BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20181221T160731Z
LOCATION:D174
DTSTART;TZID=America/Chicago:20181116T105000
DTEND;TZID=America/Chicago:20181116T111000
UID:submissions.supercomputing.org_SC18_sess146_ws_ftxs114@linklings.com
SUMMARY:Extending and Evaluating Fault-Tolerant Preconditioned Conjugate G
 radient Methods
DESCRIPTION:Workshop\nResiliency, Scientific Computing, Workshop Reg Pass\
 n\nExtending and Evaluating Fault-Tolerant Preconditioned Conjugate Gradie
 nt Methods\n\nPachajoa, Levonyak, Gansterer\n\nWe compare and refine exact
  and heuristic fault-tolerance extensions for the preconditioned conjugate
  gradient (PCG) and the split preconditioner conjugate gradient (SPCG) met
 hods for recovering from failures of compute nodes of large-scale parallel
  computers. In the exact state reconstruction (ESR) approach, which is bas
 ed on a method proposed by Chen (2011), the solver keeps extra information
  from previous search directions of the (S)PCG solver, so that its state c
 an be fully reconstructed if a node fails unexpectedly. ESR does not make 
 use of checkpointing or external storage for saving dynamic solver data an
 d has only negligible computation and communication overhead compared to t
 he failure-free situation. In exact arithmetic, the reconstruction is exac
 t, but in finite-precision computations, the number of iterations until co
 nvergence can differ slightly from the failure-free case due to rounding e
 ffects. We perform experiments to investigate the behavior of ESR in float
 ing-point arithmetic and compare it to the heuristic linear interpolation 
 (LI) approach by Langou et al. (2007) and Agullo et al. (2016), which does
  not have to keep extra information and thus has lower memory requirements
 . Our experiments illustrate that ESR, on average, has essentially zero ov
 erhead in terms of additional iterations until convergence, whereas the LI
  approach incurs much larger overheads.
URL:https://sc18.supercomputing.org/presentation/?id=ws_ftxs114&sess=sess1
 46
END:VEVENT
END:VCALENDAR

