BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20181221T160726Z
LOCATION:D220
DTSTART;TZID=America/Chicago:20181111T141500
DTEND;TZID=America/Chicago:20181111T141800
UID:submissions.supercomputing.org_SC18_sess160_ws_whpc126@linklings.com
SUMMARY:Study of Performance Variability on Dragonfly Systems
DESCRIPTION:Workshop\nDiversity, Education, Hot Topics, Workshop Reg Pass\
 n\nStudy of Performance Variability on Dragonfly Systems\n\nWang\n\nDragon
 fly networks are being widely adopted in high-performance computing system
 s. On these networks, however, interference caused by resource sharing can
  lead to significant network congestion and performance variability. On a 
 shared network, different job placement policies lead to different traffic
  distributions. Contiguous job placement policy achieves localized communi
 cation by assigning adjacent compute nodes to the same job. Random job pla
 cement policy, on the other hand, achieves balanced network traffic by pla
 cing application processes sparsely across the network to uniformly distri
 bute the message load. Localized communication and balanced network traffi
 c have opposite advantages and drawbacks. Localizing communication reduces
  the number of hops for message transfers at the cost of potential network
  congestion, while balancing network traffic reduces potential local conge
 stion at the cost of increased message transfer hops.\n\nIn this study, we
  first present a comparative analysis exploring the trade-off between loca
 lizing communication and balancing network traffic using trace-based simul
 ations, and demonstrate the effect of external network interference by int
 roducing background traffic and show that localized communication can help
  reduce the application performance variation caused by network sharing. W
 e then introduce an online simulation framework that improves performance 
 and scalability, and discuss the validation of the simulation observations
  to a production Dragonfly system in respect of performance variability.
URL:https://sc18.supercomputing.org/presentation/?id=ws_whpc126&sess=sess1
 60
END:VEVENT
END:VCALENDAR

