BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260522T150120Z
LOCATION:C140/142
DTSTART;TZID=America/Chicago:20181115T143000
DTEND;TZID=America/Chicago:20181115T150000
UID:submissions.supercomputing.org_SC18_sess190_pap322@linklings.com
SUMMARY:Anatomy of High-Performance Deep Learning Convolutions on SIMD Arc
 hitectures
DESCRIPTION:Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj
  Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke (Intel Corporati
 on)\n\nConvolution layers are prevalent in many classes of deep neural net
 works, including Convolutional Neural Networks (CNNs) which provide state-
 of-the-art results for tasks like image recognition, neural machine transl
 ation, and speech recognition. The computationally expensive nature of a c
 onvolution operation has led to the proliferation of implementations inclu
 ding matrix-matrix multiplication formulation, and direct convolution prim
 arily targeting GPUs. In this paper, we introduce direct convolution kerne
 ls for x86 architectures, in particular for Xeon and Xeon Phi systems, whi
 ch are implemented via a dynamic compilation approach. Our JIT-based imple
 mentation shows close to theoretical peak performance, depending on the se
 tting and the CPU architecture at hand. We additionally demonstrate how th
 ese JIT-optimized kernels can be integrated into a light-weight multi-node
  graph execution model. This illustrates that single- and multi-node runs 
 yield high efficiencies and high image-throughputs  when executing state o
 f the art image recognition tasks on CPUs.\n\nTag: Applications, Cosmology
 , Data Analytics, Deep Learning, Machine Learning, Programming Systems, St
 orage, Visualization\n\nRegistration Category: Tech Program Reg Pass\n\nSe
 ssion Chair: Tal Ben-Nun (Lawrence Livermore National Laboratory (LLNL))\n
 \n
END:VEVENT
END:VCALENDAR
