BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20181221T160731Z
LOCATION:C141/143/149
DTSTART;TZID=America/Chicago:20181115T153000
DTEND;TZID=America/Chicago:20181115T160000
UID:submissions.supercomputing.org_SC18_sess186_pap407@linklings.com
SUMMARY:Dac-Man: Data Change Management for Scientific Datasets on HPC Sys
 tems
DESCRIPTION:Paper\nArchitectures, Data Management, File Systems, Networks,
  State of the Practice, System Software, Workflows, Tech Program Reg Pass\
 n\nDac-Man: Data Change Management for Scientific Datasets on HPC Systems\
 n\nGhoshal, Ramakrishnan, Agarwal\n\nScientific data is growing rapidly an
 d often changes due to instrument configurations, software updates, or qua
 lity assessments. These changes in datasets can result in significant wast
 e of compute and storage resources on HPC systems as downstream pipelines 
 are reprocessed. Data changes need to be detected, tracked, and analyzed f
 or understanding the impact of data change, managing data provenance, and 
 making efficient and effective decisions about reprocessing and use of HPC
  resources. Existing methods for identifying and capturing change are ofte
 n manual, domain-specific, and error-prone and do not scale to large scien
 tific datasets. In this paper, we describe the design and implementation o
 f Dac-Man framework, which identifies, captures, and manages change in lar
 ge scientific datasets, and enables plug-in of domain-specific change anal
 ysis with minimal user effort. Our evaluations show that it can retrieve f
 ile changes from directories containing millions of files and terabytes of
  data in less than a minute.
URL:https://sc18.supercomputing.org/presentation/?id=pap407&sess=sess186
END:VEVENT
END:VCALENDAR

