Quantifying Resiliency in the Extreme Scale HPC Co-Design Space

Size: px

Start display at page:

Download "Quantifying Resiliency in the Extreme Scale HPC Co-Design Space"

Esther Floyd
5 years ago
Views:

1 Quantifying Resiliency in the Extreme Scale HPC Co-Design Space Jeffrey S. Vetter Jeremy Meredith Dagstuhl Seminar #15281 Algorithms and Scheduling Techniques to Manage Resilience and Power Consumption in Distributed Systems 9 Jul 2015 ORNL is managed by UT-Battelle for the US Department of Energy vetter@computer.org

2 Overview Our community has major challenges in HPC as we move to extreme scale Power, Performance, Resilience, Productivity Major shifts in architectures, software, applications Not just HPC: Most uncertainty in two decades New technologies emerging to address some of these challenges Heterogeneous computing Nonvolatile memory Consequently, we now have critical situations in Portable programming models Better design and analysis tools for procurement, optimization, etc Aspen is a tool for structured design and analysis Co-design applications and architectures for performance, power, resiliency 3

3 4 Surveying the HPC Landscape: Today and Tomorrow

Notional Exascale Architecture Targets (From Exascale Arch Report 2009) System attributes 2001 2010 2015 2018 System peak 10 Tera 2 Peta 200 Petaflop/sec 1 Exaflop/sec Power ~0.

4 Notional Exascale Architecture Targets (From Exascale Arch Report 2009) System attributes System peak 10 Tera 2 Peta 200 Petaflop/sec 1 Exaflop/sec Power ~0.8 MW 6 MW 15 MW 20 MW System memory PB 0.3 PB 5 PB PB Node performance TF TF 0.5 TF 7 TF 1 TF 10 TF Node memory BW 25 GB/s 0.1 TB/sec 1 TB/sec 0.4 TB/sec 4 TB/sec Node concurrency O(100) O(1,000) O(1,000) O(10,000) System size (nodes) ,700 50,000 5,000 1,000, ,000 Total Node Interconnect BW 1.5 GB/s 150 GB/sec 1 TB/sec 250 GB/sec 2 TB/sec MTTI day O(1 day) O(1 day) Parallel I/O?? 5

5 6 Today s Status

(Un-)Balanced Systems?? System attributes 2001 2010 2014 2015 est 2018 Summit/Titan 2018 Name Seaborg3 Jaguar Titan SUMMIT System peak 10 Tera 2 27 200 136 5.0 1 Exaflop/sec Power (MW) 0.

72 inf Node performance (TF) 0.024 0.125 1.4 0.5 7 40 28.6 1 10 Node memory BW 25 GB/s 0.1 TB/sec 1 TB/sec 0.

6 (Un-)Balanced Systems?? System attributes est 2018 Summit/Titan 2018 Name Seaborg3 Jaguar Titan SUMMIT System peak 10 Tera Exaflop/sec Power (MW) Node main memory (GB) System memory (PB) Node Persistent Memory (GB) 800 inf System Persistent Memory (PB) 2.72 inf Node performance (TF) Node memory BW 25 GB/s 0.1 TB/sec 1 TB/sec 0.4 TB/sec 4 TB/sec Node concurrency O(100) O(1,000) *POWER9s + *VOLTAs O(1,000) O(10,000) System size (nodes) Total Node Interconnect BW (GB/s) 1.5 GB/s 150 GB/sec 1 TB/sec 250 GB/sec 2 TB/sec injection bandwidth per node (GB/s) File system capacity (PB) File system bandwidth (TB/s) MTTI day O(1 day) O(1 day) 7 Power is constant 1/5 of the node count Heterogeneous I/O and NIC bandwidth has plateaued NVM is new!

7 Notional Future Architecture Interconnection Network 8

8 Investigating Emerging Technologies Heterogeneous Computing NV Memory Optical interconnect, Silicon photonics Storage systems (key, value)

9 Earlier Experimental Computing Systems (past decade) Popular architectures since ~2004 The past decade has started the trend away from traditional simple architectures Examples Cell, GPUs, FPGAs, SoCs, etc Mainly driven by facilities costs and successful (sometimes heroic) application examples 10

10 We ve seen this for decades; Why is it different this time? 11 IEEE Spectrum Bob Colwell, Hotchips 25 Avago acquires Broadcom; Intel acquires Altera; next??

Emerging Computing Architectures Now Heterogeneous processing Latency

put cores Special purpose hardware (e.g.

5D and 3D Stacking HMC, HBM, WIDEIO2, LPDDR4, etc New devices (PCRAM,

Active storage Non-traditional storage architectures (key-value stores)

11 Emerging Computing Architectures Now Heterogeneous processing Latency tolerant cores Throughput cores Special purpose hardware (e.g., AES, MPEG, RND) Memory Fused, configurable memory 2.5D and 3D Stacking HMC, HBM, WIDEIO2, LPDDR4, etc New devices (PCRAM, ReRAM) Interconnects Collective offload Scalable topologies Storage Active storage Non-traditional storage architectures (key-value stores) Improving performance and programmability in face of increasing complexity Power, resilience 12 HPC (mobile, enterprise, embedded) computer design is more fluid now than in the past two decades.

12 Recent announcements 13

13 NVRAM Technology Continues to Improve Driven by Market Forces DEC28_STOR_MFG_NT_01.jpg

14 Comparison of emerging memory technologies Jeffrey Vetter, ORNL Robert Schreiber, HP Labs Trevor Mudge, University of Michigan Yuan Xie, Penn State University SRAM DRAM edram 2D NAND Flash 3D NAND Flash PCRAM STTRAM 2D ReRAM 3D ReRAM Data Retention N N N Y Y Y Y Y Y Cell Size (F 2 ) < <1 Minimum F demonstrated (nm) Read Time (ns) < Write Time (ns) < Number of Rewrites Read Power Low Low Low High High Low Medium Medium Medium Write Power Low Low Low High High High Medium Medium Medium Power (other than R/W) Leakage Refresh Refresh None None None None Sneak Sneak Maturity 15

Mittal, Opportunities for Nonvolatile Memory Systems in Extreme-Scale

15 Opportunities for NVM in Emerging Systems Burst Buffers In situ visualization In-mem tables J.S. 16 Vetter and S. Mittal, Opportunities for Nonvolatile Memory Systems in Extreme-Scale High-Performance Computing, Computing in Science & Engineering, 17(2):73-82, 2015, /MCSE

16 With so many complex architectural choices, we need new methods and tools Performance Portable Programming Models Design tools for architectures and applications Performance Resiliency Power 17

Workflow within the Exascale Ecosystem (Application driven) co-design is the process where scientific problem requirements influence computer architecture design, and

Domain/Alg Analysis Application Co-Design Proxy Apps Application Design System Design Vendor Analysis Sim Exp Proto HW Prog Models HW Simulator Tools Hardware

17 Workflow within the Exascale Ecosystem (Application driven) co-design is the process where scientific problem requirements influence computer architecture design, and technology constraints inform formulation and design of algorithms and software. Bill Harrod (DOE) Slide courtesy of ExMatEx Co-design team. Domain/Alg Analysis Application Co-Design Proxy Apps Application Design System Design Vendor Analysis Sim Exp Proto HW Prog Models HW Simulator Tools Hardware Co-Design HW Design Open Analysis Models Simulators Emulators SW Solutions HW Constraints Computer Science Co-Design System Software Stack Analysis Prog models Tools Compilers Runtime OS, I/O,... 19

18 21 Prediction Techniques Ranked

19 22 Prediction Techniques Ranked

Aspen: Abstract Scalable Performance Engineering Notation Source code Aspen code Creation

Representation in Aspen Modular Sharable Composable Reflects prog structure Existing models

International Conference for High Performance Computing, Networking, Storage, and Analysis,

20 Aspen: Abstract Scalable Performance Engineering Notation Source code Aspen code Creation Static analysis via compilers Empirical, Historical Manual for future applications Representation in Aspen Modular Sharable Composable Reflects prog structure Existing models for MD, UHPC CP 1, Lulesh, 3D FFT, CoMD, VPFFT, K. 24 Spafford and J.S. Vetter, Aspen: A Domain Specific Language for Performance Modeling, in SC12: ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis, 2012 Use Interactive tools for graphs, queries Design space optimization Drive simulators Feedback to runtime systems Researchers are using Aspen for parallel applications, scientific workflows, capacity planning, quantum computing, etc

21 25 Manual Example of LULESH

22 Creating Aspen Models S. Lee, J.S. Meredith, and J.S. Vetter, COMPASS: A Framework for Automated Performance Modeling and Prediction, in ACM 26 International Conference on Supercomputing (ICS). Newport Beach, California: ACM, 2015, /

23 27 Simple MM example generated from COMPASS

24 Example Queries Automated Big O Notation Idealized Concurrency 28

25 Example Queries Computational Intensity and Memory Usage per Subroutine 29

26 31 Fast and Accurate Enough for Online Decisions

Raw and Correlated Monitoring Data ESnet PANORAMA Overview

Validation NERSC Pegasus Framework Anomaly Detection and

Mapping and Adaptation Workflow Execution HPSS VDF ExoGENI ESnet

, PANORAMA: An Approach to Performance Modeling and Diagnosis of

27 Raw and Correlated Monitoring Data ESnet PANORAMA Overview Workflow Simulation Infrastructure Design Resources OLCF Model Validation NERSC Pegasus Framework Anomaly Detection and Diagnosis Viz APS SNS Aspen Modeling Language and System Resource Mapping and Adaptation Workflow Execution HPSS VDF ExoGENI ESnet testbed E. 32 Deelman, C. Carothers et al., PANORAMA: An Approach to Performance Modeling and Diagnosis of Extreme Scale Workflows, International Journal of High Performance Computing Applications, (to appear), 2015,

28 End-to-end Resiliency Design using Aspen

29 Data Vulnerability Factor: Why a new metric and methodology? Analytical model of resiliency that includes important features of architecture and application Fast Flexible Balance multiple design dimensions Application requirements Architecture (memory capacity and type) Focus on main memory initially Prioritize vulnerabilities of application data L. Yu, D. Li et al., Quantitatively modeling application resilience with the data vulnerability factor (Best Student Paper Finalist), in SC14: International Conference for High Performance Computing, Networking, Storage and Analysis. New Orleans, Louisiana: 35 IEEE Press, 2014, pp , /sc

DVF Defined Data Structure Vulnerability DVF d = N error N ha Application Vulnerability DVF a = n i=1 DVF di Larger Hardware DVF Effects indicates Number higher of vulnerability, Errors ( N error )

30 DVF Defined Data Structure Vulnerability DVF d = N error N ha Application Vulnerability DVF a = n i=1 DVF di Larger Hardware DVF Effects indicates Number higher of vulnerability, Errors ( N error ) and vice versa Hardware Failure Rate ( FIT ) Execution Time ( T ) Footprint Size ( S d ) N error = FIT T S d Application Effects Number of Hardware Accesses ( N ha ) We focus on a specific hardware component, the Hardware main Access memory, Pattern in this work N ha Hardware Access Pattern 36

31 Implementing DVF Extend Aspen performance modeling language Specify memory access patterns Combine error rates with memory regions and performance Assign DVF to each application memory region, Sum for application 37

32 38 Workflow to calculate Data Vulnerability Factor

33 An Example of Aspen Program for DVF procedure VM(A,B,C) for i 1, 1000 do C[i] C[i] + A[i*4] * B[i*8] end for end procedure Pseudocode kernel vecmul { execute mainblock2 [1] { flops [2*(n^3)] as sp, fmad, simd access {1000} from {mata} as stream(4,16) access {4000} from {matb} as stream(4,32) access {8000} from {matc} as stream(4,4) } } Extended Aspen Statements Data structure A: Number of errors: 30,400 Number of memory accesses: 51 DVF: e+06 Resilience Modeling Results Extended Parser Extended Complier Resilience Statements: Footprint Sizes: Int: 16,000 Data Structures: Ident: mata Access Pattern: Stream Int: 4 Int: 16 Resilience Statements: Footprint Sizes: Int: 16,000 Data Structures: Ident: mata Access Pattern: Stream Int: 4 Int: 16 Resilience Statements: Footprint Sizes: Int: 16,000 Data Structures: Ident: mata Access Pattern: Stream Int: 4 Int: 16 Syntax Tree 39

34 DVF Results Provides insight for balancing interacting factors 40 40

35 DVF: next steps Evaluated different architectures How much no-ecc, ECC, NVM? Evaluate software and applications ABFT C/R TMR Containment domains Fault tolerant MPI End-to-End analysis Where should we bear the cost for resiliency? Not everwhere! 41 41

36 Summary Our community has major challenges in HPC as we move to extreme scale Power, Performance, Resilience, Productivity Major shifts in architectures, software, applications Not just HPC: Most uncertainty in two decades New technologies emerging to address some of these challenges Heterogeneous computing Nonvolatile memory Consequently, we now have critical situations in Portable programming models Performance prediction for procurement, optimization, etc Aspen is a tool we have developed for performance prediction Co-design applications and architectures for performance, power, resiliency 43

Acknowledgements Contributors and Sponsors Future Technologies Group: http://ft.ornl.gov US Department of Energy Office of Science DOE Vancouver Project: https://ft.ornl.gov/trac/vancouver DOE Blackcomb Project: https://ft.

37 Acknowledgements Contributors and Sponsors Future Technologies Group: US Department of Energy Office of Science DOE Vancouver Project: DOE Blackcomb Project: DOE ExMatEx Codesign Center: DOE Cesar Codesign Center: DOE Exascale Efforts: Scalable Heterogeneous Computing Benchmark team: US National Science Foundation Keeneland Project: US DARPA NVIDIA CUDA Center of Excellence 44

Architecture trends, performance prediction and co-design tools

Architecture trends, performance prediction and co-design tools Jeffrey S. Vetter US-Japan Joint Institute for Fusion Theory Workshop on Innovations and Codesigns of Fusion Simulations towards Extreme