Exploring Use-cases for Non-Volatile Memories in support of HPC Resilience

Size: px

Start display at page:

Download "Exploring Use-cases for Non-Volatile Memories in support of HPC Resilience"

Wilfred Robbins
5 years ago
Views:

1 Exploring Use-cases for Non-Volatile Memories in support of HPC Resilience Onkar Patil 1, Saurabh Hukerikar 2, Frank Mueller 1, Christian Engelmann 2 1 Dept. of Computer Science, North Carolina State University 2 Computer Science and Mathematics Division, Oak Ridge National Laboratory

2 MOTIVATION Exaflop Computers compute devices + memory devices + interconnects + cooling and power Close Proximity! Manufacturing processes not foolproof Lower durability and reliability Frequency of device failures and data corruptions effectiveness and utility Future Applications need to be more resilient, Maintain a balance between performance and power consumption Minimize trade-offs

PROBLEM STATEMENT Non-volatile memory (NVM) technologies maintain state of computation in the

improving resilience of an application against crashes Persistent memory regions to improve HPC

DYNAMIC DATA STRUCTURES STATIC DATA STRUCTURES DYNAMIC DATA STRUCTURES STATIC DATA STRUCTURES

3 PROBLEM STATEMENT Non-volatile memory (NVM) technologies maintain state of computation in the primary memory architecture More potential as specialized hardware Data Retention critical in improving resilience of an application against crashes Persistent memory regions to improve HPC resiliency key aspect of this project APPLICATION APPLICATION APPLICATION STATIC DATA STRUCTURES DYNAMIC DATA STRUCTURES STATIC DATA STRUCTURES DYNAMIC DATA STRUCTURES STATIC DATA STRUCTURES DYNAMIC DATA STRUCTURES NVM NVM NVM NVM-based Main Memory Application-directed Checkpointing Data Versioning

4 RESULTS Experimentation Setup 16-node cluster with Dual socket, Quad-Core AMD Opteron, 128 GB memory, Intel SSD from 100GB to 256GB DGEMM benchmark of the HPCC benchmark suite Tested for 4, 8 and 16-node configurations for a matrix sizes of 1000, 2000 and 3000 elements

5 GFLOPS Time(sec) 10 GFLOPS in node scaling for StarDGEMM PMEM_ONLY PMEM_CPY PMEM_VER 100 Execution times in node scaling for StarDGEMM PMEM_ONLY PMEM_CPY 1 10 PMEM_VER Nodes Nodes 8 16 only allocation and NVM-based main memory perform better An inefficient lookup algorithm

6 GFLOPS Time(sec) 10 1 GFLOPS for problem size scaling in StarDGEMM PMEM_ONL Y Execution time for problem size scaling in StarDGEMM PMEM_ONLY PMEM_CPY PMEM_VER No. of elements All modes perform similar and consistently for node and data scaling Execution time increases exponentially for multiple copies of memory No. of elements

7 CONCLUSION & FUTURE WORK Conclusion: Non-volatile memory devices can be used as specialized hardware for improving the resilience of the system Future Work: Memory usage modes to make applications efficient and maintain complete system state Minimal overhead Support more complex applications Lightweight recovery mechanisms to work with the checkpointing schemes Reduce downtime and rollback time Intelligent policies that can help build resilient static and dynamic runtime system

8 Evaluating Performance of Burst Buffer Models for Real-Application Workloads in HPC Systems Harsh Khetawat Frank Mueller Christopher Zimmer

9 Introduction Existing storage systems becoming bottleneck Solution: burst buffers Use burst buffers for: Checkpoint/Restart I/O Staging Write-through cache for parallel FS Burst Buffers on Cori

10 Burst buffer placement: Placement Co-located with compute nodes (Summit) Co-located with I/O nodes (Cori) Separate set of nodes Trade-offs in choice of placements Capability I/O models, staging, etc. Predictability Impact on shared resources, runtime variability Economic Infrastructure reuse, cost of storage device I/O performance dependent on placement Choice of network topology

Idea Simulate network and burst buffer architectures CODES simulation suite Real-world I/O traces (Darshan) Full multi-tenant system with mixed workloads (capability/capacity)

11 Idea Simulate network and burst buffer architectures CODES simulation suite Real-world I/O traces (Darshan) Full multi-tenant system with mixed workloads (capability/capacity) Supports network topologies Local & external storage models Combine network topologies and storage architectures Performance under striping/protection schemes Reproducible tool for HPC centers

12 Conclusion Determine based on workload characteristics: Burst buffer placement Network topology Performance of striping across burst buffers Overhead of resilience schemes Reproducible tool to: Simulate specific workloads Determine best fit

13 Thank You

Scalable, Fault-Tolerant Membership for MPI Tasks on HPC Systems

fastos.org/molar Scalable, Fault-Tolerant Membership for MPI Tasks on HPC Systems Jyothish Varma 1, Chao Wang 1, Frank Mueller 1, Christian Engelmann, Stephen L. Scott 1 North Carolina State University,