Challenges in High Performance Computing. William Gropp

Size: px
Start display at page:

Download "Challenges in High Performance Computing. William Gropp"

Transcription

1 Challenges in High Performance Computing William Gropp

2 2 What is HPC? High Performance Computing is the use of computing to solve challenging problems that require significant computing resources Supercomputers Clusters (Large and Small) Accelerator-equipped workstations Custom hardware (e.g., Anton) Traditionally FLOPS But could be memory, data, bandwidth, real-time,

3 3 What Sort of Problems are Solved with HPC? Engineering everything from consumer products to spacecraft Getting Pringles into a can (3-D, timedependent CFD) Optimizing bottles for minimum weight with required robustness Fuel-efficient aircraft with novel materials Insights in science Formation of galaxies; effects of different theories of dark matter and energy Weather and climate forecasting Formation (and points of attack) of viruses such as HIV

4 4 Advancing Science and Engineering Advances in a broad range of science and engineering disciplines will be enabled by sustained petascale computers: Molecular Science Weather & Climate Forecasting Astronomy Earth Science Health

5 5 Messages Big is big Data driven is an important area, but not all data driven problems are big data (despite current hype). The distinction is important There are different measures of big, but a TB of data that can be processed by a linear algorithm is not big Key feature of an extreme computing system is a fast interconnect Low latency, high link bandwidth, high bisection bandwidth Provides fast access to data everywhere in system, particularly with one-sided access models Think map(r1,r2, ) function that requires more than one record, where the specific input records are unpredictable (e.g., data dependent on previous result)

6 6 HPC and Clouds and Big Data Clouds can provide some HPC services Esp. where each experiment runs on one node Single nodes can do a lot! But clouds very poor at applications that require tightly coordinated computations across tens or hundreds of thousands of nodes The one thing that makes a supercomputer super today is a high-bandwidth, low-latency interconnect, and the software to match Big Data also requires big compute Some applications only need independent computation (e.g., clouds)

7 7 Extrapolation is Risky 1989 T 24 years Intel introduces 486DX Eugene Brooks writes Attack of the Killer Micros 4 years before TOP500 Top systems at about 2 GF Peak 1999 T 14 years NVIDIA introduces its GPU (GeForce 256) Programming GPUs still a challenge 14 years later Top system ASCI Red, 9632 cores, 3.2 TF Peak (about 3 GPUs in 2013) MPI is 7 years old

8 8 HPC Today High(est)-End systems 1 PF (10 15 Ops/s) achieved on a few peak friendly by applications 2011; Blue Waters demonstrated > 1PF endto-end on a larger set this year Much worry about scalability, how we re going to get to an ExaFLOPS Systems are all oversubscribed DOE INCITE awarded almost 900M processor hours in 2009; 1600M-1700M hours in ; over 5B hours in 2013 NSF PRAC awards for Blue Waters similarly competitive Widespread use of clusters, many with accelerators; cloud computing services These are transforming the low and midrange Laptops (far) more powerful than the supercomputers I used as a graduate student

9 9 HPC in 2011 Sustained PF systems K Computer (Fujitsu) at RIKEN, Kobe, Japan (2011) Sequoia Blue Gene/Q at LLNL NSF Track 1 Blue Waters at Illinois Milky Way-2 (TH-2) in China (apps yet to be shown) Still programmed with MPI and MPI+other (e.g., MPI+OpenMP or MPI+OpenCL/CUDA or MPI+OpenACC) But in many cases using toolkits, libraries, etc. And not so bad applications will be able to run when the system is turned on Replacing MPI will require some compromise e.g., domain specific (higher-level but less general) Lots of evidence that fully automatic solutions won t work

10 10 Blue Waters and Sequoia Computing Systems NCSA LLNL System Attribute Blue Waters Sequoia Vendors Cray/AMD/NVIDIA IBM Processors Interlagos/Kepler PowerPCA2 variant Total Peak Performance (PF) Total Peak Performance (CPU/GPU) 7.6/ /0.0 Number of CPU Chips (8, 16 cores/chip) 48,576 98,304 Number of GPU Chips 3,072 0 Amount of CPU Memory (TB) 1,510 1,572 Interconnect 3D Torus 5D Torus Amount of On-line Disk Storage (PB) 26 50(?) Sustained Disk Transfer (TB/sec) > Amount of Archival Storage >300? Sustained Tape Transfer (GB/sec) >100?

11 11 Blue Waters and MilkyWay-2 Computing Systems NCSA NUDT System Attribute Blue Waters Milky Way 2 Vendors Cray/AMD/NVIDIA NUDT/Inspur Processors Interlagos/Kepler IvyBridge/Phi Total Peak Performance (PF) Total Peak Performance (CPU/GPU) 7.6/ /48.1 Number of CPU Chips (8, 16 cores/chip) 48,576 (12 cores) 32,000 Number of GPU Chips 3,072 48,000 Amount of CPU Memory (TB) 1,510 1,404 Interconnect 3D Torus Fat Tree Amount of On-line Disk Storage (PB) 26 12(?) Sustained Disk Transfer (TB/sec) >1? Amount of Archival Storage >300? Sustained Tape Transfer (GB/sec) >100?

12 12 HPC in Exascale systems are likely to have Extreme power constraints, leading to Clock Rates similar to today s systems A wide-diversity of simple computing elements (simple for hardware but complex for software) Memory per core and per FLOP will be much smaller Moving data anywhere will be expensive (time and power) Faults that will need to be detected and managed Some detection may be the job of the programmer, as hardware detection takes power Extreme scalability and performance irregularity Performance will require enormous concurrency Performance is likely to be variable Simple, static decompositions will not scale A need for latency tolerant algorithms and programming Memory, processors will be 100s to 10000s of cycles away. Waiting for operations to complete will cripple performance

13 13 Algorithms and Applications Will Change Applications need to become more dynamic, more integrated System software must work with application: Code complexity (Autotuning) Dynamic resources (no simple PGAS) Latency hiding (Nonblocking algorithms, interfaces, futures) Resource sharing (more performance information, performance asserts, runtime coordination)

14 How Do We Make Effective Use of These Systems? Better use of our existing systems Blue Waters provides a sustained PF, but that typically requires ~10PF peak Improve node performance Make the compiler better Give better code to the compiler Get realistic with algorithms/data structures Improve parallel performance/scalability Improve productivity of applications Better tools and interoperable languages, not a (single) new programming language Improve algorithms Optimize for the real issues data movement, power, resilience, 14

15 15 Common Themes Multiple operations must be pending at any time Asynchronous I/O, communication, even computation Split computations and communication Complex systems require adaptive approaches Autotuning for likely choices, runtime optimization Operations must be on aggregates CPU: vectors (GPU gangs/workers/vectors) I/O: Collective, parallel I/O Example: Parallel collective I/O for a distributed data structure mesh distributed across all nodes

16 Four Levels of Collective I/O Level 0 Level 1 Level 2 Level Processes 16

17 Distributed Array Access: Write Bandwidth Array size: 512 x 512 x 512 Note:Log Scale! 1GB data 128 procs 256 procs 32 procs 256procs 128 procs Thanks to Weikuan Yu, Wei-keng Liao, Bill Loewe, and Anthony Chan for these results

18 18 What s Different at Peta/Exascale Performance Focus Only a little basically, the resource is expensive, so a premium placed on making good use of resource Quite a bit node is more complex, has more features that must be exploited, interconnect performs operations Scalability Solutions that work at way often inefficient at 100,000-way Some algorithms scale well Explicit time marching in 3D Some don t Direct implicit methods Some scale well for a while FFTs (communication volume in Alltoall) Load balance, latency are critical issues Fault Tolerance becoming important Now: Reduce time spent in checkpoints Soon: Lightweight recovery from transient errors

19 19 Preparing for the Next Generation of HPC Systems Better use of existing resources Performance-oriented programming Dynamic management of resources at all levels Embrace hybrid programming models (you have already if you use SSE/VSX/OpenMP/ ) Focus on results Adapt to available network bandwidth and latency Exploit I/O capability (available space grew faster than processor performance!) Prepare for the future Fault tolerance Hybrid processor architectures Latency tolerant algorithms Data-driven systems

20 20 Research Directions Integrated, interoperable, component oriented languages Generalization of so-called domain-specific language Really data-structure-specific languages Performance modeling and tuning Performance info in language; performance considered as part of correctness Fault tolerance at the high end Fault tolerance features in the language, working with hardware and algorithms Correctness Correctness features for testing in the language Support for special cases (e.g., provably deterministic expression of deterministic algorithms)

21 Recommended Reading Bit reversal on uniprocessors (Alan Karp, SIAM Review, 1996) Achieving high sustained performance in an unstructured mesh CFD application (W. K. Anderson, W. D. Gropp, D. K. Kaushik, D. E. Keyes, B. F. Smith, Proceedings of Supercomputing, 1999) Experimental Analysis of Algorithms (Catherine McGeoch, Notices of the American Mathematical Society, March 2001) Reflections on the Memory Wall (Sally McKee, ACM Conference on Computing Frontiers, 2004) 21

22 Still open a once a year opportunity to work with high-end networks sc13.supercomputing.org/content/scinet-network-research-program 22

23 23 Six Questions 1. What is the appropriate balance between HPC needs at the extreme scale (fundamental research that can be done in no other way) and the needs of the long tail (research that needs more than a desktop computer)? How do you support all computing needs for research? 2. Applications have needs and wants. These may not be the same. E.g., applications want to use their existing algorithms and code, but it may not be possible to run those fast enough. The application may need to change approach. How do you get application scientists to separate their needs and wants? 3. HPC is about performance. How do you support both the research and especially the engineering work to make codes efficient? If you don t do this, where do you find the funds to buy the additional computational capacity required to meet the additional need created by running less than optimized codes?

24 24 Six Questions (con t) 4. Data and HPC should be closely connected. Truly big data (much greater than 10 PB) requires significant compute, for example. How do you change the perception that big data and big compute are antagonistic? What Big Data problems would be best solved on big compute platforms such as Blue Waters? 5. Current computer Architecture is reaching its limits. Where are the new architectures? How do you solve the chicken and egg problem do new architectures require a demand from new applications, and applications require a well-established, dependable architecture? Should NSF only consume architectures created by others (whether industry or other agencies) or should it have some control of its core computational technology? 6. How do we get past the usual suspects of applications (computational fluid dynamics, n-body problems, etc.)? How do we extend the use of HPC into new areas in science, engineering, and the humanities?

25 Six Questions: The Short Form 1. What is the right balance between HPC and other computing infrastructure? 2. How can we encourage applications to try new approaches? 3. How do we ensure applications make efficient use of infrastructure? 4. What Big Data problems need HPC? 5. How can we support innovative computer architecture research? 6. How do we bring computing to new areas of science? 25

The Next Generation of High Performance Computing. William Gropp

The Next Generation of High Performance Computing. William Gropp The Next Generation of High Performance Computing William Gropp www.cs.illinois.edu/~wgropp Extrapolation is Risky 1989 T 23 years Intel introduces 486DX Eugene Brooks writes Attack of the Killer Micros

More information

The Challenges of Exascale. William D Gropp

The Challenges of Exascale. William D Gropp The Challenges of Exascale William D Gropp www.cs.illinois.edu/~wgropp Extrapolation is Risky 1989 T 22 years Intel introduces 486DX Eugene Brooks writes Attack of the Killer Micros 4 years before TOP500

More information

Algorithms and Software in the Post-Petascale Era

Algorithms and Software in the Post-Petascale Era Algorithms and Software in the Post-Petascale Era William Gropp www.cs.illinois.edu/~wgropp Extrapolation is Risky 1989 T 23 years Intel introduces 486DX Eugene Brooks writes Attack of the Killer Micros

More information

Algorithms and Software in the Post-Petascale Era. William Gropp

Algorithms and Software in the Post-Petascale Era. William Gropp Algorithms and Software in the Post-Petascale Era William Gropp www.cs.illinois.edu/~wgropp Extrapolation is Risky 1989 T 23 years Intel introduces 486DX Eugene Brooks writes Attack of the Killer Micros

More information

MPI in 2020: Opportunities and Challenges. William Gropp

MPI in 2020: Opportunities and Challenges. William Gropp MPI in 2020: Opportunities and Challenges William Gropp www.cs.illinois.edu/~wgropp MPI and Supercomputing The Message Passing Interface (MPI) has been amazingly successful First released in 1992, it is

More information

The Next Generation of High Performance Computing. William Gropp

The Next Generation of High Performance Computing. William Gropp The Next Generation of High Performance Computing William Gropp www.cs.illinois.edu/~wgropp Extrapolation is Risky 1989 T 23 years Intel introduces 486DX Eugene Brooks writes Attack of the Killer Micros

More information

Overview. CS 472 Concurrent & Parallel Programming University of Evansville

Overview. CS 472 Concurrent & Parallel Programming University of Evansville Overview CS 472 Concurrent & Parallel Programming University of Evansville Selection of slides from CIS 410/510 Introduction to Parallel Computing Department of Computer and Information Science, University

More information

Trends in HPC (hardware complexity and software challenges)

Trends in HPC (hardware complexity and software challenges) Trends in HPC (hardware complexity and software challenges) Mike Giles Oxford e-research Centre Mathematical Institute MIT seminar March 13th, 2013 Mike Giles (Oxford) HPC Trends March 13th, 2013 1 / 18

More information

Green Supercomputing

Green Supercomputing Green Supercomputing On the Energy Consumption of Modern E-Science Prof. Dr. Thomas Ludwig German Climate Computing Centre Hamburg, Germany ludwig@dkrz.de Outline DKRZ 2013 and Climate Science The Exascale

More information

The Blue Water s File/Archive System. Data Management Challenges Michelle Butler

The Blue Water s File/Archive System. Data Management Challenges Michelle Butler The Blue Water s File/Archive System Data Management Challenges Michelle Butler (mbutler@ncsa.illinois.edu) NCSA is a World leader in deploying supercomputers and providing scientists with the software

More information

MPI+X on The Way to Exascale. William Gropp

MPI+X on The Way to Exascale. William Gropp MPI+X on The Way to Exascale William Gropp http://wgropp.cs.illinois.edu Some Likely Exascale Architectures Figure 1: Core Group for Node (Low Capacity, High Bandwidth) 3D Stacked Memory (High Capacity,

More information

Store Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete

Store Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete Store Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete 1 DDN Who We Are 2 We Design, Deploy and Optimize Storage Systems Which Solve HPC, Big Data and Cloud Business

More information

Steve Scott, Tesla CTO SC 11 November 15, 2011

Steve Scott, Tesla CTO SC 11 November 15, 2011 Steve Scott, Tesla CTO SC 11 November 15, 2011 What goal do these products have in common? Performance / W Exaflop Expectations First Exaflop Computer K Computer ~10 MW CM5 ~200 KW Not constant size, cost

More information

HPC future trends from a science perspective

HPC future trends from a science perspective HPC future trends from a science perspective Simon McIntosh-Smith University of Bristol HPC Research Group simonm@cs.bris.ac.uk 1 Business as usual? We've all got used to new machines being relatively

More information

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA STATE OF THE ART 2012 18,688 Tesla K20X GPUs 27 PetaFLOPS FLAGSHIP SCIENTIFIC APPLICATIONS

More information

Ian Foster, An Overview of Distributed Systems

Ian Foster, An Overview of Distributed Systems The advent of computation can be compared, in terms of the breadth and depth of its impact on research and scholarship, to the invention of writing and the development of modern mathematics. Ian Foster,

More information

Enabling the Next Generation of Scalable Clusters. William Gropp

Enabling the Next Generation of Scalable Clusters. William Gropp Enabling the Next Generation of Scalable Clusters William Gropp www.cs.illinois.edu/~wgropp State of the World Clock rate ride over; power a constraint New architectures considered GPGPU even though hard

More information

NetApp: Solving I/O Challenges. Jeff Baxter February 2013

NetApp: Solving I/O Challenges. Jeff Baxter February 2013 NetApp: Solving I/O Challenges Jeff Baxter February 2013 1 High Performance Computing Challenges Computing Centers Challenge of New Science Performance Efficiency directly impacts achievable science Power

More information

Exascale: Parallelism gone wild!

Exascale: Parallelism gone wild! IPDPS TCPP meeting, April 2010 Exascale: Parallelism gone wild! Craig Stunkel, Outline Why are we talking about Exascale? Why will it be fundamentally different? How will we attack the challenges? In particular,

More information

Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries

Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries Jeffrey Young, Alex Merritt, Se Hoon Shon Advisor: Sudhakar Yalamanchili 4/16/13 Sponsors: Intel, NVIDIA, NSF 2 The Problem Big

More information

The next generation supercomputer. Masami NARITA, Keiichi KATAYAMA Numerical Prediction Division, Japan Meteorological Agency

The next generation supercomputer. Masami NARITA, Keiichi KATAYAMA Numerical Prediction Division, Japan Meteorological Agency The next generation supercomputer and NWP system of JMA Masami NARITA, Keiichi KATAYAMA Numerical Prediction Division, Japan Meteorological Agency Contents JMA supercomputer systems Current system (Mar

More information

Algorithms and Architecture. William D. Gropp Mathematics and Computer Science

Algorithms and Architecture. William D. Gropp Mathematics and Computer Science Algorithms and Architecture William D. Gropp Mathematics and Computer Science www.mcs.anl.gov/~gropp Algorithms What is an algorithm? A set of instructions to perform a task How do we evaluate an algorithm?

More information

Lecture 20: Distributed Memory Parallelism. William Gropp

Lecture 20: Distributed Memory Parallelism. William Gropp Lecture 20: Distributed Parallelism William Gropp www.cs.illinois.edu/~wgropp A Very Short, Very Introductory Introduction We start with a short introduction to parallel computing from scratch in order

More information

High Performance Computing. What is it used for and why?

High Performance Computing. What is it used for and why? High Performance Computing What is it used for and why? Overview What is it used for? Drivers for HPC Examples of usage Why do you need to learn the basics? Hardware layout and structure matters Serial

More information

Complexity and Advanced Algorithms. Introduction to Parallel Algorithms

Complexity and Advanced Algorithms. Introduction to Parallel Algorithms Complexity and Advanced Algorithms Introduction to Parallel Algorithms Why Parallel Computing? Save time, resources, memory,... Who is using it? Academia Industry Government Individuals? Two practical

More information

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. Cluster Networks Introduction Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. As usual, the driver is performance

More information

Motivation Goal Idea Proposition for users Study

Motivation Goal Idea Proposition for users Study Exploring Tradeoffs Between Power and Performance for a Scientific Visualization Algorithm Stephanie Labasan Computer and Information Science University of Oregon 23 November 2015 Overview Motivation:

More information

GPFS Experiences from the Argonne Leadership Computing Facility (ALCF) William (Bill) E. Allcock ALCF Director of Operations

GPFS Experiences from the Argonne Leadership Computing Facility (ALCF) William (Bill) E. Allcock ALCF Director of Operations GPFS Experiences from the Argonne Leadership Computing Facility (ALCF) William (Bill) E. Allcock ALCF Director of Operations Argonne National Laboratory Argonne National Laboratory is located on 1,500

More information

Engineering Performance for Multiphysics Applications. William Gropp

Engineering Performance for Multiphysics Applications. William Gropp Engineering Performance for Multiphysics Applications William Gropp www.cs.illinois.edu/~wgropp Performance, then Productivity Note the then not instead of For easier problems, it is correct to invert

More information

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29 Introduction CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction Spring 2018 1 / 29 Outline 1 Preface Course Details Course Requirements 2 Background Definitions

More information

Fujitsu s Approach to Application Centric Petascale Computing

Fujitsu s Approach to Application Centric Petascale Computing Fujitsu s Approach to Application Centric Petascale Computing 2 nd Nov. 2010 Motoi Okuda Fujitsu Ltd. Agenda Japanese Next-Generation Supercomputer, K Computer Project Overview Design Targets System Overview

More information

Parallelism and Concurrency. COS 326 David Walker Princeton University

Parallelism and Concurrency. COS 326 David Walker Princeton University Parallelism and Concurrency COS 326 David Walker Princeton University Parallelism What is it? Today's technology trends. How can we take advantage of it? Why is it so much harder to program? Some preliminary

More information

Center Extreme Scale CS Research

Center Extreme Scale CS Research Center Extreme Scale CS Research Center for Compressible Multiphase Turbulence University of Florida Sanjay Ranka Herman Lam Outline 10 6 10 7 10 8 10 9 cores Parallelization and UQ of Rocfun and CMT-Nek

More information

HPC Storage Use Cases & Future Trends

HPC Storage Use Cases & Future Trends Oct, 2014 HPC Storage Use Cases & Future Trends Massively-Scalable Platforms and Solutions Engineered for the Big Data and Cloud Era Atul Vidwansa Email: atul@ DDN About Us DDN is a Leader in Massively

More information

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Aim High Intel Technical Update Teratec 07 Symposium June 20, 2007 Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Risk Factors Today s s presentations contain forward-looking statements.

More information

Challenges for Algorithms and Software at Extreme Scale. William Gropp

Challenges for Algorithms and Software at Extreme Scale. William Gropp Challenges for Algorithms and Software at Extreme Scale William Gropp www.cs.illinois.edu/~wgropp Frequency Scaling is Over New (prediction): Increase 4% per year (ITRS 2012 Roadmap) Old: Double every

More information

MPI+X on The Way to Exascale. William Gropp

MPI+X on The Way to Exascale. William Gropp MPI+X on The Way to Exascale William Gropp http://wgropp.cs.illinois.edu Likely Exascale Architectures (Low Capacity, High Bandwidth) 3D Stacked Memory (High Capacity, Low Bandwidth) Thin Cores / Accelerators

More information

HPC Saudi Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences. Presented to: March 14, 2017

HPC Saudi Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences. Presented to: March 14, 2017 Creating an Exascale Ecosystem for Science Presented to: HPC Saudi 2017 Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences March 14, 2017 ORNL is managed by UT-Battelle

More information

Our Workshop Environment

Our Workshop Environment Our Workshop Environment John Urbanic Parallel Computing Scientist Pittsburgh Supercomputing Center Copyright 2015 Our Environment Today Your laptops or workstations: only used for portal access Blue Waters

More information

Building NVLink for Developers

Building NVLink for Developers Building NVLink for Developers Unleashing programmatic, architectural and performance capabilities for accelerated computing Why NVLink TM? Simpler, Better and Faster Simplified Programming No specialized

More information

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory Quinn Mitchell HPC UNIX/LINUX Storage Systems ORNL is managed by UT-Battelle for the US Department of Energy U.S. Department

More information

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Unleash Your Data Center s Hidden Power September 16, 2014 Molly Rector CMO, EVP Product Management & WW Marketing

More information

Cray XC Scalability and the Aries Network Tony Ford

Cray XC Scalability and the Aries Network Tony Ford Cray XC Scalability and the Aries Network Tony Ford June 29, 2017 Exascale Scalability Which scalability metrics are important for Exascale? Performance (obviously!) What are the contributing factors?

More information

Do You Know What Your I/O Is Doing? (and how to fix it?) William Gropp

Do You Know What Your I/O Is Doing? (and how to fix it?) William Gropp Do You Know What Your I/O Is Doing? (and how to fix it?) William Gropp www.cs.illinois.edu/~wgropp Messages Current I/O performance is often appallingly poor Even relative to what current systems can achieve

More information

Exascale Challenges and Applications Initiatives for Earth System Modeling

Exascale Challenges and Applications Initiatives for Earth System Modeling Exascale Challenges and Applications Initiatives for Earth System Modeling Workshop on Weather and Climate Prediction on Next Generation Supercomputers 22-25 October 2012 Tom Edwards tedwards@cray.com

More information

Improved Solutions for I/O Provisioning and Application Acceleration

Improved Solutions for I/O Provisioning and Application Acceleration 1 Improved Solutions for I/O Provisioning and Application Acceleration August 11, 2015 Jeff Sisilli Sr. Director Product Marketing jsisilli@ddn.com 2 Why Burst Buffer? The Supercomputing Tug-of-War A supercomputer

More information

Performance, then Productivity. Rethinking Solvers for Extreme Scale Architectures. Using Extra Computation in Time Dependent Problems

Performance, then Productivity. Rethinking Solvers for Extreme Scale Architectures. Using Extra Computation in Time Dependent Problems Rethinking Solvers for Extreme Scale Architectures William Gropp www.cs.illinois.edu/~wgropp parallel.illinois.edu Performance, then Productivity Note the then not instead of For easier problems, it is

More information

The Future of Interconnect Technology

The Future of Interconnect Technology The Future of Interconnect Technology Michael Kagan, CTO HPC Advisory Council Stanford, 2014 Exponential Data Growth Best Interconnect Required 44X 0.8 Zetabyte 2009 35 Zetabyte 2020 2014 Mellanox Technologies

More information

Computing architectures Part 2 TMA4280 Introduction to Supercomputing

Computing architectures Part 2 TMA4280 Introduction to Supercomputing Computing architectures Part 2 TMA4280 Introduction to Supercomputing NTNU, IMF January 16. 2017 1 Supercomputing What is the motivation for Supercomputing? Solve complex problems fast and accurately:

More information

The Stampede is Coming: A New Petascale Resource for the Open Science Community

The Stampede is Coming: A New Petascale Resource for the Open Science Community The Stampede is Coming: A New Petascale Resource for the Open Science Community Jay Boisseau Texas Advanced Computing Center boisseau@tacc.utexas.edu Stampede: Solicitation US National Science Foundation

More information

Half full or half empty? William Gropp Mathematics and Computer Science

Half full or half empty? William Gropp Mathematics and Computer Science Half full or half empty? William Gropp Mathematics and Computer Science www.mcs.anl.gov/~gropp MPI on Multicore Processors Work of Darius Buntinas and Guillaume Mercier 340 ns MPI ping/pong latency More

More information

Introduction to High Performance Parallel I/O

Introduction to High Performance Parallel I/O Introduction to High Performance Parallel I/O Richard Gerber Deputy Group Lead NERSC User Services August 30, 2013-1- Some slides from Katie Antypas I/O Needs Getting Bigger All the Time I/O needs growing

More information

BlueGene/L. Computer Science, University of Warwick. Source: IBM

BlueGene/L. Computer Science, University of Warwick. Source: IBM BlueGene/L Source: IBM 1 BlueGene/L networking BlueGene system employs various network types. Central is the torus interconnection network: 3D torus with wrap-around. Each node connects to six neighbours

More information

Co-existence: Can Big Data and Big Computation Co-exist on the Same Systems?

Co-existence: Can Big Data and Big Computation Co-exist on the Same Systems? Co-existence: Can Big Data and Big Computation Co-exist on the Same Systems? Dr. William Kramer National Center for Supercomputing Applications, University of Illinois Where these views come from Large

More information

The Future of the Message-Passing Interface. William Gropp

The Future of the Message-Passing Interface. William Gropp The Future of the Message-Passing Interface William Gropp www.cs.illinois.edu/~wgropp MPI and Supercomputing The Message Passing Interface (MPI) has been amazingly successful First released in 1992, it

More information

Interconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, November 2017

Interconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, November 2017 Interconnect Your Future Enabling the Best Datacenter Return on Investment TOP500 Supercomputers, November 2017 InfiniBand Accelerates Majority of New Systems on TOP500 InfiniBand connects 77% of new HPC

More information

MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구

MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구 MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구 Leading Supplier of End-to-End Interconnect Solutions Analyze Enabling the Use of Data Store ICs Comprehensive End-to-End InfiniBand and Ethernet Portfolio

More information

HPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms. Author: Correspondence: ABSTRACT:

HPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms. Author: Correspondence: ABSTRACT: HPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms Author: Stan Posey Panasas, Inc. Correspondence: Stan Posey Panasas, Inc. Phone +510 608 4383 Email sposey@panasas.com

More information

PART I - Fundamentals of Parallel Computing

PART I - Fundamentals of Parallel Computing PART I - Fundamentals of Parallel Computing Objectives What is scientific computing? The need for more computing power The need for parallel computing and parallel programs 1 What is scientific computing?

More information

Present and Future Leadership Computers at OLCF

Present and Future Leadership Computers at OLCF Present and Future Leadership Computers at OLCF Al Geist ORNL Corporate Fellow DOE Data/Viz PI Meeting January 13-15, 2015 Walnut Creek, CA ORNL is managed by UT-Battelle for the US Department of Energy

More information

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING Meeting Today s Datacenter Challenges Produced by Tabor Custom Publishing in conjunction with: 1 Introduction In this era of Big Data, today s HPC systems are faced with unprecedented growth in the complexity

More information

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0)

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0) TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0) Contributing sites and the corresponding computer systems for this call are: GCS@Jülich, Germany IBM Blue Gene/Q GENCI@CEA, France Bull Bullx

More information

Aggregation of Real-Time System Monitoring Data for Analyzing Large-Scale Parallel and Distributed Computing Environments

Aggregation of Real-Time System Monitoring Data for Analyzing Large-Scale Parallel and Distributed Computing Environments Aggregation of Real-Time System Monitoring Data for Analyzing Large-Scale Parallel and Distributed Computing Environments Swen Böhm 1,2, Christian Engelmann 2, and Stephen L. Scott 2 1 Department of Computer

More information

High Performance Computing Data Management. Philippe Trautmann BDM High Performance Computing Global Research

High Performance Computing Data Management. Philippe Trautmann BDM High Performance Computing Global Research High Performance Computing Management Philippe Trautmann BDM High Performance Computing Global Education @ Research HPC Market and Trends High Performance Computing: Availability/Sharing is key European

More information

Organizational Update: December 2015

Organizational Update: December 2015 Organizational Update: December 2015 David Hudak Doug Johnson Alan Chalker www.osc.edu Slide 1 OSC Organizational Update Leadership changes State of OSC Roadmap Web app demonstration (if time) Slide 2

More information

IBM CORAL HPC System Solution

IBM CORAL HPC System Solution IBM CORAL HPC System Solution HPC and HPDA towards Cognitive, AI and Deep Learning Deep Learning AI / Deep Learning Strategy for Power Power AI Platform High Performance Data Analytics Big Data Strategy

More information

The Architecture and the Application Performance of the Earth Simulator

The Architecture and the Application Performance of the Earth Simulator The Architecture and the Application Performance of the Earth Simulator Ken ichi Itakura (JAMSTEC) http://www.jamstec.go.jp 15 Dec., 2011 ICTS-TIFR Discussion Meeting-2011 1 Location of Earth Simulator

More information

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0)

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0) TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0) Contributing sites and the corresponding computer systems for this call are: BSC, Spain IBM System x idataplex CINECA, Italy Lenovo System

More information

Brand-New Vector Supercomputer

Brand-New Vector Supercomputer Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13 1 New Product NEC Released A Brand-New Vector Supercomputer, SX-ACE Just Now. Vector Supercomputer for Memory Bandwidth

More information

Bridging the Gap Between High Quality and High Performance for HPC Visualization

Bridging the Gap Between High Quality and High Performance for HPC Visualization Bridging the Gap Between High Quality and High Performance for HPC Visualization Rob Sisneros National Center for Supercomputing Applications University of Illinois at Urbana Champaign Outline Why am I

More information

Determining Optimal MPI Process Placement for Large- Scale Meteorology Simulations with SGI MPIplace

Determining Optimal MPI Process Placement for Large- Scale Meteorology Simulations with SGI MPIplace Determining Optimal MPI Process Placement for Large- Scale Meteorology Simulations with SGI MPIplace James Southern, Jim Tuccillo SGI 25 October 2016 0 Motivation Trend in HPC continues to be towards more

More information

The Mont-Blanc approach towards Exascale

The Mont-Blanc approach towards Exascale http://www.montblanc-project.eu The Mont-Blanc approach towards Exascale Alex Ramirez Barcelona Supercomputing Center Disclaimer: Not only I speak for myself... All references to unavailable products are

More information

Overview. Idea: Reduce CPU clock frequency This idea is well suited specifically for visualization

Overview. Idea: Reduce CPU clock frequency This idea is well suited specifically for visualization Exploring Tradeoffs Between Power and Performance for a Scientific Visualization Algorithm Stephanie Labasan & Matt Larsen (University of Oregon), Hank Childs (Lawrence Berkeley National Laboratory) 26

More information

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill Introduction to FREE National Resources for Scientific Computing Dana Brunson Oklahoma State University High Performance Computing Center Jeff Pummill University of Arkansas High Peformance Computing Center

More information

The IBM Blue Gene/Q: Application performance, scalability and optimisation

The IBM Blue Gene/Q: Application performance, scalability and optimisation The IBM Blue Gene/Q: Application performance, scalability and optimisation Mike Ashworth, Andrew Porter Scientific Computing Department & STFC Hartree Centre Manish Modani IBM STFC Daresbury Laboratory,

More information

Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science

Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science ECSS Symposium, 12/16/14 M. L. Norman, R. L. Moore, D. Baxter, G. Fox (Indiana U), A Majumdar, P Papadopoulos, W Pfeiffer, R. S.

More information

Lecture 1. Introduction Course Overview

Lecture 1. Introduction Course Overview Lecture 1 Introduction Course Overview Welcome to CSE 260! Your instructor is Scott Baden baden@ucsd.edu Office: room 3244 in EBU3B Office hours Week 1: Today (after class), Tuesday (after class) Remainder

More information

Interconnect Your Future

Interconnect Your Future Interconnect Your Future Gilad Shainer 2nd Annual MVAPICH User Group (MUG) Meeting, August 2014 Complete High-Performance Scalable Interconnect Infrastructure Comprehensive End-to-End Software Accelerators

More information

Current Status of the Next- Generation Supercomputer in Japan. YOKOKAWA, Mitsuo Next-Generation Supercomputer R&D Center RIKEN

Current Status of the Next- Generation Supercomputer in Japan. YOKOKAWA, Mitsuo Next-Generation Supercomputer R&D Center RIKEN Current Status of the Next- Generation Supercomputer in Japan YOKOKAWA, Mitsuo Next-Generation Supercomputer R&D Center RIKEN International Workshop on Peta-Scale Computing Programming Environment, Languages

More information

Oak Ridge National Laboratory Computing and Computational Sciences

Oak Ridge National Laboratory Computing and Computational Sciences Oak Ridge National Laboratory Computing and Computational Sciences OFA Update by ORNL Presented by: Pavel Shamis (Pasha) OFA Workshop Mar 17, 2015 Acknowledgments Bernholdt David E. Hill Jason J. Leverman

More information

Atos announces the Bull sequana X1000 the first exascale-class supercomputer. Jakub Venc

Atos announces the Bull sequana X1000 the first exascale-class supercomputer. Jakub Venc Atos announces the Bull sequana X1000 the first exascale-class supercomputer Jakub Venc The world is changing The world is changing Digital simulation will be the key contributor to overcome 21 st century

More information

Execution Models for the Exascale Era

Execution Models for the Exascale Era Execution Models for the Exascale Era Nicholas J. Wright Advanced Technology Group, NERSC/LBNL njwright@lbl.gov Programming weather, climate, and earth- system models on heterogeneous muli- core plajorms

More information

ICON for HD(CP) 2. High Definition Clouds and Precipitation for Advancing Climate Prediction

ICON for HD(CP) 2. High Definition Clouds and Precipitation for Advancing Climate Prediction ICON for HD(CP) 2 High Definition Clouds and Precipitation for Advancing Climate Prediction High Definition Clouds and Precipitation for Advancing Climate Prediction ICON 2 years ago Parameterize shallow

More information

Algorithm and Library Software Design Challenges for Tera, Peta, and Future Exascale Computing

Algorithm and Library Software Design Challenges for Tera, Peta, and Future Exascale Computing Algorithm and Library Software Design Challenges for Tera, Peta, and Future Exascale Computing Bo Kågström Department of Computing Science and High Performance Computing Center North (HPC2N) Umeå University,

More information

InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014

InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014 InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment TOP500 Supercomputers, June 2014 TOP500 Performance Trends 38% CAGR 78% CAGR Explosive high-performance

More information

AN INTRODUCTION TO CLUSTER COMPUTING

AN INTRODUCTION TO CLUSTER COMPUTING CLUSTERS AND YOU AN INTRODUCTION TO CLUSTER COMPUTING Engineering IT BrownBag Series 29 October, 2015 Gianni Pezzarossi Linux Systems Administrator Mark Smylie Hart Research Technology Facilitator WHAT

More information

Real Parallel Computers

Real Parallel Computers Real Parallel Computers Modular data centers Overview Short history of parallel machines Cluster computing Blue Gene supercomputer Performance development, top-500 DAS: Distributed supercomputing Short

More information

The Future of High Performance Interconnects

The Future of High Performance Interconnects The Future of High Performance Interconnects Ashrut Ambastha HPC Advisory Council Perth, Australia :: August 2017 When Algorithms Go Rogue 2017 Mellanox Technologies 2 When Algorithms Go Rogue 2017 Mellanox

More information

From the latency to the throughput age. Prof. Jesús Labarta Director Computer Science Dept (BSC) UPC

From the latency to the throughput age. Prof. Jesús Labarta Director Computer Science Dept (BSC) UPC From the latency to the throughput age Prof. Jesús Labarta Director Computer Science Dept (BSC) UPC ETP4HPC Post-H2020 HPC Vision Frankfurt, June 24 th 2018 To exascale... and beyond 2 Vision The multicore

More information

The Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center

The Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center The Stampede is Coming Welcome to Stampede Introductory Training Dan Stanzione Texas Advanced Computing Center dan@tacc.utexas.edu Thanks for Coming! Stampede is an exciting new system of incredible power.

More information

INSPUR and HPC Innovation. Dong Qi (Forrest) Oversea PM

INSPUR and HPC Innovation. Dong Qi (Forrest) Oversea PM INSPUR and HPC Innovation Dong Qi (Forrest) Oversea PM dongqi@inspur.com Contents 1 2 3 4 5 Inspur introduction HPC Challenge and Inspur HPC strategy HPC cases Inspur contribution to HPC community Inspur

More information

Porting Scalable Parallel CFD Application HiFUN on NVIDIA GPU

Porting Scalable Parallel CFD Application HiFUN on NVIDIA GPU Porting Scalable Parallel CFD Application NVIDIA D. V., N. Munikrishna, Nikhil Vijay Shende 1 N. Balakrishnan 2 Thejaswi Rao 3 1. S & I Engineering Solutions Pvt. Ltd., Bangalore, India 2. Aerospace Engineering,

More information

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium Storage on the Lunatic Fringe Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium tmruwart@dtc.umn.edu Orientation Who are the lunatics? What are their requirements?

More information

Future of Enzo. Michael L. Norman James Bordner LCA/SDSC/UCSD

Future of Enzo. Michael L. Norman James Bordner LCA/SDSC/UCSD Future of Enzo Michael L. Norman James Bordner LCA/SDSC/UCSD SDSC Resources Data to Discovery Host SDNAP San Diego network access point for multiple 10 Gbs WANs ESNet, NSF TeraGrid, CENIC, Internet2, StarTap

More information

ALCF Argonne Leadership Computing Facility

ALCF Argonne Leadership Computing Facility ALCF Argonne Leadership Computing Facility ALCF Data Analytics and Visualization Resources William (Bill) Allcock Leadership Computing Facility Argonne Leadership Computing Facility Established 2006. Dedicated

More information

Top-Down System Design Approach Hans-Christian Hoppe, Intel Deutschland GmbH

Top-Down System Design Approach Hans-Christian Hoppe, Intel Deutschland GmbH Exploiting the Potential of European HPC Stakeholders in Extreme-Scale Demonstrators Top-Down System Design Approach Hans-Christian Hoppe, Intel Deutschland GmbH Motivation & Introduction Computer system

More information

HPC Algorithms and Applications

HPC Algorithms and Applications HPC Algorithms and Applications Intro Michael Bader Winter 2015/2016 Intro, Winter 2015/2016 1 Part I Scientific Computing and Numerical Simulation Intro, Winter 2015/2016 2 The Simulation Pipeline phenomenon,

More information

Cluster Computing. Cluster Architectures

Cluster Computing. Cluster Architectures Cluster Architectures Overview The Problem The Solution The Anatomy of a Cluster The New Problem A big cluster example The Problem Applications Many fields have come to depend on processing power for progress:

More information

Illinois Proposal Considerations Greg Bauer

Illinois Proposal Considerations Greg Bauer - 2016 Greg Bauer Support model Blue Waters provides traditional Partner Consulting as part of its User Services. Standard service requests for assistance with porting, debugging, allocation issues, and

More information

Introduction to Grid Computing

Introduction to Grid Computing Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able

More information