Challenges in High Performance Computing. William Gropp
|
|
- Sharyl Sharp
- 5 years ago
- Views:
Transcription
1 Challenges in High Performance Computing William Gropp
2 2 What is HPC? High Performance Computing is the use of computing to solve challenging problems that require significant computing resources Supercomputers Clusters (Large and Small) Accelerator-equipped workstations Custom hardware (e.g., Anton) Traditionally FLOPS But could be memory, data, bandwidth, real-time,
3 3 What Sort of Problems are Solved with HPC? Engineering everything from consumer products to spacecraft Getting Pringles into a can (3-D, timedependent CFD) Optimizing bottles for minimum weight with required robustness Fuel-efficient aircraft with novel materials Insights in science Formation of galaxies; effects of different theories of dark matter and energy Weather and climate forecasting Formation (and points of attack) of viruses such as HIV
4 4 Advancing Science and Engineering Advances in a broad range of science and engineering disciplines will be enabled by sustained petascale computers: Molecular Science Weather & Climate Forecasting Astronomy Earth Science Health
5 5 Messages Big is big Data driven is an important area, but not all data driven problems are big data (despite current hype). The distinction is important There are different measures of big, but a TB of data that can be processed by a linear algorithm is not big Key feature of an extreme computing system is a fast interconnect Low latency, high link bandwidth, high bisection bandwidth Provides fast access to data everywhere in system, particularly with one-sided access models Think map(r1,r2, ) function that requires more than one record, where the specific input records are unpredictable (e.g., data dependent on previous result)
6 6 HPC and Clouds and Big Data Clouds can provide some HPC services Esp. where each experiment runs on one node Single nodes can do a lot! But clouds very poor at applications that require tightly coordinated computations across tens or hundreds of thousands of nodes The one thing that makes a supercomputer super today is a high-bandwidth, low-latency interconnect, and the software to match Big Data also requires big compute Some applications only need independent computation (e.g., clouds)
7 7 Extrapolation is Risky 1989 T 24 years Intel introduces 486DX Eugene Brooks writes Attack of the Killer Micros 4 years before TOP500 Top systems at about 2 GF Peak 1999 T 14 years NVIDIA introduces its GPU (GeForce 256) Programming GPUs still a challenge 14 years later Top system ASCI Red, 9632 cores, 3.2 TF Peak (about 3 GPUs in 2013) MPI is 7 years old
8 8 HPC Today High(est)-End systems 1 PF (10 15 Ops/s) achieved on a few peak friendly by applications 2011; Blue Waters demonstrated > 1PF endto-end on a larger set this year Much worry about scalability, how we re going to get to an ExaFLOPS Systems are all oversubscribed DOE INCITE awarded almost 900M processor hours in 2009; 1600M-1700M hours in ; over 5B hours in 2013 NSF PRAC awards for Blue Waters similarly competitive Widespread use of clusters, many with accelerators; cloud computing services These are transforming the low and midrange Laptops (far) more powerful than the supercomputers I used as a graduate student
9 9 HPC in 2011 Sustained PF systems K Computer (Fujitsu) at RIKEN, Kobe, Japan (2011) Sequoia Blue Gene/Q at LLNL NSF Track 1 Blue Waters at Illinois Milky Way-2 (TH-2) in China (apps yet to be shown) Still programmed with MPI and MPI+other (e.g., MPI+OpenMP or MPI+OpenCL/CUDA or MPI+OpenACC) But in many cases using toolkits, libraries, etc. And not so bad applications will be able to run when the system is turned on Replacing MPI will require some compromise e.g., domain specific (higher-level but less general) Lots of evidence that fully automatic solutions won t work
10 10 Blue Waters and Sequoia Computing Systems NCSA LLNL System Attribute Blue Waters Sequoia Vendors Cray/AMD/NVIDIA IBM Processors Interlagos/Kepler PowerPCA2 variant Total Peak Performance (PF) Total Peak Performance (CPU/GPU) 7.6/ /0.0 Number of CPU Chips (8, 16 cores/chip) 48,576 98,304 Number of GPU Chips 3,072 0 Amount of CPU Memory (TB) 1,510 1,572 Interconnect 3D Torus 5D Torus Amount of On-line Disk Storage (PB) 26 50(?) Sustained Disk Transfer (TB/sec) > Amount of Archival Storage >300? Sustained Tape Transfer (GB/sec) >100?
11 11 Blue Waters and MilkyWay-2 Computing Systems NCSA NUDT System Attribute Blue Waters Milky Way 2 Vendors Cray/AMD/NVIDIA NUDT/Inspur Processors Interlagos/Kepler IvyBridge/Phi Total Peak Performance (PF) Total Peak Performance (CPU/GPU) 7.6/ /48.1 Number of CPU Chips (8, 16 cores/chip) 48,576 (12 cores) 32,000 Number of GPU Chips 3,072 48,000 Amount of CPU Memory (TB) 1,510 1,404 Interconnect 3D Torus Fat Tree Amount of On-line Disk Storage (PB) 26 12(?) Sustained Disk Transfer (TB/sec) >1? Amount of Archival Storage >300? Sustained Tape Transfer (GB/sec) >100?
12 12 HPC in Exascale systems are likely to have Extreme power constraints, leading to Clock Rates similar to today s systems A wide-diversity of simple computing elements (simple for hardware but complex for software) Memory per core and per FLOP will be much smaller Moving data anywhere will be expensive (time and power) Faults that will need to be detected and managed Some detection may be the job of the programmer, as hardware detection takes power Extreme scalability and performance irregularity Performance will require enormous concurrency Performance is likely to be variable Simple, static decompositions will not scale A need for latency tolerant algorithms and programming Memory, processors will be 100s to 10000s of cycles away. Waiting for operations to complete will cripple performance
13 13 Algorithms and Applications Will Change Applications need to become more dynamic, more integrated System software must work with application: Code complexity (Autotuning) Dynamic resources (no simple PGAS) Latency hiding (Nonblocking algorithms, interfaces, futures) Resource sharing (more performance information, performance asserts, runtime coordination)
14 How Do We Make Effective Use of These Systems? Better use of our existing systems Blue Waters provides a sustained PF, but that typically requires ~10PF peak Improve node performance Make the compiler better Give better code to the compiler Get realistic with algorithms/data structures Improve parallel performance/scalability Improve productivity of applications Better tools and interoperable languages, not a (single) new programming language Improve algorithms Optimize for the real issues data movement, power, resilience, 14
15 15 Common Themes Multiple operations must be pending at any time Asynchronous I/O, communication, even computation Split computations and communication Complex systems require adaptive approaches Autotuning for likely choices, runtime optimization Operations must be on aggregates CPU: vectors (GPU gangs/workers/vectors) I/O: Collective, parallel I/O Example: Parallel collective I/O for a distributed data structure mesh distributed across all nodes
16 Four Levels of Collective I/O Level 0 Level 1 Level 2 Level Processes 16
17 Distributed Array Access: Write Bandwidth Array size: 512 x 512 x 512 Note:Log Scale! 1GB data 128 procs 256 procs 32 procs 256procs 128 procs Thanks to Weikuan Yu, Wei-keng Liao, Bill Loewe, and Anthony Chan for these results
18 18 What s Different at Peta/Exascale Performance Focus Only a little basically, the resource is expensive, so a premium placed on making good use of resource Quite a bit node is more complex, has more features that must be exploited, interconnect performs operations Scalability Solutions that work at way often inefficient at 100,000-way Some algorithms scale well Explicit time marching in 3D Some don t Direct implicit methods Some scale well for a while FFTs (communication volume in Alltoall) Load balance, latency are critical issues Fault Tolerance becoming important Now: Reduce time spent in checkpoints Soon: Lightweight recovery from transient errors
19 19 Preparing for the Next Generation of HPC Systems Better use of existing resources Performance-oriented programming Dynamic management of resources at all levels Embrace hybrid programming models (you have already if you use SSE/VSX/OpenMP/ ) Focus on results Adapt to available network bandwidth and latency Exploit I/O capability (available space grew faster than processor performance!) Prepare for the future Fault tolerance Hybrid processor architectures Latency tolerant algorithms Data-driven systems
20 20 Research Directions Integrated, interoperable, component oriented languages Generalization of so-called domain-specific language Really data-structure-specific languages Performance modeling and tuning Performance info in language; performance considered as part of correctness Fault tolerance at the high end Fault tolerance features in the language, working with hardware and algorithms Correctness Correctness features for testing in the language Support for special cases (e.g., provably deterministic expression of deterministic algorithms)
21 Recommended Reading Bit reversal on uniprocessors (Alan Karp, SIAM Review, 1996) Achieving high sustained performance in an unstructured mesh CFD application (W. K. Anderson, W. D. Gropp, D. K. Kaushik, D. E. Keyes, B. F. Smith, Proceedings of Supercomputing, 1999) Experimental Analysis of Algorithms (Catherine McGeoch, Notices of the American Mathematical Society, March 2001) Reflections on the Memory Wall (Sally McKee, ACM Conference on Computing Frontiers, 2004) 21
22 Still open a once a year opportunity to work with high-end networks sc13.supercomputing.org/content/scinet-network-research-program 22
23 23 Six Questions 1. What is the appropriate balance between HPC needs at the extreme scale (fundamental research that can be done in no other way) and the needs of the long tail (research that needs more than a desktop computer)? How do you support all computing needs for research? 2. Applications have needs and wants. These may not be the same. E.g., applications want to use their existing algorithms and code, but it may not be possible to run those fast enough. The application may need to change approach. How do you get application scientists to separate their needs and wants? 3. HPC is about performance. How do you support both the research and especially the engineering work to make codes efficient? If you don t do this, where do you find the funds to buy the additional computational capacity required to meet the additional need created by running less than optimized codes?
24 24 Six Questions (con t) 4. Data and HPC should be closely connected. Truly big data (much greater than 10 PB) requires significant compute, for example. How do you change the perception that big data and big compute are antagonistic? What Big Data problems would be best solved on big compute platforms such as Blue Waters? 5. Current computer Architecture is reaching its limits. Where are the new architectures? How do you solve the chicken and egg problem do new architectures require a demand from new applications, and applications require a well-established, dependable architecture? Should NSF only consume architectures created by others (whether industry or other agencies) or should it have some control of its core computational technology? 6. How do we get past the usual suspects of applications (computational fluid dynamics, n-body problems, etc.)? How do we extend the use of HPC into new areas in science, engineering, and the humanities?
25 Six Questions: The Short Form 1. What is the right balance between HPC and other computing infrastructure? 2. How can we encourage applications to try new approaches? 3. How do we ensure applications make efficient use of infrastructure? 4. What Big Data problems need HPC? 5. How can we support innovative computer architecture research? 6. How do we bring computing to new areas of science? 25
The Next Generation of High Performance Computing. William Gropp
The Next Generation of High Performance Computing William Gropp www.cs.illinois.edu/~wgropp Extrapolation is Risky 1989 T 23 years Intel introduces 486DX Eugene Brooks writes Attack of the Killer Micros
More informationThe Challenges of Exascale. William D Gropp
The Challenges of Exascale William D Gropp www.cs.illinois.edu/~wgropp Extrapolation is Risky 1989 T 22 years Intel introduces 486DX Eugene Brooks writes Attack of the Killer Micros 4 years before TOP500
More informationAlgorithms and Software in the Post-Petascale Era
Algorithms and Software in the Post-Petascale Era William Gropp www.cs.illinois.edu/~wgropp Extrapolation is Risky 1989 T 23 years Intel introduces 486DX Eugene Brooks writes Attack of the Killer Micros
More informationAlgorithms and Software in the Post-Petascale Era. William Gropp
Algorithms and Software in the Post-Petascale Era William Gropp www.cs.illinois.edu/~wgropp Extrapolation is Risky 1989 T 23 years Intel introduces 486DX Eugene Brooks writes Attack of the Killer Micros
More informationMPI in 2020: Opportunities and Challenges. William Gropp
MPI in 2020: Opportunities and Challenges William Gropp www.cs.illinois.edu/~wgropp MPI and Supercomputing The Message Passing Interface (MPI) has been amazingly successful First released in 1992, it is
More informationThe Next Generation of High Performance Computing. William Gropp
The Next Generation of High Performance Computing William Gropp www.cs.illinois.edu/~wgropp Extrapolation is Risky 1989 T 23 years Intel introduces 486DX Eugene Brooks writes Attack of the Killer Micros
More informationOverview. CS 472 Concurrent & Parallel Programming University of Evansville
Overview CS 472 Concurrent & Parallel Programming University of Evansville Selection of slides from CIS 410/510 Introduction to Parallel Computing Department of Computer and Information Science, University
More informationTrends in HPC (hardware complexity and software challenges)
Trends in HPC (hardware complexity and software challenges) Mike Giles Oxford e-research Centre Mathematical Institute MIT seminar March 13th, 2013 Mike Giles (Oxford) HPC Trends March 13th, 2013 1 / 18
More informationGreen Supercomputing
Green Supercomputing On the Energy Consumption of Modern E-Science Prof. Dr. Thomas Ludwig German Climate Computing Centre Hamburg, Germany ludwig@dkrz.de Outline DKRZ 2013 and Climate Science The Exascale
More informationThe Blue Water s File/Archive System. Data Management Challenges Michelle Butler
The Blue Water s File/Archive System Data Management Challenges Michelle Butler (mbutler@ncsa.illinois.edu) NCSA is a World leader in deploying supercomputers and providing scientists with the software
More informationMPI+X on The Way to Exascale. William Gropp
MPI+X on The Way to Exascale William Gropp http://wgropp.cs.illinois.edu Some Likely Exascale Architectures Figure 1: Core Group for Node (Low Capacity, High Bandwidth) 3D Stacked Memory (High Capacity,
More informationStore Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete
Store Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete 1 DDN Who We Are 2 We Design, Deploy and Optimize Storage Systems Which Solve HPC, Big Data and Cloud Business
More informationSteve Scott, Tesla CTO SC 11 November 15, 2011
Steve Scott, Tesla CTO SC 11 November 15, 2011 What goal do these products have in common? Performance / W Exaflop Expectations First Exaflop Computer K Computer ~10 MW CM5 ~200 KW Not constant size, cost
More informationHPC future trends from a science perspective
HPC future trends from a science perspective Simon McIntosh-Smith University of Bristol HPC Research Group simonm@cs.bris.ac.uk 1 Business as usual? We've all got used to new machines being relatively
More informationHETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA
HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA STATE OF THE ART 2012 18,688 Tesla K20X GPUs 27 PetaFLOPS FLAGSHIP SCIENTIFIC APPLICATIONS
More informationIan Foster, An Overview of Distributed Systems
The advent of computation can be compared, in terms of the breadth and depth of its impact on research and scholarship, to the invention of writing and the development of modern mathematics. Ian Foster,
More informationEnabling the Next Generation of Scalable Clusters. William Gropp
Enabling the Next Generation of Scalable Clusters William Gropp www.cs.illinois.edu/~wgropp State of the World Clock rate ride over; power a constraint New architectures considered GPGPU even though hard
More informationNetApp: Solving I/O Challenges. Jeff Baxter February 2013
NetApp: Solving I/O Challenges Jeff Baxter February 2013 1 High Performance Computing Challenges Computing Centers Challenge of New Science Performance Efficiency directly impacts achievable science Power
More informationExascale: Parallelism gone wild!
IPDPS TCPP meeting, April 2010 Exascale: Parallelism gone wild! Craig Stunkel, Outline Why are we talking about Exascale? Why will it be fundamentally different? How will we attack the challenges? In particular,
More informationOncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries
Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries Jeffrey Young, Alex Merritt, Se Hoon Shon Advisor: Sudhakar Yalamanchili 4/16/13 Sponsors: Intel, NVIDIA, NSF 2 The Problem Big
More informationThe next generation supercomputer. Masami NARITA, Keiichi KATAYAMA Numerical Prediction Division, Japan Meteorological Agency
The next generation supercomputer and NWP system of JMA Masami NARITA, Keiichi KATAYAMA Numerical Prediction Division, Japan Meteorological Agency Contents JMA supercomputer systems Current system (Mar
More informationAlgorithms and Architecture. William D. Gropp Mathematics and Computer Science
Algorithms and Architecture William D. Gropp Mathematics and Computer Science www.mcs.anl.gov/~gropp Algorithms What is an algorithm? A set of instructions to perform a task How do we evaluate an algorithm?
More informationLecture 20: Distributed Memory Parallelism. William Gropp
Lecture 20: Distributed Parallelism William Gropp www.cs.illinois.edu/~wgropp A Very Short, Very Introductory Introduction We start with a short introduction to parallel computing from scratch in order
More informationHigh Performance Computing. What is it used for and why?
High Performance Computing What is it used for and why? Overview What is it used for? Drivers for HPC Examples of usage Why do you need to learn the basics? Hardware layout and structure matters Serial
More informationComplexity and Advanced Algorithms. Introduction to Parallel Algorithms
Complexity and Advanced Algorithms Introduction to Parallel Algorithms Why Parallel Computing? Save time, resources, memory,... Who is using it? Academia Industry Government Individuals? Two practical
More informationCommunication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.
Cluster Networks Introduction Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. As usual, the driver is performance
More informationMotivation Goal Idea Proposition for users Study
Exploring Tradeoffs Between Power and Performance for a Scientific Visualization Algorithm Stephanie Labasan Computer and Information Science University of Oregon 23 November 2015 Overview Motivation:
More informationGPFS Experiences from the Argonne Leadership Computing Facility (ALCF) William (Bill) E. Allcock ALCF Director of Operations
GPFS Experiences from the Argonne Leadership Computing Facility (ALCF) William (Bill) E. Allcock ALCF Director of Operations Argonne National Laboratory Argonne National Laboratory is located on 1,500
More informationEngineering Performance for Multiphysics Applications. William Gropp
Engineering Performance for Multiphysics Applications William Gropp www.cs.illinois.edu/~wgropp Performance, then Productivity Note the then not instead of For easier problems, it is correct to invert
More informationIntroduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29
Introduction CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction Spring 2018 1 / 29 Outline 1 Preface Course Details Course Requirements 2 Background Definitions
More informationFujitsu s Approach to Application Centric Petascale Computing
Fujitsu s Approach to Application Centric Petascale Computing 2 nd Nov. 2010 Motoi Okuda Fujitsu Ltd. Agenda Japanese Next-Generation Supercomputer, K Computer Project Overview Design Targets System Overview
More informationParallelism and Concurrency. COS 326 David Walker Princeton University
Parallelism and Concurrency COS 326 David Walker Princeton University Parallelism What is it? Today's technology trends. How can we take advantage of it? Why is it so much harder to program? Some preliminary
More informationCenter Extreme Scale CS Research
Center Extreme Scale CS Research Center for Compressible Multiphase Turbulence University of Florida Sanjay Ranka Herman Lam Outline 10 6 10 7 10 8 10 9 cores Parallelization and UQ of Rocfun and CMT-Nek
More informationHPC Storage Use Cases & Future Trends
Oct, 2014 HPC Storage Use Cases & Future Trends Massively-Scalable Platforms and Solutions Engineered for the Big Data and Cloud Era Atul Vidwansa Email: atul@ DDN About Us DDN is a Leader in Massively
More informationAim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group
Aim High Intel Technical Update Teratec 07 Symposium June 20, 2007 Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Risk Factors Today s s presentations contain forward-looking statements.
More informationChallenges for Algorithms and Software at Extreme Scale. William Gropp
Challenges for Algorithms and Software at Extreme Scale William Gropp www.cs.illinois.edu/~wgropp Frequency Scaling is Over New (prediction): Increase 4% per year (ITRS 2012 Roadmap) Old: Double every
More informationMPI+X on The Way to Exascale. William Gropp
MPI+X on The Way to Exascale William Gropp http://wgropp.cs.illinois.edu Likely Exascale Architectures (Low Capacity, High Bandwidth) 3D Stacked Memory (High Capacity, Low Bandwidth) Thin Cores / Accelerators
More informationHPC Saudi Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences. Presented to: March 14, 2017
Creating an Exascale Ecosystem for Science Presented to: HPC Saudi 2017 Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences March 14, 2017 ORNL is managed by UT-Battelle
More informationOur Workshop Environment
Our Workshop Environment John Urbanic Parallel Computing Scientist Pittsburgh Supercomputing Center Copyright 2015 Our Environment Today Your laptops or workstations: only used for portal access Blue Waters
More informationBuilding NVLink for Developers
Building NVLink for Developers Unleashing programmatic, architectural and performance capabilities for accelerated computing Why NVLink TM? Simpler, Better and Faster Simplified Programming No specialized
More informationManaging HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory
Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory Quinn Mitchell HPC UNIX/LINUX Storage Systems ORNL is managed by UT-Battelle for the US Department of Energy U.S. Department
More informationLeveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands
Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Unleash Your Data Center s Hidden Power September 16, 2014 Molly Rector CMO, EVP Product Management & WW Marketing
More informationCray XC Scalability and the Aries Network Tony Ford
Cray XC Scalability and the Aries Network Tony Ford June 29, 2017 Exascale Scalability Which scalability metrics are important for Exascale? Performance (obviously!) What are the contributing factors?
More informationDo You Know What Your I/O Is Doing? (and how to fix it?) William Gropp
Do You Know What Your I/O Is Doing? (and how to fix it?) William Gropp www.cs.illinois.edu/~wgropp Messages Current I/O performance is often appallingly poor Even relative to what current systems can achieve
More informationExascale Challenges and Applications Initiatives for Earth System Modeling
Exascale Challenges and Applications Initiatives for Earth System Modeling Workshop on Weather and Climate Prediction on Next Generation Supercomputers 22-25 October 2012 Tom Edwards tedwards@cray.com
More informationImproved Solutions for I/O Provisioning and Application Acceleration
1 Improved Solutions for I/O Provisioning and Application Acceleration August 11, 2015 Jeff Sisilli Sr. Director Product Marketing jsisilli@ddn.com 2 Why Burst Buffer? The Supercomputing Tug-of-War A supercomputer
More informationPerformance, then Productivity. Rethinking Solvers for Extreme Scale Architectures. Using Extra Computation in Time Dependent Problems
Rethinking Solvers for Extreme Scale Architectures William Gropp www.cs.illinois.edu/~wgropp parallel.illinois.edu Performance, then Productivity Note the then not instead of For easier problems, it is
More informationThe Future of Interconnect Technology
The Future of Interconnect Technology Michael Kagan, CTO HPC Advisory Council Stanford, 2014 Exponential Data Growth Best Interconnect Required 44X 0.8 Zetabyte 2009 35 Zetabyte 2020 2014 Mellanox Technologies
More informationComputing architectures Part 2 TMA4280 Introduction to Supercomputing
Computing architectures Part 2 TMA4280 Introduction to Supercomputing NTNU, IMF January 16. 2017 1 Supercomputing What is the motivation for Supercomputing? Solve complex problems fast and accurately:
More informationThe Stampede is Coming: A New Petascale Resource for the Open Science Community
The Stampede is Coming: A New Petascale Resource for the Open Science Community Jay Boisseau Texas Advanced Computing Center boisseau@tacc.utexas.edu Stampede: Solicitation US National Science Foundation
More informationHalf full or half empty? William Gropp Mathematics and Computer Science
Half full or half empty? William Gropp Mathematics and Computer Science www.mcs.anl.gov/~gropp MPI on Multicore Processors Work of Darius Buntinas and Guillaume Mercier 340 ns MPI ping/pong latency More
More informationIntroduction to High Performance Parallel I/O
Introduction to High Performance Parallel I/O Richard Gerber Deputy Group Lead NERSC User Services August 30, 2013-1- Some slides from Katie Antypas I/O Needs Getting Bigger All the Time I/O needs growing
More informationBlueGene/L. Computer Science, University of Warwick. Source: IBM
BlueGene/L Source: IBM 1 BlueGene/L networking BlueGene system employs various network types. Central is the torus interconnection network: 3D torus with wrap-around. Each node connects to six neighbours
More informationCo-existence: Can Big Data and Big Computation Co-exist on the Same Systems?
Co-existence: Can Big Data and Big Computation Co-exist on the Same Systems? Dr. William Kramer National Center for Supercomputing Applications, University of Illinois Where these views come from Large
More informationThe Future of the Message-Passing Interface. William Gropp
The Future of the Message-Passing Interface William Gropp www.cs.illinois.edu/~wgropp MPI and Supercomputing The Message Passing Interface (MPI) has been amazingly successful First released in 1992, it
More informationInterconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, November 2017
Interconnect Your Future Enabling the Best Datacenter Return on Investment TOP500 Supercomputers, November 2017 InfiniBand Accelerates Majority of New Systems on TOP500 InfiniBand connects 77% of new HPC
More informationMELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구
MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구 Leading Supplier of End-to-End Interconnect Solutions Analyze Enabling the Use of Data Store ICs Comprehensive End-to-End InfiniBand and Ethernet Portfolio
More informationHPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms. Author: Correspondence: ABSTRACT:
HPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms Author: Stan Posey Panasas, Inc. Correspondence: Stan Posey Panasas, Inc. Phone +510 608 4383 Email sposey@panasas.com
More informationPART I - Fundamentals of Parallel Computing
PART I - Fundamentals of Parallel Computing Objectives What is scientific computing? The need for more computing power The need for parallel computing and parallel programs 1 What is scientific computing?
More informationPresent and Future Leadership Computers at OLCF
Present and Future Leadership Computers at OLCF Al Geist ORNL Corporate Fellow DOE Data/Viz PI Meeting January 13-15, 2015 Walnut Creek, CA ORNL is managed by UT-Battelle for the US Department of Energy
More informationChelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING
Meeting Today s Datacenter Challenges Produced by Tabor Custom Publishing in conjunction with: 1 Introduction In this era of Big Data, today s HPC systems are faced with unprecedented growth in the complexity
More informationTECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0)
TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0) Contributing sites and the corresponding computer systems for this call are: GCS@Jülich, Germany IBM Blue Gene/Q GENCI@CEA, France Bull Bullx
More informationAggregation of Real-Time System Monitoring Data for Analyzing Large-Scale Parallel and Distributed Computing Environments
Aggregation of Real-Time System Monitoring Data for Analyzing Large-Scale Parallel and Distributed Computing Environments Swen Böhm 1,2, Christian Engelmann 2, and Stephen L. Scott 2 1 Department of Computer
More informationHigh Performance Computing Data Management. Philippe Trautmann BDM High Performance Computing Global Research
High Performance Computing Management Philippe Trautmann BDM High Performance Computing Global Education @ Research HPC Market and Trends High Performance Computing: Availability/Sharing is key European
More informationOrganizational Update: December 2015
Organizational Update: December 2015 David Hudak Doug Johnson Alan Chalker www.osc.edu Slide 1 OSC Organizational Update Leadership changes State of OSC Roadmap Web app demonstration (if time) Slide 2
More informationIBM CORAL HPC System Solution
IBM CORAL HPC System Solution HPC and HPDA towards Cognitive, AI and Deep Learning Deep Learning AI / Deep Learning Strategy for Power Power AI Platform High Performance Data Analytics Big Data Strategy
More informationThe Architecture and the Application Performance of the Earth Simulator
The Architecture and the Application Performance of the Earth Simulator Ken ichi Itakura (JAMSTEC) http://www.jamstec.go.jp 15 Dec., 2011 ICTS-TIFR Discussion Meeting-2011 1 Location of Earth Simulator
More informationTECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0)
TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0) Contributing sites and the corresponding computer systems for this call are: BSC, Spain IBM System x idataplex CINECA, Italy Lenovo System
More informationBrand-New Vector Supercomputer
Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13 1 New Product NEC Released A Brand-New Vector Supercomputer, SX-ACE Just Now. Vector Supercomputer for Memory Bandwidth
More informationBridging the Gap Between High Quality and High Performance for HPC Visualization
Bridging the Gap Between High Quality and High Performance for HPC Visualization Rob Sisneros National Center for Supercomputing Applications University of Illinois at Urbana Champaign Outline Why am I
More informationDetermining Optimal MPI Process Placement for Large- Scale Meteorology Simulations with SGI MPIplace
Determining Optimal MPI Process Placement for Large- Scale Meteorology Simulations with SGI MPIplace James Southern, Jim Tuccillo SGI 25 October 2016 0 Motivation Trend in HPC continues to be towards more
More informationThe Mont-Blanc approach towards Exascale
http://www.montblanc-project.eu The Mont-Blanc approach towards Exascale Alex Ramirez Barcelona Supercomputing Center Disclaimer: Not only I speak for myself... All references to unavailable products are
More informationOverview. Idea: Reduce CPU clock frequency This idea is well suited specifically for visualization
Exploring Tradeoffs Between Power and Performance for a Scientific Visualization Algorithm Stephanie Labasan & Matt Larsen (University of Oregon), Hank Childs (Lawrence Berkeley National Laboratory) 26
More informationIntroduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill
Introduction to FREE National Resources for Scientific Computing Dana Brunson Oklahoma State University High Performance Computing Center Jeff Pummill University of Arkansas High Peformance Computing Center
More informationThe IBM Blue Gene/Q: Application performance, scalability and optimisation
The IBM Blue Gene/Q: Application performance, scalability and optimisation Mike Ashworth, Andrew Porter Scientific Computing Department & STFC Hartree Centre Manish Modani IBM STFC Daresbury Laboratory,
More informationGateways to Discovery: Cyberinfrastructure for the Long Tail of Science
Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science ECSS Symposium, 12/16/14 M. L. Norman, R. L. Moore, D. Baxter, G. Fox (Indiana U), A Majumdar, P Papadopoulos, W Pfeiffer, R. S.
More informationLecture 1. Introduction Course Overview
Lecture 1 Introduction Course Overview Welcome to CSE 260! Your instructor is Scott Baden baden@ucsd.edu Office: room 3244 in EBU3B Office hours Week 1: Today (after class), Tuesday (after class) Remainder
More informationInterconnect Your Future
Interconnect Your Future Gilad Shainer 2nd Annual MVAPICH User Group (MUG) Meeting, August 2014 Complete High-Performance Scalable Interconnect Infrastructure Comprehensive End-to-End Software Accelerators
More informationCurrent Status of the Next- Generation Supercomputer in Japan. YOKOKAWA, Mitsuo Next-Generation Supercomputer R&D Center RIKEN
Current Status of the Next- Generation Supercomputer in Japan YOKOKAWA, Mitsuo Next-Generation Supercomputer R&D Center RIKEN International Workshop on Peta-Scale Computing Programming Environment, Languages
More informationOak Ridge National Laboratory Computing and Computational Sciences
Oak Ridge National Laboratory Computing and Computational Sciences OFA Update by ORNL Presented by: Pavel Shamis (Pasha) OFA Workshop Mar 17, 2015 Acknowledgments Bernholdt David E. Hill Jason J. Leverman
More informationAtos announces the Bull sequana X1000 the first exascale-class supercomputer. Jakub Venc
Atos announces the Bull sequana X1000 the first exascale-class supercomputer Jakub Venc The world is changing The world is changing Digital simulation will be the key contributor to overcome 21 st century
More informationExecution Models for the Exascale Era
Execution Models for the Exascale Era Nicholas J. Wright Advanced Technology Group, NERSC/LBNL njwright@lbl.gov Programming weather, climate, and earth- system models on heterogeneous muli- core plajorms
More informationICON for HD(CP) 2. High Definition Clouds and Precipitation for Advancing Climate Prediction
ICON for HD(CP) 2 High Definition Clouds and Precipitation for Advancing Climate Prediction High Definition Clouds and Precipitation for Advancing Climate Prediction ICON 2 years ago Parameterize shallow
More informationAlgorithm and Library Software Design Challenges for Tera, Peta, and Future Exascale Computing
Algorithm and Library Software Design Challenges for Tera, Peta, and Future Exascale Computing Bo Kågström Department of Computing Science and High Performance Computing Center North (HPC2N) Umeå University,
More informationInfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014
InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment TOP500 Supercomputers, June 2014 TOP500 Performance Trends 38% CAGR 78% CAGR Explosive high-performance
More informationAN INTRODUCTION TO CLUSTER COMPUTING
CLUSTERS AND YOU AN INTRODUCTION TO CLUSTER COMPUTING Engineering IT BrownBag Series 29 October, 2015 Gianni Pezzarossi Linux Systems Administrator Mark Smylie Hart Research Technology Facilitator WHAT
More informationReal Parallel Computers
Real Parallel Computers Modular data centers Overview Short history of parallel machines Cluster computing Blue Gene supercomputer Performance development, top-500 DAS: Distributed supercomputing Short
More informationThe Future of High Performance Interconnects
The Future of High Performance Interconnects Ashrut Ambastha HPC Advisory Council Perth, Australia :: August 2017 When Algorithms Go Rogue 2017 Mellanox Technologies 2 When Algorithms Go Rogue 2017 Mellanox
More informationFrom the latency to the throughput age. Prof. Jesús Labarta Director Computer Science Dept (BSC) UPC
From the latency to the throughput age Prof. Jesús Labarta Director Computer Science Dept (BSC) UPC ETP4HPC Post-H2020 HPC Vision Frankfurt, June 24 th 2018 To exascale... and beyond 2 Vision The multicore
More informationThe Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center
The Stampede is Coming Welcome to Stampede Introductory Training Dan Stanzione Texas Advanced Computing Center dan@tacc.utexas.edu Thanks for Coming! Stampede is an exciting new system of incredible power.
More informationINSPUR and HPC Innovation. Dong Qi (Forrest) Oversea PM
INSPUR and HPC Innovation Dong Qi (Forrest) Oversea PM dongqi@inspur.com Contents 1 2 3 4 5 Inspur introduction HPC Challenge and Inspur HPC strategy HPC cases Inspur contribution to HPC community Inspur
More informationPorting Scalable Parallel CFD Application HiFUN on NVIDIA GPU
Porting Scalable Parallel CFD Application NVIDIA D. V., N. Munikrishna, Nikhil Vijay Shende 1 N. Balakrishnan 2 Thejaswi Rao 3 1. S & I Engineering Solutions Pvt. Ltd., Bangalore, India 2. Aerospace Engineering,
More informationStorage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium
Storage on the Lunatic Fringe Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium tmruwart@dtc.umn.edu Orientation Who are the lunatics? What are their requirements?
More informationFuture of Enzo. Michael L. Norman James Bordner LCA/SDSC/UCSD
Future of Enzo Michael L. Norman James Bordner LCA/SDSC/UCSD SDSC Resources Data to Discovery Host SDNAP San Diego network access point for multiple 10 Gbs WANs ESNet, NSF TeraGrid, CENIC, Internet2, StarTap
More informationALCF Argonne Leadership Computing Facility
ALCF Argonne Leadership Computing Facility ALCF Data Analytics and Visualization Resources William (Bill) Allcock Leadership Computing Facility Argonne Leadership Computing Facility Established 2006. Dedicated
More informationTop-Down System Design Approach Hans-Christian Hoppe, Intel Deutschland GmbH
Exploiting the Potential of European HPC Stakeholders in Extreme-Scale Demonstrators Top-Down System Design Approach Hans-Christian Hoppe, Intel Deutschland GmbH Motivation & Introduction Computer system
More informationHPC Algorithms and Applications
HPC Algorithms and Applications Intro Michael Bader Winter 2015/2016 Intro, Winter 2015/2016 1 Part I Scientific Computing and Numerical Simulation Intro, Winter 2015/2016 2 The Simulation Pipeline phenomenon,
More informationCluster Computing. Cluster Architectures
Cluster Architectures Overview The Problem The Solution The Anatomy of a Cluster The New Problem A big cluster example The Problem Applications Many fields have come to depend on processing power for progress:
More informationIllinois Proposal Considerations Greg Bauer
- 2016 Greg Bauer Support model Blue Waters provides traditional Partner Consulting as part of its User Services. Standard service requests for assistance with porting, debugging, allocation issues, and
More informationIntroduction to Grid Computing
Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able
More information