HPC as a Driver for Computing Technology and Education

Size: px
Start display at page:

Download "HPC as a Driver for Computing Technology and Education"

Transcription

1 HPC as a Driver for Computing Technology and Education Tarek El-Ghazawi The George Washington University Washington D.C., USA

2 NOW- July 2015: The TOP 10 Systems Rank Site Computer Cores Rmax [Pflops] % of Peak Power [MW] MFlops /Watt National Super Computer Center in Guangzhou, China DOE / OS Oak Ridge Nat Lab USA DOE / NNSA L Livermore Nat Lab USA RIKEN Advanced Inst for Comp Sci, Japan DOE / OS Argonne Nat Lab, USA 6 Swiss CSCS 7 KAUST, Saudi Tianhe-2 NUDT, Xeon 12C 2.2GHz + IntelXeon Phi (57c) + Custom Titan, Cray XK7, AMD (16C) + Nvidia Kepler GPU (14c) + Custom Sequoia, BlueGene/Q (16c) + custom K computer Fujitsu SPARC64 VIIIfx (8c) + Custom Mira, BlueGene/Q (16c) + Custom Piz Daint, Cray XC30, Xeon 8C + Nvidia Kepler (14c) + Custom Shaheen II, Cray XC30, Xeon 16C + Custom 3,120, , ,572, , , , , TACC, USA 9 10 Forschungszentrum Juelich (FZJ), Germany DOE / NNSA LLNL, USA Stampede, Dell Intel (8c) + Intel Xeon Phi (61c) + IB JuQUEEN, BlueGene/Q, Power BQC 16C 1.6GHz+Custom Vulcan, BlueGene/Q, Power BQC 16C 1.6GHz+Custom 204, , , (422) Software Comp HP Cluster USA 18,

3 HPC is a Top National Priority! Executive Order from the White House Establishment of a National Strategic Computing Initiative (NCSI) 29 July

4 National Strategic Computing Initiative Five strategic themes of the NSCI: 1) Create systems that can apply exaflops of computing power to exabytes of data 2) Keep the United States at the forefront of HPC capabilities 3) Improve HPC application developer productivity 4) Make HPC readily available 5) Establish hardware technology for future HPC systems 4 4

5 Future/Investments - International Exascale HPC Programs Country Funding Year(s) Remarks European Union 700M Private-Public Partnership commitment through European Tech Platform for HPC (ETP4HPC) 143.4M in M dedicated FP7 Exascale projects India $2B Led by IISc (Indian Institute of Science) and ISRO (Indian Space Research Organization). Targeting a 132 ExaFLOP/s machine $750M C-DAC (Center for Development of Advanced Computing) to set up 70 supercomputers over 5 years Japan $1.38B Post-K computer to be installed at RIKEN; Tentatively based on Extreme SIMD chip PACS-G China - Due to U.S./DoC ban will use Chinese 5 Tarek El-Ghazawi, parts GWU to upgrade current #1 system 5

6 Why is HPC Important? Critical for economic competitiveness (Highlighted by Minster Daoudi) because of its wide applications (through simulations and intensive data analyses) Drives computer hardware and software innovations for future conventional computing Is becoming ubiquitous, i.e. all computing/information technology is turning into Parallel!! Is that why it is turning into an international HPC muscle flexing contest? 6

7 Why is HPC Important? (1)Competitiveness Design Build Test Design Model Simulate Build 7

8 Molecular Dynamics HIV-1 Protease Why is HPC Important? Competitiveness Inhibitor Drug Gene Sequence Alignment Simulation for 2ns: 2 weeks on a desktop 6 hours on a supercomputer HPC Application Examples Phylogenetic Analysis: 32 days on desktop 1.5 hrs supercomputer Car Crash Simulations Understanding Fundamental Structure of Matter 2 million elements simulation: 4 days on a desktop 25 minutes on a supercomputer Requires a billionbillion calculations per second 8

9 Why is HPC Important? (2) HPC of Today is Conventional Computing for Tomorrow The ASCI Red Supercomputer 9000 chips for 3 TeraFLOPs in 1997 Intel 80 Core Chip 1 Chip and 1 TeraFLOPs in

10 3- Why is HPC Important?- HPC Concepts are becoming Ubiquitous Sony PS3 Samsung S6 8 Cores Uses the Cell Processors! HPC is Ubiquitous! All Computing is becoming HPC, Can we become bystanders? The Road Runner: Was Fastest Supercomputer in 08 Tile64: A 64 CPU Chip- Can be in your future laptop! Uses Cell Processors! 10

11 How Did we Get Here - Supercomputers in recent History Computer Processor # Pr. Year Tianhe-2 (MilkyWay-2) Titan TH-IVB-FEP Cluster, Intel Xeon E C 2.200GHz, TH Express-2, Intel Xeon Phi 31S1P Cray XK7, Opteron 16 Cores, 2.2GHz, Nvidia K20X R max (TFlops) till now 33, ,600 K-Computer, Japan SPARC64 VIIIfx 2.0GHz, ,510 Tianhe-1A, China Intel EM64T Xeon X56xx (Westmere-EP) 2930 MHz (11.72 Gflops) + NVIDIA GPU, FT C ,566 Jaguar, Cray Cray XT5-HE Opteron Six Core 2.6 GHz ,759 Roadrunner, IBM PowerXCell 8i 3200 MHz (12.8 GFlops) ,026 BlueGene/L - eserver Blue Gene Solution, IBM BlueGene/L - eserver Blue Gene Solution, IBM PowerPC MHz (2.8 GFlops) PowerPC MHz (2.8 GFlops) BlueGene/L beta-system IBM PowerPC MHz (2.8 GFlops) Earth-Simulator / NEC NEC 1000 MHz (8 GFlops) IBM ASCI White,SP POWER3 375 MHz (1.5 GFlops) IBM ASCI White,SP POWER3 375MHz (1.5 GFlops) Intel ASCI Red Intel IA-32 Pentium Pro 333 MHz (0.333 GFlops)

12 How Did we Get Here - Supercomputers in recent History See: 12

13 How Did we Get Here - Supercomputers in recent History PetaFLOPS Performance Vector Machines Massively Parallel Processors MPPs with Multicores and Heterogeneous Accelerators TeraFLOPS Discrete Integrated HPCC End of Moore s Law in Clocking! Time 13

14 NOW- July 2015: The TOP 10 Systems Rank Site Computer Cores Rmax [Pflops] % of Peak Power [MW] MFlops /Watt National Super Computer Center in Guangzhou, China DOE / OS Oak Ridge Nat Lab USA DOE / NNSA L Livermore Nat Lab USA RIKEN Advanced Inst for Comp Sci, Japan DOE / OS Argonne Nat Lab, USA 6 Swiss CSCS 7 KAUST, Saudi Tianhe-2 NUDT, Xeon 12C 2.2GHz + IntelXeon Phi (57c) + Custom Titan, Cray XK7, AMD (16C) + Nvidia Kepler GPU (14c) + Custom Sequoia, BlueGene/Q (16c) + custom K computer Fujitsu SPARC64 VIIIfx (8c) + Custom Mira, BlueGene/Q (16c) + Custom Piz Daint, Cray XC30, Xeon 8C + Nvidia Kepler (14c) + Custom Shaheen II, Cray XC30, Xeon 16C + Custom 3,120, , ,572, , , , , TACC, USA 9 10 Forschungszentrum Juelich (FZJ), Germany DOE / NNSA LLNL, USA Stampede, Dell Intel (8c) + Intel Xeon Phi (61c) + IB JuQUEEN, BlueGene/Q, Power BQC 16C 1.6GHz+Custom Vulcan, BlueGene/Q, Power BQC 16C 1.6GHz+Custom 204, , , (422) Software Comp HP Cluster USA 18,

15 How to Make Progress Launch a competitive funding cycle or a large national project Pose a system challenge ~ 33.8 PFLOPS/17.8 Mwatt provides about 2GF/Watt To get to Exascale using same total power we need 200GF/Watt Pose an application challenge(s) Let the community compete for government funding with innovative ideas 15

16 Challenges - The End of Moore s Law The phenomenon of exponential improvements in processors was observed in 1979 by Intel co-founder Gordon Moore The speed of a microprocessor doubles every months, assuming the price of the processor stays the same Wrong, not anymore! The price of a microchip drops about 48% every months, assuming the same processor speed and on chip memory capacity Ok, for Now The number of transistors on a microchip doubles every months, assuming the price of the chip stays the same Ok, for Now 16

17 No faster clocking but more Cores? Source: Ed Davis, Intel 17

18 ccelerators and Dealing with the Moore s Law Challenge Through Parallelism Fab. Process Freq # Cores Peak FP Performance Peak Power DP Flops/W Memory nm GHz SPFP GFlops DPFP GFlops W BW GB/s Memory type PowerXCell 8i XDR Nvidia Kepler K40 Intel Xeon Phi 7120P Intel Xeon 12- core 2.7 GHz E5-2697v2 AMD Opteron 6370P Interlagos GDDR (244 threads) GDDR DDR DDR Xilinx XC7VX1140T Xilinx XCUV * 5.0* Altera Stratix V GSB

19 Accelerators/Heterogeneous Computing FPGAs Cell GPUs Phi Microprocessor Application Speedup SAVINGS Cost Power Size DNA Match x 779x 253x DES Breaker x 3439x 1116x El-Ghazawi et. al. The Promise of HPRCs. IEEE Computer, February

20 A General Execution Model for Heterogeneous Computers µp Transfer of Control Input Data GPU Accelerator CELL B.E. PC FPGA Clearspeed Intel Xeon Phi Output Data Transfer of Control 20

21 Challenges for Accelerators 1. Application must lend itself to the rule, and different accelerators suit diffent type of computations 2. Programmer partitions the code across the CPU and accelerator 3. Programmer co-schedules CPU and accelerator, and ensures good utilization of the expensive accelerator resources 4. Programmer explicitly transfers data between CPU and accelerator 5. Accelerators are fast as compared to the link, and overhead that can render the use of the accelerator useless or harmful 6. Multiple programming paradigms are needed 7. New accelerator means learning/porting to a new programming interface 8. Changing the ratio of CPUs to accelerators requires also substantial programming unless accelerators are vituralized 21

22 Challenges for Advancing or for Exascale DoE ASCAC Subcommittee Report Feb Energy Efficiency 2. Interconnect Technology 3. Memory Technology 4. Scalable System Software 5. Programming Systems 6. Data Management 7. Exascale Algorithms 8. Algorithms for Discovery, Design & Decision 9. Resilience and Correctness 10. Scientific Productivity Data movement Tarek and/or El-Ghazawi, programming GWU related 22

23 Exascale Technological Challenges The Power Wall Frequency scaling is no longer possible, power increases rapidly The Memory Wall Gap between processor speed and memory speed is widening The Interconnect Wall Available bandwidth per compute operations is dropping Power needed for data movement is increasing Programmability Wall, Resilience Wall,

24 The Data Movement Challenge Bandwidth density vs. system distance Energy vs. system distance [Source: ASCAC 14] Locality matters a lot, cost (energy and time) rapidly increases with distance Locality should be exploited at short distance, needed more at far distances 24

25 Data Movement and the Hierarchical Locality Challenge 25 25

26 Locality is Not Flat Anymore Chip and System 26 26

27 Locality is Not Flat in Anymore Chip and System 27 27

28 Locality is Not Flat Anymore Chip and System 28 28

29 Locality is Not Flat in Extreme Scale Chip and System Cray XC

30 Locality in Extreme Scale Chip and System Perspectives TTT TILE64 Tile64 Cray XC

31 What Does that Mean for Programmers Exploiting Hierarchical Locality Machine level and Chip level Hierarchical Tiled Data Structures Hierarchical Locality Exploitation with RTS MPI+X 31

32 General Implications Short term programming challenge Golden opportunity for smart programmer New hardware advances needed first and they will influence software May be silicon based, may be nano technologies like carbon nano-tube transistors by IBM (9nm), may keep things the way they are from the software side for a while 32

33 General Implications- Longer Run Long-term hardware technology may move toward Nano-photonics for computing Quantum Computing Many of the new hardware computing innovations may show first as discrete accelerators, then on the chip accelerator, then move closer to the processor internal circuitry ( data path ) 33

34 Longer term The bad news: with the limits of the silicon approached we may see departures from conventional methods of computing which may dramatically change the way we conceive software The good news: history has shown that good ideas from the past get resurrected in new ways 34

35 Conclusions Graduating and intelligent IT workforce can be a golden egg for countries like Morocco You can teach skills but it is imperative to teach and stress concepts in the curriculum Stress Parallelism Stress Locality See the recommendations by IEEE/NSF and SIAM for incorporating parallelism in Computer Science, Computer Engineering, and Computational Science and Engineering Curricula, and add locality For the very long-term There is nothing better than having good foundations in Physics and Math even for CS and CE majors 35

36 Conclusions cont. Integrate teaching soft skills as President Ouaouicha said Communications Entrepreneurism and marketing, individually and in groups Patenting and legal 36

High-Performance Computing - and why Learn about it?

High-Performance Computing - and why Learn about it? High-Performance Computing - and why Learn about it? Tarek El-Ghazawi The George Washington University Washington D.C., USA Outline What is High-Performance Computing? Why is High-Performance Computing

More information

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Waiting for Moore s Law to save your serial code start getting bleak in 2004 Source: published SPECInt data Moore s Law is not at all

More information

CSE5351: Parallel Procesisng. Part 1B. UTA Copyright (c) Slide No 1

CSE5351: Parallel Procesisng. Part 1B. UTA Copyright (c) Slide No 1 Slide No 1 CSE5351: Parallel Procesisng Part 1B Slide No 2 State of the Art In Supercomputing Several of the next slides (or modified) are the courtesy of Dr. Jack Dongarra, a distinguished professor of

More information

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Waiting for Moore s Law to save your serial code started getting bleak in 2004 Source: published SPECInt

More information

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Waiting for Moore s Law to save your serial code started getting bleak in 2004 Source: published SPECInt

More information

Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester

Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester 11/20/13 1 Rank Site Computer Country Cores Rmax [Pflops] % of Peak Power [MW] MFlops /Watt 1 2 3 4 National

More information

Trends in HPC (hardware complexity and software challenges)

Trends in HPC (hardware complexity and software challenges) Trends in HPC (hardware complexity and software challenges) Mike Giles Oxford e-research Centre Mathematical Institute MIT seminar March 13th, 2013 Mike Giles (Oxford) HPC Trends March 13th, 2013 1 / 18

More information

Overview. CS 472 Concurrent & Parallel Programming University of Evansville

Overview. CS 472 Concurrent & Parallel Programming University of Evansville Overview CS 472 Concurrent & Parallel Programming University of Evansville Selection of slides from CIS 410/510 Introduction to Parallel Computing Department of Computer and Information Science, University

More information

Presentations: Jack Dongarra, University of Tennessee & ORNL. The HPL Benchmark: Past, Present & Future. Mike Heroux, Sandia National Laboratories

Presentations: Jack Dongarra, University of Tennessee & ORNL. The HPL Benchmark: Past, Present & Future. Mike Heroux, Sandia National Laboratories HPC Benchmarking Presentations: Jack Dongarra, University of Tennessee & ORNL The HPL Benchmark: Past, Present & Future Mike Heroux, Sandia National Laboratories The HPCG Benchmark: Challenges It Presents

More information

Top500

Top500 Top500 www.top500.org Salvatore Orlando (from a presentation by J. Dongarra, and top500 website) 1 2 MPPs Performance on massively parallel machines Larger problem sizes, i.e. sizes that make sense Performance

More information

Emerging Heterogeneous Technologies for High Performance Computing

Emerging Heterogeneous Technologies for High Performance Computing MURPA (Monash Undergraduate Research Projects Abroad) Emerging Heterogeneous Technologies for High Performance Computing Jack Dongarra University of Tennessee Oak Ridge National Lab University of Manchester

More information

CS2214 COMPUTER ARCHITECTURE & ORGANIZATION SPRING Top 10 Supercomputers in the World as of November 2013*

CS2214 COMPUTER ARCHITECTURE & ORGANIZATION SPRING Top 10 Supercomputers in the World as of November 2013* CS2214 COMPUTER ARCHITECTURE & ORGANIZATION SPRING 2014 COMPUTERS : PRESENT, PAST & FUTURE Top 10 Supercomputers in the World as of November 2013* No Site Computer Cores Rmax + (TFLOPS) Rpeak (TFLOPS)

More information

The Mont-Blanc approach towards Exascale

The Mont-Blanc approach towards Exascale http://www.montblanc-project.eu The Mont-Blanc approach towards Exascale Alex Ramirez Barcelona Supercomputing Center Disclaimer: Not only I speak for myself... All references to unavailable products are

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introduction Why High Performance Computing? Quote: It is hard to understand an ocean because it is too big. It is hard to understand a molecule because it is too small. It is hard to understand

More information

Report on the Sunway TaihuLight System. Jack Dongarra. University of Tennessee. Oak Ridge National Laboratory

Report on the Sunway TaihuLight System. Jack Dongarra. University of Tennessee. Oak Ridge National Laboratory Report on the Sunway TaihuLight System Jack Dongarra University of Tennessee Oak Ridge National Laboratory June 24, 2016 University of Tennessee Department of Electrical Engineering and Computer Science

More information

Parallel Computing & Accelerators. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

Parallel Computing & Accelerators. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Parallel Computing Accelerators John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Purpose of this talk This is the 50,000 ft. view of the parallel computing landscape. We want

More information

Mathematical computations with GPUs

Mathematical computations with GPUs Master Educational Program Information technology in applications Mathematical computations with GPUs Introduction Alexey A. Romanenko arom@ccfit.nsu.ru Novosibirsk State University How to.. Process terabytes

More information

Parallel and Distributed Systems. Hardware Trends. Why Parallel or Distributed Computing? What is a parallel computer?

Parallel and Distributed Systems. Hardware Trends. Why Parallel or Distributed Computing? What is a parallel computer? Parallel and Distributed Systems Instructor: Sandhya Dwarkadas Department of Computer Science University of Rochester What is a parallel computer? A collection of processing elements that communicate and

More information

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Moore's Law abandoned serial programming around 2004 Courtesy Liberty Computer Architecture Research Group

More information

Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester

Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester 12/24/09 1 Take a look at high performance computing What s driving HPC Future Trends 2 Traditional scientific

More information

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Moore's Law abandoned serial programming around 2004 Courtesy Liberty Computer Architecture Research Group

More information

TOP500 List s Twice-Yearly Snapshots of World s Fastest Supercomputers Develop Into Big Picture of Changing Technology

TOP500 List s Twice-Yearly Snapshots of World s Fastest Supercomputers Develop Into Big Picture of Changing Technology TOP500 List s Twice-Yearly Snapshots of World s Fastest Supercomputers Develop Into Big Picture of Changing Technology BY ERICH STROHMAIER COMPUTER SCIENTIST, FUTURE TECHNOLOGIES GROUP, LAWRENCE BERKELEY

More information

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory Quinn Mitchell HPC UNIX/LINUX Storage Systems ORNL is managed by UT-Battelle for the US Department of Energy U.S. Department

More information

HPCG UPDATE: ISC 15 Jack Dongarra Michael Heroux Piotr Luszczek

HPCG UPDATE: ISC 15 Jack Dongarra Michael Heroux Piotr Luszczek www.hpcg-benchmark.org 1 HPCG UPDATE: ISC 15 Jack Dongarra Michael Heroux Piotr Luszczek www.hpcg-benchmark.org 2 HPCG Snapshot High Performance Conjugate Gradient (HPCG). Solves Ax=b, A large, sparse,

More information

HPC Technology Trends

HPC Technology Trends HPC Technology Trends High Performance Embedded Computing Conference September 18, 2007 David S Scott, Ph.D. Petascale Product Line Architect Digital Enterprise Group Risk Factors Today s s presentations

More information

Why we need Exascale and why we won t get there by 2020 Horst Simon Lawrence Berkeley National Laboratory

Why we need Exascale and why we won t get there by 2020 Horst Simon Lawrence Berkeley National Laboratory Why we need Exascale and why we won t get there by 2020 Horst Simon Lawrence Berkeley National Laboratory 2013 International Workshop on Computational Science and Engineering National University of Taiwan

More information

CRAY XK6 REDEFINING SUPERCOMPUTING. - Sanjana Rakhecha - Nishad Nerurkar

CRAY XK6 REDEFINING SUPERCOMPUTING. - Sanjana Rakhecha - Nishad Nerurkar CRAY XK6 REDEFINING SUPERCOMPUTING - Sanjana Rakhecha - Nishad Nerurkar CONTENTS Introduction History Specifications Cray XK6 Architecture Performance Industry acceptance and applications Summary INTRODUCTION

More information

Supercomputers. Alex Reid & James O'Donoghue

Supercomputers. Alex Reid & James O'Donoghue Supercomputers Alex Reid & James O'Donoghue The Need for Supercomputers Supercomputers allow large amounts of processing to be dedicated to calculation-heavy problems Supercomputers are centralized in

More information

Jack Dongarra University of Tennessee Oak Ridge National Laboratory

Jack Dongarra University of Tennessee Oak Ridge National Laboratory Jack Dongarra University of Tennessee Oak Ridge National Laboratory 3/9/11 1 TPP performance Rate Size 2 100 Pflop/s 100000000 10 Pflop/s 10000000 1 Pflop/s 1000000 100 Tflop/s 100000 10 Tflop/s 10000

More information

Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester

Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester 12/3/09 1 ! Take a look at high performance computing! What s driving HPC! Issues with power consumption! Future

More information

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29 Introduction CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction Spring 2018 1 / 29 Outline 1 Preface Course Details Course Requirements 2 Background Definitions

More information

Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins

Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins Outline History & Motivation Architecture Core architecture Network Topology Memory hierarchy Brief comparison to GPU & Tilera Programming Applications

More information

Cray XC Scalability and the Aries Network Tony Ford

Cray XC Scalability and the Aries Network Tony Ford Cray XC Scalability and the Aries Network Tony Ford June 29, 2017 Exascale Scalability Which scalability metrics are important for Exascale? Performance (obviously!) What are the contributing factors?

More information

PART I - Fundamentals of Parallel Computing

PART I - Fundamentals of Parallel Computing PART I - Fundamentals of Parallel Computing Objectives What is scientific computing? The need for more computing power The need for parallel computing and parallel programs 1 What is scientific computing?

More information

ECE 574 Cluster Computing Lecture 2

ECE 574 Cluster Computing Lecture 2 ECE 574 Cluster Computing Lecture 2 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 24 January 2019 Announcements Put your name on HW#1 before turning in! 1 Top500 List November

More information

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Aim High Intel Technical Update Teratec 07 Symposium June 20, 2007 Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Risk Factors Today s s presentations contain forward-looking statements.

More information

Building supercomputers from commodity embedded chips

Building supercomputers from commodity embedded chips http://www.montblanc-project.eu Building supercomputers from commodity embedded chips Alex Ramirez Barcelona Supercomputing Center Technical Coordinator This project and the research leading to these results

More information

Preparing GPU-Accelerated Applications for the Summit Supercomputer

Preparing GPU-Accelerated Applications for the Summit Supercomputer Preparing GPU-Accelerated Applications for the Summit Supercomputer Fernanda Foertter HPC User Assistance Group Training Lead foertterfs@ornl.gov This research used resources of the Oak Ridge Leadership

More information

Power Profiling of Cholesky and QR Factorizations on Distributed Memory Systems

Power Profiling of Cholesky and QR Factorizations on Distributed Memory Systems International Conference on Energy-Aware High Performance Computing Hamburg, Germany Bosilca, Ltaief, Dongarra (KAUST, UTK) Power Sept Profiling, DLA Algorithms ENAHPC / 6 Power Profiling of Cholesky and

More information

Seagate ExaScale HPC Storage

Seagate ExaScale HPC Storage Seagate ExaScale HPC Storage Miro Lehocky System Engineer, Seagate Systems Group, HPC1 100+ PB Lustre File System 130+ GB/s Lustre File System 140+ GB/s Lustre File System 55 PB Lustre File System 1.6

More information

Intel Many Integrated Core (MIC) Architecture

Intel Many Integrated Core (MIC) Architecture Intel Many Integrated Core (MIC) Architecture Karl Solchenbach Director European Exascale Labs BMW2011, November 3, 2011 1 Notice and Disclaimers Notice: This document contains information on products

More information

Hybrid Architectures Why Should I Bother?

Hybrid Architectures Why Should I Bother? Hybrid Architectures Why Should I Bother? CSCS-FoMICS-USI Summer School on Computer Simulations in Science and Engineering Michael Bader July 8 19, 2013 Computer Simulations in Science and Engineering,

More information

HPC Algorithms and Applications

HPC Algorithms and Applications HPC Algorithms and Applications Intro Michael Bader Winter 2015/2016 Intro, Winter 2015/2016 1 Part I Scientific Computing and Numerical Simulation Intro, Winter 2015/2016 2 The Simulation Pipeline phenomenon,

More information

COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES

COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES P(ND) 2-2 2014 Guillaume Colin de Verdière OCTOBER 14TH, 2014 P(ND)^2-2 PAGE 1 CEA, DAM, DIF, F-91297 Arpajon, France October 14th, 2014 Abstract:

More information

represent parallel computers, so distributed systems such as Does not consider storage or I/O issues

represent parallel computers, so distributed systems such as Does not consider storage or I/O issues Top500 Supercomputer list represent parallel computers, so distributed systems such as SETI@Home are not considered Does not consider storage or I/O issues Both custom designed machines and commodity machines

More information

Overview. High Performance Computing - History of the Supercomputer. Modern Definitions (II)

Overview. High Performance Computing - History of the Supercomputer. Modern Definitions (II) Overview High Performance Computing - History of the Supercomputer Dr M. Probert Autumn Term 2017 Early systems with proprietary components, operating systems and tools Development of vector computing

More information

Titan - Early Experience with the Titan System at Oak Ridge National Laboratory

Titan - Early Experience with the Titan System at Oak Ridge National Laboratory Office of Science Titan - Early Experience with the Titan System at Oak Ridge National Laboratory Buddy Bland Project Director Oak Ridge Leadership Computing Facility November 13, 2012 ORNL s Titan Hybrid

More information

European energy efficient supercomputer project

European energy efficient supercomputer project http://www.montblanc-project.eu European energy efficient supercomputer project Simon McIntosh-Smith University of Bristol (Based on slides from Alex Ramirez, BSC) Disclaimer: Speaking for myself... All

More information

Why we need Exascale and why we won t get there by 2020

Why we need Exascale and why we won t get there by 2020 Why we need Exascale and why we won t get there by 2020 Horst Simon Lawrence Berkeley National Laboratory August 27, 2013 Overview Current state of HPC: petaflops firmly established Why we won t get to

More information

Confessions of an Accidental Benchmarker

Confessions of an Accidental Benchmarker Confessions of an Accidental Benchmarker http://bit.ly/hpcg-benchmark 1 Appendix B of the Linpack Users Guide Designed to help users extrapolate execution Linpack software package First benchmark report

More information

HPCG UPDATE: SC 15 Jack Dongarra Michael Heroux Piotr Luszczek

HPCG UPDATE: SC 15 Jack Dongarra Michael Heroux Piotr Luszczek 1 HPCG UPDATE: SC 15 Jack Dongarra Michael Heroux Piotr Luszczek HPCG Snapshot High Performance Conjugate Gradient (HPCG). Solves Ax=b, A large, sparse, b known, x computed. An optimized implementation

More information

High Performance Computing in Europe and USA: A Comparison

High Performance Computing in Europe and USA: A Comparison High Performance Computing in Europe and USA: A Comparison Erich Strohmaier 1 and Hans W. Meuer 2 1 NERSC, Lawrence Berkeley National Laboratory, USA 2 University of Mannheim, Germany 1 Introduction In

More information

Introduction to GPU computing

Introduction to GPU computing Introduction to GPU computing Nagasaki Advanced Computing Center Nagasaki, Japan The GPU evolution The Graphic Processing Unit (GPU) is a processor that was specialized for processing graphics. The GPU

More information

Complexity and Advanced Algorithms. Introduction to Parallel Algorithms

Complexity and Advanced Algorithms. Introduction to Parallel Algorithms Complexity and Advanced Algorithms Introduction to Parallel Algorithms Why Parallel Computing? Save time, resources, memory,... Who is using it? Academia Industry Government Individuals? Two practical

More information

Parallel Programming

Parallel Programming Parallel Programming Introduction Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS15/16 Acknowledgements Prof. Felix Wolf, TU Darmstadt Prof. Matthias

More information

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS + Hybrid Computing @ KAUST Many Cores and OpenACC Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS + Agenda Hybrid Computing n Hybrid Computing n From Multi-Physics

More information

Fra superdatamaskiner til grafikkprosessorer og

Fra superdatamaskiner til grafikkprosessorer og Fra superdatamaskiner til grafikkprosessorer og Brødtekst maskinlæring Prof. Anne C. Elster IDI HPC/Lab Parallel Computing: Personal perspective 1980 s: Concurrent and Parallel Pascal 1986: Intel ipsc

More information

Steve Scott, Tesla CTO SC 11 November 15, 2011

Steve Scott, Tesla CTO SC 11 November 15, 2011 Steve Scott, Tesla CTO SC 11 November 15, 2011 What goal do these products have in common? Performance / W Exaflop Expectations First Exaflop Computer K Computer ~10 MW CM5 ~200 KW Not constant size, cost

More information

An Overview of High Performance Computing

An Overview of High Performance Computing IFIP Working Group 10.3 on Concurrent Systems An Overview of High Performance Computing Jack Dongarra University of Tennessee and Oak Ridge National Laboratory 1/3/2006 1 Overview Look at fastest computers

More information

Parallel Computing & Accelerators. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

Parallel Computing & Accelerators. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Parallel Computing Accelerators John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Purpose of this talk This is the 50,000 ft. view of the parallel computing landscape. We want

More information

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA STATE OF THE ART 2012 18,688 Tesla K20X GPUs 27 PetaFLOPS FLAGSHIP SCIENTIFIC APPLICATIONS

More information

NVIDIA Update and Directions on GPU Acceleration for Earth System Models

NVIDIA Update and Directions on GPU Acceleration for Earth System Models NVIDIA Update and Directions on GPU Acceleration for Earth System Models Stan Posey, HPC Program Manager, ESM and CFD, NVIDIA, Santa Clara, CA, USA Carl Ponder, PhD, Applications Software Engineer, NVIDIA,

More information

HPC IN EUROPE. Organisation of public HPC resources

HPC IN EUROPE. Organisation of public HPC resources HPC IN EUROPE Organisation of public HPC resources Context Focus on publicly-funded HPC resources provided primarily to enable scientific research and development at European universities and other publicly-funded

More information

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620 Introduction to Parallel and Distributed Computing Linh B. Ngo CPSC 3620 Overview: What is Parallel Computing To be run using multiple processors A problem is broken into discrete parts that can be solved

More information

Technology challenges and trends over the next decade (A look through a 2030 crystal ball) Al Gara Intel Fellow & Chief HPC System Architect

Technology challenges and trends over the next decade (A look through a 2030 crystal ball) Al Gara Intel Fellow & Chief HPC System Architect Technology challenges and trends over the next decade (A look through a 2030 crystal ball) Al Gara Intel Fellow & Chief HPC System Architect Today s Focus Areas For Discussion Will look at various technologies

More information

Roadmapping of HPC interconnects

Roadmapping of HPC interconnects Roadmapping of HPC interconnects MIT Microphotonics Center, Fall Meeting Nov. 21, 2008 Alan Benner, bennera@us.ibm.com Outline Top500 Systems, Nov. 2008 - Review of most recent list & implications on interconnect

More information

PLAN-E Workshop Switzerland. Welcome! September 8, 2016

PLAN-E Workshop Switzerland. Welcome! September 8, 2016 PLAN-E Workshop Switzerland Welcome! September 8, 2016 The Swiss National Supercomputing Centre Driving innovation in computational research in Switzerland Michele De Lorenzi (CSCS) PLAN-E September 8,

More information

Systems Architectures towards Exascale

Systems Architectures towards Exascale Systems Architectures towards Exascale D. Pleiter German-Indian Workshop on HPC Architectures and Applications Pune 29 November 2016 Outline Introduction Exascale computing Technology trends Architectures

More information

CS5222 Advanced Computer Architecture. Lecture 1 Introduction

CS5222 Advanced Computer Architecture. Lecture 1 Introduction CS5222 Advanced Computer Architecture Lecture 1 Introduction Overview Teaching Staff Introduction to Computer Architecture History Future / Trends Significance The course Content Workload Administrative

More information

IBM HPC DIRECTIONS. Dr Don Grice. ECMWF Workshop November, IBM Corporation

IBM HPC DIRECTIONS. Dr Don Grice. ECMWF Workshop November, IBM Corporation IBM HPC DIRECTIONS Dr Don Grice ECMWF Workshop November, 2008 IBM HPC Directions Agenda What Technology Trends Mean to Applications Critical Issues for getting beyond a PF Overview of the Roadrunner Project

More information

Introduction: Modern computer architecture. The stored program computer and its inherent bottlenecks Multi- and manycore chips and nodes

Introduction: Modern computer architecture. The stored program computer and its inherent bottlenecks Multi- and manycore chips and nodes Introduction: Modern computer architecture The stored program computer and its inherent bottlenecks Multi- and manycore chips and nodes Motivation: Multi-Cores where and why Introduction: Moore s law Intel

More information

The Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center

The Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center The Stampede is Coming Welcome to Stampede Introductory Training Dan Stanzione Texas Advanced Computing Center dan@tacc.utexas.edu Thanks for Coming! Stampede is an exciting new system of incredible power.

More information

Exascale: challenges and opportunities in a power constrained world

Exascale: challenges and opportunities in a power constrained world Exascale: challenges and opportunities in a power constrained world Carlo Cavazzoni c.cavazzoni@cineca.it SuperComputing Applications and Innovation Department CINECA CINECA non profit Consortium, made

More information

8/28/12. CSE 820 Graduate Computer Architecture. Richard Enbody. Dr. Enbody. 1 st Day 2

8/28/12. CSE 820 Graduate Computer Architecture. Richard Enbody. Dr. Enbody. 1 st Day 2 CSE 820 Graduate Computer Architecture Richard Enbody Dr. Enbody 1 st Day 2 1 Why Computer Architecture? Improve coding. Knowledge to make architectural choices. Ability to understand articles about architecture.

More information

Digital Signal Processor Supercomputing

Digital Signal Processor Supercomputing Digital Signal Processor Supercomputing ENCM 515: Individual Report Prepared by Steven Rahn Submitted: November 29, 2013 Abstract: Analyzing the history of supercomputers: how the industry arrived to where

More information

The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System

The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System Alan Humphrey, Qingyu Meng, Martin Berzins Scientific Computing and Imaging Institute & University of Utah I. Uintah Overview

More information

Lecture 1: Gentle Introduction to GPUs

Lecture 1: Gentle Introduction to GPUs CSCI-GA.3033-004 Graphics Processing Units (GPUs): Architecture and Programming Lecture 1: Gentle Introduction to GPUs Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Who Am I? Mohamed

More information

System Packaging Solution for Future High Performance Computing May 31, 2018 Shunichi Kikuchi Fujitsu Limited

System Packaging Solution for Future High Performance Computing May 31, 2018 Shunichi Kikuchi Fujitsu Limited System Packaging Solution for Future High Performance Computing May 31, 2018 Shunichi Kikuchi Fujitsu Limited 2018 IEEE 68th Electronic Components and Technology Conference San Diego, California May 29

More information

Race to Exascale: Opportunities and Challenges. Avinash Sodani, Ph.D. Chief Architect MIC Processor Intel Corporation

Race to Exascale: Opportunities and Challenges. Avinash Sodani, Ph.D. Chief Architect MIC Processor Intel Corporation Race to Exascale: Opportunities and Challenges Avinash Sodani, Ph.D. Chief Architect MIC Processor Intel Corporation Exascale Goal: 1-ExaFlops (10 18 ) within 20 MW by 2018 1 ZFlops 100 EFlops 10 EFlops

More information

Trends in HPC Architectures and Parallel

Trends in HPC Architectures and Parallel Trends in HPC Architectures and Parallel Programmming Giovanni Erbacci - g.erbacci@cineca.it Supercomputing, Applications & Innovation Department - CINECA CINECA, February 11-15, 2013 Agenda Computational

More information

The Stampede is Coming: A New Petascale Resource for the Open Science Community

The Stampede is Coming: A New Petascale Resource for the Open Science Community The Stampede is Coming: A New Petascale Resource for the Open Science Community Jay Boisseau Texas Advanced Computing Center boisseau@tacc.utexas.edu Stampede: Solicitation US National Science Foundation

More information

End User Update: High-Performance Reconfigurable Computing

End User Update: High-Performance Reconfigurable Computing End User Update: High-Performance Reconfigurable Computing Tarek El-Ghazawi Director, GW Institute for Massively Parallel Applications and Computing Technologies(IMPACT) Co-Director, NSF Center for High-Performance

More information

CS 5803 Introduction to High Performance Computer Architecture: Performance Metrics

CS 5803 Introduction to High Performance Computer Architecture: Performance Metrics CS 5803 Introduction to High Performance Computer Architecture: Performance Metrics A.R. Hurson 323 Computer Science Building, Missouri S&T hurson@mst.edu 1 Instructor: Ali R. Hurson 323 CS Building hurson@mst.edu

More information

Overview. Introduction to Parallel Computing CIS 410/510 Department of Computer and Information Science. Lecture 1 Overview

Overview. Introduction to Parallel Computing CIS 410/510 Department of Computer and Information Science. Lecture 1 Overview Overview Introduction to Parallel Computing CIS 410/510 Department of Computer and Information Science Outline q Course Overview What is CIS 410/510? What is expected of you? What will you learn in CIS

More information

Trends in HPC Architectures and Parallel

Trends in HPC Architectures and Parallel Trends in HPC Architectures and Parallel Programmming Giovanni Erbacci - g.erbacci@cineca.it Supercomputing, Applications & Innovation Department - CINECA Agenda - Computational Sciences - Trends in Parallel

More information

HPC Technology Update Challenges or Chances?

HPC Technology Update Challenges or Chances? HPC Technology Update Challenges or Chances? Swiss Distributed Computing Day Thomas Schoenemeyer, Technology Integration, CSCS 1 Move in Feb-April 2012 1500m2 16 MW Lake-water cooling PUE 1.2 New Datacenter

More information

Accelerating HPC. (Nash) Dr. Avinash Palaniswamy High Performance Computing Data Center Group Marketing

Accelerating HPC. (Nash) Dr. Avinash Palaniswamy High Performance Computing Data Center Group Marketing Accelerating HPC (Nash) Dr. Avinash Palaniswamy High Performance Computing Data Center Group Marketing SAAHPC, Knoxville, July 13, 2010 Legal Disclaimer Intel may make changes to specifications and product

More information

Overview of HPC and Energy Saving on KNL for Some Computations

Overview of HPC and Energy Saving on KNL for Some Computations Overview of HPC and Energy Saving on KNL for Some Computations Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester 1/2/217 1 Outline Overview of High Performance

More information

Let s say I give you a homework assignment today with 100 problems. Each problem takes 2 hours to solve. The homework is due tomorrow.

Let s say I give you a homework assignment today with 100 problems. Each problem takes 2 hours to solve. The homework is due tomorrow. Let s say I give you a homework assignment today with 100 problems. Each problem takes 2 hours to solve. The homework is due tomorrow. Big problems and Very Big problems in Science How do we live Protein

More information

Technology Trends Presentation For Power Symposium

Technology Trends Presentation For Power Symposium Technology Trends Presentation For Power Symposium 2006 8-23-06 Darryl Solie, Distinguished Engineer, Chief System Architect IBM Systems & Technology Group From Ingenuity to Impact Copyright IBM Corporation

More information

What does Heterogeneity bring?

What does Heterogeneity bring? What does Heterogeneity bring? Ken Koch Scientific Advisor, CCS-DO, LANL LACSI 2006 Conference October 18, 2006 Some Terminology Homogeneous Of the same or similar nature or kind Uniform in structure or

More information

HPC future trends from a science perspective

HPC future trends from a science perspective HPC future trends from a science perspective Simon McIntosh-Smith University of Bristol HPC Research Group simonm@cs.bris.ac.uk 1 Business as usual? We've all got used to new machines being relatively

More information

GPU > CPU. FOR HIGH PERFORMANCE COMPUTING PRESENTATION BY - SADIQ PASHA CHETHANA DILIP

GPU > CPU. FOR HIGH PERFORMANCE COMPUTING PRESENTATION BY - SADIQ PASHA CHETHANA DILIP GPU > CPU. FOR HIGH PERFORMANCE COMPUTING PRESENTATION BY - SADIQ PASHA CHETHANA DILIP INTRODUCTION or With the exponential increase in computational power of todays hardware, the complexity of the problem

More information

Supercomputing with Commodity CPUs: Are Mobile SoCs Ready for HPC?

Supercomputing with Commodity CPUs: Are Mobile SoCs Ready for HPC? Supercomputing with Commodity CPUs: Are Mobile SoCs Ready for HPC? Nikola Rajovic, Paul M. Carpenter, Isaac Gelado, Nikola Puzovic, Alex Ramirez, Mateo Valero SC 13, November 19 th 2013, Denver, CO, USA

More information

A unified Energy Footprint for Simulation Software

A unified Energy Footprint for Simulation Software A unified Energy Footprint for Simulation Software Hartwig Anzt, Armen Beglarian, Suren Chilingaryan, Andrew Ferrone, Vincent Heuveline, Andreas Kopmann Hartwig Anzt September 12, 212 ENGINEERING MATHEMATICS

More information

Fabio AFFINITO.

Fabio AFFINITO. Introduction to High Performance Computing Fabio AFFINITO What is the meaning of High Performance Computing? What does HIGH PERFORMANCE mean??? 1976... Cray-1 supercomputer First commercial successful

More information

CINECA and the European HPC Ecosystem

CINECA and the European HPC Ecosystem CINECA and the European HPC Ecosystem Giovanni Erbacci Supercomputing, Applications and Innovation Department, CINECA g.erbacci@cineca.it Enabling Applications on Intel MIC based Parallel Architectures

More information

D6.1 AllScale Computing Infrastructure

D6.1 AllScale Computing Infrastructure H2020 FETHPC-1-2014 An Exascale Programming, Multi-objective Optimisation and Resilience Management Environment Based on Nested Recursive Parallelism Project Number 671603 D6.1 AllScale Computing Infrastructure

More information

High Performance Computing in Europe and USA: A Comparison

High Performance Computing in Europe and USA: A Comparison High Performance Computing in Europe and USA: A Comparison Hans Werner Meuer University of Mannheim and Prometeus GmbH 2nd European Stochastic Experts Forum Baden-Baden, June 28-29, 2001 Outlook Introduction

More information

Introduction to Parallel Programming for Multicore/Manycore Clusters Introduction

Introduction to Parallel Programming for Multicore/Manycore Clusters Introduction duction to Parallel Programming for Multicore/Manycore Clusters Introduction Kengo Nakajima Information Technology Center The University of Tokyo 2 Motivation for Parallel Computing (and this class) Large-scale

More information