CSE5351: Parallel Procesisng. Part 1B. UTA Copyright (c) Slide No 1

Size: px
Start display at page:

Download "CSE5351: Parallel Procesisng. Part 1B. UTA Copyright (c) Slide No 1"

Transcription

1 Slide No 1 CSE5351: Parallel Procesisng Part 1B

2 Slide No 2 State of the Art In Supercomputing Several of the next slides (or modified) are the courtesy of Dr. Jack Dongarra, a distinguished professor of Computer Science at the University of Tennessee.

3 Slide No 3 Look at the Fastest Computers Strategic importance of supercomputing Essential for scientific discovery Critical for national security Fundamental contributor to the economy and competitiveness through use in engineering and manufacturing Supercomputers are the tool for solving the most challenging problems through simulations 3

4 Slide No 4 Rate H. Meuer, H. Simon, E. Strohmaier, & JD - Listing of the 500 most powerful Computers in the World - Yardstick: Rmax from LINPACK MPP Ax=b, dense problem - Updated twice a year SC xy in the States in November Meeting in Germany in June Size TPP performance - All data available from 4

5 Performance Development of HPC over the Last 23 Years from the Top500 1 Eflop/s 1E PFlop/s 100 Pflop/s Pflop/s Pflop/s Tflop/s SUM 33.9 PFlop/s 166 TFlop/s 10 Tflop/s N=1 1 Tflop/s TFlop/s 100 Gflop/s 100 N= GFlop/s 10 Gflop/s 10 1 Gflop/s MFlop/s 100 Mflop/s 0.1 Slide No 5 My Laptop 70 Gflop/s My iphone 4 Gflop/s

6 State of Supercomputing in 2015 Pflops computing fully established with 67 systems. Three technology architecture possibilities or swim lanes are thriving. Commodity (e.g. Intel) Commodity + accelerator (e.g. GPUs) (88 systems) Special purpose lightweight cores (e.g. IBM BG, Knights Landing) Interest in supercomputing is now worldwide, and growing in many new markets (over 50% of Top500 computers are in industry). Exascale projects exist in many countries and regions. Intel processors largest share, 86% followed by AMD, 4%. Slide No 6 6

7 July 2015: The TOP 10 Systems Rank Site Computer Country Cores Rmax [Pflops] % of Peak Power [MW] MFlops/ Watt 1 National Super Computer Center in Guangzhou Tianhe-2 NUDT, Xeon 12C 2.2GHz + IntelXeon Phi (57c) + Custom China 3,120, DOE / OS Ridge Nat Lab Oak Titan, Cray XK7, AMD (16C) + Nvidia Kepler GPU (14c) + Custom USA 560, DOE / NNSA Livermore Nat Lab L Sequoia, BlueGene/Q (16c) + custom USA 1,572, RIKEN Advanced Inst for Comp Sci K computer Fujitsu SPARC64 VIIIfx (8c) + Custom Japan 705, DOE / OS Argonne Nat Lab Mira, BlueGene/Q (16c) + Custom USA 786, Swiss CSCS 7 KAUST Piz Daint, Cray XC30, Xeon 8C + Nvidia Kepler (14c) + Custom Shaheen II, Cray XC30, Xeon 16C + Custom Swiss 115, Saudi Arabia 196, Texas Advanced Computing Center Stampede, Dell Intel (8c) + Intel Xeon Phi (61c) + IB USA 204, Forschungszentrum Juelich (FZJ) JuQUEEN, BlueGene/Q, Power BQC 16C 1.6GHz+Custom Germany 458, DOE / NNSA L Vulcan, BlueGene/Q, 10 USA 393, CSE5351 Livermore (Part Nat ( 1 Lab Power BQC 16C 1.6GHz+Custom 500 (422) Software Comp HP Cluster USA 18, Slide No 7

8 Seven Top500 Systems in Australia Rank Name Computer Site Manufacturer Total Cores Rmax Rpeak SuperBlade SBI-7127RG- 15 C01N E/SGI ICE X, Intel Xeon E5- Supermicro/S 2695v2 12C 2.4GHz, Tulip Trading GI Infiniband FDR, Intel Xeon Phi 7120P/NDIVA M Magnus 71 Cray XC40, Xeon E5-2690v3 12C 2.6GHz, Aries interconnect Fujitsu PRIMERGY CX250 S1, Xeon E C 2.600GHz, Infiniband FDR 100 Avoca BlueGene/Q, Power BQC 16C 1.60GHz, Custom CSIRO Nitro G16 3GPU, Xeon E5-199 GPU C 2GHz, Infiniband Cluster FDR, Nvidia K20m 406 Sukuriputo Okane 408 Galaxy SGI ICE X, Intel Xeon E5-2695v2 12C 2.4GHz, Infiniband FDR, NVIDIA 2090 Cray XC30, Intel Xeon E5-2692v2 10C 3.000GHz, Aries interconnect Pawsey SC Centre, WA Cray Inc ANU Fujitsu Victorian Life Sci Comp Initiative CSIRO IBM Xenon Systems C01N SGI Pawsey SC Centre, WA Cray Inc Slide No 8 8

9 Systems Slide No 9 Accelerators (53 systems) Intel MIC (13) Clearspeed CSX600 (0) ATI GPU (2) IBM PowerXCell 8i (0) NVIDIA 2070 (4) NVIDIA 2050 (7) NVIDIA 2090 (11) NVIDIA K20 (16) 19 US 9 China 6 Japan 4 Russia 2 France 2 Germany 2 India 1 Italy 1 Poland 1 Australia 2 Brazil 1 Saudi Arabia 1 South Korea 1 Spain 2 Switzerland 1 UK

10 Slide No 10 Processors / Systems 2% 1% 1% Intel SandyBridge 4% 4% Intel Nehalem 10% AMD x86_64 PowerPC 55% Power 23% Intel Core Sparc Others

11 Slide No 11 Vendors / System Share NEC, 4, 1% Hitachi, 4, Others, 33, 6% 1% NUDT Dell, 8, 2%, 4, Fujitsu, 8, 2% 1% Bull, 14, 3% SGI, 17, 3% Cray Inc., 48, 9% IBM, 164, 33% HP, 196, 39% IBM HP Cray Inc. SGI Bull Fujitsu Dell NUDT Hitachi NEC Others

12 Countries Share Slide No 12 Absolute Cou US: 267 China: 63 Japan: 28 UK: 23 France: 22 Germany:

13 Slide No 13 Performance Development in Top500 1E+11 1E+10 1 Eflop/s 1E Pflop/s Pflop/s Pflop/s Tflop/s Tflop/s Tflop/s Gflop/s Gflop/s 10 1 Gflop/s N=1 N=

14 Slide No 14 Today s #1 System Systems Tianhe Difference Today & Exa System peak 55 Pflop/s 1 Eflop/s ~20x Power System memory Node performance 18 MW (3 Gflops/W) 1.4 PB (1.024 PB CPU PB CoP) 3.43 TF/s (.4 CPU +3 CoP) Node concurrency 24 cores CPU cores CoP ~20 MW (50 Gflops/W) O(1) ~15x PB ~50x 1.2 or 15TF/s O(1) O(1k) or 10k ~5x - ~50x Node Interconnect BW 6.36 GB/s GB/s ~40x System size (nodes) 16,000 O(100,000) or O(1M) ~6x - ~60x Total concurrency 3.12 M 12.48M threads (4/core) O(billion) ~100x MTTF Few / day O(<1 day) O(?)

15 Exascale System Architecture with a cap of $200M and 20MW Slide No 15 Systems Tianhe Difference Today & Exa System peak 55 Pflop/s 1 Eflop/s ~20x Power System memory Node performance 18 MW (3 Gflops/W) 1.4 PB (1.024 PB CPU PB CoP) 3.43 TF/s (.4 CPU +3 CoP) Node concurrency 24 cores CPU cores CoP ~20 MW (50 Gflops/W) O(1) ~15x PB ~50x 1.2 or 15TF/s O(1) O(1k) or 10k ~5x - ~50x Node Interconnect BW 6.36 GB/s GB/s ~40x System size (nodes) 16,000 O(100,000) or O(1M) ~6x - ~60x Total concurrency 3.12 M 12.48M threads (4/core) O(billion) ~100x MTTF Few / day Many / day O(?)

16 ORNL s Titan Hybrid System: Cray XK7 with AMD Opteron and NVIDIA Tesla processors 4,352 ft m 2 SYSTEM SPECIFICATIONS: Peak performance of 27 PF 24.5 Pflop/s GPU Pflop/s AMD 18,688 Compute Nodes each with: 16-Core AMD Opteron CPU NVIDIA Tesla K20x GPU GB memory 512 Service and I/O nodes 200 Cabinets 710 TB total system memory Cray Gemini 3D Torus Interconnect 9 MW peak power 16 Slide No 16

17 Slide No 17 Summary Major Challenges are ahead for extreme computing Parallelism Hybrid Fault Tolerance Power and many others not discussed here We will need completely new approaches and technologies to reach the Exascale level

18 Slide No 19 To be published in the January 2011 issue of The International Journal of High Performance Computing Applications 19 We can only see a short distance ahead, but we can see plenty there that needs to be done. Alan Turing ( )

19 Slide No 20 Technology Trends: Microprocessor Capacity Gordon Moore (co-founder of Intel) Electronics Magazine, 1965 Number of devices/chip doubles every 18 months 2X transistors/chip Every 1.5 years Called Moore s Law Microprocessors have become smaller, denser, and more powerful. Not just processors, bandwidth, storage, etc. 2X memory and processor speed and ½ size, cost, & power every 18 months. 20

20 Slide No 21 Moore s Law is Alive and Well 1.E+07 1.E+06 1.E+05 Transistors (in Thousands) 1.E+04 1.E+03 1.E+02 1.E+01 1.E+00 1.E Data from Kunle Olukotun, Lance Hammond, Herb Sutter, Burton Smith, Chris Batten, and Krste Asanoviç Slide from Kathy Yelick

21 But Clock Frequency Scaling Replaced by Scaling Cores / Chip Slide No 22 1.E+07 1.E+06 1.E Years of exponential growth ~2x year has ended Transistors (in Thousands) Frequency (MHz) Cores 1.E+04 1.E+03 1.E+02 1.E+01 1.E+00 1.E Data from Kunle Olukotun, Lance Hammond, Herb Sutter, Burton Smith, Chris Batten, and Krste Asanoviç Slide from Kathy Yelick

22 Performance Has Also Slowed, Along with Power Slide No 23 1.E+07 1.E+06 1.E+05 Power is the root cause of all this Transistors (in Thousands) Frequency (MHz) Power (W) 1.E+04 Cores A hardware issue just became a software problem 1.E+03 1.E+02 1.E+01 1.E+00 1.E Data from Kunle Olukotun, Lance Hammond, Herb Sutter, Burton Smith, Chris Batten, and Krste Asanoviç Slide from Kathy Yelick

23 Slide No 24 Power Cost of Frequency Power Voltage 2 x Frequency (V 2 F) Frequency Voltage Power Frequency 3 24

24 Slide No 25 Power Cost of Frequency Power Voltage 2 x Frequency (V 2 F) Frequency Voltage Power Frequency 3 25

25 Looking at the Gordon Bell Prize (Recognize outstanding achievement in high-performance computing application and encourage development of parallel processing ) Slide No 26 1 GFlop/s; 1988; Cray Y-MP; 8 Processors Static finite element analysis 1 TFlop/s; 1998; Cray T3E; 1024 Processors Modeling of metallic magnet atoms, using a variation of the locally self-consistent multiple scattering method. 1 PFlop/s; 2008; Cray XT5; 1.5x10 5 Processors Superconductive materials 1 EFlop/s; ~2018;?; 1x10 7 Processors (10 9 threads)

26 Hardware and System Software Scalability Slide No 28 Barriers Fundamental assumptions of system software architecture did not anticipate exponential growth in parallelism Number of components and MTBF changes the game Technical Focus Areas System Hardware Scalability System Software Scalability Applications Scalability Technical Gap 1000x improvement in system software scaling 100x improvement in system software reliability 100,000 90,000 80,000 70,000 60,000 50,000 40,000 30,000 20,000 10,000 0 Average Number of Cores Per Supercomputer Top20 of the Top500

27 Commodity Intel Xeon 8 cores 3 GHz 8*4 ops/cycle 96 Gflop/s (DP) Commodity plus Accelerator Today 88 of the Top500 Systems Accelerator (GPU) Nvidia K20X Kepler 2688 Cuda cores.732 GHz 2688*2/3 ops/cycle 1.31 Tflop/s (DP) 192 Cuda cores/smx 2688 Cuda cores Interconnect PCI-e Gen2/3 16 lane 64 Gb/s (8 GB/s) 1 GW/s 6 GB 31 Slide No 31

28 Recent Developments US DOE planning to deploy O(100) Pflop/s systems for $525M hardware Oak Ridge Lab and Lawrence Livermore Lab to receive IBM and Nvidia based systems Argonne Lab to receive Intel based system After this Exaflops US Dept of Commerce is preventing some China groups from receiving Intel technology Citing concerns about nuclear research being done with the systems; February On the blockade list: National SC Center Guangzhou, site of Tianhe-2 National SC Center Tianjin, site of Tianhe-1A National University for Defense Technology, developer National SC Center Changsha, location of NUDT For the first time, < 50% of Top500 are in the U.S. 231 of the systems are U.S.-based, China #3 w/37. Slide No 32 32

29 Today s Multicores All of Top500 Systems Are Based on Multicore Intel Haswell (18 cores) Intel Xeon Phi (60 cores) IBM Power 8 (12 cores) AMD Interlagos (16 cores) Nvidia Kepler (2688 Cuda cores) IBM BG/Q (18 cores) Fujitsu Venus (16 cores) 33 ShenWei (16 core) Slide No 33

30 Slide No 34 Problem with Processors As we put more processing power on the multicore chip, one of the problems is getting the data to the cores Next generation will be more integrated, 3D design with a photonic network 34

31 Peak Performance - Per Core We are here Floating point operations per cycle per core Most of the recent computers have FMA (Fused multiple add): (i.e. x x + y*z in one cycle) Intel Xeon earlier models and AMD Opteron have SSE2 2 flops/cycle DP & 4 flops/cycle SP Intel Xeon Nehalem ( 09) & Westmere ( 10) have SSE4 4 flops/cycle DP & 8 flops/cycle SP Intel Xeon Sandy Bridge( 11) & Ivy Bridge ( 12) have AVX 8 flops/cycle DP & 16 flops/cycle SP Intel Xeon Haswell ( 13) & (Broadwell ( 14)) AVX2 16 flops/cycle DP & 32 flops/cycle SP Xeon Phi (per core) is at 16 flops/cycle DP & 32 flops/cycle SP Intel Xeon Skylake ( 15) CSE5351 (Part ( 1 32 flops/cycle DL & 64 flops/cycle SP Slide No 35

32 Memory transfer (Its All About Data Movement) Example on my laptop: One level of memory 56 GFLOP/sec/core x 2 cores CPU ( Omitting latency here. ) Intel Core i7 4850HQ Haswell, 2.3 GHz Turbo Boost 3.5 GHz Cache (6 MB) 25.6 GB/sec Main memory (8 GB) The model IS simplified (see next slide) but it provides an upper bound on performance as well. I.e., we will never go faster than what the model predicts. ( slowergocanwe, courseof, And) Slide No 36

Emerging Heterogeneous Technologies for High Performance Computing

Emerging Heterogeneous Technologies for High Performance Computing MURPA (Monash Undergraduate Research Projects Abroad) Emerging Heterogeneous Technologies for High Performance Computing Jack Dongarra University of Tennessee Oak Ridge National Lab University of Manchester

More information

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Waiting for Moore s Law to save your serial code start getting bleak in 2004 Source: published SPECInt data Moore s Law is not at all

More information

Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester

Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester 11/20/13 1 Rank Site Computer Country Cores Rmax [Pflops] % of Peak Power [MW] MFlops /Watt 1 2 3 4 National

More information

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Waiting for Moore s Law to save your serial code started getting bleak in 2004 Source: published SPECInt

More information

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Waiting for Moore s Law to save your serial code started getting bleak in 2004 Source: published SPECInt

More information

HPC as a Driver for Computing Technology and Education

HPC as a Driver for Computing Technology and Education HPC as a Driver for Computing Technology and Education Tarek El-Ghazawi The George Washington University Washington D.C., USA NOW- July 2015: The TOP 10 Systems Rank Site Computer Cores Rmax [Pflops] %

More information

Jack Dongarra University of Tennessee Oak Ridge National Laboratory

Jack Dongarra University of Tennessee Oak Ridge National Laboratory Jack Dongarra University of Tennessee Oak Ridge National Laboratory 3/9/11 1 TPP performance Rate Size 2 100 Pflop/s 100000000 10 Pflop/s 10000000 1 Pflop/s 1000000 100 Tflop/s 100000 10 Tflop/s 10000

More information

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Moore's Law abandoned serial programming around 2004 Courtesy Liberty Computer Architecture Research Group

More information

Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester

Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester 12/3/09 1 ! Take a look at high performance computing! What s driving HPC! Issues with power consumption! Future

More information

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Moore's Law abandoned serial programming around 2004 Courtesy Liberty Computer Architecture Research Group

More information

Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester

Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester 12/24/09 1 Take a look at high performance computing What s driving HPC Future Trends 2 Traditional scientific

More information

Overview. CS 472 Concurrent & Parallel Programming University of Evansville

Overview. CS 472 Concurrent & Parallel Programming University of Evansville Overview CS 472 Concurrent & Parallel Programming University of Evansville Selection of slides from CIS 410/510 Introduction to Parallel Computing Department of Computer and Information Science, University

More information

Top500

Top500 Top500 www.top500.org Salvatore Orlando (from a presentation by J. Dongarra, and top500 website) 1 2 MPPs Performance on massively parallel machines Larger problem sizes, i.e. sizes that make sense Performance

More information

Presentations: Jack Dongarra, University of Tennessee & ORNL. The HPL Benchmark: Past, Present & Future. Mike Heroux, Sandia National Laboratories

Presentations: Jack Dongarra, University of Tennessee & ORNL. The HPL Benchmark: Past, Present & Future. Mike Heroux, Sandia National Laboratories HPC Benchmarking Presentations: Jack Dongarra, University of Tennessee & ORNL The HPL Benchmark: Past, Present & Future Mike Heroux, Sandia National Laboratories The HPCG Benchmark: Challenges It Presents

More information

Trends in HPC (hardware complexity and software challenges)

Trends in HPC (hardware complexity and software challenges) Trends in HPC (hardware complexity and software challenges) Mike Giles Oxford e-research Centre Mathematical Institute MIT seminar March 13th, 2013 Mike Giles (Oxford) HPC Trends March 13th, 2013 1 / 18

More information

Report on the Sunway TaihuLight System. Jack Dongarra. University of Tennessee. Oak Ridge National Laboratory

Report on the Sunway TaihuLight System. Jack Dongarra. University of Tennessee. Oak Ridge National Laboratory Report on the Sunway TaihuLight System Jack Dongarra University of Tennessee Oak Ridge National Laboratory June 24, 2016 University of Tennessee Department of Electrical Engineering and Computer Science

More information

Parallel Computing & Accelerators. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

Parallel Computing & Accelerators. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Parallel Computing Accelerators John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Purpose of this talk This is the 50,000 ft. view of the parallel computing landscape. We want

More information

High Performance Computing in Europe and USA: A Comparison

High Performance Computing in Europe and USA: A Comparison High Performance Computing in Europe and USA: A Comparison Hans Werner Meuer University of Mannheim and Prometeus GmbH 2nd European Stochastic Experts Forum Baden-Baden, June 28-29, 2001 Outlook Introduction

More information

High-Performance Computing - and why Learn about it?

High-Performance Computing - and why Learn about it? High-Performance Computing - and why Learn about it? Tarek El-Ghazawi The George Washington University Washington D.C., USA Outline What is High-Performance Computing? Why is High-Performance Computing

More information

HPCG UPDATE: ISC 15 Jack Dongarra Michael Heroux Piotr Luszczek

HPCG UPDATE: ISC 15 Jack Dongarra Michael Heroux Piotr Luszczek www.hpcg-benchmark.org 1 HPCG UPDATE: ISC 15 Jack Dongarra Michael Heroux Piotr Luszczek www.hpcg-benchmark.org 2 HPCG Snapshot High Performance Conjugate Gradient (HPCG). Solves Ax=b, A large, sparse,

More information

Why we need Exascale and why we won t get there by 2020 Horst Simon Lawrence Berkeley National Laboratory

Why we need Exascale and why we won t get there by 2020 Horst Simon Lawrence Berkeley National Laboratory Why we need Exascale and why we won t get there by 2020 Horst Simon Lawrence Berkeley National Laboratory 2013 International Workshop on Computational Science and Engineering National University of Taiwan

More information

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory Quinn Mitchell HPC UNIX/LINUX Storage Systems ORNL is managed by UT-Battelle for the US Department of Energy U.S. Department

More information

HPC Algorithms and Applications

HPC Algorithms and Applications HPC Algorithms and Applications Intro Michael Bader Winter 2015/2016 Intro, Winter 2015/2016 1 Part I Scientific Computing and Numerical Simulation Intro, Winter 2015/2016 2 The Simulation Pipeline phenomenon,

More information

Fra superdatamaskiner til grafikkprosessorer og

Fra superdatamaskiner til grafikkprosessorer og Fra superdatamaskiner til grafikkprosessorer og Brødtekst maskinlæring Prof. Anne C. Elster IDI HPC/Lab Parallel Computing: Personal perspective 1980 s: Concurrent and Parallel Pascal 1986: Intel ipsc

More information

TOP500 List s Twice-Yearly Snapshots of World s Fastest Supercomputers Develop Into Big Picture of Changing Technology

TOP500 List s Twice-Yearly Snapshots of World s Fastest Supercomputers Develop Into Big Picture of Changing Technology TOP500 List s Twice-Yearly Snapshots of World s Fastest Supercomputers Develop Into Big Picture of Changing Technology BY ERICH STROHMAIER COMPUTER SCIENTIST, FUTURE TECHNOLOGIES GROUP, LAWRENCE BERKELEY

More information

Hybrid Architectures Why Should I Bother?

Hybrid Architectures Why Should I Bother? Hybrid Architectures Why Should I Bother? CSCS-FoMICS-USI Summer School on Computer Simulations in Science and Engineering Michael Bader July 8 19, 2013 Computer Simulations in Science and Engineering,

More information

International Conference Russian Supercomputing Days. September 25-26, 2017, Moscow

International Conference Russian Supercomputing Days. September 25-26, 2017, Moscow International Conference Russian Supercomputing Days September 25-26, 2017, Moscow International Conference Russian Supercomputing Days Supported by the Russian Foundation for Basic Research Platinum Sponsor

More information

CRAY XK6 REDEFINING SUPERCOMPUTING. - Sanjana Rakhecha - Nishad Nerurkar

CRAY XK6 REDEFINING SUPERCOMPUTING. - Sanjana Rakhecha - Nishad Nerurkar CRAY XK6 REDEFINING SUPERCOMPUTING - Sanjana Rakhecha - Nishad Nerurkar CONTENTS Introduction History Specifications Cray XK6 Architecture Performance Industry acceptance and applications Summary INTRODUCTION

More information

An Overview of High Performance Computing and Challenges for the Future

An Overview of High Performance Computing and Challenges for the Future An Overview of High Performance Computing and Challenges for the Future Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester 6/15/2009 1 H. Meuer, H. Simon, E. Strohmaier,

More information

Titan - Early Experience with the Titan System at Oak Ridge National Laboratory

Titan - Early Experience with the Titan System at Oak Ridge National Laboratory Office of Science Titan - Early Experience with the Titan System at Oak Ridge National Laboratory Buddy Bland Project Director Oak Ridge Leadership Computing Facility November 13, 2012 ORNL s Titan Hybrid

More information

Cray XC Scalability and the Aries Network Tony Ford

Cray XC Scalability and the Aries Network Tony Ford Cray XC Scalability and the Aries Network Tony Ford June 29, 2017 Exascale Scalability Which scalability metrics are important for Exascale? Performance (obviously!) What are the contributing factors?

More information

HPCG UPDATE: SC 15 Jack Dongarra Michael Heroux Piotr Luszczek

HPCG UPDATE: SC 15 Jack Dongarra Michael Heroux Piotr Luszczek 1 HPCG UPDATE: SC 15 Jack Dongarra Michael Heroux Piotr Luszczek HPCG Snapshot High Performance Conjugate Gradient (HPCG). Solves Ax=b, A large, sparse, b known, x computed. An optimized implementation

More information

Why we need Exascale and why we won t get there by 2020

Why we need Exascale and why we won t get there by 2020 Why we need Exascale and why we won t get there by 2020 Horst Simon Lawrence Berkeley National Laboratory August 27, 2013 Overview Current state of HPC: petaflops firmly established Why we won t get to

More information

Preparing GPU-Accelerated Applications for the Summit Supercomputer

Preparing GPU-Accelerated Applications for the Summit Supercomputer Preparing GPU-Accelerated Applications for the Summit Supercomputer Fernanda Foertter HPC User Assistance Group Training Lead foertterfs@ornl.gov This research used resources of the Oak Ridge Leadership

More information

HPC Technology Trends

HPC Technology Trends HPC Technology Trends High Performance Embedded Computing Conference September 18, 2007 David S Scott, Ph.D. Petascale Product Line Architect Digital Enterprise Group Risk Factors Today s s presentations

More information

What have we learned from the TOP500 lists?

What have we learned from the TOP500 lists? What have we learned from the TOP500 lists? Hans Werner Meuer University of Mannheim and Prometeus GmbH Sun HPC Consortium Meeting Heidelberg, Germany June 19-20, 2001 Outlook TOP500 Approach Snapshots

More information

ECE 574 Cluster Computing Lecture 2

ECE 574 Cluster Computing Lecture 2 ECE 574 Cluster Computing Lecture 2 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 24 January 2019 Announcements Put your name on HW#1 before turning in! 1 Top500 List November

More information

HPC Technology Update Challenges or Chances?

HPC Technology Update Challenges or Chances? HPC Technology Update Challenges or Chances? Swiss Distributed Computing Day Thomas Schoenemeyer, Technology Integration, CSCS 1 Move in Feb-April 2012 1500m2 16 MW Lake-water cooling PUE 1.2 New Datacenter

More information

An Overview of High Performance Computing

An Overview of High Performance Computing IFIP Working Group 10.3 on Concurrent Systems An Overview of High Performance Computing Jack Dongarra University of Tennessee and Oak Ridge National Laboratory 1/3/2006 1 Overview Look at fastest computers

More information

Presentation of the 16th List

Presentation of the 16th List Presentation of the 16th List Hans- Werner Meuer, University of Mannheim Erich Strohmaier, University of Tennessee Jack J. Dongarra, University of Tennesse Horst D. Simon, NERSC/LBNL SC2000, Dallas, TX,

More information

High Performance Computing in Europe and USA: A Comparison

High Performance Computing in Europe and USA: A Comparison High Performance Computing in Europe and USA: A Comparison Erich Strohmaier 1 and Hans W. Meuer 2 1 NERSC, Lawrence Berkeley National Laboratory, USA 2 University of Mannheim, Germany 1 Introduction In

More information

An Overview of High Performance Computing and Future Requirements

An Overview of High Performance Computing and Future Requirements An Overview of High Performance Computing and Future Requirements Jack Dongarra University of Tennessee Oak Ridge National Laboratory 1 H. Meuer, H. Simon, E. Strohmaier, & JD - Listing of the 500 most

More information

Seagate ExaScale HPC Storage

Seagate ExaScale HPC Storage Seagate ExaScale HPC Storage Miro Lehocky System Engineer, Seagate Systems Group, HPC1 100+ PB Lustre File System 130+ GB/s Lustre File System 140+ GB/s Lustre File System 55 PB Lustre File System 1.6

More information

Parallel Programming

Parallel Programming Parallel Programming Introduction Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS15/16 Acknowledgements Prof. Felix Wolf, TU Darmstadt Prof. Matthias

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introduction Why High Performance Computing? Quote: It is hard to understand an ocean because it is too big. It is hard to understand a molecule because it is too small. It is hard to understand

More information

Accelerating HPC. (Nash) Dr. Avinash Palaniswamy High Performance Computing Data Center Group Marketing

Accelerating HPC. (Nash) Dr. Avinash Palaniswamy High Performance Computing Data Center Group Marketing Accelerating HPC (Nash) Dr. Avinash Palaniswamy High Performance Computing Data Center Group Marketing SAAHPC, Knoxville, July 13, 2010 Legal Disclaimer Intel may make changes to specifications and product

More information

Real Parallel Computers

Real Parallel Computers Real Parallel Computers Modular data centers Overview Short history of parallel machines Cluster computing Blue Gene supercomputer Performance development, top-500 DAS: Distributed supercomputing Short

More information

Overview of HPC and Energy Saving on KNL for Some Computations

Overview of HPC and Energy Saving on KNL for Some Computations Overview of HPC and Energy Saving on KNL for Some Computations Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester 1/2/217 1 Outline Overview of High Performance

More information

InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014

InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014 InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment TOP500 Supercomputers, June 2014 TOP500 Performance Trends 38% CAGR 78% CAGR Explosive high-performance

More information

Steve Scott, Tesla CTO SC 11 November 15, 2011

Steve Scott, Tesla CTO SC 11 November 15, 2011 Steve Scott, Tesla CTO SC 11 November 15, 2011 What goal do these products have in common? Performance / W Exaflop Expectations First Exaflop Computer K Computer ~10 MW CM5 ~200 KW Not constant size, cost

More information

The TOP500 Project of the Universities Mannheim and Tennessee

The TOP500 Project of the Universities Mannheim and Tennessee The TOP500 Project of the Universities Mannheim and Tennessee Hans Werner Meuer University of Mannheim EURO-PAR 2000 29. August - 01. September 2000 Munich/Germany Outline TOP500 Approach HPC-Market as

More information

The Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center

The Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center The Stampede is Coming Welcome to Stampede Introductory Training Dan Stanzione Texas Advanced Computing Center dan@tacc.utexas.edu Thanks for Coming! Stampede is an exciting new system of incredible power.

More information

Fabio AFFINITO.

Fabio AFFINITO. Introduction to High Performance Computing Fabio AFFINITO What is the meaning of High Performance Computing? What does HIGH PERFORMANCE mean??? 1976... Cray-1 supercomputer First commercial successful

More information

Jack Dongarra INNOVATIVE COMP ING LABORATORY. University i of Tennessee Oak Ridge National Laboratory

Jack Dongarra INNOVATIVE COMP ING LABORATORY. University i of Tennessee Oak Ridge National Laboratory Computational Science, High Performance Computing, and the IGMCS Program Jack Dongarra INNOVATIVE COMP ING LABORATORY University i of Tennessee Oak Ridge National Laboratory 1 The Third Pillar of 21st

More information

Introduction to Computational Science (aka Scientific Computing)

Introduction to Computational Science (aka Scientific Computing) (aka Scientific Computing) Xianyi Zeng xzeng@utep.edu Department of Mathematical Sciences The University of Texas at El Paso. August 23, 2016. Acknowledgement Dr. Shirley Moore for setting up a high standard

More information

CS2214 COMPUTER ARCHITECTURE & ORGANIZATION SPRING Top 10 Supercomputers in the World as of November 2013*

CS2214 COMPUTER ARCHITECTURE & ORGANIZATION SPRING Top 10 Supercomputers in the World as of November 2013* CS2214 COMPUTER ARCHITECTURE & ORGANIZATION SPRING 2014 COMPUTERS : PRESENT, PAST & FUTURE Top 10 Supercomputers in the World as of November 2013* No Site Computer Cores Rmax + (TFLOPS) Rpeak (TFLOPS)

More information

Parallel Computing & Accelerators. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

Parallel Computing & Accelerators. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Parallel Computing Accelerators John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Purpose of this talk This is the 50,000 ft. view of the parallel computing landscape. We want

More information

CS 5803 Introduction to High Performance Computer Architecture: Performance Metrics

CS 5803 Introduction to High Performance Computer Architecture: Performance Metrics CS 5803 Introduction to High Performance Computer Architecture: Performance Metrics A.R. Hurson 323 Computer Science Building, Missouri S&T hurson@mst.edu 1 Instructor: Ali R. Hurson 323 CS Building hurson@mst.edu

More information

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA STATE OF THE ART 2012 18,688 Tesla K20X GPUs 27 PetaFLOPS FLAGSHIP SCIENTIFIC APPLICATIONS

More information

Parallel and Distributed Systems. Hardware Trends. Why Parallel or Distributed Computing? What is a parallel computer?

Parallel and Distributed Systems. Hardware Trends. Why Parallel or Distributed Computing? What is a parallel computer? Parallel and Distributed Systems Instructor: Sandhya Dwarkadas Department of Computer Science University of Rochester What is a parallel computer? A collection of processing elements that communicate and

More information

The TOP500 list. Hans-Werner Meuer University of Mannheim. SPEC Workshop, University of Wuppertal, Germany September 13, 1999

The TOP500 list. Hans-Werner Meuer University of Mannheim. SPEC Workshop, University of Wuppertal, Germany September 13, 1999 The TOP500 list Hans-Werner Meuer University of Mannheim SPEC Workshop, University of Wuppertal, Germany September 13, 1999 Outline TOP500 Approach HPC-Market as of 6/99 Market Trends, Architecture Trends,

More information

TOP500 Listen und industrielle/kommerzielle Anwendungen

TOP500 Listen und industrielle/kommerzielle Anwendungen TOP500 Listen und industrielle/kommerzielle Anwendungen Hans Werner Meuer Universität Mannheim Gesprächsrunde Nichtnumerische Anwendungen im Bereich des Höchstleistungsrechnens des BMBF Berlin, 16./ 17.

More information

represent parallel computers, so distributed systems such as Does not consider storage or I/O issues

represent parallel computers, so distributed systems such as Does not consider storage or I/O issues Top500 Supercomputer list represent parallel computers, so distributed systems such as SETI@Home are not considered Does not consider storage or I/O issues Both custom designed machines and commodity machines

More information

Intro To Parallel Computing. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

Intro To Parallel Computing. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Intro To Parallel Computing John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Purpose of this talk This is the 50,000 ft. view of the parallel computing landscape. We want to orient

More information

Overview. High Performance Computing - History of the Supercomputer. Modern Definitions (II)

Overview. High Performance Computing - History of the Supercomputer. Modern Definitions (II) Overview High Performance Computing - History of the Supercomputer Dr M. Probert Autumn Term 2017 Early systems with proprietary components, operating systems and tools Development of vector computing

More information

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29 Introduction CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction Spring 2018 1 / 29 Outline 1 Preface Course Details Course Requirements 2 Background Definitions

More information

Exploring Emerging Technologies in the Extreme Scale HPC Co- Design Space with Aspen

Exploring Emerging Technologies in the Extreme Scale HPC Co- Design Space with Aspen Exploring Emerging Technologies in the Extreme Scale HPC Co- Design Space with Aspen Jeffrey S. Vetter SPPEXA Symposium Munich 26 Jan 2016 ORNL is managed by UT-Battelle for the US Department of Energy

More information

Introduction: Modern computer architecture. The stored program computer and its inherent bottlenecks Multi- and manycore chips and nodes

Introduction: Modern computer architecture. The stored program computer and its inherent bottlenecks Multi- and manycore chips and nodes Introduction: Modern computer architecture The stored program computer and its inherent bottlenecks Multi- and manycore chips and nodes Motivation: Multi-Cores where and why Introduction: Moore s law Intel

More information

Mathematical computations with GPUs

Mathematical computations with GPUs Master Educational Program Information technology in applications Mathematical computations with GPUs Introduction Alexey A. Romanenko arom@ccfit.nsu.ru Novosibirsk State University How to.. Process terabytes

More information

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Aim High Intel Technical Update Teratec 07 Symposium June 20, 2007 Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Risk Factors Today s s presentations contain forward-looking statements.

More information

Digital Signal Processor Supercomputing

Digital Signal Processor Supercomputing Digital Signal Processor Supercomputing ENCM 515: Individual Report Prepared by Steven Rahn Submitted: November 29, 2013 Abstract: Analyzing the history of supercomputers: how the industry arrived to where

More information

Motivation Goal Idea Proposition for users Study

Motivation Goal Idea Proposition for users Study Exploring Tradeoffs Between Power and Performance for a Scientific Visualization Algorithm Stephanie Labasan Computer and Information Science University of Oregon 23 November 2015 Overview Motivation:

More information

Roadmapping of HPC interconnects

Roadmapping of HPC interconnects Roadmapping of HPC interconnects MIT Microphotonics Center, Fall Meeting Nov. 21, 2008 Alan Benner, bennera@us.ibm.com Outline Top500 Systems, Nov. 2008 - Review of most recent list & implications on interconnect

More information

GPU COMPUTING AND THE FUTURE OF HPC. Timothy Lanfear, NVIDIA

GPU COMPUTING AND THE FUTURE OF HPC. Timothy Lanfear, NVIDIA GPU COMPUTING AND THE FUTURE OF HPC Timothy Lanfear, NVIDIA ~1 W ~3 W ~100 W ~30 W 1 kw 100 kw 20 MW Power-constrained Computers 2 EXASCALE COMPUTING WILL ENABLE TRANSFORMATIONAL SCIENCE RESULTS First-principles

More information

European energy efficient supercomputer project

European energy efficient supercomputer project http://www.montblanc-project.eu European energy efficient supercomputer project Simon McIntosh-Smith University of Bristol (Based on slides from Alex Ramirez, BSC) Disclaimer: Speaking for myself... All

More information

COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES

COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES P(ND) 2-2 2014 Guillaume Colin de Verdière OCTOBER 14TH, 2014 P(ND)^2-2 PAGE 1 CEA, DAM, DIF, F-91297 Arpajon, France October 14th, 2014 Abstract:

More information

The Stampede is Coming: A New Petascale Resource for the Open Science Community

The Stampede is Coming: A New Petascale Resource for the Open Science Community The Stampede is Coming: A New Petascale Resource for the Open Science Community Jay Boisseau Texas Advanced Computing Center boisseau@tacc.utexas.edu Stampede: Solicitation US National Science Foundation

More information

An Overview of High Performance Computing. Jack Dongarra University of Tennessee and Oak Ridge National Laboratory 11/29/2005 1

An Overview of High Performance Computing. Jack Dongarra University of Tennessee and Oak Ridge National Laboratory 11/29/2005 1 An Overview of High Performance Computing Jack Dongarra University of Tennessee and Oak Ridge National Laboratory 11/29/ 1 Overview Look at fastest computers From the Top5 Some of the changes that face

More information

Overview. Idea: Reduce CPU clock frequency This idea is well suited specifically for visualization

Overview. Idea: Reduce CPU clock frequency This idea is well suited specifically for visualization Exploring Tradeoffs Between Power and Performance for a Scientific Visualization Algorithm Stephanie Labasan & Matt Larsen (University of Oregon), Hank Childs (Lawrence Berkeley National Laboratory) 26

More information

Vectorisation and Portable Programming using OpenCL

Vectorisation and Portable Programming using OpenCL Vectorisation and Portable Programming using OpenCL Mitglied der Helmholtz-Gemeinschaft Jülich Supercomputing Centre (JSC) Andreas Beckmann, Ilya Zhukov, Willi Homberg, JSC Wolfram Schenck, FH Bielefeld

More information

The Mont-Blanc approach towards Exascale

The Mont-Blanc approach towards Exascale http://www.montblanc-project.eu The Mont-Blanc approach towards Exascale Alex Ramirez Barcelona Supercomputing Center Disclaimer: Not only I speak for myself... All references to unavailable products are

More information

The Era of Heterogeneous Computing

The Era of Heterogeneous Computing The Era of Heterogeneous Computing EU-US Summer School on High Performance Computing New York, NY, USA June 28, 2013 Lars Koesterke: Research Staff @ TACC Nomenclature Architecture Model -------------------------------------------------------

More information

Confessions of an Accidental Benchmarker

Confessions of an Accidental Benchmarker Confessions of an Accidental Benchmarker http://bit.ly/hpcg-benchmark 1 Appendix B of the Linpack Users Guide Designed to help users extrapolate execution Linpack software package First benchmark report

More information

GPU-centric communication for improved efficiency

GPU-centric communication for improved efficiency GPU-centric communication for improved efficiency Benjamin Klenk *, Lena Oden, Holger Fröning * * Heidelberg University, Germany Fraunhofer Institute for Industrial Mathematics, Germany GPCDP Workshop

More information

JÜLICH SUPERCOMPUTING CENTRE Site Introduction Michael Stephan Forschungszentrum Jülich

JÜLICH SUPERCOMPUTING CENTRE Site Introduction Michael Stephan Forschungszentrum Jülich JÜLICH SUPERCOMPUTING CENTRE Site Introduction 09.04.2018 Michael Stephan JSC @ Forschungszentrum Jülich FORSCHUNGSZENTRUM JÜLICH Research Centre Jülich One of the 15 Helmholtz Research Centers in Germany

More information

GPU Architecture. Alan Gray EPCC The University of Edinburgh

GPU Architecture. Alan Gray EPCC The University of Edinburgh GPU Architecture Alan Gray EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? Architectural reasons for accelerator performance advantages Latest GPU Products From

More information

CME 213 S PRING Eric Darve

CME 213 S PRING Eric Darve CME 213 S PRING 2017 Eric Darve Summary of previous lectures Pthreads: low-level multi-threaded programming OpenMP: simplified interface based on #pragma, adapted to scientific computing OpenMP for and

More information

HPC-CINECA infrastructure: The New Marconi System. HPC methods for Computational Fluid Dynamics and Astrophysics Giorgio Amati,

HPC-CINECA infrastructure: The New Marconi System. HPC methods for Computational Fluid Dynamics and Astrophysics Giorgio Amati, HPC-CINECA infrastructure: The New Marconi System HPC methods for Computational Fluid Dynamics and Astrophysics Giorgio Amati, g.amati@cineca.it Agenda 1. New Marconi system Roadmap Some performance info

More information

arxiv: v1 [physics.comp-ph] 4 Nov 2013

arxiv: v1 [physics.comp-ph] 4 Nov 2013 arxiv:1311.0590v1 [physics.comp-ph] 4 Nov 2013 Performance of Kepler GTX Titan GPUs and Xeon Phi System, Weonjong Lee, and Jeonghwan Pak Lattice Gauge Theory Research Center, CTP, and FPRD, Department

More information

Complexity and Advanced Algorithms. Introduction to Parallel Algorithms

Complexity and Advanced Algorithms. Introduction to Parallel Algorithms Complexity and Advanced Algorithms Introduction to Parallel Algorithms Why Parallel Computing? Save time, resources, memory,... Who is using it? Academia Industry Government Individuals? Two practical

More information

HPC Architectures. Types of resource currently in use

HPC Architectures. Types of resource currently in use HPC Architectures Types of resource currently in use Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

Intro To Parallel Computing. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

Intro To Parallel Computing. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Intro To Parallel Computing John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Purpose of this talk This is the 50,000 ft. view of the parallel computing landscape. We want to orient

More information

MANY-CORE COMPUTING. 7-Oct Ana Lucia Varbanescu, UvA. Original slides: Rob van Nieuwpoort, escience Center

MANY-CORE COMPUTING. 7-Oct Ana Lucia Varbanescu, UvA. Original slides: Rob van Nieuwpoort, escience Center MANY-CORE COMPUTING 7-Oct-2013 Ana Lucia Varbanescu, UvA Original slides: Rob van Nieuwpoort, escience Center Schedule 2 1. Introduction, performance metrics & analysis 2. Programming: basics (10-10-2013)

More information

HPC Capabilities at Research Intensive Universities

HPC Capabilities at Research Intensive Universities HPC Capabilities at Research Intensive Universities Purushotham (Puri) V. Bangalore Department of Computer and Information Sciences and UAB IT Research Computing UAB HPC Resources 24 nodes (192 cores)

More information

HPC Issues for DFT Calculations. Adrian Jackson EPCC

HPC Issues for DFT Calculations. Adrian Jackson EPCC HC Issues for DFT Calculations Adrian Jackson ECC Scientific Simulation Simulation fast becoming 4 th pillar of science Observation, Theory, Experimentation, Simulation Explore universe through simulation

More information

Supercomputers in ITC/U.Tokyo 2 big systems, 6 yr. cycle

Supercomputers in ITC/U.Tokyo 2 big systems, 6 yr. cycle Supercomputers in ITC/U.Tokyo 2 big systems, 6 yr. cycle FY 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 Hitachi SR11K/J2 IBM Power 5+ 18.8TFLOPS, 16.4TB Hitachi HA8000 (T2K) AMD Opteron 140TFLOPS, 31.3TB

More information

Linear Algebra for Modern Computers. Jack Dongarra

Linear Algebra for Modern Computers. Jack Dongarra Linear Algebra for Modern Computers Jack Dongarra Tuning for Caches 1. Preserve locality. 2. Reduce cache thrashing. 3. Loop blocking when out of cache. 4. Software pipelining. 2 Indirect Addressing d

More information

Fujitsu s Approach to Application Centric Petascale Computing

Fujitsu s Approach to Application Centric Petascale Computing Fujitsu s Approach to Application Centric Petascale Computing 2 nd Nov. 2010 Motoi Okuda Fujitsu Ltd. Agenda Japanese Next-Generation Supercomputer, K Computer Project Overview Design Targets System Overview

More information

Fujitsu s Technologies Leading to Practical Petascale Computing: K computer, PRIMEHPC FX10 and the Future

Fujitsu s Technologies Leading to Practical Petascale Computing: K computer, PRIMEHPC FX10 and the Future Fujitsu s Technologies Leading to Practical Petascale Computing: K computer, PRIMEHPC FX10 and the Future November 16 th, 2011 Motoi Okuda Technical Computing Solution Unit Fujitsu Limited Agenda Achievements

More information

The Future of High- Performance Computing

The Future of High- Performance Computing Lecture 26: The Future of High- Performance Computing Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2017 Comparing Two Large-Scale Systems Oakridge Titan Google Data Center Monolithic

More information