Unit Name. Date These work are contributed by all our staffs

Size: px
Start display at page:

Download "Unit Name. Date These work are contributed by all our staffs"

Transcription

1 Supercomputing Applications Title in CAS Xue-bin Chi Supercomputing Center, CNIC, CAS Unit Name April 13-14, 14, 2011, Helsinki, Finland Date These work are contributed by all our staffs

2 Outline Overview the HPC History of China Brief Introduction of SCCAS Grid environment overview Applications in science and engineering Future Plan on high performance environment

3 Overview the HPC History of China 1.Started in the late of 1980 s 2.Developing in the 1990 s

4 Parallel computers ( ) 1994) BJ 01 ( ) 4 processors, with global and local memmory, made by ICT, CAS Transputers ( ) Severalparallelsystems systems, our group had a 17 transputer system KJ 8950 ( ) 16 processors, with global and local memmory, made by ICT, CAS st parallel computer Dawning 1000 occurred Peak performance 2.5GFLOPS in single precision, HPL 50% Similar to Paragon machine, NX parallel implementation environment Intel i860

5 National High Performance computing Centers ( ) National High Performance computing Center (Beijing) In Institute of computing technology, CAS National High Performance computing Center (Wuhan) Huazhong University of Science and Technology National High Performance computing Center (Chengdu) Southwest Jiaotong University National High Performance computing Center (Hefei) University of Science and Technology of China Supercomputing Center in Chinese Academy of Sciences (SCCAS) 1996 Shanghai Supercomputing Center (SSC) 2001

6 Clusters ( ) 2005) Lenovo DeepComp 6800 (SCCAS) 2003, top500 rank 14th, peak performance 5.2TFLOPS Itanium II, Linux OS, MPI, QsNet Budget 60M RMB MOST 15M, Lenovo 15M, CAS 30M Dawning 4000 (SSC) 2004, top500 rank 10th, peak performance 11 TFLOPS AMD Opteron, Linux Os, MPI, Merinet 2000 Budget 60M RMB MOST 30M, Shanghai Government 30M Many other clusters installed during that period

7 100TFLOPS HPCs ( ) 2010) Lenovo DeepComp TFLOPS, our center, Dec., 2008 Budget 180M RMB Dawning 5000 MOST 80M, Lenovo 25M, CAS 75M Peak performance 200TFLOPS, AMD processors Aug., 2009, in Shanghai Supercomputing Center Top500 rank 10th on June of 2009 Budget 200M RMB MOST 100M, Shanghai Government 100M Software and environment 100M

8 Peta FlopsHPCs Dawning Nebulae 600TFLOPS CPU + 3PFLOPS GPU Top500 Rank 2nd Budget 600M RMB MOST 200M, Shenzhen Government 400M Upgrade 1PFLOPS CPU, 200 TFLOPS Godson NUDT Tianhe 200TFLOPS CPU + 1.2PFLOPS GPU Top500 rank 7 th Budget 600M RMB MOST 200M, Tianjin Government 400M Upgrade 1PFLOPS CPU + 4PFLOPS GPU HPL 2.5PFLOPS

9 Brief Introduction of SCCAS 1.Providing computing power and technical support for scientists 2.Training in HPC algorithm and programming

10 SCCAS Computing Resources During (9th 5 year plan) In 1996, SGI Power Challenge XL 6.4Gflops 16 CPUs In1998: Hitachi SR GFlops 32CPUs In 2000, Dawning 2000 II 111.7Gflops 164 CPUs

11 SCCAS Computing Computing Resources During (10th 5 year plan) In 2003, Lenovo DeepComp6800 5Tflops, 1024 CPUs TOP500:No.14;China TOP100:No. 1 During 2006 now (11th 5 year plan) In 2008, Lenovo DeepComp Tflops, 13,000 cores TOP500:No.19;China TOP100:No.2 3 kinds of nodes, Altix 4700, IBM 3950, IBM Blades In , 300TFLOPS GPU

12 SCCAS Services: Services: Users Users in SCCAS has massively increased since 1996 (from 20 to more than 310)

13 SCCAS R&D of Algorithmsandsoftware software Parallel AMR (Adaptive Mesh Refinement) method Parallel Eigenvalue Problem Parallel Fast Multipole Method Parallel Computing Model Gridmol ScGrid middleware PSEPS FMM radar Transplant many open source software

14 Grid environment overview 1.China National Grid (CNGrid) 2.China Scientific Computing Grid (ScGrid)

15 China National Grid

16 China National Grid CNGrid operation center Supercomputing center of CNIC of CAS Northern major node Supercomputing center of CNIC of CAS Southern major node Shanghai supercomputing center Normal nodes 9 sites, they are Tsinghua University, Beijing Institute of Applied Physics and Computational Mathematics, Shandong University, University of Science and Technology of China, Huazhong University of Science and Technology, Shenzhen Institute of Advanced Technology of CAS, Hong Kong University, Xi an Jiaotong University, Gansu province Supercomputing Center

17 China Scientific Computing Grid (ScGrid) 3 Level Grid Property Complementarity Autonomic Ability Scalability Availability Realizability Usability Manageability Stability CAS ARUMS

18 GPU Clusters within CAS Site Vendor R peak /Tflops Institute. of Electrical Engnineering Lenovo 112 Shenzhen Institutes of Advanced Technology Lenovo 200 USTC Lenovo 183 CNIC Lenovo, Dawning 300 National Astronomical Observatories Lenovo 158 Institute of Geology and Geophysics Lenovo, Dawning 200 Institute of Modern Physics Lenovo Institute of High Energy Physics Dawning Institute of Metals Research Dawning 183 Purple Mountain Observatory Dawning 180 SUM Pflops

19 Windows / Linux Clients Users Administrator Web Portal SCE Middleware HPC, Cluster, Workstation, Storage

20 Applications in Science and Engineering 1.Scientific ifi Research 2.Industrial Computing

21 Prediction of Sandstorm Real time prediction system of Sandstorms for China Meteorological Administration DeepComp CPUs from 15hours down to 8mins

22 Climate and Atmosphere Model CUACE(CMA Unified Atmospheric 3000 Chemistry 1000 Environment) 0 TI ME 54km km km km % 100% 80% 60% 40% 20% 0% Ef f i ci ency km 100% 90% % % % % 108km 100% % % % % % 54km 108km 100 RIEMS(Regional 80 Integrated 70 Environment 60 Modeling System) ) MPI MPI+OpenMP

23 The surface temperature eof the Joined Area of Asia&India Pacific sa da ac c Ocean

24 Computational Fluid Dynamics Speed d-up CPU number

25 Gl Galactic Wind Simulation 8192 Computational cores, Internationally advanced both in computational scale Internationally advanced both in computational scale and speed

26 Galactic Wind Simulation on Tianhe 1A Speedup Li near Efficiency % % %

27 Simulation the Universe Evolution by 2048 Cores 13 Days

28 Spaceweather prediction research Shock wave magnetic cloud interactive Using parallel AMR method to do MHD Computing (512 cores)

29 Drug screening for Avian Influenza DeepComp7000 p 2400 CPU cores Time consumed decreased from 2 months to 8hours The compounds screened out by this computation is forwarded to related department tfor further biological test

30 Simulation: Side plates Formation in Ti Alloys Phase-field model Time-dependent Ginzburg-Landau(TDGL) equation Cahn-Hillard equation Anisotropic interfacial energy Numerical Method time domain finite difference operator splitting Grid size 1024*1024*1024 Number of DOFs 4.295*10 9 Run on 4,096 cores Parallel efficiency 94%. 3 simulations 462,828 core hours. Collaborators Ms. Mei Yang Dr. Hao Wang Dr. Gang Wang Prof. Dongsheng Xu Institute of Metal Research, CAS, Shenyang Side-plates Formation Visualization(512 3 )

31 Side plates formation in Ti Alloys test (Tianhe 1A) Speedup Li near Efficiency % % %

32 Parallel Software HPSEPS Eigenvalue problem parallel solvers for sparse and dense symmetric matrics SVD Parallel Solver LSQR Parallel Solver Eigensolver for dense matrices with 8192 cores on Tianhe-1A(N=40000) 16 H i E i i H ( X) X X Num of Cores Time(sec) Speed up Efficency 100% 73% 56% 39% 25%

33 Hypre on Tianhe 1A Convection Reaction Diffusion div ( K grad u + B u) + C u = F SMG preconditioned GMRES solver N=1024x1024, Np=kxk Assume efficency on 1024 core is100%,we can get efficency in 8100 cores is 51%

34 P3DFFT on Tianhe 1 A Parallel 3D FFT Based on FFTW3.2.1, but MPI_AlltoAll is avoid 1024x1024x x1024x2048 #Procs T(P3DFFT) Efficiency

35 SC PEtot: First Large scale PlanewaveGPUCode Real space & G-space calculation lation Both feature are implemented in GPU Speedup can be 5~6x for CG Data structure Transpose data from (16, 1) to (1,16) in CPU for more efficient GPU implementation Sphere-to-Cube FFT Use sphere to reduce CPU-GPU data transfer size Mapping to cube in GPU to use CUFFT MPI communication and GPU overlap Non-block AlltoAll to overlap CPU communication Communication & computation pipeline in FFT G FFT CG GPU 1.8s 118s CPU 20.5s 645s R Hpsi CG GPU s CPU s

36 SC PEtot:5xfaster thanvasp Testing systems on SC-PEtot: 512 atom GaAs bulk system 933 atom CdSe quantum dot (QD) passivated by hydrogen atoms placed in a large box These two systems represent two different situations: one with atoms uniformly occupying the space, another with a large empty space. Results 4.5x to 9 times speedup compared with CPU PEtot About 5 times speedup compared with VASP Nodes systems 512 GaAs 512 GaAs 512 GaAs 512 GaAs 512 GaAs 933 CdSe VASP (CPU) PEtot (CPU) PEtot (GPU) Speed up(with CPU) 9.7x 9.2x 9.1x 6.3x 5.8x 8.5x Speed up(with VASP) 5.49x 5.1x 6.6x 6.7x SC-Petot testing results compared with CPU Petot and VASP

37 P_InsPecT/cuda InsPecT Software Software Introduction Both are optimized InsPecT software P_InsPecT is open source and can be downloaded from SCBG Cuda InsPecT will be open source Software Function InsPecT is an unrestricted identification software of PTMs(post translational modifications) Software Characteristics P_InsPecT via MPI run on CPU cluster or CPU nodes of HPC Software Performance Software:P_InsPecT(one ( modification) ) Database:36547 mass spectrometric; protein sequences Environment:DeepComp7000 One Core (estimate) 2048 Cores Time h 0.4 h cuda-inspect via MPI+cuda C run on GPU cluster Software:cuda-InsPecT(two ( modifications) ) Database:62346 mass spectrometric; protein sequencese Environment:Dawn 6000A One Core (estimate) t 677 Fermi C2050 Time 6 years h

38 VAT4M ( A Visual Analysis tool for CryoEM Density Maps) Volume rendering with an intuitive interface Automatic ti segmentation tti of density maps Automatic structure fitting for multi molecular models

39 CAD/CAE Platform in Automation Industry Who we provide our service to CAD/CAE engineers What we deliver to them A high performance service DeepComp 7000 for parallel l solver High performance graphic workstations for pre/post processing A high usability cloud computing solution via internet Remote graphical access to workstations Grid computing based resources Web portalintegrated with CAEsoftware A high safety environment Secure access through VPN and firewall Isolate unauthorized access from cluster by web portal

40 Aerodynamic Noise Simulation inhigh speed Train 13 million mesh elements 128 cores simulation 11 day for flow field 15 days for noise

41 Fingerprint Recognition Fingerprint database :19,600,000 fingerprints HPC:1024 cores ( 16 nodes, 64 cores/node ) Speed:4000 fingerprints/second/core Run time:3 months (linear Speedup)

42 Oil Reservoir simulation on Daqing Oilfield Grid size: 199*87*67 Dawning2000II 64 CPUs The different Oil and water percentage on the same layer at different years: Nov. 1966(left) and Jan. 2100(right)

43 Future Plan on High Performance Environment

44 Petascale Supercomputers in CAS 1 PetaFlops computer in 2012 Budget 250M RMB For scientific computing 10 PetaFlops computer in 2015 Collaboration with Beijing local government Budget 700M RMB For scientific computing and industry computing

45 100PFlops Supercomputers in China Many peta scale HPCs At least one PFLOPS HPC Budget 4B RMB MOST 2.4B, Local Government 1.6B 1 10ExaFLOPS HPC Budget Can t estimate

46 Question & Suggestion Thank you very much!

INSPUR and HPC Innovation

INSPUR and HPC Innovation INSPUR and HPC Innovation Dong Qi (Forrest) Product manager Inspur dongqi@inspur.com Contents 1 2 3 4 5 Inspur introduction HPC Challenge and Inspur HPC strategy HPC cases Inspur contribution to HPC community

More information

INSPUR and HPC Innovation. Dong Qi (Forrest) Oversea PM

INSPUR and HPC Innovation. Dong Qi (Forrest) Oversea PM INSPUR and HPC Innovation Dong Qi (Forrest) Oversea PM dongqi@inspur.com Contents 1 2 3 4 5 Inspur introduction HPC Challenge and Inspur HPC strategy HPC cases Inspur contribution to HPC community Inspur

More information

EASIS: An Optimized Information Service for High Performance Computing Environment

EASIS: An Optimized Information Service for High Performance Computing Environment EASIS: An Optimized Information Service for High Performance Computing Environment Can Wu, Xiaoning Wang, Haili Xiao, Rongqiang Cao, Yining Zhao, Xuebin Chi Computer Network Information Center, Chinese

More information

The next generation supercomputer. Masami NARITA, Keiichi KATAYAMA Numerical Prediction Division, Japan Meteorological Agency

The next generation supercomputer. Masami NARITA, Keiichi KATAYAMA Numerical Prediction Division, Japan Meteorological Agency The next generation supercomputer and NWP system of JMA Masami NARITA, Keiichi KATAYAMA Numerical Prediction Division, Japan Meteorological Agency Contents JMA supercomputer systems Current system (Mar

More information

Godson Processor and its Application in High Performance Computers

Godson Processor and its Application in High Performance Computers Godson Processor and its Application in High Performance Computers Weiwu Hu Institute of Computing Technology, Chinese Academy of Sciences Loongson Technologies Corporation Limited hww@ict.ac.cn 1 Contents

More information

Practical Scientific Computing

Practical Scientific Computing Practical Scientific Computing Performance-optimized Programming Preliminary discussion: July 11, 2008 Dr. Ralf-Peter Mundani, mundani@tum.de Dipl.-Ing. Ioan Lucian Muntean, muntean@in.tum.de MSc. Csaba

More information

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620 Introduction to Parallel and Distributed Computing Linh B. Ngo CPSC 3620 Overview: What is Parallel Computing To be run using multiple processors A problem is broken into discrete parts that can be solved

More information

Supercomputer Networking via IPv6. Xing Li, Congxiao Bao, Wusheng Zhang

Supercomputer Networking via IPv6. Xing Li, Congxiao Bao, Wusheng Zhang Supercomputer Networking via IPv6 Xing Li, Congxiao Bao, Wusheng Zhang 2018-08-08 Outline Sunway TaihuLight Supercomputer CERNET and CERNET2 Challenges Solution Performance Q&A Remarks 2 National Supercomputing

More information

Matrix-free multi-gpu Implementation of Elliptic Solvers for strongly anisotropic PDEs

Matrix-free multi-gpu Implementation of Elliptic Solvers for strongly anisotropic PDEs Iterative Solvers Numerical Results Conclusion and outlook 1/18 Matrix-free multi-gpu Implementation of Elliptic Solvers for strongly anisotropic PDEs Eike Hermann Müller, Robert Scheichl, Eero Vainikko

More information

CRAY XK6 REDEFINING SUPERCOMPUTING. - Sanjana Rakhecha - Nishad Nerurkar

CRAY XK6 REDEFINING SUPERCOMPUTING. - Sanjana Rakhecha - Nishad Nerurkar CRAY XK6 REDEFINING SUPERCOMPUTING - Sanjana Rakhecha - Nishad Nerurkar CONTENTS Introduction History Specifications Cray XK6 Architecture Performance Industry acceptance and applications Summary INTRODUCTION

More information

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS + Hybrid Computing @ KAUST Many Cores and OpenACC Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS + Agenda Hybrid Computing n Hybrid Computing n From Multi-Physics

More information

PART I - Fundamentals of Parallel Computing

PART I - Fundamentals of Parallel Computing PART I - Fundamentals of Parallel Computing Objectives What is scientific computing? The need for more computing power The need for parallel computing and parallel programs 1 What is scientific computing?

More information

Supercomputers. Alex Reid & James O'Donoghue

Supercomputers. Alex Reid & James O'Donoghue Supercomputers Alex Reid & James O'Donoghue The Need for Supercomputers Supercomputers allow large amounts of processing to be dedicated to calculation-heavy problems Supercomputers are centralized in

More information

HPC with GPU and its applications from Inspur. Haibo Xie, Ph.D

HPC with GPU and its applications from Inspur. Haibo Xie, Ph.D HPC with GPU and its applications from Inspur Haibo Xie, Ph.D xiehb@inspur.com 2 Agenda I. HPC with GPU II. YITIAN solution and application 3 New Moore s Law 4 HPC? HPC stands for High Heterogeneous Performance

More information

INSPUR HPC Development

INSPUR HPC Development INSPUR HPC Development -Industrial Perspective of China Zilong (DAVID) XU Contents 1. Introduction of INSPUR 2. Introduction of INSPUR HPC 3. Being A Responsible HPC Company 2 Contents 1. Introduction

More information

HPC and Big Data: Updates about China. Haohuan FU August 29 th, 2017

HPC and Big Data: Updates about China. Haohuan FU August 29 th, 2017 HPC and Big Data: Updates about China Haohuan FU August 29 th, 2017 1 Outline HPC and Big Data Projects in China Recent Efforts on Tianhe-2 Recent Efforts on Sunway TaihuLight 2 MOST HPC Projects 2016

More information

HPC Infrastructure and Applications in Chinese Academy of Sciences

HPC Infrastructure and Applications in Chinese Academy of Sciences HPC Infrastructure and Applications in Chinese Academy of Sciences Xuebin CHI, Haili XIAO, Rongqiang CAO, Yining ZHAO (chi@sccas.cn) Computer Network Information Center (CNIC) Chinese Academy of Sciences

More information

Efficient multigrid solvers for strongly anisotropic PDEs in atmospheric modelling

Efficient multigrid solvers for strongly anisotropic PDEs in atmospheric modelling Iterative Solvers Numerical Results Conclusion and outlook 1/22 Efficient multigrid solvers for strongly anisotropic PDEs in atmospheric modelling Part II: GPU Implementation and Scaling on Titan Eike

More information

2014 China TOP100 List of High Performance Computer

2014 China TOP100 List of High Performance Computer 2014 China TOP100 List of High Performance Computer (November, 2014) 1/ 15 2014 China TOP100 List of High Performance Computer The Specialty Association of Mathematical & Scientific Software, CSIA Evaluation

More information

InfiniBand Strengthens Leadership as The High-Speed Interconnect Of Choice

InfiniBand Strengthens Leadership as The High-Speed Interconnect Of Choice InfiniBand Strengthens Leadership as The High-Speed Interconnect Of Choice Providing the Best Return on Investment by Delivering the Highest System Efficiency and Utilization Top500 Supercomputers June

More information

Adaptive Mesh Astrophysical Fluid Simulations on GPU. San Jose 10/2/2009 Peng Wang, NVIDIA

Adaptive Mesh Astrophysical Fluid Simulations on GPU. San Jose 10/2/2009 Peng Wang, NVIDIA Adaptive Mesh Astrophysical Fluid Simulations on GPU San Jose 10/2/2009 Peng Wang, NVIDIA Overview Astrophysical motivation & the Enzo code Finite volume method and adaptive mesh refinement (AMR) CUDA

More information

INTENSIVE COMPUTATION. Annalisa Massini

INTENSIVE COMPUTATION. Annalisa Massini INTENSIVE COMPUTATION Annalisa Massini 2017-2018 Intensive Computation - 2017/2018 2 Course topics The course will cover topics that are in some sense related to intensive computation: Matlab (an introduction)

More information

Speedup Altair RADIOSS Solvers Using NVIDIA GPU

Speedup Altair RADIOSS Solvers Using NVIDIA GPU Innovation Intelligence Speedup Altair RADIOSS Solvers Using NVIDIA GPU Eric LEQUINIOU, HPC Director Hongwei Zhou, Senior Software Developer May 16, 2012 Innovation Intelligence ALTAIR OVERVIEW Altair

More information

Tianhe-2, the world s fastest supercomputer. Shaohua Wu Senior HPC application development engineer

Tianhe-2, the world s fastest supercomputer. Shaohua Wu Senior HPC application development engineer Tianhe-2, the world s fastest supercomputer Shaohua Wu Senior HPC application development engineer Inspur Inspur revenue 5.8 2010-2013 6.4 2011 2012 Unit: billion$ 8.8 2013 21% Staff: 14, 000+ 12% 10%

More information

Overview of Tianhe-2

Overview of Tianhe-2 Overview of Tianhe-2 (MilkyWay-2) Supercomputer Yutong Lu School of Computer Science, National University of Defense Technology; State Key Laboratory of High Performance Computing, China ytlu@nudt.edu.cn

More information

Stockholm Brain Institute Blue Gene/L

Stockholm Brain Institute Blue Gene/L Stockholm Brain Institute Blue Gene/L 1 Stockholm Brain Institute Blue Gene/L 2 IBM Systems & Technology Group and IBM Research IBM Blue Gene /P - An Overview of a Petaflop Capable System Carl G. Tengwall

More information

Adaptive-Mesh-Refinement Hydrodynamic GPU Computation in Astrophysics

Adaptive-Mesh-Refinement Hydrodynamic GPU Computation in Astrophysics Adaptive-Mesh-Refinement Hydrodynamic GPU Computation in Astrophysics H. Y. Schive ( 薛熙于 ) Graduate Institute of Physics, National Taiwan University Leung Center for Cosmology and Particle Astrophysics

More information

ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016

ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016 ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016 Challenges What is Algebraic Multi-Grid (AMG)? AGENDA Why use AMG? When to use AMG? NVIDIA AmgX Results 2

More information

Mathematical computations with GPUs

Mathematical computations with GPUs Master Educational Program Information technology in applications Mathematical computations with GPUs Introduction Alexey A. Romanenko arom@ccfit.nsu.ru Novosibirsk State University How to.. Process terabytes

More information

Large scale Imaging on Current Many- Core Platforms

Large scale Imaging on Current Many- Core Platforms Large scale Imaging on Current Many- Core Platforms SIAM Conf. on Imaging Science 2012 May 20, 2012 Dr. Harald Köstler Chair for System Simulation Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen,

More information

Informatization and Grid Technology

Informatization and Grid Technology Informatization and Grid Technology Qian Depei Xian Jiaotong University May 21, 2004 Opportunity and challenge China is experiencing a rapid development in informatization Huge investment on information

More information

Implicit Low-Order Unstructured Finite-Element Multiple Simulation Enhanced by Dense Computation using OpenACC

Implicit Low-Order Unstructured Finite-Element Multiple Simulation Enhanced by Dense Computation using OpenACC Fourth Workshop on Accelerator Programming Using Directives (WACCPD), Nov. 13, 2017 Implicit Low-Order Unstructured Finite-Element Multiple Simulation Enhanced by Dense Computation using OpenACC Takuma

More information

Introduction to National Supercomputing Centre in Guangzhou and Opportunities for International Collaboration

Introduction to National Supercomputing Centre in Guangzhou and Opportunities for International Collaboration Exascale Applications and Software Conference 21st 23rd April 2015, Edinburgh, UK Introduction to National Supercomputing Centre in Guangzhou and Opportunities for International Collaboration Xue-Feng

More information

Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester

Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester 12/24/09 1 Take a look at high performance computing What s driving HPC Future Trends 2 Traditional scientific

More information

Kengo Nakajima Information Technology Center, The University of Tokyo. SC15, November 16-20, 2015 Austin, Texas, USA

Kengo Nakajima Information Technology Center, The University of Tokyo. SC15, November 16-20, 2015 Austin, Texas, USA ppopen-hpc Open Source Infrastructure for Development and Execution of Large-Scale Scientific Applications on Post-Peta Scale Supercomputers with Automatic Tuning (AT) Kengo Nakajima Information Technology

More information

Key Technologies for 100 PFLOPS. Copyright 2014 FUJITSU LIMITED

Key Technologies for 100 PFLOPS. Copyright 2014 FUJITSU LIMITED Key Technologies for 100 PFLOPS How to keep the HPC-tree growing Molecular dynamics Computational materials Drug discovery Life-science Quantum chemistry Eigenvalue problem FFT Subatomic particle phys.

More information

Inauguration Cartesius June 14, 2013

Inauguration Cartesius June 14, 2013 Inauguration Cartesius June 14, 2013 Hardware is Easy...but what about software/applications/implementation/? Dr. Peter Michielse Deputy Director 1 Agenda History Cartesius Hardware path to exascale: the

More information

CHAO YANG. Early Experience on Optimizations of Application Codes on the Sunway TaihuLight Supercomputer

CHAO YANG. Early Experience on Optimizations of Application Codes on the Sunway TaihuLight Supercomputer CHAO YANG Dr. Chao Yang is a full professor at the Laboratory of Parallel Software and Computational Sciences, Institute of Software, Chinese Academy Sciences. His research interests include numerical

More information

How to perform HPL on CPU&GPU clusters. Dr.sc. Draško Tomić

How to perform HPL on CPU&GPU clusters. Dr.sc. Draško Tomić How to perform HPL on CPU&GPU clusters Dr.sc. Draško Tomić email: drasko.tomic@hp.com Forecasting is not so easy, HPL benchmarking could be even more difficult Agenda TOP500 GPU trends Some basics about

More information

Resource virtualization and optimization via Grid and Cloud Computing

Resource virtualization and optimization via Grid and Cloud Computing Resource virtualization and optimization via Grid and Cloud Computing Moon J Kim IBM Senior Technical Staff Member/Chief Architect/Master Inventor moonkim@us.ibm.com Grid Motivations Research Accelerate

More information

Faster Innovation - Accelerating SIMULIA Abaqus Simulations with NVIDIA GPUs. Baskar Rajagopalan Accelerated Computing, NVIDIA

Faster Innovation - Accelerating SIMULIA Abaqus Simulations with NVIDIA GPUs. Baskar Rajagopalan Accelerated Computing, NVIDIA Faster Innovation - Accelerating SIMULIA Abaqus Simulations with NVIDIA GPUs Baskar Rajagopalan Accelerated Computing, NVIDIA 1 Engineering & IT Challenges/Trends NVIDIA GPU Solutions AGENDA Abaqus GPU

More information

A Kernel-independent Adaptive Fast Multipole Method

A Kernel-independent Adaptive Fast Multipole Method A Kernel-independent Adaptive Fast Multipole Method Lexing Ying Caltech Joint work with George Biros and Denis Zorin Problem Statement Given G an elliptic PDE kernel, e.g. {x i } points in {φ i } charges

More information

Exascale: challenges and opportunities in a power constrained world

Exascale: challenges and opportunities in a power constrained world Exascale: challenges and opportunities in a power constrained world Carlo Cavazzoni c.cavazzoni@cineca.it SuperComputing Applications and Innovation Department CINECA CINECA non profit Consortium, made

More information

Update on LRZ Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities. 2 Oct 2018 Prof. Dr. Dieter Kranzlmüller

Update on LRZ Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities. 2 Oct 2018 Prof. Dr. Dieter Kranzlmüller Update on LRZ Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities 2 Oct 2018 Prof. Dr. Dieter Kranzlmüller 1 Leibniz Supercomputing Centre Bavarian Academy of Sciences and

More information

Realtime Photometry System for AST3 (AST3_RPS)

Realtime Photometry System for AST3 (AST3_RPS) Realtime Photometry System for AST3 (AST3_RPS) Yu Ce ( 于策 ) yuce@tju.edu.cn School of Computer Science and Technology, Tianjin University ( 天津大学 ) Under the direction of Zhaohui Shang Xi an, Aug. 2010

More information

Supercomputer and grid infrastructure! in Poland!

Supercomputer and grid infrastructure! in Poland! Supercomputer and grid infrastructure in Poland Franciszek Rakowski Interdisciplinary Centre for Mathematical and Computational Modelling 12th INCF Nodes Workshop, 16.04.2015 Warsaw, Nencki Institute.

More information

Performance Evaluation of Quantum ESPRESSO on SX-ACE. REV-A Workshop held on conjunction with the IEEE Cluster 2017 Hawaii, USA September 5th, 2017

Performance Evaluation of Quantum ESPRESSO on SX-ACE. REV-A Workshop held on conjunction with the IEEE Cluster 2017 Hawaii, USA September 5th, 2017 Performance Evaluation of Quantum ESPRESSO on SX-ACE REV-A Workshop held on conjunction with the IEEE Cluster 2017 Hawaii, USA September 5th, 2017 Osamu Watanabe Akihiro Musa Hiroaki Hokari Shivanshu Singh

More information

AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015

AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015 AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015 Agenda Introduction to AmgX Current Capabilities Scaling V2.0 Roadmap for the future 2 AmgX Fast, scalable linear solvers, emphasis on iterative

More information

Practical Scientific Computing

Practical Scientific Computing Practical Scientific Computing Performance-optimised Programming Preliminary discussion, 17.7.2007 Dr. Ralf-Peter Mundani, mundani@tum.de Dipl.-Ing. Ioan Lucian Muntean, muntean@in.tum.de Dipl.-Geophys.

More information

Presentations: Jack Dongarra, University of Tennessee & ORNL. The HPL Benchmark: Past, Present & Future. Mike Heroux, Sandia National Laboratories

Presentations: Jack Dongarra, University of Tennessee & ORNL. The HPL Benchmark: Past, Present & Future. Mike Heroux, Sandia National Laboratories HPC Benchmarking Presentations: Jack Dongarra, University of Tennessee & ORNL The HPL Benchmark: Past, Present & Future Mike Heroux, Sandia National Laboratories The HPCG Benchmark: Challenges It Presents

More information

Fujitsu s Approach to Application Centric Petascale Computing

Fujitsu s Approach to Application Centric Petascale Computing Fujitsu s Approach to Application Centric Petascale Computing 2 nd Nov. 2010 Motoi Okuda Fujitsu Ltd. Agenda Japanese Next-Generation Supercomputer, K Computer Project Overview Design Targets System Overview

More information

High performance Computing and O&G Challenges

High performance Computing and O&G Challenges High performance Computing and O&G Challenges 2 Seismic exploration challenges High Performance Computing and O&G challenges Worldwide Context Seismic,sub-surface imaging Computing Power needs Accelerating

More information

Numerical Algorithm Co-Design multi-scale simulation at extreme scale

Numerical Algorithm Co-Design multi-scale simulation at extreme scale Co-Design 2014, Guangzhou, Nov. 6-8 Numerical Algorithm Co-Design multi-scale simulation at extreme scale Wei Ge Institute of Process Engineering (IPE), CAS Co-Design 2014, Guangzhou, Nov. 6-8 Co-Design

More information

Software and Performance Engineering for numerical codes on GPU clusters

Software and Performance Engineering for numerical codes on GPU clusters Software and Performance Engineering for numerical codes on GPU clusters H. Köstler International Workshop of GPU Solutions to Multiscale Problems in Science and Engineering Harbin, China 28.7.2010 2 3

More information

Fabio AFFINITO.

Fabio AFFINITO. Introduction to High Performance Computing Fabio AFFINITO What is the meaning of High Performance Computing? What does HIGH PERFORMANCE mean??? 1976... Cray-1 supercomputer First commercial successful

More information

TESLA P100 PERFORMANCE GUIDE. Deep Learning and HPC Applications

TESLA P100 PERFORMANCE GUIDE. Deep Learning and HPC Applications TESLA P PERFORMANCE GUIDE Deep Learning and HPC Applications SEPTEMBER 217 TESLA P PERFORMANCE GUIDE Modern high performance computing (HPC) data centers are key to solving some of the world s most important

More information

High-Performance Scientific Computing

High-Performance Scientific Computing High-Performance Scientific Computing Instructor: Randy LeVeque TA: Grady Lemoine Applied Mathematics 483/583, Spring 2011 http://www.amath.washington.edu/~rjl/am583 World s fastest computers http://top500.org

More information

CUDA Experiences: Over-Optimization and Future HPC

CUDA Experiences: Over-Optimization and Future HPC CUDA Experiences: Over-Optimization and Future HPC Carl Pearson 1, Simon Garcia De Gonzalo 2 Ph.D. candidates, Electrical and Computer Engineering 1 / Computer Science 2, University of Illinois Urbana-Champaign

More information

Scheduling Strategies for HPC as a Service (HPCaaS) for Bio-Science Applications

Scheduling Strategies for HPC as a Service (HPCaaS) for Bio-Science Applications Scheduling Strategies for HPC as a Service (HPCaaS) for Bio-Science Applications Sep 2009 Gilad Shainer, Tong Liu (Mellanox); Jeffrey Layton (Dell); Joshua Mora (AMD) High Performance Interconnects for

More information

CS 668 Parallel Computing Spring 2011

CS 668 Parallel Computing Spring 2011 CS 668 Parallel Computing Spring 2011 Prof. Fred Annexstein @proffreda fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807 Meeting: TuTh 2:00-3:25 in RecCenter 3240 Lecture

More information

Japan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS

Japan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS Japan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS HPC User Forum, 7 th September, 2016 Outline of Talk Introduction of FLAGSHIP2020 project An Overview of post K system Concluding Remarks

More information

Introduction to High-Performance Computing

Introduction to High-Performance Computing Introduction to High-Performance Computing Dr. Axel Kohlmeyer Associate Dean for Scientific Computing, CST Associate Director, Institute for Computational Science Assistant Vice President for High-Performance

More information

The Architecture and the Application Performance of the Earth Simulator

The Architecture and the Application Performance of the Earth Simulator The Architecture and the Application Performance of the Earth Simulator Ken ichi Itakura (JAMSTEC) http://www.jamstec.go.jp 15 Dec., 2011 ICTS-TIFR Discussion Meeting-2011 1 Location of Earth Simulator

More information

Advanced Numerical Techniques for Cluster Computing

Advanced Numerical Techniques for Cluster Computing Advanced Numerical Techniques for Cluster Computing Presented by Piotr Luszczek http://icl.cs.utk.edu/iter-ref/ Presentation Outline Motivation hardware Dense matrix calculations Sparse direct solvers

More information

Optimizing Data Locality for Iterative Matrix Solvers on CUDA

Optimizing Data Locality for Iterative Matrix Solvers on CUDA Optimizing Data Locality for Iterative Matrix Solvers on CUDA Raymond Flagg, Jason Monk, Yifeng Zhu PhD., Bruce Segee PhD. Department of Electrical and Computer Engineering, University of Maine, Orono,

More information

Mixed Precision Methods

Mixed Precision Methods Mixed Precision Methods Mixed precision, use the lowest precision required to achieve a given accuracy outcome " Improves runtime, reduce power consumption, lower data movement " Reformulate to find correction

More information

InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014

InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014 InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment TOP500 Supercomputers, June 2014 TOP500 Performance Trends 38% CAGR 78% CAGR Explosive high-performance

More information

CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman)

CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman) CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC Guest Lecturer: Sukhyun Song (original slides by Alan Sussman) Parallel Programming with Message Passing and Directives 2 MPI + OpenMP Some applications can

More information

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29 Introduction CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction Spring 2018 1 / 29 Outline 1 Preface Course Details Course Requirements 2 Background Definitions

More information

ICON for HD(CP) 2. High Definition Clouds and Precipitation for Advancing Climate Prediction

ICON for HD(CP) 2. High Definition Clouds and Precipitation for Advancing Climate Prediction ICON for HD(CP) 2 High Definition Clouds and Precipitation for Advancing Climate Prediction High Definition Clouds and Precipitation for Advancing Climate Prediction ICON 2 years ago Parameterize shallow

More information

EULAG: high-resolution computational model for research of multi-scale geophysical fluid dynamics

EULAG: high-resolution computational model for research of multi-scale geophysical fluid dynamics Zbigniew P. Piotrowski *,** EULAG: high-resolution computational model for research of multi-scale geophysical fluid dynamics *Geophysical Turbulence Program, National Center for Atmospheric Research,

More information

Simulation of chaotic mixing dynamics in microdroplets

Simulation of chaotic mixing dynamics in microdroplets Simulation of chaotic mixing dynamics in microdroplets Hongbo Zhou, Liguo Jiang and Shuhuai Yao the Hong Kong University of Science and Technology Date: 20/10/2011 Presented at COMSOL Conference 2011 China

More information

GPU Clouds IGT Cloud Computing Summit Mordechai Butrashvily, CEO 2009 (c) All rights reserved

GPU Clouds IGT Cloud Computing Summit Mordechai Butrashvily, CEO 2009 (c) All rights reserved GPU Clouds IGT 2009 Cloud Computing Summit Mordechai Butrashvily, CEO moti@hoopoe-cloud.com 02/12/2009 Agenda Introduction to GPU Computing Future GPU architecture GPU on a Cloud: Visualization Computing

More information

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA STATE OF THE ART 2012 18,688 Tesla K20X GPUs 27 PetaFLOPS FLAGSHIP SCIENTIFIC APPLICATIONS

More information

MPI Alltoall Personalized Exchange on GPGPU Clusters: Design Alternatives and Benefits

MPI Alltoall Personalized Exchange on GPGPU Clusters: Design Alternatives and Benefits MPI Alltoall Personalized Exchange on GPGPU Clusters: Design Alternatives and Benefits Ashish Kumar Singh, Sreeram Potluri, Hao Wang, Krishna Kandalla, Sayantan Sur, and Dhabaleswar K. Panda Network-Based

More information

HPC Capabilities at Research Intensive Universities

HPC Capabilities at Research Intensive Universities HPC Capabilities at Research Intensive Universities Purushotham (Puri) V. Bangalore Department of Computer and Information Sciences and UAB IT Research Computing UAB HPC Resources 24 nodes (192 cores)

More information

Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters

Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences,

More information

Radial Basis Function-Generated Finite Differences (RBF-FD): New Opportunities for Applications in Scientific Computing

Radial Basis Function-Generated Finite Differences (RBF-FD): New Opportunities for Applications in Scientific Computing Radial Basis Function-Generated Finite Differences (RBF-FD): New Opportunities for Applications in Scientific Computing Natasha Flyer National Center for Atmospheric Research Boulder, CO Meshes vs. Mesh-free

More information

Complexity and Advanced Algorithms. Introduction to Parallel Algorithms

Complexity and Advanced Algorithms. Introduction to Parallel Algorithms Complexity and Advanced Algorithms Introduction to Parallel Algorithms Why Parallel Computing? Save time, resources, memory,... Who is using it? Academia Industry Government Individuals? Two practical

More information

HPC projects. Grischa Bolls

HPC projects. Grischa Bolls HPC projects Grischa Bolls Outline Why projects? 7th Framework Programme Infrastructure stack IDataCool, CoolMuc Mont-Blanc Poject Deep Project Exa2Green Project 2 Why projects? Pave the way for exascale

More information

GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS

GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS Agenda Forming a GPGPU WG 1 st meeting Future meetings Activities Forming a GPGPU WG To raise needs and enhance information sharing A platform for knowledge

More information

Peta-Scale Simulations with the HPC Software Framework walberla:

Peta-Scale Simulations with the HPC Software Framework walberla: Peta-Scale Simulations with the HPC Software Framework walberla: Massively Parallel AMR for the Lattice Boltzmann Method SIAM PP 2016, Paris April 15, 2016 Florian Schornbaum, Christian Godenschwager,

More information

COST EFFICIENCY VS ENERGY EFFICIENCY. Anna Lepak Universität Hamburg Seminar: Energy-Efficient Programming Wintersemester 2014/2015

COST EFFICIENCY VS ENERGY EFFICIENCY. Anna Lepak Universität Hamburg Seminar: Energy-Efficient Programming Wintersemester 2014/2015 COST EFFICIENCY VS ENERGY EFFICIENCY Anna Lepak Universität Hamburg Seminar: Energy-Efficient Programming Wintersemester 2014/2015 TOPIC! Cost Efficiency vs Energy Efficiency! How much money do we have

More information

The Development of Modeling, Simulation and Visualization Services for the Earth Sciences in the National Supercomputing Center of Korea

The Development of Modeling, Simulation and Visualization Services for the Earth Sciences in the National Supercomputing Center of Korea icas2013 (September 9~12, Annecy/France) The Development of Modeling, Simulation and Visualization Services for the Earth Sciences in the National Supercomputing Center of Korea Min-Su Joh, Joon-Eun Ahn,

More information

TESLA P100 PERFORMANCE GUIDE. HPC and Deep Learning Applications

TESLA P100 PERFORMANCE GUIDE. HPC and Deep Learning Applications TESLA P PERFORMANCE GUIDE HPC and Deep Learning Applications MAY 217 TESLA P PERFORMANCE GUIDE Modern high performance computing (HPC) data centers are key to solving some of the world s most important

More information

COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES

COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES P(ND) 2-2 2014 Guillaume Colin de Verdière OCTOBER 14TH, 2014 P(ND)^2-2 PAGE 1 CEA, DAM, DIF, F-91297 Arpajon, France October 14th, 2014 Abstract:

More information

Commodity Cluster Computing

Commodity Cluster Computing Commodity Cluster Computing Ralf Gruber, EPFL-SIC/CAPA/Swiss-Tx, Lausanne http://capawww.epfl.ch Commodity Cluster Computing 1. Introduction 2. Characterisation of nodes, parallel machines,applications

More information

A Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids

A Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids A Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids Patrice Castonguay and Antony Jameson Aerospace Computing Lab, Stanford University GTC Asia, Beijing, China December 15 th, 2011

More information

Full Vehicle Dynamic Analysis using Automated Component Modal Synthesis. Peter Schartz, Parallel Project Manager ClusterWorld Conference June 2003

Full Vehicle Dynamic Analysis using Automated Component Modal Synthesis. Peter Schartz, Parallel Project Manager ClusterWorld Conference June 2003 Full Vehicle Dynamic Analysis using Automated Component Modal Synthesis Peter Schartz, Parallel Project Manager Conference Outline Introduction Background Theory Case Studies Full Vehicle Dynamic Analysis

More information

Towards Exascale Computing with the Atmospheric Model NUMA

Towards Exascale Computing with the Atmospheric Model NUMA Towards Exascale Computing with the Atmospheric Model NUMA Andreas Müller, Daniel S. Abdi, Michal Kopera, Lucas Wilcox, Francis X. Giraldo Department of Applied Mathematics Naval Postgraduate School, Monterey

More information

Timothy Lanfear, NVIDIA HPC

Timothy Lanfear, NVIDIA HPC GPU COMPUTING AND THE Timothy Lanfear, NVIDIA FUTURE OF HPC Exascale Computing will Enable Transformational Science Results First-principles simulation of combustion for new high-efficiency, lowemision

More information

Continued Investigation of Small-Scale Air-Sea Coupled Dynamics Using CBLAST Data

Continued Investigation of Small-Scale Air-Sea Coupled Dynamics Using CBLAST Data Continued Investigation of Small-Scale Air-Sea Coupled Dynamics Using CBLAST Data Dick K.P. Yue Center for Ocean Engineering Department of Mechanical Engineering Massachusetts Institute of Technology Cambridge,

More information

NVIDIA GPU TECHNOLOGY UPDATE

NVIDIA GPU TECHNOLOGY UPDATE NVIDIA GPU TECHNOLOGY UPDATE May 2015 Axel Koehler Senior Solutions Architect, NVIDIA NVIDIA: The VISUAL Computing Company GAMING DESIGN ENTERPRISE VIRTUALIZATION HPC & CLOUD SERVICE PROVIDERS AUTONOMOUS

More information

A Brief Introduction to CFINS

A Brief Introduction to CFINS A Brief Introduction to CFINS Center for Intelligent and Networked Systems (CFINS) Department of Automation Tsinghua University Beijing 100084, China 6/30/2016 1 Outline Mission People Professors Students

More information

Green Supercomputing

Green Supercomputing Green Supercomputing On the Energy Consumption of Modern E-Science Prof. Dr. Thomas Ludwig German Climate Computing Centre Hamburg, Germany ludwig@dkrz.de Outline DKRZ 2013 and Climate Science The Exascale

More information

CUDA. Matthew Joyner, Jeremy Williams

CUDA. Matthew Joyner, Jeremy Williams CUDA Matthew Joyner, Jeremy Williams Agenda What is CUDA? CUDA GPU Architecture CPU/GPU Communication Coding in CUDA Use cases of CUDA Comparison to OpenCL What is CUDA? What is CUDA? CUDA is a parallel

More information

Portable and Productive Performance with OpenACC Compilers and Tools. Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc.

Portable and Productive Performance with OpenACC Compilers and Tools. Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc. Portable and Productive Performance with OpenACC Compilers and Tools Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc. 1 Cray: Leadership in Computational Research Earth Sciences

More information

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Waiting for Moore s Law to save your serial code start getting bleak in 2004 Source: published SPECInt data Moore s Law is not at all

More information

Fujitsu s Technologies to the K Computer

Fujitsu s Technologies to the K Computer Fujitsu s Technologies to the K Computer - a journey to practical Petascale computing platform - June 21 nd, 2011 Motoi Okuda FUJITSU Ltd. Agenda The Next generation supercomputer project of Japan The

More information

High-Order Finite-Element Earthquake Modeling on very Large Clusters of CPUs or GPUs

High-Order Finite-Element Earthquake Modeling on very Large Clusters of CPUs or GPUs High-Order Finite-Element Earthquake Modeling on very Large Clusters of CPUs or GPUs Gordon Erlebacher Department of Scientific Computing Sept. 28, 2012 with Dimitri Komatitsch (Pau,France) David Michea

More information