Unit Name. Date These work are contributed by all our staffs
|
|
- Melina Payne
- 5 years ago
- Views:
Transcription
1 Supercomputing Applications Title in CAS Xue-bin Chi Supercomputing Center, CNIC, CAS Unit Name April 13-14, 14, 2011, Helsinki, Finland Date These work are contributed by all our staffs
2 Outline Overview the HPC History of China Brief Introduction of SCCAS Grid environment overview Applications in science and engineering Future Plan on high performance environment
3 Overview the HPC History of China 1.Started in the late of 1980 s 2.Developing in the 1990 s
4 Parallel computers ( ) 1994) BJ 01 ( ) 4 processors, with global and local memmory, made by ICT, CAS Transputers ( ) Severalparallelsystems systems, our group had a 17 transputer system KJ 8950 ( ) 16 processors, with global and local memmory, made by ICT, CAS st parallel computer Dawning 1000 occurred Peak performance 2.5GFLOPS in single precision, HPL 50% Similar to Paragon machine, NX parallel implementation environment Intel i860
5 National High Performance computing Centers ( ) National High Performance computing Center (Beijing) In Institute of computing technology, CAS National High Performance computing Center (Wuhan) Huazhong University of Science and Technology National High Performance computing Center (Chengdu) Southwest Jiaotong University National High Performance computing Center (Hefei) University of Science and Technology of China Supercomputing Center in Chinese Academy of Sciences (SCCAS) 1996 Shanghai Supercomputing Center (SSC) 2001
6 Clusters ( ) 2005) Lenovo DeepComp 6800 (SCCAS) 2003, top500 rank 14th, peak performance 5.2TFLOPS Itanium II, Linux OS, MPI, QsNet Budget 60M RMB MOST 15M, Lenovo 15M, CAS 30M Dawning 4000 (SSC) 2004, top500 rank 10th, peak performance 11 TFLOPS AMD Opteron, Linux Os, MPI, Merinet 2000 Budget 60M RMB MOST 30M, Shanghai Government 30M Many other clusters installed during that period
7 100TFLOPS HPCs ( ) 2010) Lenovo DeepComp TFLOPS, our center, Dec., 2008 Budget 180M RMB Dawning 5000 MOST 80M, Lenovo 25M, CAS 75M Peak performance 200TFLOPS, AMD processors Aug., 2009, in Shanghai Supercomputing Center Top500 rank 10th on June of 2009 Budget 200M RMB MOST 100M, Shanghai Government 100M Software and environment 100M
8 Peta FlopsHPCs Dawning Nebulae 600TFLOPS CPU + 3PFLOPS GPU Top500 Rank 2nd Budget 600M RMB MOST 200M, Shenzhen Government 400M Upgrade 1PFLOPS CPU, 200 TFLOPS Godson NUDT Tianhe 200TFLOPS CPU + 1.2PFLOPS GPU Top500 rank 7 th Budget 600M RMB MOST 200M, Tianjin Government 400M Upgrade 1PFLOPS CPU + 4PFLOPS GPU HPL 2.5PFLOPS
9 Brief Introduction of SCCAS 1.Providing computing power and technical support for scientists 2.Training in HPC algorithm and programming
10 SCCAS Computing Resources During (9th 5 year plan) In 1996, SGI Power Challenge XL 6.4Gflops 16 CPUs In1998: Hitachi SR GFlops 32CPUs In 2000, Dawning 2000 II 111.7Gflops 164 CPUs
11 SCCAS Computing Computing Resources During (10th 5 year plan) In 2003, Lenovo DeepComp6800 5Tflops, 1024 CPUs TOP500:No.14;China TOP100:No. 1 During 2006 now (11th 5 year plan) In 2008, Lenovo DeepComp Tflops, 13,000 cores TOP500:No.19;China TOP100:No.2 3 kinds of nodes, Altix 4700, IBM 3950, IBM Blades In , 300TFLOPS GPU
12 SCCAS Services: Services: Users Users in SCCAS has massively increased since 1996 (from 20 to more than 310)
13 SCCAS R&D of Algorithmsandsoftware software Parallel AMR (Adaptive Mesh Refinement) method Parallel Eigenvalue Problem Parallel Fast Multipole Method Parallel Computing Model Gridmol ScGrid middleware PSEPS FMM radar Transplant many open source software
14 Grid environment overview 1.China National Grid (CNGrid) 2.China Scientific Computing Grid (ScGrid)
15 China National Grid
16 China National Grid CNGrid operation center Supercomputing center of CNIC of CAS Northern major node Supercomputing center of CNIC of CAS Southern major node Shanghai supercomputing center Normal nodes 9 sites, they are Tsinghua University, Beijing Institute of Applied Physics and Computational Mathematics, Shandong University, University of Science and Technology of China, Huazhong University of Science and Technology, Shenzhen Institute of Advanced Technology of CAS, Hong Kong University, Xi an Jiaotong University, Gansu province Supercomputing Center
17 China Scientific Computing Grid (ScGrid) 3 Level Grid Property Complementarity Autonomic Ability Scalability Availability Realizability Usability Manageability Stability CAS ARUMS
18 GPU Clusters within CAS Site Vendor R peak /Tflops Institute. of Electrical Engnineering Lenovo 112 Shenzhen Institutes of Advanced Technology Lenovo 200 USTC Lenovo 183 CNIC Lenovo, Dawning 300 National Astronomical Observatories Lenovo 158 Institute of Geology and Geophysics Lenovo, Dawning 200 Institute of Modern Physics Lenovo Institute of High Energy Physics Dawning Institute of Metals Research Dawning 183 Purple Mountain Observatory Dawning 180 SUM Pflops
19 Windows / Linux Clients Users Administrator Web Portal SCE Middleware HPC, Cluster, Workstation, Storage
20 Applications in Science and Engineering 1.Scientific ifi Research 2.Industrial Computing
21 Prediction of Sandstorm Real time prediction system of Sandstorms for China Meteorological Administration DeepComp CPUs from 15hours down to 8mins
22 Climate and Atmosphere Model CUACE(CMA Unified Atmospheric 3000 Chemistry 1000 Environment) 0 TI ME 54km km km km % 100% 80% 60% 40% 20% 0% Ef f i ci ency km 100% 90% % % % % 108km 100% % % % % % 54km 108km 100 RIEMS(Regional 80 Integrated 70 Environment 60 Modeling System) ) MPI MPI+OpenMP
23 The surface temperature eof the Joined Area of Asia&India Pacific sa da ac c Ocean
24 Computational Fluid Dynamics Speed d-up CPU number
25 Gl Galactic Wind Simulation 8192 Computational cores, Internationally advanced both in computational scale Internationally advanced both in computational scale and speed
26 Galactic Wind Simulation on Tianhe 1A Speedup Li near Efficiency % % %
27 Simulation the Universe Evolution by 2048 Cores 13 Days
28 Spaceweather prediction research Shock wave magnetic cloud interactive Using parallel AMR method to do MHD Computing (512 cores)
29 Drug screening for Avian Influenza DeepComp7000 p 2400 CPU cores Time consumed decreased from 2 months to 8hours The compounds screened out by this computation is forwarded to related department tfor further biological test
30 Simulation: Side plates Formation in Ti Alloys Phase-field model Time-dependent Ginzburg-Landau(TDGL) equation Cahn-Hillard equation Anisotropic interfacial energy Numerical Method time domain finite difference operator splitting Grid size 1024*1024*1024 Number of DOFs 4.295*10 9 Run on 4,096 cores Parallel efficiency 94%. 3 simulations 462,828 core hours. Collaborators Ms. Mei Yang Dr. Hao Wang Dr. Gang Wang Prof. Dongsheng Xu Institute of Metal Research, CAS, Shenyang Side-plates Formation Visualization(512 3 )
31 Side plates formation in Ti Alloys test (Tianhe 1A) Speedup Li near Efficiency % % %
32 Parallel Software HPSEPS Eigenvalue problem parallel solvers for sparse and dense symmetric matrics SVD Parallel Solver LSQR Parallel Solver Eigensolver for dense matrices with 8192 cores on Tianhe-1A(N=40000) 16 H i E i i H ( X) X X Num of Cores Time(sec) Speed up Efficency 100% 73% 56% 39% 25%
33 Hypre on Tianhe 1A Convection Reaction Diffusion div ( K grad u + B u) + C u = F SMG preconditioned GMRES solver N=1024x1024, Np=kxk Assume efficency on 1024 core is100%,we can get efficency in 8100 cores is 51%
34 P3DFFT on Tianhe 1 A Parallel 3D FFT Based on FFTW3.2.1, but MPI_AlltoAll is avoid 1024x1024x x1024x2048 #Procs T(P3DFFT) Efficiency
35 SC PEtot: First Large scale PlanewaveGPUCode Real space & G-space calculation lation Both feature are implemented in GPU Speedup can be 5~6x for CG Data structure Transpose data from (16, 1) to (1,16) in CPU for more efficient GPU implementation Sphere-to-Cube FFT Use sphere to reduce CPU-GPU data transfer size Mapping to cube in GPU to use CUFFT MPI communication and GPU overlap Non-block AlltoAll to overlap CPU communication Communication & computation pipeline in FFT G FFT CG GPU 1.8s 118s CPU 20.5s 645s R Hpsi CG GPU s CPU s
36 SC PEtot:5xfaster thanvasp Testing systems on SC-PEtot: 512 atom GaAs bulk system 933 atom CdSe quantum dot (QD) passivated by hydrogen atoms placed in a large box These two systems represent two different situations: one with atoms uniformly occupying the space, another with a large empty space. Results 4.5x to 9 times speedup compared with CPU PEtot About 5 times speedup compared with VASP Nodes systems 512 GaAs 512 GaAs 512 GaAs 512 GaAs 512 GaAs 933 CdSe VASP (CPU) PEtot (CPU) PEtot (GPU) Speed up(with CPU) 9.7x 9.2x 9.1x 6.3x 5.8x 8.5x Speed up(with VASP) 5.49x 5.1x 6.6x 6.7x SC-Petot testing results compared with CPU Petot and VASP
37 P_InsPecT/cuda InsPecT Software Software Introduction Both are optimized InsPecT software P_InsPecT is open source and can be downloaded from SCBG Cuda InsPecT will be open source Software Function InsPecT is an unrestricted identification software of PTMs(post translational modifications) Software Characteristics P_InsPecT via MPI run on CPU cluster or CPU nodes of HPC Software Performance Software:P_InsPecT(one ( modification) ) Database:36547 mass spectrometric; protein sequences Environment:DeepComp7000 One Core (estimate) 2048 Cores Time h 0.4 h cuda-inspect via MPI+cuda C run on GPU cluster Software:cuda-InsPecT(two ( modifications) ) Database:62346 mass spectrometric; protein sequencese Environment:Dawn 6000A One Core (estimate) t 677 Fermi C2050 Time 6 years h
38 VAT4M ( A Visual Analysis tool for CryoEM Density Maps) Volume rendering with an intuitive interface Automatic ti segmentation tti of density maps Automatic structure fitting for multi molecular models
39 CAD/CAE Platform in Automation Industry Who we provide our service to CAD/CAE engineers What we deliver to them A high performance service DeepComp 7000 for parallel l solver High performance graphic workstations for pre/post processing A high usability cloud computing solution via internet Remote graphical access to workstations Grid computing based resources Web portalintegrated with CAEsoftware A high safety environment Secure access through VPN and firewall Isolate unauthorized access from cluster by web portal
40 Aerodynamic Noise Simulation inhigh speed Train 13 million mesh elements 128 cores simulation 11 day for flow field 15 days for noise
41 Fingerprint Recognition Fingerprint database :19,600,000 fingerprints HPC:1024 cores ( 16 nodes, 64 cores/node ) Speed:4000 fingerprints/second/core Run time:3 months (linear Speedup)
42 Oil Reservoir simulation on Daqing Oilfield Grid size: 199*87*67 Dawning2000II 64 CPUs The different Oil and water percentage on the same layer at different years: Nov. 1966(left) and Jan. 2100(right)
43 Future Plan on High Performance Environment
44 Petascale Supercomputers in CAS 1 PetaFlops computer in 2012 Budget 250M RMB For scientific computing 10 PetaFlops computer in 2015 Collaboration with Beijing local government Budget 700M RMB For scientific computing and industry computing
45 100PFlops Supercomputers in China Many peta scale HPCs At least one PFLOPS HPC Budget 4B RMB MOST 2.4B, Local Government 1.6B 1 10ExaFLOPS HPC Budget Can t estimate
46 Question & Suggestion Thank you very much!
INSPUR and HPC Innovation
INSPUR and HPC Innovation Dong Qi (Forrest) Product manager Inspur dongqi@inspur.com Contents 1 2 3 4 5 Inspur introduction HPC Challenge and Inspur HPC strategy HPC cases Inspur contribution to HPC community
More informationINSPUR and HPC Innovation. Dong Qi (Forrest) Oversea PM
INSPUR and HPC Innovation Dong Qi (Forrest) Oversea PM dongqi@inspur.com Contents 1 2 3 4 5 Inspur introduction HPC Challenge and Inspur HPC strategy HPC cases Inspur contribution to HPC community Inspur
More informationEASIS: An Optimized Information Service for High Performance Computing Environment
EASIS: An Optimized Information Service for High Performance Computing Environment Can Wu, Xiaoning Wang, Haili Xiao, Rongqiang Cao, Yining Zhao, Xuebin Chi Computer Network Information Center, Chinese
More informationThe next generation supercomputer. Masami NARITA, Keiichi KATAYAMA Numerical Prediction Division, Japan Meteorological Agency
The next generation supercomputer and NWP system of JMA Masami NARITA, Keiichi KATAYAMA Numerical Prediction Division, Japan Meteorological Agency Contents JMA supercomputer systems Current system (Mar
More informationGodson Processor and its Application in High Performance Computers
Godson Processor and its Application in High Performance Computers Weiwu Hu Institute of Computing Technology, Chinese Academy of Sciences Loongson Technologies Corporation Limited hww@ict.ac.cn 1 Contents
More informationPractical Scientific Computing
Practical Scientific Computing Performance-optimized Programming Preliminary discussion: July 11, 2008 Dr. Ralf-Peter Mundani, mundani@tum.de Dipl.-Ing. Ioan Lucian Muntean, muntean@in.tum.de MSc. Csaba
More informationIntroduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620
Introduction to Parallel and Distributed Computing Linh B. Ngo CPSC 3620 Overview: What is Parallel Computing To be run using multiple processors A problem is broken into discrete parts that can be solved
More informationSupercomputer Networking via IPv6. Xing Li, Congxiao Bao, Wusheng Zhang
Supercomputer Networking via IPv6 Xing Li, Congxiao Bao, Wusheng Zhang 2018-08-08 Outline Sunway TaihuLight Supercomputer CERNET and CERNET2 Challenges Solution Performance Q&A Remarks 2 National Supercomputing
More informationMatrix-free multi-gpu Implementation of Elliptic Solvers for strongly anisotropic PDEs
Iterative Solvers Numerical Results Conclusion and outlook 1/18 Matrix-free multi-gpu Implementation of Elliptic Solvers for strongly anisotropic PDEs Eike Hermann Müller, Robert Scheichl, Eero Vainikko
More informationCRAY XK6 REDEFINING SUPERCOMPUTING. - Sanjana Rakhecha - Nishad Nerurkar
CRAY XK6 REDEFINING SUPERCOMPUTING - Sanjana Rakhecha - Nishad Nerurkar CONTENTS Introduction History Specifications Cray XK6 Architecture Performance Industry acceptance and applications Summary INTRODUCTION
More informationHybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS
+ Hybrid Computing @ KAUST Many Cores and OpenACC Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS + Agenda Hybrid Computing n Hybrid Computing n From Multi-Physics
More informationPART I - Fundamentals of Parallel Computing
PART I - Fundamentals of Parallel Computing Objectives What is scientific computing? The need for more computing power The need for parallel computing and parallel programs 1 What is scientific computing?
More informationSupercomputers. Alex Reid & James O'Donoghue
Supercomputers Alex Reid & James O'Donoghue The Need for Supercomputers Supercomputers allow large amounts of processing to be dedicated to calculation-heavy problems Supercomputers are centralized in
More informationHPC with GPU and its applications from Inspur. Haibo Xie, Ph.D
HPC with GPU and its applications from Inspur Haibo Xie, Ph.D xiehb@inspur.com 2 Agenda I. HPC with GPU II. YITIAN solution and application 3 New Moore s Law 4 HPC? HPC stands for High Heterogeneous Performance
More informationINSPUR HPC Development
INSPUR HPC Development -Industrial Perspective of China Zilong (DAVID) XU Contents 1. Introduction of INSPUR 2. Introduction of INSPUR HPC 3. Being A Responsible HPC Company 2 Contents 1. Introduction
More informationHPC and Big Data: Updates about China. Haohuan FU August 29 th, 2017
HPC and Big Data: Updates about China Haohuan FU August 29 th, 2017 1 Outline HPC and Big Data Projects in China Recent Efforts on Tianhe-2 Recent Efforts on Sunway TaihuLight 2 MOST HPC Projects 2016
More informationHPC Infrastructure and Applications in Chinese Academy of Sciences
HPC Infrastructure and Applications in Chinese Academy of Sciences Xuebin CHI, Haili XIAO, Rongqiang CAO, Yining ZHAO (chi@sccas.cn) Computer Network Information Center (CNIC) Chinese Academy of Sciences
More informationEfficient multigrid solvers for strongly anisotropic PDEs in atmospheric modelling
Iterative Solvers Numerical Results Conclusion and outlook 1/22 Efficient multigrid solvers for strongly anisotropic PDEs in atmospheric modelling Part II: GPU Implementation and Scaling on Titan Eike
More information2014 China TOP100 List of High Performance Computer
2014 China TOP100 List of High Performance Computer (November, 2014) 1/ 15 2014 China TOP100 List of High Performance Computer The Specialty Association of Mathematical & Scientific Software, CSIA Evaluation
More informationInfiniBand Strengthens Leadership as The High-Speed Interconnect Of Choice
InfiniBand Strengthens Leadership as The High-Speed Interconnect Of Choice Providing the Best Return on Investment by Delivering the Highest System Efficiency and Utilization Top500 Supercomputers June
More informationAdaptive Mesh Astrophysical Fluid Simulations on GPU. San Jose 10/2/2009 Peng Wang, NVIDIA
Adaptive Mesh Astrophysical Fluid Simulations on GPU San Jose 10/2/2009 Peng Wang, NVIDIA Overview Astrophysical motivation & the Enzo code Finite volume method and adaptive mesh refinement (AMR) CUDA
More informationINTENSIVE COMPUTATION. Annalisa Massini
INTENSIVE COMPUTATION Annalisa Massini 2017-2018 Intensive Computation - 2017/2018 2 Course topics The course will cover topics that are in some sense related to intensive computation: Matlab (an introduction)
More informationSpeedup Altair RADIOSS Solvers Using NVIDIA GPU
Innovation Intelligence Speedup Altair RADIOSS Solvers Using NVIDIA GPU Eric LEQUINIOU, HPC Director Hongwei Zhou, Senior Software Developer May 16, 2012 Innovation Intelligence ALTAIR OVERVIEW Altair
More informationTianhe-2, the world s fastest supercomputer. Shaohua Wu Senior HPC application development engineer
Tianhe-2, the world s fastest supercomputer Shaohua Wu Senior HPC application development engineer Inspur Inspur revenue 5.8 2010-2013 6.4 2011 2012 Unit: billion$ 8.8 2013 21% Staff: 14, 000+ 12% 10%
More informationOverview of Tianhe-2
Overview of Tianhe-2 (MilkyWay-2) Supercomputer Yutong Lu School of Computer Science, National University of Defense Technology; State Key Laboratory of High Performance Computing, China ytlu@nudt.edu.cn
More informationStockholm Brain Institute Blue Gene/L
Stockholm Brain Institute Blue Gene/L 1 Stockholm Brain Institute Blue Gene/L 2 IBM Systems & Technology Group and IBM Research IBM Blue Gene /P - An Overview of a Petaflop Capable System Carl G. Tengwall
More informationAdaptive-Mesh-Refinement Hydrodynamic GPU Computation in Astrophysics
Adaptive-Mesh-Refinement Hydrodynamic GPU Computation in Astrophysics H. Y. Schive ( 薛熙于 ) Graduate Institute of Physics, National Taiwan University Leung Center for Cosmology and Particle Astrophysics
More informationACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016
ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016 Challenges What is Algebraic Multi-Grid (AMG)? AGENDA Why use AMG? When to use AMG? NVIDIA AmgX Results 2
More informationMathematical computations with GPUs
Master Educational Program Information technology in applications Mathematical computations with GPUs Introduction Alexey A. Romanenko arom@ccfit.nsu.ru Novosibirsk State University How to.. Process terabytes
More informationLarge scale Imaging on Current Many- Core Platforms
Large scale Imaging on Current Many- Core Platforms SIAM Conf. on Imaging Science 2012 May 20, 2012 Dr. Harald Köstler Chair for System Simulation Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen,
More informationInformatization and Grid Technology
Informatization and Grid Technology Qian Depei Xian Jiaotong University May 21, 2004 Opportunity and challenge China is experiencing a rapid development in informatization Huge investment on information
More informationImplicit Low-Order Unstructured Finite-Element Multiple Simulation Enhanced by Dense Computation using OpenACC
Fourth Workshop on Accelerator Programming Using Directives (WACCPD), Nov. 13, 2017 Implicit Low-Order Unstructured Finite-Element Multiple Simulation Enhanced by Dense Computation using OpenACC Takuma
More informationIntroduction to National Supercomputing Centre in Guangzhou and Opportunities for International Collaboration
Exascale Applications and Software Conference 21st 23rd April 2015, Edinburgh, UK Introduction to National Supercomputing Centre in Guangzhou and Opportunities for International Collaboration Xue-Feng
More informationJack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester
Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester 12/24/09 1 Take a look at high performance computing What s driving HPC Future Trends 2 Traditional scientific
More informationKengo Nakajima Information Technology Center, The University of Tokyo. SC15, November 16-20, 2015 Austin, Texas, USA
ppopen-hpc Open Source Infrastructure for Development and Execution of Large-Scale Scientific Applications on Post-Peta Scale Supercomputers with Automatic Tuning (AT) Kengo Nakajima Information Technology
More informationKey Technologies for 100 PFLOPS. Copyright 2014 FUJITSU LIMITED
Key Technologies for 100 PFLOPS How to keep the HPC-tree growing Molecular dynamics Computational materials Drug discovery Life-science Quantum chemistry Eigenvalue problem FFT Subatomic particle phys.
More informationInauguration Cartesius June 14, 2013
Inauguration Cartesius June 14, 2013 Hardware is Easy...but what about software/applications/implementation/? Dr. Peter Michielse Deputy Director 1 Agenda History Cartesius Hardware path to exascale: the
More informationCHAO YANG. Early Experience on Optimizations of Application Codes on the Sunway TaihuLight Supercomputer
CHAO YANG Dr. Chao Yang is a full professor at the Laboratory of Parallel Software and Computational Sciences, Institute of Software, Chinese Academy Sciences. His research interests include numerical
More informationHow to perform HPL on CPU&GPU clusters. Dr.sc. Draško Tomić
How to perform HPL on CPU&GPU clusters Dr.sc. Draško Tomić email: drasko.tomic@hp.com Forecasting is not so easy, HPL benchmarking could be even more difficult Agenda TOP500 GPU trends Some basics about
More informationResource virtualization and optimization via Grid and Cloud Computing
Resource virtualization and optimization via Grid and Cloud Computing Moon J Kim IBM Senior Technical Staff Member/Chief Architect/Master Inventor moonkim@us.ibm.com Grid Motivations Research Accelerate
More informationFaster Innovation - Accelerating SIMULIA Abaqus Simulations with NVIDIA GPUs. Baskar Rajagopalan Accelerated Computing, NVIDIA
Faster Innovation - Accelerating SIMULIA Abaqus Simulations with NVIDIA GPUs Baskar Rajagopalan Accelerated Computing, NVIDIA 1 Engineering & IT Challenges/Trends NVIDIA GPU Solutions AGENDA Abaqus GPU
More informationA Kernel-independent Adaptive Fast Multipole Method
A Kernel-independent Adaptive Fast Multipole Method Lexing Ying Caltech Joint work with George Biros and Denis Zorin Problem Statement Given G an elliptic PDE kernel, e.g. {x i } points in {φ i } charges
More informationExascale: challenges and opportunities in a power constrained world
Exascale: challenges and opportunities in a power constrained world Carlo Cavazzoni c.cavazzoni@cineca.it SuperComputing Applications and Innovation Department CINECA CINECA non profit Consortium, made
More informationUpdate on LRZ Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities. 2 Oct 2018 Prof. Dr. Dieter Kranzlmüller
Update on LRZ Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities 2 Oct 2018 Prof. Dr. Dieter Kranzlmüller 1 Leibniz Supercomputing Centre Bavarian Academy of Sciences and
More informationRealtime Photometry System for AST3 (AST3_RPS)
Realtime Photometry System for AST3 (AST3_RPS) Yu Ce ( 于策 ) yuce@tju.edu.cn School of Computer Science and Technology, Tianjin University ( 天津大学 ) Under the direction of Zhaohui Shang Xi an, Aug. 2010
More informationSupercomputer and grid infrastructure! in Poland!
Supercomputer and grid infrastructure in Poland Franciszek Rakowski Interdisciplinary Centre for Mathematical and Computational Modelling 12th INCF Nodes Workshop, 16.04.2015 Warsaw, Nencki Institute.
More informationPerformance Evaluation of Quantum ESPRESSO on SX-ACE. REV-A Workshop held on conjunction with the IEEE Cluster 2017 Hawaii, USA September 5th, 2017
Performance Evaluation of Quantum ESPRESSO on SX-ACE REV-A Workshop held on conjunction with the IEEE Cluster 2017 Hawaii, USA September 5th, 2017 Osamu Watanabe Akihiro Musa Hiroaki Hokari Shivanshu Singh
More informationAmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015
AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015 Agenda Introduction to AmgX Current Capabilities Scaling V2.0 Roadmap for the future 2 AmgX Fast, scalable linear solvers, emphasis on iterative
More informationPractical Scientific Computing
Practical Scientific Computing Performance-optimised Programming Preliminary discussion, 17.7.2007 Dr. Ralf-Peter Mundani, mundani@tum.de Dipl.-Ing. Ioan Lucian Muntean, muntean@in.tum.de Dipl.-Geophys.
More informationPresentations: Jack Dongarra, University of Tennessee & ORNL. The HPL Benchmark: Past, Present & Future. Mike Heroux, Sandia National Laboratories
HPC Benchmarking Presentations: Jack Dongarra, University of Tennessee & ORNL The HPL Benchmark: Past, Present & Future Mike Heroux, Sandia National Laboratories The HPCG Benchmark: Challenges It Presents
More informationFujitsu s Approach to Application Centric Petascale Computing
Fujitsu s Approach to Application Centric Petascale Computing 2 nd Nov. 2010 Motoi Okuda Fujitsu Ltd. Agenda Japanese Next-Generation Supercomputer, K Computer Project Overview Design Targets System Overview
More informationHigh performance Computing and O&G Challenges
High performance Computing and O&G Challenges 2 Seismic exploration challenges High Performance Computing and O&G challenges Worldwide Context Seismic,sub-surface imaging Computing Power needs Accelerating
More informationNumerical Algorithm Co-Design multi-scale simulation at extreme scale
Co-Design 2014, Guangzhou, Nov. 6-8 Numerical Algorithm Co-Design multi-scale simulation at extreme scale Wei Ge Institute of Process Engineering (IPE), CAS Co-Design 2014, Guangzhou, Nov. 6-8 Co-Design
More informationSoftware and Performance Engineering for numerical codes on GPU clusters
Software and Performance Engineering for numerical codes on GPU clusters H. Köstler International Workshop of GPU Solutions to Multiscale Problems in Science and Engineering Harbin, China 28.7.2010 2 3
More informationFabio AFFINITO.
Introduction to High Performance Computing Fabio AFFINITO What is the meaning of High Performance Computing? What does HIGH PERFORMANCE mean??? 1976... Cray-1 supercomputer First commercial successful
More informationTESLA P100 PERFORMANCE GUIDE. Deep Learning and HPC Applications
TESLA P PERFORMANCE GUIDE Deep Learning and HPC Applications SEPTEMBER 217 TESLA P PERFORMANCE GUIDE Modern high performance computing (HPC) data centers are key to solving some of the world s most important
More informationHigh-Performance Scientific Computing
High-Performance Scientific Computing Instructor: Randy LeVeque TA: Grady Lemoine Applied Mathematics 483/583, Spring 2011 http://www.amath.washington.edu/~rjl/am583 World s fastest computers http://top500.org
More informationCUDA Experiences: Over-Optimization and Future HPC
CUDA Experiences: Over-Optimization and Future HPC Carl Pearson 1, Simon Garcia De Gonzalo 2 Ph.D. candidates, Electrical and Computer Engineering 1 / Computer Science 2, University of Illinois Urbana-Champaign
More informationScheduling Strategies for HPC as a Service (HPCaaS) for Bio-Science Applications
Scheduling Strategies for HPC as a Service (HPCaaS) for Bio-Science Applications Sep 2009 Gilad Shainer, Tong Liu (Mellanox); Jeffrey Layton (Dell); Joshua Mora (AMD) High Performance Interconnects for
More informationCS 668 Parallel Computing Spring 2011
CS 668 Parallel Computing Spring 2011 Prof. Fred Annexstein @proffreda fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807 Meeting: TuTh 2:00-3:25 in RecCenter 3240 Lecture
More informationJapan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS
Japan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS HPC User Forum, 7 th September, 2016 Outline of Talk Introduction of FLAGSHIP2020 project An Overview of post K system Concluding Remarks
More informationIntroduction to High-Performance Computing
Introduction to High-Performance Computing Dr. Axel Kohlmeyer Associate Dean for Scientific Computing, CST Associate Director, Institute for Computational Science Assistant Vice President for High-Performance
More informationThe Architecture and the Application Performance of the Earth Simulator
The Architecture and the Application Performance of the Earth Simulator Ken ichi Itakura (JAMSTEC) http://www.jamstec.go.jp 15 Dec., 2011 ICTS-TIFR Discussion Meeting-2011 1 Location of Earth Simulator
More informationAdvanced Numerical Techniques for Cluster Computing
Advanced Numerical Techniques for Cluster Computing Presented by Piotr Luszczek http://icl.cs.utk.edu/iter-ref/ Presentation Outline Motivation hardware Dense matrix calculations Sparse direct solvers
More informationOptimizing Data Locality for Iterative Matrix Solvers on CUDA
Optimizing Data Locality for Iterative Matrix Solvers on CUDA Raymond Flagg, Jason Monk, Yifeng Zhu PhD., Bruce Segee PhD. Department of Electrical and Computer Engineering, University of Maine, Orono,
More informationMixed Precision Methods
Mixed Precision Methods Mixed precision, use the lowest precision required to achieve a given accuracy outcome " Improves runtime, reduce power consumption, lower data movement " Reformulate to find correction
More informationInfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014
InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment TOP500 Supercomputers, June 2014 TOP500 Performance Trends 38% CAGR 78% CAGR Explosive high-performance
More informationCMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman)
CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC Guest Lecturer: Sukhyun Song (original slides by Alan Sussman) Parallel Programming with Message Passing and Directives 2 MPI + OpenMP Some applications can
More informationIntroduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29
Introduction CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction Spring 2018 1 / 29 Outline 1 Preface Course Details Course Requirements 2 Background Definitions
More informationICON for HD(CP) 2. High Definition Clouds and Precipitation for Advancing Climate Prediction
ICON for HD(CP) 2 High Definition Clouds and Precipitation for Advancing Climate Prediction High Definition Clouds and Precipitation for Advancing Climate Prediction ICON 2 years ago Parameterize shallow
More informationEULAG: high-resolution computational model for research of multi-scale geophysical fluid dynamics
Zbigniew P. Piotrowski *,** EULAG: high-resolution computational model for research of multi-scale geophysical fluid dynamics *Geophysical Turbulence Program, National Center for Atmospheric Research,
More informationSimulation of chaotic mixing dynamics in microdroplets
Simulation of chaotic mixing dynamics in microdroplets Hongbo Zhou, Liguo Jiang and Shuhuai Yao the Hong Kong University of Science and Technology Date: 20/10/2011 Presented at COMSOL Conference 2011 China
More informationGPU Clouds IGT Cloud Computing Summit Mordechai Butrashvily, CEO 2009 (c) All rights reserved
GPU Clouds IGT 2009 Cloud Computing Summit Mordechai Butrashvily, CEO moti@hoopoe-cloud.com 02/12/2009 Agenda Introduction to GPU Computing Future GPU architecture GPU on a Cloud: Visualization Computing
More informationHETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA
HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA STATE OF THE ART 2012 18,688 Tesla K20X GPUs 27 PetaFLOPS FLAGSHIP SCIENTIFIC APPLICATIONS
More informationMPI Alltoall Personalized Exchange on GPGPU Clusters: Design Alternatives and Benefits
MPI Alltoall Personalized Exchange on GPGPU Clusters: Design Alternatives and Benefits Ashish Kumar Singh, Sreeram Potluri, Hao Wang, Krishna Kandalla, Sayantan Sur, and Dhabaleswar K. Panda Network-Based
More informationHPC Capabilities at Research Intensive Universities
HPC Capabilities at Research Intensive Universities Purushotham (Puri) V. Bangalore Department of Computer and Information Sciences and UAB IT Research Computing UAB HPC Resources 24 nodes (192 cores)
More informationFlux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters
Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences,
More informationRadial Basis Function-Generated Finite Differences (RBF-FD): New Opportunities for Applications in Scientific Computing
Radial Basis Function-Generated Finite Differences (RBF-FD): New Opportunities for Applications in Scientific Computing Natasha Flyer National Center for Atmospheric Research Boulder, CO Meshes vs. Mesh-free
More informationComplexity and Advanced Algorithms. Introduction to Parallel Algorithms
Complexity and Advanced Algorithms Introduction to Parallel Algorithms Why Parallel Computing? Save time, resources, memory,... Who is using it? Academia Industry Government Individuals? Two practical
More informationHPC projects. Grischa Bolls
HPC projects Grischa Bolls Outline Why projects? 7th Framework Programme Infrastructure stack IDataCool, CoolMuc Mont-Blanc Poject Deep Project Exa2Green Project 2 Why projects? Pave the way for exascale
More informationGPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS
GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS Agenda Forming a GPGPU WG 1 st meeting Future meetings Activities Forming a GPGPU WG To raise needs and enhance information sharing A platform for knowledge
More informationPeta-Scale Simulations with the HPC Software Framework walberla:
Peta-Scale Simulations with the HPC Software Framework walberla: Massively Parallel AMR for the Lattice Boltzmann Method SIAM PP 2016, Paris April 15, 2016 Florian Schornbaum, Christian Godenschwager,
More informationCOST EFFICIENCY VS ENERGY EFFICIENCY. Anna Lepak Universität Hamburg Seminar: Energy-Efficient Programming Wintersemester 2014/2015
COST EFFICIENCY VS ENERGY EFFICIENCY Anna Lepak Universität Hamburg Seminar: Energy-Efficient Programming Wintersemester 2014/2015 TOPIC! Cost Efficiency vs Energy Efficiency! How much money do we have
More informationThe Development of Modeling, Simulation and Visualization Services for the Earth Sciences in the National Supercomputing Center of Korea
icas2013 (September 9~12, Annecy/France) The Development of Modeling, Simulation and Visualization Services for the Earth Sciences in the National Supercomputing Center of Korea Min-Su Joh, Joon-Eun Ahn,
More informationTESLA P100 PERFORMANCE GUIDE. HPC and Deep Learning Applications
TESLA P PERFORMANCE GUIDE HPC and Deep Learning Applications MAY 217 TESLA P PERFORMANCE GUIDE Modern high performance computing (HPC) data centers are key to solving some of the world s most important
More informationCOMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES
COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES P(ND) 2-2 2014 Guillaume Colin de Verdière OCTOBER 14TH, 2014 P(ND)^2-2 PAGE 1 CEA, DAM, DIF, F-91297 Arpajon, France October 14th, 2014 Abstract:
More informationCommodity Cluster Computing
Commodity Cluster Computing Ralf Gruber, EPFL-SIC/CAPA/Swiss-Tx, Lausanne http://capawww.epfl.ch Commodity Cluster Computing 1. Introduction 2. Characterisation of nodes, parallel machines,applications
More informationA Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids
A Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids Patrice Castonguay and Antony Jameson Aerospace Computing Lab, Stanford University GTC Asia, Beijing, China December 15 th, 2011
More informationFull Vehicle Dynamic Analysis using Automated Component Modal Synthesis. Peter Schartz, Parallel Project Manager ClusterWorld Conference June 2003
Full Vehicle Dynamic Analysis using Automated Component Modal Synthesis Peter Schartz, Parallel Project Manager Conference Outline Introduction Background Theory Case Studies Full Vehicle Dynamic Analysis
More informationTowards Exascale Computing with the Atmospheric Model NUMA
Towards Exascale Computing with the Atmospheric Model NUMA Andreas Müller, Daniel S. Abdi, Michal Kopera, Lucas Wilcox, Francis X. Giraldo Department of Applied Mathematics Naval Postgraduate School, Monterey
More informationTimothy Lanfear, NVIDIA HPC
GPU COMPUTING AND THE Timothy Lanfear, NVIDIA FUTURE OF HPC Exascale Computing will Enable Transformational Science Results First-principles simulation of combustion for new high-efficiency, lowemision
More informationContinued Investigation of Small-Scale Air-Sea Coupled Dynamics Using CBLAST Data
Continued Investigation of Small-Scale Air-Sea Coupled Dynamics Using CBLAST Data Dick K.P. Yue Center for Ocean Engineering Department of Mechanical Engineering Massachusetts Institute of Technology Cambridge,
More informationNVIDIA GPU TECHNOLOGY UPDATE
NVIDIA GPU TECHNOLOGY UPDATE May 2015 Axel Koehler Senior Solutions Architect, NVIDIA NVIDIA: The VISUAL Computing Company GAMING DESIGN ENTERPRISE VIRTUALIZATION HPC & CLOUD SERVICE PROVIDERS AUTONOMOUS
More informationA Brief Introduction to CFINS
A Brief Introduction to CFINS Center for Intelligent and Networked Systems (CFINS) Department of Automation Tsinghua University Beijing 100084, China 6/30/2016 1 Outline Mission People Professors Students
More informationGreen Supercomputing
Green Supercomputing On the Energy Consumption of Modern E-Science Prof. Dr. Thomas Ludwig German Climate Computing Centre Hamburg, Germany ludwig@dkrz.de Outline DKRZ 2013 and Climate Science The Exascale
More informationCUDA. Matthew Joyner, Jeremy Williams
CUDA Matthew Joyner, Jeremy Williams Agenda What is CUDA? CUDA GPU Architecture CPU/GPU Communication Coding in CUDA Use cases of CUDA Comparison to OpenCL What is CUDA? What is CUDA? CUDA is a parallel
More informationPortable and Productive Performance with OpenACC Compilers and Tools. Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc.
Portable and Productive Performance with OpenACC Compilers and Tools Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc. 1 Cray: Leadership in Computational Research Earth Sciences
More informationIt s a Multicore World. John Urbanic Pittsburgh Supercomputing Center
It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Waiting for Moore s Law to save your serial code start getting bleak in 2004 Source: published SPECInt data Moore s Law is not at all
More informationFujitsu s Technologies to the K Computer
Fujitsu s Technologies to the K Computer - a journey to practical Petascale computing platform - June 21 nd, 2011 Motoi Okuda FUJITSU Ltd. Agenda The Next generation supercomputer project of Japan The
More informationHigh-Order Finite-Element Earthquake Modeling on very Large Clusters of CPUs or GPUs
High-Order Finite-Element Earthquake Modeling on very Large Clusters of CPUs or GPUs Gordon Erlebacher Department of Scientific Computing Sept. 28, 2012 with Dimitri Komatitsch (Pau,France) David Michea
More information