China s HPC development: a brief review and perspectives

Size: px
Start display at page:

Download "China s HPC development: a brief review and perspectives"

Transcription

1 China s HPC development: a brief review and perspectives Depei Qian Beihang University/Sun Yat-sen University International Symposium on Impact of extreme scale computing Tokyo, Japan Nov. 2, 2017

2 Outline A Brief review The New HPC key project in China Issues in exascale system development

3 A Brief review

4 Three 863 key projects on HPC :High Performance Computer and Core Software Research on resource sharing and collaborative work Grid-enabled applications in multiple areas TFlops computers and China National Grid (CNGrid) testbed :High Productivity Computer and Grid Service Environment High productivity Application performance Efficiency in program development Portability of programs Robust of the system Emphasizing service features of the HPC environment Developing peta-scale computers :High Productivity Computer and Application Service Environment Developing 100PF computers Developing large scale HPC applications Upgrading of CNGird

5 High performance Computers 2013: Tianhe-2 CPU+MIC Heterogeneous accelerated architecture 54.9 PF peak, 33.9 PF Linpack, No. 1 in Top500 for 6 times from 2013 to 2015 Installed at the National Supercomputing Center in Guangzhou Will be upgraded to 100PF this year 2016: Sunway TaihuLight Implemented with home-grown Shenwei many-core processors, 10 million cores in total 125 PF peak, 93 PF Linpack, No. 1 in Top500 in June and Nov. of 2016 Installed at the National Supercomputing Center in Wuxi Tianhe-2 Sunway Bluelight

6 Tianhe-2 upgrade Items Tianhe-2 Tianhe-2A Nodes & Performance nodes with Intel CPU + KNC 54.9Pflops nodes with Intel CPU + Matrix Pflops Interconnection 10Gbps, 1.57us 14Gbps, 1us Memory 1.4PB 3.4PB Storage 12.4PB, 512GB/s 19PB, 1TB/s Energy Efficiency 17.8MW, 1.9Gflops/W About 18MW, >5Gflops/W Heterogeneous software MPSS for Intel KNC OpenMP/OpenCL for Matrix

7 Matrix-2000 accelerator Chip specification 4 super-nodes (SN) 8 clusters per SN 4 cores per cluster Core On chip interconnection Self-defined 256-bit vector ISA 16 DP flops/cycle per core Peak performance: Tflops@1.2GHz SN0 C C C C C C C C Cluster 0 Cluster 1 C C C C C C C C Cluster 2 Cluster 3 C C C C C C C C Cluster 4 Cluster 5 C C C C C C C C Cluster 6 Cluster 7 SN1 C C C C C C C C Cluster 0 Cluster 1 C C C C C C C C Cluster 2 Cluster 3 C C C C C C C C Cluster 4 Cluster 5 C C C C C C C C Cluster 6 Cluster 7 SN2 C C C C C C C C Cluster 0 Cluster 1 C C C C C C C C Cluster 2 Cluster 3 C C C C C C C C Cluster 4 Cluster 5 C C C C C C C C Cluster 6 Cluster 7 SN3 C C C C C C C C Cluster 0 Cluster 1 C C C C C C C C Cluster 2 Cluster 3 C C C C C C C C Cluster 4 Cluster 5 C C C C C C C C Cluster 6 Cluster 7 PCIE DDR4 DDR4 DDR4 DDR4 4 SNs x 8 clusters x 4cores x 16 flops x 1.2 GHz = Tflops Peak power dissipation: ~240w Interface 8 DDR channels X16 PCIE 3.0 EP Port 7

8 Compute Nodes Heterogeneous Compute Nodes Intel Xeon CPU x2 Matrix-2000 x2 Com m. P o rt 16X PCIE N IC G b LA N GE Memory:192GB Interconnection:14G proprietary network Peak performance: 5.34Tflops DDR4 DDR4 MT-2000 MT X PCIE 16X PCIE CPU CPU Q PI DM I IP M B PCH CPLD

9 HPC environment 2016 China National Grid, composed of 17 national supercomputing centers and HPC centers, world leading class computing resources

10 HPC applications 2016 HPC applications in many domains 10-million core parallelism reached, Gordon Bell Prize in 2016 Developed a number application software, adopted by production systems aircraft design high speed train design oil & gas exploration new drug discovery ensemble weather forecasting bio-information car development design optimization of large fluid machinery electromagnetic computation

11 Problems identified Lack of the long-term national program for high performance computing Weak in kernel HPC technologies processor/accelerator novel devices (new memory, storage, and network) large scale parallel algorithms and programs implementation Application software is the bottleneck applications rely on imported commercial software expensive small scale parallelism restricted by export regulation Shortage in cross-disciplinary talents No enough talents with both domain and IT knowledge Lack of multi-disciplinary collaboration

12 The new key HPC project in China

13 Reform of research system in China The national research and development system is undergoing a reform 100+ different national R&D programs/initiatives are merged into 5 tracks of national programs Basic research program (NSFC) Mega-science and technology programs Key R&D program (former 863, 973, enabling programs) Enterprise innovation program Facility/talent program

14 A New key project on HPC High performance computing has been identified as a priority subject under the key R&D program (track 3) Strategic studies and planning have been conducted since 2013 A proposal on HPC in the 13 th five-year plan was submitted in early 2015 The key R&D project was approved in Oct by a multi-government agency committee led by the MOST

15 Motivations The key value of exascale computers identified Addressing the grand challenge problems Energy shortage, pollution, climate change Enabling industry transformation supporting development of important products high speed train, commercial aircraft, automobile promoting economy transformation For social development and people s benefit new drug discovery, precision medicine, digital media Enabling scientific discovery high energy physics, computational chemistry, new material, astrophysics Promote computer industry by technology transfer Developing HPC systems by self-controllable technologies a lesson learnt from the recent embargo regulation

16 Major tasks Exa-scale computer development R&D on novel architectures and key technologies of the exa-scale computer Developing the exa-scale computer based on home-grown processors Technology transfer to promote development of high-end servers HPC applications development Basic research on exa-scale modeling methods and parallel algorithms Developing high performance application software Establishing the HPC application eco-system HPC environment development Developing software and platform for national HPC environment Upgrading of the national HPC environment CNGrid Developing service systems on the national HPC environment Each task will cover basic research, key technology development, and application demonstration

17 Basic research Task 1: Exa-scale computer development Novel high performance interconnect Theoretical work on the novel interconnect based on the enabling technologies of 3D chips, silicon photonics and on-chip networks Programming & execution models for exa-scale systems new programming models for heterogeneous systems Improving programming efficiency

18 Task 1: Exa-scale computer development Key technology prototype systems for verifying the exa-scale system technologies 3 typical applications to verify the design exa-scale computer technologies architecture optimized for multi-objectives high efficient computing node high performance processor/accelerator design exa-scale system software scalable interconnect parallel I/O exa-scale infrastructure energy efficiency exa-scale system reliability

19 Task 1: Exa-scale computer development Exa-scale computer development exaflops in peak Linpack efficiency >60% 10PB memory EB storage 30GF/w energy efficiency interconnect >500Gbps large scale system management and resource scheduling easy-to-use parallel programming environment system monitoring and fault tolerance support large scale applications

20 Task 2: HPC application development Basic research computable modeling and computational methods for exa-scale systems scalable highly efficient parallel algorithms and parallel libraries for exa-scale systems Key technology programming framework for exa-scale software development

21 Task 2: HPC application development Application software Numerical devices numerical nuclear reactor numerical aircraft numerical earth system numerical engine high performance domain application software complex engineering project and critical equipment numerical simulation of ocean design of energy-efficient large fluid machineries drug discovery electromagnetic environment simulation ship design oil exploration digital media rendering high performance application software for research material science high energy physics astrophysics life science

22 Task 2: HPC application development HPC application software development establishing a national-level R&D center for HPC application software build up of a platform for HPC software development and optimization tools for performance/energy efficiency and pre- /post-processing build up software resource repository developing typical domain application software a joint effort involving national supercomputing centers, universities, and institutes

23 Basic research Task 3: HPC environment development models and architecture for computational services virtual data space Key technology mechanism and platform for the national HPC environment, providing technical support for service mode operation upgrading the national HPC environment (CNGrid)

24 Services Task 3: HPC environment development integrated business platform, e.g. complex product design HPC-enabled EDA platform application villages innovation and optimization of industrial products drug discovery SME computing and simulation platform platform for HPC education provide computing resources and services to undergraduate and graduate students

25 Projects supported The first call for proposal was issued in Feb., projects supported The second call was issued in Oct., 2016, 18 projects supported, mainly application software The third round of call was issued in Oct. 2017, the review process will begin soon.

26 Sugon exa-prototype: specification metrics prototype exascale ratio Computing Node peak (TF) No. of nodes No. silicon-unit System peak (PF) storage Memory (PB) Storage (PB) network Silicon-switch Power consum Dim. global net 2*1*3 8*8*6 4*8*2 Dim. local net 2*3*2 2*3*2 1 Power consumption Energy efficiency (GF/W) size W*D*H (m) 6*6*6 24*24*6 16 Total cabinets

27 Sugon exa-prototype: general design Computing sub-system home-grown X86 processor + DCU accelerator in 2019 CPU > 1TF, DCU > 15TF Network sub-system 400Gbps 6D-torus, 384 routers Storage sub-system Distributed storage architecture, extensible to EB Infrastructure sub-system Immersive phase-change cooling High voltage DC power supply Hierarchical 3D assembly Software sub-system Mature and complete libs and programming tools Light-weight virtualization and software-defined architecture

28 Sugon exa-prototype: hierarchical 3D structure 层次 每单元节点数 原型机单元数 E 级机单元数 Node pair Super node Silicon block Silicon cubic

29 Sugon exa-ptototype: Computing node Node:2 CPU and 2 DCU CPU and DCU interconnected by GOP high speed bus Memory bandwidth: 2667 Mbps, DDR4 Memory capacity 128G DDR4 Interconnect: 200Gbps fast Fabric U/R/LR DDR4 DIMMs 2X200G NIC U/R/LR DDR4 DIMMs DCU0 16x GOP*2 CPU0 XGKR*2 Pcle 16x XGKR*2 Pcle 16x CPU2 16x GOP*2 DCU2 16x GOP*2 SATA Pcle 4x 16x GOP*2 BIOS Midplane BIOS SATA Pcle 4x M.2 M.2 M.2 M.2 SATA/ Pcle 4x SATA/ Pcle 4x 16x GOP*2 XGKR*2 XGKR*2 16x GOP*2 DCU1 CPU1 Pcle 16x Pcle 16x CPU3 DCU3 16x GOP*2 16x GOP*2 BIOS BIOS U/R/LR DDR4 DIMMs AIU U/R/LR DDR4 DIMMs

30 Tianhe exa-prototype: flexible architecture Reconfigurable flexible architecture, meet the requirement of different applications Virtualized OS, provide a configurable computing environment Software-defined interconnect, guarantee bandwidth and fault isolation Hierarchical storage QoS guarantee technology, providing stable and independent storage bandwidth Dynamic optimization providing architecture-aware optimization application compiler runtime OS Computing node Computing sub-system IO storage sub-system

31 Tianhe exa-prototype: technical route performance Special purpose accelerator Many-core customized Energy efficiency Easy to use General purpose many-core is adopted by the prototype 31

32 Tianhe exa-prototype: technical features Flexible architecture to meet the requirement of different applications New generation many-core processor, pursuing balanced computing and memory access Optoelectronic integrated high speed interconnect, greatly improved performance and energy efficiency Fault-tolerance based on new storage medium Accurate heat dissipation, tradeoff between the manufacture cost and the operational cost

33 Tianhe exa-prototype: interconnect High-radix router for low power consumption, low cost and high desity Exascale communication need: single node > 400Gbps Chip power budget <200W, at most 12 ports of 400 Gbps Co-design of ultra short distance Serdes PHY, PHY coding, and link layer Optoelectronic integration for interconnect 33

34 Sunway exa-prototype: hardware system System composed of computing, interconnect, storage, power supply and cooling New generation many-core based system,512 nodes,performance >4PFlops Self-developed network chip, fat-tree interconnect, point to point bandwidth > 200Gbps Storage subsystem based on Shenwei storage server Self-developed high voltage (300V) DC power supply High efficient water-cooling, enhanced heat transfer copper cold plate 二级胖树互连结构 直流供电系统 新一代众核处理器 强化换热冷板组装节点 运算机仓 水冷机组

35 Sunway exa-prototype: computing node DDR4 Connection to the interconnect 2 X 25GbpsX4 Point to point one-way bandwidth 200Gbps Peak performance >8TFlops memory > 64GB DDR3 DDR4 DDR3 高速计算网 网络接口 核组0 时钟管理 核组1 处理器管理 PCI-E 电源管理 以太网 节点监测 核组2 BM C 核组3 以太管理网 网络接口 DDR3 DDR4 DDR3 DDR4

36 Basic software for home-grown manycore processor parallel OS high performance storage management system parallel compiler parallel program development environment High efficient compiler for heterogeneous many-core SIMD auto-vectorization High performance basic math libs Integrated multi-domain OS for heterogeneous many-core Dynamic storage management Supporting MPI-1 MPI-2 MPI-3 OpenMP3.0, compatible OpenACC2.0 Debugger for heterogeneous manycore Sunway exa-prototype: software system

37 Sunway exa-prototype: demo applications Porting applications on TaihuLight, performance optimization is being conducted Floating platform design seismic Aircraft design Ocean model

38 Sunway exa-prototype: applications 10-Million core applications on TaihuLight 2016 Fully Implicit Solver for Atmospheric Dynamics Surface Wave Modeling Phase Field Simulations of Coarsening Dynamics Atomistic Simulation of Silicon Nanowires Run-away Electron Trajectory Simulation Genome Functional Annotation and Homeotic Gene Building Spacecraft CFD Numerical Simulation 2017 Extreme-scale Graph Processing Framework Simulation of Planetary Rings Simulations of Quantum Spin Liquid States via PEPS++ Molecular Dynamics Simulation of Condensed Covalent Materials cryo-em Macromolecule Structure Determination Redesigning CAM-SE Nonlinear Earthquake Simulation

39 Issues in exascale system development

40 Major Challenges to exa-scale systems Power consumption Performance obtained by applications Programmability Resilience How to make tradeoffs between performance, power consumption, and programmability? How to achieve continuous no-stop operation? How to adapt to a wide range of applications with reasonable efficiency?

41 Architecture Novel architectures beyond the current heterogeneous accelerated/manycore-based expected Co-processor or partitioned heterogeneous architecture? Low utilization of the co-processor in some applications, using CPU only Bottleneck in moving data between CPU and co-processor Application-aware architecture on-chip integration of special purpose units (idea from Prof. Andrew Chien) using the right tool to do the right things dynamic reconfigurable? how to program?

42 Memory system Pursuing large capacity, low latency, high bandwidth Increase capacity and lower power consumption by using DRAM/NVM together Data placement issue Improving bandwidth and latency by using the 3D stacking technology Reduce the data move by placing the data closer to processing HBM/HMC near processor On-chip DRAM Simple functions in memory Reduce data copy cost by using unified memory space in heterogeneous architecture

43 Pursuing low latency, high bandwidth and low energy consumption Adopt new technologies silicon photonics communication between components optical interconnect / communication miniature optical devices High scalability adapting to exascale system interconnect requirement Connecting 10,000+ nodes Low-hop, low-latency topology Reliable and intelligent routing Interconnect

44 Programming the heterogeneous systems Addressing the issues in programming the heterogeneous parallel systems efficient expression of the parallelism, dependence, data sharing, execution semantics problem decomposition appropriate for heterogeneous systems Improving programming by means of a holistic approach new programming models programming languages extension and compiler parallel debugging runtime support and optimization architectural support

45 Full-chain innovation mathematical methods computer algorithms algorithm implementation and optimization A good mathematical method is often more effective than hardware improvement and algorithm optimization Architecture-aware algorithm implementation and optimization is necessary for heterogeneous systems Domain-specific libraries for improving software productivity and performance Computational models and algorithms

46 Resilience Resilience is one of the key issues of the exa-scale system Large scale of the system 50K to 100K nodes Huge amount of components Very short MTBF Long time non-stop operation required for solving large scale problems Reliability measures at different levels required, including device, node, and system levels Software / hardware coordination is necessary fast context saving and recovery for checkpointing in case of short MTBF fault-tolerance at the algorithm and application software level

47 Importance of the tools Development and optimization of large scale parallel software require scalable tools Particularly important for systems implemented with home-grown processors current commercial and research tools do not support Three kinds of default tools required Parallel debugger for correctness Performance tuner for performance Energy optimizer for energy efficiency

48 Urgent need for eco-system The eco-system for exa-scale system based on home-grown processors is in a urgent need languages, compilers, OS, runtime tools application development support application software Need to attract the hardware manufacturers and the third party software developers product family instead of a single machine Collaboration between industry, academia and end-users required

49 Thank you!

Overview of Tianhe-2

Overview of Tianhe-2 Overview of Tianhe-2 (MilkyWay-2) Supercomputer Yutong Lu School of Computer Science, National University of Defense Technology; State Key Laboratory of High Performance Computing, China ytlu@nudt.edu.cn

More information

Sunway TaihuLight: The system and applications. Zhao Liu Director of Application Support Department National Supercomputing Center in Wuxi

Sunway TaihuLight: The system and applications. Zhao Liu Director of Application Support Department National Supercomputing Center in Wuxi Sunway TaihuLight: The system and applications Zhao Liu Director of Application Support Department National Supercomputing Center in Wuxi Outline Sunway Machine Applications and Programming Challenges

More information

HPC and Big Data: Updates about China. Haohuan FU August 29 th, 2017

HPC and Big Data: Updates about China. Haohuan FU August 29 th, 2017 HPC and Big Data: Updates about China Haohuan FU August 29 th, 2017 1 Outline HPC and Big Data Projects in China Recent Efforts on Tianhe-2 Recent Efforts on Sunway TaihuLight 2 MOST HPC Projects 2016

More information

Introduction to National Supercomputing Centre in Guangzhou and Opportunities for International Collaboration

Introduction to National Supercomputing Centre in Guangzhou and Opportunities for International Collaboration Exascale Applications and Software Conference 21st 23rd April 2015, Edinburgh, UK Introduction to National Supercomputing Centre in Guangzhou and Opportunities for International Collaboration Xue-Feng

More information

Cray XC Scalability and the Aries Network Tony Ford

Cray XC Scalability and the Aries Network Tony Ford Cray XC Scalability and the Aries Network Tony Ford June 29, 2017 Exascale Scalability Which scalability metrics are important for Exascale? Performance (obviously!) What are the contributing factors?

More information

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Aim High Intel Technical Update Teratec 07 Symposium June 20, 2007 Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Risk Factors Today s s presentations contain forward-looking statements.

More information

The Road from Peta to ExaFlop

The Road from Peta to ExaFlop The Road from Peta to ExaFlop Andreas Bechtolsheim June 23, 2009 HPC Driving the Computer Business Server Unit Mix (IDC 2008) Enterprise HPC Web 100 75 50 25 0 2003 2008 2013 HPC grew from 13% of units

More information

HPC Technology Trends

HPC Technology Trends HPC Technology Trends High Performance Embedded Computing Conference September 18, 2007 David S Scott, Ph.D. Petascale Product Line Architect Digital Enterprise Group Risk Factors Today s s presentations

More information

The State and Opportunities of HPC Applications in China. Ruibo Wang National University of Defense Technology

The State and Opportunities of HPC Applications in China. Ruibo Wang National University of Defense Technology The State and Opportunities of HPC Applications in China Ruibo Wang National University of Defense Technology Outline Brief introduction to the Sites Applications Fusion Development of HPC, Cloud & Big

More information

Brand-New Vector Supercomputer

Brand-New Vector Supercomputer Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13 1 New Product NEC Released A Brand-New Vector Supercomputer, SX-ACE Just Now. Vector Supercomputer for Memory Bandwidth

More information

Tianhe-2, the world s fastest supercomputer. Shaohua Wu Senior HPC application development engineer

Tianhe-2, the world s fastest supercomputer. Shaohua Wu Senior HPC application development engineer Tianhe-2, the world s fastest supercomputer Shaohua Wu Senior HPC application development engineer Inspur Inspur revenue 5.8 2010-2013 6.4 2011 2012 Unit: billion$ 8.8 2013 21% Staff: 14, 000+ 12% 10%

More information

Titan - Early Experience with the Titan System at Oak Ridge National Laboratory

Titan - Early Experience with the Titan System at Oak Ridge National Laboratory Office of Science Titan - Early Experience with the Titan System at Oak Ridge National Laboratory Buddy Bland Project Director Oak Ridge Leadership Computing Facility November 13, 2012 ORNL s Titan Hybrid

More information

The IBM Blue Gene/Q: Application performance, scalability and optimisation

The IBM Blue Gene/Q: Application performance, scalability and optimisation The IBM Blue Gene/Q: Application performance, scalability and optimisation Mike Ashworth, Andrew Porter Scientific Computing Department & STFC Hartree Centre Manish Modani IBM STFC Daresbury Laboratory,

More information

PART I - Fundamentals of Parallel Computing

PART I - Fundamentals of Parallel Computing PART I - Fundamentals of Parallel Computing Objectives What is scientific computing? The need for more computing power The need for parallel computing and parallel programs 1 What is scientific computing?

More information

HPC projects. Grischa Bolls

HPC projects. Grischa Bolls HPC projects Grischa Bolls Outline Why projects? 7th Framework Programme Infrastructure stack IDataCool, CoolMuc Mont-Blanc Poject Deep Project Exa2Green Project 2 Why projects? Pave the way for exascale

More information

Fujitsu s Approach to Application Centric Petascale Computing

Fujitsu s Approach to Application Centric Petascale Computing Fujitsu s Approach to Application Centric Petascale Computing 2 nd Nov. 2010 Motoi Okuda Fujitsu Ltd. Agenda Japanese Next-Generation Supercomputer, K Computer Project Overview Design Targets System Overview

More information

Introduction of Fujitsu s next-generation supercomputer

Introduction of Fujitsu s next-generation supercomputer Introduction of Fujitsu s next-generation supercomputer MATSUMOTO Takayuki July 16, 2014 HPC Platform Solutions Fujitsu has a long history of supercomputing over 30 years Technologies and experience of

More information

Interconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, November 2017

Interconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, November 2017 Interconnect Your Future Enabling the Best Datacenter Return on Investment TOP500 Supercomputers, November 2017 InfiniBand Accelerates Majority of New Systems on TOP500 InfiniBand connects 77% of new HPC

More information

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. Cluster Networks Introduction Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. As usual, the driver is performance

More information

BlueGene/L. Computer Science, University of Warwick. Source: IBM

BlueGene/L. Computer Science, University of Warwick. Source: IBM BlueGene/L Source: IBM 1 BlueGene/L networking BlueGene system employs various network types. Central is the torus interconnection network: 3D torus with wrap-around. Each node connects to six neighbours

More information

COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES

COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES P(ND) 2-2 2014 Guillaume Colin de Verdière OCTOBER 14TH, 2014 P(ND)^2-2 PAGE 1 CEA, DAM, DIF, F-91297 Arpajon, France October 14th, 2014 Abstract:

More information

CUDA. Matthew Joyner, Jeremy Williams

CUDA. Matthew Joyner, Jeremy Williams CUDA Matthew Joyner, Jeremy Williams Agenda What is CUDA? CUDA GPU Architecture CPU/GPU Communication Coding in CUDA Use cases of CUDA Comparison to OpenCL What is CUDA? What is CUDA? CUDA is a parallel

More information

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620 Introduction to Parallel and Distributed Computing Linh B. Ngo CPSC 3620 Overview: What is Parallel Computing To be run using multiple processors A problem is broken into discrete parts that can be solved

More information

Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins

Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins Outline History & Motivation Architecture Core architecture Network Topology Memory hierarchy Brief comparison to GPU & Tilera Programming Applications

More information

Current Status of the Next- Generation Supercomputer in Japan. YOKOKAWA, Mitsuo Next-Generation Supercomputer R&D Center RIKEN

Current Status of the Next- Generation Supercomputer in Japan. YOKOKAWA, Mitsuo Next-Generation Supercomputer R&D Center RIKEN Current Status of the Next- Generation Supercomputer in Japan YOKOKAWA, Mitsuo Next-Generation Supercomputer R&D Center RIKEN International Workshop on Peta-Scale Computing Programming Environment, Languages

More information

Building NVLink for Developers

Building NVLink for Developers Building NVLink for Developers Unleashing programmatic, architectural and performance capabilities for accelerated computing Why NVLink TM? Simpler, Better and Faster Simplified Programming No specialized

More information

Mathematical computations with GPUs

Mathematical computations with GPUs Master Educational Program Information technology in applications Mathematical computations with GPUs Introduction Alexey A. Romanenko arom@ccfit.nsu.ru Novosibirsk State University How to.. Process terabytes

More information

Interconnect Your Future

Interconnect Your Future Interconnect Your Future Paving the Path to Exascale November 2017 Mellanox Accelerates Leading HPC and AI Systems Summit CORAL System Sierra CORAL System Fastest Supercomputer in Japan Fastest Supercomputer

More information

Supercomputing and Mass Market Desktops

Supercomputing and Mass Market Desktops Supercomputing and Mass Market Desktops John Manferdelli Microsoft Corporation This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.

More information

MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구

MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구 MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구 Leading Supplier of End-to-End Interconnect Solutions Analyze Enabling the Use of Data Store ICs Comprehensive End-to-End InfiniBand and Ethernet Portfolio

More information

APENet: LQCD clusters a la APE

APENet: LQCD clusters a la APE Overview Hardware/Software Benchmarks Conclusions APENet: LQCD clusters a la APE Concept, Development and Use Roberto Ammendola Istituto Nazionale di Fisica Nucleare, Sezione Roma Tor Vergata Centro Ricerce

More information

CPU-GPU Heterogeneous Computing

CPU-GPU Heterogeneous Computing CPU-GPU Heterogeneous Computing Advanced Seminar "Computer Engineering Winter-Term 2015/16 Steffen Lammel 1 Content Introduction Motivation Characteristics of CPUs and GPUs Heterogeneous Computing Systems

More information

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007 Mellanox Technologies Maximize Cluster Performance and Productivity Gilad Shainer, shainer@mellanox.com October, 27 Mellanox Technologies Hardware OEMs Servers And Blades Applications End-Users Enterprise

More information

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA STATE OF THE ART 2012 18,688 Tesla K20X GPUs 27 PetaFLOPS FLAGSHIP SCIENTIFIC APPLICATIONS

More information

The Cray Rainier System: Integrated Scalar/Vector Computing

The Cray Rainier System: Integrated Scalar/Vector Computing THE SUPERCOMPUTER COMPANY The Cray Rainier System: Integrated Scalar/Vector Computing Per Nyberg 11 th ECMWF Workshop on HPC in Meteorology Topics Current Product Overview Cray Technology Strengths Rainier

More information

NERSC Site Update. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory. Richard Gerber

NERSC Site Update. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory. Richard Gerber NERSC Site Update National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory Richard Gerber NERSC Senior Science Advisor High Performance Computing Department Head Cori

More information

Cluster Network Products

Cluster Network Products Cluster Network Products Cluster interconnects include, among others: Gigabit Ethernet Myrinet Quadrics InfiniBand 1 Interconnects in Top500 list 11/2009 2 Interconnects in Top500 list 11/2008 3 Cluster

More information

HPC with GPU and its applications from Inspur. Haibo Xie, Ph.D

HPC with GPU and its applications from Inspur. Haibo Xie, Ph.D HPC with GPU and its applications from Inspur Haibo Xie, Ph.D xiehb@inspur.com 2 Agenda I. HPC with GPU II. YITIAN solution and application 3 New Moore s Law 4 HPC? HPC stands for High Heterogeneous Performance

More information

The DEEP (and DEEP-ER) projects

The DEEP (and DEEP-ER) projects The DEEP (and DEEP-ER) projects Estela Suarez - Jülich Supercomputing Centre BDEC for Europe Workshop Barcelona, 28.01.2015 The research leading to these results has received funding from the European

More information

Japan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS

Japan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS Japan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS HPC User Forum, 7 th September, 2016 Outline of Talk Introduction of FLAGSHIP2020 project An Overview of post K system Concluding Remarks

More information

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc Scaling to Petaflop Ola Torudbakken Distinguished Engineer Sun Microsystems, Inc HPC Market growth is strong CAGR increased from 9.2% (2006) to 15.5% (2007) Market in 2007 doubled from 2003 (Source: IDC

More information

System Packaging Solution for Future High Performance Computing May 31, 2018 Shunichi Kikuchi Fujitsu Limited

System Packaging Solution for Future High Performance Computing May 31, 2018 Shunichi Kikuchi Fujitsu Limited System Packaging Solution for Future High Performance Computing May 31, 2018 Shunichi Kikuchi Fujitsu Limited 2018 IEEE 68th Electronic Components and Technology Conference San Diego, California May 29

More information

The Architecture and the Application Performance of the Earth Simulator

The Architecture and the Application Performance of the Earth Simulator The Architecture and the Application Performance of the Earth Simulator Ken ichi Itakura (JAMSTEC) http://www.jamstec.go.jp 15 Dec., 2011 ICTS-TIFR Discussion Meeting-2011 1 Location of Earth Simulator

More information

INSPUR and HPC Innovation

INSPUR and HPC Innovation INSPUR and HPC Innovation Dong Qi (Forrest) Product manager Inspur dongqi@inspur.com Contents 1 2 3 4 5 Inspur introduction HPC Challenge and Inspur HPC strategy HPC cases Inspur contribution to HPC community

More information

Introduction to the K computer

Introduction to the K computer Introduction to the K computer Fumiyoshi Shoji Deputy Director Operations and Computer Technologies Div. Advanced Institute for Computational Science RIKEN Outline ü Overview of the K

More information

Petascale Computing Research Challenges

Petascale Computing Research Challenges Petascale Computing Research Challenges - A Manycore Perspective Stephen Pawlowski Intel Senior Fellow GM, Architecture & Planning CTO, Digital Enterprise Group Yesterday, Today and Tomorrow in HPC ENIAC

More information

HPC Issues for DFT Calculations. Adrian Jackson EPCC

HPC Issues for DFT Calculations. Adrian Jackson EPCC HC Issues for DFT Calculations Adrian Jackson ECC Scientific Simulation Simulation fast becoming 4 th pillar of science Observation, Theory, Experimentation, Simulation Explore universe through simulation

More information

Oak Ridge National Laboratory Computing and Computational Sciences

Oak Ridge National Laboratory Computing and Computational Sciences Oak Ridge National Laboratory Computing and Computational Sciences OFA Update by ORNL Presented by: Pavel Shamis (Pasha) OFA Workshop Mar 17, 2015 Acknowledgments Bernholdt David E. Hill Jason J. Leverman

More information

Path to Exascale? Intel in Research and HPC 2012

Path to Exascale? Intel in Research and HPC 2012 Path to Exascale? Intel in Research and HPC 2012 Intel s Investment in Manufacturing New Capacity for 14nm and Beyond D1X Oregon Development Fab Fab 42 Arizona High Volume Fab 22nm Fab Upgrades D1D Oregon

More information

Fujitsu s Technologies to the K Computer

Fujitsu s Technologies to the K Computer Fujitsu s Technologies to the K Computer - a journey to practical Petascale computing platform - June 21 nd, 2011 Motoi Okuda FUJITSU Ltd. Agenda The Next generation supercomputer project of Japan The

More information

Green Supercomputing

Green Supercomputing Green Supercomputing On the Energy Consumption of Modern E-Science Prof. Dr. Thomas Ludwig German Climate Computing Centre Hamburg, Germany ludwig@dkrz.de Outline DKRZ 2013 and Climate Science The Exascale

More information

Aggregation of Real-Time System Monitoring Data for Analyzing Large-Scale Parallel and Distributed Computing Environments

Aggregation of Real-Time System Monitoring Data for Analyzing Large-Scale Parallel and Distributed Computing Environments Aggregation of Real-Time System Monitoring Data for Analyzing Large-Scale Parallel and Distributed Computing Environments Swen Böhm 1,2, Christian Engelmann 2, and Stephen L. Scott 2 1 Department of Computer

More information

White paper FUJITSU Supercomputer PRIMEHPC FX100 Evolution to the Next Generation

White paper FUJITSU Supercomputer PRIMEHPC FX100 Evolution to the Next Generation White paper FUJITSU Supercomputer PRIMEHPC FX100 Evolution to the Next Generation Next Generation Technical Computing Unit Fujitsu Limited Contents FUJITSU Supercomputer PRIMEHPC FX100 System Overview

More information

Inauguration Cartesius June 14, 2013

Inauguration Cartesius June 14, 2013 Inauguration Cartesius June 14, 2013 Hardware is Easy...but what about software/applications/implementation/? Dr. Peter Michielse Deputy Director 1 Agenda History Cartesius Hardware path to exascale: the

More information

INTRODUCTION TO THE ARCHER KNIGHTS LANDING CLUSTER. Adrian

INTRODUCTION TO THE ARCHER KNIGHTS LANDING CLUSTER. Adrian INTRODUCTION TO THE ARCHER KNIGHTS LANDING CLUSTER Adrian Jackson adrianj@epcc.ed.ac.uk @adrianjhpc Processors The power used by a CPU core is proportional to Clock Frequency x Voltage 2 In the past, computers

More information

Preparing GPU-Accelerated Applications for the Summit Supercomputer

Preparing GPU-Accelerated Applications for the Summit Supercomputer Preparing GPU-Accelerated Applications for the Summit Supercomputer Fernanda Foertter HPC User Assistance Group Training Lead foertterfs@ornl.gov This research used resources of the Oak Ridge Leadership

More information

The Red Storm System: Architecture, System Update and Performance Analysis

The Red Storm System: Architecture, System Update and Performance Analysis The Red Storm System: Architecture, System Update and Performance Analysis Douglas Doerfler, Jim Tomkins Sandia National Laboratories Center for Computation, Computers, Information and Mathematics LACSI

More information

Interconnect Your Future

Interconnect Your Future Interconnect Your Future Gilad Shainer 2nd Annual MVAPICH User Group (MUG) Meeting, August 2014 Complete High-Performance Scalable Interconnect Infrastructure Comprehensive End-to-End Software Accelerators

More information

China's supercomputer surprises U.S. experts

China's supercomputer surprises U.S. experts China's supercomputer surprises U.S. experts John Markoff Reproduced from THE HINDU, October 31, 2011 Fast forward: A journalist shoots video footage of the data storage system of the Sunway Bluelight

More information

Challenges in High Performance Computing. William Gropp

Challenges in High Performance Computing. William Gropp Challenges in High Performance Computing William Gropp www.cs.illinois.edu/~wgropp 2 What is HPC? High Performance Computing is the use of computing to solve challenging problems that require significant

More information

创新释放高性能计算潜力 林俊 : 华为服务器领域首席架构师

创新释放高性能计算潜力 林俊 : 华为服务器领域首席架构师 创新释放高性能计算潜力 林俊 : 华为服务器领域首席架构师 Market Trends 2 2 Requirement for Compute Security Big Data Cloud Mobility Internet of Things Industry 4.0 Intelligent City 2020 Millions of MIPS Opportunity for Innovation

More information

Advances of parallel computing. Kirill Bogachev May 2016

Advances of parallel computing. Kirill Bogachev May 2016 Advances of parallel computing Kirill Bogachev May 2016 Demands in Simulations Field development relies more and more on static and dynamic modeling of the reservoirs that has come a long way from being

More information

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29 Introduction CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction Spring 2018 1 / 29 Outline 1 Preface Course Details Course Requirements 2 Background Definitions

More information

The Mont-Blanc approach towards Exascale

The Mont-Blanc approach towards Exascale http://www.montblanc-project.eu The Mont-Blanc approach towards Exascale Alex Ramirez Barcelona Supercomputing Center Disclaimer: Not only I speak for myself... All references to unavailable products are

More information

Race to Exascale: Opportunities and Challenges. Avinash Sodani, Ph.D. Chief Architect MIC Processor Intel Corporation

Race to Exascale: Opportunities and Challenges. Avinash Sodani, Ph.D. Chief Architect MIC Processor Intel Corporation Race to Exascale: Opportunities and Challenges Avinash Sodani, Ph.D. Chief Architect MIC Processor Intel Corporation Exascale Goal: 1-ExaFlops (10 18 ) within 20 MW by 2018 1 ZFlops 100 EFlops 10 EFlops

More information

Search for Optimal Network Topologies for Supercomputers 寻找超级计算机优化的网络拓扑结构

Search for Optimal Network Topologies for Supercomputers 寻找超级计算机优化的网络拓扑结构 Search for Optimal Network Topologies for Supercomputers 寻找超级计算机优化的网络拓扑结构 GUO, Meng 郭猛 guomeng@sdas.org Shandong Computer Science Center (National Supercomputer Center in Jinan) 山东省计算中心 ( 国家超级计算济南中心 )

More information

The Future of High Performance Interconnects

The Future of High Performance Interconnects The Future of High Performance Interconnects Ashrut Ambastha HPC Advisory Council Perth, Australia :: August 2017 When Algorithms Go Rogue 2017 Mellanox Technologies 2 When Algorithms Go Rogue 2017 Mellanox

More information

Stockholm Brain Institute Blue Gene/L

Stockholm Brain Institute Blue Gene/L Stockholm Brain Institute Blue Gene/L 1 Stockholm Brain Institute Blue Gene/L 2 IBM Systems & Technology Group and IBM Research IBM Blue Gene /P - An Overview of a Petaflop Capable System Carl G. Tengwall

More information

Dynamical Exascale Entry Platform

Dynamical Exascale Entry Platform DEEP Dynamical Exascale Entry Platform 2 nd IS-ENES Workshop on High performance computing for climate models 30.01.2013, Toulouse, France Estela Suarez The research leading to these results has received

More information

The Road to ExaScale. Advances in High-Performance Interconnect Infrastructure. September 2011

The Road to ExaScale. Advances in High-Performance Interconnect Infrastructure. September 2011 The Road to ExaScale Advances in High-Performance Interconnect Infrastructure September 2011 diego@mellanox.com ExaScale Computing Ambitious Challenges Foster Progress Demand Research Institutes, Universities

More information

High performance Computing and O&G Challenges

High performance Computing and O&G Challenges High performance Computing and O&G Challenges 2 Seismic exploration challenges High Performance Computing and O&G challenges Worldwide Context Seismic,sub-surface imaging Computing Power needs Accelerating

More information

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Pak Lui, Gilad Shainer, Brian Klaff Mellanox Technologies Abstract From concept to

More information

Mapping MPI+X Applications to Multi-GPU Architectures

Mapping MPI+X Applications to Multi-GPU Architectures Mapping MPI+X Applications to Multi-GPU Architectures A Performance-Portable Approach Edgar A. León Computer Scientist San Jose, CA March 28, 2018 GPU Technology Conference This work was performed under

More information

TSUBAME-KFC : Ultra Green Supercomputing Testbed

TSUBAME-KFC : Ultra Green Supercomputing Testbed TSUBAME-KFC : Ultra Green Supercomputing Testbed Toshio Endo,Akira Nukada, Satoshi Matsuoka TSUBAME-KFC is developed by GSIC, Tokyo Institute of Technology NEC, NVIDIA, Green Revolution Cooling, SUPERMICRO,

More information

Towards Exascale Computing with the Atmospheric Model NUMA

Towards Exascale Computing with the Atmospheric Model NUMA Towards Exascale Computing with the Atmospheric Model NUMA Andreas Müller, Daniel S. Abdi, Michal Kopera, Lucas Wilcox, Francis X. Giraldo Department of Applied Mathematics Naval Postgraduate School, Monterey

More information

Accelerating High Performance Computing.

Accelerating High Performance Computing. Accelerating High Performance Computing http://www.nvidia.com/tesla Computing The 3 rd Pillar of Science Drug Design Molecular Dynamics Seismic Imaging Reverse Time Migration Automotive Design Computational

More information

Intel Many Integrated Core (MIC) Architecture

Intel Many Integrated Core (MIC) Architecture Intel Many Integrated Core (MIC) Architecture Karl Solchenbach Director European Exascale Labs BMW2011, November 3, 2011 1 Notice and Disclaimers Notice: This document contains information on products

More information

The Stampede is Coming: A New Petascale Resource for the Open Science Community

The Stampede is Coming: A New Petascale Resource for the Open Science Community The Stampede is Coming: A New Petascale Resource for the Open Science Community Jay Boisseau Texas Advanced Computing Center boisseau@tacc.utexas.edu Stampede: Solicitation US National Science Foundation

More information

Game-changing Extreme GPU computing with The Dell PowerEdge C4130

Game-changing Extreme GPU computing with The Dell PowerEdge C4130 Game-changing Extreme GPU computing with The Dell PowerEdge C4130 A Dell Technical White Paper This white paper describes the system architecture and performance characterization of the PowerEdge C4130.

More information

The Earth Simulator Current Status

The Earth Simulator Current Status The Earth Simulator Current Status SC13. 2013 Ken ichi Itakura (Earth Simulator Center, JAMSTEC) http://www.jamstec.go.jp 2013 SC13 NEC BOOTH PRESENTATION 1 JAMSTEC Organization Japan Agency for Marine-Earth

More information

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory Quinn Mitchell HPC UNIX/LINUX Storage Systems ORNL is managed by UT-Battelle for the US Department of Energy U.S. Department

More information

HPC Algorithms and Applications

HPC Algorithms and Applications HPC Algorithms and Applications Intro Michael Bader Winter 2015/2016 Intro, Winter 2015/2016 1 Part I Scientific Computing and Numerical Simulation Intro, Winter 2015/2016 2 The Simulation Pipeline phenomenon,

More information

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Waiting for Moore s Law to save your serial code started getting bleak in 2004 Source: published SPECInt

More information

Fujitsu HPC Roadmap Beyond Petascale Computing. Toshiyuki Shimizu Fujitsu Limited

Fujitsu HPC Roadmap Beyond Petascale Computing. Toshiyuki Shimizu Fujitsu Limited Fujitsu HPC Roadmap Beyond Petascale Computing Toshiyuki Shimizu Fujitsu Limited Outline Mission and HPC product portfolio K computer*, Fujitsu PRIMEHPC, and the future K computer and PRIMEHPC FX10 Post-FX10,

More information

Introduction of Oakforest-PACS

Introduction of Oakforest-PACS Introduction of Oakforest-PACS Hiroshi Nakamura Director of Information Technology Center The Univ. of Tokyo (Director of JCAHPC) Outline Supercomputer deployment plan in Japan What is JCAHPC? Oakforest-PACS

More information

Distributed Dense Linear Algebra on Heterogeneous Architectures. George Bosilca

Distributed Dense Linear Algebra on Heterogeneous Architectures. George Bosilca Distributed Dense Linear Algebra on Heterogeneous Architectures George Bosilca bosilca@eecs.utk.edu Centraro, Italy June 2010 Factors that Necessitate to Redesign of Our Software» Steepness of the ascent

More information

Pedraforca: a First ARM + GPU Cluster for HPC

Pedraforca: a First ARM + GPU Cluster for HPC www.bsc.es Pedraforca: a First ARM + GPU Cluster for HPC Nikola Puzovic, Alex Ramirez We ve hit the power wall ALL computers are limited by power consumption Energy-efficient approaches Multi-core Fujitsu

More information

HPC Innovation Lab Update. Dell EMC HPC Community Meeting 3/28/2017

HPC Innovation Lab Update. Dell EMC HPC Community Meeting 3/28/2017 HPC Innovation Lab Update Dell EMC HPC Community Meeting 3/28/2017 Dell EMC HPC Innovation Lab charter Design, develop and integrate Heading HPC systems Lorem ipsum Flexible reference dolor sit amet, architectures

More information

Timothy Lanfear, NVIDIA HPC

Timothy Lanfear, NVIDIA HPC GPU COMPUTING AND THE Timothy Lanfear, NVIDIA FUTURE OF HPC Exascale Computing will Enable Transformational Science Results First-principles simulation of combustion for new high-efficiency, lowemision

More information

Atos announces the Bull sequana X1000 the first exascale-class supercomputer. Jakub Venc

Atos announces the Bull sequana X1000 the first exascale-class supercomputer. Jakub Venc Atos announces the Bull sequana X1000 the first exascale-class supercomputer Jakub Venc The world is changing The world is changing Digital simulation will be the key contributor to overcome 21 st century

More information

In-Network Computing. Paving the Road to Exascale. 5th Annual MVAPICH User Group (MUG) Meeting, August 2017

In-Network Computing. Paving the Road to Exascale. 5th Annual MVAPICH User Group (MUG) Meeting, August 2017 In-Network Computing Paving the Road to Exascale 5th Annual MVAPICH User Group (MUG) Meeting, August 2017 Exponential Data Growth The Need for Intelligent and Faster Interconnect CPU-Centric (Onload) Data-Centric

More information

INTRODUCTION TO THE ARCHER KNIGHTS LANDING CLUSTER. Adrian

INTRODUCTION TO THE ARCHER KNIGHTS LANDING CLUSTER. Adrian INTRODUCTION TO THE ARCHER KNIGHTS LANDING CLUSTER Adrian Jackson a.jackson@epcc.ed.ac.uk @adrianjhpc Processors The power used by a CPU core is proportional to Clock Frequency x Voltage 2 In the past,

More information

C-DAC HPC Trends & Activities in India. Abhishek Das Scientist & Team Leader HPC Solutions Group C-DAC Ministry of Communications & IT Govt of India

C-DAC HPC Trends & Activities in India. Abhishek Das Scientist & Team Leader HPC Solutions Group C-DAC Ministry of Communications & IT Govt of India C-DAC HPC Trends & Activities in India Abhishek Das Scientist & Team Leader HPC Solutions Group C-DAC Ministry of Communications & IT Govt of India Presentation Outline A brief profile of C-DAC, India

More information

Cray XD1 Supercomputer Release 1.3 CRAY XD1 DATASHEET

Cray XD1 Supercomputer Release 1.3 CRAY XD1 DATASHEET CRAY XD1 DATASHEET Cray XD1 Supercomputer Release 1.3 Purpose-built for HPC delivers exceptional application performance Affordable power designed for a broad range of HPC workloads and budgets Linux,

More information

High Performance Computing with Fujitsu

High Performance Computing with Fujitsu High Performance Computing with Fujitsu Ivo Doležel 0 2017 FUJITSU FUJITSU Software HPC Cluster Suite A complete HPC software stack solution HPC cluster general characteristics HPC clusters consist primarily

More information

High Performance Computing with Accelerators

High Performance Computing with Accelerators High Performance Computing with Accelerators Volodymyr Kindratenko Innovative Systems Laboratory @ NCSA Institute for Advanced Computing Applications and Technologies (IACAT) National Center for Supercomputing

More information

INSPUR and HPC Innovation. Dong Qi (Forrest) Oversea PM

INSPUR and HPC Innovation. Dong Qi (Forrest) Oversea PM INSPUR and HPC Innovation Dong Qi (Forrest) Oversea PM dongqi@inspur.com Contents 1 2 3 4 5 Inspur introduction HPC Challenge and Inspur HPC strategy HPC cases Inspur contribution to HPC community Inspur

More information

Practical Scientific Computing

Practical Scientific Computing Practical Scientific Computing Performance-optimized Programming Preliminary discussion: July 11, 2008 Dr. Ralf-Peter Mundani, mundani@tum.de Dipl.-Ing. Ioan Lucian Muntean, muntean@in.tum.de MSc. Csaba

More information

COMP 633 Parallel Computing.

COMP 633 Parallel Computing. COMP 633 Parallel Computing http://www.cs.unc.edu/~prins/classes/633/ Parallel computing What is it? multiple processors cooperating to solve a single problem hopefully faster than a single processor!

More information

PORTING CP2K TO THE INTEL XEON PHI. ARCHER Technical Forum, Wed 30 th July Iain Bethune

PORTING CP2K TO THE INTEL XEON PHI. ARCHER Technical Forum, Wed 30 th July Iain Bethune PORTING CP2K TO THE INTEL XEON PHI ARCHER Technical Forum, Wed 30 th July Iain Bethune (ibethune@epcc.ed.ac.uk) Outline Xeon Phi Overview Porting CP2K to Xeon Phi Performance Results Lessons Learned Further

More information