Inauguration Cartesius June 14, 2013 Hardware is Easy...but what about software/applications/implementation/? Dr. Peter Michielse Deputy Director 1
Agenda History Cartesius Hardware path to exascale: the next frontier Back to the future - actual improvements What can we offer to you, here and now? 2
1983-2008: National Supercomputing Services SGI Altix 3700 (2003) 1400 CPUs, 3.2 TFlop/s IBM Power6 (2008) 3456 cores, 65 TFlop/s Cray C916/121024 (2003) 12 CPUs, 12 GFlop/s CDC Cyber 205 (1984) 1 CPU, 100 MFlop/s 3
New national supercomputer Cartesius Cartesius 2013 270 TFlop/s Cartesius end 2014 > 1 PFlop/s So we face a factor of 10M peak performance in about 30 years 4
High-level architecture Cartesius InfiniBand FDR14 Low-Latency Network Fat Node Island 32 nodes 1,024 cores 256 GB/node Multiple Multiple Thin Multiple Thin Node Multiple Thin Node Islands Thin Node Islands 4k Node Islands 32 8k Islands 32 nodes cores 64 32 nodes 1.024 GB/node nodes 1.024 cores 1.024 cores 2 cores 2 GB/core 2 GB/core GB/core 2 Interactive nodes 16 cores 128 GB/node Multiple Service & Management nodes 180 TB home file system > 5 PB Scratch & Project Lustre file systems 5
Performance roadmap and plans Huygens Peak Perf.: 65 TF Av. Application Capacity: 1.0 Cartesius Phase 1 Peak Perf.: 270 TF Application Capacity: 3.4-13.0 Cartesius Phase 2 Peak Perf.: > 1PF Application Capacity: 11.8-48.3 Sept. 2008 - June 2013 July 2013 2H 2014 2H 2014 On-demand plans: Performance growth relative to user demand and scientific challenges Accelerator options to be investigated with user applications will start as soon as possible accelerator platform operational before year end 6
The next (or final?) frontier Copyright: Paramount Pictures. 7
The hardware road to the future Table from Rick Stevens, Argonne National Laboratory, USA Hardware: challenging, but considered as doable. 8
Only rely on hardware? Although Moore s law seems to continue, individual CPU cores do not get faster: Huygens Power6 core at 18.8 Gflop/s (2008) peak performance Currently installed cores in Cartesius at around the same peak performance (double flops/cycle, half frequency) Scaling to millions of cores is not straightforward Implementation issues Algorithmic design New hardware Accelerators, GPUs, many-core, FPGA, personalities, Reliability and resilience How to deal with to-be-expected frequent hardware failures? 9
June 13, 2007: Huygens part I At the 2007 inauguration of Huygens part I, we said: Until 2032: Calculatio fortunae may hold for hardware development (classical, FPGA, GPU, quantum,...) But has to be assisted (or even more than that) by software development and intelligent implementation Climate research, turbulence modelling: Getting to 1 M application performance improvement in 25 years will probably require 1 k coming from software and implementation... Let s see how far we got in 6 years 10
Improvements over 6 years Prof. Henk Dijkstra (UU), climate research: Implementation improvement: 4x over past 6 years, no algorithmic improvement Next challenge: coupled ocean/atmosphere, ocean (0.1 degree, 45 levels), atmosphere (0.5 degree, 45 levels), grid 5*10 8 + 10 8, 100 years 11
Improvements over 6 years Prof. Arthur Veldman and dr. Roel Verstappen (both RUG), turbulence modeling: Algorithm/implementation improvement: 100x over past 18 years (average 4.65x in 6 years) Next challenge: DNS on fast car (2015), Re = 10 7, grid 10 13, flops 10 22 12
EC vision on HPC 15 February 2012: HPC Europe s place in a global race. Some conclusions and related recommendations: Renew the HPC strategy similar to the decision of setting up the European Space Agency (ESA) in 1975; Implement Research and Development projects concerning exascale roadmaps, including both hardware and software issues; Europe has an outstanding position in scalable codes and expertise. Investing in exascale software development in these fields contributes to keeping competitiveness in areas of significant importance for science and industry. But: Member States have to take care of their own national (Tier-1) systems! 13
What does that mean for you? Let s indeed assume that Hardware is Easy Question may be answered by ASML, Bull and Intel this afternoon Remember that the user is on top of the food chain Focus on: Algorithms Implementation Scaling Curriculum Scientific Computing 14
What does that mean for you? Expertise is available for Cartesius and Lisa: Helpdesk Parallelisation and optimisation of codes Implementation improvements Scaling DCCP projects Wim Nieuwpoort Award Parallel I/O techniques Visualisation and rendering Training (also customised) 3 rd party codes PRACE and DECI access requests And let s not forget: also for grid, HPC cloud, Hadoop, Beehub, 15
From SURF SciencePark 16