Green Supercomputing

Similar documents
Overview. Idea: Reduce CPU clock frequency This idea is well suited specifically for visualization

Motivation Goal Idea Proposition for users Study

COST EFFICIENCY VS ENERGY EFFICIENCY. Anna Lepak Universität Hamburg Seminar: Energy-Efficient Programming Wintersemester 2014/2015

Earth System Sciences in the Times of Brilliant Technologies

Introduction to National Supercomputing Centre in Guangzhou and Opportunities for International Collaboration

Workshop: Innovation Procurement in Horizon 2020 PCP Contractors wanted

Brand-New Vector Supercomputer

The Energy Challenge in HPC

Overview. Energy-Efficient and Power-Constrained Techniques for ExascaleComputing. Motivation: Power is becoming a leading design constraint in HPC

Atos announces the Bull sequana X1000 the first exascale-class supercomputer. Jakub Venc

Prototyping in PRACE PRACE Energy to Solution prototype at LRZ

PART I - Fundamentals of Parallel Computing

HPC projects. Grischa Bolls

HPC Technology Trends

Update on LRZ Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities. 2 Oct 2018 Prof. Dr. Dieter Kranzlmüller

Challenges in HPC I/O

Intel Many Integrated Core (MIC) Architecture

Technologies for Information and Health

Fujitsu s Technologies to the K Computer

Challenges in High Performance Computing. William Gropp

Some Reflections on Advanced Geocomputations and the Data Deluge

e-research Infrastructures for e-science Axel Berg SARA national HPC & e-science support center RAMIRI, June 15, 2011

Jack Dongarra University of Tennessee Oak Ridge National Laboratory

: A new version of Supercomputing or life after the end of the Moore s Law

Exascale: challenges and opportunities in a power constrained world

PLAN-E Workshop Switzerland. Welcome! September 8, 2016

context: massive systems

Managing Hardware Power Saving Modes for High Performance Computing

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29

Imperial College London. Simon Burbidge 29 Sept 2016

HECToR. UK National Supercomputing Service. Andy Turner & Chris Johnson

Medical practice: diagnostics, treatment and surgery in supercomputing centers

INSPUR and HPC Innovation

ICON for HD(CP) 2. High Definition Clouds and Precipitation for Advancing Climate Prediction

High Performance Computing Data Management. Philippe Trautmann BDM High Performance Computing Global Research

ECE 574 Cluster Computing Lecture 23

Aggregation of Real-Time System Monitoring Data for Analyzing Large-Scale Parallel and Distributed Computing Environments

Fujitsu s Approach to Application Centric Petascale Computing

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory

INSPUR HPC Development

Tuning I/O Performance for Data Intensive Computing. Nicholas J. Wright. lbl.gov

The Future of Interconnect Technology

InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014

Building supercomputers from embedded technologies

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Composite Metrics for System Throughput in HPC

C-DAC HPC Trends & Activities in India. Abhishek Das Scientist & Team Leader HPC Solutions Group C-DAC Ministry of Communications & IT Govt of India

IS-ENES2 Kick-off meeting Sergi Girona, Chair of the Board of Directors

Science 2.0 VU Big Science, e-science and E- Infrastructures + Bibliometric Network Analysis

Inauguration Cartesius June 14, 2013

Environmental Computing More than computing environmental models? Anton Frank IEEE escience Conf., Munich,

Making a Case for a Green500 List

Introduction & Motivation Problem Statement Proposed Work Evaluation Conclusions Future Work

Opportunities A Realistic Study of Costs Associated

Arguably one of the most fundamental discipline that touches all other disciplines and people

in Action Fujitsu High Performance Computing Ecosystem Human Centric Innovation Innovation Flexibility Simplicity

The European Cloud Initiative and High Performance Computing (HPC) Teratec 2016

High Performance Data Analytics for Numerical Simulations. Bruno Raffin DataMove

HPC Algorithms and Applications

HPC IN EUROPE. Organisation of public HPC resources

Barcelona Supercomputing Center

The Cambridge Bio-Medical-Cloud An OpenStack platform for medical analytics and biomedical research

The Future of High Performance Computing

COMPARING COST MODELS - DETAILS

High Performance Computing from an EU perspective

Lustre usages and experiences

The EuroHPC strategic initiative

CSD3 The Cambridge Service for Data Driven Discovery. A New National HPC Service for Data Intensive science

Update on Cray Activities in the Earth Sciences

HPC and Big Data: Updates about China. Haohuan FU August 29 th, 2017

Foreign Direct Investment (FDI) and Renewable Energy. Donal Flavin, Strategic Policy Manager, IDA Ireland

Parallel & Cluster Computing. cs 6260 professor: elise de doncker by: lina hussein

Introduction to High Performance Parallel I/O

Green IT (ICT) (Accredited by British Computer Society)

Research Infrastructures and Horizon 2020

The next generation supercomputer. Masami NARITA, Keiichi KATAYAMA Numerical Prediction Division, Japan Meteorological Agency

The way toward peta-flops

Improved Solutions for I/O Provisioning and Application Acceleration

INSPUR and HPC Innovation. Dong Qi (Forrest) Oversea PM

HPC Saudi Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences. Presented to: March 14, 2017

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

Exascale: Parallelism gone wild!

Optimizing Apache Spark with Memory1. July Page 1 of 14

SuperMUC. PetaScale HPC at the Leibniz Supercomputing Centre (LRZ) Top 500 Supercomputer (Juni 2012)

The Future of High- Performance Computing

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Co-designing an Energy Efficient System

EU Research Infra Integration: a vision from the BSC. Josep M. Martorell, PhD Associate Director

Cray XC Scalability and the Aries Network Tony Ford

Introduction of Fujitsu s next-generation supercomputer

Shared Services Canada Environment and Climate Change Canada HPC Renewal Project

Mapping MPI+X Applications to Multi-GPU Architectures

Reduce Latency and Increase Application Performance Up to 44x with Adaptec maxcache 3.0 SSD Read and Write Caching Solutions

Ian Foster, An Overview of Distributed Systems

HETEROGENEOUS COMPUTE INFRASTRUCTURE FOR SINGAPORE

Joachim Biercamp Deutsches Klimarechenzentrum (DKRZ) With input from Peter Bauer, Reinhard Budich, Sylvie Joussaume, Bryan Lawrence.

The Blue Water s File/Archive System. Data Management Challenges Michelle Butler

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

High Performance Computing in C and C++

Transcription:

Green Supercomputing On the Energy Consumption of Modern E-Science Prof. Dr. Thomas Ludwig German Climate Computing Centre Hamburg, Germany ludwig@dkrz.de

Outline DKRZ 2013 and Climate Science The Exascale Era (2020) Energy Efficiency Research The Future 2/ 47

DKRZ 2013 and Climate Science 3/ 47

DKRZ in Hamburg 4/ 47

IBM Power6 Computer System Rank 368 in TOP500/Jun13 8064 cores, 115 TFLOPS Linpack 6PB disks 5/ 47

Sun StorageTek Tape Library 100 PB storage capacity 90 tape drives HPSS HSM system 6/ 47

Mission DKRZ to provide high performance computing platforms, sophisticated and high capacity data management, and superior service for premium climate science Operated as a non-profit company with Max-Planck-Society as principal share holder 70+ staff 7/ 47

Climate Modelling 8/ 47

Energy Costs at DKRZ 2 MW for computer, storage, cooling, building Annual budget for power >2 M Currently we use certified renewable energy Otherwise ca. 10.000t CO 2 /y High performance compute centres: 1-10 MW Energy costs become limiting factor for HPC usage 9/ 47

Energy Cost History at DKRZ 10 / 47

Business Model at Google 11 / 47

Business Modell at DKRZ 04.09.2013 DKRZ EnviroInfo 2013 12 / 47

Energy Costs for Science 5th IPCC status report: German part uses ca. 30M corehours at DKRZ DKRZ offers ca. 60M corehours/y Energy costs for the German IPCC contribution: ca. 1 M 9.000.000 kwh to solution with DKRZ s Blizzard system 4.500.000 kg of CO 2 with regular German electricity Climate researchers should predict the climate change...... and not produce it! 13 / 47

Total Costs at DKRZ Total costs of ownership (TCO) Building: 25 M / 25 y Computer and storage: 36 M / 5 y Electricity: 2 M /y Staff: 3 M /y Tapes: 0.5 M /y Maintenance: 1 M /y Others costs at DKRZ: 1,5 M /y TCO of DKRZ per year: approximately 16 M Processor hours per year: approximately 60 M Prize per processor hour: about 27 Cent 14 / 47

Total Costs at DKRZ Total costs of ownership (TCO) Building: 25 M / 25 y Computer and storage: 36 M / 5 y Electricity: 2 M /y Others costs at DKRZ: 6 M /y TCO of DKRZ per year: approximately 16 M Processor hours per year: approximately 60 M Prize per processor hour: about 27 Cent 15 / 47

Total Costs for Science at DKRZ TCO of DKRZ per year: approximately 16 M Publications per year: let s assume 400 Mean price per publication: 40.000 Could be justifiable for climate science What about astro physics and e.g. galaxy collisions? 16 / 47

Costs in the Petascale Era Operational costs: electricity 2002: Earth Simulator (Yokohama): $600 million 3 MW $2.5 million/year 2010: Tianhe-1A (Tjanin): $88 million 4 MW $3.5 million/year 2011: K computer (Kobe): around $1 billion 12 MW $10 million/year 2011: Sequoia (Livermore): $250 million 8 MW $7 million/year 2012: SuperMUC (Munich): 135 million 3 MW 5 million/year 17 / 47

The Exascale Era 18 / 47

In approximately 2019 we will hit the next improvement of factor 1000 Same procedure as every ten years? Just more powerful computers? Exaflops Just more disks? Exabytes The Exascale Era From Petascale to Exascale: evolution or revolution? Terascale to Petascale: evolution Just more of MPI-Fortran/C/C++ 19 / 47

Expected Systems Architecture Systems 2009 2018 Difference Today & 2018 System peak 2 Pflop/s 1 Eflop/s O(1000) Power 6 MW ~20 MW System memory Node performance Node memory BW 0.3 PB 32-64 PB [.03 Bytes/Flop] O(100) 125 GF 1, 2 or 15 TF O(10)-O(100) 25 GB/s 2-4 TB/s [.002 Bytes/Flop] O(100) Node concurrency 12 O(1k) or O(10k) O(100)-O(1000) Total node interconnect BW System size (nodes) 3.5 GB/s 200-400 GB/s (1:4 or 1:8 from memory BW) O(100) 18,700 O(100,000) or O(1M) O(10)-O(100) Total concurrency 225,000 O(billion) [O(10) to O(100) for latency O(10000) hiding] Storage 15 PB 500-1000 PB (>10x system memory is min) O(10)-O(100) IO 0.2 TB 60 TB/s (how long to drain the machine) O(100) MTTI days O(1 day) -O(10) 20 / 47

Some sort of disruptiveness The Exascale Revolution Many more processors More diverse hardware (e.g. GPUs) Mandatory fault tolerance Mandatory energy efficiency 21 / 47

Research and development costs Exascale programs to build an Exaflops computer with Exabyte storage systems USA, Japan, Europe, China, Russia multi-billion investment in R&D TCO-Considerations Investment cost First EFLOPS-computer: $500-$1500 million Operational costs 20 MW $20 million/year 22 / 47

Exascale Science The usual finally-we-can-do suspects Biology: simulate the human brain Particle physics: Higgs Boson found what next?? Medical science: eliminate cancer, Alzheimer etc. Astrophysik: understand galaxy collisions However, what we learn here: Modern science depends on high performance computing! 23 / 47

Exascale Climate Research Finally: cloud computing 24 / 47

Power Consumption Development Figure by courtesy of ZIH Dresden 25 / 47

Power Efficiency Development Figure by courtesy of ZIH Dresden 26 / 47

Costs of Science in Future Problems for exascale HPC based science Power consumption might be too high and nobody will be willing to pay for it Consequences / Requirements Look for higher energy efficiency in all components What, if we are not successful? Will harm the Western science and engineering productivity 27 / 47

Research and Development Goal: sustained HPC-based science and engineering EESI European Exascale Software Initiative WG 4.2 Software Ecosystems Subtopic Power Management Works on concepts for research on energy efficiency In general much research in Europe on energy efficiency 28 / 47

Energie Efficiency Research 29 / 47

Levels of Activity Hardware Software Brainware 30 / 47

Energy Efficient Hardware Progress at all levels is needed Lower energy consumption in all parts We see progress with semiconductor technology We see other technologies at the horizon Carbon nano tubes Biocomputers Quantum computers Power proportionality is needed High consumption with high load Low consumption with low load 31 / 47

Poor Power Proportionality power usage / efficiency 100 90 80 70 60 50 40 30 20 10 0 0 10 20 30 40 50 60 70 80 90 100 resource utilization relative power usage power efficiency 32 / 47

High Power Proportionality power usage / efficiency 100 90 80 70 60 50 40 30 20 10 0 0 10 20 30 40 50 60 70 80 90 100 resource utilization relative power usage power efficiency 33 / 47

34 / 47

High energy-proportionality Power Proportional Hardware CPUs, in particular for mobile and embedded systems use energy saving modes smooth mode switching Poor energy-proportionality disk drives network components DRAM mode switching with reactivation penalties 35 / 47

Abstraction Levels Modeling Mathematics Algorithms Programming Language Parallel Program Machine Language Libraries Operating System Hardware Machine Room Performance Tuning Energy Efficiency Tuning Compiler Job Scheduler 36 / 47

Modeling Which energy consumption can be seen with which hardware for which application? Which is the optimal system for an application? HPC system / Grid / Cloud How much energy does it take to move my application there? Simulation How behaves environment A compared to environment B? How behaves a rearranged software? Measurement Energy Efficiency Research Where can I measure what (hardware/software)? 37 / 47

Energy Efficiency Research Evaluation Visualize and understand measurements Automatic analysis of energy bottlenecks? Improved Concepts Facility management / computer hardware / operating system / middle-ware / programming / job and data scheduling Benchmarking 38 / 47

Research at University Hamburg (I) Energy Efficient Cluster Computing (eeclust) [completed] Goal: switch off all unused hardware and minimize reactivation penalty Analyze parallel programs Trace based analysis Find phases of resource inactivity Switch resources into power saving modes during these phases Instrumentation entered into source code 39 / 47

Research at University Hamburg (II) Energy-Aware Numerics (Exa2Green) Goals Develop metrics for quantitative assessment and analysis of the energy profile of algorithms Develop an advanced and detailed power consumption monitoring and profiling Develop new smart algorithms using energy-efficient software models Develop a smart, power-aware scheduling technology for High Performance Clusters Conduct a proof of concept using the weather forecast model COSMO- ART 40 / 47

Let us have a business-oriented look at scientific applications They have high costs to develop them (human resources) They have high costs to run them (electricity) Brainware Some have high costs to save the results (disks, tapes) Electricity costs are an overproportional high factor Use better hardware and software to reduce costs Use brainware to reduce the runtime and thus reduce costs 41 / 47

Example IPCC AR5 production runs Tune program and save 10% runtime Saves 900.000 kwh Saves 100.000 Is 1,5 years of a skilled tuning specialist at DKRZ Brainware... Real examples are e.g. available from HECToR: UK National Supercomputing Service Success stories on code tuning and corresponding budget savings 42 / 47

The Future of Computational Science and Engineering 43 / 47

A few Exascale systems for capability computing Future Architectures Difficult to use efficiently Expensive to operate (> 50 M annual budget) Grid Infrastructures Based on existing concepts Uses tier-1 compute centres Cloud infrastructures All sorts of services will be offered and used Commercial and non-commercial providers Can offer good prices for computing and storage 44 / 47

Future Usage Concepts Map application to appropriate environment What is appropriate? Cheap to transfer the application Cheap to execute the application Transfer costs for code and data must be considered Model of resource usage defined by the applications Data intensive applications are critical Energy aware scheduling Models and solutions are available 45 / 47

Future Policy Objective: minimize kwh-to-solution For more science in a shorter time For a cheaper science For a greener science What do we need Adapted funding systems More people, less iron Education of computer scientists Currently there are not enough 46 / 47

Paradigm Shift from time-to-solution to kwh-to-solution for a more economical & ecological science 47 / 47