update: HPC, Data Center Infrastructure and Machine Learning HPC User Forum, February 28, 2017 Stuttgart

Similar documents
The Energy Challenge in HPC

Update on LRZ Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities. 2 Oct 2018 Prof. Dr. Dieter Kranzlmüller

Co-designing an Energy Efficient System

SuperMUC. PetaScale HPC at the Leibniz Supercomputing Centre (LRZ) Top 500 Supercomputer (Juni 2012)

Lenovo Technical / Strategy Update

LRZ SuperMUC One year of Operation

HPC projects. Grischa Bolls

Prototyping in PRACE PRACE Energy to Solution prototype at LRZ

Environmental Computing More than computing environmental models? Anton Frank IEEE escience Conf., Munich,

100% Warm-Water Cooling with CoolMUC-3

Passive RDHx as a Cost Effective Alternative to CRAH Air Cooling. Jeremiah Stikeleather Applications Engineer

Advanced User Support with πcs

CAS 2K13 Sept Jean-Pierre Panziera Chief Technology Director

Chiller Replacement and System Strategy. Colin Berry Account Manager

InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014

Algorithms, System and Data Centre Optimisation for Energy Efficient HPC

Saving Energy with Free Cooling and How Well It Works

Data Center Trends and Challenges

CHILLED WATER. HIGH PRECISION AIR CONDITIONERS, FROM 7 TO 211 kw

Session 6: Data Center Energy Management Strategies

Efficiency of Data Center cooling

Efficient Cooling Methods of Large HPC Systems

COST EFFICIENCY VS ENERGY EFFICIENCY. Anna Lepak Universität Hamburg Seminar: Energy-Efficient Programming Wintersemester 2014/2015

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 16 th CALL (T ier-0)

JÜLICH SUPERCOMPUTING CENTRE Site Introduction Michael Stephan Forschungszentrum Jülich

Mission Critical Facilities & Technology Conference November 3, 2011 Cooling 101. Nick Gangemi Regional Sales Manager Data Aire

Vienna Scientific Cluster s The Immersion Supercomputer: Extreme Efficiency, Needs No Water

Results of Study Comparing Liquid Cooling Methods

Packaging of New Servers - energy efficiency aspects-

MSYS 4480 AC Systems Winter 2015

Saving Water and Operating Costs at NREL s HPC Data Center

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 14 th CALL (T ier-0)

Planning for Liquid Cooling Patrick McGinn Product Manager, Rack DCLC

LRZ Site Report. Athens, Juan Pancorbo Armada

TSUBAME-KFC : Ultra Green Supercomputing Testbed

Switch to Perfection

HPC IN EUROPE. Organisation of public HPC resources

Water-Cooling and its Impact on Performance and TCO

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0)

Leibniz-Rechenzentrum der Bayerischen Akademie der Wissenschaften

Welcome to the. Jülich Supercomputing Centre. D. Rohe and N. Attig Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich

Replacement of the University of Missouri Research Reactor (MURR) Cooling Towers

LANL High Performance Computing Facilities Operations. Rick Rivera and Farhad Banisadr. Facility Data Center Management

Cooling on Demand - scalable and smart cooling solutions. Marcus Edwards B.Sc.

HPC Resources & Training

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 11th CALL (T ier-0)

Welcome to the. Jülich Supercomputing Centre. D. Rohe and N. Attig Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich

COOLING SYSTEM DESIGNING FOR HPC

ENERGY AWARE COMPUTING

Future of Cooling High Density Equipment. Steve Madara Vice President and General Manager Liebert Precision Cooling Business Emerson Network Power

Optimization in Data Centres. Jayantha Siriwardana and Saman K. Halgamuge Department of Mechanical Engineering Melbourne School of Engineering

AutoTune Workshop. Michael Gerndt Technische Universität München

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0)

Slurm BOF SC13 Bull s Slurm roadmap

Is Data Center Free Cooling Feasible in the Middle East? By Noriel Ong, ASEAN Eng., PMP, ATD, PQP

Data Center Energy and Cost Saving Evaluation

Impact of Air Containment Systems

Case study on Green Data Center. Reliance Jio Infocomm Limited

3300 kw, Tier 3, Chilled Water, 70,000 ft 2

Cooling Solutions & Considerations for High Performance Computers

Digital Realty Data Center Solutions Digital Chicago Datacampus Franklin Park, Illinois Owner: Digital Realty Trust Engineer of Record: ESD Architect

The Mont-Blanc Project

Improving energy efficiency of supercomputer systems through software-aided liquid cooling management

New HPC architectures landscape and impact on code developments Massimiliano Guarrasi, CINECA Meeting ICT INAF Catania, 13/09/2018

Cooling making efficient choices

Energy Efficiency and WCT Innovations

Výpočetní zdroje IT4Innovations a PRACE pro využití ve vědě a výzkumu

INITIATIVE FOR GLOBUS IN EUROPE. Dr. Helmut Heller Leibniz Supercomputing Centre (LRZ) Munich, Germany IGE Project Coordinator

Global Optimisation of Chiller Sequencing and Load Balancing Using Shuffled Complex Evolution

Liebert HPC. Freecooling Scroll Chillers for the Efficient Cooling of Data Centers

Overview of Tianhe-2

Thread and Data parallelism in CPUs - will GPUs become obsolete?

2000 kw, Tier 3, Chilled Water, Prefabricated, sq. ft.

PROCESS & DATA CENTER COOLING. How and Why To Prepare For What s To Come JOHN MORRIS REGIONAL DIRECTOR OF SALES

Accelerated Earthquake Simulations

COOLING WHAT ARE THE OPTIONS AND THE PROS AND CONS FOR THE VARIOUS TYPES OF DATACENTRES SERVER AND COMMS ROOMS

Server room guide helps energy managers reduce server consumption

Data Center Energy Savings By the numbers

Introducing the first PATENT PENDING solid state selfregulating

FREECOOLING, EVAPORATIVE AND ADIABATIC COOLING TECHNOLOGIES IN DATA CENTER. Applications in Diverse Climates within Europe, Middle East and Africa

Smart Data Centres. Robert M Pe, Data Centre Consultant HP Services SEA

Indirect Adiabatic and Evaporative Data Centre Cooling

READEX: A Tool Suite for Dynamic Energy Tuning. Michael Gerndt Technische Universität München

18 th National Award for Excellence in Energy Management. July 27, 2017

Data centers control: challenges and opportunities

Ten Cooling Solutions to Support High-density Server Deployment. Schneider Electric Data Center Science Center White Paper #42

The way toward peta-flops

The EuroHPC strategic initiative

High Efficient Data Center Cooling

System Overview. Liquid Cooling for Data Centers. Negative Pressure Cooling Without Risk of Leaks

Energy Efficiency and Water-Cool-Technology Innovations

Indirect Adiabatic and Evaporative Data Center Cooling

Datacenter Efficiency Trends. Cary Roberts Tellme, a Microsoft Subsidiary

Rittal Cooling Solutions A. Tropp

SUPERSAVER. Energy Efficiency Optimization Software for Freecooling Systems

Modular data centres in containers

Update on Cray Activities in the Earth Sciences

Current Data Center Design. James Monahan Sept 19 th 2006 IEEE San Francisco ComSoc

New Design Strategies for Energy Saving. Intelligent free-cooling

Close Coupled Cooling for Datacentres

Transcription:

GCS@LRZ update: HPC, Data Center Infrastructure and Machine Learning HPC User Forum, February 28, 2017 Stuttgart Arndt Bode Chairman of the Board, Leibniz-Rechenzentrum of the Bavarian Academy of Sciences and Humanities and Technische Universität München

Leibniz Supercomputing Centre (LRZ) LRZ IT Equipment Floor Space: 3160.5 m 2 (34 019 ft 2 6 rooms on 3 floors) Infrastructure Floor Space: 6393.5 m 2 (68 819 ft 2 ) Generic IT services to all Munich universities - Internet Access, Munich Scientific Network - IT Service & Support IT services to all Bavarian universities - Software License Management - Backup & Archive Services HPC resources to scientist in Europe - European Tier 0 level of HPC - PRACE Member 3/15/2017 HPC User Forum 2

SuperMUC The First High Temperature Direct Liquid Cooled Petascale Supercomputer System Lenovo/IBM NeXtScale nx360m5 Number of Cores: 86,016 Total Number of Nodes: 3072 Cores Per Node: 28 Shared memory per node: 64 Gbyte System IBM/Lenovo idataplex DX360M4 (Thin Nodes) Number of Cores: 147,456 Total Number of Nodes: 9216 Cores Per Node: 16 Shared memory per node: 32 Gbyte Processor Intel Haswell Xeon Processor E5-2697 v3 PHASE 1 Turbo Frequency: Max Operating Frequency: Thermal Design Power: 3.1 GHz to 3.6 GHz 2.6 GHz 145 W Processor Intel Sandy Bridge-EP Xeon E5-2680 8C Turbo Frequency: 3.1 GHz to 3.5 GHz Max Operating Frequency: 2.7 GHz Thermal Design Power: 130 W Performance Theoretical Peak performance: Performance @ HPL: Energy Efficiency: Typical Power Consumption: Power Consumption @ HPL: 3.57 PetaFLOPS 2.81 PetaFLOPS 1.9 GLOPS/W 1.1 MW 1.48 MW Performance Theoretical Peak performance: Performance @ HPL: Energy Efficiency: Typical Power Consumption: Power Consumption @ HPL: 3.18 PetaFLOPS 2.89 PetaFLOPS 0.85 GLOPS/W 2.3 MW 3.4 MW Phase1 Blade Phase2 Blade 3/15/2017 HPC User Forum 3

HPC: New Procurements SuperMUC-NG: negotiations started in 2017, contract end 2017 Funding: project SiVeGCS Overview over current systems: 15/03/2017 HPC User Forum 4

Overview HPC Systems @LRZ (continued) 3/15/2017 HPC User Forum 5

Schematic overview of LRZ s cooling infrastrucutre 3/15/2017 HPC User Forum 6

Costs @LRZ 1kWh (2000) 0. 07 1kWh (2016) 0. 161 3/15/2017 HPC User Forum 7

Energy-Efficiency Challenges 3/15/2017 HPC User Forum 8

Data Center Overview of Some EE R&D Activities @LRZ Pillar I Building Infrastructure Pillar II System Hardware Pillar III System Software Pillar IV Applications Neighboring Buildings Infrastructure Data Analyser (IDA) Global Optimization Framework Utility Providers Environment SIMOPEK www.simopek.de Resource Management & Scheduling System Power Variation Aware Configuration Adviser Lightweight Adaptive Consumption Predictor (LACP) Power Data Aggregation Monitor (PowerDAM) GOAL: Improve Key Performance Indicators GOAL: Reduce Hardware Power Consumption GOAL: Optimize Resource Usage, Tune System GOAL: Optimize Application Performance 3/15/2017 HPC User Forum 9

Schematic overview of LRZ s chiller-less cooling infrastrucutre 3/15/2017 HPC User Forum 10

Schematic overview of the hydraulic connection to the KLT14 cooling tower Sensor measuring the inlet water temperature entering from the distribution bar Sensors measuring the currently generated heat and the water flow rate Sensors measuring the speed and the status of the pump Pumps three-way valve Hydraulic Gate A Heat Exchanger Water mixed with glycol Sensor measuring the return water temperature to the distribution bar subcircuit I subcircuit II 3/15/2017 HPC User Forum 11

Schematic overview of the KLT14 cooling tower Sprinkler Pipe Sensor measuring the inlet water temperature to the colling tower Sensor measuring the return water temperature from the cooling tower A Fans Sensors indicating the status and the speed of the fan 3/15/2017 HPC User Forum 12

The Efficiency of Cooling Infrastructure COP = Q CoolingCircuits P CoolingCircuits Q CoolingCircuits - aggregated amount of cold generated by the four cooling circuits P CoolingCircuits - aggregated amount of power consumed by the four cooling circuits 3/15/2017 HPC User Forum 13

Predicting the Efficiency of Cooling Infrastructure Using Recurrent Neural Networks Network Type: Software Package: LSTM (Long Short-Term Memory) Keras TensorFlow 12 Inputs: aggregated amount of cold generated by each cooling circuit aggregated amount of power consumed by the fans of each cooling circuit number of active cooling towers wet bulb temperature inlet water temperature (to the distribution bar) from each cooling circuit (x 4) return water temperature (to the distribution bar) to each cooling circuit (x 4) Output: COP of warm water cooling infrastructure 3/15/2017 HPC User Forum 14

COP Prediction Results (learned 2015 operational data predicted for 2016) Some of the data not available due to maintenances in building automation systems Some of the data not available due to maintenances in building automation systems 3/15/2017 HPC User Forum 15

GCS@LRZ summary New procurement(s) for HPC in the context of GCS General purpose system, embedded into general IT infrastructure visualization big data networking Research and Development in Energy Efficiency 15/03/2017 HPC User Forum 16