ENERGY AWARE COMPUTING

Size: px
Start display at page:

Download "ENERGY AWARE COMPUTING"

Transcription

1 ENERGY AWARE COMPUTING Luigi Brochard, Lenovo Distinguished Engineer, WW HPC & AI HPC Knowledge June ,

2 Agenda Different metrics for energy efficiency Lenovo cooling solutions Lenovo software for energy aware computing 2017 Lenovo. All rights reserved. 2

3 How to measure Power Efficiency PUE Total Facility Power PUE = IT Equipment Power Power usage effectiveness (PUE) is a measure of how efficiently a computer data center uses its power; PUE is the ratio of total power used by a computer facility] to the power delivered to computing equipment. Ideal value is 1.0 It does not take into account how IT power can be optimised ITUE (IT power + VR + PSU + Fan) ITUE = IT Power IT power effectiveness ( ITUE) measures how the node power can be optimised Ideal value if 1.0 ERE Total Facility Power Treuse ERE = IT Equipment Power Energy Reuse Effectiveness measures how efficient a data center reuses the power dissipated by the computer ERE is the ratio of total amount of power used by a computer facility] to the power delivered to computing equipment. An ideal ERE is 0.0. ERE = PUE Tresuse/IT EqPower; If no reuse, ERE = PUE 2017 Lenovo. All rights reserved. 3

4 Choice of Cooling Air Cooled Standard air flow with internal fans Fits in any datacenter Maximum flexibility Broadest choice of configurable options supported Supports Native Expansion nodes (Storage NeX, PCI NeX) PUE ~2 1.5 ERE ~2 1.5 Choose for broadest choice of customizable options 2017 Lenovo. All Rights Reserved Air Cooled with Rear Door Heat Exchangers Air cool, supplemented with RDHX door on rack Uses chilled water with economizer (18 C water) Enables extremely tight rack placement PUE ~ ERE ~ Choose for balance between configuration flexibility and energy efficiency Direct Water Cooled Direct water cooling with no internal fans Higher performance per watt Free cooling (45 C water) Energy re-use Densest footprint Ideal for geos with high electricity costs and new data centers Supports highest wattage processors PUE ~ 1.1 ERE < < 1 with hot water Choose for highest performance and energy efficiency 4

5 TCO: payback period for DWC vs RDHx New Existing $0.06/kWh Existing Existing $0.12/kWh $0.20/kWh DWC RDHx New data centers: Water cooling has immediate payback. Existing air-cooled data center payback period strongly depends on electricity rate 2017 Lenovo. All Rights Reserved 5

6 idataplex dx360m4 ( ) idataplex rack with 84 dx360m4 servers dx360 M4 nodes, 2xCPUs (130W, 115W), 16xDIMMS (4GB/8GB), 1HDD/2SSD, network card. 85% Heat Recovery, Water 18 C-45 C, 0.5 lpm / node. dx360m4 server dx360m4 Server idataplex Rack 2017 Lenovo. All Rights Reserved 6

7 NextScale nx360m5 WCT ( ) NextScale Chassis 6U/12Nodes, 2 nodes / tray. nx360m5 WCT 2xCPUs (up to 165W), 16xDIMMS (8GB/16GB/32GB), 1HDD/2SSD, 1 ML2 or PCIe Network Card. 85% Heat Recovery, Water 18 C-45 C (and even upto 50 C), 0.5 lpm / node. copper waterloops 2 Nodes of nx-360m5 WCT in a Tray NextScale Chassis Scalable Manifold Rack Configuration nx360m5 with 2 SSDs 2017 Lenovo. All Rights Reserved 7

8 SuperMUC systems at LRZ: Phase 1 and Phase 2 Phase 1 Ranked 28 and 29 in Top500 June 2016 Fastest Computer in Europe on Top 500, June Nodes with 2 Intel Sandy Bridge EP CPUs HPL = 2.9 PetaFLOP/s Infiniband FDR10 Interconnect Large File Space for multiple purpose 10 PetaByte File Space based on IBM GPFS with 200GigaByte/s I/O bw Innovative Technology for Energy Effective Computing Hot Water Cooling Energy Aware Scheduling Most Energy Efficient high End HPC System PUE 1.1 Total Power consumption over 5 years to be reduced by ~ 37% from 27.6 M to 17.4 M 2017 Lenovo. All Rights Reserved Phase 2 Acceptance completed 3096 nx360m5 compute nodes Haswell EP CPUs HPL = 2.8 PetaFLOP/s Direct Hot Water Cooled, Energy Aware Scheduling Infiniband FDR14 GPFS, 10 x GSS26, 7.5 PB capacity, 100 GB/s IO bw 8

9 Lenovo Water Cooling added value Classic Water Cooling Direct Water cooling CPU only - Only 60% of heat goes to water - => 40% still need to be air cooled Inlet water temperature - Upto35 C - => No free cooling all year long/all geo Heat from water is wasted Unproven technology Power of server is not managed 2017 Lenovo. All Rights Reserved Lenovo Water Cooling Direct Water cooling CPU/DIMMS/VRs - 80 to 85% of heat goes to water - => just 10% still need to be air cooled Inlet water temperature - Upto C - => Free cooling all year long in all geo Water is hot enough to be efficiently reused - like with Absorption chiller => ERE <<1 3rd generation Water Cooling - More than nodes installed Power and energy are managed & optimized 9

10 DWC reduces Processor Temperature on Xeon 2697 v4 Conclusion: Direct Water Cooling lowers processor power consumption by about 5% and allows Higher processor frequency. NXT with 2 socket 2697v4, 128 GB 2400 MHz DIMM Inlet Water temperature is 28 C, 2017 Lenovo. All Rights Reserved 10

11 Air and DWC performance DC power on Xeon 2697v4 Conclusion: With Turbo OFF, Direct Water Cooling reduces power by 5% With Turbo ON, it increases performance by 3% and still reduces power by 1% DC energy is measured through aem DC energy accumulator 2017 Lenovo. All Rights Reserved 11

12 Savings from Lenovo Direct Water Cooling Higher TDP processors Reduced server power comsumption Lower processor power consumption (~ 5%) No fan per node (~ 4%) Reduce cooling power consumption With DWC at 45 C, we assume free cooling all year long ( ~ 25%) Additional savings with Energy Aware SW Total savings = ~35-40% Free cooling all yea long => Less chillers => CAPEX savings 2017 Lenovo. All Rights Reserved 12

13 Re-Use of Waste Heat, New buildings in Germany are very good thermally isolated: Standard heat requirement of only 50 W/m2 SuperMUCs waste heat would be sufficient to heat m2 of office space (~10 x) What to do with the waste heat during summer? 2017 Lenovo. All Rights Reserved 13

14 CooLMUC-2: Waste Heat Re-Use for Chilled Water Production ERE=0.3 Lenovo NeXtScale Water Cool (WCT) system technology Water inlet temperatures 50 C All season chiller-less cooling 384 compute nodes 466 TFlop/s peak performance Energy Reuse Effectiveness ( ERE) measures how efficient a data center reuses the power dissipated by the computer 2017 Lenovo. All Rights Reserved SorTech Adsorbtion Chillers based of zeolite coated metal fiber heat exchangers a factor 3 higher than current chillers based on silica gel COP = 60% Total electricity reduced by 50+% ERE = Total Facility Power Treuse IT Equipment Power 14

15 CooLMUC-2: ERE = 0.3 ERE = Total Facility Power Treuse = = 0.32 IT Equipment Power 104 CooLMUC-2 power consumption CooLMUC-2 heat output into warm water cooling loop Cold water generated by absorption chillers (COP ~ 0,5 0,6) Leibniz Supercomputing Centre 15

16 Savins from Direct Water Cooling with Lenovo Server power comsumption Lower processor power consumption (~ 5%) No fan per node (~ 4%) Cooling power consumption With DWC at 45 C, we assume free cooling all year long ( ~ 25%) Additional savings with energy aware SW Total savings = ~35-40% Heat Reuse With DWC at 50 C, additional 30% savings as free chilled water is generated With heat reuse total savings => 2017 Lenovo. All Rights Reserved 50+% 16

17 Lenovo references with DWC ( ) Sites LRZ SuperMUC LRZ SuperMUC 2 LRZ SuperCool2 NTU Enercon US Army Exxon Research NASA Goddard PIK KIT Birmingham U ph1 Birmingham U ph2 MMD UNINET Peking U Nodes Country Germany Germany Germany Singapore Germany Hawai NA NA Germany Germany UK UK Malaysia Norway China Instal date Max. In. Water 45 C 45 C 50 C 45 C 45 C 45 C 45 C 45 C 45 C 45 C 45 C 45 C 45 C 45 C 45 C More than nodes up and running with DWC Lenovo technology 17

18 How to manage/control power and energy Report temperature and power consumption per node / per chassis power consumption and energy per job Optimize Reduce power of inactive nodes Reduce power of active nodes 2017 Lenovo. All Rights Reserved 18

19 Power Management on NeXtScale IMM = Integrated Management Module (Node-Level Systems Management) Monitors DC power consumed by node as a whole and by CPU and memory subsystems Monitors inlet air temperature for node Caps DC power consumed by node as a whole Monitors CPU and memory subsystem throttling caused by node-level throttling FPC = Fan/Power Controller (Chassis-Level Systems Mgmt Monitors AC and DC power consumed by individual power supplies and aggregates to chassis level Monitors DC power consumed by individual fans and aggregates to chassis level Enables or disables power savings for node PCH = Platform Controller Hub (i.e., south bridge) ME = Management Engine (embedded in PCH, runs Intel NM firmware) HSC = Hot Swap Controller (provides power readings) 2017 Lenovo All rights reserved. 19

20 DC power sampling and reporting frequency High Level Software 1Hz IMM/BMC 1Hz NM/ME 200Hz RAPL CPU/memory (energy MSRs) 500Hz Meter 10Hz HSC 1KHz Sensor 2017 Lenovo All rights reserved. 20

21 Power reading improvements on next gen. NeXtScale Targets Better than or equal to +/-3% power reading accuracy down to the node s minimum active power (~40-50W DC). Power granularity <=100mW At least 100Hz update rate for node power readings. 100 readings are returned en masse. In addition, there is a beginning timestamp. Time between each successive readings is maintained at 10mS. ipmi raw oem cmd INA226 (used for metering) FPGA (FIFO) Bulk 12V 2017 Lenovo.. All rights reserved. IMM Rsense ME (Node Manager) SN (used for capping) Node 12V 21

22 Demo at ISC 17 Lenovo HPC Software Stack Lenovo OpenSource HPC Stack Build on top of OpenHPC with xcat Enhanced with Lenovo configuration, plugins and scripts Add in IBM Spectrum Scale, Singularity* Add in Lenovo Antilles and Energy Aware run time** Integrated and supported A ready to use HPC Stack collaborating w/ Greg Kurtzer from Lawrence Berkeley National Lab; 2017 ** : inlenovo. collaboration withreserved. BSC All rights Customer Applications Enterprise Professional Services Installation and custom services, may not include service support for third party software An OpenSource IaaS suite to run and manage optimally and transparently HPC, Big Data and Workflows on a virtualized infrastructure adjusting dynamically to user and datacenter needs through energy policies User / Operator Web Portal xcat Extreme Cloud Admin. Toolkit Energy Awareness Containers Parallel File Systems OS Antilles - Simplified User Job View / Customized Confluent Lenovo EAR Singularity IBM Spectrum Scale & Lustre NFS OFED Lenovo System x Virtual, Physical, Desktop, Server Compute Storage Network OmniPath Infiniband 22

23 Confluent: HeatMap 2017 Lenovo. All rights reserved. 23

24 How to manage/control power and energy Report temperature and power consumption per node / per chassis power consumption and energy per job xcat/confluent Optimize Reduce power of inactive nodes Reduce frequency of active nodes IBM Platform LSF SLURM PBSpro 2017 Lenovo. All rights reserved. 24

25 Demo at ISC 17 Energy Aware Run time: Motivation Power and Energy has become a critical constraint for HPC systems Performance and Power consumption of parallel applications depends on: Architectural parameters Runtime node configuration Application characteristics Input data Select optimal frequency Manual best frequency Difficult to select manually and it is a time consuming process (resources and then power) and not reusable It may change along time It may change between nodes Lenovo. All rights reserved. New architecture Configure application for Architecture X Execute with N frequencies: calculate time and energy New input set 25

26 Energy Aware Run time (EAR) Goals Offer a dynamic and transparent solution to energy awareness : Avoiding having to re-execute applications again and again Easy to use - Without source code modifications - Without historic application information - Supporting standard programming models: MPI, MPI+OpenMP - Using standard libraries and tools as much as possible to be easily portable - Open Source Frequency change based on simple Energy Policies with performance thresholds Minimizing the overhead introduced - Remember we are HPC and we want to save energy! 2017 Lenovo. All rights reserved. 26

27 EAR proposal Automatic and dynamic frequency selection based on: Architecture characterization Application characterization - Outer loop detection (DPD) - Application signature computation (CPI,GBS,POWER,TIME) Performance and power projection Users/System policy definition for frequency selection (configured with thresholds) - MINIMIZE_ENERGY_TO_SOLUTION - Goal: To save energy by reducing frequency (with potential performance degradation) - We limit the performance degradation with a MAX_PERFORMANCE_DEGRADATION threshold - MINIMIZE_TIME_TO_SOLUTION - Goal: To reduce time by increasing frequency (with potential energy increase) - We use a MIN_PERFORMANCE_EFFICIENCY_GAIN threshold to avoid that application that do not scale with frequency to consume more energy for nothing 2017 Lenovo. All rights reserved. 27

28 BQCD use case Berlin quantum chromodynamics program (BQCD) Two input sets generates two different use cases - Input 1 generates a cpu intensive use case, much more sensible to node frequency changes - Input 2 generates a memory intensive use case, less sensible to node frequency changes 2017 Lenovo. All rights reserved. 28

29 BQCD use case BQCD at Lenox MPI+OpenMP code Configured with 16 mpis (4 nodes, 4 ppn) Architecture tested Lenox cluster Intel(R) Xeon(R) CPU E GHz, 18c 500 Exec.Time(sec.) Number of threads adapted to number of cores Making this comparison took 96H of computation x 4 nodes consumed!! 2.3GHz 2.2GHz 2.1GHz CPU 2.0GHz 1.9GHz 1.8GHz MEM BQCD at Lenox 300,00 Avg.Power (W) 250,00 200,00 150,00 100,00 50,00 0,00 2.3GHz 2.2GHz 2.1GHz CPU 2017 Lenovo. All rights reserved. 2.0GHz 1.9GHz 1.8GHz MEM 29

30 BQCD static analysis CPU 1.8Ghz vs. 2.3Ghz Power saving is 34% Performance degradation is 31% Energy variation is 14% Minimal energy is at 2.0GHz Memory 1.8Ghz vs. 2.3Ghz Power saving is 14% Performance degradation is 4% Energy variation is 10% Minimal energy is at 1,8Ghz If we set a maximum performance penalty (i.e 20%) CPU version will run at 2,0GHz Mem. Version will run at 1,8GHz BQCD_CPU: Fi vs. 2.3GHz 40% 35% 30% 25% 20% 15% 10% 5% 0% 2.2GHz 2.1GHz Perf.Degrada on PowerSaving 1.8GHz EnergySaving 16% 14% 12% 10% 8% 6% 4% 2% 0% 2.1GHz Perf.Degrada on 1.9GHz BQCD_MEM: Fi vs. 2.3GHz 2.2GHz 2017 Lenovo. All rights reserved. 2.0GHz 2.0GHz PowerSaving 1.9GHz 1.8GHz EnergySaving 30

31 2017 Lenovo. All rights reserved. 31

32 EAR overall overview Learning Phase: at EAR installation (*) Kernels execution Coefficients computatio n Coefficients DB Store Coefficients DPD detects outer loop Compute power and performance metrics for outer loop Optimal frequency calculation Energy policy Read EAR execution (loaded with application) (*) or every time cluster configuration is modified (more memory per node, new processors...) 2017 Lenovo. All rights reserved. 32

33 Learning Phase 2017 Lenovo. All rights reserved. 33

34 Performance and Power models We use linear models to predict Power and Time at any available frequency fn from a reference frequency Rf based on: HW coefficients (A(Rf,fn), B(Rf,fn), C(Rf,fn), D(Rf,fn), E(fn) and F(Rf,fn)) which represent the parameters of each node in the system, measured in the Learning Phase - At EAR installation or every time cluster configuration is modified Application signature (Power, CPI and GBS) computed while the application is running - Power, CPI and GBS are the basic metrics that characterize an application: we define it as the Application signature Execution of selected kernels with available frequencies on each node of the system Metrics collection: Execution time, Avg.Power, CPI and GBS 2017 Lenovo. All rights reserved. Computation of node coefficients to be used in performance and power models 34

35 Dynamic Application Characterization by Dynamic Periodicity Detection 2017 Lenovo. All rights reserved. 35

36 Dynamic characterization is driven by DPD New Dynamic Pattern Detection -IF Signature changes -IF New Pattern detection Optimal frequency selection based on policy and arguments 2017 Lenovo. All rights reserved. Signature computation: (CPI,GBS,POWER,TIME) Performance/Power projection models applied 36

37 Dynamic Pattern Detection DPD algorithm detects repetitive sequences of events EAR generates a event identifier per dynamic MPI call Using mpi arguments such as type of call, sendbuff, recvbuff, etc HPC applications usually present this repetitive behaviour since they exploit loops DPD receives a new event as input and reports the dpd_status New event 2017 Lenovo Internal. All rights reserved. DPD NEW_LOOP/NEW_ITERATION 37

38 DPD example Events sequence Time 0,4,8,1,2,3,1,2,3,1,2,3,1,2,3,4,8, 1,2,3,1,2,3,1,2,3,1,2,3,4,8 NEW_LOOP Loop_id=1 Loop_size=3 NEW_ITERATION Loop_id=1 Loop_size=3 NEW_LOOP Loop_id=1 Loop_size=3 NEW_ITERATION Loop_id=1 Loop_size=3 END_LOOP NEW_LOOP Loop_id=4 Loop_size= Lenovo Internal. All rights reserved. 38

39 Accumulated 2017 Lenovo Internal. All rights reserved. me Big loop detected Policy is applied F: 2.6Ghz 2.4Ghz Power is reduced Iteration time is similar Frequency BQCD_CPU: Outer loop size detected(mpi rank 0) Acuumulated me BQCD_CPU:Frequency(mpi rank 0) Accumulated me BQCD_CPU:Measured POWER(mpi rank 0) Accumulated me BQCD_CPU:Measured Itera on me(mpi rank 0) Avg.Power (W) Frequency Itera on me (usecs) R A N K Avg.Power (W) M P I Itera on me (usecs) EAR at works on BQCD_CPU BQCD_CPU: Outer loop size detected(mpi rank 8) Acuumulated me BQCD_CPU:Frequency(mpi rank 8) Accumulated me BQCD_CPU:Measured Itera on me(mpi rank 8) Accumulated me M P I Accumulated me BQCD_CPU:Measured POWER(mpi rank 8) R A N K

40 EAR policies MINIMIZE_ENERGY_TO_SOLUTION Goal : Minimize the energy consumed with a limit to the performance penalty User/Sysadmin defines a MAX_PERFORMANCE_PENALTY Based on Signature and node coefficients, EAR computes a performance projection for each available frequency. EAR selects the optimal frequency that minimizes Energy with (performance_penalty <= MAX_PERFORMANCE_PENALTY) MINIMIZE_TIME_TO_SOLUTION Goal: Improve the execution time while guaranteeing a minimum performance improvement that justifies more energy consumption User/Sysadmin defines a MIN_PERFORMANCE_GAIN Based on Signature and node coefficients, EAR computes a performance projection for each available frequency EAR selects the optimal frequency that improves application performance with performance_improvement >= MIN_PERFORMANCE_EFFICIENCY_GAIN 2017 Lenovo. All rights reserved. 42

41 EAR evaluation 2017 Lenovo Internal. All rights reserved. 43

42 Performance and power projection models: BQCD real vs. projected values TIME POWER BQCD_CPU: Real vs. Projecte me Error(Projected vs Real metrics) BQCD_CPU: Real vs. Projected power % % 4% 2% 150 0% 100 2,2GHz 2,1GHz 0 2,3GHz 2,2GHz 2,1GHz Exec.Time(sec.) 2,0GHz 1,9GHz 1,8GHz 2,3GHz Proj.Time(sec.) 2,2GHz 2,1GHz Avg.Power BQCD_MEM: Real vs. Projected me 2,0GHz 1,9GHz 1,8GHz Power (W) ,9GHz 1,8GHz Error Time(bqcd_cpu) Error Power(bqcd_cpu) Error Time(bqcd_cpu) Error Power(bqcd_mem) Proj.Avg.Power Average error is 5% BQCD_MEM: Real vs Projected power 500 2,0GHz Frequency 50 0 Time(sec.) Error Power (W) Time (sec.) 8% ,3GHz 2,2GHz 2,1GHz Exec.Time 2,0GHz 1,9GHz Proj. Exec.Time 2017 Lenovo Internal. All rights reserved. 1,8GHz 2,3GHz 2,2GHz 2,1GHz Avg.power 2,0GHz 1,9GHz 1,8GHz Proj.Avg.Power 45

43 BQCD best static frequency selection CPU 1.8Ghz vs. 2.3Ghz Power saving is 34% Performance degradation is 31% Energy variation is 14% Minimal energy is at 2.0GHz Memory 1.8Ghz vs. 2.3Ghz Power saving is 14% Performance degradation is 4% Energy variation is 10% Minimal energy is at 1.8Ghz BQCD_CPU: Fi vs. 2.3GHz 40% 35% 30% 25% 20% 15% 10% 5% 0% 2.2GHz 2.1GHz Perf.Degrada on 2.0GHz 1.9GHz PowerSaving 1.8GHz EnergySaving BQCD_MEM: Fi vs. 2.3GHz 16% 14% 12% 10% 8% 6% 4% 2% 0% 2.2GHz 2.1GHz Perf.Degrada on 2017 Lenovo Internal. All rights reserved. 2.0GHz PowerSaving 1.9GHz 1.8GHz EnergySaving 46

44 BQCD: EAR MINIMIZE_ENERGY_TO_SOLUTION Default frequency 2.3GHz MAX_PERFORMANCE_DEGRADATION Set to threshold TH=[5%,10%,15%] BQCD_CPU: EAR MIN_ENERGY_TO_SOLUTION 25% 20% 15% 10% 5% CPU use case 0% 5% TH= 5% [2.2Ghz 2.3Ghz] selected TH= 10% 2.2Ghz selected TH= 15% [2.0GHz 2.2GHz] selected MEM use case 10% 15% MAX_PERFORMANCE_DEGRADATION PerfDegrada on PowerSaving EnergySaving BQCD_MEM: EAR MIN_ENERGY_TO_SOLUTION 10% 8% TH= 5% 1.8GHz selected TH= 10% 1.8GHz selected TH= 15% 1.8GHz selected 6% 4% 2% 0% 5% 10% 15% MAX_PERFORMANCE_DEGRADATION PerfDegrada on 2017 Lenovo Internal. All rights reserved. PowerSaving EnergySaving 47

45 BQCD: EAR MINIMIZE_TIME_TO_SOLUTION Default frequency 2.0Ghz MIN_PERFORMANCE_EFFICIENCY_GAIN Set to 70% - 80% BQCD_CPU:MIN_TIME_TO_SOLUTION 10% 8% 6% BQCD_CPU 4% One node is executed at 2.3GHz The rest remains at 2.0GHz 2% 0% 70% 80% MIN_PERFORMANCE_EFFICIENCY_GAIN Performance Gain BQCD_MEM Power increment Energy increment BQCD_MEM: EAR MIN_TIME_TO_SOLUTION All the nodes execute at 2.0GHz 5% 4% 3% 2% 1% 0% 70% 80% MIN_PERFORMANCE_EFFICIENCY_GAIN PerformanceGain 2017 Lenovo Internal. All rights reserved. Power Increment Energy Increment 48

46 EAR distributed design EAR offers a distributed design where each node works locally no centralized information One mpi process per node takes care of: Dynamic Pattern detection EAR metrics collection EAR node state: collecting metrics, evaluating signature, etc EAR policy implementation Per node logging messages: One summary file per node is generated 2017 Lenovo. All rights reserved. 49

47 EAR overhead evaluation Main EAR overhead comes from DPD algorithm Current EAR version invokes DPD at each MPI call DPD consumes 2 usecs per invocation BQCD cpu intensive use case executes 16millions of DPD invocations in 500 seconds Profile information reported by mpi ranks 0,4,8 and 12 We are working in optimizations to reduce the number of DPD calls Mpi rank DPD calls usecs/dpd ,12usecs ,02usecs ,07usecs ,04usecs 50 50

48 EAR Goals.where we are? Our main goal were to offer a dynamic and transparent solution to energy awareness Avoiding having to re-execute applications again and again Easy to use - Without source code modifications - Without historic application information - Supporting standard programming models: MPI, MPI+OpenMP - Using standard libraries and tools as much as possible to be easily portable - Papi for hardware counters (core and uncore) when available and machine configuration - Libcpupower - ibmaem kernel module for node energy (linux distro) - Open Source (on progress) Frequency change based on simple Energy Policies with performance thresholds Minimizing the overhead introduced Overhead has been significantly reduced, but there is still margin for improvement 2017 Lenovo. All rights reserved. 51

49 Conclusion Lenovo has been a leader for years in Energy Aware Computing With end to end solutions to : reduce CAPEX, OPEX and TCO Maximize heat removal by water Maximize temperature for manage systems and infrastructure free cooling all year round efficient use of adsorption chillers with open source and partners solutions And going beyond: Dynamically adjust node frequency in operation Minimize TCO by working on all metrics PUE + ITUE + ERE 2017 Lenovo. All rights reserved. 52

50 Lenovo ISC17 Booth AI demo EAR demo NXS WTC demo Rack with various HW Table 1 Table 2 Open HPC Software Stack Table Lenovo Internal. All rights reserved. Excelero demo Stark HW HW Interactive Show & Tell 53

51

Co-designing an Energy Efficient System

Co-designing an Energy Efficient System Co-designing an Energy Efficient System Luigi Brochard Distinguished Engineer, HPC&AI Lenovo lbrochard@lenovo.com MaX International Conference 2018 Trieste 29.01.2018 Industry Thermal Challenges NVIDIA

More information

Energy Efficiency and Water-Cool-Technology Innovations

Energy Efficiency and Water-Cool-Technology Innovations Energy Efficiency and Water-Cool-Technology Innovations Karsten Kutzer April 10th 2018 Swiss Conference 2018 Acknowledgments: Luigi Brochard, Vinod Kamath, Martin Hiegl (Lenovo) Julita Corbalan (BSC) Why

More information

Energy Efficiency and WCT Innovations

Energy Efficiency and WCT Innovations Energy Efficiency and WCT Innovations Zeeshan Kamal Siddiqi HPC Leader Middle East, Turkey and Africa (META) Lenovo 2017 Lenovo. All rights reserved. Why do we have a problem? Higher TDP Processors Data

More information

Water-Cooling and its Impact on Performance and TCO

Water-Cooling and its Impact on Performance and TCO Water-Cooling and its Impact on Performance and TCO Matthew T. Ziegler Director, HPC and AI WW BU System Strategy and Architecture mziegler@lenovo.com 2016 Lenovo Internal. All rights reserved. 1 Agenda

More information

TITLE: Lenovo HPC and AI strategy

TITLE: Lenovo HPC and AI strategy TITLE: Lenovo HPC and AI strategy Abstract: Eighteen months after the transition from IBM System x, Lenovo is number 2 in Top500 and won several highly visible accounts. This talk will explain why this

More information

The Energy Challenge in HPC

The Energy Challenge in HPC ARNDT BODE Professor Arndt Bode is the Chair for Computer Architecture at the Leibniz-Supercomputing Center. He is Full Professor for Informatics at TU Mü nchen. His main research includes computer architecture,

More information

Lenovo Technical / Strategy Update

Lenovo Technical / Strategy Update Lenovo Technical / Strategy Update Martin W Hiegl October 1st, 2018 Bio or: Why should you believe a word I say? Based in Stuttgart, Germany, home of the Lenovo HPC & AI Innovation Center HPC & AI WW Offerings

More information

LRZ SuperMUC One year of Operation

LRZ SuperMUC One year of Operation LRZ SuperMUC One year of Operation IBM Deep Computing 13.03.2013 Klaus Gottschalk IBM HPC Architect Leibniz Computing Center s new HPC System is now installed and operational 2 SuperMUC Technical Highlights

More information

CAS 2K13 Sept Jean-Pierre Panziera Chief Technology Director

CAS 2K13 Sept Jean-Pierre Panziera Chief Technology Director CAS 2K13 Sept. 2013 Jean-Pierre Panziera Chief Technology Director 1 personal note 2 Complete solutions for Extreme Computing b ubullx ssupercomputer u p e r c o p u t e r suite s u e Production ready

More information

update: HPC, Data Center Infrastructure and Machine Learning HPC User Forum, February 28, 2017 Stuttgart

update: HPC, Data Center Infrastructure and Machine Learning HPC User Forum, February 28, 2017 Stuttgart GCS@LRZ update: HPC, Data Center Infrastructure and Machine Learning HPC User Forum, February 28, 2017 Stuttgart Arndt Bode Chairman of the Board, Leibniz-Rechenzentrum of the Bavarian Academy of Sciences

More information

100% Warm-Water Cooling with CoolMUC-3

100% Warm-Water Cooling with CoolMUC-3 100% Warm-Water Cooling with CoolMUC-3 No chillers, no fans, no problems Axel Auweter MEGWARE CTO Workshop on Energy Efficiency in HPC May 30, 2018, Ljubljana!1 MEGWARE Computer Vertrieb und Service GmbH

More information

2014 LENOVO INTERNAL. ALL RIGHTS RESERVED.

2014 LENOVO INTERNAL. ALL RIGHTS RESERVED. 2014 LENOVO INTERNAL. ALL RIGHTS RESERVED. Who is Lenovo? A $39 billion, Fortune 500 technology company - Publicly listed/traded on the Hong Kong Stock Exchange - 54,000 employees serving clients in 160+

More information

Update on LRZ Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities. 2 Oct 2018 Prof. Dr. Dieter Kranzlmüller

Update on LRZ Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities. 2 Oct 2018 Prof. Dr. Dieter Kranzlmüller Update on LRZ Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities 2 Oct 2018 Prof. Dr. Dieter Kranzlmüller 1 Leibniz Supercomputing Centre Bavarian Academy of Sciences and

More information

STAR-CCM+ Performance Benchmark and Profiling. July 2014

STAR-CCM+ Performance Benchmark and Profiling. July 2014 STAR-CCM+ Performance Benchmark and Profiling July 2014 Note The following research was performed under the HPC Advisory Council activities Participating vendors: CD-adapco, Intel, Dell, Mellanox Compute

More information

Planning for Liquid Cooling Patrick McGinn Product Manager, Rack DCLC

Planning for Liquid Cooling Patrick McGinn Product Manager, Rack DCLC Planning for Liquid Cooling -------------------------------- Patrick McGinn Product Manager, Rack DCLC February 3 rd, 2015 Expertise, Innovation & Delivery 13 years in the making with a 1800% growth rate

More information

HPC projects. Grischa Bolls

HPC projects. Grischa Bolls HPC projects Grischa Bolls Outline Why projects? 7th Framework Programme Infrastructure stack IDataCool, CoolMuc Mont-Blanc Poject Deep Project Exa2Green Project 2 Why projects? Pave the way for exascale

More information

Algorithms, System and Data Centre Optimisation for Energy Efficient HPC

Algorithms, System and Data Centre Optimisation for Energy Efficient HPC 2015-09-14 Algorithms, System and Data Centre Optimisation for Energy Efficient HPC Vincent Heuveline URZ Computing Centre of Heidelberg University EMCL Engineering Mathematics and Computing Lab 1 Energy

More information

INCREASE IT EFFICIENCY, REDUCE OPERATING COSTS AND DEPLOY ANYWHERE

INCREASE IT EFFICIENCY, REDUCE OPERATING COSTS AND DEPLOY ANYWHERE www.iceotope.com DATA SHEET INCREASE IT EFFICIENCY, REDUCE OPERATING COSTS AND DEPLOY ANYWHERE BLADE SERVER TM PLATFORM 80% Our liquid cooling platform is proven to reduce cooling energy consumption by

More information

MILC Performance Benchmark and Profiling. April 2013

MILC Performance Benchmark and Profiling. April 2013 MILC Performance Benchmark and Profiling April 2013 Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information on the supporting

More information

Hot vs Cold Energy Efficient Data Centers. - SVLG Data Center Center Efficiency Summit

Hot vs Cold Energy Efficient Data Centers. - SVLG Data Center Center Efficiency Summit Hot vs Cold Energy Efficient Data Centers - SVLG Data Center Center Efficiency Summit KC Mares November 2014 The Great Debate about Hardware Inlet Temperature Feb 2003: RMI report on high-performance data

More information

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc Scaling to Petaflop Ola Torudbakken Distinguished Engineer Sun Microsystems, Inc HPC Market growth is strong CAGR increased from 9.2% (2006) to 15.5% (2007) Market in 2007 doubled from 2003 (Source: IDC

More information

LRZ Site Report. Athens, Juan Pancorbo Armada

LRZ Site Report. Athens, Juan Pancorbo Armada LRZ Site Report Athens, 26.9.- 27.9.2016 Juan Pancorbo Armada juan.pancorbo@lrz.de Introduction The LRZ is the computer center for Munich's universities and is part of the Bavarian Academy of Sciences

More information

I/O Monitoring at JSC, SIONlib & Resiliency

I/O Monitoring at JSC, SIONlib & Resiliency Mitglied der Helmholtz-Gemeinschaft I/O Monitoring at JSC, SIONlib & Resiliency Update: I/O Infrastructure @ JSC Update: Monitoring with LLview (I/O, Memory, Load) I/O Workloads on Jureca SIONlib: Task-Local

More information

S8765 Performance Optimization for Deep- Learning on the Latest POWER Systems

S8765 Performance Optimization for Deep- Learning on the Latest POWER Systems S8765 Performance Optimization for Deep- Learning on the Latest POWER Systems Khoa Huynh Senior Technical Staff Member (STSM), IBM Jonathan Samn Software Engineer, IBM Evolving from compute systems to

More information

PPC-F-Q370 SERIES AI Ready Panel PC Based on Intel OpenVINO toolkit

PPC-F-Q370 SERIES AI Ready Panel PC Based on Intel OpenVINO toolkit PPC-F-Q370 SERIES AI Ready Panel PC Based on Intel OpenVINO toolkit 15 Motorcycle 48 CAR 305 CAR 304 People 32 CAR 308 CAR 401 17 15.6 18.5 21.5 23.8 CAR 306 CAR 307 CAR 309 CAR 308 CAR 400 CAR 402 CAR

More information

Introducing. IBM idataplex. Gilad Berman System x

Introducing. IBM idataplex. Gilad Berman System x Introducing IBM idataplex Gilad Berman System x Agenda idataplex Overview (racks and servers) idataplex Advantages idataplex Management idataplex Potential Costumers It wont be long, I promise.. Background

More information

Cray XC Scalability and the Aries Network Tony Ford

Cray XC Scalability and the Aries Network Tony Ford Cray XC Scalability and the Aries Network Tony Ford June 29, 2017 Exascale Scalability Which scalability metrics are important for Exascale? Performance (obviously!) What are the contributing factors?

More information

PPC-F-Q370 SERIES. OpenVINO. AI Ready Panel PC Based on Intel OpenVINO toolkit AR 501 CAR 500 CAR 406 CAR 407 CAR 408 CAR 409 CAR 404 CAR 405 CAR 402

PPC-F-Q370 SERIES. OpenVINO. AI Ready Panel PC Based on Intel OpenVINO toolkit AR 501 CAR 500 CAR 406 CAR 407 CAR 408 CAR 409 CAR 404 CAR 405 CAR 402 PPC-F-Q370 SERIES AI Ready Panel PC Based on Intel OpenVINO toolkit Motorcycle 48 CAR 305 CAR 304 People 32 CAR 308 CAR 401 CAR 306 CAR 307 CAR 309 CAR 308 CAR 400 CAR 402 CAR 403 Motorcycle 48 Motorcycle

More information

Emerging Technologies for HPC Storage

Emerging Technologies for HPC Storage Emerging Technologies for HPC Storage Dr. Wolfgang Mertz CTO EMEA Unstructured Data Solutions June 2018 The very definition of HPC is expanding Blazing Fast Speed Accessibility and flexibility 2 Traditional

More information

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0)

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0) TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0) Contributing sites and the corresponding computer systems for this call are: BSC, Spain IBM System x idataplex CINECA, Italy Lenovo System

More information

Altair OptiStruct 13.0 Performance Benchmark and Profiling. May 2015

Altair OptiStruct 13.0 Performance Benchmark and Profiling. May 2015 Altair OptiStruct 13.0 Performance Benchmark and Profiling May 2015 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute

More information

Energy Efficiency for Exascale. High Performance Computing, Grids and Clouds International Workshop Cetraro Italy June 27, 2012 Natalie Bates

Energy Efficiency for Exascale. High Performance Computing, Grids and Clouds International Workshop Cetraro Italy June 27, 2012 Natalie Bates Energy Efficiency for Exascale High Performance Computing, Grids and Clouds International Workshop Cetraro Italy June 27, 2012 Natalie Bates The Exascale Power Challenge Where do we get a 1000x improvement

More information

Product Guide. Direct Contact Liquid Cooling Industry leading data center solutions for HPC, Cloud and Enterprise markets.

Product Guide. Direct Contact Liquid Cooling Industry leading data center solutions for HPC, Cloud and Enterprise markets. Product Guide Direct Contact Liquid Cooling Industry leading data center solutions for HPC, Cloud and Enterprise markets. www.coolitsystems.com Phone: +1 403.235.4895 The Future of Data Center Cooling

More information

Hybrid Warm Water Direct Cooling Solution Implementation in CS300-LC

Hybrid Warm Water Direct Cooling Solution Implementation in CS300-LC Hybrid Warm Water Direct Cooling Solution Implementation in CS300-LC Roger Smith Mississippi State University Giridhar Chukkapalli Cray, Inc. C O M P U T E S T O R E A N A L Y Z E 1 Safe Harbor Statement

More information

Saving Energy with Free Cooling and How Well It Works

Saving Energy with Free Cooling and How Well It Works Saving Energy with Free Cooling and How Well It Works Brent Draney HPC Facility Integration Lead Data Center Efficiency Summit 2014-1 - NERSC is the primary computing facility for DOE Office of Science

More information

Data Center Trends and Challenges

Data Center Trends and Challenges Data Center Trends and Challenges, IBM Fellow Chief Engineer - Data Center Energy Efficiency Customers challenges are driven by the following Increasing IT demand Continued cost pressure Responsive to

More information

CS500 SMARTER CLUSTER SUPERCOMPUTERS

CS500 SMARTER CLUSTER SUPERCOMPUTERS CS500 SMARTER CLUSTER SUPERCOMPUTERS OVERVIEW Extending the boundaries of what you can achieve takes reliable computing tools matched to your workloads. That s why we tailor the Cray CS500 cluster supercomputer

More information

Prototyping in PRACE PRACE Energy to Solution prototype at LRZ

Prototyping in PRACE PRACE Energy to Solution prototype at LRZ Prototyping in PRACE PRACE Energy to Solution prototype at LRZ Torsten Wilde 1IP-WP9 co-lead and 2IP-WP11 lead (GSC-LRZ) PRACE Industy Seminar, Bologna, April 16, 2012 Leibniz Supercomputing Center 2 Outline

More information

InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014

InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014 InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment TOP500 Supercomputers, June 2014 TOP500 Performance Trends 38% CAGR 78% CAGR Explosive high-performance

More information

TSUBAME-KFC : Ultra Green Supercomputing Testbed

TSUBAME-KFC : Ultra Green Supercomputing Testbed TSUBAME-KFC : Ultra Green Supercomputing Testbed Toshio Endo,Akira Nukada, Satoshi Matsuoka TSUBAME-KFC is developed by GSIC, Tokyo Institute of Technology NEC, NVIDIA, Green Revolution Cooling, SUPERMICRO,

More information

Data Sheet FUJITSU Server PRIMERGY CX400 M1 Scale out Server

Data Sheet FUJITSU Server PRIMERGY CX400 M1 Scale out Server Data Sheet FUJITSU Server Scale out Server Data Sheet FUJITSU Server Scale out Server Scale-Out Smart for HPC, Cloud and Hyper-Converged Computing FUJITSU Server PRIMERGY will give you the servers you

More information

Data Sheet FUJITSU Server PRIMERGY CX400 M4 Scale out Server

Data Sheet FUJITSU Server PRIMERGY CX400 M4 Scale out Server Data Sheet FUJITSU Server PRIMERGY CX400 M4 Scale out Server Data Sheet FUJITSU Server PRIMERGY CX400 M4 Scale out Server Workload-specific power in a modular form factor FUJITSU Server PRIMERGY will give

More information

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Pak Lui, Gilad Shainer, Brian Klaff Mellanox Technologies Abstract From concept to

More information

PROCESS & DATA CENTER COOLING. How and Why To Prepare For What s To Come JOHN MORRIS REGIONAL DIRECTOR OF SALES

PROCESS & DATA CENTER COOLING. How and Why To Prepare For What s To Come JOHN MORRIS REGIONAL DIRECTOR OF SALES PROCESS & DATA CENTER COOLING How and Why To Prepare For What s To Come JOHN MORRIS REGIONAL DIRECTOR OF SALES CONGRATULATIONS 2 BOLD PREDICTION 3 Where We Are TODAY Computer Technology NOW advances more

More information

MULTITHERMAN: Out-of-band High-Resolution HPC Power and Performance Monitoring Support for Big-Data Analysis

MULTITHERMAN: Out-of-band High-Resolution HPC Power and Performance Monitoring Support for Big-Data Analysis MULTITHERMAN: Out-of-band High-Resolution HPC Power and Performance Monitoring Support for Big-Data Analysis EU H2020 FETHPC project ANTAREX (g.a. 671623) EU FP7 ERC Project MULTITHERMAN (g.a.291125) HPC

More information

Inspur AI Computing Platform

Inspur AI Computing Platform Inspur Server Inspur AI Computing Platform 3 Server NF5280M4 (2CPU + 3 ) 4 Server NF5280M5 (2 CPU + 4 ) Node (2U 4 Only) 8 Server NF5288M5 (2 CPU + 8 ) 16 Server SR BOX (16 P40 Only) Server target market

More information

Data Sheet Fujitsu Server PRIMERGY CX400 M1 Compact and Easy

Data Sheet Fujitsu Server PRIMERGY CX400 M1 Compact and Easy Data Sheet Fujitsu Server Compact and Easy Data Sheet Fujitsu Server Compact and Easy Scale-Out Smart for HPC, Cloud and Hyper-Converged Computing The Fujitsu Server helps to meet the immense challenges

More information

Designed for Maximum Accelerator Performance

Designed for Maximum Accelerator Performance Designed for Maximum Accelerator Performance A dense, GPU-accelerated cluster supercomputer that delivers up to 329 double-precision GPU teraflops in one rack. This power- and spaceefficient system can

More information

19. prosince 2018 CIIRC Praha. Milan Král, IBM Radek Špimr

19. prosince 2018 CIIRC Praha. Milan Král, IBM Radek Špimr 19. prosince 2018 CIIRC Praha Milan Král, IBM Radek Špimr CORAL CORAL 2 CORAL Installation at ORNL CORAL Installation at LLNL Order of Magnitude Leap in Computational Power Real, Accelerated Science ACME

More information

Looking ahead with IBM i. 10+ year roadmap

Looking ahead with IBM i. 10+ year roadmap Looking ahead with IBM i 10+ year roadmap 1 Enterprises Trust IBM Power 80 of Fortune 100 have IBM Power Systems The top 10 banking firms have IBM Power Systems 9 of top 10 insurance companies have IBM

More information

n N c CIni.o ewsrg.au

n N c CIni.o ewsrg.au @NCInews NCI and Raijin National Computational Infrastructure 2 Our Partners General purpose, highly parallel processors High FLOPs/watt and FLOPs/$ Unit of execution Kernel Separate memory subsystem GPGPU

More information

Quotations invited. 2. The supplied hardware should have 5 years comprehensive onsite warranty (24 x 7 call logging) from OEM directly.

Quotations invited. 2. The supplied hardware should have 5 years comprehensive onsite warranty (24 x 7 call logging) from OEM directly. Enquiry No: IITK/ME/mkdas/2016/01 May 04, 2016 Quotations invited Sealed quotations are invited for the purchase of an HPC cluster with the specification outlined below. Technical as well as the commercial

More information

Cooling on Demand - scalable and smart cooling solutions. Marcus Edwards B.Sc.

Cooling on Demand - scalable and smart cooling solutions. Marcus Edwards B.Sc. Cooling on Demand - scalable and smart cooling solutions Marcus Edwards B.Sc. Schroff UK Limited Data Centre cooling what a waste of money! A 1MW data center needs 177,000,000 kwh in its lifetime of 10

More information

TECHNOLOGIES CO., LTD.

TECHNOLOGIES CO., LTD. A Fresh Look at HPC HUAWEI TECHNOLOGIES Francis Lam Director, Product Management www.huawei.com WORLD CLASS HPC SOLUTIONS TODAY 170+ Countries $74.8B 2016 Revenue 14.2% of Revenue in R&D 79,000 R&D Engineers

More information

Smarter Clusters from the Supercomputer Experts

Smarter Clusters from the Supercomputer Experts Smarter Clusters from the Supercomputer Experts Maximize Your Results with Flexible, High-Performance Cray CS500 Cluster Supercomputers In science and business, as soon as one question is answered another

More information

Planning a Green Datacenter

Planning a Green Datacenter Planning a Green Datacenter Dr. Dominique Singy Contribution to: Green IT Crash Course Organisator: Green IT Special Interest Group of the Swiss Informatics Society ETH-Zurich / February 13, Green Data

More information

IT Level Power Provisioning Business Continuity and Efficiency at NTT

IT Level Power Provisioning Business Continuity and Efficiency at NTT IT Level Power Provisioning Business Continuity and Efficiency at NTT Henry M.L. Wong Intel Eco-Technology Program Office Environment Global CO 2 Emissions ICT 2% 98% Source: The Climate Group Economic

More information

How Liquid Cooling Helped Two University Data Centers Achieve Cooling Efficiency Goals. Michael Gagnon Coolcentric October

How Liquid Cooling Helped Two University Data Centers Achieve Cooling Efficiency Goals. Michael Gagnon Coolcentric October How Liquid Cooling Helped Two University Data Centers Achieve Cooling Efficiency Goals Michael Gagnon Coolcentric October 13 2011 Data Center Trends Energy consumption in the Data Center (DC) 2006-61 billion

More information

Passive RDHx as a Cost Effective Alternative to CRAH Air Cooling. Jeremiah Stikeleather Applications Engineer

Passive RDHx as a Cost Effective Alternative to CRAH Air Cooling. Jeremiah Stikeleather Applications Engineer Passive RDHx as a Cost Effective Alternative to CRAH Air Cooling Jeremiah Stikeleather Applications Engineer Passive RDHx as a Cost Effective Alternative to CRAH Air Cooling While CRAH cooling has been

More information

Data Sheet FUJITSU Server PRIMERGY CX2550 M1 Dual Socket Server Node

Data Sheet FUJITSU Server PRIMERGY CX2550 M1 Dual Socket Server Node Data Sheet FUJITSU Server PRIMERGY CX2550 M1 Dual Socket Server Node Data Sheet FUJITSU Server PRIMERGY CX2550 M1 Dual Socket Server Node Standard server node for PRIMERGY CX400 M1 multi-node server system

More information

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 14 th CALL (T ier-0)

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 14 th CALL (T ier-0) TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 14 th CALL (T ier0) Contributing sites and the corresponding computer systems for this call are: GENCI CEA, France Bull Bullx cluster GCS HLRS, Germany Cray

More information

Lenovo Enterprise Portfolio

Lenovo Enterprise Portfolio Lenovo Enterprise Portfolio Federico Cuatromano Client Technical Specialist Data Center Group Lenovo Today A $46 billion, Fortune 500 company 60,000 employees serving customers in 160+ countries Publicly

More information

PowerExecutive. Tom Brey IBM Agenda. Why PowerExecutive. Fundamentals of PowerExecutive. - The Data Center Power/Cooling Crisis.

PowerExecutive. Tom Brey IBM Agenda. Why PowerExecutive. Fundamentals of PowerExecutive. - The Data Center Power/Cooling Crisis. PowerExecutive IBM Agenda Why PowerExecutive - The Data Center Power/Cooling Crisis Fundamentals of PowerExecutive 1 The Data Center Power/Cooling Crisis Customers want more IT processing cycles to run

More information

Systems and Technology Group. IBM Technology and Solutions Jan Janick IBM Vice President Modular Systems and Storage Development

Systems and Technology Group. IBM Technology and Solutions Jan Janick IBM Vice President Modular Systems and Storage Development Systems and Technology Group IBM Technology and Solutions Jan Janick IBM Vice President Modular Systems and Storage Development Power and cooling are complex issues There is no single fix. IBM is working

More information

COST EFFICIENCY VS ENERGY EFFICIENCY. Anna Lepak Universität Hamburg Seminar: Energy-Efficient Programming Wintersemester 2014/2015

COST EFFICIENCY VS ENERGY EFFICIENCY. Anna Lepak Universität Hamburg Seminar: Energy-Efficient Programming Wintersemester 2014/2015 COST EFFICIENCY VS ENERGY EFFICIENCY Anna Lepak Universität Hamburg Seminar: Energy-Efficient Programming Wintersemester 2014/2015 TOPIC! Cost Efficiency vs Energy Efficiency! How much money do we have

More information

Overview of Tianhe-2

Overview of Tianhe-2 Overview of Tianhe-2 (MilkyWay-2) Supercomputer Yutong Lu School of Computer Science, National University of Defense Technology; State Key Laboratory of High Performance Computing, China ytlu@nudt.edu.cn

More information

NAMD Performance Benchmark and Profiling. January 2015

NAMD Performance Benchmark and Profiling. January 2015 NAMD Performance Benchmark and Profiling January 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource

More information

Energy efficient mapping of virtual machines

Energy efficient mapping of virtual machines GreenDays@Lille Energy efficient mapping of virtual machines Violaine Villebonnet Thursday 28th November 2013 Supervisor : Georges DA COSTA 2 Current approaches for energy savings in cloud Several actions

More information

Slurm BOF SC13 Bull s Slurm roadmap

Slurm BOF SC13 Bull s Slurm roadmap Slurm BOF SC13 Bull s Slurm roadmap SC13 Eric Monchalin Head of Extreme Computing R&D 1 Bullx BM values Bullx BM bullx MPI integration ( runtime) Automatic Placement coherency Scalable launching through

More information

Tianhe-2, the world s fastest supercomputer. Shaohua Wu Senior HPC application development engineer

Tianhe-2, the world s fastest supercomputer. Shaohua Wu Senior HPC application development engineer Tianhe-2, the world s fastest supercomputer Shaohua Wu Senior HPC application development engineer Inspur Inspur revenue 5.8 2010-2013 6.4 2011 2012 Unit: billion$ 8.8 2013 21% Staff: 14, 000+ 12% 10%

More information

Power Systems AC922 Overview. Chris Mann IBM Distinguished Engineer Chief System Architect, Power HPC Systems December 11, 2017

Power Systems AC922 Overview. Chris Mann IBM Distinguished Engineer Chief System Architect, Power HPC Systems December 11, 2017 Power Systems AC922 Overview Chris Mann IBM Distinguished Engineer Chief System Architect, Power HPC Systems December 11, 2017 IBM POWER HPC Platform Strategy High-performance computer and high-performance

More information

7 Best Practices for Increasing Efficiency, Availability and Capacity. XXXX XXXXXXXX Liebert North America

7 Best Practices for Increasing Efficiency, Availability and Capacity. XXXX XXXXXXXX Liebert North America 7 Best Practices for Increasing Efficiency, Availability and Capacity XXXX XXXXXXXX Liebert North America Emerson Network Power: The global leader in enabling Business-Critical Continuity Automatic Transfer

More information

HPC Technology Trends

HPC Technology Trends HPC Technology Trends High Performance Embedded Computing Conference September 18, 2007 David S Scott, Ph.D. Petascale Product Line Architect Digital Enterprise Group Risk Factors Today s s presentations

More information

HUAWEI Tecal X6000 High-Density Server

HUAWEI Tecal X6000 High-Density Server HUAWEI Tecal X6000 High-Density Server Professional Trusted Future-oriented HUAWEI TECHNOLOGIES CO., LTD. HUAWEI Tecal X6000 High-Density Server (X6000) High computing density The X6000 is 2U high and

More information

Intel SSD Data center evolution

Intel SSD Data center evolution Intel SSD Data center evolution March 2018 1 Intel Technology Innovations Fill the Memory and Storage Gap Performance and Capacity for Every Need Intel 3D NAND Technology Lower cost & higher density Intel

More information

Lenovo ThinkSystem Mission Critical Customer Deck

Lenovo ThinkSystem Mission Critical Customer Deck Lenovo ThinkSystem Mission Critical Customer Deck Patrick Moakley, Director of Marketing June, 2017 Agenda Customer Challenges The ThinkSystem Mission Critical Portfolio Mission Critical Value Prop Story

More information

The Mont-Blanc Project

The Mont-Blanc Project http://www.montblanc-project.eu The Mont-Blanc Project Daniele Tafani Leibniz Supercomputing Centre 1 Ter@tec Forum 26 th June 2013 This project and the research leading to these results has received funding

More information

Data Sheet FUJITSU Server PRIMERGY CX400 S2 Multi-Node Server Enclosure

Data Sheet FUJITSU Server PRIMERGY CX400 S2 Multi-Node Server Enclosure Data Sheet FUJITSU Server PRIMERGY CX400 S2 Multi-Node Server Enclosure Data Sheet FUJITSU Server PRIMERGY CX400 S2 Multi-Node Server Enclosure Scale-Out Smart for HPC and Cloud Computing with enhanced

More information

Results of Study Comparing Liquid Cooling Methods

Results of Study Comparing Liquid Cooling Methods School of Mechanical Engineering FACULTY OF ENGINEERING Results of Study Comparing Liquid Cooling Methods Dr Jon Summers (J.L.Summers@Leeds.Ac.UK) Institute of ThermoFluids (itf) 54th HPC User Forum, September

More information

IBM Power AC922 Server

IBM Power AC922 Server IBM Power AC922 Server The Best Server for Enterprise AI Highlights More accuracy - GPUs access system RAM for larger models Faster insights - significant deep learning speedups Rapid deployment - integrated

More information

Architecting High Performance Computing Systems for Fault Tolerance and Reliability

Architecting High Performance Computing Systems for Fault Tolerance and Reliability Architecting High Performance Computing Systems for Fault Tolerance and Reliability Blake T. Gonzales HPC Computer Scientist Dell Advanced Systems Group blake_gonzales@dell.com Agenda HPC Fault Tolerance

More information

Performance Analysis in the Real World of Online Services

Performance Analysis in the Real World of Online Services Performance Analysis in the Real World of Online Services Dileep Bhandarkar, Ph. D. Distinguished Engineer 2009 IEEE International Symposium on Performance Analysis of Systems and Software My Background:

More information

Essentials. Expected Discontinuance Q2'15 Limited 3-year Warranty Yes Extended Warranty Available

Essentials. Expected Discontinuance Q2'15 Limited 3-year Warranty Yes Extended Warranty Available M&A, Inc. Essentials Status Launched Expected Discontinuance Q2'15 Limited 3-year Warranty Extended Warranty Available for Purchase (Select Countries) On-Site Repair Available for Purchase (Select Countries)

More information

Smart Data Centres. Robert M Pe, Data Centre Consultant HP Services SEA

Smart Data Centres. Robert M Pe, Data Centre Consultant HP Services SEA Smart Data Centres Robert M Pe, Data Centre Consultant Services SEA 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Content Data center

More information

Data Centers. The Environment. December The State of Global Environmental Sustainability in Data Center Design

Data Centers. The Environment. December The State of Global Environmental Sustainability in Data Center Design Data Centers & The Environment The State of Global Environmental Sustainability in Data Center Design December 2018 Today s Data Centers Data centers have a huge impact on the world we live in. Today they

More information

Portable Power/Performance Benchmarking and Analysis with WattProf

Portable Power/Performance Benchmarking and Analysis with WattProf Portable Power/Performance Benchmarking and Analysis with WattProf Amir Farzad, Boyana Norris University of Oregon Mohammad Rashti RNET Technologies, Inc. Motivation Energy efficiency is becoming increasingly

More information

Resource Saving: Latest Innovation in Optimized Cloud Infrastructure

Resource Saving: Latest Innovation in Optimized Cloud Infrastructure Resource Saving: Latest Innovation in Optimized Cloud Infrastructure CloudFest 2018 Presented by Martin Galle, Director FAE We Keep ITSupermicro Green 2018 Cloud Computing Development Technology Evolution

More information

MAHA. - Supercomputing System for Bioinformatics

MAHA. - Supercomputing System for Bioinformatics MAHA - Supercomputing System for Bioinformatics - 2013.01.29 Outline 1. MAHA HW 2. MAHA SW 3. MAHA Storage System 2 ETRI HPC R&D Area - Overview Research area Computing HW MAHA System HW - Rpeak : 0.3

More information

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0)

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0) TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0) Contributing sites and the corresponding computer systems for this call are: GCS@Jülich, Germany IBM Blue Gene/Q GENCI@CEA, France Bull Bullx

More information

AcuSolve Performance Benchmark and Profiling. October 2011

AcuSolve Performance Benchmark and Profiling. October 2011 AcuSolve Performance Benchmark and Profiling October 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox, Altair Compute

More information

BeeGFS. Parallel Cluster File System. Container Workshop ISC July Marco Merkel VP ww Sales, Consulting

BeeGFS.   Parallel Cluster File System. Container Workshop ISC July Marco Merkel VP ww Sales, Consulting BeeGFS The Parallel Cluster File System Container Workshop ISC 28.7.18 www.beegfs.io July 2018 Marco Merkel VP ww Sales, Consulting HPC & Cognitive Workloads Demand Today Flash Storage HDD Storage Shingled

More information

OCTOPUS Performance Benchmark and Profiling. June 2015

OCTOPUS Performance Benchmark and Profiling. June 2015 OCTOPUS Performance Benchmark and Profiling June 2015 2 Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information on the

More information

Data Management and Analysis for Energy Efficient HPC Centers

Data Management and Analysis for Energy Efficient HPC Centers Data Management and Analysis for Energy Efficient HPC Centers Presented by Ghaleb Abdulla, Anna Maria Bailey and John Weaver; Copyright 2015 OSIsoft, LLC Data management and analysis for energy efficient

More information

Packaging of New Servers - energy efficiency aspects-

Packaging of New Servers - energy efficiency aspects- Packaging of New Servers - energy efficiency aspects- Roger Schmidt, IBM Fellow Chief Engineer, Data Center Energy Efficiency Many slides provided by Ed Seminaro Chief Architect of HPC Power6 & 7 1 st

More information

Motivation Goal Idea Proposition for users Study

Motivation Goal Idea Proposition for users Study Exploring Tradeoffs Between Power and Performance for a Scientific Visualization Algorithm Stephanie Labasan Computer and Information Science University of Oregon 23 November 2015 Overview Motivation:

More information

18 th National Award for Excellence in Energy Management. July 27, 2017

18 th National Award for Excellence in Energy Management. July 27, 2017 18 th National Award for Excellence in Energy Management July 27, 2017 ICICI Bank, Data Centre, August Hyderabad 30 & 31, 2017 ICICI Bank - Revolution in Digital Banking Over 100,000 Digital POS in 4 months

More information

System Design of Kepler Based HPC Solutions. Saeed Iqbal, Shawn Gao and Kevin Tubbs HPC Global Solutions Engineering.

System Design of Kepler Based HPC Solutions. Saeed Iqbal, Shawn Gao and Kevin Tubbs HPC Global Solutions Engineering. System Design of Kepler Based HPC Solutions Saeed Iqbal, Shawn Gao and Kevin Tubbs HPC Global Solutions Engineering. Introduction The System Level View K20 GPU is a powerful parallel processor! K20 has

More information

IBM Power Advanced Compute (AC) AC922 Server

IBM Power Advanced Compute (AC) AC922 Server IBM Power Advanced Compute (AC) AC922 Server The Best Server for Enterprise AI Highlights IBM Power Systems Accelerated Compute (AC922) server is an acceleration superhighway to enterprise- class AI. A

More information

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme PBO1046BES Simplifying the Journey To The Software Defined Datacenter Tikiri Wanduragala Senior Consultant Data Center Group, Lenovo EMEA VMworld 2017 Geoff Hunt Senior Product Manager Data Center Group,

More information

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 11th CALL (T ier-0)

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 11th CALL (T ier-0) TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 11th CALL (T ier-0) Contributing sites and the corresponding computer systems for this call are: BSC, Spain IBM System X idataplex CINECA, Italy The site selection

More information