Powering)the)Road)to)National)HPC)Leadership)
|
|
- Audra Pope
- 5 years ago
- Views:
Transcription
1 Powering)the)Road)to)National)HPC)Leadership) Jack%C.%Wells,%Director%of%Science Oak$Ridge$Leadership$Computing$Facility/Oak$Ridge$National$Laboratory Join%the%Conversation%#OpenPOWERSummit
2 Powering$the$Road$to$National$HPC$ Leadership$ Jack%C.%Wells Director%of%Science Oak%Ridge%Leadership%Computing%Facility Oak%Ridge%National%Laboratory 2018%OpenPOWER Summit Las%Vegas 19%March%2018 ORNL%is%managed%by%UT2Battelle% for%the%us%department%of%energy This%research%used%resources%of%the%Oak%Ridge%Leadership%Computing%Facility%at%the%Oak% Ridge%National%Laboratory,%which%is%supported%by%the%Office%of%Science%of%the%U.S.% Department%of%Energy%under%Contract%No.%DE2AC05200OR22725.%Some%of%the%work% presented%here%is%from%the%total%and%oak%ridge%national%laboratory%collaboration%which% is%done%under%the%crada%agreement%nfe %some%of%the%experiments%were% supported%by%an%allocation%of%advanced%computing%resources%provided%by%the%national% Science%Foundation.%The%computations%were%performed%on%Nautilus%at%the%National% Institute%for%Computational%Sciences.
3 A"Little"About"ORNL Oak$Ridge$National$ Laboratory$is$the$ largest$us$ Department$of$ Energy$(DOE)$open$ science$laboratory$ Oak Ridge, Tennessee
4 What$is$a$Leadership$Computing$Facility$(LCF)? Collaborative%DOE%Office%of%Science%user2 facility%program%at%ornl%and%anl Mission:%Provide%the%computational%and%data% resources%required%to%solve%the%most% challenging%problems. 22centers/22architectures%to%address%diverse% and%growing%computational%needs%of%the% scientific%community Highly%competitive%user%allocation%programs% (INCITE,%ALCC). Projects%receive%10x%to%100x%more%resource% than%at%other%generally%available%centers. LCF%centers%partner%with%users%to%enable% science%&%engineering%breakthroughs% (Liaisons,%Catalysts).
5 OLCF23 ORNL$has$systematically$delivered$a$series$ of$leadershipeclass$systems On%scope% %On%budget% %Within%schedule Titan,%five%years%old%in%October%2017,%continues% to%deliver%world2class%science%research%in%support% of%our%user%community.%we%will%operate%titan% through%2019%when%it%will%be%decommissioned. OLCF21 OLCF fold improvement in%8%years 2012 Cray%XK7% Titan 27 PF 18.5 TF 25% TF 54% TF 62% TF 263% TF 1% PF 2.5 PF 2004 Cray%X1E% Phoenix% 2005 Cray%XT3% Jaguar 2006 Cray%XT3% Jaguar 2007 Cray%XT4% Jaguar 2008 Cray%XT4% Jaguar 2008 Cray%XT5% Jaguar 2009 Cray%XT5% Jaguar
6 We$are$building$on$this$record$of$success$ to$enable$exascale in$2021 OLCF24 OLCF25 ~1 EF 27 PF 2012 Cray%XK7% Titan 200 PF 2018 IBM% Summit 5002fold improvement in%9%years 2021 Frontier
7 Coming$in$2018:$Summit$will$replace$Titan$ as$the$olcf s$leadership$supercomputer$ Summit,%slated%to%be%more%powerful%than%any%other%existing% supercomputer,%is%the%department%of%energy s%oak%ridge%national% Laboratory s%newest%supercomputer%for%open%science.
8 Summit$Overview Compute$Node Compute$Rack 18%Compute%Servers Warm%water%(70 F%direct2cooled% components) RDHX%for%air2cooled%components Compute$System 10.2$PB$Total$Memory 256%compute%racks 4,608%compute%nodes Mellanox EDR%IB%fabric 200%PFLOPS ~13%MW% Components IBM$POWER9 22%Cores 4%Threads/core NVLink 2%x%POWER9 6%x%NVIDIA%GV100 NVMe2compatible%PCIe 1600%GB%SSD%!! 25%GB/s%EDR%IB2 (2%ports) 512%GB%DRAM2 (DDR4) 96%GB%HBM2 (3D%Stacked) Coherent%Shared%Memory 39.7%TB%Memory/rack 55%KW%max%power/rack GPFS$File$System 250$PB$storage 2.5%TB/s%read,%2.5%TB/s%write NVIDIA$GV100 7%TF NVLink
9 Summit$Node$Overview HBM 16 GB 900 GB/s GPU 7 TF DRAM 256 GB DRAM 256 GB HBM 16 GB 900 GB/s GPU 7 TF 50 GB/s 50 GB/s HBM 16 GB 50 GB/s 900 GB/s GPU 7 TF 50 GB/s 50 GB/s 135 GB/s P9 16 GB/s 64 GB/s 135 GB/s 16 GB/s P9 50 GB/s 50 GB/s 50 GB/s HBM 16 GB GPU 7 TF HBM 16 GB 50 GB/s GPU 7 TF 900 GB/s NIC HBM 16 GB GPU 7 TF 12.5 GB/s 12.5 GB/s 50 GB/s 900 GB/s 900 GB/s NVM 6.0 GB/s Read 2.2 GB/s Write TF 42 TF (6x7 TF) HBM 96 GB (6x16 GB) DRAM 512 GB (2x16x16 GB) NET 25 GB/s (2x12.5 GB/s) MMsg/s 83 HBM/DRAM Bus (aggregate B/W) NVLINK X-Bus (SMP) PCIe Gen4 EDR IB HBM & DRAM speeds are aggregate (Read+Write). All other speeds (X-Bus, NVLink, PCIe, IB) are bi-directional.
10 Coming$in$2018:$Summit$will$replace$Titan$ as$the$olcf s$leadership$supercomputer$ Feature Titan Summit Application Performance Baseline 5210x%Titan Number%of%Nodes 18,688 4,608 Many%fewer%nodes Much%more%powerful%nodes Much%more%memory%per%node% and%total%system%memory Faster%interconnect Much%higher%bandwidth% between%cpus%and%gpus Much%larger%and%faster%file% system Node%performance 1.4%TF 42%TF Memory per%node 32 GB DDR3%+%6%GB%GDDR5 512%GB%DDR4%+%96%GB%HBM2 NV%memory per%node %GB Total%System%Memory 710%TB >10%PB%DDR4%+%HBM2%+ Non2volatile System%Interconnect Gemini%(6.4%GB/s) Dual%Rail%EDR2IB (25%GB/s) Interconnect%Topology 3D Torus Non2blocking%Fat%Tree Bi2Section%Bandwidth 15.6%TB/s TB/s Processors 1%AMD%Opteron 1%NVIDIA%Kepler 2%IBM%POWER9 6%NVIDIA Volta File%System 32%PB,%1%TB/s, Lustre 250 PB,%2.5%TB/s,%GPFS Power%Consumption 9%MW 13%MW
11 What$is$CORAL?$ The$program$through$which$Summit$&$Sierra$are$procured. Several%DOE%labs%have%strong%supercomputing%programs%and%facilities.% To%bring%the%next%generation%of%leading%supercomputers%to%these%labs,%DOE% created%coral%(the%collaboration%of%oak%ridge,%argonne,%and%livermore)%to% jointly%procure%these%systems,%and%in%so%doing,%align%strategy%and%resources% across%the%doe%enterprise. Collaboration%grouping%of%DOE%labs%was%done%based%on%common%acquisition% timings.%collaboration%is%a%win2win%for%all%parties.% Summit %System Sierra %System OpenPOWER Technologies:%IBM%POWER%CPUs,%NVIDIA%Tesla%GPUs,%Mellanox EDR%100Gb/s%InfiniBand Paving%The%Road%to%Exascale%Performance
12 OLCF$Program$to$Ready$Application$ Developers$and$Users We%are%preparing%users%through: Application%Readiness%and%Early%Science%through%Center%for%Accelerated% Application%Readiness%(CAAR) Training%and%web2based%%documentation Early%access%on%SummitDev and%summit%phase%i%system%(already%accepted) Access%for%broader%user%base%on%final,%accepted%Phase%II%system Goals:% Early%science%achievements,% Demonstrate%application%readiness,% Prepare%INCITE%&%ALCC%proposals,% Harden%Summit%for%full2user%operations
13 Summit$Early$Science$Program$(ESP)$ We%put%out%a%Call%for%Proposals%in%December%2017 Resulted%in%62%Letters%of%Intent%(LOI)%received%by%year s%end. 27%are%from%PIs%at%universities 32%are%from%PIs%at%national%laboratories%or%research%institutions%(DOE,%NASA)% 14%are%CAAR%project2related%LOIs 27%have%had%past%INCITE%allocations 9%have%had%past%ALCC%allocations 15%have%connections%to%the%US%DOE%Exascale%Computing%Project 9%are%AI%or%deep%learning2related% Proposals%are%due%at%the%beginning%of%June ESP%Users%will%gain%full%access%to%Summit%for%early%science%later%this%year
14 Summit$will$be$the$world s$smartest$ supercomputer$for$open$science But%what%makes%a%supercomputer%smart? Summit%provides%unprecedented%opportunities%for%the%integration% of%artificial%intelligence%(ai)%and%scientific%discovery.%here s%why: GPU$Brawn:$Summit%links%more%than%27,000%deep2learning% optimized%nvidia%gpus%with%the%potential%to%deliver% exascale2level%performance%(a%billion2billion%calculations%per% second)%for%ai%applications. HighEspeed$Data$Movement:$NVLink high2bandwidth% technology%built%into%all%of%summit s%processors%supplies%the% next2generation% information%superhighways %needed%to%train% deep%learning%algorithms%for%challenging%science%problems% quickly. Memory$Where$it$Matters:%Summit s%sizable%local%memory% gives%ai%researchers%a%convenient%launching%point%for%data2 intensive%tasks,%an%asset%that%allows%for%faster%ai%training%and% greater%algorithmic%accuracy. One%of%Summit s%4,600%ibm%ac922%nodes.%each%node% contains%six%nvidia%volta%gpus%and%two%ibm%power9% CPUs,%giving%scientists%new%opportunities%to%automate,% accelerate%and%drive%understanding%using%artificial% intelligence%techniques.
15 Summit$will$be$the$world s$smartest$ supercomputer$for$open$science But%what%can%a%smart%supercomputer%do? Science%challenges%for%a%smart%supercomputer:% Identifying$NextEgeneration$Materials By%training%AI%algorithms%to%predict%material% properties%from%experimental%data,% longstanding%questions%about%material% behavior%at%atomic%scales%could%be%answered% for%better%batteries,%more%resilient%building% materials,%and%more%efficient%semiconductors.% Predicting$Fusion$Energy Predictive%AI%software%is%already%helping% scientists%anticipate%disruptions%to%the%volatile% plasmas%inside%experimental%reactors.% Summit s%arrival%allows%researchers%to%take% this%work%to%the%next%level%and%further% integrate%ai%with%fusion%technology.% Deciphering$HighEenergy$Physics$Data With%AI%supercomputing,%physicists%can%lean%on% machines%to%identify%important%pieces%of% information data%that s%too%massive%for%any% single%human%to%handle%and%that%could%change% our%understanding%of%the%universe. Combating$Cancer Through%the%development%of%scalable%deep% neural%networks,%scientists%at%the%us% Department%of%Energy%and%the%National% Cancer%Institute%are%making%strides%in% improving%cancer%diagnosis%and%treatment.%
16 Summit$is$still$under$construction We%expect%to%accept%the%machine%in%Summer%of%2018,%allow%early%users%on%this% year,%and%allocate%our%first%users%through%the%incite%program%in%january%2019.% We%are%continuing%node%and%file%storage%installation%and%software%testing.%%
17 Questions? Jack$Wells
Preparing GPU-Accelerated Applications for the Summit Supercomputer
Preparing GPU-Accelerated Applications for the Summit Supercomputer Fernanda Foertter HPC User Assistance Group Training Lead foertterfs@ornl.gov This research used resources of the Oak Ridge Leadership
More informationOak Ridge National Laboratory Computing and Computational Sciences
Oak Ridge National Laboratory Computing and Computational Sciences OFA Update by ORNL Presented by: Pavel Shamis (Pasha) OFA Workshop Mar 17, 2015 Acknowledgments Bernholdt David E. Hill Jason J. Leverman
More informationPresent and Future Leadership Computers at OLCF
Present and Future Leadership Computers at OLCF Al Geist ORNL Corporate Fellow DOE Data/Viz PI Meeting January 13-15, 2015 Walnut Creek, CA ORNL is managed by UT-Battelle for the US Department of Energy
More informationIBM HPC Technology & Strategy
IBM HPC Technology & Strategy Hyperion HPC User Forum Stuttgart, October 1st, 2018 The World s Smartest Supercomputers Klaus Gottschalk gottschalk@de.ibm.com HPC Strategy Deliver End to End Solutions for
More informationManaging HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory
Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory Quinn Mitchell HPC UNIX/LINUX Storage Systems ORNL is managed by UT-Battelle for the US Department of Energy U.S. Department
More informationHPC Saudi Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences. Presented to: March 14, 2017
Creating an Exascale Ecosystem for Science Presented to: HPC Saudi 2017 Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences March 14, 2017 ORNL is managed by UT-Battelle
More informationExploring Emerging Technologies in the Extreme Scale HPC Co- Design Space with Aspen
Exploring Emerging Technologies in the Extreme Scale HPC Co- Design Space with Aspen Jeffrey S. Vetter SPPEXA Symposium Munich 26 Jan 2016 ORNL is managed by UT-Battelle for the US Department of Energy
More informationIBM CORAL HPC System Solution
IBM CORAL HPC System Solution HPC and HPDA towards Cognitive, AI and Deep Learning Deep Learning AI / Deep Learning Strategy for Power Power AI Platform High Performance Data Analytics Big Data Strategy
More informationMapping MPI+X Applications to Multi-GPU Architectures
Mapping MPI+X Applications to Multi-GPU Architectures A Performance-Portable Approach Edgar A. León Computer Scientist San Jose, CA March 28, 2018 GPU Technology Conference This work was performed under
More information19. prosince 2018 CIIRC Praha. Milan Král, IBM Radek Špimr
19. prosince 2018 CIIRC Praha Milan Král, IBM Radek Špimr CORAL CORAL 2 CORAL Installation at ORNL CORAL Installation at LLNL Order of Magnitude Leap in Computational Power Real, Accelerated Science ACME
More informationPower Systems AC922 Overview. Chris Mann IBM Distinguished Engineer Chief System Architect, Power HPC Systems December 11, 2017
Power Systems AC922 Overview Chris Mann IBM Distinguished Engineer Chief System Architect, Power HPC Systems December 11, 2017 IBM POWER HPC Platform Strategy High-performance computer and high-performance
More informationOak Ridge Leadership Computing Facility: Summit and Beyond
Oak Ridge Leadership Computing Facility: Summit and Beyond Justin L. Whitt OLCF-4 Deputy Project Director, Oak Ridge Leadership Computing Facility Oak Ridge National Laboratory March 2017 ORNL is managed
More informationoutthink limits Spectrum Scale Enhancements for CORAL Sarp Oral, Oak Ridge National Laboratory Gautam Shah, IBM
outthink limits Spectrum Scale Enhancements for CORAL Sarp Oral, Oak Ridge National Laboratory Gautam Shah, IBM What is CORAL Collaboration of DOE Oak Ridge, Argonne, and Lawrence Livermore National Labs
More informationInterconnect Your Future
#OpenPOWERSummit Interconnect Your Future Scot Schultz, Director HPC / Technical Computing Mellanox Technologies OpenPOWER Summit, San Jose CA March 2015 One-Generation Lead over the Competition Mellanox
More informationUCX: An Open Source Framework for HPC Network APIs and Beyond
UCX: An Open Source Framework for HPC Network APIs and Beyond Presented by: Pavel Shamis / Pasha ORNL is managed by UT-Battelle for the US Department of Energy Co-Design Collaboration The Next Generation
More informationOak Ridge National Laboratory Computing and Computational Sciences
Oak Ridge National Laboratory Computing and Computational Sciences Preparing OpenSHMEM for Exascale Presented by: Pavel Shamis (Pasha) HPC Advisory Council Stanford Conference California Feb 2, 2015 Outline
More informationTitan - Early Experience with the Titan System at Oak Ridge National Laboratory
Office of Science Titan - Early Experience with the Titan System at Oak Ridge National Laboratory Buddy Bland Project Director Oak Ridge Leadership Computing Facility November 13, 2012 ORNL s Titan Hybrid
More informationPreparing Scientific Software for Exascale
Preparing Scientific Software for Exascale Jack Wells Director of Science Oak Ridge Leadership Computing Facility Oak Ridge National Laboratory Mini-Symposium on Scientific Software Engineering Monday,
More informationProgramming NVM Systems
Programming NVM Systems Random Access Talk Jeffrey S. Vetter Seyong Lee, Joel Denny, Jungwon Kim, et al. Presented to Salishan Conference on High Speed Computing Gleneden Beach, Oregon 27 Apr 2016 ORNL
More informationHETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA
HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA STATE OF THE ART 2012 18,688 Tesla K20X GPUs 27 PetaFLOPS FLAGSHIP SCIENTIFIC APPLICATIONS
More informationIBM Deep Learning Solutions
IBM Deep Learning Solutions Reference Architecture for Deep Learning on POWER8, P100, and NVLink October, 2016 How do you teach a computer to Perceive? 2 Deep Learning: teaching Siri to recognize a bicycle
More informationGPUS FOR NGVLA. M Clark, April 2015
S FOR NGVLA M Clark, April 2015 GAMING DESIGN ENTERPRISE VIRTUALIZATION HPC & CLOUD SERVICE PROVIDERS AUTONOMOUS MACHINES PC DATA CENTER MOBILE The World Leader in Visual Computing 2 What is a? Tesla K40
More informationCenter for Accelerated Application Readiness. Summit. Tjerk Straatsma. Getting Applications Ready for. OLCF Scientific Computing Group
Center for Accelerated Application Readiness Getting Applications Ready for Summit Tjerk Straatsma OLCF Scientific Computing Group ORNL is managed by UT-Battelle for the US Department of Energy OLCF on
More informationPOWER9. Jeff Stuecheli POWER Systems, IBM Systems IBM Corporation
POWER9 Jeff Stuecheli POWER Systems, IM Systems 2018 IM Corporation Recent and Future POWER Processor Roadmap POWER7 45 nm 2010 POWER7+ 32 nm 2012 POWER8 Family 22nm 2014 2016 POWER9 Family 14nm 2H17 2H18+
More informationComputational Challenges and Opportunities for Nuclear Astrophysics
Computational Challenges and Opportunities for Nuclear Astrophysics Bronson Messer Acting Group Leader Scientific Computing Group National Center for Computational Sciences Theoretical Astrophysics Group
More informationOLCF's next- genera0on Spider file system
OLCF's next- genera0on Spider file system Sarp Oral, PhD File and Storage Systems Team Lead Technology Integra0on Group Oak Ridge Leadership Compu0ng Facility Oak Ridge Na0onal Laboratory April 18, 2013
More informationInterconnect Related Research at Oak Ridge National Laboratory
Interconnect Related Research at Oak Ridge National Laboratory Barney Maccabe Director, Computer Science and Mathematics Division July 16, 2015 Frankfurt, Germany ORNL is managed by UT-Battelle for the
More informationOpenPOWER Innovations for HPC. IBM Research. IWOPH workshop, ISC, Germany June 21, Christoph Hagleitner,
IWOPH workshop, ISC, Germany June 21, 2017 OpenPOWER Innovations for HPC IBM Research Christoph Hagleitner, hle@zurich.ibm.com IBM Research - Zurich Lab IBM Research - Zurich Established in 1956 45+ different
More informationMVAPICH User s Group Meeting August 20, 2015
MVAPICH User s Group Meeting August 20, 2015 LLNL-PRES-676271 This work was performed under the auspices of the U.S. Department of Energy by under contract DE-AC52-07NA27344. Lawrence Livermore National
More informationModernizing OpenMP for an Accelerated World
Modernizing OpenMP for an Accelerated World Tom Scogland Bronis de Supinski This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract
More informationStepping up to Summit
DEPARTMENT: Leadership Computing Stepping up to Summit Jonathan Hines Oak Ridge National Laboratory Editors: James J. Hack, jhack@ornl.gov; Michael E. Papka, papka@anl.gov In November 2014, the Oak Ridge
More informationResponsive Large Data Analysis and Visualization with the ParaView Ecosystem. Patrick O Leary, Kitware Inc
Responsive Large Data Analysis and Visualization with the ParaView Ecosystem Patrick O Leary, Kitware Inc Hybrid Computing Attribute Titan Summit - 2018 Compute Nodes 18,688 ~3,400 Processor (1) 16-core
More informationOverview of Reedbush-U How to Login
Overview of Reedbush-U How to Login Information Technology Center The University of Tokyo http://www.cc.u-tokyo.ac.jp/ Supercomputers in ITC/U.Tokyo 2 big systems, 6 yr. cycle FY 08 09 10 11 12 13 14 15
More informationRECENT TRENDS IN GPU ARCHITECTURES. Perspectives of GPU computing in Science, 26 th Sept 2016
RECENT TRENDS IN GPU ARCHITECTURES Perspectives of GPU computing in Science, 26 th Sept 2016 NVIDIA THE AI COMPUTING COMPANY GPU Computing Computer Graphics Artificial Intelligence 2 NVIDIA POWERS WORLD
More informationShort Talk: System abstractions to facilitate data movement in supercomputers with deep memory and interconnect hierarchy
Short Talk: System abstractions to facilitate data movement in supercomputers with deep memory and interconnect hierarchy François Tessier, Venkatram Vishwanath Argonne National Laboratory, USA July 19,
More informationAcceleration of HPC applications on hybrid CPU-GPU systems: When can Multi-Process Service (MPS) help?
Acceleration of HPC applications on hybrid CPU- systems: When can Multi-Process Service (MPS) help? GTC 2018 March 28, 2018 Olga Pearce (Lawrence Livermore National Laboratory) http://people.llnl.gov/olga
More informationPaving the Road to Exascale
Paving the Road to Exascale Gilad Shainer August 2015, MVAPICH User Group (MUG) Meeting The Ever Growing Demand for Performance Performance Terascale Petascale Exascale 1 st Roadrunner 2000 2005 2010 2015
More informationDeep Learning mit PowerAI - Ein Überblick
Stephen Lutz Deep Learning mit PowerAI - Open Group Master Certified IT Specialist Technical Sales IBM Cognitive Infrastructure IBM Germany Ein Überblick Stephen.Lutz@de.ibm.com What s that? and what s
More informationIBM Research: AcceleratorTechnologies in HPC and Cognitive Computing
MaRS Workshop, Eurosys 2017, Belgrade April 23, 2017 IBM Research: AcceleratorTechnologies in HPC and Cognitive Computing Christoph Hagleitner, hle@zurich.ibm.com IBM Research - Zurich Lab IBM Research
More informationOpenPOWER Performance
OpenPOWER Performance Alex Mericas Chief Engineer, OpenPOWER Performance IBM Delivering the Linux ecosystem for Power SOLUTIONS OpenPOWER IBM SOFTWARE LINUX ECOSYSTEM OPEN SOURCE Solutions with full stack
More informationInterconnect Your Future
Interconnect Your Future Gilad Shainer 2nd Annual MVAPICH User Group (MUG) Meeting, August 2014 Complete High-Performance Scalable Interconnect Infrastructure Comprehensive End-to-End Software Accelerators
More informationBigger GPUs and Bigger Nodes. Carl Pearson PhD Candidate, advised by Professor Wen-Mei Hwu
Bigger GPUs and Bigger Nodes Carl Pearson (pearson@illinois.edu) PhD Candidate, advised by Professor Wen-Mei Hwu 1 Outline Experiences from working with domain experts to develop GPU codes on Blue Waters
More informationUCX: An Open Source Framework for HPC Network APIs and Beyond
UCX: An Open Source Framework for HPC Network APIs and Beyond Pavel Shamis, Manjunath Gorentla Venkata, M. Graham Lopez, Matthew B. Baker, Oscar Hernandez, Yossi Itigin, Mike Dubman, Gilad Shainer, Richard
More informationTHE PATH TO EXASCALE COMPUTING. Bill Dally Chief Scientist and Senior Vice President of Research
THE PATH TO EXASCALE COMPUTING Bill Dally Chief Scientist and Senior Vice President of Research The Goal: Sustained ExaFLOPs on problems of interest 2 Exascale Challenges Energy efficiency Programmability
More informationArchitecture trends, performance prediction and co-design tools
Architecture trends, performance prediction and co-design tools Jeffrey S. Vetter US-Japan Joint Institute for Fusion Theory Workshop on Innovations and Codesigns of Fusion Simulations towards Extreme
More informationIBM Spectrum Scale IO performance
IBM Spectrum Scale 5.0.0 IO performance Silverton Consulting, Inc. StorInt Briefing 2 Introduction High-performance computing (HPC) and scientific computing are in a constant state of transition. Artificial
More informationUniversité IBM i 2017
Université IBM i 2017 17 et 18 mai IBM Client Center de Bois-Colombes S24 Architecture IBM POWER: tendances et stratégies Jeudi 18 mai 11h00-12h30 Jean-Luc Bonhommet IBM AGENDA IBM Power Systems - IBM
More informationS8765 Performance Optimization for Deep- Learning on the Latest POWER Systems
S8765 Performance Optimization for Deep- Learning on the Latest POWER Systems Khoa Huynh Senior Technical Staff Member (STSM), IBM Jonathan Samn Software Engineer, IBM Evolving from compute systems to
More informationUsing MPI+OpenMP for current and future architectures
Using MPI+OpenMP for current and future architectures September 24th, 2018 OpenMPCon 2018 Oscar Hernandez Yun (Helen) He Barbara Chapman DOE s Office of Science Computation User Facilities DOE is leader
More informationAggregation of Real-Time System Monitoring Data for Analyzing Large-Scale Parallel and Distributed Computing Environments
Aggregation of Real-Time System Monitoring Data for Analyzing Large-Scale Parallel and Distributed Computing Environments Swen Böhm 1,2, Christian Engelmann 2, and Stephen L. Scott 2 1 Department of Computer
More informationIBM Power User Group - Atlanta
IBM Power User Group - Atlanta Wes Showfety Open Source Database & HPC strategist, North America showfety@us.ibm.com 770-617-7377 LinkedIn: https://www.linkedin.com/in/wes-showfety-2399444 Twitter: @Wes_Show
More informationToward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies
Toward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies François Tessier, Venkatram Vishwanath, Paul Gressier Argonne National Laboratory, USA Wednesday
More informationParallel Computer Architecture II
Parallel Computer Architecture II Stefan Lang Interdisciplinary Center for Scientific Computing (IWR) University of Heidelberg INF 368, Room 532 D-692 Heidelberg phone: 622/54-8264 email: Stefan.Lang@iwr.uni-heidelberg.de
More informationGPU COMPUTING AND THE FUTURE OF HPC. Timothy Lanfear, NVIDIA
GPU COMPUTING AND THE FUTURE OF HPC Timothy Lanfear, NVIDIA ~1 W ~3 W ~100 W ~30 W 1 kw 100 kw 20 MW Power-constrained Computers 2 EXASCALE COMPUTING WILL ENABLE TRANSFORMATIONAL SCIENCE RESULTS First-principles
More informationDesign of Scalable Network Considering Diameter and Cable Delay
Tohoku Design of Scalable etwork Considering Diameter and Cable Delay Kentaro Sano Tohoku University, JAPA Agenda Introduction Assumption Preliminary evaluation & candidate networks Cable length and delay
More informationLecture 6: odds and ends
Lecture 6: odds and ends Prof. Mike Giles mike.giles@maths.ox.ac.uk Oxford University Mathematical Institute Oxford e-research Centre Lecture 6 p. 1 Overview synchronicity multiple streams and devices
More informationQuantifying Resiliency in the Extreme Scale HPC Co-Design Space
Quantifying Resiliency in the Extreme Scale HPC Co-Design Space Jeffrey S. Vetter Jeremy Meredith Dagstuhl Seminar #15281 Algorithms and Scheduling Techniques to Manage Resilience and Power Consumption
More informationA Large-Scale Study of Soft- Errors on GPUs in the Field
A Large-Scale Study of Soft- Errors on GPUs in the Field Bin Nie*, Devesh Tiwari +, Saurabh Gupta +, Evgenia Smirni*, and James H. Rogers + *College of William and Mary + Oak Ridge National Laboratory
More informationUtilizing Unused Resources To Improve Checkpoint Performance
Utilizing Unused Resources To Improve Checkpoint Performance Ross Miller Oak Ridge Leadership Computing Facility Oak Ridge National Laboratory Oak Ridge, Tennessee Email: rgmiller@ornl.gov Scott Atchley
More informationNVIDIA Update and Directions on GPU Acceleration for Earth System Models
NVIDIA Update and Directions on GPU Acceleration for Earth System Models Stan Posey, HPC Program Manager, ESM and CFD, NVIDIA, Santa Clara, CA, USA Carl Ponder, PhD, Applications Software Engineer, NVIDIA,
More informationS8688 : INSIDE DGX-2. Glenn Dearth, Vyas Venkataraman Mar 28, 2018
S8688 : INSIDE DGX-2 Glenn Dearth, Vyas Venkataraman Mar 28, 2018 Why was DGX-2 created Agenda DGX-2 internal architecture Software programming model Simple application Results 2 DEEP LEARNING TRENDS Application
More informationOpen Innovation with Power8
2011 IBM Power Systems Technical University October 10-14 Fontainebleau Miami Beach Miami, FL IBM Open Innovation with Power8 Jeffrey Stuecheli Power Processor Development Copyright IBM Corporation 2013
More informationSYNERGIE VON HPC UND DEEP LEARNING MIT NVIDIA GPUS
SYNERGIE VON HPC UND DEEP LEARNING MIT NVIDIA S Axel Koehler, Principal Solution Architect HPCN%Workshop%Goettingen,%14.%Mai%2018 NVIDIA - AI COMPUTING COMPANY Computer Graphics Computing Artificial Intelligence
More informationOverlapping Computation and Communication for Advection on Hybrid Parallel Computers
Overlapping Computation and Communication for Advection on Hybrid Parallel Computers James B White III (Trey) trey@ucar.edu National Center for Atmospheric Research Jack Dongarra dongarra@eecs.utk.edu
More informationThe Red Storm System: Architecture, System Update and Performance Analysis
The Red Storm System: Architecture, System Update and Performance Analysis Douglas Doerfler, Jim Tomkins Sandia National Laboratories Center for Computation, Computers, Information and Mathematics LACSI
More informationPower Technology For a Smarter Future
2011 IBM Power Systems Technical University October 10-14 Fontainebleau Miami Beach Miami, FL IBM Power Technology For a Smarter Future Jeffrey Stuecheli Power Processor Development Copyright IBM Corporation
More informationGPU Architecture. Alan Gray EPCC The University of Edinburgh
GPU Architecture Alan Gray EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? Architectural reasons for accelerator performance advantages Latest GPU Products From
More informationIBM Power Advanced Compute (AC) AC922 Server
IBM Power Advanced Compute (AC) AC922 Server The Best Server for Enterprise AI Highlights IBM Power Systems Accelerated Compute (AC922) server is an acceleration superhighway to enterprise- class AI. A
More informationMegaGauss (MGs) Cluster Design Overview
MegaGauss (MGs) Cluster Design Overview NVIDIA Tesla (Fermi) S2070 Modules Based Solution Version 6 (Apr 27, 2010) Alexander S. Zaytsev p. 1 of 15: "Title" Front view: planar
More informationRevolutionizing Data-Centric Transformation
2016 OpenPOWER Foundation Revolutionizing Data-Centric Transformation April 2016 Sumit Gupta Vice President, High Performance Computing and Analytics IBM Power Systems OpenPOWER: Catalyst for Open Innovation
More informationgenzconsortium.org Gen-Z Technology: Enabling Memory Centric Architecture
Gen-Z Technology: Enabling Memory Centric Architecture Why Gen-Z? Gen-Z Consortium 2017 2 Why Gen-Z? Gen-Z Consortium 2017 3 Why Gen-Z? Businesses Need to Monetize Data Big Data AI Machine Learning Deep
More informationCRAY XK6 REDEFINING SUPERCOMPUTING. - Sanjana Rakhecha - Nishad Nerurkar
CRAY XK6 REDEFINING SUPERCOMPUTING - Sanjana Rakhecha - Nishad Nerurkar CONTENTS Introduction History Specifications Cray XK6 Architecture Performance Industry acceptance and applications Summary INTRODUCTION
More informationALCF Operations Best Practices and Highlights. Mark Fahey The 6 th AICS International Symposium Feb 22-23, 2016 RIKEN AICS, Kobe, Japan
ALCF Operations Best Practices and Highlights Mark Fahey The 6 th AICS International Symposium Feb 22-23, 2016 RIKEN AICS, Kobe, Japan Overview ALCF Organization and Structure IBM BG/Q Mira Storage/Archive
More informationThe Future of the Message-Passing Interface. William Gropp
The Future of the Message-Passing Interface William Gropp www.cs.illinois.edu/~wgropp MPI and Supercomputing The Message Passing Interface (MPI) has been amazingly successful First released in 1992, it
More informationI/O Router Placement and Fine-Grained Routing on Titan to Support Spider II
I/O Router Placement and Fine-Grained Routing on Titan to Support Spider II Matt Ezell, Sarp Oral, Feiyi Wang, Devesh Tiwari, Don Maxwell, Dustin Leverman, and Jason Hill Oak Ridge National Laboratory;
More informationEfficient Object Storage Journaling in a Distributed Parallel File System
Efficient Object Storage Journaling in a Distributed Parallel File System Presented by Sarp Oral Sarp Oral, Feiyi Wang, David Dillow, Galen Shipman, Ross Miller, and Oleg Drokin FAST 10, Feb 25, 2010 A
More informationI/O and Scheduling aspects in DEEP-EST
I/O and Scheduling aspects in DEEP-EST Norbert Eicker Jülich Supercomputing Centre & University of Wuppertal The research leading to these results has received funding from the European Community's Seventh
More informationGen-Z Memory-Driven Computing
Gen-Z Memory-Driven Computing Our vision for the future of computing Patrick Demichel Distinguished Technologist Explosive growth of data More Data Need answers FAST! Value of Analyzed Data 2005 0.1ZB
More informationSteve Scott, Tesla CTO SC 11 November 15, 2011
Steve Scott, Tesla CTO SC 11 November 15, 2011 What goal do these products have in common? Performance / W Exaflop Expectations First Exaflop Computer K Computer ~10 MW CM5 ~200 KW Not constant size, cost
More informationThe Titan Tools Experience
The Titan Tools Experience Michael J. Brim, Ph.D. Computer Science Research, CSMD/NCCS Petascale Tools Workshop 213 Madison, WI July 15, 213 Overview of Titan Cray XK7 18,688+ compute nodes 16-core AMD
More informationCST STUDIO SUITE R Supported GPU Hardware
CST STUDIO SUITE R 2017 Supported GPU Hardware 1 Supported Hardware CST STUDIO SUITE currently supports up to 8 GPU devices in a single host system, meaning each number of GPU devices between 1 and 8 is
More informationJack Dongarra University of Tennessee Oak Ridge National Laboratory
Jack Dongarra University of Tennessee Oak Ridge National Laboratory 3/9/11 1 TPP performance Rate Size 2 100 Pflop/s 100000000 10 Pflop/s 10000000 1 Pflop/s 1000000 100 Tflop/s 100000 10 Tflop/s 10000
More informationCCR. ISC18 June 28, Kevin Pedretti, Jim H. Laros III, Si Hammond SAND C. Photos placed in horizontal env
Photos placed in horizontal position with even amount of white space between photos and header Photos placed in horizontal env position with even amount of white space between photos and header Vanguard
More informationHPC Hardware Overview
HPC Hardware Overview John Lockman III April 19, 2013 Texas Advanced Computing Center The University of Texas at Austin Outline Lonestar Dell blade-based system InfiniBand ( QDR) Intel Processors Longhorn
More informationCUDA: NEW AND UPCOMING FEATURES
May 8-11, 2017 Silicon Valley CUDA: NEW AND UPCOMING FEATURES Stephen Jones, GTC 2018 CUDA ECOSYSTEM 2018 CUDA DOWNLOADS IN 2017 3,500,000 CUDA REGISTERED DEVELOPERS 800,000 GTC ATTENDEES 8,000+ 2 CUDA
More informationParallel Norms Performance Report
6 Parallel Norms Performance Report Jakub Kurzak Mark Gates Asim YarKhan Ichitaro Yamazaki Piotr Luszczek Jamie Finney Jack Dongarra Innovative Computing Laboratory July 1, 2018 This research was supported
More informationBlue Gene/Q. Hardware Overview Michael Stephan. Mitglied der Helmholtz-Gemeinschaft
Blue Gene/Q Hardware Overview 02.02.2015 Michael Stephan Blue Gene/Q: Design goals System-on-Chip (SoC) design Processor comprises both processing cores and network Optimal performance / watt ratio Small
More informationEXTENDING THE REACH OF PARALLEL COMPUTING WITH CUDA
EXTENDING THE REACH OF PARALLEL COMPUTING WITH CUDA Mark Harris, NVIDIA @harrism #NVSC14 EXTENDING THE REACH OF CUDA 1 Machine Learning 2 Higher Performance 3 New Platforms 4 New Languages 2 GPUS: THE
More informationLAMMPS-KOKKOS Performance Benchmark and Profiling. September 2015
LAMMPS-KOKKOS Performance Benchmark and Profiling September 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox, NVIDIA
More informationIBM Power AC922 Server
IBM Power AC922 Server The Best Server for Enterprise AI Highlights More accuracy - GPUs access system RAM for larger models Faster insights - significant deep learning speedups Rapid deployment - integrated
More informationApril 4-7, 2016 Silicon Valley INSIDE PASCAL. Mark Harris, October 27,
April 4-7, 2016 Silicon Valley INSIDE PASCAL Mark Harris, October 27, 2016 @harrism INTRODUCING TESLA P100 New GPU Architecture CPU to CPUEnable the World s Fastest Compute Node PCIe Switch PCIe Switch
More informationJohn Unthank IBM Federal Sales IBM HPC Topics IBM Corporation
John Unthank IBM Federal Sales unthank@us.ibm.com IBM HPC Topics IBM Imperatives* Transform industries and professions with data Remake enterprise IT for the cloud Reimagine work through mobile and social
More informationHow Might Recently Formed System Interconnect Consortia Affect PM? Doug Voigt, SNIA TC
How Might Recently Formed System Interconnect Consortia Affect PM? Doug Voigt, SNIA TC Three Consortia Formed in Oct 2016 Gen-Z Open CAPI CCIX complex to rack scale memory fabric Cache coherent accelerator
More informationRDAV and Nautilus
http://rdav.nics.tennessee.edu/ RDAV and Nautilus Parallel Processing with R Amy F. Szczepa!ski Remote Data Analysis and Visualization Center University of Tennessee, Knoxville aszczepa@utk.edu Any opinions,
More informationInterconnect Your Future
Interconnect Your Future Smart Interconnect for Next Generation HPC Platforms Gilad Shainer, August 2016, 4th Annual MVAPICH User Group (MUG) Meeting Mellanox Connects the World s Fastest Supercomputer
More informationOak Ridge National Laboratory
Oak Ridge National Laboratory Lustre Scalability Workshop Presented by: Galen M. Shipman Collaborators: David Dillow Sarp Oral Feiyi Wang February 10, 2009 We have increased system performance 300 times
More informationStockholm Brain Institute Blue Gene/L
Stockholm Brain Institute Blue Gene/L 1 Stockholm Brain Institute Blue Gene/L 2 IBM Systems & Technology Group and IBM Research IBM Blue Gene /P - An Overview of a Petaflop Capable System Carl G. Tengwall
More informationRecent Advances in Heterogeneous Computing using Charm++
Recent Advances in Heterogeneous Computing using Charm++ Jaemin Choi, Michael Robson Parallel Programming Laboratory University of Illinois Urbana-Champaign April 12, 2018 1 / 24 Heterogeneous Computing
More informationHigh-Performance Computing - and why Learn about it?
High-Performance Computing - and why Learn about it? Tarek El-Ghazawi The George Washington University Washington D.C., USA Outline What is High-Performance Computing? Why is High-Performance Computing
More informationBuilding NVLink for Developers
Building NVLink for Developers Unleashing programmatic, architectural and performance capabilities for accelerated computing Why NVLink TM? Simpler, Better and Faster Simplified Programming No specialized
More information