OpenPOWER Innovations for HPC. IBM Research. IWOPH workshop, ISC, Germany June 21, Christoph Hagleitner,
|
|
- Lorraine Johns
- 5 years ago
- Views:
Transcription
1 IWOPH workshop, ISC, Germany June 21, 2017 OpenPOWER Innovations for HPC IBM Research Christoph Hagleitner, IBM Research - Zurich Lab
2 IBM Research - Zurich Established in different nationalities Open Collaboration 43 funded projects and 500+ partners in Horizon 2020 Binnig and Rohrer Nanotechnology Centre opened in 2011 (Public Private Partnership with ETH Zürich and EMPA) 7 European Research Council Grants Two Nobel Prizes: 1986 for the scanning tunneling microscope (Binnig and Rohrer) 1987 for the discovery of hightemperature superconductivity (Müller and Bednorz) 2
3 Agenda Cognitive Computing & HPC HPC Software ecosystem HPC system roadmap HPC Processor HPC Accelerators 3
4 Cognitive Computing Applications Computational Complexity O(N 3 ) Graph Analytics Deep Learning HADOOP HPC Classic HPC Applications Dim. Reduction O(N 2 ) Uncertainty Quantification Knowledge Graph Creation Information Retrieval O(N) Database queries MB GB TB PB Data Volume 4
5 Cognitive Computing: Integration & Disaggregation hadoop-style workloads... scale-out via network main metrics cost (capital, energy) compute density scalability homogeneous nodes (CPU / FPGA / NVMe plus compute) datacenter disaggregation complex HPC-like workloads... scale-up via high-speed buses main metrics memory / accelerator / inter-node BW optimal mix of heterogeneous resources (CPU / GPU / FPGA / HBM / DRAM / NVMe) compute density, scalability heterogeneous nodes data centric design 5
6 Dense, Energy Efficient Computing: Hyperscale FPGA Cloud economics density (>1000 nodes / rack) integrated NICs switch card (backplane, no cables) medium to low-cost compute chips Passive liquid cooling ultimate density (cooling >70W / node) energy re-use Built to integrate heterogeneous resources CPUs Accelerators 6
7 Cognitive Computing: Integration & Disaggregation hadoop-style workloads... scale-out via network main metrics cost (capital, energy) compute density scalability homogeneous nodes (CPU / FPGA / NVMe plus compute) datacenter disaggregation complex HPC-like workloads... scale-up via high-speed buses main metrics memory / accelerator / inter-node BW optimal mix of heterogeneous resources (CPU / GPU / FPGA / HBM / DRAM / NVMe) compute density, scalability heterogeneous nodes data centric design 7
8 OpenPOWER, a catalyst for Open Innovation Open Development OpenPOWER enables greater innovation through both open software and open hardware Collaboration across multiple thought leaders Collaborative development model drives collective thought leadership, simultaneously across multiple disciplines Performance of leading POWER architecture Broadens the capability and performance of the POWER platform The OpenPOWER Foundation creates an open ecosystem, using the POWER Architecture to share expertise, investment, and server-class intellectual property to serve the evolving needs of customers. IBM Research - Zurich Lab
9 OpenPOWER Foundation: 230+ Partners, 24 countries
10 SuperVessel: OpenPOWER Cloud Open cloud platform based on Power/OpenPOWER and OpenStack technology for business partners, developers, and university students Heterogeneous Computing Zurich Research Lab IBM Research - Zurich Lab
11 SuperVessel: OpenPOWER Cloud for Developers 11
12 OpenPOWER Software Support 50+ IBM Innovation Centers, over 2,300 Linux ISVs developing on Power Moving to little endian (almost complete) IBM Research - Zurich Lab
13 OpenPOWER OS & Compiler Support Choose your favorite and latest Linux flavor RHEL Ubuntu in little endian (ppc64le) Standard compilers : GCC 4.8.5, MPICH 3.0.4, CUDA 8.0 AT9.0.3 compilers: GCC 5.3.1, Python 3.4, and more optimized for POWER AT compilers: GCC 6.2.1, Python 3.5, and more optimized for POWER IBM compilers: XLF, XLC,... Optimized libraries: MASS (math functions) ESSL (BLAS) and MPI 13
14 Introducing PowerAI: Enterprise Deep Learning Distribution Package of Pre-Compiled Major Deep Learning Frameworks Easy to install & get started with Deep Learning with Enterprise-Class Support Optimized for Performance To Take Advantage of NVLink Enabled by High Performance Computing Infrastructure IBM Research - Zurich Lab 14
15 PowerAI Deep Learning Software Distribution Deep Learning Frameworks Caffe NVCaffe IBMCaffe Torch TensorFlow Distributed TensorFlow Theano Chainer Supporting Libraries OpenBLAS Bazel Distributed Communications NCCL DIGITS Accelerated Servers and Infrastructure for Scaling Cluster of NVLink Servers Spectrum Scale: High-Speed Parallel File System Scale to Cloud IBM Research - Zurich Lab
16 S822LC: IBM POWER8+ for HPC 16
17 OpenPOWER Core Technology Roadmap Mellanox Interconnect Connect-IB FDR Infiniband PCIe Gen3 ConnectX-4 EDR Infiniband CAPI over PCIe Gen3 ConnectX-5 Next-Gen Infiniband Enhanced CAPI over PCIe Gen4 NVIDIA GPUs Kepler PCIe Gen3 Pascal NVLink Volta Enhanced NVLink IBM CPUs POWER8 PCIe Gen3 & CAPI Interface POWER8 NVLink & CAPI POWER9 Enhanced NVLink, OpenCAPI & PCIe Gen4 Accelerator Links PCIe Gen3 1x GPU FPGA NVLINK 5x GPU FPGA 25G Accelerator Link 7-10x 4x PCIe Gen4 GPU FPGA IBM Research - Zurich Lab
18 Looking Ahead: POWER9 Chip New Core Microarchitecture Stronger thread performance Efficient agile pipeline POWER ISA v3.0 Enhanced Cache Hierarchy 120MB NUCA L3 architecture 12 x 20-way associative regions Advanced replacement policies Fed by 7 TB/s on-chip bandwidth Cloud + Virtualization Innovation Quality of service assists New interrupt architecture Workload optimized frequency Hardware enforced trusted execution 14nm finfet Improved device performance and reduced energy 17 layer metal stack and edram 8.0 billion transistors IBM Research - Zurich Lab Leadership Hardware Acceleration Platform Enhanced on-chip acceleration Nvidia NVLink 2.0: High bandwidth, advanced features CAPI 2.0: Coherent accelerator and storage attach (PCIe G4) OpenCAPI: Improved latency and bandwidth, open interface State of the Art I/O Subsystem PCIe Gen4 48 lanes High Bandwidth Signaling Technology 16 Gb/s interface: Local SMP 25 Gb/s interface: 25G Link for Accelerator and remote SMP
19 POWER9 marchitecture Re-factored Core Provides Improved Efficiency & Workload Alignment Enhanced pipeline efficiency with modular execution and intelligent pipeline control Increased pipeline utilization with symmetric data-type engines: Fixed, Float, 128b, SIMD Shared compute resource optimizes data-type interchange 19
20 POWER9: POWER ISA v3.0 Broader data type support 128-bit IEEE 754 Quad-Precision Float Full width quad-precision for financial and security applications Expanded BCD and 128b Decimal Integer For database and native analytics Half-Precision Float Conversion Optimized for accelerator bandwidth and data exchange Support Emerging Algorithms Enhanced Arithmetic and SIMD Random Number Generation Instruction Accelerate Emerging Workloads Memory Atomics For high scale data-centric applications Hardware Assisted Garbage Collection Optimize response time of interpretive languages Cloud Optimization Enhanced Translation Architecture Optimized for Linux New Interrupt Architecture Automated partition routing for extreme virtualization Enhanced Accelerator Virtualization Hardware Enforced Trusted Execution Energy & Frequency Management POWER9 Workload Optimized Frequency Manage energy between threads and cores with reduced wakeup latency 20
21 POWER8 Caches L2: 1 MB 8 way per core L3: 96 MB (12 x 8 MB 8 way Bank) L4: 128 MB (on Centaur) NUCA Cache policy (Non-Uniform Cache Architecture) Cache bandwidth 4 TB/sec L2 BW 3 TB/sec L3 BW 21
22 Lookin ahead: POWER9 Data Capacity & Throughput L3 Cache: 120 MB Shared Capacity NUCA Cache 10 MB Capacity + 512k L2 per 2x SMT4 Core Enhanced Replacement with Reuse & Data- Type Awareness High-Throughput On-Chip Fabric Over 7 TB/s On-chip Switch Move Data in/out at 256 GB/s per 2x SMT4 Core 22
23 Looking Ahead: POWER9 Accelerator Interfaces Extreme Accelerator Bandwidth and Reduced Latency PCIe Gen 4 x 48 lanes 192 GB/s peak bandwidth (duplex) IBM BlueLink 25Gb/s x 48 lanes 300 GB/s peak bandwidth (duplex) Coherent Memory and Virtual Addressing Capability for all Accelerators CAPI 2.0-4x bandwidth of POWER8 using PCIe Gen 4 NVLink 2.0 Next generation of GPU/CPU bandwidth and integration using BlueLink OpenCAPI High bandwidth, low latency and open interface using BlueLink opencapi opencapi 23
24 POWER9 + Accelerators: GPUs See GTC
25 POWER9 + GPU: Unified Memory Pascal Volta See GTC
26 CAPI... Coherent Accelerator Processor Interface Standard I/O Model Flow DD Call Copy/Pin MMIO Notify Accelerate Poll / Int Copy/Unpin Return DD Shared Mem. Notify Accelerator Flow with a Coherent Model Accelerate Shared Memory Completion CAPI FPGA CAPP PCIe POWER8 Processor POWER Service Layer AFU n AFU 2 AFU 1 AFU 0 26
27 IBM Research - Zurich Lab 6/22/
28 CAPI SNAP OpenCAPI v3.0 and NVLINK 2.0 with POWER9 POWER9 CPU 2 CAPI v2 Proxy Cores P9 Core Memory Bus P9 Core CAPP CAPP OpenCAPI and NVLink processing unit PHB NPU 25G PCIe Gen4 PCIe Accelerator Card FPGA PSL SNAP Action0 Action1 Memory - Streaming Layer for CAPI v Simplifies accelerator development and use - Supports High-Level-Synthesis (HLS) for FPGA development - Available as open source OpenCAPI Accelerator OpenCAPI OpenCAPI Accelerator Card OpenCAPI Accelerator Storage Accelerator Card Class Memory Card Card IBM Research - Zurich Lab NVIDIA NVIDIA Pascal NVIDIA Pascal GPU Pascal GPU GPU Device or Network PCIe Gen 4 x 48 lanes 192 GB/s duplex bandwidth 25G Link x 48 lanes 300 GB/s duplex bandwidth
29 Available Accelerator Cards Nallatech team explaining CAPI Flash card: 6/22/2017 IBM Research - Zurich Lab 29
30 Dense Memory (distributed) Prototype Dense Memory integration software stack available byte addressable, distributed globally accessable DM resource exports industry standard asynchronous RDMA API for DM read and write access Implements efficient local and remote DM access zero copy local access via direct DMA device - application buffer zero copy remote access via IB RDMA remote host - application buffer Performance measurements local DM access at NVMe devices performance limits (3.5 GB/s read, 1.8 GB/s write of 4k buffers) remote DM access at network (100Gb/s InfiniBand) and device limits: 12.5 GB/s distributed DM random read with 4 storage nodes, all equipped with one NVMe SSD each close to 900k IOPs for single device short sequential red/write operations 30
31 But be willing take incremental steps when you can! IBM Research - Zurich Lab 31
IBM Research: AcceleratorTechnologies in HPC and Cognitive Computing
MaRS Workshop, Eurosys 2017, Belgrade April 23, 2017 IBM Research: AcceleratorTechnologies in HPC and Cognitive Computing Christoph Hagleitner, hle@zurich.ibm.com IBM Research - Zurich Lab IBM Research
More informationDeep Learning mit PowerAI - Ein Überblick
Stephen Lutz Deep Learning mit PowerAI - Open Group Master Certified IT Specialist Technical Sales IBM Cognitive Infrastructure IBM Germany Ein Überblick Stephen.Lutz@de.ibm.com What s that? and what s
More informationIBM CORAL HPC System Solution
IBM CORAL HPC System Solution HPC and HPDA towards Cognitive, AI and Deep Learning Deep Learning AI / Deep Learning Strategy for Power Power AI Platform High Performance Data Analytics Big Data Strategy
More informationPower Systems AC922 Overview. Chris Mann IBM Distinguished Engineer Chief System Architect, Power HPC Systems December 11, 2017
Power Systems AC922 Overview Chris Mann IBM Distinguished Engineer Chief System Architect, Power HPC Systems December 11, 2017 IBM POWER HPC Platform Strategy High-performance computer and high-performance
More informationS8765 Performance Optimization for Deep- Learning on the Latest POWER Systems
S8765 Performance Optimization for Deep- Learning on the Latest POWER Systems Khoa Huynh Senior Technical Staff Member (STSM), IBM Jonathan Samn Software Engineer, IBM Evolving from compute systems to
More informationOpen Innovation with Power8
2011 IBM Power Systems Technical University October 10-14 Fontainebleau Miami Beach Miami, FL IBM Open Innovation with Power8 Jeffrey Stuecheli Power Processor Development Copyright IBM Corporation 2013
More informationIBM Deep Learning Solutions
IBM Deep Learning Solutions Reference Architecture for Deep Learning on POWER8, P100, and NVLink October, 2016 How do you teach a computer to Perceive? 2 Deep Learning: teaching Siri to recognize a bicycle
More informationRevolutionizing Open. Cecilia Carniel IBM Power Systems Scale Out sales
Revolutionizing Open Cecilia Carniel IBM Power Systems Scale Out sales cecilia_carniel@it.ibm.com Copyright IBM Corporation 2015 Technical University/Symposia materials may not be reproduced in whole or
More informationPower Technology For a Smarter Future
2011 IBM Power Systems Technical University October 10-14 Fontainebleau Miami Beach Miami, FL IBM Power Technology For a Smarter Future Jeffrey Stuecheli Power Processor Development Copyright IBM Corporation
More informationPOWER9 Announcement. Martin Bušek IBM Server Solution Sales Specialist
POWER9 Announcement Martin Bušek IBM Server Solution Sales Specialist Announce Performance Launch GA 2/13 2/27 3/19 3/20 POWER9 is here!!! The new POWER9 processor ~1TB/s 1 st chip with PCIe4 4GHZ 2x Core
More informationPOWER9. Jeff Stuecheli POWER Systems, IBM Systems IBM Corporation
POWER9 Jeff Stuecheli POWER Systems, IM Systems 2018 IM Corporation Recent and Future POWER Processor Roadmap POWER7 45 nm 2010 POWER7+ 32 nm 2012 POWER8 Family 22nm 2014 2016 POWER9 Family 14nm 2H17 2H18+
More informationIBM Power AC922 Server
IBM Power AC922 Server The Best Server for Enterprise AI Highlights More accuracy - GPUs access system RAM for larger models Faster insights - significant deep learning speedups Rapid deployment - integrated
More informationUniversité IBM i 2017
Université IBM i 2017 17 et 18 mai IBM Client Center de Bois-Colombes S24 Architecture IBM POWER: tendances et stratégies Jeudi 18 mai 11h00-12h30 Jean-Luc Bonhommet IBM AGENDA IBM Power Systems - IBM
More informationBuilding the Most Efficient Machine Learning System
Building the Most Efficient Machine Learning System Mellanox The Artificial Intelligence Interconnect Company June 2017 Mellanox Overview Company Headquarters Yokneam, Israel Sunnyvale, California Worldwide
More informationHeterogeneous Computing Systems in Cloud Datacenters
FPL 2016 Lausanne, August 31 Heterogeneous Computing Systems in Cloud Datacenters Christoph Hagleitner, hle@zurich.ibm.com IBM Research - Zurich Lab IBM Research Zurich Lab (ZRL) Established in 1956 Two
More informationBuilding the Most Efficient Machine Learning System
Building the Most Efficient Machine Learning System Mellanox The Artificial Intelligence Interconnect Company June 2017 Mellanox Overview Company Headquarters Yokneam, Israel Sunnyvale, California Worldwide
More informationIBM Power Advanced Compute (AC) AC922 Server
IBM Power Advanced Compute (AC) AC922 Server The Best Server for Enterprise AI Highlights IBM Power Systems Accelerated Compute (AC922) server is an acceleration superhighway to enterprise- class AI. A
More informationIBM Power Systems Update. David Spurway IBM Power Systems Product Manager STG, UK and Ireland
IBM Power Systems Update David Spurway IBM Power Systems Product Manager STG, UK and Ireland Would you like to go fast? Go faster - win your race Doing More LESS With Power 8 POWER8 is the fastest around
More information19. prosince 2018 CIIRC Praha. Milan Král, IBM Radek Špimr
19. prosince 2018 CIIRC Praha Milan Král, IBM Radek Špimr CORAL CORAL 2 CORAL Installation at ORNL CORAL Installation at LLNL Order of Magnitude Leap in Computational Power Real, Accelerated Science ACME
More informationOpenCAPI and its Roadmap
OpenCAPI and its Roadmap Myron Slota, President OpenCAPI Speaker name, Consortium Title Company/Organization Name Join the Conversation #OpenPOWERSummit Industry Collaboration and Innovation OpenCAPI and
More informationInterconnect Your Future
#OpenPOWERSummit Interconnect Your Future Scot Schultz, Director HPC / Technical Computing Mellanox Technologies OpenPOWER Summit, San Jose CA March 2015 One-Generation Lead over the Competition Mellanox
More informationInterconnect Your Future
Interconnect Your Future Gilad Shainer 2nd Annual MVAPICH User Group (MUG) Meeting, August 2014 Complete High-Performance Scalable Interconnect Infrastructure Comprehensive End-to-End Software Accelerators
More informationRevolutionizing Data-Centric Transformation
2016 OpenPOWER Foundation Revolutionizing Data-Centric Transformation April 2016 Sumit Gupta Vice President, High Performance Computing and Analytics IBM Power Systems OpenPOWER: Catalyst for Open Innovation
More informationHow Might Recently Formed System Interconnect Consortia Affect PM? Doug Voigt, SNIA TC
How Might Recently Formed System Interconnect Consortia Affect PM? Doug Voigt, SNIA TC Three Consortia Formed in Oct 2016 Gen-Z Open CAPI CCIX complex to rack scale memory fabric Cache coherent accelerator
More informationPreparing GPU-Accelerated Applications for the Summit Supercomputer
Preparing GPU-Accelerated Applications for the Summit Supercomputer Fernanda Foertter HPC User Assistance Group Training Lead foertterfs@ornl.gov This research used resources of the Oak Ridge Leadership
More informationIndustry Collaboration and Innovation
Industry Collaboration and Innovation OpenCAPI Topics Industry Background Technology Overview Design Enablement OpenCAPI Consortium Industry Landscape Key changes occurring in our industry Historical microprocessor
More informationIBM Power Systems: Open innovation to put data to work Dexter Henderson Vice President IBM Power Systems
IBM Power Systems: Open innovation to put data to work Dexter Henderson Vice President IBM Power Systems 2014 IBM Corporation Powerful Forces are Changing the Way Business Gets Done Data growing exponentially
More informationLooking ahead with IBM i. 10+ year roadmap
Looking ahead with IBM i 10+ year roadmap 1 Enterprises Trust IBM Power 80 of Fortune 100 have IBM Power Systems The top 10 banking firms have IBM Power Systems 9 of top 10 insurance companies have IBM
More informationIndustry Collaboration and Innovation
Industry Collaboration and Innovation Industry Landscape Key changes occurring in our industry Historical microprocessor technology continues to deliver far less than the historical rate of cost/performance
More informationToward a Memory-centric Architecture
Toward a Memory-centric Architecture Martin Fink EVP & Chief Technology Officer Western Digital Corporation August 8, 2017 1 SAFE HARBOR DISCLAIMERS Forward-Looking Statements This presentation contains
More informationOpenCAPI Technology. Myron Slota Speaker name, Title OpenCAPI Consortium Company/Organization Name. Join the Conversation #OpenPOWERSummit
OpenCAPI Technology Myron Slota Speaker name, Title OpenCAPI Consortium Company/Organization Name Join the Conversation #OpenPOWERSummit Industry Collaboration and Innovation OpenCAPI Topics Computation
More informationPaving the Road to Exascale
Paving the Road to Exascale Gilad Shainer August 2015, MVAPICH User Group (MUG) Meeting The Ever Growing Demand for Performance Performance Terascale Petascale Exascale 1 st Roadrunner 2000 2005 2010 2015
More informationMapping MPI+X Applications to Multi-GPU Architectures
Mapping MPI+X Applications to Multi-GPU Architectures A Performance-Portable Approach Edgar A. León Computer Scientist San Jose, CA March 28, 2018 GPU Technology Conference This work was performed under
More informationFoundation Overview Mingzhi Christensen
Foundation Overview April, 2017 Mingzhi Christensen Manager, IBM OpenPOWER Global Alliances mingzhi@us.ibm.com Today s challenges demand innovation Full system and stack open innovation required Data holds
More informationPOWER8 for DB2 and SAP
July 2014 POWER8 for DB2 and SAP Walter Orb IBM SAP Competence Center, Walldorf Agenda OpenPOWER Foundation POWER8 POWER8 for SAP POWER8 for DB2 2 Important Disclaimer IBM s statements regarding its plans,
More informationOpenPOWER Performance
OpenPOWER Performance Alex Mericas Chief Engineer, OpenPOWER Performance IBM Delivering the Linux ecosystem for Power SOLUTIONS OpenPOWER IBM SOFTWARE LINUX ECOSYSTEM OPEN SOURCE Solutions with full stack
More informationBuilding NVLink for Developers
Building NVLink for Developers Unleashing programmatic, architectural and performance capabilities for accelerated computing Why NVLink TM? Simpler, Better and Faster Simplified Programming No specialized
More informationSYNERGIE VON HPC UND DEEP LEARNING MIT NVIDIA GPUS
SYNERGIE VON HPC UND DEEP LEARNING MIT NVIDIA S Axel Koehler, Principal Solution Architect HPCN%Workshop%Goettingen,%14.%Mai%2018 NVIDIA - AI COMPUTING COMPANY Computer Graphics Computing Artificial Intelligence
More informationThe Stampede is Coming: A New Petascale Resource for the Open Science Community
The Stampede is Coming: A New Petascale Resource for the Open Science Community Jay Boisseau Texas Advanced Computing Center boisseau@tacc.utexas.edu Stampede: Solicitation US National Science Foundation
More informationIBM Power Systems HPC Cluster
IBM Power Systems HPC Cluster Highlights Complete and fully Integrated HPC cluster for demanding workloads Modular and Extensible: match components & configurations to meet demands Integrated: racked &
More informationIBM POWER9 Server Update
IBM POWER9 Server Update Luc Cloutier Advisory I/T Specialist, Power Server luc@ca.ibm.com Charts by: Simon Porstendorfer Principal Offering Manager Cognitive Systems Dylan Boday, Ph.D. Offering Manager,
More informationDDN. DDN Updates. Data DirectNeworks Japan, Inc Shuichi Ihara. DDN Storage 2017 DDN Storage
DDN DDN Updates Data DirectNeworks Japan, Inc Shuichi Ihara DDN A Broad Range of Technologies to Best Address Your Needs Protection Security Data Distribution and Lifecycle Management Open Monitoring Your
More informationFC-NVMe. NVMe over Fabrics. Fibre Channel the most trusted fabric can transport NVMe natively. White Paper
FC-NVMe NVMe over Fabrics Fibre Channel the most trusted fabric can transport NVMe natively BACKGROUND AND SUMMARY Ever since IBM shipped the world s first hard disk drive (HDD), the RAMAC 305 in 1956,
More informationThe Future of Interconnect Technology
The Future of Interconnect Technology Michael Kagan, CTO HPC Advisory Council Stanford, 2014 Exponential Data Growth Best Interconnect Required 44X 0.8 Zetabyte 2009 35 Zetabyte 2020 2014 Mellanox Technologies
More informationCloud Acceleration with FPGA s. Mike Strickland, Director, Computer & Storage BU, Altera
Cloud Acceleration with FPGA s Mike Strickland, Director, Computer & Storage BU, Altera Agenda Mission Alignment & Data Center Trends OpenCL and Algorithm Acceleration Networking Acceleration Data Access
More informationThe Future of High Performance Interconnects
The Future of High Performance Interconnects Ashrut Ambastha HPC Advisory Council Perth, Australia :: August 2017 When Algorithms Go Rogue 2017 Mellanox Technologies 2 When Algorithms Go Rogue 2017 Mellanox
More informationEfficient Communication Library for Large-Scale Deep Learning
IBM Research AI Efficient Communication Library for Large-Scale Deep Learning Mar 26, 2018 Minsik Cho (minsikcho@us.ibm.com) Deep Learning changing Our Life Automotive/transportation Security/public safety
More informationCAPI SNAP framework, the tool for C/C++ programmers to accelerate by a 2 digit factor using FPGA technology
CAPI SNAP framework, the tool for C/C++ programmers to accelerate by a 2 digit factor using FPGA technology Bruno MESNET, Power CAPI Enablement IBM Power Systems Join the Conversation #OpenPOWERSummit
More informationConcurrent High Performance Processor design: From Logic to PD in Parallel
IBM Systems Group Concurrent High Performance design: From Logic to PD in Parallel Leon Stok, VP EDA, IBM Systems Group Mainframes process 30 billion business transactions per day The mainframe is everywhere,
More informationIBM Power 9 надежная платформа для развертывания облаков. Ташкент. Юрий Кондратенко Cross-Brand Sales Specialist
IBM Power 9 надежная платформа для развертывания облаков Ташкент Юрий Кондратенко Cross-Brand Sales Specialist Power Systems Family POWER9 servers and solutions are built to crush today s most advanced
More informationIntroduction to the OpenCAPI Interface
Introduction to the OpenCAPI Interface Brian Allison, STSM OpenCAPI Technology and Enablement Speaker name, Title Company/Organization Name Join the Conversation #OpenPOWERSummit Industry Collaboration
More informationGPUs and Emerging Architectures
GPUs and Emerging Architectures Mike Giles mike.giles@maths.ox.ac.uk Mathematical Institute, Oxford University e-infrastructure South Consortium Oxford e-research Centre Emerging Architectures p. 1 CPUs
More informationIBM POWER SYSTEMS: YOUR UNFAIR ADVANTAGE
IBM POWER SYSTEMS: YOUR UNFAIR ADVANTAGE Choosing IT infrastructure is a crucial decision, and the right choice will position your organization for success. IBM Power Systems provides an innovative platform
More informationCapturing value from an open ecosystem
Capturing value from an open ecosystem Tom Rosamilia Senior Vice President IBM Systems Forward-Looking Statement Certain comments made during this event and in the presentation materials may be characterized
More informationAll About the Cell Processor
All About the Cell H. Peter Hofstee, Ph. D. IBM Systems and Technology Group SCEI/Sony Toshiba IBM Design Center Austin, Texas Acknowledgements Cell is the result of a deep partnership between SCEI/Sony,
More informationMaximizing heterogeneous system performance with ARM interconnect and CCIX
Maximizing heterogeneous system performance with ARM interconnect and CCIX Neil Parris, Director of product marketing Systems and software group, ARM Teratec June 2017 Intelligent flexible cloud to enable
More informationIBM HPC Technology & Strategy
IBM HPC Technology & Strategy Hyperion HPC User Forum Stuttgart, October 1st, 2018 The World s Smartest Supercomputers Klaus Gottschalk gottschalk@de.ibm.com HPC Strategy Deliver End to End Solutions for
More informationS THE MAKING OF DGX SATURNV: BREAKING THE BARRIERS TO AI SCALE. Presenter: Louis Capps, Solution Architect, NVIDIA,
S7750 - THE MAKING OF DGX SATURNV: BREAKING THE BARRIERS TO AI SCALE Presenter: Louis Capps, Solution Architect, NVIDIA, lcapps@nvidia.com A TALE OF ENLIGHTENMENT Basic OK List 10 for x = 1 to 3 20 print
More informationS8688 : INSIDE DGX-2. Glenn Dearth, Vyas Venkataraman Mar 28, 2018
S8688 : INSIDE DGX-2 Glenn Dearth, Vyas Venkataraman Mar 28, 2018 Why was DGX-2 created Agenda DGX-2 internal architecture Software programming model Simple application Results 2 DEEP LEARNING TRENDS Application
More informationExploiting the OpenPOWER Platform for Big Data Analytics and Cognitive. Rajesh Bordawekar and Ruchir Puri IBM T. J. Watson Research Center
Exploiting the OpenPOWER Platform for Big Data Analytics and Cognitive Rajesh Bordawekar and Ruchir Puri IBM T. J. Watson Research Center 3/17/2015 2014 IBM Corporation Outline IBM OpenPower Platform Accelerating
More informationDDN. DDN Updates. DataDirect Neworks Japan, Inc Nobu Hashizume. DDN Storage 2018 DDN Storage 1
1 DDN DDN Updates DataDirect Neworks Japan, Inc Nobu Hashizume DDN Storage 2018 DDN Storage 1 2 DDN A Broad Range of Technologies to Best Address Your Needs Your Use Cases Research Big Data Enterprise
More informationInterconnect Your Future
Interconnect Your Future Smart Interconnect for Next Generation HPC Platforms Gilad Shainer, August 2016, 4th Annual MVAPICH User Group (MUG) Meeting Mellanox Connects the World s Fastest Supercomputer
More informationTop 5 Reasons to Consider
Top 5 Reasons to Consider NVM Express over Fabrics For Your Cloud Data Center White Paper Top 5 Reasons to Consider NVM Express over Fabrics For Your Cloud Data Center Major transformations are occurring
More informationIBM Power User Group - Atlanta
IBM Power User Group - Atlanta Wes Showfety Open Source Database & HPC strategist, North America showfety@us.ibm.com 770-617-7377 LinkedIn: https://www.linkedin.com/in/wes-showfety-2399444 Twitter: @Wes_Show
More informationEnabling Performance-per-Watt Gains in High-Performance Cluster Computing
WHITE PAPER Appro Xtreme-X Supercomputer with the Intel Xeon Processor E5-2600 Product Family Enabling Performance-per-Watt Gains in High-Performance Cluster Computing Appro Xtreme-X Supercomputer with
More informationCSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.
CSCI 402: Computer Architectures Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI 6.6 - End Today s Contents GPU Cluster and its network topology The Roofline performance
More informationPerformance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability
Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability Mellanox InfiniBand Host Channel Adapters (HCA) enable the highest data center
More informationHETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA
HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA STATE OF THE ART 2012 18,688 Tesla K20X GPUs 27 PetaFLOPS FLAGSHIP SCIENTIFIC APPLICATIONS
More informationOCP Engineering Workshop - Telco
OCP Engineering Workshop - Telco Low Latency Mobile Edge Computing Trevor Hiatt Product Management, IDT IDT Company Overview Founded 1980 Workforce Approximately 1,800 employees Headquarters San Jose,
More informationBirds of a Feather Presentation
Mellanox InfiniBand QDR 4Gb/s The Fabric of Choice for High Performance Computing Gilad Shainer, shainer@mellanox.com June 28 Birds of a Feather Presentation InfiniBand Technology Leadership Industry Standard
More informationCisco Unified Computing System Delivering on Cisco's Unified Computing Vision
Cisco Unified Computing System Delivering on Cisco's Unified Computing Vision At-A-Glance Unified Computing Realized Today, IT organizations assemble their data center environments from individual components.
More informationOncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries
Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries Jeffrey Young, Alex Merritt, Se Hoon Shon Advisor: Sudhakar Yalamanchili 4/16/13 Sponsors: Intel, NVIDIA, NSF 2 The Problem Big
More informationHPC Saudi Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences. Presented to: March 14, 2017
Creating an Exascale Ecosystem for Science Presented to: HPC Saudi 2017 Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences March 14, 2017 ORNL is managed by UT-Battelle
More informationLinuxCon Japan 2014 OpenPOWER Technical Overview. Jeff Scheel Chief Engineer Linux on Power May 21, IBM Corporation
LinuxCon Japan 2014 OpenPOWER Technical Overview Jeff Scheel Chief Engineer Linux on Power scheel@us.ibm.com May 21, 2014 Agenda 1. OpenPOWER Foundation Overview 2. OpenPOWER Hardware Technologies 3. OpenPOWER
More informationSolutions for Scalable HPC
Solutions for Scalable HPC Scot Schultz, Director HPC/Technical Computing HPC Advisory Council Stanford Conference Feb 2014 Leading Supplier of End-to-End Interconnect Solutions Comprehensive End-to-End
More informationInterconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, November 2017
Interconnect Your Future Enabling the Best Datacenter Return on Investment TOP500 Supercomputers, November 2017 InfiniBand Accelerates Majority of New Systems on TOP500 InfiniBand connects 77% of new HPC
More informationSoftFlash: Programmable Storage in Future Data Centers Jae Do Researcher, Microsoft Research
SoftFlash: Programmable Storage in Future Data Centers Jae Do Researcher, Microsoft Research 1 The world s most valuable resource Data is everywhere! May. 2017 Values from Data! Need infrastructures for
More informationRealizing the Next Generation of Exabyte-scale Persistent Memory-Centric Architectures and Memory Fabrics
Realizing the Next Generation of Exabyte-scale Persistent Memory-Centric Architectures and Memory Fabrics Zvonimir Z. Bandic, Sr. Director, Next Generation Platform Technologies Western Digital Corporation
More informationArchitected for Performance. NVMe over Fabrics. September 20 th, Brandon Hoff, Broadcom.
Architected for Performance NVMe over Fabrics September 20 th, 2017 Brandon Hoff, Broadcom Brandon.Hoff@Broadcom.com Agenda NVMe over Fabrics Update Market Roadmap NVMe-TCP The benefits of NVMe over Fabrics
More informationDRAM and Storage-Class Memory (SCM) Overview
Page 1 of 7 DRAM and Storage-Class Memory (SCM) Overview Introduction/Motivation Looking forward, volatile and non-volatile memory will play a much greater role in future infrastructure solutions. Figure
More informationJohn Unthank IBM Federal Sales IBM HPC Topics IBM Corporation
John Unthank IBM Federal Sales unthank@us.ibm.com IBM HPC Topics IBM Imperatives* Transform industries and professions with data Remake enterprise IT for the cloud Reimagine work through mobile and social
More informationNetworking at the Speed of Light
Networking at the Speed of Light Dror Goldenberg VP Software Architecture MaRS Workshop April 2017 Cloud The Software Defined Data Center Resource virtualization Efficient services VM, Containers uservices
More informationEmerging Technologies for HPC Storage
Emerging Technologies for HPC Storage Dr. Wolfgang Mertz CTO EMEA Unstructured Data Solutions June 2018 The very definition of HPC is expanding Blazing Fast Speed Accessibility and flexibility 2 Traditional
More informationVOLTA: PROGRAMMABILITY AND PERFORMANCE. Jack Choquette NVIDIA Hot Chips 2017
VOLTA: PROGRAMMABILITY AND PERFORMANCE Jack Choquette NVIDIA Hot Chips 2017 1 TESLA V100 21B transistors 815 mm 2 80 SM 5120 CUDA Cores 640 Tensor Cores 16 GB HBM2 900 GB/s HBM2 300 GB/s NVLink *full GV100
More informationGen-Z Memory-Driven Computing
Gen-Z Memory-Driven Computing Our vision for the future of computing Patrick Demichel Distinguished Technologist Explosive growth of data More Data Need answers FAST! Value of Analyzed Data 2005 0.1ZB
More informationPOWER CAPI+SNAP+FPGA,
POWER CAPI+SNAP+FPGA, the powerful combination to accelerate routines explained through use cases Bruno MESNET, CAPI / OpenCAPI enablement IBM Systems Join the Conversation #OpenPOWERSummit Offload?...CAPI
More informationFUJITSU Server PRIMERGY CX400 M4 Workload-specific power in a modular form factor. 0 Copyright 2018 FUJITSU LIMITED
FUJITSU Server PRIMERGY CX400 M4 Workload-specific power in a modular form factor 0 Copyright 2018 FUJITSU LIMITED FUJITSU Server PRIMERGY CX400 M4 Workload-specific power in a compact and modular form
More informationMessaging Overview. Introduction. Gen-Z Messaging
Page 1 of 6 Messaging Overview Introduction Gen-Z is a new data access technology that not only enhances memory and data storage solutions, but also provides a framework for both optimized and traditional
More informationNVMe Takes It All, SCSI Has To Fall. Brave New Storage World. Lugano April Alexander Ruebensaal
Lugano April 2018 NVMe Takes It All, SCSI Has To Fall freely adapted from ABBA Brave New Storage World Alexander Ruebensaal 1 Design, Implementation, Support & Operating of optimized IT Infrastructures
More informationHewlett Packard Enterprise HPE GEN10 PERSISTENT MEMORY PERFORMANCE THROUGH PERSISTENCE
Hewlett Packard Enterprise HPE GEN10 PERSISTENT MEMORY PERFORMANCE THROUGH PERSISTENCE Digital transformation is taking place in businesses of all sizes Big Data and Analytics Mobility Internet of Things
More informationFlash Storage with 24G SAS Leads the Way in Crunching Big Data
Flash Storage with 24G SAS Leads the Way in Crunching Big Data SCSI Trade Association August 8th, 2018 1 Today s Panel Dennis Martin Founder and President Demartek Mohamad El-Batal Sr. Director of Architecture,
More informationIBM Leading High Performance Computing and Deep Learning Technologies
IBM Leading High Performance Computing and Deep Learning Technologies Yubo Li ( 李玉博 ) Chief Architect, on Cloud IBM Research -- China email: liyubobj@cn.ibm.com QQ: 395238640 GTC China 2016 Sept. 13, 2016
More information2008 International ANSYS Conference
2008 International ANSYS Conference Maximizing Productivity With InfiniBand-Based Clusters Gilad Shainer Director of Technical Marketing Mellanox Technologies 2008 ANSYS, Inc. All rights reserved. 1 ANSYS,
More informationMICROWAY S NVIDIA TESLA V100 GPU SOLUTIONS GUIDE
MICROWAY S NVIDIA TESLA V100 GPU SOLUTIONS GUIDE LEVERAGE OUR EXPERTISE sales@microway.com http://microway.com/tesla NUMBERSMASHER TESLA 4-GPU SERVER/WORKSTATION Flexible form factor 4 PCI-E GPUs + 3 additional
More informationInterconnect Your Future
Interconnect Your Future Paving the Path to Exascale November 2017 Mellanox Accelerates Leading HPC and AI Systems Summit CORAL System Sierra CORAL System Fastest Supercomputer in Japan Fastest Supercomputer
More informationOak Ridge National Laboratory Computing and Computational Sciences
Oak Ridge National Laboratory Computing and Computational Sciences OFA Update by ORNL Presented by: Pavel Shamis (Pasha) OFA Workshop Mar 17, 2015 Acknowledgments Bernholdt David E. Hill Jason J. Leverman
More informationIndustry Collaboration and Innovation
Industry Collaboration and Innovation Open Coherent Accelerator Processor Interface OpenCAPI TM - A New Standard for High Performance Memory, Acceleration and Networks Jeff Stuecheli April 10, 2017 What
More informationExascale: challenges and opportunities in a power constrained world
Exascale: challenges and opportunities in a power constrained world Carlo Cavazzoni c.cavazzoni@cineca.it SuperComputing Applications and Innovation Department CINECA CINECA non profit Consortium, made
More informationSOFTWARE-DEFINED BLOCK STORAGE FOR HYPERSCALE APPLICATIONS
SOFTWARE-DEFINED BLOCK STORAGE FOR HYPERSCALE APPLICATIONS SCALE-OUT SERVER SAN WITH DISTRIBUTED NVME, POWERED BY HIGH-PERFORMANCE NETWORK TECHNOLOGY INTRODUCTION The evolution in data-centric applications,
More informationInterconnect Your Future
Interconnect Your Future Paving the Road to Exascale August 2017 Exponential Data Growth The Need for Intelligent and Faster Interconnect CPU-Centric (Onload) Data-Centric (Offload) Must Wait for the Data
More information