Porting Scientific Applications to OpenPOWER
|
|
- Ernest Whitehead
- 6 years ago
- Views:
Transcription
1 Porting Scientific Applications to OpenPOWER Dirk Pleiter Forschungszentrum Jülich / JSC #OpenPOWERSummit Join the conversation at #OpenPOWERSummit 1
2 JSC s HPC Strategy IBM Power 6 JUMP, 9 TFlop/s Intel Nehalem JUROPA 300 TFlop/s JURECA ~ 2 PFlop/s + Booster ~ 10 PFlop/s File Server Lustre GPFS IBM Blue Gene/L JUBL, 45 TFlop/s IBM Blue Gene/P JUGENE, 1 PFlop/s IBM Blue Gene/Q JUQUEEN 5.9 PFlop/s General-Purpose Cluster Highly Scalable System Join the conversation at #OpenPOWERSummit 2
3 Achieving Scalability Need for Research on Research on architectures and technologies Research on applications and algorithms Ingredients for HPC co-design Provide incentives to users: High-Q Club Showcase for codes that can utilize a 28-rack Blue Gene/Q at JSC Selected Club members dynqcd: simulation of particle theories KKRnano: DFT-based condensed matter PEPC: Tree-based N-body code... Join the conversation at #OpenPOWERSummit 3
4 Why OpenPOWER? Answer from a customer point of view Increasing share of Top500 are based on CPUs from single vendor Pure market observation, no statement about technology Lack of competition in processor technologies Usually higher prices Less incentive for innovations Need for promoting alternative technologies OpenPOWER Join the conversation at #OpenPOWERSummit 4
5 Why OpenPOWER? Answer from an architectural point of view Tight integration of high-performance processor and low-clocked, highly parallel compute devices Enable drastic improvement of power efficiency Preserve usability at tremendously increased level of parallelism Opportunity to improve overall balance of system Integration of non-volatile memory into fat compute nodes Increased reliability though reduced number of components and support of resilience Addresses exascale challenges Join the conversation at #OpenPOWERSummit 5
6 POWER Acceleration and Design Center PADC is a collaboration between IBM R&D Labs in Böblingen and Zürich Forschungszentrum Jülich NVIDIA Europe Mission statement Support scientists and engineers to target the grand challenges facing society using OpenPOWER technologies Grand challenges Energy and environment, e.g. plasma physics Information, e.g. condensed matter physics Healthcare, e.g. brain research Join the conversation at #OpenPOWERSummit 6
7 Applications PADC takes application driven approach Builds on previous work in NVIDIA Application Lab Previously targeted applications Regional Flood Model B-CALM PANDA... Ongoing and future applications KKRnano BigBrain... Join the conversation at #OpenPOWERSummit 7
8 Performance Analysis Performance analysis POWER8 memory hierarchy Performance analysis GPU-GPU data transport Join the conversation at #OpenPOWERSummit 8
9 Performance Characterization Characterization of applications on given hardware Methodology Identification of performance critical kernels Optimization of kernel at best effort with given constraints Performance characterization Measurement of extensive performance metrics Architectural analysis Question addressed in architectural analysis How does performance change with clock speed? How does it depend on memory hierarchy? Join the conversation at #OpenPOWERSummit 9
10 Performance Characterization Example: Regional Flood Model Key kernel: Solver for Saint-Venant equations Compute particle flow in 2 dimensions Selected performance metrics (on K20x) Arithmetic intensity AI acc (T) = 0.5 Memory rd/wr bandwidth = 80/86 GByte/s Warp execution efficiency ε warp = 80% Example analysis for changing boost clock Join the conversation at #OpenPOWERSummit 10
11 Performance Modelling Semi-empirical performance modelling methodology Methodology On basis of prior knowledge formulate scaling formulae describing dependence of execution time t(w) as function of work-load W Measure t(w ) for different W and fit scaling formulae to result Check fitted parameters for plausibility Considered example: B-CALM 1-dimensionally parallelized Finite Difference Time Domain approach for electro-magnetic simulations Join the conversation at #OpenPOWERSummit 11
12 Performance Modelling Semi-empirical performance modelling for B-CALM Model ansatz Calculation of boundary sites t bnd ~ N x N y Calculation of bulk sites t bulk ~ N x N y (N z / P) Communication of boundary t net ~ N x N y Overlapping calculations and communications: t = t bnd + max( t bulk, t net ) Weak scaling measured for fixed N x N y using P=2 GPUs attached to single processor Non-optimized MPI Join the conversation at #OpenPOWERSummit 12
13 Future opportunities Challenging applications with large memory capacity and high bandwidth requirements High bandwidth, smaller capacity memory attached to GPU Large capacity, smaller bandwidth memory attached to CPU Example: BigBrain project at FZ Jülich Goal: 3d brain model reconstructed from 2d slices Computational challenge: image registration Compute intensive computation of mutual information metric Large capacity required for storing high-resolution images Significant slow-down found using host memory on today s architectures [A. Adinets et al., HeteroPar 2013] Large benefit expected from NVLink Join the conversation at #OpenPOWERSummit 13
14 Conclusions OpenPOWER opens important opportunities for HPC infrastructure providers Exascale challenges are addressed No problems porting GPU-enabled applications to OpenPOWER Room for optimizations Support for porting more applications required Based on performance characterization and modelling Optimization and code restructuring within PADC Join the conversation at #OpenPOWERSummit 14
NVIDIA Application Lab at Jülich
Mitglied der Helmholtz- Gemeinschaft NVIDIA Application Lab at Jülich Dirk Pleiter Jülich Supercomputing Centre (JSC) Forschungszentrum Jülich at a Glance (status 2010) Budget: 450 mio Euro Staff: 4,800
More informationVon Antreibern und Beschleunigern des HPC
Mitglied der Helmholtz-Gemeinschaft Von Antreibern und Beschleunigern des HPC D. Pleiter Jülich 16 December 2014 Ein Dementi vorweg [c't, Nr. 25/2014, 15.11.2014] Ja: Das FZJ ist seit März Mitglieder der
More informationMPI RUNTIMES AT JSC, NOW AND IN THE FUTURE
, NOW AND IN THE FUTURE Which, why and how do they compare in our systems? 08.07.2018 I MUG 18, COLUMBUS (OH) I DAMIAN ALVAREZ Outline FZJ mission JSC s role JSC s vision for Exascale-era computing JSC
More informationSystems Architectures towards Exascale
Systems Architectures towards Exascale D. Pleiter German-Indian Workshop on HPC Architectures and Applications Pune 29 November 2016 Outline Introduction Exascale computing Technology trends Architectures
More informationJülich Supercomputing Centre
Mitglied der Helmholtz-Gemeinschaft Jülich Supercomputing Centre Norbert Attig Jülich Supercomputing Centre (JSC) Forschungszentrum Jülich (FZJ) Aug 26, 2009 DOAG Regionaltreffen NRW 2 Supercomputing at
More informationWelcome to the. Jülich Supercomputing Centre. D. Rohe and N. Attig Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich
Mitglied der Helmholtz-Gemeinschaft Welcome to the Jülich Supercomputing Centre D. Rohe and N. Attig Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich Schedule: Monday, May 18 13:00-13:30 Welcome
More informationWelcome to the. Jülich Supercomputing Centre. D. Rohe and N. Attig Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich
Mitglied der Helmholtz-Gemeinschaft Welcome to the Jülich Supercomputing Centre D. Rohe and N. Attig Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich Schedule: Thursday, Nov 26 13:00-13:30
More informationJÜLICH SUPERCOMPUTING CENTRE Site Introduction Michael Stephan Forschungszentrum Jülich
JÜLICH SUPERCOMPUTING CENTRE Site Introduction 09.04.2018 Michael Stephan JSC @ Forschungszentrum Jülich FORSCHUNGSZENTRUM JÜLICH Research Centre Jülich One of the 15 Helmholtz Research Centers in Germany
More informationI/O and Scheduling aspects in DEEP-EST
I/O and Scheduling aspects in DEEP-EST Norbert Eicker Jülich Supercomputing Centre & University of Wuppertal The research leading to these results has received funding from the European Community's Seventh
More informationI/O Monitoring at JSC, SIONlib & Resiliency
Mitglied der Helmholtz-Gemeinschaft I/O Monitoring at JSC, SIONlib & Resiliency Update: I/O Infrastructure @ JSC Update: Monitoring with LLview (I/O, Memory, Load) I/O Workloads on Jureca SIONlib: Task-Local
More informationParallel & Scalable Machine Learning Introduction to Machine Learning Algorithms
Parallel & Scalable Machine Learning Introduction to Machine Learning Algorithms Dr. Ing. Morris Riedel Adjunct Associated Professor School of Engineering and Natural Sciences, University of Iceland Research
More informationI/O at JSC. I/O Infrastructure Workloads, Use Case I/O System Usage and Performance SIONlib: Task-Local I/O. Wolfgang Frings
Mitglied der Helmholtz-Gemeinschaft I/O at JSC I/O Infrastructure Workloads, Use Case I/O System Usage and Performance SIONlib: Task-Local I/O Wolfgang Frings W.Frings@fz-juelich.de Jülich Supercomputing
More informationHigh Performance Computing at the Jülich Supercomputing Center
Mitglied der Helmholtz-Gemeinschaft High Performance Computing at the Jülich Supercomputing Center Jutta Docter Institute for Advanced Simulation (IAS) Jülich Supercomputing Centre (JSC) Overview Jülich
More informationPreparing GPU-Accelerated Applications for the Summit Supercomputer
Preparing GPU-Accelerated Applications for the Summit Supercomputer Fernanda Foertter HPC User Assistance Group Training Lead foertterfs@ornl.gov This research used resources of the Oak Ridge Leadership
More informationBuilding NVLink for Developers
Building NVLink for Developers Unleashing programmatic, architectural and performance capabilities for accelerated computing Why NVLink TM? Simpler, Better and Faster Simplified Programming No specialized
More informationTrends in HPC Architectures
Mitglied der Helmholtz-Gemeinschaft Trends in HPC Architectures Norbert Eicker Institute for Advanced Simulation Jülich Supercomputing Centre PRACE/LinkSCEEM-2 CyI 2011 Winter School Nikosia, Cyprus Forschungszentrum
More informationSoftware and Performance Engineering for numerical codes on GPU clusters
Software and Performance Engineering for numerical codes on GPU clusters H. Köstler International Workshop of GPU Solutions to Multiscale Problems in Science and Engineering Harbin, China 28.7.2010 2 3
More informationTrends in HPC (hardware complexity and software challenges)
Trends in HPC (hardware complexity and software challenges) Mike Giles Oxford e-research Centre Mathematical Institute MIT seminar March 13th, 2013 Mike Giles (Oxford) HPC Trends March 13th, 2013 1 / 18
More informationThe DEEP (and DEEP-ER) projects
The DEEP (and DEEP-ER) projects Estela Suarez - Jülich Supercomputing Centre BDEC for Europe Workshop Barcelona, 28.01.2015 The research leading to these results has received funding from the European
More informationVectorisation and Portable Programming using OpenCL
Vectorisation and Portable Programming using OpenCL Mitglied der Helmholtz-Gemeinschaft Jülich Supercomputing Centre (JSC) Andreas Beckmann, Ilya Zhukov, Willi Homberg, JSC Wolfram Schenck, FH Bielefeld
More informationTools and Methodology for Ensuring HPC Programs Correctness and Performance. Beau Paisley
Tools and Methodology for Ensuring HPC Programs Correctness and Performance Beau Paisley bpaisley@allinea.com About Allinea Over 15 years of business focused on parallel programming development tools Strong
More informationHPC Saudi Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences. Presented to: March 14, 2017
Creating an Exascale Ecosystem for Science Presented to: HPC Saudi 2017 Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences March 14, 2017 ORNL is managed by UT-Battelle
More informationExascale: challenges and opportunities in a power constrained world
Exascale: challenges and opportunities in a power constrained world Carlo Cavazzoni c.cavazzoni@cineca.it SuperComputing Applications and Innovation Department CINECA CINECA non profit Consortium, made
More informationResults from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence
Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence Jens Domke Research Staff at MATSUOKA Laboratory GSIC, Tokyo Institute of Technology, Japan Omni-Path User Group 2017/11/14 Denver,
More informationParallel I/O on JUQUEEN
Parallel I/O on JUQUEEN 4. Februar 2014, JUQUEEN Porting and Tuning Workshop Mitglied der Helmholtz-Gemeinschaft Wolfgang Frings w.frings@fz-juelich.de Jülich Supercomputing Centre Overview Parallel I/O
More informationInterconnect Your Future
#OpenPOWERSummit Interconnect Your Future Scot Schultz, Director HPC / Technical Computing Mellanox Technologies OpenPOWER Summit, San Jose CA March 2015 One-Generation Lead over the Competition Mellanox
More informationRecent Developments in Supercomputing
John von Neumann Institute for Computing Recent Developments in Supercomputing Th. Lippert published in NIC Symposium 2008, G. Münster, D. Wolf, M. Kremer (Editors), John von Neumann Institute for Computing,
More informationFujitsu s Approach to Application Centric Petascale Computing
Fujitsu s Approach to Application Centric Petascale Computing 2 nd Nov. 2010 Motoi Okuda Fujitsu Ltd. Agenda Japanese Next-Generation Supercomputer, K Computer Project Overview Design Targets System Overview
More informationAnalyzing the Performance of IWAVE on a Cluster using HPCToolkit
Analyzing the Performance of IWAVE on a Cluster using HPCToolkit John Mellor-Crummey and Laksono Adhianto Department of Computer Science Rice University {johnmc,laksono}@rice.edu TRIP Meeting March 30,
More informationExploiting the OpenPOWER Platform for Big Data Analytics and Cognitive. Rajesh Bordawekar and Ruchir Puri IBM T. J. Watson Research Center
Exploiting the OpenPOWER Platform for Big Data Analytics and Cognitive Rajesh Bordawekar and Ruchir Puri IBM T. J. Watson Research Center 3/17/2015 2014 IBM Corporation Outline IBM OpenPower Platform Accelerating
More informationIBM Deep Learning Solutions
IBM Deep Learning Solutions Reference Architecture for Deep Learning on POWER8, P100, and NVLink October, 2016 How do you teach a computer to Perceive? 2 Deep Learning: teaching Siri to recognize a bicycle
More informationPLAN-E Workshop Switzerland. Welcome! September 8, 2016
PLAN-E Workshop Switzerland Welcome! September 8, 2016 The Swiss National Supercomputing Centre Driving innovation in computational research in Switzerland Michele De Lorenzi (CSCS) PLAN-E September 8,
More informationUsing Automated Performance Modeling to Find Scalability Bugs in Complex Codes
Using Automated Performance Modeling to Find Scalability Bugs in Complex Codes A. Calotoiu 1, T. Hoefler 2, M. Poke 1, F. Wolf 1 1) German Research School for Simulation Sciences 2) ETH Zurich September
More information19. prosince 2018 CIIRC Praha. Milan Král, IBM Radek Špimr
19. prosince 2018 CIIRC Praha Milan Král, IBM Radek Špimr CORAL CORAL 2 CORAL Installation at ORNL CORAL Installation at LLNL Order of Magnitude Leap in Computational Power Real, Accelerated Science ACME
More informationOpenStaPLE, an OpenACC Lattice QCD Application
OpenStaPLE, an OpenACC Lattice QCD Application Enrico Calore Postdoctoral Researcher Università degli Studi di Ferrara INFN Ferrara Italy GTC Europe, October 10 th, 2018 E. Calore (Univ. and INFN Ferrara)
More informationLarge scale Imaging on Current Many- Core Platforms
Large scale Imaging on Current Many- Core Platforms SIAM Conf. on Imaging Science 2012 May 20, 2012 Dr. Harald Köstler Chair for System Simulation Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen,
More informationDynamical Exascale Entry Platform
DEEP Dynamical Exascale Entry Platform 2 nd IS-ENES Workshop on High performance computing for climate models 30.01.2013, Toulouse, France Estela Suarez The research leading to these results has received
More informationGPU-centric communication for improved efficiency
GPU-centric communication for improved efficiency Benjamin Klenk *, Lena Oden, Holger Fröning * * Heidelberg University, Germany Fraunhofer Institute for Industrial Mathematics, Germany GPCDP Workshop
More informationMELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구
MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구 Leading Supplier of End-to-End Interconnect Solutions Analyze Enabling the Use of Data Store ICs Comprehensive End-to-End InfiniBand and Ethernet Portfolio
More informationWrite a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical
Identify a problem Review approaches to the problem Propose a novel approach to the problem Define, design, prototype an implementation to evaluate your approach Could be a real system, simulation and/or
More informationEarly Evaluation of the "Infinite Memory Engine" Burst Buffer Solution
Early Evaluation of the "Infinite Memory Engine" Burst Buffer Solution Wolfram Schenck Faculty of Engineering and Mathematics, Bielefeld University of Applied Sciences, Bielefeld, Germany Salem El Sayed,
More informationHigh Performance Computing Course Notes Course Administration
High Performance Computing Course Notes 2009-2010 2010 Course Administration Contacts details Dr. Ligang He Home page: http://www.dcs.warwick.ac.uk/~liganghe Email: liganghe@dcs.warwick.ac.uk Office hours:
More informationTECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0)
TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0) Contributing sites and the corresponding computer systems for this call are: GCS@Jülich, Germany IBM Blue Gene/Q GENCI@CEA, France Bull Bullx
More informationTwo-Phase flows on massively parallel multi-gpu clusters
Two-Phase flows on massively parallel multi-gpu clusters Peter Zaspel Michael Griebel Institute for Numerical Simulation Rheinische Friedrich-Wilhelms-Universität Bonn Workshop Programming of Heterogeneous
More informationA Breakthrough in Non-Volatile Memory Technology FUJITSU LIMITED
A Breakthrough in Non-Volatile Memory Technology & 0 2018 FUJITSU LIMITED IT needs to accelerate time-to-market Situation: End users and applications need instant access to data to progress faster and
More informationDeep Learning mit PowerAI - Ein Überblick
Stephen Lutz Deep Learning mit PowerAI - Open Group Master Certified IT Specialist Technical Sales IBM Cognitive Infrastructure IBM Germany Ein Überblick Stephen.Lutz@de.ibm.com What s that? and what s
More informationParticle-in-Cell Simulations on Modern Computing Platforms. Viktor K. Decyk and Tajendra V. Singh UCLA
Particle-in-Cell Simulations on Modern Computing Platforms Viktor K. Decyk and Tajendra V. Singh UCLA Outline of Presentation Abstraction of future computer hardware PIC on GPUs OpenCL and Cuda Fortran
More informationInterconnect Your Future
Interconnect Your Future Gilad Shainer 2nd Annual MVAPICH User Group (MUG) Meeting, August 2014 Complete High-Performance Scalable Interconnect Infrastructure Comprehensive End-to-End Software Accelerators
More informationIntel Many Integrated Core (MIC) Architecture
Intel Many Integrated Core (MIC) Architecture Karl Solchenbach Director European Exascale Labs BMW2011, November 3, 2011 1 Notice and Disclaimers Notice: This document contains information on products
More informationNumerical Algorithms on Multi-GPU Architectures
Numerical Algorithms on Multi-GPU Architectures Dr.-Ing. Harald Köstler 2 nd International Workshops on Advances in Computational Mechanics Yokohama, Japan 30.3.2010 2 3 Contents Motivation: Applications
More informationCUDA Experiences: Over-Optimization and Future HPC
CUDA Experiences: Over-Optimization and Future HPC Carl Pearson 1, Simon Garcia De Gonzalo 2 Ph.D. candidates, Electrical and Computer Engineering 1 / Computer Science 2, University of Illinois Urbana-Champaign
More informationGeneral overview and first results.
French Technological Watch Group: General overview and first results gabriel.hautreux@genci.fr Fusion workshop at «Maison de la Simulation» 29/11/2016 FRENCH TECHNOLOGICAL WATCH GROUP Led by GENCI and
More informationSUPERMICRO, VEXATA AND INTEL ENABLING NEW LEVELS PERFORMANCE AND EFFICIENCY FOR REAL-TIME DATA ANALYTICS FOR SQL DATA WAREHOUSE DEPLOYMENTS
TABLE OF CONTENTS 2 THE AGE OF INFORMATION ACCELERATION Vexata Provides the Missing Piece in The Information Acceleration Puzzle The Vexata - Supermicro Partnership 4 CREATING ULTRA HIGH-PERFORMANCE DATA
More informationOverview of Parallel Computing. Timothy H. Kaiser, PH.D.
Overview of Parallel Computing Timothy H. Kaiser, PH.D. tkaiser@mines.edu Introduction What is parallel computing? Why go parallel? The best example of parallel computing Some Terminology Slides and examples
More informationHigh Performance Computing Course Notes HPC Fundamentals
High Performance Computing Course Notes 2008-2009 2009 HPC Fundamentals Introduction What is High Performance Computing (HPC)? Difficult to define - it s a moving target. Later 1980s, a supercomputer performs
More informationPresent and Future Leadership Computers at OLCF
Present and Future Leadership Computers at OLCF Al Geist ORNL Corporate Fellow DOE Data/Viz PI Meeting January 13-15, 2015 Walnut Creek, CA ORNL is managed by UT-Battelle for the US Department of Energy
More informationMapping MPI+X Applications to Multi-GPU Architectures
Mapping MPI+X Applications to Multi-GPU Architectures A Performance-Portable Approach Edgar A. León Computer Scientist San Jose, CA March 28, 2018 GPU Technology Conference This work was performed under
More informationInterconnection of Armenian e- Infrastructures with the pan- Euroepan Integrated Environments
Interconnection of Armenian e- Infrastructures with the pan- Euroepan Integrated Environments H. Astsatryan Institute for Informatics and Automation Problems, National Academy of Sciences of the Republic
More informationTechnologies for Information and Health
Energy Defence and Global Security Technologies for Information and Health Atomic Energy Commission HPC in France from a global perspective Pierre LECA Head of «Simulation and Information Sciences Dpt.»
More informationAccelerating Implicit LS-DYNA with GPU
Accelerating Implicit LS-DYNA with GPU Yih-Yih Lin Hewlett-Packard Company Abstract A major hindrance to the widespread use of Implicit LS-DYNA is its high compute cost. This paper will show modern GPU,
More informationHPC IN EUROPE. Organisation of public HPC resources
HPC IN EUROPE Organisation of public HPC resources Context Focus on publicly-funded HPC resources provided primarily to enable scientific research and development at European universities and other publicly-funded
More informationIBM POWER SYSTEMS: YOUR UNFAIR ADVANTAGE
IBM POWER SYSTEMS: YOUR UNFAIR ADVANTAGE Choosing IT infrastructure is a crucial decision, and the right choice will position your organization for success. IBM Power Systems provides an innovative platform
More informationAdaptive Mesh Astrophysical Fluid Simulations on GPU. San Jose 10/2/2009 Peng Wang, NVIDIA
Adaptive Mesh Astrophysical Fluid Simulations on GPU San Jose 10/2/2009 Peng Wang, NVIDIA Overview Astrophysical motivation & the Enzo code Finite volume method and adaptive mesh refinement (AMR) CUDA
More informationTHE PATH TO EXASCALE COMPUTING. Bill Dally Chief Scientist and Senior Vice President of Research
THE PATH TO EXASCALE COMPUTING Bill Dally Chief Scientist and Senior Vice President of Research The Goal: Sustained ExaFLOPs on problems of interest 2 Exascale Challenges Energy efficiency Programmability
More informationPerformance Tools for Technical Computing
Christian Terboven terboven@rz.rwth-aachen.de Center for Computing and Communication RWTH Aachen University Intel Software Conference 2010 April 13th, Barcelona, Spain Agenda o Motivation and Methodology
More informationThe IBM Blue Gene/Q: Application performance, scalability and optimisation
The IBM Blue Gene/Q: Application performance, scalability and optimisation Mike Ashworth, Andrew Porter Scientific Computing Department & STFC Hartree Centre Manish Modani IBM STFC Daresbury Laboratory,
More informationIBM CORAL HPC System Solution
IBM CORAL HPC System Solution HPC and HPDA towards Cognitive, AI and Deep Learning Deep Learning AI / Deep Learning Strategy for Power Power AI Platform High Performance Data Analytics Big Data Strategy
More informationFirst Steps of YALES2 Code Towards GPU Acceleration on Standard and Prototype Cluster
First Steps of YALES2 Code Towards GPU Acceleration on Standard and Prototype Cluster YALES2: Semi-industrial code for turbulent combustion and flows Jean-Matthieu Etancelin, ROMEO, NVIDIA GPU Application
More informationWorld s most advanced data center accelerator for PCIe-based servers
NVIDIA TESLA P100 GPU ACCELERATOR World s most advanced data center accelerator for PCIe-based servers HPC data centers need to support the ever-growing demands of scientists and researchers while staying
More informationCharacterizing Parallel I/O Behaviour Based on Server-Side I/O Counters
Characterizing Parallel I/O Behaviour Based on Server-Side I/O Counters SC16 - BoF Analyzing Parallel I/O SC16 BoF - Analyzing Parallel I/O, November 15, 2016 S. El Sayed JSC M. Bolten Kas D. Pleiter JSC
More informationHPC future trends from a science perspective
HPC future trends from a science perspective Simon McIntosh-Smith University of Bristol HPC Research Group simonm@cs.bris.ac.uk 1 Business as usual? We've all got used to new machines being relatively
More informationPortable Heterogeneous High-Performance Computing via Domain-Specific Virtualization. Dmitry I. Lyakh.
Portable Heterogeneous High-Performance Computing via Domain-Specific Virtualization Dmitry I. Lyakh liakhdi@ornl.gov This research used resources of the Oak Ridge Leadership Computing Facility at the
More informationsimulation framework for piecewise regular grids
WALBERLA, an ultra-scalable multiphysics simulation framework for piecewise regular grids ParCo 2015, Edinburgh September 3rd, 2015 Christian Godenschwager, Florian Schornbaum, Martin Bauer, Harald Köstler
More informationHybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS
+ Hybrid Computing @ KAUST Many Cores and OpenACC Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS + Agenda Hybrid Computing n Hybrid Computing n From Multi-Physics
More informationPERFORMANCE PORTABILITY WITH OPENACC. Jeff Larkin, NVIDIA, November 2015
PERFORMANCE PORTABILITY WITH OPENACC Jeff Larkin, NVIDIA, November 2015 TWO TYPES OF PORTABILITY FUNCTIONAL PORTABILITY PERFORMANCE PORTABILITY The ability for a single code to run anywhere. The ability
More informationThe Future of High Performance Interconnects
The Future of High Performance Interconnects Ashrut Ambastha HPC Advisory Council Perth, Australia :: August 2017 When Algorithms Go Rogue 2017 Mellanox Technologies 2 When Algorithms Go Rogue 2017 Mellanox
More informationArm's role in co-design for the next generation of HPC platforms
Arm's role in co-design for the next generation of HPC platforms Filippo Spiga Software and Large Scale Systems What it is Co-design? Abstract: Preparations for Exascale computing have led to the realization
More informationHigh Performance Computing Data Management. Philippe Trautmann BDM High Performance Computing Global Research
High Performance Computing Management Philippe Trautmann BDM High Performance Computing Global Education @ Research HPC Market and Trends High Performance Computing: Availability/Sharing is key European
More informationAn Empirical Study of Computation-Intensive Loops for Identifying and Classifying Loop Kernels
An Empirical Study of Computation-Intensive Loops for Identifying and Classifying Loop Kernels Masatomo Hashimoto Masaaki Terai Toshiyuki Maeda Kazuo Minami 26/04/2017 ICPE2017 1 Agenda Performance engineering
More informationOverview. CS 472 Concurrent & Parallel Programming University of Evansville
Overview CS 472 Concurrent & Parallel Programming University of Evansville Selection of slides from CIS 410/510 Introduction to Parallel Computing Department of Computer and Information Science, University
More informationTESLA V100 PERFORMANCE GUIDE. Life Sciences Applications
TESLA V100 PERFORMANCE GUIDE Life Sciences Applications NOVEMBER 2017 TESLA V100 PERFORMANCE GUIDE Modern high performance computing (HPC) data centers are key to solving some of the world s most important
More informationHPC with GPU and its applications from Inspur. Haibo Xie, Ph.D
HPC with GPU and its applications from Inspur Haibo Xie, Ph.D xiehb@inspur.com 2 Agenda I. HPC with GPU II. YITIAN solution and application 3 New Moore s Law 4 HPC? HPC stands for High Heterogeneous Performance
More informationOak Ridge National Laboratory Computing and Computational Sciences
Oak Ridge National Laboratory Computing and Computational Sciences OFA Update by ORNL Presented by: Pavel Shamis (Pasha) OFA Workshop Mar 17, 2015 Acknowledgments Bernholdt David E. Hill Jason J. Leverman
More informationGeneral Plasma Physics
Present and Future Computational Requirements General Plasma Physics Center for Integrated Computation and Analysis of Reconnection and Turbulence () Kai Germaschewski, Homa Karimabadi Amitava Bhattacharjee,
More informationNetworks. Capital markets day 2017 N O V E M B E R 7-8, N E W Y O R K. Ericsson Internal Page 1
Networks Capital markets day 2017 N O V E M B E R 7-8, 2 0 1 7 N E W Y O R K Ericsson Internal 2017-10-06 Page 1 Fredrik Jejdling Executive Vice President & Head of Networks Ericsson Internal 2017-10-06
More informationTECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 11th CALL (T ier-0)
TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 11th CALL (T ier-0) Contributing sites and the corresponding computer systems for this call are: BSC, Spain IBM System X idataplex CINECA, Italy The site selection
More informationIntroduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620
Introduction to Parallel and Distributed Computing Linh B. Ngo CPSC 3620 Overview: What is Parallel Computing To be run using multiple processors A problem is broken into discrete parts that can be solved
More informationHigh performance Computing and O&G Challenges
High performance Computing and O&G Challenges 2 Seismic exploration challenges High Performance Computing and O&G challenges Worldwide Context Seismic,sub-surface imaging Computing Power needs Accelerating
More informationFacilitating IP Development for the OpenCAPI Memory Interface Kevin McIlvain, Memory Development Engineer IBM. Join the Conversation #OpenPOWERSummit
Facilitating IP Development for the OpenCAPI Memory Interface Kevin McIlvain, Memory Development Engineer IBM Join the Conversation #OpenPOWERSummit Moral of the Story OpenPOWER is the best platform to
More informationIdentifying Working Data Set of Particular Loop Iterations for Dynamic Performance Tuning
Identifying Working Data Set of Particular Loop Iterations for Dynamic Performance Tuning Yukinori Sato (JAIST / JST CREST) Hiroko Midorikawa (Seikei Univ. / JST CREST) Toshio Endo (TITECH / JST CREST)
More informationBuilding the Most Efficient Machine Learning System
Building the Most Efficient Machine Learning System Mellanox The Artificial Intelligence Interconnect Company June 2017 Mellanox Overview Company Headquarters Yokneam, Israel Sunnyvale, California Worldwide
More informationThe Future of Interconnect Technology
The Future of Interconnect Technology Michael Kagan, CTO HPC Advisory Council Stanford, 2014 Exponential Data Growth Best Interconnect Required 44X 0.8 Zetabyte 2009 35 Zetabyte 2020 2014 Mellanox Technologies
More informationSolving Large Complex Problems. Efficient and Smart Solutions for Large Models
Solving Large Complex Problems Efficient and Smart Solutions for Large Models 1 ANSYS Structural Mechanics Solutions offers several techniques 2 Current trends in simulation show an increased need for
More informationFaster Innovation - Accelerating SIMULIA Abaqus Simulations with NVIDIA GPUs. Baskar Rajagopalan Accelerated Computing, NVIDIA
Faster Innovation - Accelerating SIMULIA Abaqus Simulations with NVIDIA GPUs Baskar Rajagopalan Accelerated Computing, NVIDIA 1 Engineering & IT Challenges/Trends NVIDIA GPU Solutions AGENDA Abaqus GPU
More informationVýpočetní zdroje IT4Innovations a PRACE pro využití ve vědě a výzkumu
Výpočetní zdroje IT4Innovations a PRACE pro využití ve vědě a výzkumu Filip Staněk Seminář gridového počítání 2011, MetaCentrum, Brno, 7. 11. 2011 Introduction I Project objectives: to establish a centre
More informationCUDA Kernel based Collective Reduction Operations on Large-scale GPU Clusters
CUDA Kernel based Collective Reduction Operations on Large-scale GPU Clusters Ching-Hsiang Chu, Khaled Hamidouche, Akshay Venkatesh, Ammar Ahmad Awan and Dhabaleswar K. (DK) Panda Speaker: Sourav Chakraborty
More informationPaving the Road to Exascale
Paving the Road to Exascale Gilad Shainer August 2015, MVAPICH User Group (MUG) Meeting The Ever Growing Demand for Performance Performance Terascale Petascale Exascale 1 st Roadrunner 2000 2005 2010 2015
More informationA Simulation of Global Atmosphere Model NICAM on TSUBAME 2.5 Using OpenACC
A Simulation of Global Atmosphere Model NICAM on TSUBAME 2.5 Using OpenACC Hisashi YASHIRO RIKEN Advanced Institute of Computational Science Kobe, Japan My topic The study for Cloud computing My topic
More informationThe RAMDISK Storage Accelerator
The RAMDISK Storage Accelerator A Method of Accelerating I/O Performance on HPC Systems Using RAMDISKs Tim Wickberg, Christopher D. Carothers wickbt@rpi.edu, chrisc@cs.rpi.edu Rensselaer Polytechnic Institute
More informationCALMIP : HIGH PERFORMANCE COMPUTING
CALMIP : HIGH PERFORMANCE COMPUTING Nicolas.renon@univ-tlse3.fr Emmanuel.courcelle@inp-toulouse.fr CALMIP (UMS 3667) Espace Clément Ader www.calmip.univ-toulouse.fr CALMIP :Toulouse University Computing
More information