GPU Computing with NVIDIA s new Kepler Architecture
|
|
- Felicity Collins
- 6 years ago
- Views:
Transcription
1 GPU Computing with NVIDIA s new Kepler Architecture Axel Koehler Sr. Solution Architect HPC HPC Advisory Council Meeting, March , Lugano 1
2 NVIDIA: Parallel Computing Company GPUs: GeForce, Quadro, Tesla ARM SoCs: Tegra VGX 2
3 Supercomputing Weather / Climate Modeling Molecular Dynamics Computational Physics Life Sciences Manufacturing Biochemistry Bioinformatics Material Science Structural Mechanics Comp Fluid Dynamics (CFD) Electromagnetics Tesla K20/K20X Kepler GK110 Defense / Govt Oil and Gas Signal Processing Image Processing Video Analytics Reverse Time Migration Kirchoff Time Migration Tesla K10 Kepler GK104 3
4 Product Name K10 K20 K20X GPU Architecture Kepler: GK104 GK110 GK110 # of GPUs Peak Single Flops Peak SGEMM Peak Double Flops Peak DGEMM 4.58 TF (2.3 TF per GPU) 2.98 TF 0.19 TF (0.095 TF per GPU) 0.12 TF 3.52 TF 2.61 TF 1.17 TF 1.10 TF 3.95 TF 2.90 TF 1.32 TF 1.22 TF Memory size 8 GB (4GB per GPU) 5 GB 6 GB Memory BW (ECC off) 320 GB/s (160GB/s per GPU) 208 GB/s 250 GB/s New CUDA Features GPUDirect w/ RDMA GPUDirect (RDMA), Hyper-Q, Dynamic Parallelism ECC Features External DRAMs only DRAM, Caches & Reg Files # CUDA Cores 3072 (1536 per GPU) Total Board Power 225W 225W 235W Board Type PCI-e Passive PCI-e Passive, Active, SXM PCI-e Passive SXM 4
5 Parallel Computing Platform Multiple Programming Approaches Libraries Drop-in Acceleration OpenACC Directives Easily Accelerate Applications Programming Languages Maximum Flexibility Development Environment Parallel Nsight IDE Linux, Mac and Windows GPU Debugging and Profiling CUDA-GDB debugger NVIDIA Visual Profiler Third Party Tools DDT, TotalView, Vampir, Compiler Open Compiler Tools Enables compiling new languages to CUDA platform, and CUDA languages to other architectures OpenACC Compiler Hardware Capabilities SMX DynamicParallelism HyperQ GPUDirect 5
6 Kepler Features Make GPU Coding Easier Hyper-Q Speedup Legacy MPI Apps FERMI 1 Work Queue Dynamic Parallelism Less Back-Forth, Simpler Code CPU Fermi GPU CPU Kepler GPU KEPLER 32 Concurrent Work Queues 6
7 Proxy A Multi-Process Runtime for MPI CUDA MPI Rank 0 CUDA MPI Rank 1 CUDA MPI Rank 2 CUDA MPI Rank 3 Why Speedups for MPI programs with low-gpu utilization How Multiple CPU processes on a single GPU simultaneously Proxy Server Client-server architecture Client processes share the same CUDA context When GPU Currently on Cray; Production on Linux with next the CUDA version 7
8 Kepler Enables Full NVIDIA GPUDirect RDMA System Memory GDDR5 Memory GDDR5 Memory GDDR5 Memory GDDR5 Memory System Memory CPU GPU1 GPU2 GPU2 GPU1 CPU Server 1 PCI-e Network Card Network Network Card PCI-e Server 2 GPUDirect RDMA is a general approach and can also be used in conjunction with other PCIe devices (eg. flash memory devices) 8
9 CUDA Compiler Contributed to Open Source LLVM Developers want to build front-ends for Java, Python, R, DSLs Target other processors like ARM, FPGA, GPUs, x86 CUDA C, C++, Fortran NVIDIA GPUs LLVM Compiler For CUDA x86 CPUs New Language Support New Processor Support 9
10 Open Compiler Architecture NVCC.cu CUDA FE libcuda.lang NVVM IR LLVM Optimizer NVPTX CodeGen libnvvm Open Sourced PTX PTXAS Host Compiler CUDA Runtime CUDA Driver
11 Scenarios for the Compiler SDK NVCC NVCC NVCC libcuda. LANG CUDA Fortran Open ACC libcuda. LANG DSL Front End libcuda. LANG x86 LLVM Backend libnvvm CUDA Runtime libnvvm CUDA Runtime DSL Runtime libnvvm CUDA Runtime x86 CUDA Runtime Building Production Quality Compilers Building Domain Specific Languages (DSL) Enabling Other Platforms JET MATLAB HALIDE
12 Enabling Research in GPU Computing CU++ CLANG x86 LLVM Backend libnvvm Custom Runtime
13 Proposed Additions for OpenACC 2.0 Address ambiguities in existing spec List of 30+ features to be added Nested parallelism Separate compilation Function calls Data directives for control, unstructured data, deep copy for C++ structures, noncontiguous memory Multiple devices Profiling interface Certification OpenACC test suite 13
14 Growing OpenACC Support
15 GPU-Accelerated Applications 15
16 Summary The Kepler architecture focuses on performance, efficiency and programmability NVIDIA s Parallel Computing Platform is evolving Open Compiler Architecture and Compiler SDK play a very important role to broaden the GPU platform Strong growth in GPU accelerated applications in academia and industry 16
17 Thank you. Questions? Axel Koehler Sr. Solution Architect HPC 17
GPU Computing. Axel Koehler Sr. Solution Architect HPC
GPU Computing Axel Koehler Sr. Solution Architect HPC 1 NVIDIA: Parallel Computing Company GPUs: GeForce, Quadro, Tesla ARM SoCs: Tegra VGX 2 Continued Demand for Ever Faster Supercomputers First-principles
More informationGPU Computing fuer rechenintensive Anwendungen. Axel Koehler NVIDIA
GPU Computing fuer rechenintensive Anwendungen Axel Koehler NVIDIA GeForce Quadro Tegra Tesla 2 Continued Demand for Ever Faster Supercomputers First-principles simulation of combustion for new high-efficiency,
More informationAccelerating High Performance Computing.
Accelerating High Performance Computing http://www.nvidia.com/tesla Computing The 3 rd Pillar of Science Drug Design Molecular Dynamics Seismic Imaging Reverse Time Migration Automotive Design Computational
More informationCompiling CUDA and Other Languages for GPUs. Vinod Grover and Yuan Lin
Compiling CUDA and Other Languages for GPUs Vinod Grover and Yuan Lin Agenda Vision Compiler Architecture Scenarios SDK Components Roadmap Deep Dive SDK Samples Demos Vision Build a platform for GPU computing
More informationKepler Overview Mark Ebersole
Kepler Overview Mark Ebersole TFLOPS TFLOPS 3x Performance in a Single Generation 3.5 3 2.5 2 1.5 1 0.5 0 1.25 1 Single Precision FLOPS (SGEMM) 2.90 TFLOPS.89 TFLOPS.36 TFLOPS Xeon E5-2690 Tesla M2090
More informationCUDA 5 and Beyond. Mark Ebersole. Original Slides: Mark Harris 2012 NVIDIA
CUDA 5 and Beyond Mark Ebersole Original Slides: Mark Harris The Soul of CUDA The Platform for High Performance Parallel Computing Accessible High Performance Enable Computing Ecosystem Introducing CUDA
More informationFuture Directions for CUDA Presented by Robert Strzodka
Future Directions for CUDA Presented by Robert Strzodka Authored by Mark Harris NVIDIA Corporation Platform for Parallel Computing Platform The CUDA Platform is a foundation that supports a diverse parallel
More informationNVIDIA GPU TECHNOLOGY UPDATE
NVIDIA GPU TECHNOLOGY UPDATE May 2015 Axel Koehler Senior Solutions Architect, NVIDIA NVIDIA: The VISUAL Computing Company GAMING DESIGN ENTERPRISE VIRTUALIZATION HPC & CLOUD SERVICE PROVIDERS AUTONOMOUS
More informationGPU COMPUTING AND THE FUTURE OF HPC. Timothy Lanfear, NVIDIA
GPU COMPUTING AND THE FUTURE OF HPC Timothy Lanfear, NVIDIA ~1 W ~3 W ~100 W ~30 W 1 kw 100 kw 20 MW Power-constrained Computers 2 EXASCALE COMPUTING WILL ENABLE TRANSFORMATIONAL SCIENCE RESULTS First-principles
More informationHIGH-PERFORMANCE COMPUTING WITH CUDA AND TESLA GPUS
HIGH-PERFORMANCE COMPUTING WITH CUDA AND TESLA GPUS Timothy Lanfear, NVIDIA WHAT IS GPU COMPUTING? What is GPU Computing? x86 PCIe bus GPU Computing with CPU + GPU Heterogeneous Computing Low Latency or
More informationCUDA Update: Present & Future. Mark Ebersole, NVIDIA CUDA Educator
CUDA Update: Present & Future Mark Ebersole, NVIDIA CUDA Educator Recent CUDA News Kepler K20 & K20X Kepler GPU Architecture: Streaming Multiprocessor (SMX) 192 SP CUDA Cores per SMX 64 DP CUDA Cores per
More informationAXEL KOEHLER GPU Computing Update
AXEL KOEHLER GPU Computing Update Agenda Introduction GPU Computing Introduction into GPU Programming Kepler GPU Architecture GPU Applications Future Developments 2 NVIDIA: Parallel Computing Company GPUs:
More informationCUDA on ARM Update. Developing Accelerated Applications on ARM. Bas Aarts and Donald Becker
CUDA on ARM Update Developing Accelerated Applications on ARM Bas Aarts and Donald Becker CUDA on ARM: a forward-looking development platform for high performance, energy efficient hybrid computing It
More informationCUDA. Matthew Joyner, Jeremy Williams
CUDA Matthew Joyner, Jeremy Williams Agenda What is CUDA? CUDA GPU Architecture CPU/GPU Communication Coding in CUDA Use cases of CUDA Comparison to OpenCL What is CUDA? What is CUDA? CUDA is a parallel
More informationCUDA on ARM Update. Developing Accelerated Applications on ARM. Bas Aarts and Donald Becker
CUDA on ARM Update Developing Accelerated Applications on ARM Bas Aarts and Donald Becker CUDA on ARM: a forward-looking development platform for high performance, energy efficient hybrid computing It
More informationGPUs and the Future of Accelerated Computing Emerging Technology Conference 2014 University of Manchester
NVIDIA GPU Computing A Revolution in High Performance Computing GPUs and the Future of Accelerated Computing Emerging Technology Conference 2014 University of Manchester John Ashley Senior Solutions Architect
More informationOPENMP GPU OFFLOAD IN FLANG AND LLVM. Guray Ozen, Simone Atzeni, Michael Wolfe Annemarie Southwell, Gary Klimowicz
OPENMP GPU OFFLOAD IN FLANG AND LLVM Guray Ozen, Simone Atzeni, Michael Wolfe Annemarie Southwell, Gary Klimowicz MOTIVATION What does HPC programmer need today? Performance à GPUs, multi-cores, other
More informationThe Visual Computing Company
The Visual Computing Company Update NVIDIA GPU Ecosystem Axel Koehler, Senior Solutions Architect HPC, NVIDIA Outline Tesla K40 and GPU Boost Jetson TK-1 Development Board for Embedded HPC Pascal GPU 3D
More informationHiPANQ Overview of NVIDIA GPU Architecture and Introduction to CUDA/OpenCL Programming, and Parallelization of LDPC codes.
HiPANQ Overview of NVIDIA GPU Architecture and Introduction to CUDA/OpenCL Programming, and Parallelization of LDPC codes Ian Glendinning Outline NVIDIA GPU cards CUDA & OpenCL Parallel Implementation
More informationGPU Computing Ecosystem
GPU Computing Ecosystem CUDA 5 Enterprise level GPU Development GPU Development Paths Libraries, Directives, Languages GPU Tools Tools, libraries and plug-ins for GPU codes Tesla K10 Kepler! Tesla K20
More informationPiz Daint: Application driven co-design of a supercomputer based on Cray s adaptive system design
Piz Daint: Application driven co-design of a supercomputer based on Cray s adaptive system design Sadaf Alam & Thomas Schulthess CSCS & ETHzürich CUG 2014 * Timelines & releases are not precise Top 500
More informationOpenACC Course. Office Hour #2 Q&A
OpenACC Course Office Hour #2 Q&A Q1: How many threads does each GPU core have? A: GPU cores execute arithmetic instructions. Each core can execute one single precision floating point instruction per cycle
More informationNOVEL GPU FEATURES: PERFORMANCE AND PRODUCTIVITY. Peter Messmer
NOVEL GPU FEATURES: PERFORMANCE AND PRODUCTIVITY Peter Messmer pmessmer@nvidia.com COMPUTATIONAL CHALLENGES IN HEP Low-Level Trigger High-Level Trigger Monte Carlo Analysis Lattice QCD 2 COMPUTATIONAL
More informationTesla GPU Computing A Revolution in High Performance Computing
Tesla GPU Computing A Revolution in High Performance Computing Mark Harris, NVIDIA Agenda Tesla GPU Computing CUDA Fermi What is GPU Computing? Introduction to Tesla CUDA Architecture Programming & Memory
More informationGeneral Purpose GPU Computing in Partial Wave Analysis
JLAB at 12 GeV - INT General Purpose GPU Computing in Partial Wave Analysis Hrayr Matevosyan - NTC, Indiana University November 18/2009 COmputationAL Challenges IN PWA Rapid Increase in Available Data
More informationHybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS
+ Hybrid Computing @ KAUST Many Cores and OpenACC Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS + Agenda Hybrid Computing n Hybrid Computing n From Multi-Physics
More informationAMD ACCELERATING TECHNOLOGIES FOR EXASCALE COMPUTING FELLOW 3 OCTOBER 2016
AMD ACCELERATING TECHNOLOGIES FOR EXASCALE COMPUTING BILL.BRANTLEY@AMD.COM, FELLOW 3 OCTOBER 2016 AMD S VISION FOR EXASCALE COMPUTING EMBRACING HETEROGENEITY CHAMPIONING OPEN SOLUTIONS ENABLING LEADERSHIP
More informationThe following NVIDIA accelerators are available from HP, for use in certain HPE ProLiant DL-series, ML-series and SL-series servers.
Overview NVIDIA Accelerators for HPE ProLiant Servers Hewlett Packard Enterprise supports, on select HPE ProLiant servers, computational accelerator modules based on NVIDIA Tesla, NVIDIA GRID, and NVIDIA
More informationGPU. OpenMP. OMPCUDA OpenMP. forall. Omni CUDA 3) Global Memory OMPCUDA. GPU Thread. Block GPU Thread. Vol.2012-HPC-133 No.
GPU CUDA OpenMP 1 2 3 1 1 OpenMP CUDA OM- PCUDA OMPCUDA GPU CUDA CUDA 1. GPU GPGPU 1)2) GPGPU CUDA 3) CPU CUDA GPGPU CPU GPU OpenMP GPU CUDA OMPCUDA 4)5) OMPCUDA GPU OpenMP GPU CUDA OMPCUDA/MG 2 GPU OMPCUDA
More informationn N c CIni.o ewsrg.au
@NCInews NCI and Raijin National Computational Infrastructure 2 Our Partners General purpose, highly parallel processors High FLOPs/watt and FLOPs/$ Unit of execution Kernel Separate memory subsystem GPGPU
More informationSelecting the right Tesla/GTX GPU from a Drunken Baker's Dozen
Selecting the right Tesla/GTX GPU from a Drunken Baker's Dozen GPU Computing Applications Here's what Nvidia says its Tesla K20(X) card excels at doing - Seismic processing, CFD, CAE, Financial computing,
More informationProductive Performance on the Cray XK System Using OpenACC Compilers and Tools
Productive Performance on the Cray XK System Using OpenACC Compilers and Tools Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc. 1 The New Generation of Supercomputers Hybrid
More informationRWTH GPU-Cluster. Sandra Wienke March Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky
RWTH GPU-Cluster Fotos: Christian Iwainsky Sandra Wienke wienke@rz.rwth-aachen.de March 2012 Rechen- und Kommunikationszentrum (RZ) The GPU-Cluster GPU-Cluster: 57 Nvidia Quadro 6000 (29 nodes) innovative
More informationRECENT TRENDS IN GPU ARCHITECTURES. Perspectives of GPU computing in Science, 26 th Sept 2016
RECENT TRENDS IN GPU ARCHITECTURES Perspectives of GPU computing in Science, 26 th Sept 2016 NVIDIA THE AI COMPUTING COMPANY GPU Computing Computer Graphics Artificial Intelligence 2 NVIDIA POWERS WORLD
More informationTHE LEADER IN VISUAL COMPUTING
MOBILE EMBEDDED THE LEADER IN VISUAL COMPUTING 2 TAKING OUR VISION TO REALITY HPC DESIGN and VISUALIZATION AUTO GAMING 3 BEST DEVELOPER EXPERIENCE Tools for Fast Development Debug and Performance Tuning
More informationPortable and Productive Performance with OpenACC Compilers and Tools. Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc.
Portable and Productive Performance with OpenACC Compilers and Tools Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc. 1 Cray: Leadership in Computational Research Earth Sciences
More informationHPC with GPU and its applications from Inspur. Haibo Xie, Ph.D
HPC with GPU and its applications from Inspur Haibo Xie, Ph.D xiehb@inspur.com 2 Agenda I. HPC with GPU II. YITIAN solution and application 3 New Moore s Law 4 HPC? HPC stands for High Heterogeneous Performance
More informationNVIDIA Update and Directions on GPU Acceleration for Earth System Models
NVIDIA Update and Directions on GPU Acceleration for Earth System Models Stan Posey, HPC Program Manager, ESM and CFD, NVIDIA, Santa Clara, CA, USA Carl Ponder, PhD, Applications Software Engineer, NVIDIA,
More informationTESLA ACCELERATED COMPUTING. Mike Wang Solutions Architect NVIDIA Australia & NZ
TESLA ACCELERATED COMPUTING Mike Wang Solutions Architect NVIDIA Australia & NZ mikewang@nvidia.com GAMING DESIGN ENTERPRISE VIRTUALIZATION HPC & CLOUD SERVICE PROVIDERS AUTONOMOUS MACHINES PC DATA CENTER
More informationTitan - Early Experience with the Titan System at Oak Ridge National Laboratory
Office of Science Titan - Early Experience with the Titan System at Oak Ridge National Laboratory Buddy Bland Project Director Oak Ridge Leadership Computing Facility November 13, 2012 ORNL s Titan Hybrid
More informationTesla GPU Computing A Revolution in High Performance Computing
Tesla GPU Computing A Revolution in High Performance Computing Gernot Ziegler, Developer Technology (Compute) (Material by Thomas Bradley) Agenda Tesla GPU Computing CUDA Fermi What is GPU Computing? Introduction
More informationHETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA
HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA STATE OF THE ART 2012 18,688 Tesla K20X GPUs 27 PetaFLOPS FLAGSHIP SCIENTIFIC APPLICATIONS
More informationMELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구
MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구 Leading Supplier of End-to-End Interconnect Solutions Analyze Enabling the Use of Data Store ICs Comprehensive End-to-End InfiniBand and Ethernet Portfolio
More informationHigh-Productivity CUDA Programming. Cliff Woolley, Sr. Developer Technology Engineer, NVIDIA
High-Productivity CUDA Programming Cliff Woolley, Sr. Developer Technology Engineer, NVIDIA HIGH-PRODUCTIVITY PROGRAMMING High-Productivity Programming What does this mean? What s the goal? Do Less Work
More informationNVIDIA GPU Computing Séminaire Calcul Hybride Aristote 25 Mars 2010
NVIDIA GPU Computing 2010 Séminaire Calcul Hybride Aristote 25 Mars 2010 NVIDIA GPU Computing 2010 Tesla 3 rd generation Full OEM coverage Ecosystem focus Value Propositions per segments Card System Module
More informationPeter Messmer Developer Technology Group Stan Posey HPC Industry and Applications
Peter Messmer Developer Technology Group pmessmer@nvidia.com Stan Posey HPC Industry and Applications sposey@nvidia.com U Progress Reported at This Workshop 2011 2012 CAM SE COSMO GEOS 5 CAM SE COSMO GEOS
More informationMathematical computations with GPUs
Master Educational Program Information technology in applications Mathematical computations with GPUs GPU architecture Alexey A. Romanenko arom@ccfit.nsu.ru Novosibirsk State University GPU Graphical Processing
More informationThe following NVIDIA accelerators are available from HPE, for use in certain HPE ProLiant DL-series, ML-series and SL-series servers.
Overview Hewlett Packard Enterprise supports, on select HPE ProLiant servers, computational accelerator modules based on NVIDIA Tesla, NVIDIA GRID, and NVIDIA Quadro Graphical Processing Unit (GPU) technology.
More informationParallel Programming and Debugging with CUDA C. Geoff Gerfin Sr. System Software Engineer
Parallel Programming and Debugging with CUDA C Geoff Gerfin Sr. System Software Engineer CUDA - NVIDIA s Architecture for GPU Computing Broad Adoption Over 250M installed CUDA-enabled GPUs GPU Computing
More informationThe following NVIDIA accelerators are available from HPE, for use in certain HPE ProLiant DL-series, ML-series and SL-series servers.
Overview Hewlett Packard Enterprise supports, on select HPE ProLiant servers, computational accelerator modules based on NVIDIA Tesla, NVIDIA GRID, and NVIDIA Quadro Graphical Processing Unit (GPU) technology.
More informationGPU Debugging Made Easy. David Lecomber CTO, Allinea Software
GPU Debugging Made Easy David Lecomber CTO, Allinea Software david@allinea.com Allinea Software HPC development tools company Leading in HPC software tools market Wide customer base Blue-chip engineering,
More informationThe following NVIDIA accelerators are available from HPE, for use in certain HPE ProLiant DL-series, ML-series and SL-series servers.
Overview Hewlett Packard Enterprise supports, on select HPE ProLiant servers, computational accelerator modules based on NVIDIA Tesla, NVIDIA GRID, and NVIDIA Quadro Graphical Processing Unit (GPU) technology.
More informationThe GPU-Cluster. Sandra Wienke Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky
The GPU-Cluster Sandra Wienke wienke@rz.rwth-aachen.de Fotos: Christian Iwainsky Rechen- und Kommunikationszentrum (RZ) The GPU-Cluster GPU-Cluster: 57 Nvidia Quadro 6000 (29 nodes) innovative computer
More informationDebugging CUDA Applications with Allinea DDT. Ian Lumb Sr. Systems Engineer, Allinea Software Inc.
Debugging CUDA Applications with Allinea DDT Ian Lumb Sr. Systems Engineer, Allinea Software Inc. ilumb@allinea.com GTC 2013, San Jose, March 20, 2013 Embracing GPUs GPUs a rival to traditional processors
More informationPedraforca: a First ARM + GPU Cluster for HPC
www.bsc.es Pedraforca: a First ARM + GPU Cluster for HPC Nikola Puzovic, Alex Ramirez We ve hit the power wall ALL computers are limited by power consumption Energy-efficient approaches Multi-core Fujitsu
More informationScalable Cluster Computing with NVIDIA GPUs Axel Koehler NVIDIA. NVIDIA Corporation 2012
Scalable Cluster Computing with NVIDIA GPUs Axel Koehler NVIDIA Outline Introduction to Multi-GPU Programming Communication for Single Host, Multiple GPUs Communication for Multiple Hosts, Multiple GPUs
More informationDesigning a Domain-specific Language to Simulate Particles. dan bailey
Designing a Domain-specific Language to Simulate Particles dan bailey Double Negative Largest Visual Effects studio in Europe Offices in London and Singapore Large and growing R & D team Squirt Fluid Solver
More informationNVIDIA : FLOP WARS, ÉPISODE III François Courteille Ecole Polytechnique 4-June-13
NVIDIA : FLOP WARS, ÉPISODE III François Courteille fcourteille@nvidia.com Ecole Polytechnique 4-June-13 1 OUTLINE NVIDIA and GPU Computing Roadmap Inside Kepler Architecture SXM Hyper-Q Dynamic Parallelism
More informationBest Practices for Deploying and Managing GPU Clusters
Best Practices for Deploying and Managing GPU Clusters Dale Southard, NVIDIA dsouthard@nvidia.com About the Speaker and You [Dale] is a senior solution architect with NVIDIA (I fix things). I primarily
More informationTimothy Lanfear, NVIDIA HPC
GPU COMPUTING AND THE Timothy Lanfear, NVIDIA FUTURE OF HPC Exascale Computing will Enable Transformational Science Results First-principles simulation of combustion for new high-efficiency, lowemision
More informationGPU Architecture. Alan Gray EPCC The University of Edinburgh
GPU Architecture Alan Gray EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? Architectural reasons for accelerator performance advantages Latest GPU Products From
More informationThe following NVIDIA accelerators are available from HPE, for use in certain HPE ProLiant DL-series, MLseries and SL-series servers.
Overview Hewlett Packard Enterprise supports, on select HPE ProLiant servers, computational accelerator modules based on NVIDIA Tesla, NVIDIA GRID, and NVIDIA Quadro Graphical Processing Unit (GPU) technology.
More informationManycore and GPU Channelisers. Seth Hall High Performance Computing Lab, AUT
Manycore and GPU Channelisers Seth Hall High Performance Computing Lab, AUT GPU Accelerated Computing GPU-accelerated computing is the use of a graphics processing unit (GPU) together with a CPU to accelerate
More informationHigh Performance Computing with Accelerators
High Performance Computing with Accelerators Volodymyr Kindratenko Innovative Systems Laboratory @ NCSA Institute for Advanced Computing Applications and Technologies (IACAT) National Center for Supercomputing
More informationLAMMPSCUDA GPU Performance. April 2011
LAMMPSCUDA GPU Performance April 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Dell, Intel, Mellanox Compute resource - HPC Advisory Council
More informationIntroduction to CUDA C/C++ Mark Ebersole, NVIDIA CUDA Educator
Introduction to CUDA C/C++ Mark Ebersole, NVIDIA CUDA Educator What is CUDA? Programming language? Compiler? Classic car? Beer? Coffee? CUDA Parallel Computing Platform www.nvidia.com/getcuda Programming
More informationTechnology for a better society. hetcomp.com
Technology for a better society hetcomp.com 1 J. Seland, C. Dyken, T. R. Hagen, A. R. Brodtkorb, J. Hjelmervik,E Bjønnes GPU Computing USIT Course Week 16th November 2011 hetcomp.com 2 9:30 10:15 Introduction
More informationIllinois Proposal Considerations Greg Bauer
- 2016 Greg Bauer Support model Blue Waters provides traditional Partner Consulting as part of its User Services. Standard service requests for assistance with porting, debugging, allocation issues, and
More informationWHAT S NEW IN CUDA 8. Siddharth Sharma, Oct 2016
WHAT S NEW IN CUDA 8 Siddharth Sharma, Oct 2016 WHAT S NEW IN CUDA 8 Why Should You Care >2X Run Computations Faster* Solve Larger Problems** Critical Path Analysis * HOOMD Blue v1.3.3 Lennard-Jones liquid
More informationCS GPU and GPGPU Programming Lecture 8+9: GPU Architecture 7+8. Markus Hadwiger, KAUST
CS 380 - GPU and GPGPU Programming Lecture 8+9: GPU Architecture 7+8 Markus Hadwiger, KAUST Reading Assignment #5 (until March 12) Read (required): Programming Massively Parallel Processors book, Chapter
More informationIntroduction to Numerical General Purpose GPU Computing with NVIDIA CUDA. Part 1: Hardware design and programming model
Introduction to Numerical General Purpose GPU Computing with NVIDIA CUDA Part 1: Hardware design and programming model Dirk Ribbrock Faculty of Mathematics, TU dortmund 2016 Table of Contents Why parallel
More informationInside Kepler. Manuel Ujaldon Nvidia CUDA Fellow. Computer Architecture Department University of Malaga (Spain)
Inside Kepler Manuel Ujaldon Nvidia CUDA Fellow Computer Architecture Department University of Malaga (Spain) Talk outline [46 slides] 1. Introducing the architecture [2] 2. Cores organization [9] 3. Memory
More informationOverview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming
Overview Lecture 1: an introduction to CUDA Mike Giles mike.giles@maths.ox.ac.uk hardware view software view Oxford University Mathematical Institute Oxford e-research Centre Lecture 1 p. 1 Lecture 1 p.
More informationCUDA Experiences: Over-Optimization and Future HPC
CUDA Experiences: Over-Optimization and Future HPC Carl Pearson 1, Simon Garcia De Gonzalo 2 Ph.D. candidates, Electrical and Computer Engineering 1 / Computer Science 2, University of Illinois Urbana-Champaign
More informationGPUs and Emerging Architectures
GPUs and Emerging Architectures Mike Giles mike.giles@maths.ox.ac.uk Mathematical Institute, Oxford University e-infrastructure South Consortium Oxford e-research Centre Emerging Architectures p. 1 CPUs
More informationACCELERATED COMPUTING: THE PATH FORWARD. Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 Nov. 16, 2015
ACCELERATED COMPUTING: THE PATH FORWARD Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 Nov. 16, 2015 COMMODITY DISRUPTS CUSTOM SOURCE: Top500 ACCELERATED COMPUTING: THE PATH FORWARD It s time to start
More informationTrends in HPC (hardware complexity and software challenges)
Trends in HPC (hardware complexity and software challenges) Mike Giles Oxford e-research Centre Mathematical Institute MIT seminar March 13th, 2013 Mike Giles (Oxford) HPC Trends March 13th, 2013 1 / 18
More informationCUDA 6.0. Manuel Ujaldón Associate Professor, Univ. of Malaga (Spain) Conjoint Senior Lecturer, Univ. of Newcastle (Australia) Nvidia CUDA Fellow
CUDA 6.0 Manuel Ujaldón Associate Professor, Univ. of Malaga (Spain) Conjoint Senior Lecturer, Univ. of Newcastle (Australia) Nvidia CUDA Fellow 1 Acknowledgements To the great Nvidia people, for sharing
More informationarxiv: v1 [physics.comp-ph] 4 Nov 2013
arxiv:1311.0590v1 [physics.comp-ph] 4 Nov 2013 Performance of Kepler GTX Titan GPUs and Xeon Phi System, Weonjong Lee, and Jeonghwan Pak Lattice Gauge Theory Research Center, CTP, and FPRD, Department
More informationAdapting Numerical Weather Prediction codes to heterogeneous architectures: porting the COSMO model to GPUs
Adapting Numerical Weather Prediction codes to heterogeneous architectures: porting the COSMO model to GPUs O. Fuhrer, T. Gysi, X. Lapillonne, C. Osuna, T. Dimanti, T. Schultess and the HP2C team Eidgenössisches
More informationStan Posey, NVIDIA, Santa Clara, CA, USA
Stan Posey, sposey@nvidia.com NVIDIA, Santa Clara, CA, USA NVIDIA Strategy for CWO Modeling (Since 2010) Initial focus: CUDA applied to climate models and NWP research Opportunities to refactor code with
More informationCMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman)
CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC Guest Lecturer: Sukhyun Song (original slides by Alan Sussman) Parallel Programming with Message Passing and Directives 2 MPI + OpenMP Some applications can
More informationdesigning a GPU Computing Solution
designing a GPU Computing Solution Patrick Van Reeth EMEA HPC Competency Center - GPU Computing Solutions Saturday, May the 29th, 2010 1 2010 Hewlett-Packard Development Company, L.P. The information contained
More informationMPI + X programming. UTK resources: Rho Cluster with GPGPU George Bosilca CS462
MPI + X programming UTK resources: Rho Cluster with GPGPU https://newton.utk.edu/doc/documentation/systems/rhocluster George Bosilca CS462 MPI Each programming paradigm only covers a particular spectrum
More informationCST STUDIO SUITE R Supported GPU Hardware
CST STUDIO SUITE R 2017 Supported GPU Hardware 1 Supported Hardware CST STUDIO SUITE currently supports up to 8 GPU devices in a single host system, meaning each number of GPU devices between 1 and 8 is
More informationHigh-Productivity CUDA Programming. Levi Barnes, Developer Technology Engineer, NVIDIA
High-Productivity CUDA Programming Levi Barnes, Developer Technology Engineer, NVIDIA MORE RESOURCES How to learn more GTC -- March 2014 San Jose, CA gputechconf.com Video archives, too Qwiklabs nvlabs.qwiklabs.com
More informationADVANCES IN EXTREME-SCALE APPLICATIONS ON GPU. Peng Wang HPC Developer Technology
ADVANCES IN EXTREME-SCALE APPLICATIONS ON GPU Peng Wang HPC Developer Technology NVIDIA SuperPhones to SuperComputers Computers no longer get faster, just wider Architectural Features Common to All Processors
More informationAddressing Heterogeneity in Manycore Applications
Addressing Heterogeneity in Manycore Applications RTM Simulation Use Case stephane.bihan@caps-entreprise.com Oil&Gas HPC Workshop Rice University, Houston, March 2008 www.caps-entreprise.com Introduction
More informationThe Rise of Open Programming Frameworks. JC BARATAULT IWOCL May 2015
The Rise of Open Programming Frameworks JC BARATAULT IWOCL May 2015 1,000+ OpenCL projects SourceForge GitHub Google Code BitBucket 2 TUM.3D Virtual Wind Tunnel 10K C++ lines of code, 30 GPU kernels CUDA
More informationAccelerator programming with OpenACC
..... Accelerator programming with OpenACC Colaboratorio Nacional de Computación Avanzada Jorge Castro jcastro@cenat.ac.cr 2018. Agenda 1 Introduction 2 OpenACC life cycle 3 Hands on session Profiling
More informationParallel Computing. November 20, W.Homberg
Mitglied der Helmholtz-Gemeinschaft Parallel Computing November 20, 2017 W.Homberg Why go parallel? Problem too large for single node Job requires more memory Shorter time to solution essential Better
More informationOpenACC/CUDA/OpenMP... 1 Languages and Libraries... 3 Multi-GPU support... 4 How OpenACC Works... 4
OpenACC Course Class #1 Q&A Contents OpenACC/CUDA/OpenMP... 1 Languages and Libraries... 3 Multi-GPU support... 4 How OpenACC Works... 4 OpenACC/CUDA/OpenMP Q: Is OpenACC an NVIDIA standard or is it accepted
More informationThe rcuda middleware and applications
The rcuda middleware and applications Will my application work with rcuda? rcuda currently provides binary compatibility with CUDA 5.0, virtualizing the entire Runtime API except for the graphics functions,
More informationLecture 1: an introduction to CUDA
Lecture 1: an introduction to CUDA Mike Giles mike.giles@maths.ox.ac.uk Oxford University Mathematical Institute Oxford e-research Centre Lecture 1 p. 1 Overview hardware view software view CUDA programming
More informationCRAY XK6 REDEFINING SUPERCOMPUTING. - Sanjana Rakhecha - Nishad Nerurkar
CRAY XK6 REDEFINING SUPERCOMPUTING - Sanjana Rakhecha - Nishad Nerurkar CONTENTS Introduction History Specifications Cray XK6 Architecture Performance Industry acceptance and applications Summary INTRODUCTION
More informationIntroduction to GPU hardware and to CUDA
Introduction to GPU hardware and to CUDA Philip Blakely Laboratory for Scientific Computing, University of Cambridge Philip Blakely (LSC) GPU introduction 1 / 35 Course outline Introduction to GPU hardware
More informationCUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav
CUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav CMPE655 - Multiple Processor Systems Fall 2015 Rochester Institute of Technology Contents What is GPGPU? What s the need? CUDA-Capable GPU Architecture
More informationPathScale ENZO GTC12 S0631 Programming Heterogeneous Many-Cores Using Directives. C. Bergström May 14th, 2012
PathScale ENZO GTC12 S0631 Programming Heterogeneous Many-Cores Using Directives C. Bergström May 14th, 2012 Brief Introduction to ENZO 2 PathScale GTC12 S0631 Tutorial May 14th, 2012 ENZO Overview & Goals
More informationProgramming GPUs with CUDA. Prerequisites for this tutorial. Commercial models available for Kepler: GeForce vs. Tesla. I.
Programming GPUs with CUDA Tutorial at 1th IEEE CSE 15 and 13th IEEE EUC 15 conferences Prerequisites for this tutorial Porto (Portugal). October, 20th, 2015 You (probably) need experience with C. You
More informationNEW FEATURES IN CUDA 6 MAKE GPU ACCELERATION EASIER MARK HARRIS
NEW FEATURES IN CUDA 6 MAKE GPU ACCELERATION EASIER MARK HARRIS 1 Unified Memory CUDA 6 2 3 XT and Drop-in Libraries GPUDirect RDMA in MPI 4 Developer Tools 1 Unified Memory CUDA 6 2 3 XT and Drop-in Libraries
More information