GPU ACCELERATED COMPUTING 1 st AlsaCalcul GPU Challenge, 14-Jun-2016, Strasbourg Frédéric Parienté, Tesla Accelerated Computing, NVIDIA Corporation
GAMING PRO ENTERPRISE VISUALIZATION DATA CENTER AUTO THE WORLD LEADER IN VISUAL COMPUTING
It s time to start planning for the end of Moore s Law, and it s worth pondering how it will end, not just when. Robert Colwell Director, Microsystems Technology Office, DARPA
HOW GPU ACCELERATION WORKS Application Code Compute-Intensive Functions GPU 5% of Code Rest of Sequential CPU Code CPU +
TESLA ACCELERATED COMPUTING PLATFORM Focused on Co-Design from Top to Bottom TFLOPS Fast GPU Engineered for High Throughput NVIDIA GPU x86 CPU Productive Programming Model & Tools Expert Co-Design Accessibility 5,5 5,0 4,5 P100 APPLICATION 4,0 3,5 MIDDLEWARE 3,0 2,5 2,0 1,5 1,0 0,5 0,0 M1060 M2090 K20 K80 Fast GPU + Strong CPU 2008 2010 2012 2014 2016 SYS SW LARGE SYSTEMS PROCESSOR
DATA CENTER TODAY Well-suited For Transactional Workloads Running on Lots of Nodes THE DREAM For Important Workloads with Infinite Need for Computing Network Fabric Server Racks Commodity Computers Interconnected with Vast Network Overhead Few Lightning-Fast Nodes with Performance of Thousands of Commodity Computers
70% OF TOP HPC APPS ACCELERATED INTERSECT360 SURVEY OF TOP APPS TOP 25 APPS IN SURVEY GROMACS SIMULIA Abaqus NAMD AMBER ANSYS Mechanical Exelis IDL MSC NASTRAN LAMMPS NWChem LS-DYNA Schrodinger Gaussian GAMESS 9 of top 10 Apps Accelerated 35 of top 50 Apps Accelerated ANSYS Fluent WRF VASP OpenFOAM CHARMM Quantum Espresso ANSYS CFX Star-CD CCSM COMSOL Star-CCM+ BLAST Intersect360, Nov 2015 HPC Application Support for GPU Computing = All popular functions accelerated = Some popular functions accelerated = In development = Not supported
400+ GPU-Accelerated Applications www.nvidia.com/appscatalog
TESLA K80: 10X FASTER ON REAL-WORLD APPS 15x K80 CPU 10x 5x 0x Benchmarks Molecular Dynamics Quantum Chemistry Physics CPU: 12 cores, E5-2697v2 @ 2.70GHz. 64GB System Memory, CentOS 6.2 GPU: Single Tesla K80, Boost enabled
ns/day BIG PROBLEMS NEED FAST COMPUTERS 2.5x Faster than the Largest CPU Data Center 50 45 AMBER Simulation of CRISPR, Nature s Tool for Genome Editing 1 Node with 4 P100 GPUs 40 35 30 25 20 15 10 5 48 CPU Nodes Comet Supercomputer Biotech discovery of the century -MIT Technology Review 12/2014 0 0 20 40 60 80 100 AMBER 16 Pre-release, CRSPR based on PDB ID 5f9r, 336,898 atoms CPU: Dual Socket Intel E5-2680v3 12 cores, 128 GB DDR4 per node, FDR IB # of Processors (CPUs and GPUs)
Tesla Accelerates Discoveries Using a supercomputer powered by the Tesla Platform with over 3,000 Tesla accelerators, University of Illinois scientists performed the first all-atom simulation of the HIV virus and discovered the chemical structure of its capsid the perfect target for fighting the infection. Without GPU, the supercomputer would need to be 5x larger for similar performance.
A NEW COMPUTING MODEL 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% ImageNet Traditional CV Deep Learning 2009 2010 2011 2012 2013 2014 2015 2016 Traditional Computer Vision Experts + Time Deep Learning Object Detection DNN + Data + HPC Deep Learning Achieves Superhuman Results
EVERY INDUSTRY WANTS DEEP LEARNING Cloud Service Provider Medicine Media & Entertainment Security & Defense Autonomous Machines Image/Video classification Cancer cell detection Video captioning Face recognition Pedestrian detection Speech recognition Diabetic grading Content based search Video surveillance Lane tracking Natural language processing Drug discovery Real time translation Cyber security Recognize traffic sign
TESLA FOR SIMULATION LIBRARIES DIRECTIVES LANGUAGES ACCELERATED COMPUTING TOOLKIT TESLA ACCELERATED COMPUTING
END-TO-END PRODUCT FAMILY HYPERSCALE HPC MIXED-APPS HPC STRONG-SCALING HPC FULLY INTEGRATED DL SUPERCOMPUTER Tesla M4, M40 Tesla K80 Tesla P100 DGX-1 Hyperscale deployment for DL training, inference, video & image processing HPC data centers running mix of CPU and GPU workloads Hyperscale & HPC data centers running apps that scale to multiple GPUs For customers who need to get going now with fully integrated solution
NVIDIA COMPUTEWORKS COMPUTEWORKS GAMEWORKS VRWORKS DESIGNWORKS DRIVEWORKS JETPACK CUDA 8 cudnn 5 nvgraph IndeX plug-in for ParaView and other technologies such as: AMGx, cusolver, cusparse, OpenACC, NSIGHT, THRUST
Sep 28-29, 2016 Amsterdam www.gputechconf.eu #GTC16 EUROPE S BRIGHTEST MINDS & BEST IDEAS DEEP LEARNING & ARTIFICIAL INTELLIGENCE SELF-DRIVING CARS VIRTUAL REALITY & AUGMENTED REALITY SUPERCOMPUTING & HPC GTC Europe is a two-day conference designed to expose the innovative ways developers, businesses and academics are using parallel computing to transform our world. 2 Days 800 Attendees 50+ Exhibitors 50+ Speakers 15+ Tracks 15+ Workshops 1-to-1 Meetings
Q&A