Multicore Computer, GPU 및 Cluster 환경에서의 MATLAB Parallel Computing 기능

Similar documents
Mit MATLAB auf der Überholspur Methoden zur Beschleunigung von MATLAB Anwendungen

Speeding up MATLAB Applications Sean de Wolski Application Engineer

Parallel and Distributed Computing with MATLAB Gerardo Hernández Manager, Application Engineer

Parallel and Distributed Computing with MATLAB The MathWorks, Inc. 1

Optimizing and Accelerating Your MATLAB Code

Mit MATLAB auf der Überholspur Methoden zur Beschleunigung von MATLAB Anwendungen

Speeding up MATLAB Applications The MathWorks, Inc.

Getting Started with MATLAB Francesca Perino

Large Data in MATLAB: A Seismic Data Processing Case Study U. M. Sundar Senior Application Engineer

Accelerating System Simulations

Parallel Computing with MATLAB

Modeling a 4G LTE System in MATLAB

Using Parallel Computing Toolbox to accelerate the Video and Image Processing Speed. Develop parallel code interactively

High Performance and GPU Computing in MATLAB

Technical Computing with MATLAB

Scaling MATLAB. for Your Organisation and Beyond. Rory Adams The MathWorks, Inc. 1

Daniel D. Warner. May 31, Introduction to Parallel Matlab. Daniel D. Warner. Introduction. Matlab s 5-fold way. Basic Matlab Example

NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI

Accelerating Simulink Optimization, Code Generation & Test Automation Through Parallelization

MATLAB Based Optimization Techniques and Parallel Computing

MATLAB. Senior Application Engineer The MathWorks Korea The MathWorks, Inc. 2

Scaling up MATLAB Analytics Marta Wilczkowiak, PhD Senior Applications Engineer MathWorks

MathWorks Products and Prices Euro Academic January 2018

Integrate MATLAB Analytics into Enterprise Applications

Parallel Computing with MATLAB

GPU Programming. Lecture 1: Introduction. Miaoqing Huang University of Arkansas 1 / 27

General Purpose GPU Computing in Partial Wave Analysis

Data Analytics with MATLAB. Tackling the Challenges of Big Data

MATLAB Parallel Computing Toolbox Benchmark for an Embarrassingly Parallel Application

Integrate MATLAB Analytics into Enterprise Applications

What s New in MATLAB and Simulink The MathWorks, Inc. 1

MathWorks Products and Prices Euro Academic September 2016

Integrate MATLAB Analytics into Enterprise Applications

Tesla Architecture, CUDA and Optimization Strategies

CUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav

Control System Design and Rapid Prototyping Using Simulink Chirag Patel Sr. Application Engineer Modeling and Simulink MathWorks India

Accelerating a Simulation of Type I X ray Bursts from Accreting Neutron Stars Mark Mackey Professor Alexander Heger

INTRODUCTION TO MATLAB PARALLEL COMPUTING TOOLBOX

What s New in MATLAB and Simulink Young Joon Lee Principal Application Engineer

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

Parallel MATLAB at VT

GPU Accelerated Solvers for ODEs Describing Cardiac Membrane Equations

How Real-Time Testing Improves the Design of a PMSM Controller

System Requirements & Platform Availability by Product for R2016b

CUDA and OpenCL Implementations of 3D CT Reconstruction for Biomedical Imaging

TOOLS FOR IMPROVING CROSS-PLATFORM SOFTWARE DEVELOPMENT

NVIDIA T4 FOR VIRTUALIZATION

MATLAB AND PARALLEL COMPUTING

Big Data con MATLAB. Lucas García The MathWorks, Inc. 1

What s New in MATLAB and Simulink

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Parallel Computing with Matlab and R

Implementation of the finite-difference method for solving Maxwell`s equations in MATLAB language on a GPU

GPU-Accelerated Beat Detection for Dancing Monkeys

GPUs Open New Avenues in Medical MRI

Developing Optimization Algorithms for Real-World Applications

Supporting Data Parallelism in Matcloud: Final Report

Maximize automotive simulation productivity with ANSYS HPC and NVIDIA GPUs

MathWorks Products and Prices Euro Academic March 2014

Deep learning in MATLAB From Concept to CUDA Code

Advanced CUDA Optimization 1. Introduction

Developing, Debugging, and Optimizing GPU Codes for High Performance Computing with Allinea Forge

OpenACC Course. Office Hour #2 Q&A

컴퓨터비전의최신기술 : Deep Learning, 3D Vision and Embedded Vision

The MathWorks Products and Prices Euro Academic March 2010

CUDA. Matthew Joyner, Jeremy Williams

The Virtues of Virtualization and the Microsoft Windows 10 Window of Opportunity

Using Multiple Processors for Monte Carlo Analysis of System Models

Data Analytics with MATLAB

Lecture x: MATLAB - advanced use cases

Introduction to High-Performance Computing

GPU Computing: Development and Analysis. Part 1. Anton Wijs Muhammad Osama. Marieke Huisman Sebastiaan Joosten

Hybrid Implementation of 3D Kirchhoff Migration

Massively Parallel Architectures

What s New in MATLAB and Simulink

NVIDIA s Compute Unified Device Architecture (CUDA)

NVIDIA s Compute Unified Device Architecture (CUDA)

Designing a Pick and Place Robotics Application Using MATLAB and Simulink

designing a GPU Computing Solution

MATLAB Distributed Computing Server Release Notes

Windows Compute Cluster Server 2003 allows MATLAB users to quickly and easily get up and running with distributed computing tools.

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

LUNAR TEMPERATURE CALCULATIONS ON A GPU

A TALENTED CPU-TO-GPU MEMORY MAPPING TECHNIQUE

NVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU

MATLAB Distributed Computing Server (MDCS) Training

Interpreting a genetic programming population on an nvidia Tesla

Tesla GPU Computing A Revolution in High Performance Computing

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono

Fast Hardware For AI

What s New with the MATLAB and Simulink Product Families. Marta Wilczkowiak & Coorous Mohtadi Application Engineering Group

Embarquez votre Intelligence Artificielle (IA) sur CPU, GPU et FPGA

COMPARISON OF PYTHON 3 SINGLE-GPU PARALLELIZATION TECHNOLOGIES ON THE EXAMPLE OF A CHARGED PARTICLES DYNAMICS SIMULATION PROBLEM

MATLAB on BioHPC. portal.biohpc.swmed.edu Updated for

Application Development and Deployment With MATLAB

GPU Programming Using NVIDIA CUDA

GPGPU on ARM. Tom Gall, Gil Pitney, 30 th Oct 2013

Real-Time Simulation of Simscape Models

Software and Performance Engineering for numerical codes on GPU clusters

Transcription:

Multicore Computer, GPU 및 Cluster 환경에서의 MATLAB Parallel Computing 기능 성호현 MathWorks Korea 2012 The MathWorks, Inc. 1

A Question to Consider Do you want to speed up your algorithms? If so Do you have a multi-core or multi-processor computer? Do you have a high-end graphics processing unit (GPU)? 2

Agenda Introduction to Parallel Computing Tools Using Multi-core/Multi-processor Machines Using Graphics Processing Units (GPUs) 3

Utilizing Additional Processing Power Built-in multithreading In core MATLAB For specific matrix operations Automatically enabled since R2008a Parallel Computing Tools Controlled by the MATLAB user For a variety of applications Leverage CPUs and GPUs to speed applications further 4

HKM Optimizes Just-in-Time Steel Manufacturing Schedule Challenge Optimize a steel production process to enable consistent, just-in-time delivery Solution Use MATLAB and Global Optimization Toolbox to maximize throughput of more than 5 million tonnes of steel annually Results Algorithm development accelerated by a factor of 10 Optimization time cut from 1 hour to 5 minutes Customer satisfaction increased Link to user story Manually reviewed plant schedule (left) and plant schedule automatically optimized with MATLAB genetic algorithms (right). The optimized schedule minimizes schedule conflicts (in red), meets delivery dates, and achieves the target utilization rate. C++, Java, or third-party optimization solutions would have required us to spend significantly more time in development or to simplify our constraints. Only MATLAB provided the flexibility, scalability, development speed, and level of optimization that we required. Alexey Nagaytsev Hüttenwerke Krupp Mannesmann 5

Lund University Develops an Artificial Neural Network for Matching Heart Transplant Donors with Recipients Challenge Improve long-term survival rates for heart transplant recipients by identifying optimal recipient and donor matches Solution Use MathWorks tools to develop a predictive artificial neural network model and simulate thousands of riskprofile combinations on a 56-processor computing cluster Results Prospective five-year survival rate raised by up to 10% Network training time reduced by more than twothirds Simulation time cut from weeks to days Link to user story Plots showing actual and predicted survival, best and worst donorrecipient match, best and worst simulated match (left); and survival rate by duration of ischemia and donor age (right). I spend a lot of time in the clinic, and don t have the time or the technical expertise to learn, configure, and maintain software. MATLAB makes it easy for physicians like me to get work done and produce meaningful results. Dr. Johan Nilsson Skåne University Hospital Lund University 6

Massachusetts Institute of Technology Integrates Cancer Research in the Lab and Classroom with MathWorks Tools Challenge Improve diagnostic techniques for cancer by identifying proteins and analyzing their interactions Solution Use MathWorks tools to enable students and researchers to analyze mass spectrometry data, model complex protein interactions, and visualize results Results Education integrated with research Computation time shortened by an order of magnitude Research grant won 3-D visualization of the human assome of protein interactions. Using a distributed approach with MATLAB code, we ran our analysis on a computer cluster and reduced computation time by an order of magnitude from about a week to much less than a day. Dr. Gil Alterovitz MIT and Harvard University Link to user story 7

Research Engineers Advance Design of the International Linear Collider with MathWorks Tools Challenge Design a control system for ensuring the precise alignment of particle beams in the International Linear Collider Solution Use MATLAB, Simulink, Parallel Computing Toolbox, and Instrument Control Toolbox software to design, model, and simulate the accelerator and alignment control system Results Simulation time reduced by an order of magnitude Development integrated Existing work leveraged Link to user story Queen Mary high-throughput cluster. Using Parallel Computing Toolbox, we simply deployed our simulation on a large group cluster. We saw a linear improvement in speed, and we could run 100 simulations at once. MathWorks tools have enabled us to accomplish work that was once impossible. Dr. Glen White Queen Mary, University of London 8

Going Beyond Serial MATLAB Applications TOOLBOXES BLOCKSETS 9

Parallel Computing on the Desktop Desktop Computer Parallel Computing Toolbox Use Parallel Computing Toolbox Speed up parallel applications on local computer Take full advantage of desktop power by using CPUs and GPUs Separate computer cluster not required 10

Scale Up to Clusters, Grids and Clouds Desktop Computer Parallel Computing Toolbox Computer Cluster MATLAB Distributed Computing Server Scheduler 11

Parallel Computing enables you to Speed up Computations Larger Compute Pool Work with Large Data Larger Memory Pool 11 26 41 12 27 42 13 28 43 14 29 44 15 30 45 16 31 46 17 32 47 17 33 48 19 34 49 20 35 50 21 36 51 22 37 52 12

Agenda Introduction to Parallel Computing Tools Using Multi-core/Multi-processor Machines Using Graphics Processing Units (GPUs) 13

Programming Parallel Applications 14 Greater Control Ease of Use

Using Additional Cores/Processors (CPUs) Ease of Use Support built into Toolboxes Greater Control 15

Example: Built-in Support for Parallelism in Other Tools Use built-in support for Parallel Computing Toolbox in Optimization Toolbox Run optimization in parallel Use pool of MATLAB workers 16

Other Tools Providing Parallel Computing Support Optimization Toolbox Global Optimization Toolbox Statistics Toolbox Simulink Design Optimization Bioinformatics Toolbox Model-Based Calibration Toolbox TOOLBOXES BLOCKSETS Directly leverage functions in Parallel Computing Toolbox 17

Using Additional Cores/Processors (CPUs) Ease of Use Support built into Toolboxes Simple programming constructs: parfor Greater Control 18

Running Independent Tasks or Iterations Ideal problem for parallel computing No dependencies or communications between tasks Examples include parameter sweeps and Monte Carlo simulations Time Time 19

Example: Parameter Sweep of ODEs Parameter sweep of ODE system Use pool of MATLAB workers Convert for to parfor Interleave serial and parallel code Damped spring oscillator 5 m x b x k x 0 1,2,... 1,2,... Sweep through different values of b and k Record peak value for each simulation 20

Using Additional Cores/Processors (CPUs) Ease of Use Support built into Toolboxes Simple programming constructs: parfor Full control of parallelization: jobs and tasks Greater Control 21

Agenda Introduction to Parallel Computing Tools Using Multi-core/Multi-processor Machines Using Graphics Processing Units (GPUs) 22

Gaining Performance with More Hardware Using More Cores (CPUs) Using GPUs Core 1 Core 2 Core 3 Core 4 Cache Device Memory 23

What is a Graphics Processing Unit (GPU) Originally for graphics acceleration, now also used for scientific calculations Massively parallel array of integer and floating point processors Typically hundreds of processors per card GPU cores complement CPU cores Dedicated high-speed memory * Parallel Computing Toolbox requires NVIDIA GPUs with Compute Capability 1.3 or greater, including NVIDIA Tesla 10-series and 20-series products. See http://www.nvidia.com/object/cuda_gpus.html for a complete listing 24

Example: GPU Computing in the Parallel Computing Toolbox Send and create data on the GPU Run calculations with built-in GPU functions Run a custom CUDA kernel 25

Benchmark: Solving 2D Wave Equation CPU vs GPU 26

Summary of Options for Targeting GPUs Ease of Use Use GPU array interface with MATLAB built-in functions Execute custom functions on elements of the GPU array Create kernels from existing CUDA code and PTX files Greater Control 27

Performance Acceleration Options in the Parallel Computing Toolbox Technology Example MATLAB s Execution Target matlabpool parfor Required CPU cores user-defined tasks (batch processing) createjob createtask Required CPU cores GPU-based parallelism GPUArray No NVIDIA GPU with Compute Capability 1.3 or greater 28

What else has changed? Enhanced job manager security (MATLAB Distributed Computing Server) Support for local workers (MATLAB Compiler) Integration with MATLAB desktop (MATLAB, Parallel Computing Toolbox) 2 Increase in the number of local workers to 12 (Parallel Computing Toolbox) Support for GPU computation (MATLAB Compiler) Job Monitor (Parallel Computing Toolbox) 29

Summary Speed up parallel applications on desktop by using Parallel Computing Toolbox Take full advantage of CPU and GPU hardware If required, use MATLAB Distributed Computing Server to Scale up to clusters, grids and clouds Work with data that is too large to fit on to desktop computers 30

For more information Visit www.mathworks.com/products/parallel-computing 31