Mit MATLAB auf der Überholspur Methoden zur Beschleunigung von MATLAB Anwendungen

Similar documents
Mit MATLAB auf der Überholspur Methoden zur Beschleunigung von MATLAB Anwendungen

Speeding up MATLAB Applications Sean de Wolski Application Engineer

Parallel and Distributed Computing with MATLAB Gerardo Hernández Manager, Application Engineer

Parallel and Distributed Computing with MATLAB The MathWorks, Inc. 1

Multicore Computer, GPU 및 Cluster 환경에서의 MATLAB Parallel Computing 기능

Optimizing and Accelerating Your MATLAB Code

Speeding up MATLAB Applications The MathWorks, Inc.

Getting Started with MATLAB Francesca Perino

Parallel Computing with MATLAB

Accelerating System Simulations

Scaling MATLAB. for Your Organisation and Beyond. Rory Adams The MathWorks, Inc. 1

INTRODUCTION TO MATLAB PARALLEL COMPUTING TOOLBOX

Modeling a 4G LTE System in MATLAB

Integrate MATLAB Analytics into Enterprise Applications

Large Data in MATLAB: A Seismic Data Processing Case Study U. M. Sundar Senior Application Engineer

High Performance and GPU Computing in MATLAB

Using Parallel Computing Toolbox to accelerate the Video and Image Processing Speed. Develop parallel code interactively

Integrate MATLAB Analytics into Enterprise Applications

MATLAB AND PARALLEL COMPUTING

CUDA and OpenCL Implementations of 3D CT Reconstruction for Biomedical Imaging

HPC with Multicore and GPUs

Introduction to Matlab GPU Acceleration for. Computational Finance. Chuan- Hsiang Han 1. Section 1: Introduction

Lecture x: MATLAB - advanced use cases

MATLAB Parallel Computing Toolbox Benchmark for an Embarrassingly Parallel Application

Scaling up MATLAB Analytics Marta Wilczkowiak, PhD Senior Applications Engineer MathWorks

Parallel Computing with MATLAB on Discovery Cluster

Technical Report TR

Parallel Computing with MATLAB

Parallel Computing with MATLAB

Parallel Computing with MATLAB

Accelerating Implicit LS-DYNA with GPU

Experiment 6 SIMULINK

MATLAB. Senior Application Engineer The MathWorks Korea The MathWorks, Inc. 2

Accelerating Simulink Optimization, Code Generation & Test Automation Through Parallelization

Integrate MATLAB Analytics into Enterprise Applications

Implementation of the finite-difference method for solving Maxwell`s equations in MATLAB language on a GPU

Embarquez votre Intelligence Artificielle (IA) sur CPU, GPU et FPGA

Accelerating Finite Element Analysis in MATLAB with Parallel Computing

Matlab for Engineers

Parallel MATLAB at VT

Speedup Altair RADIOSS Solvers Using NVIDIA GPU

Deep learning in MATLAB From Concept to CUDA Code

MATLAB Distributed Computing Server Release Notes

High-Order Finite-Element Earthquake Modeling on very Large Clusters of CPUs or GPUs

What s New in MATLAB and Simulink The MathWorks, Inc. 1

Evaluating the MATLAB Parallel Computing Toolbox Kashif Hussain Computer Science 2012/2013

Data Analytics with MATLAB. Tackling the Challenges of Big Data

GPU Acceleration of SAR/ISAR Imaging Algorithms

Daniel D. Warner. May 31, Introduction to Parallel Matlab. Daniel D. Warner. Introduction. Matlab s 5-fold way. Basic Matlab Example

A Comprehensive Study on the Performance of Implicit LS-DYNA

GPU Accelerated Solvers for ODEs Describing Cardiac Membrane Equations

GPGPUs in HPC. VILLE TIMONEN Åbo Akademi University CSC

Particle-in-Cell Simulations on Modern Computing Platforms. Viktor K. Decyk and Tajendra V. Singh UCLA

MATLAB Based Optimization Techniques and Parallel Computing

High-Performance Data Loading and Augmentation for Deep Neural Network Training

Parallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs

GPU-Accelerated Beat Detection for Dancing Monkeys

Vidushi: Parallel Implementation of Alpha Miner Algorithm and Performance Analysis on CPU and GPU Architecture

Handling and Processing Big Data for Biomedical Discovery with MATLAB

Experiment 8 SIMULINK

Improving the Performance of the Molecular Similarity in Quantum Chemistry Fits. Alexander M. Cappiello

Model-Based Design: Design with Simulation in Simulink

Parallel Computing with Matlab and R

ACCELERATING THE PRODUCTION OF SYNTHETIC SEISMOGRAMS BY A MULTICORE PROCESSOR CLUSTER WITH MULTIPLE GPUS

IBM Power AC922 Server

GPUs Open New Avenues in Medical MRI

VSC Users Day 2018 Start to GPU Ehsan Moravveji

Real-Time Simulation of Simscape Models

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 6. Parallel Processors from Client to Cloud

Data Analytics with MATLAB

Machine Learning for (fast) simulation

TESLA V100 PERFORMANCE GUIDE. Life Sciences Applications

Session S0069: GPU Computing Advances in 3D Electromagnetic Simulation

Matlab: Parallel Computing Toolbox

TOOLS FOR IMPROVING CROSS-PLATFORM SOFTWARE DEVELOPMENT

SEASHORE / SARUMAN. Short Read Matching using GPU Programming. Tobias Jakobi

High performance 2D Discrete Fourier Transform on Heterogeneous Platforms. Shrenik Lad, IIIT Hyderabad Advisor : Dr. Kishore Kothapalli

CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman)

General Purpose GPU Computing in Partial Wave Analysis

Introduction to Parallel Programming

Performance Analysis of Matlab Code and PCT

Summer 2009 REU: Introduction to Some Advanced Topics in Computational Mathematics

Technical Computing with MATLAB

A TALENTED CPU-TO-GPU MEMORY MAPPING TECHNIQUE

Approaches to acceleration: GPUs vs Intel MIC. Fabio AFFINITO SCAI department

Master Class: Diseño de Sistemas Mecatrónicos

ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016

Distributed Training of Deep Neural Networks: Theoretical and Practical Limits of Parallel Scalability

Matlab s Parallel Computation Toolbox

Stream Processing with CUDA TM A Case Study Using Gamebryo's Floodgate Technology

Arm Processor Technology Update and Roadmap

N-Body Simulation using CUDA. CSE 633 Fall 2010 Project by Suraj Alungal Balchand Advisor: Dr. Russ Miller State University of New York at Buffalo

What s New in MATLAB and Simulink Young Joon Lee Principal Application Engineer

ANSYS Improvements to Engineering Productivity with HPC and GPU-Accelerated Simulation

MATLAB Distributed Computing Server (MDCS) Training

Finite Element Integration and Assembly on Modern Multi and Many-core Processors

NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI

How GPUs can find your next hit: Accelerating virtual screening with OpenCL. Simon Krige

Simulation and Benchmarking of Modelica Models on Multi-core Architectures with Explicit Parallel Algorithmic Language Extensions

MICROWAY S NVIDIA TESLA V100 GPU SOLUTIONS GUIDE

Transcription:

Mit MATLAB auf der Überholspur Methoden zur Beschleunigung von MATLAB Anwendungen Frank Graeber Application Engineering MathWorks Germany 2013 The MathWorks, Inc. 1

Speed up the serial code within core MATLAB Use the MATLAB Profiler to identify performance bottlenecks Techniques for addressing performance Vectorization Preallocation Consider readability and maintainability Looping vs. matrix operations Subscripted vs. linear vs. logical and more Public Webinar: https://www.mathworks.com/company/events/webinars/wbnr71341.html 2

The Quest for Performance Once you improved the performance of your serial MATLAB code Do you need to reduce your run-time further? Do you need to solve larger problems? If so, do you have access to a Multi-core or multi-processor computer? Computer cluster or cloud? Graphics processing unit (GPU)? 3

Utilizing Additional Processing Power Built-in multithreading Automatically enabled in MATLAB since R2008a Multiple threads in a single MATLAB computation engine Parallel Computing using explicit techniques Multiple computation engines on CPU controlled by a single session Parallel programming constructs for interactive and batch processing GPU programming for massively parallel problems 4

Parallel Computing for the Desktop Desktop Computer Local Implement & test parallel applications Gain speed on multicore systems Up to 12 local workers (computation engines) Take advantage of GPUs MATLAB Desktop (Client) Parallel Computing Toolbox Prototype code for your cluster 5

Scheduler Scaling Up to Clusters and Clouds Desktop Computer Computer Cluster Local Cluster MATLAB Desktop (Client) Parallel Computing Toolbox MATLAB Distributed Computing Server 6

Going Beyond Serial MATLAB Applications MATLAB Desktop (Client) Worker Worker Worker Worker Worker Worker Pool of Workers 7

Ease of Use Programming Parallel Applications (CPU) Built-in support with Toolboxes Simple programming constructs: parfor, batch, distributed, Advanced programming constructs: createjob, labsend, spmd, Greater Control 8

Tools Providing Parallel Computing Support Optimization Toolbox, Global Optimization Toolbox Statistics Toolbox Signal Processing Toolbox Neural Network Toolbox Image Processing Toolbox and more BLOCKSETS Directly leverage functions in Parallel Computing Toolbox www.mathworks.com/products/parallel-computing/builtin-parallel-support.html 9

Displacement (x) Example: Parameter Sweep of ODEs Parallel for-loops Simulink Parameter sweep of ODE system Damped spring oscllator (also in Simulink) Sweep through different values of damping and stiffness Record peak value for each simulation 1.2 1 0.8 0.6 0.4 0.2 m = 5, b = 2, k = 2 k b x m 5 m x b x k x 0 1,2,... 1,2,... Convert for to parfor 0-0.2 m = 5, b = 5, k = 5-0.4 0 5 10 15 20 25 Time (s) Use pool of MATLAB workers 10

Benchmark: Parameter Sweep of ODEs Scaling case study Computation time (minutes) Workers Number of combinations 961 3844 15625 62500 250000 1 5 20 81 325 1292 16 0.6 2 6 25 98 32 0.4 1 3 12 47 64 0.5 0.6 1.7 6 23 96 0.5 0.7 1.3 4 16 128 0.6 0.7 1.1 3.1 12 Processor: Intel Xeon E5-2670 16 cores per node 11

GPU Programming with MATLAB One computation engine (client or worker) per GPU Massively parallel: Calculations can be broken into hundreds or thousands of independent units of work Problem size takes advantage of many GPU cores Computationally intensive: Computation time significantly exceeds CPU/GPU data transfer time 12

Ease of Use GPU Programming Options Built-in support with Toolboxes Simple programming constructs: gpuarray, gather Advanced programming constructs: arrayfun, bsxfun, spmd Interface for experts: CUDAKernel, MEX support Greater Control 13

Example: Solving 2D Wave Equation GPU Computing Solve 2 nd order wave equation using spectral methods: 2 u t 2 = 2 u x 2 + 2 u y 2 Run both on CPU and GPU Using gpuarray and overloaded functions (e.g., fft and ifft) 14

Comparison of Code 15

Benchmark: Solving 2D Wave Equation CPU vs GPU Grid Size CPU (s) GPU (s) Speedup 64 x 64 0.05 0.15 0.32 128 x 128 0.13 0.15 0.88 256 x 256 0.47 0.15 3.12 512 x 512 2.22 0.27 8.10 1024 x 1024 10.80 0.88 12.31 2048 x 2048 54.60 3.84 14.22 Intel Xeon Processor W3690 (3.47GHz), NVIDIA Tesla K20 GPU 16

Learn More http://www.mathworks.de/parallel-computing http://www.mathworks.de/products/parallel-computing/ 17

Key Takeaways Analyze your code for bottlenecks using the MATLAB profiler Consider performance benefit of vector and matrix operations in MATLAB Leverage parallel computing tools on Multicore CPUs Compute clusters or clouds Graphics processing units (GPUs) On the use of multiple GPUs: http://blogs.mathworks.com/loren/2013/06/24/running-monte-carlo-simulations-on-multiple-gpus/ 18