Accelerating System Simulations

Size: px
Start display at page:

Download "Accelerating System Simulations"

Transcription

1 Accelerating System Simulations 김용정부장 Senior Applications Engineer 2013 The MathWorks, Inc. 1

2 Why simulation acceleration? From algorithm exploration to system design Size and complexity of models increases Time needed for a single simulation increases Number of test cases increases Test cases become larger Need to reduce simulation time during design simulation time for large scale testing during prototyping 2

3 MATLAB is quite fast Optimized and widely-used libraries BLAS Basic Linear Algebra Subroutines (multithreaded) LAPACK Linear Algebra Package JIT (Just In Time) Acceleration On-the-fly multithreaded code generation for increased speed Built-in support for vector and matrix operations 3

4 Application LTE Physical Downlink Control Channel (PDCCH) 4

5 Workflow Start with a baseline algorithm Profile it to introduce a performance yardstick Introduce the following optimizations: Better MATLAB serial programming techniques Using System objects MATLAB to C code generation (MEX) Parallel Computing GPU-optimized System objects Rapid Accelerator mode of simulation in Simulink 5

6 Simulation acceleration options in MATLAB Better MATLAB code User s Code System objects MATLAB to C Parallel Computing GPU processing 6

7 Profiling MATLAB algorithms Profiler summarizes MATLAB code execution total time spent within each function which lines of code use the most processing time Helps identify algorithm bottlenecks 7

8 Effective MATLAB programming techniques Example of pre-allocation y=[]; for n=1:len/tx G=[u(idx1(n)) u(idx2(n));... -conj(u(idx2(n))) conj(u(idx1(n)))]; y=[y;g]; end y=complex(zeros(len,tx)); y(idx1,1)=u(idx1); y(idx1,2)=u(idx2); y(idx2,1)=-conj(u(idx2)); y(idx2,2)=conj(u(idx1)); Pre-allocation Initialize an array using its final size Helps avoid dynamically resizing arrays in a loop Vectorization Convert code from using scalar loops to using matrix/vector operations Helps MATLAB leverage processor-optimized libraries for vector processing 8

9 Using System objects of DSP & Communications System Toolboxes Example of System objects System objects facilitate stream processing Can accelerate simulation because function s = Alamouti_DecoderS(u,H) %#codegen % STBC Combiner persistent htddec if isempty(htddec) htddec= comm.ostbccombiner(... 'NumTransmitAntennas',2,'NumReceiveAntennas',2); end s = step(htddec, u, H); Decouple declaration from the execution of the algorithms Reduce overhead of parameter handling in the loop Most of them implemented as MATLAB executables (MEX) 9

10 MATLAB to C code generation MATLAB Coder Automatically generate a MEX function Call the generated MEX file within testbench Verify same numerical results Assess the baseline function and the generated MEX function for speed 10

11 Parallel Simulation Runs Worker TOOLBOXES BLOCKSETS Worker Worker Worker Task 1 Task 2 Task 3 Task 4 >> Demo Time Time 11

12 Summary matlabpool available workers No modification of algorithm Use parfor loop instead of for loop Parallel computation or simulation leads to further acceleration More cores = more speed 12

13 Simulation acceleration options in MATLAB Better MATLAB code User s Code System objects MATLAB to C Parallel Computing GPU processing 13

14 What is a Graphics Processing Unit (GPU) Originally for graphics acceleration, now also used for scientific calculations Massively parallel array of integer and floating point processors Typically hundreds of processors per card GPU cores complement CPU cores Dedicated high-speed memory 14

15 Why would you want to use a GPU? Speed up execution of computationally intensive simulations For example: Performance: A\b with Double Precision 15

16 Ease of Use Options for Targeting GPUs 1) Use GPU with MATLAB built-in functions 2) Execute MATLAB functions elementwise on the GPU 3) Create kernels from existing CUDA code and PTX files Greater Control 16

17 Data Transfer between MATLAB and GPU % Push data from CPU to GPU memory Agpu = gpuarray(a) % Bring results from GPU memory back to CPU B = gather(bgpu) 17

18 GPU Processing with Communications System Toolbox Alternative implementation for many System objects take advantage of GPU processing Use Parallel Computing Toolbox to execute many communications algorithms directly on the GPU GPU System objects comm.gpu.turbodecoder comm.gpu.viterbidecoder comm.gpu.ldpcdecoder comm.gpu.pskdemodulator comm.gpu.awgnchannel Easy-to-use syntax Dramatically accelerate simulations 18

19 Example: Turbo Coding Impressive coding gain High computational complexity Bit-error rate performance as a function of number of iterations = comm.turbodecoder( NumIterations, numiter, 19

20 Acceleration with GPU System objects Version Elapsed time Acceleration CPU 8 hours GPU 40 minutes 12.0 Same numerical results Cluster of 4 GPUs 11 minutes 43.0 = comm.turbodecoder( comm.gpu.turbodecoder( NumIterations, N, = comm.awgnchannel( = comm.gpu.awgnchannel( 20

21 Key Operations in Turbo Coding Function CPU GPU Version 1 % Turbo Encoder htenc = comm.turboencoder('trellisstructure',poly2trellis(4, [13 15], 13),.. 'InterleaverIndices', intrlvrindices) % AWG Noise hawgn = comm.awgnchannel('noisemethod', 'Variance'); % BER measurement hber = comm.errorrate; % Turbo Decoder htdec = comm.turbodecoder( 'TrellisStructure',poly2trellis(4, [13 15], 13),... 'InterleaverIndices', intrlvrindices,'numiterations', numiter); % Turbo Encoder htenc = comm.turboencoder('trellisstructure',poly2trellis(4, [13 15], 13),.. 'InterleaverIndices', intrlvrindices) % AWG Noise hawgn = comm.awgnchannel('noisemethod', 'Variance'); % BER measurement hber = comm.errorrate; % Turbo Decoder htdec = comm.gpu.turbodecoder( 'TrellisStructure',poly2trellis(4, [13 15], 13),... 'InterleaverIndices', intrlvrindices,'numiterations', numiter); ber = zeros(3,1); %initialize BER output %% Processing loop while ( ber(1) < MaxNumErrs && ber(2) < MaxNumBits) data = randn(blklength, 1)>0.5; % Encode random data bits yenc = step(htenc, data); %Modulate, Add noise to real bipolar data modout = 1-2*yEnc; rdata = step(hawgn, modout); % Convert to log-likelihood ratios for decoding llrdata = (-2/noiseVar).*rData; % Turbo Decode decdata = step(htdec, llrdata); % Calculate errors ber = step(hber, data, decdata); end ber = zeros(3,1); %initialize BER output %% Processing loop while ( ber(1) < MaxNumErrs && ber(2) < MaxNumBits) data = randn(blklength, 1)>0.5; % Encode random data bits yenc = step(htenc, data); %Modulate, Add noise to real bipolar data modout = 1-2*yEnc; rdata = step(hawgn, modout); % Convert to log-likelihood ratios for decoding llrdata = (-2/noiseVar).*rData; % Turbo Decode decdata = step(htdec, llrdata); % Calculate errors ber = step(hber, data, decdata); end 21

22 Profile results in Turbo Coding Function CPU GPU Version 1 % Turbo Encoder <0.01 htenc = comm.turboencoder('trellisstructure',poly2trellis(4, [13 15], 13),.. 'InterleaverIndices', intrlvrindices) % AWG Noise <0.01 hawgn = comm.awgnchannel('noisemethod', 'Variance'); % BER measurement <0.01 hber = comm.errorrate; % Turbo Decoder <0.01 htdec = comm.turbodecoder( 'TrellisStructure',poly2trellis(4, [13 15], 13),... 'InterleaverIndices', intrlvrindices,'numiterations', numiter); % Turbo Encoder <0.01 htenc = comm.turboencoder('trellisstructure',poly2trellis(4, [13 15], 13),.. 'InterleaverIndices', intrlvrindices) % AWG Noise <0.01 hawgn = comm.awgnchannel('noisemethod', 'Variance'); % BER measurement <0.01 hber = comm.errorrate; % Turbo Decoder 0.02 htdec = comm.gpu.turbodecoder( 'TrellisStructure',poly2trellis(4, [13 15], 13),... 'InterleaverIndices', intrlvrindices,'numiterations', numiter); <0.01 ber = zeros(3,1); %initialize BER output %% Processing loop while ( ber(1) < MaxNumErrs && ber(2) < MaxNumBits) 0.30 data = randn(blklength, 1)>0.5; % Encode random data bits 2.33 yenc = step(htenc, data); %Modulate, Add noise to real bipolar data 0.05 modout = 1-2*yEnc; 1.50 rdata = step(hawgn, modout); % Convert to log-likelihood ratios for decoding 0.03 llrdata = (-2/noiseVar).*rData; % Turbo Decode decdata = step(htdec, llrdata); % Calculate errors 0.17 ber = step(hber, data, decdata); end <0.01 ber = zeros(3,1); %initialize BER output %% Processing loop while ( ber(1) < MaxNumErrs && ber(2) < MaxNumBits) 0.28 data = randn(blklength, 1)>0.5; % Encode random data bits 2.38 yenc = step(htenc, data); %Modulate, Add noise to real bipolar data 0.05 modout = 1-2*yEnc; 1.45 rdata = step(hawgn, modout); % Convert to log-likelihood ratios for decoding 0.04 llrdata = (-2/noiseVar).*rData; % Turbo Decode decdata = step(htdec, llrdata); % Calculate errors 0.17 ber = step(hber, data, decdata); end 22

23 Key Operations in Turbo Coding Function CPU GPU Version 2 % Turbo Encoder htenc = comm.turboencoder('trellisstructure',poly2trellis(4, [13 15], 13),.. 'InterleaverIndices', intrlvrindices) % AWG Noise hawgn = comm.awgnchannel('noisemethod', 'Variance'); % BER measurement hber = comm.errorrate; % Turbo Decoder htdec = comm.turbodecoder('trellisstructure',poly2trellis(4, [13 15], 13),... 'InterleaverIndices', intrlvrindices,'numiterations', numiter); %% Processing loop while ( ber(1) < MaxNumErrs && ber(2) < MaxNumBits) data = randn(blklength, 1)>0.5; % Encode random data bits yenc = step(htenc, data); %Modulate, Add noise to real bipolar data modout = 1-2*yEnc; rdata = step(hawgn, modout); % Convert to log-likelihood ratios for decoding llrdata = (-2/noiseVar).*rData; % Turbo Decode decdata = step(htdec, llrdata); % Calculate errors ber = step(hber, data, decdata); end % Turbo Encoder htenc = comm.turboencoder('trellisstructure',poly2trellis(4, [13 15], 13),.. 'InterleaverIndices', intrlvrindices) % AWG Noise hawgn = comm.gpu.awgnchannel ('NoiseMethod', 'Variance'); % BER measurement hber = comm.errorrate; % Turbo Decoder - setup for Multi-frame or Multi-user processing numframes = 30; htdec = comm.gpu.turbodecoder('trellisstructure',poly2trellis(4, [13 15], 13),... 'InterleaverIndices', intrlvrindices,'numiterations',numiter, NumFrames,numFrames); %% Processing loop while ( ber(1) < MaxNumErrs && ber(2) < MaxNumBits) data = randn(numframes*blklength, 1)>0.5; % Encode random data bits yenc = gpuarray(multiframestep(htenc, data, numframes)); %Modulate, Add noise to real bipolar data modout = 1-2*yEnc; rdata = step(hawgn, modout); % Convert to log-likelihood ratios for decoding llrdata = (-2/noiseVar).*rData; % Turbo Decode decdata = step(htdec, llrdata); % Calculate errors ber=step(hber, data, gather(decdata)); end 23

24 Profile results in Turbo Coding Function CPU GPU Version 2 % Turbo Encoder <0.01 htenc = comm.turboencoder('trellisstructure',poly2trellis(4, [13 15], 13),.. 'InterleaverIndices', intrlvrindices) % AWG Noise <0.01 hawgn = comm.awgnchannel('noisemethod', 'Variance'); % BER measurement <0.01 hber = comm.errorrate; % Turbo Decoder <0.01 htdec = comm.turbodecoder( 'TrellisStructure',poly2trellis(4, [13 15], 13),... 'InterleaverIndices', intrlvrindices,'numiterations', numiter); %% Processing loop while ( ber(1) < MaxNumErrs && ber(2) < MaxNumBits) 0.30 data = randn(blklength, 1)>0.5; % Encode random data bits 2.33 yenc = step(htenc, data); %Modulate, Add noise to real bipolar data 0.05 modout = 1-2*yEnc; 1.50 rdata = step(hawgn, modout); % Convert to log-likelihood ratios for decoding 0.03 llrdata = (-2/noiseVar).*rData; % Turbo Decode decdata = step(htdec, llrdata); % Calculate errors 0.17 ber = step(hber, data, decdata); end % Turbo Encoder <0.01 htenc = comm.turboencoder('trellisstructure',poly2trellis(4, [13 15], 13),.. 'InterleaverIndices', intrlvrindices) % AWG Noise 0.03 hawgn = comm.gpu.awgnchannel ('NoiseMethod', 'Variance'); % BER measurement <0.01 hber = comm.errorrate; % Turbo Decoder - setup for Multi-frame or Multi-user processing 0.01 numframes = 30; 0.01 htdec = comm.gpu.turbodecoder('trellisstructure', poly2trellis(4, [13 15], 13),'InterleaverIndices', intrlvrindices, 'NumIterations',numIter, NumFrames,numFrames); %% Processing loop while ( ber(1) < MaxNumErrs && ber(2) < MaxNumBits) 0.22 data = randn(numframes*blklength, 1)>0.5; % Encode random data bits 2.45 yenc = gpuarray(multiframestep(htenc, data, numframes)); %Modulate, Add noise to real bipolar data 0.02 modout = 1-2*yEnc; 0.31 rdata = step(hawgn, modout); % Convert to log-likelihood ratios for decoding 0.01 llrdata = (-2/noiseVar).*rData; % Turbo Decode decdata = step(htdec, llrdata); % Calculate errors 0.09 ber=step(hber, data, gather(decdata)); end 24

25 Things to note when targeting GPU Minimize data transfer between CPU and GPU. Using GPU only makes sense if data size is large. Some functions in MATLAB are optimized and can be faster than the GPU equivalent (eg. FFT). Use arrayfun to explicitly specify elementwise operations. 25

26 Summary Acceleration methodologies in MATLAB & Simulink Technology / Product 1. Best Practices in Programming Vectorization & pre-allocation Environment tools. (i.e. Profiler, Code Analyzer) 2. Better Algorithms Ideal environment for algorithm exploration Rich set of functionality (e.g. System objects) MATLAB, Toolboxes, System Toolboxes MATLAB, Toolboxes, System Toolboxes 3. More Processors or Cores High level parallel constructs (e.g. parfor, matlabpool) Utilize cluster, clouds, and grids 4. Refactoring the Implementation Compiled code (MEX) GPUs, FPGA-in-the-Loop Parallel Computing Toolbox, MATLAB Distributed Computing Server MATLAB, MATLAB Coder, Parallel Computing Toolbox 26

27 Thank You Q & A 27

Modeling a 4G LTE System in MATLAB

Modeling a 4G LTE System in MATLAB Modeling a 4G LTE System in MATLAB Part 2: Simulation acceleration Houman Zarrinkoub PhD. Signal Processing Product Manager MathWorks houmanz@mathworks.com 2011 The MathWorks, Inc. 1 Why simulation acceleration?

More information

Modeling a 4G LTE System in MATLAB Idin Motedayen-Aval Senior Applications Engineer MathWorks

Modeling a 4G LTE System in MATLAB Idin Motedayen-Aval Senior Applications Engineer MathWorks Modeling a 4G LTE System in MATLAB Idin Motedayen-Aval Senior Applications Engineer MathWorks Idin.motedayen-aval@mathworks.com 2012 The MathWorks, Inc. 1 Agenda 4G LTE and LTE Advanced True Global standard

More information

Speeding up MATLAB Applications Sean de Wolski Application Engineer

Speeding up MATLAB Applications Sean de Wolski Application Engineer Speeding up MATLAB Applications Sean de Wolski Application Engineer 2014 The MathWorks, Inc. 1 Non-rigid Displacement Vector Fields 2 Agenda Leveraging the power of vector and matrix operations Addressing

More information

Parallel and Distributed Computing with MATLAB The MathWorks, Inc. 1

Parallel and Distributed Computing with MATLAB The MathWorks, Inc. 1 Parallel and Distributed Computing with MATLAB 2018 The MathWorks, Inc. 1 Practical Application of Parallel Computing Why parallel computing? Need faster insight on more complex problems with larger datasets

More information

Getting Started with MATLAB Francesca Perino

Getting Started with MATLAB Francesca Perino Getting Started with MATLAB Francesca Perino francesca.perino@mathworks.it 2014 The MathWorks, Inc. 1 Agenda MATLAB Intro Importazione ed esportazione Programmazione in MATLAB Tecniche per la velocizzazione

More information

Optimizing and Accelerating Your MATLAB Code

Optimizing and Accelerating Your MATLAB Code Optimizing and Accelerating Your MATLAB Code Sofia Mosesson Senior Application Engineer 2016 The MathWorks, Inc. 1 Agenda Optimizing for loops and using vector and matrix operations Indexing in different

More information

Parallel and Distributed Computing with MATLAB Gerardo Hernández Manager, Application Engineer

Parallel and Distributed Computing with MATLAB Gerardo Hernández Manager, Application Engineer Parallel and Distributed Computing with MATLAB Gerardo Hernández Manager, Application Engineer 2018 The MathWorks, Inc. 1 Practical Application of Parallel Computing Why parallel computing? Need faster

More information

Mit MATLAB auf der Überholspur Methoden zur Beschleunigung von MATLAB Anwendungen

Mit MATLAB auf der Überholspur Methoden zur Beschleunigung von MATLAB Anwendungen Mit MATLAB auf der Überholspur Methoden zur Beschleunigung von MATLAB Anwendungen Frank Graeber Application Engineering MathWorks Germany 2013 The MathWorks, Inc. 1 Speed up the serial code within core

More information

Speeding up MATLAB Applications The MathWorks, Inc.

Speeding up MATLAB Applications The MathWorks, Inc. Speeding up MATLAB Applications 2009 The MathWorks, Inc. Agenda Leveraging the power of vector & matrix operations Addressing bottlenecks Utilizing additional processing power Summary 2 Example: Block

More information

Multicore Computer, GPU 및 Cluster 환경에서의 MATLAB Parallel Computing 기능

Multicore Computer, GPU 및 Cluster 환경에서의 MATLAB Parallel Computing 기능 Multicore Computer, GPU 및 Cluster 환경에서의 MATLAB Parallel Computing 기능 성호현 MathWorks Korea 2012 The MathWorks, Inc. 1 A Question to Consider Do you want to speed up your algorithms? If so Do you have a multi-core

More information

Large Data in MATLAB: A Seismic Data Processing Case Study U. M. Sundar Senior Application Engineer

Large Data in MATLAB: A Seismic Data Processing Case Study U. M. Sundar Senior Application Engineer Large Data in MATLAB: A Seismic Data Processing Case Study U. M. Sundar Senior Application Engineer 2013 MathWorks, Inc. 1 Problem Statement: Scaling Up Seismic Analysis Challenge: Developing a seismic

More information

Daniel D. Warner. May 31, Introduction to Parallel Matlab. Daniel D. Warner. Introduction. Matlab s 5-fold way. Basic Matlab Example

Daniel D. Warner. May 31, Introduction to Parallel Matlab. Daniel D. Warner. Introduction. Matlab s 5-fold way. Basic Matlab Example to May 31, 2010 What is Matlab? Matlab is... an Integrated Development Environment for solving numerical problems in computational science. a collection of state-of-the-art algorithms for scientific computing

More information

Modeling a 4G LTE System in MATLAB

Modeling a 4G LTE System in MATLAB Modeling a 4G LTE System in MATLAB Part 3: Path to implementation (C and HDL) Houman Zarrinkoub PhD. Signal Processing Product Manager MathWorks houmanz@mathworks.com 2011 The MathWorks, Inc. 1 LTE Downlink

More information

Moving MATLAB Algorithms into Complete Designs with Fixed-Point Simulation and Code Generation

Moving MATLAB Algorithms into Complete Designs with Fixed-Point Simulation and Code Generation Moving MATLAB Algorithms into Complete Designs with Fixed-Point Simulation and Code Generation Houman Zarrinkoub, PhD. Product Manager Signal Processing Toolboxes The MathWorks Inc. 2007 The MathWorks,

More information

Optimization and Implementation of Embedded Signal Processing Algorithms Jonas Rutström Senior Application Engineer

Optimization and Implementation of Embedded Signal Processing Algorithms Jonas Rutström Senior Application Engineer Optimization and Implementation of Embedded Signal Processing Algorithms Jonas Rutström Senior Application Engineer 2016 The MathWorks, 1 Inc. Two important questions in embedded design... 1. What s your

More information

High Performance and GPU Computing in MATLAB

High Performance and GPU Computing in MATLAB High Performance and GPU Computing in MATLAB Jan Houška houska@humusoft.cz http://www.humusoft.cz 1 About HUMUSOFT Company: Humusoft s.r.o. Founded: 1990 Number of employees: 18 Location: Praha 8, Pobřežní

More information

Mit MATLAB auf der Überholspur Methoden zur Beschleunigung von MATLAB Anwendungen

Mit MATLAB auf der Überholspur Methoden zur Beschleunigung von MATLAB Anwendungen Mit MATLAB auf der Überholspur Methoden zur Beschleunigung von MATLAB Anwendungen Michael Glaßer Application Engineering MathWorks Germany 2014 The MathWorks, Inc. 1 Key Takeaways 1. Speed up your serial

More information

Introduction to C and HDL Code Generation from MATLAB

Introduction to C and HDL Code Generation from MATLAB Introduction to C and HDL Code Generation from MATLAB 이웅재차장 Senior Application Engineer 2012 The MathWorks, Inc. 1 Algorithm Development Process Requirements Research & Design Explore and discover Design

More information

Audio Signal Processing in MATLAB Youssef Abdelilah Senior Product Manager

Audio Signal Processing in MATLAB Youssef Abdelilah Senior Product Manager Audio Signal Processing in MATLAB Youssef Abdelilah Senior Product Manager 2014 The MathWorks, Inc. 1 Agenda Tunable parametric equalizer example Audio tone removal example 1 2 3 How to create a streaming

More information

Model-Based Design: Generating Embedded Code for Prototyping or Production

Model-Based Design: Generating Embedded Code for Prototyping or Production Model-Based Design: Generating Embedded Code for Prototyping or Production Ruth-Anne Marchant Application Engineer MathWorks 2016 The MathWorks, Inc. 1 2 ABB Accelerates Application Control Software Development

More information

Parallel Computing with MATLAB

Parallel Computing with MATLAB Parallel Computing with MATLAB CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University

More information

MatCL - OpenCL MATLAB Interface

MatCL - OpenCL MATLAB Interface MatCL - OpenCL MATLAB Interface MatCL - OpenCL MATLAB Interface Slide 1 MatCL - OpenCL MATLAB Interface OpenCL toolkit for Mathworks MATLAB/SIMULINK Compile & Run OpenCL Kernels Handles OpenCL memory management

More information

Scaling up MATLAB Analytics Marta Wilczkowiak, PhD Senior Applications Engineer MathWorks

Scaling up MATLAB Analytics Marta Wilczkowiak, PhD Senior Applications Engineer MathWorks Scaling up MATLAB Analytics Marta Wilczkowiak, PhD Senior Applications Engineer MathWorks 2013 The MathWorks, Inc. 1 Agenda Giving access to your analytics to more users Handling larger problems 2 When

More information

How Real-Time Testing Improves the Design of a PMSM Controller

How Real-Time Testing Improves the Design of a PMSM Controller How Real-Time Testing Improves the Design of a PMSM Controller Prasanna Deshpande Control Design & Automation Application Engineer MathWorks 2015 The MathWorks, Inc. 1 Problem Statement: Design speed control

More information

Technical Computing with MATLAB

Technical Computing with MATLAB Technical Computing with MATLAB University Of Bath Seminar th 19 th November 2010 Adrienne James (Application Engineering) 1 Agenda Introduction to MATLAB Importing, visualising and analysing data from

More information

Intro to System Generator. Objectives. After completing this module, you will be able to:

Intro to System Generator. Objectives. After completing this module, you will be able to: Intro to System Generator This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able to: Explain why there is a need for an integrated

More information

Model-Based Design for Video/Image Processing Applications

Model-Based Design for Video/Image Processing Applications Model-Based Design for Video/Image Processing Applications The MathWorks Agenda Model-Based Design From MATLAB and Simulink to Altera FPGA Step-by-step design and implementation of edge detection algorithm

More information

Deep learning in MATLAB From Concept to CUDA Code

Deep learning in MATLAB From Concept to CUDA Code Deep learning in MATLAB From Concept to CUDA Code Roy Fahn Applications Engineer Systematics royf@systematics.co.il 03-7660111 Ram Kokku Principal Engineer MathWorks ram.kokku@mathworks.com 2017 The MathWorks,

More information

Implementing MATLAB Algorithms in FPGAs and ASICs By Alexander Schreiber Senior Application Engineer MathWorks

Implementing MATLAB Algorithms in FPGAs and ASICs By Alexander Schreiber Senior Application Engineer MathWorks Implementing MATLAB Algorithms in FPGAs and ASICs By Alexander Schreiber Senior Application Engineer MathWorks 2014 The MathWorks, Inc. 1 Traditional Implementation Workflow: Challenges Algorithm Development

More information

MATLAB AND PARALLEL COMPUTING

MATLAB AND PARALLEL COMPUTING Image Processing & Communication, vol. 17, no. 4, pp. 207-216 DOI: 10.2478/v10248-012-0048-5 207 MATLAB AND PARALLEL COMPUTING MAGDALENA SZYMCZYK, PIOTR SZYMCZYK AGH University of Science and Technology,

More information

Model-Based Design for effective HW/SW Co-Design Alexander Schreiber Senior Application Engineer MathWorks, Germany

Model-Based Design for effective HW/SW Co-Design Alexander Schreiber Senior Application Engineer MathWorks, Germany Model-Based Design for effective HW/SW Co-Design Alexander Schreiber Senior Application Engineer MathWorks, Germany 2013 The MathWorks, Inc. 1 Agenda Model-Based Design of embedded Systems Software Implementation

More information

NumbaPro CUDA Python. Square matrix multiplication

NumbaPro CUDA Python. Square matrix multiplication NumbaPro Enables parallel programming in Python Support various entry points: Low-level (CUDA-C like) programming language High-level array oriented interface CUDA library bindings Also support multicore

More information

Using Parallel Computing Toolbox to accelerate the Video and Image Processing Speed. Develop parallel code interactively

Using Parallel Computing Toolbox to accelerate the Video and Image Processing Speed. Develop parallel code interactively Using Parallel Computing Toolbox to accelerate the Video and Image Processing Speed Presenter: Claire Chuang TeraSoft Inc. Agenda Develop parallel code interactively parallel applications for faster processing

More information

INTRODUCTION TO MATLAB PARALLEL COMPUTING TOOLBOX

INTRODUCTION TO MATLAB PARALLEL COMPUTING TOOLBOX INTRODUCTION TO MATLAB PARALLEL COMPUTING TOOLBOX Keith Ma ---------------------------------------- keithma@bu.edu Research Computing Services ----------- help@rcs.bu.edu Boston University ----------------------------------------------------

More information

개발과정에서의 MATLAB 과 C 의연동 ( 영상처리분야 )

개발과정에서의 MATLAB 과 C 의연동 ( 영상처리분야 ) 개발과정에서의 MATLAB 과 C 의연동 ( 영상처리분야 ) Application Engineer Caleb Kim 2016 The MathWorks, Inc. 1 Algorithm Development with MATLAB for C/C++ Programmers Objectives Use MATLAB throughout algorithm development

More information

General Purpose GPU Computing in Partial Wave Analysis

General Purpose GPU Computing in Partial Wave Analysis JLAB at 12 GeV - INT General Purpose GPU Computing in Partial Wave Analysis Hrayr Matevosyan - NTC, Indiana University November 18/2009 COmputationAL Challenges IN PWA Rapid Increase in Available Data

More information

Model-Based Design: Design with Simulation in Simulink

Model-Based Design: Design with Simulation in Simulink Model-Based Design: Design with Simulation in Simulink Ruth-Anne Marchant Application Engineer MathWorks 2016 The MathWorks, Inc. 1 2 Outline Model-Based Design Overview Modelling and Design in Simulink

More information

Avnet Speedway Design Workshop

Avnet Speedway Design Workshop Accelerating Your Success Avnet Speedway Design Workshop Creating FPGA-based Co-Processors for DSPs Using Model Based Design Techniques Lecture 4: FPGA Co-Processor Architectures and Verification V10_1_2_0

More information

Real-Time Testing in a Modern, Agile Development Workflow

Real-Time Testing in a Modern, Agile Development Workflow Real-Time Testing in a Modern, Agile Development Workflow Simon Eriksson Application Engineer 2015 The MathWorks, Inc. 1 Demo Going from Desktop Testing to Real-Time Testing 2 Key Take-Aways From This

More information

Integrate MATLAB Analytics into Enterprise Applications

Integrate MATLAB Analytics into Enterprise Applications Integrate Analytics into Enterprise Applications Lyamine Hedjazi 2015 The MathWorks, Inc. 1 Data Analytics Workflow Preprocessing Data Business Systems Build Algorithms Smart Connected Systems Take Decisions

More information

Embarquez votre Intelligence Artificielle (IA) sur CPU, GPU et FPGA

Embarquez votre Intelligence Artificielle (IA) sur CPU, GPU et FPGA Embarquez votre Intelligence Artificielle (IA) sur CPU, GPU et FPGA Pierre Nowodzienski Engineer pierre.nowodzienski@mathworks.fr 2018 The MathWorks, Inc. 1 From Data to Business value Make decisions Get

More information

MATLAB. Senior Application Engineer The MathWorks Korea The MathWorks, Inc. 2

MATLAB. Senior Application Engineer The MathWorks Korea The MathWorks, Inc. 2 1 Senior Application Engineer The MathWorks Korea 2017 The MathWorks, Inc. 2 Data Analytics Workflow Business Systems Smart Connected Systems Data Acquisition Engineering, Scientific, and Field Business

More information

컴퓨터비전의최신기술 : Deep Learning, 3D Vision and Embedded Vision

컴퓨터비전의최신기술 : Deep Learning, 3D Vision and Embedded Vision 1 컴퓨터비전의최신기술 : Deep Learning, 3D Vision and Embedded Vision 김종남 Application Engineer 2017 The MathWorks, Inc. 2 Three Main Topics New capabilities for computer vision system design: Deep Learning 3-D Vision

More information

designing a GPU Computing Solution

designing a GPU Computing Solution designing a GPU Computing Solution Patrick Van Reeth EMEA HPC Competency Center - GPU Computing Solutions Saturday, May the 29th, 2010 1 2010 Hewlett-Packard Development Company, L.P. The information contained

More information

GPU ACCELERATED DATABASE MANAGEMENT SYSTEMS

GPU ACCELERATED DATABASE MANAGEMENT SYSTEMS CIS 601 - Graduate Seminar Presentation 1 GPU ACCELERATED DATABASE MANAGEMENT SYSTEMS PRESENTED BY HARINATH AMASA CSU ID: 2697292 What we will talk about.. Current problems GPU What are GPU Databases GPU

More information

MATLAB Based Optimization Techniques and Parallel Computing

MATLAB Based Optimization Techniques and Parallel Computing MATLAB Based Optimization Techniques and Parallel Computing Bratislava June 4, 2009 2009 The MathWorks, Inc. Jörg-M. Sautter Application Engineer The MathWorks Agenda Introduction Local and Smooth Optimization

More information

CUDA. Matthew Joyner, Jeremy Williams

CUDA. Matthew Joyner, Jeremy Williams CUDA Matthew Joyner, Jeremy Williams Agenda What is CUDA? CUDA GPU Architecture CPU/GPU Communication Coding in CUDA Use cases of CUDA Comparison to OpenCL What is CUDA? What is CUDA? CUDA is a parallel

More information

Integrate MATLAB Analytics into Enterprise Applications

Integrate MATLAB Analytics into Enterprise Applications Integrate Analytics into Enterprise Applications Aurélie Urbain MathWorks Consulting Services 2015 The MathWorks, Inc. 1 Data Analytics Workflow Data Acquisition Data Analytics Analytics Integration Business

More information

Scaling MATLAB. for Your Organisation and Beyond. Rory Adams The MathWorks, Inc. 1

Scaling MATLAB. for Your Organisation and Beyond. Rory Adams The MathWorks, Inc. 1 Scaling MATLAB for Your Organisation and Beyond Rory Adams 2015 The MathWorks, Inc. 1 MATLAB at Scale Front-end scaling Scale with increasing access requests Back-end scaling Scale with increasing computational

More information

USING THE SYSTEM-C LIBRARY FOR BIT TRUE SIMULATIONS IN MATLAB

USING THE SYSTEM-C LIBRARY FOR BIT TRUE SIMULATIONS IN MATLAB USING THE SYSTEM-C LIBRARY FOR BIT TRUE SIMULATIONS IN MATLAB Jan Schier Institute of Information Theory and Automation Academy of Sciences of the Czech Republic Abstract In the paper, the possibilities

More information

Model-Based Design for Altera FPGAs Using HDL Code Generation The MathWorks, Inc. 1

Model-Based Design for Altera FPGAs Using HDL Code Generation The MathWorks, Inc. 1 Model-Based Design for Altera FPGAs Using HDL Code Generation Z 2011 The MathWorks, Inc. 1 Separate Views of DSP Implementation System Designer FPGA Designer Algorithm Design System Test Bench RTL Design

More information

What s New with the MATLAB and Simulink Product Families. Marta Wilczkowiak & Coorous Mohtadi Application Engineering Group

What s New with the MATLAB and Simulink Product Families. Marta Wilczkowiak & Coorous Mohtadi Application Engineering Group What s New with the MATLAB and Simulink Product Families Marta Wilczkowiak & Coorous Mohtadi Application Engineering Group 1 Area MATLAB Math, Statistics, and Optimization Application Deployment Parallel

More information

Matlab for Engineers

Matlab for Engineers Matlab for Engineers Alistair Johnson 31st May 2012 Centre for Doctoral Training in Healthcare Innovation Institute of Biomedical Engineering Department of Engineering Science University of Oxford Supported

More information

High-Performance Data Loading and Augmentation for Deep Neural Network Training

High-Performance Data Loading and Augmentation for Deep Neural Network Training High-Performance Data Loading and Augmentation for Deep Neural Network Training Trevor Gale tgale@ece.neu.edu Steven Eliuk steven.eliuk@gmail.com Cameron Upright c.upright@samsung.com Roadmap 1. The General-Purpose

More information

MATLAB: The challenges involved in providing a high-level language on a GPU

MATLAB: The challenges involved in providing a high-level language on a GPU MATLAB: The challenges involved in providing a high-level language on a GPU Jos Martin jos.martin@mathworks.co.uk 2013 The MathWorks, Inc. 1 Agenda Why did we introduce GPU support? What did we do? What

More information

MATLAB Parallel Computing Toolbox Benchmark for an Embarrassingly Parallel Application

MATLAB Parallel Computing Toolbox Benchmark for an Embarrassingly Parallel Application MATLAB Parallel Computing Toolbox Benchmark for an Embarrassingly Parallel Application By Nils Oberg, Benjamin Ruddell, Marcelo H. García, and Praveen Kumar Department of Civil and Environmental Engineering

More information

Practical Introduction to CUDA and GPU

Practical Introduction to CUDA and GPU Practical Introduction to CUDA and GPU Charlie Tang Centre for Theoretical Neuroscience October 9, 2009 Overview CUDA - stands for Compute Unified Device Architecture Introduced Nov. 2006, a parallel computing

More information

Modeling HDL components for FPGAs in control applications

Modeling HDL components for FPGAs in control applications Modeling HDL components for FPGAs in control applications Mark Corless, Principal Application Engineer, Novi MI 2014 The MathWorks, Inc. 1 Position sensing High resolution voltage modulation Critical diagnostics

More information

2015 The MathWorks, Inc. 1

2015 The MathWorks, Inc. 1 2015 The MathWorks, Inc. 1 MATLAB 의 C 코드생성 워크플로우및최적화요령 정승혁과장 2015 The MathWorks, Inc. 2 MATLAB Coder User Story Using MATLAB Try a new idea quickly Evaluation of the system by testing and analysis High

More information

Designing and Targeting Video Processing Subsystems for Hardware

Designing and Targeting Video Processing Subsystems for Hardware 1 Designing and Targeting Video Processing Subsystems for Hardware 정승혁과장 Senior Application Engineer MathWorks Korea 2017 The MathWorks, Inc. 2 Pixel-stream Frame-based Process : From Algorithm to Hardware

More information

Using Intel Math Kernel Library with MathWorks* MATLAB* on Intel Xeon Phi Coprocessor System

Using Intel Math Kernel Library with MathWorks* MATLAB* on Intel Xeon Phi Coprocessor System Using Intel Math Kernel Library with MathWorks* MATLAB* on Intel Xeon Phi Coprocessor System Overview This guide is intended to help developers use the latest version of Intel Math Kernel Library (Intel

More information

Stream Processing with CUDA TM A Case Study Using Gamebryo's Floodgate Technology

Stream Processing with CUDA TM A Case Study Using Gamebryo's Floodgate Technology Stream Processing with CUDA TM A Case Study Using Gamebryo's Floodgate Technology Dan Amerson, Technical Director, Emergent Game Technologies Purpose Why am I giving this talk? To answer this question:

More information

CUDA Programming Model

CUDA Programming Model CUDA Xing Zeng, Dongyue Mou Introduction Example Pro & Contra Trend Introduction Example Pro & Contra Trend Introduction What is CUDA? - Compute Unified Device Architecture. - A powerful parallel programming

More information

2015 The MathWorks, Inc. 1

2015 The MathWorks, Inc. 1 2015 The MathWorks, Inc. 1 웨어러블디바이스의신호분석 Senior Application Engineer 김종남 2015 The MathWorks, Inc. 2 Agenda Internet Of Things Signal Analytics and Classification : On data from wareable and mobile device

More information

Design and Verify Embedded Signal Processing Systems Using MATLAB and Simulink

Design and Verify Embedded Signal Processing Systems Using MATLAB and Simulink Design and Verify Embedded Signal Processing Systems Using MATLAB and Simulink Giorgia Zucchelli, Application Engineer, MathWorks 10 January 2013, Technical University Eindhoven 2013 The MathWorks, Inc.

More information

CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman)

CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman) CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC Guest Lecturer: Sukhyun Song (original slides by Alan Sussman) Parallel Programming with Message Passing and Directives 2 MPI + OpenMP Some applications can

More information

SDACCEL DEVELOPMENT ENVIRONMENT. The Xilinx SDAccel Development Environment. Bringing The Best Performance/Watt to the Data Center

SDACCEL DEVELOPMENT ENVIRONMENT. The Xilinx SDAccel Development Environment. Bringing The Best Performance/Watt to the Data Center SDAccel Environment The Xilinx SDAccel Development Environment Bringing The Best Performance/Watt to the Data Center Introduction Data center operators constantly seek more server performance. Currently

More information

Model-Based Embedded System Design

Model-Based Embedded System Design Model-Based Embedded System Design Pieter J. Mosterman Senior Research Scientist The MathW orks, Inc. 2007 The MathWorks, Inc. Agenda Introduction Embedded Systems Design Demo A Design Activity Dynamic

More information

Supporting Data Parallelism in Matcloud: Final Report

Supporting Data Parallelism in Matcloud: Final Report Supporting Data Parallelism in Matcloud: Final Report Yongpeng Zhang, Xing Wu 1 Overview Matcloud is an on-line service to run Matlab-like script on client s web browser. Internally it is accelerated by

More information

Introducing Simulink R2012b for Signal Processing & Communications Graham Reith Senior Team Leader, UK Application Engineering

Introducing Simulink R2012b for Signal Processing & Communications Graham Reith Senior Team Leader, UK Application Engineering Introducing Simulink R2012b for Signal Processing & Communications Graham Reith Senior Team Leader, UK Application Engineering 2012 The MathWorks, Inc. 1 Simulink R2012b the most significant upgrade to

More information

RTW SUPPORT FOR PARALLEL 64bit ALPHA AXP-BASED PLATFORMS. Christian Vialatte, Jiri Kadlec,

RTW SUPPORT FOR PARALLEL 64bit ALPHA AXP-BASED PLATFORMS. Christian Vialatte, Jiri Kadlec, RTW SUPPORT FOR PARALLEL 64bit ALPHA AXP-BASED PLATFORMS Christian Vialatte, Jiri Kadlec, Introduction Presentation of software supporting the Real-Time Workshop (Matlab 5.3), targeting AD66 ISA and AD66-PCI

More information

Using a GPU in InSAR processing to improve performance

Using a GPU in InSAR processing to improve performance Using a GPU in InSAR processing to improve performance Rob Mellors, ALOS PI 152 San Diego State University David Sandwell University of California, San Diego What is a GPU? (Graphic Processor Unit) A graphics

More information

What s New in MATLAB and Simulink

What s New in MATLAB and Simulink What s New in MATLAB Simulink Fabrizio Sara 2015 The MathWorks, Inc. 1 Engineers scientists 2 Engineers scientists Develop algorithms Analyze data write MATLAB code. 3 Engineers scientists deploy algorithms

More information

Georgia Institute of Technology Center for Signal and Image Processing Steve Conover February 2009

Georgia Institute of Technology Center for Signal and Image Processing Steve Conover February 2009 Georgia Institute of Technology Center for Signal and Image Processing Steve Conover February 2009 Introduction CUDA is a tool to turn your graphics card into a small computing cluster. It s not always

More information

Accelerate FPGA Prototyping with

Accelerate FPGA Prototyping with Accelerate FPGA Prototyping with MATLAB and Simulink September 21 st 2010 Stephan van Beek Senior Application Engineer 1 From Idea to Implementation DESIGN Algorithm Development MATLAB Simulink Stateflow

More information

Spartan -6 LX150T Development Kit Hardware Co-Simulation Reference Design User Guide

Spartan -6 LX150T Development Kit Hardware Co-Simulation Reference Design User Guide Spartan -6 LX150T Development Kit H/W Co-Simulation Reference Design User Guide Spartan -6 LX150T Development Kit Hardware Co-Simulation Reference Design User Guide Version 0.8 Revision History Version

More information

Coarse Grain Reconfigurable Arrays are Signal Processing Engines!

Coarse Grain Reconfigurable Arrays are Signal Processing Engines! Coarse Grain Reconfigurable Arrays are Signal Processing Engines! Advanced Topics in Telecommunications, Algorithms and Implementation Platforms for Wireless Communications, TLT-9707 Waqar Hussain Researcher

More information

Data Analytics with MATLAB. Tackling the Challenges of Big Data

Data Analytics with MATLAB. Tackling the Challenges of Big Data Data Analytics with MATLAB Tackling the Challenges of Big Data How big is big? What characterises big data? Any collection of data sets so large and complex that it becomes difficult to process using traditional

More information

Higher Level Programming Abstractions for FPGAs using OpenCL

Higher Level Programming Abstractions for FPGAs using OpenCL Higher Level Programming Abstractions for FPGAs using OpenCL Desh Singh Supervising Principal Engineer Altera Corporation Toronto Technology Center ! Technology scaling favors programmability CPUs."#/0$*12'$-*

More information

Modeling and Simulating Social Systems with MATLAB

Modeling and Simulating Social Systems with MATLAB Modeling and Simulating Social Systems with MATLAB Lecture 6 Optimization and Parallelization Olivia Woolley, Tobias Kuhn, Dario Biasini, Dirk Helbing Chair of Sociology, in particular of Modeling and

More information

Parallel Processing Tool-box

Parallel Processing Tool-box Parallel Processing Tool-box Start up MATLAB in the regular way. This copy of MATLAB that you start with is called the "client" copy; the copies of MATLAB that will be created to assist in the computation

More information

Parallel Computing with Matlab and R

Parallel Computing with Matlab and R Parallel Computing with Matlab and R scsc@duke.edu https://wiki.duke.edu/display/scsc Tom Milledge tm103@duke.edu Overview Running Matlab and R interactively and in batch mode Introduction to Parallel

More information

Designing and Prototyping Digital Systems on SoC FPGA The MathWorks, Inc. 1

Designing and Prototyping Digital Systems on SoC FPGA The MathWorks, Inc. 1 Designing and Prototyping Digital Systems on SoC FPGA Hitu Sharma Application Engineer Vinod Thomas Sr. Training Engineer 2015 The MathWorks, Inc. 1 What is an SoC FPGA? A typical SoC consists of- A microcontroller,

More information

A Design Framework for Mapping Vectorized Synchronous Dataflow Graphs onto CPU-GPU Platforms

A Design Framework for Mapping Vectorized Synchronous Dataflow Graphs onto CPU-GPU Platforms A Design Framework for Mapping Vectorized Synchronous Dataflow Graphs onto CPU-GPU Platforms Shuoxin Lin, Yanzhou Liu, William Plishker, Shuvra Bhattacharyya Maryland DSPCAD Research Group Department of

More information

Hardware and Software Co-Design for Motor Control Applications

Hardware and Software Co-Design for Motor Control Applications Hardware and Software Co-Design for Motor Control Applications GianCarlo Pacitti Senior Application Engineer, MathWorks 2015 The MathWorks, Inc. 1 Agenda Why use Hardware and Software for motor control?

More information

What s New for MATLAB David Willingham

What s New for MATLAB David Willingham What s New for MATLAB David Willingham 2015 The MathWorks, Inc. 1 MATLAB Execution Engine Redesigned execution engine runs MATLAB code faster All MATLAB code is now JIT compiled A platform for future improvements

More information

Developing a Data Driven System for Computational Neuroscience

Developing a Data Driven System for Computational Neuroscience Developing a Data Driven System for Computational Neuroscience Ross Snider and Yongming Zhu Montana State University, Bozeman MT 59717, USA Abstract. A data driven system implies the need to integrate

More information

CUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav

CUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav CUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav CMPE655 - Multiple Processor Systems Fall 2015 Rochester Institute of Technology Contents What is GPGPU? What s the need? CUDA-Capable GPU Architecture

More information

What's new in MATLAB and Simulink for Model-Based Design

What's new in MATLAB and Simulink for Model-Based Design What's new in MATLAB and Simulink for Model-Based Design Magnus Jung Application Engineer 2016 The MathWorks, Inc. 1 What s New? 2 Model-Based Design Workflow RESEARCH REQUIREMENTS DESIGN Scheduling Event

More information

Parallel Computing with MATLAB on Discovery Cluster

Parallel Computing with MATLAB on Discovery Cluster Parallel Computing with MATLAB on Discovery Cluster Northeastern University Research Computing: Nilay K Roy, MS Computer Science, Ph.D Computational Physics Lets look at the Discovery Cluster Matlab environment

More information

Advanced CUDA Optimization 1. Introduction

Advanced CUDA Optimization 1. Introduction Advanced CUDA Optimization 1. Introduction Thomas Bradley Agenda CUDA Review Review of CUDA Architecture Programming & Memory Models Programming Environment Execution Performance Optimization Guidelines

More information

Hardware Implementation and Verification by Model-Based Design Workflow - Communication Models to FPGA-based Radio

Hardware Implementation and Verification by Model-Based Design Workflow - Communication Models to FPGA-based Radio Hardware Implementation and Verification by -Based Design Workflow - Communication s to FPGA-based Radio Katsuhisa Shibata Industry Marketing MathWorks Japan 2015 The MathWorks, Inc. 1 Agenda Challenges

More information

CUDA and OpenCL Implementations of 3D CT Reconstruction for Biomedical Imaging

CUDA and OpenCL Implementations of 3D CT Reconstruction for Biomedical Imaging CUDA and OpenCL Implementations of 3D CT Reconstruction for Biomedical Imaging Saoni Mukherjee, Nicholas Moore, James Brock and Miriam Leeser September 12, 2012 1 Outline Introduction to CT Scan, 3D reconstruction

More information

Vidushi: Parallel Implementation of Alpha Miner Algorithm and Performance Analysis on CPU and GPU Architecture

Vidushi: Parallel Implementation of Alpha Miner Algorithm and Performance Analysis on CPU and GPU Architecture Vidushi: Parallel Implementation of Alpha Miner Algorithm and Performance Analysis on CPU and GPU Architecture Divya Kundra Computer Science Indraprastha Institute of Information Technology, Delhi (IIIT-D),

More information

Porting the NAS-NPB Conjugate Gradient Benchmark to CUDA. NVIDIA Corporation

Porting the NAS-NPB Conjugate Gradient Benchmark to CUDA. NVIDIA Corporation Porting the NAS-NPB Conjugate Gradient Benchmark to CUDA NVIDIA Corporation Outline! Overview of CG benchmark! Overview of CUDA Libraries! CUSPARSE! CUBLAS! Porting Sequence! Algorithm Analysis! Data/Code

More information

GPU-Accelerated Beat Detection for Dancing Monkeys

GPU-Accelerated Beat Detection for Dancing Monkeys GPU-Accelerated Beat Detection for Dancing Monkeys Philip Peng University of Pennsylvania Yanjie Feng University of Pennsylvania Abstract In music-based rhythm games, the game system needs to create patterns

More information

MATLAB to iphone Made Easy

MATLAB to iphone Made Easy MATLAB to iphone Made Easy Generating readable and portable C code from your MATLAB algorithms for your iphone or ipad app Bill Chou 2014 The MathWorks, Inc. 1 2 4 Quick Demo MATLAB Coder >> Demo 5 Agenda

More information

System Requirements & Platform Availability by Product for R2016b

System Requirements & Platform Availability by Product for R2016b & Platform Availability by Product for R2016b View general system requirements. Product Aerospace Blockset Requires Aerospace Control recommended Aerospace Antenna RF recommended Phased Array recommended

More information

The Use of Computing Clusters and Automatic Code Generation to Speed Up Simulation Tasks

The Use of Computing Clusters and Automatic Code Generation to Speed Up Simulation Tasks The Use of Computing Clusters and Automatic Code Generation to Speed Up Simulation Tasks Jason R. Ghidella 1, Amory Wakefield 2, Silvina Grad-Freilich 3, Jon Friedman 4 and Vinod Cherian 5 The MathWorks,

More information

Dynamic Cuda with F# HPC GPU & F# Meetup. March 19. San Jose, California

Dynamic Cuda with F# HPC GPU & F# Meetup. March 19. San Jose, California Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79 430 03 61 About Us! Software development and consulting company!

More information