Advanced Simulation Library Expanding software ecosystem for the DSP/FPGA/GPU market. September 3, 2015

Size: px
Start display at page:

Download "Advanced Simulation Library Expanding software ecosystem for the DSP/FPGA/GPU market. September 3, 2015"

Transcription

1 Advanced Simulation Library Expanding software ecosystem for the DSP/FPGA/GPU market September 3, 2015

2 1 ASL Advanced Simulation Library - free and open source, OpenCL-based multiphysics simulation software package Figure 1: CFD example Velocity field ASL entered all major Linux distributions in a record time!

3 ASL Features 1 Sophisticated algorithms and hardware acceleration 2 Pure C++ API OpenCL-based engine 3 Cross-platform: Linux, Unix, Windows, Mac OS X 4 Free AGPLv3 license 5 Various physical and chemical phenomena supported 6 Generation and manipulation of geometric primitives 7 Interfacing with well-known software: VTK/ParaView, MATLAB

4 Value proposition I 1 Sophisticated algorithms and hardware acceleration Using only the methods that allow efficient parallelization: Lattice Boltzmann method, Explicit Finite Difference, Matrix Free Finite Elements, etc Exploitation of all available hardware resources, such as SIMD, local cache, etc Carefully chosen numerical techniques allow ASL to bypass the prerequisite solid expertise in numerical modeling, which was required from the user in the past Extraordinarily high performance!

5 Value proposition II 2 Pure C++ API OpenCL-based engine Pure C++ API eliminates the initial long learning curve, required to master OpenCL High level C++ routines, no need for user to operate on the low hardware level Implementation of the engine in OpenCL enables deployment of applications on a variety of massively parallel architectures, ranging from inexpensive embedded FPGAs, DSPs and GPUs up to heterogeneous clusters/supercomputers (multiple multi-core CPUs, multiple GPUs/APUs, multiple nodes) Flexible design allows to add new engine back-ends (like CUDA) in the future 3 Cross-platform: Linux, Unix, Windows, Mac OS X Accelerates adoption

6 Value proposition III 4 Free AGPLv3 license Free of charge No limitations on the number of deployed computing units as in many commercial packages Availability of the source code ensures reproducibility of research Open Science - important for academia Users feel part of community Can participate/influence the development Social coding experience Stimulates external code contributions and extensive testing AGPL protects contributors against exploitation Companies get the code for free, no need for the upfront expenses - no risks

7 ASL Application Areas 1 High-performance scientific computing 2 Computer-aided engineering 3 Image-guided surgery 4 Non invasive measurements 5 Industrial process control 6 Virtual sensing 7 Etc, etc, etc

8 ASL Examples: Platform used A regular desktop was used in all benchmarks below: CPU: AMD FX(tm)-6300 Six-Core Processor 35GHz RAM: 8Gb GPU: Sapphire AMD Radeon HD GB GDDR5 BoostOC OS: 64-bit Debian GNU/Linux

9 Application Examples and Relevant Hardware Markets I 1 High-performance scientific computing Simulations in fundamental research by academia Weather forecast Oil and gas industry (reservoir evaluation, identifying drilling targets, risk assessment, etc) Etc, etc 2 Computer-aided engineering Crystal growth simulations by semiconductor and nanotechnology industries Aerodynamic/hydrodynamic simulations by automotive, aerospace and shipbuilding industries Required hardware: DSP/FPGA/GPU-based supercomputers/clusters

10 Example I: Aerodynamics of a Locomotive in a Tunnel Simulation of a locomotive running through a tunnel In this benchmark air flow velocity and pressure fields were computed

11 Example I: Aerodynamics of a Locomotive in a Tunnel Numerical setup Domain size: 187 x 125 x 500 Number of points: Resolution: 008 m Physical size: 1496 x 10 x 40 m³ Number of iterations: Time step: 48e-4 s Physical time: 96 s Computation time: 4191 s

12 Application Examples and Relevant Hardware Markets II 3 Image-guided surgery Real-time simulations of biophysical/biochemical processes in the human body by biomed industry Required hardware: DSP/FPGA/GPU embedded into medical devices

13 Example II: Computer-assisted cryo- and neurosurgery CryoVision - 3D cryoablation planning and visualization system for planning and guiding cryosurgery BrainShift - simulation of the brain shift process during a craniotomy procedure Utilized in ACTIVE (European FP7 project)

14 Application Examples and Relevant Hardware Markets III 4 Non invasive measurements 5 Industrial process control 6 Virtual sensing Real-time non-invasive measurement and optimization of process output parameters Simulation-based industrial process control Replacing physical sensors with virtual or soft sensors where the former are unavailable, impractical or do not provide sufficient information Required hardware: DSP/FPGA/GPU embedded into various devices and industrial equipment

15 Examples III: Multicomponent flow Multicomponent flow problem is typical for a number of technologies: oil refining, natural-gas processing, food manufacturing, industrial wastewater treatment, passive mixers, lab-on-a-chip, etc In this benchmark we present a 3D simulation of three-component flow in cross-coupled pipes All three components (white, red and blue) are water based solutions Flow characteristic velocity is 2 cm/s Video

16 Examples III: Multicomponent flow Numerical setup Domain size: 503 x 203 x 103 Number of points: Resolution: 05 mm Physical size: 25 x 10 x 5 cm³ Number of iterations: Time step: s Physical time: 25 s Computation time: 2895 s

17 Examples III: Physical Vapor Deposition Simulation of circular saw discs coating procedure employing Physical Vapor Deposition (PVD) method This metallization technique is used in a wide variety of other applications such as semiconductor wafers, photovoltaic cells, thin-film batteries, optical coatings, etc Present modeling can be utilized to optimize different output parameters of PVD: layer thickness uniformity, energetic efficiency, process duration, costs and productivity

18 Examples III: Microfluidic device Microfluidic device for separating mixtures of proteins

Overview of Project's Achievements

Overview of Project's Achievements PalDMC Parallelised Data Mining Components Final Presentation ESRIN, 12/01/2012 Overview of Project's Achievements page 1 Project Outline Project's objectives design and implement performance optimised,

More information

Two-Phase flows on massively parallel multi-gpu clusters

Two-Phase flows on massively parallel multi-gpu clusters Two-Phase flows on massively parallel multi-gpu clusters Peter Zaspel Michael Griebel Institute for Numerical Simulation Rheinische Friedrich-Wilhelms-Universität Bonn Workshop Programming of Heterogeneous

More information

SPC 307 Aerodynamics. Lecture 1. February 10, 2018

SPC 307 Aerodynamics. Lecture 1. February 10, 2018 SPC 307 Aerodynamics Lecture 1 February 10, 2018 Sep. 18, 2016 1 Course Materials drahmednagib.com 2 COURSE OUTLINE Introduction to Aerodynamics Review on the Fundamentals of Fluid Mechanics Euler and

More information

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS + Hybrid Computing @ KAUST Many Cores and OpenACC Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS + Agenda Hybrid Computing n Hybrid Computing n From Multi-Physics

More information

Particleworks: Particle-based CAE Software fully ported to GPU

Particleworks: Particle-based CAE Software fully ported to GPU Particleworks: Particle-based CAE Software fully ported to GPU Introduction PrometechVideo_v3.2.3.wmv 3.5 min. Particleworks Why the particle method? Existing methods FEM, FVM, FLIP, Fluid calculation

More information

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620 Introduction to Parallel and Distributed Computing Linh B. Ngo CPSC 3620 Overview: What is Parallel Computing To be run using multiple processors A problem is broken into discrete parts that can be solved

More information

GPUs and Emerging Architectures

GPUs and Emerging Architectures GPUs and Emerging Architectures Mike Giles mike.giles@maths.ox.ac.uk Mathematical Institute, Oxford University e-infrastructure South Consortium Oxford e-research Centre Emerging Architectures p. 1 CPUs

More information

GPU Clouds IGT Cloud Computing Summit Mordechai Butrashvily, CEO 2009 (c) All rights reserved

GPU Clouds IGT Cloud Computing Summit Mordechai Butrashvily, CEO 2009 (c) All rights reserved GPU Clouds IGT 2009 Cloud Computing Summit Mordechai Butrashvily, CEO moti@hoopoe-cloud.com 02/12/2009 Agenda Introduction to GPU Computing Future GPU architecture GPU on a Cloud: Visualization Computing

More information

Maximize automotive simulation productivity with ANSYS HPC and NVIDIA GPUs

Maximize automotive simulation productivity with ANSYS HPC and NVIDIA GPUs Presented at the 2014 ANSYS Regional Conference- Detroit, June 5, 2014 Maximize automotive simulation productivity with ANSYS HPC and NVIDIA GPUs Bhushan Desam, Ph.D. NVIDIA Corporation 1 NVIDIA Enterprise

More information

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller Entertainment Graphics: Virtual Realism for the Masses CSE 591: GPU Programming Introduction Computer games need to have: realistic appearance of characters and objects believable and creative shading,

More information

Laptop Requirement: Technical Specifications and Guidelines. Frequently Asked Questions

Laptop Requirement: Technical Specifications and Guidelines. Frequently Asked Questions Laptop Requirement: Technical Specifications and Guidelines As artists and designers, you will be working in an increasingly digital landscape. The Parsons curriculum addresses this by making digital literacy

More information

A Framework for Industrial Simulation and Data Analytics. Yann Debray Scilab Center of Excellence, ESI Group

A Framework for Industrial Simulation and Data Analytics. Yann Debray Scilab Center of Excellence, ESI Group A Framework for Industrial Simulation and Data Analytics Yann Debray Scilab Center of Excellence, ESI Group Copyright ESI Copyright Group, 2017. ESI All Group, rights reserved. 2017. All rights reserved.

More information

A TALENTED CPU-TO-GPU MEMORY MAPPING TECHNIQUE

A TALENTED CPU-TO-GPU MEMORY MAPPING TECHNIQUE A TALENTED CPU-TO-GPU MEMORY MAPPING TECHNIQUE Abu Asaduzzaman, Deepthi Gummadi, and Chok M. Yip Department of Electrical Engineering and Computer Science Wichita State University Wichita, Kansas, USA

More information

The Future of GPU Computing

The Future of GPU Computing The Future of GPU Computing Bill Dally Chief Scientist & Sr. VP of Research, NVIDIA Bell Professor of Engineering, Stanford University November 18, 2009 The Future of Computing Bill Dally Chief Scientist

More information

KEY STAR TECHNOLOGIES: DISPERSED MULTIPHASE FLOW AND LIQUID FILM MODELLING DAVID GOSMAN EXEC VP TECHNOLOGY, CD-adapco

KEY STAR TECHNOLOGIES: DISPERSED MULTIPHASE FLOW AND LIQUID FILM MODELLING DAVID GOSMAN EXEC VP TECHNOLOGY, CD-adapco KEY STAR TECHNOLOGIES: DISPERSED MULTIPHASE FLOW AND LIQUID FILM MODELLING DAVID GOSMAN EXEC VP TECHNOLOGY, CD-adapco INTRODUCTION KEY METHODOLOGIES AVAILABLE IN STAR-CCM+ AND STAR-CD 1. Lagrangian modelling

More information

New Technologies in CST STUDIO SUITE CST COMPUTER SIMULATION TECHNOLOGY

New Technologies in CST STUDIO SUITE CST COMPUTER SIMULATION TECHNOLOGY New Technologies in CST STUDIO SUITE 2016 Outline Design Tools & Modeling Antenna Magus Filter Designer 2D/3D Modeling 3D EM Solver Technology Cable / Circuit / PCB Systems Multiphysics CST Design Tools

More information

Trends in the Infrastructure of Computing

Trends in the Infrastructure of Computing Trends in the Infrastructure of Computing CSCE 9: Computing in the Modern World Dr. Jason D. Bakos My Questions How do computer processors work? Why do computer processors get faster over time? How much

More information

Parallel and Distributed Computing

Parallel and Distributed Computing Parallel and Distributed Computing NUMA; OpenCL; MapReduce José Monteiro MSc in Information Systems and Computer Engineering DEA in Computational Engineering Department of Computer Science and Engineering

More information

PhD Student. Associate Professor, Co-Director, Center for Computational Earth and Environmental Science. Abdulrahman Manea.

PhD Student. Associate Professor, Co-Director, Center for Computational Earth and Environmental Science. Abdulrahman Manea. Abdulrahman Manea PhD Student Hamdi Tchelepi Associate Professor, Co-Director, Center for Computational Earth and Environmental Science Energy Resources Engineering Department School of Earth Sciences

More information

OpenFOAM on POWER8. Stretching the performance envelope. A White Paper by OCF

OpenFOAM on POWER8. Stretching the performance envelope. A White Paper by OCF OpenFOAM on POWER8 Stretching the performance envelope A White Paper by OCF Executive Summary In this white paper, we will show that the IBM Power architecture provides a uniquely powerful platform for

More information

General Purpose GPU Computing in Partial Wave Analysis

General Purpose GPU Computing in Partial Wave Analysis JLAB at 12 GeV - INT General Purpose GPU Computing in Partial Wave Analysis Hrayr Matevosyan - NTC, Indiana University November 18/2009 COmputationAL Challenges IN PWA Rapid Increase in Available Data

More information

GLOBAL MARKETS AND TECHNOLOGIES FOR THIN-FILM SENSORS

GLOBAL MARKETS AND TECHNOLOGIES FOR THIN-FILM SENSORS GLOBAL MARKETS AND TECHNOLOGIES FOR THIN-FILM SENSORS IAS051A January 2014 Margareth Gagliardi Project Analyst ISBN: 1-56965-681-9 BCC Research 49 Walnut Park, Building 2 Wellesley, MA 02481 USA 866-285-7215

More information

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University CSE 591/392: GPU Programming Introduction Klaus Mueller Computer Science Department Stony Brook University First: A Big Word of Thanks! to the millions of computer game enthusiasts worldwide Who demand

More information

On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators

On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators Karl Rupp, Barry Smith rupp@mcs.anl.gov Mathematics and Computer Science Division Argonne National Laboratory FEMTEC

More information

Gradient Free Design of Microfluidic Structures on a GPU Cluster

Gradient Free Design of Microfluidic Structures on a GPU Cluster Gradient Free Design of Microfluidic Structures on a GPU Cluster Austen Duffy - Florida State University SIAM Conference on Computational Science and Engineering March 2, 2011 Acknowledgements This work

More information

ANSYS Discovery Live- Getting Started

ANSYS Discovery Live- Getting Started ANSYS Discovery Live- Getting Started Every engineer deserves the power of Discovery ANSYS Discovery Live provides instantaneous simulation, tightly coupled with direct geometry modeling, to enable interactive

More information

The Many-Core Revolution Understanding Change. Alejandro Cabrera January 29, 2009

The Many-Core Revolution Understanding Change. Alejandro Cabrera January 29, 2009 The Many-Core Revolution Understanding Change Alejandro Cabrera cpp.cabrera@gmail.com January 29, 2009 Disclaimer This presentation currently contains several claims requiring proper citations and a few

More information

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007 Mellanox Technologies Maximize Cluster Performance and Productivity Gilad Shainer, shainer@mellanox.com October, 27 Mellanox Technologies Hardware OEMs Servers And Blades Applications End-Users Enterprise

More information

CUDA Conference. Walter Mundt-Blum March 6th, 2008

CUDA Conference. Walter Mundt-Blum March 6th, 2008 CUDA Conference Walter Mundt-Blum March 6th, 2008 NVIDIA s Businesses Multiple Growth Engines GPU Graphics Processing Units MCP Media and Communications Processors PESG Professional Embedded & Solutions

More information

Acer Aspire M series

Acer Aspire M series Acer Aspire M series Настоящето Aspire S3 In order to avoid any potential legal complications, please be reminded of the following guideline: all presentation slides, including all graphics and photos

More information

Fundamentals of Quantitative Design and Analysis

Fundamentals of Quantitative Design and Analysis Fundamentals of Quantitative Design and Analysis Dr. Jiang Li Adapted from the slides provided by the authors Computer Technology Performance improvements: Improvements in semiconductor technology Feature

More information

Lockheed Martin Nanosystems

Lockheed Martin Nanosystems Lockheed Martin Nanosystems National Nanotechnology Initiative at Ten: Nanotechnology Innovation Summit December 2010 Dr. Brent M. Segal Director & Chief Technologist, LM Nanosystems brent.m.segal@lmco.com

More information

Exploiting CUDA Dynamic Parallelism for low power ARM based prototypes

Exploiting CUDA Dynamic Parallelism for low power ARM based prototypes www.bsc.es Exploiting CUDA Dynamic Parallelism for low power ARM based prototypes Vishal Mehta Engineer, Barcelona Supercomputing Center vishal.mehta@bsc.es BSC/UPC CUDA Centre of Excellence (CCOE) Training

More information

HPC with GPU and its applications from Inspur. Haibo Xie, Ph.D

HPC with GPU and its applications from Inspur. Haibo Xie, Ph.D HPC with GPU and its applications from Inspur Haibo Xie, Ph.D xiehb@inspur.com 2 Agenda I. HPC with GPU II. YITIAN solution and application 3 New Moore s Law 4 HPC? HPC stands for High Heterogeneous Performance

More information

Photoresist with Ultrasonic Atomization Allows for High-Aspect-Ratio Photolithography under Atmospheric Conditions

Photoresist with Ultrasonic Atomization Allows for High-Aspect-Ratio Photolithography under Atmospheric Conditions Photoresist with Ultrasonic Atomization Allows for High-Aspect-Ratio Photolithography under Atmospheric Conditions 1 CONTRIBUTING AUTHORS Robb Engle, Vice President of Engineering, Sono-Tek Corporation

More information

FOR P3: A monolithic multigrid FEM solver for fluid structure interaction

FOR P3: A monolithic multigrid FEM solver for fluid structure interaction FOR 493 - P3: A monolithic multigrid FEM solver for fluid structure interaction Stefan Turek 1 Jaroslav Hron 1,2 Hilmar Wobker 1 Mudassar Razzaq 1 1 Institute of Applied Mathematics, TU Dortmund, Germany

More information

HPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms. Author: Correspondence: ABSTRACT:

HPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms. Author: Correspondence: ABSTRACT: HPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms Author: Stan Posey Panasas, Inc. Correspondence: Stan Posey Panasas, Inc. Phone +510 608 4383 Email sposey@panasas.com

More information

AMD Elite A-Series APU Desktop LAUNCHING JUNE 4 TH PLACE YOUR ORDERS TODAY!

AMD Elite A-Series APU Desktop LAUNCHING JUNE 4 TH PLACE YOUR ORDERS TODAY! AMD Elite A-Series APU Desktop LAUNCHING JUNE 4 TH PLACE YOUR ORDERS TODAY! INTRODUCING THE APU: DIFFERENT THAN CPUS A new AMD led category of processor APUs Are Their OWN Category + = Up to 779 GFLOPS

More information

GPU Acceleration of the Longwave Rapid Radiative Transfer Model in WRF using CUDA Fortran. G. Ruetsch, M. Fatica, E. Phillips, N.

GPU Acceleration of the Longwave Rapid Radiative Transfer Model in WRF using CUDA Fortran. G. Ruetsch, M. Fatica, E. Phillips, N. GPU Acceleration of the Longwave Rapid Radiative Transfer Model in WRF using CUDA Fortran G. Ruetsch, M. Fatica, E. Phillips, N. Juffa Outline WRF and RRTM Previous Work CUDA Fortran Features RRTM in CUDA

More information

8/28/12. CSE 820 Graduate Computer Architecture. Richard Enbody. Dr. Enbody. 1 st Day 2

8/28/12. CSE 820 Graduate Computer Architecture. Richard Enbody. Dr. Enbody. 1 st Day 2 CSE 820 Graduate Computer Architecture Richard Enbody Dr. Enbody 1 st Day 2 1 Why Computer Architecture? Improve coding. Knowledge to make architectural choices. Ability to understand articles about architecture.

More information

Trends and Challenges in Multicore Programming

Trends and Challenges in Multicore Programming Trends and Challenges in Multicore Programming Eva Burrows Bergen Language Design Laboratory (BLDL) Department of Informatics, University of Bergen Bergen, March 17, 2010 Outline The Roadmap of Multicores

More information

! Readings! ! Room-level, on-chip! vs.!

! Readings! ! Room-level, on-chip! vs.! 1! 2! Suggested Readings!! Readings!! H&P: Chapter 7 especially 7.1-7.8!! (Over next 2 weeks)!! Introduction to Parallel Computing!! https://computing.llnl.gov/tutorials/parallel_comp/!! POSIX Threads

More information

Moore s Law. Computer architect goal Software developer assumption

Moore s Law. Computer architect goal Software developer assumption Moore s Law The number of transistors that can be placed inexpensively on an integrated circuit will double approximately every 18 months. Self-fulfilling prophecy Computer architect goal Software developer

More information

Embedded GPGPU and Deep Learning for Industrial Market

Embedded GPGPU and Deep Learning for Industrial Market Embedded GPGPU and Deep Learning for Industrial Market Author: Dan Mor GPGPU and HPEC Product Line Manager September 2018 Table of Contents 1. INTRODUCTION... 3 2. DIFFICULTIES IN CURRENT EMBEDDED INDUSTRIAL

More information

GPU Architecture. Alan Gray EPCC The University of Edinburgh

GPU Architecture. Alan Gray EPCC The University of Edinburgh GPU Architecture Alan Gray EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? Architectural reasons for accelerator performance advantages Latest GPU Products From

More information

Virtualization Station. Brings an Efficient Virtualization Environment 4 essential aspects

Virtualization Station. Brings an Efficient Virtualization Environment 4 essential aspects Virtualization Station Brings an Efficient Virtualization Environment 4 essential aspects Core values of Virtualization Logically dividing the physical computer resource (CPU, memory, storage and network)

More information

Building supercomputers from embedded technologies

Building supercomputers from embedded technologies http://www.montblanc-project.eu Building supercomputers from embedded technologies Alex Ramirez Barcelona Supercomputing Center Technical Coordinator This project and the research leading to these results

More information

World s most advanced data center accelerator for PCIe-based servers

World s most advanced data center accelerator for PCIe-based servers NVIDIA TESLA P100 GPU ACCELERATOR World s most advanced data center accelerator for PCIe-based servers HPC data centers need to support the ever-growing demands of scientists and researchers while staying

More information

Exclusive pricing for UW students, faculty, staff and UWAA members!

Exclusive pricing for UW students, faculty, staff and UWAA members! Exclusive pricing for UW students, faculty, staff and UWAA members! MacBook Pro with Retina display (mid 2017) 720p FaceTime HD Camera; stereo speakers & dual microphones; backlit keyboard with ambient

More information

Implementation of OpenSource Structural Engineering Application OpenSees on GPU platform

Implementation of OpenSource Structural Engineering Application OpenSees on GPU platform Implementation of OpenSource Structural Engineering Application OpenSees on GPU platform Ms Gouri Kadam 1 Ms Shweta Nayak 2 1 CDAC, Pune University Campus Pune, India. Email: gourik@cdac.in 2 Department

More information

Applying OpenCL. IWOCL, May Andrew Richards

Applying OpenCL. IWOCL, May Andrew Richards Applying OpenCL IWOCL, May 2017 Andrew Richards The next generation of software will not be built on CPUs 2 On a 100 millimetre-squared chip, Google needs something like 50 teraflops of performance - Daniel

More information

Heterogenous Computing

Heterogenous Computing Heterogenous Computing Fall 2018 CS, SE - Freshman Seminar 11:00 a 11:50a Computer Architecture What are the components of a computer? How do these components work together to perform computations? How

More information

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática Architectures for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 12

More information

Computer Architecture. What is it?

Computer Architecture. What is it? Computer Architecture Venkatesh Akella EEC 270 Winter 2005 What is it? EEC270 Computer Architecture Basically a story of unprecedented improvement $1K buys you a machine that was 1-5 million dollars a

More information

GAME PROGRAMMING ON HYBRID CPU-GPU ARCHITECTURES TAKAHIRO HARADA, AMD DESTRUCTION FOR GAMES ERWIN COUMANS, AMD

GAME PROGRAMMING ON HYBRID CPU-GPU ARCHITECTURES TAKAHIRO HARADA, AMD DESTRUCTION FOR GAMES ERWIN COUMANS, AMD GAME PROGRAMMING ON HYBRID CPU-GPU ARCHITECTURES TAKAHIRO HARADA, AMD DESTRUCTION FOR GAMES ERWIN COUMANS, AMD GAME PROGRAMMING ON HYBRID CPU-GPU ARCHITECTURES Jason Yang, Takahiro Harada AMD HYBRID CPU-GPU

More information

Higher Level Programming Abstractions for FPGAs using OpenCL

Higher Level Programming Abstractions for FPGAs using OpenCL Higher Level Programming Abstractions for FPGAs using OpenCL Desh Singh Supervising Principal Engineer Altera Corporation Toronto Technology Center ! Technology scaling favors programmability CPUs."#/0$*12'$-*

More information

1. Most people feel comfortable purchasing complex devices, such as cars, home theater systems, and computers.

1. Most people feel comfortable purchasing complex devices, such as cars, home theater systems, and computers. 1. Most people feel comfortable purchasing complex devices, such as cars, home theater systems, and computers. F PTS: 1 REF: 2 2. To make an informed choice when purchasing a computer, you must know your

More information

Chapter 0: IT Essentials Introduction Chapter 1: Introduction to the Personal Computer

Chapter 0: IT Essentials Introduction Chapter 1: Introduction to the Personal Computer Name Date Chapter 0: IT Essentials Introduction Chapter 1: Introduction to the Personal Computer After completion of this chapter, students should be able to: Explain IT industry certifications and technician

More information

Ward s Single Probes. When All You Need is One

Ward s Single Probes. When All You Need is One Ward s Single Probes When All You Need is One INTRODUCTION The Ward s Single Probes platform was designed for the classroom with ever-changing needs in mind. Users can select from three different operational

More information

Finite Element Integration and Assembly on Modern Multi and Many-core Processors

Finite Element Integration and Assembly on Modern Multi and Many-core Processors Finite Element Integration and Assembly on Modern Multi and Many-core Processors Krzysztof Banaś, Jan Bielański, Kazimierz Chłoń AGH University of Science and Technology, Mickiewicza 30, 30-059 Kraków,

More information

GPU Computing with NVIDIA s new Kepler Architecture

GPU Computing with NVIDIA s new Kepler Architecture GPU Computing with NVIDIA s new Kepler Architecture Axel Koehler Sr. Solution Architect HPC HPC Advisory Council Meeting, March 13-15 2013, Lugano 1 NVIDIA: Parallel Computing Company GPUs: GeForce, Quadro,

More information

Profiling and Debugging OpenCL Applications with ARM Development Tools. October 2014

Profiling and Debugging OpenCL Applications with ARM Development Tools. October 2014 Profiling and Debugging OpenCL Applications with ARM Development Tools October 2014 1 Agenda 1. Introduction to GPU Compute 2. ARM Development Solutions 3. Mali GPU Architecture 4. Using ARM DS-5 Streamline

More information

NVIDIA T4 FOR VIRTUALIZATION

NVIDIA T4 FOR VIRTUALIZATION NVIDIA T4 FOR VIRTUALIZATION TB-09377-001-v01 January 2019 Technical Brief TB-09377-001-v01 TABLE OF CONTENTS Powering Any Virtual Workload... 1 High-Performance Quadro Virtual Workstations... 3 Deep Learning

More information

Challenges for Non Volatile Memory (NVM) for Automotive High Temperature Operating Conditions Alexander Muffler

Challenges for Non Volatile Memory (NVM) for Automotive High Temperature Operating Conditions Alexander Muffler Challenges for Non Volatile Memory (NVM) for Automotive High Temperature Operating Conditions Alexander Muffler Product Marketing Manager Automotive, X-FAB Outline Introduction NVM Technology & Design

More information

GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS

GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS Agenda Forming a GPGPU WG 1 st meeting Future meetings Activities Forming a GPGPU WG To raise needs and enhance information sharing A platform for knowledge

More information

CUDA and OpenCL Implementations of 3D CT Reconstruction for Biomedical Imaging

CUDA and OpenCL Implementations of 3D CT Reconstruction for Biomedical Imaging CUDA and OpenCL Implementations of 3D CT Reconstruction for Biomedical Imaging Saoni Mukherjee, Nicholas Moore, James Brock and Miriam Leeser September 12, 2012 1 Outline Introduction to CT Scan, 3D reconstruction

More information

Collaboration for Breakthrough Innovation in Human Performance Monitoring for the Warfighter

Collaboration for Breakthrough Innovation in Human Performance Monitoring for the Warfighter Collaboration for Breakthrough Innovation in Human Performance Monitoring for the Warfighter NDIA 2018 Human Systems Conference Dr. Melissa Grupen-Shemansky Chief Technology Officer, SEMI / FlexTech megshemansky@semi.org

More information

OP2 FOR MANY-CORE ARCHITECTURES

OP2 FOR MANY-CORE ARCHITECTURES OP2 FOR MANY-CORE ARCHITECTURES G.R. Mudalige, M.B. Giles, Oxford e-research Centre, University of Oxford gihan.mudalige@oerc.ox.ac.uk 27 th Jan 2012 1 AGENDA OP2 Current Progress Future work for OP2 EPSRC

More information

NANOMETRIC LAB PRINTER. Plug & Play Solution for ultra-precise printing of conductive lines in nano-scale

NANOMETRIC LAB PRINTER. Plug & Play Solution for ultra-precise printing of conductive lines in nano-scale NANOMETRIC LAB PRINTER Plug & Play Solution for ultra-precise printing of conductive lines in nano-scale WHO ARE WE? WHAT IS OUR SOLUTION? XTPL S.A. is a company operating in the nanotechnology segment.

More information

MEMORY BHARAT SCHOOL OF BANKING- VELLORE

MEMORY BHARAT SCHOOL OF BANKING- VELLORE A memory is just like a human brain. It is used to store data and instructions. Computer memory is the storage space in computer where data is to be processed and instructions required for processing are

More information

SWIR Vision Systems Acuros TM CQD TM SWIR Cameras. November 2018 SWIR VISION SYSTEM

SWIR Vision Systems Acuros TM CQD TM SWIR Cameras. November 2018 SWIR VISION SYSTEM SWIR Vision Systems Acuros TM CQD TM SWIR Cameras November 2018 SWIR VISION SYSTEM Introducing Quantum Dots for short-wave IR imaging! Acuros TM CQD TM films turn silicon ICs into infrared sensors Encapsulant

More information

Introduction to Computational Fluid Dynamics Mech 122 D. Fabris, K. Lynch, D. Rich

Introduction to Computational Fluid Dynamics Mech 122 D. Fabris, K. Lynch, D. Rich Introduction to Computational Fluid Dynamics Mech 122 D. Fabris, K. Lynch, D. Rich 1 Computational Fluid dynamics Computational fluid dynamics (CFD) is the analysis of systems involving fluid flow, heat

More information

ODP Relationship to NFV. Bill Fischofer, LNG 31 October 2013

ODP Relationship to NFV. Bill Fischofer, LNG 31 October 2013 ODP Relationship to NFV Bill Fischofer, LNG 31 October 2013 Alphabet Soup NFV - Network Functions Virtualization, a carrier initiative organized under ETSI (European Telecommunications Standards Institute)

More information

Tesla GPU Computing A Revolution in High Performance Computing

Tesla GPU Computing A Revolution in High Performance Computing Tesla GPU Computing A Revolution in High Performance Computing Mark Harris, NVIDIA Agenda Tesla GPU Computing CUDA Fermi What is GPU Computing? Introduction to Tesla CUDA Architecture Programming & Memory

More information

Numerical Algorithms on Multi-GPU Architectures

Numerical Algorithms on Multi-GPU Architectures Numerical Algorithms on Multi-GPU Architectures Dr.-Ing. Harald Köstler 2 nd International Workshops on Advances in Computational Mechanics Yokohama, Japan 30.3.2010 2 3 Contents Motivation: Applications

More information

International Supercomputing Conference 2009

International Supercomputing Conference 2009 International Supercomputing Conference 2009 Implementation of a Lattice-Boltzmann-Method for Numerical Fluid Mechanics Using the nvidia CUDA Technology E. Riegel, T. Indinger, N.A. Adams Technische Universität

More information

Lecture -1- By lec. (Eng.) Hind Basil University of technology Department of Materials Engineering

Lecture -1- By lec. (Eng.) Hind Basil University of technology Department of Materials Engineering Lecture -1- By lec. (Eng.) Hind Basil University of technology Department of Materials Engineering What is Computer? Computer is an advanced electronic device that takes raw data as input from the user

More information

ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016

ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016 ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016 Challenges What is Algebraic Multi-Grid (AMG)? AGENDA Why use AMG? When to use AMG? NVIDIA AmgX Results 2

More information

Turbostream: A CFD solver for manycore

Turbostream: A CFD solver for manycore Turbostream: A CFD solver for manycore processors Tobias Brandvik Whittle Laboratory University of Cambridge Aim To produce an order of magnitude reduction in the run-time of CFD solvers for the same hardware

More information

Introduction to High-Performance Computing

Introduction to High-Performance Computing Introduction to High-Performance Computing 2 What is High Performance Computing? There is no clear definition Computing on high performance computers Solving problems / doing research using computer modeling,

More information

Operating Systems. Lecture Course in Autumn Term 2015 University of Birmingham. Eike Ritter. September 22, 2015

Operating Systems. Lecture Course in Autumn Term 2015 University of Birmingham. Eike Ritter. September 22, 2015 Lecture Course in Autumn Term 2015 University of Birmingham September 22, 2015 Course Details Overview Course Details What is an Operating System? OS Definition and Structure Lecture notes and resources:

More information

GITAM. Skill Development Centre. Computer Aided Engineering UNIVERSITY H Y D E R A B A D. (Estd.u/s 3 of the UGC, 1956)

GITAM. Skill Development Centre. Computer Aided Engineering UNIVERSITY H Y D E R A B A D. (Estd.u/s 3 of the UGC, 1956) GITAM Skill Development Centre Computer Aided Engineering A University Committed to Excellence Modeling and Analysis Software tools Creo (Pro/ENGINEER) : Creo is a powerful 3D CAD solutions package optimized

More information

B-191 B-191s B-192 B-192S. B-190 Series - Range. 1000x. 600x. 1000x. 600x

B-191 B-191s B-192 B-192S. B-190 Series - Range. 1000x. 600x. 1000x. 600x B-190 Series - Range B-191 B-191s 1 1 600x Entry level model with monocular head up to total magnification, mechanical stage and exclusive X-LED 2 for unmatchable performance, powerful and uniform illumination.

More information

Directed Optimization On Stencil-based Computational Fluid Dynamics Application(s)

Directed Optimization On Stencil-based Computational Fluid Dynamics Application(s) Directed Optimization On Stencil-based Computational Fluid Dynamics Application(s) Islam Harb 08/21/2015 Agenda Motivation Research Challenges Contributions & Approach Results Conclusion Future Work 2

More information

High performance Computing and O&G Challenges

High performance Computing and O&G Challenges High performance Computing and O&G Challenges 2 Seismic exploration challenges High Performance Computing and O&G challenges Worldwide Context Seismic,sub-surface imaging Computing Power needs Accelerating

More information

Simplify System Complexity

Simplify System Complexity Simplify System Complexity With the new high-performance CompactRIO controller Fanie Coetzer Field Sales Engineer Northern South Africa 2 3 New control system CompactPCI MMI/Sequencing/Logging FieldPoint

More information

GPUfs: Integrating a file system with GPUs

GPUfs: Integrating a file system with GPUs GPUfs: Integrating a file system with GPUs Mark Silberstein (UT Austin/Technion) Bryan Ford (Yale), Idit Keidar (Technion) Emmett Witchel (UT Austin) 1 Traditional System Architecture Applications OS CPU

More information

Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS

Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS Who am I? Education Master of Technology, NTNU, 2007 PhD, NTNU, 2010. Title: «Managing Shared Resources in Chip Multiprocessor Memory

More information

Software and Performance Engineering for numerical codes on GPU clusters

Software and Performance Engineering for numerical codes on GPU clusters Software and Performance Engineering for numerical codes on GPU clusters H. Köstler International Workshop of GPU Solutions to Multiscale Problems in Science and Engineering Harbin, China 28.7.2010 2 3

More information

GNSS SIGNAL PROCESSING IN GPU

GNSS SIGNAL PROCESSING IN GPU ARTIFICIAL SATELLITES, Vol. 48, No. 2 2013 DOI: 10.2478/arsa-2013-0005 GNSS SIGNAL PROCESSING IN GPU Petr Roule, Ondej Jakubov, Pavel Ková, Petr Kamaík, František Vejražka Czech Technical University in

More information

GPU Debugging Made Easy. David Lecomber CTO, Allinea Software

GPU Debugging Made Easy. David Lecomber CTO, Allinea Software GPU Debugging Made Easy David Lecomber CTO, Allinea Software david@allinea.com Allinea Software HPC development tools company Leading in HPC software tools market Wide customer base Blue-chip engineering,

More information

From Notebooks to Supercomputers: Tap the Full Potential of Your CUDA Resources with LibGeoDecomp

From Notebooks to Supercomputers: Tap the Full Potential of Your CUDA Resources with LibGeoDecomp From Notebooks to Supercomputers: Tap the Full Potential of Your CUDA Resources with andreas.schaefer@cs.fau.de Friedrich-Alexander-Universität Erlangen-Nürnberg GPU Technology Conference 2013, San José,

More information

Group Business Strategies and Management Policy

Group Business Strategies and Management Policy Kyocera Corporation Investor Meeting in Hong Kong Kyocera 2006 Group Business Strategies and Management Policy Vice Chairman and Representative Director Masahiro Umemura November 30, 2006 Presentation

More information

Parallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs

Parallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs Parallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs C.-C. Su a, C.-W. Hsieh b, M. R. Smith b, M. C. Jermy c and J.-S. Wu a a Department of Mechanical Engineering, National Chiao Tung

More information

MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구

MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구 MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구 Leading Supplier of End-to-End Interconnect Solutions Analyze Enabling the Use of Data Store ICs Comprehensive End-to-End InfiniBand and Ethernet Portfolio

More information

Use cases. Faces tagging in photo and video, enabling: sharing media editing automatic media mashuping entertaining Augmented reality Games

Use cases. Faces tagging in photo and video, enabling: sharing media editing automatic media mashuping entertaining Augmented reality Games Viewdle Inc. 1 Use cases Faces tagging in photo and video, enabling: sharing media editing automatic media mashuping entertaining Augmented reality Games 2 Why OpenCL matter? OpenCL is going to bring such

More information

WETTING PROPERTIES OF STRUCTURED INTERFACES COMPOSED OF SURFACE-ATTACHED SPHERICAL NANOPARTICLES

WETTING PROPERTIES OF STRUCTURED INTERFACES COMPOSED OF SURFACE-ATTACHED SPHERICAL NANOPARTICLES November 20, 2018 WETTING PROPERTIES OF STRUCTURED INTERFACES COMPOSED OF SURFACE-ATTACHED SPHERICAL NANOPARTICLES Bishal Bhattarai and Nikolai V. Priezjev Department of Mechanical and Materials Engineering

More information

Enable AI on Mobile Devices

Enable AI on Mobile Devices Enable AI on Mobile Devices Scott Wang 王舒翀 Senior Segment Manager Mobile, BSG ARM Tech Forum 2017 14 th June 2017, Shenzhen AI is moving from core to edge Ubiquitous AI Safe and autonomous Mixed reality

More information

Technology for a better society. hetcomp.com

Technology for a better society. hetcomp.com Technology for a better society hetcomp.com 1 J. Seland, C. Dyken, T. R. Hagen, A. R. Brodtkorb, J. Hjelmervik,E Bjønnes GPU Computing USIT Course Week 16th November 2011 hetcomp.com 2 9:30 10:15 Introduction

More information

Cactus Semiconductor Also known as Cactus Custom Analog Design

Cactus Semiconductor Also known as Cactus Custom Analog Design Cactus Semiconductor Also known as Cactus Custom Analog Design Wireless Communication & Charging for Solid State Batteries in Miniature Implantable Medical Devices Andrew Kelly Cactus Semiconductor Inc.

More information