HSA FOR APPLICATION PROGRAMMING
|
|
- Mae Allen
- 6 years ago
- Views:
Transcription
1 HSA FOR APPLICATION PROGRAMMING Wen-mei W. Hwu CTO, MulticoreWare, Inc. Professor University of Illinois, Urbana-Champaign
2 CURRENT GPU COMPUTING PAIN POINTS Kernel launch overhead Limited virtual and physical memory space Extra data movement in I/O and networking Tedious host code Multiple source code development
3 KERNEL LAUNCH OVERHEAD
4 DESIRED DATA TRANSFER BEHAVIOR Main Memory (DRAM) CPU Network I/O Disk I/O Device Memory DMA GPU card (or other Accelerator cards) SAMOS 2013
5 ACTUAL DATA TRANSFER BEHAVIOR Main Memory (DRAM) Each additional copy diminishes applicationperceived bandwidth Network I/O CPU Disk I/O Device Memory DMA GPU card (or other Accelerator cards) SAMOS 2013
6 HSA COMMON ADDRESS SPACE
7 STANDARDIZED USER-LEVEL QUEUES
8 LOW OVERHEAD KERNEL LAUNCH
9 DYNAMIC COMPILATION AND BINARY COMPATIBLITY
10 FAMILIAR COMPUTATION ORGANIZATION
11 EXAMPLE: COMPUTER VISION
12
13
14
15
16 EXAMPLE EXECUTION TIME
17 TOOLS WORKING GROUP Tools Group will be looking at foundation for Developer Tools (compilation, debugging and performance analysis) Working with the working group to insure the right interface are in place Compilation, Debugging, and Performance Analysis initially We act as stewards for LLVM backend for generating HSAIL (AMD/MCW) HSA functional simulator with GDB support (AMD/MCW) HSA performance simulator (AMD) Loader Library for Simulator - BRIG object loading support ( AMD) Longer Terms will spawn sub-group that will drive requirement for language support for HSA C, C++, C++ AMP Java (JVM, Dalvik) Python JavaScript DSL Hwu 2013
18 CURRENT TIME TABLE Kickoff Conference and Refined Charter November 2013 Tools Roadmap December 2013 Initial Tools December 2013 Initial Language Tool Chains January 2013 Please join us and contribute! Hwu 2013
19 HAS BOOK IN THE MAKING Heterogeneous System Architecture For Application Programming Audience Performance application developers System architects Component architects Software stack developers Hwu 2013
20 EDITORIAL BOARD Gaster, Benedict Qualcomm Hegde Manju - AMD Hwu, Wen-mei - MulticoreWare/UIUC Jablin, Thomas MultcoreWare Lokhmotov, Anton ARM Lu, Chien-Ping MediaTek Whitecotton, Bob - AMD Hwu 2013
21 CURRENT TABLE OF CONTENTS HSA Overview (50 pages) Chapter 1: What is HSA? (Hwu/Jablin) Chapter 2: The HSA Architecture (Hwu/Jablin/Others) Chapter 3: A Programmer's View of HSA (Hwu/Jablin) Chapter 4: The Tools Framework for HAS, (Hwu/Jablin/Others) Chapter 5: Mapping of Classical algorithms onto the HSA Architecture (Hwu/Jablin) Hwu 2013
22 CURRENT TABLE OF CONTENTS (CONT.) Representative Workloads: Chapter 6: Photography, Minh Do, Professor UIUC, CTO Personify Chapter 7: Video-Audio Search, Ren Wu, Baidu Chapter 8: Augmented Reality Chein-ping Lu, MeidaTek Chapter 9: Biometrics (Face Detection) Mike Jones, MERL & Harris Gasparakis, AMD Chapter 10: Audio-Video processing Bill Herz, Sr. Fellow AMD Hwu 2013
23 CURRENT TABLE OF CONTENTS (CONT.) Chapter 11: Ray-Tracing, TBD Chapter 12: Natural user Interfaces, Navneett Dallal, CEO Flutter Chapter 13: Physics Processing, Ronald Fedkiw, Professor Stanford U. and Eftychios Sifakis Chapter 14: Graphics enhancement, Ignacio Vargas, CTO, Nextlimit (Interested) Chapter 15: Communications and networking, Don Banks- Cisco (Interested) Chapter 16: Data/Business Analytics, Zubin Dowlaty, CTO Mu Sigma Hwu 2013
24 CURRENT TABLE OF CONTENTS (CONT.) Chapter 17: Hadoop, Memcached and cloud frameworks, Bharath Mundlapudi, CTO Orzota Chapter 18: Bioinformatics Wu Feng, VirginiaTech Chapter 19: Scientific Computations, Byunghyun Jang, Univ Miss. Chapter 20: Computer Aided Design and Engineering, Martin Wong, Professor UIUC Chapter 21: Oil and Gas, Nacho Navarro, UPC/BSC (interested) Hwu 2013
25 CURRENT TABLE OF CONTENTS (CONT.) Chapter 22: Financial Services & Analysis, Surra Yanamadala, CA VP Chapter 23: Computer Vision facial expression, Elnar Hajiyev, Realeye Chapter 24: Legacy Code Interoperability, Wenmei and Tom Jablin, MCW Chapter 25: Java for HSA, Gary Frost, AMD Hwu 2013
26 MAJOR UPCOMING MILESTONES Author Algorithm and Code Due February 2014 Author Chapters Due April 2014 Review Due Back to Authors May 2014 Chapters to Elsevier June 2014 Books available October 2014 Hwu 2013
27 THANK YOU! ANY MORE QUESTIONS? Hwu 2013
HSA foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015!
Advanced Topics on Heterogeneous System Architectures HSA foundation! Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2
More informationHSA Foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017!
Advanced Topics on Heterogeneous System Architectures HSA Foundation! Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2
More informationMapping C++ AMP to OpenCL / HSA Wen-Heng Jack Chung
Mapping C++ AMP to OpenCL / HSA Wen-Heng Jack Chung 1 MulticoreWare Founded in 2009 Largest Independent OpenCL Team Locations Changchun Champaign Beijing St. Louis Taiwan Sunnyvale
More informationHETEROGENEOUS SYSTEM ARCHITECTURE: PLATFORM FOR THE FUTURE
HETEROGENEOUS SYSTEM ARCHITECTURE: PLATFORM FOR THE FUTURE Haibo Xie, Ph.D. Chief HSA Evangelist AMD China OUTLINE: The Challenges with Computing Today Introducing Heterogeneous System Architecture (HSA)
More informationSIMULATOR AMD RESEARCH JUNE 14, 2015
AMD'S gem5apu SIMULATOR AMD RESEARCH JUNE 14, 2015 OVERVIEW Introducing AMD s gem5 APU Simulator Extends gem5 with a GPU timing model Supports Heterogeneous System Architecture in SE mode Includes several
More informationRenderscript Accelerated Advanced Image and Video Processing on ARM Mali T-600 GPUs. Lihua Zhang, Ph.D. MulticoreWare Inc.
Renderscript Accelerated Advanced Image and Video Processing on ARM Mali T-600 GPUs Lihua Zhang, Ph.D. MulticoreWare Inc. lihua@multicorewareinc.com Overview More & more mobile apps are beginning to require
More informationECE 8823: GPU Architectures. Objectives
ECE 8823: GPU Architectures Introduction 1 Objectives Distinguishing features of GPUs vs. CPUs Major drivers in the evolution of general purpose GPUs (GPGPUs) 2 1 Chapter 1 Chapter 2: 2.2, 2.3 Reading
More informationAMD ACCELERATING TECHNOLOGIES FOR EXASCALE COMPUTING FELLOW 3 OCTOBER 2016
AMD ACCELERATING TECHNOLOGIES FOR EXASCALE COMPUTING BILL.BRANTLEY@AMD.COM, FELLOW 3 OCTOBER 2016 AMD S VISION FOR EXASCALE COMPUTING EMBRACING HETEROGENEITY CHAMPIONING OPEN SOLUTIONS ENABLING LEADERSHIP
More informationPanel Discussion: The Future of I/O From a CPU Architecture Perspective
Panel Discussion: The Future of I/O From a CPU Architecture Perspective Brad Benton AMD, Inc. #OFADevWorkshop Issues Move to Exascale involves more parallel processing across more processing elements GPUs,
More informationProject Kickoff CS/EE 217. GPU Architecture and Parallel Programming
CS/EE 217 GPU Architecture and Parallel Programming Project Kickoff David Kirk/NVIDIA and Wen-mei W. Hwu, 2007-2012 University of Illinois, Urbana-Champaign! 1 Two flavors Application Implement/optimize
More informationSIGGRAPH Briefing August 2014
Copyright Khronos Group 2014 - Page 1 SIGGRAPH Briefing August 2014 Neil Trevett VP Mobile Ecosystem, NVIDIA President, Khronos Copyright Khronos Group 2014 - Page 2 Significant Khronos API Ecosystem Advances
More informationHeterogeneous SoCs. May 28, 2014 COMPUTER SYSTEM COLLOQUIUM 1
COSCOⅣ Heterogeneous SoCs M5171111 HASEGAWA TORU M5171112 IDONUMA TOSHIICHI May 28, 2014 COMPUTER SYSTEM COLLOQUIUM 1 Contents Background Heterogeneous technology May 28, 2014 COMPUTER SYSTEM COLLOQUIUM
More informationCLICK TO EDIT MASTER TITLE STYLE. Click to edit Master text styles. Second level Third level Fourth level Fifth level
CLICK TO EDIT MASTER TITLE STYLE Second level THE HETEROGENEOUS SYSTEM ARCHITECTURE ITS (NOT) ALL ABOUT THE GPU PAUL BLINZER, FELLOW, HSA SYSTEM SOFTWARE, AMD SYSTEM ARCHITECTURE WORKGROUP CHAIR, HSA FOUNDATION
More informationHeterogeneous Computing
Heterogeneous Computing Featured Speaker Ben Sander Senior Fellow Advanced Micro Devices (AMD) DR. DOBB S: GPU AND CPU PROGRAMMING WITH HETEROGENEOUS SYSTEM ARCHITECTURE Ben Sander AMD Senior Fellow APU:
More informationGPU Computing with NVIDIA s new Kepler Architecture
GPU Computing with NVIDIA s new Kepler Architecture Axel Koehler Sr. Solution Architect HPC HPC Advisory Council Meeting, March 13-15 2013, Lugano 1 NVIDIA: Parallel Computing Company GPUs: GeForce, Quadro,
More informationGPGPU on ARM. Tom Gall, Gil Pitney, 30 th Oct 2013
GPGPU on ARM Tom Gall, Gil Pitney, 30 th Oct 2013 Session Description This session will discuss the current state of the art of GPGPU technologies on ARM SoC systems. What standards are there? Where are
More informationHeterogeneous Architecture. Luca Benini
Heterogeneous Architecture Luca Benini lbenini@iis.ee.ethz.ch Intel s Broadwell 03.05.2016 2 Qualcomm s Snapdragon 810 03.05.2016 3 AMD Bristol Ridge Departement Informationstechnologie und Elektrotechnik
More informationUPCRC Overview. Universal Computing Research Centers launched at UC Berkeley and UIUC. Andrew A. Chien. Vice President of Research Intel Corporation
UPCRC Overview Universal Computing Research Centers launched at UC Berkeley and UIUC Andrew A. Chien Vice President of Research Intel Corporation Announcement Key Messages Microsoft and Intel are announcing
More informationDr. Yassine Hariri CMC Microsystems
Dr. Yassine Hariri Hariri@cmc.ca CMC Microsystems 03-26-2013 Agenda MCES Workshop Agenda and Topics Canada s National Design Network and CMC Microsystems Processor Eras: Background and History Single core
More informationEE382N (20): Computer Architecture - Parallelism and Locality Spring 2015 Lecture 09 GPUs (II) Mattan Erez. The University of Texas at Austin
EE382 (20): Computer Architecture - ism and Locality Spring 2015 Lecture 09 GPUs (II) Mattan Erez The University of Texas at Austin 1 Recap 2 Streaming model 1. Use many slimmed down cores to run in parallel
More informationThe Future of GPU Computing
The Future of GPU Computing Bill Dally Chief Scientist & Sr. VP of Research, NVIDIA Bell Professor of Engineering, Stanford University November 18, 2009 The Future of Computing Bill Dally Chief Scientist
More informationBig Data Systems on Future Hardware. Bingsheng He NUS Computing
Big Data Systems on Future Hardware Bingsheng He NUS Computing http://www.comp.nus.edu.sg/~hebs/ 1 Outline Challenges for Big Data Systems Why Hardware Matters? Open Challenges Summary 2 3 ANYs in Big
More informationTo hear the audio, please be sure to dial in: ID#
Introduction to the HPP-Heterogeneous Processing Platform A combination of Multi-core, GPUs, FPGAs and Many-core accelerators To hear the audio, please be sure to dial in: 1-866-440-4486 ID# 4503739 Yassine
More informationCopyright Khronos Group Page 1. Vulkan Overview. June 2015
Copyright Khronos Group 2015 - Page 1 Vulkan Overview June 2015 Copyright Khronos Group 2015 - Page 2 Khronos Connects Software to Silicon Open Consortium creating OPEN STANDARD APIs for hardware acceleration
More informationIntroduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono
Introduction to CUDA Algoritmi e Calcolo Parallelo References q This set of slides is mainly based on: " CUDA Technical Training, Dr. Antonino Tumeo, Pacific Northwest National Laboratory " Slide of Applied
More informationIntroduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono
Introduction to CUDA Algoritmi e Calcolo Parallelo References This set of slides is mainly based on: CUDA Technical Training, Dr. Antonino Tumeo, Pacific Northwest National Laboratory Slide of Applied
More informationTake GPU Processing Power Beyond Graphics with Mali GPU Computing
Take GPU Processing Power Beyond Graphics with Mali GPU Computing Roberto Mijat Visual Computing Marketing Manager August 2012 Introduction Modern processor and SoC architectures endorse parallelism as
More informationThe Evolution of Big Data Platforms and Data Science
IBM Analytics The Evolution of Big Data Platforms and Data Science ECC Conference 2016 Brandon MacKenzie June 13, 2016 2016 IBM Corporation Hello, I m Brandon MacKenzie. I work at IBM. Data Science - Offering
More informationExploring System Coherency and Maximizing Performance of Mobile Memory Systems
Exploring System Coherency and Maximizing Performance of Mobile Memory Systems Shanghai: William Orme, Strategic Marketing Manager of SSG Beijing & Shenzhen: Mayank Sharma, Product Manager of SSG ARM Tech
More informationOptimizing Cache Coherent Subsystem Architecture for Heterogeneous Multicore SoCs
Optimizing Cache Coherent Subsystem Architecture for Heterogeneous Multicore SoCs Niu Feng Technical Specialist, ARM Tech Symposia 2016 Agenda Introduction Challenges: Optimizing cache coherent subsystem
More informationAccelerating Data Centers Using NVMe and CUDA
Accelerating Data Centers Using NVMe and CUDA Stephen Bates, PhD Technical Director, CSTO, PMC-Sierra Santa Clara, CA 1 Project Donard @ PMC-Sierra Donard is a PMC CTO project that leverages NVM Express
More informationTHE HETEROGENEOUS SYSTEM ARCHITECTURE IT S BEYOND THE GPU
THE HETEROGENEOUS SYSTEM ARCHITECTURE IT S BEYOND THE GPU PAUL BLINZER AMD INC, FELLOW, SYSTEM SOFTWARE SYSTEM ARCHITECTURE WORKGROUP CHAIR HSA FOUNDATION THE HSA VISION MAKE HETEROGENEOUS PROGRAMMING
More informationA Productive Framework for Generating High Performance, Portable, Scalable Applications for Heterogeneous computing
A Productive Framework for Generating High Performance, Portable, Scalable Applications for Heterogeneous computing Wen-mei W. Hwu with Tom Jablin, Chris Rodrigues, Liwen Chang, Steven ShengZhou Wu, Abdul
More informationVulkan Launch Webinar 18 th February Copyright Khronos Group Page 1
Vulkan Launch Webinar 18 th February 2016 Copyright Khronos Group 2016 - Page 1 Copyright Khronos Group 2016 - Page 2 The Vulkan Launch Webinar Is About to Start! Kathleen Mattson - Webinar MC, Khronos
More informationMaster Informatics Eng.
Advanced Architectures Master Informatics Eng. 2018/19 A.J.Proença Data Parallelism 3 (GPU/CUDA, Neural Nets,...) (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2018/19 1 The
More informationGeneral Purpose GPU Programming. Advanced Operating Systems Tutorial 7
General Purpose GPU Programming Advanced Operating Systems Tutorial 7 Tutorial Outline Review of lectured material Key points Discussion OpenCL Future directions 2 Review of Lectured Material Heterogeneous
More informationProgramming in CUDA. Malik M Khan
Programming in CUDA October 21, 2010 Malik M Khan Outline Reminder of CUDA Architecture Execution Model - Brief mention of control flow Heterogeneous Memory Hierarchy - Locality through data placement
More informationGPU ACCELERATED DATABASE MANAGEMENT SYSTEMS
CIS 601 - Graduate Seminar Presentation 1 GPU ACCELERATED DATABASE MANAGEMENT SYSTEMS PRESENTED BY HARINATH AMASA CSU ID: 2697292 What we will talk about.. Current problems GPU What are GPU Databases GPU
More informationGeneral Purpose GPU Programming (1) Advanced Operating Systems Lecture 14
General Purpose GPU Programming (1) Advanced Operating Systems Lecture 14 Lecture Outline Heterogenous multi-core systems and general purpose GPU programming Programming models Heterogenous multi-kernels
More informationCUDA 5 and Beyond. Mark Ebersole. Original Slides: Mark Harris 2012 NVIDIA
CUDA 5 and Beyond Mark Ebersole Original Slides: Mark Harris The Soul of CUDA The Platform for High Performance Parallel Computing Accessible High Performance Enable Computing Ecosystem Introducing CUDA
More informationCSE 4/521 Introduction to Operating Systems
CSE 4/521 Introduction to Operating Systems Lecture 3 Operating Systems Structures (Operating-System Services, User and Operating-System Interface, System Calls, Types of System Calls, System Programs,
More informationNext Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Copyright Khronos Group Page 1
Next Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Ecosystem @neilt3d Copyright Khronos Group 2015 - Page 1 Copyright Khronos Group 2015 - Page 2 Khronos Connects Software to Silicon
More informationTaipei Embedded Outreach OpenCL DSP Profile Proposals
Copyright 2018 The Khronos Group Inc. Page 1 Taipei Embedded Outreach OpenCL DSP Profile Proposals Prof. Jenq-Kuen Lee, NTHU Taipei, January 2018 Copyright 2018 The Khronos Group Inc. Page 2 Outline Speaker
More informationAMD s Unified CPU & GPU Processor Concept
Advanced Seminar Computer Engineering Institute of Computer Engineering (ZITI) University of Heidelberg February 5, 2014 Overview 1 2 Current Platforms: 3 4 5 Architecture 6 2/37 Single-thread Performance
More informationCompiling for HSA accelerators with GCC
Compiling for HSA accelerators with GCC Martin Jambor SUSE Labs 8th August 2015 Outline HSA branch: svn://gcc.gnu.org/svn/gcc/branches/hsa Table of contents: Very Brief Overview of HSA Generating HSAIL
More informationGeneral Purpose GPU Programming. Advanced Operating Systems Tutorial 9
General Purpose GPU Programming Advanced Operating Systems Tutorial 9 Tutorial Outline Review of lectured material Key points Discussion OpenCL Future directions 2 Review of Lectured Material Heterogeneous
More informationWelcome to the XSEDE Big Data Workshop
Welcome to the XSEDE Big Data Workshop John Urbanic Parallel Computing Scientist Pittsburgh Supercomputing Center Copyright 2017 Who are we? Your hosts: Pittsburgh Supercomputing Center Our satellite sites:
More informationPorting Performance across GPUs and FPGAs
Porting Performance across GPUs and FPGAs Deming Chen, ECE, University of Illinois In collaboration with Alex Papakonstantinou 1, Karthik Gururaj 2, John Stratton 1, Jason Cong 2, Wen-Mei Hwu 1 1: ECE
More informationC3SR Cloud Tools and Services for Heterogeneous Cognitive Computing Systems
C3SR Cloud Tools and Services for Heterogeneous Cognitive Computing Systems Wen-mei Hwu Professor and Sanders-AMD Chair, ECE, NCSA, CS University of Illinois at Urbana-Champaign with Jinjun Xiong (IBM),
More informationTesla GPU Computing A Revolution in High Performance Computing
Tesla GPU Computing A Revolution in High Performance Computing Gernot Ziegler, Developer Technology (Compute) (Material by Thomas Bradley) Agenda Tesla GPU Computing CUDA Fermi What is GPU Computing? Introduction
More informationStorage Networking Strategy for the Next Five Years
White Paper Storage Networking Strategy for the Next Five Years 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 1 of 8 Top considerations for storage
More informationAccelerate MySQL for Demanding OLAP and OLTP Use Case with Apache Ignite December 7, 2016
Accelerate MySQL for Demanding OLAP and OLTP Use Case with Apache Ignite December 7, 2016 Nikita Ivanov CTO and Co-Founder GridGain Systems Peter Zaitsev CEO and Co-Founder Percona About the Presentation
More informationEnabling a Richer Multimedia Experience with GPU Compute. Roberto Mijat Visual Computing Marketing Manager
Enabling a Richer Multimedia Experience with GPU Compute Roberto Mijat Visual Computing Marketing Manager 1 What is GPU Compute Operating System and most application processing continue to reside on the
More informationTracing and profiling dataflow applications
Tracing and profiling dataflow applications Pierre Zins, Michel Dagenais December, 2017 Polytechnique Montréal Laboratoire DORSAL Agenda Introduction Existing tools for profiling Available platforms Current
More informationTHE PROGRAMMER S GUIDE TO THE APU GALAXY. Phil Rogers, Corporate Fellow AMD
THE PROGRAMMER S GUIDE TO THE APU GALAXY Phil Rogers, Corporate Fellow AMD THE OPPORTUNITY WE ARE SEIZING Make the unprecedented processing capability of the APU as accessible to programmers as the CPU
More informationPerformance Characterization, Prediction, and Optimization for Heterogeneous Systems with Multi-Level Memory Interference
The 2017 IEEE International Symposium on Workload Characterization Performance Characterization, Prediction, and Optimization for Heterogeneous Systems with Multi-Level Memory Interference Shin-Ying Lee
More informationThe Role of Standards in Heterogeneous Programming
The Role of Standards in Heterogeneous Programming Multi-core Challenge Bristol UWE 45 York Place, Edinburgh EH1 3HP June 12th, 2013 Codeplay Software Ltd. Incorporated in 1999 Based in Edinburgh, Scotland
More informationNUMA-Aware Data-Transfer Measurements for Power/NVLink Multi-GPU Systems
NUMA-Aware Data-Transfer Measurements for Power/NVLink Multi-GPU Systems Carl Pearson 1, I-Hsin Chung 2, Zehra Sura 2, Wen-Mei Hwu 1, and Jinjun Xiong 2 1 University of Illinois Urbana-Champaign, Urbana
More informationData Center Virtualization: Xen and Xen-blanket
Data Center Virtualization: Xen and Xen-blanket Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems and Networking November 17, 2014 Slides from ACM European
More informationFCUDA: Enabling Efficient Compilation of CUDA Kernels onto
FCUDA: Enabling Efficient Compilation of CUDA Kernels onto FPGAs October 13, 2009 Overview Presenting: Alex Papakonstantinou, Karthik Gururaj, John Stratton, Jason Cong, Deming Chen, Wen-mei Hwu. FCUDA:
More informationPredictive Runtime Code Scheduling for Heterogeneous Architectures
Predictive Runtime Code Scheduling for Heterogeneous Architectures Víctor Jiménez, Lluís Vilanova, Isaac Gelado Marisa Gil, Grigori Fursin, Nacho Navarro HiPEAC 2009 January, 26th, 2009 1 Outline Motivation
More informationFCUDA: Enabling Efficient Compilation of CUDA Kernels onto
FCUDA: Enabling Efficient Compilation of CUDA Kernels onto FPGAs October 13, 2009 Overview Presenting: Alex Papakonstantinou, Karthik Gururaj, John Stratton, Jason Cong, Deming Chen, Wen-mei Hwu. FCUDA:
More informationChapter 2: Operating-System Structures. Operating System Concepts 9 th Edit9on
Chapter 2: Operating-System Structures Operating System Concepts 9 th Edit9on Silberschatz, Galvin and Gagne 2013 Chapter 2: Operating-System Structures 1. Operating System Services 2. User Operating System
More informationFunctional Programming and the Web
June 13, 2011 About Me Undergraduate: University of Illinois at Champaign-Urbana PhD: Penn State University Retrofitting Programs for Complete Security Mediation Static analysis, type-based compiler Racker:
More informationCS140 Operating Systems and Systems Programming Midterm Exam
CS140 Operating Systems and Systems Programming Midterm Exam October 28 th, 2002 (Total time = 50 minutes, Total Points = 50) Name: (please print) In recognition of and in the spirit of the Stanford University
More informationCloud Computing & Visualization
Cloud Computing & Visualization Workflows Distributed Computation with Spark Data Warehousing with Redshift Visualization with Tableau #FIUSCIS School of Computing & Information Sciences, Florida International
More informationAn introduction to Machine Learning silicon
An introduction to Machine Learning silicon November 28 2017 Insight for Technology Investors AI/ML terminology Artificial Intelligence Machine Learning Deep Learning Algorithms: CNNs, RNNs, etc. Additional
More informationSAS Platform Strategy Prepared for FANS usergroup. Mike Frost, Director, Product Management Fiona McNeill, Global Product Marketing
SAS Platform Strategy Prepared for FANS usergroup Mike Frost, Director, Product Management Fiona McNeill, Global Product Marketing Information is subject to change. Q1 2017 Q2 2017 Q3 2017 Q4 2017 H1
More informationGPU Programming for Mathematical and Scientific Computing
GPU Programming for Mathematical and Scientific Computing Ethan Kerzner and Timothy Urness Department of Mathematics and Computer Science Drake University Des Moines, IA 50311 ethan.kerzner@gmail.com timothy.urness@drake.edu
More informationDomain Specific Languages for Financial Payoffs. Matthew Leslie Bank of America Merrill Lynch
Domain Specific Languages for Financial Payoffs Matthew Leslie Bank of America Merrill Lynch Outline Introduction What, How, and Why do we use DSLs in Finance? Implementation Interpreting, Compiling Performance
More informationLIQUID METAL Taming Heterogeneity
LIQUID METAL Taming Heterogeneity Stephen Fink IBM Research! IBM Research Liquid Metal Team (IBM T. J. Watson Research Center) Josh Auerbach Perry Cheng 2 David Bacon Stephen Fink Ioana Baldini Rodric
More informationTesla GPU Computing A Revolution in High Performance Computing
Tesla GPU Computing A Revolution in High Performance Computing Mark Harris, NVIDIA Agenda Tesla GPU Computing CUDA Fermi What is GPU Computing? Introduction to Tesla CUDA Architecture Programming & Memory
More informationMaximizing heterogeneous system performance with ARM interconnect and CCIX
Maximizing heterogeneous system performance with ARM interconnect and CCIX Neil Parris, Director of product marketing Systems and software group, ARM Teratec June 2017 Intelligent flexible cloud to enable
More informationPioneering New Frontiers
Pioneering New Frontiers EEA Mission Statement The EEA is a member-led industry organization based on the goal of empowering the use of Ethereum blockchain technology as an open standard for the betterment
More informationROCm: An open platform for GPU computing exploration
UCX-ROCm: ROCm Integration into UCX {Khaled Hamidouche, Brad Benton}@AMD Research ROCm: An open platform for GPU computing exploration 1 JUNE, 2018 ISC ROCm Software Platform An Open Source foundation
More informationImplementing Long-term Recurrent Convolutional Network Using HLS on POWER System
Implementing Long-term Recurrent Convolutional Network Using HLS on POWER System Xiaofan Zhang1, Mohamed El Hadedy1, Wen-mei Hwu1, Nam Sung Kim1, Jinjun Xiong2, Deming Chen1 1 University of Illinois Urbana-Champaign
More informationLecture 1: Introduction and Computational Thinking
PASI Summer School Advanced Algorithmic Techniques for GPUs Lecture 1: Introduction and Computational Thinking 1 Course Objective To master the most commonly used algorithm techniques and computational
More informationLecture 1: Gentle Introduction to GPUs
CSCI-GA.3033-004 Graphics Processing Units (GPUs): Architecture and Programming Lecture 1: Gentle Introduction to GPUs Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Who Am I? Mohamed
More informationClearSpeed Visual Profiler
ClearSpeed Visual Profiler Copyright 2007 ClearSpeed Technology plc. All rights reserved. 12 November 2007 www.clearspeed.com 1 Profiling Application Code Why use a profiler? Program analysis tools are
More informationXPU A Programmable FPGA Accelerator for Diverse Workloads
XPU A Programmable FPGA Accelerator for Diverse Workloads Jian Ouyang, 1 (ouyangjian@baidu.com) Ephrem Wu, 2 Jing Wang, 1 Yupeng Li, 1 Hanlin Xie 1 1 Baidu, Inc. 2 Xilinx Outlines Background - FPGA for
More informationMore performance options
More performance options OpenCL, streaming media, and native coding options with INDE April 8, 2014 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Intel Xeon, and Intel
More informationTowards Automatic Heterogeneous Computing Performance Analysis. Carl Pearson Adviser: Wen-Mei Hwu
Towards Automatic Heterogeneous Computing Performance Analysis Carl Pearson pearson@illinois.edu Adviser: Wen-Mei Hwu 2018 03 30 1 Outline High Performance Computing Challenges Vision CUDA Allocation and
More informationBuilding blocks for 64-bit Systems Development of System IP in ARM
Building blocks for 64-bit Systems Development of System IP in ARM Research seminar @ University of York January 2015 Stuart Kenny stuart.kenny@arm.com 1 2 64-bit Mobile Devices The Mobile Consumer Expects
More informationKhronos Connects Software to Silicon
Press Pre-Briefing GDC 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem All Materials Embargoed Until Tuesday 3 rd March, 12:01AM Pacific Time Copyright Khronos Group 2015 - Page
More informationChapter 2. Operating-System Structures
Chapter 2 Operating-System Structures 2.1 Chapter 2: Operating-System Structures Operating System Services User Operating System Interface System Calls Types of System Calls System Programs Operating System
More informationHigh Performance Memory Opportunities in 2.5D Network Flow Processors
High Performance Memory Opportunities in 2.5D Network Flow Processors Jay Seaton, VP Silicon Operations, Netronome Larry Zu, PhD, President, Sarcina Technology LLC August 6, 2013 2013 Netronome 1 Netronome
More informationPaper Summary Problem Uses Problems Predictions/Trends. General Purpose GPU. Aurojit Panda
s Aurojit Panda apanda@cs.berkeley.edu Summary s SIMD helps increase performance while using less power For some tasks (not everything can use data parallelism). Can use less power since DLP allows use
More informationOpen Standard APIs for Embedded Vision Processing
Copyright Khronos Group 2014 - Page 1 Open Standard APIs for Embedded Vision Processing Neil Trevett Vice President Mobile Ecosystem, NVIDIA President, Khronos Group Copyright Khronos Group 2014 - Page
More informationCross-architecture Virtualisation
Cross-architecture Virtualisation Tom Spink Harry Wagstaff, Björn Franke School of Informatics University of Edinburgh Virtualisation Many of you will be familiar with same-architecture virtualisation
More informationCIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( )
Guide: CIS 601 Graduate Seminar Presented By: Dr. Sunnie S. Chung Dhruv Patel (2652790) Kalpesh Sharma (2660576) Introduction Background Parallel Data Warehouse (PDW) Hive MongoDB Client-side Shared SQL
More informationThe Architecture of Virtual Machines Lecture for the Embedded Systems Course CSD, University of Crete (April 29, 2014)
The Architecture of Virtual Machines Lecture for the Embedded Systems Course CSD, University of Crete (April 29, 2014) ManolisMarazakis (maraz@ics.forth.gr) Institute of Computer Science (ICS) Foundation
More informationOptimization Principles and Application Performance Evaluation of a Multithreaded GPU Using CUDA
Optimization Principles and Application Performance Evaluation of a Multithreaded GPU Using CUDA Shane Ryoo, Christopher I. Rodrigues, Sara S. Baghsorkhi, Sam S. Stone, David B. Kirk, and Wen-mei H. Hwu
More informationHandout 3. HSAIL and A SIMT GPU Simulator
Handout 3 HSAIL and A SIMT GPU Simulator 1 Outline Heterogeneous System Introduction of HSA Intermediate Language (HSAIL) A SIMT GPU Simulator Summary 2 Heterogeneous System CPU & GPU CPU GPU CPU wants
More informationWelcome to the XSEDE Big Data Workshop
Welcome to the XSEDE Big Data Workshop John Urbanic Parallel Computing Scientist Pittsburgh Supercomputing Center Copyright 2018 Who are we? Our satellite sites: Tufts University Purdue University Howard
More informationThe OpenVX Computer Vision and Neural Network Inference
The OpenVX Computer and Neural Network Inference Standard for Portable, Efficient Code Radhakrishna Giduthuri Editor, OpenVX Khronos Group radha.giduthuri@amd.com @RadhaGiduthuri Copyright 2018 Khronos
More information@Cisco. Welcome! By Rennie Allen, Cisco FAE. Welcome to the Machine By Rennie Allen, Cisco FAE. Q Volume 1 Issue 1
@Cisco QNX Software Systems, 900 East Hamilton Ave., Campbell CA, 95008 www.qnx.com rallen@qnx.com 951-704-3447 Q2 2008 Volume 1 Issue 1 Welcome! I N S I D E T H I S I S S U E 1 Welcome to @Cisco! 1 Welcome
More informationFour Components of a Computer System
Four Components of a Computer System Operating System Concepts Essentials 2nd Edition 1.1 Silberschatz, Galvin and Gagne 2013 Operating System Definition OS is a resource allocator Manages all resources
More informationIJRDTM Kailash ISBN No Vol.17 Issue
ABSTRACT ANDROID OPERATING SYSTEM : A CASE STUDY by Pankaj Research Associate, GGSIP University Android is a software stack for mobile devices that includes an operating system, middleware and key applications.
More informationA High-Performing Cloud Begins with a Strong Foundation. A solution guide for IBM Cloud bare metal servers
A High-Performing Cloud Begins with a Strong Foundation A solution guide for IBM Cloud bare metal servers 02 IBM Cloud Bare Metal Servers Bare metal and the bottom line Today s workloads are dynamic and
More informationEE382N (20): Computer Architecture - Parallelism and Locality Lecture 10 Parallelism in Software I
EE382 (20): Computer Architecture - Parallelism and Locality Lecture 10 Parallelism in Software I Mattan Erez The University of Texas at Austin EE382: Parallelilsm and Locality (c) Rodric Rabbah, Mattan
More information