Taipei Embedded Outreach OpenCL DSP Profile Proposals
|
|
- Hope Kerry Patterson
- 5 years ago
- Views:
Transcription
1 Copyright 2018 The Khronos Group Inc. Page 1 Taipei Embedded Outreach OpenCL DSP Profile Proposals Prof. Jenq-Kuen Lee, NTHU Taipei, January 2018
2 Copyright 2018 The Khronos Group Inc. Page 2 Outline Speaker Vita OpenCL DSP Profile Proposal Reference Designs and Use Cases Current Status with OpenCL Roadmap
3 Copyright 2018 The Khronos Group Inc. Page 3 Prof. Jenq-Kuen Lee, NTHU Our group contributed to PoCL-based OpenCL runtime on HSA - Codes contributed to HSA Foundation GitHub - also up-streamed to PoCL Prototyping OpenCL compiler (based on Open64) for PAC DSP multi-cores - ESTIMedia 2013, ICPP EMS 2012, ACM TODAES 2015 Compiler optimizations for OpenCL programs - Vector Data Flow Analysis for SIMD Optimizations on OpenCL Programs (Concurrency and Computation: Practice and Experience, 2016) - Pointer-based Divergence for OpenCL(CPC 2013) - Compilers for OpenCL with Affine Registers on GPGPU (ACM TODAES 2017) DSP compiler optimizations (PAC DSP Compiler) - SIMD Intrisinc supports for VLIW DSP (CPC 2010) - PALF: compiler supports for distributed register files in VLIW DSP (Concurrency and Computation: Practice and Experience 2007) Contribute OpenCL DSP Profile Proposals - Funding support from Taiwan MOST and Mediatek
4 Copyright 2018 The Khronos Group Inc. Page 4 Outline Speaker Vita OpenCL DSP Profile Proposal Reference Designs and Use Cases Current Status with OpenCL Roadmap
5 L: K C O
6 OpenCL Research with HSA Architectures w w Enable OpenCL framework on HSA platform Use pocl as our basic OpenCL framework Extend pocl to support HSA platforms by lowering OpenCL APIs to HSA runtime APIs Our enhanced PoCL-based OpenCL runtime for HSA is officially released on HSA Foundation. Upstream to PoCL ml/hsa_status.html (acknowledge our work) J. System Architecture Nov (OpenCL 2.0 Runtime) 6
7 Support heterogeneous computing for OpenCL among CPU, GPU, DSP, and FPGA. Previously, OpenCL is mainly for CPU and GPU Low-power numerical precision to work with vision applications, deep learning applications, and signal processing applications Energy savings up to 35% in the memory energy and the computation of low-power DSP numerics in the OpenCL DSP proposal only consumes 1/7 energy of floating point in hardware. Optional stoachastic rounding mode benefits deep learning applications Demonstrate reference design for Khronos SPIR-V extension with our DSP proposals.
8 Copyright 2018 The Khronos Group Inc. Page 8 Proposals to Khronos OpenCL DSP Profile Goal is to integrate CPU/GPU/DSP - Houston f2f Seattle f2f Frankfurt f2f Seoul f2f Vancouver f2f Amsterdam f2f Chicago f2f 2017 Contributors: - NTHU: C. C. Yang, M. Y. Hsu, S. C. Wang, C Li, Y. Chang, BS Lu, S. Chien, T. Chen - MediaTek: PeiChia Lin, Cheng-Wei Chen, Trent Lo, Diana Chen The proposals are in collaboration with MediaTek
9 Comparison Math Functions Support for Fixed-Point Types Functions Trigonometric Xilinx HLS *bit-width specification cos, cospi sin sinpi *ap_fixed<w,2> where W<=32 Khronos OpenCL C++ proposal cos, sin, tan, acos, asin, atan Hyperbolic None cosh, sinh, tanh, acosh, asinh, atanh Exponential and Logarithmic Power exp *ap_fixed<16,8> and ap_fixed<8,4> sqrt *ap_fixed<w,i> where W<=32 exp, log, log10 pow, sqrt ISO C++ proposal (John) cos, sin None (Tentative) Others None None abs Other Fixed-Point Types Comparisons Support Xilinx HLS Khronos OpenCL C++ proposal exp pow, sqrt ISO C++ proposal (John) ISO C++ proposal (Lawrence) ISO C++ proposal (Lawrence) System C System C Console IO Yes No No No Yes Fixed Point Width Arbitrary DSP: 8, 16, 32 FPGA: Arbitrary Courtesy of Ronan Keryell and Lin-Ya Yu Copyright 2017 Xilinx. 8, 16, 32, 64 - sc_fxval: arbitrary width and binary point location, limited-precision fixed-point: 53
10 Copyright 2018 The Khronos Group Inc. Page 10 Outline Speaker Vita OpenCL DSP Profile Proposal Reference Designs and Use Cases Current Status with OpenCL Roadmap
11 Copyright 2018 The Khronos Group Inc. - Page 11 SPIR-V Ecosystem GLSL HLSL Khronos has open sourced these tools and translators Third party kernel and shader languages glslang MSL HLSL GLSL SPIR-V Cross OpenCL C Front-end OpenCL C++ Front-end SPIR-V Khronos defined and controlled cross-api intermediate language Native support for graphics and parallel constructs 32-bit Word Stream Extensible and easily parsed Retains data object and control flow information for effective code generation and translation SPIR-V (Dis)Assembler SPIR-V Validator Other Intermediate Forms IHV Driver Runtimes LLVM to SPIR-V Bi-directional Translator LLVM Khronos coordinating liaison with Clang/LLVM Community E.g. discussing SPIR-V as supported Clang target
12 Overview Revision of Khronos LLVM IR to SPIRV Convertor on GitHub Example: New Opcode Type: OpTypeFixedPoint 12
13 Example: New Opcode Type - OpTypeFixedPoint OpTypeFixedPoint Declares a new fixed-point type. Width is the bit width of significant digits(significand). Exponent is the absolute value(i.e abs()) of the exponent value Result <id> Literal Number Width Literal Number Exponent... 13
14 Copyright 2018 The Khronos Group Inc. Page 14 OpenCL as Language/Library Backend C++ based Neural network framework Language for image processing and computational photography MulticoreWare open source project on Bitbucket Single Source C++ Programming for OpenCL Open source software library for machine learning Vision processing open source project Compiler directives for Fortran, C and C++ Open compiler for AI framework
15 Use Case: Fixed Point on Tensorflow Tensorflo w Eigen ViennaCL OpenCL 2.2 C++ OpenCL C Khronos f2f Meeting, Chicago, 2017
16 Interfacing Tensorflow/Eigen with ViennaCL ViennaCL provides methods for data copying between Eigen and ViennaCL Therefore, we can enable the Tensorflow-to-OpenCL flow by interfacing Tensorflow/Eigen with ViennaCL Application Tensorflow Output Eigen ViennaCL OpenCL Khronos f2f Meeting, Chicago, 2017
17 Khronos f2f Meeting, Chicago, 2017 From:
18 Enable Built-in Kernels in ViennaCL with Our Proposed OpenCL Fixed-Point To enable the flow, we have revised the.hpp header files which generated OpenCL kernels in the following path: ViennaCL-1.x.x/viennacl/linalg/opencl/kernels/ We enable the use case flow to fixed point by: Enable the use of OpenCL C++ kernels in ViennaCL Enable the use of OpenCL C++ kernels with proposed fixed-point in ViennaCL Enable the use of OpenCL C kernels with proposed fixed-point in ViennaCL Khronos f2f Meeting, Chicago, 2017
19 19 Enable Fixed-Point Type Simulation (GPGPUSim) Extend Clang/SPIR-V translator to accommodate fixed-point Opcodes LLVM: Clang: LLVM IR Fixer (or LLVM IR Khronos Adaptor) Fix IR difference Kernel Identification (Annotation v.s Explicit IR semantic) Special Types Other special IR annotations for NVPTX Add/Map fixed-point ISA extension for NVPTX OpenCL Kernel clang LLVM IR (SPIRV) SPIRV Translator SPIRV SPIRV Translator LLVM IR (SPIRV) LLVM IR Fixer LLVM IR (NVPTX) NVPTX Backend NVPTX
20 Enable SPIR-V to GPGPUSim
21 Copyright 2018 The Khronos Group Inc. Page 21 Outline Speaker Vita OpenCL DSP Profile Proposal Reference Designs and Use Cases Current Status with OpenCL Roadmap
22 Copyright 2018 The Khronos Group Inc. Page 22 OpenCL Next Directions Generalizing DSP Path to DSP, FPGA, Embedded GPU, DL - Incorporate OpenCL DSP Profile Proposals Feature Sets Capability - Allow Code Specializations - Optional features are captured as capability flags in the API Enhance SPIR-V to accommodate OpenCL low-power numerics
23 Copyright 2018 The Khronos Group Inc. Page 23 Suggestions and Comments We welcome DSP vendors and IP companies for suggestions and inputs We also hope to encourage companies to participate in Khronos activities and the research and design with Khronos APIs!
Copyright Khronos Group Page 1
OpenCL State of the Nation Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem OpenCL Working Group Chair ntrevett@nvidia.com @neilt3d Toronto, May 2017 Copyright Khronos Group 2017
More informationCopyright Khronos Group Page 1
OpenCL State of the Nation Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem OpenCL Working Group Chair ntrevett@nvidia.com @neilt3d Toronto, May 2017 Copyright Khronos Group 2017
More informationA Translation Framework for Automatic Translation of Annotated LLVM IR into OpenCL Kernel Function
A Translation Framework for Automatic Translation of Annotated LLVM IR into OpenCL Kernel Function Chen-Ting Chang, Yu-Sheng Chen, I-Wei Wu, and Jyh-Jiun Shann Dept. of Computer Science, National Chiao
More informationVulkan 1.1 March Copyright Khronos Group Page 1
Vulkan 1.1 March 2018 Copyright Khronos Group 2018 - Page 1 Vulkan 1.1 Launch and Ongoing Momentum Strengthening the Ecosystem Improved developer tools (SDK, validation/debug layers) More rigorous conformance
More informationKhronos Connects Software to Silicon
Press Pre-Briefing GDC 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem All Materials Embargoed Until Tuesday 3 rd March, 12:01AM Pacific Time Copyright Khronos Group 2015 - Page
More informationCopyright Khronos Group Page 1
OpenCL and Ecosystem State of the Nation Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem OpenCL Working Group Chair ntrevett@nvidia.com @neilt3d Oxford, May 2018 Copyright Khronos
More informationCopyright Khronos Group Page 1. Vulkan Overview. June 2015
Copyright Khronos Group 2015 - Page 1 Vulkan Overview June 2015 Copyright Khronos Group 2015 - Page 2 Khronos Connects Software to Silicon Open Consortium creating OPEN STANDARD APIs for hardware acceleration
More informationData Parallel Execution Model
CS/EE 217 GPU Architecture and Parallel Programming Lecture 3: Kernel-Based Data Parallel Execution Model David Kirk/NVIDIA and Wen-mei Hwu, 2007-2013 Objective To understand the organization and scheduling
More informationCopyright Khronos Group Page 1
Open Standards and Open Source Together How Khronos APIs Accelerate Fast and Cool Applications Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem Copyright Khronos Group 2015 - Page
More informationPress Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem. Copyright Khronos Group Page 1
Press Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem Copyright Khronos Group 2015 - Page 1 Khronos Connects Software to Silicon Open Consortium creating ROYALTY-FREE,
More informationSIGGRAPH Briefing August 2014
Copyright Khronos Group 2014 - Page 1 SIGGRAPH Briefing August 2014 Neil Trevett VP Mobile Ecosystem, NVIDIA President, Khronos Copyright Khronos Group 2014 - Page 2 Significant Khronos API Ecosystem Advances
More informationCopyright Khronos Group Page 1
Update on Khronos Standards for Vision and Machine Learning December 2017 Neil Trevett Khronos President NVIDIA VP Developer Ecosystem ntrevett@nvidia.com @neilt3d www.khronos.org Copyright Khronos Group
More informationIntroduction to SPIR-V Shaders
Copyright Khronos Group 2016 - Page 38 Introduction to SPIR-V Shaders Neil Hickey Compiler Engineer, ARM SPIR History Copyright Khronos Group 2016 - Page 39 Copyright Khronos Group 2016 - Page 40 SPIR-V
More informationUpdate on Khronos Open Standard APIs for Vision Processing Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem
Update on Khronos Open Standard APIs for Vision Processing Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem Copyright Khronos Group 2015 - Page 1 Copyright Khronos Group 2015 - Page
More informationCUDA Toolkit 5.0 Performance Report. January 2013
CUDA Toolkit 5.0 Performance Report January 2013 CUDA Math Libraries High performance math routines for your applications: cufft Fast Fourier Transforms Library cublas Complete BLAS Library cusparse Sparse
More informationCopyright Khronos Group Page 1
OpenCL A State of the Union Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem OpenCL Working Group Chair ntrevett@nvidia.com @neilt3d Vienna, April 2016 Copyright Khronos Group 2016
More informationThe Role of Standards in Heterogeneous Programming
The Role of Standards in Heterogeneous Programming Multi-core Challenge Bristol UWE 45 York Place, Edinburgh EH1 3HP June 12th, 2013 Codeplay Software Ltd. Incorporated in 1999 Based in Edinburgh, Scotland
More informationHKG OpenCL Support by NNVM & TVM. Jammy Zhou - Linaro
HKG18-417 OpenCL Support by NNVM & TVM Jammy Zhou - Linaro Agenda OpenCL Overview OpenCL in NNVM & TVM Current Status OpenCL Introduction Open Computing Language Open standard maintained by Khronos with
More informationOpen Standards for Vision and AI Peter McGuinness NNEF WG Chair CEO, Highwai, Inc May 2018
Copyright Khronos Group 2018 - Page 1 Open Standards for Vision and AI Peter McGuinness NNEF WG Chair CEO, Highwai, Inc peter.mcguinness@gobrach.com May 2018 Khronos Mission E.g. OpenGL ES provides 3D
More informationPress Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem. Copyright Khronos Group Page 1
Press Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem Copyright Khronos Group 2015 - Page 1 Khronos Connects Software to Silicon Open Consortium creating ROYALTY-FREE,
More informationAccelerating Vision Processing
Accelerating Vision Processing Neil Trevett Vice President Mobile Ecosystem at NVIDIA President of Khronos and Chair of the OpenCL Working Group SIGGRAPH, July 2016 Copyright Khronos Group 2016 - Page
More informationVulkan Launch Webinar 18 th February Copyright Khronos Group Page 1
Vulkan Launch Webinar 18 th February 2016 Copyright Khronos Group 2016 - Page 1 Copyright Khronos Group 2016 - Page 2 The Vulkan Launch Webinar Is About to Start! Kathleen Mattson - Webinar MC, Khronos
More informationKHRONOS STANDARDS UPDATE. Neil Trevett, GTC, 26 th March 2018
KHRONOS STANDARDS UPDATE Neil Trevett, GTC, 26 th March 2018 Khronos Mission Software Silicon Khronos is an International Industry Consortium of over 100 companies creating royalty-free, open standards
More informationSPIR-V Extended Instructions for GLSL
SPIR-V Etended Instructions for GLSL John Kessenich, Google Version 1.00, Revision 7 August 8, 2018 SPIR-V Etended Instructions for GLSL Copyright 2014-2018 The Khronos Group Inc. All Rights Reserved.
More informationNavigating the Vision API Jungle: Which API Should You Use and Why? Embedded Vision Summit, May 2015
Copyright Khronos Group 2015 - Page 1 Navigating the Vision API Jungle: Which API Should You Use and Why? Embedded Vision Summit, May 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem
More informationHighly Optimized Mathematical Functions for the Itanium Processor
Highly Optimized Mathematical Functions for the Itanium Processor! Speaker: Shane Story! Software Engineer! CSL Numerics Group! Corporation Copyright Copyright 2001 2001 Corporation. Agenda! Itanium Processor
More informationOpenCL Overview. Shanghai March Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group
Copyright Khronos Group, 2012 - Page 1 OpenCL Overview Shanghai March 2012 Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group Copyright Khronos Group, 2012 - Page 2 Processor
More informationCLICK TO EDIT MASTER TITLE STYLE. Click to edit Master text styles. Second level Third level Fourth level Fifth level
CLICK TO EDIT MASTER TITLE STYLE Second level THE HETEROGENEOUS SYSTEM ARCHITECTURE ITS (NOT) ALL ABOUT THE GPU PAUL BLINZER, FELLOW, HSA SYSTEM SOFTWARE, AMD SYSTEM ARCHITECTURE WORKGROUP CHAIR, HSA FOUNDATION
More informationOverview. Think Silicon is a privately held company founded in 2007 by the core team of Atmel MMC IC group
Nema An OpenGL & OpenCL Embedded Programmable Engine Georgios Keramidas & Iakovos Stamoulis Think Silicon mobile GRAPHICS Overview Think Silicon is a privately held company founded in 2007 by the core
More informationBuilt-in Types of Data
Built-in Types of Data Types A data type is set of values and a set of operations defined on those values Python supports several built-in data types: int (for integers), float (for floating-point numbers),
More informationtrisycl Open Source C++17 & OpenMP-based OpenCL SYCL prototype Ronan Keryell 05/12/2015 IWOCL 2015 SYCL Tutorial Khronos OpenCL SYCL committee
trisycl Open Source C++17 & OpenMP-based OpenCL SYCL prototype Ronan Keryell Khronos OpenCL SYCL committee 05/12/2015 IWOCL 2015 SYCL Tutorial OpenCL SYCL committee work... Weekly telephone meeting Define
More informationRequest for Quotations
Request for Quotations OpenCL 2.2 CTS September 2016 Notice ALL KHRONOS SPECIFICATIONS AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, MATERIALS ) ARE BEING PROVIDED AS IS. KHRONOS MAKES NO WARRANTIES, EXPRESSED,
More informationCSE 591: GPU Programming. Programmer Interface. Klaus Mueller. Computer Science Department Stony Brook University
CSE 591: GPU Programming Programmer Interface Klaus Mueller Computer Science Department Stony Brook University Compute Levels Encodes the hardware capability of a GPU card newer cards have higher compute
More informationCopyright Khronos Group 2012 Page 1. OpenCL 1.2. August 2012
Copyright Khronos Group 2012 Page 1 OpenCL 1.2 August 2012 Copyright Khronos Group 2012 Page 2 Khronos - Connecting Software to Silicon Khronos defines open, royalty-free standards to access graphics,
More informationStandards Update. Copyright Khronos Group Page 1
Standards Update VR/AR, 3D, Web, Vision and Deep Learning Neil Trevett Khronos President NVIDIA VP Developer Ecosystem ntrevett@nvidia.com @neilt3d www.khronos.org Copyright Khronos Group 2017 - Page 1
More informationThe OpenVX Computer Vision and Neural Network Inference
The OpenVX Computer and Neural Network Inference Standard for Portable, Efficient Code Radhakrishna Giduthuri Editor, OpenVX Khronos Group radha.giduthuri@amd.com @RadhaGiduthuri Copyright 2018 Khronos
More informationCUDA 6.0 Performance Report. April 2014
CUDA 6. Performance Report April 214 1 CUDA 6 Performance Report CUDART CUDA Runtime Library cufft Fast Fourier Transforms Library cublas Complete BLAS Library cusparse Sparse Matrix Library curand Random
More informationGoing to cover; - Why we have SPIR-V - Brief history of SPIR-V - Some of the core required features we wanted - How OpenCL will use SPIR-V - How
1 Going to cover; - Why we have SPIR-V - Brief history of SPIR-V - Some of the core required features we wanted - How OpenCL will use SPIR-V - How Vulkan will use SPIR-V - The differences between compute/graphics
More informationMatlab Workshop I. Niloufer Mackey and Lixin Shen
Matlab Workshop I Niloufer Mackey and Lixin Shen Western Michigan University/ Syracuse University Email: nil.mackey@wmich.edu, lshen03@syr.edu@wmich.edu p.1/13 What is Matlab? Matlab is a commercial Matrix
More informationHSAIL: PORTABLE COMPILER IR FOR HSA
HSAIL: PORTABLE COMPILER IR FOR HSA HOT CHIPS TUTORIAL - AUGUST 2013 BEN SANDER AMD SENIOR FELLOW STATE OF GPU COMPUTING GPUs are fast and power efficient : high compute density per-mm and per-watt But:
More informationArithmetic and Logic Blocks
Arithmetic and Logic Blocks The Addition Block The block performs addition and subtractions on its inputs. This block can add or subtract scalar, vector, or matrix inputs. We can specify the operation
More informationKHRONOS STANDARDS UPDATE. Neil Trevett, GTC, 26 th March 2018
KHRONOS STANDARDS UPDATE Neil Trevett, GTC, 26 th March 2018 Khronos Mission Software Silicon Khronos is an International Industry Consortium of over 100 companies creating royalty-free, open standards
More informationOpen Standard APIs for Augmented Reality
Copyright Khronos Group 2014 - Page 1 Open Standard APIs for Augmented Reality Neil Trevett Vice President Mobile Ecosystem, NVIDIA President, Khronos Group Copyright Khronos Group 2014 - Page 2 Khronos
More informationStandards for Vision Processing and Neural Networks
Copyright Khronos Group 2017 - Page 1 Standards for Vision Processing and Neural Networks Radhakrishna Giduthuri, AMD radha.giduthuri@ieee.org Agenda Why we need a standard? Khronos NNEF Khronos OpenVX
More informationOpenCL: History & Future. November 20, 2017
Mitglied der Helmholtz-Gemeinschaft OpenCL: History & Future November 20, 2017 OpenCL Portable Heterogeneous Computing 2 APIs and 2 kernel languages C Platform Layer API OpenCL C and C++ kernel language
More informationArcGIS Enterprise Building Raster Analytics Workflows. Mike Muller, Jie Zhang
ArcGIS Enterprise Building Raster Analytics Workflows Mike Muller, Jie Zhang Introduction and Context Raster Analytics What is Raster Analytics? The ArcGIS way to create and execute spatial analysis models
More informationLAB 1 General MATLAB Information 1
LAB 1 General MATLAB Information 1 General: To enter a matrix: > type the entries between square brackets, [...] > enter it by rows with elements separated by a space or comma > rows are terminated by
More informationKhronos Connects Software to Silicon
Neil Trevett Vice President Mobile Ecosystem at NVIDIA President of Khronos and Chair of the OpenCL Working Group SIGGRAPH, July 2016 Copyright Khronos Group 2016 - Page 1 Copyright Khronos Group 2016
More informationC++, How to Program. Spring 2016 CISC1600 Yanjun Li 1
Chapter 6 Function C++, How to Program Deitel & Deitel Spring 2016 CISC1600 Yanjun Li 1 Function A function is a collection of statements that performs a specific task - a single, well-defined task. Divide
More informationIntroduction to MATLAB
Outlines September 9, 2004 Outlines Part I: Review of Previous Lecture Part II: Part III: Writing MATLAB Functions Review of Previous Lecture Outlines Part I: Review of Previous Lecture Part II: Part III:
More informationSYCL for OpenCL in a Nutshell
SYCL for OpenCL in a Nutshell Luke Iwanski, Games Technology Programmer @ Codeplay! SIGGRAPH Vancouver 2014 1 2 Copyright Khronos Group 2014 SYCL for OpenCL in a nutshell Copyright Khronos Group 2014 Why?
More informationModern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design
Modern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design The 1960s - 1970s Instructions took multiple cycles Only one instruction in flight at once Optimisation meant
More informationMapping C++ AMP to OpenCL / HSA Wen-Heng Jack Chung
Mapping C++ AMP to OpenCL / HSA Wen-Heng Jack Chung 1 MulticoreWare Founded in 2009 Largest Independent OpenCL Team Locations Changchun Champaign Beijing St. Louis Taiwan Sunnyvale
More informationOpenCL Press Conference
Copyright Khronos Group, 2011 - Page 1 OpenCL Press Conference Tokyo, November 2011 Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group Copyright Khronos Group, 2011 - Page
More informationGeneral Purpose GPU Programming (1) Advanced Operating Systems Lecture 14
General Purpose GPU Programming (1) Advanced Operating Systems Lecture 14 Lecture Outline Heterogenous multi-core systems and general purpose GPU programming Programming models Heterogenous multi-kernels
More informationOutline of High-Speed Quad-Precision Arithmetic Package ASLQUAD
Outline of High-Speed Quad-Precision Arithmetic Package ASLQUAD OGATA Ryusei, KUBO Yoshiyuki, TAKEI Toshifumi Abstract The ASLQUAD high-speed quad-precision arithmetic package reduces numerical errors
More informationIntroduction to MATLAB
Outlines January 30, 2008 Outlines Part I: Part II: Writing MATLAB Functions Starting MATLAB Exiting MATLAB Getting Help Command Window Workspace Command History Current Directory Selector Real Values
More informationCopyright Khronos Group, Page 1 SYCL. SG14, February 2016
Copyright Khronos Group, 2014 - Page 1 SYCL SG14, February 2016 BOARD OF PROMOTERS Over 100 members worldwide any company is welcome to join Copyright Khronos Group 2014 SYCL 1. What is SYCL for and what
More informationComputational Physics
Computational Physics Python Programming Basics Prof. Paul Eugenio Department of Physics Florida State University Jan 17, 2019 http://hadron.physics.fsu.edu/~eugenio/comphy/ Announcements Exercise 0 due
More informationModern Processor Architectures. L25: Modern Compiler Design
Modern Processor Architectures L25: Modern Compiler Design The 1960s - 1970s Instructions took multiple cycles Only one instruction in flight at once Optimisation meant minimising the number of instructions
More informationA. Matrix-wise and element-wise operations
USC GSBME MATLAB CLASS Reviewing previous session Second session A. Matrix-wise and element-wise operations A.1. Matrix-wise operations So far we learned how to define variables and how to extract data
More informationS III. Case Study: TI Calculator Numerics
Introduction S III. Case Study: TI Calculator Numerics Texas Instruments started a research project in 1965 to design a pocket calculator. The first pocket calculators appeared in the early 1970 from the
More informationHSA Foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017!
Advanced Topics on Heterogeneous System Architectures HSA Foundation! Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2
More informationGeneral MATLAB Information 1
Introduction to MATLAB General MATLAB Information 1 Once you initiate the MATLAB software, you will see the MATLAB logo appear and then the MATLAB prompt >>. The prompt >> indicates that MATLAB is awaiting
More informationTake GPU Processing Power Beyond Graphics with Mali GPU Computing
Take GPU Processing Power Beyond Graphics with Mali GPU Computing Roberto Mijat Visual Computing Marketing Manager August 2012 Introduction Modern processor and SoC architectures endorse parallelism as
More informationIntroduction to GNU-Octave
Introduction to GNU-Octave Dr. K.R. Chowdhary, Professor & Campus Director, JIETCOE JIET College of Engineering Email: kr.chowdhary@jietjodhpur.ac.in Web-Page: http://www.krchowdhary.com July 11, 2016
More informationAMD CORPORATE TEMPLATE AMD Radeon Open Compute Platform Felix Kuehling
AMD Radeon Open Compute Platform Felix Kuehling ROCM PLATFORM ON LINUX Compiler Front End AMDGPU Driver Enabled with ROCm GCN Assembly Device LLVM Compiler (GCN) LLVM Opt Passes GCN Target Host LLVM Compiler
More informationEnable AI on Mobile Devices
Enable AI on Mobile Devices Scott Wang 王舒翀 Senior Segment Manager Mobile, BSG ARM Tech Forum 2017 14 th June 2017, Shenzhen AI is moving from core to edge Ubiquitous AI Safe and autonomous Mixed reality
More informationComputing Fundamentals
Computing Fundamentals Salvatore Filippone salvatore.filippone@uniroma2.it 2012 2013 (salvatore.filippone@uniroma2.it) Computing Fundamentals 2012 2013 1 / 18 Octave basics Octave/Matlab: f p r i n t f
More informationAltera SDK for OpenCL Version 15.1 Release Notes
Subscribe The Altera SDK for OpenCL Release Notes provides late-breaking information about the Altera Software Development Kit (SDK) for OpenCL (1) (AOCL (2) ) and the Altera Runtime Environment (RTE)
More informationCUDA Toolkit 4.0 Performance Report. June, 2011
CUDA Toolkit 4. Performance Report June, 211 CUDA Math Libraries High performance math routines for your applications: cufft Fast Fourier Transforms Library cublas Complete BLAS Library cusparse Sparse
More informationRenderscript Accelerated Advanced Image and Video Processing on ARM Mali T-600 GPUs. Lihua Zhang, Ph.D. MulticoreWare Inc.
Renderscript Accelerated Advanced Image and Video Processing on ARM Mali T-600 GPUs Lihua Zhang, Ph.D. MulticoreWare Inc. lihua@multicorewareinc.com Overview More & more mobile apps are beginning to require
More informationHSA foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015!
Advanced Topics on Heterogeneous System Architectures HSA foundation! Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2
More informationEcosystem Overview Neil Trevett Khronos President NVIDIA Vice President Developer
Ecosystem Overview Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem ntrevett@nvidia.com @neilt3d Copyright Khronos Group 2016 - Page 1 Khronos Mission Software Silicon Khronos is
More informationCompiling CUDA and Other Languages for GPUs. Vinod Grover and Yuan Lin
Compiling CUDA and Other Languages for GPUs Vinod Grover and Yuan Lin Agenda Vision Compiler Architecture Scenarios SDK Components Roadmap Deep Dive SDK Samples Demos Vision Build a platform for GPU computing
More informationIntroduction to Programming and 4Algorithms Abstract Types. Uwe R. Zimmer - The Australian National University
Introduction to Programming and 4Algorithms 2015 Uwe R. Zimmer - The Australian National University [ Thompson2011 ] Thompson, Simon Haskell - The craft of functional programming Addison Wesley, third
More informationHeterogeneous Computing
Heterogeneous Computing Featured Speaker Ben Sander Senior Fellow Advanced Micro Devices (AMD) DR. DOBB S: GPU AND CPU PROGRAMMING WITH HETEROGENEOUS SYSTEM ARCHITECTURE Ben Sander AMD Senior Fellow APU:
More informationWu Zhiwen.
Wu Zhiwen zhiwen.wu@intel.com Agenda Background information OpenCV DNN module OpenCL acceleration Vulkan backend Sample 2 What is OpenCV? Open Source Compute Vision (OpenCV) library 2500+ Optimized algorithms
More informationHandout 3. HSAIL and A SIMT GPU Simulator
Handout 3 HSAIL and A SIMT GPU Simulator 1 Outline Heterogeneous System Introduction of HSA Intermediate Language (HSAIL) A SIMT GPU Simulator Summary 2 Heterogeneous System CPU & GPU CPU GPU CPU wants
More informationAMD ACCELERATING TECHNOLOGIES FOR EXASCALE COMPUTING FELLOW 3 OCTOBER 2016
AMD ACCELERATING TECHNOLOGIES FOR EXASCALE COMPUTING BILL.BRANTLEY@AMD.COM, FELLOW 3 OCTOBER 2016 AMD S VISION FOR EXASCALE COMPUTING EMBRACING HETEROGENEITY CHAMPIONING OPEN SOLUTIONS ENABLING LEADERSHIP
More informationTHE HETEROGENEOUS SYSTEM ARCHITECTURE IT S BEYOND THE GPU
THE HETEROGENEOUS SYSTEM ARCHITECTURE IT S BEYOND THE GPU PAUL BLINZER AMD INC, FELLOW, SYSTEM SOFTWARE SYSTEM ARCHITECTURE WORKGROUP CHAIR HSA FOUNDATION THE HSA VISION MAKE HETEROGENEOUS PROGRAMMING
More informationDEVELOPER DAY. Shader Toolchain: HLSL in Vulkan Lei Zhang, Google MONTRÉAL APRIL Copyright Khronos Group Page 1
DEVELOPER DAY Shader Toolchain: HLSL in Vulkan Lei Zhang, Google MONTRÉAL APRIL 2018 Copyright Khronos Group 2018 - Page 1 Copyright Khronos Group 2018 - Page 2 Overview Shader toolchain - Projects - SPIR-V
More informationStarting MATLAB To logon onto a Temple workstation at the Tech Center, follow the directions below.
What is MATLAB? MATLAB (short for MATrix LABoratory) is a language for technical computing, developed by The Mathworks, Inc. (A matrix is a rectangular array or table of usually numerical values.) MATLAB
More informationCopyright Khronos Group Page 1
SYCL and OpenCL State of the Nation Michael Wong ISOCPP VP Codeplay Vice President of R & D SYCL Working Group Chair Chair C++ Standard SG5, SG14 michael@codeplay.com wongmichael.com Ronan Keryell Xilinx
More informationScript started on Thu 25 Aug :00:40 PM CDT
Script started on Thu 25 Aug 2016 02:00:40 PM CDT < M A T L A B (R) > Copyright 1984-2014 The MathWorks, Inc. R2014a (8.3.0.532) 64-bit (glnxa64) February 11, 2014 To get started, type one of these: helpwin,
More informationCS-201 Introduction to Programming with Java
CS-201 Introduction to Programming with Java California State University, Los Angeles Computer Science Department Lecture V: Mathematical Functions, Characters, and Strings Introduction How would you estimate
More informationOutline. Introduction Intel Vector Math Library (VML) o Features and performance VML in Finance Useful links
Outline Introduction Intel Vector Math Library (VML) o Features and performance VML in Finance Useful links 2 Introduction VML is one component of Intel MKL Support HPC applications: o o Scientific & engineering
More informationFrom Application to Technology OpenCL Application Processors Chung-Ho Chen
From Application to Technology OpenCL Application Processors Chung-Ho Chen Computer Architecture and System Laboratory (CASLab) Department of Electrical Engineering and Institute of Computer and Communication
More informationHETEROGENEOUS SYSTEM ARCHITECTURE: PLATFORM FOR THE FUTURE
HETEROGENEOUS SYSTEM ARCHITECTURE: PLATFORM FOR THE FUTURE Haibo Xie, Ph.D. Chief HSA Evangelist AMD China OUTLINE: The Challenges with Computing Today Introducing Heterogeneous System Architecture (HSA)
More information1001ICT Introduction To Programming Lecture Notes
1001ICT Introduction To Programming Lecture Notes School of Information and Communication Technology Griffith University Semester 1, 2015 1 M Environment console M.1 Purpose This environment supports programming
More informationGPU Programming Using NVIDIA CUDA
GPU Programming Using NVIDIA CUDA Siddhante Nangla 1, Professor Chetna Achar 2 1, 2 MET s Institute of Computer Science, Bandra Mumbai University Abstract: GPGPU or General-Purpose Computing on Graphics
More informationPackage Brobdingnag. R topics documented: March 19, 2018
Type Package Title Very Large Numbers in R Version 1.2-5 Date 2018-03-19 Author Depends R (>= 2.13.0), methods Package Brobdingnag March 19, 2018 Maintainer Handles very large
More informationMulti2sim Kepler: A Detailed Architectural GPU Simulator
Multi2sim Kepler: A Detailed Architectural GPU Simulator Xun Gong, Rafael Ubal, David Kaeli Northeastern University Computer Architecture Research Lab Department of Electrical and Computer Engineering
More informationOptimised OpenCL Workgroup Synthesis for Hybrid ARM-FPGA Devices
Optimised OpenCL Workgroup Synthesis for Hybrid ARM-FPGA Devices Mohammad Hosseinabady and Jose Luis Nunez-Yanez Department of Electrical and Electronic Engineering University of Bristol, UK. Email: {m.hosseinabady,
More informationIntroduction to GPGPUs and to CUDA programming model: CUDA Libraries
Introduction to GPGPUs and to CUDA programming model: CUDA Libraries www.cineca.it Marzia Rivi m.rivi@cineca.it NVIDIA CUDA Libraries http://developer.nvidia.com/technologies/libraries CUDA Toolkit includes
More informationWeek 2: Console I/O and Operators Arithmetic Operators. Integer Division. Arithmetic Operators. Gaddis: Chapter 3 (2.14,3.1-6,3.9-10,5.
Week 2: Console I/O and Operators Gaddis: Chapter 3 (2.14,3.1-6,3.9-10,5.1) CS 1428 Fall 2014 Jill Seaman 1 2.14 Arithmetic Operators An operator is a symbol that tells the computer to perform specific
More informationPython. Olmo Zavala R. Python Exercises. Center of Atmospheric Sciences, UNAM. August 24, 2016
Exercises Center of Atmospheric Sciences, UNAM August 24, 2016 NAND Make function that computes the NAND. It should receive two booleans and return one more boolean. logical operators A and B, A or B,
More informationAn introduction to R WS 2013/2014
An introduction to R WS 2013/2014 Dr. Noémie Becker (AG Metzler) Dr. Sonja Grath (AG Parsch) Special thanks to: Dr. Martin Hutzenthaler (previously AG Metzler, now University of Frankfurt) course development,
More informationPolly Polyhedral Optimizations for LLVM
Polly Polyhedral Optimizations for LLVM Tobias Grosser - Hongbin Zheng - Raghesh Aloor Andreas Simbürger - Armin Grösslinger - Louis-Noël Pouchet April 03, 2011 Polly - Polyhedral Optimizations for LLVM
More informationNeural Network Exchange Format
Copyright Khronos Group 2017 - Page 1 Neural Network Exchange Format Deploying Trained Networks to Inference Engines Viktor Gyenes, specification editor Copyright Khronos Group 2017 - Page 2 Outlook The
More information