Taipei Embedded Outreach OpenCL DSP Profile Proposals

Size: px
Start display at page:

Download "Taipei Embedded Outreach OpenCL DSP Profile Proposals"

Transcription

1 Copyright 2018 The Khronos Group Inc. Page 1 Taipei Embedded Outreach OpenCL DSP Profile Proposals Prof. Jenq-Kuen Lee, NTHU Taipei, January 2018

2 Copyright 2018 The Khronos Group Inc. Page 2 Outline Speaker Vita OpenCL DSP Profile Proposal Reference Designs and Use Cases Current Status with OpenCL Roadmap

3 Copyright 2018 The Khronos Group Inc. Page 3 Prof. Jenq-Kuen Lee, NTHU Our group contributed to PoCL-based OpenCL runtime on HSA - Codes contributed to HSA Foundation GitHub - also up-streamed to PoCL Prototyping OpenCL compiler (based on Open64) for PAC DSP multi-cores - ESTIMedia 2013, ICPP EMS 2012, ACM TODAES 2015 Compiler optimizations for OpenCL programs - Vector Data Flow Analysis for SIMD Optimizations on OpenCL Programs (Concurrency and Computation: Practice and Experience, 2016) - Pointer-based Divergence for OpenCL(CPC 2013) - Compilers for OpenCL with Affine Registers on GPGPU (ACM TODAES 2017) DSP compiler optimizations (PAC DSP Compiler) - SIMD Intrisinc supports for VLIW DSP (CPC 2010) - PALF: compiler supports for distributed register files in VLIW DSP (Concurrency and Computation: Practice and Experience 2007) Contribute OpenCL DSP Profile Proposals - Funding support from Taiwan MOST and Mediatek

4 Copyright 2018 The Khronos Group Inc. Page 4 Outline Speaker Vita OpenCL DSP Profile Proposal Reference Designs and Use Cases Current Status with OpenCL Roadmap

5 L: K C O

6 OpenCL Research with HSA Architectures w w Enable OpenCL framework on HSA platform Use pocl as our basic OpenCL framework Extend pocl to support HSA platforms by lowering OpenCL APIs to HSA runtime APIs Our enhanced PoCL-based OpenCL runtime for HSA is officially released on HSA Foundation. Upstream to PoCL ml/hsa_status.html (acknowledge our work) J. System Architecture Nov (OpenCL 2.0 Runtime) 6

7 Support heterogeneous computing for OpenCL among CPU, GPU, DSP, and FPGA. Previously, OpenCL is mainly for CPU and GPU Low-power numerical precision to work with vision applications, deep learning applications, and signal processing applications Energy savings up to 35% in the memory energy and the computation of low-power DSP numerics in the OpenCL DSP proposal only consumes 1/7 energy of floating point in hardware. Optional stoachastic rounding mode benefits deep learning applications Demonstrate reference design for Khronos SPIR-V extension with our DSP proposals.

8 Copyright 2018 The Khronos Group Inc. Page 8 Proposals to Khronos OpenCL DSP Profile Goal is to integrate CPU/GPU/DSP - Houston f2f Seattle f2f Frankfurt f2f Seoul f2f Vancouver f2f Amsterdam f2f Chicago f2f 2017 Contributors: - NTHU: C. C. Yang, M. Y. Hsu, S. C. Wang, C Li, Y. Chang, BS Lu, S. Chien, T. Chen - MediaTek: PeiChia Lin, Cheng-Wei Chen, Trent Lo, Diana Chen The proposals are in collaboration with MediaTek

9 Comparison Math Functions Support for Fixed-Point Types Functions Trigonometric Xilinx HLS *bit-width specification cos, cospi sin sinpi *ap_fixed<w,2> where W<=32 Khronos OpenCL C++ proposal cos, sin, tan, acos, asin, atan Hyperbolic None cosh, sinh, tanh, acosh, asinh, atanh Exponential and Logarithmic Power exp *ap_fixed<16,8> and ap_fixed<8,4> sqrt *ap_fixed<w,i> where W<=32 exp, log, log10 pow, sqrt ISO C++ proposal (John) cos, sin None (Tentative) Others None None abs Other Fixed-Point Types Comparisons Support Xilinx HLS Khronos OpenCL C++ proposal exp pow, sqrt ISO C++ proposal (John) ISO C++ proposal (Lawrence) ISO C++ proposal (Lawrence) System C System C Console IO Yes No No No Yes Fixed Point Width Arbitrary DSP: 8, 16, 32 FPGA: Arbitrary Courtesy of Ronan Keryell and Lin-Ya Yu Copyright 2017 Xilinx. 8, 16, 32, 64 - sc_fxval: arbitrary width and binary point location, limited-precision fixed-point: 53

10 Copyright 2018 The Khronos Group Inc. Page 10 Outline Speaker Vita OpenCL DSP Profile Proposal Reference Designs and Use Cases Current Status with OpenCL Roadmap

11 Copyright 2018 The Khronos Group Inc. - Page 11 SPIR-V Ecosystem GLSL HLSL Khronos has open sourced these tools and translators Third party kernel and shader languages glslang MSL HLSL GLSL SPIR-V Cross OpenCL C Front-end OpenCL C++ Front-end SPIR-V Khronos defined and controlled cross-api intermediate language Native support for graphics and parallel constructs 32-bit Word Stream Extensible and easily parsed Retains data object and control flow information for effective code generation and translation SPIR-V (Dis)Assembler SPIR-V Validator Other Intermediate Forms IHV Driver Runtimes LLVM to SPIR-V Bi-directional Translator LLVM Khronos coordinating liaison with Clang/LLVM Community E.g. discussing SPIR-V as supported Clang target

12 Overview Revision of Khronos LLVM IR to SPIRV Convertor on GitHub Example: New Opcode Type: OpTypeFixedPoint 12

13 Example: New Opcode Type - OpTypeFixedPoint OpTypeFixedPoint Declares a new fixed-point type. Width is the bit width of significant digits(significand). Exponent is the absolute value(i.e abs()) of the exponent value Result <id> Literal Number Width Literal Number Exponent... 13

14 Copyright 2018 The Khronos Group Inc. Page 14 OpenCL as Language/Library Backend C++ based Neural network framework Language for image processing and computational photography MulticoreWare open source project on Bitbucket Single Source C++ Programming for OpenCL Open source software library for machine learning Vision processing open source project Compiler directives for Fortran, C and C++ Open compiler for AI framework

15 Use Case: Fixed Point on Tensorflow Tensorflo w Eigen ViennaCL OpenCL 2.2 C++ OpenCL C Khronos f2f Meeting, Chicago, 2017

16 Interfacing Tensorflow/Eigen with ViennaCL ViennaCL provides methods for data copying between Eigen and ViennaCL Therefore, we can enable the Tensorflow-to-OpenCL flow by interfacing Tensorflow/Eigen with ViennaCL Application Tensorflow Output Eigen ViennaCL OpenCL Khronos f2f Meeting, Chicago, 2017

17 Khronos f2f Meeting, Chicago, 2017 From:

18 Enable Built-in Kernels in ViennaCL with Our Proposed OpenCL Fixed-Point To enable the flow, we have revised the.hpp header files which generated OpenCL kernels in the following path: ViennaCL-1.x.x/viennacl/linalg/opencl/kernels/ We enable the use case flow to fixed point by: Enable the use of OpenCL C++ kernels in ViennaCL Enable the use of OpenCL C++ kernels with proposed fixed-point in ViennaCL Enable the use of OpenCL C kernels with proposed fixed-point in ViennaCL Khronos f2f Meeting, Chicago, 2017

19 19 Enable Fixed-Point Type Simulation (GPGPUSim) Extend Clang/SPIR-V translator to accommodate fixed-point Opcodes LLVM: Clang: LLVM IR Fixer (or LLVM IR Khronos Adaptor) Fix IR difference Kernel Identification (Annotation v.s Explicit IR semantic) Special Types Other special IR annotations for NVPTX Add/Map fixed-point ISA extension for NVPTX OpenCL Kernel clang LLVM IR (SPIRV) SPIRV Translator SPIRV SPIRV Translator LLVM IR (SPIRV) LLVM IR Fixer LLVM IR (NVPTX) NVPTX Backend NVPTX

20 Enable SPIR-V to GPGPUSim

21 Copyright 2018 The Khronos Group Inc. Page 21 Outline Speaker Vita OpenCL DSP Profile Proposal Reference Designs and Use Cases Current Status with OpenCL Roadmap

22 Copyright 2018 The Khronos Group Inc. Page 22 OpenCL Next Directions Generalizing DSP Path to DSP, FPGA, Embedded GPU, DL - Incorporate OpenCL DSP Profile Proposals Feature Sets Capability - Allow Code Specializations - Optional features are captured as capability flags in the API Enhance SPIR-V to accommodate OpenCL low-power numerics

23 Copyright 2018 The Khronos Group Inc. Page 23 Suggestions and Comments We welcome DSP vendors and IP companies for suggestions and inputs We also hope to encourage companies to participate in Khronos activities and the research and design with Khronos APIs!

Copyright Khronos Group Page 1

Copyright Khronos Group Page 1 OpenCL State of the Nation Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem OpenCL Working Group Chair ntrevett@nvidia.com @neilt3d Toronto, May 2017 Copyright Khronos Group 2017

More information

Copyright Khronos Group Page 1

Copyright Khronos Group Page 1 OpenCL State of the Nation Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem OpenCL Working Group Chair ntrevett@nvidia.com @neilt3d Toronto, May 2017 Copyright Khronos Group 2017

More information

A Translation Framework for Automatic Translation of Annotated LLVM IR into OpenCL Kernel Function

A Translation Framework for Automatic Translation of Annotated LLVM IR into OpenCL Kernel Function A Translation Framework for Automatic Translation of Annotated LLVM IR into OpenCL Kernel Function Chen-Ting Chang, Yu-Sheng Chen, I-Wei Wu, and Jyh-Jiun Shann Dept. of Computer Science, National Chiao

More information

Vulkan 1.1 March Copyright Khronos Group Page 1

Vulkan 1.1 March Copyright Khronos Group Page 1 Vulkan 1.1 March 2018 Copyright Khronos Group 2018 - Page 1 Vulkan 1.1 Launch and Ongoing Momentum Strengthening the Ecosystem Improved developer tools (SDK, validation/debug layers) More rigorous conformance

More information

Khronos Connects Software to Silicon

Khronos Connects Software to Silicon Press Pre-Briefing GDC 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem All Materials Embargoed Until Tuesday 3 rd March, 12:01AM Pacific Time Copyright Khronos Group 2015 - Page

More information

Copyright Khronos Group Page 1

Copyright Khronos Group Page 1 OpenCL and Ecosystem State of the Nation Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem OpenCL Working Group Chair ntrevett@nvidia.com @neilt3d Oxford, May 2018 Copyright Khronos

More information

Copyright Khronos Group Page 1. Vulkan Overview. June 2015

Copyright Khronos Group Page 1. Vulkan Overview. June 2015 Copyright Khronos Group 2015 - Page 1 Vulkan Overview June 2015 Copyright Khronos Group 2015 - Page 2 Khronos Connects Software to Silicon Open Consortium creating OPEN STANDARD APIs for hardware acceleration

More information

Data Parallel Execution Model

Data Parallel Execution Model CS/EE 217 GPU Architecture and Parallel Programming Lecture 3: Kernel-Based Data Parallel Execution Model David Kirk/NVIDIA and Wen-mei Hwu, 2007-2013 Objective To understand the organization and scheduling

More information

Copyright Khronos Group Page 1

Copyright Khronos Group Page 1 Open Standards and Open Source Together How Khronos APIs Accelerate Fast and Cool Applications Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem Copyright Khronos Group 2015 - Page

More information

Press Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem. Copyright Khronos Group Page 1

Press Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem. Copyright Khronos Group Page 1 Press Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem Copyright Khronos Group 2015 - Page 1 Khronos Connects Software to Silicon Open Consortium creating ROYALTY-FREE,

More information

SIGGRAPH Briefing August 2014

SIGGRAPH Briefing August 2014 Copyright Khronos Group 2014 - Page 1 SIGGRAPH Briefing August 2014 Neil Trevett VP Mobile Ecosystem, NVIDIA President, Khronos Copyright Khronos Group 2014 - Page 2 Significant Khronos API Ecosystem Advances

More information

Copyright Khronos Group Page 1

Copyright Khronos Group Page 1 Update on Khronos Standards for Vision and Machine Learning December 2017 Neil Trevett Khronos President NVIDIA VP Developer Ecosystem ntrevett@nvidia.com @neilt3d www.khronos.org Copyright Khronos Group

More information

Introduction to SPIR-V Shaders

Introduction to SPIR-V Shaders Copyright Khronos Group 2016 - Page 38 Introduction to SPIR-V Shaders Neil Hickey Compiler Engineer, ARM SPIR History Copyright Khronos Group 2016 - Page 39 Copyright Khronos Group 2016 - Page 40 SPIR-V

More information

Update on Khronos Open Standard APIs for Vision Processing Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem

Update on Khronos Open Standard APIs for Vision Processing Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem Update on Khronos Open Standard APIs for Vision Processing Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem Copyright Khronos Group 2015 - Page 1 Copyright Khronos Group 2015 - Page

More information

CUDA Toolkit 5.0 Performance Report. January 2013

CUDA Toolkit 5.0 Performance Report. January 2013 CUDA Toolkit 5.0 Performance Report January 2013 CUDA Math Libraries High performance math routines for your applications: cufft Fast Fourier Transforms Library cublas Complete BLAS Library cusparse Sparse

More information

Copyright Khronos Group Page 1

Copyright Khronos Group Page 1 OpenCL A State of the Union Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem OpenCL Working Group Chair ntrevett@nvidia.com @neilt3d Vienna, April 2016 Copyright Khronos Group 2016

More information

The Role of Standards in Heterogeneous Programming

The Role of Standards in Heterogeneous Programming The Role of Standards in Heterogeneous Programming Multi-core Challenge Bristol UWE 45 York Place, Edinburgh EH1 3HP June 12th, 2013 Codeplay Software Ltd. Incorporated in 1999 Based in Edinburgh, Scotland

More information

HKG OpenCL Support by NNVM & TVM. Jammy Zhou - Linaro

HKG OpenCL Support by NNVM & TVM. Jammy Zhou - Linaro HKG18-417 OpenCL Support by NNVM & TVM Jammy Zhou - Linaro Agenda OpenCL Overview OpenCL in NNVM & TVM Current Status OpenCL Introduction Open Computing Language Open standard maintained by Khronos with

More information

Open Standards for Vision and AI Peter McGuinness NNEF WG Chair CEO, Highwai, Inc May 2018

Open Standards for Vision and AI Peter McGuinness NNEF WG Chair CEO, Highwai, Inc May 2018 Copyright Khronos Group 2018 - Page 1 Open Standards for Vision and AI Peter McGuinness NNEF WG Chair CEO, Highwai, Inc peter.mcguinness@gobrach.com May 2018 Khronos Mission E.g. OpenGL ES provides 3D

More information

Press Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem. Copyright Khronos Group Page 1

Press Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem. Copyright Khronos Group Page 1 Press Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem Copyright Khronos Group 2015 - Page 1 Khronos Connects Software to Silicon Open Consortium creating ROYALTY-FREE,

More information

Accelerating Vision Processing

Accelerating Vision Processing Accelerating Vision Processing Neil Trevett Vice President Mobile Ecosystem at NVIDIA President of Khronos and Chair of the OpenCL Working Group SIGGRAPH, July 2016 Copyright Khronos Group 2016 - Page

More information

Vulkan Launch Webinar 18 th February Copyright Khronos Group Page 1

Vulkan Launch Webinar 18 th February Copyright Khronos Group Page 1 Vulkan Launch Webinar 18 th February 2016 Copyright Khronos Group 2016 - Page 1 Copyright Khronos Group 2016 - Page 2 The Vulkan Launch Webinar Is About to Start! Kathleen Mattson - Webinar MC, Khronos

More information

KHRONOS STANDARDS UPDATE. Neil Trevett, GTC, 26 th March 2018

KHRONOS STANDARDS UPDATE. Neil Trevett, GTC, 26 th March 2018 KHRONOS STANDARDS UPDATE Neil Trevett, GTC, 26 th March 2018 Khronos Mission Software Silicon Khronos is an International Industry Consortium of over 100 companies creating royalty-free, open standards

More information

SPIR-V Extended Instructions for GLSL

SPIR-V Extended Instructions for GLSL SPIR-V Etended Instructions for GLSL John Kessenich, Google Version 1.00, Revision 7 August 8, 2018 SPIR-V Etended Instructions for GLSL Copyright 2014-2018 The Khronos Group Inc. All Rights Reserved.

More information

Navigating the Vision API Jungle: Which API Should You Use and Why? Embedded Vision Summit, May 2015

Navigating the Vision API Jungle: Which API Should You Use and Why? Embedded Vision Summit, May 2015 Copyright Khronos Group 2015 - Page 1 Navigating the Vision API Jungle: Which API Should You Use and Why? Embedded Vision Summit, May 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem

More information

Highly Optimized Mathematical Functions for the Itanium Processor

Highly Optimized Mathematical Functions for the Itanium Processor Highly Optimized Mathematical Functions for the Itanium Processor! Speaker: Shane Story! Software Engineer! CSL Numerics Group! Corporation Copyright Copyright 2001 2001 Corporation. Agenda! Itanium Processor

More information

OpenCL Overview. Shanghai March Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group

OpenCL Overview. Shanghai March Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group Copyright Khronos Group, 2012 - Page 1 OpenCL Overview Shanghai March 2012 Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group Copyright Khronos Group, 2012 - Page 2 Processor

More information

CLICK TO EDIT MASTER TITLE STYLE. Click to edit Master text styles. Second level Third level Fourth level Fifth level

CLICK TO EDIT MASTER TITLE STYLE. Click to edit Master text styles. Second level Third level Fourth level Fifth level CLICK TO EDIT MASTER TITLE STYLE Second level THE HETEROGENEOUS SYSTEM ARCHITECTURE ITS (NOT) ALL ABOUT THE GPU PAUL BLINZER, FELLOW, HSA SYSTEM SOFTWARE, AMD SYSTEM ARCHITECTURE WORKGROUP CHAIR, HSA FOUNDATION

More information

Overview. Think Silicon is a privately held company founded in 2007 by the core team of Atmel MMC IC group

Overview. Think Silicon is a privately held company founded in 2007 by the core team of Atmel MMC IC group Nema An OpenGL & OpenCL Embedded Programmable Engine Georgios Keramidas & Iakovos Stamoulis Think Silicon mobile GRAPHICS Overview Think Silicon is a privately held company founded in 2007 by the core

More information

Built-in Types of Data

Built-in Types of Data Built-in Types of Data Types A data type is set of values and a set of operations defined on those values Python supports several built-in data types: int (for integers), float (for floating-point numbers),

More information

trisycl Open Source C++17 & OpenMP-based OpenCL SYCL prototype Ronan Keryell 05/12/2015 IWOCL 2015 SYCL Tutorial Khronos OpenCL SYCL committee

trisycl Open Source C++17 & OpenMP-based OpenCL SYCL prototype Ronan Keryell 05/12/2015 IWOCL 2015 SYCL Tutorial Khronos OpenCL SYCL committee trisycl Open Source C++17 & OpenMP-based OpenCL SYCL prototype Ronan Keryell Khronos OpenCL SYCL committee 05/12/2015 IWOCL 2015 SYCL Tutorial OpenCL SYCL committee work... Weekly telephone meeting Define

More information

Request for Quotations

Request for Quotations Request for Quotations OpenCL 2.2 CTS September 2016 Notice ALL KHRONOS SPECIFICATIONS AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, MATERIALS ) ARE BEING PROVIDED AS IS. KHRONOS MAKES NO WARRANTIES, EXPRESSED,

More information

CSE 591: GPU Programming. Programmer Interface. Klaus Mueller. Computer Science Department Stony Brook University

CSE 591: GPU Programming. Programmer Interface. Klaus Mueller. Computer Science Department Stony Brook University CSE 591: GPU Programming Programmer Interface Klaus Mueller Computer Science Department Stony Brook University Compute Levels Encodes the hardware capability of a GPU card newer cards have higher compute

More information

Copyright Khronos Group 2012 Page 1. OpenCL 1.2. August 2012

Copyright Khronos Group 2012 Page 1. OpenCL 1.2. August 2012 Copyright Khronos Group 2012 Page 1 OpenCL 1.2 August 2012 Copyright Khronos Group 2012 Page 2 Khronos - Connecting Software to Silicon Khronos defines open, royalty-free standards to access graphics,

More information

Standards Update. Copyright Khronos Group Page 1

Standards Update. Copyright Khronos Group Page 1 Standards Update VR/AR, 3D, Web, Vision and Deep Learning Neil Trevett Khronos President NVIDIA VP Developer Ecosystem ntrevett@nvidia.com @neilt3d www.khronos.org Copyright Khronos Group 2017 - Page 1

More information

The OpenVX Computer Vision and Neural Network Inference

The OpenVX Computer Vision and Neural Network Inference The OpenVX Computer and Neural Network Inference Standard for Portable, Efficient Code Radhakrishna Giduthuri Editor, OpenVX Khronos Group radha.giduthuri@amd.com @RadhaGiduthuri Copyright 2018 Khronos

More information

CUDA 6.0 Performance Report. April 2014

CUDA 6.0 Performance Report. April 2014 CUDA 6. Performance Report April 214 1 CUDA 6 Performance Report CUDART CUDA Runtime Library cufft Fast Fourier Transforms Library cublas Complete BLAS Library cusparse Sparse Matrix Library curand Random

More information

Going to cover; - Why we have SPIR-V - Brief history of SPIR-V - Some of the core required features we wanted - How OpenCL will use SPIR-V - How

Going to cover; - Why we have SPIR-V - Brief history of SPIR-V - Some of the core required features we wanted - How OpenCL will use SPIR-V - How 1 Going to cover; - Why we have SPIR-V - Brief history of SPIR-V - Some of the core required features we wanted - How OpenCL will use SPIR-V - How Vulkan will use SPIR-V - The differences between compute/graphics

More information

Matlab Workshop I. Niloufer Mackey and Lixin Shen

Matlab Workshop I. Niloufer Mackey and Lixin Shen Matlab Workshop I Niloufer Mackey and Lixin Shen Western Michigan University/ Syracuse University Email: nil.mackey@wmich.edu, lshen03@syr.edu@wmich.edu p.1/13 What is Matlab? Matlab is a commercial Matrix

More information

HSAIL: PORTABLE COMPILER IR FOR HSA

HSAIL: PORTABLE COMPILER IR FOR HSA HSAIL: PORTABLE COMPILER IR FOR HSA HOT CHIPS TUTORIAL - AUGUST 2013 BEN SANDER AMD SENIOR FELLOW STATE OF GPU COMPUTING GPUs are fast and power efficient : high compute density per-mm and per-watt But:

More information

Arithmetic and Logic Blocks

Arithmetic and Logic Blocks Arithmetic and Logic Blocks The Addition Block The block performs addition and subtractions on its inputs. This block can add or subtract scalar, vector, or matrix inputs. We can specify the operation

More information

KHRONOS STANDARDS UPDATE. Neil Trevett, GTC, 26 th March 2018

KHRONOS STANDARDS UPDATE. Neil Trevett, GTC, 26 th March 2018 KHRONOS STANDARDS UPDATE Neil Trevett, GTC, 26 th March 2018 Khronos Mission Software Silicon Khronos is an International Industry Consortium of over 100 companies creating royalty-free, open standards

More information

Open Standard APIs for Augmented Reality

Open Standard APIs for Augmented Reality Copyright Khronos Group 2014 - Page 1 Open Standard APIs for Augmented Reality Neil Trevett Vice President Mobile Ecosystem, NVIDIA President, Khronos Group Copyright Khronos Group 2014 - Page 2 Khronos

More information

Standards for Vision Processing and Neural Networks

Standards for Vision Processing and Neural Networks Copyright Khronos Group 2017 - Page 1 Standards for Vision Processing and Neural Networks Radhakrishna Giduthuri, AMD radha.giduthuri@ieee.org Agenda Why we need a standard? Khronos NNEF Khronos OpenVX

More information

OpenCL: History & Future. November 20, 2017

OpenCL: History & Future. November 20, 2017 Mitglied der Helmholtz-Gemeinschaft OpenCL: History & Future November 20, 2017 OpenCL Portable Heterogeneous Computing 2 APIs and 2 kernel languages C Platform Layer API OpenCL C and C++ kernel language

More information

ArcGIS Enterprise Building Raster Analytics Workflows. Mike Muller, Jie Zhang

ArcGIS Enterprise Building Raster Analytics Workflows. Mike Muller, Jie Zhang ArcGIS Enterprise Building Raster Analytics Workflows Mike Muller, Jie Zhang Introduction and Context Raster Analytics What is Raster Analytics? The ArcGIS way to create and execute spatial analysis models

More information

LAB 1 General MATLAB Information 1

LAB 1 General MATLAB Information 1 LAB 1 General MATLAB Information 1 General: To enter a matrix: > type the entries between square brackets, [...] > enter it by rows with elements separated by a space or comma > rows are terminated by

More information

Khronos Connects Software to Silicon

Khronos Connects Software to Silicon Neil Trevett Vice President Mobile Ecosystem at NVIDIA President of Khronos and Chair of the OpenCL Working Group SIGGRAPH, July 2016 Copyright Khronos Group 2016 - Page 1 Copyright Khronos Group 2016

More information

C++, How to Program. Spring 2016 CISC1600 Yanjun Li 1

C++, How to Program. Spring 2016 CISC1600 Yanjun Li 1 Chapter 6 Function C++, How to Program Deitel & Deitel Spring 2016 CISC1600 Yanjun Li 1 Function A function is a collection of statements that performs a specific task - a single, well-defined task. Divide

More information

Introduction to MATLAB

Introduction to MATLAB Outlines September 9, 2004 Outlines Part I: Review of Previous Lecture Part II: Part III: Writing MATLAB Functions Review of Previous Lecture Outlines Part I: Review of Previous Lecture Part II: Part III:

More information

SYCL for OpenCL in a Nutshell

SYCL for OpenCL in a Nutshell SYCL for OpenCL in a Nutshell Luke Iwanski, Games Technology Programmer @ Codeplay! SIGGRAPH Vancouver 2014 1 2 Copyright Khronos Group 2014 SYCL for OpenCL in a nutshell Copyright Khronos Group 2014 Why?

More information

Modern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design

Modern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design Modern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design The 1960s - 1970s Instructions took multiple cycles Only one instruction in flight at once Optimisation meant

More information

Mapping C++ AMP to OpenCL / HSA Wen-Heng Jack Chung

Mapping C++ AMP to OpenCL / HSA Wen-Heng Jack Chung Mapping C++ AMP to OpenCL / HSA Wen-Heng Jack Chung 1 MulticoreWare Founded in 2009 Largest Independent OpenCL Team Locations Changchun Champaign Beijing St. Louis Taiwan Sunnyvale

More information

OpenCL Press Conference

OpenCL Press Conference Copyright Khronos Group, 2011 - Page 1 OpenCL Press Conference Tokyo, November 2011 Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group Copyright Khronos Group, 2011 - Page

More information

General Purpose GPU Programming (1) Advanced Operating Systems Lecture 14

General Purpose GPU Programming (1) Advanced Operating Systems Lecture 14 General Purpose GPU Programming (1) Advanced Operating Systems Lecture 14 Lecture Outline Heterogenous multi-core systems and general purpose GPU programming Programming models Heterogenous multi-kernels

More information

Outline of High-Speed Quad-Precision Arithmetic Package ASLQUAD

Outline of High-Speed Quad-Precision Arithmetic Package ASLQUAD Outline of High-Speed Quad-Precision Arithmetic Package ASLQUAD OGATA Ryusei, KUBO Yoshiyuki, TAKEI Toshifumi Abstract The ASLQUAD high-speed quad-precision arithmetic package reduces numerical errors

More information

Introduction to MATLAB

Introduction to MATLAB Outlines January 30, 2008 Outlines Part I: Part II: Writing MATLAB Functions Starting MATLAB Exiting MATLAB Getting Help Command Window Workspace Command History Current Directory Selector Real Values

More information

Copyright Khronos Group, Page 1 SYCL. SG14, February 2016

Copyright Khronos Group, Page 1 SYCL. SG14, February 2016 Copyright Khronos Group, 2014 - Page 1 SYCL SG14, February 2016 BOARD OF PROMOTERS Over 100 members worldwide any company is welcome to join Copyright Khronos Group 2014 SYCL 1. What is SYCL for and what

More information

Computational Physics

Computational Physics Computational Physics Python Programming Basics Prof. Paul Eugenio Department of Physics Florida State University Jan 17, 2019 http://hadron.physics.fsu.edu/~eugenio/comphy/ Announcements Exercise 0 due

More information

Modern Processor Architectures. L25: Modern Compiler Design

Modern Processor Architectures. L25: Modern Compiler Design Modern Processor Architectures L25: Modern Compiler Design The 1960s - 1970s Instructions took multiple cycles Only one instruction in flight at once Optimisation meant minimising the number of instructions

More information

A. Matrix-wise and element-wise operations

A. Matrix-wise and element-wise operations USC GSBME MATLAB CLASS Reviewing previous session Second session A. Matrix-wise and element-wise operations A.1. Matrix-wise operations So far we learned how to define variables and how to extract data

More information

S III. Case Study: TI Calculator Numerics

S III. Case Study: TI Calculator Numerics Introduction S III. Case Study: TI Calculator Numerics Texas Instruments started a research project in 1965 to design a pocket calculator. The first pocket calculators appeared in the early 1970 from the

More information

HSA Foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017!

HSA Foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017! Advanced Topics on Heterogeneous System Architectures HSA Foundation! Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2

More information

General MATLAB Information 1

General MATLAB Information 1 Introduction to MATLAB General MATLAB Information 1 Once you initiate the MATLAB software, you will see the MATLAB logo appear and then the MATLAB prompt >>. The prompt >> indicates that MATLAB is awaiting

More information

Take GPU Processing Power Beyond Graphics with Mali GPU Computing

Take GPU Processing Power Beyond Graphics with Mali GPU Computing Take GPU Processing Power Beyond Graphics with Mali GPU Computing Roberto Mijat Visual Computing Marketing Manager August 2012 Introduction Modern processor and SoC architectures endorse parallelism as

More information

Introduction to GNU-Octave

Introduction to GNU-Octave Introduction to GNU-Octave Dr. K.R. Chowdhary, Professor & Campus Director, JIETCOE JIET College of Engineering Email: kr.chowdhary@jietjodhpur.ac.in Web-Page: http://www.krchowdhary.com July 11, 2016

More information

AMD CORPORATE TEMPLATE AMD Radeon Open Compute Platform Felix Kuehling

AMD CORPORATE TEMPLATE AMD Radeon Open Compute Platform Felix Kuehling AMD Radeon Open Compute Platform Felix Kuehling ROCM PLATFORM ON LINUX Compiler Front End AMDGPU Driver Enabled with ROCm GCN Assembly Device LLVM Compiler (GCN) LLVM Opt Passes GCN Target Host LLVM Compiler

More information

Enable AI on Mobile Devices

Enable AI on Mobile Devices Enable AI on Mobile Devices Scott Wang 王舒翀 Senior Segment Manager Mobile, BSG ARM Tech Forum 2017 14 th June 2017, Shenzhen AI is moving from core to edge Ubiquitous AI Safe and autonomous Mixed reality

More information

Computing Fundamentals

Computing Fundamentals Computing Fundamentals Salvatore Filippone salvatore.filippone@uniroma2.it 2012 2013 (salvatore.filippone@uniroma2.it) Computing Fundamentals 2012 2013 1 / 18 Octave basics Octave/Matlab: f p r i n t f

More information

Altera SDK for OpenCL Version 15.1 Release Notes

Altera SDK for OpenCL Version 15.1 Release Notes Subscribe The Altera SDK for OpenCL Release Notes provides late-breaking information about the Altera Software Development Kit (SDK) for OpenCL (1) (AOCL (2) ) and the Altera Runtime Environment (RTE)

More information

CUDA Toolkit 4.0 Performance Report. June, 2011

CUDA Toolkit 4.0 Performance Report. June, 2011 CUDA Toolkit 4. Performance Report June, 211 CUDA Math Libraries High performance math routines for your applications: cufft Fast Fourier Transforms Library cublas Complete BLAS Library cusparse Sparse

More information

Renderscript Accelerated Advanced Image and Video Processing on ARM Mali T-600 GPUs. Lihua Zhang, Ph.D. MulticoreWare Inc.

Renderscript Accelerated Advanced Image and Video Processing on ARM Mali T-600 GPUs. Lihua Zhang, Ph.D. MulticoreWare Inc. Renderscript Accelerated Advanced Image and Video Processing on ARM Mali T-600 GPUs Lihua Zhang, Ph.D. MulticoreWare Inc. lihua@multicorewareinc.com Overview More & more mobile apps are beginning to require

More information

HSA foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015!

HSA foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015! Advanced Topics on Heterogeneous System Architectures HSA foundation! Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2

More information

Ecosystem Overview Neil Trevett Khronos President NVIDIA Vice President Developer

Ecosystem Overview Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem Overview Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem ntrevett@nvidia.com @neilt3d Copyright Khronos Group 2016 - Page 1 Khronos Mission Software Silicon Khronos is

More information

Compiling CUDA and Other Languages for GPUs. Vinod Grover and Yuan Lin

Compiling CUDA and Other Languages for GPUs. Vinod Grover and Yuan Lin Compiling CUDA and Other Languages for GPUs Vinod Grover and Yuan Lin Agenda Vision Compiler Architecture Scenarios SDK Components Roadmap Deep Dive SDK Samples Demos Vision Build a platform for GPU computing

More information

Introduction to Programming and 4Algorithms Abstract Types. Uwe R. Zimmer - The Australian National University

Introduction to Programming and 4Algorithms Abstract Types. Uwe R. Zimmer - The Australian National University Introduction to Programming and 4Algorithms 2015 Uwe R. Zimmer - The Australian National University [ Thompson2011 ] Thompson, Simon Haskell - The craft of functional programming Addison Wesley, third

More information

Heterogeneous Computing

Heterogeneous Computing Heterogeneous Computing Featured Speaker Ben Sander Senior Fellow Advanced Micro Devices (AMD) DR. DOBB S: GPU AND CPU PROGRAMMING WITH HETEROGENEOUS SYSTEM ARCHITECTURE Ben Sander AMD Senior Fellow APU:

More information

Wu Zhiwen.

Wu Zhiwen. Wu Zhiwen zhiwen.wu@intel.com Agenda Background information OpenCV DNN module OpenCL acceleration Vulkan backend Sample 2 What is OpenCV? Open Source Compute Vision (OpenCV) library 2500+ Optimized algorithms

More information

Handout 3. HSAIL and A SIMT GPU Simulator

Handout 3. HSAIL and A SIMT GPU Simulator Handout 3 HSAIL and A SIMT GPU Simulator 1 Outline Heterogeneous System Introduction of HSA Intermediate Language (HSAIL) A SIMT GPU Simulator Summary 2 Heterogeneous System CPU & GPU CPU GPU CPU wants

More information

AMD ACCELERATING TECHNOLOGIES FOR EXASCALE COMPUTING FELLOW 3 OCTOBER 2016

AMD ACCELERATING TECHNOLOGIES FOR EXASCALE COMPUTING FELLOW 3 OCTOBER 2016 AMD ACCELERATING TECHNOLOGIES FOR EXASCALE COMPUTING BILL.BRANTLEY@AMD.COM, FELLOW 3 OCTOBER 2016 AMD S VISION FOR EXASCALE COMPUTING EMBRACING HETEROGENEITY CHAMPIONING OPEN SOLUTIONS ENABLING LEADERSHIP

More information

THE HETEROGENEOUS SYSTEM ARCHITECTURE IT S BEYOND THE GPU

THE HETEROGENEOUS SYSTEM ARCHITECTURE IT S BEYOND THE GPU THE HETEROGENEOUS SYSTEM ARCHITECTURE IT S BEYOND THE GPU PAUL BLINZER AMD INC, FELLOW, SYSTEM SOFTWARE SYSTEM ARCHITECTURE WORKGROUP CHAIR HSA FOUNDATION THE HSA VISION MAKE HETEROGENEOUS PROGRAMMING

More information

DEVELOPER DAY. Shader Toolchain: HLSL in Vulkan Lei Zhang, Google MONTRÉAL APRIL Copyright Khronos Group Page 1

DEVELOPER DAY. Shader Toolchain: HLSL in Vulkan Lei Zhang, Google MONTRÉAL APRIL Copyright Khronos Group Page 1 DEVELOPER DAY Shader Toolchain: HLSL in Vulkan Lei Zhang, Google MONTRÉAL APRIL 2018 Copyright Khronos Group 2018 - Page 1 Copyright Khronos Group 2018 - Page 2 Overview Shader toolchain - Projects - SPIR-V

More information

Starting MATLAB To logon onto a Temple workstation at the Tech Center, follow the directions below.

Starting MATLAB To logon onto a Temple workstation at the Tech Center, follow the directions below. What is MATLAB? MATLAB (short for MATrix LABoratory) is a language for technical computing, developed by The Mathworks, Inc. (A matrix is a rectangular array or table of usually numerical values.) MATLAB

More information

Copyright Khronos Group Page 1

Copyright Khronos Group Page 1 SYCL and OpenCL State of the Nation Michael Wong ISOCPP VP Codeplay Vice President of R & D SYCL Working Group Chair Chair C++ Standard SG5, SG14 michael@codeplay.com wongmichael.com Ronan Keryell Xilinx

More information

Script started on Thu 25 Aug :00:40 PM CDT

Script started on Thu 25 Aug :00:40 PM CDT Script started on Thu 25 Aug 2016 02:00:40 PM CDT < M A T L A B (R) > Copyright 1984-2014 The MathWorks, Inc. R2014a (8.3.0.532) 64-bit (glnxa64) February 11, 2014 To get started, type one of these: helpwin,

More information

CS-201 Introduction to Programming with Java

CS-201 Introduction to Programming with Java CS-201 Introduction to Programming with Java California State University, Los Angeles Computer Science Department Lecture V: Mathematical Functions, Characters, and Strings Introduction How would you estimate

More information

Outline. Introduction Intel Vector Math Library (VML) o Features and performance VML in Finance Useful links

Outline. Introduction Intel Vector Math Library (VML) o Features and performance VML in Finance Useful links Outline Introduction Intel Vector Math Library (VML) o Features and performance VML in Finance Useful links 2 Introduction VML is one component of Intel MKL Support HPC applications: o o Scientific & engineering

More information

From Application to Technology OpenCL Application Processors Chung-Ho Chen

From Application to Technology OpenCL Application Processors Chung-Ho Chen From Application to Technology OpenCL Application Processors Chung-Ho Chen Computer Architecture and System Laboratory (CASLab) Department of Electrical Engineering and Institute of Computer and Communication

More information

HETEROGENEOUS SYSTEM ARCHITECTURE: PLATFORM FOR THE FUTURE

HETEROGENEOUS SYSTEM ARCHITECTURE: PLATFORM FOR THE FUTURE HETEROGENEOUS SYSTEM ARCHITECTURE: PLATFORM FOR THE FUTURE Haibo Xie, Ph.D. Chief HSA Evangelist AMD China OUTLINE: The Challenges with Computing Today Introducing Heterogeneous System Architecture (HSA)

More information

1001ICT Introduction To Programming Lecture Notes

1001ICT Introduction To Programming Lecture Notes 1001ICT Introduction To Programming Lecture Notes School of Information and Communication Technology Griffith University Semester 1, 2015 1 M Environment console M.1 Purpose This environment supports programming

More information

GPU Programming Using NVIDIA CUDA

GPU Programming Using NVIDIA CUDA GPU Programming Using NVIDIA CUDA Siddhante Nangla 1, Professor Chetna Achar 2 1, 2 MET s Institute of Computer Science, Bandra Mumbai University Abstract: GPGPU or General-Purpose Computing on Graphics

More information

Package Brobdingnag. R topics documented: March 19, 2018

Package Brobdingnag. R topics documented: March 19, 2018 Type Package Title Very Large Numbers in R Version 1.2-5 Date 2018-03-19 Author Depends R (>= 2.13.0), methods Package Brobdingnag March 19, 2018 Maintainer Handles very large

More information

Multi2sim Kepler: A Detailed Architectural GPU Simulator

Multi2sim Kepler: A Detailed Architectural GPU Simulator Multi2sim Kepler: A Detailed Architectural GPU Simulator Xun Gong, Rafael Ubal, David Kaeli Northeastern University Computer Architecture Research Lab Department of Electrical and Computer Engineering

More information

Optimised OpenCL Workgroup Synthesis for Hybrid ARM-FPGA Devices

Optimised OpenCL Workgroup Synthesis for Hybrid ARM-FPGA Devices Optimised OpenCL Workgroup Synthesis for Hybrid ARM-FPGA Devices Mohammad Hosseinabady and Jose Luis Nunez-Yanez Department of Electrical and Electronic Engineering University of Bristol, UK. Email: {m.hosseinabady,

More information

Introduction to GPGPUs and to CUDA programming model: CUDA Libraries

Introduction to GPGPUs and to CUDA programming model: CUDA Libraries Introduction to GPGPUs and to CUDA programming model: CUDA Libraries www.cineca.it Marzia Rivi m.rivi@cineca.it NVIDIA CUDA Libraries http://developer.nvidia.com/technologies/libraries CUDA Toolkit includes

More information

Week 2: Console I/O and Operators Arithmetic Operators. Integer Division. Arithmetic Operators. Gaddis: Chapter 3 (2.14,3.1-6,3.9-10,5.

Week 2: Console I/O and Operators Arithmetic Operators. Integer Division. Arithmetic Operators. Gaddis: Chapter 3 (2.14,3.1-6,3.9-10,5. Week 2: Console I/O and Operators Gaddis: Chapter 3 (2.14,3.1-6,3.9-10,5.1) CS 1428 Fall 2014 Jill Seaman 1 2.14 Arithmetic Operators An operator is a symbol that tells the computer to perform specific

More information

Python. Olmo Zavala R. Python Exercises. Center of Atmospheric Sciences, UNAM. August 24, 2016

Python. Olmo Zavala R. Python Exercises. Center of Atmospheric Sciences, UNAM. August 24, 2016 Exercises Center of Atmospheric Sciences, UNAM August 24, 2016 NAND Make function that computes the NAND. It should receive two booleans and return one more boolean. logical operators A and B, A or B,

More information

An introduction to R WS 2013/2014

An introduction to R WS 2013/2014 An introduction to R WS 2013/2014 Dr. Noémie Becker (AG Metzler) Dr. Sonja Grath (AG Parsch) Special thanks to: Dr. Martin Hutzenthaler (previously AG Metzler, now University of Frankfurt) course development,

More information

Polly Polyhedral Optimizations for LLVM

Polly Polyhedral Optimizations for LLVM Polly Polyhedral Optimizations for LLVM Tobias Grosser - Hongbin Zheng - Raghesh Aloor Andreas Simbürger - Armin Grösslinger - Louis-Noël Pouchet April 03, 2011 Polly - Polyhedral Optimizations for LLVM

More information

Neural Network Exchange Format

Neural Network Exchange Format Copyright Khronos Group 2017 - Page 1 Neural Network Exchange Format Deploying Trained Networks to Inference Engines Viktor Gyenes, specification editor Copyright Khronos Group 2017 - Page 2 Outlook The

More information