Copyright Khronos Group Page 1

Similar documents
Copyright Khronos Group Page 1

Copyright Khronos Group Page 1

Accelerating Vision Processing

Copyright Khronos Group Page 1

Update on Khronos Open Standard APIs for Vision Processing Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem

Khronos Connects Software to Silicon

Copyright Khronos Group Page 1

Press Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem. Copyright Khronos Group Page 1

Navigating the Vision API Jungle: Which API Should You Use and Why? Embedded Vision Summit, May 2015

Copyright Khronos Group Page 1

Copyright Khronos Group Page 1

Press Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem. Copyright Khronos Group Page 1

Copyright Khronos Group Page 1. Vulkan Overview. June 2015

Standards Update. Copyright Khronos Group Page 1

SIGGRAPH Briefing August 2014

Vulkan Launch Webinar 18 th February Copyright Khronos Group Page 1

Standards for Vision Processing and Neural Networks

Vulkan 1.1 March Copyright Khronos Group Page 1

Ecosystem Overview Neil Trevett Khronos President NVIDIA Vice President Developer

Open Standards for Vision and AI Peter McGuinness NNEF WG Chair CEO, Highwai, Inc May 2018

Open Standards for AR and VR Neil Trevett Khronos President NVIDIA VP Developer January 2018

Next Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Copyright Khronos Group Page 1

OpenCL Overview. Shanghai March Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group

Overview and AR/VR Roadmap

OpenCL Press Conference

Silicon Acceleration APIs

Copyright Khronos Group 2012 Page 1. OpenCL 1.2. August 2012

Taipei Embedded Outreach OpenCL DSP Profile Proposals

The OpenVX Computer Vision and Neural Network Inference

Open Standards for Building Virtual and Augmented Realities. Neil Trevett Khronos President NVIDIA VP Developer Ecosystems

Copyright Khronos Group, Page 1. Khronos Overview. Taiwan, February 2012

OpenCL: History & Future. November 20, 2017

KHRONOS STANDARDS UPDATE. Neil Trevett, GTC, 26 th March 2018

AR Standards Update Austin, March 2012

Mobile AR Hardware Futures

Copyright Khronos Group, Page 1. OpenCL. GDC, March 2010

WebGL Meetup GDC Copyright Khronos Group, Page 1

Open Standard APIs for Augmented Reality

Open API Standards for Mobile Graphics, Compute and Vision Processing GTC, March 2014

Khronos Updates GDC 2017 Neil Trevett Vice President Developer Ecosystem, NVIDIA President,

CLICK TO EDIT MASTER TITLE STYLE. Click to edit Master text styles. Second level Third level Fourth level Fifth level

Khronos and the Mobile Ecosystem

HSA Foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017!

Enabling a Richer Multimedia Experience with GPU Compute. Roberto Mijat Visual Computing Marketing Manager

EECS 487: Interactive Computer Graphics

KHRONOS STANDARDS UPDATE. Neil Trevett, GTC, 26 th March 2018

Copyright Khronos Group, Page 1 SYCL. SG14, February 2016

Prospects for a more robust, simpler and more efficient shader cross-compilation pipeline in Unity with SPIR-V

Copyright Khronos Group Page 1. OpenCL BOF SIGGRAPH 2013

WebGL, WebCL and OpenCL

HETEROGENEOUS SYSTEM ARCHITECTURE: PLATFORM FOR THE FUTURE

OpenCL 1.0 to 2.2 : a quick survey

HSA foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015!

Profiling and Debugging OpenCL Applications with ARM Development Tools. October 2014

Request for Quotations

Open Standards for Today s Gaming Industry

Applying OpenCL. IWOCL, May Andrew Richards

Heterogeneous Computing

Neil Trevett Vice President Mobile Ecosystem, NVIDIA President, Khronos Group. Copyright Khronos Group Page 1

Graphics Technology Update

Vulkan API 杨瑜, 资深工程师

Open Standard APIs for Embedded Vision Processing

DEVELOPER DAY MONTRÉAL APRIL Copyright Khronos Group Page 1

trisycl Open Source C++17 & OpenMP-based OpenCL SYCL prototype Ronan Keryell 05/12/2015 IWOCL 2015 SYCL Tutorial Khronos OpenCL SYCL committee

HKG OpenCL Support by NNVM & TVM. Jammy Zhou - Linaro

Copyright Khronos Group Page 1. Introduction to SYCL. SYCL Tutorial IWOCL

Neural Network Exchange Format

Shrinath Shanbhag Senior Software Engineer Microsoft Corporation

SYCL for OpenCL in a Nutshell

Copyright Khronos Group Page 1

WebGL, WebCL and Beyond!

THE PROGRAMMER S GUIDE TO THE APU GALAXY. Phil Rogers, Corporate Fellow AMD

Khronos Connects Software to Silicon

Distributed & Heterogeneous Programming in C++ for HPC at SC17

THE HETEROGENEOUS SYSTEM ARCHITECTURE IT S BEYOND THE GPU

Ecosystem Forum. SIGGRAPH, August 2018 Neil Trevett, Khronos President. Copyright Khronos Group Page 1

The State of Gaming APIs

Colin Riddell GPU Compiler Developer Codeplay Visit us at

Introduction to OpenGL ES 3.0

IBM Power Systems: Open innovation to put data to work Dexter Henderson Vice President IBM Power Systems

Next Generation Visual Computing

ASYNCHRONOUS SHADERS WHITE PAPER 0

The Benefits of GPU Compute on ARM Mali GPUs

Neil Trevett Vice President, NVIDIA OpenCL Chair Khronos President. Copyright Khronos Group, Page 1

Working with Metal Overview

Standards for WebVR. Neil Trevett. Khronos President Vice President Mobile Content,

Mobile Graphics Ecosystem. Tom Olson OpenGL ES working group chair

Mesa i965 Scenes from a Quiet Revolution

Graph Streaming Processor

Copyright Khronos Group, Page 1

Beyond Hardware IP An overview of Arm development solutions

The Role of Standards in Heterogeneous Programming

OpenCL The Open Standard for Heterogeneous Parallel Programming

OpenCL C++ kernel language

TOOLS FOR IMPROVING CROSS-PLATFORM SOFTWARE DEVELOPMENT

Copyright Khronos Group, Page 1

Single-source SYCL C++ on Xilinx FPGA. Xilinx Research Labs Khronos 2017/11/12 19

Modern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design

OpenGL ES 2.0 : Start Developing Now. Dan Ginsburg Advanced Micro Devices, Inc.

Vision Acceleration. Launch Briefing October Neil Trevett Vice President Mobile Ecosystem, NVIDIA President, Khronos Group

Transcription:

OpenCL State of the Nation Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem OpenCL Working Group Chair ntrevett@nvidia.com @neilt3d Toronto, May 2017 Copyright Khronos Group 2017 - Page 1

Copyright Khronos Group 2017 - Page 2 Topics The Good The amazing progress of OpenCL The Bad Lessons Learned from the first eight years The Exciting Where do we go from here? 1 2 3

OpenCL 2.2 Finalized Here at IWOCL! 2011 2013 2015 2017 OpenCL 1.2 OpenCL 2.0 Becomes industry baseline Enables new class of hardware OpenCL 2.1 SPIR-V 1.0 OpenCL 2.2 SPIR-V 1.2 SPIR-V in Core OpenCL C++ Kernel Language Static subset of C++14 Templates and Lambdas SVM Generic Addresses On-device dispatch Kernel Language Flexibility SPIR-V 1.2 OpenCL C++ support Pipes Efficient device-scope communication between kernels https://www.khronos.org/opencl/ Code Generation Optimizations - Specialization constants at SPIR-V compilation time - Constructors and destructors of program scope global objects - User callbacks can be set at program release time Copyright Khronos Group 2017 - Page 3

New Open Source Engagement Model Khronos is open sourcing specification sources, conformance tests, tools - Merge requests welcome from the community (subject to review by OpenCL working group) Deeper Community Enablement - Mix your own documentation! - Contribute and fix conformance tests - Fix the specification, headers, ICD etc. - Contribute new features (carefully) Conformance Test Suite Source Contributions and Distribution under Apache 2.0 Khronos Adopters Program Anyone can test any implementation at any time Conformant Implementations can use trademark and are covered by Khronos IP Framework Source Materials for Specifications and Reference Documentation CONTRIBUTED Under Khronos IP Framework (you won t assert patents against conformant implementations, and license copyright for Khronos use) Spec and Ref Language Source Redistribution under CC-BY 4.0 Contributions and Distribution under Apache 2.0 Spec Build System and Scripts Khronos builds and Ratifies Canonical Specification under Khronos IP Framework. No changes or re-hosting allowed Spec and Ref Language Source and derivative materials. Re-mixable under CC-BY by the industry and community Community built documentation and tools Copyright Khronos Group 2017 - Page 4

Copyright Khronos Group 2017 - Page 5 Shout Out to University of Windsor The Windsor Testing Framework, also released today, enables developers to quickly install and configure the OpenCL Conformance Test Suite on their own systems.

SPIR-V Ecosystem GLSL Third party kernel and shader languages Khronos has open sourced these tools and translators HLSL glslang https://github.com/khronosgroup/spirv-tools MSL HLSL GLSL OpenCL C OpenCL C++ Front-end Front-end SPIR-V Cross SPIR-V (Dis)Assembler SPIR-V Validator SPIR-V Khronos defined and controlled cross-api intermediate language Native support for graphics and parallel constructs 32-bit Word Stream Extensible and easily parsed Retains data object and control flow information for effective code generation and translation Other Intermediate Forms LLVM LLVM to SPIR-V Bi-directional Translator Khronos coordinating liaison with Clang/LLVM Community E.g. discussing SPIR-V as supported Clang target IHV Driver Runtimes Copyright Khronos Group 2017 - Page 6

SYCL Ecosystem Single-source heterogeneous programming using STANDARD C++ - Use C++ templates and lambda functions for host & device code - Layerered over OpenCL Fast and powerful path for bring C++ apps and libraries to OpenCL - C++ Kernel Fusion - better performance on complex software than hand-coding - Halide, Eigen, Boost.Compute, SYCLBLAS, SYCL Eigen, SYCL TensorFlow, SYCL GTX - trisycl, ComputeCpp, VisionCpp, ComputeCpp SDK More information at http://sycl.tech Developer Choice The development of the two specifications are aligned so code can be easily shared between the two approaches C++ Kernel Language Low Level Control GPGPU -style separation of device-side kernel source code and host code Single-source C++ Programmer Familiarity Approach also taken by C++ AMP and OpenMP Copyright Khronos Group 2017 - Page 7

Copyright Khronos Group 2017 - Page 8 OpenCL Adoption 100s of applications using OpenCL acceleration - Rendering, visualization, video editing, simulation, image processing Over 4,000 GitHub repositories using OpenCL - Tools, applications, libraries, languages - Up form 2,000 two years ago Multiple silicon and open source implementations - Increasingly used for embedded vision and neural network inferencing Khronos Resource Hub https://www.khronos.org/opencl/resources/opencl-applications-using-opencl Basemark CL

OpenCL as Language/Library Backend C++ based Neural network framework Language for image processing and computational photography MulticoreWare Single open source Source C++ project on Programming Bitbucket for OpenCL Java language extensions for parallelism Vision processing open source project Compiler directives for Fortran, C and C++ Open source software library for machine learning Hundreds of languages, frameworks and projects using OpenCL to access vendor-optimized, heterogeneous compute runtimes Copyright Khronos Group 2017 - Page 9

Safety Critical APIs Experience and Guidelines OpenGL SC 1.0-2005 OpenGL SC 2.0 - April 2016 Fixed function graphics subset OpenGL ES 1.0-2003 Fixed function graphics New Generation APIs for safety certifiable vision, graphics and compute e.g. ISO 26262 and DO-178B/C Shader programmable pipeline subset OpenGL ES 2.0-2007 Shader programmable pipeline OpenVX SC 1.1 Released 1st May 2017 Restricted deployment implementation executes on the target hardware by reading the binary format and executing the pre-compiled graphs Khronos SCAP Safety Critical Advisory Panel Guidelines for designing APIs that ease system certification. Open to Khronos member AND industry experts OpenCL SC TSG Formed Working on OpenCL SC 1.2 Eliminate Undefined Behavior Eliminate Callback Functions Static Pool of Event Objects Copyright Khronos Group 2017 - Page 10

OpenCL Conformant Implementations 1.0 May09 1.1 Jul11 1.2 Jun12 1.0 Aug09 1.1 Aug10 1.2 May12 2.0 Dec14 1.0 May10 1.1 Feb11 1.1 Mar11 1.2 Dec12 2.0 Jul14 Desktop 2.1 Jun16 1.0 May09 1.1 Jun10 1.2 May15 1.0 Feb11 1.1 Aug12 1.2 Sep13 Mobile 1.2 Mar16 2.0 Apr17 1.1 Apr12 1.1 Nov12 1.2 Apr14 1.2 Dec14 1.2 Sep14 2.0 Nov15 1.0 Jan10 1.1 May13 Embedded 1.2 May15 1.0 Jul13 FPGA 1.0 Dec14 1.2 Aug15 Vendor timelines are first conformant submission for each spec generation Dec08 OpenCL 1.0 Specification Jun10 OpenCL 1.1 Specification Nov11 OpenCL 1.2 Specification Nov13 OpenCL 2.0 Specification Nov15 OpenCL 2.1 Specification Copyright Khronos Group 2017 - Page 11

OpenCL - 1000s Man Years Effort Single Source C++ Programming OpenCL C++ Kernel Language Full support for features in C++14-based Kernel Language Static subset of C++14 Templates and Lambdas SPIR-V 1.2 with C++ support SYCL 2.2 single source C++ Pipes API and Language Specs Brings C++14-based Kernel Language into core specification Efficient device-scope communication between kernels Portable Kernel Intermediate Language Multiple Code Generation Optimizations Support for C++14-based kernel language e.g. constructors/destructors 3-component vectors Additional image formats Multiple hosts and devices Buffer region operations Enhanced event-driven execution Additional OpenCL C built-ins Improved OpenGL data/event interop Dec08 OpenCL 1.0 Specification 18 months Jun10 OpenCL 1.1 Specification Device partitioning Separate compilation and linking Enhanced image support Built-in kernels / custom devices Enhanced DX and OpenGL Interop 18 months Nov11 OpenCL 1.2 Specification Shared Virtual Memory On-device dispatch Generic Address Space Enhanced Image Support C11 Atomics Pipes Android ICD 24 months Nov13 OpenCL 2.0 Specification SPIR-V in Core Subgroups into core Subgroup query operations clclonekernel Low-latency device timer queries 24 months Nov15 OpenCL 2.1 Specification 18 months May17 OpenCL 2.2 Specification Copyright Khronos Group 2017 - Page 12

Google Trends Copyright Khronos Group 2017 - Page 13

Embrace the Layered Ecosystem OpenCL mixed providing low-level hardware access with ease-of-use Didn t make it clear that lowlevel performance portability is impossible Did not focus on rapidly porting efficient libraries Applications Rich Middleware Ecosystem Libraries, languages, tools, engines Hardware Middleware just needs direct access to hardware. Driver should get out of the way Middleware can provide ease of use Middleware has the system/domain context to try to provide performance portability Run-time abstraction hardware is needed: - Software vendors can t afford to port to every type/generation hardware - Hardware vendors want to keep innovating under an abstraction Copyright Khronos Group 2017 - Page 14

Market Segments Need Deployment Flexibility Desktop (actual and cloud) Use cases: Video Image Processing, Gaming Compute, Rendering, Neural Network Training and Inferencing Roadmap: Vulkan interop, dialable precision, pre-emption, collective programming and improved execution model, dynamic parallelism, pre-emption Mobile OpenCL has been over-monolithic E.g. DSP inferencing should not be forced to ship IEEE FP32 Solution: feature sets enabling toggling capabilities within a coherent framework without losing conformance Use case: Photo and Vision Processing, Neural Network Inferencing Roadmap: SVM, dialable precision for inference engine and pixel processing efficiency, pre-emption and QoS scheduling for power efficiency HPC Use case: Numerical Simulation, Neural Network Training, Virtualization Roadmap: enhanced streaming processing, enhanced library support FPGAs Use cases: Network and Stream Processing Roadmap: enhanced execution model, self-synchronized and self-scheduled graphs, fine-grained synchronization between kernels, DSL in C++ Embedded Use cases: Vison, Signal and Pixel Processing, Neural Network and Inferencing Roadmap: arbitrary precision for power efficiency, hard real-time scheduling, asynch DMA Copyright Khronos Group 2017 - Page 15

Copyright Khronos Group 2017 - Page 16 Other Lessons Lessons Language flexibility is good! Enable language innovation! Lack of tools and libraries Needs to be adopted/available on key platforms Middleware and application insights and prototyping are essential during standards design How We Learned Them OpenCL WG spent way too long designing OpenCL C and C++ Assumption that the Working Group s job is done once the specification is shipped Apple are focused on Metal OpenCL/RenderScript Confusion NVIDIA not pushing to 2.0 The OpenCL Working Group has lacked active software developer participation How We Do Better! Ingest SPIR-V! BUT Vendors need to support it! Hard launches i.e. simultaneous availability of spec, libraries, implementations and engines Add value to key platforms and/or develop viable portability solutions Encourage ISVs to join Khronos to help steer the industry! AND OpenCL Advisory Panels

Copyright Khronos Group 2017 - Page 17 Khronos Advisory Panels The Working Group invites input and shares draft specifications and other WG materials Working Group Members Pay membership Fee Sign NDA and IP Framework Directly participate in working groups Shared Email list and Repository Advisory Panel Advisors Pay $0 Sign Advisors Agreement = NDA and IP Framework Provide requirements and feedback on specification drafts to the working group Advisory Panel membership is By Invitation and renewed annually. No minimum workload commitment but we love input and feedback! Please reach out if you wish to participate!

Requirements for OpenCL Next Low-level explicit API as Foundation of multi-layer ecosystem Features set for Market Deployment Flexibility SPIR-V Ingestion for Language flexibility Widely Adopted No market barriers to deployment Working Group Decision! Converge with and leverage Vulkan design! Expand on Vulkan supported processors types and compute capabilities Installable tools architecture for Development flexibility Low-latency, multi-threaded dispatch For fine-grained, high-performance At least OpenCL 2.X-class compute capabilities Support for diverse processor types Copyright Khronos Group 2017 - Page 18

Copyright Khronos Group 2017 - Page 19 Vulkan Explicit GPU Control Complex drivers cause overhead and inconsistent behavior across vendors Always active error handling Full GLSL preprocessor and compiler in driver OpenGL vs. OpenGL ES Application Single thread per context High-level Driver Abstraction Layered GPU Control Context management Memory allocation Full GLSL compiler Error detection GPU Application Memory allocation Thread management Synchronization Multi-threaded generation of command buffers Thin Driver Explicit GPU Control GPU Vulkan 1.0 provides access to OpenGL ES 3.1 / OpenGL 4.X-class GPU functionality but with increased performance and flexibility Multiple Front-end Compilers GLSL, HLSL etc. SPIR-V pre-compiled shaders Loadable debug and validation layers Resource management offloaded to app: low-overhead, low-latency driver Consistent behavior: no fighting with driver heuristics Validation and debug layers loaded only when needed SPIR-V intermediate language: shading language flexibility Multi-threaded command creation. Multiple graphics, command and DMA queues Unified API across all platforms with feature set flexibility

Vulkan Adoption All Major GPU Companies shipping Vulkan Drivers for Desktop and Mobile Platforms Mobile, Embedded and Console Platforms Supporting Vulkan Android 7.0 Nintendo Switch Android TV Embedded Linux 7 Cross Platform Copyright Khronos Group 2017 - Page 20

Copyright Khronos Group 2017 - Page 21 GPU Portability Call For Participation API Overlap Analysis No single API on all systems Vulkan Portability Solution C/C++ Portability API Library + Shading Language tools All open source WebGL Next - Lift Portability API to JavaScript and use in WebAssembly natve code - Nexgen graphics and GPU compute for the Web Use Feature Sets to remove non-portable functionality Vulkan is non-proprietary and is already designed to be portable Use Extensions to add functionality e.g. security and robustness for the Web Portable Vulkan Subset API Specification Open source compilers/translators for shading and intermediate languages A Portability Solution needs to address APIs and shading languages MIR DX IL GLSL MSL HLSL

OpenCL V - OpenCL and Vulkan Convergence Converge OpenCL roadmap over time with Vulkan API and run-time - Support more processor types, e.g. DSPs and FPGAs (graphics optional) Layered ecosystem for backwards-compatibility and market flexibility - Feature sets for target market agility Single runtime stack for graphics and compute - Streamline development, adoption and deployment for the entire industry Applications Vendor-supplied and open source middleware OpenCL 1.X/2.X Compatibility Math Libraries Language Front-ends Tool Layers Vulkan API Installable tool & validation layers Thin, explicit Vulkan run-time with rigorous memory/execution model. Low-latency and predictable Dialable types and precision Real-time Preemption and QoS scheduling Explicit Asynch DMA Self-synchronized, self-scheduled graphs Stream Processing Feature Sets can be enabled for particular target markets Copyright Khronos Group 2017 - Page 22

OpenCL Evolution Discussions Single source C++ programming. Great for supporting C++ apps, libraries and frameworks Industry working to bring Heterogeneous compute to standard ISO C++ SYCL 1.2 C++11 Single source programming SYCL 2.2 C++14 Single source programming 2011 2015 2017 OpenCL 1.2 OpenCL 2.1 SPIR-V in Core OpenCL 2.2 C++ Kernel Language C++17 Parallel STL hosted by Khronos Executors for scheduling work Managed pointers or channels for sharing data Hoping to target C++ 20 but timescales are tight OpenCL-V OpenCL 1.2++? Focus on embedded imaging, vision and inferencing Make FP32 optional for DSPs and general power efficiency Converge Vulkan and OpenCL Your Input! Copyright Khronos Group 2017 - Page 23

Copyright Khronos Group 2017 - Page 24 Get Involved! OpenCL is driving to new level of community engagement - Learning from the Vulkan experience - We need to know what you need from OpenCL - IWOCL is the perfect opportunity to find out! Any company or organization is welcome to join Khronos - For a voice and a vote in any of these standards - www.khronos.org If joining is not possible ask about the OpenCL Advisory Panel - Free of charge enables design reviews, requirements and contributions Neil Trevett - ntrevett@nvidia.com - @neilt3d