Kari Pulli Intel. Radhakrishna Giduthuri, AMD. Frank Brill NVIDIA. OpenVX Webinar. June 16, 2016

Size: px
Start display at page:

Download "Kari Pulli Intel. Radhakrishna Giduthuri, AMD. Frank Brill NVIDIA. OpenVX Webinar. June 16, 2016"

Transcription

1 Kari Pulli Intel Radhakrishna Giduthuri, AMD Frank Brill NVIDIA OpenVX Webinar June 16, 2016

2 Copyright Khronos Group Page 1 Vision Acceleration Kari Pulli Sr. Principal Engineer Intel

3 Copyright Khronos Group Page 2 Khronos Open Standards Software Silicon Khronos is an Industry Consortium of over 100 companies creating royalty-free, open standard APIs to enable software to access hardware acceleration for graphics, parallel compute and vision

4 Copyright Khronos Group Page 3 Vision Processing Power Efficiency Vision processing just on CPU is too expensive - Especially on battery-powered devices Advanced Sensors GPUs are more power-efficient - They were architected for efficient pixel handling Traditional cameras have dedicated hardware - ISP = Image Signal Processor on all SOCs today Wearables But how to program specialized processors? Performance and Functional Portability Power Efficiency X100 X10 X1 Dedicated Hardware Vision DSPs Vision Processing Efficiency GPU Compute Multi-core CPU Computation Flexibility

5 OpenVX Low-Power Vision Acceleration Higher-level abstraction API - Targeted at real-time mobile and embedded platforms Performance portability across diverse architectures - Multi-core CPUs, GPUs, DSPs, ISPs, Dedicated hardware, Extends portable vision acceleration to very low-power domains - Doesn t require high-power CPU/GPU Complex Power Efficiency X100 X10 X1 Dedicated Hardware Vision DSPs Vision Processing Efficiency GPU Compute Multi-core CPU Computation Flexibility Vision Engine Accelerator Middleware Accelerator Application Accelerator Copyright Khronos Group Page 4

6 Copyright Khronos Group Page 5 OpenVX Graphs OpenVX developers express a graph of image operations ( Nodes ) - Nodes can be on any hardware or processor coded in any language - For example, on GPU, nodes may implemented in OpenCL Minimizes host interaction during frame-rate graph execution - Host processor can setup graph which can then execute almost autonomously Camera Input OpenVX Nodes Pyr t Rendering Output RGB Frame Color Conversion YUV Frame Channel Extract Gray Frame Image Pyramid Optical Flow Array of Keypoints Harris Track OpenVX Graph Ftr t-1 Array of Features

7 Copyright Khronos Group Page 6 OpenVX Framework Efficiency Graph Scheduling Memory Management Kernel Merging Data Tiling Split the graph execution across the whole system: CPU / GPU / dedicated HW Reuse pre-allocated memory for multiple intermediate data Replace a subgraph with a single faster node Execute a subgraph at tile granularity instead of image granularity Faster execution or lower power consumption Less allocation overhead, more memory for other applications Better memory locality, less kernel launch overhead Better use of data cache and local memory

8 Copyright Khronos Group Page 7 OpenVX and OpenCV are Complementary Implementation Conformance Consistency Scope Efficiency Typical Use Case Community-driven open source library Extensive OpenCV Test Suite but no formal Adopters program Available functions can vary depending on implementation / platform Very wide 1000s of imaging and vision functions Memory-based architecture Each operation reads and writes to memory Rapid experimentation and prototyping - especially on desktop Open standard API designed to be implemented by hardware vendors Implementations must pass defined conformance test suite to use trademark All core functions must be available in all conformant implementations Tight focus on core hardware accelerated functions for mobile vision but extensible Graph-based execution Optimizable computation and data transfer Production development & deployment on wide range of mobile and embedded devices

9 Copyright Khronos Group Page 8 OpenVX 1.0 Shipping, OpenVX 1.1 Released! Multiple OpenVX 1.0 Implementations shipping spec in October Open source sample implementation and conformance tests available OpenVX 1.1 Specification released in May Expands node functionality AND enhances graph framework -Sample source and conformance tests will be updated to OpenVX 1.1 soon OpenVX is EXTENSIBLE -Implementers can add their own nodes at any time to meet customer and market needs = provided results for conformance tests

10 Copyright Khronos Group Page 1 OpenVX Technical Overview Khronos Webinar khronos.org/openvx Radhakrishna Giduthuri AMD

11 Copyright Khronos Group Page 2 OpenVX Components Context (vx_context) Data Objects vx_image, vx_pyramid, vx_array, vx_lut, vx_remap, vx_scalar, vx_threshold, vx_distribution, vx_matrix, vx_convolution, vx_delay, vx_object_array Graphs (vx_graph) Nodes (vx_node) Kernel instances, parameters, completion callback functions Kernels (vx_kernel) Built-in vision functions, Vendor extensions, User-defined Virtual Data vx_image, vx_pyramid, vx_array, vx_object_array Miscellaneous Directives, Hints, Logging, Performance Measurements Extensions Tiling, XML Schema

12 Copyright Khronos Group Page 3 Context Context - OpenVX world: need to be created first - All objects belong to a context #include <VX/vx.h>... vx_context context = vxcreatecontext(); * See VX/vx_api.h for framework API function definitions.

13 Error Management Methods return a status vx_status returned: VX_SUCCESS when no error if( vxprocessgraph( graph )!= VX_SUCCESS) { /* Error */ } Explicit status check Object creation: use vxgetstatus to check the object vx_context context = vxcreatecontext(); if( vxgetstatus( (vx_reference)context )!= VX_SUCCESS ) { /* Error */ } More info from the log callback void logcallback( vx_context c, vx_reference r, vx_status s, const vx_char string[] ) { /* Do something */ }... vxregisterlogcallback( context, logcallback, vx_false_e );... vxaddlogentry( reference, VX_INVALID_VALUE, specified value is out of range ); * See VX/vx_types.h for type definitions and error codes. Copyright Khronos Group Page 4

14 Data objects The application gets only references to objects, not the objects -References should be released by the application when not needed -Ref-counted object destroyed by OpenVX when not referenced any more vx_image img = vxcreateimage( context, 640, 400, VX_DF_IMAGE_RGB ); // Use the image vxreleaseimage( &img ); Object-Oriented Behavior -strongly typed (good for safety-critical applications) -OpenVX are really pointers to structs - any object may be down-cast to a vx_reference, e.g., for passing to vxgetstatus() Opaque -Access to content explicit and temporary (map/unmap or copy) - No permanent pointer to internal data -Needed to handle complex memory hierarchies - DSP local memory - GPU dedicated memory Copyright Khronos Group Page 5

15 Copyright Khronos Group Page 6 Enumerated Data Types C data type vx_uint8 (basic data type) vx_int16 vx_uint16 vx_int32 vx_float32 vx_enum Enumeration VX_TYPE_UINT8 VX_TYPE_INT16 VX_TYPE_UINT16 VX_TYPE_INT32 VX_TYPE_FLOAT32 VX_TYPE_ENUM vx_rectangle_t (struct) vx_keypoint_t vx_image (opaque object) VX_TYPE_RECTANGLE VX_TYPE_KEYPOINT VX_TYPE_IMAGE

16 Copyright Khronos Group Page 7 Data Object Creation vx_image img = vxcreateimage( ctx, 640, 400, VX_DF_IMAGE_UYVY ); // supports 13 standard formats vx_pyramid pyr = vxcreatepyramid( ctx, levels, VX_SCALE_PYRAMID_HALF, 640, 400, VX_DF_IMAGE_U8 ); vx_array arr = vxcreatearray( ctx, VX_TYPE_KEYPOINT, capacity ); // array of vx_keypoint_t[] vx_lut lut = vxcreatelut( ctx, VX_TYPE_UINT8, 256 ); // 8-bit look-up table vx_remap remap = vxcreateremap( ctx, src_width, src_height, dst_width, dst_height ); vx_matrix mat = vxcreatematrix( ctx, VX_TYPE_FLOAT32, columns, rows ); vx_distribution dist = vxcreatedistribution( ctx, num_bins, offset, range ); vx_float32 scalar_initial_value = 1.25f; vx_scalar scalar = vxcreatescalar( ctx, VX_TYPE_FLOAT32, &scalar_initial_value ); vx_delay delay = vxcreatedelay( ctx, (vx_reference)pyr, num_slots ); // pyr is an exemplar vx_object_array obj_arr = vxcreateobjectarray( ctx, (vx_reference)pyr, count );

17 OpenVX Graph vx_context context = vxcreatecontext(); vx_image input = vxcreateimage( context, 640, 480, VX_DF_IMAGE_U8 ); vx_image output = vxcreateimage( context, 640, 480, VX_DF_IMAGE_U8 ); vx_graph graph = vxcreategraph( context ); vx_image intermediate = vxcreatevirtualimage( graph, 640, 480, VX_DF_IMAGE_U8 ); vx_node F1 = vxf1node( graph, input, intermediate ); vx_node F2 = vxf2node( graph, intermediate, output ); vxverifygraph( graph ); while(...) { // write to input image vxprocessgraph( graph ); // read from output image } context graph * Use #include <VX/vx.h> for OpenVX header files intermediate input F1 F2 output

18 OpenVX 1.1 Built-in Vision Functions Kernels Pixel-wise Functions Add, Subtract, Multiply, AbsDiff, And, Or, Xor, Not, Magnitude, Phase, Threshold, TableLookup, ColorDepth, ChannelExtract, ChannelCombine, ColorConvert, AccumulateImage, AccumulateSquaredImage, AccumulateWeightedImage, Reduction Functions Histogram, MeanStdDev, MinMaxLoc Filtering Functions Box3x3, Convolve, Dilate3x3, Erode3x3, Gaussian3x3, Median3x3, Sobel3x3, GaussianPyramid, NonLinearFilter, LaplacianPyramid, LaplacianReconstruct Geometric Functions Remap, ScaleImage, WarpAffine, WarpPerspective, HalfScaleGaussian Complex Functions CannyEdgeDetector, EqualizeHist, FastCorners, HarrisCorners, IntegralImage, OpticalFlowPyrLK * See VX/vx_nodes.h for functions to create kernel instances (nodes) in a graph. Copyright Khronos Group Page 9

19 Copyright Khronos Group Page 10 Rectangle typedef struct _vx_rectangle_t { vx_uint32 start_x; /*!< \brief The Start X coordinate. */ vx_uint32 start_y; /*!< \brief The Start Y coordinate. */ vx_uint32 end_x; /*!< \brief The End X coordinate. */ vx_uint32 end_y; /*!< \brief The End Y coordinate. */ } vx_rectangle_t; Type enumeration: VX_TYPE_RECTANGLE Image start : inside rectangle end : outside

20 Copyright Khronos Group Page 11 Keypoints typedef struct _vx_keypoint_t { vx_int32 x; // keypoint x-coordinate vx_int32 y; // keypoint y-coordinate vx_float32 strength; // strength of keypoint vx_float32 scale; vx_float32 orientation; vx_int32 tracking_status; // zero indicates lost point. Initialized to 1 by detectors vx_float32 error; } vx_keypoint_t; Type enumeration: VX_TYPE_KEYPOINT key-point Image

21 Copyright Khronos Group Page 12 Array Data Object vx_array vxcreatearray ( vx_context context, vx_enum item_type, // VX_TYPE_KEYPOINT, VX_TYPE_UINT32,... vx_size capacity ); capacity-1 num_items vx_array array = vxcreatearray( context, VX_TYPE_RECTANGLE, 64 ); // remove all items from array and add 8 items vxtruncatearray( array, 0 ); vxaddarrayitems( array, 8, &rect[0], sizeof(vx_rectangle_t) ); // get number items in the array by querying array attribute vxqueryarray(array, VX_ARRAY_NUMITEMS, &num_items, sizeof(num_items));

22 Copyright Khronos Group Page 13 Array Data Access Copy using application controlled address and memory layout - vxcopyarrayrange: copy (Read or Write) vxqueryarray( arr, VX_ARRAY_NUMITEMS, &num_items, sizeof(num_items) ); vxcopyarrayrange( arr, 0, num_items, sizeof(my_array[0]), &my_array[0], VX_READ_ONLY, VX_MEMORY_TYPE_HOST ); Access limited in time - vxmaparrayrange: get access (Read, Write, Read & Write) - vxunmaparrayrange: release the access vx_map_id map_id; void * ptr; vxqueryarray( arr, VX_ARRAY_NUMITEMS, &num_items, sizeof(num_items) ); vxmaparrayrange( arr, 0, num_items, &map_id, &stride, &ptr, VX_READ_AND_WRITE, VX_MEMORY_TYPE_HOST, 0 ); // Access data in ptr vxunmaparrayrange( arr, map_id );

23 Copyright Khronos Group Page 14 Image Access (1/2) : Overview Copy using application controlled address and memory layout - vxcopyimagepatch: copy (Read or Write) vx_imagepatch_addressing_t addr = { /* to fill stride_x & stride_y */ }; vx_rectangle_t rect = { 0u, 0u, width, height }; vxcopyimagepatch( img, &rect, plane, &addr, my_array, VX_WRITE_ONLY, VX_MEMORY_TYPE_HOST, VX_NOGAP_X ); Access limited in time - vxmapimagepatch: get access (Read, Write, Read & Write) - vxunmapimagepatch: release the access vx_map_id map_id; void * ptr; vx_imagepatch_addressing_t addr; vx_rectangle_t rect = { 0u, 0u, width, height }; vxmapimagepatch( img, &rect, plane, &map_id, &addr, &ptr, VX_READ_AND_WRITE, VX_MEMORY_TYPE_HOST, VX_NOGAP_X ); // Access data in ptr vxunmapimagepatch( img, map_id );

24 Copyright Khronos Group Page 15 Image Access (2/2) : Memory Layout typedef struct _vx_imagepatch_addressing_t { vx_uint32 dim_x; vx_uint32 dim_y; vx_int32 stride_x; vx_int32 stride_y; vx_uint32 scale_x; vx_uint32 scale_y; vx_uint32 step_x; vx_uint32 step_y; } vx_imagepatch_addressing_t; Num of (logical) pixels in a row Num of (logical) pixels in a column Num of bytes between the beginning of 2 successive pixels Num of bytes between the beginning of 2 successive lines Sub-sampling : 1 physical pixel every step logical pixel scale = VX_SCALE_UNITY / step stride_y stride_x Patch

25 Feature Tracking Example vx_context vx_delay of pyramids Optical Flow (LK) vx_node Compute Pyramid Input Image from CAMERA vx_image Convert to Grayscale vx_pyramid at t=n-1 vx_pyramid at t=n keypoint array (x, y, ) at t=n-1 vx_delay of keypoints Copy of Data from Previous Iteration keypoint array (x, y, ) at t=n vx_array keep old copy Computed Data Harris Detector vx_node vx_image vx_graph vx_graph Copyright Khronos Group Page 16

26 Copyright Khronos Group Page 17 Pyramid Data Object vx_pyramid vxcreatepyramid ( vx_context context, vx_size levels, vx_float32 scale, // VX_SCALE_PYRAMID_HALF or VX_SCALE_PYRAMID_ORB vx_uint32 width, vx_uint32 height, vx_df_image format // VX_DF_IMAGE_U8 ); Level 3 Level 0 (base) Level 2 Level 1 Example: vx_pyramid pyramid = vxcreatepyramid(context, ); // get image at pyramid level 2 vx_image img2 = vxgetpyramidlevel( pyramid, 2 ); vxreleaseimage( &img2 ); vxreleasepyramid( &pyramid );

27 Copyright Khronos Group Page 18 Delay Data Object vx_delay vxcreatedelay ( vx_context context, vx_reference exemplar, vx_size count ); Example: vx_pyramid exemplar = vxcreatepyramid(context, ); vx_delay pyr_delay = vxcreatedelay(context, (vx_reference)exemplar, 2); vxreleasepyramid(&exemplar); vx_pyramid pyr_0 = (vx_pyramid)vxgetreferencefromdelay(pyr_delay, 0); vx_pyramid pyr_1 = (vx_pyramid)vxgetreferencefromdelay(pyr_delay, -1); vxagedelay(pyr_delay);

28 Data Objects vx_context Input Image from CAMERA vx_delay of pyramids vx_image vx_pyramid at t=n-1 keypoint array (x, y, ) at t=n-1 vx_delay of keypoints vx_pyramid at t=n keypoint array (x, y, ) at t=n vx_array Copy of Data from Previous Iteration keep old copy Computed Data Copyright Khronos Group Page 19

29 Harris Graph vx_context vx_delay of pyramids Gaussian Pyramid vx_node Input Image from CAMERA vx_pyramid at t=n-1 vx_pyramid at t=n vx_image (RGB) vx_image (virtual U008) vx_node Color Convert keypoint array (x, y, ) at t=n-1 vx_delay of keypoints Copy of Data from Previous Iteration keep old copy Computed Data keypoint array (x, y, ) at t=n vx_array additional parameters Harris Corners vx_node Channel Extract vx_node vx_image (virtual IYUV) vx_graph Copyright Khronos Group Page 20

30 Copyright Khronos Group Page 21 Vision Functions in a Graph RGB -> YUV vxcolorconvertnode( graph, input_rgb_image, harris_yuv_image ); YUV -> Y VX_DF_IMAGE_RGB VX_DF_IMAGE_YUV vxchannelextractnode( graph, harris_yuv_image, VX_CHANNEL_Y, harris_gray_image ); Harris corner - strength_thresh : f - min_distance : 5.0f - sensitivity : 0.04f - gradient_size : 3 - block_size : 3 VX_DF_IMAGE_YUV VX_DF_IMAGE_U8 vxharriscornersnode( graph, harris_gray_image, strength_thresh, min_distance, sensitivity, gradient_size, block_size, keypoint_array_output, NULL );

31 Optical Flow Graph Input Image from CAMERA vx_context vx_delay of pyramids additional parameters Optical FlowPyrLK vx_node Gaussian Pyramid vx_node vx_image (RGB) vx_image (virtual U008) vx_node Color Convert vx_pyramid at t=n-1 vx_pyramid at t=n keypoint array (x, y, ) at t=n-1 vx_delay of keypoints Copy of Data from Previous Iteration keypoint array (x, y, ) at t=n vx_array keep old copy Computed Data Channel Extract vx_node vx_image (virtual IYUV) vx_graph Copyright Khronos Group Page 22

32 Copyright Khronos Group Page 23 Execute a Graph in Loop to Process Input Before executing Harris & Optical Flow Graphs - vxverifygraph API should return VX_SUCCESS (outside the loop) Inside the loop -- process each image from input video sequence - write pixels from input video into input RGB image - Execute Graphs using vxprocessgraph API - Execute Harris Graph for the 1st image from video sequence - Execute Optical Flow Graph from 2nd image onwards - Read previous and current keypoints and draw each item - Use vxgetreferencefromdelay API to get previous and current keypoint arrays - Flip the previous and current pyramid and keypoints in delay objects - Use vxagedelay API - This will automatically trigger flipping of previous and current pyramids in all the graphs After the processing loop - Query VX_GRAPH_ATTRIBUTE_PERFORMANCE for performance measurements - Release all objects -- make sure to release context at the end

33 Copyright Khronos Group Page 24 Summary OpenVX is a low-level programming framework domain to enable software developers to efficiently access computer vision hardware acceleration with both functional and performance portability. OpenVX contains: - a library of predefined and customizable vision functions - a graph-based execution model to combine function enabling both task and data-independent execution, and; - a set of memory objects that abstract the physical memory. OpenVX is defined as a C API - object-oriented design - synchronous and asynchronous execution model - extend functionality using enums and callbacks Useful Links: and github.com/rgiduthuri/openvx_tutorial

Tutorial Practice Session

Tutorial Practice Session Tutorial Practice Session Step 1: OpenVX Basic Copyright Khronos Group 2016 - Page 1 Step 1: Keypoint detection PETS09-S1-L1-View001.avi Capture OpenCV cv::videocapture cv_src_bgr cv::cvtcolor cv_src_rgb

More information

Tutorial Practice Session

Tutorial Practice Session Copyright Khronos Group 2016 - Page 1 Tutorial Practice Session Step 2: Graphs Copyright Khronos Group 2016 - Page 2 Why graphs? Most APIs (e.g., OpenCV) are based on function calls - a function abstracts

More information

OpenVX 1.2 Quick Reference Guide Page 1

OpenVX 1.2 Quick Reference Guide Page 1 OpenVX 1.2 Quick Reference Guide Page 1 OpenVX (Open Computer Vision Acceleration API) is a low-level programming framework domain to access computer vision hardware acceleration with both functional and

More information

The OpenVX User Data Object Extension

The OpenVX User Data Object Extension The OpenVX User Data Object Extension The Khronos OpenVX Working Group, Editor: Jesse Villarreal Version 1.0 (provisional), Wed, 13 Feb 2019 16:07:15 +0000 Table of Contents 1. Introduction.............................................................................................

More information

The OpenVX Classifier Extension. The Khronos OpenVX Working Group, Editors: Tomer Schwartz, Radhakrishna Giduthuri

The OpenVX Classifier Extension. The Khronos OpenVX Working Group, Editors: Tomer Schwartz, Radhakrishna Giduthuri The OpenVX Classifier Extension The Khronos OpenVX Working Group, Editors: Tomer Schwartz, Radhakrishna Giduthuri Version 1.2.1 (provisional), Wed, 15 Aug 2018 06:03:09 +0000 Table of Contents 1. Classifiers

More information

The OpenVX Specification

The OpenVX Specification The OpenVX Specification Version 1.2 Document Revision: dba1aa3 Generated on Wed Oct 11 2017 20:00:10 Khronos Vision Working Group Editor: Stephen Ramm Copyright 2016-2017 The Khronos Group Inc. Copyright

More information

The OpenVX Computer Vision and Neural Network Inference

The OpenVX Computer Vision and Neural Network Inference The OpenVX Computer and Neural Network Inference Standard for Portable, Efficient Code Radhakrishna Giduthuri Editor, OpenVX Khronos Group radha.giduthuri@amd.com @RadhaGiduthuri Copyright 2018 Khronos

More information

Standards for Vision Processing and Neural Networks

Standards for Vision Processing and Neural Networks Copyright Khronos Group 2017 - Page 1 Standards for Vision Processing and Neural Networks Radhakrishna Giduthuri, AMD radha.giduthuri@ieee.org Agenda Why we need a standard? Khronos NNEF Khronos OpenVX

More information

The OpenVX Export and Import Extension

The OpenVX Export and Import Extension The OpenVX Export and Import Extension The Khronos OpenVX Working Group, Editors: Steve Ramm, Radhakrishna Giduthuri Version 1.1.1, Wed, 15 Aug 2018 06:03:12 +0000 Table of Contents 1. Export and Import

More information

Standards for Neural Networks Acceleration and Deployment

Standards for Neural Networks Acceleration and Deployment Copyright Khronos Group 2017 - Page 1 Standards for Neural Networks Acceleration and Deployment Radhakrishna Giduthuri Advanced Micro Devices radha.giduthuri@ieee.org Agenda Why we need a standard? Khronos

More information

Accelerating Vision Processing

Accelerating Vision Processing Accelerating Vision Processing Neil Trevett Vice President Mobile Ecosystem at NVIDIA President of Khronos and Chair of the OpenCL Working Group SIGGRAPH, July 2016 Copyright Khronos Group 2016 - Page

More information

Update on Khronos Open Standard APIs for Vision Processing Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem

Update on Khronos Open Standard APIs for Vision Processing Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem Update on Khronos Open Standard APIs for Vision Processing Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem Copyright Khronos Group 2015 - Page 1 Copyright Khronos Group 2015 - Page

More information

The OpenVX Buffer Aliasing Extension

The OpenVX Buffer Aliasing Extension The OpenVX Buffer Aliasing Extension The Khronos OpenVX Working Group, Editor: Jesse Villarreal Version 1.0 (provisional), Wed, 13 Feb 2019 16:07:12 +0000 Table of Contents 1. Introduction.............................................................................................

More information

The OpenVX [Provisional] Specification

The OpenVX [Provisional] Specification The OpenVX [Provisional] Specification Version 1.0 Document Revision: r26495 Generated on Fri May 2 2014 16:13:28 Khronos Vision Working Group Editor: Susheel Gautam Editor: Erik Rainey Copyright 2014

More information

Open API Standards for Mobile Graphics, Compute and Vision Processing GTC, March 2014

Open API Standards for Mobile Graphics, Compute and Vision Processing GTC, March 2014 Open API Standards for Mobile Graphics, Compute and Vision Processing GTC, March 2014 Neil Trevett Vice President Mobile Ecosystem, NVIDIA President Khronos Copyright Khronos Group 2014 - Page 1 Khronos

More information

Vision Acceleration. Launch Briefing October Neil Trevett Vice President Mobile Ecosystem, NVIDIA President, Khronos Group

Vision Acceleration. Launch Briefing October Neil Trevett Vice President Mobile Ecosystem, NVIDIA President, Khronos Group Copyright Khronos Group 2014 - Page 1 Vision Acceleration Launch Briefing October 2014 Neil Trevett Vice President Mobile Ecosystem, NVIDIA President, Khronos Group Copyright Khronos Group 2014 - Page

More information

Copyright Khronos Group Page 1. Vulkan Overview. June 2015

Copyright Khronos Group Page 1. Vulkan Overview. June 2015 Copyright Khronos Group 2015 - Page 1 Vulkan Overview June 2015 Copyright Khronos Group 2015 - Page 2 Khronos Connects Software to Silicon Open Consortium creating OPEN STANDARD APIs for hardware acceleration

More information

Silicon Acceleration APIs

Silicon Acceleration APIs Copyright Khronos Group 2016 - Page 1 Silicon Acceleration APIs Embedded Technology 2016, Yokohama Neil Trevett Vice President Developer Ecosystem, NVIDIA President, Khronos ntrevett@nvidia.com @neilt3d

More information

Next Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Copyright Khronos Group Page 1

Next Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Copyright Khronos Group Page 1 Next Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Ecosystem @neilt3d Copyright Khronos Group 2015 - Page 1 Copyright Khronos Group 2015 - Page 2 Khronos Connects Software to Silicon

More information

A framework for optimizing OpenVX Applications on Embedded Many Core Accelerators

A framework for optimizing OpenVX Applications on Embedded Many Core Accelerators A framework for optimizing OpenVX Applications on Embedded Many Core Accelerators Giuseppe Tagliavini, DEI University of Bologna Germain Haugou, IIS ETHZ Andrea Marongiu, DEI University of Bologna & IIS

More information

April 4-7, 2016 Silicon Valley VISIONWORKS A CUDA ACCELERATED COMPUTER VISION LIBRARY S6783. Elif Albuz, April 4, 2016

April 4-7, 2016 Silicon Valley VISIONWORKS A CUDA ACCELERATED COMPUTER VISION LIBRARY S6783. Elif Albuz, April 4, 2016 April 4-7, 2016 Silicon Valley VISIONWORKS A CUDA ACCELERATED COMPUTER VISION LIBRARY S6783 Elif Albuz, April 4, 2016 Motivation Introduction to VisionWorks AGENDA VisionWorks Software Stack VisionWorks

More information

The OpenVX XML Schema User Guide. The Khronos OpenVX Working Group, Editors: Jesse Villarreal

The OpenVX XML Schema User Guide. The Khronos OpenVX Working Group, Editors: Jesse Villarreal The OpenVX XML Schema User Guide The Khronos OpenVX Working Group, Editors: Jesse Villarreal Version 1.1 (provisional), Mon, 10 Dec 2018 23:43:03 +0000 Table of Contents 1. XML Schema Extension User Guide.........................................................

More information

Open Standards for AR and VR Neil Trevett Khronos President NVIDIA VP Developer January 2018

Open Standards for AR and VR Neil Trevett Khronos President NVIDIA VP Developer January 2018 Copyright Khronos Group 2018 - Page 1 Open Standards for AR and Neil Trevett Khronos President NVIDIA VP Developer Ecosystem ntrevett@nvidia.com @neilt3d January 2018 Khronos Mission E.g. OpenGL ES provides

More information

The OpenVX Neural Network Extension

The OpenVX Neural Network Extension The OpenVX Neural Network Extension Version 1.0 (Provisional) Document Revision: 7505566 Khronos Vision Working Group Editor: Schwartz Tomer Editor: Hagog Mostafa Copyright 2016 The Khronos Group Inc.

More information

The OpenVX Graph Pipelining, Streaming, and Batch Processing Extension to OpenVX 1.1 and 1.2

The OpenVX Graph Pipelining, Streaming, and Batch Processing Extension to OpenVX 1.1 and 1.2 The OpenVX Graph Pipelining, Streaming, and Batch Processing Extension to OpenVX 1.1 and 1.2 Version 1.0 (Provisional) Document Revision: 950f130 Generated on Mon Dec 11 2017 14:38:52 Khronos Vision Working

More information

Open Standard APIs for Augmented Reality

Open Standard APIs for Augmented Reality Copyright Khronos Group 2014 - Page 1 Open Standard APIs for Augmented Reality Neil Trevett Vice President Mobile Ecosystem, NVIDIA President, Khronos Group Copyright Khronos Group 2014 - Page 2 Khronos

More information

SIGGRAPH Briefing August 2014

SIGGRAPH Briefing August 2014 Copyright Khronos Group 2014 - Page 1 SIGGRAPH Briefing August 2014 Neil Trevett VP Mobile Ecosystem, NVIDIA President, Khronos Copyright Khronos Group 2014 - Page 2 Significant Khronos API Ecosystem Advances

More information

Copyright Khronos Group Page 1

Copyright Khronos Group Page 1 Open Standards and Open Source Together How Khronos APIs Accelerate Fast and Cool Applications Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem Copyright Khronos Group 2015 - Page

More information

NVJPEG. DA _v0.2.0 October nvjpeg Libary Guide

NVJPEG. DA _v0.2.0 October nvjpeg Libary Guide NVJPEG DA-06762-001_v0.2.0 October 2018 Libary Guide TABLE OF CONTENTS Chapter 1. Introduction...1 Chapter 2. Using the Library... 3 2.1. Single Image Decoding... 3 2.3. Batched Image Decoding... 6 2.4.

More information

NVJPEG. DA _v0.1.4 August nvjpeg Libary Guide

NVJPEG. DA _v0.1.4 August nvjpeg Libary Guide NVJPEG DA-06762-001_v0.1.4 August 2018 Libary Guide TABLE OF CONTENTS Chapter 1. Introduction...1 Chapter 2. Using the Library... 3 2.1. Single Image Decoding... 3 2.3. Batched Image Decoding... 6 2.4.

More information

Navigating the Vision API Jungle: Which API Should You Use and Why? Embedded Vision Summit, May 2015

Navigating the Vision API Jungle: Which API Should You Use and Why? Embedded Vision Summit, May 2015 Copyright Khronos Group 2015 - Page 1 Navigating the Vision API Jungle: Which API Should You Use and Why? Embedded Vision Summit, May 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem

More information

Open Standards for Vision and AI Peter McGuinness NNEF WG Chair CEO, Highwai, Inc May 2018

Open Standards for Vision and AI Peter McGuinness NNEF WG Chair CEO, Highwai, Inc May 2018 Copyright Khronos Group 2018 - Page 1 Open Standards for Vision and AI Peter McGuinness NNEF WG Chair CEO, Highwai, Inc peter.mcguinness@gobrach.com May 2018 Khronos Mission E.g. OpenGL ES provides 3D

More information

Ecosystem Overview Neil Trevett Khronos President NVIDIA Vice President Developer

Ecosystem Overview Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem Overview Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem ntrevett@nvidia.com @neilt3d Copyright Khronos Group 2016 - Page 1 Khronos Mission Software Silicon Khronos is

More information

TIOVX TI s OpenVX Implementation

TIOVX TI s OpenVX Implementation TIOVX TI s OpenVX Implementation Aish Dubey Product Marketing, Automotive Processors Embedded Vision Summit, 3 May 2017 1 TI SOC platform heterogeneous cores High level processing Object detection and

More information

Open Standard APIs for Embedded Vision Processing

Open Standard APIs for Embedded Vision Processing Copyright Khronos Group 2014 - Page 1 Open Standard APIs for Embedded Vision Processing Neil Trevett Vice President Mobile Ecosystem, NVIDIA President, Khronos Group Copyright Khronos Group 2014 - Page

More information

NVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU

NVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU NVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU GPGPU opens the door for co-design HPC, moreover middleware-support embedded system designs to harness the power of GPUaccelerated

More information

OpenCL Overview. Shanghai March Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group

OpenCL Overview. Shanghai March Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group Copyright Khronos Group, 2012 - Page 1 OpenCL Overview Shanghai March 2012 Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group Copyright Khronos Group, 2012 - Page 2 Processor

More information

Enabling a Richer Multimedia Experience with GPU Compute. Roberto Mijat Visual Computing Marketing Manager

Enabling a Richer Multimedia Experience with GPU Compute. Roberto Mijat Visual Computing Marketing Manager Enabling a Richer Multimedia Experience with GPU Compute Roberto Mijat Visual Computing Marketing Manager 1 What is GPU Compute Operating System and most application processing continue to reside on the

More information

Neural Network Exchange Format

Neural Network Exchange Format Copyright Khronos Group 2017 - Page 1 Neural Network Exchange Format Deploying Trained Networks to Inference Engines Viktor Gyenes, specification editor Copyright Khronos Group 2017 - Page 2 Outlook The

More information

Maximizing Face Detection Performance

Maximizing Face Detection Performance Maximizing Face Detection Performance Paulius Micikevicius Developer Technology Engineer, NVIDIA GTC 2015 1 Outline Very brief review of cascaded-classifiers Parallelization choices Reducing the amount

More information

Khronos Connects Software to Silicon

Khronos Connects Software to Silicon Press Pre-Briefing GDC 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem All Materials Embargoed Until Tuesday 3 rd March, 12:01AM Pacific Time Copyright Khronos Group 2015 - Page

More information

OpenCL. Matt Sellitto Dana Schaa Northeastern University NUCAR

OpenCL. Matt Sellitto Dana Schaa Northeastern University NUCAR OpenCL Matt Sellitto Dana Schaa Northeastern University NUCAR OpenCL Architecture Parallel computing for heterogenous devices CPUs, GPUs, other processors (Cell, DSPs, etc) Portable accelerated code Defined

More information

Design guidelines for embedded real time face detection application

Design guidelines for embedded real time face detection application Design guidelines for embedded real time face detection application White paper for Embedded Vision Alliance By Eldad Melamed Much like the human visual system, embedded computer vision systems perform

More information

Using SYCL as an Implementation Framework for HPX.Compute

Using SYCL as an Implementation Framework for HPX.Compute Using SYCL as an Implementation Framework for HPX.Compute Marcin Copik 1 Hartmut Kaiser 2 1 RWTH Aachen University mcopik@gmail.com 2 Louisiana State University Center for Computation and Technology The

More information

Copyright Khronos Group, Page 1. OpenCL. GDC, March 2010

Copyright Khronos Group, Page 1. OpenCL. GDC, March 2010 Copyright Khronos Group, 2011 - Page 1 OpenCL GDC, March 2010 Authoring and accessibility Application Acceleration System Integration Copyright Khronos Group, 2011 - Page 2 Khronos Family of Standards

More information

HSA Foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017!

HSA Foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017! Advanced Topics on Heterogeneous System Architectures HSA Foundation! Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2

More information

Intel CoFluent Studio in Digital Imaging

Intel CoFluent Studio in Digital Imaging Intel CoFluent Studio in Digital Imaging Sensata Technologies Use Case Sensata Technologies www.sensatatechnologies.com Formerly Texas Instruments Sensors & Controls, Sensata Technologies is the world

More information

Copyright Khronos Group Page 1

Copyright Khronos Group Page 1 Update on Khronos Standards for Vision and Machine Learning December 2017 Neil Trevett Khronos President NVIDIA VP Developer Ecosystem ntrevett@nvidia.com @neilt3d www.khronos.org Copyright Khronos Group

More information

OpenCV on Zynq: Accelerating 4k60 Dense Optical Flow and Stereo Vision. Kamran Khan, Product Manager, Software Acceleration and Libraries July 2017

OpenCV on Zynq: Accelerating 4k60 Dense Optical Flow and Stereo Vision. Kamran Khan, Product Manager, Software Acceleration and Libraries July 2017 OpenCV on Zynq: Accelerating 4k60 Dense Optical Flow and Stereo Vision Kamran Khan, Product Manager, Software Acceleration and Libraries July 2017 Agenda Why Zynq SoCs for Traditional Computer Vision Automated

More information

Higher Level Programming Abstractions for FPGAs using OpenCL

Higher Level Programming Abstractions for FPGAs using OpenCL Higher Level Programming Abstractions for FPGAs using OpenCL Desh Singh Supervising Principal Engineer Altera Corporation Toronto Technology Center ! Technology scaling favors programmability CPUs."#/0$*12'$-*

More information

OpenCL The Open Standard for Heterogeneous Parallel Programming

OpenCL The Open Standard for Heterogeneous Parallel Programming OpenCL The Open Standard for Heterogeneous Parallel Programming March 2009 Copyright Khronos Group, 2009 - Page 1 Close-to-the-Silicon Standards Khronos creates Foundation-Level acceleration APIs - Needed

More information

Renderscript Accelerated Advanced Image and Video Processing on ARM Mali T-600 GPUs. Lihua Zhang, Ph.D. MulticoreWare Inc.

Renderscript Accelerated Advanced Image and Video Processing on ARM Mali T-600 GPUs. Lihua Zhang, Ph.D. MulticoreWare Inc. Renderscript Accelerated Advanced Image and Video Processing on ARM Mali T-600 GPUs Lihua Zhang, Ph.D. MulticoreWare Inc. lihua@multicorewareinc.com Overview More & more mobile apps are beginning to require

More information

CS553 Lecture Dynamic Optimizations 2

CS553 Lecture Dynamic Optimizations 2 Dynamic Optimizations Last time Predication and speculation Today Dynamic compilation CS553 Lecture Dynamic Optimizations 2 Motivation Limitations of static analysis Programs can have values and invariants

More information

Handheld Devices. Kari Pulli. Research Fellow, Nokia Research Center Palo Alto. Material from Jyrki Leskelä, Jarmo Nikula, Mika Salmela

Handheld Devices. Kari Pulli. Research Fellow, Nokia Research Center Palo Alto. Material from Jyrki Leskelä, Jarmo Nikula, Mika Salmela OpenCL in Handheld Devices Kari Pulli Research Fellow, Nokia Research Center Palo Alto Material from Jyrki Leskelä, Jarmo Nikula, Mika Salmela 1 OpenCL 1.0 Embedded Profile Enables OpenCL on mobile and

More information

Khronos Data Format Specification

Khronos Data Format Specification Copyright Khronos Group 2015 - Page 1 Khronos Data Format Specification 1.0 release, July 2015 Andrew Garrard Spec editor Senior Software Engineer, Samsung Electronics Copyright Khronos Group 2015 - Page

More information

Marek Szyprowski Samsung R&D Institute Poland

Marek Szyprowski Samsung R&D Institute Poland Marek Szyprowski m.szyprowski@samsung.com Samsung R&D Institute Poland Quick Introduction to Linux DRM A few words on atomic KMS API Exynos DRM IPP subsystem New API proposal Some code examples Summary

More information

Unix Device Memory. James Jones XDC 2016

Unix Device Memory. James Jones XDC 2016 Unix Device Memory James Jones XDC 2016 Background Started with a Weston patch proposal Many strong views Much time invested in current software and APIs Thank you for keeping discussions civil Many areas

More information

Copyright Khronos Group Page 1

Copyright Khronos Group Page 1 Gaming Market Briefing Overview of APIs GDC March 2016 Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem ntrevett@nvidia.com @neilt3d Copyright Khronos Group 2016 - Page 1 Copyright

More information

Scale Invariant Feature Transform

Scale Invariant Feature Transform Scale Invariant Feature Transform Why do we care about matching features? Camera calibration Stereo Tracking/SFM Image moiaicing Object/activity Recognition Objection representation and recognition Image

More information

Neil Trevett Vice President, NVIDIA OpenCL Chair Khronos President. Copyright Khronos Group, Page 1

Neil Trevett Vice President, NVIDIA OpenCL Chair Khronos President. Copyright Khronos Group, Page 1 Neil Trevett Vice President, NVIDIA OpenCL Chair Khronos President Copyright Khronos Group, 2009 - Page 1 Introduction and aims of OpenCL - Neil Trevett, NVIDIA OpenCL Specification walkthrough - Mike

More information

OpenCL Press Conference

OpenCL Press Conference Copyright Khronos Group, 2011 - Page 1 OpenCL Press Conference Tokyo, November 2011 Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group Copyright Khronos Group, 2011 - Page

More information

Use cases. Faces tagging in photo and video, enabling: sharing media editing automatic media mashuping entertaining Augmented reality Games

Use cases. Faces tagging in photo and video, enabling: sharing media editing automatic media mashuping entertaining Augmented reality Games Viewdle Inc. 1 Use cases Faces tagging in photo and video, enabling: sharing media editing automatic media mashuping entertaining Augmented reality Games 2 Why OpenCL matter? OpenCL is going to bring such

More information

WebGL Meetup GDC Copyright Khronos Group, Page 1

WebGL Meetup GDC Copyright Khronos Group, Page 1 WebGL Meetup GDC 2012 Copyright Khronos Group, 2012 - Page 1 Copyright Khronos Group, 2012 - Page 2 Khronos API Ecosystem Trends Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos

More information

Copyright Khronos Group Page 1. OpenCL BOF SIGGRAPH 2013

Copyright Khronos Group Page 1. OpenCL BOF SIGGRAPH 2013 Copyright Khronos Group 2013 - Page 1 OpenCL BOF SIGGRAPH 2013 Copyright Khronos Group 2013 - Page 2 OpenCL Roadmap OpenCL-HLM (High Level Model) High-level programming model, unifying host and device

More information

EECS 487: Interactive Computer Graphics

EECS 487: Interactive Computer Graphics EECS 487: Interactive Computer Graphics Lecture 21: Overview of Low-level Graphics API Metal, Direct3D 12, Vulkan Console Games Why do games look and perform so much better on consoles than on PCs with

More information

High Quality Real Time Image Processing Framework on Mobile Platforms using Tegra K1. Eyal Hirsch

High Quality Real Time Image Processing Framework on Mobile Platforms using Tegra K1. Eyal Hirsch High Quality Real Time Image Processing Framework on Mobile Platforms using Tegra K1 Eyal Hirsch Established in 2009 and headquartered in Israel SagivTech Snapshot Core domain expertise: GPU Computing

More information

Image Contrast Adjustment using Nvidia Performance Primitives (NPP) Yang Song

Image Contrast Adjustment using Nvidia Performance Primitives (NPP) Yang Song Image Contrast Adjustment using Nvidia Performance Primitives (NPP) Yang Song Outline NPP Introduction Problem Statement Solution and Hands-on Coding Epilogue What is NPP? A library of image, signal and

More information

Applying OpenCL. IWOCL, May Andrew Richards

Applying OpenCL. IWOCL, May Andrew Richards Applying OpenCL IWOCL, May 2017 Andrew Richards The next generation of software will not be built on CPUs 2 On a 100 millimetre-squared chip, Google needs something like 50 teraflops of performance - Daniel

More information

Scale Invariant Feature Transform

Scale Invariant Feature Transform Why do we care about matching features? Scale Invariant Feature Transform Camera calibration Stereo Tracking/SFM Image moiaicing Object/activity Recognition Objection representation and recognition Automatic

More information

AR Standards Update Austin, March 2012

AR Standards Update Austin, March 2012 AR Standards Update Austin, March 2012 Neil Trevett President, The Khronos Group Vice President Mobile Content, NVIDIA Copyright Khronos Group, 2012 - Page 1 Topics Very brief overview of Khronos Update

More information

OpenCL Overview Benedict R. Gaster, AMD

OpenCL Overview Benedict R. Gaster, AMD Copyright Khronos Group, 2011 - Page 1 OpenCL Overview Benedict R. Gaster, AMD March 2010 The BIG Idea behind OpenCL OpenCL execution model - Define N-dimensional computation domain - Execute a kernel

More information

Stream Computing using Brook+

Stream Computing using Brook+ Stream Computing using Brook+ School of Electrical Engineering and Computer Science University of Central Florida Slides courtesy of P. Bhaniramka Outline Overview of Brook+ Brook+ Software Architecture

More information

Vulkan Launch Webinar 18 th February Copyright Khronos Group Page 1

Vulkan Launch Webinar 18 th February Copyright Khronos Group Page 1 Vulkan Launch Webinar 18 th February 2016 Copyright Khronos Group 2016 - Page 1 Copyright Khronos Group 2016 - Page 2 The Vulkan Launch Webinar Is About to Start! Kathleen Mattson - Webinar MC, Khronos

More information

INTEGRATING COMPUTER VISION SENSOR INNOVATIONS INTO MOBILE DEVICES. Eli Savransky Principal Architect - CTO Office Mobile BU NVIDIA corp.

INTEGRATING COMPUTER VISION SENSOR INNOVATIONS INTO MOBILE DEVICES. Eli Savransky Principal Architect - CTO Office Mobile BU NVIDIA corp. INTEGRATING COMPUTER VISION SENSOR INNOVATIONS INTO MOBILE DEVICES Eli Savransky Principal Architect - CTO Office Mobile BU NVIDIA corp. Computer Vision in Mobile Tegra K1 It s time! AGENDA Use cases categories

More information

B. Tech. Project Second Stage Report on

B. Tech. Project Second Stage Report on B. Tech. Project Second Stage Report on GPU Based Active Contours Submitted by Sumit Shekhar (05007028) Under the guidance of Prof Subhasis Chaudhuri Table of Contents 1. Introduction... 1 1.1 Graphic

More information

AFOSR BRI: Codifying and Applying a Methodology for Manual Co-Design and Developing an Accelerated CFD Library

AFOSR BRI: Codifying and Applying a Methodology for Manual Co-Design and Developing an Accelerated CFD Library AFOSR BRI: Codifying and Applying a Methodology for Manual Co-Design and Developing an Accelerated CFD Library Synergy@VT Collaborators: Paul Sathre, Sriram Chivukula, Kaixi Hou, Tom Scogland, Harold Trease,

More information

OpenACC 2.6 Proposed Features

OpenACC 2.6 Proposed Features OpenACC 2.6 Proposed Features OpenACC.org June, 2017 1 Introduction This document summarizes features and changes being proposed for the next version of the OpenACC Application Programming Interface, tentatively

More information

Computer Vision. Exercise 3 Panorama Stitching 09/12/2013. Compute Vision : Exercise 3 Panorama Stitching

Computer Vision. Exercise 3 Panorama Stitching 09/12/2013. Compute Vision : Exercise 3 Panorama Stitching Computer Vision Exercise 3 Panorama Stitching 09/12/2013 Compute Vision : Exercise 3 Panorama Stitching The task Compute Vision : Exercise 3 Panorama Stitching 09/12/2013 2 Pipeline Compute Vision : Exercise

More information

Automatic Pruning of Autotuning Parameter Space for OpenCL Applications

Automatic Pruning of Autotuning Parameter Space for OpenCL Applications Automatic Pruning of Autotuning Parameter Space for OpenCL Applications Ahmet Erdem, Gianluca Palermo 6, and Cristina Silvano 6 Department of Electronics, Information and Bioengineering Politecnico di

More information

IMPORTANT QUESTIONS IN C FOR THE INTERVIEW

IMPORTANT QUESTIONS IN C FOR THE INTERVIEW IMPORTANT QUESTIONS IN C FOR THE INTERVIEW 1. What is a header file? Header file is a simple text file which contains prototypes of all in-built functions, predefined variables and symbolic constants.

More information

Standards Update. Copyright Khronos Group Page 1

Standards Update. Copyright Khronos Group Page 1 Standards Update VR/AR, 3D, Web, Vision and Deep Learning Neil Trevett Khronos President NVIDIA VP Developer Ecosystem ntrevett@nvidia.com @neilt3d www.khronos.org Copyright Khronos Group 2017 - Page 1

More information

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller Entertainment Graphics: Virtual Realism for the Masses CSE 591: GPU Programming Introduction Computer games need to have: realistic appearance of characters and objects believable and creative shading,

More information

Fractal: A Software Toolchain for Mapping Applications to Diverse, Heterogeneous Architecures

Fractal: A Software Toolchain for Mapping Applications to Diverse, Heterogeneous Architecures Fractal: A Software Toolchain for Mapping Applications to Diverse, Heterogeneous Architecures University of Virginia Dept. of Computer Science Technical Report #CS-2011-09 Jeremy W. Sheaffer and Kevin

More information

Introduction to Multicore Programming

Introduction to Multicore Programming Introduction to Multicore Programming Minsoo Ryu Department of Computer Science and Engineering 2 1 Multithreaded Programming 2 Synchronization 3 Automatic Parallelization and OpenMP 4 GPGPU 5 Q& A 2 Multithreaded

More information

#include <tobii/tobii.h> char const* tobii_error_message( tobii_error_t error );

#include <tobii/tobii.h> char const* tobii_error_message( tobii_error_t error ); tobii.h Thread safety The tobii.h header file collects the core API functions of stream engine. It contains functions to initialize the API and establish a connection to a tracker, as well as enumerating

More information

08 - Address Generator Unit (AGU)

08 - Address Generator Unit (AGU) October 2, 2014 Todays lecture Memory subsystem Address Generator Unit (AGU) Schedule change A new lecture has been entered into the schedule (to compensate for the lost lecture last week) Memory subsystem

More information

CLICK TO EDIT MASTER TITLE STYLE. Click to edit Master text styles. Second level Third level Fourth level Fifth level

CLICK TO EDIT MASTER TITLE STYLE. Click to edit Master text styles. Second level Third level Fourth level Fifth level CLICK TO EDIT MASTER TITLE STYLE Second level THE HETEROGENEOUS SYSTEM ARCHITECTURE ITS (NOT) ALL ABOUT THE GPU PAUL BLINZER, FELLOW, HSA SYSTEM SOFTWARE, AMD SYSTEM ARCHITECTURE WORKGROUP CHAIR, HSA FOUNDATION

More information

Open Standards for Building Virtual and Augmented Realities. Neil Trevett Khronos President NVIDIA VP Developer Ecosystems

Open Standards for Building Virtual and Augmented Realities. Neil Trevett Khronos President NVIDIA VP Developer Ecosystems Open Standards for Building Virtual and Augmented Realities Neil Trevett Khronos President NVIDIA VP Developer Ecosystems Khronos Mission Asian Members Software Silicon Khronos is an International Industry

More information

HSA foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015!

HSA foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015! Advanced Topics on Heterogeneous System Architectures HSA foundation! Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2

More information

CISC 1600, Lab 3.1: Processing

CISC 1600, Lab 3.1: Processing CISC 1600, Lab 3.1: Processing Prof Michael Mandel 1 Getting set up For this lab, we will be using OpenProcessing, a site for building processing sketches online using processing.js. 1.1. Go to https://www.openprocessing.org/class/57767/

More information

CUDA Toolkit CUPTI User's Guide. DA _v01 September 2012

CUDA Toolkit CUPTI User's Guide. DA _v01 September 2012 CUDA Toolkit CUPTI User's Guide DA-05679-001_v01 September 2012 Document Change History Ver Date Resp Reason for change v01 2011/1/19 DG Initial revision for CUDA Tools SDK 4.0 v02 2012/1/5 DG Revisions

More information

The Path to Embedded Vision & AI using a Low Power Vision DSP. Yair Siegel, Director of Segment Marketing Hotchips August 2016

The Path to Embedded Vision & AI using a Low Power Vision DSP. Yair Siegel, Director of Segment Marketing Hotchips August 2016 The Path to Embedded Vision & AI using a Low Power Vision DSP Yair Siegel, Director of Segment Marketing Hotchips August 2016 Presentation Outline Introduction The Need for Embedded Vision & AI Vision

More information

Vulkan: Architecture positive How Vulkan maps to PowerVR GPUs Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics.

Vulkan: Architecture positive How Vulkan maps to PowerVR GPUs Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics. Vulkan: Architecture positive How Vulkan maps to PowerVR GPUs Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics www.imgtec.com Introduction Who am I? Kevin Sun Working at Imagination Technologies

More information

Copyright Khronos Group 2012 Page 1. OpenCL 1.2. August 2012

Copyright Khronos Group 2012 Page 1. OpenCL 1.2. August 2012 Copyright Khronos Group 2012 Page 1 OpenCL 1.2 August 2012 Copyright Khronos Group 2012 Page 2 Khronos - Connecting Software to Silicon Khronos defines open, royalty-free standards to access graphics,

More information

Overview and AR/VR Roadmap

Overview and AR/VR Roadmap Khronos Group Inc. 2018 - Page 1 Overview and AR/ Roadmap Neil Trevett Khronos President NVIDIA VP Developer Ecosystems ntrevett@nvidia.com @neilt3d Khronos Group Inc. 2018 - Page 2 Khronos Connects Software

More information

Vulkan 1.1 March Copyright Khronos Group Page 1

Vulkan 1.1 March Copyright Khronos Group Page 1 Vulkan 1.1 March 2018 Copyright Khronos Group 2018 - Page 1 Vulkan 1.1 Launch and Ongoing Momentum Strengthening the Ecosystem Improved developer tools (SDK, validation/debug layers) More rigorous conformance

More information

SIMULATOR AMD RESEARCH JUNE 14, 2015

SIMULATOR AMD RESEARCH JUNE 14, 2015 AMD'S gem5apu SIMULATOR AMD RESEARCH JUNE 14, 2015 OVERVIEW Introducing AMD s gem5 APU Simulator Extends gem5 with a GPU timing model Supports Heterogeneous System Architecture in SE mode Includes several

More information

INTRODUCTION TO OPENCL TM A Beginner s Tutorial. Udeepta Bordoloi AMD

INTRODUCTION TO OPENCL TM A Beginner s Tutorial. Udeepta Bordoloi AMD INTRODUCTION TO OPENCL TM A Beginner s Tutorial Udeepta Bordoloi AMD IT S A HETEROGENEOUS WORLD Heterogeneous computing The new normal CPU Many CPU s 2, 4, 8, Very many GPU processing elements 100 s Different

More information

Addressing the Memory Wall

Addressing the Memory Wall Lecture 26: Addressing the Memory Wall Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2015 Tunes Cage the Elephant Back Against the Wall (Cage the Elephant) This song is for the

More information

VSIPL Fundamentals. Naming Data Types Blocks & Views. Randall Judd SSC-SD VSIPL Forum 1

VSIPL Fundamentals. Naming Data Types Blocks & Views. Randall Judd SSC-SD VSIPL Forum 1 VSIPL Fundamentals Naming Data Types Blocks & Views Randall Judd SSC-SD 619 553 3086 judd@spawar.navy.mil 1 VSIPL Names VSIPL is an ANSI C API for signal processing. ANSI C does not support function overloading.

More information