Async Workgroup Update. Barthold Lichtenbelt

Size: px
Start display at page:

Download "Async Workgroup Update. Barthold Lichtenbelt"

Transcription

1 Async Workgroup Update Barthold Lichtenbelt 1

2 Goals Provide synchronization framework for OpenGL - Provide base functionality as defined in NV_fence and GL2_async_core - Build a framework for future, more complex, functionality, some of which discussed in GL2_async_core - Initially support CPU <-> GPU synchronization - Support synchronization across multiple OpenGL contexts Resulted in GL_ARB_sync spec - Finished April Posted draft to opengl.org for feedback - Not quite official ARB extension yet 2

3 Functionality overview ARB_sync provides synchronization primitives - Can be tested, set and waited upon Specifically, a Fence Synchronization Object and corresponding Fence command Fence completion allows for partial glfinish - All commands prior to the fence are forced to complete before control is returned to caller Fence Sync Objects can be shared across contexts - Allows for synchronization of OpenGL command streams across contexts New data type: GLtime represents intervals in nanoseconds - 64 bit integer, same encoding as UST counter in OpenML - Accuracy implementation dependent, precision in nanoseconds If you have used the Windows Event model, this will feel familiar 3

4 Synchronization model in ARB_sync 1/2 A sync object is a primitive used for synchronization between CPU and GPU, CPU, or something else. - Sync object has state: type, condition, status A sync object s status can be signaled or non-signaled - when created status is signaled unless a flag is set in which case it is non-signaled A fence sync object is a specific type of sync object - Provides partial finish semantics - Only type of sync object currently defined A fence is a token inserted in the GL command stream - A sync object is not inserted into the command stream - Fence has no state A fence is associated with a fence sync object. - Multiple fences can be associated with the same sync object When a fence is inserted in the command stream, the status of its sync object is set to non-signaled A fence, once completed, will set the status of its sync object to signaled 4

5 Synchronization model in ARB_sync 2/2 A wait function waits on a sync object, not on a fence A poll function polls a sync object, not a fence A wait function called on a sync object in the non-signaled state will block. It unblocks when the sync object transitions to the signaled state. 5

6 Example RTT with two contexts Context A Sync_objectA = glcreatesync(attrib); <render to texture that context B needs> glfence(sync_objecta); glflush(); // prevent deadlock Context B glclientwaitsync(sync_objecta,0,gl_forever); glbindtexture(.); // Just rendered <render using texture> 6

7 OS specific functionality Convert sync object to the window system native event primitive - Allows applications to synchronize all events in a system using one API All operations on <sync> are reflected in OS event and vice-versa Both <sync> and the OS event are valid to use in your code On windows, convert to an Event HANDLE wglconvertsynctoevent(object sync); - Need to specify, when sync object is created, that it can be converted to OS event - Separate extension: WGL_ARB_sync_event On Unix, convert to a file-descriptor, x-event or semaphore? - Still TBD 7

8 Possible future functionality Add a WaitForMultipleSync(uint *sync_objects,.) command - Synchronize with multiple sync objects at once Add a payload to a fence - For example, the time it completed Allow one GPU stream to wait for another GPU stream - WaitSync(sync_object); A sync object whose status will pulse with every vblank A sync object that can signal when data binding has completed - As opposed to when rendering has completed using the data 8

9 Example Streaming video processing Loop Draw frame 1 glfence(sync_object1); // To a FBO, for example // inserts a fence in the command stream Draw frame 2 glfence(sync_object2); Read back data in frame 1 Read back data in frame 2 9

10 Variation with asynchronous read back Loop Draw frame 1 Read back frame 1 into PBO 1 glfence(sync_object1); // To a FBO, for example // Asynchronous readback // Inserts a fence in the command stream Draw frame 2 Read back frame 2 into PBO 2 glfence(sync_object2); glmapbuffer( ); // Access the data of frame 1 in PBO 1 glmapbuffer( ); // Access the data of frame 2 in PBO 2 10

11 Differences with GL_NV_Fence No separation of sync objects and fences in NV_Fence - NV version only has fence objects - Fence object has state Creation of sync object and inserting a fence in one command - SetFenceNV creates and inserts a fence (old object model) NV Fence objects not shared across contexts 11

12 API Overview 1/2 Create a sync attribute object object CreateSyncAttrib(); - SYNC_TYPE has to be FENCE - SYNC_CONDITION has to be SYNC_PRIOR_COMMANDS_COMPLETE - SYNC_STATUS SIGNALED or UNSIGNALED Create the sync object object CreateSync(object attrib); Insert a fence, associated with a sync object, into command stream void Fence(object sync); 12

13 API Overview 2/2 Wait or test the status of a fence sync object enum ClientWaitSync(object sync, uint flags, time timeout); - Blocks until sync is signalled or timeout expired - If timeout == 0, does not block, returns the status of sync - If timeout == FOREVER, call does not timeout - Optionally will flush before blocking - Returns 3 values: ALREADY_SIGNALED, TIMEOUT_EXPIRED, CONDITION_SATISFIED Signal or unsignal a sync object void SignalSync(object sync, enum mode); - If status transitions from unsignaled to signaled, ClientWaitSync will unblock 13

14 Example Streaming video processing Loop Draw frame 1 glfence(sync_object1); // To a FBO, for example // inserts a fence in the command stream Draw frame 2 glfence(sync_object2); Read back data in frame 1 Read back data in frame 2 14

15 Variation with asynchronous read back Loop Draw frame 1 Read back frame 1 into PBO 1 glfence(sync_object1); // To a FBO, for example // Asynchronous readback // Inserts a fence in the command stream Draw frame 2 Read back frame 2 into PBO 2 glfence(sync_object2); glmapbuffer( ); // Access the data of frame 1 in PBO 1 glmapbuffer( ); // Access the data of frame 2 in PBO 2 15

Sync Points in the Intel Gfx Driver. Jesse Barnes Intel Open Source Technology Center

Sync Points in the Intel Gfx Driver. Jesse Barnes Intel Open Source Technology Center Sync Points in the Intel Gfx Driver Jesse Barnes Intel Open Source Technology Center 1 Agenda History and other implementations Other I/O layers - block device ordering NV_fence, ARB_sync EGL_native_fence_sync,

More information

GDC 2014 Barthold Lichtenbelt OpenGL ARB chair

GDC 2014 Barthold Lichtenbelt OpenGL ARB chair GDC 2014 Barthold Lichtenbelt OpenGL ARB chair Agenda OpenGL 4.4, news and updates - Barthold Lichtenbelt, NVIDIA Low Overhead Rendering with OpenGL - Cass Everitt, NVIDIA Copyright Khronos Group, 2010

More information

OpenGL Status - November 2013 G-Truc Creation

OpenGL Status - November 2013 G-Truc Creation OpenGL Status - November 2013 G-Truc Creation Vendor NVIDIA AMD Intel Windows Apple Release date 02/10/2013 08/11/2013 30/08/2013 22/10/2013 Drivers version 331.10 beta 13.11 beta 9.2 10.18.10.3325 MacOS

More information

OpenGL BOF Siggraph 2011

OpenGL BOF Siggraph 2011 OpenGL BOF Siggraph 2011 OpenGL BOF Agenda OpenGL 4 update Barthold Lichtenbelt, NVIDIA OpenGL Shading Language Hints/Kinks Bill Licea-Kane, AMD Ecosystem update Jon Leech, Khronos Viewperf 12, a new beginning

More information

Kenneth Dyke Sr. Engineer, Graphics and Compute Architecture

Kenneth Dyke Sr. Engineer, Graphics and Compute Architecture Kenneth Dyke Sr. Engineer, Graphics and Compute Architecture 2 Supporting multiple GPUs in your application Finding all renderers and devices Responding to renderer changes Making use of multiple GPUs

More information

Vulkan (including Vulkan Fast Paths)

Vulkan (including Vulkan Fast Paths) Vulkan (including Vulkan Fast Paths) Łukasz Migas Software Development Engineer WS Graphics Let s talk about OpenGL (a bit) History 1.0-1992 1.3-2001 multitexturing 1.5-2003 vertex buffer object 2.0-2004

More information

SLICING THE WORKLOAD MULTI-GPU OPENGL RENDERING APPROACHES

SLICING THE WORKLOAD MULTI-GPU OPENGL RENDERING APPROACHES SLICING THE WORKLOAD MULTI-GPU OPENGL RENDERING APPROACHES INGO ESSER NVIDIA DEVTECH PROVIZ OVERVIEW Motivation Tools of the trade Multi-GPU driver functions Multi-GPU programming functions Multi threaded

More information

Discussion Week 8. TA: Kyle Dewey. Tuesday, November 15, 11

Discussion Week 8. TA: Kyle Dewey. Tuesday, November 15, 11 Discussion Week 8 TA: Kyle Dewey Overview Exams Interrupt priority Direct memory access (DMA) Different kinds of I/O calls Caching What I/O looks like Exams Interrupt Priority Process 1 makes an I/O request

More information

GeForce3 OpenGL Performance. John Spitzer

GeForce3 OpenGL Performance. John Spitzer GeForce3 OpenGL Performance John Spitzer GeForce3 OpenGL Performance John Spitzer Manager, OpenGL Applications Engineering jspitzer@nvidia.com Possible Performance Bottlenecks They mirror the OpenGL pipeline

More information

User Guide. TexturePerformancePBO Demo

User Guide. TexturePerformancePBO Demo User Guide TexturePerformancePBO Demo The TexturePerformancePBO Demo serves two purposes: 1. It allows developers to experiment with various combinations of texture transfer methods for texture upload

More information

December 11, 2001 Copyright 3Dlabs, Page 1

December 11, 2001 Copyright 3Dlabs, Page 1 Status Update December 11, 2001 Copyright 3Dlabs, 2001 - Page 1 OpenGL 2.0 Progress Update White Papers Followed timeline established at September ARB meeting Distributed to identified reviewers in mid-october

More information

Threaded OpenGL API Dispatch. Alexander Monakov. Institute for System Programming of Russian Academy of Sciences

Threaded OpenGL API Dispatch. Alexander Monakov. Institute for System Programming of Russian Academy of Sciences tangl and mangl Threaded OpenGL API Dispatch Alexander Monakov amonakov@ispras.ru Institute for System Programming of Russian Academy of Sciences X.Org Developers Conference, October 10, 2014 1 / 25 Talking

More information

Programmable Graphics Hardware

Programmable Graphics Hardware Programmable Graphics Hardware Outline 2/ 49 A brief Introduction into Programmable Graphics Hardware Hardware Graphics Pipeline Shading Languages Tools GPGPU Resources Hardware Graphics Pipeline 3/ 49

More information

Copyright Khronos Group, Page 1. OpenCL. GDC, March 2010

Copyright Khronos Group, Page 1. OpenCL. GDC, March 2010 Copyright Khronos Group, 2011 - Page 1 OpenCL GDC, March 2010 Authoring and accessibility Application Acceleration System Integration Copyright Khronos Group, 2011 - Page 2 Khronos Family of Standards

More information

Remote Invocation. Today. Next time. l Overlay networks and P2P. l Request-reply, RPC, RMI

Remote Invocation. Today. Next time. l Overlay networks and P2P. l Request-reply, RPC, RMI Remote Invocation Today l Request-reply, RPC, RMI Next time l Overlay networks and P2P Types of communication " Persistent or transient Persistent A submitted message is stored until delivered Transient

More information

Mention driver developers in the room. Because of time this will be fairly high level, feel free to come talk to us afterwards

Mention driver developers in the room. Because of time this will be fairly high level, feel free to come talk to us afterwards 1 Introduce Mark, Michael Poll: Who is a software developer or works for a software company? Who s in management? Who knows what the OpenGL ARB standards body is? Mention driver developers in the room.

More information

This process is a fundamental step for every USB device, fore without it, the device would never be able to be used by the OS.

This process is a fundamental step for every USB device, fore without it, the device would never be able to be used by the OS. What is USB Enumeration? Enumeration is the process by which a USB device is attached to a system and is assigned a specific numerical address that will be used to access that particular device. It is

More information

Superbuffers Workgroup Update. Barthold Lichtenbelt

Superbuffers Workgroup Update. Barthold Lichtenbelt Superbuffers Workgroup Update Barthold Lichtenbelt 1 EXT_framebuffer_object update Specification stable since September 2005 Version #118 posted to Registry April 2006 - Lay groundwork for R, RG rendering

More information

GLSL Overview: Creating a Program

GLSL Overview: Creating a Program 1. Create the OpenGL application GLSL Overview: Creating a Program Primarily concerned with drawing Preferred approach uses buffer objects All drawing done in terms of vertex arrays Programming style differs

More information

Introduction to Asynchronous Programming Fall 2014

Introduction to Asynchronous Programming Fall 2014 CS168 Computer Networks Fonseca Introduction to Asynchronous Programming Fall 2014 Contents 1 Introduction 1 2 The Models 1 3 The Motivation 3 4 Event-Driven Programming 4 5 select() to the rescue 5 1

More information

Best practices for effective OpenGL programming. Dan Omachi OpenGL Development Engineer

Best practices for effective OpenGL programming. Dan Omachi OpenGL Development Engineer Best practices for effective OpenGL programming Dan Omachi OpenGL Development Engineer 2 What Is OpenGL? 3 OpenGL is a software interface to graphics hardware - OpenGL Specification 4 GPU accelerates rendering

More information

20 Years of OpenGL. Kurt Akeley. Copyright Khronos Group, Page 1

20 Years of OpenGL. Kurt Akeley. Copyright Khronos Group, Page 1 20 Years of OpenGL Kurt Akeley Copyright Khronos Group, 2010 - Page 1 So many deprecations! Application-generated object names Color index mode SL versions 1.10 and 1.20 Begin / End primitive specification

More information

Micrium µc/os II RTOS Introduction EE J. E. Lumpp

Micrium µc/os II RTOS Introduction EE J. E. Lumpp Micrium µc/os II RTOS Introduction (by Jean Labrosse) EE599 001 Fall 2012 J. E. Lumpp μc/os II μc/os II is a highly portable, ROMable, very scalable, preemptive real time, deterministic, multitasking kernel

More information

Get the most out of the new OpenGL ES 3.1 API. Hans-Kristian Arntzen Software Engineer

Get the most out of the new OpenGL ES 3.1 API. Hans-Kristian Arntzen Software Engineer Get the most out of the new OpenGL ES 3.1 API Hans-Kristian Arntzen Software Engineer 1 Content Compute shaders introduction Shader storage buffer objects Shader image load/store Shared memory Atomics

More information

PERFORMANCE OPTIMIZATIONS FOR AUTOMOTIVE SOFTWARE

PERFORMANCE OPTIMIZATIONS FOR AUTOMOTIVE SOFTWARE April 4-7, 2016 Silicon Valley PERFORMANCE OPTIMIZATIONS FOR AUTOMOTIVE SOFTWARE Pradeep Chandrahasshenoy, Automotive Solutions Architect, NVIDIA Stefan Schoenefeld, ProViz DevTech, NVIDIA 4 th April 2016

More information

Asynchronous Events on Linux

Asynchronous Events on Linux Asynchronous Events on Linux Frederic.Rossi@Ericsson.CA Open System Lab Systems Research June 25, 2002 Ericsson Research Canada Introduction Linux performs well as a general purpose OS but doesn t satisfy

More information

CSE Traditional Operating Systems deal with typical system software designed to be:

CSE Traditional Operating Systems deal with typical system software designed to be: CSE 6431 Traditional Operating Systems deal with typical system software designed to be: general purpose running on single processor machines Advanced Operating Systems are designed for either a special

More information

Homework #3. CS 318/418/618, Fall Handout: 10/09/2017

Homework #3. CS 318/418/618, Fall Handout: 10/09/2017 CS 318/418/618, Fall 2017 Homework #3 Handout: 10/09/2017 1. Microsoft.NET provides a synchronization primitive called a CountdownEvent. Programs use CountdownEvent to synchronize on the completion of

More information

Last Class: Deadlocks. Today

Last Class: Deadlocks. Today Last Class: Deadlocks Necessary conditions for deadlock: Mutual exclusion Hold and wait No preemption Circular wait Ways of handling deadlock Deadlock detection and recovery Deadlock prevention Deadlock

More information

DRI Memory Management

DRI Memory Management DRI Memory Management Full strength manager wasn't required for traditional usage: Quake3 and glxgears. Perceived to be difficult. Fundamental for modern desktops, offscreen rendering. Talked about for

More information

3.1 Introduction. Computers perform operations concurrently

3.1 Introduction. Computers perform operations concurrently PROCESS CONCEPTS 1 3.1 Introduction Computers perform operations concurrently For example, compiling a program, sending a file to a printer, rendering a Web page, playing music and receiving e-mail Processes

More information

CSE325 Principles of Operating Systems. Processes. David P. Duggan February 1, 2011

CSE325 Principles of Operating Systems. Processes. David P. Duggan February 1, 2011 CSE325 Principles of Operating Systems Processes David P. Duggan dduggan@sandia.gov February 1, 2011 Today s Goal: 1. Process Concept 2. Process Manager Responsibilities 3. Process Scheduling 4. Operations

More information

Native D3D9 on Mesa Gallium Nine : the status

Native D3D9 on Mesa Gallium Nine : the status Native D3D9 on Mesa Gallium Nine : the status Axel Davy FOSDEM 2015 1 Introduction 2 Wine integration 3 Presenting to the screen D3D9 queue multi-gpu Misc 4 Gallium Nine internals 5 Performance Test configuration

More information

Real - Time Rendering. Pipeline optimization. Michal Červeňanský Juraj Starinský

Real - Time Rendering. Pipeline optimization. Michal Červeňanský Juraj Starinský Real - Time Rendering Pipeline optimization Michal Červeňanský Juraj Starinský Motivation Resolution 1600x1200, at 60 fps Hw power not enough Acceleration is still necessary 3.3.2010 2 Overview Application

More information

OpenGL ES for iphone Games. Erik M. Buck

OpenGL ES for iphone Games. Erik M. Buck OpenGL ES for iphone Games Erik M. Buck Topics The components of a game n Technology: Graphics, sound, input, physics (an engine) n Art: The content n Fun: That certain something (a mystery) 2 What is

More information

Free Downloads OpenGL ES 3.0 Programming Guide

Free Downloads OpenGL ES 3.0 Programming Guide Free Downloads OpenGL ES 3.0 Programming Guide OpenGLÂ Â ESâ is the industryâ s leading software interface and graphics library for rendering sophisticated 3D graphics on handheld and embedded devices.

More information

Working with Metal Overview

Working with Metal Overview Graphics and Games #WWDC14 Working with Metal Overview Session 603 Jeremy Sandmel GPU Software 2014 Apple Inc. All rights reserved. Redistribution or public display not permitted without written permission

More information

STREAMING VIDEO DATA INTO 3D APPLICATIONS Session Christopher Mayer AMD Sr. Software Engineer

STREAMING VIDEO DATA INTO 3D APPLICATIONS Session Christopher Mayer AMD Sr. Software Engineer STREAMING VIDEO DATA INTO 3D APPLICATIONS Session 2116 Christopher Mayer AMD Sr. Software Engineer CONTENT Introduction Pinned Memory Streaming Video Data How does the APU change the game 3 Streaming Video

More information

Concurrent Server Design Multiple- vs. Single-Thread

Concurrent Server Design Multiple- vs. Single-Thread Concurrent Server Design Multiple- vs. Single-Thread Chuan-Ming Liu Computer Science and Information Engineering National Taipei University of Technology Fall 2007, TAIWAN NTUT, TAIWAN 1 Examples Using

More information

CS179 GPU Programming Introduction to CUDA. Lecture originally by Luke Durant and Tamas Szalay

CS179 GPU Programming Introduction to CUDA. Lecture originally by Luke Durant and Tamas Szalay Introduction to CUDA Lecture originally by Luke Durant and Tamas Szalay Today CUDA - Why CUDA? - Overview of CUDA architecture - Dense matrix multiplication with CUDA 2 Shader GPGPU - Before current generation,

More information

Lecture 8: Other IPC Mechanisms. CSC 469H1F Fall 2006 Angela Demke Brown

Lecture 8: Other IPC Mechanisms. CSC 469H1F Fall 2006 Angela Demke Brown Lecture 8: Other IPC Mechanisms CSC 469H1F Fall 2006 Angela Demke Brown Topics Messages through sockets / pipes Receiving notification of activity Generalizing the event notification mechanism Kqueue Semaphores

More information

Topics. Lecture 8: Other IPC Mechanisms. Socket IPC. Unix Communication

Topics. Lecture 8: Other IPC Mechanisms. Socket IPC. Unix Communication Topics Lecture 8: Other IPC Mechanisms CSC 469H1F Fall 2006 Angela Demke Brown Messages through sockets / pipes Receiving notification of activity Generalizing the event notification mechanism Kqueue Semaphores

More information

Programming in the Simple Raster Graphics Package (SRGP)

Programming in the Simple Raster Graphics Package (SRGP) Programming in the Simple Raster Graphics Package (SRGP) Chapter 2 This chapter focuses on a graphics package called SRGP. SRGP was written by the authors to demonstrate some of the basics of Raster Graphics

More information

Copyright Khronos Group, Page Graphic Remedy. All Rights Reserved

Copyright Khronos Group, Page Graphic Remedy. All Rights Reserved Avi Shapira Graphic Remedy Copyright Khronos Group, 2009 - Page 1 2004 2009 Graphic Remedy. All Rights Reserved Debugging and profiling 3D applications are both hard and time consuming tasks Companies

More information

Processes. Process Management Chapter 3. When does a process gets created? When does a process gets terminated?

Processes. Process Management Chapter 3. When does a process gets created? When does a process gets terminated? Processes Process Management Chapter 3 1 A process is a program in a state of execution (created but not terminated) Program is a passive entity one on your disk (survivor.class, kelly.out, ) Process is

More information

A Deterministic Concurrent Language for Embedded Systems

A Deterministic Concurrent Language for Embedded Systems A Deterministic Concurrent Language for Embedded Systems Stephen A. Edwards Columbia University Joint work with Olivier Tardieu SHIM:A Deterministic Concurrent Language for Embedded Systems p. 1/38 Definition

More information

Blink: 3D Display Multiplexing for Virtualized Applications

Blink: 3D Display Multiplexing for Virtualized Applications : 3D Display Multiplexing for Virtualized Applications January 20, 2006 : 3D Display Multiplexing for Virtualized Applications Motivation Sprites and Tiles Lessons Learned GL in, GL out Communication Protocol

More information

Signals, Synchronization. CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han

Signals, Synchronization. CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han , Synchronization CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han Announcements Program Assignment #1 due Tuesday Feb. 15 at 11:55 pm TA will explain parts b-d in recitation Read chapters 7 and

More information

Process Description and Control. Chapter 3

Process Description and Control. Chapter 3 Process Description and Control 1 Chapter 3 2 Processes Working definition: An instance of a program Processes are among the most important abstractions in an OS all the running software on a computer,

More information

1995 Paper 10 Question 7

1995 Paper 10 Question 7 995 Paper 0 Question 7 Why are multiple buffers often used between producing and consuming processes? Describe the operation of a semaphore. What is the difference between a counting semaphore and a binary

More information

The Kernel Abstraction

The Kernel Abstraction The Kernel Abstraction Debugging as Engineering Much of your time in this course will be spent debugging In industry, 50% of software dev is debugging Even more for kernel development How do you reduce

More information

GPU Memory Model. Adapted from:

GPU Memory Model. Adapted from: GPU Memory Model Adapted from: Aaron Lefohn University of California, Davis With updates from slides by Suresh Venkatasubramanian, University of Pennsylvania Updates performed by Gary J. Katz, University

More information

Processes. Process Concept

Processes. Process Concept Processes These slides are created by Dr. Huang of George Mason University. Students registered in Dr. Huang s courses at GMU can make a single machine readable copy and print a single copy of each slide

More information

Today CSCI Communication. Communication in Distributed Systems. Communication in Distributed Systems. Remote Procedure Calls (RPC)

Today CSCI Communication. Communication in Distributed Systems. Communication in Distributed Systems. Remote Procedure Calls (RPC) Today CSCI 5105 Communication in Distributed Systems Overview Types Remote Procedure Calls (RPC) Instructor: Abhishek Chandra 2 Communication How do program modules/processes communicate on a single machine?

More information

Get your port on! porting to Native Client as of Pepper 18. Colt "MainRoach" McAnlis

Get your port on! porting to Native Client as of Pepper 18. Colt MainRoach McAnlis Get your port on! porting to Native Client as of Pepper 18 Colt "MainRoach" McAnlis 3.05.2012 Getting Started gonacl.com It works! Native Client runs C++ code in a web page No plug-in required The Gist

More information

Coding OpenGL ES 3.0 for Better Graphics Quality

Coding OpenGL ES 3.0 for Better Graphics Quality Coding OpenGL ES 3.0 for Better Graphics Quality Part 2 Hugo Osornio Rick Tewell A P R 1 1 t h 2 0 1 4 TM External Use Agenda Exercise 1: Array Structure vs Vertex Buffer Objects vs Vertex Array Objects

More information

Scuola Superiore Sant Anna. I/O subsystem. Giuseppe Lipari

Scuola Superiore Sant Anna. I/O subsystem. Giuseppe Lipari Scuola Superiore Sant Anna I/O subsystem Giuseppe Lipari Input Output and Device Drivers ERI Gennaio 2008 2 Objectives of the I/O subsystem To hide the complexity From the variability of the devices Provide

More information

ANDROID APPS DEVELOPMENT FOR MOBILE AND TABLET DEVICE (LEVEL II)

ANDROID APPS DEVELOPMENT FOR MOBILE AND TABLET DEVICE (LEVEL II) ANDROID APPS DEVELOPMENT FOR MOBILE AND TABLET DEVICE (LEVEL II) Media Playback Engine Android provides a media playback engine at the native level called Stagefright that comes built-in with software-based

More information

Today: Distributed Middleware. Middleware

Today: Distributed Middleware. Middleware Today: Distributed Middleware Middleware concepts Case study: CORBA Lecture 24, page 1 Middleware Software layer between application and the OS Provides useful services to the application Abstracts out

More information

The Application Stage. The Game Loop, Resource Management and Renderer Design

The Application Stage. The Game Loop, Resource Management and Renderer Design 1 The Application Stage The Game Loop, Resource Management and Renderer Design Application Stage Responsibilities 2 Set up the rendering pipeline Resource Management 3D meshes Textures etc. Prepare data

More information

MICROKERNEL CONSTRUCTION 2014

MICROKERNEL CONSTRUCTION 2014 MICROKERNEL CONSTRUCTION 2014 THE FIASCO.OC MICROKERNEL Alexander Warg MICROKERNEL CONSTRUCTION 1 FIASCO.OC IN ONE SLIDE CAPABILITY-BASED MICROKERNEL API single system call invoke capability MULTI-PROCESSOR

More information

To Do. Computer Graphics (Fall 2008) Course Outline. Course Outline. Methodology for Lecture. Demo: Surreal (HW 3)

To Do. Computer Graphics (Fall 2008) Course Outline. Course Outline. Methodology for Lecture. Demo: Surreal (HW 3) Computer Graphics (Fall 2008) COMS 4160, Lecture 9: OpenGL 1 http://www.cs.columbia.edu/~cs4160 To Do Start thinking (now) about HW 3. Milestones are due soon. Course Course 3D Graphics Pipeline 3D Graphics

More information

Vulkan C++ Markus Tavenrath, Senior DevTech Software Engineer Professional Visualization

Vulkan C++ Markus Tavenrath, Senior DevTech Software Engineer Professional Visualization Vulkan C++ Markus Tavenrath, Senior DevTech Software Engineer Professional Visualization Who am I? Markus Tavenrath Senior Dev Tech Software Engineer - Professional Visualization Joined NVIDIA 8 years

More information

Mixing graphics and compute with multiple GPUs

Mixing graphics and compute with multiple GPUs Mixing graphics and compute with multiple GPUs genda Compute and Graphics Interoperability Interoperability at a system level pplication design considerations Putting Graphics & Compute together Compute

More information

Breaking Down Barriers: An Intro to GPU Synchronization. Matt Pettineo Lead Engine Programmer Ready At Dawn Studios

Breaking Down Barriers: An Intro to GPU Synchronization. Matt Pettineo Lead Engine Programmer Ready At Dawn Studios Breaking Down Barriers: An Intro to GPU Synchronization Matt Pettineo Lead Engine Programmer Ready At Dawn Studios Who am I? Ready At Dawn for 9 years Lead Engine Programmer for 5 I like GPUs and APIs!

More information

Achieving High-performance Graphics on Mobile With the Vulkan API

Achieving High-performance Graphics on Mobile With the Vulkan API Achieving High-performance Graphics on Mobile With the Vulkan API Marius Bjørge Graphics Research Engineer GDC 2016 Agenda Overview Command Buffers Synchronization Memory Shaders and Pipelines Descriptor

More information

Tuning CUDA Applications for Fermi. Version 1.2

Tuning CUDA Applications for Fermi. Version 1.2 Tuning CUDA Applications for Fermi Version 1.2 7/21/2010 Next-Generation CUDA Compute Architecture Fermi is NVIDIA s next-generation CUDA compute architecture. The Fermi whitepaper [1] gives a detailed

More information

Distributed Systems Theory 4. Remote Procedure Call. October 17, 2008

Distributed Systems Theory 4. Remote Procedure Call. October 17, 2008 Distributed Systems Theory 4. Remote Procedure Call October 17, 2008 Client-server model vs. RPC Client-server: building everything around I/O all communication built in send/receive distributed computing

More information

PRIME Synchronization. XDC 2016 Alex Goins, Andy Ritger

PRIME Synchronization. XDC 2016 Alex Goins, Andy Ritger PRIME Synchronization XDC 2016 Alex Goins, Andy Ritger 1 Introduction: PRIME Output Slaving Enables the sequence: One GPU renders and transfer pixels through GEM shared buffers. Another GPU displays the

More information

Thread Concept. Thread. No. 3. Multiple single-threaded Process. One single-threaded Process. Process vs. Thread. One multi-threaded Process

Thread Concept. Thread. No. 3. Multiple single-threaded Process. One single-threaded Process. Process vs. Thread. One multi-threaded Process EECS 3221 Operating System Fundamentals What is thread? Thread Concept No. 3 Thread Difference between a process and a thread Prof. Hui Jiang Dept of Electrical Engineering and Computer Science, York University

More information

Performance Analysis of Sobel Edge Detection Filter on GPU using CUDA & OpenGL

Performance Analysis of Sobel Edge Detection Filter on GPU using CUDA & OpenGL Performance Analysis of Sobel Edge Detection Filter on GPU using CUDA & OpenGL Ms. Khyati Shah Assistant Professor, Computer Engineering Department VIER-kotambi, INDIA khyati30@gmail.com Abstract: CUDA(Compute

More information

Bringing Vulkan to VR. Cass Everitt, Oculus

Bringing Vulkan to VR. Cass Everitt, Oculus Bringing Vulkan to VR Cass Everitt, Oculus A Presentation in Two Acts The Graphics-API-Agnostic Design of the VrApi The Vulkan-Samples atw Sample Family as Proving Grounds Act One The Graphics-API-Agnostic

More information

Last Class: CPU Scheduling! Adjusting Priorities in MLFQ!

Last Class: CPU Scheduling! Adjusting Priorities in MLFQ! Last Class: CPU Scheduling! Scheduling Algorithms: FCFS Round Robin SJF Multilevel Feedback Queues Lottery Scheduling Review questions: How does each work? Advantages? Disadvantages? Lecture 7, page 1

More information

Why modern versions of OpenGL should be used Some useful API commands and extensions

Why modern versions of OpenGL should be used Some useful API commands and extensions Michał Radziszewski Why modern versions of OpenGL should be used Some useful API commands and extensions Timer Query EXT Direct State Access (DSA) Geometry Programs Position in pipeline Rendering wireframe

More information

Extending Flink s Streaming APIs

Extending Flink s Streaming APIs Extending Flink s Streaming APIs Kostas Kloudas @KLOUBEN_K Flink Forward San Francisco April 11, 2017 1 Original creators of Apache Flink Providers of the da Platform, a supported Flink distribution 2

More information

TG-Gallium Driver Stack. Softpipe, Cell and Beyond. Keith Whitwell

TG-Gallium Driver Stack. Softpipe, Cell and Beyond. Keith Whitwell TG-Gallium Driver Stack Softpipe, Cell and Beyond DRI Driver Model drm App Mesa DRI Driver DRI Leaky interface between Mesa and driver. Drivers getting bigger, more complex. API, OS dependencies encoded

More information

Extensions to Barrelfish Asynchronous C

Extensions to Barrelfish Asynchronous C Extensions to Barrelfish Asynchronous C Michael Quigley michaelforrquigley@gmail.com School of Computing, University of Utah October 27, 2016 1 Abstract The intent of the Microsoft Barrelfish Asynchronous

More information

Mobile AR Hardware Futures

Mobile AR Hardware Futures Copyright Khronos Group, 2010 - Page 1 Mobile AR Hardware Futures Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group Two Perspectives NVIDIA - Tegra 2 mobile processor Khronos

More information

EXPLICIT SYNCHRONIZATION

EXPLICIT SYNCHRONIZATION EXPLICIT SYNCHRONIZATION Lauri Peltonen XDC, 8 October, 204 WHAT IS EXPLICIT SYNCHRONIZATION? Fence is an abstract primitive that marks completion of an operation Implicit synchronization Fences are attached

More information

Building X 2D rendering acceleration with OpenGL. Eric Anholt Intel Open Source Technology Center

Building X 2D rendering acceleration with OpenGL. Eric Anholt Intel Open Source Technology Center Building X 2D rendering acceleration with OpenGL Eric Anholt Intel Open Source Technology Center How 2D has worked X has always implemented graphics acceleration in a hardware specific driver Acceleration

More information

Embedded Systems. 5. Operating Systems. Lothar Thiele. Computer Engineering and Networks Laboratory

Embedded Systems. 5. Operating Systems. Lothar Thiele. Computer Engineering and Networks Laboratory Embedded Systems 5. Operating Systems Lothar Thiele Computer Engineering and Networks Laboratory Embedded Operating Systems 5 2 Embedded Operating System (OS) Why an operating system (OS) at all? Same

More information

Operating System Structure

Operating System Structure Operating System Structure CSCI 4061 Introduction to Operating Systems Applications Instructor: Abhishek Chandra Operating System Hardware 2 Questions Operating System Structure How does the OS manage

More information

ADVANCED RENDERING WITH DIRECTX 12

ADVANCED RENDERING WITH DIRECTX 12 April 4-7, 2016 Silicon Valley ADVANCED RENDERING WITH DIRECTX 12 Oleg Kuznetsov, Developer Technology Engineer, April 4 th 2016 AGENDA DirectX 12: more control & responsibilities How to efficiently drive

More information

Introduction to OS Synchronization MOS 2.3

Introduction to OS Synchronization MOS 2.3 Introduction to OS Synchronization MOS 2.3 Mahmoud El-Gayyar elgayyar@ci.suez.edu.eg Mahmoud El-Gayyar / Introduction to OS 1 Challenge How can we help processes synchronize with each other? E.g., how

More information

Copyright Khronos Group, Page 1. OpenCL Overview. February 2010

Copyright Khronos Group, Page 1. OpenCL Overview. February 2010 Copyright Khronos Group, 2011 - Page 1 OpenCL Overview February 2010 Copyright Khronos Group, 2011 - Page 2 Khronos Vision Billions of devices increasing graphics, compute, video, imaging and audio capabilities

More information

Tasks. Task Implementation and management

Tasks. Task Implementation and management Tasks Task Implementation and management Tasks Vocab Absolute time - real world time Relative time - time referenced to some event Interval - any slice of time characterized by start & end times Duration

More information

What s New in DI-Guy 12.5

What s New in DI-Guy 12.5 What s New in DI-Guy 12.5 DI-Guy 12.5.0 March 2013 What s New in DI-Guy 12.5 NEW ADAPTIVE GUIDE MORE VISUAL VARIATION DYNAMIC STANDING Boston Dynamics 78 Fourth Avenue Waltham, MA 02451 USA 617.868.5600

More information

2010 Summer Answers [OS I]

2010 Summer Answers [OS I] CS2503 A-Z Accumulator o Register where CPU stores intermediate arithmetic results. o Speeds up process by not having to store these results in main memory. Addition o Carried out by the ALU. o ADD AX,

More information

CSCE Introduction to Computer Systems Spring 2019

CSCE Introduction to Computer Systems Spring 2019 CSCE 313-200 Introduction to Computer Systems Spring 2019 Processes Dmitri Loguinov Texas A&M University January 24, 2019 1 Chapter 3: Roadmap 3.1 What is a process? 3.2 Process states 3.3 Process description

More information

Operating Systems. Lecture 4 - Concurrency and Synchronization. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

Operating Systems. Lecture 4 - Concurrency and Synchronization. Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Operating Systems Lecture 4 - Concurrency and Synchronization Adrien Krähenbühl Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Mutual exclusion Hardware solutions Semaphores IPC: Message passing

More information

Processes and Threads

Processes and Threads COS 318: Operating Systems Processes and Threads Kai Li and Andy Bavier Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall13/cos318 Today s Topics u Concurrency

More information

Process Scheduling Queues

Process Scheduling Queues Process Control Process Scheduling Queues Job queue set of all processes in the system. Ready queue set of all processes residing in main memory, ready and waiting to execute. Device queues set of processes

More information

OpenCL Overview. Shanghai March Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group

OpenCL Overview. Shanghai March Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group Copyright Khronos Group, 2012 - Page 1 OpenCL Overview Shanghai March 2012 Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group Copyright Khronos Group, 2012 - Page 2 Processor

More information

Lecture 25: Board Notes: Threads and GPUs

Lecture 25: Board Notes: Threads and GPUs Lecture 25: Board Notes: Threads and GPUs Announcements: - Reminder: HW 7 due today - Reminder: Submit project idea via (plain text) email by 11/24 Recap: - Slide 4: Lecture 23: Introduction to Parallel

More information

OPENGL AND GLSL. Computer Graphics

OPENGL AND GLSL. Computer Graphics OPENGL AND GLSL Computer Graphics 1 OUTLINE I. Detecting GLSL Errors II. Drawing a (gasp) Triangle! III. (Simple) Animation 2 Interactive Computer Graphics, http://www.mechapen.com/projects.html WHAT IS

More information

Module 12: I/O Systems

Module 12: I/O Systems Module 12: I/O Systems I/O Hardware Application I/O Interface Kernel I/O Subsystem Transforming I/O Requests to Hardware Operations Performance Operating System Concepts 12.1 Silberschatz and Galvin c

More information

2. Introduction to Software for Embedded Systems

2. Introduction to Software for Embedded Systems 2. Introduction to Software for Embedded Systems Lothar Thiele ETH Zurich, Switzerland 2-1 Contents of Lectures (Lothar Thiele) 1. Introduction to Embedded System Design 2. Software for Embedded Systems

More information

CS 179 GPU Programming

CS 179 GPU Programming CS179: GPU Programming Lecture 7: Render to Texture Lecture originally by Luke Durant, Russell McClellan, Tamas Szalay 1 Today: Render to Texture Render to texture in OpenGL Framebuffers and renderbuffers

More information

Last Class: Synchronization

Last Class: Synchronization Last Class: Synchronization Synchronization primitives are required to ensure that only one thread executes in a critical section at a time. Concurrent programs Low-level atomic operations (hardware) load/store

More information

Doc number: P0126R2 Revises: P0126R1, N4195 Date: Project: Reply-to: Thanks-to: A simplifying abstraction

Doc number: P0126R2 Revises: P0126R1, N4195 Date: Project: Reply-to: Thanks-to: A simplifying abstraction Doc number: P0126R2 Revises: P0126R1, N4195 Date: 2016-03-13 Project: Programming Language C++, Concurrency Working Group Reply-to: Olivier Giroux ogiroux@nvidia.com, Torvald Riegel triegel@redhat.com

More information