Async Workgroup Update. Barthold Lichtenbelt
|
|
- Russell Leonard
- 5 years ago
- Views:
Transcription
1 Async Workgroup Update Barthold Lichtenbelt 1
2 Goals Provide synchronization framework for OpenGL - Provide base functionality as defined in NV_fence and GL2_async_core - Build a framework for future, more complex, functionality, some of which discussed in GL2_async_core - Initially support CPU <-> GPU synchronization - Support synchronization across multiple OpenGL contexts Resulted in GL_ARB_sync spec - Finished April Posted draft to opengl.org for feedback - Not quite official ARB extension yet 2
3 Functionality overview ARB_sync provides synchronization primitives - Can be tested, set and waited upon Specifically, a Fence Synchronization Object and corresponding Fence command Fence completion allows for partial glfinish - All commands prior to the fence are forced to complete before control is returned to caller Fence Sync Objects can be shared across contexts - Allows for synchronization of OpenGL command streams across contexts New data type: GLtime represents intervals in nanoseconds - 64 bit integer, same encoding as UST counter in OpenML - Accuracy implementation dependent, precision in nanoseconds If you have used the Windows Event model, this will feel familiar 3
4 Synchronization model in ARB_sync 1/2 A sync object is a primitive used for synchronization between CPU and GPU, CPU, or something else. - Sync object has state: type, condition, status A sync object s status can be signaled or non-signaled - when created status is signaled unless a flag is set in which case it is non-signaled A fence sync object is a specific type of sync object - Provides partial finish semantics - Only type of sync object currently defined A fence is a token inserted in the GL command stream - A sync object is not inserted into the command stream - Fence has no state A fence is associated with a fence sync object. - Multiple fences can be associated with the same sync object When a fence is inserted in the command stream, the status of its sync object is set to non-signaled A fence, once completed, will set the status of its sync object to signaled 4
5 Synchronization model in ARB_sync 2/2 A wait function waits on a sync object, not on a fence A poll function polls a sync object, not a fence A wait function called on a sync object in the non-signaled state will block. It unblocks when the sync object transitions to the signaled state. 5
6 Example RTT with two contexts Context A Sync_objectA = glcreatesync(attrib); <render to texture that context B needs> glfence(sync_objecta); glflush(); // prevent deadlock Context B glclientwaitsync(sync_objecta,0,gl_forever); glbindtexture(.); // Just rendered <render using texture> 6
7 OS specific functionality Convert sync object to the window system native event primitive - Allows applications to synchronize all events in a system using one API All operations on <sync> are reflected in OS event and vice-versa Both <sync> and the OS event are valid to use in your code On windows, convert to an Event HANDLE wglconvertsynctoevent(object sync); - Need to specify, when sync object is created, that it can be converted to OS event - Separate extension: WGL_ARB_sync_event On Unix, convert to a file-descriptor, x-event or semaphore? - Still TBD 7
8 Possible future functionality Add a WaitForMultipleSync(uint *sync_objects,.) command - Synchronize with multiple sync objects at once Add a payload to a fence - For example, the time it completed Allow one GPU stream to wait for another GPU stream - WaitSync(sync_object); A sync object whose status will pulse with every vblank A sync object that can signal when data binding has completed - As opposed to when rendering has completed using the data 8
9 Example Streaming video processing Loop Draw frame 1 glfence(sync_object1); // To a FBO, for example // inserts a fence in the command stream Draw frame 2 glfence(sync_object2); Read back data in frame 1 Read back data in frame 2 9
10 Variation with asynchronous read back Loop Draw frame 1 Read back frame 1 into PBO 1 glfence(sync_object1); // To a FBO, for example // Asynchronous readback // Inserts a fence in the command stream Draw frame 2 Read back frame 2 into PBO 2 glfence(sync_object2); glmapbuffer( ); // Access the data of frame 1 in PBO 1 glmapbuffer( ); // Access the data of frame 2 in PBO 2 10
11 Differences with GL_NV_Fence No separation of sync objects and fences in NV_Fence - NV version only has fence objects - Fence object has state Creation of sync object and inserting a fence in one command - SetFenceNV creates and inserts a fence (old object model) NV Fence objects not shared across contexts 11
12 API Overview 1/2 Create a sync attribute object object CreateSyncAttrib(); - SYNC_TYPE has to be FENCE - SYNC_CONDITION has to be SYNC_PRIOR_COMMANDS_COMPLETE - SYNC_STATUS SIGNALED or UNSIGNALED Create the sync object object CreateSync(object attrib); Insert a fence, associated with a sync object, into command stream void Fence(object sync); 12
13 API Overview 2/2 Wait or test the status of a fence sync object enum ClientWaitSync(object sync, uint flags, time timeout); - Blocks until sync is signalled or timeout expired - If timeout == 0, does not block, returns the status of sync - If timeout == FOREVER, call does not timeout - Optionally will flush before blocking - Returns 3 values: ALREADY_SIGNALED, TIMEOUT_EXPIRED, CONDITION_SATISFIED Signal or unsignal a sync object void SignalSync(object sync, enum mode); - If status transitions from unsignaled to signaled, ClientWaitSync will unblock 13
14 Example Streaming video processing Loop Draw frame 1 glfence(sync_object1); // To a FBO, for example // inserts a fence in the command stream Draw frame 2 glfence(sync_object2); Read back data in frame 1 Read back data in frame 2 14
15 Variation with asynchronous read back Loop Draw frame 1 Read back frame 1 into PBO 1 glfence(sync_object1); // To a FBO, for example // Asynchronous readback // Inserts a fence in the command stream Draw frame 2 Read back frame 2 into PBO 2 glfence(sync_object2); glmapbuffer( ); // Access the data of frame 1 in PBO 1 glmapbuffer( ); // Access the data of frame 2 in PBO 2 15
Sync Points in the Intel Gfx Driver. Jesse Barnes Intel Open Source Technology Center
Sync Points in the Intel Gfx Driver Jesse Barnes Intel Open Source Technology Center 1 Agenda History and other implementations Other I/O layers - block device ordering NV_fence, ARB_sync EGL_native_fence_sync,
More informationGDC 2014 Barthold Lichtenbelt OpenGL ARB chair
GDC 2014 Barthold Lichtenbelt OpenGL ARB chair Agenda OpenGL 4.4, news and updates - Barthold Lichtenbelt, NVIDIA Low Overhead Rendering with OpenGL - Cass Everitt, NVIDIA Copyright Khronos Group, 2010
More informationOpenGL Status - November 2013 G-Truc Creation
OpenGL Status - November 2013 G-Truc Creation Vendor NVIDIA AMD Intel Windows Apple Release date 02/10/2013 08/11/2013 30/08/2013 22/10/2013 Drivers version 331.10 beta 13.11 beta 9.2 10.18.10.3325 MacOS
More informationOpenGL BOF Siggraph 2011
OpenGL BOF Siggraph 2011 OpenGL BOF Agenda OpenGL 4 update Barthold Lichtenbelt, NVIDIA OpenGL Shading Language Hints/Kinks Bill Licea-Kane, AMD Ecosystem update Jon Leech, Khronos Viewperf 12, a new beginning
More informationKenneth Dyke Sr. Engineer, Graphics and Compute Architecture
Kenneth Dyke Sr. Engineer, Graphics and Compute Architecture 2 Supporting multiple GPUs in your application Finding all renderers and devices Responding to renderer changes Making use of multiple GPUs
More informationVulkan (including Vulkan Fast Paths)
Vulkan (including Vulkan Fast Paths) Łukasz Migas Software Development Engineer WS Graphics Let s talk about OpenGL (a bit) History 1.0-1992 1.3-2001 multitexturing 1.5-2003 vertex buffer object 2.0-2004
More informationSLICING THE WORKLOAD MULTI-GPU OPENGL RENDERING APPROACHES
SLICING THE WORKLOAD MULTI-GPU OPENGL RENDERING APPROACHES INGO ESSER NVIDIA DEVTECH PROVIZ OVERVIEW Motivation Tools of the trade Multi-GPU driver functions Multi-GPU programming functions Multi threaded
More informationDiscussion Week 8. TA: Kyle Dewey. Tuesday, November 15, 11
Discussion Week 8 TA: Kyle Dewey Overview Exams Interrupt priority Direct memory access (DMA) Different kinds of I/O calls Caching What I/O looks like Exams Interrupt Priority Process 1 makes an I/O request
More informationGeForce3 OpenGL Performance. John Spitzer
GeForce3 OpenGL Performance John Spitzer GeForce3 OpenGL Performance John Spitzer Manager, OpenGL Applications Engineering jspitzer@nvidia.com Possible Performance Bottlenecks They mirror the OpenGL pipeline
More informationUser Guide. TexturePerformancePBO Demo
User Guide TexturePerformancePBO Demo The TexturePerformancePBO Demo serves two purposes: 1. It allows developers to experiment with various combinations of texture transfer methods for texture upload
More informationDecember 11, 2001 Copyright 3Dlabs, Page 1
Status Update December 11, 2001 Copyright 3Dlabs, 2001 - Page 1 OpenGL 2.0 Progress Update White Papers Followed timeline established at September ARB meeting Distributed to identified reviewers in mid-october
More informationThreaded OpenGL API Dispatch. Alexander Monakov. Institute for System Programming of Russian Academy of Sciences
tangl and mangl Threaded OpenGL API Dispatch Alexander Monakov amonakov@ispras.ru Institute for System Programming of Russian Academy of Sciences X.Org Developers Conference, October 10, 2014 1 / 25 Talking
More informationProgrammable Graphics Hardware
Programmable Graphics Hardware Outline 2/ 49 A brief Introduction into Programmable Graphics Hardware Hardware Graphics Pipeline Shading Languages Tools GPGPU Resources Hardware Graphics Pipeline 3/ 49
More informationCopyright Khronos Group, Page 1. OpenCL. GDC, March 2010
Copyright Khronos Group, 2011 - Page 1 OpenCL GDC, March 2010 Authoring and accessibility Application Acceleration System Integration Copyright Khronos Group, 2011 - Page 2 Khronos Family of Standards
More informationRemote Invocation. Today. Next time. l Overlay networks and P2P. l Request-reply, RPC, RMI
Remote Invocation Today l Request-reply, RPC, RMI Next time l Overlay networks and P2P Types of communication " Persistent or transient Persistent A submitted message is stored until delivered Transient
More informationMention driver developers in the room. Because of time this will be fairly high level, feel free to come talk to us afterwards
1 Introduce Mark, Michael Poll: Who is a software developer or works for a software company? Who s in management? Who knows what the OpenGL ARB standards body is? Mention driver developers in the room.
More informationThis process is a fundamental step for every USB device, fore without it, the device would never be able to be used by the OS.
What is USB Enumeration? Enumeration is the process by which a USB device is attached to a system and is assigned a specific numerical address that will be used to access that particular device. It is
More informationSuperbuffers Workgroup Update. Barthold Lichtenbelt
Superbuffers Workgroup Update Barthold Lichtenbelt 1 EXT_framebuffer_object update Specification stable since September 2005 Version #118 posted to Registry April 2006 - Lay groundwork for R, RG rendering
More informationGLSL Overview: Creating a Program
1. Create the OpenGL application GLSL Overview: Creating a Program Primarily concerned with drawing Preferred approach uses buffer objects All drawing done in terms of vertex arrays Programming style differs
More informationIntroduction to Asynchronous Programming Fall 2014
CS168 Computer Networks Fonseca Introduction to Asynchronous Programming Fall 2014 Contents 1 Introduction 1 2 The Models 1 3 The Motivation 3 4 Event-Driven Programming 4 5 select() to the rescue 5 1
More informationBest practices for effective OpenGL programming. Dan Omachi OpenGL Development Engineer
Best practices for effective OpenGL programming Dan Omachi OpenGL Development Engineer 2 What Is OpenGL? 3 OpenGL is a software interface to graphics hardware - OpenGL Specification 4 GPU accelerates rendering
More information20 Years of OpenGL. Kurt Akeley. Copyright Khronos Group, Page 1
20 Years of OpenGL Kurt Akeley Copyright Khronos Group, 2010 - Page 1 So many deprecations! Application-generated object names Color index mode SL versions 1.10 and 1.20 Begin / End primitive specification
More informationMicrium µc/os II RTOS Introduction EE J. E. Lumpp
Micrium µc/os II RTOS Introduction (by Jean Labrosse) EE599 001 Fall 2012 J. E. Lumpp μc/os II μc/os II is a highly portable, ROMable, very scalable, preemptive real time, deterministic, multitasking kernel
More informationGet the most out of the new OpenGL ES 3.1 API. Hans-Kristian Arntzen Software Engineer
Get the most out of the new OpenGL ES 3.1 API Hans-Kristian Arntzen Software Engineer 1 Content Compute shaders introduction Shader storage buffer objects Shader image load/store Shared memory Atomics
More informationPERFORMANCE OPTIMIZATIONS FOR AUTOMOTIVE SOFTWARE
April 4-7, 2016 Silicon Valley PERFORMANCE OPTIMIZATIONS FOR AUTOMOTIVE SOFTWARE Pradeep Chandrahasshenoy, Automotive Solutions Architect, NVIDIA Stefan Schoenefeld, ProViz DevTech, NVIDIA 4 th April 2016
More informationAsynchronous Events on Linux
Asynchronous Events on Linux Frederic.Rossi@Ericsson.CA Open System Lab Systems Research June 25, 2002 Ericsson Research Canada Introduction Linux performs well as a general purpose OS but doesn t satisfy
More informationCSE Traditional Operating Systems deal with typical system software designed to be:
CSE 6431 Traditional Operating Systems deal with typical system software designed to be: general purpose running on single processor machines Advanced Operating Systems are designed for either a special
More informationHomework #3. CS 318/418/618, Fall Handout: 10/09/2017
CS 318/418/618, Fall 2017 Homework #3 Handout: 10/09/2017 1. Microsoft.NET provides a synchronization primitive called a CountdownEvent. Programs use CountdownEvent to synchronize on the completion of
More informationLast Class: Deadlocks. Today
Last Class: Deadlocks Necessary conditions for deadlock: Mutual exclusion Hold and wait No preemption Circular wait Ways of handling deadlock Deadlock detection and recovery Deadlock prevention Deadlock
More informationDRI Memory Management
DRI Memory Management Full strength manager wasn't required for traditional usage: Quake3 and glxgears. Perceived to be difficult. Fundamental for modern desktops, offscreen rendering. Talked about for
More information3.1 Introduction. Computers perform operations concurrently
PROCESS CONCEPTS 1 3.1 Introduction Computers perform operations concurrently For example, compiling a program, sending a file to a printer, rendering a Web page, playing music and receiving e-mail Processes
More informationCSE325 Principles of Operating Systems. Processes. David P. Duggan February 1, 2011
CSE325 Principles of Operating Systems Processes David P. Duggan dduggan@sandia.gov February 1, 2011 Today s Goal: 1. Process Concept 2. Process Manager Responsibilities 3. Process Scheduling 4. Operations
More informationNative D3D9 on Mesa Gallium Nine : the status
Native D3D9 on Mesa Gallium Nine : the status Axel Davy FOSDEM 2015 1 Introduction 2 Wine integration 3 Presenting to the screen D3D9 queue multi-gpu Misc 4 Gallium Nine internals 5 Performance Test configuration
More informationReal - Time Rendering. Pipeline optimization. Michal Červeňanský Juraj Starinský
Real - Time Rendering Pipeline optimization Michal Červeňanský Juraj Starinský Motivation Resolution 1600x1200, at 60 fps Hw power not enough Acceleration is still necessary 3.3.2010 2 Overview Application
More informationOpenGL ES for iphone Games. Erik M. Buck
OpenGL ES for iphone Games Erik M. Buck Topics The components of a game n Technology: Graphics, sound, input, physics (an engine) n Art: The content n Fun: That certain something (a mystery) 2 What is
More informationFree Downloads OpenGL ES 3.0 Programming Guide
Free Downloads OpenGL ES 3.0 Programming Guide OpenGLÂ Â ESâ is the industryâ s leading software interface and graphics library for rendering sophisticated 3D graphics on handheld and embedded devices.
More informationWorking with Metal Overview
Graphics and Games #WWDC14 Working with Metal Overview Session 603 Jeremy Sandmel GPU Software 2014 Apple Inc. All rights reserved. Redistribution or public display not permitted without written permission
More informationSTREAMING VIDEO DATA INTO 3D APPLICATIONS Session Christopher Mayer AMD Sr. Software Engineer
STREAMING VIDEO DATA INTO 3D APPLICATIONS Session 2116 Christopher Mayer AMD Sr. Software Engineer CONTENT Introduction Pinned Memory Streaming Video Data How does the APU change the game 3 Streaming Video
More informationConcurrent Server Design Multiple- vs. Single-Thread
Concurrent Server Design Multiple- vs. Single-Thread Chuan-Ming Liu Computer Science and Information Engineering National Taipei University of Technology Fall 2007, TAIWAN NTUT, TAIWAN 1 Examples Using
More informationCS179 GPU Programming Introduction to CUDA. Lecture originally by Luke Durant and Tamas Szalay
Introduction to CUDA Lecture originally by Luke Durant and Tamas Szalay Today CUDA - Why CUDA? - Overview of CUDA architecture - Dense matrix multiplication with CUDA 2 Shader GPGPU - Before current generation,
More informationLecture 8: Other IPC Mechanisms. CSC 469H1F Fall 2006 Angela Demke Brown
Lecture 8: Other IPC Mechanisms CSC 469H1F Fall 2006 Angela Demke Brown Topics Messages through sockets / pipes Receiving notification of activity Generalizing the event notification mechanism Kqueue Semaphores
More informationTopics. Lecture 8: Other IPC Mechanisms. Socket IPC. Unix Communication
Topics Lecture 8: Other IPC Mechanisms CSC 469H1F Fall 2006 Angela Demke Brown Messages through sockets / pipes Receiving notification of activity Generalizing the event notification mechanism Kqueue Semaphores
More informationProgramming in the Simple Raster Graphics Package (SRGP)
Programming in the Simple Raster Graphics Package (SRGP) Chapter 2 This chapter focuses on a graphics package called SRGP. SRGP was written by the authors to demonstrate some of the basics of Raster Graphics
More informationCopyright Khronos Group, Page Graphic Remedy. All Rights Reserved
Avi Shapira Graphic Remedy Copyright Khronos Group, 2009 - Page 1 2004 2009 Graphic Remedy. All Rights Reserved Debugging and profiling 3D applications are both hard and time consuming tasks Companies
More informationProcesses. Process Management Chapter 3. When does a process gets created? When does a process gets terminated?
Processes Process Management Chapter 3 1 A process is a program in a state of execution (created but not terminated) Program is a passive entity one on your disk (survivor.class, kelly.out, ) Process is
More informationA Deterministic Concurrent Language for Embedded Systems
A Deterministic Concurrent Language for Embedded Systems Stephen A. Edwards Columbia University Joint work with Olivier Tardieu SHIM:A Deterministic Concurrent Language for Embedded Systems p. 1/38 Definition
More informationBlink: 3D Display Multiplexing for Virtualized Applications
: 3D Display Multiplexing for Virtualized Applications January 20, 2006 : 3D Display Multiplexing for Virtualized Applications Motivation Sprites and Tiles Lessons Learned GL in, GL out Communication Protocol
More informationSignals, Synchronization. CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han
, Synchronization CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han Announcements Program Assignment #1 due Tuesday Feb. 15 at 11:55 pm TA will explain parts b-d in recitation Read chapters 7 and
More informationProcess Description and Control. Chapter 3
Process Description and Control 1 Chapter 3 2 Processes Working definition: An instance of a program Processes are among the most important abstractions in an OS all the running software on a computer,
More information1995 Paper 10 Question 7
995 Paper 0 Question 7 Why are multiple buffers often used between producing and consuming processes? Describe the operation of a semaphore. What is the difference between a counting semaphore and a binary
More informationThe Kernel Abstraction
The Kernel Abstraction Debugging as Engineering Much of your time in this course will be spent debugging In industry, 50% of software dev is debugging Even more for kernel development How do you reduce
More informationGPU Memory Model. Adapted from:
GPU Memory Model Adapted from: Aaron Lefohn University of California, Davis With updates from slides by Suresh Venkatasubramanian, University of Pennsylvania Updates performed by Gary J. Katz, University
More informationProcesses. Process Concept
Processes These slides are created by Dr. Huang of George Mason University. Students registered in Dr. Huang s courses at GMU can make a single machine readable copy and print a single copy of each slide
More informationToday CSCI Communication. Communication in Distributed Systems. Communication in Distributed Systems. Remote Procedure Calls (RPC)
Today CSCI 5105 Communication in Distributed Systems Overview Types Remote Procedure Calls (RPC) Instructor: Abhishek Chandra 2 Communication How do program modules/processes communicate on a single machine?
More informationGet your port on! porting to Native Client as of Pepper 18. Colt "MainRoach" McAnlis
Get your port on! porting to Native Client as of Pepper 18 Colt "MainRoach" McAnlis 3.05.2012 Getting Started gonacl.com It works! Native Client runs C++ code in a web page No plug-in required The Gist
More informationCoding OpenGL ES 3.0 for Better Graphics Quality
Coding OpenGL ES 3.0 for Better Graphics Quality Part 2 Hugo Osornio Rick Tewell A P R 1 1 t h 2 0 1 4 TM External Use Agenda Exercise 1: Array Structure vs Vertex Buffer Objects vs Vertex Array Objects
More informationScuola Superiore Sant Anna. I/O subsystem. Giuseppe Lipari
Scuola Superiore Sant Anna I/O subsystem Giuseppe Lipari Input Output and Device Drivers ERI Gennaio 2008 2 Objectives of the I/O subsystem To hide the complexity From the variability of the devices Provide
More informationANDROID APPS DEVELOPMENT FOR MOBILE AND TABLET DEVICE (LEVEL II)
ANDROID APPS DEVELOPMENT FOR MOBILE AND TABLET DEVICE (LEVEL II) Media Playback Engine Android provides a media playback engine at the native level called Stagefright that comes built-in with software-based
More informationToday: Distributed Middleware. Middleware
Today: Distributed Middleware Middleware concepts Case study: CORBA Lecture 24, page 1 Middleware Software layer between application and the OS Provides useful services to the application Abstracts out
More informationThe Application Stage. The Game Loop, Resource Management and Renderer Design
1 The Application Stage The Game Loop, Resource Management and Renderer Design Application Stage Responsibilities 2 Set up the rendering pipeline Resource Management 3D meshes Textures etc. Prepare data
More informationMICROKERNEL CONSTRUCTION 2014
MICROKERNEL CONSTRUCTION 2014 THE FIASCO.OC MICROKERNEL Alexander Warg MICROKERNEL CONSTRUCTION 1 FIASCO.OC IN ONE SLIDE CAPABILITY-BASED MICROKERNEL API single system call invoke capability MULTI-PROCESSOR
More informationTo Do. Computer Graphics (Fall 2008) Course Outline. Course Outline. Methodology for Lecture. Demo: Surreal (HW 3)
Computer Graphics (Fall 2008) COMS 4160, Lecture 9: OpenGL 1 http://www.cs.columbia.edu/~cs4160 To Do Start thinking (now) about HW 3. Milestones are due soon. Course Course 3D Graphics Pipeline 3D Graphics
More informationVulkan C++ Markus Tavenrath, Senior DevTech Software Engineer Professional Visualization
Vulkan C++ Markus Tavenrath, Senior DevTech Software Engineer Professional Visualization Who am I? Markus Tavenrath Senior Dev Tech Software Engineer - Professional Visualization Joined NVIDIA 8 years
More informationMixing graphics and compute with multiple GPUs
Mixing graphics and compute with multiple GPUs genda Compute and Graphics Interoperability Interoperability at a system level pplication design considerations Putting Graphics & Compute together Compute
More informationBreaking Down Barriers: An Intro to GPU Synchronization. Matt Pettineo Lead Engine Programmer Ready At Dawn Studios
Breaking Down Barriers: An Intro to GPU Synchronization Matt Pettineo Lead Engine Programmer Ready At Dawn Studios Who am I? Ready At Dawn for 9 years Lead Engine Programmer for 5 I like GPUs and APIs!
More informationAchieving High-performance Graphics on Mobile With the Vulkan API
Achieving High-performance Graphics on Mobile With the Vulkan API Marius Bjørge Graphics Research Engineer GDC 2016 Agenda Overview Command Buffers Synchronization Memory Shaders and Pipelines Descriptor
More informationTuning CUDA Applications for Fermi. Version 1.2
Tuning CUDA Applications for Fermi Version 1.2 7/21/2010 Next-Generation CUDA Compute Architecture Fermi is NVIDIA s next-generation CUDA compute architecture. The Fermi whitepaper [1] gives a detailed
More informationDistributed Systems Theory 4. Remote Procedure Call. October 17, 2008
Distributed Systems Theory 4. Remote Procedure Call October 17, 2008 Client-server model vs. RPC Client-server: building everything around I/O all communication built in send/receive distributed computing
More informationPRIME Synchronization. XDC 2016 Alex Goins, Andy Ritger
PRIME Synchronization XDC 2016 Alex Goins, Andy Ritger 1 Introduction: PRIME Output Slaving Enables the sequence: One GPU renders and transfer pixels through GEM shared buffers. Another GPU displays the
More informationThread Concept. Thread. No. 3. Multiple single-threaded Process. One single-threaded Process. Process vs. Thread. One multi-threaded Process
EECS 3221 Operating System Fundamentals What is thread? Thread Concept No. 3 Thread Difference between a process and a thread Prof. Hui Jiang Dept of Electrical Engineering and Computer Science, York University
More informationPerformance Analysis of Sobel Edge Detection Filter on GPU using CUDA & OpenGL
Performance Analysis of Sobel Edge Detection Filter on GPU using CUDA & OpenGL Ms. Khyati Shah Assistant Professor, Computer Engineering Department VIER-kotambi, INDIA khyati30@gmail.com Abstract: CUDA(Compute
More informationBringing Vulkan to VR. Cass Everitt, Oculus
Bringing Vulkan to VR Cass Everitt, Oculus A Presentation in Two Acts The Graphics-API-Agnostic Design of the VrApi The Vulkan-Samples atw Sample Family as Proving Grounds Act One The Graphics-API-Agnostic
More informationLast Class: CPU Scheduling! Adjusting Priorities in MLFQ!
Last Class: CPU Scheduling! Scheduling Algorithms: FCFS Round Robin SJF Multilevel Feedback Queues Lottery Scheduling Review questions: How does each work? Advantages? Disadvantages? Lecture 7, page 1
More informationWhy modern versions of OpenGL should be used Some useful API commands and extensions
Michał Radziszewski Why modern versions of OpenGL should be used Some useful API commands and extensions Timer Query EXT Direct State Access (DSA) Geometry Programs Position in pipeline Rendering wireframe
More informationExtending Flink s Streaming APIs
Extending Flink s Streaming APIs Kostas Kloudas @KLOUBEN_K Flink Forward San Francisco April 11, 2017 1 Original creators of Apache Flink Providers of the da Platform, a supported Flink distribution 2
More informationTG-Gallium Driver Stack. Softpipe, Cell and Beyond. Keith Whitwell
TG-Gallium Driver Stack Softpipe, Cell and Beyond DRI Driver Model drm App Mesa DRI Driver DRI Leaky interface between Mesa and driver. Drivers getting bigger, more complex. API, OS dependencies encoded
More informationExtensions to Barrelfish Asynchronous C
Extensions to Barrelfish Asynchronous C Michael Quigley michaelforrquigley@gmail.com School of Computing, University of Utah October 27, 2016 1 Abstract The intent of the Microsoft Barrelfish Asynchronous
More informationMobile AR Hardware Futures
Copyright Khronos Group, 2010 - Page 1 Mobile AR Hardware Futures Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group Two Perspectives NVIDIA - Tegra 2 mobile processor Khronos
More informationEXPLICIT SYNCHRONIZATION
EXPLICIT SYNCHRONIZATION Lauri Peltonen XDC, 8 October, 204 WHAT IS EXPLICIT SYNCHRONIZATION? Fence is an abstract primitive that marks completion of an operation Implicit synchronization Fences are attached
More informationBuilding X 2D rendering acceleration with OpenGL. Eric Anholt Intel Open Source Technology Center
Building X 2D rendering acceleration with OpenGL Eric Anholt Intel Open Source Technology Center How 2D has worked X has always implemented graphics acceleration in a hardware specific driver Acceleration
More informationEmbedded Systems. 5. Operating Systems. Lothar Thiele. Computer Engineering and Networks Laboratory
Embedded Systems 5. Operating Systems Lothar Thiele Computer Engineering and Networks Laboratory Embedded Operating Systems 5 2 Embedded Operating System (OS) Why an operating system (OS) at all? Same
More informationOperating System Structure
Operating System Structure CSCI 4061 Introduction to Operating Systems Applications Instructor: Abhishek Chandra Operating System Hardware 2 Questions Operating System Structure How does the OS manage
More informationADVANCED RENDERING WITH DIRECTX 12
April 4-7, 2016 Silicon Valley ADVANCED RENDERING WITH DIRECTX 12 Oleg Kuznetsov, Developer Technology Engineer, April 4 th 2016 AGENDA DirectX 12: more control & responsibilities How to efficiently drive
More informationIntroduction to OS Synchronization MOS 2.3
Introduction to OS Synchronization MOS 2.3 Mahmoud El-Gayyar elgayyar@ci.suez.edu.eg Mahmoud El-Gayyar / Introduction to OS 1 Challenge How can we help processes synchronize with each other? E.g., how
More informationCopyright Khronos Group, Page 1. OpenCL Overview. February 2010
Copyright Khronos Group, 2011 - Page 1 OpenCL Overview February 2010 Copyright Khronos Group, 2011 - Page 2 Khronos Vision Billions of devices increasing graphics, compute, video, imaging and audio capabilities
More informationTasks. Task Implementation and management
Tasks Task Implementation and management Tasks Vocab Absolute time - real world time Relative time - time referenced to some event Interval - any slice of time characterized by start & end times Duration
More informationWhat s New in DI-Guy 12.5
What s New in DI-Guy 12.5 DI-Guy 12.5.0 March 2013 What s New in DI-Guy 12.5 NEW ADAPTIVE GUIDE MORE VISUAL VARIATION DYNAMIC STANDING Boston Dynamics 78 Fourth Avenue Waltham, MA 02451 USA 617.868.5600
More information2010 Summer Answers [OS I]
CS2503 A-Z Accumulator o Register where CPU stores intermediate arithmetic results. o Speeds up process by not having to store these results in main memory. Addition o Carried out by the ALU. o ADD AX,
More informationCSCE Introduction to Computer Systems Spring 2019
CSCE 313-200 Introduction to Computer Systems Spring 2019 Processes Dmitri Loguinov Texas A&M University January 24, 2019 1 Chapter 3: Roadmap 3.1 What is a process? 3.2 Process states 3.3 Process description
More informationOperating Systems. Lecture 4 - Concurrency and Synchronization. Master of Computer Science PUF - Hồ Chí Minh 2016/2017
Operating Systems Lecture 4 - Concurrency and Synchronization Adrien Krähenbühl Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Mutual exclusion Hardware solutions Semaphores IPC: Message passing
More informationProcesses and Threads
COS 318: Operating Systems Processes and Threads Kai Li and Andy Bavier Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall13/cos318 Today s Topics u Concurrency
More informationProcess Scheduling Queues
Process Control Process Scheduling Queues Job queue set of all processes in the system. Ready queue set of all processes residing in main memory, ready and waiting to execute. Device queues set of processes
More informationOpenCL Overview. Shanghai March Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group
Copyright Khronos Group, 2012 - Page 1 OpenCL Overview Shanghai March 2012 Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group Copyright Khronos Group, 2012 - Page 2 Processor
More informationLecture 25: Board Notes: Threads and GPUs
Lecture 25: Board Notes: Threads and GPUs Announcements: - Reminder: HW 7 due today - Reminder: Submit project idea via (plain text) email by 11/24 Recap: - Slide 4: Lecture 23: Introduction to Parallel
More informationOPENGL AND GLSL. Computer Graphics
OPENGL AND GLSL Computer Graphics 1 OUTLINE I. Detecting GLSL Errors II. Drawing a (gasp) Triangle! III. (Simple) Animation 2 Interactive Computer Graphics, http://www.mechapen.com/projects.html WHAT IS
More informationModule 12: I/O Systems
Module 12: I/O Systems I/O Hardware Application I/O Interface Kernel I/O Subsystem Transforming I/O Requests to Hardware Operations Performance Operating System Concepts 12.1 Silberschatz and Galvin c
More information2. Introduction to Software for Embedded Systems
2. Introduction to Software for Embedded Systems Lothar Thiele ETH Zurich, Switzerland 2-1 Contents of Lectures (Lothar Thiele) 1. Introduction to Embedded System Design 2. Software for Embedded Systems
More informationCS 179 GPU Programming
CS179: GPU Programming Lecture 7: Render to Texture Lecture originally by Luke Durant, Russell McClellan, Tamas Szalay 1 Today: Render to Texture Render to texture in OpenGL Framebuffers and renderbuffers
More informationLast Class: Synchronization
Last Class: Synchronization Synchronization primitives are required to ensure that only one thread executes in a critical section at a time. Concurrent programs Low-level atomic operations (hardware) load/store
More informationDoc number: P0126R2 Revises: P0126R1, N4195 Date: Project: Reply-to: Thanks-to: A simplifying abstraction
Doc number: P0126R2 Revises: P0126R1, N4195 Date: 2016-03-13 Project: Programming Language C++, Concurrency Working Group Reply-to: Olivier Giroux ogiroux@nvidia.com, Torvald Riegel triegel@redhat.com
More information