Kenneth Dyke Sr. Engineer, Graphics and Compute Architecture

Similar documents
Best practices for effective OpenGL programming. Dan Omachi OpenGL Development Engineer

Working with Metal Overview

OpenGL ES for iphone Games. Erik M. Buck

Motivation Hardware Overview Programming model. GPU computing. Part 1: General introduction. Ch. Hoelbling. Wuppertal University

OpenCL overview. Ian Ollmann, Ph.D. Senior Scientist

Async Workgroup Update. Barthold Lichtenbelt

Power Efficiency for Software Algorithms running on Graphics Processors. Björn Johnsson Per Ganestam Michael Doggett Tomas Akenine-Möller

Dynamic Texturing. Mark Harris NVIDIA Corporation

Working With Metal Advanced

GeForce3 OpenGL Performance. John Spitzer

Lecture 13: OpenGL Shading Language (GLSL)

Copyright Khronos Group, Page Graphic Remedy. All Rights Reserved

GPU Programming and Architecture: Course Overview

What s New in Core Image

More frames per second. Alex Kan and Jean-François Roy GPU Software

OPENCL TM APPLICATION ANALYSIS AND OPTIMIZATION MADE EASY WITH AMD APP PROFILER AND KERNELANALYZER

Bringing it all together: The challenge in delivering a complete graphics system architecture. Chris Porthouse

Framebuffer Objects. Emil Persson ATI Technologies, Inc.

Windowing System on a 3D Pipeline. February 2005

Copyright Khronos Group, Page 1. OpenCL. GDC, March 2010

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

CS 179: GPU Programming

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

The Graphics Pipeline and OpenGL I: Transformations!

The Application Stage. The Game Loop, Resource Management and Renderer Design

PERFORMANCE OPTIMIZATIONS FOR AUTOMOTIVE SOFTWARE

Optimisation. CS7GV3 Real-time Rendering

Martin Kruliš, v

Lighting and Texturing

GPGPU IGAD 2014/2015. Lecture 1. Jacco Bikker

CS179: GPU Programming

UberFlow: A GPU-Based Particle Engine

Compute Shaders. Christian Hafner. Institute of Computer Graphics and Algorithms Vienna University of Technology

egfx Breakaway Box for AMD and NVIDIA GPUs

Technical Report. SLI Best Practices

Building a Game with SceneKit

Programmable GPUS. Last Time? Reading for Today. Homework 4. Planar Shadows Projective Texture Shadows Shadow Maps Shadow Volumes

SLICING THE WORKLOAD MULTI-GPU OPENGL RENDERING APPROACHES

Real - Time Rendering. Pipeline optimization. Michal Červeňanský Juraj Starinský

Bringing AAA graphics to mobile platforms. Niklas Smedberg Senior Engine Programmer, Epic Games

CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015

NVIDIA Parallel Nsight. Jeff Kiel

X. GPU Programming. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter X 1

The GPGPU Programming Model

Maya , MotionBuilder & Mudbox 2018.x August 2018

User Guide. TexturePerformancePBO Demo

Sync Points in the Intel Gfx Driver. Jesse Barnes Intel Open Source Technology Center

What s New in Metal, Part 2

Performance OpenGL Programming (for whatever reason)

Mention driver developers in the room. Because of time this will be fairly high level, feel free to come talk to us afterwards

Cornell University CS 569: Interactive Computer Graphics. Introduction. Lecture 1. [John C. Stone, UIUC] NASA. University of Calgary

Technical Report. SLI Best Practices

CS 220: Introduction to Parallel Computing. Introduction to CUDA. Lecture 28

frame buffer depth buffer stencil buffer

Jomar Silva Technical Evangelist

! Readings! ! Room-level, on-chip! vs.!

GPU Programming and Architecture: Course Overview

General Purpose computation on GPUs. Liangjun Zhang 2/23/2005

CS179 GPU Programming Introduction to CUDA. Lecture originally by Luke Durant and Tamas Szalay

CSE 167: Lecture #17: Volume Rendering. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012

Martin Kruliš, v

Graphics and Imaging Architectures

What s New in Xcode App Signing

Antonio R. Miele Marco D. Santambrogio

MAPPING VIDEO CODECS TO HETEROGENEOUS ARCHITECTURES. Mauricio Alvarez-Mesa Techische Universität Berlin - Spin Digital MULTIPROG 2015

Building Visually Rich User Experiences

Creating Great App Previews

Mixing graphics and compute with multiple GPUs

EECS 487: Interactive Computer Graphics

Laptop Requirement: Technical Specifications and Guidelines. Frequently Asked Questions

CLU: Open Source API for OpenCL Prototyping

Dave Shreiner, ARM March 2009

The Ultimate Developers Toolkit. Jonathan Zarge Dan Ginsburg

GPGPU COMPUTE ON AMD. Udeepta Bordoloi April 6, 2011

WebCL Overview and Roadmap

20 Years of OpenGL. Kurt Akeley. Copyright Khronos Group, Page 1

CS335 Graphics and Multimedia. Slides adopted from OpenGL Tutorial

Lecture 07: Buffers and Textures

CME 213 S PRING Eric Darve

Graphics Architectures and OpenCL. Michael Doggett Department of Computer Science Lund university

Why modern versions of OpenGL should be used Some useful API commands and extensions

Direct3D 11 Performance Tips & Tricks

MSAA- Based Coarse Shading

CIS 665 GPU Programming and Architecture

Heterogeneous Computing

EE , GPU Programming

Introducing Metal 2. Graphics and Games #WWDC17. Michal Valient, GPU Software Engineer Richard Schreyer, GPU Software Engineer

Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)

Advanced OpenCL Event Model Usage

Computer Graphics. Bing-Yu Chen National Taiwan University

GPU Memory Model Overview

Lecture 25: Board Notes: Threads and GPUs

GPU Memory Model. Adapted from:

Automatic Intra-Application Load Balancing for Heterogeneous Systems

CS 432 Interactive Computer Graphics

vs. GPU Performance Without the Answer University of Virginia Computer Engineering g Labs

Could you make the XNA functions yourself?

Data Sheet. Desktop ESPRIMO. General

Real-time Graphics 9. GPGPU

Programming shaders & GPUs Christian Miller CS Fall 2011

Transcription:

Kenneth Dyke Sr. Engineer, Graphics and Compute Architecture 2

Supporting multiple GPUs in your application Finding all renderers and devices Responding to renderer changes Making use of multiple GPUs at the same time Shared contexts Resource management and synchronization Performance tips Using IOSurface with multiple GPUs 3

Motivation Systems with multiple GPUs are shipping today! Mac Pros can have up to four GPUs Some recent MacBook Pros also have multiple GPUs Better user experience on systems with multiple GPUs Increased performance Hot-plug support 4

5

Renderers and pixel formats A renderer represents a single GPU or the CPU-based software rasterizer Each renderer has a unique renderer id A pixel format represents both drawable attributes and a specific set of renderers Pixel Format 32-bit RGBA, 24-bit Depth, etc. 6

Pixel formats and contexts OpenGL contexts inherit the list of renderers from the pixel format Each renderer within an OpenGL context is assigned a virtual screen Pixel Format 32-bit RGBA, 24-bit Depth, etc. CGLCreateContext() 7

Pixel formats and contexts OpenGL contexts inherit the list of renderers from the pixel format Each renderer within an OpenGL context is assigned a virtual screen Pixel Format 32-bit RGBA, 24-bit Depth, etc. CGLCreateContext() 8

Virtual screens and renderers 9

Virtual screens and renderers The current virtual screen is used to select the context s renderer Virtual screen order matches pixel format No correlation with physical screens Multiple heterogenous renderers is unique to Mac OS X! 10

Different APIs, but similar concepts clgetdeviceids() instead of CGLChoosePixelFormat() clcreatecontext() passed list of devices instead of pixel format cl_command_queue selects the device for a command 11

Allowing for multiple renders Don t use NSOpenGLPFAScreenMask Guaranteed you ll only get a single hardware renderer Only necessary for legacy full-screen contexts displaymask = CGDisplayIDToOpenGLDisplayMask(kCGDirectMainDisplay); NSOpenGLPixelFormatAttribute attribs[] = { NSOpenGLPFAAccelerated, NSOpenGLPFADoubleBuffer, NSOpenGLPFAColorSize, 32, NSOpenGLPFAScreenMask, displaymask, 0 }; 12

Allowing for multiple renders Don t use NSOpenGLPFAScreenMask Guaranteed you ll only get a single hardware renderer Only necessary for legacy full-screen contexts NSOpenGLPixelFormatAttribute attribs[] = { NSOpenGLPFAAccelerated, NSOpenGLPFADoubleBuffer, NSOpenGLPFAColorSize, 32, }; 0 13

Allowing for multiple renders Do use NSOpenGLPFAAllowOfflineRenderers NSOpenGLPixelFormatAttribute attribs[] = { NSOpenGLPFAAccelerated, NSOpenGLPFADoubleBuffer, NSOpenGLPFAColorSize, 32, }; 0 14

Allowing for multiple renders Do use NSOpenGLPFAAllowOfflineRenderers This gets you all renderers, even ones without a display Important for hot plug NSOpenGLPixelFormatAttribute attribs[] = { NSOpenGLPFAAccelerated, NSOpenGLPFADoubleBuffer, NSOpenGLPFAColorSize, 32, NSOpenGLPFAAllowOfflineRenderers, 0 }; 15

Triggering renderer changes -[NSOpenGLContext update] Best renderer is chosen automatically Usually in response to -[NSOpenGLView update] Or, NSGlobalViewFrameDidChange [NSOpenGLContext setvirtualscreen:] Forces a specific renderer Useful for offscreen contexts @implementation MyOpenGLView... - (void)update { [[self openglcontext] update]; };... @end 16

Responding to renderer changes For some apps, nothing is required Otherwise, check for virtual screen changes @implementation MyOpenGLView... - (void)update { [[self openglcontext] update]; /* w00t! I m done! */ };... @end 17

Responding to renderer changes For some apps, nothing is required Otherwise, check for virtual screen changes @implementation MyOpenGLView... - (void)update { [[self openglcontext] update]; if(virtualscreen!= [[self openglcontext] currentvirtualscreen]) [self gpuchanged]; };... 18

Responding to renderer changes For some apps, nothing is required Otherwise, check for virtual screen changes Check for required extensions @implementation MyOpenGLView... - (void)gpuchanged { };... @end 19

Responding to renderer changes For some apps, nothing is required Otherwise, check for virtual screen changes Check for required extensions Verify hardware limits @implementation MyOpenGLView... - (void)gpuchanged { GLchar *extensions = glgetstring(gl_extensions); };... @end 20

Responding to renderer changes For some apps, nothing is required Otherwise, check for virtual screen changes Check for required extensions Verify hardware limits Synchronize offscreen contexts @implementation MyOpenGLView... - (void)gpuchanged { GLchar *extensions = glgetstring(gl_extensions); GLsizei maxtexturesize = glgetintegerv(gl_max_...); };... @end 21

Responding to renderer changes For some apps, nothing is required Otherwise, check for virtual screen changes Check for required extensions Verify hardware limits Synchronize offscreen contexts @implementation MyOpenGLView... - (void)gpuchanged { GLchar *extensions = glgetstring(gl_extensions); GLsizei maxtexturesize = glgetintegerv(gl_max_...); [self syncoffscreencontexts]; };... @end 22

Resource management during renderer changes OpenGL/OpenCL automatically track resources across GPUs Unmodified resources will be uploaded as needed Modified resources will be copied across via system memory Currently bound objects synchronized at render change Other objects synchronized lazily at bind time 23

Resource management during renderer changes Virtual Screen 0 AMD Radeon HD 4870... glbindtexture(...,1); glteximage2d(...); NVIDIA GeForce GTX 285 24

Resource management during renderer changes Virtual Screen 0 AMD Radeon HD 4870... glbindtexture(...,1); glteximage2d(...); NVIDIA GeForce GTX 285 25

Resource management during renderer changes Virtual Screen 0 AMD Radeon HD 4870... glbindtexture(...,1); glteximage2d(...); DrawTexture(); NVIDIA GeForce GTX 285 26

Resource management during renderer changes Virtual Screen 0 AMD Radeon HD 4870... glbindtexture(...,1); glteximage2d(...); DrawTexture(); NVIDIA GeForce GTX 285 27

Resource management during renderer changes Virtual Screen 0 AMD Radeon HD 4870... glbindtexture(...,1); glteximage2d(...); DrawTexture(); NVIDIA GeForce GTX 285 28

Resource management during renderer changes Virtual Screen 0 AMD Radeon HD 4870... glbindtexture(...,1); glteximage2d(...); DrawTexture(); NVIDIA GeForce GTX 285 29

Resource management during renderer changes Virtual Screen 0 AMD Radeon HD 4870... glbindtexture(...,1); glteximage2d(...); DrawTexture(); NVIDIA GeForce GTX 285 30

Resource management during renderer changes Virtual Screen 0 AMD Radeon HD 4870 glbindtexture(...,1); glteximage2d(...); DrawTexture(); gltexsubimage2d(); NVIDIA GeForce GTX 285 31

Resource management during renderer changes Virtual Screen 0 AMD Radeon HD 4870 glbindtexture(...,1); glteximage2d(...); DrawTexture(); gltexsubimage2d(); NVIDIA GeForce GTX 285 32

Resource management during renderer changes Virtual Screen 0 AMD Radeon HD 4870 glbindtexture(...,1); glteximage2d(...); DrawTexture(); gltexsubimage2d(); NVIDIA GeForce GTX 285 33

Resource management during renderer changes Virtual Screen 0 AMD Radeon HD 4870 glteximage2d(...); DrawTexture(); gltexsubimage2d(); [mycontext update]; NVIDIA GeForce GTX 285 34

Resource management during renderer changes Virtual Screen 1 AMD Radeon HD 4870 glteximage2d(...); DrawTexture(); gltexsubimage2d(); [mycontext update]; NVIDIA GeForce GTX 285 35

Resource management during renderer changes Virtual Screen 1 AMD Radeon HD 4870 glteximage2d(...); DrawTexture(); gltexsubimage2d(); [mycontext update]; NVIDIA GeForce GTX 285 36

Resource management during renderer changes Virtual Screen 1 AMD Radeon HD 4870 glteximage2d(...); DrawTexture(); gltexsubimage2d(); [mycontext update]; NVIDIA GeForce GTX 285 37

Resource management during renderer changes Virtual Screen 1 AMD Radeon HD 4870 glteximage2d(...); DrawTexture(); gltexsubimage2d(); [mycontext update]; NVIDIA GeForce GTX 285 38

Resource management during renderer changes Virtual Screen 1 AMD Radeon HD 4870 DrawTexture(); gltexsubimage2d(); [mycontext update]; DrawTexture(); NVIDIA GeForce GTX 285 39

Resource management during renderer changes Virtual Screen 1 AMD Radeon HD 4870 DrawTexture(); gltexsubimage2d(); [mycontext update]; DrawTexture(); NVIDIA GeForce GTX 285 40

Resource management during renderer changes Virtual Screen 1 AMD Radeon HD 4870 gltexsubimage2d(); [mycontext update]; DrawTexture(); NVIDIA GeForce GTX 285 41

Resource management during renderer changes Virtual Screen 1 AMD Radeon HD 4870 gltexsubimage2d(); [mycontext update]; DrawTexture(); NVIDIA GeForce GTX 285 42

Kenneth Dyke Sr. Mad Scientist 43

44

Using multiple GPUs simultaneously Motivations Increased performance Specific GPU feature requirements Issues to consider Context sharing Synchronization and resource management 45

Context sharing with OpenGL Multiple OpenGL contexts may share resources Text CGLContextObj cgl_ctx; err = CGLCreateContext(pix_fmt, NULL, &cgl_ctx); 46

Context sharing with OpenGL Multiple OpenGL contexts may share resources All contexts in the same share group must use the same set of renderers Be very careful when using different pixel format objects Safest way is to use the pixel format from the other context CGLContextObj cgl_ctx; err = CGLCreateContext(pix_fmt, share_ctx, &cgl_ctx); 47

Context sharing with OpenGL Multiple OpenGL contexts may share resources All contexts in the same share group must use the same set of renderers Be very careful when using different pixel format objects Safest way is to use the pixel format from the other context CGLContextObj cgl_ctx; CGLPixelFormatObj share_pix = CGLGetPixelFormat(share_ctx); err = CGLCreateContext(share_pix, share_ctx, &cgl_ctx); 48

Context sharing in OpenCL All queues created from the same OpenCL context will share resources Can also create OpenCL contexts that are compatible with OpenGL CGLContextObj cgl_ctx = [[self openglcontext] CGLContextObj]; cl_int err = 0; cl_context_properties properties[] = { CL_CONTEXT_PROPERTY_USE_CGL_SHAREGROUP_APPLE, (cl_context_properties)cglgetsharegroup(context), 0 }; computecontext = clcreatecontext(properties, 0, 0, 0, 0, &err); 49

Multicontext synchronization Maintaining order of operations is very important OpenGL uses flush then bind semantics Producer context must flush (glflush, glfinish, etc.) Consumer context must bind (glbindtexture, etc.) Applies to both single and multi-gpu cases OpenCL uses events for dependencies Data dependencies between GL and CL require glflush/clacquire 50

Multicontext resource management Virtual Screen 0 Virtual Screen 1... glbindtexture(...,1); glteximage2d(...); OpenGL Share Group AMD Radeon HD 4870 NVIDIA GeForce GTX 285 51

Multicontext resource management Virtual Screen 0 Virtual Screen 1... glbindtexture(...,1); glteximage2d(...); OpenGL Share Group AMD Radeon HD 4870 NVIDIA GeForce GTX 285 52

Multicontext resource management Virtual Screen 0 Virtual Screen 1... glbindtexture(...,1); glteximage2d(...); OpenGL Share Group AMD Radeon HD 4870 NVIDIA GeForce GTX 285 53

Multicontext resource management Virtual Screen 0 Virtual Screen 1 glbindtexture(...,1); glteximage2d(...); DrawTexture(); OpenGL Share Group AMD Radeon HD 4870 NVIDIA GeForce GTX 285 54

Multicontext resource management Virtual Screen 0 Virtual Screen 1 glbindtexture(...,1); glteximage2d(...); DrawTexture(); OpenGL Share Group AMD Radeon HD 4870 NVIDIA GeForce GTX 285 55

Multicontext resource management Virtual Screen 0 Virtual Screen 1 glteximage2d(...); DrawTexture(); OpenGL Share Group AMD Radeon HD 4870 NVIDIA GeForce GTX 285 56

Multicontext resource management Virtual Screen 0 Virtual Screen 1 glteximage2d(...); DrawTexture(); OpenGL Share Group AMD Radeon HD 4870 NVIDIA GeForce GTX 285 57

Multicontext resource management Virtual Screen 0 Virtual Screen 1 glteximage2d(...); DrawTexture(); OpenGL Share Group AMD Radeon HD 4870 NVIDIA GeForce GTX 285 58

Multicontext resource management Virtual Screen 0 Virtual Screen 1 DrawTexture(); gltexsubimage2d(...); OpenGL Share Group AMD Radeon HD 4870 NVIDIA GeForce GTX 285 59

Multicontext resource management Virtual Screen 0 Virtual Screen 1 DrawTexture(); gltexsubimage2d(...); OpenGL Share Group AMD Radeon HD 4870 NVIDIA GeForce GTX 285 60

Multicontext resource management Virtual Screen 0 Virtual Screen 1 DrawTexture(); gltexsubimage2d(...); OpenGL Share Group AMD Radeon HD 4870 NVIDIA GeForce GTX 285 61

Multicontext resource management Virtual Screen 0 Virtual Screen 1 gltexsubimage2d(...); OpenGL Share Group AMD Radeon HD 4870 NVIDIA GeForce GTX 285 62

Multicontext resource management Virtual Screen 0 Virtual Screen 1 gltexsubimage2d(...); OpenGL Share Group... glbindtexture(...,1); AMD Radeon HD 4870 NVIDIA GeForce GTX 285 63

Multicontext resource management Virtual Screen 0 Virtual Screen 1 gltexsubimage2d(...); OpenGL Share Group... glbindtexture(...,1); AMD Radeon HD 4870 NVIDIA GeForce GTX 285 64

Multicontext resource management Virtual Screen 0 Virtual Screen 1 gltexsubimage2d(...); OpenGL Share Group... glbindtexture(...,1); AMD Radeon HD 4870 NVIDIA GeForce GTX 285 65

Multicontext resource management Virtual Screen 0 Virtual Screen 1 gltexsubimage2d(...); OpenGL Share Group... glbindtexture(...,1); AMD Radeon HD 4870 NVIDIA GeForce GTX 285 66

Multicontext resource management Virtual Screen 0 Virtual Screen 1 gltexsubimage2d(...); OpenGL Share Group... glbindtexture(...,1); AMD Radeon HD 4870 NVIDIA GeForce GTX 285 67

Multicontext resource management Virtual Screen 0 Virtual Screen 1 gltexsubimage2d(...); OpenGL Share Group... glbindtexture(...,1); DrawTexture(); AMD Radeon HD 4870 NVIDIA GeForce GTX 285 68

Multicontext resource management Virtual Screen 0 Virtual Screen 1 gltexsubimage2d(...); OpenGL Share Group... glbindtexture(...,1); DrawTexture(); AMD Radeon HD 4870 NVIDIA GeForce GTX 285 69

Multicontext resource management Virtual Screen 0 Virtual Screen 1 gltexsubimage2d(...); OpenGL Share Group glbindtexture(...,1); DrawTexture(); AMD Radeon HD 4870 NVIDIA GeForce GTX 285 70

Multicontext resource management Virtual Screen 0 Virtual Screen 1 gltexsubimage2d(...); OpenGL Share Group glbindtexture(...,1); DrawTexture(); AMD Radeon HD 4870 NVIDIA GeForce GTX 285 71

Multicontext resource management Virtual Screen 0 Virtual Screen 1 gltexsubimage2d(...); OpenGL Share Group glbindtexture(...,1); DrawTexture(); AMD Radeon HD 4870 NVIDIA GeForce GTX 285 72

Multicontext resource management Virtual Screen 0 Virtual Screen 1 gltexsubimage2d(...); OpenGL Share Group DrawTexture(); glcopytexsubimage(); AMD Radeon HD 4870 NVIDIA GeForce GTX 285 73

Multicontext resource management Virtual Screen 0 Virtual Screen 1 gltexsubimage2d(...); OpenGL Share Group DrawTexture(); glcopytexsubimage(); AMD Radeon HD 4870 NVIDIA GeForce GTX 285 74

Multicontext resource management Virtual Screen 0 Virtual Screen 1 gltexsubimage2d(...); OpenGL Share Group DrawTexture(); glcopytexsubimage(); AMD Radeon HD 4870 NVIDIA GeForce GTX 285 75

Multicontext resource management Virtual Screen 0 Virtual Screen 1 gltexsubimage2d(...); OpenGL Share Group glcopytexsubimage(); AMD Radeon HD 4870 NVIDIA GeForce GTX 285 76

Performance tips Do enough work! You must take into account fill/compute rate vs. transfer cost Be mindful of 4x vs. 16x PCI-E slots in Mac Pro Decouple GPU workloads as much as possible Don t rely upon automatic resource synchronization for performance Consider using extra buffering between GPUs to avoid stalls Avoid display bottlenecks 77

Abe Stephens OpenCL Engineer 78

79

Resource sharing made easy High-level abstraction around shared memory Very efficient cross-process and cross-api data sharing Integrated directly into GPU software stack Hides details about moving data between CPU and GPUs 80

GPU integration OpenGL texture can be bound to an IOSurface Rendering is done by binding an IOSurface texture to an FBO Currently supported in OpenCL via OpenGL context sharing All textures bound to an IOSurface share the same video memory All GPUs share the same backing memory 81

OpenGL texture creation example iosurfacebuffer = IOSurfaceCreate((CFDictionaryRef) [NSDictionary dictionarywithobjectsandkeys: [NSNumber numberwithint:256], (id)kiosurfacewidth, [NSNumber numberwithint:256], (id)kiosurfaceheight, [NSNumber numberwithint:4], (id)kiosurfacebytesperelement, nil]); glgentextures(1, &name); glbindtexture(gl_texture_rectangle_ext, name); err = CGLTexImageIOSurface2D( cgl_ctx, GL_TEXTURE_RECTANGLE_EXT, GL_RGBA, /* Internal format */ 256, 256, /* width, height */ GL_BGRA, GL_UNSIGNED_INT_8_8_8_8_REV, /* format/type */ iosurfacebuffer, 0 /* plane */); 82

OpenGL texture creation example iosurfacebuffer = IOSurfaceCreate((CFDictionaryRef) [NSDictionary dictionarywithobjectsandkeys: [NSNumber numberwithint:256], (id)kiosurfacewidth, [NSNumber numberwithint:256], (id)kiosurfaceheight, [NSNumber numberwithint:4], (id)kiosurfacebytesperelement, nil]); glgentextures(1, &name); glbindtexture(gl_texture_rectangle_ext, name); err = CGLTexImageIOSurface2D( cgl_ctx, GL_TEXTURE_RECTANGLE_EXT, GL_RGBA, /* Internal format */ 256, 256, /* width, height */ GL_BGRA, GL_UNSIGNED_INT_8_8_8_8_REV, /* format/type */ iosurfacebuffer, 0 /* plane */); 83

OpenGL texture creation example iosurfacebuffer = IOSurfaceCreate((CFDictionaryRef) [NSDictionary dictionarywithobjectsandkeys: [NSNumber numberwithint:256], (id)kiosurfacewidth, [NSNumber numberwithint:256], (id)kiosurfaceheight, [NSNumber numberwithint:4], (id)kiosurfacebytesperelement, nil]); glgentextures(1, &name); glbindtexture(gl_texture_rectangle_ext, name); err = CGLTexImageIOSurface2D( cgl_ctx, GL_TEXTURE_RECTANGLE_EXT, GL_RGBA, /* Internal format */ 256, 256, /* width, height */ GL_BGRA, GL_UNSIGNED_INT_8_8_8_8_REV, /* format/type */ iosurfacebuffer, 0 /* plane */); 84

OpenGL texture creation example iosurfacebuffer = IOSurfaceCreate((CFDictionaryRef) [NSDictionary dictionarywithobjectsandkeys: [NSNumber numberwithint:256], (id)kiosurfacewidth, [NSNumber numberwithint:256], (id)kiosurfaceheight, [NSNumber numberwithint:4], (id)kiosurfacebytesperelement, nil]); glgentextures(1, &name); glbindtexture(gl_texture_rectangle_ext, name); err = CGLTexImageIOSurface2D( cgl_ctx, GL_TEXTURE_RECTANGLE_EXT, GL_RGBA, /* Internal format */ 256, 256, /* width, height */ GL_BGRA, GL_UNSIGNED_INT_8_8_8_8_REV, /* format/type */ iosurfacebuffer, 0 /* plane */); 85

Synchronization rules IOSurface follows the Mac OS X OpenGL flush/bind rule CPU writes must be completed before a bind IOSurfaceLock(mySurface, 0, NULL); /* Write to surface here */ IOSurfaceUnlock(mySurface, 0, NULL); glbindtexture(gl_texture_rectangle, mysurfacetexname); /* Do something with texture */ 86

Synchronization rules IOSurface follows the Mac OS X OpenGL flush/bind rule CPU writes must be completed before a bind GPU flush must happen before any CPU reads or writes /* Render to IOSurface */ IOSurfaceLock(mySurface, kiosurfacelockreadonly, NULL); /* Read from surface here */ IOSurfaceUnlock(mySurface, kiosurfacelockreadonly, NULL); 87

Performance tips and tricks Automatic synchronization is easy, but not asynchronous Take advantage of IOSurfaceLock() to implement double buffering Control over when data is written back to main memory Shared backing means no CPU copies Use IOSurface to cast from one format/type to another Example: Manipulate luminance data as quarter-width RGBA Total data sizes much match 88

Example usage cases Application plug-ins Common abstraction for CPU- or GPU-based plug-ins Client/server situations Pass IOSurfaces efficiently between processes Resources already on the GPU stay there Combine both Run plug-ins in a different address space, on a different architecture, and even on a different GPU! 89

Kenneth Dyke Sr. Mad Scientist 90

Support systems with multiple GPUs whenever possible Take advantage of multiple GPUs when beneficial Use IOSurface to help Read the sample code! 91

Allan Schaffer Graphics and Game Technologies Evangelist aschaffer@apple.com Apple Developer Forums http://devforums.apple.com 92

Harnessing OpenCL in Your Application Maximizing OpenCL Performance OpenGL for Mac OS X Russian Hill Wednesday 3:15PM Russian Hill Wednesday 4:30PM Nob Hill Thursday 9:00AM 93

OpenCL Lab OpenGL for Mac OS X Lab Graphics and Media Lab C Thursday 9:00AM Graphics and Media Lab C Thursday 2:00PM 94

95

96