BOF Siggraph 2012 Barthold Lichtenbelt OpenGL ARB chair
OpenGL BOF Agenda Latest news and features in OpenGL - Barthold Lichtenbelt, NVIDIA Cool things you never dreamed you could do with OpenGL? - Bill Licea-Kane, AMD Left 4 Dead 2 Linux: From 6 to 300 FPS in OpenGL - Rich Geldreich, Valve 20 years of OpenGL - Kurt Akeley, co-founder of SGI and the OpenGL API Followed by Party! Trivia throughout Copyright Khronos Group, 2010 - Page 2
Sponsors Copyright Khronos Group, 2010 - Page 3
Sponsors Rob Barris Tadamasa Teranishi Tomohiro Matsumoto Jesse Barker Lingjun Chen Glenn Fredericks Masahito Hirose John Kessenich Arzhange Safdarzadeh Tom Olson Lawrence McDonough Mark Kilgard Takeshi Haga Cass Everitt Takeshi Hirai Ian Romanick Laurent Billy Benj Lipchak Sergey Kosarevsky Christophe Riccio Dominic Agoro-Ombaka Vicki and Dave Shreiner Kentaro Suzuki a.k.a. hole Kentaro Oku "kioku/system K The English Tiddlywinks Association Several anonymous sponsors Copyright Khronos Group, 2010 - Page 4
OpenGL is 20 years today! Copyright Khronos Group, 2010 - Page 5
OpenGL 20 th Birthday - Then and Now Ideas in Motion - SGI Rage -id Software 1992 Reality Engine 8 Geometry Engines 4 Raster Manager boards 2012 Mobile NVIDIA Tegra 3 Nexus 7 Android Tablet 2012 PC NVIDIA GeForce GTX 680 Kepler GK104 Triangles / sec (millions) 1 103 (x103) 1800 (x1800) Pixel Fragments / sec (millions) 240 1040 (x4.3) 14,400 (x60) GigaFLOPS 0.64 15.6 (x25) 3090 (x4830) 1.5KW <5W Copyright Khronos Group, 2010 - Page 6
OpenGL Latest Updates Games - Steam s Left 4 Dead 2 on Linux uses OpenGL (7/2012) - http://www.extremetech.com/gaming/133824-valve-opengl-is-faster-than-directx-even-on-windows - Doom3 source code released (11/2011) Books - OpenGL Insights released (8/2012) - OpenGL 4.0 Shading Language Cookbook released (1/2012) - Graphics Shaders: Theory and Practice, second edition released (11/2011) - Learning Modern 3D Graphics Programming (2012) - http://www.arcsynthesis.org/gltut/ Copyright Khronos Group, 2010 - Page 7
OpenGL Ecosystem News Tools updated to support OpenGL 4.2 - GLView (2/2012) - GLEW (7/2012) and GL3W - GLIntercept (11/2011) - http://www.g-truc.net/ (8/2011) New projects - GLCapsViewer - http://delphigl.de/glcapsviewer/listreports.php (8/2011) - Regal for OpenGL - https://github.com/p3/regal (2012) - Proland - http://proland.inrialpes.fr/index.html (5/2012) New Tutorials - http://www.opengl-tutorial.org/ Copyright Khronos Group, 2010 - Page 8
Announcing 4.3 Copyright Khronos Group, 2010 - Page 9
Accelerating OpenGL Innovation Bringing state-of-theart functionality to cross-platform graphics OpenGL 4.3 OpenGL 4.2 OpenGL 4.1 OpenGL 3.3/4.0 OpenGL 3.2 OpenGL 2.0 OpenGL 2.1 OpenGL 3.0 OpenGL 3.1 2004 2005 2006 2007 2008 2009 2010 2011 2012 DirectX 9.0c DirectX 10.0 DirectX 10.1 DirectX 11 DirectX 11.1 Copyright Khronos Group, 2010 - Page 10
What is new in OpenGL 4.3? texture functionality - ARB_texture_view - ARB_internalformat_query2 - ARB_copy_image - ARB_texture_buffer_range - ARB_stencil_texturing - ARB_texture_storage_multisample buffer functionality - ARB_shader_storage_buffer_object - ARB_invalidate_subdata - ARB_clear_buffer_object - ARB_vertex_attrib_binding - ARB_robust_buffer_access_behavior 4.3 Copyright Khronos Group, 2010 - Page 11
What is new in OpenGL 4.3? pipeline functionality - ARB_compute_shader - ARB_multi_draw_indirect - KHR_debug - ARB_program_interface_query - ARB_ES3_compatibility extensions - KHR_texture_compression_astc_ldr - ARB_robustness_isolation GLSL 4.3 functionality - ARB_shader_image_size - ARB_explicit_uniform_location - ARB_texture_query_levels - ARB_arrays_of_arrays - ARB_fragment_layer_viewport 4.3 Copyright Khronos Group, 2010 - Page 12
OpenGL 4.3 Pipelines From Application From Application Element Array Buffer b Vertex Puller Dispatch Indirect Buffer b Dispatch Draw Indirect Buffer b Vertex Shader Image Load / Store t/b Compute Shader Vertex Buffer Object b Tessellation Control Shader Atomic Counter b Tessellation Primitive Gen. Shader Storage b Tessellation Eval. Shader Geometry Shader Texture Fetch t/b Transform Feedback Buffer b Transform Feedback Uniform Block b Legend Rasterization From Application Fixed Function Stage Programmable Stage Fragment Shader Pixel Assembly Pixel Unpack Buffer b b Buffer Binding Per-Fragment Operations Pixel Operations Texture Image t t Texture Binding Arrows indicate data flow Framebuffer Pixel Pack Pixel Pack Buffer b Copyright Khronos Group, 2010 - Page 13
Compute Shaders Execute algorithmically general purpose GLSL shaders - Operate on buffers, images and textures Process graphics data in the context of the graphics pipeline - Easier than interoperating with a compute API IF processing close to the pixel Complementary to OpenCL - Not a full heterogonous (CPU/GPU) programming framework using full ANSI C Standard part of all OpenGL 4.3 implementations - Matches DirectX 11 functionality Image processing AI Simulation Ray Tracing Wave Simulation Global Illumination Copyright Khronos Group, 2010 - Page 14
Compute Shaders for Physics Processing Credit: Dr. Mike Bailey, Oregon State University also Notes and sample code on OpenGL Compute Shader - http://web.engr.oregonstate.edu/~mjb/sig12/ Copyright Khronos Group, 2010 - Page 15
Compute programming model Dispatch gl_workgroupsize = (4,2,0) gl_workgroupid = (1,1,0) gl_localinvocationid = (2,1,0) gl_globalinvocationid = (6,3,0) Work Group (0, 1) Work Group (1, 1) Work Group (2, 1) Work Group (1,1) Invocation (0,1) Invocation (1,1) Invocation (2,1) Invocation (3,1) Work Group (0, 0) Work Group (1, 0) Work Group (2, 0) Invocation (0,0) Invocation (1,0) Invocation (2,0) Invocation (3,0) in uvec3 gl_numworkgroups; // Number of workgroups dispatched const uvec3 gl_workgroupsize; // Size of each work group for current shader in uvec3 gl_workgroupid; // Index of current work group being executed in uvec3 gl_localinvocationid; // index of current invocation in a work group in uvec3 gl_globalinvocationid; // Unique ID across all work groups and invocations Copyright Khronos Group, 2010 - Page 16
Compute memory hierarchy Shader Storage Buffer Object (SSBO) Uniform Buffer Object (UBO) Texture Buffer Object (TexBO) Texture Dispatch Thread (0,1) Work Group Work Group Work Group Work Group (0, 1) Work Group Work Group Work Group Work Group Shared Variables Image Use void barrier() to synchronize invocations in a work group Use memory barriers to order reads/writes accessible to other invocations void memorybarrier(); void memorybarrieratomiccounter(); void memorybarrierbuffer(); void memorybarrierimage(); void memorybarriershared(); // Only for compute shaders void groupmemorybarrier(); // Only for compute shaders Invocation Local Variables Copyright Khronos Group, 2010 - Page 17
Texture Views View texture data store multiple ways - Re-interpret the format/type - Clamp mip-map level range - Clamp array slice range No new object types introduced Conceptual split of a texture object - Data store holding texels - View state describing which part of data store to use - View state describing how to interpret elements in data store - An embedded sampler object - Texture parameters Multiple textures share same data store - Data store ref counted Copyright Khronos Group, 2010 - Page 18
Texture Views Sampler Object Sampler Parameters (mutable) use sampler object if bound Texture Object Sampler Parameters (mutable) Texture Parameters Texture View Parameters (immutable) created with TexStorage*() Texture Lookup Hardware Texture levels selected by view mipmap chain To rest of pipeline Texel Data (mutable, ref counted) Copyright Khronos Group, 2010 - Page 19
Creation of New Texture View Sampler Object Sampler Parameters (mutable) Texture Object Sampler Parameters (mutable) Texture Parameters New Texture Object Texture Parameters (reset to default) Sampler Parameters (reset to default) Sampler Object Sampler Parameters (mutable) use sampler object if bound Texture View Parameters (immutable) Texture View Parameters (immutable) use sampler object if bound created with TexStorage*() created with TextureView() Texture Lookup Hardware Texture levels selected by view mipmap chain Texture levels selected by view Texture Lookup Hardware To rest of pipeline Texel Data (mutable, ref counted) New texture state set with TextureView() enum internalformat // base internal format enum target // texture target uint minlevel // first level of mipmap uint numlevels // number of mipmap levels uint minlayer // 1 st layer of array texture uint numlayers // number of layers in array Copyright Khronos Group, 2010 - Page 20
KHR_debug Builds on ARB_debug_output Callback with debug information - Or write to log Messages grouped by {source, type, ID, severity} - Source: GL API, GLSL shader, application, third-party, debugger - Type: Error, performance, undefined behavior, portability - ID: Unique identifier for each message - Severity: High, medium, low Label objects - Human readable text Annotate commands stream - Markers: Identify some event in your code - Groups: Encapsulate command stream and control debug verbosity Copyright Khronos Group, 2010 - Page 21
MultiDraw*Indirect() MultiDraw{Arrays/Elements}Indirect - Combines MultiDraw with DrawIndirect MultiDraw - MultiDraw functions can help reduce validation overhead especially for many low complexity draw calls, while keeping each sub-object addressable DrawIndirect - Store draw command inputs in host or GPU buffers Host Provides efficient system for GPU to generate its own work - Use XFB or SSBO/compute to write the draw command buffers - For example for culling (setting count to zero), LOD picking (changing count/firstindex)... Compute Shader MultiDraw Indirect Buffer Graphics Shader.................. CAD example: individual model features (bevels ) struct DEICommand { uint count; uint instancecount; uint firstindex; int basevertex; uint baseinstance; }; Copyright Khronos Group, 2010 - Page 22
Shader Storage Buffer Objects (SSBO) Read/write and atomic operations on variables stored in a buffer object - Think writeable UBOs New binding point SHADER_STORAGE_BUFFER - Queriable limits on number of storage blocks per shader type - MAX_<SHADER>_STORAGE_BLOCKS Support large buffers - Minimum size is 16 MB New std430 memory layout - Pack scalar arrays more efficiently Can use C-style code in a shader to read/write - Example on next slide Especially useful for compute shaders - No built-in outputs - Data transfer has to be through buffers or images Copyright Khronos Group, 2010 - Page 23
Shader Storage Buffer Objects (SSBO) glgenbuffers(1, & posssbo); glbindbuffer(gl_shader_storage_buffer, posssbo); glbufferdata(gl_shader_storage_buffer,... ); gluseprogram(myshaderprogram); glbindbufferbase(gl_shader_storage_buffer, 4, posssbo); struct MyVertex { vec2 tex[2]; // tightly packed array in 430 vec3 pos; int materialidx; } layout(std430, binding = 4) buffer { MyVertex Vertices[ ]; // unsized array allowed at end of buffer };... // compute data to store in Vertices[] Vertices[i]. materialidx = idx; // directly write to buffer content Copyright Khronos Group, 2010 - Page 24
New KHR and ARB extensions KHR_texture_compression_astc_ldr - Adaptive Scalable Texture Compression (ASTC) - 1-4 component, low bit rate < 1 bit/pixel 8 bit/pixel ARB_robustness_isolation - If application causes GPU reset, no other application will be affected All 4.3 functionality also available as ARB extensions Original 24bpp ASTC Compression 8bpp 3.56bpp 2bpp Copyright Khronos Group, 2010 - Page 25
OpenGL 4.3 reference pages Huge thanks to Graham Sellers!!! Copyright Khronos Group, 2010 - Page 26
Specification re-ordering Shader and buffer centric - Fixed function interfaces described as alternates Introduces concepts and objects at high level - Before being used later in document Error summaries for commands Removed duplication of language Consistent uses of phrases and terminology Aligned section numbering between Core and Compatibility profiles Copyright Khronos Group, 2010 - Page 27
Conclusion OpenGL 4.3 adds major new functionality - Compute shaders - Advanced buffer management - Advanced texture management - Advanced GPU work creation OpenGL usage on the rise sharply - WebGL - Mobile platforms - Linux OpenGL is 20 years today! - Awesome achievement Copyright Khronos Group, 2010 - Page 28
Rest of the evening Get Drink Kurt Akeley Presentation Toast Party - LIVE DEMO: Viewperf 12 - Play with the O2! Get your drink and COME BACK to toast to OpenGL Copyright Khronos Group, 2010 - Page 29
OpenGL 4.3 details Copyright Khronos Group, 2010 - Page 30
OpenGL 4.3 new texture functionality ARB_texture_view - Provide different ways to interpret texture data without duplicating the texture - Match DX11 functionality ARB_internalformat_query2 - find out actual supported limits for most texture parameters ARB_copy_image - Direct copy of pixels between textures and render buffers ARB_texture_buffer_range - create texture buffer object corresponding to a subrange of a buffer s data store ARB_stencil_texturing - Read stencil bits of a packed depth-stencil texture ARB_texture_storage_multisample - Immutable storage objects for multisampled textures Copyright Khronos Group, 2010 - Page 31
OpenGL 4.3 new buffer functionality ARB_shader_storage_buffer_object - Enables all shader stages to read and write to very large buffers - structs, arrays, scalars, etc ARB_invalidate_subdata - Invalidate all or some of the contents of textures and buffers ARB_clear_buffer_object - Clear a buffer object with a constant value ARB_vertex_attrib_binding - Separate vertex attribute state from the data stores of each array ARB_robust_buffer_access_behavior - shader read/write to an object only allowed to data owned by the application - Applies to out of bounds accesses Copyright Khronos Group, 2010 - Page 32
OpenGL 4.3 new pipeline functionality ARB_compute_shader - Introduces new shader stage - Enables advanced processing algorithms that harness the parallelism of GPUs ARB_multi_draw_indirect - Draw many GPU generated objects with one call KHR_debug - Enhanced debug context support ARB_program_interface_query - Generic API to enumerate active variables and interface blocks for each stage - Enumerate active variables in interfaces between separable program objects ARB_ES3_compatibility - features not previously present in OpenGL - Brings EAC and ETC2 texture compression formats Copyright Khronos Group, 2010 - Page 33
GLSL 4.3 new functionality ARB_shader_image_size - Query size of an image in a shader ARB_explicit_uniform_location - Set location of a default-block uniform in the shader ARB_texture_query_levels - Query number of mipmap levels accessible through a sampler uniform ARB_arrays_of_arrays - Allows multi-dimensional arrays in GLSL. float f[4][3]; ARB_fragment_layer_viewport - gl_layer and gl_viewportindex now available to fragment shader Copyright Khronos Group, 2010 - Page 34
Texture object state Sampler Parameters TEXTURE_BORDER_COLOR TEXTURE_COMPARE_{FUNC,MODE} TEXTURE_LOD_BIAS TEXTURE_{MAX,MIN}_LOD TEXTURE_{MAG,MIN}_FILTER TEXTURE_WRAP_{S,T,R} Texture View Parameters Texture Object Sampler Parameters (mutable) Texture Parameters Texture View Parameters (immutable) <target> TEXTURE_INTERNAL_FORMAT TEXTURE_VIEW_{MIN,NUM}_LEVEL TEXTURE_VIEW_{MIN,NUM}_LAYER TEXTURE_IMMUTABLE_LEVELS TEXTURE_SHARED_SIZE TEXTURE_{RED,GREEN,BLUE,ALPHA,DEPTH,STENCIL}_SIZE TEXTURE_{RED,GREEN,BLUE,ALPHA,DEPTH}_TYPE IMAGE_FORMAT_COMPATIBILITY_TYPE Texture Parameters TEXTURE_WIDTH TEXTURE_HEIGHT TEXTURE_DEPTH TEXTURE_SAMPLES TEXTURE_FIXED_SAMPLE_LOCATIONS TEXTURE_COMPRESSED TEXTURE_COMPRESSED_IMAGE_SIZE TEXTURE_IMMUTABLE_FORMAT TEXTURE_SWIZZLE_{R,G,B,A} TEXTURE_MAX_LEVEL TEXTURE_BASE_LEVEL DEPTH_STENCIL_TEXTURE_MODE State is immutable, unless listed in italics Copyright Khronos Group, 2010 - Page 35