Working with Metal Overview
|
|
- Austin Grant
- 5 years ago
- Views:
Transcription
1 Graphics and Games #WWDC14 Working with Metal Overview Session 603 Jeremy Sandmel GPU Software 2014 Apple Inc. All rights reserved. Redistribution or public display not permitted without written permission from Apple.
2 Metal Dramatically reduced overhead Unified graphics and compute Precompiled shaders Efficient multithreading Designed for A7
3 Agenda Background API concepts Shading language Developer tools
4 Agenda Background API concepts Shading language Developer tools
5 10x more draw calls
6 About Draw Calls
7 About Draw Calls Each draw call requires its own state vector Shaders, states, textures, render targets, etc.
8 About Draw Calls Each draw call requires its own state vector Shaders, states, textures, render targets, etc. Changing state vectors can be expensive Translation to hardware commands
9 About Draw Calls Each draw call requires its own state vector Shaders, states, textures, render targets, etc. Changing state vectors can be expensive for the CPU Translation to hardware commands
10 About Draw Calls Your Application Each draw call requires its own state vector Shaders, states, textures, render targets, etc. Changing state vectors can be expensive for the CPU Translation to hardware commands Set Shaders Set Textures Set Vertex Buffers Draw #1 Set Shaders Set Blend Set Depth Test Draw #2
11 About Draw Calls Your Application Each draw call requires its own state vector Shaders, states, textures, render targets, etc. Set Shaders Set Textures Set Vertex Buffers Draw #1 Changing state vectors can be expensive Translation to hardware commands for the CPU Set Shaders Set Blend Set Depth Test Draw #2 Hardware commands
12 About Draw Calls Your Application Each draw call requires its own state vector Shaders, states, textures, render targets, etc. Set Shaders Set Textures Set Vertex Buffers Draw #1 Changing state vectors can be expensive Translation to hardware commands for the CPU Set Shaders Set Blend Set Depth Test Draw #2 Hardware commands
13 About Draw Calls Your Application Each draw call requires its own state vector Shaders, states, textures, render targets, etc. Set Shaders Set Textures Set Vertex Buffers Draw #1 Changing state vectors can be expensive Translation to hardware commands More draw calls per frame gives you for the CPU Set Shaders Set Blend Set Depth Test Draw #2 More unique objects More visual variety Hardware commands More freedom for game artists and designers
14 Before Metal
15 Before Metal Long history of GPU programming APIs Standards OpenGL, OpenCL Domains High level, low level, 2D, 3D Architectures Platforms, devices, GPUs
16 Before Metal Long history of GPU programming APIs Standards OpenGL, OpenCL Domains High level, low level, 2D, 3D Architectures Platforms, devices, GPUs Something was missing
17 Deep Integration
18 Deep Integration + +
19 Deep Integration + +
20 Deep Integration What if we took the same approach for GPU programming? + +
21 Deep Integration What if we took the same approach for GPU programming? + +
22 Deep Integration What if we took the same approach for GPU programming? + + =
23
24 Metal Design
25 Metal Design Thinnest possible API
26 Metal Design Thinnest possible API Modern GPU features
27 Metal Design Thinnest possible API Modern GPU features Do expensive tasks less often
28 Metal Design Thinnest possible API Modern GPU features Do expensive tasks less often Predictable performance
29 Metal Design Thinnest possible API Modern GPU features Do expensive tasks less often Predictable performance Explicit command submission
30 Metal Design Thinnest possible API Modern GPU features Do expensive tasks less often Predictable performance Explicit command submission Optimized for CPU behavior
31 Your App GPU
32 Your App Scene Graphs SceneKit SpriteKit GPU
33 Your App Scene Graphs 2D Graphics and Imaging SceneKit SpriteKit Core Animation Core Image Core Graphics GPU
34 Your App Scene Graphs 2D Graphics and Imaging Standards-Based 3D Graphics SceneKit SpriteKit Core Animation Core Image Core Graphics OpenGL ES GPU
35 Your App Scene Graphs 2D Graphics and Imaging Standards-Based 3D Graphics High Efficiency GPU Access SceneKit SpriteKit Core Animation Core Image Core Graphics OpenGL ES Metal GPU
36 So how did we do this?
37 Frame Times Many games target frame rate of 30 FPS (33.3 milliseconds/frame)
38 Frame Times Many games target frame rate of 30 FPS (33.3 milliseconds/frame) 0 ms 33.3 ms 66.7 ms 100 ms
39 Frame Times Many games target frame rate of 30 FPS (33.3 milliseconds/frame) 0 ms 33.3 ms 66.7 ms 100 ms CPU work frame N-1 CPU work frame N GPU work frame N-1 GPU work frame N
40 Frame Times Many games target frame rate of 30 FPS (33.3 milliseconds/frame) 0 ms 33.3 ms 66.7 ms 100 ms CPU work frame N-1 CPU work frame N CPU work frame N+1 CPU work frame N+2 GPU work frame N-1 GPU work frame N GPU work frame N+1 GPU work frame N+2
41 One Balanced Frame 0 ms 33.3 ms CPU CPU work frame N GPU GPU work frame N-1
42 CPU Can Take More Time Than GPU 0 ms 33.3 ms CPU CPU work frame N GPU GPU work frame N-1
43 CPU Can Take More Time Than GPU 0 ms 33.3 ms CPU CPU work frame N GPU GPU work frame N-1 GPU idle time GPU utilization = 67%
44 More GPU Work Requires More CPU Work 0 ms 33.3 ms 50 ms CPU CPU work frame N GPU GPU work frame N-1 GPU idle time
45 CPU Time Includes Application and GPU API 0 ms 33.3 ms 50 ms CPU Application frame N CPU work frame N GPU API frame N GPU GPU work frame N-1 GPU idle time
46 Targeting CPU Time Spent in GPU API 0 ms 33.3 ms 50 ms CPU Application frame N GPU API frame N GPU GPU work frame N-1 GPU idle time
47 Metal Dramatically Reduces GPU API Time 0 ms GPU API frame N 20 ms 33.3 ms CPU Application frame N CPU idle time GPU GPU work frame N-1
48 Metal Dramatically Reduces GPU API Time 0 ms GPU API frame N 20 ms 33.3 ms CPU Application frame N CPU idle time GPU GPU work frame N-1
49 Use CPU Time to Improve Your Game 0 ms 20 ms 33.3 ms CPU Application frame N More Physics More AI CPU idle time GPU GPU work frame N-1
50 Use CPU Time to Draw More Objects 0 ms 33.3 ms CPU Application frame N More Physics More AI Up to 10x more draw calls GPU GPU work frame N-1
51 Use CPU Time to Draw More Objects 0 ms 33.3 ms CPU Application frame N More Physics More AI Up to 10x more draw calls GPU GPU work frame N-1
52 Why Is GPU Programming Expensive?
53 Why Is GPU Programming Expensive? State validation Confirming API usage is valid Encoding API state to hardware state
54 Why Is GPU Programming Expensive? State validation Confirming API usage is valid Encoding API state to hardware state Shader compilation Run-time generation of shader machine code Interactions between state and shaders
55 Why Is GPU Programming Expensive? State validation Confirming API usage is valid Encoding API state to hardware state Shader compilation Run-time generation of shader machine code Interactions between state and shaders Sending work to GPU Managing resource residency Batching commands
56 Do Expensive Tasks Less Often When Frequency Application build Content loading Draw time
57 Do Expensive Tasks Less Often When Frequency Application build Never Content loading Draw time
58 Do Expensive Tasks Less Often When Frequency Application build Never Content loading Rare Draw time
59 Do Expensive Tasks Less Often When Frequency Application build Never Content loading Rare Draw time 1000s of times per frame
60 Do Expensive Tasks Less Often When Frequency Before Metal Application build Never Content loading Rare Draw time 1000s of times per frame Shader compilation State validation Start work on GPU
61 Do Expensive Tasks Less Often When Frequency Before Metal After Metal Application build Never Shader compilation Content loading Rare State validation Shader compilation Draw time 1000s of times per frame State validation Start work on GPU Start work on GPU
62 Agenda Background API concepts Shading language Developer tools
63 Metal Objects Objects Purpose
64 Metal Objects Objects Purpose Device The GPU
65 Metal Objects Objects Purpose Device The GPU Command Queue Serial sequence of command buffers
66 Metal Objects Objects Purpose Device The GPU Command Queue Serial sequence of command buffers Command Buffer Contains GPU hardware commands
67 Metal Objects Objects Purpose Device The GPU Command Queue Serial sequence of command buffers Command Buffer Contains GPU hardware commands Command Encoder Translates API commands to GPU hardware commands
68 Metal Objects Objects Purpose Device The GPU Command Queue Serial sequence of command buffers Command Buffer Contains GPU hardware commands Command Encoder Translates API commands to GPU hardware commands State Framebuffer configuration, blend, depth, samplers, etc.
69 Metal Objects Objects Purpose Device The GPU Command Queue Serial sequence of command buffers Command Buffer Contains GPU hardware commands Command Encoder Translates API commands to GPU hardware commands State Framebuffer configuration, blend, depth, samplers, etc. Code Shaders
70 Metal Objects Objects Purpose Device The GPU Command Queue Serial sequence of command buffers Command Buffer Contains GPU hardware commands Command Encoder Translates API commands to GPU hardware commands State Framebuffer configuration, blend, depth, samplers, etc. Code Shaders Resources Textures and Data Buffers (vertices, constants, etc.)
71 Key Resource Code State Device Descriptor System
72 Command Queue Device Key Resource Code State Descriptor System
73 Command Command Command Buffer Buffer Key Resource Code Command Queue Device State Descriptor System
74 Render Command Encoder Render Render Command Command Encoder Encoder Command Command Command Buffer Buffer Key Resource Code Command Queue Device State Descriptor System
75 Data Buffer Resource Data Data Buffer Buffer Resource Resource Render Command Encoder Render Render Command Command Encoder Encoder Command Command Command Buffer Buffer Key Resource Code Command Queue Device State Descriptor System
76 Data Buffer Resource Data Data Buffer Buffer Resource Resource Texture Resource Texture Texture Resource Resource Texture Descriptor Texture Texture Descriptor Descriptor Render Command Encoder Render Render Command Command Encoder Encoder Command Command Command Buffer Buffer Key Resource Code Command Queue Device State Descriptor System
77 Data Buffer Resource Data Data Buffer Buffer Resource Resource Render Command Encoder Render Render Command Command Encoder Encoder Texture Resource Texture Texture Resource Resource Render Pipeline State Texture Descriptor Texture Texture Descriptor Descriptor Render Pipeline Descriptor Vertex Format Vertex Function Fragment Function Vertex Descriptor Function Function Blend Blend Descriptor Command Command Command Buffer Buffer Key Resource Code Command Queue Device State Descriptor System
78 Data Buffer Resource Data Data Buffer Buffer Resource Resource Render Command Encoder Render Render Command Command Encoder Encoder Texture Resource Texture Texture Resource Resource Render Pipeline State Texture Descriptor Texture Texture Descriptor Descriptor Render Pipeline Descriptor Vertex Format Vertex Function Fragment Function Vertex Descriptor Function Function Depth Stencil State Depth Stencil Descriptor Blend Blend Descriptor Command Command Command Buffer Buffer Key Resource Code Command Queue Device State Descriptor System
79 Data Buffer Resource Data Data Buffer Buffer Resource Resource Render Command Encoder Render Render Command Command Encoder Encoder Texture Resource Texture Texture Resource Resource Render Pipeline State Texture Descriptor Texture Texture Descriptor Descriptor Render Pipeline Descriptor Vertex Format Vertex Function Fragment Function Vertex Descriptor Function Function Depth Stencil State Depth Stencil Descriptor Blend Blend Descriptor Render Pass Descriptor Attachment Descriptor Attachment Attachment Descriptor Descriptor Texture Resource Texture Texture Resource Resource Texture Descriptor Texture Texture Descriptor Descriptor Command Command Command Buffer Buffer Key Resource Code Command Queue Device State Descriptor System
80 Data Data Buffer Data Buffer Buffer Resource Resource Resource Texture Texture Texture Resource Resource Resource Render Command Encoder Render Render Command Command Encoder Encoder Render Pipeline State Depth Stencil State Texture Texture Texture Resource Resource Resource Command Command Command Buffer Buffer Command Queue Device
81 Data Data Buffer Data Buffer Buffer Resource Resource Resource Changeable Source Buffers Texture Texture Texture Resource Resource Resource Changeable Source Textures Render Command Encoder Render Render Command Command Encoder Encoder Render Pipeline State Depth Stencil State Texture Texture Texture Resource Resource Resource Command Command Command Buffer Buffer Command Queue Device
82 Data Data Buffer Data Buffer Buffer Resource Resource Resource Changeable Source Buffers Texture Texture Texture Resource Resource Resource Changeable Source Textures Render Command Encoder Render Render Command Command Encoder Encoder Render Pipeline State Depth Stencil State Texture Texture Texture Resource Resource Resource Fixed Render Target Textures Command Command Command Buffer Buffer Command Queue Device
83 Render Command Encoder Render Command Encoder Render Command Encoder Command Buffer Command Queue Device
84 Render Command Encoder Compute Command Encoder Render Command Encoder Command Buffer Command Queue Device
85 Thread #1 Thread #2 Render Command Encoder Compute Command Encoder Render Command Encoder Blit Command Encoder Compute Command Encoder Render Command Encoder Command Buffer Command Buffer Command Queue Device
86 Command Submission Model Command encoders convert API commands into hardware commands
87 Command Submission Model Command encoders convert API commands into hardware commands Hardware commands stored in command buffers
88 Command Submission Model Command encoders convert API commands into hardware commands Hardware commands stored in command buffers Three types of command encoders Render, Compute, Blit Can interleave different types into single command buffer Avoids implicit expensive state save and restore operations
89 Command Submission Model
90 Command Submission Model Explicit command buffer construction and submission App creates many lightweight command buffers App controls command buffer submission Metal signals app when command buffers finish execution
91 Command Submission Model Explicit command buffer construction and submission App creates many lightweight command buffers App controls command buffer submission Metal signals app when command buffers finish execution Command encoders generate commands immediately No deferred state validation Direct call to driver
92 Command Submission Model Multithreaded command encoding Multiple command buffers can be encoded in parallel App decides execution order Very efficient implementation to ensure scalable performance
93 Resource Update Model
94 Resource Update Model Designed for A7 s unified memory system CPU and GPU share same storage No implicit copies Automatic CPU/GPU coherency model - CPU and GPU observe writes at command buffer execution boundaries - No explicit CPU cache management required
95 Resource Update Model Designed for A7 s unified memory system CPU and GPU share same storage No implicit copies Automatic CPU/GPU coherency model - CPU and GPU observe writes at command buffer execution boundaries - No explicit CPU cache management required Significantly higher performance More synchronization responsibilities for you
96 Resource Update Model
97 Resource Update Model Two types of resources Textures (formatted images) - Render targets, texture sources Data buffers (unformatted memory) - a bag of bytes - Vertex data, shader constants, output memory, etc.
98 Resource Update Model Two types of resources Textures (formatted images) - Render targets, texture sources Data buffers (unformatted memory) - a bag of bytes - Vertex data, shader constants, output memory, etc. Resource structure (size, levels, format) can t be changed Avoids costly resource validation Resource contents can be changed
99 Resource Update Model
100 Resource Update Model Updating data buffers Direct access to storage by CPU No lock API needed
101 Resource Update Model Updating data buffers Direct access to storage by CPU No lock API needed Updating textures Implementation private storage Metal provides blazing fast texture update routines
102 Resource Update Model Updating data buffers Direct access to storage by CPU No lock API needed Updating textures Implementation private storage Metal provides blazing fast texture update routines GPU-accelerated and pipelined updates Via Blit command encoder
103 Resource Update Model
104 Resource Update Model Can share texture storage with other textures Interpret as different pixel formats with same pixel size - eg., srgb vs. RGB, or single 32-bit component vs. RGBA8888
105 Resource Update Model Can share texture storage with other textures Interpret as different pixel formats with same pixel size - eg., srgb vs. RGB, or single 32-bit component vs. RGBA8888 Can share texture storage with other buffers Assumes row-linear pixel data
106 Command Encoder Types
107 Command Encoder Types Render command encoder Graphics rendering
108 Command Encoder Types Render command encoder Graphics rendering Compute command encoder Data parallel computations
109 Command Encoder Types Render command encoder Graphics rendering Compute command encoder Data parallel computations Blit command encoder GPU-accelerated resource copy operations
110 Render Command Encoder
111 Render Command Encoder Generates hardware commands for single rendering pass All rendering to single framebuffer object Specifies states for vertex and fragment stages of 3D pipeline Interleaves resources, state changes, and draw calls
112 Render Command Encoder Generates hardware commands for single rendering pass All rendering to single framebuffer object Specifies states for vertex and fragment stages of 3D pipeline Interleaves resources, state changes, and draw calls No draw time compilation App controls when all significant compilation and state validation occurs
113 Render State Objects State Object Description DepthStencil DepthStencil comparison functions and write masks Sampler Filter states, addressing modes, LOD state Render Pipeline Everything else Vertex and pixel shader functions Vertex data layout Multisample state Blend state Color write masks
114 Render States States affecting compilation can t be changed after object creation Inexpensive states can be changed Changeable Render States Vertex and fragment shaders # of render targets, pixel format, color write masks Multisample state Blend state DepthStencil state and write masks Specification of buffers, textures, samplers Cull mode and facing orientation Depth clipping and depth bias Polygon mode Viewport and scissor Occlusion queries
115 Framebuffer Loads and Stores Framebuffer configuration designed for optimal behavior on A7 GPU Tile-based deferred-mode renderer
116 Framebuffer Loads and Stores Framebuffer configuration designed for optimal behavior on A7 GPU Tile-based deferred-mode renderer Explicit control of framebuffer tile cache Load and Store operations Load at start of render pass Store at end of render pass
117 Framebuffer Loads and Stores Framebuffer configuration designed for optimal behavior on A7 GPU Tile-based deferred-mode renderer Explicit control of framebuffer tile cache Load and Store operations Load at start of render pass Store at end of render pass App choses per render target Load Don t care, load, clear Store Don t care, store, multisample resolve
118 Before Metal Framebuffer loads and stores One Frame Color Framebuffer Read Render Pass #1 Write Color Framebuffer Read Render Pass #2 Write Color Framebuffer
119 Before Metal Framebuffer loads and stores One Frame Color Framebuffer Read Write Color Framebuffer Read Write Color Framebuffer Render Pass #1 Render Pass #2
120 Before Metal Framebuffer loads and stores One Frame Color Framebuffer Read Write Color Framebuffer Read Write Color Framebuffer Render Pass #1 Render Pass #2 Depth Framebuffer Read Write Read Write Depth Framebuffer Depth Framebuffer
121 Before Metal Framebuffer loads and stores One Frame Color Framebuffer Read Write Color Framebuffer Read Write Color Framebuffer Render Pass #1 Render Pass #2 Depth Framebuffer Read Write Read Write Depth Framebuffer Depth Framebuffer Framebuffer Memory Bandwidth Color: 2 reads + 2 writes Depth: 2 reads + 2 writes
122 Metal Framebuffer loads and stores One Frame Color Framebuffer Read Write Color Framebuffer Read Write Color Framebuffer Render Pass #1 Render Pass #2 Depth Framebuffer Read Write Read Write Depth Framebuffer Depth Framebuffer
123 Metal Framebuffer loads and stores One Frame Color Framebuffer Don t Care Store Write Color Framebuffer Read Write Color Framebuffer Render Pass #1 Render Pass #2 Depth Framebuffer Read Write Read Write Depth Framebuffer Depth Framebuffer
124 Metal Framebuffer loads and stores One Frame Color Framebuffer Don t Care Store Write Color Read Load Store Framebuffer Write Color Framebuffer Render Pass #1 Render Pass #2 Depth Framebuffer Read Write Read Write Depth Framebuffer Depth Framebuffer
125 Metal Framebuffer loads and stores One Frame Color Framebuffer Don t Care Store Write Color Read Load Store Framebuffer Write Color Framebuffer Render Pass #1 Render Pass #2 Depth Framebuffer Clear Don t Care Depth Framebuffer Read Write Depth Framebuffer
126 Metal Framebuffer loads and stores One Frame Color Framebuffer Don t Care Store Write Color Read Load Store Framebuffer Write Color Framebuffer Render Pass #1 Render Pass #2 Depth Framebuffer Clear Don t Care Depth Framebuffer Clear Don t Care Depth Framebuffer
127 Dramatic Reduction in Memory Traffic One Frame Color Framebuffer Don t Care Store Write Read Load Store Color Framebuffer Write Color Framebuffer Render Pass #1 Render Pass #2 Depth Framebuffer Clear Don t Care Depth Framebuffer Clear Don t Care Depth Framebuffer Framebuffer Memory Bandwidth Color: 1 reads + 2 writes Depth: 0 reads + 0 writes
128 Compute Command Encoder Familiar compute run-time and memory model Textures and data buffers Local and global memory Local atomics Barriers Memory loads and stores User-specified workgroup dimensions
129 Compute Command Encoder
130 Compute Command Encoder Fully integrated with graphics Unified API, shading language and developer tools Efficiently interleaves Compute commands with Render and Blit commands
131 Compute Command Encoder Fully integrated with graphics Unified API, shading language and developer tools Efficiently interleaves Compute commands with Render and Blit commands No execution-time compilation App controls when all significant compilation and validation occurs
132 Compute Command Encoder State Object Description Compute State Compute functions, workgroup configuration Sampler Filter states, addressing modes, LOD state
133 Blit Command Encoder Enables asynchronous copies In parallel with graphics and compute operations
134 Blit Command Encoder Enables asynchronous copies In parallel with graphics and compute operations Texture uploads Copy to/from other texture or data buffer Accelerated mipmap generation
135 Blit Command Encoder Enables asynchronous copies In parallel with graphics and compute operations Texture uploads Copy to/from other texture or data buffer Accelerated mipmap generation Data Buffer updates Copy to/from another data buffer or texture Fill with constant values
136 Agenda Background API concepts Shading language Developer tools
137 Shading Language
138 Shading Language Unified shading language for graphics and compute processing
139 Shading Language Unified shading language for graphics and compute processing Based on C++11 Static subset Built from LLVM and clang
140 Shading Language Unified shading language for graphics and compute processing Based on C++11 Static subset Built from LLVM and clang Additions GPU hardware features (texture sampling, rasterization, compute operations, etc.) Function overloading and templates
141 Shading Language
142 Shading Language Data types for graphics and compute features Scalar, vector and matrix types Samplers and textures
143 Shading Language Data types for graphics and compute features Scalar, vector and matrix types Samplers and textures Attributes Function arguments Sampling and interpolation qualifiers Per-instance inputs, outputs, and built-in graphics variables Programmable blending
144 Shading Language
145 Shading Language Multiple shaders per source file
146 Shading Language Multiple shaders per source file Metal shaders built by Xcode compiler into Metal library files Library contains archive of Metal shaders With run-time APIs - Load a Metal library - Finalize compilation to GPU machine code
147 Shading Language Multiple shaders per source file Metal shaders built by Xcode compiler into Metal library files Library contains archive of Metal shaders With run-time APIs - Load a Metal library - Finalize compilation to GPU machine code Metal includes standard library for graphics and compute functions
148 Argument Tables Textures, buffers, and samplers passed as arguments to functions Vertex, fragment, compute shaders
149 Argument Tables Textures, buffers, and samplers passed as arguments to functions Vertex, fragment, compute shaders Each command encoder includes set of argument tables One table per type (texture, buffer, sampler)
150 Argument Tables Textures, buffers, and samplers passed as arguments to functions Vertex, fragment, compute shaders Each command encoder includes set of argument tables One table per type (texture, buffer, sampler) Metal API and shading language use table index to reference arguments
151 Argument Tables Texture Index ID 0 Texture A 1 Texture B.. Texture.. n Texture Z Command Encoder
152 Argument Tables Texture Index ID 0 Texture A 1 Texture B.. Texture.. n Texture Z Buffer Index ID 0 Buffer A 1 Buffer B.. Buffer.. n Buffer Z Sampler Index ID 0 Sampler A 1 Sampler B.. Sampler.. n Sampler Z Command Encoder
153 Argument Tables vertex VertexOutput smoothtrianglevertex(constant float4 *pos_data [[ buffer(0) ]], constant float2 *uv_data [[ buffer(1) ]], uint vid [[ vertex_id ]]) { VertexOutput out; out.pos = pos_data[vid]; out.uv = uv_data[vid]; return out; }! fragment float4 smoothtrianglefragment(vertexoutput in [[ stage_in ]], texture2d<float> tex [[ texture(1) ]]) { return tex.sample(s, in.uv); } Shader Code Texture Index ID 0 Texture A 1 Texture B.. Texture.. n Texture Z Buffer Index ID 0 Buffer A 1 Buffer B.. Buffer.. n Buffer Z Sampler Index ID 0 Sampler A 1 Sampler B.. Sampler.. n Sampler Z Command Encoder
154 Argument Tables vertex VertexOutput smoothtrianglevertex(constant float4 *pos_data [[ buffer(0) ]], constant float2 *uv_data [[ buffer(1) ]], uint vid [[ vertex_id ]]) { VertexOutput out; out.pos = pos_data[vid]; out.uv = uv_data[vid]; return out; }! fragment float4 smoothtrianglefragment(vertexoutput in [[ stage_in ]], texture2d<float> tex [[ texture(1) ]]) { return tex.sample(s, in.uv); } Shader Code Texture Index ID 0 Texture A 1 Texture B.. Texture.. n Texture Z Buffer Index ID 0 Buffer A 1 Buffer B.. Buffer.. n Buffer Z Sampler Index ID 0 Sampler A 1 Sampler B.. Sampler.. n Sampler Z Command Encoder
155 Argument Tables vertex VertexOutput smoothtrianglevertex(constant float4 *pos_data [[ buffer(0) ]], constant float2 *uv_data [[ buffer(1) ]], uint vid [[ vertex_id ]]) { VertexOutput out; out.pos = pos_data[vid]; out.uv = uv_data[vid]; return out; }! fragment float4 smoothtrianglefragment(vertexoutput in [[ stage_in ]], texture2d<float> tex [[ texture(1) ]]) { return tex.sample(s, in.uv); } Shader Code Texture Index ID 0 Texture A 1 Texture B.. Texture.. n Texture Z Buffer Index ID 0 Buffer A 1 Buffer B.. Buffer.. n Buffer Z Sampler Index ID 0 Sampler A 1 Sampler B.. Sampler.. n Sampler Z Command Encoder
156 Argument Tables vertex VertexOutput smoothtrianglevertex(constant float4 *pos_data [[ buffer(0) ]], constant float2 *uv_data [[ buffer(1) ]], uint vid [[ vertex_id ]]) { VertexOutput out; out.pos = pos_data[vid]; out.uv = uv_data[vid]; return out; }! fragment float4 smoothtrianglefragment(vertexoutput in [[ stage_in ]], texture2d<float> tex [[ texture(1) ]]) { return tex.sample(s, in.uv); } Shader Code Texture Index ID 0 Texture A 1 Texture B.. Texture.. n Texture Z Buffer Index ID 0 Buffer A 1 Buffer B.. Buffer.. n Buffer Z Sampler Index ID 0 Sampler A 1 Sampler B.. Sampler.. n Sampler Z Command Encoder
157 Agenda Background API concepts Shading language Developer tools
158 Metal Shader Compiler Process
159 Metal Shader Compiler Process Metal shader sources compiled to libraries at application build time No need to ship source code with application Shading language errors reported at build time
160 Metal Shader Compiler Process Metal shader sources compiled to libraries at application build time No need to ship source code with application Shading language errors reported at build time Metal libraries compiled to device code at state object creation time No draw time compilation Device code cached after compilation
161 Metal Shader Compiler Process Metal shader sources compiled to libraries at application build time No need to ship source code with application Shading language errors reported at build time Metal libraries compiled to device code at state object creation time No draw time compilation Device code cached after compilation There is also a run-time shader compiler No draw time compilation For best performance, use offline compiler
162 Metal Shader Compiler Process In Xcode Metal Shader Compiler Your Application
163 Metal Shader Compiler Process In Xcode Metal Metal Metal Metal Metal Shader Metal Shader Compiler Your Application
164 Metal Shader Compiler Process In Xcode Metal Metal Metal Metal Metal Shader Metal Shader Compiler Metal Library Your Application
165 Metal Shader Compiler Process In Xcode Metal Metal Metal Metal Metal Shader Metal Shader Compiler Your Application Metal Library
166 Metal Shader Compiler Process In Xcode At application run-time Metal Metal Metal Metal Metal Shader Your Application Metal Library Metal Shader Compiler
167 Metal Shader Compiler Process In Xcode At application run-time Metal Metal Metal Metal Metal Shader Your Application Metal Library Metal Shader Compiler State Vertex Shader Fragment Shader Pipeline Object Creation
168 Metal Shader Compiler Process In Xcode At application run-time Metal Metal Metal Metal Metal Shader Your Application Metal Library Metal Shader Compiler State Vertex Shader Fragment Shader Pipeline Object Creation
169 Metal Shader Compiler Process In Xcode At application run-time Metal Metal Metal Metal Metal Shader Your Application Metal Library Metal Shader Compiler State Vertex Shader Fragment Shader Pipeline Object Creation Shader Cache
170 Metal Shader Compiler Process In Xcode At application run-time Metal Metal Metal Metal Metal Shader Your Application Metal Library Metal Shader Compiler State Vertex Shader Fragment Shader Pipeline Object Creation Shader Cache Metal Device Compiler
171 Metal Shader Compiler Process In Xcode At application run-time Metal Metal Metal Metal Metal Shader Your Application Metal Library Metal Shader Compiler State Vertex Shader Fragment Shader Pipeline Object Creation Shader Cache Metal Device Compiler GPU
172 Metal Tools Fully Integrated into Xcode Visual frame debugger API trace and navigation Shader edit-and-continue Rich source code editing (including shaders) Graphics and compute state inspection Shader compiler Debug mode for Metal framework
173
174 Frame Navigator
175 Framebuffer View
176 Resource View
177 State Inspector
178
179 Performance Report
180
181 Shader Profiler
182 Shader Profiler
183 Shader Profiler
184 Shader Profiler
185 Shader Profiler
186
187 Demo The Collectables on Metal Sean Tracey Crytek
188 Summary
189 Summary Low overhead, high performance GPU programming API
190 Summary Low overhead, high performance GPU programming API Up to 10x more draw calls
191 Summary Low overhead, high performance GPU programming API Up to 10x more draw calls Designed for A7 and ios
192 Summary Low overhead, high performance GPU programming API Up to 10x more draw calls Designed for A7 and ios Streamlined for modern GPU features
193 Summary Low overhead, high performance GPU programming API Up to 10x more draw calls Designed for A7 and ios Streamlined for modern GPU features Fine-grained control
194 Summary Low overhead, high performance GPU programming API Up to 10x more draw calls Designed for A7 and ios Streamlined for modern GPU features Fine-grained control Precompiled shaders
195 Summary Low overhead, high performance GPU programming API Up to 10x more draw calls Designed for A7 and ios Streamlined for modern GPU features Fine-grained control Precompiled shaders Fantastic developer tools
196 Summary Low overhead, high performance GPU programming API Up to 10x more draw calls Designed for A7 and ios Streamlined for modern GPU features Fine-grained control Precompiled shaders Fantastic developer tools Enables entirely new class of games
197 More Information Filip Iliescu Graphics and Games Technologies Evangelist Allan Schaffer Graphics and Games Technologies Evangelist Documentation Apple Developer Forums
198 Related Sessions Working with Metal Fundamentals Pacific Heights Wednesday 10:15AM Working with Metal Advanced Pacific Heights Wednesday 11:30AM
199 Labs Metal Lab Graphics and Games Lab A Wednesday 2:00PM Metal Lab Graphics and Games Lab B Thursday 10:15AM
200
Metal for OpenGL Developers
#WWDC18 Metal for OpenGL Developers Dan Omachi, Metal Ecosystem Engineer Sukanya Sudugu, GPU Software Engineer 2018 Apple Inc. All rights reserved. Redistribution or public display not permitted without
More informationWorking With Metal Advanced
Graphics and Games #WWDC14 Working With Metal Advanced Session 605 Gokhan Avkarogullari GPU Software Aaftab Munshi GPU Software Serhat Tekin GPU Software 2014 Apple Inc. All rights reserved. Redistribution
More informationIntroducing Metal 2. Graphics and Games #WWDC17. Michal Valient, GPU Software Engineer Richard Schreyer, GPU Software Engineer
Session Graphics and Games #WWDC17 Introducing Metal 2 601 Michal Valient, GPU Software Engineer Richard Schreyer, GPU Software Engineer 2017 Apple Inc. All rights reserved. Redistribution or public display
More informationMetal. GPU-accelerated advanced 3D graphics rendering and data-parallel computation. source rebelsmarket.com
Metal GPU-accelerated advanced 3D graphics rendering and data-parallel computation source rebelsmarket.com Maths The heart and foundation of computer graphics source wallpoper.com Metalmatics There are
More informationReal-Time Rendering (Echtzeitgraphik) Michael Wimmer
Real-Time Rendering (Echtzeitgraphik) Michael Wimmer wimmer@cg.tuwien.ac.at Walking down the graphics pipeline Application Geometry Rasterizer What for? Understanding the rendering pipeline is the key
More informationBuilding a Game with SceneKit
Graphics and Games #WWDC14 Building a Game with SceneKit Session 610 Amaury Balliet Software Engineer 2014 Apple Inc. All rights reserved. Redistribution or public display not permitted without written
More informationMore frames per second. Alex Kan and Jean-François Roy GPU Software
More frames per second Alex Kan and Jean-François Roy GPU Software 2 OpenGL ES Analyzer Tuning the graphics pipeline Analyzer demo 3 Developer preview Jean-François Roy GPU Software Developer Technologies
More informationThe Application Stage. The Game Loop, Resource Management and Renderer Design
1 The Application Stage The Game Loop, Resource Management and Renderer Design Application Stage Responsibilities 2 Set up the rendering pipeline Resource Management 3D meshes Textures etc. Prepare data
More informationReal - Time Rendering. Pipeline optimization. Michal Červeňanský Juraj Starinský
Real - Time Rendering Pipeline optimization Michal Červeňanský Juraj Starinský Motivation Resolution 1600x1200, at 60 fps Hw power not enough Acceleration is still necessary 3.3.2010 2 Overview Application
More informationPowerVR Hardware. Architecture Overview for Developers
Public Imagination Technologies PowerVR Hardware Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.
More informationPowerVR Series5. Architecture Guide for Developers
Public Imagination Technologies PowerVR Series5 Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.
More informationEECS 487: Interactive Computer Graphics
EECS 487: Interactive Computer Graphics Lecture 21: Overview of Low-level Graphics API Metal, Direct3D 12, Vulkan Console Games Why do games look and perform so much better on consoles than on PCs with
More informationBringing AAA graphics to mobile platforms. Niklas Smedberg Senior Engine Programmer, Epic Games
Bringing AAA graphics to mobile platforms Niklas Smedberg Senior Engine Programmer, Epic Games Who Am I A.k.a. Smedis Platform team at Epic Games Unreal Engine 15 years in the industry 30 years of programming
More informationProfiling and Debugging Games on Mobile Platforms
Profiling and Debugging Games on Mobile Platforms Lorenzo Dal Col Senior Software Engineer, Graphics Tools Gamelab 2013, Barcelona 26 th June 2013 Agenda Introduction to Performance Analysis with ARM DS-5
More informationNVIDIA Parallel Nsight. Jeff Kiel
NVIDIA Parallel Nsight Jeff Kiel Agenda: NVIDIA Parallel Nsight Programmable GPU Development Presenting Parallel Nsight Demo Questions/Feedback Programmable GPU Development More programmability = more
More informationMetal Feature Set Tables
Metal Feature Set Tables apple Developer Feature Availability This table lists the availability of major Metal features. OS ios 8 ios 8 ios 9 ios 9 ios 9 ios 10 ios 10 ios 10 ios 11 ios 11 ios 11 ios 11
More informationMetal for Ray Tracing Acceleration
Session #WWDC18 Metal for Ray Tracing Acceleration 606 Sean James, GPU Software Engineer Wayne Lister, GPU Software Engineer 2018 Apple Inc. All rights reserved. Redistribution or public display not permitted
More informationVulkan: Architecture positive How Vulkan maps to PowerVR GPUs Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics.
Vulkan: Architecture positive How Vulkan maps to PowerVR GPUs Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics www.imgtec.com Introduction Who am I? Kevin Sun Working at Imagination Technologies
More informationBifrost - The GPU architecture for next five billion
Bifrost - The GPU architecture for next five billion Hessed Choi Senior FAE / ARM ARM Tech Forum June 28 th, 2016 Vulkan 2 ARM 2016 What is Vulkan? A 3D graphics API for the next twenty years Logical successor
More informationOptimizing and Profiling Unity Games for Mobile Platforms. Angelo Theodorou Senior Software Engineer, MPG Gamelab 2014, 25 th -27 th June
Optimizing and Profiling Unity Games for Mobile Platforms Angelo Theodorou Senior Software Engineer, MPG Gamelab 2014, 25 th -27 th June 1 Agenda Introduction ARM and the presenter Preliminary knowledge
More informationWhat s New in Metal, Part 2
Graphics and Games #WWDC15 What s New in Metal, Part 2 Session 607 Dan Omachi GPU Software Frameworks Engineer Anna Tikhonova GPU Software Frameworks Engineer 2015 Apple Inc. All rights reserved. Redistribution
More informationBuilding Visually Rich User Experiences
Session App Frameworks #WWDC17 Building Visually Rich User Experiences 235 Noah Witherspoon, Software Engineer Warren Moore, Software Engineer 2017 Apple Inc. All rights reserved. Redistribution or public
More informationGeForce4. John Montrym Henry Moreton
GeForce4 John Montrym Henry Moreton 1 Architectural Drivers Programmability Parallelism Memory bandwidth 2 Recent History: GeForce 1&2 First integrated geometry engine & 4 pixels/clk Fixed-function transform,
More informationCopyright Khronos Group, Page Graphic Remedy. All Rights Reserved
Avi Shapira Graphic Remedy Copyright Khronos Group, 2009 - Page 1 2004 2009 Graphic Remedy. All Rights Reserved Debugging and profiling 3D applications are both hard and time consuming tasks Companies
More informationNext-Generation Graphics on Larrabee. Tim Foley Intel Corp
Next-Generation Graphics on Larrabee Tim Foley Intel Corp Motivation The killer app for GPGPU is graphics We ve seen Abstract models for parallel programming How those models map efficiently to Larrabee
More informationWindowing System on a 3D Pipeline. February 2005
Windowing System on a 3D Pipeline February 2005 Agenda 1.Overview of the 3D pipeline 2.NVIDIA software overview 3.Strengths and challenges with using the 3D pipeline GeForce 6800 220M Transistors April
More informationBest practices for effective OpenGL programming. Dan Omachi OpenGL Development Engineer
Best practices for effective OpenGL programming Dan Omachi OpenGL Development Engineer 2 What Is OpenGL? 3 OpenGL is a software interface to graphics hardware - OpenGL Specification 4 GPU accelerates rendering
More informationMali Developer Resources. Kevin Ho ARM Taiwan FAE
Mali Developer Resources Kevin Ho ARM Taiwan FAE ARM Mali Developer Tools Software Development SDKs for OpenGL ES & OpenCL OpenGL ES Emulators Shader Development Studio Shader Library Asset Creation Texture
More informationThe Bifrost GPU architecture and the ARM Mali-G71 GPU
The Bifrost GPU architecture and the ARM Mali-G71 GPU Jem Davies ARM Fellow and VP of Technology Hot Chips 28 Aug 2016 Introduction to ARM Soft IP ARM licenses Soft IP cores (amongst other things) to our
More informationOpenGL Status - November 2013 G-Truc Creation
OpenGL Status - November 2013 G-Truc Creation Vendor NVIDIA AMD Intel Windows Apple Release date 02/10/2013 08/11/2013 30/08/2013 22/10/2013 Drivers version 331.10 beta 13.11 beta 9.2 10.18.10.3325 MacOS
More informationGPU Memory Model Overview
GPU Memory Model Overview John Owens University of California, Davis Department of Electrical and Computer Engineering Institute for Data Analysis and Visualization SciDAC Institute for Ultrascale Visualization
More informationGraphics Performance Optimisation. John Spitzer Director of European Developer Technology
Graphics Performance Optimisation John Spitzer Director of European Developer Technology Overview Understand the stages of the graphics pipeline Cherchez la bottleneck Once found, either eliminate or balance
More informationParallel Programming on Larrabee. Tim Foley Intel Corp
Parallel Programming on Larrabee Tim Foley Intel Corp Motivation This morning we talked about abstractions A mental model for GPU architectures Parallel programming models Particular tools and APIs This
More informationReal - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský
Real - Time Rendering Graphics pipeline Michal Červeňanský Juraj Starinský Overview History of Graphics HW Rendering pipeline Shaders Debugging 2 History of Graphics HW First generation Second generation
More informationVulkan (including Vulkan Fast Paths)
Vulkan (including Vulkan Fast Paths) Łukasz Migas Software Development Engineer WS Graphics Let s talk about OpenGL (a bit) History 1.0-1992 1.3-2001 multitexturing 1.5-2003 vertex buffer object 2.0-2004
More informationLecture 9: Deferred Shading. Visual Computing Systems CMU , Fall 2013
Lecture 9: Deferred Shading Visual Computing Systems The course so far The real-time graphics pipeline abstraction Principle graphics abstractions Algorithms and modern high performance implementations
More informationRendering Objects. Need to transform all geometry then
Intro to OpenGL Rendering Objects Object has internal geometry (Model) Object relative to other objects (World) Object relative to camera (View) Object relative to screen (Projection) Need to transform
More informationGraphics Processing Unit Architecture (GPU Arch)
Graphics Processing Unit Architecture (GPU Arch) With a focus on NVIDIA GeForce 6800 GPU 1 What is a GPU From Wikipedia : A specialized processor efficient at manipulating and displaying computer graphics
More informationARM. Mali GPU. OpenGL ES Application Optimization Guide. Version: 3.0. Copyright 2011, 2013 ARM. All rights reserved. ARM DUI 0555C (ID102813)
ARM Mali GPU Version: 3.0 OpenGL ES Application Optimization Guide Copyright 2011, 2013 ARM. All rights reserved. ARM DUI 0555C () ARM Mali GPU OpenGL ES Application Optimization Guide Copyright 2011,
More informationPROFESSIONAL. WebGL Programming DEVELOPING 3D GRAPHICS FOR THE WEB. Andreas Anyuru WILEY. John Wiley & Sons, Ltd.
PROFESSIONAL WebGL Programming DEVELOPING 3D GRAPHICS FOR THE WEB Andreas Anyuru WILEY John Wiley & Sons, Ltd. INTRODUCTION xxl CHAPTER 1: INTRODUCING WEBGL 1 The Basics of WebGL 1 So Why Is WebGL So Great?
More informationARM. Mali GPU. OpenGL ES Application Optimization Guide. Version: 2.0. Copyright 2011, 2013 ARM. All rights reserved. ARM DUI 0555B (ID051413)
ARM Mali GPU Version: 2.0 OpenGL ES Application Optimization Guide Copyright 2011, 2013 ARM. All rights reserved. ARM DUI 0555B () ARM Mali GPU OpenGL ES Application Optimization Guide Copyright 2011,
More informationOpenGL SUPERBIBLE. Fifth Edition. Comprehensive Tutorial and Reference. Richard S. Wright, Jr. Nicholas Haemel Graham Sellers Benjamin Lipchak
OpenGL SUPERBIBLE Fifth Edition Comprehensive Tutorial and Reference Richard S. Wright, Jr. Nicholas Haemel Graham Sellers Benjamin Lipchak AAddison-Wesley Upper Saddle River, NJ Boston Indianapolis San
More informationPowerVR Performance Recommendations. The Golden Rules
PowerVR Performance Recommendations Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind. Redistribution
More informationGet the most out of the new OpenGL ES 3.1 API. Hans-Kristian Arntzen Software Engineer
Get the most out of the new OpenGL ES 3.1 API Hans-Kristian Arntzen Software Engineer 1 Content Compute shaders introduction Shader storage buffer objects Shader image load/store Shared memory Atomics
More informationAchieving Console Quality Games on Mobile
Achieving Console Quality Games on Mobile Peter Harris, Senior Principal Engineer, ARM Unai Landa, CTO, Digital Legends Jon Kirkham, Staff Engineer, ARM GDC 2017 Agenda Premium smartphone in 2017 ARM Cortex
More informationEnhancing Traditional Rasterization Graphics with Ray Tracing. October 2015
Enhancing Traditional Rasterization Graphics with Ray Tracing October 2015 James Rumble Developer Technology Engineer, PowerVR Graphics Overview Ray Tracing Fundamentals PowerVR Ray Tracing Pipeline Using
More informationGrafica Computazionale: Lezione 30. Grafica Computazionale. Hiding complexity... ;) Introduction to OpenGL. lezione30 Introduction to OpenGL
Grafica Computazionale: Lezione 30 Grafica Computazionale lezione30 Introduction to OpenGL Informatica e Automazione, "Roma Tre" May 20, 2010 OpenGL Shading Language Introduction to OpenGL OpenGL (Open
More informationModule 13C: Using The 3D Graphics APIs OpenGL ES
Module 13C: Using The 3D Graphics APIs OpenGL ES BREW TM Developer Training Module Objectives See the steps involved in 3D rendering View the 3D graphics capabilities 2 1 3D Overview The 3D graphics library
More informationRSX Best Practices. Mark Cerny, Cerny Games David Simpson, Naughty Dog Jon Olick, Naughty Dog
RSX Best Practices Mark Cerny, Cerny Games David Simpson, Naughty Dog Jon Olick, Naughty Dog RSX Best Practices About libgcm Using the SPUs with the RSX Brief overview of GCM Replay December 7 th, 2004
More informationWhat s New in SpriteKit
Graphics and Games #WWDC14 What s New in SpriteKit Session 606 Norman Wang Game Technologies 2014 Apple Inc. All rights reserved. Redistribution or public display not permitted without written permission
More informationPortland State University ECE 588/688. Graphics Processors
Portland State University ECE 588/688 Graphics Processors Copyright by Alaa Alameldeen 2018 Why Graphics Processors? Graphics programs have different characteristics from general purpose programs Highly
More informationPOWERVR MBX. Technology Overview
POWERVR MBX Technology Overview Copyright 2009, Imagination Technologies Ltd. All Rights Reserved. This publication contains proprietary information which is subject to change without notice and is supplied
More informationLecture 6: Texture. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)
Lecture 6: Texture Kayvon Fatahalian CMU 15-869: Graphics and Imaging Architectures (Fall 2011) Today: texturing! Texture filtering - Texture access is not just a 2D array lookup ;-) Memory-system implications
More informationGraphics Architectures and OpenCL. Michael Doggett Department of Computer Science Lund university
Graphics Architectures and OpenCL Michael Doggett Department of Computer Science Lund university Overview Parallelism Radeon 5870 Tiled Graphics Architectures Important when Memory and Bandwidth limited
More informationSpring 2011 Prof. Hyesoon Kim
Spring 2011 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on
More informationProgramming Guide. Aaftab Munshi Dan Ginsburg Dave Shreiner. TT r^addison-wesley
OpenGUES 2.0 Programming Guide Aaftab Munshi Dan Ginsburg Dave Shreiner TT r^addison-wesley Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich Paris Madrid
More informationCopyright Khronos Group Page 1
Gaming Market Briefing Overview of APIs GDC March 2016 Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem ntrevett@nvidia.com @neilt3d Copyright Khronos Group 2016 - Page 1 Copyright
More informationDave Shreiner, ARM March 2009
4 th Annual Dave Shreiner, ARM March 2009 Copyright Khronos Group, 2009 - Page 1 Motivation - What s OpenGL ES, and what can it do for me? Overview - Lingo decoder - Overview of the OpenGL ES Pipeline
More informationProgramming Graphics Hardware
Tutorial 5 Programming Graphics Hardware Randy Fernando, Mark Harris, Matthias Wloka, Cyril Zeller Overview of the Tutorial: Morning 8:30 9:30 10:15 10:45 Introduction to the Hardware Graphics Pipeline
More informationWhat s New in SpriteKit
Graphics and Games #WWDC16 What s New in SpriteKit Session 610 Ross Dexter Games Technologies Engineer Clément Boissière Games Technologies Engineer 2016 Apple Inc. All rights reserved. Redistribution
More informationCopyright Khronos Group Page 1. Vulkan Overview. June 2015
Copyright Khronos Group 2015 - Page 1 Vulkan Overview June 2015 Copyright Khronos Group 2015 - Page 2 Khronos Connects Software to Silicon Open Consortium creating OPEN STANDARD APIs for hardware acceleration
More informationAchieving High-performance Graphics on Mobile With the Vulkan API
Achieving High-performance Graphics on Mobile With the Vulkan API Marius Bjørge Graphics Research Engineer GDC 2016 Agenda Overview Command Buffers Synchronization Memory Shaders and Pipelines Descriptor
More informationOptimisation. CS7GV3 Real-time Rendering
Optimisation CS7GV3 Real-time Rendering Introduction Talk about lower-level optimization Higher-level optimization is better algorithms Example: not using a spatial data structure vs. using one After that
More informationGraphics Hardware, Graphics APIs, and Computation on GPUs. Mark Segal
Graphics Hardware, Graphics APIs, and Computation on GPUs Mark Segal Overview Graphics Pipeline Graphics Hardware Graphics APIs ATI s low-level interface for computation on GPUs 2 Graphics Hardware High
More informationGPU Memory Model. Adapted from:
GPU Memory Model Adapted from: Aaron Lefohn University of California, Davis With updates from slides by Suresh Venkatasubramanian, University of Pennsylvania Updates performed by Gary J. Katz, University
More informationEvolution of GPUs Chris Seitz
Evolution of GPUs Chris Seitz Overview Concepts: Real-time rendering Hardware graphics pipeline Evolution of the PC hardware graphics pipeline: 1995-1998: Texture mapping and z-buffer 1998: Multitexturing
More informationBeyond Programmable Shading. Scheduling the Graphics Pipeline
Beyond Programmable Shading Scheduling the Graphics Pipeline Jonathan Ragan-Kelley, MIT CSAIL 9 August 2011 Mike s just showed how shaders can use large, coherent batches of work to achieve high throughput.
More informationRaise your VR game with NVIDIA GeForce Tools
Raise your VR game with NVIDIA GeForce Tools Yan An Graphics Tools QA Manager 1 Introduction & tour of Nsight Analyze a geometry corruption bug VR debugging AGENDA System Analysis Tracing GPU Range Profiling
More informationSceneKit: What s New Session 604
Graphics and Games #WWDC17 SceneKit: What s New Session 604 Thomas Goossens, SceneKit engineer Amaury Balliet, SceneKit engineer Anatole Duprat, SceneKit engineer Sébastien Métrot, SceneKit engineer 2017
More informationBuilding scalable 3D applications. Ville Miettinen Hybrid Graphics
Building scalable 3D applications Ville Miettinen Hybrid Graphics What s going to happen... (1/2) Mass market: 3D apps will become a huge success on low-end and mid-tier cell phones Retro-gaming New game
More informationMali-400 MP: A Scalable GPU for Mobile Devices Tom Olson
Mali-400 MP: A Scalable GPU for Mobile Devices Tom Olson Director, Graphics Research, ARM Outline ARM and Mobile Graphics Design Constraints for Mobile GPUs Mali Architecture Overview Multicore Scaling
More informationOptimizing Games for ATI s IMAGEON Aaftab Munshi. 3D Architect ATI Research
Optimizing Games for ATI s IMAGEON 2300 Aaftab Munshi 3D Architect ATI Research A A 3D hardware solution enables publishers to extend brands to mobile devices while remaining close to original vision of
More informationVulkan on Mobile. Daniele Di Donato, ARM GDC 2016
Vulkan on Mobile Daniele Di Donato, ARM GDC 2016 Outline Vulkan main features Mapping Vulkan Key features to ARM CPUs Mapping Vulkan Key features to ARM Mali GPUs 4 Vulkan Good match for mobile and tiling
More informationProject 1, 467. (Note: This is not a graphics class. It is ok if your rendering has some flaws, like those gaps in the teapot image above ;-)
Project 1, 467 Purpose: The purpose of this project is to learn everything you need to know for the next 9 weeks about graphics hardware. What: Write a 3D graphics hardware simulator in your language of
More informationCSE 167: Introduction to Computer Graphics Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015
CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015 Announcements Project 2 due tomorrow at 2pm Grading window
More informationLPGPU Workshop on Power-Efficient GPU and Many-core Computing (PEGPUM 2014)
A practitioner s view of challenges faced with power and performance on mobile GPU Prashant Sharma Samsung R&D Institute UK LPGPU Workshop on Power-Efficient GPU and Many-core Computing (PEGPUM 2014) SERI
More informationIntroduction to the Direct3D 11 Graphics Pipeline
Introduction to the Direct3D 11 Graphics Pipeline Kevin Gee - XNA Developer Connection Microsoft Corporation 2008 NVIDIA Corporation. Direct3D 11 focuses on Key Takeaways Increasing scalability, Improving
More informationLecture 25: Board Notes: Threads and GPUs
Lecture 25: Board Notes: Threads and GPUs Announcements: - Reminder: HW 7 due today - Reminder: Submit project idea via (plain text) email by 11/24 Recap: - Slide 4: Lecture 23: Introduction to Parallel
More informationCS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology
CS8803SC Software and Hardware Cooperative Computing GPGPU Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology Why GPU? A quiet revolution and potential build-up Calculation: 367
More informationProgramming Tips For Scalable Graphics Performance
Game Developers Conference 2009 Programming Tips For Scalable Graphics Performance March 25, 2009 ROOM 2010 Luis Gimenez Graphics Architect Ganesh Kumar Application Engineer Katen Shah Graphics Architect
More informationDeferred Rendering Due: Wednesday November 15 at 10pm
CMSC 23700 Autumn 2017 Introduction to Computer Graphics Project 4 November 2, 2017 Deferred Rendering Due: Wednesday November 15 at 10pm 1 Summary This assignment uses the same application architecture
More informationCornell University CS 569: Interactive Computer Graphics. Introduction. Lecture 1. [John C. Stone, UIUC] NASA. University of Calgary
Cornell University CS 569: Interactive Computer Graphics Introduction Lecture 1 [John C. Stone, UIUC] 2008 Steve Marschner 1 2008 Steve Marschner 2 NASA University of Calgary 2008 Steve Marschner 3 2008
More informationSpring 2009 Prof. Hyesoon Kim
Spring 2009 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on
More informationStreaming Massive Environments From Zero to 200MPH
FORZA MOTORSPORT From Zero to 200MPH Chris Tector (Software Architect Turn 10 Studios) Turn 10 Internal studio at Microsoft Game Studios - we make Forza Motorsport Around 70 full time staff 2 Why am I
More informationScanline Rendering 2 1/42
Scanline Rendering 2 1/42 Review 1. Set up a Camera the viewing frustum has near and far clipping planes 2. Create some Geometry made out of triangles 3. Place the geometry in the scene using Transforms
More informationCourse Recap + 3D Graphics on Mobile GPUs
Lecture 18: Course Recap + 3D Graphics on Mobile GPUs Interactive Computer Graphics Q. What is a big concern in mobile computing? A. Power Two reasons to save power Run at higher performance for a fixed
More informationWhiz-Bang Graphics and Media Performance for Java Platform, Micro Edition (JavaME)
Whiz-Bang Graphics and Media Performance for Java Platform, Micro Edition (JavaME) Pavel Petroshenko, Sun Microsystems, Inc. Ashmi Bhanushali, NVIDIA Corporation Jerry Evans, Sun Microsystems, Inc. Nandini
More informationGUERRILLA DEVELOP CONFERENCE JULY 07 BRIGHTON
Deferred Rendering in Killzone 2 Michal Valient Senior Programmer, Guerrilla Talk Outline Forward & Deferred Rendering Overview G-Buffer Layout Shader Creation Deferred Rendering in Detail Rendering Passes
More informationOpenGL ES for iphone Games. Erik M. Buck
OpenGL ES for iphone Games Erik M. Buck Topics The components of a game n Technology: Graphics, sound, input, physics (an engine) n Art: The content n Fun: That certain something (a mystery) 2 What is
More informationE.Order of Operations
Appendix E E.Order of Operations This book describes all the performed between initial specification of vertices and final writing of fragments into the framebuffer. The chapters of this book are arranged
More informationEnhancing Traditional Rasterization Graphics with Ray Tracing. March 2015
Enhancing Traditional Rasterization Graphics with Ray Tracing March 2015 Introductions James Rumble Developer Technology Engineer Ray Tracing Support Justin DeCell Software Design Engineer Ray Tracing
More informationRasterization Overview
Rendering Overview The process of generating an image given a virtual camera objects light sources Various techniques rasterization (topic of this course) raytracing (topic of the course Advanced Computer
More informationSoftware Occlusion Culling
Software Occlusion Culling Abstract This article details an algorithm and associated sample code for software occlusion culling which is available for download. The technique divides scene objects into
More informationApplications of Explicit Early-Z Z Culling. Jason Mitchell ATI Research
Applications of Explicit Early-Z Z Culling Jason Mitchell ATI Research Outline Architecture Hardware depth culling Applications Volume Ray Casting Skin Shading Fluid Flow Deferred Shading Early-Z In past
More informationIntroduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono
Introduction to CUDA Algoritmi e Calcolo Parallelo References q This set of slides is mainly based on: " CUDA Technical Training, Dr. Antonino Tumeo, Pacific Northwest National Laboratory " Slide of Applied
More informationMobile Performance Tools and GPU Performance Tuning. Lars M. Bishop, NVIDIA Handheld DevTech Jason Allen, NVIDIA Handheld DevTools
Mobile Performance Tools and GPU Performance Tuning Lars M. Bishop, NVIDIA Handheld DevTech Jason Allen, NVIDIA Handheld DevTools NVIDIA GoForce5500 Overview World-class 3D HW Geometry pipeline 16/32bpp
More informationGetting fancy with texture mapping (Part 2) CS559 Spring Apr 2017
Getting fancy with texture mapping (Part 2) CS559 Spring 2017 6 Apr 2017 Review Skyboxes as backdrops Credits : Flipmode 3D Review Reflection maps Credits : NVidia Review Decal textures Credits : andreucabre.com
More informationSIGGRAPH Briefing August 2014
Copyright Khronos Group 2014 - Page 1 SIGGRAPH Briefing August 2014 Neil Trevett VP Mobile Ecosystem, NVIDIA President, Khronos Copyright Khronos Group 2014 - Page 2 Significant Khronos API Ecosystem Advances
More informationShaders (some slides taken from David M. course)
Shaders (some slides taken from David M. course) Doron Nussbaum Doron Nussbaum COMP 3501 - Shaders 1 Traditional Rendering Pipeline Traditional pipeline (older graphics cards) restricts developer to texture
More informationHardware Accelerated Volume Visualization. Leonid I. Dimitrov & Milos Sramek GMI Austrian Academy of Sciences
Hardware Accelerated Volume Visualization Leonid I. Dimitrov & Milos Sramek GMI Austrian Academy of Sciences A Real-Time VR System Real-Time: 25-30 frames per second 4D visualization: real time input of
More information