Metal for OpenGL Developers

Size: px
Start display at page:

Download "Metal for OpenGL Developers"

Transcription

1 #WWDC18 Metal for OpenGL Developers Dan Omachi, Metal Ecosystem Engineer Sukanya Sudugu, GPU Software Engineer 2018 Apple Inc. All rights reserved. Redistribution or public display not permitted without written permission from Apple.

2 These legacy APIs are deprecated Still available in ios 12, macos and tvos 12 Begin transitioning to Metal

3 Choosing an Approach High-level Apple frameworks SpriteKit, SceneKit, and Core Image

4 Choosing an Approach High-level Apple frameworks SpriteKit, SceneKit, and Core Image Third-party engines Unity, Unreal, Lumberyard, etc. Update to latest version

5 Choosing an Approach High-level Apple frameworks SpriteKit, SceneKit, and Core Image Third-party engines Unity, Unreal, Lumberyard, etc. Update to latest version Focus today incrementally porting to Metal

6 Metal Concepts

7 Challenges with OpenGL OpenGL designed more than 25 years ago Core architecture reflects the origins of 3D graphics Extensions retrofitted some GPU features

8 Challenges with OpenGL OpenGL designed more than 25 years ago Core architecture reflects the origins of 3D graphics Extensions retrofitted some GPU features

9 Challenges with OpenGL OpenGL designed more than 25 years ago Core architecture reflects the origins of 3D graphics Extensions retrofitted some GPU features Fundamental design choices based on past principles GPU pipeline has changed Multithreaded operation not considered Asynchronous processing, not core

10 Challenges with OpenGL OpenGL designed more than 25 years ago Core architecture reflects the origins of 3D graphics Extensions retrofitted some GPU features Fundamental design choices based on past principles GPU pipeline has changed Multithreaded operation not considered Asynchronous processing, not core

11 Challenges with OpenGL OpenGL designed more than 25 years ago Core architecture reflects the origins of 3D graphics Extensions retrofitted some GPU features Fundamental design choices based on past principles GPU pipeline has changed Multithreaded operation not considered Asynchronous processing, not core

12 Design Goals for Metal Efficient GPU interaction

13 Design Goals for Metal Efficient GPU interaction Low CPU overhead

14 Design Goals for Metal Efficient GPU interaction Low CPU overhead Multithreaded execution

15 Design Goals for Metal Efficient GPU interaction Low CPU overhead Multithreaded execution Predictable operation

16 Design Goals for Metal Efficient GPU interaction Low CPU overhead Multithreaded execution Predictable operation Resource and synchronization control

17 Design Goals for Metal Efficient GPU interaction Low CPU overhead Multithreaded execution Predictable operation Resource and synchronization control Approachable to OpenGL developers

18 Design Goals for Metal Efficient GPU interaction Low CPU overhead Multithreaded execution Predictable operation Resource and synchronization control Approachable to OpenGL developers Built for modern and Apple-design GPUs

19 Key Conceptual Differences Expensive operations less frequent Expensive CPU operations performed less often

20 Key Conceptual Differences Expensive operations less frequent Expensive CPU operations performed less often More GPU command generation during object creation

21 Key Conceptual Differences Expensive operations less frequent Expensive CPU operations performed less often More GPU command generation during object creation Less needed when rendering

22 Key Conceptual Differences Modern GPU pipeline Reflects the modern GPU architectures This way there's much less cost to use them later on when actually rendering

23 Key Conceptual Differences Modern GPU pipeline Reflects the modern GPU architectures Closer match yields less costly translation to GPU commands This way there's much less cost to use them later on when actually rendering

24 Key Conceptual Differences Modern GPU pipeline Reflects the modern GPU architectures Closer match yields less costly translation to GPU commands State grouped more efficiently This way there's much less cost to use them later on when actually rendering

25 Key Conceptual Differences Multithreaded execution Designed for multithreaded execution

26 Key Conceptual Differences Multithreaded execution Designed for multithreaded execution Clear rules for multithreaded usage

27 Key Conceptual Differences Multithreaded execution Designed for multithreaded execution Clear rules for multithreaded usage Cross thread object usability

28 Key Conceptual Differences Execution model True interaction between software and GPU

29 Key Conceptual Differences Execution model True interaction between software and GPU Predictable operation allows efficient designs

30 Key Conceptual Differences Execution model True interaction between software and GPU Predictable operation allows efficient designs Thinner stack between application and GPU

31 Application Renderer OpenGL API OpenGL Context

32 Application Renderer Metal API Device

33 Application Renderer Metal API Device

34 Application Renderer Metal API Device GPU

35 Application Renderer Metal API Device GPU

36 Application Renderer Metal API Pipelines Textures Buffers Device GPU

37 Application Renderer Metal API Textures Buffers Device GPU Pipelines Render Objects

38 Application Renderer Metal API Textures Buffers Command Device Queue GPU Pipelines Render Objects

39 Application Renderer Metal API Textures Buffers Command Device Queue GPU Pipelines Render Objects

40 Application Renderer Metal API Textures Buffers Command Queue Device GPU GPU Pipelines Render Objects

41 Application Renderer Metal API Textures Buffers Command Queue Device GPU GPU Pipelines Render Objects

42 Application Renderer Metal API Textures Buffers Command Queue Device GPU Pipelines Render Objects

43 Application Renderer Metal API Command Buffers Command Queue Device GPU

44 Application Renderer Metal API Command Buffers Command Queue Device GPU

45 Application Renderer Metal API Command Buffers Command Queue Device GPU

46 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

47 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

48 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

49 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

50 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

51 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

52 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

53 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

54 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

55 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

56 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

57 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

58 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

59 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

60 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

61 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

62 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

63 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

64 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

65 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

66 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

67 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

68 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

69 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

70 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

71 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

72 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

73 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

74 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

75 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

76 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

77 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

78 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

79 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

80 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

81 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

82 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

83 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

84 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

85 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

86 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

87 Application Renderer Metal API Command Encoders Command Buffers Command Queue Device GPU

88 Command Encoders Render Command Encoder

89 Command Encoders Render Command Encoder Blit Command Encoder

90 Command Encoders Render Command Encoder Blit Command Encoder Compute Command Encoder

91 Render Command Encoders Commands for a render pass Encodes a series of render commands

92 Render Command Encoders Commands for a render pass Encodes a series of render commands Also called a Render Pass

93 Render Command Encoders Commands for a render pass Encodes a series of render commands Also called a Render Pass Set render object for the graphics pipeline Buffer, textures, shaders

94 Render Command Encoders Commands for a render pass Encodes a series of render commands Also called a Render Pass Set render object for the graphics pipeline Buffer, textures, shaders Issue draw commands Draw primitives, draw index primitives, instanced draws

95 Render Command Encoders Render targets Associated with a set of render targets

96 Render Command Encoders Render targets Associated with a set of render targets Textures for rendering

97 Render Command Encoders Render targets Associated with a set of render targets Textures for rendering Specify a set of render targets upon creation

98 Render Command Encoders Render targets Associated with a set of render targets Textures for rendering Specify a set of render targets upon creation All draw commands directed to these for lifetime of encoder

99 Render Command Encoders Render targets Associated with a set of render targets Textures for rendering Specify a set of render targets upon creation All draw commands directed to these for lifetime of encoder New render targets need a new encoder

100 Render Command Encoders Render targets Associated with a set of render targets Textures for rendering Specify a set of render targets upon creation All draw commands directed to these for lifetime of encoder New render targets need a new encoder Clear delineation between sets of render targets

101 Render Objects Textures

102 Render Objects Textures Buffers

103 Render Objects Textures Buffers Samplers

104 Render Objects Textures Buffers Samplers Render pipeline states

105 Render Objects Textures Buffers Samplers Render pipeline states Depth stencil states

106 Render Objects Object creation Created from a device Usable only on that device

107 Render Objects Object creation Created from a device Usable only on that device Objects state set at creation Descriptor object specifies properties for render object

108 Render Objects Object creation Created from a device Usable only on that device Objects state set at creation Descriptor object specifies properties for render object State set at creation fixed for the lifetime of the object Image data of textures and values in buffers can change

109 Render Objects Object creation Metal compiles objects into GPU state once Never needs to check for changes and recompile

110 Render Objects Object creation Metal compiles objects into GPU state once Never needs to check for changes and recompile Multithreaded usage more efficient Metal does not need to protect state from changes on other threads

111 Metal Porting

112 Build Initialize Render

113 Build Initialize Render

114 Build Initialize Render

115 Build Initialize Render

116 Build Initialize Render Shaders Devices and Queues Command Buffers Render Objects Resource Updates Render Encoders Display

117 Build Initialize Render Shaders Devices and Queues Command Buffers Render Objects Resource Updates Render Encoders Display

118 Build Initialize Render Shaders Devices and Queues Command Buffers Render Objects Resource Updates Render Encoders Display

119 Metal Shading Language Based on C++ Classes, templates, structs, enums, namespaces

120 Metal Shading Language Based on C++ Classes, templates, structs, enums, namespaces Built-in types for vectors and matrices

121 Metal Shading Language Based on C++ Classes, templates, structs, enums, namespaces Built-in types for vectors and matrices Built-in functions and operators

122 Metal Shading Language Based on C++ Classes, templates, structs, enums, namespaces Built-in types for vectors and matrices Built-in functions and operators Built-in classes for textures and samplers

123 vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord; return out; } fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

124 vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord; return out; } fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

125 vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord; return out; } fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

126 vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord; return out; } fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

127 vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord; return out; } fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

128 vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord; return out; } fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

129 vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord; return out; } fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

130 vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord; return out; } fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

131 vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord; return out; } fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

132 }; vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord; return out; } fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

133 struct VertexOutput { float4 clippos [[position]]; float2 texcoord; }; struct Vertex { float4 modelpos; float2 texcoord; }; vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord;

134 struct VertexOutput { float4 clippos [[position]]; float2 texcoord; }; struct Vertex { float4 modelpos; float2 texcoord; }; vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord;

135 struct VertexOutput { float4 clippos [[position]]; float2 texcoord; }; struct Vertex { float4 modelpos; float2 texcoord; }; vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord;

136 struct VertexOutput { float4 clippos [[position]]; float2 texcoord; }; struct Vertex { float4 modelpos; float2 texcoord; }; vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord;

137 struct VertexOutput { float4 clippos [[position]]; float2 texcoord; }; struct Vertex { float4 modelpos; float2 texcoord; }; vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord;

138 struct VertexOutput { float4 clippos [[position]]; float2 texcoord; }; struct Vertex { float4 modelpos; float2 texcoord; }; vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord;

139 struct VertexOutput { float4 clippos [[position]]; float2 texcoord; }; struct Vertex { float4 modelpos; float2 texcoord; }; vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord;

140 struct VertexOutput { float4 clippos [[position]]; float2 texcoord; }; struct Vertex { float4 modelpos; float2 texcoord; }; vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord;

141 struct VertexOutput { float4 clippos [[position]]; float2 texcoord; }; struct Vertex { float4 modelpos; float2 texcoord; }; vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord;

142 }; vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord; return out; } fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

143 vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord; return out; } fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

144 vertex VertexOutput myvertexshader(uint vid [[ vertex_id ]], device Vertex * vertices [[ buffer(0) ]], constant Uniforms & uniforms [[ buffer(1) ]]) { VertexOutput out; out.clippos = vertices[vid].modelpos * uniforms.mvp; out.texcoord = vertices[vid].texcoord; return out; } fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

145 fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

146 [renderencoder setfragmentbuffer:myuniformbuffer offset:0 atindex:3]; [renderencoder setfragmenttexture:mycolortexture atindex:0]; [renderencoder setfragmentsampler:mysampler atindex:1]; fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

147 [renderencoder setfragmentbuffer:myuniformbuffer offset:0 atindex:3]; [renderencoder setfragmenttexture:mycolortexture atindex:0]; [renderencoder setfragmentsampler:mysampler atindex:1]; fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

148 [renderencoder setfragmentbuffer:myuniformbuffer offset:0 atindex:3]; [renderencoder setfragmenttexture:mycolortexture atindex:0]; [renderencoder setfragmentsampler:mysampler atindex:1]; fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

149 [renderencoder setfragmentbuffer:myuniformbuffer offset:0 atindex:3]; [renderencoder setfragmenttexture:mycolortexture atindex:0]; [renderencoder setfragmentsampler:mysampler atindex:1]; fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

150 [renderencoder setfragmentbuffer:myuniformbuffer offset:0 atindex:3]; [renderencoder setfragmenttexture:mycolortexture atindex:0]; [renderencoder setfragmentsampler:mysampler atindex:1]; fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

151 [renderencoder setfragmentbuffer:myuniformbuffer offset:0 atindex:3]; [renderencoder setfragmenttexture:mycolortexture atindex:0]; [renderencoder setfragmentsampler:mysampler atindex:1]; fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

152 [renderencoder setfragmentbuffer:myuniformbuffer offset:0 atindex:3]; [renderencoder setfragmenttexture:mycolortexture atindex:0]; [renderencoder setfragmentsampler:mysampler atindex:1]; fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

153 [renderencoder setfragmentbuffer:myuniformbuffer offset:0 atindex:3]; [renderencoder setfragmenttexture:mycolortexture atindex:0]; [renderencoder setfragmentsampler:mysampler atindex:1]; fragment float4 myfragmentshader(vertexoutput in [[ stage_in ]], constant Uniforms & uniforms [[ buffer(3) ]], texture2d<float> colortex [[ texture(0) ]], sampler texsampler [[ sampler(1) ]]) { return colortex.sample(texsampler, in.texcoord * uniforms.coordscale); }

154 SIMD Type Library Types for shader development Vector and matrix types

155 SIMD Type Library Types for shader development Vector and matrix types Usable with Metal shading language and application code

156 SIMD Type Library Types for shader development Vector and matrix types Usable with Metal shading language and application code struct MyUniforms { matrix_float4x4 modelviewprojection; vector_float4 sunposition; };

157 SIMD Type Library Types for shader development Vector and matrix types Usable with Metal shading language and application code MyShaderTypes.h struct MyUniforms {c matrix_float4x4 modelviewprojection; vector_float4 sunposition; };

158 SIMD Type Library Types for shader development MyShaderTypes.h struct MyUniforms {c matrix_float4x4 modelviewprojection; vector_float4 sunposition; }; MyRenderer.m MyShaders.metal #include "MyShaderTypes.h" #include "MyShaderTypes.h"

159 Shader Compilation Building with Xcode Xcode compiles shaders into a Metal library (.metallib)

160 Shader Compilation Building with Xcode Xcode compiles shaders into a Metal library (.metallib) Front-end compilation to binary intermediate representation

161 Shader Compilation Building with Xcode Xcode compiles shaders into a Metal library (.metallib) Front-end compilation to binary intermediate representation Avoids parsing time on customer systems

162 Shader Compilation Building with Xcode Xcode compiles shaders into a Metal library (.metallib) Front-end compilation to binary intermediate representation Avoids parsing time on customer systems By default, all shaders built into default.metallib Placed in app bundle for run time retrieval

163 Runtime Shader Compilation Also can build shaders from source at runtime

164 Runtime Shader Compilation Also can build shaders from source at runtime Significant disadvantages Full shader compilation occurs at runtime Compilation errors less obvious No header sharing between application and runtime built shaders

165 Runtime Shader Compilation Also can build shaders from source at runtime Significant disadvantages Full shader compilation occurs at runtime Compilation errors less obvious No header sharing between application and runtime built shaders Build time compilation recommended

166 Build Initialize Render Shaders Devices and Queues Command Buffers Render Objects Resource Updates Render Encoders Display

167 Build Initialize Render Shaders Devices and Queues Command Buffers Render Objects Resource Updates Render Encoders Display

168 Application Renderer Metal API Command Encoder Command Buffer Display Textures Buffers Pipelines Command Queue Device GPU Render Objects

169 Application Renderer Metal API Command Encoder Command Buffer Display Textures Buffers Pipelines Command Queue Device GPU Render Objects

170 Devices A device represents one GPU

171 Devices A device represents one GPU Creates render objects Textures, buffers, pipelines

172 Devices A device represents one GPU Creates render objects Textures, buffers, pipelines macos multiple devices may be avaliable

173 Devices A device represents one GPU Creates render objects Textures, buffers, pipelines macos multiple devices may be avaliable Default device suitable for most applications // Getting a device id<mtldevice> device = MTLCreateSystemDefaultDevice();

174 Command Queues Queue created from a device

175 Command Queues Queue created from a device Queues execute command buffers in order Create queue at initialization

176 Command Queues Queue created from a device Queues execute command buffers in order Create queue at initialization Typically one queue sufficient

177 Command Queues Queue created from a device Queues execute command buffers in order Create queue at initialization Typically one queue sufficient // Getting a device id<mtlcommandqueue> commandqueue = [device newcommandqueue];

178 Build Initialize Render Shaders Devices and Queues Command Buffers Render Objects Resource Updates Render Encoders Display

179 Build Initialize Render Shaders Devices and Queues Command Buffers Render Objects Resource Updates Render Encoders Display

180 Application Renderer Metal API Command Encoder Command Buffer Display Textures Buffers Pipelines Command Queue Device GPU Render Objects

181 Application Renderer Metal API Command Encoder Command Buffer Display Textures Buffers Pipelines Command Queue Device GPU Render Objects

182 Textures Buffers Pipelines

183 Textures Buffers Pipelines

184 Device

185 Descriptor Device

186 Texture descriptor Device

187 Texture descriptor Texture Type Width Height Num Mipmaps Pixel Format 2D RGBA8 Device

188 Texture descriptor Texture Type Width Height Num Mipmaps Pixel Format 2D RGBA8 Device

189 Texture descriptor Texture Object Texture Type Width Height Num Mipmaps Pixel Format 2D RGBA8 Device

190 Texture descriptor Texture Object Texture Type Width Height Num Mipmaps Pixel Format 2D RGBA8 Device Memory

191 Texture descriptor Texture Object Texture Type Width Height Num Mipmaps Pixel Format 2D RGBA8 Device Memory

192 Texture Object Device Memory

193 Texture Object Device Memory

194 Texture Object Device Memory

195 Texture Object Device Memory

196 Texture Object Device Memory

197 Storage Modes Memory

198 Storage Modes Shared storage GPU CPU Memory

199 Storage Modes Shared storage GPU CPU Memory

200 Storage Modes Private storage GPU CPU Memory

201 Storage Modes Private storage GPU CPU Memory

202 Storage Modes Private storage GPU CPU Video Memory System Memory

203 Storage Modes Managed storage GPU CPU Video Memory System Memory

204 Storage Modes Managed storage GPU CPU Video Memory System Memory

205 Storage Modes Managed storage GPU CPU Video Memory System Memory

206 Storage Modes Managed storage GPU CPU Video Memory System Memory

207 // Creating Textures MTLTextureDescriptor *texturedescriptor = [MTLTextureDescriptor new]; texturedescriptor.pixelformat = MTLPixelFormatBGRA8Unorm; texturedescriptor.width = 512; texturedescriptor.height = 512; texturedescriptor.storagemode = MTLStorageModeShared; id<mtltexture> texture = [device newtexturewithdescriptor:texturedescriptor];

208 // Creating Textures MTLTextureDescriptor *texturedescriptor = [MTLTextureDescriptor new]; texturedescriptor.pixelformat = MTLPixelFormatBGRA8Unorm; texturedescriptor.width = 512; texturedescriptor.height = 512; texturedescriptor.storagemode = MTLStorageModeShared;; id<mtltexture> texture = [device newtexturewithdescriptor:texturedescriptor];

209 // Creating Textures MTLTextureDescriptor *texturedescriptor = [MTLTextureDescriptor new]; texturedescriptor.pixelformat = MTLPixelFormatBGRA8Unorm; texturedescriptor.width = 512; texturedescriptor.height = 512; texturedescriptor.storagemode = MTLStorageModeShared; id<mtltexture> texture = [device newtexturewithdescriptor:texturedescriptor];

210 // Loading Image Data NSUInteger bytesperrow = 4 * image.width; MTLRegion region = { { 0, 0, 0 }, // Origin { 512, 512, 1 } // Size }; [texture replaceregion:region mipmaplevel:0 withbytes:imagedata bytesperrow:bytesperrow];

211 // Loading Image Data NSUInteger bytesperrow = 4 * image.width; MTLRegion region = { { 0, 0, 0 }, // Origin { 512, 512, 1 } // Size }; [texture replaceregion:region mipmaplevel:0 withbytes:imagedata bytesperrow:bytesperrow];

212 // Loading Image Data NSUInteger bytesperrow = 4 * image.width; MTLRegion region = { { 0, 0, 0 }, // Origin { 512, 512, 1 } // Size }; [texture replaceregion:region mipmaplevel:0 withbytes:imagedata bytesperrow:bytesperrow];

213 Texture Differences Sampler state never part of texture Wrap modes, filtering, min/max LOD

214 Texture Differences Sampler state never part of texture Wrap modes, filtering, min/max LOD Texture image data not flipped OpenGL uses bottom-left origin, Metal uses top-left origin

215 Texture Differences Sampler state never part of texture Wrap modes, filtering, min/max LOD Texture image data not flipped OpenGL uses bottom-left origin, Metal uses top-left origin Metal does not perform format conversion

216 Textures Buffers Shaders

217 Textures Buffers Shaders

218 Buffers Metal uses buffers for vertices, indices, and all uniform data OpenGL's vertex, element, and uniform buffers are similar Easier to port apps that have adopted these

219 // Creating Buffers id<mtlbuffer> buffer = [device newbufferwithlength:bufferdatabytesize options:mtlresourcestoragemodeshared]; struct MyUniforms *uniforms = (struct MyUniforms*) buffer.contents; uniforms->modelviewprojection = modelviewprojection; uniforms->sunposition = sunposition;

220 // Creating Buffers id<mtlbuffer> buffer = [device newbufferwithlength:bufferdatabytesize options:mtlresourcestoragemodeshared]; struct MyUniforms *uniforms = (struct MyUniforms*) buffer.contents; uniforms->modelviewprojection = modelviewprojection; uniforms->sunposition = sunposition;

221 // Creating Buffers id<mtlbuffer> buffer = [device newbufferwithlength:bufferdatabytesize options:mtlresourcestoragemodeshared]; struct MyUniforms *uniforms = (struct MyUniforms*) buffer.contents; uniforms->modelviewprojection = modelviewprojection; uniforms->sunposition = sunposition;

222 Notes About Buffer Data Pay attention to alignment Type Alignment float3, int3, uint3 16 bytes float3x3, float4x3 16 bytes half3, short3, ushore3, 8 bytes half3x3, half4x3 8 bytes structures 4 bytes

223 Notes About Buffer Data Pay attention to alignment Type Alignment float3, int3, uint3 16 bytes float3x3, float4x3 16 bytes half3, short3, ushore3, 8 bytes half3x3, half4x3 8 bytes structures 4 bytes

224 Notes About Buffer Data Pay attention to alignment Type Alignment float3, int3, uint3 16 bytes float3x3, float4x3 16 bytes half3, short3, ushore3, 8 bytes half3x3, half4x3 8 bytes structures 4 bytes

225 Notes About Buffer Data SIMD and packed types SIMD libraries vector and matrix types follow same rules as Metal shaders Special packed vector types available to shaders packed_float3 consumes 12 bytes packed_half3 consumes 6 bytes Cannot directly operate on packed types Cast to non-packed type required

226 Storage Modes for Porting Use most convient storage modes Easier access to data

227 Storage Modes for Porting Use most convient storage modes Easier access to data On ios Create all textures and buffers with MTLStorageModeShared

228 Storage Modes for Porting Use most convient storage modes Easier access to data On ios Create all textures and buffers with MTLStorageModeShared On macos Create all textures with MTLStorageModeManaged Make judicious use of MTLStorageModeShared for buffers - Separate GPU only data from CPU accessible data

229 MetalKit Texture and buffer utilities Texture Loading Textures from KTX, PVR, JPG, PNG, TIFF, etc. Model Loading Vertex buffers from USD, OBJ, Alembic, etc.

230 Textures Buffers Pipelines

231 Textures Buffers Pipelines

232 Render Pipeline Descriptor Device

233 Render Pipeline Descriptor Vertex Shader Device Fragment Shader

234 Render Pipeline Descriptor Vertex Layout Vertex Shader Device Fragment Shader

235 Render Pipeline Descriptor Vertex Layout Vertex Shader Device Fragment Shader Blend State Render Target Pixel Formats

236 Render Pipeline Descriptor Vertex Layout Vertex Shader Device Fragment Shader Blend State Render Target Pixel Formats

237 Render Pipeline Descriptor Render State Pipeline Object Vertex Layout Vertex Shader Device Fragment Shader Blend State Render Target Pixel Formats

238 Render State Pipeline Object Device

239 // Creating Render Pipeline Objects id<mtllibrary> defaultlibrary = [device newdefaultlibrary]; id<mtlfunction> vertexfunction = [defaultlibrary newfunctionwithname:@"vertexshader"]; id<mtlfunction> fragmentfunction = [defaultlibrary newfunctionwithname:@"fragmentshader"]; MTLRenderPipelineDescriptor *pipelinestatedescriptor = [MTLRenderPipelineDescriptor new]; pipelinestatedescriptor.vertexfunction = vertexfunction; pipelinestatedescriptor.fragmentfunction = fragmentfunction; pipelinestatedescriptor.colorattachments[0].pixelformat = MTLPixelFormatRGBA8Unorm; id<mtlrenderpipelinestate> pipelinestate; pipelinestate = [device newrenderpipelinestatewithdescriptor:pipelinestatedescriptor error:nil];

240 // Creating Render Pipeline Objects id<mtllibrary> defaultlibrary = [device newdefaultlibrary]; id<mtlfunction> vertexfunction = [defaultlibrary newfunctionwithname:@"vertexshader"]; id<mtlfunction> fragmentfunction = [defaultlibrary newfunctionwithname:@"fragmentshader"]; MTLRenderPipelineDescriptor *pipelinestatedescriptor = [MTLRenderPipelineDescriptor new]; pipelinestatedescriptor.vertexfunction = vertexfunction; pipelinestatedescriptor.fragmentfunction = fragmentfunction; pipelinestatedescriptor.colorattachments[0].pixelformat = MTLPixelFormatRGBA8Unorm; id<mtlrenderpipelinestate> pipelinestate; pipelinestate = [device newrenderpipelinestatewithdescriptor:pipelinestatedescriptor error:nil];

241 // Creating Render Pipeline Objects id<mtllibrary> defaultlibrary = [device newdefaultlibrary]; id<mtlfunction> vertexfunction = [defaultlibrary newfunctionwithname:@"vertexshader"]; id<mtlfunction> fragmentfunction = [defaultlibrary newfunctionwithname:@"fragmentshader"]; MTLRenderPipelineDescriptor *pipelinestatedescriptor = [MTLRenderPipelineDescriptor new]; pipelinestatedescriptor.vertexfunction = vertexfunction; pipelinestatedescriptor.fragmentfunction = fragmentfunction; pipelinestatedescriptor.colorattachments[0].pixelformat = MTLPixelFormatRGBA8Unorm; id<mtlrenderpipelinestate> pipelinestate; pipelinestate = [device newrenderpipelinestatewithdescriptor:pipelinestatedescriptor error:nil];

242 // Creating Render Pipeline Objects id<mtllibrary> defaultlibrary = [device newdefaultlibrary]; id<mtlfunction> vertexfunction = [defaultlibrary newfunctionwithname:@"vertexshader"]; id<mtlfunction> fragmentfunction = [defaultlibrary newfunctionwithname:@"fragmentshader"]; MTLRenderPipelineDescriptor *pipelinestatedescriptor = [MTLRenderPipelineDescriptor new]; pipelinestatedescriptor.vertexfunction = vertexfunction; pipelinestatedescriptor.fragmentfunction = fragmentfunction; pipelinestatedescriptor.colorattachments[0].pixelformat = MTLPixelFormatRGBA8Unorm; id<mtlrenderpipelinestate> pipelinestate; pipelinestate = [device newrenderpipelinestatewithdescriptor:pipelinestatedescriptor error:nil];

243 // Creating Render Pipeline Objects id<mtllibrary> defaultlibrary = [device newdefaultlibrary]; id<mtlfunction> vertexfunction = [defaultlibrary newfunctionwithname:@"vertexshader"]; id<mtlfunction> fragmentfunction = [defaultlibrary newfunctionwithname:@"fragmentshader"]; MTLRenderPipelineDescriptor *pipelinestatedescriptor = [MTLRenderPipelineDescriptor new]; pipelinestatedescriptor.vertexfunction = vertexfunction; pipelinestatedescriptor.fragmentfunction = fragmentfunction; pipelinestatedescriptor.colorattachments[0].pixelformat = MTLPixelFormatRGBA8Unorm; id<mtlrenderpipelinestate> pipelinestate; pipelinestate = [device newrenderpipelinestatewithdescriptor:pipelinestatedescriptor error:nil];

244 // Creating Render Pipeline Objects id<mtllibrary> defaultlibrary = [device newdefaultlibrary]; id<mtlfunction> vertexfunction = [defaultlibrary newfunctionwithname:@"vertexshader"]; id<mtlfunction> fragmentfunction = [defaultlibrary newfunctionwithname:@"fragmentshader"]; MTLRenderPipelineDescriptor *pipelinestatedescriptor = [MTLRenderPipelineDescriptor new]; pipelinestatedescriptor.vertexfunction = vertexfunction; pipelinestatedescriptor.fragmentfunction = fragmentfunction; pipelinestatedescriptor.colorattachments[0].pixelformat = MTLPixelFormatRGBA8Unorm; id<mtlrenderpipelinestate> pipelinestate; pipelinestate = [device newrenderpipelinestatewithdescriptor:pipelinestatedescriptor error:nil];

245 Pipeline Differences OpenGL Program Objects Metal OpenGL Vertex Layout Vertex Shader Vertex Shader Fragment Shader Fragment Shader Blend State Render Target Pixel Formats

246 Pipeline Building Create at intitlization Full compilation key advantage of state grouping Choose a canonical vertex layout for meshes Use a limited set of render target formats

247 Pipeline Building Lazy creation at draw time! Store pipeline state objects in a dictionary using descriptor as key Construct descriptor at draw time with current state Retrieve existing pipeline from dictionary OR build new pipeline

248 Create Render Objects at Initialization Object creation expensive

249 Create Render Objects at Initialization Object creation expensive Pipelines require backend compilation

250 Create Render Objects at Initialization Object creation expensive Pipelines require backend compilation Buffers and textures need allocations

251 Create Render Objects at Initialization Object creation expensive Pipelines require backend compilation Buffers and textures need allocations Once created, much faster usage during rendering

252 Porting the Render Loop Sukanya Sudugu, GPU Software Engineer

253 Build Initialize Render Shaders Devices and Queues Command Buffers Render Objects Resource Updates Render Encoders Display

254 Build Initialize Render Shaders Devices and Queues Command Buffers Render Objects Resource Updates Render Encoders Display

255 Build Initialize Render Shaders Devices and Queues Command Buffers Render Objects Resource Updates Render Encoders Display

256 Build Initialize Render Shaders Devices and Queues Command Buffers Render Objects Resource Updates Render Encoders Display

257 Build Initialize Render Shaders Devices and Queues Command Buffers Render Objects Resource Updates Render Encoders Display

258 Application Renderer Metal API Command Encoder Command Buffer Display Textures Buffers Pipelines Command Queue Device GPU Render Objects

259 Application Renderer Metal API Command Encoder Command Buffer Display Textures Buffers Pipelines Command Queue Device GPU Render Objects

260 Command Buffers Explicit control over command buffer submission Start with one command buffer per frame Optionally split a frame into multiple command buffers to Submit early and get the GPU started Build commands on multiple threads

261 Command Buffers Explicit control over command buffer submission Start with one command buffer per frame Optionally split a frame into multiple command buffers to Submit early and get the GPU started Build commands on multiple threads Completion handler invoked when execution is finished

262 // Render Loop // Obtaining a command buffer at the beginning of each frame id<mtlcommandbuffer> commandbuffer = [commandqueue commandbuffer]; // Encoding commands... // Commit the command buffer to the GPU for execution [commandbuffer commit];

263 // Render Loop // Obtaining a command buffer at the beginning of each frame id<mtlcommandbuffer> commandbuffer = [commandqueue commandbuffer]; // Encoding commands... // Commit the command buffer to the GPU for execution [commandbuffer commit];

264 // Render Loop // Obtaining a command buffer at the beginning of each frame id<mtlcommandbuffer> commandbuffer = [commandqueue commandbuffer]; // Encoding commands... // Commit the command buffer to the GPU for execution [commandbuffer commit];

265 // Render Loop // Obtaining a command buffer at the beginning of each frame id<mtlcommandbuffer> commandbuffer = [commandqueue commandbuffer]; // Encoding commands... // Commit the command buffer to the GPU for execution [commandbuffer commit];

266 // Render Loop // Obtaining a command buffer at the beginning of each frame id<mtlcommandbuffer> commandbuffer = [commandqueue commandbuffer]; // Encoding commands... // Commit the command buffer to the GPU for execution [commandbuffer commit]; // Wait until the GPU has finished execution [commandbuffer waituntilcompleted];

267 // Render Loop // Obtaining a command buffer at the beginning of each frame id<mtlcommandbuffer> commandbuffer = [commandqueue commandbuffer]; // Encoding commands... // Commit the command buffer to the GPU for execution [commandbuffer commit];

268 // Render Loop // Obtaining a command buffer at the beginning of each frame id<mtlcommandbuffer> commandbuffer = [commandqueue commandbuffer]; // Encoding commands... // Add a completion hander to tell me when the GPU is done [commandbuffer addcompletedhandler:^(id<mtlcommandbuffer> commandbuffer) { // GPU is done with my buffer!... }]; // Commit the command buffer to the GPU for execution [commandbuffer commit];

269 Build Initialize Render Shaders Devices and Queues Command Buffers Render Objects Resource Updates Render Encoders Display

270 Resource Updates Resources are explicitly managed in Metal Shared No implicit synchronization like OpenGL Allows for fine gained synchronization CPU Writes Resources Dynamic Uniforms Dynamic Vertices GPU Reads Application has complete control Best model dependent on usage Triple buffering recommended

271 Resource Updates Without synchronization CPU Write Buffer Read GPU Time

272 Resource Updates Without synchronization CPU Frame 1 Write Buffer Frame 1 Data Read GPU Time

273 Resource Updates Without synchronization CPU Frame 1 Write Buffer Frame 1 Data Read GPU Frame 1 Time

274 Resource Updates Without synchronization CPU Frame 1 Frame Wait 2 Write Buffer Frame 2 Data Frame 1 Data Read GPU Frame 1 Time

275 Resource Updates Without synchronization CPU Frame 1 Frame Wait 2 Write Buffer Frame 2 Data Frame 1 Data Read GPU Frame 1 Time

276 Temporary Solution Synchronous wait after every frame CPU Frame 1 Wait Frame 2 Write Buffer Frame 2 Data Read GPU Frame 1 Idle Time

277 Temporary Solution Synchronous wait after every frame CPU Frame 1 Wait Frame 2 Write Buffer Frame 2 Data Read GPU Frame 1 Idle Time

278 Triple Buffering Shared buffer pool CPU Frame 1 Write Frame 1 Data Buffer Read GPU Frame 1 Time

279 Triple Buffering Shared buffer pool CPU Frame 1 Frame 2 Frame 3 Write Frame 1 Data Buffer Frame 2 Data Frame 3 Data Read GPU Frame 1 Time

280 Triple Buffering Shared buffer pool CPU Frame 1 Frame 2 Frame 3 Write Frame 1 Data Buffer Frame 2 Data Frame 3 Data Read GPU Frame 1 Time

281 Triple Buffering Shared buffer pool CPU Frame 1 Frame 2 Frame 3 Write Frame 1 Data Buffer Frame 2 Data Frame 3 Data Read GPU Frame 1 Time

282 Triple Buffering Shared buffer pool CPU Frame 1 Frame 2 Frame 3 Write Frame 1 Data Buffer Frame 2 Data Frame 3 Data Read GPU Frame 1 Time

283 Triple Buffering Shared buffer pool Completion Handler CPU Frame 1 Frame 2 Frame 3 Write Frame 1 Data Buffer Frame 2 Data Frame 3 Data Read GPU Frame 1 Time

284 Triple Buffering Shared buffer pool Completion Handler CPU Frame 1 Frame 2 Frame 3 Frame 4 Write Buffer Frame 4 Data Frame 2 Data Frame 3 Data Read GPU Frame 1 Time

285 Triple Buffering Shared buffer pool Completion Handler CPU Frame 1 Frame 2 Frame 3 Frame 4 Write Buffer Frame 4 Data Frame 2 Data Frame 3 Data Read GPU Frame 1 Frame 2 Time

286 Triple Buffering Shared buffer pool Completion Handler CPU Frame 1 Frame 2 Frame 3 Frame 4 Frame 5 Write Frame 4 Data Buffer Frame 5 Data Frame 3 Data Read GPU Frame 1 Frame 2 Frame 3 Time

287 // Triple Buffering Implementation // Create FIFO queue of three dynamic data uniform buffers id <MTLBuffer> myuniformbuffers[3]; // Create a semaphore that gets signaled at each frame boundary. // The GPU signals the semaphore once it completes a frame s work, // allowing CPU To work on a new frame dispatch_semaphore_t frameboundarysemaphore = dispatch_semaphore_create(3); // Current frame Index NSUInteger currentuniformindex = 0;

288 // Triple Buffering Implementation // Create FIFO queue of three dynamic data uniform buffers id <MTLBuffer> myuniformbuffers[3]; // Create a semaphore that gets signaled at each frame boundary. // The GPU signals the semaphore once it completes a frame s work, // allowing CPU To work on a new frame dispatch_semaphore_t frameboundarysemaphore = dispatch_semaphore_create(3); // Current frame Index NSUInteger currentuniformindex = 0;

289 // Triple Buffering Implementation // Create FIFO queue of three dynamic data uniform buffers id <MTLBuffer> myuniformbuffers[3]; // Create a semaphore that gets signaled at each frame boundary. // The GPU signals the semaphore once it completes a frame s work, // allowing CPU To work on a new frame dispatch_semaphore_t frameboundarysemaphore = dispatch_semaphore_create(3); // Current frame Index NSUInteger currentuniformindex = 0;

290 // Wait until inflight frame is completed dispatch_semaphore_wait(frameboundarysemaphore, DISPATCH_TIME_FOREVER); // Grab current frame and update its buffer currentuniformindex = (currentuniformindex + 1) % 3; [self updateuniformresource: myuniformbuffers[currentuniformindex]]; // Encode commands and bind uniform buffer for GPU access // Schedule frame completion handler [commandbuffer addcompletedhandler:^(id<mtlcommandbuffer> commandbuffer) { // GPU work is complete. Signal the Semaphore to start CPU work dispatch_semaphore_signal(frameboundarysemaphore); }]; // Finalize and commit frame to GPU [commandbuffer commit];

291 // Wait until inflight frame is completed dispatch_semaphore_wait(frameboundarysemaphore, DISPATCH_TIME_FOREVER); // Grab current frame and update its buffer currentuniformindex = (currentuniformindex + 1) % 3; [self updateuniformresource: myuniformbuffers[currentuniformindex]]; // Encode commands and bind uniform buffer for GPU access // Schedule frame completion handler [commandbuffer addcompletedhandler:^(id<mtlcommandbuffer> commandbuffer) { // GPU work is complete. Signal the Semaphore to start CPU work dispatch_semaphore_signal(frameboundarysemaphore); }]; // Finalize and commit frame to GPU [commandbuffer commit];

292 // Wait until inflight frame is completed dispatch_semaphore_wait(frameboundarysemaphore, DISPATCH_TIME_FOREVER); // Grab current frame and update its buffer currentuniformindex = (currentuniformindex + 1) % 3; [self updateuniformresource: myuniformbuffers[currentuniformindex]]; // Encode commands and bind uniform buffer for GPU access // Schedule frame completion handler [commandbuffer addcompletedhandler:^(id<mtlcommandbuffer> commandbuffer) { // GPU work is complete. Signal the Semaphore to start CPU work dispatch_semaphore_signal(frameboundarysemaphore); }]; // Finalize and commit frame to GPU [commandbuffer commit];

293 // Wait until inflight frame is completed dispatch_semaphore_wait(frameboundarysemaphore, DISPATCH_TIME_FOREVER); // Grab current frame and update its buffer currentuniformindex = (currentuniformindex + 1) % 3; [self updateuniformresource: myuniformbuffers[currentuniformindex]]; // Encode commands and bind uniform buffer for GPU access // Schedule frame completion handler [commandbuffer addcompletedhandler:^(id<mtlcommandbuffer> commandbuffer) { // GPU work is complete. Signal the Semaphore to start CPU work dispatch_semaphore_signal(frameboundarysemaphore); }]; // Finalize and commit frame to GPU [commandbuffer commit];

294 // Wait until inflight frame is completed dispatch_semaphore_wait(frameboundarysemaphore, DISPATCH_TIME_FOREVER); // Grab current frame and update its buffer currentuniformindex = (currentuniformindex + 1) % 3; [self updateuniformresource: myuniformbuffers[currentuniformindex]]; // Encode commands and bind uniform buffer for GPU access // Schedule frame completion handler [commandbuffer addcompletedhandler:^(id<mtlcommandbuffer> commandbuffer) { // GPU work is complete. Signal the Semaphore to start CPU work dispatch_semaphore_signal(frameboundarysemaphore); }]; // Finalize and commit frame to GPU [commandbuffer commit];

295 // Wait until inflight frame is completed dispatch_semaphore_wait(frameboundarysemaphore, DISPATCH_TIME_FOREVER); // Grab current frame and update its buffer currentuniformindex = (currentuniformindex + 1) % 3; [self updateuniformresource: myuniformbuffers[currentuniformindex]]; // Encode commands and bind uniform buffer for GPU access // Schedule frame completion handler [commandbuffer addcompletedhandler:^(id<mtlcommandbuffer> commandbuffer) { // GPU work is complete. Signal the Semaphore to start CPU work dispatch_semaphore_signal(frameboundarysemaphore); }]; // Finalize and commit frame to GPU [commandbuffer commit];

296 Build Initialize Render Shaders Devices and Queues Command Buffers Render Objects Resource Updates Render Encoders Display

297 Application Renderer Metal API Command Encoder Command Buffer Display Textures Buffers Pipelines Command Queue Device GPU Render Objects

298 Application Renderer Metal API Command Encoder Command Buffer Display Textures Buffers Pipelines Command Queue Device GPU Render Objects

299 Render Pass Descriptor Render Pass Descriptor Color Depth Command Buffer Stencil

300 Render Pass Descriptor Render Pass Descriptor Render Command Encoder Color Depth Command Buffer Stencil

301 Render Pass Setup // Metal Render Pass descriptor MTLRenderPassDescriptor * desc = [MTLRenderPassDescriptor new]; desc.colorattachment[0].texture = mycolortexture; desc.depthattachment.texture = mydepthtexture; id <MTLRenderCommandEncoder> encoder = [commandbuffer rendercommandencoderwithdescriptor: desc];

302 Render Pass Setup // Metal Render Pass descriptor MTLRenderPassDescriptor * desc = [MTLRenderPassDescriptor new]; desc.colorattachment[0].texture = mycolortexture; desc.depthattachment.texture = mydepthtexture; id <MTLRenderCommandEncoder> encoder = [commandbuffer rendercommandencoderwithdescriptor: desc];

303 Render Pass Load and Store Actions Render Pass Descriptor Render Command Encoder Color Depth Command Buffer Stencil Load Action Store Action

304 Render Pass Load and Store Actions Load Action Draw Store Action Color Depth

305 Render Pass Load and Store Actions Load Action Draw Store Action Clear Color Clear Depth

306 Render Pass Load and Store Actions Load Action Draw Store Action Clear Color Clear Depth

307 Render Pass Load and Store Actions Load Action Draw Store Action Clear Store Color Clear Don t care Depth

308 Render Pass Load and Store Actions // Color attachment Load and Store Actions MTLRenderPassDescriptor * desc = [MTLRenderPassDescriptor new]; desc.colorattachment[0].texture = mycolortexture; desc.colorattachment[0].loadaction = MTLLoadActionClear; desc.colorattachment[0].clearcolor = MTLClearColorMake(0.39f, 0.34f, 0.53f, 1.0f); desc.colorattachment[0].storeaction = MTLStoreActionStore; id <MTLRenderCommandEncoder> encoder = [commandbuffer rendercommandencoderwithdescriptor: desc];

309 Render Pass Load and Store Actions // Color attachment Load and Store Actions MTLRenderPassDescriptor * desc = [MTLRenderPassDescriptor new]; desc.colorattachment[0].texture = mycolortexture; desc.colorattachment[0].loadaction = MTLLoadActionClear; desc.colorattachment[0].clearcolor = MTLClearColorMake(0.39f, 0.34f, 0.53f, 1.0f); desc.colorattachment[0].storeaction = MTLStoreActionStore; id <MTLRenderCommandEncoder> encoder = [commandbuffer rendercommandencoderwithdescriptor: desc];

310 Render Pass Setup // Color attachment Load and Store Actions MTLRenderPassDescriptor * desc = [MTLRenderPassDescriptor new]; desc.colorattachment[0].texture = mycolortexture; desc.colorattachment[0].loadaction = MTLLoadActionClear; desc.colorattachment[0].clearcolor = MTLClearColorMake(0.39f, 0.34f, 0.53f, 1.0f); desc.colorattachment[0].storeaction = MTLStoreActionStore; id <MTLRenderCommandEncoder> encoder = [commandbuffer rendercommandencoderwithdescriptor: desc];

311 Rendering with OpenGL Render Targets glbindframebuffer(gl_framebuffer, myframebuffer); Shaders gluseprogram(myprogram); Vertex Buffers glbindbuffer(gl_array_buffer, myvertexbuffer); Uniforms glbindbuffer(gl_uniform_buffer, myuniforms); Textures glbindtexture(gl_texture_2d, mycolortexture); Draws gldrawarrays(gl_triangles, 0, numvertices);

312 Rendering with OpenGL Render Targets glbindframebuffer(gl_framebuffer, myframebuffer); Shaders gluseprogram(myprogram); Vertex Buffers glbindbuffer(gl_array_buffer, myvertexbuffer); Uniforms glbindbuffer(gl_uniform_buffer, myuniforms); Textures glbindtexture(gl_texture_2d, mycolortexture); Draws gldrawarrays(gl_triangles, 0, numvertices);

313 Rendering with OpenGL Render Targets glbindframebuffer(gl_framebuffer, myframebuffer); Shaders gluseprogram(myprogram); Vertex Buffers glbindbuffer(gl_array_buffer, myvertexbuffer); Uniforms glbindbuffer(gl_uniform_buffer, myuniforms); Textures glbindtexture(gl_texture_2d, mycolortexture); Draws gldrawarrays(gl_triangles, 0, numvertices);

314 Rendering with OpenGL Render Targets glbindframebuffer(gl_framebuffer, myframebuffer); Shaders gluseprogram(myprogram); Vertex Buffers glbindbuffer(gl_array_buffer, myvertexbuffer); Uniforms glbindbuffer(gl_uniform_buffer, myuniforms); Textures glbindtexture(gl_texture_2d, mycolortexture); Draws gldrawarrays(gl_triangles, 0, numvertices);

315 Rendering with OpenGL Render Targets glbindframebuffer(gl_framebuffer, myframebuffer); Shaders gluseprogram(myprogram); Vertex Buffers glbindbuffer(gl_array_buffer, myvertexbuffer); Uniforms glbindbuffer(gl_uniform_buffer, myuniforms); Textures glbindtexture(gl_texture_2d, mycolortexture); Draws gldrawarrays(gl_triangles, 0, numvertices);

316 Rendering with OpenGL Render Targets glbindframebuffer(gl_framebuffer, myframebuffer); Shaders gluseprogram(myprogram); Vertex Buffers glbindbuffer(gl_array_buffer, myvertexbuffer); Uniforms glbindbuffer(gl_uniform_buffer, myuniforms); Textures glbindtexture(gl_texture_2d, mycolortexture); Draws gldrawarrays(gl_triangles, 0, numvertices);

317 Rendering with OpenGL Render Targets glbindframebuffer(gl_framebuffer, myframebuffer); Shaders gluseprogram(myprogram); Vertex Buffers glbindbuffer(gl_array_buffer, myvertexbuffer); Uniforms glbindbuffer(gl_uniform_buffer, myuniforms); Textures glbindtexture(gl_texture_2d, mycolortexture); Draws gldrawarrays(gl_triangles, 0, numvertices);

318 Rendering with Metal Render Targets encoder = [commandbuffer rendercommandencoderwithdescriptor:descriptor]; Shaders [encoder setpipelinestate:mypipeline]; Vertex Buffers [encoder setvertexbuffer:myvertexdata offset:0 atindex:0]; [encoder setvertexbuffer:myuniforms offset:0 atindex:1]; Uniforms [encoder setfragmentbuffer:myuniforms offset:0 atindex:1]; Textures [encoder setfragmenttexture:mycolortexture atindex:0]; Draws [encoder drawprimitives:mtlprimitivetypetriangle vertexstart:0 vertexcount:numvertices]; [encoder endencoding];

319 Rendering with Metal Render Targets encoder = [commandbuffer rendercommandencoderwithdescriptor:descriptor]; Shaders [encoder setpipelinestate:mypipeline]; Vertex Buffers [encoder setvertexbuffer:myvertexdata offset:0 atindex:0]; [encoder setvertexbuffer:myuniforms offset:0 atindex:1]; Uniforms [encoder setfragmentbuffer:myuniforms offset:0 atindex:1]; Textures [encoder setfragmenttexture:mycolortexture atindex:0]; Draws [encoder drawprimitives:mtlprimitivetypetriangle vertexstart:0 vertexcount:numvertices]; [encoder endencoding];

320 Rendering with Metal Render Targets encoder = [commandbuffer rendercommandencoderwithdescriptor:descriptor]; Shaders [encoder setpipelinestate:mypipeline]; Vertex Buffers [encoder setvertexbuffer:myvertexdata offset:0 atindex:0]; [encoder setvertexbuffer:myuniforms offset:0 atindex:1]; Uniforms [encoder setfragmentbuffer:myuniforms offset:0 atindex:1]; Textures [encoder setfragmenttexture:mycolortexture atindex:0]; Draws [encoder drawprimitives:mtlprimitivetypetriangle vertexstart:0 vertexcount:numvertices]; [encoder endencoding];

321 Rendering with Metal Render Targets encoder = [commandbuffer rendercommandencoderwithdescriptor:descriptor]; Shaders [encoder setpipelinestate:mypipeline]; Vertex Buffers [encoder setvertexbuffer:myvertexdata offset:0 atindex:0]; [encoder setvertexbuffer:myuniforms offset:0 atindex:1]; Uniforms [encoder setfragmentbuffer:myuniforms offset:0 atindex:1]; Textures [encoder setfragmenttexture:mycolortexture atindex:0]; Draws [encoder drawprimitives:mtlprimitivetypetriangle vertexstart:0 vertexcount:numvertices]; [encoder endencoding];

322 Rendering with Metal Render Targets encoder = [commandbuffer rendercommandencoderwithdescriptor:descriptor]; Shaders [encoder setpipelinestate:mypipeline]; Vertex Buffers [encoder setvertexbuffer:myvertexdata offset:0 atindex:0]; [encoder setvertexbuffer:myuniforms offset:0 atindex:1]; Uniforms [encoder setfragmentbuffer:myuniforms offset:0 atindex:1]; Textures [encoder setfragmenttexture:mycolortexture atindex:0]; Draws [encoder drawprimitives:mtlprimitivetypetriangle vertexstart:0 vertexcount:numvertices]; [encoder endencoding];

323 Rendering with Metal Render Targets encoder = [commandbuffer rendercommandencoderwithdescriptor:descriptor]; Shaders [encoder setpipelinestate:mypipeline]; Vertex Buffers [encoder setvertexbuffer:myvertexdata offset:0 atindex:0]; [encoder setvertexbuffer:myuniforms offset:0 atindex:1]; Uniforms [encoder setfragmentbuffer:myuniforms offset:0 atindex:1]; Textures [encoder setfragmenttexture:mycolortexture atindex:0]; Draws [encoder drawprimitives:mtlprimitivetypetriangle vertexstart:0 vertexcount:numvertices]; [encoder endencoding];

324 Rendering with Metal Render Targets encoder = [commandbuffer rendercommandencoderwithdescriptor:descriptor]; Shaders [encoder setpipelinestate:mypipeline]; Vertex Buffers [encoder setvertexbuffer:myvertexdata offset:0 atindex:0]; [encoder setvertexbuffer:myuniforms offset:0 atindex:1]; Uniforms [encoder setfragmentbuffer:myuniforms offset:0 atindex:1]; Textures [encoder setfragmenttexture:mycolortexture atindex:0]; Draws [encoder drawprimitives:mtlprimitivetypetriangle vertexstart:0 vertexcount:numvertices]; [encoder endencoding];

325 Rendering with Metal Render Targets encoder = [commandbuffer rendercommandencoderwithdescriptor:descriptor]; Shaders [encoder setpipelinestate:mypipeline]; Vertex Buffers [encoder setvertexbuffer:myvertexdata offset:0 atindex:0]; [encoder setvertexbuffer:myuniforms offset:0 atindex:1]; Uniforms [encoder setfragmentbuffer:myuniforms offset:0 atindex:1]; Textures [encoder setfragmenttexture:mycolortexture atindex:0]; Draws [encoder drawprimitives:mtlprimitivetypetriangle vertexstart:0 vertexcount:numvertices]; [encoder endencoding];

326 Build Initialize Render Shaders Devices and Queues Command Buffers Render Objects Resource Updates Render Encoders Display

327 Application Renderer Metal API Command Encoder Command Buffer Display Textures Buffers Pipelines Command Queue Device GPU Render Objects

328 Application Renderer Metal API Command Encoder Command Buffer Display Textures Buffers Pipelines Command Queue Device GPU Render Objects

329 Display Drawables and presentation Drawables Textures for on screen display Each frame MTKView provides Drawable texture Render pass descriptor setup with the drawable Render to drawables like any other texture Present drawable when done rendering

330 // Render offscreen passes... // Acquire a render pass descriptor generated from the drawable s texture MTLRenderPassDescriptor* renderpassdescriptor = view.currentrenderpassdescriptor; // encode your on-screen render passes id <MTLRenderCommandEncoder> rendercommandencoder = [commandbuffer rendercommandencoderwithdescriptor:renderpassdescriptor]; // Encode render commands... [rendercommandencoder endencoding]; // Register the drawable presentation [commandbuffer presentdrawable:view.currentdrawable]; [commandbuffer commit];

331 // Render offscreen passes... // Acquire a render pass descriptor generated from the drawable s texture MTLRenderPassDescriptor* renderpassdescriptor = view.currentrenderpassdescriptor; // encode your on-screen render passes id <MTLRenderCommandEncoder> rendercommandencoder = [commandbuffer rendercommandencoderwithdescriptor:renderpassdescriptor]; // Encode render commands... [rendercommandencoder endencoding]; // Register the drawable presentation [commandbuffer presentdrawable:view.currentdrawable]; [commandbuffer commit];

332 // Render offscreen passes... // Acquire a render pass descriptor generated from the drawable s texture MTLRenderPassDescriptor* renderpassdescriptor = view.currentrenderpassdescriptor; // encode your on-screen render passes id <MTLRenderCommandEncoder> rendercommandencoder = [commandbuffer rendercommandencoderwithdescriptor:renderpassdescriptor]; // Encode render commands... [rendercommandencoder endencoding]; // Register the drawable presentation [commandbuffer presentdrawable:view.currentdrawable]; [commandbuffer commit];

333 Build Initialize Render Shaders Devices and Queues Command Buffers Render Objects Resource Updates Render Encoders Display

334 Incrementally Porting Create shared Metal/OpenGL textures using IOSurface or CVPixelBuffer Render to texture on one API and read in the other Can enable mixed Metal/OpenGL applications Sample code available

335 Going Further Multithreading Metal is designed to facilitate Multithreading Consider multithreading if application is CPU bound Encode multiple command buffers simultaneously Split single render pass using MTLParalllelCommandEncoder

336 Going Further Staying on the GPU Metal natively supports compute Performance benefits Reduces CPU utilization Reduces GPU-CPU synchronization points Free s data bandwidth to the GPU New algorithms possible Particle systems, physics, object culling

337 More Metal Features

338 Sharable Textures Tile shaders Tessellation Resource Heaps Indirect Command Buffers Raster Order Groups Programmable Sample Positions Events Layered Rendering Argument Buffers Image Blocks Compute More Metal Features Indirect Dispatch SIMD Group operators Typed Buffers Framebuffer Fetch Metal Performance Shaders Function Specialization Multi Viewport Rendering Texture Arrays Array of Samplers Wide Color Resource Views Memoryless Render Targets

339 Developer Tools Debug and optimize your applications Xcode contains an advanced set of GPU tools Enable Metal's API validation layer On by default when target run from Xcode

340

341

342

343

344

Working with Metal Overview

Working with Metal Overview Graphics and Games #WWDC14 Working with Metal Overview Session 603 Jeremy Sandmel GPU Software 2014 Apple Inc. All rights reserved. Redistribution or public display not permitted without written permission

More information

Metal. GPU-accelerated advanced 3D graphics rendering and data-parallel computation. source rebelsmarket.com

Metal. GPU-accelerated advanced 3D graphics rendering and data-parallel computation. source rebelsmarket.com Metal GPU-accelerated advanced 3D graphics rendering and data-parallel computation source rebelsmarket.com Maths The heart and foundation of computer graphics source wallpoper.com Metalmatics There are

More information

What s New in Metal, Part 2

What s New in Metal, Part 2 Graphics and Games #WWDC15 What s New in Metal, Part 2 Session 607 Dan Omachi GPU Software Frameworks Engineer Anna Tikhonova GPU Software Frameworks Engineer 2015 Apple Inc. All rights reserved. Redistribution

More information

Working With Metal Advanced

Working With Metal Advanced Graphics and Games #WWDC14 Working With Metal Advanced Session 605 Gokhan Avkarogullari GPU Software Aaftab Munshi GPU Software Serhat Tekin GPU Software 2014 Apple Inc. All rights reserved. Redistribution

More information

Metal Feature Set Tables

Metal Feature Set Tables Metal Feature Set Tables apple Developer Feature Availability This table lists the availability of major Metal features. OS ios 8 ios 8 ios 9 ios 9 ios 9 ios 10 ios 10 ios 10 ios 11 ios 11 ios 11 ios 11

More information

The Application Stage. The Game Loop, Resource Management and Renderer Design

The Application Stage. The Game Loop, Resource Management and Renderer Design 1 The Application Stage The Game Loop, Resource Management and Renderer Design Application Stage Responsibilities 2 Set up the rendering pipeline Resource Management 3D meshes Textures etc. Prepare data

More information

Metal for Ray Tracing Acceleration

Metal for Ray Tracing Acceleration Session #WWDC18 Metal for Ray Tracing Acceleration 606 Sean James, GPU Software Engineer Wayne Lister, GPU Software Engineer 2018 Apple Inc. All rights reserved. Redistribution or public display not permitted

More information

What s New in Metal. Part 2 #WWDC16. Graphics and Games. Session 605

What s New in Metal. Part 2 #WWDC16. Graphics and Games. Session 605 Graphics and Games #WWDC16 What s New in Metal Part 2 Session 605 Charles Brissart GPU Software Engineer Dan Omachi GPU Software Engineer Anna Tikhonova GPU Software Engineer 2016 Apple Inc. All rights

More information

Introducing Metal 2. Graphics and Games #WWDC17. Michal Valient, GPU Software Engineer Richard Schreyer, GPU Software Engineer

Introducing Metal 2. Graphics and Games #WWDC17. Michal Valient, GPU Software Engineer Richard Schreyer, GPU Software Engineer Session Graphics and Games #WWDC17 Introducing Metal 2 601 Michal Valient, GPU Software Engineer Richard Schreyer, GPU Software Engineer 2017 Apple Inc. All rights reserved. Redistribution or public display

More information

X. GPU Programming. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter X 1

X. GPU Programming. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter X 1 X. GPU Programming 320491: Advanced Graphics - Chapter X 1 X.1 GPU Architecture 320491: Advanced Graphics - Chapter X 2 GPU Graphics Processing Unit Parallelized SIMD Architecture 112 processing cores

More information

Copyright Khronos Group Page 1

Copyright Khronos Group Page 1 Gaming Market Briefing Overview of APIs GDC March 2016 Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem ntrevett@nvidia.com @neilt3d Copyright Khronos Group 2016 - Page 1 Copyright

More information

Vulkan (including Vulkan Fast Paths)

Vulkan (including Vulkan Fast Paths) Vulkan (including Vulkan Fast Paths) Łukasz Migas Software Development Engineer WS Graphics Let s talk about OpenGL (a bit) History 1.0-1992 1.3-2001 multitexturing 1.5-2003 vertex buffer object 2.0-2004

More information

EECS 487: Interactive Computer Graphics

EECS 487: Interactive Computer Graphics EECS 487: Interactive Computer Graphics Lecture 21: Overview of Low-level Graphics API Metal, Direct3D 12, Vulkan Console Games Why do games look and perform so much better on consoles than on PCs with

More information

Could you make the XNA functions yourself?

Could you make the XNA functions yourself? 1 Could you make the XNA functions yourself? For the second and especially the third assignment, you need to globally understand what s going on inside the graphics hardware. You will write shaders, which

More information

Achieving High-performance Graphics on Mobile With the Vulkan API

Achieving High-performance Graphics on Mobile With the Vulkan API Achieving High-performance Graphics on Mobile With the Vulkan API Marius Bjørge Graphics Research Engineer GDC 2016 Agenda Overview Command Buffers Synchronization Memory Shaders and Pipelines Descriptor

More information

Vulkan: Architecture positive How Vulkan maps to PowerVR GPUs Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics.

Vulkan: Architecture positive How Vulkan maps to PowerVR GPUs Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics. Vulkan: Architecture positive How Vulkan maps to PowerVR GPUs Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics www.imgtec.com Introduction Who am I? Kevin Sun Working at Imagination Technologies

More information

Real - Time Rendering. Pipeline optimization. Michal Červeňanský Juraj Starinský

Real - Time Rendering. Pipeline optimization. Michal Červeňanský Juraj Starinský Real - Time Rendering Pipeline optimization Michal Červeňanský Juraj Starinský Motivation Resolution 1600x1200, at 60 fps Hw power not enough Acceleration is still necessary 3.3.2010 2 Overview Application

More information

Programming Guide. Aaftab Munshi Dan Ginsburg Dave Shreiner. TT r^addison-wesley

Programming Guide. Aaftab Munshi Dan Ginsburg Dave Shreiner. TT r^addison-wesley OpenGUES 2.0 Programming Guide Aaftab Munshi Dan Ginsburg Dave Shreiner TT r^addison-wesley Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich Paris Madrid

More information

OpenGL Status - November 2013 G-Truc Creation

OpenGL Status - November 2013 G-Truc Creation OpenGL Status - November 2013 G-Truc Creation Vendor NVIDIA AMD Intel Windows Apple Release date 02/10/2013 08/11/2013 30/08/2013 22/10/2013 Drivers version 331.10 beta 13.11 beta 9.2 10.18.10.3325 MacOS

More information

Copyright Khronos Group, Page Graphic Remedy. All Rights Reserved

Copyright Khronos Group, Page Graphic Remedy. All Rights Reserved Avi Shapira Graphic Remedy Copyright Khronos Group, 2009 - Page 1 2004 2009 Graphic Remedy. All Rights Reserved Debugging and profiling 3D applications are both hard and time consuming tasks Companies

More information

PowerVR Hardware. Architecture Overview for Developers

PowerVR Hardware. Architecture Overview for Developers Public Imagination Technologies PowerVR Hardware Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.

More information

GPU Memory Model. Adapted from:

GPU Memory Model. Adapted from: GPU Memory Model Adapted from: Aaron Lefohn University of California, Davis With updates from slides by Suresh Venkatasubramanian, University of Pennsylvania Updates performed by Gary J. Katz, University

More information

Shaders (some slides taken from David M. course)

Shaders (some slides taken from David M. course) Shaders (some slides taken from David M. course) Doron Nussbaum Doron Nussbaum COMP 3501 - Shaders 1 Traditional Rendering Pipeline Traditional pipeline (older graphics cards) restricts developer to texture

More information

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer Real-Time Rendering (Echtzeitgraphik) Michael Wimmer wimmer@cg.tuwien.ac.at Walking down the graphics pipeline Application Geometry Rasterizer What for? Understanding the rendering pipeline is the key

More information

Graphics Hardware. Instructor Stephen J. Guy

Graphics Hardware. Instructor Stephen J. Guy Instructor Stephen J. Guy Overview What is a GPU Evolution of GPU GPU Design Modern Features Programmability! Programming Examples Overview What is a GPU Evolution of GPU GPU Design Modern Features Programmability!

More information

Shader Series Primer: Fundamentals of the Programmable Pipeline in XNA Game Studio Express

Shader Series Primer: Fundamentals of the Programmable Pipeline in XNA Game Studio Express Shader Series Primer: Fundamentals of the Programmable Pipeline in XNA Game Studio Express Level: Intermediate Area: Graphics Programming Summary This document is an introduction to the series of samples,

More information

Shaders. Slide credit to Prof. Zwicker

Shaders. Slide credit to Prof. Zwicker Shaders Slide credit to Prof. Zwicker 2 Today Shader programming 3 Complete model Blinn model with several light sources i diffuse specular ambient How is this implemented on the graphics processor (GPU)?

More information

Next-Generation Graphics on Larrabee. Tim Foley Intel Corp

Next-Generation Graphics on Larrabee. Tim Foley Intel Corp Next-Generation Graphics on Larrabee Tim Foley Intel Corp Motivation The killer app for GPGPU is graphics We ve seen Abstract models for parallel programming How those models map efficiently to Larrabee

More information

OpenGL SUPERBIBLE. Fifth Edition. Comprehensive Tutorial and Reference. Richard S. Wright, Jr. Nicholas Haemel Graham Sellers Benjamin Lipchak

OpenGL SUPERBIBLE. Fifth Edition. Comprehensive Tutorial and Reference. Richard S. Wright, Jr. Nicholas Haemel Graham Sellers Benjamin Lipchak OpenGL SUPERBIBLE Fifth Edition Comprehensive Tutorial and Reference Richard S. Wright, Jr. Nicholas Haemel Graham Sellers Benjamin Lipchak AAddison-Wesley Upper Saddle River, NJ Boston Indianapolis San

More information

2.11 Particle Systems

2.11 Particle Systems 2.11 Particle Systems 320491: Advanced Graphics - Chapter 2 152 Particle Systems Lagrangian method not mesh-based set of particles to model time-dependent phenomena such as snow fire smoke 320491: Advanced

More information

GPU Memory Model Overview

GPU Memory Model Overview GPU Memory Model Overview John Owens University of California, Davis Department of Electrical and Computer Engineering Institute for Data Analysis and Visualization SciDAC Institute for Ultrascale Visualization

More information

Optimisation. CS7GV3 Real-time Rendering

Optimisation. CS7GV3 Real-time Rendering Optimisation CS7GV3 Real-time Rendering Introduction Talk about lower-level optimization Higher-level optimization is better algorithms Example: not using a spatial data structure vs. using one After that

More information

Basics of GPU-Based Programming

Basics of GPU-Based Programming Module 1: Introduction to GPU-Based Methods Basics of GPU-Based Programming Overview Rendering pipeline on current GPUs Low-level languages Vertex programming Fragment programming High-level shading languages

More information

Windowing System on a 3D Pipeline. February 2005

Windowing System on a 3D Pipeline. February 2005 Windowing System on a 3D Pipeline February 2005 Agenda 1.Overview of the 3D pipeline 2.NVIDIA software overview 3.Strengths and challenges with using the 3D pipeline GeForce 6800 220M Transistors April

More information

OpenGL BOF Siggraph 2011

OpenGL BOF Siggraph 2011 OpenGL BOF Siggraph 2011 OpenGL BOF Agenda OpenGL 4 update Barthold Lichtenbelt, NVIDIA OpenGL Shading Language Hints/Kinks Bill Licea-Kane, AMD Ecosystem update Jon Leech, Khronos Viewperf 12, a new beginning

More information

From Art to Engine with Model I/O

From Art to Engine with Model I/O Session Graphics and Games #WWDC17 From Art to Engine with Model I/O 610 Nick Porcino, Game Technologies Engineer Nicholas Blasingame, Game Technologies Engineer 2017 Apple Inc. All rights reserved. Redistribution

More information

PowerVR Framework. October 2015

PowerVR Framework. October 2015 PowerVR Framework October 2015 Gerry Raptis Leading Developer Technology Engineer, PowerVR Graphics PowerVR Tools and SDK Overview Tools Development Debugging Optimisation Authoring SDK Development Learning

More information

Lecture 9: Deferred Shading. Visual Computing Systems CMU , Fall 2013

Lecture 9: Deferred Shading. Visual Computing Systems CMU , Fall 2013 Lecture 9: Deferred Shading Visual Computing Systems The course so far The real-time graphics pipeline abstraction Principle graphics abstractions Algorithms and modern high performance implementations

More information

Hands-On Workshop: 3D Automotive Graphics on Connected Radios Using Rayleigh and OpenGL ES 2.0

Hands-On Workshop: 3D Automotive Graphics on Connected Radios Using Rayleigh and OpenGL ES 2.0 Hands-On Workshop: 3D Automotive Graphics on Connected Radios Using Rayleigh and OpenGL ES 2.0 FTF-AUT-F0348 Hugo Osornio Luis Olea A P R. 2 0 1 4 TM External Use Agenda Back to the Basics! What is a GPU?

More information

Vulkan: Scaling to Multiple Threads. Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics

Vulkan: Scaling to Multiple Threads. Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics Vulkan: Scaling to Multiple Threads Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics www.imgtec.com Introduction Who am I? Kevin Sun Working at Imagination Technologies Take responsibility

More information

Computergraphics Exercise 15/ Shading & Texturing

Computergraphics Exercise 15/ Shading & Texturing Computergraphics Exercise 15/16 3. Shading & Texturing Jakob Wagner for internal use only Shaders Vertex Specification define vertex format & data in model space Vertex Processing transform to clip space

More information

Real - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský

Real - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský Real - Time Rendering Graphics pipeline Michal Červeňanský Juraj Starinský Overview History of Graphics HW Rendering pipeline Shaders Debugging 2 History of Graphics HW First generation Second generation

More information

Understanding Undefined Behavior

Understanding Undefined Behavior Session Developer Tools #WWDC17 Understanding Undefined Behavior 407 Fred Riss, Clang Team Ryan Govostes, Security Engineering and Architecture Team Anna Zaks, Program Analysis Team 2017 Apple Inc. All

More information

Short Notes of CS201

Short Notes of CS201 #includes: Short Notes of CS201 The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with < and > if the file is a system

More information

Lecture 25: Board Notes: Threads and GPUs

Lecture 25: Board Notes: Threads and GPUs Lecture 25: Board Notes: Threads and GPUs Announcements: - Reminder: HW 7 due today - Reminder: Submit project idea via (plain text) email by 11/24 Recap: - Slide 4: Lecture 23: Introduction to Parallel

More information

Lecture 13: OpenGL Shading Language (GLSL)

Lecture 13: OpenGL Shading Language (GLSL) Lecture 13: OpenGL Shading Language (GLSL) COMP 175: Computer Graphics April 18, 2018 1/56 Motivation } Last week, we discussed the many of the new tricks in Graphics require low-level access to the Graphics

More information

The GPGPU Programming Model

The GPGPU Programming Model The Programming Model Institute for Data Analysis and Visualization University of California, Davis Overview Data-parallel programming basics The GPU as a data-parallel computer Hello World Example Programming

More information

GeForce3 OpenGL Performance. John Spitzer

GeForce3 OpenGL Performance. John Spitzer GeForce3 OpenGL Performance John Spitzer GeForce3 OpenGL Performance John Spitzer Manager, OpenGL Applications Engineering jspitzer@nvidia.com Possible Performance Bottlenecks They mirror the OpenGL pipeline

More information

Graphics Processing Unit Architecture (GPU Arch)

Graphics Processing Unit Architecture (GPU Arch) Graphics Processing Unit Architecture (GPU Arch) With a focus on NVIDIA GeForce 6800 GPU 1 What is a GPU From Wikipedia : A specialized processor efficient at manipulating and displaying computer graphics

More information

Bringing AAA graphics to mobile platforms. Niklas Smedberg Senior Engine Programmer, Epic Games

Bringing AAA graphics to mobile platforms. Niklas Smedberg Senior Engine Programmer, Epic Games Bringing AAA graphics to mobile platforms Niklas Smedberg Senior Engine Programmer, Epic Games Who Am I A.k.a. Smedis Platform team at Epic Games Unreal Engine 15 years in the industry 30 years of programming

More information

CS201 - Introduction to Programming Glossary By

CS201 - Introduction to Programming Glossary By CS201 - Introduction to Programming Glossary By #include : The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with

More information

Next Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Copyright Khronos Group Page 1

Next Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Copyright Khronos Group Page 1 Next Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Ecosystem @neilt3d Copyright Khronos Group 2015 - Page 1 Copyright Khronos Group 2015 - Page 2 Khronos Connects Software to Silicon

More information

Mali Developer Resources. Kevin Ho ARM Taiwan FAE

Mali Developer Resources. Kevin Ho ARM Taiwan FAE Mali Developer Resources Kevin Ho ARM Taiwan FAE ARM Mali Developer Tools Software Development SDKs for OpenGL ES & OpenCL OpenGL ES Emulators Shader Development Studio Shader Library Asset Creation Texture

More information

PERFORMANCE. Rene Damm Kim Steen Riber COPYRIGHT UNITY TECHNOLOGIES

PERFORMANCE. Rene Damm Kim Steen Riber COPYRIGHT UNITY TECHNOLOGIES PERFORMANCE Rene Damm Kim Steen Riber WHO WE ARE René Damm Core engine developer @ Unity Kim Steen Riber Core engine lead developer @ Unity OPTIMIZING YOUR GAME FPS CPU (Gamecode, Physics, Skinning, Particles,

More information

Vulkan API 杨瑜, 资深工程师

Vulkan API 杨瑜, 资深工程师 Vulkan API 杨瑜, 资深工程师 Vulkan Overview (1/3) Some History ~2011 became apparent that the API is getting in the way - Console Developers programmed GPUs To-the-Metal 2012 Khronos started work on GLCommon

More information

PowerVR Performance Recommendations. The Golden Rules

PowerVR Performance Recommendations. The Golden Rules PowerVR Performance Recommendations Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind. Redistribution

More information

More frames per second. Alex Kan and Jean-François Roy GPU Software

More frames per second. Alex Kan and Jean-François Roy GPU Software More frames per second Alex Kan and Jean-François Roy GPU Software 2 OpenGL ES Analyzer Tuning the graphics pipeline Analyzer demo 3 Developer preview Jean-François Roy GPU Software Developer Technologies

More information

GRAPHICS CONTROLLERS

GRAPHICS CONTROLLERS Fujitsu Semiconductor Europe Application Note an-mb86r1x-optimize-graphics-apps-rev0-10 GRAPHICS CONTROLLERS MB86R1X 'EMERALD-X' OPTIMIZING GRAPHICS APPLICATIONS APPLICATION NOTE Revision History Revision

More information

OpenGL Programmable Shaders

OpenGL Programmable Shaders h gpup 1 Topics Rendering Pipeline Shader Types OpenGL Programmable Shaders sh gpup 1 OpenGL Shader Language Basics h gpup 1 EE 4702-X Lecture Transparency. Formatted 9:03, 20 October 2014 from shaders2.

More information

Rendering Objects. Need to transform all geometry then

Rendering Objects. Need to transform all geometry then Intro to OpenGL Rendering Objects Object has internal geometry (Model) Object relative to other objects (World) Object relative to camera (View) Object relative to screen (Projection) Need to transform

More information

Metal Shading Language for Core Image Kernels

Metal Shading Language for Core Image Kernels Metal Shading Language for Core Image Kernels apple Developer Contents Overview 3 CIKernel Function Requirements 4 Data Types 5 Destination Types.............................................. 5 Sampler

More information

Vulkan on Mobile. Daniele Di Donato, ARM GDC 2016

Vulkan on Mobile. Daniele Di Donato, ARM GDC 2016 Vulkan on Mobile Daniele Di Donato, ARM GDC 2016 Outline Vulkan main features Mapping Vulkan Key features to ARM CPUs Mapping Vulkan Key features to ARM Mali GPUs 4 Vulkan Good match for mobile and tiling

More information

Grafica Computazionale: Lezione 30. Grafica Computazionale. Hiding complexity... ;) Introduction to OpenGL. lezione30 Introduction to OpenGL

Grafica Computazionale: Lezione 30. Grafica Computazionale. Hiding complexity... ;) Introduction to OpenGL. lezione30 Introduction to OpenGL Grafica Computazionale: Lezione 30 Grafica Computazionale lezione30 Introduction to OpenGL Informatica e Automazione, "Roma Tre" May 20, 2010 OpenGL Shading Language Introduction to OpenGL OpenGL (Open

More information

Vulkan Multipass mobile deferred done right

Vulkan Multipass mobile deferred done right Vulkan Multipass mobile deferred done right Hans-Kristian Arntzen Marius Bjørge Khronos 5 / 25 / 2017 Content What is multipass? What multipass allows... A driver to do versus MRT Developers to do Transient

More information

Bifrost - The GPU architecture for next five billion

Bifrost - The GPU architecture for next five billion Bifrost - The GPU architecture for next five billion Hessed Choi Senior FAE / ARM ARM Tech Forum June 28 th, 2016 Vulkan 2 ARM 2016 What is Vulkan? A 3D graphics API for the next twenty years Logical successor

More information

CS770/870 Spring 2017 Open GL Shader Language GLSL

CS770/870 Spring 2017 Open GL Shader Language GLSL Preview CS770/870 Spring 2017 Open GL Shader Language GLSL Review traditional graphics pipeline CPU/GPU mixed pipeline issues Shaders GLSL graphics pipeline Based on material from Angel and Shreiner, Interactive

More information

CS770/870 Spring 2017 Open GL Shader Language GLSL

CS770/870 Spring 2017 Open GL Shader Language GLSL CS770/870 Spring 2017 Open GL Shader Language GLSL Based on material from Angel and Shreiner, Interactive Computer Graphics, 6 th Edition, Addison-Wesley, 2011 Bailey and Cunningham, Graphics Shaders 2

More information

Cg 2.0. Mark Kilgard

Cg 2.0. Mark Kilgard Cg 2.0 Mark Kilgard What is Cg? Cg is a GPU shading language C/C++ like language Write vertex-, geometry-, and fragmentprocessing kernels that execute on massively parallel GPUs Productivity through a

More information

Cornell University CS 569: Interactive Computer Graphics. Introduction. Lecture 1. [John C. Stone, UIUC] NASA. University of Calgary

Cornell University CS 569: Interactive Computer Graphics. Introduction. Lecture 1. [John C. Stone, UIUC] NASA. University of Calgary Cornell University CS 569: Interactive Computer Graphics Introduction Lecture 1 [John C. Stone, UIUC] 2008 Steve Marschner 1 2008 Steve Marschner 2 NASA University of Calgary 2008 Steve Marschner 3 2008

More information

CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015

CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015 CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015 Announcements Project 2 due tomorrow at 2pm Grading window

More information

Programmable GPUs Outline

Programmable GPUs Outline papi 1 Outline References Programmable Units Languages Programmable GPUs Outline papi 1 OpenGL Shading Language papi 1 EE 7700-1 Lecture Transparency. Formatted 11:30, 25 March 2009 from set-prog-api.

More information

Lecture 2. Shaders, GLSL and GPGPU

Lecture 2. Shaders, GLSL and GPGPU Lecture 2 Shaders, GLSL and GPGPU Is it interesting to do GPU computing with graphics APIs today? Lecture overview Why care about shaders for computing? Shaders for graphics GLSL Computing with shaders

More information

Get the most out of the new OpenGL ES 3.1 API. Hans-Kristian Arntzen Software Engineer

Get the most out of the new OpenGL ES 3.1 API. Hans-Kristian Arntzen Software Engineer Get the most out of the new OpenGL ES 3.1 API Hans-Kristian Arntzen Software Engineer 1 Content Compute shaders introduction Shader storage buffer objects Shader image load/store Shared memory Atomics

More information

CS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology

CS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology CS8803SC Software and Hardware Cooperative Computing GPGPU Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology Why GPU? A quiet revolution and potential build-up Calculation: 367

More information

Lecture 6: Texture. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

Lecture 6: Texture. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011) Lecture 6: Texture Kayvon Fatahalian CMU 15-869: Graphics and Imaging Architectures (Fall 2011) Today: texturing! Texture filtering - Texture access is not just a 2D array lookup ;-) Memory-system implications

More information

Low-Overhead Rendering with Direct3D. Evan Hart Principal Engineer - NVIDIA

Low-Overhead Rendering with Direct3D. Evan Hart Principal Engineer - NVIDIA Low-Overhead Rendering with Direct3D Evan Hart Principal Engineer - NVIDIA Ground Rules No DX9 Need to move fast Big topic in 30 minutes Assuming experienced audience Everything is a tradeoff These are

More information

Vulkan C++ Markus Tavenrath, Senior DevTech Software Engineer Professional Visualization

Vulkan C++ Markus Tavenrath, Senior DevTech Software Engineer Professional Visualization Vulkan C++ Markus Tavenrath, Senior DevTech Software Engineer Professional Visualization Who am I? Markus Tavenrath Senior Dev Tech Software Engineer - Professional Visualization Joined NVIDIA 8 years

More information

Building Visually Rich User Experiences

Building Visually Rich User Experiences Session App Frameworks #WWDC17 Building Visually Rich User Experiences 235 Noah Witherspoon, Software Engineer Warren Moore, Software Engineer 2017 Apple Inc. All rights reserved. Redistribution or public

More information

Cg Toolkit. Cg 1.3 Release Notes. December 2004

Cg Toolkit. Cg 1.3 Release Notes. December 2004 Cg Toolkit Cg 1.3 Release Notes December 2004 Cg Toolkit Release Notes The Cg Toolkit allows developers to write and run Cg programs using a wide variety of hardware platforms and graphics APIs. Originally

More information

Graphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics

Graphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics Why GPU? Chapter 1 Graphics Hardware Graphics Processing Unit (GPU) is a Subsidiary hardware With massively multi-threaded many-core Dedicated to 2D and 3D graphics Special purpose low functionality, high

More information

Hardware Accelerated Volume Visualization. Leonid I. Dimitrov & Milos Sramek GMI Austrian Academy of Sciences

Hardware Accelerated Volume Visualization. Leonid I. Dimitrov & Milos Sramek GMI Austrian Academy of Sciences Hardware Accelerated Volume Visualization Leonid I. Dimitrov & Milos Sramek GMI Austrian Academy of Sciences A Real-Time VR System Real-Time: 25-30 frames per second 4D visualization: real time input of

More information

Copyright Khronos Group Page 1. Vulkan Overview. June 2015

Copyright Khronos Group Page 1. Vulkan Overview. June 2015 Copyright Khronos Group 2015 - Page 1 Vulkan Overview June 2015 Copyright Khronos Group 2015 - Page 2 Khronos Connects Software to Silicon Open Consortium creating OPEN STANDARD APIs for hardware acceleration

More information

Best practices for effective OpenGL programming. Dan Omachi OpenGL Development Engineer

Best practices for effective OpenGL programming. Dan Omachi OpenGL Development Engineer Best practices for effective OpenGL programming Dan Omachi OpenGL Development Engineer 2 What Is OpenGL? 3 OpenGL is a software interface to graphics hardware - OpenGL Specification 4 GPU accelerates rendering

More information

What s New in Core Image

What s New in Core Image Media #WWDC15 What s New in Core Image Session 510 David Hayward Engineering Manager Tony Chu Engineer Alexandre Naaman Lead Engineer 2015 Apple Inc. All rights reserved. Redistribution or public display

More information

PROFESSIONAL. WebGL Programming DEVELOPING 3D GRAPHICS FOR THE WEB. Andreas Anyuru WILEY. John Wiley & Sons, Ltd.

PROFESSIONAL. WebGL Programming DEVELOPING 3D GRAPHICS FOR THE WEB. Andreas Anyuru WILEY. John Wiley & Sons, Ltd. PROFESSIONAL WebGL Programming DEVELOPING 3D GRAPHICS FOR THE WEB Andreas Anyuru WILEY John Wiley & Sons, Ltd. INTRODUCTION xxl CHAPTER 1: INTRODUCING WEBGL 1 The Basics of WebGL 1 So Why Is WebGL So Great?

More information

CS4621/5621 Fall Computer Graphics Practicum Intro to OpenGL/GLSL

CS4621/5621 Fall Computer Graphics Practicum Intro to OpenGL/GLSL CS4621/5621 Fall 2015 Computer Graphics Practicum Intro to OpenGL/GLSL Professor: Kavita Bala Instructor: Nicolas Savva with slides from Balazs Kovacs, Eston Schweickart, Daniel Schroeder, Jiang Huang

More information

CS 432 Interactive Computer Graphics

CS 432 Interactive Computer Graphics CS 432 Interactive Computer Graphics Lecture 7 Part 2 Texture Mapping in OpenGL Matt Burlick - Drexel University - CS 432 1 Topics Texture Mapping in OpenGL Matt Burlick - Drexel University - CS 432 2

More information

Graphics Performance Optimisation. John Spitzer Director of European Developer Technology

Graphics Performance Optimisation. John Spitzer Director of European Developer Technology Graphics Performance Optimisation John Spitzer Director of European Developer Technology Overview Understand the stages of the graphics pipeline Cherchez la bottleneck Once found, either eliminate or balance

More information

OpenGL on Android. Lecture 7. Android and Low-level Optimizations Summer School. 27 July 2015

OpenGL on Android. Lecture 7. Android and Low-level Optimizations Summer School. 27 July 2015 OpenGL on Android Lecture 7 Android and Low-level Optimizations Summer School 27 July 2015 This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this

More information

CS451Real-time Rendering Pipeline

CS451Real-time Rendering Pipeline 1 CS451Real-time Rendering Pipeline JYH-MING LIEN DEPARTMENT OF COMPUTER SCIENCE GEORGE MASON UNIVERSITY Based on Tomas Akenine-Möller s lecture note You say that you render a 3D 2 scene, but what does

More information

Stream Computing using Brook+

Stream Computing using Brook+ Stream Computing using Brook+ School of Electrical Engineering and Computer Science University of Central Florida Slides courtesy of P. Bhaniramka Outline Overview of Brook+ Brook+ Software Architecture

More information

Why Study Assembly Language?

Why Study Assembly Language? Why Study Assembly Language? This depends on the decade in which you studied assembly language. 1940 s You cannot study assembly language. It does not exist yet. 1950 s You study assembly language because,

More information

CS452/552; EE465/505. Clipping & Scan Conversion

CS452/552; EE465/505. Clipping & Scan Conversion CS452/552; EE465/505 Clipping & Scan Conversion 3-31 15 Outline! From Geometry to Pixels: Overview Clipping (continued) Scan conversion Read: Angel, Chapter 8, 8.1-8.9 Project#1 due: this week Lab4 due:

More information

Adaptive Point Cloud Rendering

Adaptive Point Cloud Rendering 1 Adaptive Point Cloud Rendering Project Plan Final Group: May13-11 Christopher Jeffers Eric Jensen Joel Rausch Client: Siemens PLM Software Client Contact: Michael Carter Adviser: Simanta Mitra 4/29/13

More information

! Readings! ! Room-level, on-chip! vs.!

! Readings! ! Room-level, on-chip! vs.! 1! 2! Suggested Readings!! Readings!! H&P: Chapter 7 especially 7.1-7.8!! (Over next 2 weeks)!! Introduction to Parallel Computing!! https://computing.llnl.gov/tutorials/parallel_comp/!! POSIX Threads

More information

MXwendler Fragment Shader Development Reference Version 1.0

MXwendler Fragment Shader Development Reference Version 1.0 MXwendler Fragment Shader Development Reference Version 1.0 This document describes the MXwendler fragmentshader interface. You will learn how to write shaders using the GLSL language standards and the

More information

Blis: Better Language for Image Stuff Project Proposal Programming Languages and Translators, Spring 2017

Blis: Better Language for Image Stuff Project Proposal Programming Languages and Translators, Spring 2017 Blis: Better Language for Image Stuff Project Proposal Programming Languages and Translators, Spring 2017 Abbott, Connor (cwa2112) Pan, Wendy (wp2213) Qinami, Klint (kq2129) Vaccaro, Jason (jhv2111) [System

More information

The Bifrost GPU architecture and the ARM Mali-G71 GPU

The Bifrost GPU architecture and the ARM Mali-G71 GPU The Bifrost GPU architecture and the ARM Mali-G71 GPU Jem Davies ARM Fellow and VP of Technology Hot Chips 28 Aug 2016 Introduction to ARM Soft IP ARM licenses Soft IP cores (amongst other things) to our

More information

Rationale for Non-Programmable Additions to OpenGL 2.0

Rationale for Non-Programmable Additions to OpenGL 2.0 Rationale for Non-Programmable Additions to OpenGL 2.0 NVIDIA Corporation March 23, 2004 This white paper provides a rationale for a set of functional additions to the 2.0 revision of the OpenGL graphics

More information

CUDA Programming Model

CUDA Programming Model CUDA Xing Zeng, Dongyue Mou Introduction Example Pro & Contra Trend Introduction Example Pro & Contra Trend Introduction What is CUDA? - Compute Unified Device Architecture. - A powerful parallel programming

More information