3D Graphics Dev Day. Copyright Khronos Group Page 1

Size: px
Start display at page:

Download "3D Graphics Dev Day. Copyright Khronos Group Page 1"

Transcription

1 3D Graphics Dev Day Copyright Khronos Group Page 1

2 Khronos 3D Graphics Dev Day - Sessions 1:20pm - Vulkan Game Development on Mobile 2:40pm - Vulkan on Desktop Deep Dive 4:00pm - When Vulkan was One: Looking Forward, Looking Back Collect them all! Copyright Khronos Group Page 2

3 Vulkan on Desktop Deep Dive Vulkan in Xenko - Jörg Wollenschläger (Silicon Studio) Unity - Jesse Barker (Unity) Vulkan Multi-GPU - Jeff Bolz (NVIDIA) Vulkan on the Desktop - Dan Baker (Oxide) Copyright Khronos Group Page 3

4 Copyright Khronos Group Page 1 Vulkan Multi-GPU Jeff Bolz, NVIDIA Corp. February 2017

5 Copyright Khronos Group Page 2 What/Why? A device group is a set of physical devices that support multi-gpu rendering Assumes a certain system configuration - Similar/identical GPUs - WDDM must be in linked display adapter mode - In short, an SLI/Crossfire system - Not dgpu/igpu New Device and Instance extensions: - KHX_device_group(_creation) Goal: Support the common multi-gpu rendering techniques (AFR/SFR/VR) Design philosophy: - Single logical device - Hide/share duplicated object creation - Make adding device group support as non-invasive as possible

6 Copyright Khronos Group Page 3 Device Group Enumeration/Creation Each GPU is still advertised as a distinct VkPhysicalDevice Advertise groups of physical devices from which a single logical device can be created: typedef struct VkPhysicalDeviceGroupPropertiesKHX {... uint32_t physicaldevicecount; VkPhysicalDevice physicaldevices[vk_max_device_group_size_khx]; } VkPhysicalDeviceGroupPropertiesKHX; vkenumeratephysicaldevicegroupskhx(..., /*out array*/vkphysicaldevicegrouppropertieskhx*); Physical Devices GTX 1080 GTX 1080 igpu Device Groups Group 0 (1080 x 2) Group 1 Once the logical device is created, it is used for object creation and work submission for all GPUs

7 Copyright Khronos Group Page 4 Memory Allocation Device-local heap has an instance for each GPU (still advertised as one heap) Each VkDeviceMemory has a number of memory instances - Common case one instance for non-local, N instances for local Most device-local memory allocations should be made with N instances - For static resources, you want a fast local copy of the data - For dynamic resources, you need a separate copy for each GPU/frame (for AFR) Example: Single instance non-device-local alloc, multi-instance device-local alloc Non-local heap Alloc 0 Shared Local heap Alloc 1 Alloc 1 GPU 0 GPU 1

8 Resource Binding Each resource has an instance for each GPU Mental Model: A resource has a single virtual address across all GPUs, which can be bound to a local or peer memory instance on each GPU New (extensible) bindings commands: vkbind{image,buffer}memory2khx - Add control over which memory instance each resource instance is bound to typedef struct VkBindImageMemoryInfoKHX {... uint32_t deviceindexcount; const uint32_t* pdeviceindices; } VkBindImageMemoryInfoKHX; pdeviceindices = {0,1} Image Instance 0 Image Instance 1 Memory Instance 0 Memory Instance 1 GPU 0 GPU 1 Copyright Khronos Group Page 5

9 Resource Binding Descriptor sets have a single instance - Expected use is for different bindings to be hidden behind same virtual address - Put image X in the set, and X is bound to some memory instance on each GPU Example: Peer binding for remote texturing Descriptor Set pdeviceindices = {1,0} Sampled Image Image Instance 0 Image Instance 1 Memory Instance 0 Memory Instance 1 Shared GPU 0 GPU 1 Copyright Khronos Group Page 6

10 Resource Binding Peer Memory feature bits advertise what operations are supported on peer bindings: typedef enum VkPeerMemoryFeatureFlagBitsKHX { VK_PEER_MEMORY_FEATURE_COPY_SRC_BIT_KHX = 0x , VK_PEER_MEMORY_FEATURE_COPY_DST_BIT_KHX = 0x , VK_PEER_MEMORY_FEATURE_GENERIC_SRC_BIT_KHX = 0x , VK_PEER_MEMORY_FEATURE_GENERIC_DST_BIT_KHX = 0x , } VkPeerMemoryFeatureFlagBitsKHX; - COPY_DST is required, NVIDIA also supports COPY_SRC and GENERIC_SRC Copy to/from peer memory by using a resource with peer memory bound Image A Instance 0 vkcmdcopyimage(src=imagea, dst=imageb); Execute on GPU 0 Image B Instance 0 Local Instance Peer Instance Copyright Khronos Group Page 7

11 Copyright Khronos Group Page 8 SFR Image Binding Map rectangular regions of image instances to local/peer memory instances typedef struct VkBindImageMemoryInfoKHX {... uint32_t SFRRectCount; const VkRect2D* psfrrects; } VkBindImageMemoryInfoKHX; SFRRectCount == N^2 - One rect per (resource,memory) instance pair - Element i*n+j is the rect in resource instance i that maps to memory instance j Example: left/right split Image Instance 0 Image Instance 1 Local Peer Peer Local VkRect2D sfrrects[4] = { { { 0, 0 }, { w/2, h } }, { { w/2, 0 }, { w/2, h } }, { { 0, 0 }, { w/2, h } }, { { w/2, 0 }, { w/2, h } }, };

12 Copyright Khronos Group Page 9 Command Buffer Recording/Submission Commands in a command buffer can be directed to a subset of the devices: vkcmdsetdevicemaskkhx(vkcommandbuffer commandbuffer, uint32_t devicemask); CommandBuffer Mask 0x1 vkcmd vkcmd Mask 0x2 vkcmd vkcmd Mask 0x3 vkcmd vkcmd Furthermore, command buffers can be submitted to a subset of devices: typedef struct VkDeviceGroupSubmitInfoKHX {... uint32_t commandbuffercount; const uint32_t* pcommandbufferdevicemasks; } VkDeviceGroupSubmitInfoKHX; Allows directing work to devices at coarse or fine granularity

13 Copyright Khronos Group Page 10 State Divergence Should state be allowed to diverge between GPUs? - Yes and no Rules: - State can be set on a subset of GPUs based on vkcmdsetdevicemask - Scissor and Viewport state can vary between GPUs at Draw time - For other state, all draws/dispatches use the most recently set state - BUT: when recording a draw/dispatch on GPU <i>, the most recent setting of any relevant state must include GPU <i> in the device mask Valid: Valid: Mask 0x1 BindPipeline(A) Draw Mask 0x2 BindPipeline(B) Draw BindPipeline(A) Mask 0x1 Draw Mask 0x3 BindPipeline(B) Mask 0x2 Draw Invalid: Mask 0x1 BindPipeline(A) Mask 0x2 BindPipeline(B) Mask 0x3 Draw - Different pipeline for each GPU in a single draw

14 Copyright Khronos Group Page 11 SFR/VR Features Split Frame Rendering: - Scissor state is allowed to diverge between GPUs (e.g. render to local instance) - Per-GPU render areas control loadop, storeop, AA resolve typedef struct VkDeviceGroupRenderPassBeginInfoKHX {... uint32_t devicerenderareacount; const VkRect2D* pdevicerenderareas; } VkDeviceGroupRenderPassBeginInfoKHX; Image Instance 0 Image Instance 1 Local Peer Peer Local New GLSL input variable: gl_deviceindex - Zero-based index of the physical device within the logical device - E.g. Used to select per-eye transform matrix in VR VkRect2D renderareas[2] = { { { 0, 0 }, { w/2, h } }, { { w/2, 0 }, { w/2, h } }, };

15 Copyright Khronos Group Page 12 Synchronization Fences/events/semaphores all have a single instance of the signaled state - Fence transitions to signaled when all GPUs complete the submission - Event can only be waited on by the GPU that signaled it - Each semaphore wait/signal operation occurs on a single GPU, but a semaphore can be used by multiple GPUs (e.g. signal on GPU 0 then wait on GPU 1) - Queue submission structures are extended to indicate which GPU(s) perform each operation typedef struct VkDeviceGroupSubmitInfoKHX {... uint32_t waitsemaphorecount; const uint32_t* pwaitsemaphoredeviceindices; uint32_t signalsemaphorecount; const uint32_t* psignalsemaphoredeviceindices; } VkDeviceGroupSubmitInfoKHX; Pipeline barriers normally perform synchronization within a GPU - Flag to opt in to pipeline barriers syncing between GPUs

16 Copyright Khronos Group Page 13 Swapchain Creation and Binding Can use vkcreateimage+vkbindimagememory2 in lieu of vkgetswapchainimages - Think of the swapchain as owning the memory, and images are separately created/bound to that memory - Bound by swapchain/imageindex rather than memory/memoryoffset typedef struct VkBindImageMemorySwapchainInfoKHX {... VkSwapchainKHR swapchain; uint32_t imageindex; } VkBindImageMemorySwapchainInfoKHX; Swapchain Image 0 Swapchain Image 1 Swapchain Image 2 Swapchain memory imageindex 0 imageindex 1 imageindex 2 - Allows creating peer/sfr bindings of swapchain memory (for rendering, not presentation)

17 Copyright Khronos Group Page 14 Presentation Modes Support local/remote/split presentation on systems that support them Device group advertises supported modes/devices typedef struct VkDeviceGroupPresentCapabilitiesKHX {... uint32_t presentmask[vk_max_device_group_size_khx]; VkDeviceGroupPresentModeFlagsKHX modes; } VkDeviceGroupPresentCapabilitiesKHX; vkgetdevicegroupsurfacepresentmodeskhx returns just-in-time supported modes for a given surface - Can vary based on window move/resize/etc. - Expect remote scanout only for fullscreen-exclusive

18 Copyright Khronos Group Page 15 Acquire/Present Acquiring a swapchain image takes a mask and returns an index available on those devices typedef struct VkAcquireNextImageInfoKHX {... uint32_t devicemask; } VkAcquireNextImageInfoKHX; Presenting an image takes a mask (must equal the acquire mask) and a mode typedef struct VkDeviceGroupPresentInfoKHX {... uint32_t swapchaincount; const uint32_t* pdevicemasks; VkDeviceGroupPresentModeFlagBitsKHX mode; } VkDeviceGroupPresentInfoKHX; Acquire Mask Present Mode AFR Alternate GPUs REMOTE SFR All GPUs SUM VR All GPUs LOCAL_MULTI_DEVICE

19 Copyright Khronos Group Page 16 Conclusion It sounds like a lot of work, but it s not - Most memory and resources want the default behavior - Submit render passes only to one GPU (AFR) or add per-device scissor (SFR) - A bit of WSI work to present the results Open Source vk_device_group sample in VRWorks (credit: Ingo Esser) - SFR stereo rendering

20 Vulkan in Xenko Vulkan on Desktop Deep Dive Jörg Wollenschläger Software Engineer, Silicon Studio

21 What is Xenko? Next-Level Game Engine Developed by Silicon Studio, Tokyo Release in April (currently free beta) Open-source Written in C# Cross-platform (Windows, UWP, Android, ios, etc.) OpenGL/ES, D3D 11, Vulkan, D3D 12

22 Rendering Architecture Redesign for next-gen APIs Previously D3D11-like Fully accessible and extensible renderer Focus on multi-threading Separation of renderer into highly parallel phases Ground work for completely asynchronous rendering New API abstraction is a mix of Vulkan and D3D 12 Best of both worlds Emulating new concepts in legacy APIs where possible Abstracting differences on a higher level (renderer is more API aware)

23 Rendering Architecture Newly exposed concepts First class: Descriptor sets, pipelines, command buffers Preferring Vulkan s descriptor sets over D3D 12, as they are more intuitive Partially: Only some explicit barriers. Many are still implicit and use state tracking Mostly implicit fences. Recycling of per-frame resources is directly exposed Not yet: Render passes, push constants, transfer queues, etc. Render passes are similar to our render stages, which describe attachments and rendering technique, but will require more high-level changes

24 Rendering Architecture Newly exposed concepts Emulated concepts Staging resources and resource renaming are support on Vulkan through suballocations of upload buffers Descriptor sets and pipelines are fake in legacy APIs API specific code paths Special path on old APIs where resource renaming is needed, e.g. for constant buffer update Special path on Vulkan that uses dynamic descriptor offsets and skipping updates

25 Shaders Current infrastructure Xenko Shader Language Extended HLSL Modular (classes, inheritance, composition, mixins, etc.) Resource groups (similar to cbuffers for resources) D3D Compiler Mixer D3D bytecode D3D XKSL HLSL OpenGL Problems Mixer and GLSL converter are AST transforms Reflection on AST GLSL dialects (e.g. bindings for Vulkan) AST transform Each shader permutation executes the complete pipeline Hard to maintain, complex, costly GLSL SPIR-V glslang OpenGL ES Vulkan

26 Shaders Investigation into new infrastructure Extended SPIR-V Intermediate format for compilation steps New OpCodes for custom features Converter based on glslang Extended glslang D3D bytecode D3D XKSL Extended SPIR-V Mixer/Linker OpenGL GLSL Advantages Compilation per shader class Efficiently working at bytecode level Reflection and optimization in bytecode Lower maintenance due to open-source projects SPIR-V OpenGL ES Vulkan

27 Performance CPU time savings Test scene with ~20000 draw calls Vulkan running at 60 FPS ~75% CPU load across cores D3D 11 Vulkan Time spent per frame Simulation (transformation, culling, sorting, etc.) Preparation (descriptor updates, resource upload, etc.) Drawing Driver thread

28 Summary Requires explicitness and design considerations Works out of the box on Android and Linux May consolidate shader infrastructure Very good debugging experience Decent performance gains Lots of potential for improvements xenko.com

29

30 Established 2013 Founders Leads of Civ V Cumulative 100+ years Independent Studio Game Technology Authors of Nitrous Engine

31 Key Differences Descriptor management differences Shader Stack is HLSL SPIRV Render Passes No explicit memory management Pipeline Barriers Validator more thorough in Vulkan Most of rest is similar to D3D12 Never been easier to support both APIs!

32 More descriptor types buffers and textures are different Descriptors are not free ranged like D3D12, live inside a specific descriptor set with a specific number of descriptors Descriptor set s layout must match pipeline objects exactly cannot be a superset For Ashes meant some descriptor set waste always create max size descriptor sets and bind with empty descriptors

33 D3D12 Global Descriptors Vulkan Descriprtor Set Descriptor Set Descriptor Set Descriprtor Set Descriprtor Set Descriptor Set

34 Similar to D3D12 s root layout Conceptually simpler, a set of descriptor sets get bound at certain slots Strict rules of D3D12 are not present, so need to be careful about hitting glass jaw of hardware We use a single dynamic descriptor set by binding and updating with a new offset for each draw call Have one giant buffer fill with data per frame Don t use too many dynamic descriptors Dynamic Descriptor Constant Memory

35 Buffer Texture Need to create a buffer, and then do copy via vkcmdcopybuffertoimage Must create a layout yourself we use layout of a DDS file Probably be community code for this pretty soon. The exact layout doesn t matter, just needs to be consistent Same applies for doing CPU readback of texture copy to a mappable buffer and read Advantage: No hidden stalls, once semaphore is clear can copy memory seamlessly back and forth Should be careful about high water mark e.g. lots of transient memory needed and discarded for memory upload We do a series of vkcmdcopybuffertoimage calls to copy the data in our CPU mappable buffer into the actual texture

36 Nitrous States Vulkan is more explicit then D3D12 But allows to switch to any mode if contents can be discarded We use our own states that are more explicit then D3D12, but less then Vulkan Having mapping tables to Vulkan pipeline barriers Engine can auto track previous state to help make it simpler RSTATE_DEFAULT_SRV RSTATE_DEFAULT_UAV RSTATE_DATA_READ RSTATE_DATA_WRITE RSTATE_SHADER_READ_SRV RSTATE_SHADER_READ_UAV RSTATE_SHADER_WRITE RSTATE_SHADER_READWRITE RSTATE_COMPUTE_READ_SRV RSTATE_COMPUTE_READ_UAV RSTATE_COMPUTE_WRITE RSTATE_COMPUTE_READWRITE RSTATE_UNINITIALIZED RSTATE_RENDER_OPTIMAL RSTATE_DEPTH_OPTIMAL RSTATE_SHADER_OPTIMAL RSTATE_COLOR_CLEAR RSTATE_DEPTH_CLEAR RSTATE_UAV_CLEAR RSTATE_INDICES RSTATE_INDIRECT_ARGS RSTATE_RT_RESOLVE_SOURCE RSTATE_RT_RESOLVE_DEST RSTATE_PRESENTABLE RSTATE_COMMON RSTATE_GENERIC_READ RSTATE_PREINITIALIZED

37 More important for mobile then desktop Main cause of structural changes in our code base Originally our draw sets had a separate texture and zbuffer Created concept of a RenderGroup in engine, which combines resources needed for rendering together, currently most emulates D3D12 method Don t currently do anything complex with render groups don t do much multi-pass rendering as more common for deferred renderer

38 Somehow, viewport in Vulkan is flipped vertically from D3D Can be solved by extension of negating the height in the viewport depthclampenable is opposite of depthcull! Cull is opposite of clamp

39 Traditionally the biggest hurdle in cross-api support Hurdle is (mostly) gone in Vulkan! Background: OpenGL used GLSL, accepting native text strings and generally a difficult and poor for production Vulkan creates SPIRV standardizing the front end of shading language, and opening the possibility for different languages

40 Major effort from Google, Valve, LunarG and others Oxide helped a tiny tiny bit, mostly freeloading of others Thanks to LunarG for helping us get 30k lines of HLSL code compiled! Still work to do, but good enough to start production work Special thanks to

41 Pipe HLSL syntax into GLSLangvalidator - currently about 99% compatible with HLSL native HLSL syntax HLSL syntax is not HLSL it s a recursive decent parser with more context sensitivity, thus has a few language syntax features not included in HLSL but needed for Vulkan support Outputs SPIRV code from HLSL, which can be consumed by Vulkan Can use exactly the same shaders across platforms!

42 Only issue is dealing with variable bindings different declarations In D3D12, use a remapping table from old registers to descriptor locations Specify these native in Vulkan Use a simple # define trick to define both styles of bindings #define layout(a,b) #else #define register(a) blank #endif layout(set=9,binding=0) sampler SS_DEFAULT : register(s0); layout(set=9,binding=1) sampler SS_DEFAULT_CLAMP : register(s1); layout(set=9,binding=2) sampler SS_ANISO : register(s2); layout(set=9,binding=3) sampler SS_ANISO_CLAMP : register(s3); layout(set=9,binding=4) sampler SS_POINT : register(s4); layout(set=9,binding=5) sampler SS_POINT_CLAMP : register(s5); layout(set=0,binding=0) cbuffer DecalMapParams : register(b0) { row_major float4x4 s_worldtolocal : packoffset(c0.x); float4 s_destoffsetandsize : packoffset(c4.x); float2 s_invdestres : packoffset(c5.x); float s_time : packoffset(c5.z); }; //ResourceSet DecalVB layout(set=1,binding=0)buffer<float4> V_Position : register(t0); layout(set=1,binding=1)buffer<float4> V_TexCoord : register(t1); layout(set=1,binding=2)buffer<float4> V_UserData : register(t2); //EndTexture Set

43 SPRIV is bigger then D3D bytecode by ~10-20x native Current pipeline Dead Code strip (spirv-remap) -> SMOLV -> LZ4 = 2x size of D3D12 More work happening here SPIRV->SPRIV optimizer may be wanted, high level optimizer in HLSL cleans up some things driver may miss

44 Ashes of the Singularity: Escalation in Vulkan Sorry, no perf data, just barely got everything debugged

45 Vulkan is ready for primetime for desktop D3D11 to Vulkan is about the same complexity of D3D11 to D3D12 For us, application/engine level makes no distinction between APIs Only base level graphics layer knows what API is being used If moving from D3D11 to next gen APIs, both could be supported simultaneously without massive extra effort

46

47 Democratize development!

48

49 Graphics Jobs!

50

51

52

53 Graphics Command Buffers Fixed render loop with callout points

54 Unity, Infinite Dreams, and ARM Thursday, March 2, 10am-11am West Hall 3022

55

Achieving High-performance Graphics on Mobile With the Vulkan API

Achieving High-performance Graphics on Mobile With the Vulkan API Achieving High-performance Graphics on Mobile With the Vulkan API Marius Bjørge Graphics Research Engineer GDC 2016 Agenda Overview Command Buffers Synchronization Memory Shaders and Pipelines Descriptor

More information

Vulkan 1.1 March Copyright Khronos Group Page 1

Vulkan 1.1 March Copyright Khronos Group Page 1 Vulkan 1.1 March 2018 Copyright Khronos Group 2018 - Page 1 Vulkan 1.1 Launch and Ongoing Momentum Strengthening the Ecosystem Improved developer tools (SDK, validation/debug layers) More rigorous conformance

More information

Porting Roblox to Vulkan. Arseny

Porting Roblox to Vulkan. Arseny Porting Roblox to Vulkan Arseny Kapoulkine @zeuxcg 1 What is Roblox? Online multiplayer game creation platform All content is user generated Windows, macos, ios, Android, Xbox One 50M+ MAU, 1.5M+ CCU 2

More information

Working with Metal Overview

Working with Metal Overview Graphics and Games #WWDC14 Working with Metal Overview Session 603 Jeremy Sandmel GPU Software 2014 Apple Inc. All rights reserved. Redistribution or public display not permitted without written permission

More information

Copyright Khronos Group Page 1. Vulkan Overview. June 2015

Copyright Khronos Group Page 1. Vulkan Overview. June 2015 Copyright Khronos Group 2015 - Page 1 Vulkan Overview June 2015 Copyright Khronos Group 2015 - Page 2 Khronos Connects Software to Silicon Open Consortium creating OPEN STANDARD APIs for hardware acceleration

More information

Vulkan: Architecture positive How Vulkan maps to PowerVR GPUs Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics.

Vulkan: Architecture positive How Vulkan maps to PowerVR GPUs Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics. Vulkan: Architecture positive How Vulkan maps to PowerVR GPUs Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics www.imgtec.com Introduction Who am I? Kevin Sun Working at Imagination Technologies

More information

EECS 487: Interactive Computer Graphics

EECS 487: Interactive Computer Graphics EECS 487: Interactive Computer Graphics Lecture 21: Overview of Low-level Graphics API Metal, Direct3D 12, Vulkan Console Games Why do games look and perform so much better on consoles than on PCs with

More information

Vulkan Launch Webinar 18 th February Copyright Khronos Group Page 1

Vulkan Launch Webinar 18 th February Copyright Khronos Group Page 1 Vulkan Launch Webinar 18 th February 2016 Copyright Khronos Group 2016 - Page 1 Copyright Khronos Group 2016 - Page 2 The Vulkan Launch Webinar Is About to Start! Kathleen Mattson - Webinar MC, Khronos

More information

Vulkan (including Vulkan Fast Paths)

Vulkan (including Vulkan Fast Paths) Vulkan (including Vulkan Fast Paths) Łukasz Migas Software Development Engineer WS Graphics Let s talk about OpenGL (a bit) History 1.0-1992 1.3-2001 multitexturing 1.5-2003 vertex buffer object 2.0-2004

More information

PROFESSIONAL VR: AN UPDATE. Robert Menzel, Ingo Esser GTC 2018, March

PROFESSIONAL VR: AN UPDATE. Robert Menzel, Ingo Esser GTC 2018, March PROFESSIONAL VR: AN UPDATE Robert Menzel, Ingo Esser GTC 2018, March 26 2018 NVIDIA VRWORKS Comprehensive SDK for VR Developers GRAPHICS HEADSET TOUCH & PHYSICS AUDIO PROFESSIONAL VIDEO 2 NVIDIA VRWORKS

More information

Next Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Copyright Khronos Group Page 1

Next Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Copyright Khronos Group Page 1 Next Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Ecosystem @neilt3d Copyright Khronos Group 2015 - Page 1 Copyright Khronos Group 2015 - Page 2 Khronos Connects Software to Silicon

More information

Vulkan Multipass mobile deferred done right

Vulkan Multipass mobile deferred done right Vulkan Multipass mobile deferred done right Hans-Kristian Arntzen Marius Bjørge Khronos 5 / 25 / 2017 Content What is multipass? What multipass allows... A driver to do versus MRT Developers to do Transient

More information

Vulkan API 杨瑜, 资深工程师

Vulkan API 杨瑜, 资深工程师 Vulkan API 杨瑜, 资深工程师 Vulkan Overview (1/3) Some History ~2011 became apparent that the API is getting in the way - Console Developers programmed GPUs To-the-Metal 2012 Khronos started work on GLCommon

More information

Using SPIR-V in practice with SPIRV-Cross

Using SPIR-V in practice with SPIRV-Cross Copyright Khronos Group 2016 - Page 60 Using SPIR-V in practice with SPIRV-Cross Hans-Kristian Arntzen Engineer, ARM Copyright Khronos Group 2016 - Page 61 Contents Moving to offline compilation of SPIR-V

More information

Copyright Khronos Group Page 1

Copyright Khronos Group Page 1 Gaming Market Briefing Overview of APIs GDC March 2016 Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem ntrevett@nvidia.com @neilt3d Copyright Khronos Group 2016 - Page 1 Copyright

More information

Vulkan: Scaling to Multiple Threads. Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics

Vulkan: Scaling to Multiple Threads. Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics Vulkan: Scaling to Multiple Threads Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics www.imgtec.com Introduction Who am I? Kevin Sun Working at Imagination Technologies Take responsibility

More information

Introduction to SPIR-V Shaders

Introduction to SPIR-V Shaders Copyright Khronos Group 2016 - Page 38 Introduction to SPIR-V Shaders Neil Hickey Compiler Engineer, ARM SPIR History Copyright Khronos Group 2016 - Page 39 Copyright Khronos Group 2016 - Page 40 SPIR-V

More information

D3D12 & Vulkan Done Right. Gareth Thomas Developer Technology Engineer, AMD

D3D12 & Vulkan Done Right. Gareth Thomas Developer Technology Engineer, AMD D3D12 & Vulkan Done Right Gareth Thomas Developer Technology Engineer, AMD Agenda Barriers Copy Queue Resources Pipeline Shaders What is *not* in this talk Async compute Check out Async Compute: Deep Dive

More information

Vulkan on Mobile. Daniele Di Donato, ARM GDC 2016

Vulkan on Mobile. Daniele Di Donato, ARM GDC 2016 Vulkan on Mobile Daniele Di Donato, ARM GDC 2016 Outline Vulkan main features Mapping Vulkan Key features to ARM CPUs Mapping Vulkan Key features to ARM Mali GPUs 4 Vulkan Good match for mobile and tiling

More information

Practical Development for Vulkan. Dan Ginsburg, Valve Baldur Karlsson, Unity Dean Sekulic, Croteam

Practical Development for Vulkan. Dan Ginsburg, Valve Baldur Karlsson, Unity Dean Sekulic, Croteam Practical Development for Vulkan Dan Ginsburg, Valve Baldur Karlsson, Unity Dean Sekulic, Croteam Session Overview Vulkan Status Update, Dan Ginsburg Vulkan Care and Feeding, Dean Sekulic Debugging with

More information

Mali Developer Resources. Kevin Ho ARM Taiwan FAE

Mali Developer Resources. Kevin Ho ARM Taiwan FAE Mali Developer Resources Kevin Ho ARM Taiwan FAE ARM Mali Developer Tools Software Development SDKs for OpenGL ES & OpenCL OpenGL ES Emulators Shader Development Studio Shader Library Asset Creation Texture

More information

D3D12 & Vulkan: Lessons learned. Dr. Matthäus G. Chajdas Developer Technology Engineer, AMD

D3D12 & Vulkan: Lessons learned. Dr. Matthäus G. Chajdas Developer Technology Engineer, AMD D3D12 & Vulkan: Lessons learned Dr. Matthäus G. Chajdas Developer Technology Engineer, AMD D3D12 What s new? DXIL DXGI & UWP updates Root Signature 1.1 Shader cache GPU validation PIX D3D12 / DXIL DXBC

More information

Shader Series Primer: Fundamentals of the Programmable Pipeline in XNA Game Studio Express

Shader Series Primer: Fundamentals of the Programmable Pipeline in XNA Game Studio Express Shader Series Primer: Fundamentals of the Programmable Pipeline in XNA Game Studio Express Level: Intermediate Area: Graphics Programming Summary This document is an introduction to the series of samples,

More information

DEVELOPER DAY. Vulkan Subgroup Explained Daniel Koch NVIDIA MONTRÉAL APRIL Copyright Khronos Group Page 1

DEVELOPER DAY. Vulkan Subgroup Explained Daniel Koch NVIDIA MONTRÉAL APRIL Copyright Khronos Group Page 1 DEVELOPER DAY Vulkan Subgroup Explained Daniel Koch (@booner_k), NVIDIA MONTRÉAL APRIL 2018 Copyright Khronos Group 2018 - Page 1 Copyright Khronos Group 2018 - Page 2 Agenda Motivation Subgroup overview

More information

Metal for OpenGL Developers

Metal for OpenGL Developers #WWDC18 Metal for OpenGL Developers Dan Omachi, Metal Ecosystem Engineer Sukanya Sudugu, GPU Software Engineer 2018 Apple Inc. All rights reserved. Redistribution or public display not permitted without

More information

DEVELOPER DAY. Shader Toolchain: HLSL in Vulkan Lei Zhang, Google MONTRÉAL APRIL Copyright Khronos Group Page 1

DEVELOPER DAY. Shader Toolchain: HLSL in Vulkan Lei Zhang, Google MONTRÉAL APRIL Copyright Khronos Group Page 1 DEVELOPER DAY Shader Toolchain: HLSL in Vulkan Lei Zhang, Google MONTRÉAL APRIL 2018 Copyright Khronos Group 2018 - Page 1 Copyright Khronos Group 2018 - Page 2 Overview Shader toolchain - Projects - SPIR-V

More information

Low-Overhead Rendering with Direct3D. Evan Hart Principal Engineer - NVIDIA

Low-Overhead Rendering with Direct3D. Evan Hart Principal Engineer - NVIDIA Low-Overhead Rendering with Direct3D Evan Hart Principal Engineer - NVIDIA Ground Rules No DX9 Need to move fast Big topic in 30 minutes Assuming experienced audience Everything is a tradeoff These are

More information

Prospects for a more robust, simpler and more efficient shader cross-compilation pipeline in Unity with SPIR-V

Prospects for a more robust, simpler and more efficient shader cross-compilation pipeline in Unity with SPIR-V Prospects for a more robust, simpler and more efficient shader cross-compilation pipeline in Unity with SPIR-V 2015/04/14 - Christophe Riccio, OpenGL Democratizing games development Monument Valley by

More information

Vulkan C++ Markus Tavenrath, Senior DevTech Software Engineer Professional Visualization

Vulkan C++ Markus Tavenrath, Senior DevTech Software Engineer Professional Visualization Vulkan C++ Markus Tavenrath, Senior DevTech Software Engineer Professional Visualization Who am I? Markus Tavenrath Senior Dev Tech Software Engineer - Professional Visualization Joined NVIDIA 8 years

More information

Explicit Multi GPU Programming with DirectX 12. Juha Sjöholm Developer Technology Engineer NVIDIA

Explicit Multi GPU Programming with DirectX 12. Juha Sjöholm Developer Technology Engineer NVIDIA Explicit Multi GPU Programming with DirectX 12 Juha Sjöholm Developer Technology Engineer NVIDIA Agenda What is explicit Multi GPU API Introduction Engine Requirements Frame Pipelining Case Study Problem

More information

Could you make the XNA functions yourself?

Could you make the XNA functions yourself? 1 Could you make the XNA functions yourself? For the second and especially the third assignment, you need to globally understand what s going on inside the graphics hardware. You will write shaders, which

More information

Object Space Lighting. Dan Baker Founder, Oxide Games

Object Space Lighting. Dan Baker Founder, Oxide Games Object Space Lighting Dan Baker Founder, Oxide Games Ashes of the Singularity Nitrous Engine/Oxide Games New studio founded from industry vets Firaxis, Zenimax, Stardock Ground up, custom engine Several

More information

VR Rendering Improvements Featuring Autodesk VRED

VR Rendering Improvements Featuring Autodesk VRED GPU Technology Conference 2017 VR Rendering Improvements Featuring Autodesk VRED Michael Nikelsky Sr. Principal Engineer, Autodesk Ingo Esser Sr. Engineer, Developer Technology, NVIDIA 2017 Autodesk AGENDA

More information

Windowing System on a 3D Pipeline. February 2005

Windowing System on a 3D Pipeline. February 2005 Windowing System on a 3D Pipeline February 2005 Agenda 1.Overview of the 3D pipeline 2.NVIDIA software overview 3.Strengths and challenges with using the 3D pipeline GeForce 6800 220M Transistors April

More information

Press Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem. Copyright Khronos Group Page 1

Press Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem. Copyright Khronos Group Page 1 Press Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem Copyright Khronos Group 2015 - Page 1 Khronos Connects Software to Silicon Open Consortium creating ROYALTY-FREE,

More information

Bringing Vulkan to VR. Cass Everitt, Oculus

Bringing Vulkan to VR. Cass Everitt, Oculus Bringing Vulkan to VR Cass Everitt, Oculus A Presentation in Two Acts The Graphics-API-Agnostic Design of the VrApi The Vulkan-Samples atw Sample Family as Proving Grounds Act One The Graphics-API-Agnostic

More information

Vulkan: Mark My DWORDS

Vulkan: Mark My DWORDS Vulkan: Mark My DWORDS Hai Nguyen / Google Vancouver, August 2018 Copyright 2018 The Khronos Group Inc. - Page 1 Agenda Overview Getting Started Marking Buffers Viewing Markers How Is This Useful? Practical

More information

Keeping your GPU fed without getting bitten

Keeping your GPU fed without getting bitten Copyright Khronos Group 2016 - Page 150 Keeping your GPU fed without getting bitten Tobias Hector May 2016 Copyright Khronos Group 2016 - Page 151 Introduction You have delicious draw calls - Yummy! Copyright

More information

Investigating real-time rendering techniques approaching realism using the Vulkan API

Investigating real-time rendering techniques approaching realism using the Vulkan API Investigating real-time rendering techniques approaching realism using the Vulkan API Sandro Weber Technische Universitaet Muenchen webers@in.tum.de Lorenzo La Spina Technische Universitaet Muenchen lorenzo.la-spina@tum.de

More information

PERFORMANCE. Rene Damm Kim Steen Riber COPYRIGHT UNITY TECHNOLOGIES

PERFORMANCE. Rene Damm Kim Steen Riber COPYRIGHT UNITY TECHNOLOGIES PERFORMANCE Rene Damm Kim Steen Riber WHO WE ARE René Damm Core engine developer @ Unity Kim Steen Riber Core engine lead developer @ Unity OPTIMIZING YOUR GAME FPS CPU (Gamecode, Physics, Skinning, Particles,

More information

DEVELOPER DAY. Descriptor Indexing Hai Nguyen, Google MONTRÉAL APRIL Copyright Khronos Group Page 1

DEVELOPER DAY. Descriptor Indexing Hai Nguyen, Google MONTRÉAL APRIL Copyright Khronos Group Page 1 DEVELOPER DAY Descriptor Indexing Hai Nguyen, Google MONTRÉAL APRIL 2018 Copyright Khronos Group 2018 - Page 1 Agenda Overview Descriptors Refresher Descriptor Indexing More On Unbounded Non-Uniform Indexing

More information

SLICING THE WORKLOAD MULTI-GPU OPENGL RENDERING APPROACHES

SLICING THE WORKLOAD MULTI-GPU OPENGL RENDERING APPROACHES SLICING THE WORKLOAD MULTI-GPU OPENGL RENDERING APPROACHES INGO ESSER NVIDIA DEVTECH PROVIZ OVERVIEW Motivation Tools of the trade Multi-GPU driver functions Multi-GPU programming functions Multi threaded

More information

Press Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem. Copyright Khronos Group Page 1

Press Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem. Copyright Khronos Group Page 1 Press Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem Copyright Khronos Group 2015 - Page 1 Khronos Connects Software to Silicon Open Consortium creating ROYALTY-FREE,

More information

Khronos Connects Software to Silicon

Khronos Connects Software to Silicon Press Pre-Briefing GDC 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem All Materials Embargoed Until Tuesday 3 rd March, 12:01AM Pacific Time Copyright Khronos Group 2015 - Page

More information

AMD RADEON GFX & DIRECTX 12 GRAPHICS CORE NEXT BETTER PREPARED FOR DIRECTX 12 ROBERT HALLOCK AMD TECHNICAL MARKETING APPROVED FOR ALL AUDIENCES

AMD RADEON GFX & DIRECTX 12 GRAPHICS CORE NEXT BETTER PREPARED FOR DIRECTX 12 ROBERT HALLOCK AMD TECHNICAL MARKETING APPROVED FOR ALL AUDIENCES GRAPHICS CORE NEXT BETTER PREPARED FOR DIRECTX 12 APPROVED FOR ALL AUDIENCES AMD RADEON GFX & DIRECTX 12 ROBERT HALLOCK AMD TECHNICAL MARKETING GAME DEVS & ASYNC SHADERS DAN BAKER, PARTNER, OXIDE GAMES

More information

Modern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design

Modern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design Modern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design The 1960s - 1970s Instructions took multiple cycles Only one instruction in flight at once Optimisation meant

More information

GDC 2014 Barthold Lichtenbelt OpenGL ARB chair

GDC 2014 Barthold Lichtenbelt OpenGL ARB chair GDC 2014 Barthold Lichtenbelt OpenGL ARB chair Agenda OpenGL 4.4, news and updates - Barthold Lichtenbelt, NVIDIA Low Overhead Rendering with OpenGL - Cass Everitt, NVIDIA Copyright Khronos Group, 2010

More information

The Application Stage. The Game Loop, Resource Management and Renderer Design

The Application Stage. The Game Loop, Resource Management and Renderer Design 1 The Application Stage The Game Loop, Resource Management and Renderer Design Application Stage Responsibilities 2 Set up the rendering pipeline Resource Management 3D meshes Textures etc. Prepare data

More information

RSX Best Practices. Mark Cerny, Cerny Games David Simpson, Naughty Dog Jon Olick, Naughty Dog

RSX Best Practices. Mark Cerny, Cerny Games David Simpson, Naughty Dog Jon Olick, Naughty Dog RSX Best Practices Mark Cerny, Cerny Games David Simpson, Naughty Dog Jon Olick, Naughty Dog RSX Best Practices About libgcm Using the SPUs with the RSX Brief overview of GCM Replay December 7 th, 2004

More information

Vulkan Timeline Semaphores

Vulkan Timeline Semaphores Vulkan line Semaphores Jason Ekstrand September 2018 Copyright 2018 The Khronos Group Inc. - Page 1 Current Status of VkSemaphore Current VkSemaphores require a strict signal, wait, signal, wait pattern

More information

Keeping your GPU fed without getting bitten

Keeping your GPU fed without getting bitten Keeping your GPU fed without getting bitten Tobias Hector May 2017 Copyright Khronos Group 2017 - Page 1 Introduction You have delicious draw calls - Yummy! Copyright Khronos Group 2017 - Page 2 Introduction

More information

Parallel Programming on Larrabee. Tim Foley Intel Corp

Parallel Programming on Larrabee. Tim Foley Intel Corp Parallel Programming on Larrabee Tim Foley Intel Corp Motivation This morning we talked about abstractions A mental model for GPU architectures Parallel programming models Particular tools and APIs This

More information

DEVELOPER DAY MONTRÉAL APRIL Copyright Khronos Group Page 1

DEVELOPER DAY MONTRÉAL APRIL Copyright Khronos Group Page 1 DEVELOPER DAY MONTRÉAL APRIL 2018 Copyright Khronos Group 2018 - Page 1 DEVELOPER DAY Introduction and Overview Alon Or-bach, Samsung MONTRÉAL APRIL 2018 Copyright Khronos Group 2018 - Page 2 Copyright

More information

Lecture 25: Board Notes: Threads and GPUs

Lecture 25: Board Notes: Threads and GPUs Lecture 25: Board Notes: Threads and GPUs Announcements: - Reminder: HW 7 due today - Reminder: Submit project idea via (plain text) email by 11/24 Recap: - Slide 4: Lecture 23: Introduction to Parallel

More information

Going to cover; - Why we have SPIR-V - Brief history of SPIR-V - Some of the core required features we wanted - How OpenCL will use SPIR-V - How

Going to cover; - Why we have SPIR-V - Brief history of SPIR-V - Some of the core required features we wanted - How OpenCL will use SPIR-V - How 1 Going to cover; - Why we have SPIR-V - Brief history of SPIR-V - Some of the core required features we wanted - How OpenCL will use SPIR-V - How Vulkan will use SPIR-V - The differences between compute/graphics

More information

Day: Thursday, 03/19 Time: 16:00-16:50 Location: Room 212A Level: Intermediate Type: Talk Tags: Developer - Tools & Libraries; Game Development

Day: Thursday, 03/19 Time: 16:00-16:50 Location: Room 212A Level: Intermediate Type: Talk Tags: Developer - Tools & Libraries; Game Development 1 Day: Thursday, 03/19 Time: 16:00-16:50 Location: Room 212A Level: Intermediate Type: Talk Tags: Developer - Tools & Libraries; Game Development 2 3 Talk about just some of the features of DX12 that are

More information

EXPLICIT SYNCHRONIZATION

EXPLICIT SYNCHRONIZATION EXPLICIT SYNCHRONIZATION Lauri Peltonen XDC, 8 October, 204 WHAT IS EXPLICIT SYNCHRONIZATION? Fence is an abstract primitive that marks completion of an operation Implicit synchronization Fences are attached

More information

CSE 167: Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012

CSE 167: Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012 CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012 Announcements Homework project #2 due this Friday, October

More information

LIQUIDVR TODAY AND TOMORROW GUENNADI RIGUER, SOFTWARE ARCHITECT

LIQUIDVR TODAY AND TOMORROW GUENNADI RIGUER, SOFTWARE ARCHITECT LIQUIDVR TODAY AND TOMORROW GUENNADI RIGUER, SOFTWARE ARCHITECT Bootstrapping the industry for better VR experience Complimentary to HMD SDKs It s all about giving developers the tools they want! AMD LIQUIDVR

More information

Vulkan Subpasses. or The Frame Buffer is Lava. Andrew Garrard Samsung R&D Institute UK. UK Khronos Chapter meet, May 2016

Vulkan Subpasses. or The Frame Buffer is Lava. Andrew Garrard Samsung R&D Institute UK. UK Khronos Chapter meet, May 2016 Vulkan Subpasses or The Frame Buffer is Lava Andrew Garrard Samsung R&D Institute UK Vulkan: Making use of the GPU more efficient Vulkan aims to reduce the overheads of keeping the GPU busy Vulkan subpasses

More information

Vulkan and Animation 3/13/ &height=285&playerId=

Vulkan and Animation 3/13/ &height=285&playerId= https://media.oregonstate.edu/id/0_q2qgt47o?width= 400&height=285&playerId=22119142 Vulkan and Animation Natasha A. Anisimova (Particle systems in Vulkan) Intel Game Dev The Loop Vulkan Cookbook https://software.intel.com/en-us/articles/using-vulkan-graphics-api-to-render-acloud-of-animated-particles-in-stardust-application

More information

OpenGL ES 2.0 : Start Developing Now. Dan Ginsburg Advanced Micro Devices, Inc.

OpenGL ES 2.0 : Start Developing Now. Dan Ginsburg Advanced Micro Devices, Inc. OpenGL ES 2.0 : Start Developing Now Dan Ginsburg Advanced Micro Devices, Inc. Agenda OpenGL ES 2.0 Brief Overview Tools OpenGL ES 2.0 Emulator RenderMonkey w/ OES 2.0 Support OpenGL ES 2.0 3D Engine Case

More information

VULKAN AND NVIDIA: THE ESSENTIALS

VULKAN AND NVIDIA: THE ESSENTIALS Siggraph 2016 VULKAN AND NVIDIA: THE ESSENTIALS Tristan Lorach Manager of Developer Technology Group, NVIDIA US 7/25/2016 2 ANALOGY ON GRAPHIC APIS (getting ready for my 7 years old son s questions on

More information

! Readings! ! Room-level, on-chip! vs.!

! Readings! ! Room-level, on-chip! vs.! 1! 2! Suggested Readings!! Readings!! H&P: Chapter 7 especially 7.1-7.8!! (Over next 2 weeks)!! Introduction to Parallel Computing!! https://computing.llnl.gov/tutorials/parallel_comp/!! POSIX Threads

More information

PowerVR Series5. Architecture Guide for Developers

PowerVR Series5. Architecture Guide for Developers Public Imagination Technologies PowerVR Series5 Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.

More information

Profiling and Debugging Games on Mobile Platforms

Profiling and Debugging Games on Mobile Platforms Profiling and Debugging Games on Mobile Platforms Lorenzo Dal Col Senior Software Engineer, Graphics Tools Gamelab 2013, Barcelona 26 th June 2013 Agenda Introduction to Performance Analysis with ARM DS-5

More information

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller Entertainment Graphics: Virtual Realism for the Masses CSE 591: GPU Programming Introduction Computer games need to have: realistic appearance of characters and objects believable and creative shading,

More information

Breaking Down Barriers: An Intro to GPU Synchronization. Matt Pettineo Lead Engine Programmer Ready At Dawn Studios

Breaking Down Barriers: An Intro to GPU Synchronization. Matt Pettineo Lead Engine Programmer Ready At Dawn Studios Breaking Down Barriers: An Intro to GPU Synchronization Matt Pettineo Lead Engine Programmer Ready At Dawn Studios Who am I? Ready At Dawn for 9 years Lead Engine Programmer for 5 I like GPUs and APIs!

More information

Mention driver developers in the room. Because of time this will be fairly high level, feel free to come talk to us afterwards

Mention driver developers in the room. Because of time this will be fairly high level, feel free to come talk to us afterwards 1 Introduce Mark, Michael Poll: Who is a software developer or works for a software company? Who s in management? Who knows what the OpenGL ARB standards body is? Mention driver developers in the room.

More information

Raise your VR game with NVIDIA GeForce Tools

Raise your VR game with NVIDIA GeForce Tools Raise your VR game with NVIDIA GeForce Tools Yan An Graphics Tools QA Manager 1 Introduction & tour of Nsight Analyze a geometry corruption bug VR debugging AGENDA System Analysis Tracing GPU Range Profiling

More information

GPU Quality and Application Portability

GPU Quality and Application Portability GPU Quality and Application Portability Kalle Raita Senior Software Architect, drawelements Copyright Khronos Group, 2010 - Page 1 drawelements Ltd. drawelements Based in Helsinki, Finland Founded in 2008

More information

OpenGL Status - November 2013 G-Truc Creation

OpenGL Status - November 2013 G-Truc Creation OpenGL Status - November 2013 G-Truc Creation Vendor NVIDIA AMD Intel Windows Apple Release date 02/10/2013 08/11/2013 30/08/2013 22/10/2013 Drivers version 331.10 beta 13.11 beta 9.2 10.18.10.3325 MacOS

More information

Enabling the Next Generation of Computational Graphics with NVIDIA Nsight Visual Studio Edition. Jeff Kiel Director, Graphics Developer Tools

Enabling the Next Generation of Computational Graphics with NVIDIA Nsight Visual Studio Edition. Jeff Kiel Director, Graphics Developer Tools Enabling the Next Generation of Computational Graphics with NVIDIA Nsight Visual Studio Edition Jeff Kiel Director, Graphics Developer Tools Computational Graphics Enabled Problem: Complexity of Computation

More information

SPU Render. Arseny Zeux Kapoulkine CREAT Studios

SPU Render. Arseny Zeux Kapoulkine CREAT Studios SPU Render Arseny Zeux Kapoulkine CREAT Studios arseny.kapoulkine@gmail.com http://zeuxcg.org/ Introduction Smash Cars 2 project Static scene of moderate size Many dynamic objects Multiple render passes

More information

VULKAN TECHNOLOGY UPDATE Christoph Kubisch, NVIDIA GTC 2017 Ingo Esser, NVIDIA

VULKAN TECHNOLOGY UPDATE Christoph Kubisch, NVIDIA GTC 2017 Ingo Esser, NVIDIA VULKAN TECHNOLOGY UPDATE Christoph Kubisch, NVIDIA GTC 2017 Ingo Esser, NVIDIA Device Generated Commands AGENDA API Interop VR in Vulkan NSIGHT Support 2 VK_NVX_device_generated_commands 3 DEVICE GENERATED

More information

PowerVR Hardware. Architecture Overview for Developers

PowerVR Hardware. Architecture Overview for Developers Public Imagination Technologies PowerVR Hardware Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.

More information

Broken Age's Approach to Scalability. Oliver Franzke Lead Programmer, Double Fine Productions

Broken Age's Approach to Scalability. Oliver Franzke Lead Programmer, Double Fine Productions Broken Age's Approach to Scalability Oliver Franzke Lead Programmer, Double Fine Productions Content Introduction Platform diversity Game assets Characters Environments Shaders Who am I? Lead Programmer

More information

Introducing Metal 2. Graphics and Games #WWDC17. Michal Valient, GPU Software Engineer Richard Schreyer, GPU Software Engineer

Introducing Metal 2. Graphics and Games #WWDC17. Michal Valient, GPU Software Engineer Richard Schreyer, GPU Software Engineer Session Graphics and Games #WWDC17 Introducing Metal 2 601 Michal Valient, GPU Software Engineer Richard Schreyer, GPU Software Engineer 2017 Apple Inc. All rights reserved. Redistribution or public display

More information

Copyright Khronos Group 2012 Page 1. Teaching GL. Dave Shreiner Director, Graphics and GPU Computing, ARM 1 December 2012

Copyright Khronos Group 2012 Page 1. Teaching GL. Dave Shreiner Director, Graphics and GPU Computing, ARM 1 December 2012 Copyright Khronos Group 2012 Page 1 Teaching GL Dave Shreiner Director, Graphics and GPU Computing, ARM 1 December 2012 Copyright Khronos Group 2012 Page 2 Agenda Overview of OpenGL family of APIs Comparison

More information

PowerVR Performance Recommendations. The Golden Rules

PowerVR Performance Recommendations. The Golden Rules PowerVR Performance Recommendations Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind. Redistribution

More information

Bringing it all together: The challenge in delivering a complete graphics system architecture. Chris Porthouse

Bringing it all together: The challenge in delivering a complete graphics system architecture. Chris Porthouse Bringing it all together: The challenge in delivering a complete graphics system architecture Chris Porthouse System Integration & the role of standards Content Ecosystem Java Execution Environment Native

More information

Designing a Modern GPU Interface

Designing a Modern GPU Interface Designing a Modern GPU Interface Brooke Hodgman ( @BrookeHodgman) http://tiny.cc/gpuinterface How to make a wrapper for D3D9/11/12, GL2/3/4, GL ES2/3, Metal, Mantle, Vulkan, GNM & GCM without going (completely)

More information

NVIDIA Parallel Nsight. Jeff Kiel

NVIDIA Parallel Nsight. Jeff Kiel NVIDIA Parallel Nsight Jeff Kiel Agenda: NVIDIA Parallel Nsight Programmable GPU Development Presenting Parallel Nsight Demo Questions/Feedback Programmable GPU Development More programmability = more

More information

OpenCL Overview. Shanghai March Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group

OpenCL Overview. Shanghai March Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group Copyright Khronos Group, 2012 - Page 1 OpenCL Overview Shanghai March 2012 Neil Trevett Vice President Mobile Content, NVIDIA President, The Khronos Group Copyright Khronos Group, 2012 - Page 2 Processor

More information

Overview. Technology Details. D/AVE NX Preliminary Product Brief

Overview. Technology Details. D/AVE NX Preliminary Product Brief Overview D/AVE NX is the latest and most powerful addition to the D/AVE family of rendering cores. It is the first IP to bring full OpenGL ES 2.0/3.1 rendering to the FPGA and SoC world. Targeted for graphics

More information

Modern Processor Architectures. L25: Modern Compiler Design

Modern Processor Architectures. L25: Modern Compiler Design Modern Processor Architectures L25: Modern Compiler Design The 1960s - 1970s Instructions took multiple cycles Only one instruction in flight at once Optimisation meant minimising the number of instructions

More information

Mobile Performance Tools and GPU Performance Tuning. Lars M. Bishop, NVIDIA Handheld DevTech Jason Allen, NVIDIA Handheld DevTools

Mobile Performance Tools and GPU Performance Tuning. Lars M. Bishop, NVIDIA Handheld DevTech Jason Allen, NVIDIA Handheld DevTools Mobile Performance Tools and GPU Performance Tuning Lars M. Bishop, NVIDIA Handheld DevTech Jason Allen, NVIDIA Handheld DevTools NVIDIA GoForce5500 Overview World-class 3D HW Geometry pipeline 16/32bpp

More information

Ecosystem Overview Neil Trevett Khronos President NVIDIA Vice President Developer

Ecosystem Overview Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem Overview Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem ntrevett@nvidia.com @neilt3d Copyright Khronos Group 2016 - Page 1 Khronos Mission Software Silicon Khronos is

More information

Copyright Khronos Group, Page Graphic Remedy. All Rights Reserved

Copyright Khronos Group, Page Graphic Remedy. All Rights Reserved Avi Shapira Graphic Remedy Copyright Khronos Group, 2009 - Page 1 2004 2009 Graphic Remedy. All Rights Reserved Debugging and profiling 3D applications are both hard and time consuming tasks Companies

More information

Sync Points in the Intel Gfx Driver. Jesse Barnes Intel Open Source Technology Center

Sync Points in the Intel Gfx Driver. Jesse Barnes Intel Open Source Technology Center Sync Points in the Intel Gfx Driver Jesse Barnes Intel Open Source Technology Center 1 Agenda History and other implementations Other I/O layers - block device ordering NV_fence, ARB_sync EGL_native_fence_sync,

More information

Game Programming Lab 25th April 2016 Team 7: Luca Ardüser, Benjamin Bürgisser, Rastislav Starkov

Game Programming Lab 25th April 2016 Team 7: Luca Ardüser, Benjamin Bürgisser, Rastislav Starkov Game Programming Lab 25th April 2016 Team 7: Luca Ardüser, Benjamin Bürgisser, Rastislav Starkov Interim Report 1. Development Stage Currently, Team 7 has fully implemented functional minimum and nearly

More information

The Witness on Android Post Mortem. Denis Barkar 3 March, 2017

The Witness on Android Post Mortem. Denis Barkar 3 March, 2017 The Witness on Android Post Mortem Denis Barkar 3 March, 2017 Starting Point The Witness is in active development by Thekla Designed for PC and PS4/Xbox One Custom game engine Small codebase: about 1500

More information

OpenGL BOF Siggraph 2011

OpenGL BOF Siggraph 2011 OpenGL BOF Siggraph 2011 OpenGL BOF Agenda OpenGL 4 update Barthold Lichtenbelt, NVIDIA OpenGL Shading Language Hints/Kinks Bill Licea-Kane, AMD Ecosystem update Jon Leech, Khronos Viewperf 12, a new beginning

More information

Lecture 2. Shaders, GLSL and GPGPU

Lecture 2. Shaders, GLSL and GPGPU Lecture 2 Shaders, GLSL and GPGPU Is it interesting to do GPU computing with graphics APIs today? Lecture overview Why care about shaders for computing? Shaders for graphics GLSL Computing with shaders

More information

Swapchains Unchained!

Swapchains Unchained! Swapchains Unchained! (What you need to know about Vulkan WSI) Alon Or-bach, Chair, Vulkan System Integration Sub-Group May 2016 @alonorbach (disclaimers apply!) Copyright Khronos Group 2016 - Page 193

More information

Bifrost - The GPU architecture for next five billion

Bifrost - The GPU architecture for next five billion Bifrost - The GPU architecture for next five billion Hessed Choi Senior FAE / ARM ARM Tech Forum June 28 th, 2016 Vulkan 2 ARM 2016 What is Vulkan? A 3D graphics API for the next twenty years Logical successor

More information

Mobile HW and Bandwidth

Mobile HW and Bandwidth Your logo on white Mobile HW and Bandwidth Andrew Gruber Qualcomm Technologies, Inc. Agenda and Goals Describe the Power and Bandwidth challenges facing Mobile Graphics Describe some of the Power Saving

More information

Shaders. Slide credit to Prof. Zwicker

Shaders. Slide credit to Prof. Zwicker Shaders Slide credit to Prof. Zwicker 2 Today Shader programming 3 Complete model Blinn model with several light sources i diffuse specular ambient How is this implemented on the graphics processor (GPU)?

More information

Ultimate Graphics Performance for DirectX 10 Hardware

Ultimate Graphics Performance for DirectX 10 Hardware Ultimate Graphics Performance for DirectX 10 Hardware Nicolas Thibieroz European Developer Relations AMD Graphics Products Group nicolas.thibieroz@amd.com V1.01 Generic API Usage DX10 designed for performance

More information

Hi everyone! My name is Niklas and I ve been working at EA for the last 2.5 years. My main responsibility there have been to bring up the graphics

Hi everyone! My name is Niklas and I ve been working at EA for the last 2.5 years. My main responsibility there have been to bring up the graphics Hi everyone! My name is Niklas and I ve been working at EA for the last 2.5 years. My main responsibility there have been to bring up the graphics side of the Frostbite engine on mobile devices. This talk

More information