Achieving High-performance Graphics on Mobile With the Vulkan API

Similar documents
Vulkan on Mobile. Daniele Di Donato, ARM GDC 2016

Introduction to SPIR-V Shaders

Inside VR on Mobile. Sam Martin Graphics Architect GDC 2016

Using SPIR-V in practice with SPIRV-Cross

Vulkan Multipass mobile deferred done right

Vulkan (including Vulkan Fast Paths)

Vulkan Subpasses. or The Frame Buffer is Lava. Andrew Garrard Samsung R&D Institute UK. UK Khronos Chapter meet, May 2016

EECS 487: Interactive Computer Graphics

Vulkan API 杨瑜, 资深工程师

Vulkan: Architecture positive How Vulkan maps to PowerVR GPUs Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics.

PowerVR Performance Recommendations. The Golden Rules

Achieving Console Quality Games on Mobile

Optimizing and Profiling Unity Games for Mobile Platforms. Angelo Theodorou Senior Software Engineer, MPG Gamelab 2014, 25 th -27 th June

Copyright Khronos Group Page 1

The Bifrost GPU architecture and the ARM Mali-G71 GPU

Dave Shreiner, ARM March 2009

Copyright Khronos Group 2012 Page 1. Teaching GL. Dave Shreiner Director, Graphics and GPU Computing, ARM 1 December 2012

Working with Metal Overview

Bifrost - The GPU architecture for next five billion

Moving Mobile Graphics Advanced Real-time Shadowing. Marius Bjørge ARM

Mali Developer Resources. Kevin Ho ARM Taiwan FAE

Bringing AAA graphics to mobile platforms. Niklas Smedberg Senior Engine Programmer, Epic Games

Porting Roblox to Vulkan. Arseny

GPU Memory Model Overview

Optimizing Mobile Games with ARM. Solo Chang Staff Applications Engineer, ARM

Press Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem. Copyright Khronos Group Page 1

Profiling and Debugging OpenCL Applications with ARM Development Tools. October 2014

Lecture 13: OpenGL Shading Language (GLSL)

EE 4702 GPU Programming

Press Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem. Copyright Khronos Group Page 1

Vulkan Launch Webinar 18 th February Copyright Khronos Group Page 1

Get the most out of the new OpenGL ES 3.1 API. Hans-Kristian Arntzen Software Engineer

Free Downloads OpenGL ES 3.0 Programming Guide

X. GPU Programming. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter X 1

Investigating real-time rendering techniques approaching realism using the Vulkan API

Unreal Engine 4: Mobile Graphics on ARM CPU and GPU Architecture

Programming Tips For Scalable Graphics Performance

Lecture 2. Shaders, GLSL and GPGPU

The Ultimate Developers Toolkit. Jonathan Zarge Dan Ginsburg

CS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST

Programming shaders & GPUs Christian Miller CS Fall 2011

GPU Memory Model. Adapted from:

Profiling and Debugging Games on Mobile Platforms

Shaders. Slide credit to Prof. Zwicker

Real - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský

C P S C 314 S H A D E R S, O P E N G L, & J S RENDERING PIPELINE. Mikhail Bessmeltsev

Graphics Programming. Computer Graphics, VT 2016 Lecture 2, Chapter 2. Fredrik Nysjö Centre for Image analysis Uppsala University

The Rasterization Pipeline

ARM Multimedia IP: working together to drive down system power and bandwidth

Hardware- Software Co-design at Arm GPUs

Metal for OpenGL Developers

Graphics Processing Unit Architecture (GPU Arch)

GDC 2014 Barthold Lichtenbelt OpenGL ARB chair

Vulkan 1.1 March Copyright Khronos Group Page 1

Going to cover; - Why we have SPIR-V - Brief history of SPIR-V - Some of the core required features we wanted - How OpenCL will use SPIR-V - How

VR Rendering Improvements Featuring Autodesk VRED

Driving Change. Vulkanising Mad Max

Optimizing Mobile Games with Gameloft and ARM

PowerVR Performance Recommendations. The Golden Rules

Copyright Khronos Group Page 1. Vulkan Overview. June 2015

Blis: Better Language for Image Stuff Project Proposal Programming Languages and Translators, Spring 2017

The Application Stage. The Game Loop, Resource Management and Renderer Design

HTML5 Evolution and Development. Matt Spencer UI & Browser Marketing Manager

Vulkan and Animation 3/13/ &height=285&playerId=

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer

Copyright Khronos Group, Page Graphic Remedy. All Rights Reserved

CS770/870 Spring 2017 Open GL Shader Language GLSL

CS770/870 Spring 2017 Open GL Shader Language GLSL

Programmable GPUs. Real Time Graphics 11/13/2013. Nalu 2004 (NVIDIA Corporation) GeForce 6. Virtua Fighter 1995 (SEGA Corporation) NV1

Coding OpenGL ES 3.0 for Better Graphics Quality

Graphics Architectures and OpenCL. Michael Doggett Department of Computer Science Lund university

PROFESSIONAL. WebGL Programming DEVELOPING 3D GRAPHICS FOR THE WEB. Andreas Anyuru WILEY. John Wiley & Sons, Ltd.

Developing the Bifrost GPU architecture for mainstream graphics

SIGGRAPH Briefing August 2014

PowerVR Series5. Architecture Guide for Developers

Shader Programs. Lecture 30 Subsections 2.8.2, Robb T. Koether. Hampden-Sydney College. Wed, Nov 16, 2011

The Graphics Pipeline

Breaking Down Barriers: An Intro to GPU Synchronization. Matt Pettineo Lead Engine Programmer Ready At Dawn Studios

Grafica Computazionale: Lezione 30. Grafica Computazionale. Hiding complexity... ;) Introduction to OpenGL. lezione30 Introduction to OpenGL

Practical Development for Vulkan. Dan Ginsburg, Valve Baldur Karlsson, Unity Dean Sekulic, Croteam

Introduction to Shaders.

CS4621/5621 Fall Computer Graphics Practicum Intro to OpenGL/GLSL

Bringing it all together: The challenge in delivering a complete graphics system architecture. Chris Porthouse

CS230 : Computer Graphics Lecture 4. Tamar Shinar Computer Science & Engineering UC Riverside

Programming Guide. Aaftab Munshi Dan Ginsburg Dave Shreiner. TT r^addison-wesley

Mobile HW and Bandwidth

Vulkan: Scaling to Multiple Threads. Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics

Vulkan Timeline Semaphores

Expected talk length: 30 minutes

VULKAN AND NVIDIA: THE ESSENTIALS

More frames per second. Alex Kan and Jean-François Roy GPU Software

ARM. Mali GPU. OpenGL ES Application Optimization Guide. Version: 3.0. Copyright 2011, 2013 ARM. All rights reserved. ARM DUI 0555C (ID102813)

Rendering Objects. Need to transform all geometry then

The Graphics Pipeline

PROFESSIONAL VR: AN UPDATE. Robert Menzel, Ingo Esser GTC 2018, March

Sign up for crits! Announcments

ABSTRACT. CHANDRA, DEEPAK. Developing a Simulation Framework for Vulkan. (Under the direction of Dr. Huiyang Zhou).

CS451Real-time Rendering Pipeline

COMP371 COMPUTER GRAPHICS

WebGL and GLSL Basics. CS559 Fall 2015 Lecture 10 October 6, 2015

Transcription:

Achieving High-performance Graphics on Mobile With the Vulkan API Marius Bjørge Graphics Research Engineer GDC 2016

Agenda Overview Command Buffers Synchronization Memory Shaders and Pipelines Descriptor sets Render passes Misc 2

Overview OpenGL OpenGL is mainly single-threaded Drawcalls are normally only submitted on main thread Multiple threads with shared GL contexts mainly used for texture streaming OpenGL has a lot of implicit behaviour Dependency tracking of resources Compiling shader combinations based on render state Splitting up workloads All this adds API overhead! OpenGL has quite a small footprint in terms of lines of code 3

Overview Vulkan Vulkan is designed from the ground up to allow efficient multi-threading behaviour Vulkan is explicit in nature Applications must track resource dependencies to avoid deleting anything that might still be used by the GPU or CPU Little API overhead Vulkan is very verbose in terms of lines of code Getting a simple Hello Triangle running requires ~1000 lines of code 4

Overview To get the most out of Vulkan you probably have to think about re-designing your graphics engine Migrating from OpenGL to Vulkan is not trivial Some things to keep in mind: What performance level are you targeting? Do you really need Vulkan? How important is OpenGL support? Portability? 5

Command Buffers Used to record commands which are later submitted to a device for execution This includes draw/dispatch, texture uploads, etc. Primary and secondary command buffers Command buffers work independently from each other Contains all state No inheritance of state between command buffers 6

Command Buffers vkbegincommandbuffer vkcmdbeginrenderpass vkcmdexecutecommands Secondary commands Secondary commands Secondary commands Secondary commands vkcmdendrenderpass vkendcommandbuffer vkqueuesubmit 7

Command Buffers In order to have a common higher-level command buffer abstraction we also had to support the same interface in OpenGL Record commands to linear allocator and playback later Uniform data pushed to a separate linear allocator per command buffer 8

Synchronization Submitted work is completed out of order by the GPU Dependencies must be tracked by the application Using output from a previous render pass Using output from a compute shader Etc Synchronization primitives in Vulkan Pipeline barriers and events Fences Semaphores 9

Allocating Memory Memory is first allocated and then bound to Vulkan objects Different Vulkan objects may have different memory requirements Allows for aliasing memory across different vulkan objects Driver does no ref counting of any objects in Vulkan Cannot free memory until you are sure it is never going to be used again Most of the memory allocated during run-time is transient Allocate, write and use in the same frame Block based memory allocator 10

Block Based Memory Allocator Relaxes memory reference counting Only entire blocks are freed/recycled 11

Image Layout Transitions Must match how the image is used at any time Pedantic or relaxed Some implementations might require careful tracking of previous and new layout to achieve optimal performance For Mali we can be quite relaxed with this most of the time we can keep the image layout as VK_IMAGE_LAYOUT_GENERAL 12

Pipelines Vulkan bundles state into big monolithic pipeline state objects Driver has full knowledge during shader compilation vkcreategraphicspipelines(...) ; vkbeginrenderpass(...); vkcmdbindpipeline(pipeline); vkcmddraw(...); vkendrenderpass(...); Dynamic State Blending State Pipeline State Raster State Pipeline Layout Shaders Depth Stencil Input Assembly Framebuffer Formats Vertex Input 13

Pipelines In an ideal world All pipeline combinations should be created upfront but this requires detailed knowledge of every potential shader/state combination that you might have in your scene As an example, one of our fragment shaders has ~9 000 combinations Every one of these shaders can use different render state We also have to make sure the pipelines are bound to compatible render passes An explosion of combinations! 14

Pipeline Cache Result of the pipeline construction can be re-used between pipelines Can be stored out to disk and re-used next time you run the application Pipeline state Pipeline state Pipeline state Pipeline state Pipeline Cache Disk 15

Shaders Vulkan standardized on SPIR-V No more pain with GLSL compilers behaving differently between vendors? Khronos reference compiler GL_KHR_vulkan_glsl Library that can be integrated into your graphics engine Can output SPIR-V from GLSL We decided early to internally standardize the engine on SPIR-V Use SPIR-V cross compiler to output GLSL 16

SPIR-V Why SPIR-V? The SPIR-V ecosystem is currently very small but we anticipate that this will change over the coming years as we are already seeing optimization tools in progress on github. SPIR-V cross compiler We wrote this library in order to parse and cross compile SPIR-V binary source Is available as open source on <INSERT LOCATION> ( or hoping to open-source this at some point) 17

Shaders Offline GLSL glslangvalidator Runtime SPIR-V library SPIR-V cross compiler Vulkan OpenGL ES 2.0 OpenGL ES 3.2 OpenGL 4.5 18

SPIR-V Using SPIR-V directly we can retrieve information about bindings as well as inputs and outputs This is useful information when creating or re-using existing pipeline layouts and descriptor set layouts Also allows us to easily re-use compatible pipeline layouts across a bunch of different shader combinations Which also means fewer descriptor set layouts to maintain 19

Descriptor Sets Textures, uniform buffers, etc. are bound to shaders in descriptor sets Hierarchical invalidation Order descriptor sets by update frequency Ideally all descriptors are pre-baked during level load Keep track of low level descriptor sets per material but, this is not trivial Our solution: Keep track of bindings and update descriptor sets when necessary 20

Descriptor Sets layout (set=0, binding=0) uniform ubo0 { // data }; layout (set=0, binding=1) uniform sampler2d TexA; layout (set=1, binding=0) uniform sampler2d TexB; layout (set=1, binding=2) uniform sampler2d TexC; 21

Descriptor Set Emulation We also need to support this in OpenGL Our solution: Added support for emulating descriptor sets in our OpenGL backend Use SPIR-V cross compiler library to collapse and serialize bindings 22

Descriptor Set Emulation Shader Set 0 Set 1 Set 2 0 GlobalVSData 1 GlobalFSData 0 MeshData 0 MaterialData 1 TexAlbedo 2 TexNormal 3 TexEnvmap SPIR-V library to GLSL Uniform block bindings 0 GlobalVSData 1 GlobalFSData 2 MeshData Texture bindings 0 TexAlbedo 1 TexNormal 2 TexEnvmap 23

Push Constants Push constants replace non-opaque uniforms Think of them as small, fast-access uniform buffer memory Update in Vulkan with vkcmdpushconstants Directly mapped to registers on Mali GPUs // New layout(push_constant, std430) uniform PushConstants { mat4 MVP; vec4 MaterialData; } RegisterMapped; // Old, no longer supported in Vulkan GLSL uniform mat4 MVP; uniform vec4 MaterialData; 24

Push Constant Emulation Again, we need to support OpenGL as well Our solution: Use SPIR-V cross compiler to turn push constants into regular non-opaque uniforms Logic in our OpenGL/Vulkan backends redirect the push constant data appropriately 25

Render Passes Knowing when to keep and when to discard Render passes in Vulkan are very explicit Declare when a render pass begins Load, discard or clear the framebuffer? Declare when a render pass ends Which parts do you need to be committed to memory? 26

Subpass Inputs Vulkan supports subpasses within render passes Standardized GL_EXT_shader_pixel_local_storage! // GLSL #extension GL_EXT_shader_pixel_local_storage : require pixel_local_inext GBuffer { layout(rgba8) vec4 albedo; layout(rgba8) vec4 normal;... } pls; // Vulkan layout(input_attachment_index = 0) uniform subpassinput albedo; layout(input_attachment_index = 1) uniform subpassinput normal;... 27

Subpass Input Emulation Supporting subpasses in GL is not trivial, and probably not feasible on a lot of implementations Our solution: Use the SPIR-V cross compiler library to rewrite subpass inputs to Pixel Local Storage variables This will only support a subset of the Vulkan subpass features, but good enough for our current use 28

Misc Yet another coordinate system Similar to D3D except Y direction in clip-space is inverted Simple solution: Invert gl_position.y in your vertex shaders or use swapchain transform if the driver supports it Mipmap generation No equivalent glgeneratemipmaps() in Vulkan Roll your own using vkcmdblitimage() 29

Thank you! The trademarks featured in this presentation are registered and/or unregistered trademarks of ARM Limited (or its subsidiaries) in the EU and/or elsewhere. All rights reserved. All other marks featured may be trademarks of their respective owners. Copyright 2016 ARM Limited

To Find Out More. ARM Booth #1624 on Expo Floor: Live demos of the techniques shown in this session In-depth Q&A with ARM engineers More tech talks at the ARM Lecture Theatre http://malideveloper.arm.com/gdc2016: Revisit this talk in PDF and video format post GDC Download the tools and resources 31 ARM 2016

More Talks From ARM at GDC 2016 Available post-show at the Mali Developer Center: malideveloper.arm.com/ Vulkan on Mobile with Unreal Engine 4 Case Study Weds. 9:30am, West Hall 3022 Making Light Work of Dynamic Large Worlds Weds. 2pm, West Hall 2000 Achieving High Quality Mobile VR Games Thurs. 10am, West Hall 3022 Optimize Your Mobile Games With Practical Case Studies Thurs. 11:30am, West Hall 2404 An End-to-End Approach to Physically Based Rendering Fri. 10am, West Hall 2020 32