PERFORMANCE OPTIMIZATIONS FOR AUTOMOTIVE SOFTWARE
|
|
- Janel Turner
- 5 years ago
- Views:
Transcription
1 April 4-7, 2016 Silicon Valley PERFORMANCE OPTIMIZATIONS FOR AUTOMOTIVE SOFTWARE Pradeep Chandrahasshenoy, Automotive Solutions Architect, NVIDIA Stefan Schoenefeld, ProViz DevTech, NVIDIA 4 th April 2016
2 SESSION OVERVIEW Overview of the methodologies to optimize Automotive HMI application Introduction to Tegra Profiler Tools Case study: QT5 OSS samples 2
3 WHY OPTIMIZE? Performance is User Experience SOFTWARE DEFINED CAR IDEAL VS REALITY Lines of Source Code (in Millions) 100% ~In Luxury Car ** ~IVI System ** 80% 60% Linux Kernel 4.x* 40% Boeing 787 ** NASA Mars Rover# 20% 0% Ideal Car Computers Today's Car Computer * Source: Linux kernel Wikipedia page: # Monitoring the Execution of Space Craft Flight Software, NASA ** IEEE: Automotive Designline Used Processing Available Headroom 3
4 WHY OPTIMIZE? COMPLEXITY MULTI-TASKING & MULTI- RENDERING CONTEXTS PIXEL EXPLOSION & MULTI- DISPLAY SYNCHRONIZATION 4
5 HOW TO OPTIMIZE? METHODOLOGY IDENTIFYING BOTTLENECKS WHAT'S NEEDED CPU GPU Memory Bandwidth HW Accelerators Other System level Your application & libraries Instrumentation How much time spend in every module? Third party libraries Drivers Tools What do they do? 5
6 HOW TO OPTIMIZE? KEY TEGRA TOOLS OVERVIEW TEGRA SYSTEM PROFILER TEGRA GRAPHICS DEBUGGER 6
7 TEGRA SYSTEM PROFILER (TSP) Multi-core CPU profiler for Tegra TEGRA SYSTEM PROFILER Easily prepare a device and deploy application for profiling Quickly identify CPU hot spots, hot paths and L1/L2 cache issues Visualize multi-core CPU activities with a new timeline view Maximize multi-core CPU utilization Visualize CPU, GPU and EMC frequencies Visualize thread state 7
8 TEGRA GRAPHICS DEBUGGER (TGD) A Console-grade tool to debug & profile OpenGL ES TGD enables graphics development, TEGRA GRAPHICS DEBUGGER debugging & optimization on Tegra devices for OpenGL ES 2.0, 3.0 & 3.1 applications. Identifying performance bottlenecks and GPU utilization Interactive examination of GPU pipeline state Real-time examination of draw calls 8
9 PROFILING SETUP OVERVIEW DRIVE CX WITH LINUX SSH Display Output HOST PC DRIVE CX DISPLAY 9
10 QT5: CASE STUDY With QT5 Samples BIG SCENE (qt3d) PLANETS (qt3d, qml, quick) Lots of small geometry Many draw calls Scene graph usage GPU intensive Optimizing GL call stack Tools showcase: TSP, NVTX Tools showcase: TGD 10
11 QT3D RENDER.CPP Attribute *Renderer::updateBuffersAndAttributes(Geometry *geometry, RenderCommand *command, GLsizei &count, bool forceupdate) { Attribute *indexattribute = Q_NULLPTR; uint estimatedcount = 0; m_dirtyattributes.reserve(m_dirtyattributes.size() + geometry->attributes().size()); Q_FOREACH (const QNodeId &attributeid, geometry->attributes()) { Attribute *attribute = m_nodesmanager->attributemanager() ; if (attribute == Q_NULLPTR) continue; 11
12 NVIDIA TOOLKIT EXTENSION #include "nvtoolsext.h void hotspotfunc() { nvtxmarka("hotspot reached"); } void render() { nvtxrangeid_t r = nvtxrangestarta("rendering scene"); //render everything nvtxrangeend(r); } 12
13 QT5: CASE STUDY With QT5 Samples BIG SCENE (qt3d) PLANETS (qt3d, qml, quick) Lots of small geometry Many draw calls Scene graph usage GPU intensive Optimizing GL call stack Tools showcase: TSP, NVTX Tools showcase: TGD 13
14 GL STATE CACHING 14
15 UNIFORM CACHING 15
16 EFFICIENT GPU PROGRAMMING BEST PRACTICES STATES GEOMETRY Do not set states redundantly Try to sort draw calls according to common states Disable unused vertex arrays Use buffer objects Pack small buffers into a single one and use one draw call Use indexed primitives Pack vertex attributes Use uniform winding (clockwise or counter-clockwise) for geometry 16
17 EFFICIENT GPU PROGRAMMING BEST PRACTICES TEXTURES TEXTURES Use texture compression when possible Prefer immutable textures created with gltexstorage[23]d() Use mipmaps Consider using texture atlases/maps Avoid random access Update textures with gltexsubimage[23]d() Update dynamically generated textures through FBO s 17
18 EFFICIENT GPU PROGRAMMING BEST PRACTICES RENDERING RENDERING If possible render front to back Avoid reading back from GPU Disables modes/tests that you do not need Clear buffers only if you need to Avoid memory management during runtime Update data only when needed Cull early and often Do computations as early as possible Use shader cache for faster application start Use instancing Use indirect draw calls 18
19 CONCLUSION Optimize as you develop Identify your use cases Get an overview over the application modules How much time is spent in every module Profile the modules for hot spots Invest the most time in reducing the big hot spots Get the low hanging fruit first Use Tegra Graphics Debugger to analyze your GPU usage Optimize your GL stream and minimize driver overhead 19
20 RECOMMENDED SESSIONS TALK, TUTORIAL, HANDS ON LAB, HANGOUTS S Memory Bandwidth Bootcamp: Collaborative Access Patterns S Developer Tools for Next Generation Graphics APIs S Nvpro-Pipeline: Handling Massive Transform Updates in a SceneGraph S Optimizing Application Performance with CUDA Profiling Tools S6111, S NVIDIA CUDA Optimization with NVIDIA Nsight Eclipse Edition L6135A, L6135B - Jetson Developer Tools Lab H6122, H Performance Optimization & Analysis 20
21 April 4-7, 2016 Silicon Valley THANK YOU JOIN THE NVIDIA DEVELOPER PROGRAM AT developer.nvidia.com/join
April 4-7, 2016 Silicon Valley
April 4-7, 2016 Silicon Valley TEGRA PLATFORMS GAMING DRONES ROBOTICS IVA AUTOMOTIVE 2 Compile Debug Profile Trace C/C++ NVTX NVIDIA Tools extension Getting Started CodeWorks JetPack Installers IDE Integration
More informationCopyright Khronos Group, Page Graphic Remedy. All Rights Reserved
Avi Shapira Graphic Remedy Copyright Khronos Group, 2009 - Page 1 2004 2009 Graphic Remedy. All Rights Reserved Debugging and profiling 3D applications are both hard and time consuming tasks Companies
More informationCUDA Development Using NVIDIA Nsight, Eclipse Edition. David Goodwin
CUDA Development Using NVIDIA Nsight, Eclipse Edition David Goodwin NVIDIA Nsight Eclipse Edition CUDA Integrated Development Environment Project Management Edit Build Debug Profile SC'12 2 Powered By
More informationNVIDIA Parallel Nsight. Jeff Kiel
NVIDIA Parallel Nsight Jeff Kiel Agenda: NVIDIA Parallel Nsight Programmable GPU Development Presenting Parallel Nsight Demo Questions/Feedback Programmable GPU Development More programmability = more
More informationEnabling the Next Generation of Computational Graphics with NVIDIA Nsight Visual Studio Edition. Jeff Kiel Director, Graphics Developer Tools
Enabling the Next Generation of Computational Graphics with NVIDIA Nsight Visual Studio Edition Jeff Kiel Director, Graphics Developer Tools Computational Graphics Enabled Problem: Complexity of Computation
More informationMobile Performance Tools and GPU Performance Tuning. Lars M. Bishop, NVIDIA Handheld DevTech Jason Allen, NVIDIA Handheld DevTools
Mobile Performance Tools and GPU Performance Tuning Lars M. Bishop, NVIDIA Handheld DevTech Jason Allen, NVIDIA Handheld DevTools NVIDIA GoForce5500 Overview World-class 3D HW Geometry pipeline 16/32bpp
More informationMali Developer Resources. Kevin Ho ARM Taiwan FAE
Mali Developer Resources Kevin Ho ARM Taiwan FAE ARM Mali Developer Tools Software Development SDKs for OpenGL ES & OpenCL OpenGL ES Emulators Shader Development Studio Shader Library Asset Creation Texture
More informationPERFWORKS A LIBRARY FOR GPU PERFORMANCE ANALYSIS
April 4-7, 2016 Silicon Valley PERFWORKS A LIBRARY FOR GPU PERFORMANCE ANALYSIS Avinash Baliga, NVIDIA Developer Tools Software Architect April 5, 2016 @ 3:00 p.m. Room 211B NVIDIA PerfWorks SDK New API
More information! Readings! ! Room-level, on-chip! vs.!
1! 2! Suggested Readings!! Readings!! H&P: Chapter 7 especially 7.1-7.8!! (Over next 2 weeks)!! Introduction to Parallel Computing!! https://computing.llnl.gov/tutorials/parallel_comp/!! POSIX Threads
More informationProfiling and Debugging Games on Mobile Platforms
Profiling and Debugging Games on Mobile Platforms Lorenzo Dal Col Senior Software Engineer, Graphics Tools Gamelab 2013, Barcelona 26 th June 2013 Agenda Introduction to Performance Analysis with ARM DS-5
More informationGraphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics
Why GPU? Chapter 1 Graphics Hardware Graphics Processing Unit (GPU) is a Subsidiary hardware With massively multi-threaded many-core Dedicated to 2D and 3D graphics Special purpose low functionality, high
More informationLecture 13: OpenGL Shading Language (GLSL)
Lecture 13: OpenGL Shading Language (GLSL) COMP 175: Computer Graphics April 18, 2018 1/56 Motivation } Last week, we discussed the many of the new tricks in Graphics require low-level access to the Graphics
More informationApril 4-7, 2016 Silicon Valley. CUDA DEBUGGING TOOLS IN CUDA8 Vyas Venkataraman, Kudbudeen Jalaludeen, April 6, 2016
April 4-7, 2016 Silicon Valley CUDA DEBUGGING TOOLS IN CUDA8 Vyas Venkataraman, Kudbudeen Jalaludeen, April 6, 2016 AGENDA General debugging approaches Cuda-gdb Demo 2 CUDA API CHECKING CUDA calls are
More informationS CUDA on Xavier
S8868 - CUDA on Xavier Anshuman Bhat CUDA Product Manager Saikat Dasadhikari CUDA Engineering 29 th March 2018 1 CUDA ECOSYSTEM 2018 CUDA DOWNLOADS IN 2017 3,500,000 CUDA REGISTERED DEVELOPERS 800,000
More informationGeForce3 OpenGL Performance. John Spitzer
GeForce3 OpenGL Performance John Spitzer GeForce3 OpenGL Performance John Spitzer Manager, OpenGL Applications Engineering jspitzer@nvidia.com Possible Performance Bottlenecks They mirror the OpenGL pipeline
More informationProgramming shaders & GPUs Christian Miller CS Fall 2011
Programming shaders & GPUs Christian Miller CS 354 - Fall 2011 Fixed-function vs. programmable Up until 2001, graphics cards implemented the whole pipeline for you Fixed functionality but configurable
More informationReal - Time Rendering. Pipeline optimization. Michal Červeňanský Juraj Starinský
Real - Time Rendering Pipeline optimization Michal Červeňanský Juraj Starinský Motivation Resolution 1600x1200, at 60 fps Hw power not enough Acceleration is still necessary 3.3.2010 2 Overview Application
More informationProfiling and Debugging OpenCL Applications with ARM Development Tools. October 2014
Profiling and Debugging OpenCL Applications with ARM Development Tools October 2014 1 Agenda 1. Introduction to GPU Compute 2. ARM Development Solutions 3. Mali GPU Architecture 4. Using ARM DS-5 Streamline
More informationCSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University
CSE 591/392: GPU Programming Introduction Klaus Mueller Computer Science Department Stony Brook University First: A Big Word of Thanks! to the millions of computer game enthusiasts worldwide Who demand
More informationReal - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský
Real - Time Rendering Graphics pipeline Michal Červeňanský Juraj Starinský Overview History of Graphics HW Rendering pipeline Shaders Debugging 2 History of Graphics HW First generation Second generation
More informationCoding OpenGL ES 3.0 for Better Graphics Quality
Coding OpenGL ES 3.0 for Better Graphics Quality Part 2 Hugo Osornio Rick Tewell A P R 1 1 t h 2 0 1 4 TM External Use Agenda Exercise 1: Array Structure vs Vertex Buffer Objects vs Vertex Array Objects
More informationCreating outstanding digital cockpits with Qt Automotive Suite
Creating outstanding digital cockpits with Qt Automotive Suite Get your digital cockpit first the finish line with Qt. Embedded World 2017 Trends in cockpit digitalization require a new approach to user
More informationThe Witness on Android Post Mortem. Denis Barkar 3 March, 2017
The Witness on Android Post Mortem Denis Barkar 3 March, 2017 Starting Point The Witness is in active development by Thekla Designed for PC and PS4/Xbox One Custom game engine Small codebase: about 1500
More informationDave Shreiner, ARM March 2009
4 th Annual Dave Shreiner, ARM March 2009 Copyright Khronos Group, 2009 - Page 1 Motivation - What s OpenGL ES, and what can it do for me? Overview - Lingo decoder - Overview of the OpenGL ES Pipeline
More informationShaders. Slide credit to Prof. Zwicker
Shaders Slide credit to Prof. Zwicker 2 Today Shader programming 3 Complete model Blinn model with several light sources i diffuse specular ambient How is this implemented on the graphics processor (GPU)?
More informationWorking with Metal Overview
Graphics and Games #WWDC14 Working with Metal Overview Session 603 Jeremy Sandmel GPU Software 2014 Apple Inc. All rights reserved. Redistribution or public display not permitted without written permission
More informationTHE LEADER IN VISUAL COMPUTING
MOBILE EMBEDDED THE LEADER IN VISUAL COMPUTING 2 TAKING OUR VISION TO REALITY HPC DESIGN and VISUALIZATION AUTO GAMING 3 BEST DEVELOPER EXPERIENCE Tools for Fast Development Debug and Performance Tuning
More informationNext Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Copyright Khronos Group Page 1
Next Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Ecosystem @neilt3d Copyright Khronos Group 2015 - Page 1 Copyright Khronos Group 2015 - Page 2 Khronos Connects Software to Silicon
More informationAccelerating Realism with the (NVIDIA Scene Graph)
Accelerating Realism with the (NVIDIA Scene Graph) Holger Kunz Manager, Workstation Middleware Development Phillip Miller Director, Workstation Middleware Product Management NVIDIA application acceleration
More informationPowerVR Hardware. Architecture Overview for Developers
Public Imagination Technologies PowerVR Hardware Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.
More informationGPU Memory Model. Adapted from:
GPU Memory Model Adapted from: Aaron Lefohn University of California, Davis With updates from slides by Suresh Venkatasubramanian, University of Pennsylvania Updates performed by Gary J. Katz, University
More informationOpenGL on Android. Lecture 7. Android and Low-level Optimizations Summer School. 27 July 2015
OpenGL on Android Lecture 7 Android and Low-level Optimizations Summer School 27 July 2015 This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this
More informationOptimisation. CS7GV3 Real-time Rendering
Optimisation CS7GV3 Real-time Rendering Introduction Talk about lower-level optimization Higher-level optimization is better algorithms Example: not using a spatial data structure vs. using one After that
More informationMention driver developers in the room. Because of time this will be fairly high level, feel free to come talk to us afterwards
1 Introduce Mark, Michael Poll: Who is a software developer or works for a software company? Who s in management? Who knows what the OpenGL ARB standards body is? Mention driver developers in the room.
More informationRaise your VR game with NVIDIA GeForce Tools
Raise your VR game with NVIDIA GeForce Tools Yan An Graphics Tools QA Manager 1 Introduction & tour of Nsight Analyze a geometry corruption bug VR debugging AGENDA System Analysis Tracing GPU Range Profiling
More informationNSIGHT ECLIPSE EDITION
NSIGHT ECLIPSE EDITION DG-06450-001 _v8.0 September 2016 Getting Started Guide TABLE OF CONTENTS Chapter 1. Introduction...1 1.1. About...1 Chapter 2. New and Noteworthy... 2 2.1. New in 7.5... 2 2.2.
More informationCurrent Trends in Computer Graphics Hardware
Current Trends in Computer Graphics Hardware Dirk Reiners University of Louisiana Lafayette, LA Quick Introduction Assistant Professor in Computer Science at University of Louisiana, Lafayette (since 2006)
More informationHardware-driven Visibility Culling Jeong Hyun Kim
Hardware-driven Visibility Culling Jeong Hyun Kim KAIST (Korea Advanced Institute of Science and Technology) Contents Introduction Background Clipping Culling Z-max (Z-min) Filter Programmable culling
More informationARM. Mali GPU. OpenGL ES Application Optimization Guide. Version: 2.0. Copyright 2011, 2013 ARM. All rights reserved. ARM DUI 0555B (ID051413)
ARM Mali GPU Version: 2.0 OpenGL ES Application Optimization Guide Copyright 2011, 2013 ARM. All rights reserved. ARM DUI 0555B () ARM Mali GPU OpenGL ES Application Optimization Guide Copyright 2011,
More informationSpring 2011 Prof. Hyesoon Kim
Spring 2011 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on
More informationAndroid PerfHUD ES quick start guide
Android PerfHUD ES quick start guide Version 1.0001 July 2010-1 - Contents INTRODUCTION 3 SETUP 4 CONNECTING TO THE PERFHUD ES CLIENT 6 COMMON PROBLEMS 7 KNOWN ISSUES 8 July 2010-2 - Introduction This
More informationNSIGHT ECLIPSE EDITION
NSIGHT ECLIPSE EDITION DG-06450-001 _v7.0 March 2015 Getting Started Guide TABLE OF CONTENTS Chapter 1. Introduction...1 1.1. About...1 Chapter 2. New and Noteworthy... 2 2.1. New in 7.0... 2 2.2. New
More informationBringing AAA graphics to mobile platforms. Niklas Smedberg Senior Engine Programmer, Epic Games
Bringing AAA graphics to mobile platforms Niklas Smedberg Senior Engine Programmer, Epic Games Who Am I A.k.a. Smedis Platform team at Epic Games Unreal Engine 15 years in the industry 30 years of programming
More informationARM. Mali GPU. OpenGL ES Application Optimization Guide. Version: 3.0. Copyright 2011, 2013 ARM. All rights reserved. ARM DUI 0555C (ID102813)
ARM Mali GPU Version: 3.0 OpenGL ES Application Optimization Guide Copyright 2011, 2013 ARM. All rights reserved. ARM DUI 0555C () ARM Mali GPU OpenGL ES Application Optimization Guide Copyright 2011,
More informationCS427 Multicore Architecture and Parallel Computing
CS427 Multicore Architecture and Parallel Computing Lecture 6 GPU Architecture Li Jiang 2014/10/9 1 GPU Scaling A quiet revolution and potential build-up Calculation: 936 GFLOPS vs. 102 GFLOPS Memory Bandwidth:
More informationSqueezing Performance out of your Game with ATI Developer Performance Tools and Optimization Techniques
Squeezing Performance out of your Game with ATI Developer Performance Tools and Optimization Techniques Jonathan Zarge, Team Lead Performance Tools Richard Huddy, European Developer Relations Manager ATI
More informationGDC 2014 Barthold Lichtenbelt OpenGL ARB chair
GDC 2014 Barthold Lichtenbelt OpenGL ARB chair Agenda OpenGL 4.4, news and updates - Barthold Lichtenbelt, NVIDIA Low Overhead Rendering with OpenGL - Cass Everitt, NVIDIA Copyright Khronos Group, 2010
More informationOptimizing and Profiling Unity Games for Mobile Platforms. Angelo Theodorou Senior Software Engineer, MPG Gamelab 2014, 25 th -27 th June
Optimizing and Profiling Unity Games for Mobile Platforms Angelo Theodorou Senior Software Engineer, MPG Gamelab 2014, 25 th -27 th June 1 Agenda Introduction ARM and the presenter Preliminary knowledge
More informationFast Interactive Sand Simulation for Gesture Tracking systems Shrenik Lad
Fast Interactive Sand Simulation for Gesture Tracking systems Shrenik Lad Project Guide : Vivek Mehta, Anup Tapadia TouchMagix media labs TouchMagix www.touchmagix.com Interactive display solutions Interactive
More informationClearSpeed Visual Profiler
ClearSpeed Visual Profiler Copyright 2007 ClearSpeed Technology plc. All rights reserved. 12 November 2007 www.clearspeed.com 1 Profiling Application Code Why use a profiler? Program analysis tools are
More informationSiggraph Agenda. Usability & Productivity. FX Composer 2.5. Usability & Productivity 9/12/2008 9:16 AM
Agenda Shader Debugger Performance Tools A New Generation of Performance Analysis and Shader Authoring Tools Chris Maughan & Jeffrey Kiel Usability & Productivity Increase productivity Refine usability
More informationMobile graphics API Overview
Mobile graphics API Overview Michael Doggett Department of Computer Science Lund University 2009 Michael Doggett and Tomas Akenine-Möller 1 Register Please check to see if your name is on the list, if
More informationGPU Computation Strategies & Tricks. Ian Buck NVIDIA
GPU Computation Strategies & Tricks Ian Buck NVIDIA Recent Trends 2 Compute is Cheap parallelism to keep 100s of ALUs per chip busy shading is highly parallel millions of fragments per frame 0.5mm 64-bit
More informationBaback Elmieh, Software Lead James Ritts, Profiler Lead Qualcomm Incorporated Advanced Content Group
Introduction ti to Adreno Tools Baback Elmieh, Software Lead James Ritts, Profiler Lead Qualcomm Incorporated Advanced Content Group Qualcomm HW Accelerated 3D: Adreno Moving content-quality forward requires
More informationReal-Time Support for GPU. GPU Management Heechul Yun
Real-Time Support for GPU GPU Management Heechul Yun 1 This Week Topic: Real-Time Support for General Purpose Graphic Processing Unit (GPGPU) Today Background Challenges Real-Time GPU Management Frameworks
More informationCMPE 665:Multiple Processor Systems CUDA-AWARE MPI VIGNESH GOVINDARAJULU KOTHANDAPANI RANJITH MURUGESAN
CMPE 665:Multiple Processor Systems CUDA-AWARE MPI VIGNESH GOVINDARAJULU KOTHANDAPANI RANJITH MURUGESAN Graphics Processing Unit Accelerate the creation of images in a frame buffer intended for the output
More informationGraphics Performance Optimisation. John Spitzer Director of European Developer Technology
Graphics Performance Optimisation John Spitzer Director of European Developer Technology Overview Understand the stages of the graphics pipeline Cherchez la bottleneck Once found, either eliminate or balance
More informationSeamless Compute and OpenGL Graphics Development in NVIDIA Nsight 3.0 Visual Studio Edition and Beyond 3/20/2013
Seamless Compute and OpenGL Graphics Development in NVIDIA Nsight 3.0 Visual Studio Edition and Beyond 3/20/2013 Agenda Computational Graphics and Visual Computing Developer Challenges Maximus Getting
More informationS WHAT THE PROFILER IS TELLING YOU: OPTIMIZING GPU KERNELS. Jakob Progsch, Mathias Wagner GTC 2018
S8630 - WHAT THE PROFILER IS TELLING YOU: OPTIMIZING GPU KERNELS Jakob Progsch, Mathias Wagner GTC 2018 1. Know your hardware BEFORE YOU START What are the target machines, how many nodes? Machine-specific
More informationStreaming Massive Environments From Zero to 200MPH
FORZA MOTORSPORT From Zero to 200MPH Chris Tector (Software Architect Turn 10 Studios) Turn 10 Internal studio at Microsoft Game Studios - we make Forza Motorsport Around 70 full time staff 2 Why am I
More informationAchieving High-performance Graphics on Mobile With the Vulkan API
Achieving High-performance Graphics on Mobile With the Vulkan API Marius Bjørge Graphics Research Engineer GDC 2016 Agenda Overview Command Buffers Synchronization Memory Shaders and Pipelines Descriptor
More informationCUDA OPTIMIZATION WITH NVIDIA NSIGHT ECLIPSE EDITION. Julien Demouth, NVIDIA Cliff Woolley, NVIDIA
CUDA OPTIMIZATION WITH NVIDIA NSIGHT ECLIPSE EDITION Julien Demouth, NVIDIA Cliff Woolley, NVIDIA WHAT WILL YOU LEARN? An iterative method to optimize your GPU code A way to conduct that method with NVIDIA
More informationSaving the Planet Designing Low-Power, Low-Bandwidth GPUs
Saving the Planet Designing Low-Power, Low-Bandwidth GPUs Alan Tsai Business Development Manager ARM Saving the Planet? Really? Photo courtesy of NASA. 2 Mobile GPU design is all about power It s not about
More informationExpected talk length: 30 minutes
Expected talk length: 30 minutes Welcome to Bringing Borderlands 2 and Pre-Sequel to The SHIELD Platform. My name is Justin Kim, and I am an engineer with NVIDIA s Developer Technology group. Our group
More informationGPU-accelerated similarity searching in a database of short DNA sequences
S7367 GPU-accelerated similarity Richard Wilton Department of Physics and Astronomy Johns Hopkins University GPU vs Database What kinds of database queries are amenable to GPU acceleration? Compute intensive
More informationGet the most out of the new OpenGL ES 3.1 API. Hans-Kristian Arntzen Software Engineer
Get the most out of the new OpenGL ES 3.1 API Hans-Kristian Arntzen Software Engineer 1 Content Compute shaders introduction Shader storage buffer objects Shader image load/store Shared memory Atomics
More informationBeyond Hardware IP An overview of Arm development solutions
Beyond Hardware IP An overview of Arm development solutions 2018 Arm Limited Arm Technical Symposia 2018 Advanced first design cost (US$ million) IC design complexity and cost aren t slowing down 542.2
More informationOpenGL BOF Siggraph 2011
OpenGL BOF Siggraph 2011 OpenGL BOF Agenda OpenGL 4 update Barthold Lichtenbelt, NVIDIA OpenGL Shading Language Hints/Kinks Bill Licea-Kane, AMD Ecosystem update Jon Leech, Khronos Viewperf 12, a new beginning
More informationDEFERRED RENDERING STEFAN MÜLLER ARISONA, ETH ZURICH SMA/
DEFERRED RENDERING STEFAN MÜLLER ARISONA, ETH ZURICH SMA/2013-11-04 DEFERRED RENDERING? CONTENTS 1. The traditional approach: Forward rendering 2. Deferred rendering (DR) overview 3. Example uses of DR:
More informationThe Application Stage. The Game Loop, Resource Management and Renderer Design
1 The Application Stage The Game Loop, Resource Management and Renderer Design Application Stage Responsibilities 2 Set up the rendering pipeline Resource Management 3D meshes Textures etc. Prepare data
More informationReal-Time Rendering (Echtzeitgraphik) Michael Wimmer
Real-Time Rendering (Echtzeitgraphik) Michael Wimmer wimmer@cg.tuwien.ac.at Walking down the graphics pipeline Application Geometry Rasterizer What for? Understanding the rendering pipeline is the key
More informationHPC Middle East. KFUPM HPC Workshop April Mohamed Mekias HPC Solutions Consultant. Introduction to CUDA programming
KFUPM HPC Workshop April 29-30 2015 Mohamed Mekias HPC Solutions Consultant Introduction to CUDA programming 1 Agenda GPU Architecture Overview Tools of the Trade Introduction to CUDA C Patterns of Parallel
More informationCS130 : Computer Graphics. Tamar Shinar Computer Science & Engineering UC Riverside
CS130 : Computer Graphics Tamar Shinar Computer Science & Engineering UC Riverside Raster Devices and Images Raster Devices Hearn, Baker, Carithers Raster Display Transmissive vs. Emissive Display anode
More informationAnalyze and Optimize Windows* Game Applications Using Intel INDE Graphics Performance Analyzers (GPA)
Analyze and Optimize Windows* Game Applications Using Intel INDE Graphics Performance Analyzers (GPA) Intel INDE Graphics Performance Analyzers (GPA) are powerful, agile tools enabling game developers
More informationPowerVR Performance Recommendations. The Golden Rules
PowerVR Performance Recommendations Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind. Redistribution
More informationSLICING THE WORKLOAD MULTI-GPU OPENGL RENDERING APPROACHES
SLICING THE WORKLOAD MULTI-GPU OPENGL RENDERING APPROACHES INGO ESSER NVIDIA DEVTECH PROVIZ OVERVIEW Motivation Tools of the trade Multi-GPU driver functions Multi-GPU programming functions Multi threaded
More informationMikkel Gjøl Graphics
Mikkel Gjøl Graphics Programmer @pixelmager PC rendering overview state, shaders, occlusion queries virtual texturing bindless vertex attributes debugging OpenGL lessons learned Founded 2001 Mod team gone
More informationGPU ACCELERATED DATABASE MANAGEMENT SYSTEMS
CIS 601 - Graduate Seminar Presentation 1 GPU ACCELERATED DATABASE MANAGEMENT SYSTEMS PRESENTED BY HARINATH AMASA CSU ID: 2697292 What we will talk about.. Current problems GPU What are GPU Databases GPU
More informationCopyright Khronos Group Page 1
Gaming Market Briefing Overview of APIs GDC March 2016 Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem ntrevett@nvidia.com @neilt3d Copyright Khronos Group 2016 - Page 1 Copyright
More informationMobile HW and Bandwidth
Your logo on white Mobile HW and Bandwidth Andrew Gruber Qualcomm Technologies, Inc. Agenda and Goals Describe the Power and Bandwidth challenges facing Mobile Graphics Describe some of the Power Saving
More informationLow-Overhead Rendering with Direct3D. Evan Hart Principal Engineer - NVIDIA
Low-Overhead Rendering with Direct3D Evan Hart Principal Engineer - NVIDIA Ground Rules No DX9 Need to move fast Big topic in 30 minutes Assuming experienced audience Everything is a tradeoff These are
More informationCSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller
Entertainment Graphics: Virtual Realism for the Masses CSE 591: GPU Programming Introduction Computer games need to have: realistic appearance of characters and objects believable and creative shading,
More informationNVIDIA Developer Tools for Graphics and PhysX
NVIDIA Developer Tools for Graphics and PhysX FX Composer Shader Debugger PerfKit Conference Presentations mental mill Artist Edition NVIDIA Shader Library Photoshop Plug ins Texture Tools Direct3D SDK
More informationGPGPU. Peter Laurens 1st-year PhD Student, NSC
GPGPU Peter Laurens 1st-year PhD Student, NSC Presentation Overview 1. What is it? 2. What can it do for me? 3. How can I get it to do that? 4. What s the catch? 5. What s the future? What is it? Introducing
More informationASYNCHRONOUS SHADERS WHITE PAPER 0
ASYNCHRONOUS SHADERS WHITE PAPER 0 INTRODUCTION GPU technology is constantly evolving to deliver more performance with lower cost and lower power consumption. Transistor scaling and Moore s Law have helped
More informationOverview. Technology Details. D/AVE NX Preliminary Product Brief
Overview D/AVE NX is the latest and most powerful addition to the D/AVE family of rendering cores. It is the first IP to bring full OpenGL ES 2.0/3.1 rendering to the FPGA and SoC world. Targeted for graphics
More informationMotivation Hardware Overview Programming model. GPU computing. Part 1: General introduction. Ch. Hoelbling. Wuppertal University
Part 1: General introduction Ch. Hoelbling Wuppertal University Lattice Practices 2011 Outline 1 Motivation 2 Hardware Overview History Present Capabilities 3 Programming model Past: OpenGL Present: CUDA
More informationPowerVR Performance Recommendations The Golden Rules. October 2015
PowerVR Performance Recommendations The Golden Rules October 2015 Paul Ly Developer Technology Engineer, PowerVR Graphics Understanding Your Bottlenecks Based on our experience 3 The Golden Rules 1. The
More informationData-Parallel Algorithms on GPUs. Mark Harris NVIDIA Developer Technology
Data-Parallel Algorithms on GPUs Mark Harris NVIDIA Developer Technology Outline Introduction Algorithmic complexity on GPUs Algorithmic Building Blocks Gather & Scatter Reductions Scan (parallel prefix)
More informationLATTICE-BOLTZMANN AND COMPUTATIONAL FLUID DYNAMICS
LATTICE-BOLTZMANN AND COMPUTATIONAL FLUID DYNAMICS NAVIER-STOKES EQUATIONS u t + u u + 1 ρ p = Ԧg + ν u u=0 WHAT IS COMPUTATIONAL FLUID DYNAMICS? Branch of Fluid Dynamics which uses computer power to approximate
More informationCS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology
CS8803SC Software and Hardware Cooperative Computing GPGPU Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology Why GPU? A quiet revolution and potential build-up Calculation: 367
More informationTesla GPU Computing A Revolution in High Performance Computing
Tesla GPU Computing A Revolution in High Performance Computing Gernot Ziegler, Developer Technology (Compute) (Material by Thomas Bradley) Agenda Tesla GPU Computing CUDA Fermi What is GPU Computing? Introduction
More informationPowerVR Series5. Architecture Guide for Developers
Public Imagination Technologies PowerVR Series5 Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.
More informationLecture 25: Board Notes: Threads and GPUs
Lecture 25: Board Notes: Threads and GPUs Announcements: - Reminder: HW 7 due today - Reminder: Submit project idea via (plain text) email by 11/24 Recap: - Slide 4: Lecture 23: Introduction to Parallel
More informationIntroduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono
Introduction to CUDA Algoritmi e Calcolo Parallelo References This set of slides is mainly based on: CUDA Technical Training, Dr. Antonino Tumeo, Pacific Northwest National Laboratory Slide of Applied
More informationCS179 GPU Programming Introduction to CUDA. Lecture originally by Luke Durant and Tamas Szalay
Introduction to CUDA Lecture originally by Luke Durant and Tamas Szalay Today CUDA - Why CUDA? - Overview of CUDA architecture - Dense matrix multiplication with CUDA 2 Shader GPGPU - Before current generation,
More informationRay Tracing with Multi-Core/Shared Memory Systems. Abe Stephens
Ray Tracing with Multi-Core/Shared Memory Systems Abe Stephens Real-time Interactive Massive Model Visualization Tutorial EuroGraphics 2006. Vienna Austria. Monday September 4, 2006 http://www.sci.utah.edu/~abe/massive06/
More informationUser Guide. GLExpert NVIDIA Performance Toolkit
User Guide GLExpert NVIDIA Performance Toolkit Table of Contents Introduction... 2 System Requirements...2 GLExpert Getting Started... 3 GLExpert Configuration Parameters...3 Categories of Interest...3
More informationNVSG NVIDIA Scene Graph
NVSG NVIDIA Scene Graph Leveraging the World's Fastest Scene Graph Agenda Overview NVSG Shader integration Interactive ray tracing Multi-GPU support NVIDIA Scene Graph (NVSG) The first cross-platform scene
More informationRAMSES. TECHNICAL OVERVIEW.
RAMSES. TECHNICAL OVERVIEW. Sven von Beuningen, Bernhard Kißlinger 30.11.2017 DISPLAY CLUSTER POTENTIAL SOLUTIONS. RAMSES - Technical Overview Seite 2 Rendering on one ECU and transfer via video to another
More information