Introduction to PowerVR Ray Tracing Tuesday 18th March, GDC. James A. McCombe
|
|
- Clifton Bond
- 5 years ago
- Views:
Transcription
1 Introduction to PowerVR Tracing Tuesday 18th March, GDC James A. McCombe
2 What are we launching today?
3 Host CPU Interface Vertex Data Master Control and Register Bus Unified Shading Cluster Array System Memory Interface Pixel Data Master Compute Data Master Course Grain Scheduler 0 Texture Unit... 1 Core Management Data Master n-1 Texture Unit n Memory Cache Unit Tracing Unit Intersection Processor Array Coherency Engine Scene Frame Hierarchy Accumulator Generator Cache System Memory Bus Tiling Coprocessor Pixel Co- Processor 2D Core (PTLA)
4
5
6 What are the applications for this?
7
8 Augmented Reality Image and Diagram Copyright Peter Kán 2012
9 Virtual Reality a.k.a Multi-persective rendering Lens distortion and aberration correction Lenticular Displays Ultra-low latency rendering
10 Hybrid rendering in games Shadows Reflections Transparency Better scaling with multiple dynamic lights Real-time light map updates for the rasteriser Easy to bolt on
11 Fully ray traced graphics Brute force path tracing Produces photo realism easily Pretty much requires all the 3D content to be ray traced Possible in todays technology in console / desktop using Wizard Probably not practical for fully real-time use in mobile for a couple more generations. Fine for sub-realtime use.
12 Non-graphics Better in-game AI Collision Detection 3D spatial search
13 Overview of PowerVR
14 Power and Bandwidth 300W 300Gb/s 270Gb/s 240Gb/s 225W 210Gb/s 180Gb/s 150W 150Gb/s 120Gb/s 90Gb/s 75W 60Gb/s 30Gb/s 0W 0Gb/s Workstation Mobile Workstation Laptop Light Laptop Tablet Graphics Watts Graphics Bandwidth
15 Capability of our mobile graphics 150 Shader GFlops Almost 5x compute capability 4x vec4 MAD 16x Scalar 2xMAD Gtexels/s OpenGLES 3.0 Support
16 Host CPU Interface Vertex Data Master Control and Register Bus Unified Shading Cluster Array System Memory Interface Pixel Data Master Compute Data Master Course Grain Scheduler 0 Texture Unit... 1 Core Management n-1 Texture Unit n Memory Cache Unit Tiling Coprocessor Pixel Co- Processor 2D Core (PTLA) System Memory Bus
17 Vertex Data Master Vertex Task 32 Instances Unified Shading Cluster Array Pixel Data Master Pixel Task Compute Task 32 Instances CGS 0 n-1 Texture Unit Texture Unit 1 n Compute Data Master 32 Instances An instance is a vertex, pixel or OpenCL thread All instances in the task share Program, Uniforms, Parameters, etc.
18 Shader Uniforms and Constants Common Store Registers (Shared across instances in the task) Common Instruction ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU Per Pixel / Vertex data and shader temps Unified Store Registers (Private registers per instance) (not exact)
19 Source 0 ABS, NEG, etc. 2 Flops Source 1 Source 2 ABS, NEG, etc. ABS, NEG, etc. Integer Math Format Conversions Output 0 Comparison Format Conversions Source 3 ABS, NEG, etc. Source 4 ABS, NEG, etc. 2 Flops Output 1 Source 5 ABS, NEG, etc. 12 Cycles of Latency (not exact)
20 Common Store Registers (Shared across instances in the task) Common Instruction ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU varying vec2 texcoord; void main() { if (texcoord.x > 0.5) { } else { } Unified Store Registers (Private registers per instance)
21 Overall capability Series6 gives you over 100Gflops for shading on latest mobile devices OpenGL ES 3.0 support Massive capability gain allowing math heavy shaders Mobile graphics is where the actual innovation is happening Try it!
22 A Technical introduction to Wizard
23 Host CPU Interface Vertex Data Master Control and Register Bus Unified Shading Cluster Array System Memory Interface Pixel Data Master Compute Data Master Course Grain Scheduler 0 Texture Unit... 1 Core Management Data Master n-1 Texture Unit n Memory Cache Unit Tracing Unit Intersection Processor Array Coherency Engine Scene Frame Hierarchy Accumulator Generator Cache System Memory Bus Tiling Coprocessor Pixel Co- Processor 2D Core (PTLA)
24 Host CPU Interface Vertex Data Master Control and Register Bus Unified Shading Cluster Array System Memory Interface Pixel Data Master Compute Data Master Course Grain Scheduler 0 Texture Unit... 1 Core Management Data Master n-1 Texture Unit n Memory Cache Unit Tracing Unit Intersection Processor Array Coherency Engine Scene Frame Hierarchy Accumulator Generator Cache System Memory Bus Tiling Coprocessor Pixel Co- Processor 2D Core (PTLA)
25 Source 0 ABS, NEG, etc. 2 Flops Source 1 Source 2 ABS, NEG, etc. ABS, NEG, etc. Integer Math Format Conversions Output 0 RAM Comparison Format Conversions Source 3 ABS, NEG, etc. Source 4 ABS, NEG, etc. 2 Flops Output 1 RAM Source 5 ABS, NEG, etc. 12 Cycles of Latency (not exact)
26 New shader type Shaders are invoked when a ray intersects a triangle Shaders types can emit any number of rays OpenRL has a frame shader to emit primary rays Existing fragment / pixel shaders can emit rays also! GLSL Programming model is the same Built-in functions, parameters, etc.
27 Primitive Objects Encapsulate the rendering state for one mesh Includes VBOs, uniforms, shader programs, texture bindings, etc. tracing unit sorts rays into tasks with common primitive objects Persistent between frames Mutable objects by the client
28 Constraints / Limitations Shaders cannot wait for results of individual ray trace operations Shaders must provide a worst case estimate on number of child rays Per-ray user data payload must be carefully managed
29 Task 0 Parallelism is on rays NOT pixels Task 1 Common Instruction ALU ALU ALU ALU Task 2 (In this example, ALU width is 4 to make the diagram smaller) Task 3
30 Parallelism is on rays NOT pixels
31 Relief: Quick ray emission demo
32 Intersection Processor Array or Who needs Fixed Function?
33 :AABB Test Fast -Axis Aligned Bounding Overlap Tests with Plucker Coordinates. Jeffrey Mahovsky and Brian Wyvill 6 lines form the silhouette of the AABB 6 planes from the ray origin and each edge vector Dot product of plane normal and ray direction vector 6 signs must match and be negative.
34 distance setup Translation Plane Setup Plane tests Test Combine XB 0 >7 voxel_center_x origin_x! shift_center_x7+/! AX0min BX0max cov_convert Octant(0) XA XB XA ri(31) ri invert ri(31) >7 origin_x_miss invert ri(31) ri(31) dirx ray_distance diry ray_distance x x ri rj voxel_center_y origin_y! shift_center_y7+/! AY0min BY0max cov_convert Octant(1) YB YA YB YA rj(31) rj(31) 0 rj >7 invert rj(31) >7 invert rj(31) endpt_x_miss origin_y_miss endpt_y_miss origin_z_miss MISS Hit or Miss dirz ray_distance x rk! ZB 0 >7 endpt_z_miss voxel_center_z origin_z! shift_center_z7+/! AZ0min BZ0max cov_convert Octant(2) ZA ZB ZA rk(31) rk invert rk(31) >7 invert 3 MUL 6 SUB 6 ADD 12 MUL 6 bitwise OR rk(31) rk(31)
35 instruction group packing LD MUL ADD MUL OR LD MUL ADD MUL LD MUL ADD MUL OR LD SUB ADD MUL 26 Instructions LD LD SUB SUB SUB SUB ADD ADD MUL MUL MUL MUL OR OR (32 if using compressed data formats) SUB MUL OR MUL MUL OR MUL
36 Triangle Triangle Tester and s Tester and s Triangle Triangle Tester and s Tester and s Child Address Free Queue Hits Coherence Cache Old Count Update Existing Queue Existing Queue Queue Packer Inst Data Inst Data Update New Queue Coherence Queue RAM 44x less area for this function
37 The Tracing Unit and Coherence Engine
38 Data Master Course Grain Scheduler Unified Shading Cluster Array 0 Texture Unit 1 System Memory Interface n-1 Texture Unit n s grouped by PrimitiveObject (Shader + Resources) s in arbitrary order Tracing Unit Intersection Processor Array Coherency Engine System Memory Bus
39 AABB Blocks Vertex Blocks Varyings Ascending Virtual Memory Addresses ( around 100Mbytes for 1M triangles including varyings )
40 Primitive Object 1 Primitive Object 2 PrimitiveObject 0
41 node_ node_17094 node_13580 node_17097 node_13627 node_17101 node_13567 node_17104 node_13619 node_17108 node_13571 node_13574 node_17113 node_13623 node_22218 node_13227 node_18727 node_13233 node_18728 node_18732 node_13270 node_18735 node_13262 node_13064 node_18644 node_13182 node_18645 node_18649 node_13056 node_18652 node_13175 node_17181 node_11676 node_17186 node_11707 node_17188 node_11680 node_17191 node_11710 node_19786 node_13225 node_18690 node_18693 node_13223 node_18699 node_18706 node_13260 node_18720 node_18714 node_12173 node_22242 node_22251 node_18863 node_18868 node_5790 node_5792 node_18870 node_5788 node_5785 node_12226 node_5883 node_18875 node_6403 node_6405 node_6208 node_6212 node_6221 node_6216 node_18881 node_6401 node_6200 node_6204 node_13363 node_6399 node_6192 node_6196 node_18894 node_18899 node_5801 node_5805 node_18902 node_5794 node_5799 node_18906 node_6232 node_6419 node_6421 node_6238 node_6242 node_18914 node_6222 node_6228 node_6407 node_6417 node_22256 node_18804 node_18807 node_13257 node_18812 node_12204 node_5773 node_12202 node_5775 node_18815 node_6166 node_13331 node_6389 node_6162 node_13335 node_6387 node_18821 node_12196 node_12189 node_12192 node_18825 node_13323 node_13315 node_13319 node_6160 node_18836 node_18841 node_12220 node_5781 node_12218 node_5783 node_18844 node_1221 node_5777 node_12210 node_5779 node_22290 node_22294 node_22300 node_16820 node_16823 node_10766 node_22307 node_22311 node_22317 node_22323 node_22431 node_22438 node_11100 node_16838 node_11102 node_11104 node_16841 node_11106 node_22446 node_10782 node_16827 node_10804 node_10833 node_16831 node_10853 node_22451 node_11094 node_19643 node_10965 node_11031 node_11096 node_11098 node_22459 node_22467 node_16854 node_16857 node_22474 node_22480 node_16850 node_16843 node_16846 node_11108 node_19254 node_19259 node_7692 node_13760 node_7710 node_7688 node_13764 node_7706 node_19264 node_7634 node_13681 node_7672 node_13685 node_7670 node_7626 node_7628 node_19272 node_13750 node_7686 node_13754 node_7704 node_1 node_13671 nod node_13945
42
43 Primitive Object 1 Primitive Object 2 PrimitiveObject 0
44 (coherence queues) Primitive Object 1 Primitive Object 2 PrimitiveObject 0
45 Automatically finds coherence paths
46 The Scene Hierarchy Generator
47 Vertex Data Master Course Grain Scheduler Unified Shading Cluster Array 0 Texture Unit 1 System Memory Interface n-1 Texture Unit n World-space Triangles and associated varyings Scene Hierarchy Generator System Memory Bus
48 Limitations Scene is represented by triangles same as today BVH is in a defined format optimised for construction and traversal Triangle order must generally follow a spatially coherent flow An approximate scene scale estimation is needed Geometry shaders are not inline with the ray tracing pipeline
49 Strengths Shading cluster workload is no higher than a vertex shader Only needs to process geometry that actually moved in world space Unique algorithm constrains working set to internal registers only Single pass operation: in-line with vertex shader execution Handles the long skinny triangles problem well Streaming writes to external memory Losslessly compressed output formats due to build algorithm Compact logic
50 Built on sparse log2 oct-tree scaffolding Level = 1 Level = 2 Level = 3
51 For each triangle Select LOD VBExtents Foreach VBNode in LOD v2 Compute Parent Voxel v1 v0 VBNode Pool Per LOD VBNode linked list head pointers On VoxelCache HIT Leaf VoxelCache On eviction Add VBNode GenerateVBNode() On eviction Tree VoxelCache Output Scene Acceleration Structure ProcessTriangle() AssembleParents()
52 After some triangles... LOD10 LeafCount=3 TreeCount=0 Leaf Head Tree Head LOD14 LeafCount=0 TreeCount=0 Leaf Head Tree Head LOD15 LeafCount=3 TreeCount=0 Leaf Head Tree Head LOD16 LeafCount=8 TreeCount=0 Leaf Head Tree Head
53 Assembling parents... LOD10 LeafCount=3 TreeCount=0 Leaf Head Tree Head VBNode Pool Add VBNode to linked list of VBNodes for its LOD GenerateVBNode() LOD14 LeafCount=0 TreeCount=0 LOD15 LeafCount=3 TreeCount=0 LOD16 LeafCount=8 TreeCount=0 Leaf Head Tree Head Leaf Head Tree Head Leaf Head Tree Head On eviction Tree VoxelCache Flush Leaf Nodes
54 After parents for one level have been assembled... LOD10 LeafCount=3 TreeCount=0 Leaf Head Tree Head LOD14 LeafCount=0 TreeCount=0 Leaf Head Tree Head LOD15 LeafCount=3 TreeCount=0 Leaf Head Tree Head A B C LOD16 LeafCount=8 TreeCount=0 Leaf Head Tree Head A0 A1 B0 C0 C1 B1 B2 C2
55 Performance Expectations
56 Host CPU Interface Vertex Data Master Control and Register Bus Unified Shading Cluster Array System Memory Interface Pixel Data Master Compute Data Master Data Master Course Grain Scheduler 0 n-1 Texture Unit Inst/ray (512flops peak) Texture Unit n Core Management Memory Cache Unit Tracing Unit 1-2 clk/ray 6 clk/tri 1-2clk/accum Intersection Processor Array Coherency Engine 64 AABB/ray 16 Tri/ray Scene Frame Hierarchy Accumulator Generator Cache System Memory Bus Tiling Coprocessor Pixel Co- Processor 2D Core (PTLA)
57 Get early access to the programming concepts at: Talk immediately after this on Hybrid rendering See the demos at Imagination Booth #402 South Hall
58 QA
59 Thank you
Enhancing Traditional Rasterization Graphics with Ray Tracing. October 2015
Enhancing Traditional Rasterization Graphics with Ray Tracing October 2015 James Rumble Developer Technology Engineer, PowerVR Graphics Overview Ray Tracing Fundamentals PowerVR Ray Tracing Pipeline Using
More informationEnabling immersive gaming experiences Intro to Ray Tracing
Enabling immersive gaming experiences Intro to Ray Tracing Overview What is Ray Tracing? Why Ray Tracing? PowerVR Wizard Architecture Example Content Unity Hybrid Rendering Demonstration 3 What is Ray
More informationEnhancing Traditional Rasterization Graphics with Ray Tracing. March 2015
Enhancing Traditional Rasterization Graphics with Ray Tracing March 2015 Introductions James Rumble Developer Technology Engineer Ray Tracing Support Justin DeCell Software Design Engineer Ray Tracing
More informationINFOGR Computer Graphics. J. Bikker - April-July Lecture 11: Acceleration. Welcome!
INFOGR Computer Graphics J. Bikker - April-July 2015 - Lecture 11: Acceleration Welcome! Today s Agenda: High-speed Ray Tracing Acceleration Structures The Bounding Volume Hierarchy BVH Construction BVH
More informationPractical Techniques for Ray Tracing in Games. Gareth Morgan (Imagination Technologies) Aras Pranckevičius (Unity Technologies) March, 2014
Practical Techniques for Ray Tracing in Games Gareth Morgan (Imagination Technologies) Aras Pranckevičius (Unity Technologies) March, 2014 What Ray Tracing is not! Myth: Ray Tracing is only for photorealistic
More informationAcceleration Data Structures
CT4510: Computer Graphics Acceleration Data Structures BOCHANG MOON Ray Tracing Procedure for Ray Tracing: For each pixel Generate a primary ray (with depth 0) While (depth < d) { Find the closest intersection
More informationPowerVR Series5. Architecture Guide for Developers
Public Imagination Technologies PowerVR Series5 Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.
More informationPowerVR Hardware. Architecture Overview for Developers
Public Imagination Technologies PowerVR Hardware Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.
More informationPowerVR Graphics - Latest Developments and Future Plans
PowerVR Graphics - Latest Developments and Future Plans Latest Developments and Future Plans A brief introduction Joe Davis Lead Developer Support Engineer, PowerVR Graphics With Imagination s PowerVR
More informationScene Management. Video Game Technologies 11498: MSc in Computer Science and Engineering 11156: MSc in Game Design and Development
Video Game Technologies 11498: MSc in Computer Science and Engineering 11156: MSc in Game Design and Development Chap. 5 Scene Management Overview Scene Management vs Rendering This chapter is about rendering
More informationFast Stereoscopic Rendering on Mobile Ray Tracing GPU for Virtual Reality Applications
Fast Stereoscopic Rendering on Mobile Ray Tracing GPU for Virtual Reality Applications SAMSUNG Advanced Institute of Technology Won-Jong Lee, Seok Joong Hwang, Youngsam Shin, Jeong-Joon Yoo, Soojung Ryu
More informationRay Tracing. Computer Graphics CMU /15-662, Fall 2016
Ray Tracing Computer Graphics CMU 15-462/15-662, Fall 2016 Primitive-partitioning vs. space-partitioning acceleration structures Primitive partitioning (bounding volume hierarchy): partitions node s primitives
More informationINFOMAGR Advanced Graphics. Jacco Bikker - February April Welcome!
INFOMAGR Advanced Graphics Jacco Bikker - February April 2016 Welcome! I x, x = g(x, x ) ε x, x + S ρ x, x, x I x, x dx Today s Agenda: Introduction Ray Distributions The Top-level BVH Real-time Ray Tracing
More informationSpring 2009 Prof. Hyesoon Kim
Spring 2009 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on
More informationGeForce4. John Montrym Henry Moreton
GeForce4 John Montrym Henry Moreton 1 Architectural Drivers Programmability Parallelism Memory bandwidth 2 Recent History: GeForce 1&2 First integrated geometry engine & 4 pixels/clk Fixed-function transform,
More informationAnnouncements. Written Assignment2 is out, due March 8 Graded Programming Assignment2 next Tuesday
Announcements Written Assignment2 is out, due March 8 Graded Programming Assignment2 next Tuesday 1 Spatial Data Structures Hierarchical Bounding Volumes Grids Octrees BSP Trees 11/7/02 Speeding Up Computations
More informationGraphics Hardware, Graphics APIs, and Computation on GPUs. Mark Segal
Graphics Hardware, Graphics APIs, and Computation on GPUs Mark Segal Overview Graphics Pipeline Graphics Hardware Graphics APIs ATI s low-level interface for computation on GPUs 2 Graphics Hardware High
More informationLecture 4 - Real-time Ray Tracing
INFOMAGR Advanced Graphics Jacco Bikker - November 2017 - February 2018 Lecture 4 - Real-time Ray Tracing Welcome! I x, x = g(x, x ) ε x, x + න S ρ x, x, x I x, x dx Today s Agenda: Introduction Ray Distributions
More informationReal-Time Rendering (Echtzeitgraphik) Michael Wimmer
Real-Time Rendering (Echtzeitgraphik) Michael Wimmer wimmer@cg.tuwien.ac.at Walking down the graphics pipeline Application Geometry Rasterizer What for? Understanding the rendering pipeline is the key
More information6.837 Introduction to Computer Graphics Final Exam Tuesday, December 20, :05-12pm Two hand-written sheet of notes (4 pages) allowed 1 SSD [ /17]
6.837 Introduction to Computer Graphics Final Exam Tuesday, December 20, 2011 9:05-12pm Two hand-written sheet of notes (4 pages) allowed NAME: 1 / 17 2 / 12 3 / 35 4 / 8 5 / 18 Total / 90 1 SSD [ /17]
More informationRay Tracing with Multi-Core/Shared Memory Systems. Abe Stephens
Ray Tracing with Multi-Core/Shared Memory Systems Abe Stephens Real-time Interactive Massive Model Visualization Tutorial EuroGraphics 2006. Vienna Austria. Monday September 4, 2006 http://www.sci.utah.edu/~abe/massive06/
More informationNext-Generation Graphics on Larrabee. Tim Foley Intel Corp
Next-Generation Graphics on Larrabee Tim Foley Intel Corp Motivation The killer app for GPGPU is graphics We ve seen Abstract models for parallel programming How those models map efficiently to Larrabee
More informationGraphics Architectures and OpenCL. Michael Doggett Department of Computer Science Lund university
Graphics Architectures and OpenCL Michael Doggett Department of Computer Science Lund university Overview Parallelism Radeon 5870 Tiled Graphics Architectures Important when Memory and Bandwidth limited
More informationLPGPU Workshop on Power-Efficient GPU and Many-core Computing (PEGPUM 2014)
A practitioner s view of challenges faced with power and performance on mobile GPU Prashant Sharma Samsung R&D Institute UK LPGPU Workshop on Power-Efficient GPU and Many-core Computing (PEGPUM 2014) SERI
More informationB-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes
B-KD rees for Hardware Accelerated Ray racing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek Saarland University, Germany Outline Previous Work B-KD ree as new Spatial Index Structure DynR
More informationINFOMAGR Advanced Graphics. Jacco Bikker - February April Welcome!
INFOMAGR Advanced Graphics Jacco Bikker - February April 2016 Welcome! I x, x = g(x, x ) ε x, x + S ρ x, x, x I x, x dx Today s Agenda: Introduction : GPU Ray Tracing Practical Perspective Advanced Graphics
More informationParallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)
Lecture 2: Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload) Visual Computing Systems Analyzing a 3D Graphics Workload Where is most of the work done? Memory Vertex
More informationSpring 2011 Prof. Hyesoon Kim
Spring 2011 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on
More informationCS451Real-time Rendering Pipeline
1 CS451Real-time Rendering Pipeline JYH-MING LIEN DEPARTMENT OF COMPUTER SCIENCE GEORGE MASON UNIVERSITY Based on Tomas Akenine-Möller s lecture note You say that you render a 3D 2 scene, but what does
More informationLecture 9: Deferred Shading. Visual Computing Systems CMU , Fall 2013
Lecture 9: Deferred Shading Visual Computing Systems The course so far The real-time graphics pipeline abstraction Principle graphics abstractions Algorithms and modern high performance implementations
More informationReal-Time Voxelization for Global Illumination
Lecture 26: Real-Time Voxelization for Global Illumination Visual Computing Systems Voxelization to regular grid Input: scene triangles Output: surface information at each voxel in 3D grid - Simple case:
More informationTDA362/DIT223 Computer Graphics EXAM (Same exam for both CTH- and GU students)
TDA362/DIT223 Computer Graphics EXAM (Same exam for both CTH- and GU students) Saturday, January 13 th, 2018, 08:30-12:30 Examiner Ulf Assarsson, tel. 031-772 1775 Permitted Technical Aids None, except
More informationCS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology
CS8803SC Software and Hardware Cooperative Computing GPGPU Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology Why GPU? A quiet revolution and potential build-up Calculation: 367
More informationHardware Accelerated Volume Visualization. Leonid I. Dimitrov & Milos Sramek GMI Austrian Academy of Sciences
Hardware Accelerated Volume Visualization Leonid I. Dimitrov & Milos Sramek GMI Austrian Academy of Sciences A Real-Time VR System Real-Time: 25-30 frames per second 4D visualization: real time input of
More informationRay Tracing Acceleration. CS 4620 Lecture 20
Ray Tracing Acceleration CS 4620 Lecture 20 2013 Steve Marschner 1 Will this be on the exam? or, Prelim 2 syllabus You can expect emphasis on topics related to the assignment (Shaders 1&2) and homework
More informationCSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.
CSCI 402: Computer Architectures Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI 6.6 - End Today s Contents GPU Cluster and its network topology The Roofline performance
More informationThreading Hardware in G80
ing Hardware in G80 1 Sources Slides by ECE 498 AL : Programming Massively Parallel Processors : Wen-Mei Hwu John Nickolls, NVIDIA 2 3D 3D API: API: OpenGL OpenGL or or Direct3D Direct3D GPU Command &
More informationBifrost - The GPU architecture for next five billion
Bifrost - The GPU architecture for next five billion Hessed Choi Senior FAE / ARM ARM Tech Forum June 28 th, 2016 Vulkan 2 ARM 2016 What is Vulkan? A 3D graphics API for the next twenty years Logical successor
More informationNVIDIA Case Studies:
NVIDIA Case Studies: OptiX & Image Space Photon Mapping David Luebke NVIDIA Research Beyond Programmable Shading 0 How Far Beyond? The continuum Beyond Programmable Shading Just programmable shading: DX,
More informationPowerVR GPU IP from Wearables to Servers. Kristof Beets Director of Business Development May 2015
PowerVR GPU IP from Wearables to Servers Kristof Beets Director of Business Development May 2015 www.imgtec.com Expanding embedded GPU market opportunities Huge range of market opportunities equates to
More informationCS427 Multicore Architecture and Parallel Computing
CS427 Multicore Architecture and Parallel Computing Lecture 6 GPU Architecture Li Jiang 2014/10/9 1 GPU Scaling A quiet revolution and potential build-up Calculation: 936 GFLOPS vs. 102 GFLOPS Memory Bandwidth:
More informationIn the previous presentation, Erik Sintorn presented methods for practically constructing a DAG structure from a voxel data set.
1 In the previous presentation, Erik Sintorn presented methods for practically constructing a DAG structure from a voxel data set. This presentation presents how such a DAG structure can be accessed immediately
More informationSpatial Data Structures and Speed-Up Techniques. Tomas Akenine-Möller Department of Computer Engineering Chalmers University of Technology
Spatial Data Structures and Speed-Up Techniques Tomas Akenine-Möller Department of Computer Engineering Chalmers University of Technology Spatial data structures What is it? Data structure that organizes
More informationParallel Programming on Larrabee. Tim Foley Intel Corp
Parallel Programming on Larrabee Tim Foley Intel Corp Motivation This morning we talked about abstractions A mental model for GPU architectures Parallel programming models Particular tools and APIs This
More informationEfficient and Scalable Shading for Many Lights
Efficient and Scalable Shading for Many Lights 1. GPU Overview 2. Shading recap 3. Forward Shading 4. Deferred Shading 5. Tiled Deferred Shading 6. And more! First GPU Shaders Unified Shaders CUDA OpenCL
More informationChallenges for GPU Architecture. Michael Doggett Graphics Architecture Group April 2, 2008
Michael Doggett Graphics Architecture Group April 2, 2008 Graphics Processing Unit Architecture CPUs vsgpus AMD s ATI RADEON 2900 Programming Brook+, CAL, ShaderAnalyzer Architecture Challenges Accelerated
More informationCSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller
Entertainment Graphics: Virtual Realism for the Masses CSE 591: GPU Programming Introduction Computer games need to have: realistic appearance of characters and objects believable and creative shading,
More informationCourse Recap + 3D Graphics on Mobile GPUs
Lecture 18: Course Recap + 3D Graphics on Mobile GPUs Interactive Computer Graphics Q. What is a big concern in mobile computing? A. Power Two reasons to save power Run at higher performance for a fixed
More informationThe Traditional Graphics Pipeline
Last Time? The Traditional Graphics Pipeline Participating Media Measuring BRDFs 3D Digitizing & Scattering BSSRDFs Monte Carlo Simulation Dipole Approximation Today Ray Casting / Tracing Advantages? Ray
More informationOverview. Pipeline implementation I. Overview. Required Tasks. Preliminaries Clipping. Hidden Surface removal
Overview Pipeline implementation I Preliminaries Clipping Line clipping Hidden Surface removal Overview At end of the geometric pipeline, vertices have been assembled into primitives Must clip out primitives
More informationSpring 2010 Prof. Hyesoon Kim. AMD presentations from Richard Huddy and Michael Doggett
Spring 2010 Prof. Hyesoon Kim AMD presentations from Richard Huddy and Michael Doggett Radeon 2900 2600 2400 Stream Processors 320 120 40 SIMDs 4 3 2 Pipelines 16 8 4 Texture Units 16 8 4 Render Backens
More informationPortland State University ECE 588/688. Graphics Processors
Portland State University ECE 588/688 Graphics Processors Copyright by Alaa Alameldeen 2018 Why Graphics Processors? Graphics programs have different characteristics from general purpose programs Highly
More informationIntroduction to Parallel Programming Models
Introduction to Parallel Programming Models Tim Foley Stanford University Beyond Programmable Shading 1 Overview Introduce three kinds of parallelism Used in visual computing Targeting throughput architectures
More informationCS 4620 Program 4: Ray II
CS 4620 Program 4: Ray II out: Tuesday 11 November 2008 due: Tuesday 25 November 2008 1 Introduction In the first ray tracing assignment you built a simple ray tracer that handled just the basics. In this
More informationSpatial Data Structures
Spatial Data Structures Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees Constructive Solid Geometry (CSG) [Angel 9.10] Outline Ray tracing review what rays matter? Ray tracing speedup faster
More informationCS 465 Program 5: Ray II
CS 465 Program 5: Ray II out: Friday 2 November 2007 due: Saturday 1 December 2007 Sunday 2 December 2007 midnight 1 Introduction In the first ray tracing assignment you built a simple ray tracer that
More informationCopyright Khronos Group Page 1
Gaming Market Briefing Overview of APIs GDC March 2016 Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem ntrevett@nvidia.com @neilt3d Copyright Khronos Group 2016 - Page 1 Copyright
More informationThe Traditional Graphics Pipeline
Last Time? The Traditional Graphics Pipeline Reading for Today A Practical Model for Subsurface Light Transport, Jensen, Marschner, Levoy, & Hanrahan, SIGGRAPH 2001 Participating Media Measuring BRDFs
More informationCSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University
CSE 591/392: GPU Programming Introduction Klaus Mueller Computer Science Department Stony Brook University First: A Big Word of Thanks! to the millions of computer game enthusiasts worldwide Who demand
More informationX. GPU Programming. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter X 1
X. GPU Programming 320491: Advanced Graphics - Chapter X 1 X.1 GPU Architecture 320491: Advanced Graphics - Chapter X 2 GPU Graphics Processing Unit Parallelized SIMD Architecture 112 processing cores
More informationKinetic BV Hierarchies and Collision Detection
Kinetic BV Hierarchies and Collision Detection Gabriel Zachmann Clausthal University, Germany zach@tu-clausthal.de Bonn, 25. January 2008 Bounding Volume Hierarchies BVHs are standard DS for collision
More informationImproving Memory Space Efficiency of Kd-tree for Real-time Ray Tracing Byeongjun Choi, Byungjoon Chang, Insung Ihm
Improving Memory Space Efficiency of Kd-tree for Real-time Ray Tracing Byeongjun Choi, Byungjoon Chang, Insung Ihm Department of Computer Science and Engineering Sogang University, Korea Improving Memory
More informationThe Bifrost GPU architecture and the ARM Mali-G71 GPU
The Bifrost GPU architecture and the ARM Mali-G71 GPU Jem Davies ARM Fellow and VP of Technology Hot Chips 28 Aug 2016 Introduction to ARM Soft IP ARM licenses Soft IP cores (amongst other things) to our
More informationFast BVH Construction on GPUs
Fast BVH Construction on GPUs Published in EUROGRAGHICS, (2009) C. Lauterbach, M. Garland, S. Sengupta, D. Luebke, D. Manocha University of North Carolina at Chapel Hill NVIDIA University of California
More informationPoint Cloud Filtering using Ray Casting by Eric Jensen 2012 The Basic Methodology
Point Cloud Filtering using Ray Casting by Eric Jensen 01 The Basic Methodology Ray tracing in standard graphics study is a method of following the path of a photon from the light source to the camera,
More informationCS 354R: Computer Game Technology
CS 354R: Computer Game Technology Texture and Environment Maps Fall 2018 Texture Mapping Problem: colors, normals, etc. are only specified at vertices How do we add detail between vertices without incurring
More informationA Reconfigurable Architecture for Load-Balanced Rendering
A Reconfigurable Architecture for Load-Balanced Rendering Jiawen Chen Michael I. Gordon William Thies Matthias Zwicker Kari Pulli Frédo Durand Graphics Hardware July 31, 2005, Los Angeles, CA The Load
More informationAccelerating Ray Tracing
Accelerating Ray Tracing Ray Tracing Acceleration Techniques Faster Intersections Fewer Rays Generalized Rays Faster Ray-Object Intersections Object bounding volumes Efficient intersection routines Fewer
More informationUsing GPUs. Visualization of Complex Functions. September 26, 2012 Khaldoon Ghanem German Research School for Simulation Sciences
Visualization of Complex Functions Using GPUs September 26, 2012 Khaldoon Ghanem German Research School for Simulation Sciences Outline GPU in a Nutshell Fractals - A Simple Fragment Shader Domain Coloring
More informationLast Time: Acceleration Data Structures for Ray Tracing. Schedule. Today. Shadows & Light Sources. Shadows
Last Time: Acceleration Data Structures for Ray Tracing Modeling Transformations Illumination (Shading) Viewing Transformation (Perspective / Orthographic) Clipping Projection (to Screen Space) Scan Conversion
More informationThe Graphics Pipeline
The Graphics Pipeline Ray Tracing: Why Slow? Basic ray tracing: 1 ray/pixel Ray Tracing: Why Slow? Basic ray tracing: 1 ray/pixel But you really want shadows, reflections, global illumination, antialiasing
More informationReal - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský
Real - Time Rendering Graphics pipeline Michal Červeňanský Juraj Starinský Overview History of Graphics HW Rendering pipeline Shaders Debugging 2 History of Graphics HW First generation Second generation
More informationBuilding scalable 3D applications. Ville Miettinen Hybrid Graphics
Building scalable 3D applications Ville Miettinen Hybrid Graphics What s going to happen... (1/2) Mass market: 3D apps will become a huge success on low-end and mid-tier cell phones Retro-gaming New game
More informationInteractive Isosurface Ray Tracing of Large Octree Volumes
Interactive Isosurface Ray Tracing of Large Octree Volumes Aaron Knoll, Ingo Wald, Steven Parker, and Charles Hansen Scientific Computing and Imaging Institute University of Utah 2006 IEEE Symposium on
More informationRay Casting of Trimmed NURBS Surfaces on the GPU
Ray Casting of Trimmed NURBS Surfaces on the GPU Hans-Friedrich Pabst Jan P. Springer André Schollmeyer Robert Lenhardt Christian Lessig Bernd Fröhlich Bauhaus University Weimar Faculty of Media Virtual
More informationAccelerating Geometric Queries. Computer Graphics CMU /15-662, Fall 2016
Accelerating Geometric Queries Computer Graphics CMU 15-462/15-662, Fall 2016 Geometric modeling and geometric queries p What point on the mesh is closest to p? What point on the mesh is closest to p?
More informationParallel Computing: Parallel Architectures Jin, Hai
Parallel Computing: Parallel Architectures Jin, Hai School of Computer Science and Technology Huazhong University of Science and Technology Peripherals Computer Central Processing Unit Main Memory Computer
More informationCS230 : Computer Graphics Lecture 4. Tamar Shinar Computer Science & Engineering UC Riverside
CS230 : Computer Graphics Lecture 4 Tamar Shinar Computer Science & Engineering UC Riverside Shadows Shadows for each pixel do compute viewing ray if ( ray hits an object with t in [0, inf] ) then compute
More informationScheduling the Graphics Pipeline on a GPU
Lecture 20: Scheduling the Graphics Pipeline on a GPU Visual Computing Systems Today Real-time 3D graphics workload metrics Scheduling the graphics pipeline on a modern GPU Quick aside: tessellation Triangle
More informationWhiz-Bang Graphics and Media Performance for Java Platform, Micro Edition (JavaME)
Whiz-Bang Graphics and Media Performance for Java Platform, Micro Edition (JavaME) Pavel Petroshenko, Sun Microsystems, Inc. Ashmi Bhanushali, NVIDIA Corporation Jerry Evans, Sun Microsystems, Inc. Nandini
More informationOpenGL BOF Siggraph 2011
OpenGL BOF Siggraph 2011 OpenGL BOF Agenda OpenGL 4 update Barthold Lichtenbelt, NVIDIA OpenGL Shading Language Hints/Kinks Bill Licea-Kane, AMD Ecosystem update Jon Leech, Khronos Viewperf 12, a new beginning
More informationSpatial Data Structures. Steve Rotenberg CSE168: Rendering Algorithms UCSD, Spring 2017
Spatial Data Structures Steve Rotenberg CSE168: Rendering Algorithms UCSD, Spring 2017 Ray Intersections We can roughly estimate the time to render an image as being proportional to the number of ray-triangle
More informationSpatial Data Structures
15-462 Computer Graphics I Lecture 17 Spatial Data Structures Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees Constructive Solid Geometry (CSG) March 28, 2002 [Angel 8.9] Frank Pfenning Carnegie
More informationLecture 2 - Acceleration Structures
INFOMAGR Advanced Graphics Jacco Bikker - November 2017 - February 2018 Lecture 2 - Acceleration Structures Welcome! I x, x = g(x, x ) ε x, x + න S ρ x, x, x I x, x dx Today s Agenda: Problem Analysis
More informationEECS 487: Interactive Computer Graphics
EECS 487: Interactive Computer Graphics Lecture 21: Overview of Low-level Graphics API Metal, Direct3D 12, Vulkan Console Games Why do games look and perform so much better on consoles than on PCs with
More informationGPGPU Lessons Learned. Mark Harris
GPGPU Lessons Learned Mark Harris General-Purpose Computation on GPUs Highly parallel applications Physically-based simulation image processing scientific computing computer vision computational finance
More informationGraphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics
Why GPU? Chapter 1 Graphics Hardware Graphics Processing Unit (GPU) is a Subsidiary hardware With massively multi-threaded many-core Dedicated to 2D and 3D graphics Special purpose low functionality, high
More informationDave Shreiner, ARM March 2009
4 th Annual Dave Shreiner, ARM March 2009 Copyright Khronos Group, 2009 - Page 1 Motivation - What s OpenGL ES, and what can it do for me? Overview - Lingo decoder - Overview of the OpenGL ES Pipeline
More informationCOMP 4801 Final Year Project. Ray Tracing for Computer Graphics. Final Project Report FYP Runjing Liu. Advised by. Dr. L.Y.
COMP 4801 Final Year Project Ray Tracing for Computer Graphics Final Project Report FYP 15014 by Runjing Liu Advised by Dr. L.Y. Wei 1 Abstract The goal of this project was to use ray tracing in a rendering
More informationA distributed rendering architecture for ray tracing large scenes on commodity hardware. FlexRender. Bob Somers Zoe J.
FlexRender A distributed rendering architecture for ray tracing large scenes on commodity hardware. GRAPP 2013 Bob Somers Zoe J. Wood Increasing Geometric Complexity Normal Maps artifacts on silhouette
More informationInteractive Light Mapping with PowerVR Ray Tracing
Interactive Light Mapping with PowerVR Ray Tracing Jens Fursund Justin DeCell Light Map Basics A light map is a texture that stores lighting for objects in the scene 3 Generation of light maps for GI Charting
More informationParallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload)
Lecture 2: Parallelizing Graphics Pipeline Execution (+ Basics of Characterizing a Rendering Workload) Visual Computing Systems Today Finishing up from last time Brief discussion of graphics workload metrics
More informationSqueezing Performance out of your Game with ATI Developer Performance Tools and Optimization Techniques
Squeezing Performance out of your Game with ATI Developer Performance Tools and Optimization Techniques Jonathan Zarge, Team Lead Performance Tools Richard Huddy, European Developer Relations Manager ATI
More informationDirect Rendering of Trimmed NURBS Surfaces
Direct Rendering of Trimmed NURBS Surfaces Hardware Graphics Pipeline 2/ 81 Hardware Graphics Pipeline GPU Video Memory CPU Vertex Processor Raster Unit Fragment Processor Render Target Screen Extended
More informationRay Tracing Acceleration Data Structures
Ray Tracing Acceleration Data Structures Sumair Ahmed October 29, 2009 Ray Tracing is very time-consuming because of the ray-object intersection calculations. With the brute force method, each ray has
More informationComputer Graphics. - Spatial Index Structures - Philipp Slusallek
Computer Graphics - Spatial Index Structures - Philipp Slusallek Overview Last lecture Overview of ray tracing Ray-primitive intersections Today Acceleration structures Bounding Volume Hierarchies (BVH)
More informationUnderstanding Shaders and WebGL. Chris Dalton & Olli Etuaho
Understanding Shaders and WebGL Chris Dalton & Olli Etuaho Agenda Introduction: Accessible shader development with WebGL Understanding WebGL shader execution: from JS to GPU Common shader bugs Accessible
More informationMali Developer Resources. Kevin Ho ARM Taiwan FAE
Mali Developer Resources Kevin Ho ARM Taiwan FAE ARM Mali Developer Tools Software Development SDKs for OpenGL ES & OpenCL OpenGL ES Emulators Shader Development Studio Shader Library Asset Creation Texture
More informationFROM VERTICES TO FRAGMENTS. Lecture 5 Comp3080 Computer Graphics HKBU
FROM VERTICES TO FRAGMENTS Lecture 5 Comp3080 Computer Graphics HKBU OBJECTIVES Introduce basic implementation strategies Clipping Scan conversion OCTOBER 9, 2011 2 OVERVIEW At end of the geometric pipeline,
More informationGraphics Processing Unit Architecture (GPU Arch)
Graphics Processing Unit Architecture (GPU Arch) With a focus on NVIDIA GeForce 6800 GPU 1 What is a GPU From Wikipedia : A specialized processor efficient at manipulating and displaying computer graphics
More information