# SPARSE FLUID SIMULATION IN DIRECTX. Alex Dunn Graphics Dev. Tech.

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 SPARSE FLUID SIMULATION IN DIRECTX Alex Dunn Graphics Dev. Tech.

2 AGENDA We want more fluid in games! Eulerian Fluid Simulation. Sparse Eulerian Fluid. Feature Level 11.3 Enhancements. 2

3 WHY DO WE NEED FLUID IN GAMES? Replace particle kinematics! more realistic == better user immersion More than just eye candy? game mechanics? 3

4 EULERIAN SIMULATION #1 My (simple) DX11.0 eulerian fluid simulation: Inject 2x Velocity Advect Pressure Vorticity Evolve 2x Pressure 1x Vorticity 4

5 EULERIAN SIMULATION #2 Inject Advect Pressure Vorticity Evolve Add fluid to simulation Move data at, XYZ (XYZ+Velocity) Calculate localized pressure Calculates localized rotational flow Tick Simulation 5

6 **(some imagination required)** 6

7 7

8 TOO MANY VOLUMES SPOIL THE Fluid isn t box shaped. clipping wastage Simulated separately. authoring GPU state volume-to-volume interaction Tricky to render. 8

9 9

10 Memory (Mb) PROBLEM! N-order problem 64^3 = ~0.25m cells 128^3 = ~2m cells 256^3 = ~16m cells Applies to: computational complexity memory requirements Texture3D - 4x16F Dimensions (X = Y = Z) And that s just 1 texture 10

11 BRICKS Split simulation space into groups of cells (each known as a brick). Simulate each brick independently. 11

12 BRICK MAP Need to track which bricks contain fluid Texture3D<uint> 1 voxel per brick 0 Ignore 1 Simulate Could also use packed binary grids [Gruen15], but this requires atomics 12

13 TRACKING BRICKS #1 Initialise using fluid emitters. (easy with primitives) 13

14 TRACKING BRICKS #2 Simulating air is important for accuracy. Simulate? = Velocity > 0 14

15 TRACKING BRICKS #3 Expansion (ignore simulate) if { V x y z > D brick } expand simulation in that axis Reduction (simulate ignore) inverse of Expansion handled automatically by clear 15

16 SPARSE SIMULATION Clear BrickMap Inject Advect Pressure Vorticity Evolve* Fill List Reset all to 0 (ignore) in brick map. Read value from brick map. Append brick coordinate to list if occupied. Texture3D<uint> g_brickmapro; AppendStructredBuffer<uint3> g_listrw; if(g_brickmapro[idx]!= 0) { g_listrw.append(idx); } *Includes expansion 16

17 Memory (Mb) PROBLEM! N-order problem 64^3 = ~0.25m cells 128^3 = ~2m cells 256^3 = ~16m cells Applies to: computational complexity memory requirements Texture3D - 4x16F Dimensions (X = Y = Z) And that s just 1 texture 17

18 UNCOMPRESSED STORAGE Simulate Ignore Allocate everything; forget about unoccupied cells Pros: simulation is coherent in memory. works in DX11.0. Cons: no reduction in memory usage. 18

19 COMPRESSED STORAGE Indirection Table Similar to, List<Brick> Pros: good memory consumption. works in DX11.0. Physical Memory Cons: allocation strategies. indirect lookup. software translation filtering particularly costly 19

20 PADDING TO REDUCE EDGE CASES 1 Brick = (4) 3 = 64 20

21 PADDING TO REDUCE EDGE CASES 1 Brick = (1+4+1) 3 = 216 New problem; 6n 2 +12n + 8 problem. Can we do better? 21

22 ENTER; FEATURE LEVEL 11.3 Volume Tiled Resources (VTR)! Extends 2D functionality in FL11.2 Must query HW support: (DX11.3!= FL11.3): ID3D11Device3* pdevice3 = nullptr; pdevice->queryinterface(&pdevice3); D3D11_FEATURE_DATA_D3D11_OPTIONS2 support; pdevice3->checkfeaturesupport(d3d11_feature_d3d11_options2, &support, sizeof(support)); m_usetiledresources = support.tiledresourcestier == D3D11_TILED_RESOURCES_TIER_3; 22

23 TILED RESOURCES #1 Pros: only mapped memory is allocated in VRAM hardware translation logically a volume texture all samplers supported 1 Tile = 64KB (= 1 Brick) fast loads 23

24 TILED RESOURCES #2 1 Tile = 64KB (= 1 Brick) BPP Tile Dimensions 8 64x32x x32x x32x x16x x16x16 24

25 TILED RESOURCES #3 Letting the driver know which bricks/tiles should be resident: HRESULT ID3D11DeviceContext2::UpdateTileMappings( ID3D11Resource *ptiledresource, UINT NumTiledResourceRegions, const D3D11_TILED_RESOURCE_COORDINATE *ptiledresourceregionstartcoordinates, const D3D11_TILE_REGION_SIZE *ptiledresourceregionsizes, ID3D11Buffer *ptilepool, UINT NumRanges, const UINT *prangeflags, const UINT *ptilepoolstartoffsets, const UINT *prangetilecounts, UINT Flags ); 25

26 UPDATE TILE MAPPINGS TIP Don t update all tiles every frame. const UINT *prangeflags Track tile deltas and use the range flags; Ignore (unmapped) Simulate (mapped) Unchanged D3D11_TILE_RANGE_NULL D3D11_TILE_RANGE_REUSE_SINGLE_TILE D3D11_TILE_RANGE_SKIP 26

27 CPU READ BACKS Taboo in real time graphics CPU read backs are fine, if done correctly! (and bad if not) 2 frame latency (more for SLI) Profile map/unmap calls if unsure N; Data Ready N+1; Data Ready N+2; Data Ready CPU: GPU: Frame N Frame N+1 Frame N+2 Frame N+3 Frame N Frame N+1 Frame N+2 N; Tiles Mapped 27

28 LATENCY RESISTANT SIMULATION #1 Naïve Approach: clamp velocity to V max CPU Read-back: occupied bricks. 2 frames of latency! extrapolate probable tiles. 28

29 LATENCY RESISTANT SIMULATION #2 Better Approach: CPU Read-back: occupied bricks. max{ V } within brick. 2 frames of latency! extrapolate probable tiles. 29

30 LATENCY RESISTANT SIMULATION #3 Yes CPU Read back Ready? No Read back Brick List Prediction Engine Emitter Bricks CPU GPU Sparse Eulerian Simulation UpdateTile Mappings 30

31 DEMO 31

32 Sim. Time (ms) PERFORMANCE # Full Grid Sparse Grid Grid Resolution NOTE: Numbers captured on a GeForce GTX980 32

33 Memory (MB) PERFORMANCE #2 40, ,160 5, Full Grid 11 Sparse Grid Grid Resolution NOTE: Numbers captured on a GeForce GTX980 33

34 Other latency resistant techniques using tiled resources?? Thank you! Alex Dunn -

### Realtime Water Simulation on GPU. Nuttapong Chentanez NVIDIA Research

1 Realtime Water Simulation on GPU Nuttapong Chentanez NVIDIA Research 2 3 Overview Approaches to realtime water simulation Hybrid shallow water solver + particles Hybrid 3D tall cell water solver + particles

### NVIDIA AFTERMATH: A NEW WAY OF DEBUGGING CRASHES ON THE GPU. Alex Dunn, 2 nd March 2017

NVIDIA AFTERMATH: A NEW WAY OF DEBUGGING CRASHES ON THE GPU Alex Dunn, 2 nd March 2017 NVIDIA AFTERMATH What is it? New tool to diagnose GPU crashes, available on GeForce! Coming to D3D for broad availability

### Tiled shading: light culling reaching the speed of light. Dmitry Zhdan Developer Technology Engineer, NVIDIA

Tiled shading: light culling reaching the speed of light Dmitry Zhdan Developer Technology Engineer, NVIDIA Agenda Über Goal Classic deferred vs tiled shading How to improve culling in tiled shading? New

### Real-Time Voxelization for Global Illumination

Lecture 26: Real-Time Voxelization for Global Illumination Visual Computing Systems Voxelization to regular grid Input: scene triangles Output: surface information at each voxel in 3D grid - Simple case:

### CS 231. Fluid simulation

CS 231 Fluid simulation Why Simulate Fluids? Feature film special effects Computer games Medicine (e.g. blood flow in heart) Because it s fun Fluid Simulation Called Computational Fluid Dynamics (CFD)

### Scalable Multi Agent Simulation on the GPU. Avi Bleiweiss NVIDIA Corporation San Jose, 2009

Scalable Multi Agent Simulation on the GPU Avi Bleiweiss NVIDIA Corporation San Jose, 2009 Reasoning Explicit State machine, serial Implicit Compute intensive Fits SIMT well Collision avoidance Motivation

### CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

Entertainment Graphics: Virtual Realism for the Masses CSE 591: GPU Programming Introduction Computer games need to have: realistic appearance of characters and objects believable and creative shading,

### UberFlow: A GPU-Based Particle Engine

UberFlow: A GPU-Based Particle Engine Peter Kipfer Mark Segal Rüdiger Westermann Technische Universität München ATI Research Technische Universität München Motivation Want to create, modify and render

### Particle Simulation using CUDA. Simon Green

Particle Simulation using CUDA Simon Green sdkfeedback@nvidia.com July 2012 Document Change History Version Date Responsible Reason for Change 1.0 Sept 19 2007 Simon Green Initial draft 1.1 Nov 3 2007

### Low-Overhead Rendering with Direct3D. Evan Hart Principal Engineer - NVIDIA

Low-Overhead Rendering with Direct3D Evan Hart Principal Engineer - NVIDIA Ground Rules No DX9 Need to move fast Big topic in 30 minutes Assuming experienced audience Everything is a tradeoff These are

### Voxel Cone Tracing and Sparse Voxel Octree for Real-time Global Illumination. Cyril Crassin NVIDIA Research

Voxel Cone Tracing and Sparse Voxel Octree for Real-time Global Illumination Cyril Crassin NVIDIA Research Global Illumination Indirect effects Important for realistic image synthesis Direct lighting Direct+Indirect

### Octree-Based Sparse Voxelization for Real-Time Global Illumination. Cyril Crassin NVIDIA Research

Octree-Based Sparse Voxelization for Real-Time Global Illumination Cyril Crassin NVIDIA Research Voxel representations Crane et al. (NVIDIA) 2007 Allard et al. 2010 Christensen and Batali (Pixar) 2004

### Fluid Simulation. [Thürey 10] [Pfaff 10] [Chentanez 11]

Fluid Simulation [Thürey 10] [Pfaff 10] [Chentanez 11] 1 Computational Fluid Dynamics 3 Graphics Why don t we just take existing models from CFD for Computer Graphics applications? 4 Graphics Why don t

### CS GAME PROGRAMMING Question bank

CS6006 - GAME PROGRAMMING Question bank Part A Unit I 1. List the different types of coordinate systems. 2. What is ray tracing? Mention some applications of ray tracing. 3. Discuss the stages involved

### GPU-accelerated data expansion for the Marching Cubes algorithm

GPU-accelerated data expansion for the Marching Cubes algorithm San Jose (CA) September 23rd, 2010 Christopher Dyken, SINTEF Norway Gernot Ziegler, NVIDIA UK Agenda Motivation & Background Data Compaction

### New GPU Features of NVIDIA s Maxwell Architecture

New GPU Features of NVIDIA s Maxwell Architecture Holger Gruen Senior DevTech Engineer AGENDA 9:30 am 10:30 am 11:00 am 12:00 am 12:30 am 13:30 pm 14:00 pm 15:00 pm 15:30 pm 16:30 pm 17:00 pm 18:00 pm

### Adarsh Krishnamurthy (cs184-bb) Bela Stepanova (cs184-bs)

OBJECTIVE FLUID SIMULATIONS Adarsh Krishnamurthy (cs184-bb) Bela Stepanova (cs184-bs) The basic objective of the project is the implementation of the paper Stable Fluids (Jos Stam, SIGGRAPH 99). The final

### Simple and Powerful Animation Compression. Nicholas Fréchette Programming Consultant for Eidos Montreal

Simple and Powerful Animation Compression Nicholas Fréchette Programming Consultant for Eidos Montreal Contributors Frédéric Zimmer, co-designer Luke Mamacos, consultant Thank you Eidos Montreal! Presentation

### GCN Performance Tweets AMD Developer Relations

AMD Developer Relations Overview This document lists all GCN ( Graphics Core Next ) performance tweets that were released on Twitter during the first few months of 2013. Each performance tweet in this

### CUDA/OpenGL Fluid Simulation. Nolan Goodnight

CUDA/OpenGL Fluid Simulation Nolan Goodnight ngoodnight@nvidia.com Document Change History Version Date Responsible Reason for Change 0.1 2/22/07 Nolan Goodnight Initial draft 1.0 4/02/07 Nolan Goodnight

### FRUSTUM-TRACED RASTER SHADOWS: REVISITING IRREGULAR Z-BUFFERS

FRUSTUM-TRACED RASTER SHADOWS: REVISITING IRREGULAR Z-BUFFERS Chris Wyman, Rama Hoetzlein, Aaron Lefohn 2015 Symposium on Interactive 3D Graphics & Games CONTRIBUTIONS Full scene, fully dynamic alias-free

### PowerVR Hardware. Architecture Overview for Developers

Public Imagination Technologies PowerVR Hardware Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.

### CLOUD GAMING AND NVIDIA GRID. Agatha Hu, GRID DevTech Engineer

CLOUD GAMING AND NVIDIA GRID Agatha Hu, GRID DevTech Engineer Why Cloud Gaming AGENDA Cloud Gaming Architecture Register and use AWS Onboard Games Onto GRID GRID Link SDK CLOUD GPUs FOR 6 MARKETS Cloud

### GDC 2014 Barthold Lichtenbelt OpenGL ARB chair

GDC 2014 Barthold Lichtenbelt OpenGL ARB chair Agenda OpenGL 4.4, news and updates - Barthold Lichtenbelt, NVIDIA Low Overhead Rendering with OpenGL - Cass Everitt, NVIDIA Copyright Khronos Group, 2010

### Render-To-Texture Caching. D. Sim Dietrich Jr.

Render-To-Texture Caching D. Sim Dietrich Jr. What is Render-To-Texture Caching? Pixel shaders are becoming more complex and expensive Per-pixel shadows Dynamic Normal Maps Bullet holes Water simulation

### CS-184: Computer Graphics Lecture #21: Fluid Simulation II

CS-184: Computer Graphics Lecture #21: Fluid Simulation II Rahul Narain University of California, Berkeley Nov. 18 19, 2013 Grid-based fluid simulation Recap: Eulerian viewpoint Grid is fixed, fluid moves

ing Hardware in G80 1 Sources Slides by ECE 498 AL : Programming Massively Parallel Processors : Wen-Mei Hwu John Nickolls, NVIDIA 2 3D 3D API: API: OpenGL OpenGL or or Direct3D Direct3D GPU Command &

### Complexity and Advanced Algorithms. Introduction to Parallel Algorithms

Complexity and Advanced Algorithms Introduction to Parallel Algorithms Why Parallel Computing? Save time, resources, memory,... Who is using it? Academia Industry Government Individuals? Two practical

### PVRTC & Texture Compression. User Guide

Public Imagination Technologies PVRTC & Texture Compression Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty

### Visualization Computer Graphics I Lecture 20

15-462 Computer Graphics I Lecture 20 Visualization Height Fields and Contours Scalar Fields Volume Rendering Vector Fields [Angel Ch. 12] November 20, 2003 Doug James Carnegie Mellon University http://www.cs.cmu.edu/~djames/15-462/fall03

### FRUSTUM-TRACED RASTER SHADOWS: REVISITING IRREGULAR Z-BUFFERS

FRUSTUM-TRACED RASTER SHADOWS: REVISITING IRREGULAR Z-BUFFERS Chris Wyman, Rama Hoetzlein, Aaron Lefohn 2015 Symposium on Interactive 3D Graphics & Games CONTRIBUTIONS Full scene, fully dynamic alias-free

### PowerVR Performance Recommendations. The Golden Rules

PowerVR Performance Recommendations Copyright Imagination Technologies Limited. All Rights Reserved. This publication contains proprietary information which is subject to change without notice and is supplied

### Lecture S3: File system data layout, naming

Lecture S3: File system data layout, naming Review -- 1 min Intro to I/O Performance model: Log Disk physical characteristics/desired abstractions Physical reality Desired abstraction disks are slow fast

### BACHELOR OF ARTS IN 3D ANIMATION AND VISUAL EFFECTS Term-End Theory Examination December, 2015 BNM-001 : ANIMATION PRODUCTION PIPELINE

No. of Printed Pages : 7 BNM-001 cx) BACHELOR OF ARTS IN 3D ANIMATION AND VISUAL EFFECTS Term-End Theory Examination December, 015 BNM-001 : ANIMATION PRODUCTION PIPELINE Time : 3 hours Maximum Marks :

### GPU ARCHITECTURE Chris Schultz, June 2017

Chris Schultz, June 2017 MISC All of the opinions expressed in this presentation are my own and do not reflect any held by NVIDIA 2 OUTLINE Problems Solved Over Time versus Why are they different? Complex

### The Illusion of Motion Making magic with textures in the vertex shader. Mario Palmero Lead Programmer at Tequila Works

The Illusion of Motion Making magic with textures in the vertex shader Mario Palmero Lead Programmer at Tequila Works Dark Ages before Textures in the Vertex Shader What is the Vertex Shader? A programmable

### Physics Parallelization. Erwin Coumans SCEA

Physics Parallelization A B C Erwin Coumans SCEA Takeaway Multi-threading the physics pipeline collision detection and constraint solver Refactoring algorithms and data for SPUs Maintain unified multi-platform

### CUDA (Compute Unified Device Architecture)

CUDA (Compute Unified Device Architecture) Mike Bailey History of GPU Performance vs. CPU Performance GFLOPS Source: NVIDIA G80 = GeForce 8800 GTX G71 = GeForce 7900 GTX G70 = GeForce 7800 GTX NV40 = GeForce

### Large and Sparse Mass Spectrometry Data Processing in the GPU Jose de Corral 2012 GPU Technology Conference

Large and Sparse Mass Spectrometry Data Processing in the GPU Jose de Corral 2012 GPU Technology Conference 2012 Waters Corporation 1 Agenda Overview of LC/IMS/MS 3D Data Processing 4D Data Processing

### Real-Time Graphics Architecture. Kurt Akeley Pat Hanrahan. Ray Tracing.

Real-Time Graphics Architecture Kurt Akeley Pat Hanrahan http://www.graphics.stanford.edu/courses/cs448a-01-fall Ray Tracing with Tim Purcell 1 Topics Why ray tracing? Interactive ray tracing on multicomputers

### Introduction to OpenGL ES 3.0

Introduction to OpenGL ES 3.0 Eisaku Ohbuchi Digital Media Professionals Inc. 2012 Digital Media Professionals Inc. All rights reserved. 12/Sep/2012 Page 1 Agenda DMP overview (quick!) OpenGL ES 3.0 update

### Streaming Massive Environments From Zero to 200MPH

FORZA MOTORSPORT From Zero to 200MPH Chris Tector (Software Architect Turn 10 Studios) Turn 10 Internal studio at Microsoft Game Studios - we make Forza Motorsport Around 70 full time staff 2 Why am I

### CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 15

CS24: INTRODUCTION TO COMPUTING SYSTEMS Spring 2017 Lecture 15 LAST TIME: CACHE ORGANIZATION Caches have several important parameters B = 2 b bytes to store the block in each cache line S = 2 s cache sets

### Building an Inverted Index

Building an Inverted Index Algorithms Memory-based Disk-based (Sort-Inversion) Sorting Merging (2-way; multi-way) 2 Memory-based Inverted Index Phase I (parse and read) For each document Identify distinct

### Spatial Data Structures

15-462 Computer Graphics I Lecture 17 Spatial Data Structures Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees Constructive Solid Geometry (CSG) April 1, 2003 [Angel 9.10] Frank Pfenning Carnegie

### Virtual Memory: Mechanisms. CS439: Principles of Computer Systems February 28, 2018

Virtual Memory: Mechanisms CS439: Principles of Computer Systems February 28, 2018 Last Time Physical addresses in physical memory Virtual/logical addresses in process address space Relocation Algorithms

### 1. Introduction. Introduction to Computer Graphics

1 1. Introduction Introduction to Computer Graphics 2 Display and Input devices Display and Input Technologies Physical Display Technologies 3 The first modern computer display devices we had were cathode

### NVIDIA GameWorks Animation Technologies in Unreal Engine 4

NVIDIA GameWorks Animation Technologies in Unreal Engine 4 Kier Storey, Viktor Makoviychuk NVIDIA Ori Cohen, Benn Gallagher - Epic Overview Introduce new Gameworks physics-based animation technologies

### DX10, Batching, and Performance Considerations. Bryan Dudash NVIDIA Developer Technology

DX10, Batching, and Performance Considerations Bryan Dudash NVIDIA Developer Technology The Point of this talk The attempt to combine wisdom and power has only rarely been successful and then only for

### Spatial Data Structures

CSCI 480 Computer Graphics Lecture 7 Spatial Data Structures Hierarchical Bounding Volumes Regular Grids BSP Trees [Ch. 0.] March 8, 0 Jernej Barbic University of Southern California http://www-bcf.usc.edu/~jbarbic/cs480-s/

### PowerVR Series5. Architecture Guide for Developers

Public Imagination Technologies PowerVR Series5 Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.

### Stream Processing with CUDA TM A Case Study Using Gamebryo's Floodgate Technology

Stream Processing with CUDA TM A Case Study Using Gamebryo's Floodgate Technology Dan Amerson, Technical Director, Emergent Game Technologies Purpose Why am I giving this talk? To answer this question:

### Windowing System on a 3D Pipeline. February 2005

Windowing System on a 3D Pipeline February 2005 Agenda 1.Overview of the 3D pipeline 2.NVIDIA software overview 3.Strengths and challenges with using the 3D pipeline GeForce 6800 220M Transistors April

### Optimizing and Profiling Unity Games for Mobile Platforms. Angelo Theodorou Senior Software Engineer, MPG Gamelab 2014, 25 th -27 th June

Optimizing and Profiling Unity Games for Mobile Platforms Angelo Theodorou Senior Software Engineer, MPG Gamelab 2014, 25 th -27 th June 1 Agenda Introduction ARM and the presenter Preliminary knowledge

### VGA Card GV-NX78X256V-B

VGA Card GV-NX78X256V-B Jun. 29, 2005 VGA GA Card GV-NX78X256V -NX78X256V-B -B Jun. 29, 2005 VGA Card GV-NX78X256VP-B Jul. 28, 2005 VGA GA Card GV-NX78X256VP-B Jul. 28, 2005 GV-NX78X256V-B / GV-NX78X256VP-B

### How TokuDB Fractal TreeTM. Indexes Work. Bradley C. Kuszmaul. MySQL UC 2010 How Fractal Trees Work 1

MySQL UC 2010 How Fractal Trees Work 1 How TokuDB Fractal TreeTM Indexes Work Bradley C. Kuszmaul MySQL UC 2010 How Fractal Trees Work 2 More Information You can download this talk and others at http://tokutek.com/technology

### Particle-based Fluid Simulation

Simulation in Computer Graphics Particle-based Fluid Simulation Matthias Teschner Computer Science Department University of Freiburg Application (with Pixar) 10 million fluid + 4 million rigid particles,

### Coming to a Pixel Near You: Mobile 3D Graphics on the GoForce WMP. Chris Wynn NVIDIA Corporation

Coming to a Pixel Near You: Mobile 3D Graphics on the GoForce WMP Chris Wynn NVIDIA Corporation What is GoForce 3D? Licensable 3D Core for Mobile Devices Discrete Solutions: GoForce 3D 4500/4800 OpenGL

### Visualization and VR for the Grid

Visualization and VR for the Grid Chris Johnson Scientific Computing and Imaging Institute University of Utah Computational Science Pipeline Construct a model of the physical domain (Mesh Generation, CAD)

### Real-Time Rendering Architectures

Real-Time Rendering Architectures Mike Houston, AMD Part 1: throughput processing Three key concepts behind how modern GPU processing cores run code Knowing these concepts will help you: 1. Understand

### Speed up a Machine-Learning-based Image Super-Resolution Algorithm on GPGPU

Speed up a Machine-Learning-based Image Super-Resolution Algorithm on GPGPU Ke Ma 1, and Yao Song 2 1 Department of Computer Sciences 2 Department of Electrical and Computer Engineering University of Wisconsin-Madison

### Vector Visualization. CSC 7443: Scientific Information Visualization

Vector Visualization Vector data A vector is an object with direction and length v = (v x,v y,v z ) A vector field is a field which associates a vector with each point in space The vector data is 3D representation

### Voxels. Tech Team - Johnny Mercado, Michael Matonis, Glen Giffey, John Jackson

Voxels Tech Team - Johnny Mercado, Michael Matonis, Glen Giffey, John Jackson Pixel -> Voxel Appearance in Games Comanche: Maximum Overkill - 1992 Minecraft - 2011 Guncraft - 2013 CodeSpell https://www.youtube.com/watch?v=nn5mqxxzd0

### CUDA. Schedule API. Language extensions. nvcc. Function type qualifiers (1) CUDA compiler to handle the standard C extensions.

Schedule CUDA Digging further into the programming manual Application Programming Interface (API) text only part, sorry Image utilities (simple CUDA examples) Performace considerations Matrix multiplication

### GRAPHICS PROCESSING UNITS

GRAPHICS PROCESSING UNITS Slides by: Pedro Tomás Additional reading: Computer Architecture: A Quantitative Approach, 5th edition, Chapter 4, John L. Hennessy and David A. Patterson, Morgan Kaufmann, 2011

### Scan Primitives for GPU Computing

Scan Primitives for GPU Computing Agenda What is scan A naïve parallel scan algorithm A work-efficient parallel scan algorithm Parallel segmented scan Applications of scan Implementation on CUDA 2 Prefix

### VGA Card GV-NX66T256DE

VGA Card GV-NX66T256DE Sep. 6, 2005 VGA Card GV-NX66T256DE Sep. 6, 2005 GV-NX66T256DE GeForce 6600 GT Graphics Accelerator User's Manual Rev. 101 12MD-X66T2DE-101R Copyright 2005 GIGABYTE TECHNOLOGY CO.,

### CS195V Week 1. Introduction

CS195V Week 1 Introduction Welcome! This is CSCI1950V: Advanced GPU Programming Prerequisite: CSCI1230: Introduction to Computer Graphics CSCI2240: Interactive Computer Graphics may be helpful, but not

### CSM Scrolling: An acceleration technique for the rendering of cascaded shadow maps

CSM Scrolling: An acceleration technique for the rendering of cascaded shadow maps Abstract This talk will explain how a bitmap-scrolling technique can be combined with a shadow map caching scheme to significantly

### Massively Parallel Architectures

Massively Parallel Architectures A Take on Cell Processor and GPU programming Joel Falcou - LRI joel.falcou@lri.fr Bat. 490 - Bureau 104 20 janvier 2009 Motivation The CELL processor Harder,Better,Faster,Stronger

### Graphics Hardware, Graphics APIs, and Computation on GPUs. Mark Segal

Graphics Hardware, Graphics APIs, and Computation on GPUs Mark Segal Overview Graphics Pipeline Graphics Hardware Graphics APIs ATI s low-level interface for computation on GPUs 2 Graphics Hardware High

### Direct Rendering of Trimmed NURBS Surfaces

Direct Rendering of Trimmed NURBS Surfaces Hardware Graphics Pipeline 2/ 81 Hardware Graphics Pipeline GPU Video Memory CPU Vertex Processor Raster Unit Fragment Processor Render Target Screen Extended

### Next Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Copyright Khronos Group Page 1

Next Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Ecosystem @neilt3d Copyright Khronos Group 2015 - Page 1 Copyright Khronos Group 2015 - Page 2 Khronos Connects Software to Silicon

### Actions and Graphs in Blender - Week 8

Actions and Graphs in Blender - Week 8 Sculpt Tool Sculpting tools in Blender are very easy to use and they will help you create interesting effects and model characters when working with animation and

### Bigtable. A Distributed Storage System for Structured Data. Presenter: Yunming Zhang Conglong Li. Saturday, September 21, 13

Bigtable A Distributed Storage System for Structured Data Presenter: Yunming Zhang Conglong Li References SOCC 2010 Key Note Slides Jeff Dean Google Introduction to Distributed Computing, Winter 2008 University

### SCIENTIFIC VISUALIZATION ON GPU CLUSTERS PETER MESSMER, NVIDIA

SCIENTIFIC VISUALIZATION ON GPU CLUSTERS PETER MESSMER, NVIDIA Visualization Rendering Visualization Isosurfaces, Isovolumes Field Operators (Gradient, Curl,.. ) Coordinate transformations Feature extraction

### Spatial Data Structures

Spatial Data Structures Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees Constructive Solid Geometry (CSG) [Angel 9.10] Outline Ray tracing review what rays matter? Ray tracing speedup faster

### NVIDIA Developer Tools for Graphics and PhysX

NVIDIA Developer Tools for Graphics and PhysX FX Composer Shader Debugger PerfKit Conference Presentations mental mill Artist Edition NVIDIA Shader Library Photoshop Plug ins Texture Tools Direct3D SDK

### Rendering Algorithms: Real-time indirect illumination. Spring 2010 Matthias Zwicker

Rendering Algorithms: Real-time indirect illumination Spring 2010 Matthias Zwicker Today Real-time indirect illumination Ray tracing vs. Rasterization Screen space techniques Visibility & shadows Instant

### The Exascale Architecture

The Exascale Architecture Richard Graham HPC Advisory Council China 2013 Overview Programming-model challenges for Exascale Challenges for scaling MPI to Exascale InfiniBand enhancements Dynamically Connected

### Practical Introduction to CUDA and GPU

Practical Introduction to CUDA and GPU Charlie Tang Centre for Theoretical Neuroscience October 9, 2009 Overview CUDA - stands for Compute Unified Device Architecture Introduced Nov. 2006, a parallel computing

### Multimedia in Mobile Phones. Architectures and Trends Lund

Multimedia in Mobile Phones Architectures and Trends Lund 091124 Presentation Henrik Ohlsson Contact: henrik.h.ohlsson@stericsson.com Working with multimedia hardware (graphics and displays) at ST- Ericsson

### LECTURE 5. Announcements

LECTURE 5 Announcements Falling Behind? Talk to us Number of retries / grading policy slightly different this year You still pass the class if you hand in all projects at the end of the semester Hand in

### A Different Approach for Continuous Physics. Vincent ROBERT Physics Programmer at Ubisoft

A Different Approach for Continuous Physics Vincent ROBERT vincent.robert@ubisoft.com Physics Programmer at Ubisoft A Different Approach for Continuous Physics Existing approaches Our method Limitations

### Flow Visualisation 1

Flow Visualisation Visualisation Lecture 13 Institute for Perception, Action & Behaviour School of Informatics Flow Visualisation 1 Flow Visualisation... so far Vector Field Visualisation vector fields

### Mesh Morphing and the Adjoint Solver in ANSYS R14.0. Simon Pereira Laz Foley

Mesh Morphing and the Adjoint Solver in ANSYS R14.0 Simon Pereira Laz Foley 1 Agenda Fluent Morphing-Optimization Feature RBF Morph with ANSYS DesignXplorer Adjoint Solver What does an adjoint solver do,

### CSE528 Computer Graphics: Theory, Algorithms, and Applications

CSE528 Computer Graphics: Theory, Algorithms, and Applications Hong Qin State University of New York at Stony Brook (Stony Brook University) Stony Brook, New York 11794--4400 Tel: (631)632-8450; Fax: (631)632-8334

### Simple and Fast Fluids

Simple and Fast Fluids Martin Guay, Fabrice Colin, Richard Egli To cite this version: Martin Guay, Fabrice Colin, Richard Egli. Simple and Fast Fluids. GPU Pro, A.K. Peters, Ltd., 2011, GPU Pro, pp.433-444.

### Khronos Data Format Specification

Copyright Khronos Group 2015 - Page 1 Khronos Data Format Specification 1.0 release, July 2015 Andrew Garrard Spec editor Senior Software Engineer, Samsung Electronics Copyright Khronos Group 2015 - Page

### Cell Based Servers for Next-Gen Games

Cell BE for Digital Animation and Visual F/X Cell Based Servers for Next-Gen Games Cell BE Online Game Prototype Bruce D Amora T.J. Watson Research Yorktown Heights, NY Karen Magerlein T.J. Watson Research

### Introduction to the Direct3D 11 Graphics Pipeline

Introduction to the Direct3D 11 Graphics Pipeline Kevin Gee - XNA Developer Connection Microsoft Corporation 2008 NVIDIA Corporation. Direct3D 11 focuses on Key Takeaways Increasing scalability, Improving

RADEON X1000 Memory Controller Ring Bus Memory Controller Supports today s fastest graphics memory devices GDDR3, 48+ GB/sec 512-bit Ring Bus Simplifies layout and enables extreme memory clock scaling

### Clipping. CSC 7443: Scientific Information Visualization

Clipping Clipping to See Inside Obscuring critical information contained in a volume data Contour displays show only exterior visible surfaces Isosurfaces can hide other isosurfaces Other displays can

### Numerical Modelling in Fortran: day 7. Paul Tackley, 2017

Numerical Modelling in Fortran: day 7 Paul Tackley, 2017 Today s Goals 1. Makefiles 2. Intrinsic functions 3. Optimisation: Making your code run as fast as possible 4. Combine advection-diffusion and Poisson

### Visualization. Images are used to aid in understanding of data. Height Fields and Contours Scalar Fields Volume Rendering Vector Fields [chapter 26]

Visualization Images are used to aid in understanding of data Height Fields and Contours Scalar Fields Volume Rendering Vector Fields [chapter 26] Tumor SCI, Utah Scientific Visualization Visualize large

### DiFi: Distance Fields - Fast Computation Using Graphics Hardware

DiFi: Distance Fields - Fast Computation Using Graphics Hardware Avneesh Sud Dinesh Manocha UNC-Chapel Hill http://gamma.cs.unc.edu/difi Distance Fields Distance Function For a site a scalar function f:r