Adaptive NURBS Tessellation with CUDA

Size: px
Start display at page:

Download "Adaptive NURBS Tessellation with CUDA"

Transcription

1 Adaptive NURBS with CUDA Presenter: Dr. Qiu Jie Multi-plAtform Game Innovation Centre (MAGIC), Nanyang Technological University (NTU), Singapore

2 About MAGIC Established in Nanyang Technological University (NTU), Singapore on 1 Nov 2012 Cross-disciplinary research with 26 professors, 33 researchers, 11 PhD To translate scientific ideas into technological products and services To increase the capability of Future Studios in media & entertainment Launched Future Studios Research Lab (FSR Lab) on 21 Jan

3 Adaptive NURBS Convert NURBS to a set of polygons Adaptive tessellation NURBS surface 3 According to Curvature View (camera distance to object) Non-adaptive tessellation Adaptive tessellation

4 Tessellating NURBS surface 4

5 Adaptive NURBS How to be adaptive? Divide the surface into patches Choose accurate, effective criterion to tessellate each patch How to handle the connection of neighbor parts? Extract neighbour info of each patch to check boundary consistency Stitch gaps if not consistent How to speed up? Implementation on GPU 5

6 Proposed Framework To be adaptive Handle boundaries Bézier Conversion Factors Calculation Stitch Gap 6 GPU implementation

7 Bézier Conversion Bézier Conversion Factors Calculation convert NURBS surface to a set of Bézier patches Stitch Gap 7 More accurate for tessellation factor calculation

8 Factors Calculation Bézier Conversion Factors Calculation Stitch Gap n m 8

9 Factors Calculation One traditional method 2 2 i i ε tess max( ( w P ) + ( r ) w ) i i P i : control points w i : weights of control points Related to the origin of the coordinate system Heuristic method: Center of bounding box as origin r 9 r

10 Factors Calculation Our method Using a new objective function to find origin O n n wp i i 2 ( Pj O) + wi ( O) 2 O w min( ) Model j= 0 i= 0 i Old method (ɛ =0.1) Our method (ɛ =0.1) 10 # of vertices

11 Bézier Conversion Factors Calculation Stitch Gap (s,t) Sst (,) = n m i= 0 j= 0 n m i= 0 j= 0 w P B() sb() t ij, ij, i j w B() sb() t ij, i j 11

12 Stitch Gap Bézier Conversion Factors Calculation Stitch Gap 12 Problem: inconsistent tessellation factors between patches gap

13 Stitch Gap 1 side 4 sides Re-design boundary connections 13

14 Implementation on GPU level of parallelism Thread assignment (16,16) (16,16) (16,16) (16,16) (16,16) (16,16) 14

15 Implementation on GPU data management Bézier Conversion Factors Calculation Stitch Gap Minimize data transfer 15

16 Implementation on GPU NURBS surface to Bézier patches Compute Bézier control points 16

17 Implementation on GPU Copy control points needed by each block to shared memory Avoid duplicate memory read for global memory 17 4 X 259 = 1036 points for a block 19 X 19 = 361 points for a block

18 Implementation on GPU Bézier Conversion Factors Calculation Stitch Gap Store tessellation factors of all patches in order to extract neighbor info 18

19 Implementation on GPU Calculate tessellation factors Calculate tessellation factors (2 directions) Store them in global memory n Patch ID 0 1 Direction 1 Direction 2 h 0 v 0 h 1 v 1 2 h 2 v 2 m 19 k h k v k

20 Implementation on GPU Bézier Conversion Factors Calculation Stitch Gap Pre-compute blending functions 20

21 Implementation on GPU using regular pattern Fast evaluation Each row has the same B i (s) Each column has the same (t) ij, ij, i j B j i= 0 j= 0 Sst (,) = n m n m i= 0 j= 0 w P B() sb() t w B() sb() t ij, i j 21

22 Implementation on GPU Bézier Conversion Factors Calculation Stitch Gap 22 Parallel scan to determine vertex/index offset for each patch

23 Implementation on GPU Vertex offset Determine # of vertices for each patch based on tessellation factors Determine vertex offset for each patch Patch ID # vertices vertex offset 0 (#v) (#v) 1 (#v) 2 Prefix sum (#v) 0 (#v) 0 +(#v) 1 23 k (#v) k SUM((#V) 0, (#V) k-1 )

24 Implementation on GPU Transition region and index offset Determine # of indices on each patch (include transition regions) Determine index offset for each patch Patch ID # indices index offset 0 (#id) (#id) 1 (#id) 2 Prefix sum (#id) 0 (#id) 0 +(#id) 1 24 k (#id) k SUM((#id) 0, (#id) k-1 )

25 Implementation on GPU Bézier Conversion Factors Calculation Stitch Gap Minimize data transfer Store patch info for easy neighbour info extraction Regular tessellation pattern Parallel scan 25

26 Video Demonstration 26

27 Results 27 Head Old man face Stegosaurus Patch No Triangle No

28 Results CPU (intel i GHz) vs GPU (GTX CUDA cores 160GB/s bandwidth) 28 Run time / ms Head Old man face Stegosaurus CPU GPU

29 Results Car model Error tolerance surfaces 3,081,930 triangles 10.4 ms 29

30 Results Submarine model Error tolerance surfaces 2,235,934 triangles 9.16 ms 30

31 Acknowledgement This research is supported by Multi-plAtform Game Innovation Centre (MAGIC), funded by the Singapore National Research Foundation under its IDM Futures Funding Initiative and administered by the Interactive & Digital Media Programme Office, Media Development Authority of Singapore, and the National Science Foundation of China (Grant no , ). 31

32 Thank you!

Future Studios Research Lab

Future Studios Research Lab GPU TECHNOLOGY WORKSHOP SOUTH EAST ASIA 2014 Future Studios Research Lab The Boy and His Robot Film Case Study Prof SEAH Hock Soon Director Multi-plAtform Game Innovation Centre (MAGIC) Nanyang Technological

More information

Information Coding / Computer Graphics, ISY, LiTH. Splines

Information Coding / Computer Graphics, ISY, LiTH. Splines 28(69) Splines Originally a drafting tool to create a smooth curve In computer graphics: a curve built from sections, each described by a 2nd or 3rd degree polynomial. Very common in non-real-time graphics,

More information

L10 Layered Depth Normal Images. Introduction Related Work Structured Point Representation Boolean Operations Conclusion

L10 Layered Depth Normal Images. Introduction Related Work Structured Point Representation Boolean Operations Conclusion L10 Layered Depth Normal Images Introduction Related Work Structured Point Representation Boolean Operations Conclusion 1 Introduction Purpose: using the computational power on GPU to speed up solid modeling

More information

Scan Conversion of Polygons. Dr. Scott Schaefer

Scan Conversion of Polygons. Dr. Scott Schaefer Scan Conversion of Polygons Dr. Scott Schaefer Drawing Rectangles Which pixels should be filled? /8 Drawing Rectangles Is this correct? /8 Drawing Rectangles What if two rectangles overlap? 4/8 Drawing

More information

Fast BVH Construction on GPUs

Fast BVH Construction on GPUs Fast BVH Construction on GPUs Published in EUROGRAGHICS, (2009) C. Lauterbach, M. Garland, S. Sengupta, D. Luebke, D. Manocha University of North Carolina at Chapel Hill NVIDIA University of California

More information

Spring 2011 Prof. Hyesoon Kim

Spring 2011 Prof. Hyesoon Kim Spring 2011 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on

More information

Higher Order Surfaces in OpenGL with NV_evaluators. Sébastien Dominé

Higher Order Surfaces in OpenGL with NV_evaluators. Sébastien Dominé Higher Order Surfaces in OpenGL with NV_evaluators Sébastien Dominé Why surfaces? Higher order primitives Animation Level of Detail Bandwidth Filtering 2 Overview What are the general evaluators defined

More information

GPGPU LAB. Case study: Finite-Difference Time- Domain Method on CUDA

GPGPU LAB. Case study: Finite-Difference Time- Domain Method on CUDA GPGPU LAB Case study: Finite-Difference Time- Domain Method on CUDA Ana Balevic IPVS 1 Finite-Difference Time-Domain Method Numerical computation of solutions to partial differential equations Explicit

More information

Duksu Kim. Professional Experience Senior researcher, KISTI High performance visualization

Duksu Kim. Professional Experience Senior researcher, KISTI High performance visualization Duksu Kim Assistant professor, KORATEHC Education Ph.D. Computer Science, KAIST Parallel Proximity Computation on Heterogeneous Computing Systems for Graphics Applications Professional Experience Senior

More information

Spring 2009 Prof. Hyesoon Kim

Spring 2009 Prof. Hyesoon Kim Spring 2009 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on

More information

Direct Rendering of Trimmed NURBS Surfaces

Direct Rendering of Trimmed NURBS Surfaces Direct Rendering of Trimmed NURBS Surfaces Hardware Graphics Pipeline 2/ 81 Hardware Graphics Pipeline GPU Video Memory CPU Vertex Processor Raster Unit Fragment Processor Render Target Screen Extended

More information

GPU & Reproductibility, predictability, Repetability, Determinism.

GPU & Reproductibility, predictability, Repetability, Determinism. GPU & Reproductibility, predictability, Repetability, Determinism. David Defour DALI/LIRMM, UPVD Phd Thesis proposal Subject Study the {numerical} predictability of GPUs Why?... fuzzy definition but useful

More information

Large-scale Deep Unsupervised Learning using Graphics Processors

Large-scale Deep Unsupervised Learning using Graphics Processors Large-scale Deep Unsupervised Learning using Graphics Processors Rajat Raina Anand Madhavan Andrew Y. Ng Stanford University Learning from unlabeled data Classify vs. car motorcycle Input space Unlabeled

More information

CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015

CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015 CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015 Announcements Project 2 due tomorrow at 2pm Grading window

More information

Ray Casting of Trimmed NURBS Surfaces on the GPU

Ray Casting of Trimmed NURBS Surfaces on the GPU Ray Casting of Trimmed NURBS Surfaces on the GPU Hans-Friedrich Pabst Jan P. Springer André Schollmeyer Robert Lenhardt Christian Lessig Bernd Fröhlich Bauhaus University Weimar Faculty of Media Virtual

More information

Real-Time Buffer Compression. Michael Doggett Department of Computer Science Lund university

Real-Time Buffer Compression. Michael Doggett Department of Computer Science Lund university Real-Time Buffer Compression Michael Doggett Department of Computer Science Lund university Project 3D graphics project Demo, Game Implement 3D graphics algorithm(s) C++/OpenGL(Lab2)/iOS/android/3D engine

More information

Accelerating Leukocyte Tracking Using CUDA: A Case Study in Leveraging Manycore Coprocessors

Accelerating Leukocyte Tracking Using CUDA: A Case Study in Leveraging Manycore Coprocessors Accelerating Leukocyte Tracking Using CUDA: A Case Study in Leveraging Manycore Coprocessors Michael Boyer, David Tarjan, Scott T. Acton, and Kevin Skadron University of Virginia IPDPS 2009 Outline Leukocyte

More information

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University CSE 591/392: GPU Programming Introduction Klaus Mueller Computer Science Department Stony Brook University First: A Big Word of Thanks! to the millions of computer game enthusiasts worldwide Who demand

More information

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller Entertainment Graphics: Virtual Realism for the Masses CSE 591: GPU Programming Introduction Computer games need to have: realistic appearance of characters and objects believable and creative shading,

More information

Design Workflow for AM: From CAD to Part

Design Workflow for AM: From CAD to Part Design Workflow for AM: From CAD to Part Sanjay Joshi Professor of Industrial and Manufacturing Engineering Penn State University Offered by: Center for Innovative Materials Processing through Direct Digital

More information

Simultaneous Solving of Linear Programming Problems in GPU

Simultaneous Solving of Linear Programming Problems in GPU Simultaneous Solving of Linear Programming Problems in GPU Amit Gurung* amitgurung@nitm.ac.in Binayak Das* binayak89cse@gmail.com Rajarshi Ray* raj.ray84@gmail.com * National Institute of Technology Meghalaya

More information

Using Virtual Texturing to Handle Massive Texture Data

Using Virtual Texturing to Handle Massive Texture Data Using Virtual Texturing to Handle Massive Texture Data San Jose Convention Center - Room A1 Tuesday, September, 21st, 14:00-14:50 J.M.P. Van Waveren id Software Evan Hart NVIDIA How we describe our environment?

More information

Tiny GPU Cluster for Big Spatial Data: A Preliminary Performance Evaluation

Tiny GPU Cluster for Big Spatial Data: A Preliminary Performance Evaluation Tiny GPU Cluster for Big Spatial Data: A Preliminary Performance Evaluation Jianting Zhang 1,2 Simin You 2, Le Gruenwald 3 1 Depart of Computer Science, CUNY City College (CCNY) 2 Department of Computer

More information

How to Work on Next Gen Effects Now: Bridging DX10 and DX9. Guennadi Riguer ATI Technologies

How to Work on Next Gen Effects Now: Bridging DX10 and DX9. Guennadi Riguer ATI Technologies How to Work on Next Gen Effects Now: Bridging DX10 and DX9 Guennadi Riguer ATI Technologies Overview New pipeline and new cool things Simulating some DX10 features in DX9 Experimental techniques Why This

More information

Real-Time Reyes: Programmable Pipelines and Research Challenges. Anjul Patney University of California, Davis

Real-Time Reyes: Programmable Pipelines and Research Challenges. Anjul Patney University of California, Davis Real-Time Reyes: Programmable Pipelines and Research Challenges Anjul Patney University of California, Davis Real-Time Reyes-Style Adaptive Surface Subdivision Anjul Patney and John D. Owens SIGGRAPH Asia

More information

Cross-Parameterization and Compatible Remeshing of 3D Models

Cross-Parameterization and Compatible Remeshing of 3D Models Cross-Parameterization and Compatible Remeshing of 3D Models Vladislav Kraevoy Alla Sheffer University of British Columbia Authors Vladislav Kraevoy Ph.D. Student Alla Sheffer Assistant Professor Outline

More information

Introduction to Multicore architecture. Tao Zhang Oct. 21, 2010

Introduction to Multicore architecture. Tao Zhang Oct. 21, 2010 Introduction to Multicore architecture Tao Zhang Oct. 21, 2010 Overview Part1: General multicore architecture Part2: GPU architecture Part1: General Multicore architecture Uniprocessor Performance (ECint)

More information

Adaptive Tessellation for Trimmed NURBS Surface

Adaptive Tessellation for Trimmed NURBS Surface Adaptive Tessellation for Trimmed NURBS Surface Ma YingLiang and Terry Hewitt 2 Manchester Visualization Centre, University of Manchester, Manchester, M3 9PL, U.K. may@cs.man.ac.uk 2 W.T.Hewitt@man.ac.uk

More information

: Mesh Processing. Chapter 8

: Mesh Processing. Chapter 8 600.657: Mesh Processing Chapter 8 Handling Mesh Degeneracies [Botsch et al., Polygon Mesh Processing] Class of Approaches Geometric: Preserve the mesh where it s good. Volumetric: Can guarantee no self-intersection.

More information

Optimizing Data Locality for Iterative Matrix Solvers on CUDA

Optimizing Data Locality for Iterative Matrix Solvers on CUDA Optimizing Data Locality for Iterative Matrix Solvers on CUDA Raymond Flagg, Jason Monk, Yifeng Zhu PhD., Bruce Segee PhD. Department of Electrical and Computer Engineering, University of Maine, Orono,

More information

NVIDIA Parallel Nsight. Jeff Kiel

NVIDIA Parallel Nsight. Jeff Kiel NVIDIA Parallel Nsight Jeff Kiel Agenda: NVIDIA Parallel Nsight Programmable GPU Development Presenting Parallel Nsight Demo Questions/Feedback Programmable GPU Development More programmability = more

More information

Analyze and Optimize Windows* Game Applications Using Intel INDE Graphics Performance Analyzers (GPA)

Analyze and Optimize Windows* Game Applications Using Intel INDE Graphics Performance Analyzers (GPA) Analyze and Optimize Windows* Game Applications Using Intel INDE Graphics Performance Analyzers (GPA) Intel INDE Graphics Performance Analyzers (GPA) are powerful, agile tools enabling game developers

More information

A Comparative Study on Exact Triangle Counting Algorithms on the GPU

A Comparative Study on Exact Triangle Counting Algorithms on the GPU A Comparative Study on Exact Triangle Counting Algorithms on the GPU Leyuan Wang, Yangzihao Wang, Carl Yang, John D. Owens University of California, Davis, CA, USA 31 st May 2016 L. Wang, Y. Wang, C. Yang,

More information

Fast and memory efficient viewdependent. rendering

Fast and memory efficient viewdependent. rendering Fast and memory efficient viewdependent trimmed NURBS rendering Michael Guthe,, Jan Meseth, Reinhard Klein Bonn University Computer Graphics Group Trimmed NURBS Screenshot from CATIA from Dassault Systemes

More information

Speed up a Machine-Learning-based Image Super-Resolution Algorithm on GPGPU

Speed up a Machine-Learning-based Image Super-Resolution Algorithm on GPGPU Speed up a Machine-Learning-based Image Super-Resolution Algorithm on GPGPU Ke Ma 1, and Yao Song 2 1 Department of Computer Sciences 2 Department of Electrical and Computer Engineering University of Wisconsin-Madison

More information

CS427 Multicore Architecture and Parallel Computing

CS427 Multicore Architecture and Parallel Computing CS427 Multicore Architecture and Parallel Computing Lecture 6 GPU Architecture Li Jiang 2014/10/9 1 GPU Scaling A quiet revolution and potential build-up Calculation: 936 GFLOPS vs. 102 GFLOPS Memory Bandwidth:

More information

Clipmap based Loading and Rendering of Large Terrain Texture

Clipmap based Loading and Rendering of Large Terrain Texture 2012 International Conference on Image, Vision and Computing (ICIVC 2012) IPCSIT vol. 50 (2012) (2012) IACSIT Press, Singapore DOI: 10.7763/IPCSIT.2012.V50.16 Clipmap based Loading and Rendering of Large

More information

Approximating Point Clouds by Generalized Bézier Surfaces

Approximating Point Clouds by Generalized Bézier Surfaces WAIT: Workshop on the Advances in Information Technology, Budapest, 2018 Approximating Point Clouds by Generalized Bézier Surfaces Péter Salvi 1, Tamás Várady 1, Kenjirō T. Miura 2 1 Budapest University

More information

Curves and Surfaces 2

Curves and Surfaces 2 Curves and Surfaces 2 Computer Graphics Lecture 17 Taku Komura Today More about Bezier and Bsplines de Casteljau s algorithm BSpline : General form de Boor s algorithm Knot insertion NURBS Subdivision

More information

Point Sample Rendering

Point Sample Rendering Point Sample Rendering Efficient Screen Space Approach for HW Accelerated Surfel Rendering VMV03, november 2003 Gaël GUENNEBAUD - Mathias PAULIN IRIT-CNRS-UPS TOULOUSE-FRANCE http://www.irit.fr/recherches/sirv/vis/surfel/index.html

More information

Spline Surfaces, Subdivision Surfaces

Spline Surfaces, Subdivision Surfaces CS-C3100 Computer Graphics Spline Surfaces, Subdivision Surfaces vectorportal.com Trivia Assignment 1 due this Sunday! Feedback on the starter code, difficulty, etc., much appreciated Put in your README

More information

Introduction to ANSYS FLUENT Meshing

Introduction to ANSYS FLUENT Meshing Workshop 04 CAD Import and Meshing from Conformal Faceting Input 14.5 Release Introduction to ANSYS FLUENT Meshing 2011 ANSYS, Inc. December 21, 2012 1 I Introduction Workshop Description: CAD files will

More information

On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators

On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators Karl Rupp, Barry Smith rupp@mcs.anl.gov Mathematics and Computer Science Division Argonne National Laboratory FEMTEC

More information

How to model a car body in T-Splines

How to model a car body in T-Splines How to model a car body in T-Splines My name is and I ll show you how to model complex cars like the Alfa Romeo 147 gta using the T-Splines Maya plugin and various techniques. This will be useful if you

More information

SIFT Descriptor Extraction on the GPU for Large-Scale Video Analysis. Hannes Fassold, Jakub Rosner

SIFT Descriptor Extraction on the GPU for Large-Scale Video Analysis. Hannes Fassold, Jakub Rosner SIFT Descriptor Extraction on the GPU for Large-Scale Video Analysis Hannes Fassold, Jakub Rosner 2014-03-26 2 Overview GPU-activities @ AVM research group SIFT descriptor extraction Algorithm GPU implementation

More information

Height field ambient occlusion using CUDA

Height field ambient occlusion using CUDA Height field ambient occlusion using CUDA 3.6.2009 Outline 1 2 3 4 Theory Kernel 5 Height fields Self occlusion Current methods Marching several directions from each fragment Sampling several times along

More information

GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS

GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS Agenda Forming a GPGPU WG 1 st meeting Future meetings Activities Forming a GPGPU WG To raise needs and enhance information sharing A platform for knowledge

More information

SIMPLY RHINO PNY NVIDIA Quadro Rhino v6 WIP Tests. Contents

SIMPLY RHINO PNY NVIDIA Quadro Rhino v6 WIP Tests. Contents SIMPLY RHINO PNY NVIDIA Quadro Rhino v6 WIP Tests Contents 01 Disclaimer 3 02 Introduction 3 03 Hardware and Settings 3 04 Testing Improvements from Rhino v5 to Rhino v6 4 05 Testing Outline scores for

More information

GPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC

GPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC GPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC MIKE GOWANLOCK NORTHERN ARIZONA UNIVERSITY SCHOOL OF INFORMATICS, COMPUTING & CYBER SYSTEMS BEN KARSIN UNIVERSITY OF HAWAII AT MANOA DEPARTMENT

More information

Current Trends in Computer Graphics Hardware

Current Trends in Computer Graphics Hardware Current Trends in Computer Graphics Hardware Dirk Reiners University of Louisiana Lafayette, LA Quick Introduction Assistant Professor in Computer Science at University of Louisiana, Lafayette (since 2006)

More information

Determinant Computation on the GPU using the Condensation AMMCS Method / 1

Determinant Computation on the GPU using the Condensation AMMCS Method / 1 Determinant Computation on the GPU using the Condensation Method Sardar Anisul Haque Marc Moreno Maza Ontario Research Centre for Computer Algebra University of Western Ontario, London, Ontario AMMCS 20,

More information

G 2 Interpolation for Polar Surfaces

G 2 Interpolation for Polar Surfaces 1 G 2 Interpolation for Polar Surfaces Jianzhong Wang 1, Fuhua Cheng 2,3 1 University of Kentucky, jwangf@uky.edu 2 University of Kentucky, cheng@cs.uky.edu 3 National Tsinhua University ABSTRACT In this

More information

Tessellations. Irena Swanson Reed College, Portland, Oregon. MathPath, Lewis & Clark College, Portland, Oregon, 24 July 2018

Tessellations. Irena Swanson Reed College, Portland, Oregon. MathPath, Lewis & Clark College, Portland, Oregon, 24 July 2018 Tessellations Irena Swanson Reed College, Portland, Oregon MathPath, Lewis & Clark College, Portland, Oregon, 24 July 2018 What is a tessellation? A tiling or a tessellation of the plane is a covering

More information

1 Introduction. 1.1 Raster-to-vector conversion

1 Introduction. 1.1 Raster-to-vector conversion 1 Introduction 1.1 Raster-to-vector conversion Vectorization (raster-to-vector conversion) consists of analyzing a raster image to convert its pixel representation to a vector representation The basic

More information

DIFFERENTIAL. Tomáš Oberhuber, Atsushi Suzuki, Jan Vacata, Vítězslav Žabka

DIFFERENTIAL. Tomáš Oberhuber, Atsushi Suzuki, Jan Vacata, Vítězslav Žabka USE OF FOR Tomáš Oberhuber, Atsushi Suzuki, Jan Vacata, Vítězslav Žabka Faculty of Nuclear Sciences and Physical Engineering Czech Technical University in Prague Mini workshop on advanced numerical methods

More information

An Evaluation of Unified Memory Technology on NVIDIA GPUs

An Evaluation of Unified Memory Technology on NVIDIA GPUs An Evaluation of Unified Memory Technology on NVIDIA GPUs Wenqiang Li 1, Guanghao Jin 2, Xuewen Cui 1, Simon See 1,3 Center for High Performance Computing, Shanghai Jiao Tong University, China 1 Tokyo

More information

Revolutionizing the Datacenter Join the Conversation #OpenPOWERSummit

Revolutionizing the Datacenter Join the Conversation #OpenPOWERSummit Redis Labs on POWER8 Server: The Promise of OpenPOWER Value Jeffrey L. Leeds, Ph.D. Vice President, Alliances & Channels Revolutionizing the Datacenter Join the Conversation #OpenPOWERSummit Who We Are

More information

Multi-View Stereo for Static and Dynamic Scenes

Multi-View Stereo for Static and Dynamic Scenes Multi-View Stereo for Static and Dynamic Scenes Wolfgang Burgard Jan 6, 2010 Main references Yasutaka Furukawa and Jean Ponce, Accurate, Dense and Robust Multi-View Stereopsis, 2007 C.L. Zitnick, S.B.

More information

Fast Tridiagonal Solvers on GPU

Fast Tridiagonal Solvers on GPU Fast Tridiagonal Solvers on GPU Yao Zhang John Owens UC Davis Jonathan Cohen NVIDIA GPU Technology Conference 2009 Outline Introduction Algorithms Design algorithms for GPU architecture Performance Bottleneck-based

More information

Abstract. Introduction. Kevin Todisco

Abstract. Introduction. Kevin Todisco - Kevin Todisco Figure 1: A large scale example of the simulation. The leftmost image shows the beginning of the test case, and shows how the fluid refracts the environment around it. The middle image

More information

Streaming Massive Environments From Zero to 200MPH

Streaming Massive Environments From Zero to 200MPH FORZA MOTORSPORT From Zero to 200MPH Chris Tector (Software Architect Turn 10 Studios) Turn 10 Internal studio at Microsoft Game Studios - we make Forza Motorsport Around 70 full time staff 2 Why am I

More information

High performance 2D Discrete Fourier Transform on Heterogeneous Platforms. Shrenik Lad, IIIT Hyderabad Advisor : Dr. Kishore Kothapalli

High performance 2D Discrete Fourier Transform on Heterogeneous Platforms. Shrenik Lad, IIIT Hyderabad Advisor : Dr. Kishore Kothapalli High performance 2D Discrete Fourier Transform on Heterogeneous Platforms Shrenik Lad, IIIT Hyderabad Advisor : Dr. Kishore Kothapalli Motivation Fourier Transform widely used in Physics, Astronomy, Engineering

More information

polygonal approximations

polygonal approximations An improved method for obtaining optimal polygonal approximations A. Carmona-Poyato; F. J. Madrid-Cuevas; N. L. Fernández-García; R. Medina-Carnicer Abstract In computer vision, polygonal approximations

More information

M4G - A Surface Representation for Adaptive CPU-GPU Computation

M4G - A Surface Representation for Adaptive CPU-GPU Computation M4G - A Surface Representation for Adaptive CPU-GPU Computation Vision and Graphics Lab Institute of Pure and Applied Mathematics Trimester Program on Computational Manifolds and Applications November

More information

N-Body Simulation using CUDA. CSE 633 Fall 2010 Project by Suraj Alungal Balchand Advisor: Dr. Russ Miller State University of New York at Buffalo

N-Body Simulation using CUDA. CSE 633 Fall 2010 Project by Suraj Alungal Balchand Advisor: Dr. Russ Miller State University of New York at Buffalo N-Body Simulation using CUDA CSE 633 Fall 2010 Project by Suraj Alungal Balchand Advisor: Dr. Russ Miller State University of New York at Buffalo Project plan Develop a program to simulate gravitational

More information

Accelerated Load Balancing of Unstructured Meshes

Accelerated Load Balancing of Unstructured Meshes Accelerated Load Balancing of Unstructured Meshes Gerrett Diamond, Lucas Davis, and Cameron W. Smith Abstract Unstructured mesh applications running on large, parallel, distributed memory systems require

More information

Sackler Course BMSC-GA 4448 High Performance Computing in Biomedical Informatics. Class 2: Friday February 14 th, :30PM 5:30PM AGENDA

Sackler Course BMSC-GA 4448 High Performance Computing in Biomedical Informatics. Class 2: Friday February 14 th, :30PM 5:30PM AGENDA Sackler Course BMSC-GA 4448 High Performance Computing in Biomedical Informatics Class 2: Friday February 14 th, 2014 2:30PM 5:30PM AGENDA Recap 1 st class & Homework discussion. Fundamentals of Parallel

More information

Appearance-Preserving Simplification

Appearance-Preserving Simplification Appearance-Preserving Simplification 7809 tris 975 tris 488 tris 1951 tris 3905 tris model courtesy of Stanford and Caltech Simplification Create representation for faster rendering Measure representation

More information

Large scale Imaging on Current Many- Core Platforms

Large scale Imaging on Current Many- Core Platforms Large scale Imaging on Current Many- Core Platforms SIAM Conf. on Imaging Science 2012 May 20, 2012 Dr. Harald Köstler Chair for System Simulation Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen,

More information

Introduction to Computer Graphics

Introduction to Computer Graphics Introduction to Computer Graphics 2016 Spring National Cheng Kung University Instructors: Min-Chun Hu 胡敏君 Shih-Chin Weng 翁士欽 ( 西基電腦動畫 ) Data Representation Curves and Surfaces Limitations of Polygons Inherently

More information

Debunking the 100X GPU vs CPU Myth: An Evaluation of Throughput Computing on CPU and GPU

Debunking the 100X GPU vs CPU Myth: An Evaluation of Throughput Computing on CPU and GPU Debunking the 100X GPU vs CPU Myth: An Evaluation of Throughput Computing on CPU and GPU The myth 10x-1000x speed up on GPU vs CPU Papers supporting the myth: Microsoft: N. K. Govindaraju, B. Lloyd, Y.

More information

Scalable Multi Agent Simulation on the GPU. Avi Bleiweiss NVIDIA Corporation San Jose, 2009

Scalable Multi Agent Simulation on the GPU. Avi Bleiweiss NVIDIA Corporation San Jose, 2009 Scalable Multi Agent Simulation on the GPU Avi Bleiweiss NVIDIA Corporation San Jose, 2009 Reasoning Explicit State machine, serial Implicit Compute intensive Fits SIMT well Collision avoidance Motivation

More information

3D Modeling techniques

3D Modeling techniques 3D Modeling techniques 0. Reconstruction From real data (not covered) 1. Procedural modeling Automatic modeling of a self-similar objects or scenes 2. Interactive modeling Provide tools to computer artists

More information

LS-ACTS 1.0 USER MANUAL

LS-ACTS 1.0 USER MANUAL LS-ACTS 1.0 USER MANUAL VISION GROUP, STATE KEY LAB OF CAD&CG, ZHEJIANG UNIVERSITY HTTP://WWW.ZJUCVG.NET TABLE OF CONTENTS 1 Introduction... 1-3 1.1 Product Specification...1-3 1.2 Feature Functionalities...1-3

More information

X10 specific Optimization of CPU GPU Data transfer with Pinned Memory Management

X10 specific Optimization of CPU GPU Data transfer with Pinned Memory Management X10 specific Optimization of CPU GPU Data transfer with Pinned Memory Management Hideyuki Shamoto, Tatsuhiro Chiba, Mikio Takeuchi Tokyo Institute of Technology IBM Research Tokyo Programming for large

More information

Improving performances of an embedded RDBMS with a hybrid CPU/GPU processing engine

Improving performances of an embedded RDBMS with a hybrid CPU/GPU processing engine Improving performances of an embedded RDBMS with a hybrid CPU/GPU processing engine Samuel Cremer 1,2, Michel Bagein 1, Saïd Mahmoudi 1, Pierre Manneback 1 1 UMONS, University of Mons Computer Science

More information

Tata Elxsi benchmark report: Unreal Datasmith

Tata Elxsi benchmark report: Unreal Datasmith This report and its findings were produced by Tata Elxsi. The report was sponsored by Unity Technologies. Tata Elxsi benchmark report: comparing PiXYZ Studio and Unreal Datasmith A Tata Elxsi perspective

More information

HETEROGENEOUS COMPUTE INFRASTRUCTURE FOR SINGAPORE

HETEROGENEOUS COMPUTE INFRASTRUCTURE FOR SINGAPORE HETEROGENEOUS COMPUTE INFRASTRUCTURE FOR SINGAPORE PHILIP HEAH ASSISTANT CHIEF EXECUTIVE TECHNOLOGY & INFRASTRUCTURE GROUP LAUNCH OF SERVICES AND DIGITAL ECONOMY (SDE) TECHNOLOGY ROADMAP (NOV 2018) Source

More information

Programmable GPUs. Real Time Graphics 11/13/2013. Nalu 2004 (NVIDIA Corporation) GeForce 6. Virtua Fighter 1995 (SEGA Corporation) NV1

Programmable GPUs. Real Time Graphics 11/13/2013. Nalu 2004 (NVIDIA Corporation) GeForce 6. Virtua Fighter 1995 (SEGA Corporation) NV1 Programmable GPUs Real Time Graphics Virtua Fighter 1995 (SEGA Corporation) NV1 Dead or Alive 3 2001 (Tecmo Corporation) Xbox (NV2A) Nalu 2004 (NVIDIA Corporation) GeForce 6 Human Head 2006 (NVIDIA Corporation)

More information

CUDA Optimizations WS Intelligent Robotics Seminar. Universität Hamburg WS Intelligent Robotics Seminar Praveen Kulkarni

CUDA Optimizations WS Intelligent Robotics Seminar. Universität Hamburg WS Intelligent Robotics Seminar Praveen Kulkarni CUDA Optimizations WS 2014-15 Intelligent Robotics Seminar 1 Table of content 1 Background information 2 Optimizations 3 Summary 2 Table of content 1 Background information 2 Optimizations 3 Summary 3

More information

Accelerating String Matching Using Multi-threaded Algorithm

Accelerating String Matching Using Multi-threaded Algorithm Accelerating String Matching Using Multi-threaded Algorithm on GPU Cheng-Hung Lin*, Sheng-Yu Tsai**, Chen-Hsiung Liu**, Shih-Chieh Chang**, Jyuo-Min Shyu** *National Taiwan Normal University, Taiwan **National

More information

Real Time Rendering Assignment 1 - Report 20/08/10

Real Time Rendering Assignment 1 - Report 20/08/10 Real Time Rendering Assignment 1 - Report 20/08/10 Tom Harris Alon Gal s3236050 s3216299 Introduction The aim of this assignment was to investigate the graphics performance of procedurally generated objects

More information

The Graphics Pipeline

The Graphics Pipeline The Graphics Pipeline Ray Tracing: Why Slow? Basic ray tracing: 1 ray/pixel Ray Tracing: Why Slow? Basic ray tracing: 1 ray/pixel But you really want shadows, reflections, global illumination, antialiasing

More information

Applications. Oversampled 3D scan data. ~150k triangles ~80k triangles

Applications. Oversampled 3D scan data. ~150k triangles ~80k triangles Mesh Simplification Applications Oversampled 3D scan data ~150k triangles ~80k triangles 2 Applications Overtessellation: E.g. iso-surface extraction 3 Applications Multi-resolution hierarchies for efficient

More information

Improving Memory Space Efficiency of Kd-tree for Real-time Ray Tracing Byeongjun Choi, Byungjoon Chang, Insung Ihm

Improving Memory Space Efficiency of Kd-tree for Real-time Ray Tracing Byeongjun Choi, Byungjoon Chang, Insung Ihm Improving Memory Space Efficiency of Kd-tree for Real-time Ray Tracing Byeongjun Choi, Byungjoon Chang, Insung Ihm Department of Computer Science and Engineering Sogang University, Korea Improving Memory

More information

A GPU-based Approximate SVD Algorithm Blake Foster, Sridhar Mahadevan, Rui Wang

A GPU-based Approximate SVD Algorithm Blake Foster, Sridhar Mahadevan, Rui Wang A GPU-based Approximate SVD Algorithm Blake Foster, Sridhar Mahadevan, Rui Wang University of Massachusetts Amherst Introduction Singular Value Decomposition (SVD) A: m n matrix (m n) U, V: orthogonal

More information

OpenACC/CUDA/OpenMP... 1 Languages and Libraries... 3 Multi-GPU support... 4 How OpenACC Works... 4

OpenACC/CUDA/OpenMP... 1 Languages and Libraries... 3 Multi-GPU support... 4 How OpenACC Works... 4 OpenACC Course Class #1 Q&A Contents OpenACC/CUDA/OpenMP... 1 Languages and Libraries... 3 Multi-GPU support... 4 How OpenACC Works... 4 OpenACC/CUDA/OpenMP Q: Is OpenACC an NVIDIA standard or is it accepted

More information

CS 179: GPU Programming

CS 179: GPU Programming CS 179: GPU Programming Introduction Lecture originally written by Luke Durant, Tamas Szalay, Russell McClellan What We Will Cover Programming GPUs, of course: OpenGL Shader Language (GLSL) Compute Unified

More information

(software agnostic) Computational Considerations

(software agnostic) Computational Considerations (software agnostic) Computational Considerations The Issues CPU GPU Emerging - FPGA, Phi, Nervana Storage Networking CPU 2 Threads core core Processor/Chip Processor/Chip Computer CPU Threads vs. Cores

More information

B-spline Curves. Smoother than other curve forms

B-spline Curves. Smoother than other curve forms Curves and Surfaces B-spline Curves These curves are approximating rather than interpolating curves. The curves come close to, but may not actually pass through, the control points. Usually used as multiple,

More information

CS452/552; EE465/505. Clipping & Scan Conversion

CS452/552; EE465/505. Clipping & Scan Conversion CS452/552; EE465/505 Clipping & Scan Conversion 3-31 15 Outline! From Geometry to Pixels: Overview Clipping (continued) Scan conversion Read: Angel, Chapter 8, 8.1-8.9 Project#1 due: this week Lab4 due:

More information

Skeleton-based Template Retrieval for Virtual Maize Modeling

Skeleton-based Template Retrieval for Virtual Maize Modeling Journal of Computational Information Systems6:5(2010) 1577-1582 Available at http://www.jofcis.com Skeleton-based Template Retrieval for Virtual Maize Modeling Boxiang XIAO 1,2, Chunjiang ZHAO 1,2,, Xinyu

More information

Dynamic Sparse Matrix Allocation on GPUs. James King

Dynamic Sparse Matrix Allocation on GPUs. James King Dynamic Sparse Matrix Allocation on GPUs James King Graph Applications Dynamic updates to graphs Adding edges add entries to sparse matrix representation Motivation Graph operations (adding edges) (e.g.

More information

newfasant PO Multiple effects analysis

newfasant PO Multiple effects analysis newfasant PO Multiple effects analysis Benchmark: PO Multiple effects analysis Software Version: 6.2.7 Date: 21 Nov 16:10 Index 1. BENCHMARK DESCRIPTION AND OBJECTIVES 1.1. GEOMETRY CREATION 2. SET-UP

More information

Graphics Architectures and OpenCL. Michael Doggett Department of Computer Science Lund university

Graphics Architectures and OpenCL. Michael Doggett Department of Computer Science Lund university Graphics Architectures and OpenCL Michael Doggett Department of Computer Science Lund university Overview Parallelism Radeon 5870 Tiled Graphics Architectures Important when Memory and Bandwidth limited

More information

High-Performance Graph Primitives on the GPU: Design and Implementation of Gunrock

High-Performance Graph Primitives on the GPU: Design and Implementation of Gunrock High-Performance Graph Primitives on the GPU: Design and Implementation of Gunrock Yangzihao Wang University of California, Davis yzhwang@ucdavis.edu March 24, 2014 Yangzihao Wang (yzhwang@ucdavis.edu)

More information

Applying Tessellation to Clipmap Terrain Rendering

Applying Tessellation to Clipmap Terrain Rendering Applying Tessellation to Clipmap Terrain Rendering COSC460 Research Project Report Corey Barnard 55066487 October 17, 2014 Department of Computer Science & Software Engineering University of Canterbury

More information

Parallelization of Shortest Path Graph Kernels on Multi-Core CPUs and GPU

Parallelization of Shortest Path Graph Kernels on Multi-Core CPUs and GPU Parallelization of Shortest Path Graph Kernels on Multi-Core CPUs and GPU Lifan Xu Wei Wang Marco A. Alvarez John Cavazos Dongping Zhang Department of Computer and Information Science University of Delaware

More information

Computergrafik. Matthias Zwicker Universität Bern Herbst 2016

Computergrafik. Matthias Zwicker Universität Bern Herbst 2016 Computergrafik Matthias Zwicker Universität Bern Herbst 2016 Today Curves NURBS Surfaces Parametric surfaces Bilinear patch Bicubic Bézier patch Advanced surface modeling 2 Piecewise Bézier curves Each

More information

Craig Peeper Software Architect Windows Graphics & Gaming Technologies Microsoft Corporation

Craig Peeper Software Architect Windows Graphics & Gaming Technologies Microsoft Corporation Gaming Technologies Craig Peeper Software Architect Windows Graphics & Gaming Technologies Microsoft Corporation Overview Games Yesterday & Today Game Components PC Platform & WGF 2.0 Game Trends Big Challenges

More information