Adaptive NURBS Tessellation with CUDA
|
|
- Milo Ray
- 5 years ago
- Views:
Transcription
1 Adaptive NURBS with CUDA Presenter: Dr. Qiu Jie Multi-plAtform Game Innovation Centre (MAGIC), Nanyang Technological University (NTU), Singapore
2 About MAGIC Established in Nanyang Technological University (NTU), Singapore on 1 Nov 2012 Cross-disciplinary research with 26 professors, 33 researchers, 11 PhD To translate scientific ideas into technological products and services To increase the capability of Future Studios in media & entertainment Launched Future Studios Research Lab (FSR Lab) on 21 Jan
3 Adaptive NURBS Convert NURBS to a set of polygons Adaptive tessellation NURBS surface 3 According to Curvature View (camera distance to object) Non-adaptive tessellation Adaptive tessellation
4 Tessellating NURBS surface 4
5 Adaptive NURBS How to be adaptive? Divide the surface into patches Choose accurate, effective criterion to tessellate each patch How to handle the connection of neighbor parts? Extract neighbour info of each patch to check boundary consistency Stitch gaps if not consistent How to speed up? Implementation on GPU 5
6 Proposed Framework To be adaptive Handle boundaries Bézier Conversion Factors Calculation Stitch Gap 6 GPU implementation
7 Bézier Conversion Bézier Conversion Factors Calculation convert NURBS surface to a set of Bézier patches Stitch Gap 7 More accurate for tessellation factor calculation
8 Factors Calculation Bézier Conversion Factors Calculation Stitch Gap n m 8
9 Factors Calculation One traditional method 2 2 i i ε tess max( ( w P ) + ( r ) w ) i i P i : control points w i : weights of control points Related to the origin of the coordinate system Heuristic method: Center of bounding box as origin r 9 r
10 Factors Calculation Our method Using a new objective function to find origin O n n wp i i 2 ( Pj O) + wi ( O) 2 O w min( ) Model j= 0 i= 0 i Old method (ɛ =0.1) Our method (ɛ =0.1) 10 # of vertices
11 Bézier Conversion Factors Calculation Stitch Gap (s,t) Sst (,) = n m i= 0 j= 0 n m i= 0 j= 0 w P B() sb() t ij, ij, i j w B() sb() t ij, i j 11
12 Stitch Gap Bézier Conversion Factors Calculation Stitch Gap 12 Problem: inconsistent tessellation factors between patches gap
13 Stitch Gap 1 side 4 sides Re-design boundary connections 13
14 Implementation on GPU level of parallelism Thread assignment (16,16) (16,16) (16,16) (16,16) (16,16) (16,16) 14
15 Implementation on GPU data management Bézier Conversion Factors Calculation Stitch Gap Minimize data transfer 15
16 Implementation on GPU NURBS surface to Bézier patches Compute Bézier control points 16
17 Implementation on GPU Copy control points needed by each block to shared memory Avoid duplicate memory read for global memory 17 4 X 259 = 1036 points for a block 19 X 19 = 361 points for a block
18 Implementation on GPU Bézier Conversion Factors Calculation Stitch Gap Store tessellation factors of all patches in order to extract neighbor info 18
19 Implementation on GPU Calculate tessellation factors Calculate tessellation factors (2 directions) Store them in global memory n Patch ID 0 1 Direction 1 Direction 2 h 0 v 0 h 1 v 1 2 h 2 v 2 m 19 k h k v k
20 Implementation on GPU Bézier Conversion Factors Calculation Stitch Gap Pre-compute blending functions 20
21 Implementation on GPU using regular pattern Fast evaluation Each row has the same B i (s) Each column has the same (t) ij, ij, i j B j i= 0 j= 0 Sst (,) = n m n m i= 0 j= 0 w P B() sb() t w B() sb() t ij, i j 21
22 Implementation on GPU Bézier Conversion Factors Calculation Stitch Gap 22 Parallel scan to determine vertex/index offset for each patch
23 Implementation on GPU Vertex offset Determine # of vertices for each patch based on tessellation factors Determine vertex offset for each patch Patch ID # vertices vertex offset 0 (#v) (#v) 1 (#v) 2 Prefix sum (#v) 0 (#v) 0 +(#v) 1 23 k (#v) k SUM((#V) 0, (#V) k-1 )
24 Implementation on GPU Transition region and index offset Determine # of indices on each patch (include transition regions) Determine index offset for each patch Patch ID # indices index offset 0 (#id) (#id) 1 (#id) 2 Prefix sum (#id) 0 (#id) 0 +(#id) 1 24 k (#id) k SUM((#id) 0, (#id) k-1 )
25 Implementation on GPU Bézier Conversion Factors Calculation Stitch Gap Minimize data transfer Store patch info for easy neighbour info extraction Regular tessellation pattern Parallel scan 25
26 Video Demonstration 26
27 Results 27 Head Old man face Stegosaurus Patch No Triangle No
28 Results CPU (intel i GHz) vs GPU (GTX CUDA cores 160GB/s bandwidth) 28 Run time / ms Head Old man face Stegosaurus CPU GPU
29 Results Car model Error tolerance surfaces 3,081,930 triangles 10.4 ms 29
30 Results Submarine model Error tolerance surfaces 2,235,934 triangles 9.16 ms 30
31 Acknowledgement This research is supported by Multi-plAtform Game Innovation Centre (MAGIC), funded by the Singapore National Research Foundation under its IDM Futures Funding Initiative and administered by the Interactive & Digital Media Programme Office, Media Development Authority of Singapore, and the National Science Foundation of China (Grant no , ). 31
32 Thank you!
Future Studios Research Lab
GPU TECHNOLOGY WORKSHOP SOUTH EAST ASIA 2014 Future Studios Research Lab The Boy and His Robot Film Case Study Prof SEAH Hock Soon Director Multi-plAtform Game Innovation Centre (MAGIC) Nanyang Technological
More informationInformation Coding / Computer Graphics, ISY, LiTH. Splines
28(69) Splines Originally a drafting tool to create a smooth curve In computer graphics: a curve built from sections, each described by a 2nd or 3rd degree polynomial. Very common in non-real-time graphics,
More informationL10 Layered Depth Normal Images. Introduction Related Work Structured Point Representation Boolean Operations Conclusion
L10 Layered Depth Normal Images Introduction Related Work Structured Point Representation Boolean Operations Conclusion 1 Introduction Purpose: using the computational power on GPU to speed up solid modeling
More informationScan Conversion of Polygons. Dr. Scott Schaefer
Scan Conversion of Polygons Dr. Scott Schaefer Drawing Rectangles Which pixels should be filled? /8 Drawing Rectangles Is this correct? /8 Drawing Rectangles What if two rectangles overlap? 4/8 Drawing
More informationFast BVH Construction on GPUs
Fast BVH Construction on GPUs Published in EUROGRAGHICS, (2009) C. Lauterbach, M. Garland, S. Sengupta, D. Luebke, D. Manocha University of North Carolina at Chapel Hill NVIDIA University of California
More informationSpring 2011 Prof. Hyesoon Kim
Spring 2011 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on
More informationHigher Order Surfaces in OpenGL with NV_evaluators. Sébastien Dominé
Higher Order Surfaces in OpenGL with NV_evaluators Sébastien Dominé Why surfaces? Higher order primitives Animation Level of Detail Bandwidth Filtering 2 Overview What are the general evaluators defined
More informationGPGPU LAB. Case study: Finite-Difference Time- Domain Method on CUDA
GPGPU LAB Case study: Finite-Difference Time- Domain Method on CUDA Ana Balevic IPVS 1 Finite-Difference Time-Domain Method Numerical computation of solutions to partial differential equations Explicit
More informationDuksu Kim. Professional Experience Senior researcher, KISTI High performance visualization
Duksu Kim Assistant professor, KORATEHC Education Ph.D. Computer Science, KAIST Parallel Proximity Computation on Heterogeneous Computing Systems for Graphics Applications Professional Experience Senior
More informationSpring 2009 Prof. Hyesoon Kim
Spring 2009 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on
More informationDirect Rendering of Trimmed NURBS Surfaces
Direct Rendering of Trimmed NURBS Surfaces Hardware Graphics Pipeline 2/ 81 Hardware Graphics Pipeline GPU Video Memory CPU Vertex Processor Raster Unit Fragment Processor Render Target Screen Extended
More informationGPU & Reproductibility, predictability, Repetability, Determinism.
GPU & Reproductibility, predictability, Repetability, Determinism. David Defour DALI/LIRMM, UPVD Phd Thesis proposal Subject Study the {numerical} predictability of GPUs Why?... fuzzy definition but useful
More informationLarge-scale Deep Unsupervised Learning using Graphics Processors
Large-scale Deep Unsupervised Learning using Graphics Processors Rajat Raina Anand Madhavan Andrew Y. Ng Stanford University Learning from unlabeled data Classify vs. car motorcycle Input space Unlabeled
More informationCSE 167: Introduction to Computer Graphics Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015
CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2015 Announcements Project 2 due tomorrow at 2pm Grading window
More informationRay Casting of Trimmed NURBS Surfaces on the GPU
Ray Casting of Trimmed NURBS Surfaces on the GPU Hans-Friedrich Pabst Jan P. Springer André Schollmeyer Robert Lenhardt Christian Lessig Bernd Fröhlich Bauhaus University Weimar Faculty of Media Virtual
More informationReal-Time Buffer Compression. Michael Doggett Department of Computer Science Lund university
Real-Time Buffer Compression Michael Doggett Department of Computer Science Lund university Project 3D graphics project Demo, Game Implement 3D graphics algorithm(s) C++/OpenGL(Lab2)/iOS/android/3D engine
More informationAccelerating Leukocyte Tracking Using CUDA: A Case Study in Leveraging Manycore Coprocessors
Accelerating Leukocyte Tracking Using CUDA: A Case Study in Leveraging Manycore Coprocessors Michael Boyer, David Tarjan, Scott T. Acton, and Kevin Skadron University of Virginia IPDPS 2009 Outline Leukocyte
More informationCSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University
CSE 591/392: GPU Programming Introduction Klaus Mueller Computer Science Department Stony Brook University First: A Big Word of Thanks! to the millions of computer game enthusiasts worldwide Who demand
More informationCSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller
Entertainment Graphics: Virtual Realism for the Masses CSE 591: GPU Programming Introduction Computer games need to have: realistic appearance of characters and objects believable and creative shading,
More informationDesign Workflow for AM: From CAD to Part
Design Workflow for AM: From CAD to Part Sanjay Joshi Professor of Industrial and Manufacturing Engineering Penn State University Offered by: Center for Innovative Materials Processing through Direct Digital
More informationSimultaneous Solving of Linear Programming Problems in GPU
Simultaneous Solving of Linear Programming Problems in GPU Amit Gurung* amitgurung@nitm.ac.in Binayak Das* binayak89cse@gmail.com Rajarshi Ray* raj.ray84@gmail.com * National Institute of Technology Meghalaya
More informationUsing Virtual Texturing to Handle Massive Texture Data
Using Virtual Texturing to Handle Massive Texture Data San Jose Convention Center - Room A1 Tuesday, September, 21st, 14:00-14:50 J.M.P. Van Waveren id Software Evan Hart NVIDIA How we describe our environment?
More informationTiny GPU Cluster for Big Spatial Data: A Preliminary Performance Evaluation
Tiny GPU Cluster for Big Spatial Data: A Preliminary Performance Evaluation Jianting Zhang 1,2 Simin You 2, Le Gruenwald 3 1 Depart of Computer Science, CUNY City College (CCNY) 2 Department of Computer
More informationHow to Work on Next Gen Effects Now: Bridging DX10 and DX9. Guennadi Riguer ATI Technologies
How to Work on Next Gen Effects Now: Bridging DX10 and DX9 Guennadi Riguer ATI Technologies Overview New pipeline and new cool things Simulating some DX10 features in DX9 Experimental techniques Why This
More informationReal-Time Reyes: Programmable Pipelines and Research Challenges. Anjul Patney University of California, Davis
Real-Time Reyes: Programmable Pipelines and Research Challenges Anjul Patney University of California, Davis Real-Time Reyes-Style Adaptive Surface Subdivision Anjul Patney and John D. Owens SIGGRAPH Asia
More informationCross-Parameterization and Compatible Remeshing of 3D Models
Cross-Parameterization and Compatible Remeshing of 3D Models Vladislav Kraevoy Alla Sheffer University of British Columbia Authors Vladislav Kraevoy Ph.D. Student Alla Sheffer Assistant Professor Outline
More informationIntroduction to Multicore architecture. Tao Zhang Oct. 21, 2010
Introduction to Multicore architecture Tao Zhang Oct. 21, 2010 Overview Part1: General multicore architecture Part2: GPU architecture Part1: General Multicore architecture Uniprocessor Performance (ECint)
More informationAdaptive Tessellation for Trimmed NURBS Surface
Adaptive Tessellation for Trimmed NURBS Surface Ma YingLiang and Terry Hewitt 2 Manchester Visualization Centre, University of Manchester, Manchester, M3 9PL, U.K. may@cs.man.ac.uk 2 W.T.Hewitt@man.ac.uk
More information: Mesh Processing. Chapter 8
600.657: Mesh Processing Chapter 8 Handling Mesh Degeneracies [Botsch et al., Polygon Mesh Processing] Class of Approaches Geometric: Preserve the mesh where it s good. Volumetric: Can guarantee no self-intersection.
More informationOptimizing Data Locality for Iterative Matrix Solvers on CUDA
Optimizing Data Locality for Iterative Matrix Solvers on CUDA Raymond Flagg, Jason Monk, Yifeng Zhu PhD., Bruce Segee PhD. Department of Electrical and Computer Engineering, University of Maine, Orono,
More informationNVIDIA Parallel Nsight. Jeff Kiel
NVIDIA Parallel Nsight Jeff Kiel Agenda: NVIDIA Parallel Nsight Programmable GPU Development Presenting Parallel Nsight Demo Questions/Feedback Programmable GPU Development More programmability = more
More informationAnalyze and Optimize Windows* Game Applications Using Intel INDE Graphics Performance Analyzers (GPA)
Analyze and Optimize Windows* Game Applications Using Intel INDE Graphics Performance Analyzers (GPA) Intel INDE Graphics Performance Analyzers (GPA) are powerful, agile tools enabling game developers
More informationA Comparative Study on Exact Triangle Counting Algorithms on the GPU
A Comparative Study on Exact Triangle Counting Algorithms on the GPU Leyuan Wang, Yangzihao Wang, Carl Yang, John D. Owens University of California, Davis, CA, USA 31 st May 2016 L. Wang, Y. Wang, C. Yang,
More informationFast and memory efficient viewdependent. rendering
Fast and memory efficient viewdependent trimmed NURBS rendering Michael Guthe,, Jan Meseth, Reinhard Klein Bonn University Computer Graphics Group Trimmed NURBS Screenshot from CATIA from Dassault Systemes
More informationSpeed up a Machine-Learning-based Image Super-Resolution Algorithm on GPGPU
Speed up a Machine-Learning-based Image Super-Resolution Algorithm on GPGPU Ke Ma 1, and Yao Song 2 1 Department of Computer Sciences 2 Department of Electrical and Computer Engineering University of Wisconsin-Madison
More informationCS427 Multicore Architecture and Parallel Computing
CS427 Multicore Architecture and Parallel Computing Lecture 6 GPU Architecture Li Jiang 2014/10/9 1 GPU Scaling A quiet revolution and potential build-up Calculation: 936 GFLOPS vs. 102 GFLOPS Memory Bandwidth:
More informationClipmap based Loading and Rendering of Large Terrain Texture
2012 International Conference on Image, Vision and Computing (ICIVC 2012) IPCSIT vol. 50 (2012) (2012) IACSIT Press, Singapore DOI: 10.7763/IPCSIT.2012.V50.16 Clipmap based Loading and Rendering of Large
More informationApproximating Point Clouds by Generalized Bézier Surfaces
WAIT: Workshop on the Advances in Information Technology, Budapest, 2018 Approximating Point Clouds by Generalized Bézier Surfaces Péter Salvi 1, Tamás Várady 1, Kenjirō T. Miura 2 1 Budapest University
More informationCurves and Surfaces 2
Curves and Surfaces 2 Computer Graphics Lecture 17 Taku Komura Today More about Bezier and Bsplines de Casteljau s algorithm BSpline : General form de Boor s algorithm Knot insertion NURBS Subdivision
More informationPoint Sample Rendering
Point Sample Rendering Efficient Screen Space Approach for HW Accelerated Surfel Rendering VMV03, november 2003 Gaël GUENNEBAUD - Mathias PAULIN IRIT-CNRS-UPS TOULOUSE-FRANCE http://www.irit.fr/recherches/sirv/vis/surfel/index.html
More informationSpline Surfaces, Subdivision Surfaces
CS-C3100 Computer Graphics Spline Surfaces, Subdivision Surfaces vectorportal.com Trivia Assignment 1 due this Sunday! Feedback on the starter code, difficulty, etc., much appreciated Put in your README
More informationIntroduction to ANSYS FLUENT Meshing
Workshop 04 CAD Import and Meshing from Conformal Faceting Input 14.5 Release Introduction to ANSYS FLUENT Meshing 2011 ANSYS, Inc. December 21, 2012 1 I Introduction Workshop Description: CAD files will
More informationOn Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators
On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators Karl Rupp, Barry Smith rupp@mcs.anl.gov Mathematics and Computer Science Division Argonne National Laboratory FEMTEC
More informationHow to model a car body in T-Splines
How to model a car body in T-Splines My name is and I ll show you how to model complex cars like the Alfa Romeo 147 gta using the T-Splines Maya plugin and various techniques. This will be useful if you
More informationSIFT Descriptor Extraction on the GPU for Large-Scale Video Analysis. Hannes Fassold, Jakub Rosner
SIFT Descriptor Extraction on the GPU for Large-Scale Video Analysis Hannes Fassold, Jakub Rosner 2014-03-26 2 Overview GPU-activities @ AVM research group SIFT descriptor extraction Algorithm GPU implementation
More informationHeight field ambient occlusion using CUDA
Height field ambient occlusion using CUDA 3.6.2009 Outline 1 2 3 4 Theory Kernel 5 Height fields Self occlusion Current methods Marching several directions from each fragment Sampling several times along
More informationGPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS
GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS Agenda Forming a GPGPU WG 1 st meeting Future meetings Activities Forming a GPGPU WG To raise needs and enhance information sharing A platform for knowledge
More informationSIMPLY RHINO PNY NVIDIA Quadro Rhino v6 WIP Tests. Contents
SIMPLY RHINO PNY NVIDIA Quadro Rhino v6 WIP Tests Contents 01 Disclaimer 3 02 Introduction 3 03 Hardware and Settings 3 04 Testing Improvements from Rhino v5 to Rhino v6 4 05 Testing Outline scores for
More informationGPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC
GPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC MIKE GOWANLOCK NORTHERN ARIZONA UNIVERSITY SCHOOL OF INFORMATICS, COMPUTING & CYBER SYSTEMS BEN KARSIN UNIVERSITY OF HAWAII AT MANOA DEPARTMENT
More informationCurrent Trends in Computer Graphics Hardware
Current Trends in Computer Graphics Hardware Dirk Reiners University of Louisiana Lafayette, LA Quick Introduction Assistant Professor in Computer Science at University of Louisiana, Lafayette (since 2006)
More informationDeterminant Computation on the GPU using the Condensation AMMCS Method / 1
Determinant Computation on the GPU using the Condensation Method Sardar Anisul Haque Marc Moreno Maza Ontario Research Centre for Computer Algebra University of Western Ontario, London, Ontario AMMCS 20,
More informationG 2 Interpolation for Polar Surfaces
1 G 2 Interpolation for Polar Surfaces Jianzhong Wang 1, Fuhua Cheng 2,3 1 University of Kentucky, jwangf@uky.edu 2 University of Kentucky, cheng@cs.uky.edu 3 National Tsinhua University ABSTRACT In this
More informationTessellations. Irena Swanson Reed College, Portland, Oregon. MathPath, Lewis & Clark College, Portland, Oregon, 24 July 2018
Tessellations Irena Swanson Reed College, Portland, Oregon MathPath, Lewis & Clark College, Portland, Oregon, 24 July 2018 What is a tessellation? A tiling or a tessellation of the plane is a covering
More information1 Introduction. 1.1 Raster-to-vector conversion
1 Introduction 1.1 Raster-to-vector conversion Vectorization (raster-to-vector conversion) consists of analyzing a raster image to convert its pixel representation to a vector representation The basic
More informationDIFFERENTIAL. Tomáš Oberhuber, Atsushi Suzuki, Jan Vacata, Vítězslav Žabka
USE OF FOR Tomáš Oberhuber, Atsushi Suzuki, Jan Vacata, Vítězslav Žabka Faculty of Nuclear Sciences and Physical Engineering Czech Technical University in Prague Mini workshop on advanced numerical methods
More informationAn Evaluation of Unified Memory Technology on NVIDIA GPUs
An Evaluation of Unified Memory Technology on NVIDIA GPUs Wenqiang Li 1, Guanghao Jin 2, Xuewen Cui 1, Simon See 1,3 Center for High Performance Computing, Shanghai Jiao Tong University, China 1 Tokyo
More informationRevolutionizing the Datacenter Join the Conversation #OpenPOWERSummit
Redis Labs on POWER8 Server: The Promise of OpenPOWER Value Jeffrey L. Leeds, Ph.D. Vice President, Alliances & Channels Revolutionizing the Datacenter Join the Conversation #OpenPOWERSummit Who We Are
More informationMulti-View Stereo for Static and Dynamic Scenes
Multi-View Stereo for Static and Dynamic Scenes Wolfgang Burgard Jan 6, 2010 Main references Yasutaka Furukawa and Jean Ponce, Accurate, Dense and Robust Multi-View Stereopsis, 2007 C.L. Zitnick, S.B.
More informationFast Tridiagonal Solvers on GPU
Fast Tridiagonal Solvers on GPU Yao Zhang John Owens UC Davis Jonathan Cohen NVIDIA GPU Technology Conference 2009 Outline Introduction Algorithms Design algorithms for GPU architecture Performance Bottleneck-based
More informationAbstract. Introduction. Kevin Todisco
- Kevin Todisco Figure 1: A large scale example of the simulation. The leftmost image shows the beginning of the test case, and shows how the fluid refracts the environment around it. The middle image
More informationStreaming Massive Environments From Zero to 200MPH
FORZA MOTORSPORT From Zero to 200MPH Chris Tector (Software Architect Turn 10 Studios) Turn 10 Internal studio at Microsoft Game Studios - we make Forza Motorsport Around 70 full time staff 2 Why am I
More informationHigh performance 2D Discrete Fourier Transform on Heterogeneous Platforms. Shrenik Lad, IIIT Hyderabad Advisor : Dr. Kishore Kothapalli
High performance 2D Discrete Fourier Transform on Heterogeneous Platforms Shrenik Lad, IIIT Hyderabad Advisor : Dr. Kishore Kothapalli Motivation Fourier Transform widely used in Physics, Astronomy, Engineering
More informationpolygonal approximations
An improved method for obtaining optimal polygonal approximations A. Carmona-Poyato; F. J. Madrid-Cuevas; N. L. Fernández-García; R. Medina-Carnicer Abstract In computer vision, polygonal approximations
More informationM4G - A Surface Representation for Adaptive CPU-GPU Computation
M4G - A Surface Representation for Adaptive CPU-GPU Computation Vision and Graphics Lab Institute of Pure and Applied Mathematics Trimester Program on Computational Manifolds and Applications November
More informationN-Body Simulation using CUDA. CSE 633 Fall 2010 Project by Suraj Alungal Balchand Advisor: Dr. Russ Miller State University of New York at Buffalo
N-Body Simulation using CUDA CSE 633 Fall 2010 Project by Suraj Alungal Balchand Advisor: Dr. Russ Miller State University of New York at Buffalo Project plan Develop a program to simulate gravitational
More informationAccelerated Load Balancing of Unstructured Meshes
Accelerated Load Balancing of Unstructured Meshes Gerrett Diamond, Lucas Davis, and Cameron W. Smith Abstract Unstructured mesh applications running on large, parallel, distributed memory systems require
More informationSackler Course BMSC-GA 4448 High Performance Computing in Biomedical Informatics. Class 2: Friday February 14 th, :30PM 5:30PM AGENDA
Sackler Course BMSC-GA 4448 High Performance Computing in Biomedical Informatics Class 2: Friday February 14 th, 2014 2:30PM 5:30PM AGENDA Recap 1 st class & Homework discussion. Fundamentals of Parallel
More informationAppearance-Preserving Simplification
Appearance-Preserving Simplification 7809 tris 975 tris 488 tris 1951 tris 3905 tris model courtesy of Stanford and Caltech Simplification Create representation for faster rendering Measure representation
More informationLarge scale Imaging on Current Many- Core Platforms
Large scale Imaging on Current Many- Core Platforms SIAM Conf. on Imaging Science 2012 May 20, 2012 Dr. Harald Köstler Chair for System Simulation Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen,
More informationIntroduction to Computer Graphics
Introduction to Computer Graphics 2016 Spring National Cheng Kung University Instructors: Min-Chun Hu 胡敏君 Shih-Chin Weng 翁士欽 ( 西基電腦動畫 ) Data Representation Curves and Surfaces Limitations of Polygons Inherently
More informationDebunking the 100X GPU vs CPU Myth: An Evaluation of Throughput Computing on CPU and GPU
Debunking the 100X GPU vs CPU Myth: An Evaluation of Throughput Computing on CPU and GPU The myth 10x-1000x speed up on GPU vs CPU Papers supporting the myth: Microsoft: N. K. Govindaraju, B. Lloyd, Y.
More informationScalable Multi Agent Simulation on the GPU. Avi Bleiweiss NVIDIA Corporation San Jose, 2009
Scalable Multi Agent Simulation on the GPU Avi Bleiweiss NVIDIA Corporation San Jose, 2009 Reasoning Explicit State machine, serial Implicit Compute intensive Fits SIMT well Collision avoidance Motivation
More information3D Modeling techniques
3D Modeling techniques 0. Reconstruction From real data (not covered) 1. Procedural modeling Automatic modeling of a self-similar objects or scenes 2. Interactive modeling Provide tools to computer artists
More informationLS-ACTS 1.0 USER MANUAL
LS-ACTS 1.0 USER MANUAL VISION GROUP, STATE KEY LAB OF CAD&CG, ZHEJIANG UNIVERSITY HTTP://WWW.ZJUCVG.NET TABLE OF CONTENTS 1 Introduction... 1-3 1.1 Product Specification...1-3 1.2 Feature Functionalities...1-3
More informationX10 specific Optimization of CPU GPU Data transfer with Pinned Memory Management
X10 specific Optimization of CPU GPU Data transfer with Pinned Memory Management Hideyuki Shamoto, Tatsuhiro Chiba, Mikio Takeuchi Tokyo Institute of Technology IBM Research Tokyo Programming for large
More informationImproving performances of an embedded RDBMS with a hybrid CPU/GPU processing engine
Improving performances of an embedded RDBMS with a hybrid CPU/GPU processing engine Samuel Cremer 1,2, Michel Bagein 1, Saïd Mahmoudi 1, Pierre Manneback 1 1 UMONS, University of Mons Computer Science
More informationTata Elxsi benchmark report: Unreal Datasmith
This report and its findings were produced by Tata Elxsi. The report was sponsored by Unity Technologies. Tata Elxsi benchmark report: comparing PiXYZ Studio and Unreal Datasmith A Tata Elxsi perspective
More informationHETEROGENEOUS COMPUTE INFRASTRUCTURE FOR SINGAPORE
HETEROGENEOUS COMPUTE INFRASTRUCTURE FOR SINGAPORE PHILIP HEAH ASSISTANT CHIEF EXECUTIVE TECHNOLOGY & INFRASTRUCTURE GROUP LAUNCH OF SERVICES AND DIGITAL ECONOMY (SDE) TECHNOLOGY ROADMAP (NOV 2018) Source
More informationProgrammable GPUs. Real Time Graphics 11/13/2013. Nalu 2004 (NVIDIA Corporation) GeForce 6. Virtua Fighter 1995 (SEGA Corporation) NV1
Programmable GPUs Real Time Graphics Virtua Fighter 1995 (SEGA Corporation) NV1 Dead or Alive 3 2001 (Tecmo Corporation) Xbox (NV2A) Nalu 2004 (NVIDIA Corporation) GeForce 6 Human Head 2006 (NVIDIA Corporation)
More informationCUDA Optimizations WS Intelligent Robotics Seminar. Universität Hamburg WS Intelligent Robotics Seminar Praveen Kulkarni
CUDA Optimizations WS 2014-15 Intelligent Robotics Seminar 1 Table of content 1 Background information 2 Optimizations 3 Summary 2 Table of content 1 Background information 2 Optimizations 3 Summary 3
More informationAccelerating String Matching Using Multi-threaded Algorithm
Accelerating String Matching Using Multi-threaded Algorithm on GPU Cheng-Hung Lin*, Sheng-Yu Tsai**, Chen-Hsiung Liu**, Shih-Chieh Chang**, Jyuo-Min Shyu** *National Taiwan Normal University, Taiwan **National
More informationReal Time Rendering Assignment 1 - Report 20/08/10
Real Time Rendering Assignment 1 - Report 20/08/10 Tom Harris Alon Gal s3236050 s3216299 Introduction The aim of this assignment was to investigate the graphics performance of procedurally generated objects
More informationThe Graphics Pipeline
The Graphics Pipeline Ray Tracing: Why Slow? Basic ray tracing: 1 ray/pixel Ray Tracing: Why Slow? Basic ray tracing: 1 ray/pixel But you really want shadows, reflections, global illumination, antialiasing
More informationApplications. Oversampled 3D scan data. ~150k triangles ~80k triangles
Mesh Simplification Applications Oversampled 3D scan data ~150k triangles ~80k triangles 2 Applications Overtessellation: E.g. iso-surface extraction 3 Applications Multi-resolution hierarchies for efficient
More informationImproving Memory Space Efficiency of Kd-tree for Real-time Ray Tracing Byeongjun Choi, Byungjoon Chang, Insung Ihm
Improving Memory Space Efficiency of Kd-tree for Real-time Ray Tracing Byeongjun Choi, Byungjoon Chang, Insung Ihm Department of Computer Science and Engineering Sogang University, Korea Improving Memory
More informationA GPU-based Approximate SVD Algorithm Blake Foster, Sridhar Mahadevan, Rui Wang
A GPU-based Approximate SVD Algorithm Blake Foster, Sridhar Mahadevan, Rui Wang University of Massachusetts Amherst Introduction Singular Value Decomposition (SVD) A: m n matrix (m n) U, V: orthogonal
More informationOpenACC/CUDA/OpenMP... 1 Languages and Libraries... 3 Multi-GPU support... 4 How OpenACC Works... 4
OpenACC Course Class #1 Q&A Contents OpenACC/CUDA/OpenMP... 1 Languages and Libraries... 3 Multi-GPU support... 4 How OpenACC Works... 4 OpenACC/CUDA/OpenMP Q: Is OpenACC an NVIDIA standard or is it accepted
More informationCS 179: GPU Programming
CS 179: GPU Programming Introduction Lecture originally written by Luke Durant, Tamas Szalay, Russell McClellan What We Will Cover Programming GPUs, of course: OpenGL Shader Language (GLSL) Compute Unified
More information(software agnostic) Computational Considerations
(software agnostic) Computational Considerations The Issues CPU GPU Emerging - FPGA, Phi, Nervana Storage Networking CPU 2 Threads core core Processor/Chip Processor/Chip Computer CPU Threads vs. Cores
More informationB-spline Curves. Smoother than other curve forms
Curves and Surfaces B-spline Curves These curves are approximating rather than interpolating curves. The curves come close to, but may not actually pass through, the control points. Usually used as multiple,
More informationCS452/552; EE465/505. Clipping & Scan Conversion
CS452/552; EE465/505 Clipping & Scan Conversion 3-31 15 Outline! From Geometry to Pixels: Overview Clipping (continued) Scan conversion Read: Angel, Chapter 8, 8.1-8.9 Project#1 due: this week Lab4 due:
More informationSkeleton-based Template Retrieval for Virtual Maize Modeling
Journal of Computational Information Systems6:5(2010) 1577-1582 Available at http://www.jofcis.com Skeleton-based Template Retrieval for Virtual Maize Modeling Boxiang XIAO 1,2, Chunjiang ZHAO 1,2,, Xinyu
More informationDynamic Sparse Matrix Allocation on GPUs. James King
Dynamic Sparse Matrix Allocation on GPUs James King Graph Applications Dynamic updates to graphs Adding edges add entries to sparse matrix representation Motivation Graph operations (adding edges) (e.g.
More informationnewfasant PO Multiple effects analysis
newfasant PO Multiple effects analysis Benchmark: PO Multiple effects analysis Software Version: 6.2.7 Date: 21 Nov 16:10 Index 1. BENCHMARK DESCRIPTION AND OBJECTIVES 1.1. GEOMETRY CREATION 2. SET-UP
More informationGraphics Architectures and OpenCL. Michael Doggett Department of Computer Science Lund university
Graphics Architectures and OpenCL Michael Doggett Department of Computer Science Lund university Overview Parallelism Radeon 5870 Tiled Graphics Architectures Important when Memory and Bandwidth limited
More informationHigh-Performance Graph Primitives on the GPU: Design and Implementation of Gunrock
High-Performance Graph Primitives on the GPU: Design and Implementation of Gunrock Yangzihao Wang University of California, Davis yzhwang@ucdavis.edu March 24, 2014 Yangzihao Wang (yzhwang@ucdavis.edu)
More informationApplying Tessellation to Clipmap Terrain Rendering
Applying Tessellation to Clipmap Terrain Rendering COSC460 Research Project Report Corey Barnard 55066487 October 17, 2014 Department of Computer Science & Software Engineering University of Canterbury
More informationParallelization of Shortest Path Graph Kernels on Multi-Core CPUs and GPU
Parallelization of Shortest Path Graph Kernels on Multi-Core CPUs and GPU Lifan Xu Wei Wang Marco A. Alvarez John Cavazos Dongping Zhang Department of Computer and Information Science University of Delaware
More informationComputergrafik. Matthias Zwicker Universität Bern Herbst 2016
Computergrafik Matthias Zwicker Universität Bern Herbst 2016 Today Curves NURBS Surfaces Parametric surfaces Bilinear patch Bicubic Bézier patch Advanced surface modeling 2 Piecewise Bézier curves Each
More informationCraig Peeper Software Architect Windows Graphics & Gaming Technologies Microsoft Corporation
Gaming Technologies Craig Peeper Software Architect Windows Graphics & Gaming Technologies Microsoft Corporation Overview Games Yesterday & Today Game Components PC Platform & WGF 2.0 Game Trends Big Challenges
More information