SCIENTIFIC COMPUTING ON COMMODITY GRAPHICS HARDWARE
|
|
- Philippa Black
- 5 years ago
- Views:
Transcription
1 SCIENTIFIC COMPUTING ON COMMODITY GRAPHICS HARDWARE RUIGANG YANG Department of Computer Science, University of Kentucky, One Quality Street, Suite 854, Lexington, KY 40507, USA Driven by the need for interactive entertainment, modern PCs are equipped with specialized graphics processors (GPUs) for creation and display of images. These GPUs have become increasingly programmable, to the point that they now are capable of efficiently executing a significant number of computational kernels from non-graphical applications. In this introductory paper we first present a highlevel overview of modern graphics hardware s architecture, then introduce several applications in scientific computing that can be efficiently accelerated by GPUs. Finally we list programming tools available for application development on GPUs. 1. Introduction As the mass-market emphasis in computing has shifted from word processing and spreadsheets to interactive entertainment, computer hardware has evolved to better support these new applications. Most of the performancelimiting processing today involves creation and display of images; thus, a new entity has appeared within most computer systems. Between the system s general-purpose processor and the video frame buffer, there is now a specialized Graphic Processing Unit (GPU). Early GPUs were not really processors, but hardwired pipelines for each of the most common rendering tasks. As more complex 3D-transformations have become common in a wide range of applications, GPUs have become increasingly programmable, to the point that they now are capable of efficiently executing a significant number of computational kernels from non-graphical applications. A GPU is simpler and more efficient than a conventional PC processor (CPU) because a GPU only needs to perform a relatively simple set of array processing operations (but at a very high speed). Many problems in scientific computing, such as physically-based simulation, information retrieval, and data mining, can boil down to relatively simple matrix operations. This characteristic makes these problems ideal candidates for GPU acceleration. In this introductory paper we first present a high-level overview of modern 1
2 2 graphics hardware s architecture and its phenomenal development in recent years. Then we introduce a large array of non-graphical computational tasks, in particular, linear algebra operations, that have been successfully implemented on GPUs and obtained significant performance improvements. Finally we list programming tools available for application development on GPUs. Some of them are designed to allow programming GPUs with familiar C-like constructs and syntax, without worrying about the details of the hardware. They hold the promise of bringing the vast computational power in GPUs to the broad scientific computing community. 2. A Brief Overview of GPUs In this section, we will explain the basic architecture of GPUs and the potential advantages of using GPUs to solve scientific problems The Rendering Pipeline GPUs are dedicated processors designed specifically to handle the intense computational requirements of display graphics, i.e., rendering texts or images over 30 frames per second. As depicted in Figure 1, a modern GPU can be abstracted as a rendering pipeline for 3D computer graphics (2D graphics is just a special case) 20. Geom etric prim itives Vertex Processing Rasterization Fragm ent Processing Fram e buffer Figure 1. Rendering Pipeline The inputs to the pipeline are geometric primitives, i.e., points, lines, polygons; and the output is the framebuffer a two-dimensional array of pixels that will be displayed on screen. The first stage operates on geometric primitives described by vertices. In this vertex-processing stage vertices are transformed and lit, and primitives are clipped to a viewing volume in preparation for the next stage, rasterization. The rasterizer produces a series of framebuffer addresses and color values, each is called a fragment that represents a portion of a primitive that corresponds to a pixel in the framebuffer. Each fragment is fed to the next fragment processing stage before it finally alters the framebuffer. Operations in this stage include texture mapping, depth test, alpha blending, etc.
3 Recent Trend in GPUs Until a few years ago, commercial GPUs, such as the RealityEngine from SGI 2, implement in hardware a fixed rendering pipeline with configurable parameters. As a result their applications are restricted to graphical computations. Driven by the market demand for better realism, the current generation of commercial GPUs such as the NVIDIA GeForce FX 19 and the ATI Radeon added significant programmable functionalities in both the vertex and the fragment processing stage(stages with double-lines in Figure 1). They allow developers to write a sequence of instructions to modify the vertex or fragment output. These programs are directly executed on the GPUs to achieve comparable performance to fixed-function GPUs. In addition to programable functionalities in modern GPUs, their support for floating point output has been improving. GPUs on the market today support up to 32-bit floating point output. Such a precision is usable for many diverse applications other than computer graphics Radeon Spec int200 Benchmark P3-733Mhz Voodoo3 GeForce 256 P4-1.7Ghz Radeon 8500 P4-3.2Ghz CPU GPU Millions of Triangles per Second 1 Jul-98 Feb-99 Aug-99 Mar-00 Oct-00 Apr-01 Nov-01 May-02 Dec-02 Jun-03 Jan-04 Date Introduced Figure 2. A graph of performance increase over time for CPUs and GPUs. GPU performance has increased at a faster rate than CPUs. (Data courtesy of Anselmo Lastra). GPUs have also demonstrated a rapid improvement in performance during the past few years. In Figure 2, we plot the performance increase of both GPUs and commodity Central Processor Units (CPUs). Similar to the number of integer operations per second for CPUs, a typical benchmark to gauge a GPU s performance is the number of triangles it can process every second. We can see that GPUs have maintained a performance improvement rate of approximately 3X/year, which exceeds the performance improvement of CPUs at 1.6X/year. This is because CPUs are designed for low latency computations, while GPUs are optimized for high throughput of vertices and fragments 14 ). Low latency on memory-intensive applications typically requires large caches, 1
4 4 which use a large silicon area. Additional transistors are used to greater effect in GPU architectures because they are applied to additional functional units that increase throughput Applications of GPUs for General-Purpose Computation With the wide deployment of inexpensive yet powerful GPUs in the last several years, we have seen a surge of experimental research in using GPUs for tasks other than rendering. For example, Yang et. al. have experimented with using GPUs to solve computer visions problems 24,23 ; Holzschuch and Alonso to speed visibility queries 7 ; Hoff et. al. to compute generalized Voronoi Diagrams 8 and proximity information 9 ; and Lok to reconstruct an object s visual hull given live video from multiple cameras 15. Each of these applications obtained significant performance improvements by exploiting the speed and the inherent parallelism in modern graphics hardware. For the scope of this paper, we introduce several representative approaches to accelerate linear algebra operations on GPUs. Larsen and McAllister present a technique for large matrix-matrix multiplies using low cost graphics hardware 12. The method is an adaptation of the technique from parallel computing of distributing the computation over a logically cube-shaped lattice of processors and performing a portion of the computation at each processor. Graphics hardware is specialized in a manner that makes it well suited to this particular problem, giving faster results in some cases than using a general-purpose processor. A more complete and up-to-date implementation of dense matrix algebra is presented by Moravánszky 1. The paper of Bolz et al. shows two basic, broadly useful, computational kernels implemented on GPUs: a sparse matrix conjugate gradient solver, and a regular-grid multigrid solver 4. Performance analysis with realistic applications shows that a GPU-based implementation compares favorable over its CPU counterpart. A similar framework for implementation of linear algebra operators on GPUs is by Krüger and Westermann 11, which focuses on sparse and banded matrices. There are many other algorithms for scientific computing that have been implemented on GPUs, including FFT 17,level set 22,13, and various types of physically-based simulations 6,10,21. Interested readers are referred to for other general-purpose applications on GPUs. 4. GPU Programming Languages While many non-graphical applications on GPUs have obtained encouraging results by exploiting GPU s fast speed and high bandwidth, the development process is not trivial. Many of the existing applications are written using low level assemble languages that are directly executed on the GPU. There-
5 5 fore, novice developers are faced with a steep learning curve to master a thorough understanding of the graphics hardware and its programming interfaces, namely OpenGL 20 and DirectX 16. Fortunately, this is rapidly changing with several high-level languages available. The first is Cg a system for programming graphics hardware in a C-like language 18. It is, however, still a programming language geared towards rendering tasks and tightly coupled with graphics hardware. There are other high-level languages, such as Brook for GPUs and Sh, which allow programming GPUs with familiar constructs and syntax, without worrying about the details of the hardware. Brook extends C to include simple data-parallel constructs, enabling the use of the GPU as a streaming coprocessor. Sh is a metaprogramming language that offers the convenient syntax of C++ and takes the burden of register allocation and other low-level issues away from the programmer. While these languages are not fully mature yet, they are the most promising ones to allow non-graphics researchers or developers to tap into the vast computational power in GPUs. 5. Conclusion The versatile programmability and improved floating-point precisions now available in GPUs make them useful coprocessors for scientific computing. Many non-trivial computational kernels have been successfully implemented on GPUs to receive significant acceleration. As graphics hardware continues to evolve at a faster speed than CPUs and more user-friendly high-level programming languages are becoming available, we believe communities outside computer graphics can also benefit from the fast processing speed and high bandwidth that GPUs offer. We hope this introductory paper will encourage further thinking along this direction. Acknowledgments The author would like to thank Hank Dietz for providing some of the materials in this paper. This work is supported in part by fund from the office of research at the University of Kentucky and Kentucky Science & Engineering Foundation (RDE-005). References 1. Ádám Moravánszky. Dense Matrix Algebra on the GPU. In Shaderx2: Shader Programming Tips & Tricks With Directx 9. Wordware, K. Akeley. Realityengine graphics. In Proceedings of SIGGRAPH, ATI Technologies Inc. ATI Radeon 9800, Jeff Bolz, Ian Farmer, Eitan Grinspun, and Peter Schrder. Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid. ACM Transactions on Graphics (SIGGRAPH 2003), 22(3), 2003.
6 6 5. M. Harris. Real-Time Cloud Simulation and Rendering. PhD thesis, Department of Computer Science, Univ. of North Carolina at Chapel Hill, M. Harris, W. Baxter, T. Scheuermann, and A. Lastra. Simulation of Cloud Dynamics on Graphics Hardware. In Proceedings of Graphics Hardware, pages , Nicolas Holzschuch and Laurent Alonso. Using graphics hardware to speed-up visibility queries. Journal of Graphics Tools, 5(2):33 47, Kenneth E. Hoff III, John Keyser, Ming C. Lin, Dinesh Manocha, and Tim Culver. Fast Computation of Generalized Voronoi Diagrams Using Graphics Hardware. In Proceeding of SIGGRAPH 99, pages , August Kenneth E. Hoff III, Andrew Zaferakis, Ming C. Lin, and Dinesh Manocha. Fast and simple 2d geometric proximity queries using graphics hardware. In 2001 ACM Symposium on Interactive 3D Graphics, pages , March ISBN T. Kim and M. Lin. Visual Simulation of Ice Crystal Growth. In Proceedings of ACM SIGGRAPH / Eurographics Symposium on Computer Animation 2003, pages , Jens Krger and Rdiger Westermann. Linear Algebra Operators for GPU Implementation of Numerical Algorithms. ACM Transactions on Graphics (SIG- GRAPH 2003), 22(3), E. Scott Larsen and David K. McAllister. Fast Matrix Multiplies using Graphics Hardware. In Proceeding of Super Computer 2001, November A. E. Lefohn, J. Kniss, C. Hansen, and R. T. Whitaker. Interactive Deformation and Visualization of Level Set Surfaces Using Graphics Hardware. In Proceedings of IEEE Visualization, E. Lindholm, M. Kilgard, and H. Moreton. A User Programmable Vertex Engine. In Proceedings of SIGGRAPH, pages , B. Lok. Online Model Reconstruction for Interactive Virtual Environments. In Proceedings 2001 Symposium on Interactive 3D Graphics, pages 69 72, Chapel Hill, North Carolina, March Microsoft. DirectX, K. Moreland and E. Angel. The FFT on a GPU. In SIGGRAPH/Eurographics Workshop on Graphics Hardware 2003 Proceedings, pages , NVIDIA. Cg: C for Graphics, NVIDIA. GeForce FX, desktop.html. 20. M. Segal and K. Akeley. The OpenGL Graphics System: A Specification (Version 1.3), S.Tomov, M.McGuigan, R.Bennett, G.Smith, and J.Spiletic. Benchmarking and Implementation of Probability-Based Simulations on Programmable Graphics Cards. Computers & Graphics, R. Strzodka and M. Rumpf. Level set segmentation in graphics hardware. In Proceedings of the International Conference on Image Processing, Ruigang Yang and Marc Pollefeys. Multi-Resolution Real-Time Stereo on Commodity Graphics Hardware. In Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), pages , Ruigang Yang and Greg Welch. Fast Image Segmentation and Smoothing Using Commodity Graphics Hardware. Journal of Graphics Tools, special issue on Hardware-Accelerated Rendering Techniques, 7(4):91 100, 2003.
The GPGPU Programming Model
The Programming Model Institute for Data Analysis and Visualization University of California, Davis Overview Data-parallel programming basics The GPU as a data-parallel computer Hello World Example Programming
More informationCS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST
CS 380 - GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1 Markus Hadwiger, KAUST Reading Assignment #2 (until Feb. 17) Read (required): GLSL book, chapter 4 (The OpenGL Programmable
More informationGraphics Hardware. Instructor Stephen J. Guy
Instructor Stephen J. Guy Overview What is a GPU Evolution of GPU GPU Design Modern Features Programmability! Programming Examples Overview What is a GPU Evolution of GPU GPU Design Modern Features Programmability!
More informationA BRIEF HISTORY OF GPGPU. Mark Harris Chief Technologist, GPU Computing UNC Ph.D. 2003
A BRIEF HISTORY OF GPGPU Mark Harris Chief Technologist, GPU Computing UNC Ph.D. 2003 2 A BRIEF HISTORY OF GPGPU fd General-Purpose computation on Graphics Processing Units 3 THE FIRST GPGPU: IKONAS RDS-3000
More informationCSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller
Entertainment Graphics: Virtual Realism for the Masses CSE 591: GPU Programming Introduction Computer games need to have: realistic appearance of characters and objects believable and creative shading,
More informationACCELERATING ROUTE PLANNING AND COLLISION DETECTION FOR COMPUTER GENERATED FORCES USING GPUS
ACCELERATING ROUTE PLANNING AND COLLISION DETECTION FOR COMPUTER GENERATED FORCES USING GPUS David Tuft, Russell Gayle, Brian Salomon, Naga Govindaraju, Ming Lin, and Dinesh Manocha University of North
More informationGraphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics
Why GPU? Chapter 1 Graphics Hardware Graphics Processing Unit (GPU) is a Subsidiary hardware With massively multi-threaded many-core Dedicated to 2D and 3D graphics Special purpose low functionality, high
More informationWhat s New with GPGPU?
What s New with GPGPU? John Owens Assistant Professor, Electrical and Computer Engineering Institute for Data Analysis and Visualization University of California, Davis Microprocessor Scaling is Slowing
More informationACCELERATING SIGNAL PROCESSING ALGORITHMS USING GRAPHICS PROCESSORS
ACCELERATING SIGNAL PROCESSING ALGORITHMS USING GRAPHICS PROCESSORS Ashwin Prasad and Pramod Subramanyan RF and Communications R&D National Instruments, Bangalore 560095, India Email: {asprasad, psubramanyan}@ni.com
More informationCSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University
CSE 591/392: GPU Programming Introduction Klaus Mueller Computer Science Department Stony Brook University First: A Big Word of Thanks! to the millions of computer game enthusiasts worldwide Who demand
More informationPractical Shadow Mapping
Practical Shadow Mapping Stefan Brabec Thomas Annen Hans-Peter Seidel Max-Planck-Institut für Informatik Saarbrücken, Germany Abstract In this paper we propose several methods that can greatly improve
More informationRasterization Overview
Rendering Overview The process of generating an image given a virtual camera objects light sources Various techniques rasterization (topic of this course) raytracing (topic of the course Advanced Computer
More informationSpring 2011 Prof. Hyesoon Kim
Spring 2011 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on
More informationGPU Architecture and Function. Michael Foster and Ian Frasch
GPU Architecture and Function Michael Foster and Ian Frasch Overview What is a GPU? How is a GPU different from a CPU? The graphics pipeline History of the GPU GPU architecture Optimizations GPU performance
More informationSpring 2009 Prof. Hyesoon Kim
Spring 2009 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on
More informationExploiting Graphics Hardware for Haptic Authoring
Book Title Book Editors IOS Press, 2003 1 Exploiting Graphics Hardware for Haptic Authoring Minho Kim a,1, Sukitti Punak a, Juan Cendan b, Sergei Kurenov b, and Jörg Peters a a Dept. CISE, University of
More informationA MATLAB Interface to the GPU
A MATLAB Interface to the GPU Second Winter School Geilo, Norway André Rigland Brodtkorb SINTEF ICT Department of Applied Mathematics 2007-01-24 Outline 1 Motivation and previous
More informationGPU-Based Volume Rendering of. Unstructured Grids. João L. D. Comba. Fábio F. Bernardon UFRGS
GPU-Based Volume Rendering of João L. D. Comba Cláudio T. Silva Steven P. Callahan Unstructured Grids UFRGS University of Utah University of Utah Fábio F. Bernardon UFRGS Natal - RN - Brazil XVIII Brazilian
More informationHardware-Assisted Relief Texture Mapping
EUROGRAPHICS 0x / N.N. and N.N. Short Presentations Hardware-Assisted Relief Texture Mapping Masahiro Fujita and Takashi Kanai Keio University Shonan-Fujisawa Campus, Fujisawa, Kanagawa, Japan Abstract
More informationGPU-Assisted Z-Field Simplification
GPU-Assisted Z-Field Simplification Alexander Bogomjakov Craig Gotsman Center for Graphics and Geometric Computing, Computer Science Dept. Technion Israel Institute of Technology Haifa 32000, Israel {alexb
More informationGraphics and Imaging Architectures
Graphics and Imaging Architectures Kayvon Fatahalian http://www.cs.cmu.edu/afs/cs/academic/class/15869-f11/www/ About Kayvon New faculty, just arrived from Stanford Dissertation: Evolving real-time graphics
More informationGraphics Processing Unit Architecture (GPU Arch)
Graphics Processing Unit Architecture (GPU Arch) With a focus on NVIDIA GeForce 6800 GPU 1 What is a GPU From Wikipedia : A specialized processor efficient at manipulating and displaying computer graphics
More informationGPU Memory Model. Adapted from:
GPU Memory Model Adapted from: Aaron Lefohn University of California, Davis With updates from slides by Suresh Venkatasubramanian, University of Pennsylvania Updates performed by Gary J. Katz, University
More informationCS130 : Computer Graphics Lecture 2: Graphics Pipeline. Tamar Shinar Computer Science & Engineering UC Riverside
CS130 : Computer Graphics Lecture 2: Graphics Pipeline Tamar Shinar Computer Science & Engineering UC Riverside Raster Devices and Images Raster Devices - raster displays show images as a rectangular array
More informationCS130 : Computer Graphics. Tamar Shinar Computer Science & Engineering UC Riverside
CS130 : Computer Graphics Tamar Shinar Computer Science & Engineering UC Riverside Raster Devices and Images Raster Devices Hearn, Baker, Carithers Raster Display Transmissive vs. Emissive Display anode
More informationGeneral Purpose Computing on Graphical Processing Units (GPGPU(
General Purpose Computing on Graphical Processing Units (GPGPU( / GPGP /GP 2 ) By Simon J.K. Pedersen Aalborg University, Oct 2008 VGIS, Readings Course Presentation no. 7 Presentation Outline Part 1:
More informationDiFi: Distance Fields - Fast Computation Using Graphics Hardware
DiFi: Distance Fields - Fast Computation Using Graphics Hardware Avneesh Sud Dinesh Manocha UNC-Chapel Hill http://gamma.cs.unc.edu/difi Distance Fields Distance Function For a site a scalar function f:r
More informationAdvanced Computer Graphics (CS & SE ) Lecture 7
Advanced Computer Graphics (CS & SE 233.420) Lecture 7 CREDITS Bill Mark, NVIDIA Programmable Graphics Technology, SIGGRAPH 2002 Course. David Kirk, GPUs and CPUs:The Uneasy Alliance, Panel Presentation,
More informationPartitioning Programs for Automatically Exploiting GPU
Partitioning Programs for Automatically Exploiting GPU Eric Petit and Sebastien Matz and Francois Bodin epetit, smatz,bodin@irisa.fr IRISA-INRIA-University of Rennes 1 Campus de Beaulieu, 35042 Rennes,
More informationCS427 Multicore Architecture and Parallel Computing
CS427 Multicore Architecture and Parallel Computing Lecture 6 GPU Architecture Li Jiang 2014/10/9 1 GPU Scaling A quiet revolution and potential build-up Calculation: 936 GFLOPS vs. 102 GFLOPS Memory Bandwidth:
More informationA MATLAB Interface to the GPU
Introduction Results, conclusions and further work References Department of Informatics Faculty of Mathematics and Natural Sciences University of Oslo June 2007 Introduction Results, conclusions and further
More informationCS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology
CS8803SC Software and Hardware Cooperative Computing GPGPU Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology Why GPU? A quiet revolution and potential build-up Calculation: 367
More informationThe Problem: Difficult To Use. Motivation: The Potential of GPGPU CGC & FXC. GPGPU Languages
Course Introduction GeneralGeneral-Purpose Computation on Graphics Hardware Motivation: Computational Power The GPU on commodity video cards has evolved into an extremely flexible and powerful processor
More informationFast Matrix Multiplies using Graphics Hardware
Fast Matrix Multiplies using Graphics Hardware E. Scott Larsen Department of Computer Science University of North Carolina at Chapel Hill Chapel Hill, NC 27599-3175 USA larsene@cs.unc.edu David McAllister
More informationReal - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský
Real - Time Rendering Graphics pipeline Michal Červeňanský Juraj Starinský Overview History of Graphics HW Rendering pipeline Shaders Debugging 2 History of Graphics HW First generation Second generation
More informationApplications of Explicit Early-Z Culling
Applications of Explicit Early-Z Culling Jason L. Mitchell ATI Research Pedro V. Sander ATI Research Introduction In past years, in the SIGGRAPH Real-Time Shading course, we have covered the details of
More informationTutorial on GPU Programming #2. Joong-Youn Lee Supercomputing Center, KISTI
Tutorial on GPU Programming #2 Joong-Youn Lee Supercomputing Center, KISTI Contents Graphics Pipeline Vertex Programming Fragment Programming Introduction to Cg Language Graphics Pipeline The process to
More informationWhat Next? Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University. * slides thanks to Kavita Bala & many others
What Next? Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University * slides thanks to Kavita Bala & many others Final Project Demo Sign-Up: Will be posted outside my office after lecture today.
More informationGeForce4. John Montrym Henry Moreton
GeForce4 John Montrym Henry Moreton 1 Architectural Drivers Programmability Parallelism Memory bandwidth 2 Recent History: GeForce 1&2 First integrated geometry engine & 4 pixels/clk Fixed-function transform,
More informationConsumer graphics cards for fast image processing based on the Pixel Shader 3.0 standard
Consumer graphics cards for fast image processing based on the Pixel Shader 3.0 standard G. Monti, C. Lindner, F. Puente León, A. W. Koch Technische Universität München, Institute for Measurement Systems
More informationGPU Architecture. Michael Doggett Department of Computer Science Lund university
GPU Architecture Michael Doggett Department of Computer Science Lund university GPUs from my time at ATI R200 Xbox360 GPU R630 R610 R770 Let s start at the beginning... Graphics Hardware before GPUs 1970s
More information2225 High Speed Machine Vision Sensing of Cotton Lint Trash
2225 High Speed Machine Vision Sensing of Cotton Lint Trash M. G. Pelletier Agricultural Engineer USDA, ARS Cotton Production and Processing Research Unit Lubbock, TX Abstract As machine design in the
More informationX. GPU Programming. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter X 1
X. GPU Programming 320491: Advanced Graphics - Chapter X 1 X.1 GPU Architecture 320491: Advanced Graphics - Chapter X 2 GPU Graphics Processing Unit Parallelized SIMD Architecture 112 processing cores
More informationEECS 487: Interactive Computer Graphics
EECS 487: Interactive Computer Graphics Lecture 21: Overview of Low-level Graphics API Metal, Direct3D 12, Vulkan Console Games Why do games look and perform so much better on consoles than on PCs with
More information2.11 Particle Systems
2.11 Particle Systems 320491: Advanced Graphics - Chapter 2 152 Particle Systems Lagrangian method not mesh-based set of particles to model time-dependent phenomena such as snow fire smoke 320491: Advanced
More informationPowerVR Hardware. Architecture Overview for Developers
Public Imagination Technologies PowerVR Hardware Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.
More informationNVIDIA nfinitefx Engine: Programmable Pixel Shaders
NVIDIA nfinitefx Engine: Programmable Pixel Shaders The NVIDIA nfinitefx Engine: The NVIDIA nfinitefx TM engine gives developers the ability to program a virtually infinite number of special effects and
More information1.2.3 The Graphics Hardware Pipeline
Figure 1-3. The Graphics Hardware Pipeline 1.2.3 The Graphics Hardware Pipeline A pipeline is a sequence of stages operating in parallel and in a fixed order. Each stage receives its input from the prior
More informationFrom Brook to CUDA. GPU Technology Conference
From Brook to CUDA GPU Technology Conference A 50 Second Tutorial on GPU Programming by Ian Buck Adding two vectors in C is pretty easy for (i=0; i
More informationFast Image Segmentation and Smoothing Using Commodity Graphics Hardware
Fast Image Segmentation and Smoothing Using Commodity Graphics Hardware Ruigang Yang and Greg Welch Department of Computer Science University of North Carolina at Chapel Hill Chapel Hill, North Carolina
More informationImage Processing on the GPU: Implementing the Canny Edge Detection Algorithm
Image Processing on the GPU: Implementing the Canny Edge Detection Algorithm Yuko Roodt Highquest, Johannesburg yuko@highquest.co.za Willem Visser University of Johannesburg glasoog@gmail.com Willem A.
More informationIntroduction to Shaders.
Introduction to Shaders Marco Benvegnù hiforce@gmx.it www.benve.org Summer 2005 Overview Rendering pipeline Shaders concepts Shading Languages Shading Tools Effects showcase Setup of a Shader in OpenGL
More informationGoal. Interactive Walkthroughs using Multiple GPUs. Boeing 777. DoubleEagle Tanker Model
Goal Interactive Walkthroughs using Multiple GPUs Dinesh Manocha University of North Carolina- Chapel Hill http://www.cs.unc.edu/~walk SIGGRAPH COURSE #11, 2003 Interactive Walkthrough of complex 3D environments
More informationData-Parallel Algorithms on GPUs. Mark Harris NVIDIA Developer Technology
Data-Parallel Algorithms on GPUs Mark Harris NVIDIA Developer Technology Outline Introduction Algorithmic complexity on GPUs Algorithmic Building Blocks Gather & Scatter Reductions Scan (parallel prefix)
More informationCSE 167: Lecture #5: Rasterization. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012
CSE 167: Introduction to Computer Graphics Lecture #5: Rasterization Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012 Announcements Homework project #2 due this Friday, October
More informationGeneral-Purpose Computation on Graphics Hardware
General-Purpose Computation on Graphics Hardware Welcome & Overview David Luebke NVIDIA Introduction The GPU on commodity video cards has evolved into an extremely flexible and powerful processor Programmability
More informationGPGPU. Peter Laurens 1st-year PhD Student, NSC
GPGPU Peter Laurens 1st-year PhD Student, NSC Presentation Overview 1. What is it? 2. What can it do for me? 3. How can I get it to do that? 4. What s the catch? 5. What s the future? What is it? Introducing
More informationGPUs and GPGPUs. Greg Blanton John T. Lubia
GPUs and GPGPUs Greg Blanton John T. Lubia PROCESSOR ARCHITECTURAL ROADMAP Design CPU Optimized for sequential performance ILP increasingly difficult to extract from instruction stream Control hardware
More informationCS452/552; EE465/505. Clipping & Scan Conversion
CS452/552; EE465/505 Clipping & Scan Conversion 3-31 15 Outline! From Geometry to Pixels: Overview Clipping (continued) Scan conversion Read: Angel, Chapter 8, 8.1-8.9 Project#1 due: this week Lab4 due:
More informationChapter 1 Introduction
Graphics & Visualization Chapter 1 Introduction Graphics & Visualization: Principles & Algorithms Brief History Milestones in the history of computer graphics: 2 Brief History (2) CPU Vs GPU 3 Applications
More informationWhy Use the GPU? How to Exploit? New Hardware Features. Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid. Semiconductor trends
Imagine stream processor; Bill Dally, Stanford Connection Machine CM; Thinking Machines Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid Jeffrey Bolz Eitan Grinspun Caltech Ian Farmer
More informationA Data Parallel Approach to Genetic Programming Using Programmable Graphics Hardware
A Data Parallel Approach to Genetic Programming Using Programmable Graphics Hardware Darren M. Chitty QinetiQ Malvern Malvern Technology Centre St Andrews Road, Malvern Worcestershire, UK WR14 3PS dmchitty@qinetiq.com
More informationNeural Network Implementation using CUDA and OpenMP
Neural Network Implementation using CUDA and OpenMP Honghoon Jang, Anjin Park, Keechul Jung Department of Digital Media, College of Information Science, Soongsil University {rollco82,anjin,kcjung}@ssu.ac.kr
More informationCurrent Trends in Computer Graphics Hardware
Current Trends in Computer Graphics Hardware Dirk Reiners University of Louisiana Lafayette, LA Quick Introduction Assistant Professor in Computer Science at University of Louisiana, Lafayette (since 2006)
More informationShaders. Slide credit to Prof. Zwicker
Shaders Slide credit to Prof. Zwicker 2 Today Shader programming 3 Complete model Blinn model with several light sources i diffuse specular ambient How is this implemented on the graphics processor (GPU)?
More informationReal-Time Rendering (Echtzeitgraphik) Michael Wimmer
Real-Time Rendering (Echtzeitgraphik) Michael Wimmer wimmer@cg.tuwien.ac.at Walking down the graphics pipeline Application Geometry Rasterizer What for? Understanding the rendering pipeline is the key
More informationStatic Scene Reconstruction
GPU supported Real-Time Scene Reconstruction with a Single Camera Jan-Michael Frahm, 3D Computer Vision group, University of North Carolina at Chapel Hill Static Scene Reconstruction 1 Capture on campus
More informationMattan Erez. The University of Texas at Austin
EE382V: Principles in Computer Architecture Parallelism and Locality Fall 2008 Lecture 10 The Graphics Processing Unit Mattan Erez The University of Texas at Austin Outline What is a GPU? Why should we
More informationProgramming Graphics Hardware
Tutorial 5 Programming Graphics Hardware Randy Fernando, Mark Harris, Matthias Wloka, Cyril Zeller Overview of the Tutorial: Morning 8:30 9:30 10:15 10:45 Introduction to the Hardware Graphics Pipeline
More informationComputer Graphics (CS 543) Lecture 1 (Part 1): Introduction to Computer Graphics
Computer Graphics (CS 543) Lecture 1 (Part 1): Introduction to Computer Graphics Prof Emmanuel Agu Computer Science Dept. Worcester Polytechnic Institute (WPI) What is Computer Graphics (CG)? Computer
More informationLU-GPU: Efficient Algorithms for Solving Dense Linear Systems on Graphics Hardware
LU-GPU: Efficient Algorithms for Solving Dense Linear Systems on Graphics Hardware Nico Galoppo Naga K. Govindaraju Michael Henson Dinesh Manocha University of North Carolina at Chapel Hill {nico,naga,henson,dm}@cs.unc.edu
More informationShadow Mapping for Hemispherical and Omnidirectional Light Sources
Shadow Mapping for Hemispherical and Omnidirectional Light Sources Abstract Stefan Brabec Thomas Annen Hans-Peter Seidel Computer Graphics Group Max-Planck-Institut für Infomatik Stuhlsatzenhausweg 85,
More informationCornell University CS 569: Interactive Computer Graphics. Introduction. Lecture 1. [John C. Stone, UIUC] NASA. University of Calgary
Cornell University CS 569: Interactive Computer Graphics Introduction Lecture 1 [John C. Stone, UIUC] 2008 Steve Marschner 1 2008 Steve Marschner 2 NASA University of Calgary 2008 Steve Marschner 3 2008
More informationAn Improved Study of Real-Time Fluid Simulation on GPU
An Improved Study of Real-Time Fluid Simulation on GPU Enhua Wu 1, 2, Youquan Liu 1, Xuehui Liu 1 1 Laboratory of Computer Science, Institute of Software Chinese Academy of Sciences, Beijing, China 2 Department
More informationFast HDR Image-Based Lighting Using Summed-Area Tables
Fast HDR Image-Based Lighting Using Summed-Area Tables Justin Hensley 1, Thorsten Scheuermann 2, Montek Singh 1 and Anselmo Lastra 1 1 University of North Carolina, Chapel Hill, NC, USA {hensley, montek,
More informationCOMP Preliminaries Jan. 6, 2015
Lecture 1 Computer graphics, broadly defined, is a set of methods for using computers to create and manipulate images. There are many applications of computer graphics including entertainment (games, cinema,
More informationReal-Time Video-Based Rendering from Multiple Cameras
Real-Time Video-Based Rendering from Multiple Cameras Vincent Nozick Hideo Saito Graduate School of Science and Technology, Keio University, Japan E-mail: {nozick,saito}@ozawa.ics.keio.ac.jp Abstract In
More informationGPU Computation Strategies & Tricks. Ian Buck NVIDIA
GPU Computation Strategies & Tricks Ian Buck NVIDIA Recent Trends 2 Compute is Cheap parallelism to keep 100s of ALUs per chip busy shading is highly parallel millions of fragments per frame 0.5mm 64-bit
More informationOverview. A real-time shadow approach for an Augmented Reality application using shadow volumes. Augmented Reality.
Overview A real-time shadow approach for an Augmented Reality application using shadow volumes Introduction of Concepts Standard Stenciled Shadow Volumes Method Proposed Approach in AR Application Experimental
More informationA Data-Parallel Genealogy: The GPU Family Tree
A Data-Parallel Genealogy: The GPU Family Tree Department of Electrical and Computer Engineering Institute for Data Analysis and Visualization University of California, Davis Outline Moore s Law brings
More information1 Hardware virtualization for shading languages Group Technical Proposal
1 Hardware virtualization for shading languages Group Technical Proposal Executive Summary The fast processing speed and large memory bandwidth of the modern graphics processing unit (GPU) will make it
More informationThree-Dimensional Image Warping on Programmable Graphics Hardware
Three-Dimensional Image Warping on Programmable Graphics Hardware Zhongding Jiang Tien-Tsin Wong Hujun Bao State Key Laboratory of CAD&CG, Zhejiang University, Hangzhou, 310027, China Department of Computer
More informationGeneral Purpose GPU Programming. Advanced Operating Systems Tutorial 9
General Purpose GPU Programming Advanced Operating Systems Tutorial 9 Tutorial Outline Review of lectured material Key points Discussion OpenCL Future directions 2 Review of Lectured Material Heterogeneous
More informationCS451Real-time Rendering Pipeline
1 CS451Real-time Rendering Pipeline JYH-MING LIEN DEPARTMENT OF COMPUTER SCIENCE GEORGE MASON UNIVERSITY Based on Tomas Akenine-Möller s lecture note You say that you render a 3D 2 scene, but what does
More informationParallel Genetic Algorithms on Programmable Graphics Hardware
Parallel Genetic Algorithms on Programmable Graphics Hardware Qizhi Yu 1, Chongcheng Chen 2,andZhigengPan 1,2 1 College of Computer Science, Zhejiang University, Hangzhou 310027, P.R. China qizhi.yu@gmail.com
More informationScanline Rendering 2 1/42
Scanline Rendering 2 1/42 Review 1. Set up a Camera the viewing frustum has near and far clipping planes 2. Create some Geometry made out of triangles 3. Place the geometry in the scene using Transforms
More informationProgrammable Graphics Hardware
Programmable Graphics Hardware Outline 2/ 49 A brief Introduction into Programmable Graphics Hardware Hardware Graphics Pipeline Shading Languages Tools GPGPU Resources Hardware Graphics Pipeline 3/ 49
More informationJournal of Universal Computer Science, vol. 14, no. 14 (2008), submitted: 30/9/07, accepted: 30/4/08, appeared: 28/7/08 J.
Journal of Universal Computer Science, vol. 14, no. 14 (2008), 2416-2427 submitted: 30/9/07, accepted: 30/4/08, appeared: 28/7/08 J.UCS Tabu Search on GPU Adam Janiak (Institute of Computer Engineering
More informationLecture 4: Geometry Processing. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)
Lecture 4: Processing Kayvon Fatahalian CMU 15-869: Graphics and Imaging Architectures (Fall 2011) Today Key per-primitive operations (clipping, culling) Various slides credit John Owens, Kurt Akeley,
More informationAutomatic Tuning Matrix Multiplication Performance on Graphics Hardware
Automatic Tuning Matrix Multiplication Performance on Graphics Hardware Changhao Jiang (cjiang@cs.uiuc.edu) Marc Snir (snir@cs.uiuc.edu) University of Illinois Urbana Champaign GPU becomes more powerful
More informationScreen Space Ambient Occlusion TSBK03: Advanced Game Programming
Screen Space Ambient Occlusion TSBK03: Advanced Game Programming August Nam-Ki Ek, Oscar Johnson and Ramin Assadi March 5, 2015 This project report discusses our approach of implementing Screen Space Ambient
More informationCS230 : Computer Graphics Lecture 4. Tamar Shinar Computer Science & Engineering UC Riverside
CS230 : Computer Graphics Lecture 4 Tamar Shinar Computer Science & Engineering UC Riverside Shadows Shadows for each pixel do compute viewing ray if ( ray hits an object with t in [0, inf] ) then compute
More informationReal-Time Graphics Architecture
Real-Time Graphics Architecture Kurt Akeley Pat Hanrahan http://www.graphics.stanford.edu/courses/cs448a-01-fall Geometry Outline Vertex and primitive operations System examples emphasis on clipping Primitive
More informationRendering Objects. Need to transform all geometry then
Intro to OpenGL Rendering Objects Object has internal geometry (Model) Object relative to other objects (World) Object relative to camera (View) Object relative to screen (Projection) Need to transform
More informationA Bandwidth Effective Rendering Scheme for 3D Texture-based Volume Visualization on GPU
for 3D Texture-based Volume Visualization on GPU Won-Jong Lee, Tack-Don Han Media System Laboratory (http://msl.yonsei.ac.k) Dept. of Computer Science, Yonsei University, Seoul, Korea Contents Background
More informationIntroduction to Multicore architecture. Tao Zhang Oct. 21, 2010
Introduction to Multicore architecture Tao Zhang Oct. 21, 2010 Overview Part1: General multicore architecture Part2: GPU architecture Part1: General Multicore architecture Uniprocessor Performance (ECint)
More informationComputer Graphics CS 543 Lecture 1 (Part I) Prof Emmanuel Agu. Computer Science Dept. Worcester Polytechnic Institute (WPI)
Computer Graphics CS 543 Lecture 1 (Part I) Prof Emmanuel Agu Computer Science Dept. Worcester Polytechnic Institute (WPI) About This Course Computer graphics: algorithms, mathematics, data structures..
More informationShading Languages. Ari Silvennoinen Apri 12, 2004
Shading Languages Ari Silvennoinen Apri 12, 2004 Introduction The recent trend in graphics hardware has been to replace fixed functionality in vertex and fragment processing with programmability [1], [2],
More informationGrafica Computazionale: Lezione 30. Grafica Computazionale. Hiding complexity... ;) Introduction to OpenGL. lezione30 Introduction to OpenGL
Grafica Computazionale: Lezione 30 Grafica Computazionale lezione30 Introduction to OpenGL Informatica e Automazione, "Roma Tre" May 20, 2010 OpenGL Shading Language Introduction to OpenGL OpenGL (Open
More informationRendering Subdivision Surfaces Efficiently on the GPU
Rendering Subdivision Surfaces Efficiently on the GPU Gy. Antal, L. Szirmay-Kalos and L. A. Jeni Department of Algorithms and their Applications, Faculty of Informatics, Eötvös Loránd Science University,
More information