GPGPU IGAD 2014/2015. Lecture 4. Jacco Bikker

Size: px
Start display at page:

Download "GPGPU IGAD 2014/2015. Lecture 4. Jacco Bikker"

Transcription

1 GPGPU IGAD 2014/2015 Lecture 4 Jacco Bikker

2 Today: Demo time! Parallel scan Parallel sort Assignment

3 Demo Time

4 Parallel scan What it is: in: out: C++: out[0] = 0 for ( i = 1; i < n; i++ ) out[i] = in[i-1] + out[i-1];

5 Parallel scan What it is good for: Building block for many parallel algorithms: Output to array of variable number of elements per thread Summed area tables Compaction

6 Variable output Verlet fluid solver: Go over the cells, and create an array of all particle pairs. Process this array in a second pass (which will have full GPU utilization). Each cell will emit 0..MAXPARTICLES-1 entries in the output array. wi wi wi wi wi wi wi wi warp

7 Summed Area Tables What it is: A table containing, for each pixel P of an image, the sum of all pixels between (0,0) and P. Using a SAT, we can calculate an arbitrary-width box filter in O(1):

8 Compaction What it is: When in a multi-pass algorithm not all data requires the same number of passes, compaction ensures that subsequent passes have full warps. What it is good for: Whitted-style ray tracing. wi wi wi wi wi wi wi wi warp 0

9 Parallel scan: Algorithm for ( d = 1; d < log 2 n; d++ ) for all k in parallel do if k >= 2 d x[k] += x[k 2 d-1 ] O(n log n)

10 Algorithm (2) Down-sweep Up-sweep O(n)

11 Today: Demo time! Parallel scan Parallel sort Assignment

12 Parallel sort Selection sort: kernel void Sort( global int* in, global int* out ) { int i = get_global_id( 0 ); int n = get_global_size( 0 ); int ikey = in[i]; // compute position of in[i] in output int pos = 0; for( int j = 0; j < n; j++ ) { int jkey = in[j]; // broadcasted bool smaller = (jkey < ikey) (jkey == ikey && j < i); pos += (smaller)? 1 : 0; } out[pos] = ikey; }

13 Parallel sort Merge sort:

14 Parallel sort Parallel merge sort: Main operation: merge if (*a < *b) *d++ = *a++; else *d++ = *b++;

15 Parallel sort Parallel merge sort: Main operation: merge while (a < a_end && b < b_end) if (*a < *b) *d++ = *a++; else *d++ = *b++; while (a < a_end) *d++ = *a++; while (b < b_end) *d++ = *b++;

16 Parallel merge What it is: Given two sorted sequences a and b, produce sorted sequence c: a: b: c: Note: position of a i in c is i + f( b, a i ) where f is the number of elements in b smaller than a i. Since b is sorted, finding f( b, x ) can be done using a binary search.

17 Sorting Networks

18 Sorting Networks

19 Bitonic Sort

20 Bitonic Sort kernel void Sort( global uint* data, const uint stage, const uint passofstage, const uint width, const uint direction ) { uint sortdir = direction; const uint idx = get_global_id( 0 ); const uint pairdist = 1 << (stage - passofstage); const uint leftid = (idx % pairdist) + (idx / pairdist) * 2 * pairdist; const uint rightid = leftid + pairdist; const uint A = data[leftid]; const uint B = data[rightid]; sortdir = ((idx >> stage) & 1) == 1? (1 - sortdir) : sortdir; const uint greater = A > B? A : B; const uint lesser = A > B? B : A; data[leftid] = sortdir? lesser : greater; data[rightid] = sortdir? greater : lesser; }

21 Today: Demo time! Parallel scan Parallel sort Updated Template Assignment

22 Template v3 #define CHECKCL(r) CheckCL( r, FILE, LINE ) float GetTime(); void StartTimer(); float GetDuation();

23 Template v3 static cl_int getplatformid( cl_platform_id* platform ) { char chbuffer[1024]; cl_uint num_platforms, devcount; cl_platform_id* clplatformids; cl_int error; *platform = NULL; CHECKCL( error = clgetplatformids( 0, NULL, &num_platforms ) ); if (num_platforms == 0) CHECKCL( -1 ); clplatformids = (cl_platform_id*)malloc( num_platforms * sizeof( cl_platform_id ) ); error = clgetplatformids( num_platforms, clplatformids, NULL ); #ifdef USE_CPU_DEVICE cl_uint devicetype[2] = { CL_DEVICE_TYPE_CPU, CL_DEVICE_TYPE_CPU }; char* deviceorder[2][3] = { { "", "", "" }, { "", "", "" } }; #else cl_uint devicetype[2] = { CL_DEVICE_TYPE_GPU, CL_DEVICE_TYPE_CPU }; char* deviceorder[2][3] = { { "NVIDIA", "AMD", "" }, { "", "", "" } }; #endif...

24 Template v3 glteximage2d( texturetype, 0, GL_RGBA32F, width, height, 0, GL_RGB, GL_FLOAT, data ); gltexparameteri( texturetype, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE ); gltexparameteri( texturetype, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE ); gltexparameteri( texturetype, GL_TEXTURE_MIN_FILTER, GL_NEAREST ); gltexparameteri( texturetype, GL_TEXTURE_MAG_FILTER, GL_NEAREST ); gltexparameteri( GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE ); gltexparameteri( GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE ); gltexparameteri( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST ); gltexparameteri( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST ); glteximage2d( GL_TEXTURE_2D, 0, GL_RGBA32F, width, height, 0, GL_RGB, GL_FLOAT, data );

25 Template v3 class Buffer { public: enum { DEFAULT = 0, TEXTURE }; // constructor / destructor Buffer() : hostbuffer( 0 ) {} Buffer( unsigned int N, unsigned int t = DEFAULT ); ~Buffer(); cl_mem* GetDevicePtr() { return &devicebuffer; } unsigned int* GetHostPtr() { return hostbuffer; } void CopyToDevice(); void CopyFromDevice(); void CopyTo( Buffer* buffer ); cl_int ParallelScan(); cl_int ParallelSort();...

26 Today: Demo time! Parallel scan Parallel sort Assignment

27 Assignment

28 Assignment Some options: Fluid simulation with surface reconstruction Cloth simulation Flocking / Boids Library of sorting functions for varying data sets, with analysis Ray traced shadows for rasterizer Mesh compression / decompression

29 Next week: Development tools Debugging Random numbers The End (for now)

30 Bonus material

31 Merge Sort in OpenCL kernel void Sort( global const int* in, global int* out, local int* aux ) { int i = get_local_id(0); // index in workgroup int wg = get_local_size(0); // workgroup size = block size, power of 2 int offset = get_group_id(0) * wg; in += offset; out += offset; // move in, out to block start aux[i] = in[i]; // load block in aux[wg] barrier(clk_local_mem_fence); // make sure AUX is entirely up to date // now we will merge sub-sequences of length 1,2,...,wg/2 for( int length = 1; length < wg; length <<=1 ) { uint ikey = aux[i]; int ii = i & (length - 1); // index in our sequence in 0..length-1 int sibling = (i - ii) ^ length; // beginning of the sibling sequence int pos = 0; for (int inc = length; inc > 0; inc >>=1 ) // increment for dichotomic search { int j = sibling + pos + inc - 1; uint jkey = aux[j]; bool smaller = (jkey < ikey) ( jkey == ikey && j < i ); pos += (smaller)? Inc : 0; pos = min( pos, length ); } int bits = 2 * length - 1; // mask for destination int dest = ((ii + pos) & bits) (i & ~bits); // dest idx in merged sequence barrier(clk_local_mem_fence); aux[dest] = ikey; barrier(clk_local_mem_fence); } out[i] = aux[i]; // write output }

/INFOMOV/ Optimization & Vectorization. J. Bikker - Sep-Nov Lecture 10: GPGPU (3) Welcome!

/INFOMOV/ Optimization & Vectorization. J. Bikker - Sep-Nov Lecture 10: GPGPU (3) Welcome! /INFOMOV/ Optimization & Vectorization J. Bikker - Sep-Nov 2018 - Lecture 10: GPGPU (3) Welcome! Today s Agenda: Don t Trust the Template The Prefix Sum Parallel Sorting Stream Filtering Optimizing GPU

More information

GPGPU IGAD 2014/2015. Lecture 1. Jacco Bikker

GPGPU IGAD 2014/2015. Lecture 1. Jacco Bikker GPGPU IGAD 2014/2015 Lecture 1 Jacco Bikker Today: Course introduction GPGPU background Getting started Assignment Introduction GPU History History 3DO-FZ1 console 1991 History NVidia NV-1 (Diamond Edge

More information

Heterogeneous Computing

Heterogeneous Computing OpenCL Hwansoo Han Heterogeneous Computing Multiple, but heterogeneous multicores Use all available computing resources in system [AMD APU (Fusion)] Single core CPU, multicore CPU GPUs, DSPs Parallel programming

More information

General Purpose computation on GPUs. Liangjun Zhang 2/23/2005

General Purpose computation on GPUs. Liangjun Zhang 2/23/2005 General Purpose computation on GPUs Liangjun Zhang 2/23/2005 Outline Interpretation of GPGPU GPU Programmable interfaces GPU programming sample: Hello, GPGPU More complex programming GPU essentials, opportunity

More information

Real-time Graphics 9. GPGPU

Real-time Graphics 9. GPGPU 9. GPGPU GPGPU GPU (Graphics Processing Unit) Flexible and powerful processor Programmability, precision, power Parallel processing CPU Increasing number of cores Parallel processing GPGPU general-purpose

More information

Lecture 19: OpenGL Texture Mapping. CITS3003 Graphics & Animation

Lecture 19: OpenGL Texture Mapping. CITS3003 Graphics & Animation Lecture 19: OpenGL Texture Mapping CITS3003 Graphics & Animation E. Angel and D. Shreiner: Interactive Computer Graphics 6E Addison-Wesley 2012 Objectives Introduce the OpenGL texture functions and options

More information

ECE 574 Cluster Computing Lecture 17

ECE 574 Cluster Computing Lecture 17 ECE 574 Cluster Computing Lecture 17 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 6 April 2017 HW#8 will be posted Announcements HW#7 Power outage Pi Cluster Runaway jobs (tried

More information

Lecture 07: Buffers and Textures

Lecture 07: Buffers and Textures Lecture 07: Buffers and Textures CSE 40166 Computer Graphics Peter Bui University of Notre Dame, IN, USA October 26, 2010 OpenGL Pipeline Today s Focus Pixel Buffers: read and write image data to and from

More information

Real-time Graphics 9. GPGPU

Real-time Graphics 9. GPGPU Real-time Graphics 9. GPGPU GPGPU GPU (Graphics Processing Unit) Flexible and powerful processor Programmability, precision, power Parallel processing CPU Increasing number of cores Parallel processing

More information

CS452/552; EE465/505. Texture Mapping in WebGL

CS452/552; EE465/505. Texture Mapping in WebGL CS452/552; EE465/505 Texture Mapping in WebGL 2-26 15 Outline! Texture Mapping in WebGL Read: Angel, Chapter 7, 7.3-7.5 LearningWebGL lesson 5: http://learningwebgl.com/blog/?p=507 Lab3 due: Monday, 3/2

More information

CSE 167: Lecture #17: Volume Rendering. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012

CSE 167: Lecture #17: Volume Rendering. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012 CSE 167: Introduction to Computer Graphics Lecture #17: Volume Rendering Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012 Announcements Thursday, Dec 13: Final project presentations

More information

OpenCL Overview Benedict R. Gaster, AMD

OpenCL Overview Benedict R. Gaster, AMD Copyright Khronos Group, 2011 - Page 1 OpenCL Overview Benedict R. Gaster, AMD March 2010 The BIG Idea behind OpenCL OpenCL execution model - Define N-dimensional computation domain - Execute a kernel

More information

OpenCL / OpenGL Texture Interoperability: An Image Blurring Case Study

OpenCL / OpenGL Texture Interoperability: An Image Blurring Case Study 1 OpenCL / OpenGL Texture Interoperability: An Image Blurring Case Study Mike Bailey mjb@cs.oregonstate.edu opencl.opengl.rendertexture.pptx OpenCL / OpenGL Texture Interoperability: The Basic Idea 2 Application

More information

Lecture 22 Sections 8.8, 8.9, Wed, Oct 28, 2009

Lecture 22 Sections 8.8, 8.9, Wed, Oct 28, 2009 s The s Lecture 22 Sections 8.8, 8.9, 8.10 Hampden-Sydney College Wed, Oct 28, 2009 Outline s The 1 2 3 4 5 The 6 7 8 Outline s The 1 2 3 4 5 The 6 7 8 Creating Images s The To create a texture image internally,

More information

Lecture Topic: An Overview of OpenCL on Xeon Phi

Lecture Topic: An Overview of OpenCL on Xeon Phi C-DAC Four Days Technology Workshop ON Hybrid Computing Coprocessors/Accelerators Power-Aware Computing Performance of Applications Kernels hypack-2013 (Mode-4 : GPUs) Lecture Topic: on Xeon Phi Venue

More information

Performing Reductions in OpenCL

Performing Reductions in OpenCL Performing Reductions in OpenCL Mike Bailey mjb@cs.oregonstate.edu opencl.reduction.pptx Recall the OpenCL Model Kernel Global Constant Local Local Local Local Work- ItemWork- ItemWork- Item Here s the

More information

Textures. Texture Mapping. Bitmap Textures. Basic Texture Techniques

Textures. Texture Mapping. Bitmap Textures. Basic Texture Techniques Texture Mapping Textures The realism of an image is greatly enhanced by adding surface textures to the various faces of a mesh object. In part a) images have been pasted onto each face of a box. Part b)

More information

Martin Kruliš, v

Martin Kruliš, v Martin Kruliš 1 GPGPU History Current GPU Architecture OpenCL Framework Example Optimizing Previous Example Alternative Architectures 2 1996: 3Dfx Voodoo 1 First graphical (3D) accelerator for desktop

More information

GPGPU COMPUTE ON AMD. Udeepta Bordoloi April 6, 2011

GPGPU COMPUTE ON AMD. Udeepta Bordoloi April 6, 2011 GPGPU COMPUTE ON AMD Udeepta Bordoloi April 6, 2011 WHY USE GPU COMPUTE CPU: scalar processing + Latency + Optimized for sequential and branching algorithms + Runs existing applications very well - Throughput

More information

CS4621/5621 Fall Basics of OpenGL/GLSL Textures Basics

CS4621/5621 Fall Basics of OpenGL/GLSL Textures Basics CS4621/5621 Fall 2015 Basics of OpenGL/GLSL Textures Basics Professor: Kavita Bala Instructor: Nicolas Savva with slides from Balazs Kovacs, Eston Schweickart, Daniel Schroeder, Jiang Huang and Pramook

More information

OpenCL. Computation on HybriLIT Brief introduction and getting started

OpenCL. Computation on HybriLIT Brief introduction and getting started OpenCL Computation on HybriLIT Brief introduction and getting started Alexander Ayriyan Laboratory of Information Technologies Joint Institute for Nuclear Research 05.09.2014 (Friday) Tutorial in frame

More information

Texture Mapping. CS 537 Interactive Computer Graphics Prof. David E. Breen Department of Computer Science

Texture Mapping. CS 537 Interactive Computer Graphics Prof. David E. Breen Department of Computer Science Texture Mapping CS 537 Interactive Computer Graphics Prof. David E. Breen Department of Computer Science 1 Objectives Introduce Mapping Methods - Texture Mapping - Environment Mapping - Bump Mapping Consider

More information

Texture Mapping. Mike Bailey.

Texture Mapping. Mike Bailey. Texture Mapping 1 Mike Bailey mjb@cs.oregonstate.edu This work is licensed under a Creative Commons Attribution-NonCommercial- NoDerivatives 4.0 International License TextureMapping.pptx The Basic Idea

More information

OpenCL in Action. Ofer Rosenberg

OpenCL in Action. Ofer Rosenberg pencl in Action fer Rosenberg Working with pencl API pencl Boot Platform Devices Context Queue Platform Query int GetPlatform (cl_platform_id &platform, char* requestedplatformname) { cl_uint numplatforms;

More information

Buffers. Angel and Shreiner: Interactive Computer Graphics 7E Addison-Wesley 2015

Buffers. Angel and Shreiner: Interactive Computer Graphics 7E Addison-Wesley 2015 Buffers 1 Objectives Introduce additional WebGL buffers Reading and writing buffers Buffers and Images 2 Buffer Define a buffer by its spatial resolution (n x m) and its depth (or precision) k, the number

More information

CISC 3620 Lecture 7 Lighting and shading. Topics: Exam results Buffers Texture mapping intro Texture mapping basics WebGL texture mapping

CISC 3620 Lecture 7 Lighting and shading. Topics: Exam results Buffers Texture mapping intro Texture mapping basics WebGL texture mapping CISC 3620 Lecture 7 Lighting and shading Topics: Exam results Buffers Texture mapping intro Texture mapping basics WebGL texture mapping Exam results Grade distribution 12 Min: 26 10 Mean: 74 8 Median:

More information

Martin Kruliš, v

Martin Kruliš, v Martin Kruliš 1 GPGPU History Current GPU Architecture OpenCL Framework Example (and its Optimization) Alternative Frameworks Most Recent Innovations 2 1996: 3Dfx Voodoo 1 First graphical (3D) accelerator

More information

Introduction to Parallel & Distributed Computing OpenCL: memory & threads

Introduction to Parallel & Distributed Computing OpenCL: memory & threads Introduction to Parallel & Distributed Computing OpenCL: memory & threads Lecture 12, Spring 2014 Instructor: 罗国杰 gluo@pku.edu.cn In this Lecture Example: image rotation GPU threads and scheduling Understanding

More information

CS 432 Interactive Computer Graphics

CS 432 Interactive Computer Graphics CS 432 Interactive Computer Graphics Lecture 7 Part 2 Texture Mapping in OpenGL Matt Burlick - Drexel University - CS 432 1 Topics Texture Mapping in OpenGL Matt Burlick - Drexel University - CS 432 2

More information

-=Catmull's Texturing=1974. Part I of Texturing

-=Catmull's Texturing=1974. Part I of Texturing -=Catmull's Texturing=1974 but with shaders Part I of Texturing Anton Gerdelan Textures Edwin Catmull's PhD thesis Computer display of curved surfaces, 1974 U.Utah Also invented the z-buffer / depth buffer

More information

PCAP Assignment II. 1. With a neat diagram, explain the various stages of fixed-function graphic pipeline.

PCAP Assignment II. 1. With a neat diagram, explain the various stages of fixed-function graphic pipeline. PCAP Assignment II 1. With a neat diagram, explain the various stages of fixed-function graphic pipeline. The host interface receives graphics commands and data from the CPU. The commands are typically

More information

OpenCL. Dr. David Brayford, LRZ, PRACE PATC: Intel MIC & GPU Programming Workshop

OpenCL. Dr. David Brayford, LRZ, PRACE PATC: Intel MIC & GPU Programming Workshop OpenCL Dr. David Brayford, LRZ, brayford@lrz.de PRACE PATC: Intel MIC & GPU Programming Workshop 1 Open Computing Language Open, royalty-free standard C-language extension For cross-platform, parallel

More information

CS 677: Parallel Programming for Many-core Processors Lecture 12

CS 677: Parallel Programming for Many-core Processors Lecture 12 1 CS 677: Parallel Programming for Many-core Processors Lecture 12 Instructor: Philippos Mordohai Webpage: www.cs.stevens.edu/~mordohai E-mail: Philippos.Mordohai@stevens.edu Final Project Presentations

More information

OpenCL parallel Processing using General Purpose Graphical Processing units TiViPE software development

OpenCL parallel Processing using General Purpose Graphical Processing units TiViPE software development TiViPE Visual Programming OpenCL parallel Processing using General Purpose Graphical Processing units TiViPE software development Technical Report Copyright c TiViPE 2012. All rights reserved. Tino Lourens

More information

Bullet Cloth Simulation. Presented by: Justin Hensley Implemented by: Lee Howes

Bullet Cloth Simulation. Presented by: Justin Hensley Implemented by: Lee Howes Bullet Cloth Simulation Presented by: Justin Hensley Implemented by: Lee Howes Course Agenda OpenCL Review! OpenCL Development Tips! OpenGL/OpenCL Interoperability! Rigid body particle simulation!

More information

INTRODUCTION TO OPENCL. Jason B. Smith, Hood College May

INTRODUCTION TO OPENCL. Jason B. Smith, Hood College May INTRODUCTION TO OPENCL Jason B. Smith, Hood College May 4 2011 WHAT IS IT? Use heterogeneous computing platforms Specifically for computationally intensive apps Provide a means for portable parallelism

More information

OpenGL Texture Mapping. Objectives Introduce the OpenGL texture functions and options

OpenGL Texture Mapping. Objectives Introduce the OpenGL texture functions and options OpenGL Texture Mapping Objectives Introduce the OpenGL texture functions and options 1 Basic Strategy Three steps to applying a texture 1. 2. 3. specify the texture read or generate image assign to texture

More information

Data Parallelism. CSCI 5828: Foundations of Software Engineering Lecture 28 12/01/2016

Data Parallelism. CSCI 5828: Foundations of Software Engineering Lecture 28 12/01/2016 Data Parallelism CSCI 5828: Foundations of Software Engineering Lecture 28 12/01/2016 1 Goals Cover the material in Chapter 7 of Seven Concurrency Models in Seven Weeks by Paul Butcher Data Parallelism

More information

CSE 167: Introduction to Computer Graphics Lecture #7: Textures. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2018

CSE 167: Introduction to Computer Graphics Lecture #7: Textures. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2018 CSE 167: Introduction to Computer Graphics Lecture #7: Textures Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2018 Announcements Project 2 due this Friday at 2pm Grading in

More information

Generating Performance Portable Code using Rewrite Rules

Generating Performance Portable Code using Rewrite Rules Generating Performance Portable Code using Rewrite Rules From High-Level Functional Expressions to High-Performance OpenCL Code Michel Steuwer Christian Fensch Sam Lindley Christophe Dubach The Problem(s)

More information

GPGPU Training. Personal Super Computing Competence Centre PSC 3. Jan G. Cornelis. Personal Super Computing Competence Center

GPGPU Training. Personal Super Computing Competence Centre PSC 3. Jan G. Cornelis. Personal Super Computing Competence Center GPGPU Training Personal Super Computing Competence Centre PSC 3 Jan G. Cornelis 1 Levels of Understanding Level 0 Host and device Level 1 Parallel execution on the device Level 2 Device model and work

More information

Graphics. Texture Mapping 고려대학교컴퓨터그래픽스연구실.

Graphics. Texture Mapping 고려대학교컴퓨터그래픽스연구실. Graphics Texture Mapping 고려대학교컴퓨터그래픽스연구실 3D Rendering Pipeline 3D Primitives 3D Modeling Coordinates Model Transformation 3D World Coordinates Lighting 3D World Coordinates Viewing Transformation 3D Viewing

More information

Texture Mapping CSCI 4229/5229 Computer Graphics Fall 2016

Texture Mapping CSCI 4229/5229 Computer Graphics Fall 2016 Texture Mapping CSCI 4229/5229 Computer Graphics Fall 2016 What are texture maps? Bitmap images used to assign fine texture to displayed surfaces Used to make surfaces appear more realistic Must move with

More information

Computational Strategies

Computational Strategies Computational Strategies How can the basic ingredients be combined: Image Order Ray casting (many options) Object Order (in world coordinate) splatting, texture mapping Combination (neither) Shear warp,

More information

CSE 167: Introduction to Computer Graphics Lecture #8: Textures. Jürgen P. Schulze, Ph.D. University of California, San Diego Spring Quarter 2016

CSE 167: Introduction to Computer Graphics Lecture #8: Textures. Jürgen P. Schulze, Ph.D. University of California, San Diego Spring Quarter 2016 CSE 167: Introduction to Computer Graphics Lecture #8: Textures Jürgen P. Schulze, Ph.D. University of California, San Diego Spring Quarter 2016 Announcements Project 2 due this Friday Midterm next Tuesday

More information

Discussion 3. PPM loading Texture rendering in OpenGL

Discussion 3. PPM loading Texture rendering in OpenGL Discussion 3 PPM loading Texture rendering in OpenGL PPM Loading - Portable PixMap format 1. 2. Code for loadppm(): http://ivl.calit2.net/wiki/images/0/09/loadppm.txt ppm file format: Header: 1. P6: byte

More information

GPU acceleration on IB clusters. Sadaf Alam Jeffrey Poznanovic Kristopher Howard Hussein Nasser El-Harake

GPU acceleration on IB clusters. Sadaf Alam Jeffrey Poznanovic Kristopher Howard Hussein Nasser El-Harake GPU acceleration on IB clusters Sadaf Alam Jeffrey Poznanovic Kristopher Howard Hussein Nasser El-Harake HPC Advisory Council European Workshop 2011 Why it matters? (Single node GPU acceleration) Control

More information

Towards Transparent and Efficient GPU Communication on InfiniBand Clusters. Sadaf Alam Jeffrey Poznanovic Kristopher Howard Hussein Nasser El-Harake

Towards Transparent and Efficient GPU Communication on InfiniBand Clusters. Sadaf Alam Jeffrey Poznanovic Kristopher Howard Hussein Nasser El-Harake Towards Transparent and Efficient GPU Communication on InfiniBand Clusters Sadaf Alam Jeffrey Poznanovic Kristopher Howard Hussein Nasser El-Harake MPI and I/O from GPU vs. CPU Traditional CPU point-of-view

More information

Masterpraktikum Scientific Computing

Masterpraktikum Scientific Computing Masterpraktikum Scientific Computing High-Performance Computing Michael Bader Alexander Heinecke Technische Universität München, Germany Outline Intel Cilk Plus OpenCL Übung, October 7, 2012 2 Intel Cilk

More information

OpenCL. Matt Sellitto Dana Schaa Northeastern University NUCAR

OpenCL. Matt Sellitto Dana Schaa Northeastern University NUCAR OpenCL Matt Sellitto Dana Schaa Northeastern University NUCAR OpenCL Architecture Parallel computing for heterogenous devices CPUs, GPUs, other processors (Cell, DSPs, etc) Portable accelerated code Defined

More information

Assignment #5: Scalar Field Visualization 3D: Direct Volume Rendering

Assignment #5: Scalar Field Visualization 3D: Direct Volume Rendering Assignment #5: Scalar Field Visualization 3D: Direct Volume Rendering Goals: Due October 4 th, before midnight This is the continuation of Assignment 4. The goal is to implement a simple DVR -- 2D texture-based

More information

CS671 Parallel Programming in the Many-Core Era

CS671 Parallel Programming in the Many-Core Era CS671 Parallel Programming in the Many-Core Era Lecture 3: GPU Programming - Reduce, Scan & Sort Zheng Zhang Rutgers University Review: Programming with CUDA An Example in C Add vector A and vector B to

More information

Overview. Goals. MipMapping. P5 MipMap Texturing. What are MipMaps. MipMapping in OpenGL. Generating MipMaps Filtering.

Overview. Goals. MipMapping. P5 MipMap Texturing. What are MipMaps. MipMapping in OpenGL. Generating MipMaps Filtering. Overview What are MipMaps MipMapping in OpenGL P5 MipMap Texturing Generating MipMaps Filtering Alexandra Junghans junghana@student.ethz.ch Advanced Filters You can explain why it is a good idea to use

More information

INFOGR Computer Graphics. J. Bikker - April-July Lecture 11: Acceleration. Welcome!

INFOGR Computer Graphics. J. Bikker - April-July Lecture 11: Acceleration. Welcome! INFOGR Computer Graphics J. Bikker - April-July 2015 - Lecture 11: Acceleration Welcome! Today s Agenda: High-speed Ray Tracing Acceleration Structures The Bounding Volume Hierarchy BVH Construction BVH

More information

三維繪圖程式設計 3D Graphics Programming Design 第七章基礎材質張貼技術嘉大資工系盧天麒

三維繪圖程式設計 3D Graphics Programming Design 第七章基礎材質張貼技術嘉大資工系盧天麒 三維繪圖程式設計 3D Graphics Programming Design 第七章基礎材質張貼技術嘉大資工系盧天麒 1 In this chapter, you will learn The basics of texture mapping Texture coordinates Texture objects and texture binding Texture specification

More information

CS 314 Principles of Programming Languages

CS 314 Principles of Programming Languages CS 314 Principles of Programming Languages Zheng Zhang Fall 2016 Dec 14 GPU Programming Rutgers University Programming with CUDA Compute Unified Device Architecture (CUDA) Mapping and managing computations

More information

Computer Architecture

Computer Architecture Jens Teubner Computer Architecture Summer 2017 1 Computer Architecture Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de Summer 2017 Jens Teubner Computer Architecture Summer 2017 34 Part II Graphics

More information

CSE 167: Introduction to Computer Graphics Lecture #8: Textures. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2017

CSE 167: Introduction to Computer Graphics Lecture #8: Textures. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2017 CSE 167: Introduction to Computer Graphics Lecture #8: Textures Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2017 Announcements Project 2 is due this Friday at 2pm Next Tuesday

More information

Easy to adapt C code to kernel code

Easy to adapt C code to kernel code The language of OpenCL kernels A simplified version of C No recursion No pointers to functions Kernels have no return values Easy to adapt C code to kernel code Tal Ben-Nun, HUJI. All rights reserved.

More information

Josef Pelikán, Jan Horáček CGG MFF UK Praha

Josef Pelikán, Jan Horáček CGG MFF UK Praha GPGPU and CUDA 2012-2018 Josef Pelikán, Jan Horáček CGG MFF UK Praha pepca@cgg.mff.cuni.cz http://cgg.mff.cuni.cz/~pepca/ 1 / 41 Content advances in hardware multi-core vs. many-core general computing

More information

Ettention Developer Guide (1.0.4) Checkout. Building. Download the ettention software package from

Ettention Developer Guide (1.0.4) Checkout. Building. Download the ettention software package from Ettention Developer Guide (1.0.4) Checkout Download the ettention software package from www.ettention.org. Building General information: ettention uses cmake as a makefile generator. Visual studio solution

More information

Lighting and Texturing

Lighting and Texturing Lighting and Texturing Michael Tao Michael Tao Lighting and Texturing 1 / 1 Fixed Function OpenGL Lighting Need to enable lighting Need to configure lights Need to configure triangle material properties

More information

OpenCL Events. Mike Bailey. Oregon State University. OpenCL Events

OpenCL Events. Mike Bailey. Oregon State University. OpenCL Events 1 OpenCL Events Mike Bailey mjb@cs.oregonstate.edu opencl.events.pptx OpenCL Events 2 An event is an object that communicates the status of OpenCL commands Event Read Buffer dc Execute Kernel Write Buffer

More information

NVIDIA OpenCL JumpStart Guide. Technical Brief

NVIDIA OpenCL JumpStart Guide. Technical Brief NVIDIA OpenCL JumpStart Guide Technical Brief Version 1.0 February 19, 2010 Introduction The purposes of this guide are to assist developers who are familiar with CUDA C/C++ development and want to port

More information

PDF Document structure, that need for managing of PDF file. It uses in all functions from EMF2PDF SDK.

PDF Document structure, that need for managing of PDF file. It uses in all functions from EMF2PDF SDK. EMF2PDF SDK Pilot Structures struct pdf_document { PDFDocument4 *pdfdoc; }; PDF Document structure, that need for managing of PDF file. It uses in all functions from EMF2PDF SDK. typedef enum { conone

More information

Tools for Multi-Cores and Multi-Targets

Tools for Multi-Cores and Multi-Targets Tools for Multi-Cores and Multi-Targets Sebastian Pop Advanced Micro Devices, Austin, Texas The Linux Foundation Collaboration Summit April 7, 2011 1 / 22 Sebastian Pop Tools for Multi-Cores and Multi-Targets

More information

GPGPU in Film Production. Laurence Emms Pixar Animation Studios

GPGPU in Film Production. Laurence Emms Pixar Animation Studios GPGPU in Film Production Laurence Emms Pixar Animation Studios Outline GPU computing at Pixar Demo overview Simulation on the GPU Future work GPU Computing at Pixar GPUs have been used for real-time preview

More information

MULTI-PASS VS SINGLE-PASS CUBEMAP

MULTI-PASS VS SINGLE-PASS CUBEMAP Sogang University Computer Graphics Lab. MULTI-PASS VS SINGLE-PASS CUBEMAP 2008.4 1 Contents Purpose Multi-Pass Cubemap Single-Pass Cubemap Reflection Mapping Test Result Conclusion 2 Purpose Implement

More information

Steiner- Wallner- Podaras

Steiner- Wallner- Podaras Texturing 2 3 Some words on textures Texturing = mapping 2D image to a model (*You will hear more on other texturing- methods in the course.) Not a trivial task! 4 Texturing how it works 5 UV coordinates

More information

Copyright Khronos Group, Page 1. OpenCL Overview. February 2010

Copyright Khronos Group, Page 1. OpenCL Overview. February 2010 Copyright Khronos Group, 2011 - Page 1 OpenCL Overview February 2010 Copyright Khronos Group, 2011 - Page 2 Khronos Vision Billions of devices increasing graphics, compute, video, imaging and audio capabilities

More information

INTRODUCTION TO OPENCL TM A Beginner s Tutorial. Udeepta Bordoloi AMD

INTRODUCTION TO OPENCL TM A Beginner s Tutorial. Udeepta Bordoloi AMD INTRODUCTION TO OPENCL TM A Beginner s Tutorial Udeepta Bordoloi AMD IT S A HETEROGENEOUS WORLD Heterogeneous computing The new normal CPU Many CPU s 2, 4, 8, Very many GPU processing elements 100 s Different

More information

CS212. OpenGL Texture Mapping and Related

CS212. OpenGL Texture Mapping and Related CS212 OpenGL Texture Mapping and Related Basic Strategy Three steps to applying a texture 1. specify the texture read or generate image assign to texture enable texturing 2. assign texture coordinates

More information

AMCAT Automata Coding Sample Questions And Answers

AMCAT Automata Coding Sample Questions And Answers 1) Find the syntax error in the below code without modifying the logic. #include int main() float x = 1.1; switch (x) case 1: printf( Choice is 1 ); default: printf( Invalid choice ); return

More information

OpenCL Events. Mike Bailey. Computer Graphics opencl.events.pptx

OpenCL Events. Mike Bailey. Computer Graphics opencl.events.pptx 1 OpenCL Events This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License Mike Bailey mjb@cs.oregonstate.edu opencl.events.pptx OpenCL Events 2 An

More information

INFOGR Computer Graphics

INFOGR Computer Graphics INFOGR Computer Graphics Jacco Bikker & Debabrata Panja - April-July 2018 Lecture 4: Graphics Fundamentals Welcome! Today s Agenda: Rasters Colors Ray Tracing Assignment P2 INFOGR Lecture 4 Graphics Fundamentals

More information

SYCL: An Abstraction Layer for Leveraging C++ and OpenCL

SYCL: An Abstraction Layer for Leveraging C++ and OpenCL SYCL: An Abstraction Layer for Leveraging C++ and OpenCL Alastair Murray Compiler Research Engineer, Codeplay Visit us at www.codeplay.com 45 York Place Edinburgh EH1 3HP United Kingdom Overview SYCL for

More information

Using Deep Learning to Generate Human-like Code

Using Deep Learning to Generate Human-like Code Using Deep Learning to Generate Human-like Code Synthesizing Benchmarks for Predictive Modeling Chris Cummins Zheng Wang Pavlos Petoumenos Hugh Leather achine learning for compilers y = f(x) Optimisations

More information

Introduction to OpenCL. Benedict R. Gaster October, 2010

Introduction to OpenCL. Benedict R. Gaster October, 2010 Introduction to OpenCL Benedict R. Gaster October, 2010 OpenCL With OpenCL you can Leverage CPUs and GPUs to accelerate parallel computation Get dramatic speedups for computationally intensive applications

More information

Assignment #3: Scalar Field Visualization 3D: Cutting Plane, Wireframe Iso-surfacing, and Direct Volume Rendering

Assignment #3: Scalar Field Visualization 3D: Cutting Plane, Wireframe Iso-surfacing, and Direct Volume Rendering Assignment #3: Scalar Field Visualization 3D: Cutting Plane, Wireframe Iso-surfacing, and Direct Volume Rendering Goals: Due October 9 th, before midnight With the results from your assignement#2, the

More information

Texture Mapping. Computer Graphics, 2015 Lecture 9. Johan Nysjö Centre for Image analysis Uppsala University

Texture Mapping. Computer Graphics, 2015 Lecture 9. Johan Nysjö Centre for Image analysis Uppsala University Texture Mapping Computer Graphics, 2015 Lecture 9 Johan Nysjö Centre for Image analysis Uppsala University What we have rendered so far: Looks OK, but how do we add more details (and colors)? Texture mapping

More information

Introduction to OpenCL!

Introduction to OpenCL! Lecture 6! Introduction to OpenCL! John Cavazos! Dept of Computer & Information Sciences! University of Delaware! www.cis.udel.edu/~cavazos/cisc879! OpenCL Architecture Defined in four parts Platform Model

More information

CS195V Week 6. Image Samplers and Atomic Operations

CS195V Week 6. Image Samplers and Atomic Operations CS195V Week 6 Image Samplers and Atomic Operations Administrata Warp is due today! NBody should go out soon Due in three weeks instead of two Slightly larger in scope First week and a half we be spent

More information

Intel OpenCL SDK. User's Guide. Copyright 2010 Intel Corporation. All Rights Reserved. Document Number: US

Intel OpenCL SDK. User's Guide. Copyright 2010 Intel Corporation. All Rights Reserved. Document Number: US User's Guide Copyright 2010 Intel Corporation All Rights Reserved Document Number: 323626-001US Legal Information INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE,

More information

CS 677: Parallel Programming for Many-core Processors Lecture 12

CS 677: Parallel Programming for Many-core Processors Lecture 12 1 CS 677: Parallel Programming for Many-core Processors Lecture 12 Instructor: Philippos Mordohai Webpage: www.cs.stevens.edu/~mordohai E-mail: Philippos.Mordohai@stevens.edu CS Department Project Poster

More information

Texturing. Slides done bytomas Akenine-Möller and Ulf Assarsson Department of Computer Engineering Chalmers University of Technology

Texturing. Slides done bytomas Akenine-Möller and Ulf Assarsson Department of Computer Engineering Chalmers University of Technology Texturing Slides done bytomas Akenine-Möller and Ulf Assarsson Department of Computer Engineering Chalmers University of Technology 1 Texturing: Glue n-dimensional images onto geometrical objects l Purpose:

More information

Parallel Patterns Ezio Bartocci

Parallel Patterns Ezio Bartocci TECHNISCHE UNIVERSITÄT WIEN Fakultät für Informatik Cyber-Physical Systems Group Parallel Patterns Ezio Bartocci Parallel Patterns Think at a higher level than individual CUDA kernels Specify what to compute,

More information

OpenCL Device Fission Benedict R. Gaster, AMD

OpenCL Device Fission Benedict R. Gaster, AMD Copyright Khronos Group, 2011 - Page 1 Fission Benedict R. Gaster, AMD March 2011 Fission (cl_ext_device_fission) Provides an interface for sub-dividing an device into multiple sub-devices Typically used

More information

/INFOMOV/ Optimization & Vectorization. J. Bikker - Sep-Nov Lecture 8: Data-Oriented Design. Welcome!

/INFOMOV/ Optimization & Vectorization. J. Bikker - Sep-Nov Lecture 8: Data-Oriented Design. Welcome! /INFOMOV/ Optimization & Vectorization J. Bikker - Sep-Nov 2016 - Lecture 8: Data-Oriented Design Welcome! 2016, P2: Avg.? a few remarks on GRADING P1 2015: Avg.7.9 2016: Avg.9.0 2017: Avg.7.5 Learn about

More information

The Rise of Open Programming Frameworks. JC BARATAULT IWOCL May 2015

The Rise of Open Programming Frameworks. JC BARATAULT IWOCL May 2015 The Rise of Open Programming Frameworks JC BARATAULT IWOCL May 2015 1,000+ OpenCL projects SourceForge GitHub Google Code BitBucket 2 TUM.3D Virtual Wind Tunnel 10K C++ lines of code, 30 GPU kernels CUDA

More information

AMath 483/583, Lecture 24, May 20, Notes: Notes: What s a GPU? Notes: Some GPU application areas

AMath 483/583, Lecture 24, May 20, Notes: Notes: What s a GPU? Notes: Some GPU application areas AMath 483/583 Lecture 24 May 20, 2011 Today: The Graphical Processing Unit (GPU) GPU Programming Today s lecture developed and presented by Grady Lemoine References: Andreas Kloeckner s High Performance

More information

TSBK 07! Computer Graphics! Ingemar Ragnemalm, ISY

TSBK 07! Computer Graphics! Ingemar Ragnemalm, ISY 1(61) Information Coding / Computer Graphics, ISY, LiTH TSBK 07 Computer Graphics Ingemar Ragnemalm, ISY 1(61) Lecture 6 Texture mapping Skyboxes Environment mapping Bump mapping 2(61)2(61) Texture mapping

More information

ICSA Institute for Computing Systems Architecture. LFCS Laboratory for Foundations of Computer Science

ICSA Institute for Computing Systems Architecture. LFCS Laboratory for Foundations of Computer Science Largest Informatics Department in the UK: > 500 academic and research staff + PhD students Overall 6 Research Institutes 2 particular relevant for the topic of the talk: ICSA Institute for Computing Systems

More information

WebCL Overview and Roadmap

WebCL Overview and Roadmap Copyright Khronos Group, 2011 - Page 1 WebCL Overview and Roadmap Tasneem Brutch Chair WebCL Working Group Samsung Electronics Copyright Khronos Group, 2011 - Page 2 WebCL Motivation Enable high performance

More information

ก ก ก.

ก ก ก. 418382 ก ก ก ก 5 pramook@gmail.com TEXTURE MAPPING Textures Texture Object An OpenGL data type that keeps textures resident in memory and provides identifiers

More information

Getting the Most out of your GPU. Imre Palik Introduction to Data-Parallel Algorithms. Imre Palik.

Getting the Most out of your GPU. Imre Palik Introduction to Data-Parallel Algorithms. Imre Palik. to Data- to Data- Getting the Most out of your GPU imre.palik@morganstanley.com Naive Segmented The views expressed in this presentation are those of the author and, therefore, do not necessarily reflect

More information

Brook+ Data Types. Basic Data Types

Brook+ Data Types. Basic Data Types Brook+ Data Types Important for all data representations in Brook+ Streams Constants Temporary variables Brook+ Supports Basic Types Short Vector Types User-Defined Types 29 Basic Data Types Basic data

More information

Synthesizing Benchmarks for Predictive Modeling.

Synthesizing Benchmarks for Predictive Modeling. Synthesizing Benchmarks for Predictive Modeling http://chriscummins.cc/cgo17 Chris Cummins University of Edinburgh Pavlos Petoumenos University of Edinburgh Zheng Wang Lancaster University Hugh Leather

More information

Magnification and Minification

Magnification and Minification Magnification and Minification Lecture 30 Robb T. Koether Hampden-Sydney College Fri, Nov 6, 2015 Robb T. Koether (Hampden-Sydney College) Magnification and Minification Fri, Nov 6, 2015 1 / 17 Outline

More information

GPU-accelerated data expansion for the Marching Cubes algorithm

GPU-accelerated data expansion for the Marching Cubes algorithm GPU-accelerated data expansion for the Marching Cubes algorithm San Jose (CA) September 23rd, 2010 Christopher Dyken, SINTEF Norway Gernot Ziegler, NVIDIA UK Agenda Motivation & Background Data Compaction

More information

CS452/552; EE465/505. Image Processing Frame Buffer Objects

CS452/552; EE465/505. Image Processing Frame Buffer Objects CS452/552; EE465/505 Image Processing Frame Buffer Objects 3-12 15 Outline! Image Processing: Examples! Render to Texture Read: Angel, Chapter 7, 7.10-7.13 Lab3 new due date: Friday, Mar. 13 th Project#1

More information