A fixed-point 3D graphics library with energy-efficient efficient cache architecture for mobile multimedia system

Size: px
Start display at page:

Download "A fixed-point 3D graphics library with energy-efficient efficient cache architecture for mobile multimedia system"

Transcription

1 MS Thesis A fixed-point 3D graphics library with energy-efficient efficient cache architecture for mobile multimedia system Min-wuk Lee Semiconductor System Laboratory Department Electrical Engineering and Computer Science Korea Advanced Institute of Science and Technology [KAIST] Min-wuk Lee 1

2 Introduction Motivation Outline MobileGL: Mobile 3D graphics library Energy-efficient CPU cache Energy-efficient texture cache Conclusion Min-wuk Lee 2

3 Introduction(1/2) Embedded mobile system Mobile 3D graphics system Optimized code for speed Draw off the best H/W performance Model Interface Transformation Gouraud Shading Depth Compare Input Software system Hardware system Output Lighting Perspective Projection Screen Clipping Alpha Blending Triangle Setup Texture Mapping Low energy consumption Good quality, high performance Model CPU MobileGL 3D Rendering Engine Memory Pixel Performance-energy co-optimization for mobile 3D graphics Software system : High speed graphics library (MobileGL) Hardware system : Energy-efficient Cache architecture Min-wuk Lee 3

4 Target system Introduction(2/2) Application processor Graphics library CPU cache Mem Low-cost target Application processor Graphics library CPU cache Texture cache Frame cache Depth cache 3D graphics SoC Low-cost target High speed graphics library Energy-efficient CPU cache system High quality target High speed and good quality graphics library Energy-efficient CPU cache, texture cache system R.E. System bus controller High quality target Mem Min-wuk Lee 4

5 For low-cost target Previous work PC, workstation platform graphics library Too huge GL supported by FPU, special graphics engine Embedded platform graphics library : Fixed point arithmetic Yoshida s work[1] Limited operation (Without texturing) No research on memory bandwidth bottleneck Previous work in our group Analysis with memory-only, without cache system For high quality target Texture cache in PC platform Hakura s work[2] Analysis based on miss rate Did not consider energy, execution time, system limitation [1] : K.Yoshida, Consumer Electronics, IEEE Transactions on,1998. [2] : Ziyad s. Hakura, ISCA,1997. Min-wuk Lee 5

6 Motivation Graphics library of this work Extended operation Lighting, Texturing, Alpha blending, Face culling, etc. Optimization of memory transaction 3D graphics characteristic Analysis with cache system Texture cache of this work Energy-efficient texture cache in embedded system With negligible performance degradation Min-wuk Lee 6

7 Introduction Motivation Outline MobileGL: Mobile 3D graphics library Energy-efficient CPU cache Energy-efficient texture cache Conclusion Min-wuk Lee 7

8 Mobile 3D graphics library A fixed-point arithmetic 32bit integer Optimized memory transaction To reduce instruction and data traffic Selective pipeline Applications To reduce branch 45KB total code size MobileGL(1/6) Model Lighting enable Lighting disable View transformation Lighting Perspective projection Clipping Perspective division Screen mapping Cull face Rendering stage Lighting and Texturing Lighting only Pixel View transformation X Perspective projection Clipping Perspective division Screen mapping Cull face Rendering stage Texture-only Include r,g,b,a calculation Exclude r,g,b,a calculation MobileGL block diagram Min-wuk Lee 8

9 MobileGL(2/6) Disabling option for perspective correction of texture address Due to small screen size Trade-off between correctness and speedup View transformation X Perspective projection Clipping /Perspective divison u,v/ w u,v On Off Screen mapping Cull face Triangle setup execution time /1K Polygons % reduction Horizontal setup 0 Pixel interpolation u,v/w u,v Triangle setup Horizontal setup Texturing Conventional This work StrongARM at 200 MHz Min-wuk Lee 9

10 MobileGL(3/6) Division reduction in interpolation Use shift instead of reciprocal High probability of 1,2 or 4 in denominator value 1st Top Direction_y 1 Mid Line1 3rd Line3 Line2 start Line4 Direction_x end Bot 2nd is 1, 2 or 4, using shift Execution time(ms) per 1000 Polygons ROD Triangle setup Horizontal setup Texturing 27 % reduction 200 MHz Min-wuk Lee 10

11 Z comparison in advance To avoid unnecessary shading and texturing[3] MobileGL(4/6) Selective precision of matrix multiplication 67% stage #2 #1 Unnecessary operation for #2 Should extend to 64bit for result 32 bit 4bit A Texturing Depth test Blending Standard OpenGL pipeline 64 bit B A MULL B Depth test Texturing Blending Z comparison in advance 32 bit 4bit Speed improvement A Z_fail / Z_access B (%) A MUL B [3] : Ramchan Woo, ISSCC 2003 Min-wuk Lee 11

12 Library performance MobileGL(5/6) 67K texture only application due to several optimization steps Original C code 67K Polygons/sec Polyons / milli sec Optimized code 6.7 times Performance improvement 0 80MHz 200MHz 200MHz Previous work[1] Min-wuk Lee 12

13 Implementation result MobileGL(6/6) Min-wuk Lee 13

14 Introduction Motivation Outline MobileGL: Mobile 3D graphics library Energy-efficient CPU cache Energy-efficient texture cache Conclusion Min-wuk Lee 14

15 Energy-efficient efficient CPU cache (1/4) Simulation environment Application Programs 3D Graphics Library From ARM SDK : 1, memory transaction From cache model using memory transaction : 2, 3 ARM SDK ARM Processor Memory CPU execution time Memory transaction file CACHE_MODEL Target Hardware Platform TOTAL EXECUTION TIME Memory_access_time T T exe_ total exe_ CPU = T = exe_ CPU instruction _ counts + memory T 3 _ access _ time exe_ instruction K = 1 K = cache_ hit _ counts CPU 2 _ cycle Min-wuk Lee 15

16 Energy-efficient efficient CPU cache (2/4) Cache model : about execution time Processor core hit_time Data cache Instruction cache Memory memory_access_time hit _ time = clock _ period core memory _ access _ time trcd + CAS _ latency + clock _ period == clock _ period L( burst _ access) mem mem L( non _ squential _ access) miss _ time == memory _ access _ time + hit _ timek( read) memory _ access _ timel( write) Min-wuk Lee 16

17 Energy-efficient efficient CPU cache (3/4) Energy modeling Tool and research documentation based model Cache hit energy : from CACTI 3.0 [4] Cache miss energy : from Power & Energy Characterization of the Itsy Pocket Computer by Compaq Western Research Laboratory 4.70nJ / bus_clock [5], [6] [4] : CACTI 3.0 : An integrated cache timing, power, and area model, Compaq Western Research Laboratory [5] : Power and energy characterization of the Itsy pocket computer [6] : A simulation framework for energy-consumption analysis of OS-driven embedded applications, TCAS 2003 Min-wuk Lee 17

18 Energy-efficient efficient CPU cache (4/4) Simulation results Direct mapped data cache, 8E/line (32B line size) Miss rate(%) Normalized execution time Normlizated@ 1 Normalized energy consumption Normlizated@ KB 4KB 8KB 16KB 2KB 4KB 8KB 16KB 2KB 4KB 8KB 16KB cache size cache size cache size 16KB data cache, 32B line size Miss rate(%) Normalized execution time Normlizated@ 1 1% performance degradation Normlizated@ Normalized energy consumption 13% energy saving DM 2WAY 4WAY 8WAY DM 2WAY 4WAY 8WAY DM 2WAY 4WAY 8WAY Using 2-way cache, 13% energy saving, 1% performance degradation compared with conventional 4-way cache Min-wuk Lee 18

19 Introduction Motivation Outline MobileGL: Mobile 3D graphics library Energy-efficient CPU cache Energy-efficient texture cache Conclusion Min-wuk Lee 19

20 Energy-efficient efficient texture cache (1/12) Texture mapping (Introduction) F(x,y,z) = (s,t) Map from 3D surface to 2D texel domain (image) Texture coordinate Lookup color in image y z x t s Lookup method 2D texture diagram Nearest texel Interpolation of surrounding texles MIPMAP Image pyramid Level 0 d axis Image pyramid Min-wuk Lee 20

21 Energy-efficient efficient texture cache (2/12) Texture filtering methods (Introduction) Point sampling, Bilinear filtering, Bilinear MIPMAP, Trilinear MIPMAP LOD 0 1st 1. Point sampling LOD 0 1st LOD = 1.XX 2nd 3rd 2nd 1st 2nd 3rd LOD 1 LOD 2 LOD 3 3rd Bilinear interpolation 2. Bilinear filtering 3. Bilinear MIPMAP Bilinear interpolation Bilinear interpolation 4. Trilinear MIPMAP Linear interpolation Texture space Screen space Texture space Min-wuk Lee 21

22 Energy-efficient efficient texture cache (3/12) Obstacle of texture mapping Requirement of extremely high bandwidth Texture cache To reduce the off-chip memory access bottlenecks Image conversion (texture map representation) : Reduce conflict miss Address conversion unit (A few logical operations and two additions) External memory 3D Rendering engine Address conversion Texture cache Image conversion Texture cache system Min-wuk Lee 22

23 Energy-efficient efficient texture cache (4/12) Simulation models Tiny Stealth Alien 6833 polygons 542 polygons 854 polygons Tiny :LOD[0:1] 80%, LOD[1:2] 10% Stealth :LOD[0:1] 67%, LOD[1:2] 15% Alien :LOD[0:1] 48%, LOD[1:2] trilinear MIPMAP Min-wuk Lee 23

24 Energy-efficient efficient texture cache (5/12) Proposed texture map representation Reduce conflict miss at bank change Miss rate reduction, energy saving (17.4%), execution time reduction (15.2%) Blocked representation Recursive Sub Block Min-wuk Lee 24

25 Energy-efficient efficient texture cache (6/12) Address conversion unit for RSB2X2 Use one-to-one correspondence and find rule Hardware implementation : only thirteen 2:1mux in trilinear MIPMAP Address conversion unit of this work : RSB 2X2 old11 old9 old7 old5 old3 old1 old10 old8 old6 old4 old2 old0 old11 old9 old7 old5 old3 old1 old10 old8 old6 old4 old2 old0 core request address core request address converted address new10 new8 new6 new4 new2 new0 new11 new9 new7 new5 new3 new1 converted address new10 new8 new6 new4 new2 new0 new11 new9 new7 new5 new3 new1 256 X 256, RSB 2X2 64 X 64, RSB 2X2 old11 old9 old7 old5 old3 old1 old10 old8 old6 old4 old2 old0 old11 old9 old7 old5 old3 old1 old10 old8 old6 old4 old2 old0 core request address core request address converted address new10 new8 new6 new4 new2 new0 new11 new9 new7 new5 new3 new1 converted address new10 new8 new6 new4 new2 new0 new11 new9 new7 new5 new3 new1 128 X 128, RSB 2X2 32 X 32, RSB 2X2 Min-wuk Lee 25

26 Energy-efficient efficient texture cache (7/12) Texture cache model using bank interleaved A0 Texture cache (1 bank) A0 A1 A2 A3 Texture cache (4 bank) Texture cache for even, odd LOD (4 bank) A0 A1 A2 A3 A4 A5 A6 A7 EvenLOD$ OddLOD$ D0 Point sampling D3 D2 D1 D0 Bilinear filtering Bilinear MIPMAP D7 D6 D5 D4 D3 D2 D1 D0 Trilinear MIPMAP Morton order representation previous work Proposed RSB2X2 also free from bank conflict Min-wuk Lee 26

27 Energy-efficient efficient texture cache (8/12) Performance and Energy comparison between filtering method Energy consumption, Execution time Point sampling < Bilinear filtering < Bilinear MIPMAP < Trilinear MIPMAP Trade off point : Image quality (aliasing criterion) Normalized energy P.S. B.F. B.M. T.M. 2KB, 16entries/line, Tiny_model D.M. 2WAY 4WAY Min-wuk Lee 27

28 Energy-efficient efficient texture cache (9/12) Image quality analysis Textile model LOD[0:1] : 44%, LOD[1:2] : 40% in MIPMAP Point smapling Bilinear filtering Bilinear mipmap Trilinear mipmap DCT analysis Low frequency term in top-left Point smapling Bilinear filtering Bilinear mipmap Trilinear mipmap Min-wuk Lee 28

29 Energy-efficient efficient texture cache (10/12) Image quality metric in terms of aliasing criterion image_ quality _ 0.5 π / 2 π / 2 fx= 0 fy= 0 = π π fx= 0 fy= 0 amplitude amplitude 0 image _ quality _ 0.75 PI fx 3π 4 3π 4 fx = 0 fy = 0 = π π fx = 0 fy = 0 amplitude amplitude Index Q, Index E To find relative value PI Normalize from 0 to 1 Index Q = cur Q max Q min min Q Q fy Index E = max max E E cur min E E Min-wuk Lee 29

30 Energy-efficient efficient texture cache (11/12) Index = Index Q + Index E Almost same quality between B.M. and T.M. in QVGA Large different energy between B.M. and T.M. Poor image quality in P.S. Bilinear MIPMAP get the largest score. Index Q 1 8E,(Q_0.5) 16E,(Q_0.75) P.S. B.F. B.M. T.M Index E P.S. B.F. B.M. T.M. 8E 16E Index Q +Index E P.S. B.F. B.M. T.M. 8E,(Q_0.5) 16E,(Q_0.5) 8E,(Q_0.75) 16E,(Q_0.75) 2-way set associative, 2KB texture cache Min-wuk Lee 30

31 Energy-efficient efficient texture cache (12/12) Simulation results 1.2 Normalized energy Tiny Stealth Alien E 16E K 2K 4K 8K 1K 2K 4K 8K 1K 2K 4K 8K Energy comparison while changing cache 2-way, using bilinear MIPMAP 4KB texture cache, 16B line size (2B per 1texel) energy-efficient, low cost, high-quality Min-wuk Lee 31

32 Introduction Motivation Outline MobileGL: Mobile 3D graphics library Energy-efficient CPU cache Energy-efficient texture cache Conclusion Min-wuk Lee 32

33 Conclusion For performance-energy co-optimization in Mobile3D graphics MobileGL / Cache architecture MobileGL : Mobile 3D graphics library 67K polygons/sec 66.1% performance improvement in average Energy-efficient CPU cache 2-way set associative cache to save energy Energy-efficient texture cache Proposed texture map representation Bilinear MIPMAP shows good quality to energy ratio 16B line size, 4KB size cache is the optimal point Min-wuk Lee 33

34 Supplemental Materials Min-wuk Lee 34

35 Geometry stage Graphics pipeline Rendering stage Camera direction 1st Top Camera position z z x View frustum Unit-cube x z x View transform Projection Clipping 1/w Screen mapping Direction_y Mid 3rd Line1 Line2 start Line3 Direction_x end 2nd Bot Triangle setup : For line1, line2, line3 using1st, 2nd, 3rd Horizontal setup : For line_x using start, end Line_x Rendering stage Pixel interpolation : Each pixel shading, texturing Min-wuk Lee 35

36 Energy portion of cache ARM920T and M*CORE : Caches consume 50% of total pr ocessor system power (Segars 01,Lee et.al. 99) >50% Min-wuk Lee 36

37 Blocked representation Conventional texture map representation (16X16blocked) Conflict block change A path : 2 B path : 16 C path : 16 B C A Block : Square region that texels are ordered consecutively Assumptions 1. Bilinear filtering block block Cache size = block size 3. 16entries / 1 line X16 conventional block texture map Min-wuk Lee 37

38 Proposed texture map representation Recursive Sub Block texture map representation (RSB4X4) Conflict change A path : 4 B path : 4 C path : 4 Assumptions 1. Bilinear filtering 2. Cache size = block size B A C block entries/ 1line block block recursive sub-block 4X4 method Min-wuk Lee 38

39 Simulation between representation methods Simulation results between texture representations Bilinear filtering, 2-way, 1KB texture cache 27% performance improvement in average Low miss rate doesn t mean high performance 8entries/line or 16entries/line shows good performance Miss rate Tiny Stealth Alien 4X4 8X8 16X16 RSB2X2 Normalized performance Tiny Stealth Alien 4X4 8X8 16X16 RSB2X E 8E 16E 32E 4E 8E 16E 32E 4E 8E 16E 32E 0.5 4E 8E 16E 32E 4E 8E 16E 32E 4E 8E 16E 32E Min-wuk Lee 39

40 Simulation between representation bilinear filtering Low miss rate doesn t mean low energy consumption RSB accomplish 17.4% energy saving compared to the best of conventional point sampling 25% performance improvement in average Normalized energy Tiny Stealth Alien 17.4% energy saving 4E 8E 16E 32E 4E 8E 16E 32E 4E 8E 16E 32E 4X4 8X8 16X16 RSB2X2 Normalized performance Tiny Stealth Alien 4E 8E 16E 32E 4E 8E 16E 32E 4E 8E 16E 32E 4X4 8X8 16X16 RSB2X2 Bilinear filtering, 1KB cache, 2way Point sampling, 1KB cache, 2way Min-wuk Lee 40

41 Morton order Multi-ported cache To access more than 1 texel in the same cycle Interleaving the cache lines across multi-banks Morton order RSB4X4 : Not free from bank conflict RSB2x2 Trilinear filtering : Not free from bank conflict Cache for even, odd LOD 4X4 block map Morton order LSB 2bits LSB 2bits LSB 2bits D0 D1 D2 D3 3 Not free from bank conflict Bank conflict free 2 D4 D5 D6 D Not free from bank conflict LSB 2bits Min-wuk Lee 41

42 Proposed texture map representation RSB2X2 map representation Bank conflict free A Conflict bank change A : B : C : B C RSB 4X4 block block RSB recursive sub-block 2X2 method Min-wuk Lee 42

A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications

A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications Ju-Ho Sohn, Jeong-Ho Woo, Min-Wuk Lee, Hye-Jung Kim, Ramchan Woo, Hoi-Jun Yoo Semiconductor System

More information

Design and Optimization of Geometry Acceleration for Portable 3D Graphics

Design and Optimization of Geometry Acceleration for Portable 3D Graphics M.S. Thesis Design and Optimization of Geometry Acceleration for Portable 3D Graphics Ju-ho Sohn 2002.12.20 oratory Department of Electrical Engineering and Computer Science Korea Advanced Institute of

More information

2D/3D Graphics Accelerator for Mobile Multimedia Applications. Ramchan Woo, Sohn, Seong-Jun Song, Young-Don

2D/3D Graphics Accelerator for Mobile Multimedia Applications. Ramchan Woo, Sohn, Seong-Jun Song, Young-Don RAMP-IV: A Low-Power and High-Performance 2D/3D Graphics Accelerator for Mobile Multimedia Applications Woo, Sungdae Choi, Ju-Ho Sohn, Seong-Jun Song, Young-Don Bae,, and Hoi-Jun Yoo oratory Dept. of EECS,

More information

Real-Time Graphics Architecture. Kurt Akeley Pat Hanrahan. Texture

Real-Time Graphics Architecture. Kurt Akeley Pat Hanrahan.  Texture Real-Time Graphics Architecture Kurt Akeley Pat Hanrahan http://www.graphics.stanford.edu/courses/cs448a-01-fall Texture 1 Topics 1. Review of texture mapping 2. RealityEngine and InfiniteReality 3. Texture

More information

A 120mW Embedded 3D Graphics Rendering Engine with 6Mb Logically Local Frame-Buffer and 3.2GByte/s Run-time Reconfigurable Bus for PDA-Chip

A 120mW Embedded 3D Graphics Rendering Engine with 6Mb Logically Local Frame-Buffer and 3.2GByte/s Run-time Reconfigurable Bus for PDA-Chip A 120mW Embedded 3D Graphics Rendering Engine with 6Mb Logically Local Frame-Buffer and 3.2GByte/s Run-time Reconfigurable Bus for PDA-Chip Ramchan Woo*, Chi-Weon Yoon, Jeonghoon Kook, Se-Joong Lee, Kangmin

More information

CS 130 Final. Fall 2015

CS 130 Final. Fall 2015 CS 130 Final Fall 2015 Name Student ID Signature You may not ask any questions during the test. If you believe that there is something wrong with a question, write down what you think the question is trying

More information

Vertex Shader Design I

Vertex Shader Design I The following content is extracted from the paper shown in next page. If any wrong citation or reference missing, please contact ldvan@cs.nctu.edu.tw. I will correct the error asap. This course used only

More information

Mobile Performance Tools and GPU Performance Tuning. Lars M. Bishop, NVIDIA Handheld DevTech Jason Allen, NVIDIA Handheld DevTools

Mobile Performance Tools and GPU Performance Tuning. Lars M. Bishop, NVIDIA Handheld DevTech Jason Allen, NVIDIA Handheld DevTools Mobile Performance Tools and GPU Performance Tuning Lars M. Bishop, NVIDIA Handheld DevTech Jason Allen, NVIDIA Handheld DevTools NVIDIA GoForce5500 Overview World-class 3D HW Geometry pipeline 16/32bpp

More information

Development of a 3-D Graphics Rendering Engine with Lighting Acceleration for Handheld Multimedia Systems

Development of a 3-D Graphics Rendering Engine with Lighting Acceleration for Handheld Multimedia Systems 1020 IEEE Transactions on Consumer Electronics, Vol. 51, No. 3, AUGUST 2005 Development of a 3-D Graphics Rendering Engine with Lighting Acceleration for Handheld Multimedia Systems Byeong-Gyu Nam, Min-wuk

More information

Spring 2009 Prof. Hyesoon Kim

Spring 2009 Prof. Hyesoon Kim Spring 2009 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on

More information

Optimizing and Profiling Unity Games for Mobile Platforms. Angelo Theodorou Senior Software Engineer, MPG Gamelab 2014, 25 th -27 th June

Optimizing and Profiling Unity Games for Mobile Platforms. Angelo Theodorou Senior Software Engineer, MPG Gamelab 2014, 25 th -27 th June Optimizing and Profiling Unity Games for Mobile Platforms Angelo Theodorou Senior Software Engineer, MPG Gamelab 2014, 25 th -27 th June 1 Agenda Introduction ARM and the presenter Preliminary knowledge

More information

A Bandwidth Effective Rendering Scheme for 3D Texture-based Volume Visualization on GPU

A Bandwidth Effective Rendering Scheme for 3D Texture-based Volume Visualization on GPU for 3D Texture-based Volume Visualization on GPU Won-Jong Lee, Tack-Don Han Media System Laboratory (http://msl.yonsei.ac.k) Dept. of Computer Science, Yonsei University, Seoul, Korea Contents Background

More information

Mattan Erez. The University of Texas at Austin

Mattan Erez. The University of Texas at Austin EE382V: Principles in Computer Architecture Parallelism and Locality Fall 2008 Lecture 10 The Graphics Processing Unit Mattan Erez The University of Texas at Austin Outline What is a GPU? Why should we

More information

Building scalable 3D applications. Ville Miettinen Hybrid Graphics

Building scalable 3D applications. Ville Miettinen Hybrid Graphics Building scalable 3D applications Ville Miettinen Hybrid Graphics What s going to happen... (1/2) Mass market: 3D apps will become a huge success on low-end and mid-tier cell phones Retro-gaming New game

More information

Spring 2011 Prof. Hyesoon Kim

Spring 2011 Prof. Hyesoon Kim Spring 2011 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on

More information

E.Order of Operations

E.Order of Operations Appendix E E.Order of Operations This book describes all the performed between initial specification of vertices and final writing of fragments into the framebuffer. The chapters of this book are arranged

More information

Real - Time Rendering. Pipeline optimization. Michal Červeňanský Juraj Starinský

Real - Time Rendering. Pipeline optimization. Michal Červeňanský Juraj Starinský Real - Time Rendering Pipeline optimization Michal Červeňanský Juraj Starinský Motivation Resolution 1600x1200, at 60 fps Hw power not enough Acceleration is still necessary 3.3.2010 2 Overview Application

More information

Lecture 6: Texture. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

Lecture 6: Texture. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011) Lecture 6: Texture Kayvon Fatahalian CMU 15-869: Graphics and Imaging Architectures (Fall 2011) Today: texturing! Texture filtering - Texture access is not just a 2D array lookup ;-) Memory-system implications

More information

Textures. Texture coordinates. Introduce one more component to geometry

Textures. Texture coordinates. Introduce one more component to geometry Texturing & Blending Prof. Aaron Lanterman (Based on slides by Prof. Hsien-Hsin Sean Lee) School of Electrical and Computer Engineering Georgia Institute of Technology Textures Rendering tiny triangles

More information

Module 13C: Using The 3D Graphics APIs OpenGL ES

Module 13C: Using The 3D Graphics APIs OpenGL ES Module 13C: Using The 3D Graphics APIs OpenGL ES BREW TM Developer Training Module Objectives See the steps involved in 3D rendering View the 3D graphics capabilities 2 1 3D Overview The 3D graphics library

More information

From Vertices to Fragments: Rasterization. Reading Assignment: Chapter 7. Special memory where pixel colors are stored.

From Vertices to Fragments: Rasterization. Reading Assignment: Chapter 7. Special memory where pixel colors are stored. From Vertices to Fragments: Rasterization Reading Assignment: Chapter 7 Frame Buffer Special memory where pixel colors are stored. System Bus CPU Main Memory Graphics Card -- Graphics Processing Unit (GPU)

More information

Graphics Processing Unit Architecture (GPU Arch)

Graphics Processing Unit Architecture (GPU Arch) Graphics Processing Unit Architecture (GPU Arch) With a focus on NVIDIA GeForce 6800 GPU 1 What is a GPU From Wikipedia : A specialized processor efficient at manipulating and displaying computer graphics

More information

CS451Real-time Rendering Pipeline

CS451Real-time Rendering Pipeline 1 CS451Real-time Rendering Pipeline JYH-MING LIEN DEPARTMENT OF COMPUTER SCIENCE GEORGE MASON UNIVERSITY Based on Tomas Akenine-Möller s lecture note You say that you render a 3D 2 scene, but what does

More information

PowerVR Hardware. Architecture Overview for Developers

PowerVR Hardware. Architecture Overview for Developers Public Imagination Technologies PowerVR Hardware Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.

More information

Monday Morning. Graphics Hardware

Monday Morning. Graphics Hardware Monday Morning Department of Computer Engineering Graphics Hardware Ulf Assarsson Skärmen består av massa pixlar 3D-Rendering Objects are often made of triangles x,y,z- coordinate for each vertex Y X Z

More information

Baback Elmieh, Software Lead James Ritts, Profiler Lead Qualcomm Incorporated Advanced Content Group

Baback Elmieh, Software Lead James Ritts, Profiler Lead Qualcomm Incorporated Advanced Content Group Introduction ti to Adreno Tools Baback Elmieh, Software Lead James Ritts, Profiler Lead Qualcomm Incorporated Advanced Content Group Qualcomm HW Accelerated 3D: Adreno Moving content-quality forward requires

More information

GeForce4. John Montrym Henry Moreton

GeForce4. John Montrym Henry Moreton GeForce4 John Montrym Henry Moreton 1 Architectural Drivers Programmability Parallelism Memory bandwidth 2 Recent History: GeForce 1&2 First integrated geometry engine & 4 pixels/clk Fixed-function transform,

More information

Mali-400 MP: A Scalable GPU for Mobile Devices Tom Olson

Mali-400 MP: A Scalable GPU for Mobile Devices Tom Olson Mali-400 MP: A Scalable GPU for Mobile Devices Tom Olson Director, Graphics Research, ARM Outline ARM and Mobile Graphics Design Constraints for Mobile GPUs Mali Architecture Overview Multicore Scaling

More information

Tutorial on GPU Programming #2. Joong-Youn Lee Supercomputing Center, KISTI

Tutorial on GPU Programming #2. Joong-Youn Lee Supercomputing Center, KISTI Tutorial on GPU Programming #2 Joong-Youn Lee Supercomputing Center, KISTI Contents Graphics Pipeline Vertex Programming Fragment Programming Introduction to Cg Language Graphics Pipeline The process to

More information

Rendering Objects. Need to transform all geometry then

Rendering Objects. Need to transform all geometry then Intro to OpenGL Rendering Objects Object has internal geometry (Model) Object relative to other objects (World) Object relative to camera (View) Object relative to screen (Projection) Need to transform

More information

Optimizing Games for ATI s IMAGEON Aaftab Munshi. 3D Architect ATI Research

Optimizing Games for ATI s IMAGEON Aaftab Munshi. 3D Architect ATI Research Optimizing Games for ATI s IMAGEON 2300 Aaftab Munshi 3D Architect ATI Research A A 3D hardware solution enables publishers to extend brands to mobile devices while remaining close to original vision of

More information

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer Real-Time Rendering (Echtzeitgraphik) Michael Wimmer wimmer@cg.tuwien.ac.at Walking down the graphics pipeline Application Geometry Rasterizer What for? Understanding the rendering pipeline is the key

More information

Real-World Applications of Computer Arithmetic

Real-World Applications of Computer Arithmetic 1 Commercial Applications Real-World Applications of Computer Arithmetic Stuart Oberman General purpose microprocessors with high performance FPUs AMD Athlon Intel P4 Intel Itanium Application specific

More information

Drawing Fast The Graphics Pipeline

Drawing Fast The Graphics Pipeline Drawing Fast The Graphics Pipeline CS559 Spring 2016 Lecture 10 February 25, 2016 1. Put a 3D primitive in the World Modeling Get triangles 2. Figure out what color it should be Do ligh/ng 3. Position

More information

An Architecture Extension for Efficient Geometry Processing

An Architecture Extension for Efficient Geometry Processing An Architecture Extension for Efficient Geometry Processing Radhika Thekkath, Mike Uhler, Chandlee Harrell, Ying-wai Ho MIPS Technologies, Inc. 1225 Charleston Road Mountain View, CA 94043 Talk Outline

More information

CS427 Multicore Architecture and Parallel Computing

CS427 Multicore Architecture and Parallel Computing CS427 Multicore Architecture and Parallel Computing Lecture 6 GPU Architecture Li Jiang 2014/10/9 1 GPU Scaling A quiet revolution and potential build-up Calculation: 936 GFLOPS vs. 102 GFLOPS Memory Bandwidth:

More information

LRU. Pseudo LRU A B C D E F G H A B C D E F G H H H C. Copyright 2012, Elsevier Inc. All rights reserved.

LRU. Pseudo LRU A B C D E F G H A B C D E F G H H H C. Copyright 2012, Elsevier Inc. All rights reserved. LRU A list to keep track of the order of access to every block in the set. The least recently used block is replaced (if needed). How many bits we need for that? 27 Pseudo LRU A B C D E F G H A B C D E

More information

Programming Graphics Hardware

Programming Graphics Hardware Tutorial 5 Programming Graphics Hardware Randy Fernando, Mark Harris, Matthias Wloka, Cyril Zeller Overview of the Tutorial: Morning 8:30 9:30 10:15 10:45 Introduction to the Hardware Graphics Pipeline

More information

A SXGA 3D Display Processor with Reduced Rendering Data and Enhanced Precision. Seok-Hoon Kim MVLSI Lab., KAIST

A SXGA 3D Display Processor with Reduced Rendering Data and Enhanced Precision. Seok-Hoon Kim MVLSI Lab., KAIST A SXGA 3D Display Processor with Reduced Rendering Data and Enhanced Precision Seok-Hoon Kim MVLSI Lab., KAIST Contents Background Motivation 3D Graphics + 3D Display Previous Works Conventional 3D Image

More information

Multimedia in Mobile Phones. Architectures and Trends Lund

Multimedia in Mobile Phones. Architectures and Trends Lund Multimedia in Mobile Phones Architectures and Trends Lund 091124 Presentation Henrik Ohlsson Contact: henrik.h.ohlsson@stericsson.com Working with multimedia hardware (graphics and displays) at ST- Ericsson

More information

Structure. Woo-Chan Park, Kil-Whan Lee, Seung-Gi Lee, Moon-Hee Choi, Won-Jong Lee, Cheol-Ho Jeong, Byung-Uck Kim, Woo-Nam Jung,

Structure. Woo-Chan Park, Kil-Whan Lee, Seung-Gi Lee, Moon-Hee Choi, Won-Jong Lee, Cheol-Ho Jeong, Byung-Uck Kim, Woo-Nam Jung, A High Performance 3D Graphics Rasterizer with Effective Memory Structure Woo-Chan Park, Kil-Whan Lee, Seung-Gi Lee, Moon-Hee Choi, Won-Jong Lee, Cheol-Ho Jeong, Byung-Uck Kim, Woo-Nam Jung, Il-San Kim,

More information

Whiz-Bang Graphics and Media Performance for Java Platform, Micro Edition (JavaME)

Whiz-Bang Graphics and Media Performance for Java Platform, Micro Edition (JavaME) Whiz-Bang Graphics and Media Performance for Java Platform, Micro Edition (JavaME) Pavel Petroshenko, Sun Microsystems, Inc. Ashmi Bhanushali, NVIDIA Corporation Jerry Evans, Sun Microsystems, Inc. Nandini

More information

Lets assume each object has a defined colour. Hence our illumination model is looks unrealistic.

Lets assume each object has a defined colour. Hence our illumination model is looks unrealistic. Shading Models There are two main types of rendering that we cover, polygon rendering ray tracing Polygon rendering is used to apply illumination models to polygons, whereas ray tracing applies to arbitrary

More information

Evolution of GPUs Chris Seitz

Evolution of GPUs Chris Seitz Evolution of GPUs Chris Seitz Overview Concepts: Real-time rendering Hardware graphics pipeline Evolution of the PC hardware graphics pipeline: 1995-1998: Texture mapping and z-buffer 1998: Multitexturing

More information

Coming to a Pixel Near You: Mobile 3D Graphics on the GoForce WMP. Chris Wynn NVIDIA Corporation

Coming to a Pixel Near You: Mobile 3D Graphics on the GoForce WMP. Chris Wynn NVIDIA Corporation Coming to a Pixel Near You: Mobile 3D Graphics on the GoForce WMP Chris Wynn NVIDIA Corporation What is GoForce 3D? Licensable 3D Core for Mobile Devices Discrete Solutions: GoForce 3D 4500/4800 OpenGL

More information

Computer Graphics. Texture Filtering & Sampling Theory. Hendrik Lensch. Computer Graphics WS07/08 Texturing

Computer Graphics. Texture Filtering & Sampling Theory. Hendrik Lensch. Computer Graphics WS07/08 Texturing Computer Graphics Texture Filtering & Sampling Theory Hendrik Lensch Overview Last time Texture Parameterization Procedural Shading Today Texturing Filtering 2D Texture Mapping Forward mapping Object surface

More information

A Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on

A Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on A Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on on-chip Donghyun Kim, Kangmin Lee, Se-joong Lee and Hoi-Jun Yoo Semiconductor System Laboratory, Dept. of EECS, Korea Advanced

More information

Lec 11 How to improve cache performance

Lec 11 How to improve cache performance Lec 11 How to improve cache performance How to Improve Cache Performance? AMAT = HitTime + MissRate MissPenalty 1. Reduce the time to hit in the cache.--4 small and simple caches, avoiding address translation,

More information

Parallel Computing: Parallel Architectures Jin, Hai

Parallel Computing: Parallel Architectures Jin, Hai Parallel Computing: Parallel Architectures Jin, Hai School of Computer Science and Technology Huazhong University of Science and Technology Peripherals Computer Central Processing Unit Main Memory Computer

More information

Drawing Fast The Graphics Pipeline

Drawing Fast The Graphics Pipeline Drawing Fast The Graphics Pipeline CS559 Fall 2015 Lecture 9 October 1, 2015 What I was going to say last time How are the ideas we ve learned about implemented in hardware so they are fast. Important:

More information

Design and Implementation of High Performance Application Specific Memory

Design and Implementation of High Performance Application Specific Memory Design and Implementation of High Performance Application Specific Memory - 고성능 Application Specific Memory 의설계와구현 - M.S. Thesis Sungdae Choi Dec. 20th, 2002 Outline Introduction Memory for Mobile 3D Graphics

More information

Texture. Real-Time Graphics Architecture. Kurt Akeley Pat Hanrahan.

Texture. Real-Time Graphics Architecture. Kurt Akeley Pat Hanrahan. Texture Real-Time Graphics Architecture Kurt Akeley Pat Hanrahan http://graphics.stanford.edu/courses/cs448-07-spring/ Topics 1. Projective texture mapping 2. Texture filtering and mip-mapping 3. Early

More information

CSE 167: Introduction to Computer Graphics Lecture #8: Textures. Jürgen P. Schulze, Ph.D. University of California, San Diego Spring Quarter 2016

CSE 167: Introduction to Computer Graphics Lecture #8: Textures. Jürgen P. Schulze, Ph.D. University of California, San Diego Spring Quarter 2016 CSE 167: Introduction to Computer Graphics Lecture #8: Textures Jürgen P. Schulze, Ph.D. University of California, San Diego Spring Quarter 2016 Announcements Project 2 due this Friday Midterm next Tuesday

More information

Using Virtual Texturing to Handle Massive Texture Data

Using Virtual Texturing to Handle Massive Texture Data Using Virtual Texturing to Handle Massive Texture Data San Jose Convention Center - Room A1 Tuesday, September, 21st, 14:00-14:50 J.M.P. Van Waveren id Software Evan Hart NVIDIA How we describe our environment?

More information

Texture mapping. Computer Graphics CSE 167 Lecture 9

Texture mapping. Computer Graphics CSE 167 Lecture 9 Texture mapping Computer Graphics CSE 167 Lecture 9 CSE 167: Computer Graphics Texture Mapping Overview Interpolation Wrapping Texture coordinates Anti aliasing Mipmaps Other mappings Including bump mapping

More information

C P S C 314 S H A D E R S, O P E N G L, & J S RENDERING PIPELINE. Mikhail Bessmeltsev

C P S C 314 S H A D E R S, O P E N G L, & J S RENDERING PIPELINE. Mikhail Bessmeltsev C P S C 314 S H A D E R S, O P E N G L, & J S RENDERING PIPELINE UGRAD.CS.UBC.C A/~CS314 Mikhail Bessmeltsev 1 WHAT IS RENDERING? Generating image from a 3D scene 2 WHAT IS RENDERING? Generating image

More information

Lecture 6: Texturing Part II: Texture Compression and GPU Latency Hiding Mechanisms. Visual Computing Systems CMU , Fall 2014

Lecture 6: Texturing Part II: Texture Compression and GPU Latency Hiding Mechanisms. Visual Computing Systems CMU , Fall 2014 Lecture 6: Texturing Part II: Texture Compression and GPU Latency Hiding Mechanisms Visual Computing Systems Review: mechanisms to reduce aliasing in the graphics pipeline When sampling visibility?! -

More information

Lecture 2. Shaders, GLSL and GPGPU

Lecture 2. Shaders, GLSL and GPGPU Lecture 2 Shaders, GLSL and GPGPU Is it interesting to do GPU computing with graphics APIs today? Lecture overview Why care about shaders for computing? Shaders for graphics GLSL Computing with shaders

More information

- Rasterization. Geometry. Scan Conversion. Rasterization

- Rasterization. Geometry. Scan Conversion. Rasterization Computer Graphics - The graphics pipeline - Geometry Modelview Geometry Processing Lighting Perspective Clipping Scan Conversion Texturing Fragment Tests Blending Framebuffer Fragment Processing - So far,

More information

Module Introduction. Content 15 pages 2 questions. Learning Time 25 minutes

Module Introduction. Content 15 pages 2 questions. Learning Time 25 minutes Purpose The intent of this module is to introduce you to the multimedia features and functions of the i.mx31. You will learn about the Imagination PowerVR MBX- Lite hardware core, graphics rendering, video

More information

Mattan Erez. The University of Texas at Austin

Mattan Erez. The University of Texas at Austin EE382V (17325): Principles in Computer Architecture Parallelism and Locality Fall 2007 Lecture 11 The Graphics Processing Unit Mattan Erez The University of Texas at Austin Outline What is a GPU? Why should

More information

Hot Chips Bringing Workstation Graphics Performance to a Desktop Near You. S3 Incorporated August 18-20, 1996

Hot Chips Bringing Workstation Graphics Performance to a Desktop Near You. S3 Incorporated August 18-20, 1996 Hot Chips 1996 Bringing Workstation Graphics Performance to a Desktop Near You S3 Incorporated August 18-20, 1996 Agenda ViRGE/VX Marketing Slide! Overview of ViRGE/VX accelerator features 3D rendering

More information

Point based Rendering

Point based Rendering Point based Rendering CS535 Daniel Aliaga Current Standards Traditionally, graphics has worked with triangles as the rendering primitive Triangles are really just the lowest common denominator for surfaces

More information

Memory Hierarchy Basics. Ten Advanced Optimizations. Small and Simple

Memory Hierarchy Basics. Ten Advanced Optimizations. Small and Simple Memory Hierarchy Basics Six basic cache optimizations: Larger block size Reduces compulsory misses Increases capacity and conflict misses, increases miss penalty Larger total cache capacity to reduce miss

More information

ISSCC 2001 / SESSION 9 / INTEGRATED MULTIMEDIA PROCESSORS / 9.2

ISSCC 2001 / SESSION 9 / INTEGRATED MULTIMEDIA PROCESSORS / 9.2 ISSCC 2001 / SESSION 9 / INTEGRATED MULTIMEDIA PROCESSORS / 9.2 9.2 A 80/20MHz 160mW Multimedia Processor integrated with Embedded DRAM MPEG-4 Accelerator and 3D Rendering Engine for Mobile Applications

More information

POWERVR MBX. Technology Overview

POWERVR MBX. Technology Overview POWERVR MBX Technology Overview Copyright 2009, Imagination Technologies Ltd. All Rights Reserved. This publication contains proprietary information which is subject to change without notice and is supplied

More information

Pipeline Operations. CS 4620 Lecture Steve Marschner. Cornell CS4620 Spring 2018 Lecture 11

Pipeline Operations. CS 4620 Lecture Steve Marschner. Cornell CS4620 Spring 2018 Lecture 11 Pipeline Operations CS 4620 Lecture 11 1 Pipeline you are here APPLICATION COMMAND STREAM 3D transformations; shading VERTEX PROCESSING TRANSFORMED GEOMETRY conversion of primitives to pixels RASTERIZATION

More information

Rasterization and Graphics Hardware. Not just about fancy 3D! Rendering/Rasterization. The simplest case: Points. When do we care?

Rasterization and Graphics Hardware. Not just about fancy 3D! Rendering/Rasterization. The simplest case: Points. When do we care? Where does a picture come from? Rasterization and Graphics Hardware CS559 Course Notes Not for Projection November 2007, Mike Gleicher Result: image (raster) Input 2D/3D model of the world Rendering term

More information

Texture Mapping and Sampling

Texture Mapping and Sampling Texture Mapping and Sampling CPSC 314 Wolfgang Heidrich The Rendering Pipeline Geometry Processing Geometry Database Model/View Transform. Lighting Perspective Transform. Clipping Scan Conversion Depth

More information

Module Contact: Dr Stephen Laycock, CMP Copyright of the University of East Anglia Version 1

Module Contact: Dr Stephen Laycock, CMP Copyright of the University of East Anglia Version 1 UNIVERSITY OF EAST ANGLIA School of Computing Sciences Main Series PG Examination 2013-14 COMPUTER GAMES DEVELOPMENT CMPSME27 Time allowed: 2 hours Answer any THREE questions. (40 marks each) Notes are

More information

Overview. Technology Details. D/AVE NX Preliminary Product Brief

Overview. Technology Details. D/AVE NX Preliminary Product Brief Overview D/AVE NX is the latest and most powerful addition to the D/AVE family of rendering cores. It is the first IP to bring full OpenGL ES 2.0/3.1 rendering to the FPGA and SoC world. Targeted for graphics

More information

CHAPTER 1 Graphics Systems and Models 3

CHAPTER 1 Graphics Systems and Models 3 ?????? 1 CHAPTER 1 Graphics Systems and Models 3 1.1 Applications of Computer Graphics 4 1.1.1 Display of Information............. 4 1.1.2 Design.................... 5 1.1.3 Simulation and Animation...........

More information

Real-Time Shadows. Last Time? Textures can Alias. Schedule. Questions? Quiz 1: Tuesday October 26 th, in class (1 week from today!

Real-Time Shadows. Last Time? Textures can Alias. Schedule. Questions? Quiz 1: Tuesday October 26 th, in class (1 week from today! Last Time? Real-Time Shadows Perspective-Correct Interpolation Texture Coordinates Procedural Solid Textures Other Mapping Bump Displacement Environment Lighting Textures can Alias Aliasing is the under-sampling

More information

0;L$+LJK3HUIRUPDQFH ;3URFHVVRU:LWK,QWHJUDWHG'*UDSKLFV

0;L$+LJK3HUIRUPDQFH ;3URFHVVRU:LWK,QWHJUDWHG'*UDSKLFV 0;L$+LJK3HUIRUPDQFH ;3URFHVVRU:LWK,QWHJUDWHG'*UDSKLFV Rajeev Jayavant Cyrix Corporation A National Semiconductor Company 8/18/98 1 0;L$UFKLWHFWXUDO)HDWXUHV ¾ Next-generation Cayenne Core Dual-issue pipelined

More information

Rendering. Converting a 3D scene to a 2D image. Camera. Light. Rendering. View Plane

Rendering. Converting a 3D scene to a 2D image. Camera. Light. Rendering. View Plane Rendering Pipeline Rendering Converting a 3D scene to a 2D image Rendering Light Camera 3D Model View Plane Rendering Converting a 3D scene to a 2D image Basic rendering tasks: Modeling: creating the world

More information

2D rendering takes a photo of the 2D scene with a virtual camera that selects an axis aligned rectangle from the scene. The photograph is placed into

2D rendering takes a photo of the 2D scene with a virtual camera that selects an axis aligned rectangle from the scene. The photograph is placed into 2D rendering takes a photo of the 2D scene with a virtual camera that selects an axis aligned rectangle from the scene. The photograph is placed into the viewport of the current application window. A pixel

More information

Feeding the Beast: How to Satiate Your GoForce While Differentiating Your Game

Feeding the Beast: How to Satiate Your GoForce While Differentiating Your Game GDC Europe 2005 Feeding the Beast: How to Satiate Your GoForce While Differentiating Your Game Lars M. Bishop NVIDIA Embedded Developer Technology 1 Agenda GoForce 3D capabilities Strengths and weaknesses

More information

Overview. A real-time shadow approach for an Augmented Reality application using shadow volumes. Augmented Reality.

Overview. A real-time shadow approach for an Augmented Reality application using shadow volumes. Augmented Reality. Overview A real-time shadow approach for an Augmented Reality application using shadow volumes Introduction of Concepts Standard Stenciled Shadow Volumes Method Proposed Approach in AR Application Experimental

More information

Pipeline Operations. CS 4620 Lecture 14

Pipeline Operations. CS 4620 Lecture 14 Pipeline Operations CS 4620 Lecture 14 2014 Steve Marschner 1 Pipeline you are here APPLICATION COMMAND STREAM 3D transformations; shading VERTEX PROCESSING TRANSFORMED GEOMETRY conversion of primitives

More information

The Application Stage. The Game Loop, Resource Management and Renderer Design

The Application Stage. The Game Loop, Resource Management and Renderer Design 1 The Application Stage The Game Loop, Resource Management and Renderer Design Application Stage Responsibilities 2 Set up the rendering pipeline Resource Management 3D meshes Textures etc. Prepare data

More information

3D rendering using FPGAs

3D rendering using FPGAs 3D rendering using FPGAs Péter Sántó, Béla Fehér Department of Measurement and Information Systems Budapest University of Technology and Economics H-7 Budapest, Magyar Tudósok krt. 2. santo@mit.bme.hu,

More information

PowerVR Performance Recommendations. The Golden Rules

PowerVR Performance Recommendations. The Golden Rules PowerVR Performance Recommendations Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind. Redistribution

More information

Chapter IV Fragment Processing and Output Merging. 3D Graphics for Game Programming

Chapter IV Fragment Processing and Output Merging. 3D Graphics for Game Programming Chapter IV Fragment Processing and Output Merging Fragment Processing The per-fragment attributes may include a normal vector, a set of texture coordinates, a set of color values, a depth, etc. Using these

More information

Computer Graphics Shadow Algorithms

Computer Graphics Shadow Algorithms Computer Graphics Shadow Algorithms Computer Graphics Computer Science Department University of Freiburg WS 11 Outline introduction projection shadows shadow maps shadow volumes conclusion Motivation shadows

More information

Overview. Videos are everywhere. But can take up large amounts of resources. Exploit redundancy to reduce file size

Overview. Videos are everywhere. But can take up large amounts of resources. Exploit redundancy to reduce file size Overview Videos are everywhere But can take up large amounts of resources Disk space Memory Network bandwidth Exploit redundancy to reduce file size Spatial Temporal General lossless compression Huffman

More information

Perspective Projection and Texture Mapping

Perspective Projection and Texture Mapping Lecture 7: Perspective Projection and Texture Mapping Computer Graphics CMU 15-462/15-662, Spring 2018 Perspective & Texture PREVIOUSLY: - transformation (how to manipulate primitives in space) - rasterization

More information

Optimizing DirectX Graphics. Richard Huddy European Developer Relations Manager

Optimizing DirectX Graphics. Richard Huddy European Developer Relations Manager Optimizing DirectX Graphics Richard Huddy European Developer Relations Manager Some early observations Bear in mind that graphics performance problems are both commoner and rarer than you d think The most

More information

CS 450: COMPUTER GRAPHICS TEXTURE MAPPING SPRING 2015 DR. MICHAEL J. REALE

CS 450: COMPUTER GRAPHICS TEXTURE MAPPING SPRING 2015 DR. MICHAEL J. REALE CS 450: COMPUTER GRAPHICS TEXTURE MAPPING SPRING 2015 DR. MICHAEL J. REALE INTRODUCTION Texturing = process that takes a surface and modifies its appearance at each location using some image, function,

More information

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1 graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1 graphics pipeline sequence of operations to generate an image using object-order processing primitives processed one-at-a-time

More information

Hardware-driven Visibility Culling Jeong Hyun Kim

Hardware-driven Visibility Culling Jeong Hyun Kim Hardware-driven Visibility Culling Jeong Hyun Kim KAIST (Korea Advanced Institute of Science and Technology) Contents Introduction Background Clipping Culling Z-max (Z-min) Filter Programmable culling

More information

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1

graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1 graphics pipeline computer graphics graphics pipeline 2009 fabio pellacini 1 graphics pipeline sequence of operations to generate an image using object-order processing primitives processed one-at-a-time

More information

CS4620/5620: Lecture 14 Pipeline

CS4620/5620: Lecture 14 Pipeline CS4620/5620: Lecture 14 Pipeline 1 Rasterizing triangles Summary 1! evaluation of linear functions on pixel grid 2! functions defined by parameter values at vertices 3! using extra parameters to determine

More information

Computer System Components

Computer System Components Computer System Components CPU Core 1 GHz - 3.2 GHz 4-way Superscaler RISC or RISC-core (x86): Deep Instruction Pipelines Dynamic scheduling Multiple FP, integer FUs Dynamic branch prediction Hardware

More information

Adapted from David Patterson s slides on graduate computer architecture

Adapted from David Patterson s slides on graduate computer architecture Mei Yang Adapted from David Patterson s slides on graduate computer architecture Introduction Ten Advanced Optimizations of Cache Performance Memory Technology and Optimizations Virtual Memory and Virtual

More information

CS 498 VR. Lecture 19-4/9/18. go.illinois.edu/vrlect19

CS 498 VR. Lecture 19-4/9/18. go.illinois.edu/vrlect19 CS 498 VR Lecture 19-4/9/18 go.illinois.edu/vrlect19 Review from previous lectures Image-order Rendering and Object-order Rendering Image-order Rendering: - Process: Ray Generation, Ray Intersection, Assign

More information

TEXTURE MAPPING. DVA338 Computer Graphics Thomas Larsson, Afshin Ameri

TEXTURE MAPPING. DVA338 Computer Graphics Thomas Larsson, Afshin Ameri TEXTURE MAPPING DVA338 Computer Graphics Thomas Larsson, Afshin Ameri OVERVIEW Motivation Texture Mapping Coordinate Mapping (2D, 3D) Perspective Correct Interpolation Texture Filtering Mip-mapping Anisotropic

More information

AS THE MOBILE electronics market matures, third-generation

AS THE MOBILE electronics market matures, third-generation IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 39, NO. 7, JULY 2004 1101 A Low-Power 3-D Rendering Engine With Two Texture Units and 29-Mb Embedded DRAM for 3G Multimedia Terminals Ramchan Woo, Student Member,

More information

Mainstream Computer System Components CPU Core 2 GHz GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation

Mainstream Computer System Components CPU Core 2 GHz GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation Mainstream Computer System Components CPU Core 2 GHz - 3.0 GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation One core or multi-core (2-4) per chip Multiple FP, integer

More information

POWERVR MBX & SGX OpenVG Support and Resources

POWERVR MBX & SGX OpenVG Support and Resources POWERVR MBX & SGX OpenVG Support and Resources Kristof Beets 3 rd Party Relations Manager - Imagination Technologies kristof.beets@imgtec.com Copyright Khronos Group, 2006 - Page 1 Copyright Khronos Group,

More information

3D Rasterization II COS 426

3D Rasterization II COS 426 3D Rasterization II COS 426 3D Rendering Pipeline (for direct illumination) 3D Primitives Modeling Transformation Lighting Viewing Transformation Projection Transformation Clipping Viewport Transformation

More information