Motivation Culling Don t draw what you can t see! Thomas Larsson Mälardalen University April 7, 2016 Image correctness Rendering speed One day we will have enough processing power!? Goals of real-time 3D graphics Higher frame rate (60-120 fps) Higher resolution (4800x3600) Improved geometric detail (1 billion polygons) Improved lighting and shading (global illumination) Ideal frame rate Avoid flickering and jerky motion High and steady frame rate Render at the same rate as the frequency of the monitor 1 2 What can t we see? Anything outside the view volume Anything occluded by another object Objects, polygons, pixels,... Anything smaller than a pixel Anything in complete shadow Low-level Culling Culling per primitive Back-face culling To remove polygons facing away from the viewer Clipping To ensures that no geometry is drawn outside the view port Culling per pixel Z-buffert algorithm To ensure that no hidden geometry will be drawn over already drawn geometry OpenGL only supports low-level culling Clipping and z-buffer are important for image correctness Low-level culling is also used to speed-up rendering Transform and lighting must still be performed for each primitive 3 4
Back-face culling Using back-face culling in OpenGL Useful when back-faces cannot be seen Closed polygonal meshes Consider the normal of a given polygon if it s pointing towards the eye, we may be able to see it pointing away means it s on the opposite side of the object We need a back-face test If n v 0 then primitive is a backface This requires correct surface normals, and not the normals often used for special lightning effects. In normalized device coordinates the test becomes very efficient since v = (0, 0, 1) Back-face test in the OpenGL pipeline Uses the signed area of the projected primitive in screen space rather than the dot product test. eye Where in the pipeline does back-face culling fit in best? 5 Enable back-face removal with glenable(gl_cull_face) Disable back-face removal with gldisable(gl_cull_face) Select which polygons to remove with one of the following functions: glcullface(gl_back) glcullface(gl_front) glcullface(gl_front_and_back) For polygonal models, the convention is to specify the triangles such that the outward face is the front face. What if the camera happens to be inside the model. How would you set up the back-face removal? 6 Clipping Polygon Clipping What about polygons piercing one or more of the frustum planes Such polygons need clipping Pixel-wise clipping is not a good idea, at least not for large triangles Clipping happens just prior to rasterization almost always done by graphics system Given an initial polygon, find areas within viewport Usually, the viewport is an axis-aligned rectangle Number of vertices may be increased Consider clipping a triangle How many vertices can there be in the clipped polygon? Can yield one or more polygons in the general case 7 8
Sutherland Hodgeman Clipping Clipping, cont. This is a divide-and-conquer algorithm repeated clipping against half-space walk around boundary of polygon four cases for edge transition in in: output next vertex in out: output intersection out out: output nothing out in: output intersection + next vertex The method can be used for any convex clipping region It can work in 3-D as well as 2-D Implementation issues Care is needed to get this right We may need to create multiple separate polygons We may get infinitely thin areas along edges of clipping region Some other degeneracies Other polygon clipping algorithms Cyrus-Beck Liang Barsky & Weiler algorithms Perhaps more efficient, but also more complex Clipping in the OpenGL pipeline 9 10 Z-buffer algorithm Pixel-wise hidden surface removal Keep depth values for the currently visible pixels in the Z-buffer Before drawing a pixel, check if it should be visible Advantages Method is simple to implement in hardware It s a natural part of the primitive rasterization process Memory requirement drawback Suppose, we need 32 bits per depth value at full HD resolution 1920x1080. How much space is needed for the depth buffer? Accuracy problems Only considers visibility at discrete pixels => aliasing artifacts Depth buffer resolution issue => Z-fighting Potential inefficiency High depth-complexity scenes We might get many over draws 11 Z-buffer and over draws Suppose 10 triangles cover the same pixel, but with different depth values. How many times will the pixel be drawn? Clearly, it depends on the draw order Minimum = 1 Maximum = 10 Assume a random draw order How many times will the pixel be drawn on average? 12
Using the Z-buffer in OpenGL First, we need to tell GLUT that a depth buffer is needed: glutinitdisplaymode(glut_double GLUT_RGB GLUT_DEPTH) And we need to enable depth testing glenable(gl_depth_test) Before drawing next frame we need to reset the depth buffer: glclear(gl_depth_buffer_bit) As needed, we can change the way depths values are compared with each other: gldepthfunc(gl_less) Note: GL_LESS is the default Higher level culling Obviously, we need higher level culling methods for complex scenes than what OpenGL has to offer. Otherwise, each primitive will be processed in linear time, even when nothing is visible Main methods View-Frustum Culling (VFC) Occlusion Culling 13 14 View Frustum Culling (VFC) Primitive Grouping VFC tries to reject objects outside the viewing volume typically done by application happens prior to lighting, transformation Discard any object outside viewing volume early on Viewing volume is formed by 6 planes. All points inside satisfy a x b y + c z + d 0 i + i i i For each polygon P in polygon mesh M: If P is completely outside the viewing volume then throw away polygon What s wrong with this simple algorithm? 15 What if an object with a million polygons are completely outside the view volume Per-Polygon Processing is inefficient! Use Bounding Volumes (BVs) to group close primitives together Now test bounding volume first if outside frustum, reject the whole object Otherwise, consider individual parts of the object Performance BV fully inside or outside test in constant time, O(1). 16
Bounding Volume Types The Bounding Sphere Several types have become popular Spheres Axis-aligned bounding boxes (AABBs) Oriented bounding boxes (OBBs) K-DOPs Convex hulls The choice of bounding volume (BV) is often a tradeoff between simplicity of use and tightness of fit. How can these BVs be computed? A bounding sphere is a sphere enclosing an object completely An optimal bounding sphere is the minimal sphere enclosing an object completely Parameters: Center point and radius Properties: not tight fitting,memory efficient, fast tests, rotationally invariant 17 18 Bounding sphere computation We only need to consider the vertices (corner) of polygonal objects Fast constant approximation heuristics: AABB mid point as center Average point as center Ritter s algorithm (See e.g. Graphics Gems I) Finding minimum volume spheres Algorithm by Emo Welzl Runs in expected O(n) time (in any fix dimension) Improved version by Gärtner Miniball source code: http://www.inf.ethz.ch/personal/gaertner/miniball.html Choice of frame for VFC Need to decide the frame of reference to be used for view frustum culling calculations clip space (CS) view space (VS) world space (WS) local object space (LS) We already know the planes of the canonical view volume, that is, in CS: But BVs will be deformed when transformed to CS. LS or WS seems nice. Can you see why? 19 20
Finding the frustum planes Simplification These are the known planes in CS: Left: L 0 = (n 0, d 0 ) = (1, 0, 0, 1) Right: L 1 = (n 1, d 1 ) = (-1, 0, 0, 1) Bottom: L 2 = (n 2, d 2 ) = (0, 1, 0, 1) Top: L 3 = (n 3, d 3 ) = (0, -1, 0, 1) Near: L 4 = (n 4, d 4 ) = (0, 0, 1, 1) Far: L 5 = (n 5, d 5 ) = (0, 0, -1, 1) n 0 n 2 n 3 n 1 There are lots of zeros and ones in plane equations L i. We can simplify a lot! It turns out that the planes are given as simple sums of row vectors in M: Let M be the combined modelview and projection matrix. A plane L i in model space is then found by applying the inverse transformation M -1 to the corresponding plane L i in CS: 1 1 T L = [( M ) ] L ' = M i i T L ' i Left: L 0 = (m3-m0, m7-m4, m11-m8, m15-m12) Right: L 1 = (m3+m0, m7+m4, m11+m8, m15+m12) Bottom: L 2 = (m3-m1, m7-m5, m11-m9, m15-m13) Top: L 3 = (m3+m1, m7+m5, m11+m9, m15+m13) Near: L 4 = (m3-m2, m7-m6, m11-m10, m15-m14) Far: L 5 = (m3+m2, m7+m6, m11+m10, m15+m14) Note: Since we are transforming planes, the inverse transpose of M -1 is used in this formula. 21 Note: It is assumed here that the elements of the matrix M is stored in a plain array with 16 elements as in OpenGL. 22 Simple sphere/frustum overlap test False positives A sphere is outside the frustum if it is completely behind at least one of the six frustum planes Testing the overlap status between a sphere and a plane is extremely simple: Insert the center point, C, of the sphere in the plane equation to get the distance from C to the plane, and compare it with the radius. Using 4D vectors the test becomes: A sphere with center C = <x, y, z, w = 1> and radius r, is behind a plane L = <n, d> if L C -r Note! Planes must be normalised to ensure correct distances: L = <n/ n, d/ n > 23 The described overlap test is conservative Sometimes an object is reported as being inside the viewing frustum although it is not Such cases are called false positives False positives do not lead to rendering errors, since clipping will ensure correct results anyway However, false positives may lead to a performance penalty in specific cases We can avoiding false positives by using an exact sphere/frustum overlap test But it s probably not worth the trouble... Focus is on the average rendering speed A corner case giving rise to a false positive 24
Hierarchical View-Frustum Culling What if an object is partly inside and partly outside the view volume? Then, an efficient culling scheme requires a hierarchy of volumes For example, a bounding volume hiearchy Begin testing at the root node if outside, nothing is visible, we re done otherwise, recursively test sub-nodes This raises several questions How can we build good hierarchies? Which BVs, or combinations of BVs, are most advantegous? Bounding Volume Hierarchies A tree structure Tightly fits objects Height of balanced tree: floor (log k n) 25 26 Basic Approaches Top-down Bottom-up Incremental insertion Other decision BV type Node degree Primitives per leaf Balanced tree? Not best in all cases Tree Building Occlusion Culling Throw away non-visible objects inside frustum Lots of algorithms have been suggested Complicated many different algorithms have been proposed no single best solution results depends on the inter-relation of the objects Two major forms Point-based or cell-based Algorithm categories Image space, object space, ray space Occluder 27 28
General Approach Let O R be the occlusion representation for each model m do if not Occluded(O R, m) then render(m) update(or, m) end if end for Hierarchical Z-Buffering Occlusion culling method by Greene et al. Operates in image space Uses octree and Z-pyramid traverse octree in rough front to back order while culling occluded nodes using the Z-pyramid: 9 7 9 4 5 4 3 9 5 7 2 1 6 3 4 5 5 1 1 2 4 29 30 Hardware Occlusion Queries Nvidia s occlusion query Scan convert simple BV while comparing with the Z-buffer returns n = number of visible pixels if n is zero the BV is completely occluded and we can safetely skip the model inside the BV when n > 0, n might determine the LOD Several queries can be tested in parallel Using bounding boxes, 12 triangles are rendered per query Note: At least six of them will be back sides Portal culling Is a special kind of occlusion culling algorithm. Determine visibility through portals or doors. Useful for architectural models. Walls serve as large occluders. Can be seen as an extension to viewfrustum culling Do view-frustum culling through each portal. 31 32
Portal culling, cont. Detail Culling eye D Sacrifices quality for speed Based on the size of projected BV if it is too small, discard it. A B C E Fits nicely together with LOD-based rendering We only need to render the geometry seen through the portals. 33 34 Levels-of detail (LODs) Scene graphs Suppose an object consists of lots of polygons (like millions) It will be very slow to render this object. If the object is very far away, it will only project to a few pixels. Rendering millions of triangles is a waste of time! Therefore, build and store several different polygon representations of this object, at different levels of detail (LODs). Goal: Only use as much detail in a model that can be seen. When object is near, use high LOD. When object is far, use low LOD. Need to do LOD Selection (based on size of projection) LOD Switching (smooth transition from one LOD to another) Continuous Levels of Detail (CLOD) Fine grained transitions Based on the process of mesh simplification Store edge collapses Reverse operation is vertex split 35 A higher level data structures More than hierarchical geometry relationships Textures, light sources, hierarchical animation, render states, etc The scene graph also provides acceleration Supports view-frustum traversal prior to rendering Resembles BVHs: Nodes often stores BV covering entire sub-graph 36