Ray Intersection Acceleration Image Synthesis Torsten Möller
Reading Physically Based Rendering by Pharr&Humphreys Chapter 2 - rays and transformations Chapter 3 - shapes Chapter 4 - intersections and acceleration An Introduction to Ray tracing by Glassner
Topics today Review of ray-tracing acceleration approaches pbrt Object-space World-space 3
Review of ray-tracing 4
First imaging algorithm Directly derived from the synthetic camera model Follow rays of light from a point light source Determine which rays enter the lens of the camera through the imaging window Compute color of projection Why is this not a good idea? Machiraju/Zhang/Möller 5
Ray tracing When following light from a light source, many (reflected) rays do not intersect the image window and do not contribute to the scene reflected ray shadow ray rays eye pixel Image plane refracted ray Reverse the process Cast one ray per pixel from the eye and shoot it into the scene Machiraju/Zhang/Möller 6
Ray tracing: the basics A point on an object may be illuminated by Light source directly through shadow ray Light reflected off an object through reflected ray Light transmitted through a transparent object through refracted ray reflected ray shadow ray eye Machiraju/Zhang/Möller pixel Image plane refracted ray 7
Ray tracing: the algorithm for each pixel on screen determine ray from eye through pixel if ray shoots into infinity, return a background color if ray shoots into light source, return light color appropriately find closest intersection of ray with an object Most cast off shadow ray (towards light sources) expensive if shadow ray hits a light source, compute light contribution according to some illumination model cast reflected and refracted ray, recursively calculate pixel color contribution return pixel color after some absorption Machiraju/Zhang/Möller 8
Ray Tracing Shoot a ray through each pixel; Find first object intersected by ray Image plane Eye Compute ray. (More linear algebra.) Compute ray-object intersection. Spawn more rays for reflection and refraction
pbrt acceleration architecture 10
Ray Tracing Architecture Scene LRT Parse Sample Generator Film TIFF Image Primitives Camera Radiance Shape Rays intersect Eye rays Lights Material Surf Integrator Shadow, reflection, refraction rays
Optimizing Ray Tracing Main computation load is ray-object intersection 50-90% of run time when profiled Test for possible intersection before committing to computing intersections
Acceleration Strategies Fast/efficient intersection computation (single raysingle object) Not much leeway for acceleration Test fewer rays Trace fewer primary rays Fewer secondary rays Fewer intersection computations Intersection with thick rays Cone tracing Beam tracing Avoid computations with far away objects Space partitioning Direction partitioning 13
Pbrt and Intersections Primitive base class Shapes are subclasses of primitive Methods WorldBound CanIntersect Intersect IntersectP Refine First four return Intersection structures Last returns Primitives Primitives Shape Lights Material
Intersection Geometry Shape independent representation for intersections DifferentialGeometry Intersection::dg Point P Normal N Parametric (u,v) Partial derivatives Tangents: dpdu, dpdv change in normal: dndu, dndv
Object-Based Acceleration Reduce the intersection time for individual objects Objects can be complex! must intersect all triangles But we can quickly estimate if a ray could intersect the object 16
Bounding Volumes Surround object with a simple volume Test ray against volume first Test object-space or world-space bound? (pros and cons) Cost model - N*cb + pi*n*co N (number of rays) is given pi fraction of rays intersecting bounding volume Minimize cb (cost of intersecting bounding volume) and co (cost of intersecting object) Reduce ray path Bounding sphere Difficult to compute good one Easy to test for intersection Bounding box Easy to compute for given object More difficult to intersect Reduce ray path
Pbrt s Bounding Boxes Virtual BBox ObjectBound() const=0; Virtual BBox WorldBound() const { return ObjectToWorld(ObjectBound()); } Bool BBox::IntersectP(const Ray &ray, float *hitt0, float *hitt1) const { }
Bounding Box Compute min/max for x,y,z 3 options Compute in world space Chance of ill fitting b-box Compute in object space and transform w/ object Object space b-box probably better fit than world space Need to intersect ray with arbitrary hexagon in world sp. Compute in object space and test in object space Inverse transform ray into object space
Bounding Box Treat BB as three perpendicular slabs Clip the ray against each slab if the clipped region is valid, then intersection! 20
Bounding Box Find t near and t far for each pair of parallel planes. See PBRT p.194. 21
Bounding Sphere Find min/max points in x,y,z -> 3 pairs Use maximally separated pair to define initial sphere For each point If point is outside of current sphere, increase old sphere to just include new point
P R C R P P rad C P rad newrad = (R+rad)/2 R newc P P P newrad newrad newc C P R newc = P+(newrad/R)(C-P)
Bounding Sphere Intersection is very fast! Does not need to be axis aligned http://www.scratchapixel.com/lessons/3d-basic-rendering/minimal-ray-tracer-rendering-simple-shapes/ray-sphere-intersection 24
Bounding Slabs More complex to compute Better fit of object Use multiple pairs of parallel planes to bound object Can add more slabs to get tighter fit 2004 Pharr, Humphreys
Bounding Slabs Intersection is a generalization of the bounding box computation. Axis-aligned optimizations aren t possible. Potentially more slabs In practice, rarely used 26
Approximate Convex Hull Find highest vertex Find plane through vertex parallel to ground plane Find second vertex that makes minimum angle with first vertex and up vector Find third vertex that makes plane whose normal makes minimum angle with up vector 1989 Glassner More effort to compute, better fit
Convex Hull Intersection is fairly complex Use a sidedness test for all planes Rarely useful 28
Bounding Volume Hierarchies Create tree of bounding volumes Children are contained within parent Creation preprocess From model hierarchy Automatic clustering Each object only appears in one leaf volume.
BVH - Issues Subtrees overlap Does not contain all objects it overlaps Balance Tree Organization
BVH - Construction Take the bounding volume of the scene (or complex object) If too many primitive exist in the volume, create child volumes (split the volume) Recurse until the end criteria is met 31
Splitting strategies Choose the axis, that has the largest spread of midpoints Split based on the midpoint of the centroids of one axis Split in equal amounts Problem - overlaps 32
The Surface Area Heuristic cost of intersecting all N primitives: NX t isect (i) i=1 cost of splitting into groups N A +N B = N t trav - traversal time of node p A / p B - prob that ray passes through each child node 33
The Surface Area Heuristic assume, that t isect (i) is the same for all primitives for convex objects A contained in B, the probability of a ray intersecting A if it intersects B is: p(a B) = s A s B find the axis with the minimal SAH cost 34
35
BVH Intersection intersect(node,ray,hits) { if( intersectp(node->bound,ray) if( leaf(node) ) intersect(nodeprims,ray,hits) else for each child intersect(child,ray,hits) } Return the closest of all hits! 36
Spatial Subdivision Subdivide world space Intersectable objects are placed in spatial voxels A ray is traversed through the subdivisions in the order of near to far No overlapping voxels but objects may be in multiple voxels 37
Uniform Grids Simplest spatial subdivision Very easy to create and traverse Good, but not optimal performance much better than brute force 38
Uniform Grids Preprocess scene Find bounding box
Uniform Grids Preprocess scene Find bounding box Determine grid resolution
Uniform Grids Preprocess scene Find bounding box Determine grid resolution Place object in cell if its bounding box overlaps the cell
AABB/Voxel Intersection Easy to compute Compute which voxel contains each vertex All voxels between the min and max voxels contain the AABB 42
Uniform Grids Preprocess scene Find bounding box Determine grid resolution Place object in cell if its bounding box overlaps the cell Check that object overlaps cell (expensive!)
Triangle/Voxel overlap Search for a separating axis. 1. [3 tests] e0 = (1, 0, 0), e1 = (0, 1, 0), e2 = (0, 0, 1) (the normals of the AABB). Test the AABB against the minimal AABB around the triangle. 2. [1 test] n, the normal of. We use a fast plane/aabb overlap test [5, 6], which only tests the two diagonal vertices, whose direction is most closely aligned to the normal of the triangle. 3. [9 tests] aij = ei fj, i, j {0, 1, 2}, where f0 = v1 v0, f1 = v2 v1, and f2 = v0 v2. These tests are very similar and we will only show the derivation of the case where i = 0 and j = 0 (see below). Tomas Akenine-Möller. 2005. Fast 3D triangle-box overlap testing. In ACM SIGGRAPH 2005 Courses (SIGGRAPH '05), John Fujii (Ed.). ACM, New York, NY, USA,, Article 8. DOI=http://dx.doi.org/ 10.1145/1198555.1198747 44
Add Sorting If objects/voxels/cells are processed in front-to-back sorted order along the ray, stop processing when first intersection is detected
Spatial Subdivision Identifying voxels hit is like a line drawing algorithm
Uniform Grids Preprocess scene Traverse grid 3D line = 3D-DDA
Amanatides & Woo Algorithm J. Amanatides and A. Woo, "A Fast Voxel Traversal Algorithm for Ray Tracing", Proc. Eurographics '87, Amsterdam, The Netherlands, August 1987, pp 1-10.
A&W Algorithm X/Y/Z: Current voxel number. stepx/y/z: 1 or -1. Is the slope +/-? tmaxx/y/z: distance to cross next vert/horiz/depth plane. tdeltax/y/z: distance along ray to travel one whole width/height/depth of a voxel
Uniform Grid Shirley P., Marschner S. Fundamentals of Computer Graphics
A&W Algorithm Results Rendering time for different levels of subdivision
Objects Across Multiple Voxels Mailboxes mailboxes eliminate redundant intersection tests objects have mailboxes mailbox contains ID of last ray that intersected it check ray against objects last tested ray ID Object splitting Split objects at voxel boundaries Sub-objects only appear in one voxel 1989 Glassner
Complexity? Problems? Uniform Grid 53
Uniform Grid Increase resolution Increases traversal cost May only need higher resolution in a small area 54
Hierarchical Data structures http://www.keithlantz.net/2013/04/kd-tree-construction-using-the-surface-area-heuristicstack-based-traversal-and-the-hyperplane-separation-theorem/ 55
Comparison Scheme Spheres (time) Rings (time) Tree (time) Uniform grid D=1 244 129 1517 D=20 38 83 781 Hierarchical grid 34 116 34 See A Proposal for Standard Graphics Environments, IEEE Computer Graphics and Applications, vol. 7, no. 11, November 1987, pp. 3-5
Hierarchical Spatial Subdivision Recursive subdivision of space 1-1 Relationship between scene points and leaf nodes Solves the lack-of-adaptivity problem DDA works Effective in practice 140.129.20.249/~jmchen/compg/slides/Octree%20Traversal.ppt
Creating Spatial Hierarchies insert(node,prim) { if( overlap(node->bound,prim) ) if( leaf(node) ) { if( node->nprims > MAXPRIMS && node->depth < MAXDEPTH ) { } } else subdivide(node); The interesting bit foreach child in node insert(child,prim) } else list_insert(node->prims,prim); foreach child in node insert(child,prim)
Hughes J., et al. Computer Graphics Principles and Practice Oct-Tree
Oct-tree Similar to a binary tree Each node has 8 children Always split the node in the middle of each axis Allows fast skipping of empty space Better efficiency than uniform grid, but still not the best Traversal is still relatively easy Complexity? 60
Oct-tree Traversal can be implemented similar to 3D-DDA Use smallest voxel as a basis, and skip multiple while inside an internal node Top-down approach Intersect root and find intersected children recursively http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.987 Bottom-up approach Find first leaf voxel intersected. Use neighbour finding to find next voxels 61
Binary Space Partitioning (BSP) General case uses arbitrary splitting planes Space is partitioned into half spaces All bins are convex Difficult to build, traversal isn t too bad Generally not used for rendering (maybe for physics simulation) See: T. Ize, I. Wald and S. G. Parker, "Ray tracing with the BSP tree," Interactive Ray Tracing, 2008. RT 2008. IEEE Symposium on, Los Angeles, CA, 2008, pp. 159-166. doi: 10.1109/RT.2008.4634637 62
Octree BSP tree KD tree
KD-Tree A binary tree in K dimensions Split one dimension at a time Each node may only have 2 children Easy to build and traverse How to choose which dimension to split? How to choose where to split? 64
KD-tree 65
KD-tree Use SAH to choose the split location. Same as BVH 66
KD-tree 67
KD-tree Traversal (stack) intersect with scene bounding box while no intersection && in the scene if node is a leaf intersect with all contained primitives else compute near and far child intersect with near child, push far child to a stack if no intersection, then pop a node from the stack continue until intersection, or exit the scene 68
69
KD-tree Other traversal cases: ray traverses the children in reverse order ray starts inside the node ray only intersects one child Fast traversal is very important it will happen frequently recursive is possible, but directly using a stack can be faster Complexity? 70
KD-tree Possible improvements? Stackless traversal! each voxel contains pointers to it s neighbours traverse the tree directly, without using a stack avoid moving up and down the tree unnecessarily costs additional memory and precompute time very good for highly parallel (GPU) implementations 71
KD-tree Popov, S., Günther, J., Seidel, H.-P. and Slusallek, P. (2007), Stackless KD-Tree Traversal for High Performance GPU Ray Tracing. Computer Graphics Forum, 26: 415 424. doi:10.1111/j.1467-8659.2007.01064.x 72
Final Thoughts Acceleration structures have a HUGE influence on performance In practice BVH and KD-tree have similar performance Optimizations exist for both Memory organization can have a big performance impact. Especially for parallel computing 73
http://www.keithlantz.net/2013/04/kd-tree-construction-using-the-surface-area-heuristicstack-based-traversal-and-the-hyperplane-separation-theorem/ 74