Half-precision Floating-point Ray Traversal

Size: px
Start display at page:

Download "Half-precision Floating-point Ray Traversal"

Transcription

1 Half-precision Floating-point Ray Traversal Matias Koskela, Timo Viitanen, Pekka Jääskeläinen and Jarmo Takala Department of Pervasive Computing, Tampere University of Technology, Korkeakoulunkatu 1, 33720, Tampere, Finland Keywords: Abstract: Ray Tracing, Bounding Volume Hierarchy, Half-precision Floating-point Numbers. Ray tracing is a memory-intensive application. This paper presents a new ray traversal algorithm for bounding volume hierarchies. The algorithm reduces the required memory bandwidth and energy usage, but requires extra computations. This is achieved with a new kind of utilization of half-precision floating-point numbers, which are used to store axis aligned bounding boxes in a hierarchical manner. In the traversal the ray origin is moved to the edges of the intersected nodes. Additionally, in order to retain high accuracy for the intersection tests the moved ray origin is converted to the child s hierarchical coordinates, instead of converting the child s bound coordinates into world coordinates. This means that both storage and the ray intersection tests with axis aligned bounding boxes can be done in half-precision. The algorithm has better results with wider vector instructions. The measurements show that on a Mali-T628 GPU the algorithm increases frame rate by 3% and decreases power consumption by 9% when wide vector instructions are used. 1 INTRODUCTION Ray tracing has been a commonly used method to render off-line images. Due to active research in the field, there are now multiple frameworks which can achieve real-time frame rates on high-end hardware. However, significant progress must be made in the field before ray tracing is fast enough for consumer applications on mobile hardware. The basic operation in ray tracing is ray casting, which finds the closest intersection of a ray with the set of scene primitives. In order to reach high performance in large scenes with many primitives, ray tracers organize the scene using special acceleration data structures. The most popular of these structures is the Bounding Volume Hierarchy (BVH). In BVH, primitives are stored in the leaves of the tree, and each node contains the Axis Aligned Bounding Boxes (AABB) of its children. During ray casting, entire subtrees are then rapidly eliminated by finding that the ray does not intersect a high-level AABB, without making separate checks against each leaf primitive in the subtree. Ray tracing is an expensive operation both in terms of computation and memory traffic. In recent years, performance has been improved by leaps and bounds due to the increased parallel resources available in modern computing systems. Therefore, the memory system is often a bottleneck since the scene hierarchy may be more than tens of megabytes in size and is accessed in an almost unpredictable order. There is a large body of research on reducing memory traffic by representing the AABBs in BVHs in a more compact or approximate manner. For instance, in a Shared-Plane Bounding Volume Hierarchy (SPBVH) (Ernst and Woop, 2011), a tree node infers some of the AABB coordinates from the parent AABB. In practice this halves the memory footprint of the nodes. BVH compression techniques are of high interest for mobile systems due to their limited memory bandwidth and low power requirement. Several CPU and GPU vendors have introduced architectural support for a half-precision (16-bit) floating-point format (IEEE, 2008), which poses an interesting opportunity for the compression. Furthermore, current desktop hardware only supports half-floats as a storage format, but this is changing in the near future, as the next-generation of NVidia s GPUs is also scheduled to have half-precision computing support (Huang, 2015). In this paper, a novel hierarchical half-precision BVH traversal scheme which computes AABB intersection tests natively in half-precision format is proposed. This removes the format conversion overheads which slowed down prior work (Koskela et al., 2015), which exploited half-precision only as a storage format. In the measurements using wide vector instructions the proposed method improves frame rate by an average of 3% and energy efficiency by 9%. Koskela, M., Viitanen, T., Jääskeläinen, P. and Takala, J. Half-precision Floating-point Ray Traversal. DOI: / In Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2016) - Volume 1: GRAPP, pages ISBN: Copyright c 2016 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved 171

2 GRAPP International Conference on Computer Graphics Theory and Applications 2 BACKGROUND Ray tracing is well suited for many ways of parallel computations. Threads can be used, for example, to trace multiple rays in parallel. On the other hand, there are at least two ways in which ray traversal can use vector instructions. First, multiple rays can be traced together in parallel, which is called packet tracing. Second, multiple ray-aabb tests can be calculated in parallel. In order to use wider vectors, BVHs with a greater branching factor than two can be used, which is called Multi Bounding Volume Hierarchy (MBVH) (Ernst and Greiner, 2008). Usually MB- VHs are labelled based on their branching factor in a way that MBVH4 means that the tree has a branching factor of 4 and MBVH8 means that the tree has a branching factor of 8, et cetera. The two ways of using vector instructions can be combined. In that case multiple rays are traced in parallel and all of them test intersections with multiple AABBs. There have been many proposed algorithms and data structures to reduce the amount of data needed for the ray tracing traversal step. These algorithms can be divided into two categories. First, some of the child AABB bounds can be inferred from the parent AABB. Because in the BVH the AABBs are tightly bound to their child geometry, all of the parent bounds are used by at least one of the children. Some examples of this are Bounding Interval Hierarchy (BIH), which stores only two coordinate values and one axis in each node (Wächter and Keller, 2006) and the SPBVH (Ernst and Woop, 2011). Nevertheless, these ideas are not as good with MBVHs since there are more children, which means that fewer values for each child can be inferred. 3 RELATED WORK Another approach is to compress data to reduced precision data types, at the cost of decompression during traversal and reduced data quality. Low quality data causes extra work at traversal, as the ray-aabb intersection tests might give false positive hits. Moreover, algorithms must be carefully designed to avoid false negatives in intersection tests, which would cause visible gaps in the scene geometry. One way to compress the data is to use reduced precision integer formats (Mahovsky and Wyvill, 2006). Most of the hardware commonly used today is capable of doing calculations in different integer precisions. Using integers, the scene is divided into equally sized quantization steps. This is beneficial if the details of the geometry are equally divided around the whole scene. To avoid too big quantization steps, the data type can be used hierarchically. In this representation the value range of the data type is scaled to the parent node s AABB bounds, so that the smallest possible value of the data type corresponds to the parent s lower bound and the greatest possible value to the upper bound. In a deep tree, even with two integer bits, this kind of hierarchical encoding can achieve more accuracy per coordinate than single-precision floating-point format. This is possible because on every lower level of the tree the quantization steps get smaller in world coordinates. However, this extra precision is not useful since the triangles are usually stored in single-precision format. The previous work on half-precision floatingpoint BVHs (Koskela et al., 2015) only considers the use of half-precision floating-point numbers as a storage format. This reduces BVH inner node memory bandwidth and space usage by half, which corresponds to an average of 16% reduction in cache misses and 7% reduction in memory space usage with MBVH4. Their work evaluates both the BVH with world coordinate half-precision floating-point numbers and the hierarchical representation similar to the work on integers (Mahovsky and Wyvill, 2006). The world coordinate half-precision traversal performance is reduced by the extra traversal caused by the false positives and the overhead of format conversion when AABBs are loaded from memory. The hierarchical half-precision BVH avoids most of the false positives, at the cost of extra computational overhead. Instead of converting the hierarchical AABB bound coordinates into world coordinates, the ray origin can be converted into hierarchical coordinates (Keely, 2014). This allows intersection tests to be computed in a low-precision format, which reduces the required bit counts significantly. Moreover, this is beneficial since fewer coordinates need to be converted to different coordinate base (there are six values in the AABB bounds and only three values in the ray origin). To avoid overflows in the hierarchical coordinates, the ray origin has to be moved to the edge of the coordinate range on every traversal step. 4 PROPOSED ALGORITHM This paper follows the compression category by using half-precision floating-point numbers. The traversal algorithm used in this paper is based on a similar idea as the (Keely, 2014) algorithm. The difference is that instead of adding custom integer hardware to the GPU, the proposed algorithm uses ex- 172

3 Half-precision Floating-point Ray Traversal 00 int nodeindex = indexstack.pop(); 01 half3 rayorigin = rayoriginstack.pop(); 02 half scalesofar = scalestack.pop(); 03 half dissofar = distancestack.pop(); if(nodeindex is inner node index){ 06 // vectorised ray-aabb slab hit tests 07 // for every ray in the packet against 09 // every child node s AABB in parallel 10 foreach hitnode that is hit by any ray 11 in the packet and is closer than 12 closest found triangle{ indexstack.push(hitnode.index); // Move ray origin to edge of hitnode 17 half hitdis = max(hitdistance, 0.0); 18 half3 newrayorigin = rayorigin * hitdis * raydirection; // Convert to deeper hierarchical coords 22 half3 axisranges = hitnode.upperbound 23 - hitnode.lowerbound; 24 half range = max(axisranges.x, 25 axisranges.y, axisranges.z); 26 float3 mover = 0.5 * 27 (float3)(hitnode.lowerbound + 28 hitnode.upperbound - range); 29 newrayorigin = (half3)(val_range * 30 (((float3)newrayorigin - mover) / 31 (float)range) + VAL_MIN); 32 rayoriginstack.push(newrayorigin); // Update distance values 35 half newscale = scalesofar * 36 (range / VAL_RANGE); 37 distancestack.push(dissofar + hitdis * 38 scalesofar); 39 scalestack.push(newscale); 40 } 41 }else{ // Node is leaf node 42 // ray-triangle tests 43 } Figure 1: The proposed algorithm in pseudo code. This consists the inner loop of the BVH traversal. isting half-precision floating-point hardware, therefore, available power savings are more modest. This kind of hardware is currently found especially in mobile platforms. In addition, this paper describes how the hierarchical half-precision format, which was designed so that values are converted to world coordinates prior to intersection testing, is modified so that it better fits calculations in hierarchical coordinates. On high level the proposed traversal has two new stages. First, the ray origin is moved to the edges of the intersected nodes. Second, in order to retain high accuracy for the intersection tests the moved ray origin is converted to the child s hierarchical coordinates, instead of converting the child s bound coor- O 0 A B O B&OBa Ba O Bb Bb Figure 2: Ray origin movement through the hierarchy. The initial origin is O 0 and because the ray intersects the node B the origin is moved to O B. In the node B, the ray intersects both nodes Ba and Bb, so the origin is moved to their edges in corresponding branches of the tree traversal. dinates into world coordinates. The main point of these modifications is to keep the values within the range of the half-precision format. This way the ray- AABB tests can be done with native half-precision calculations and the AABBs can be stored in halfprecision. Some hardware floating-point units can compute twice as many half-precision values with same latency as single-precision values (Hariharakrishnan et al., 2013). The pseudo code of the proposed algorithm can be found in Figure 1. On lines the algorithm loads the index of the current node and the current ray origin. The ray origin is already in the same relative coordinate base as the child nodes AABB bounds, which means that the ray-aabb test can be performed for every child AABB with vector instructions without any extra preparations. In the measurements an unmodified version of the slab method (Smits, 1998) was used for ray-aabb hit tests in half-precision. When using the half-precision floating-point format, as small a number as is rounded up to infinity. In order to prevent values from being rounded up to infinity, the hierarchical storage format was modified to use the interval from VAL_MIN (-16384) to VAL_MAX (16384), which is one fourth of the maximum non-infinite half-precision interval. 4.1 Ray Origin Movement The code after line 10 is executed only for those children which are actually intersected. This part could also be done for all of the children with vector instructions, but in the authors experiments with MBVH4, a ray hits almost only one AABB in a inner node on average. In this case vectorizing the lines after 10 to all of the child AABBs would cause almost three of the four pieces of data to be trash values. This requires many registers for the trash values. For these reasons there was an improvement when the loop was used to go through the intersected children. Furthermore, if the hardware supports instruction level parallelism, i.e., calculating vector and scalar instructions at the same time, it might be possible to optimize the inner loop so that in the single ray case at least some parts of 173

4 GRAPP International Conference on Computer Graphics Theory and Applications (a) Shiny Bunny with one point light (69K triangles). (b) Normals of Dragon, in other words, just ray casting (871K triangles). (c) Sibenik with reflecting floor and two point lights (75K triangles). (d) Fairy with one point light (174K triangles). (e) Sponza with reflecting stone and metal materi- (f) Conference with all materials reflecting and als and with two point lights (278K triangles). with two point lights (331K triangles). Figure 3: The test scenes used in the measurements. 174

5 Half-precision Floating-point Ray Traversal the ray-aabb tests and coordinate conversions could be calculated simultaneously. The index of every intersected child is stored in the stack on line 14. Next, the ray origin is moved to the edge of the intersected child AABB on the lines 17 19, which can also be seen visualized in Figure 2. The distance to the child AABBs intersection is clamped from the lower end to zero, because negative distances mean that the ray origin is inside the child AABB. In this case the ray origin is not moved, but it is still converted to the child s hierarchical coordinates. The distance is shortened with an epsilon of 0.01, because if the child AABB is planar, i.e., the size is 0 on some axis, moving the ray origin inside of it causes false negatives. This was one working epsilon value found by testing different values. Since the algorithm avoids infinities by using one fourth of the total half-precision range, the value of the epsilon is not important. If the BVH tree is unbalanced and contains child AABBs which are extremely small compared to their parents, the epsilon value starts to have an effect. 4.2 Coordinate Conversion The coordinate conversion to the new ray origin is shown on lines This conversion changes the coordinates into a deeper level of hierarchical coordinates. The idea of the conversion is first to convert the new ray origin values so that the coordinates of the intersected child nodes lower bounds are represented by 0.0 and the coordinates of the upper bounds are represented by 1.0. Then this value is multiplied with the total used range VAL_RANGE = VAL_MAX - VAL_MIN, which makes it a value from 0 to Then the interval is moved with VAL_MIN to be from to 16384, which was the desired interval. A slightly different algorithm than in the previous work was used, because usually the whole range of the data type is used on every axis. This means that going deeper into the hierarchy scales different axes at different rates. Without correction, this would bend the rays. More importantly, correcting the direction would require more computations and one more stack to store the corrected ray directions. The issue was solved by scaling all of the axes with just one value range. The largest range of all of the axes was used for the scaling. An example is provided in Figure 4. In the previous work the variable mover was just set to VAL_MIN, but since in this work only one range was used on all of the axes, a more complicated mover was needed. The idea is to move the interval of the child node into the center of the half-precision total range. If conventional VAL_MIN was used as VAL_MIN A B -P 0 P VAL_MAX VAL_MAX 0 VAL_MIN Figure 4: In contrast to conventional hierarchical BVH storage the proposed approach scales every axis on the largest range. In the previous work x-axis value VAL MAX would have been in location P and value VAL MIN in -P. This reduces the need for ray direction correction but loses precision. mover, the value interval would have started from the VAL_MIN and the algorithm would have suffered from the poor precision of the floating-point format with values far from zero. In Figure 4, this would have meant that the bounding box had been on the left side of the dashed square, not in the center. Single-precision data types were used for the coordinate conversion, since half-precision was causing errors for small triangles. This part was vectorized only for the rays in the packet, not for the child nodes. The child nodes were handled by looping through the actually intersected ones. This means that even with the wider data type these lines do not require wider vector units and the performance impact was only the type conversions and the extra register space requirement. This impact was so small compared to memory bandwidth savings that it was unmeasurable. 4.3 Maintaining the Traversed Distance Finally in the lines the distance that has been traversed is maintained. This distance in world coordinates is required because the traversal can terminate as soon as it has found a triangle intersection and all of the AABBs in the stack are further away than it. In the lines the distance values are popped from the stacks. The world coordinate distance from the current moved ray origin to the found AABB can be calculated by multiplying the distance to the AABB, which is in the hierarchical coordinates, with the value popped from the scale stack. The value popped from the distance stack is the distance before intersecting current node. The sum of the distance before intersecting current node and the scaled distance to the intersected AABB equals the total world coordinate traversal distance to the AABB. This value is needed on line 10 to determine if the AABB is further than closest found triangle. 175

6 GRAPP International Conference on Computer Graphics Theory and Applications Table 1: Frame rates [frames per second] (more is better). The values are compared to single-precision and all results that improve on it are bolded. Other measured algorithms were the proposed algorithm and storing the values in half-precision and converting them to single-precision before computation. Model Algorithm BVH MBVH4 MBVH8 MBVH4 2 rays MBVH4 4 rays single Bunny half storage -5.9% -3.2% 0% -4.0% -2.9% proposed -30% -23% -11% -19% 0% single Dragon half storage -3.6% -3.8% 2.0% 3.7% 0% proposed -19% -12% -1.0% -8.8% 3.0% single Sibenik half storage -4.4% 0.4% 0.3% -4.3% -4.2% proposed -13% -4.6% -6.9% -5.5% 8.5% single Fairy half storage -2.8% -0.7% 2.9% -2.3% -1.9% proposed -26% -18% -10% -19% 1.0% single Sponza half storage -4.0% 0.2% 2.1% -4.6% -3.7% proposed -23% -15% -5.2% -12% 12% single Conference half storage -8.2% -6.2% -3.4% -9.1% -10% proposed -27% -21% -13% -20% -9.7% 5 TEST SETUP For the measurements a ray tracer was implemented using OpenCL. OpenCL was chosen because it has the cl khr fp16 extension, which provides support for native half-precision calculations. The tests were run on the ODROID XU3 board s Mali-T628 GPU because it supports most of the features of this extension. The board also has built-in sensors for power readings, which were used for the measurements. During the measurements, the software traced rays for as long as it took for the variations in the frame rates to settle. Then, 10 frames of frame rate and power usage was recorded. The reported numbers are averages of these numbers. The frame rate did not improve after the first frames, which indicates that the scenes were so massive that they overfilled the small caches of the Mali. The images of different test setups can be seen in Figure 3. The ray tracer used Whitted style ray tracing (Whitted, 1979). In the scenes with reflections, two bounces were allowed. In the scenes with lighting, every regular ray sent one shadow ray to each point light if the result was not already known from the normal of the surface. Explicit vector instructions were used to test all child nodes simultaneously in BVH, MBVH4 and MBVH8. MBVH16 measurements were not included, because it was already hard to fill all branches of the tree in a MBVH8. Therefore, MBVH16 measurements would not have been representative. To achieve wider vector instructions packet tracing was also used, yet it is not good for performance with incoherent secondary rays (Wald et al., 2008). Because the packet size is the same with all of the compared methods, all of them should have the same performance drawback. With packet tracing MBVH4 was used, since the implemented Surface Area Heuristic (SAH) (MacDonald and Booth, 1990) tree builder was best suited for building MBVH4 trees. The rays were assigned to packets based on their screen space coordinates. With 4 rays testing hits against 4 AABBs ray-aabb tests were done with 16 elements wide vector instructions, which is the widest vector width supported by the OpenCL framework. 6 RESULTS The measured frames per second can be found in Table 1 and the measured power usage can be found in Table 2. Both of the measurements show similar results. The wider the vector instruction used in the calculations, the better the proposed algorithm is. With narrow vector instructions such as BVH with one ray (which stores only two elements in a vector) the proposed algorithm performs much worse than regular single-precision traversal. This is because the little benefit from the smaller AABB memory footprint is hidden by the use of extra stacks and the computation overhead from the coordinate conversions. 176

7 Half-precision Floating-point Ray Traversal Table 2: Power readings [W] from GPU and memory, which is shared with the CPU (less is better). The values are compared to single-precision by dividing the values with the corresponding frame rate. This means that the results are comparable in terms of actual work performed by the ray tracer. All results that improve on single-precision are bolded. Other measured algorithms were the proposed algorithm and storing the values in half-precision and converting them to single-precision before computation. Model Algorithm BVH MBVH4 MBVH8 MBVH4 2 rays MBVH4 4 rays single Bunny half storage 8.9% 4.7% -5.9% 4.8% 10.8% proposed 46% 27% 4.6% 26% -0.6% single Dragon half storage 4.6% -3.3% -10% -6.4% 5.2% proposed 47% 18% -7.3% 7.1% -8.0% single Sibenik half storage 9.0% -0.1% 1.6% 3.5% 10% proposed 24% 9.5% -0.3% 7.0% -12% single Fairy half storage -0.2% -2.1% -4.5% 0.5% 3.7% proposed 36% 23% 4.0% 20% -11% single Sponza half storage 2.8% 3.6% -3.3% 5.3% -0.1% proposed 44% 29% 5.1% 17% -17% single Conference half storage 19% 0.3% -3.8% 5.4% 10% proposed 58% 30% 8.1% 21% -5.9% Avg. difference (%) MBV H packet Vector width (element count) Figure 5: The average difference of the proposed algorithm for regular single-precision traversal (less is better). Circles and solid lines are power usage. Squares and dashed lines are computation time. This shows a clear trend that the wider the vector instructions used the better the performance is with the proposed algorithm. In addition to actual coordinate calculations, the data for the conversion needs to be parsed from the stored BVH. Each AABB bound coordinate for every child node was stored in one vector, which makes the ray- AABB intersection test faster. However, the proposed algorithm needs to fetch one value for every intersected child node from all of the six bound vectors. With 16-element-wide vector instructions, the proposed algorithm starts to outperform its counterpart single-precision algorithm. These wide instructions were only run with the packet tracing, but the Figure 5 shows a similar trend without packets. This means that if a ray tracer successfully utilizes wide vector instructions and the targeted hardware has support for native half-precision computations the proposed algorithm should be faster. Of course the limit of 16 elements might be lower or higher on other platforms depending on the vector computation hardware and the structure of the whole memory hierarchy. The proposed algorithm s worse performance in the Conference scene is likely explained by the poor precision in the edges of the scene on the first levels of the BVH. Poor precision leads to many extra rays continuing traversal to the branches where the computationally heavy parts like the curtains and the chairs are stored. This only adds extra work to BVH traversal for finding out that a ray is not actually intersecting any primitives in an area of the scene. In the similar test set-up of the Sponza scene the curtains are located closer to the origin of the scene where there is more precision. Furthermore, computationally expensive parts have reflections in the Conference scene, which makes both the incoming rays and the outgoing rays heavier to compute than in single-precision. Comparing the half-precision storage and singleprecision methods shows they perform at a similar level. There is no clear pattern visible for which results half-precision storage outperforms singleprecision. In contrast, the proposed method is better when using widest vectors and also has far greater differences compared to single-precision. This is be- 177

8 GRAPP International Conference on Computer Graphics Theory and Applications cause the SAH tree construction has to make very different trees with far fewer quantization points provided by the non-hierarchical format. On the other hand, the proposed algorithm has even more precision in the lower levels of the tree than single-precision so it builds similar trees. 7 CONCLUSIONS This paper describes a new kind of ray traversal for BVHs, which utilizes half-precision floating-point computations. The traversal is done by moving the ray origin to the edge of the intersected child AABB and then converting the coordinates of the ray origin into the hierarchical coordinates of the child AABB. This way the traversal stays in a more suitable range for the lower precision format. In addition, the hierarchical half-precision coordinates can provide even more accuracy than single-precision world coordinates. This is possible since on every lower level the quantization step in world coordinates gets smaller. The proposed algorithm works better than singleprecision with wider vector instructions. This means that if a ray tracing application is fastest with wide vector instructions then changing to the proposed traversal should make it even faster. This requires that the hardware is capable of calculating twice the amount of half-precision values with same latency as single-precision calculations, which is quite common with the hardware that supports native half-precision calculations. In the future, the authors are interested in extending the use of half-precision floating-point numbers to the rest of the ray tracing algorithm. Another potential future work is to optimize the proposed algorithm for possible desktop GPUs with support for native halfprecision calculations. ACKNOWLEDGEMENTS The authors would like to thank their funding sources: TUT graduate school, Academy of Finland (funding decision ), Finnish Funding Agency for Technology and Innovation (project Parallel Acceleration 2, funding decision 40081/14) and ARTEMIS JU under grant agreement no (ALMARVI). In addition, the authors would like to thank Anat Grynberg and Greg Ward for Conference, Frank Meinl for Sponza, Utah 3D Animation Repository for Fairy, Marko Daprovic for Sibenik and Stanford 3D Scanning Repository for Bunny and Dragon models. The authors are also very grateful for the anonymous reviewer comments. REFERENCES Ernst, M. and Greiner, G. (2008). Multi bounding volume hierarchies. In Proceedings of the IEEE Symposium on Interactive Ray Tracing, pages Ernst, M. and Woop, S. (2011). Ray tracing with sharedplane bounding volume hierarchies. Journal of Graphics, GPU, and Game Tools, 15(3): Hariharakrishnan, K., Barbier, A., and Hedley, F. (2013). OpenCL on Mali FAQs. ARM. Huang, J.-H. (2015). Leaps in visual computing. In Opening Keynote of GPU Techonology Conference Available at: com/gtc/2015/presentation/s5715-keynote-jen- Hsun-Huang.pdf (Referenced: 9/18/2015). IEEE (2008). Standard for floating-point arithmetic. Std Keely, S. (2014). Reduced precision for hardware ray tracing in GPUs. In Proceedings of the High Performance Graphics Conference, pages Koskela, M., Viitanen, T., Jääskelainen, P., Takala, J., and Cameron, K. (2015). Using half-precision floatingpoint numbers for storing bounding volume hierarchies. In Proceedings of the 32nd Computer Graphics International Conference. MacDonald, J. and Booth, K. (1990). Heuristics for ray tracing using space subdivision. The Visual Computer, 6(3): Mahovsky, J. and Wyvill, B. (2006). Memory-conserving bounding volume hierarchies with coherent raytracing. Computer Graphics Forum, 25(2): Smits, B. (1998). Efficiency issues for ray tracing. Journal of Graphics Tools, 3(2):1 14. Wächter, C. and Keller, A. (2006). Instant ray tracing: The bounding interval hierarchy. In EuroGraphics Symposium on Rendering Techniques, pages Wald, I., Benthin, C., and Boulos, S. (2008). Getting rid of packets efficient SIMD single-ray traversal using multi-branching BVHs. In Proceedings of the IEEE Symposium on Interactive Ray Tracing, pages Whitted, T. (1979). An improved illumination model for shaded display. ACM SIGGRAPH Computer Graphics, 13:

Ray Tracing. Computer Graphics CMU /15-662, Fall 2016

Ray Tracing. Computer Graphics CMU /15-662, Fall 2016 Ray Tracing Computer Graphics CMU 15-462/15-662, Fall 2016 Primitive-partitioning vs. space-partitioning acceleration structures Primitive partitioning (bounding volume hierarchy): partitions node s primitives

More information

Multi Bounding Volume Hierarchies for Ray Tracing Pipelines

Multi Bounding Volume Hierarchies for Ray Tracing Pipelines Tampere University of Technology Multi Bounding Volume Hierarchies for Ray Tracing Pipelines Citation Viitanen, T., Koskela, M., Jääskeläinen, P., & Takala, J. (2016). Multi Bounding Volume Hierarchies

More information

Computer Graphics Ray Casting. Matthias Teschner

Computer Graphics Ray Casting. Matthias Teschner Computer Graphics Ray Casting Matthias Teschner Outline Context Implicit surfaces Parametric surfaces Combined objects Triangles Axis-aligned boxes Iso-surfaces in grids Summary University of Freiburg

More information

Fast BVH Construction on GPUs

Fast BVH Construction on GPUs Fast BVH Construction on GPUs Published in EUROGRAGHICS, (2009) C. Lauterbach, M. Garland, S. Sengupta, D. Luebke, D. Manocha University of North Carolina at Chapel Hill NVIDIA University of California

More information

INFOMAGR Advanced Graphics. Jacco Bikker - February April Welcome!

INFOMAGR Advanced Graphics. Jacco Bikker - February April Welcome! INFOMAGR Advanced Graphics Jacco Bikker - February April 2016 Welcome! I x, x = g(x, x ) ε x, x + S ρ x, x, x I x, x dx Today s Agenda: Introduction Ray Distributions The Top-level BVH Real-time Ray Tracing

More information

Lecture 4 - Real-time Ray Tracing

Lecture 4 - Real-time Ray Tracing INFOMAGR Advanced Graphics Jacco Bikker - November 2017 - February 2018 Lecture 4 - Real-time Ray Tracing Welcome! I x, x = g(x, x ) ε x, x + න S ρ x, x, x I x, x dx Today s Agenda: Introduction Ray Distributions

More information

Accelerated Entry Point Search Algorithm for Real-Time Ray-Tracing

Accelerated Entry Point Search Algorithm for Real-Time Ray-Tracing Accelerated Entry Point Search Algorithm for Real-Time Ray-Tracing Figure 1: Four of the scenes used for testing purposes. From the left: Fairy Forest from the Utah 3D Animation Repository, Legocar from

More information

INFOGR Computer Graphics. J. Bikker - April-July Lecture 11: Acceleration. Welcome!

INFOGR Computer Graphics. J. Bikker - April-July Lecture 11: Acceleration. Welcome! INFOGR Computer Graphics J. Bikker - April-July 2015 - Lecture 11: Acceleration Welcome! Today s Agenda: High-speed Ray Tracing Acceleration Structures The Bounding Volume Hierarchy BVH Construction BVH

More information

B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes

B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes B-KD rees for Hardware Accelerated Ray racing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek Saarland University, Germany Outline Previous Work B-KD ree as new Spatial Index Structure DynR

More information

Lecture 2 - Acceleration Structures

Lecture 2 - Acceleration Structures INFOMAGR Advanced Graphics Jacco Bikker - November 2017 - February 2018 Lecture 2 - Acceleration Structures Welcome! I x, x = g(x, x ) ε x, x + න S ρ x, x, x I x, x dx Today s Agenda: Problem Analysis

More information

The Minimal Bounding Volume Hierarchy

The Minimal Bounding Volume Hierarchy Vision, Modeling, and Visualization (2010) Reinhard Koch, Andreas Kolb, Christof Rezk-Salama (Eds.) The Minimal Bounding Volume Hierarchy Pablo Bauszat 1 and Martin Eisemann 1 and Marcus Magnor 1 1 TU

More information

Acceleration Data Structures

Acceleration Data Structures CT4510: Computer Graphics Acceleration Data Structures BOCHANG MOON Ray Tracing Procedure for Ray Tracing: For each pixel Generate a primary ray (with depth 0) While (depth < d) { Find the closest intersection

More information

Simpler and Faster HLBVH with Work Queues

Simpler and Faster HLBVH with Work Queues Simpler and Faster HLBVH with Work Queues Kirill Garanzha NVIDIA Keldysh Institute of Applied Mathematics Jacopo Pantaleoni NVIDIA Research David McAllister NVIDIA Figure 1: Some of our test scenes, from

More information

EDAN30 Photorealistic Computer Graphics. Seminar 2, Bounding Volume Hierarchy. Magnus Andersson, PhD student

EDAN30 Photorealistic Computer Graphics. Seminar 2, Bounding Volume Hierarchy. Magnus Andersson, PhD student EDAN30 Photorealistic Computer Graphics Seminar 2, 2012 Bounding Volume Hierarchy Magnus Andersson, PhD student (magnusa@cs.lth.se) This seminar We want to go from hundreds of triangles to thousands (or

More information

Improving Memory Space Efficiency of Kd-tree for Real-time Ray Tracing Byeongjun Choi, Byungjoon Chang, Insung Ihm

Improving Memory Space Efficiency of Kd-tree for Real-time Ray Tracing Byeongjun Choi, Byungjoon Chang, Insung Ihm Improving Memory Space Efficiency of Kd-tree for Real-time Ray Tracing Byeongjun Choi, Byungjoon Chang, Insung Ihm Department of Computer Science and Engineering Sogang University, Korea Improving Memory

More information

Universiteit Leiden Computer Science

Universiteit Leiden Computer Science Universiteit Leiden Computer Science Optimizing octree updates for visibility determination on dynamic scenes Name: Hans Wortel Student-no: 0607940 Date: 28/07/2011 1st supervisor: Dr. Michael Lew 2nd

More information

Holger Dammertz August 10, The Edge Volume Heuristic Robust Triangle Subdivision for Improved BVH Performance Holger Dammertz, Alexander Keller

Holger Dammertz August 10, The Edge Volume Heuristic Robust Triangle Subdivision for Improved BVH Performance Holger Dammertz, Alexander Keller Holger Dammertz August 10, 2008 The Edge Volume Heuristic Robust Triangle Subdivision for Improved BVH Performance Holger Dammertz, Alexander Keller Page 2 The Edge Volume Heuristic Robust Triangle Subdivision

More information

Interactive Ray Tracing: Higher Memory Coherence

Interactive Ray Tracing: Higher Memory Coherence Interactive Ray Tracing: Higher Memory Coherence http://gamma.cs.unc.edu/rt Dinesh Manocha (UNC Chapel Hill) Sung-Eui Yoon (Lawrence Livermore Labs) Interactive Ray Tracing Ray tracing is naturally sub-linear

More information

Spatial Data Structures

Spatial Data Structures Spatial Data Structures Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees Constructive Solid Geometry (CSG) [Angel 9.10] Outline Ray tracing review what rays matter? Ray tracing speedup faster

More information

NVIDIA Case Studies:

NVIDIA Case Studies: NVIDIA Case Studies: OptiX & Image Space Photon Mapping David Luebke NVIDIA Research Beyond Programmable Shading 0 How Far Beyond? The continuum Beyond Programmable Shading Just programmable shading: DX,

More information

COMP 4801 Final Year Project. Ray Tracing for Computer Graphics. Final Project Report FYP Runjing Liu. Advised by. Dr. L.Y.

COMP 4801 Final Year Project. Ray Tracing for Computer Graphics. Final Project Report FYP Runjing Liu. Advised by. Dr. L.Y. COMP 4801 Final Year Project Ray Tracing for Computer Graphics Final Project Report FYP 15014 by Runjing Liu Advised by Dr. L.Y. Wei 1 Abstract The goal of this project was to use ray tracing in a rendering

More information

Ray Tracing with Spatial Hierarchies. Jeff Mahovsky & Brian Wyvill CSC 305

Ray Tracing with Spatial Hierarchies. Jeff Mahovsky & Brian Wyvill CSC 305 Ray Tracing with Spatial Hierarchies Jeff Mahovsky & Brian Wyvill CSC 305 Ray Tracing Flexible, accurate, high-quality rendering Slow Simplest ray tracer: Test every ray against every object in the scene

More information

RACBVHs: Random Accessible Compressed Bounding Volume Hierarchies

RACBVHs: Random Accessible Compressed Bounding Volume Hierarchies RACBVHs: Random Accessible Compressed Bounding Volume Hierarchies Published at IEEE Transactions on Visualization and Computer Graphics, 2010, Vol. 16, Num. 2, pp. 273 286 Tae Joon Kim joint work with

More information

MSBVH: An Efficient Acceleration Data Structure for Ray Traced Motion Blur

MSBVH: An Efficient Acceleration Data Structure for Ray Traced Motion Blur MSBVH: An Efficient Acceleration Data Structure for Ray Traced Motion Blur Leonhard Grünschloß Martin Stich Sehera Nawaz Alexander Keller August 6, 2011 Principles of Accelerated Ray Tracing Hierarchical

More information

Ray-Box Culling for Tree Structures

Ray-Box Culling for Tree Structures JOURNAL OF INFORMATION SCIENCE AND ENGINEERING XX, XXX-XXX (2012) Ray-Box Culling for Tree Structures JAE-HO NAH 1, WOO-CHAN PARK 2, YOON-SIG KANG 1, AND TACK-DON HAN 1 1 Department of Computer Science

More information

Real Time Ray Tracing

Real Time Ray Tracing Real Time Ray Tracing Programação 3D para Simulação de Jogos Vasco Costa Ray tracing? Why? How? P3DSJ Real Time Ray Tracing Vasco Costa 2 Real time ray tracing : example Source: NVIDIA P3DSJ Real Time

More information

Topics. Ray Tracing II. Intersecting transformed objects. Transforming objects

Topics. Ray Tracing II. Intersecting transformed objects. Transforming objects Topics Ray Tracing II CS 4620 Lecture 16 Transformations in ray tracing Transforming objects Transformation hierarchies Ray tracing acceleration structures Bounding volumes Bounding volume hierarchies

More information

Computer Graphics (CS 543) Lecture 13b Ray Tracing (Part 1) Prof Emmanuel Agu. Computer Science Dept. Worcester Polytechnic Institute (WPI)

Computer Graphics (CS 543) Lecture 13b Ray Tracing (Part 1) Prof Emmanuel Agu. Computer Science Dept. Worcester Polytechnic Institute (WPI) Computer Graphics (CS 543) Lecture 13b Ray Tracing (Part 1) Prof Emmanuel Agu Computer Science Dept. Worcester Polytechnic Institute (WPI) Raytracing Global illumination-based rendering method Simulates

More information

Topics. Ray Tracing II. Transforming objects. Intersecting transformed objects

Topics. Ray Tracing II. Transforming objects. Intersecting transformed objects Topics Ray Tracing II CS 4620 ations in ray tracing ing objects ation hierarchies Ray tracing acceleration structures Bounding volumes Bounding volume hierarchies Uniform spatial subdivision Adaptive spatial

More information

Intersection Acceleration

Intersection Acceleration Advanced Computer Graphics Intersection Acceleration Matthias Teschner Computer Science Department University of Freiburg Outline introduction bounding volume hierarchies uniform grids kd-trees octrees

More information

Ray Tracing Acceleration. CS 4620 Lecture 20

Ray Tracing Acceleration. CS 4620 Lecture 20 Ray Tracing Acceleration CS 4620 Lecture 20 2013 Steve Marschner 1 Will this be on the exam? or, Prelim 2 syllabus You can expect emphasis on topics related to the assignment (Shaders 1&2) and homework

More information

Ray Tracing with Multi-Core/Shared Memory Systems. Abe Stephens

Ray Tracing with Multi-Core/Shared Memory Systems. Abe Stephens Ray Tracing with Multi-Core/Shared Memory Systems Abe Stephens Real-time Interactive Massive Model Visualization Tutorial EuroGraphics 2006. Vienna Austria. Monday September 4, 2006 http://www.sci.utah.edu/~abe/massive06/

More information

Efficiency Issues for Ray Tracing

Efficiency Issues for Ray Tracing Efficiency Issues for Ray Tracing Brian Smits University of Utah February 19, 1999 Abstract Ray casting is the bottleneck of many rendering algorithms. Although much work has been done on making ray casting

More information

Spatial Data Structures

Spatial Data Structures 15-462 Computer Graphics I Lecture 17 Spatial Data Structures Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees Constructive Solid Geometry (CSG) March 28, 2002 [Angel 8.9] Frank Pfenning Carnegie

More information

EFFICIENT BVH CONSTRUCTION AGGLOMERATIVE CLUSTERING VIA APPROXIMATE. Yan Gu, Yong He, Kayvon Fatahalian, Guy Blelloch Carnegie Mellon University

EFFICIENT BVH CONSTRUCTION AGGLOMERATIVE CLUSTERING VIA APPROXIMATE. Yan Gu, Yong He, Kayvon Fatahalian, Guy Blelloch Carnegie Mellon University EFFICIENT BVH CONSTRUCTION VIA APPROXIMATE AGGLOMERATIVE CLUSTERING Yan Gu, Yong He, Kayvon Fatahalian, Guy Blelloch Carnegie Mellon University BVH CONSTRUCTION GOALS High quality: produce BVHs of comparable

More information

CS580: Ray Tracing. Sung-Eui Yoon ( 윤성의 ) Course URL:

CS580: Ray Tracing. Sung-Eui Yoon ( 윤성의 ) Course URL: CS580: Ray Tracing Sung-Eui Yoon ( 윤성의 ) Course URL: http://sglab.kaist.ac.kr/~sungeui/gcg/ Recursive Ray Casting Gained popularity in when Turner Whitted (1980) recognized that recursive ray casting could

More information

Spatial Data Structures

Spatial Data Structures 15-462 Computer Graphics I Lecture 17 Spatial Data Structures Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees Constructive Solid Geometry (CSG) April 1, 2003 [Angel 9.10] Frank Pfenning Carnegie

More information

Ray Tracing Acceleration. CS 4620 Lecture 22

Ray Tracing Acceleration. CS 4620 Lecture 22 Ray Tracing Acceleration CS 4620 Lecture 22 2014 Steve Marschner 1 Topics Transformations in ray tracing Transforming objects Transformation hierarchies Ray tracing acceleration structures Bounding volumes

More information

Ray Tracing Performance

Ray Tracing Performance Ray Tracing Performance Zero to Millions in 45 Minutes Gordon Stoll, Intel Ray Tracing Performance Zero to Millions in 45 Minutes?! Gordon Stoll, Intel Goals for this talk Goals point you toward the current

More information

GPU-BASED RAY TRACING ALGORITHM USING UNIFORM GRID STRUCTURE

GPU-BASED RAY TRACING ALGORITHM USING UNIFORM GRID STRUCTURE GPU-BASED RAY TRACING ALGORITHM USING UNIFORM GRID STRUCTURE Reza Fuad R 1) Mochamad Hariadi 2) Telematics Laboratory, Dept. Electrical Eng., Faculty of Industrial Technology Institut Teknologi Sepuluh

More information

Using Bounding Volume Hierarchies Efficient Collision Detection for Several Hundreds of Objects

Using Bounding Volume Hierarchies Efficient Collision Detection for Several Hundreds of Objects Part 7: Collision Detection Virtuelle Realität Wintersemester 2007/08 Prof. Bernhard Jung Overview Bounding Volumes Separating Axis Theorem Using Bounding Volume Hierarchies Efficient Collision Detection

More information

Ray Tracing with Sparse Boxes

Ray Tracing with Sparse Boxes Ray Tracing with Sparse Boxes Vlastimil Havran Czech Technical University Jiří Bittner Czech Technical University Vienna University of Technology Figure : (left) A ray casted view of interior of a larger

More information

Simple Nested Dielectrics in Ray Traced Images

Simple Nested Dielectrics in Ray Traced Images Simple Nested Dielectrics in Ray Traced Images Charles M. Schmidt and Brian Budge University of Utah Abstract This paper presents a simple method for modeling and rendering refractive objects that are

More information

Acceleration Structure for Animated Scenes. Copyright 2010 by Yong Cao

Acceleration Structure for Animated Scenes. Copyright 2010 by Yong Cao t min X X Y 1 B C Y 1 Y 2 A Y 2 D A B C D t max t min X X Y 1 B C Y 2 Y 1 Y 2 A Y 2 D A B C D t max t min X X Y 1 B C Y 1 Y 2 A Y 2 D A B C D t max t min A large tree structure change. A totally new tree!

More information

Acceleration Structures. CS 6965 Fall 2011

Acceleration Structures. CS 6965 Fall 2011 Acceleration Structures Run Program 1 in simhwrt Lab time? Program 2 Also run Program 2 and include that output Inheritance probably doesn t work 2 Boxes Axis aligned boxes Parallelepiped 12 triangles?

More information

improving raytracing speed

improving raytracing speed ray tracing II computer graphics ray tracing II 2006 fabio pellacini 1 improving raytracing speed computer graphics ray tracing II 2006 fabio pellacini 2 raytracing computational complexity ray-scene intersection

More information

Spatial Data Structures

Spatial Data Structures CSCI 420 Computer Graphics Lecture 17 Spatial Data Structures Jernej Barbic University of Southern California Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees [Angel Ch. 8] 1 Ray Tracing Acceleration

More information

Dual-Split Trees Supplemental Materials

Dual-Split Trees Supplemental Materials Dual-Split Trees Supplemental Materials DAQI LIN, University of Utah KONSTANTIN SHKURKO, University of Utah IAN MALLETT, University of Utah CEM YUKSEL, University of Utah ACM Reference Format: Daqi Lin,

More information

Spatial Data Structures

Spatial Data Structures CSCI 480 Computer Graphics Lecture 7 Spatial Data Structures Hierarchical Bounding Volumes Regular Grids BSP Trees [Ch. 0.] March 8, 0 Jernej Barbic University of Southern California http://www-bcf.usc.edu/~jbarbic/cs480-s/

More information

INFOMAGR Advanced Graphics. Jacco Bikker - February April Welcome!

INFOMAGR Advanced Graphics. Jacco Bikker - February April Welcome! INFOMAGR Advanced Graphics Jacco Bikker - February April 2016 Welcome! I x, x = g(x, x ) ε x, x + S ρ x, x, x I x, x dx Today s Agenda: Introduction : GPU Ray Tracing Practical Perspective Advanced Graphics

More information

Accelerating Ray Tracing

Accelerating Ray Tracing Accelerating Ray Tracing Ray Tracing Acceleration Techniques Faster Intersections Fewer Rays Generalized Rays Faster Ray-Object Intersections Object bounding volumes Efficient intersection routines Fewer

More information

Object Partitioning Considered Harmful: Space Subdivision for BVHs

Object Partitioning Considered Harmful: Space Subdivision for BVHs Object Partitioning Considered Harmful: Space Subdivision for BVHs Stefan Popov Saarland University Iliyan Georgiev Saarland University Rossen Dimov Saarland University Philipp Slusallek DFKI Saarbrücken

More information

Spatial Data Structures. Steve Rotenberg CSE168: Rendering Algorithms UCSD, Spring 2017

Spatial Data Structures. Steve Rotenberg CSE168: Rendering Algorithms UCSD, Spring 2017 Spatial Data Structures Steve Rotenberg CSE168: Rendering Algorithms UCSD, Spring 2017 Ray Intersections We can roughly estimate the time to render an image as being proportional to the number of ray-triangle

More information

Shadows for Many Lights sounds like it might mean something, but In fact it can mean very different things, that require very different solutions.

Shadows for Many Lights sounds like it might mean something, but In fact it can mean very different things, that require very different solutions. 1 2 Shadows for Many Lights sounds like it might mean something, but In fact it can mean very different things, that require very different solutions. 3 We aim for something like the numbers of lights

More information

Evaluation and Improvement of GPU Ray Tracing with a Thread Migration Technique

Evaluation and Improvement of GPU Ray Tracing with a Thread Migration Technique Evaluation and Improvement of GPU Ray Tracing with a Thread Migration Technique Xingxing Zhu and Yangdong Deng Institute of Microelectronics, Tsinghua University, Beijing, China Email: zhuxingxing0107@163.com,

More information

Ray Tracing. Cornell CS4620/5620 Fall 2012 Lecture Kavita Bala 1 (with previous instructors James/Marschner)

Ray Tracing. Cornell CS4620/5620 Fall 2012 Lecture Kavita Bala 1 (with previous instructors James/Marschner) CS4620/5620: Lecture 37 Ray Tracing 1 Announcements Review session Tuesday 7-9, Phillips 101 Posted notes on slerp and perspective-correct texturing Prelim on Thu in B17 at 7:30pm 2 Basic ray tracing Basic

More information

Computer Graphics. - Ray-Tracing II - Hendrik Lensch. Computer Graphics WS07/08 Ray Tracing II

Computer Graphics. - Ray-Tracing II - Hendrik Lensch. Computer Graphics WS07/08 Ray Tracing II Computer Graphics - Ray-Tracing II - Hendrik Lensch Overview Last lecture Ray tracing I Basic ray tracing What is possible? Recursive ray tracing algorithm Intersection computations Today Advanced acceleration

More information

Exploiting Local Orientation Similarity for Efficient Ray Traversal of Hair and Fur

Exploiting Local Orientation Similarity for Efficient Ray Traversal of Hair and Fur 1 Exploiting Local Orientation Similarity for Efficient Ray Traversal of Hair and Fur Sven Woop, Carsten Benthin, Ingo Wald, Gregory S. Johnson Intel Corporation Eric Tabellion DreamWorks Animation 2 Legal

More information

Parallel Physically Based Path-tracing and Shading Part 3 of 2. CIS565 Fall 2012 University of Pennsylvania by Yining Karl Li

Parallel Physically Based Path-tracing and Shading Part 3 of 2. CIS565 Fall 2012 University of Pennsylvania by Yining Karl Li Parallel Physically Based Path-tracing and Shading Part 3 of 2 CIS565 Fall 202 University of Pennsylvania by Yining Karl Li Jim Scott 2009 Spatial cceleration Structures: KD-Trees *Some portions of these

More information

Announcements. Written Assignment2 is out, due March 8 Graded Programming Assignment2 next Tuesday

Announcements. Written Assignment2 is out, due March 8 Graded Programming Assignment2 next Tuesday Announcements Written Assignment2 is out, due March 8 Graded Programming Assignment2 next Tuesday 1 Spatial Data Structures Hierarchical Bounding Volumes Grids Octrees BSP Trees 11/7/02 Speeding Up Computations

More information

Collision Detection with Bounding Volume Hierarchies

Collision Detection with Bounding Volume Hierarchies Simulation in Computer Graphics Collision Detection with Bounding Volume Hierarchies Matthias Teschner Computer Science Department University of Freiburg Outline introduction bounding volumes BV hierarchies

More information

Comparison of hierarchies for occlusion culling based on occlusion queries

Comparison of hierarchies for occlusion culling based on occlusion queries Comparison of hierarchies for occlusion culling based on occlusion queries V.I. Gonakhchyan pusheax@ispras.ru Ivannikov Institute for System Programming of the RAS, Moscow, Russia Efficient interactive

More information

Row Tracing with Hierarchical Occlusion Maps

Row Tracing with Hierarchical Occlusion Maps Row Tracing with Hierarchical Occlusion Maps Ravi P. Kammaje, Benjamin Mora August 9, 2008 Page 2 Row Tracing with Hierarchical Occlusion Maps Outline August 9, 2008 Introduction Related Work Row Tracing

More information

Stackless Ray Traversal for kd-trees with Sparse Boxes

Stackless Ray Traversal for kd-trees with Sparse Boxes Stackless Ray Traversal for kd-trees with Sparse Boxes Vlastimil Havran Czech Technical University e-mail: havranat f el.cvut.cz Jiri Bittner Czech Technical University e-mail: bittnerat f el.cvut.cz November

More information

PantaRay: Fast Ray-traced Occlusion Caching of Massive Scenes J. Pantaleoni, L. Fascione, M. Hill, T. Aila

PantaRay: Fast Ray-traced Occlusion Caching of Massive Scenes J. Pantaleoni, L. Fascione, M. Hill, T. Aila PantaRay: Fast Ray-traced Occlusion Caching of Massive Scenes J. Pantaleoni, L. Fascione, M. Hill, T. Aila Agenda Introduction Motivation Basics PantaRay Accelerating structure generation Massively parallel

More information

Computer Graphics. - Spatial Index Structures - Philipp Slusallek

Computer Graphics. - Spatial Index Structures - Philipp Slusallek Computer Graphics - Spatial Index Structures - Philipp Slusallek Overview Last lecture Overview of ray tracing Ray-primitive intersections Today Acceleration structures Bounding Volume Hierarchies (BVH)

More information

Bounding Volume Hierarchies. CS 6965 Fall 2011

Bounding Volume Hierarchies. CS 6965 Fall 2011 Bounding Volume Hierarchies 2 Heightfield 3 C++ Operator? #include int main() { int x = 10; while( x --> 0 ) // x goes to 0 { printf("%d ", x); } } stackoverflow.com/questions/1642028/what-is-the-name-of-this-operator

More information

Real-time ray tracing

Real-time ray tracing Lecture 10: Real-time ray tracing (and opportunities for hardware acceleration) Visual Computing Systems Recent push towards real-time ray tracing Image credit: NVIDIA (this ray traced image can be rendered

More information

Accelerated Raytracing

Accelerated Raytracing Accelerated Raytracing Why is Acceleration Important? Vanilla ray tracing is really slow! mxm pixels, kxk supersampling, n primitives, average ray path length of d, l lights, 2 recursive ray casts per

More information

In the previous presentation, Erik Sintorn presented methods for practically constructing a DAG structure from a voxel data set.

In the previous presentation, Erik Sintorn presented methods for practically constructing a DAG structure from a voxel data set. 1 In the previous presentation, Erik Sintorn presented methods for practically constructing a DAG structure from a voxel data set. This presentation presents how such a DAG structure can be accessed immediately

More information

Introduction to PowerVR Ray Tracing Tuesday 18th March, GDC. James A. McCombe

Introduction to PowerVR Ray Tracing Tuesday 18th March, GDC. James A. McCombe Introduction to PowerVR Tracing Tuesday 18th March, 2014 @ GDC James A. McCombe What are we launching today? Host CPU Interface Vertex Data Master Control and Register Bus Unified Shading Cluster Array

More information

SAH guided spatial split partitioning for fast BVH construction. Per Ganestam and Michael Doggett Lund University

SAH guided spatial split partitioning for fast BVH construction. Per Ganestam and Michael Doggett Lund University SAH guided spatial split partitioning for fast BVH construction Per Ganestam and Michael Doggett Lund University Opportunistic triangle splitting for higher quality BVHs Bounding Volume Hierarchies (BVH)

More information

Accelerating Shadow Rays Using Volumetric Occluders and Modified kd-tree Traversal

Accelerating Shadow Rays Using Volumetric Occluders and Modified kd-tree Traversal Accelerating Shadow Rays Using Volumetric Occluders and Modified kd-tree Traversal Peter Djeu*, Sean Keely*, and Warren Hunt * University of Texas at Austin Intel Labs Shadow Rays Shadow rays are often

More information

Accelerating Ray-Tracing

Accelerating Ray-Tracing Lecture 9: Accelerating Ray-Tracing Computer Graphics and Imaging UC Berkeley CS184/284A, Spring 2016 Course Roadmap Rasterization Pipeline Core Concepts Sampling Antialiasing Transforms Geometric Modeling

More information

Sung-Eui Yoon ( 윤성의 )

Sung-Eui Yoon ( 윤성의 ) CS380: Computer Graphics Ray Tracing Sung-Eui Yoon ( 윤성의 ) Course URL: http://sglab.kaist.ac.kr/~sungeui/cg/ Class Objectives Understand overall algorithm of recursive ray tracing Ray generations Intersection

More information

Ray Intersection Acceleration

Ray Intersection Acceleration Ray Intersection Acceleration Image Synthesis Torsten Möller Reading Physically Based Rendering by Pharr&Humphreys Chapter 2 - rays and transformations Chapter 3 - shapes Chapter 4 - intersections and

More information

Early Split Clipping for Bounding Volume Hierarchies

Early Split Clipping for Bounding Volume Hierarchies Early Split Clipping for Bounding Volume Hierarchies Manfred Ernst Günther Greiner Lehrstuhl für Graphische Datenverarbeitung Universität Erlangen-Nürnberg, Germany ABSTRACT Despite their algorithmic elegance

More information

PowerVR Hardware. Architecture Overview for Developers

PowerVR Hardware. Architecture Overview for Developers Public Imagination Technologies PowerVR Hardware Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.

More information

Scene Management. Video Game Technologies 11498: MSc in Computer Science and Engineering 11156: MSc in Game Design and Development

Scene Management. Video Game Technologies 11498: MSc in Computer Science and Engineering 11156: MSc in Game Design and Development Video Game Technologies 11498: MSc in Computer Science and Engineering 11156: MSc in Game Design and Development Chap. 5 Scene Management Overview Scene Management vs Rendering This chapter is about rendering

More information

Parallel SAH k-d Tree Construction. Byn Choi, Rakesh Komuravelli, Victor Lu, Hyojin Sung, Robert L. Bocchino, Sarita V. Adve, John C.

Parallel SAH k-d Tree Construction. Byn Choi, Rakesh Komuravelli, Victor Lu, Hyojin Sung, Robert L. Bocchino, Sarita V. Adve, John C. Parallel SAH k-d Tree Construction Byn Choi, Rakesh Komuravelli, Victor Lu, Hyojin Sung, Robert L. Bocchino, Sarita V. Adve, John C. Hart Motivation Real-time Dynamic Ray Tracing Efficient rendering with

More information

CS 465 Program 5: Ray II

CS 465 Program 5: Ray II CS 465 Program 5: Ray II out: Friday 2 November 2007 due: Saturday 1 December 2007 Sunday 2 December 2007 midnight 1 Introduction In the first ray tracing assignment you built a simple ray tracer that

More information

Building a Fast Ray Tracer

Building a Fast Ray Tracer Abstract Ray tracing is often used in renderers, as it can create very high quality images at the expense of run time. It is useful because of its ability to solve many different problems in image rendering.

More information

Fast, Effective BVH Updates for Animated Scenes

Fast, Effective BVH Updates for Animated Scenes Fast, Effective BVH Updates for Animated Scenes Daniel Kopta Thiago Ize Josef Spjut Erik Brunvand Al Davis Andrew Kensler Pixar Abstract Bounding volume hierarchies (BVHs) are a popular acceleration structure

More information

Ray Genealogy. Raytracing: Performance. Bounding Volumes. Bounding Volume. Acceleration Classification. Acceleration Classification

Ray Genealogy. Raytracing: Performance. Bounding Volumes. Bounding Volume. Acceleration Classification. Acceleration Classification Raytracing: Performance OS 4328/5327 Scott. King Ray Genealogy 1 Ray/pixel at 1k 1k image = 1M? rays rays. Say on avg. 6 secondary rays = 7M?rays rays 100k objects with 10 ops for intersection =? Operations

More information

Ray Tracing III. Wen-Chieh (Steve) Lin National Chiao-Tung University

Ray Tracing III. Wen-Chieh (Steve) Lin National Chiao-Tung University Ray Tracing III Wen-Chieh (Steve) Lin National Chiao-Tung University Shirley, Fundamentals of Computer Graphics, Chap 10 Doug James CG slides, I-Chen Lin s CG slides Ray-tracing Review For each pixel,

More information

Computer Graphics. Bing-Yu Chen National Taiwan University The University of Tokyo

Computer Graphics. Bing-Yu Chen National Taiwan University The University of Tokyo Computer Graphics Bing-Yu Chen National Taiwan University The University of Tokyo Hidden-Surface Removal Back-Face Culling The Depth-Sort Algorithm Binary Space-Partitioning Trees The z-buffer Algorithm

More information

CPSC / Sonny Chan - University of Calgary. Collision Detection II

CPSC / Sonny Chan - University of Calgary. Collision Detection II CPSC 599.86 / 601.86 Sonny Chan - University of Calgary Collision Detection II Outline Broad phase collision detection: - Problem definition and motivation - Bounding volume hierarchies - Spatial partitioning

More information

Lecture 11 - GPU Ray Tracing (1)

Lecture 11 - GPU Ray Tracing (1) INFOMAGR Advanced Graphics Jacco Bikker - November 2017 - February 2018 Lecture 11 - GPU Ray Tracing (1) Welcome! I x, x = g(x, x ) ε x, x + න S ρ x, x, x I x, x dx Today s Agenda: Exam Questions: Sampler

More information

Two Optimization Methods for Raytracing. W. Sturzlinger and R. F. Tobler

Two Optimization Methods for Raytracing. W. Sturzlinger and R. F. Tobler Two Optimization Methods for Raytracing by W. Sturzlinger and R. F. Tobler Johannes Kepler University of Linz Institute for Computer Science Department for graphical and parallel Processing Altenbergerstrae

More information

CS 4620 Program 4: Ray II

CS 4620 Program 4: Ray II CS 4620 Program 4: Ray II out: Tuesday 11 November 2008 due: Tuesday 25 November 2008 1 Introduction In the first ray tracing assignment you built a simple ray tracer that handled just the basics. In this

More information

Spatial Data Structures and Speed-Up Techniques. Tomas Akenine-Möller Department of Computer Engineering Chalmers University of Technology

Spatial Data Structures and Speed-Up Techniques. Tomas Akenine-Möller Department of Computer Engineering Chalmers University of Technology Spatial Data Structures and Speed-Up Techniques Tomas Akenine-Möller Department of Computer Engineering Chalmers University of Technology Spatial data structures What is it? Data structure that organizes

More information

Dual-Split Trees. Konstantin Shkurko. Daqi Lin. Ian Mallett. Cem Yuksel University of Utah ABSTRACT INTRODUCTION CCS CONCEPTS KEYWORDS

Dual-Split Trees. Konstantin Shkurko. Daqi Lin. Ian Mallett. Cem Yuksel University of Utah ABSTRACT INTRODUCTION CCS CONCEPTS KEYWORDS Daqi in Konstantin Shkurko Ian Mallett Cem Yuksel University of Utah University of Utah University of Utah University of Utah Ray-Plane Intersections Ray-Triangle Intersections Dual-Split Tree Dual-Split

More information

Last Time: Acceleration Data Structures for Ray Tracing. Schedule. Today. Shadows & Light Sources. Shadows

Last Time: Acceleration Data Structures for Ray Tracing. Schedule. Today. Shadows & Light Sources. Shadows Last Time: Acceleration Data Structures for Ray Tracing Modeling Transformations Illumination (Shading) Viewing Transformation (Perspective / Orthographic) Clipping Projection (to Screen Space) Scan Conversion

More information

Ray Tracing. Kjetil Babington

Ray Tracing. Kjetil Babington Ray Tracing Kjetil Babington 21.10.2011 1 Introduction What is Ray Tracing? Act of tracing a ray through some scene Not necessarily for rendering Rendering with Ray Tracing Ray Tracing is a global illumination

More information

Hello, Thanks for the introduction

Hello, Thanks for the introduction Hello, Thanks for the introduction 1 In this paper we suggest an efficient data-structure for precomputed shadows from point light or directional light-sources. Because, in fact, after more than four decades

More information

ICS RESEARCH TECHNICAL TALK DRAKE TETREAULT, ICS H197 FALL 2013

ICS RESEARCH TECHNICAL TALK DRAKE TETREAULT, ICS H197 FALL 2013 ICS RESEARCH TECHNICAL TALK DRAKE TETREAULT, ICS H197 FALL 2013 TOPIC: RESEARCH PAPER Title: Data Management for SSDs for Large-Scale Interactive Graphics Applications Authors: M. Gopi, Behzad Sajadi,

More information

SAH guided spatial split partitioning for fast BVH construction

SAH guided spatial split partitioning for fast BVH construction EUROGRAPHICS 2016 / J. Jorge and M. Lin (Guest Editors) Volume 35 (2016), Number 2 SAH guided spatial split partitioning for fast BVH construction Per Ganestam and Michael Doggett Lund University Figure

More information

Lecture 11: Ray tracing (cont.)

Lecture 11: Ray tracing (cont.) Interactive Computer Graphics Ray tracing - Summary Lecture 11: Ray tracing (cont.) Graphics Lecture 10: Slide 1 Some slides adopted from H. Pfister, Harvard Graphics Lecture 10: Slide 2 Ray tracing -

More information

MULTI-LEVEL GRID STRATEGIES FOR RAY TRACING Improving Render Time Performance for Row Displacement Compressed Grids

MULTI-LEVEL GRID STRATEGIES FOR RAY TRACING Improving Render Time Performance for Row Displacement Compressed Grids MULTI-LEVEL GRID STRATEGIES FOR RAY TRACING Improving Render Time Performance for Row Displacement Compressed Grids Vasco Costa, João Madeiras Pereira INESC-ID / IST, Rua Alves Redol 9, Apartado 1369,

More information

Spatial Data Structures for Computer Graphics

Spatial Data Structures for Computer Graphics Spatial Data Structures for Computer Graphics Page 1 of 65 http://www.cse.iitb.ac.in/ sharat November 2008 Spatial Data Structures for Computer Graphics Page 1 of 65 http://www.cse.iitb.ac.in/ sharat November

More information