Real-time non-photorealistic rendering Lauri Siljamäki HUT Lauri.Siljamaki@hut.fi Abstract This paper summarizes techniques used for real-time non-photorealistic rendering (NPR). Currently most NPR images and animations are computed prior playback. This limits the possible uses of NPR imagery. It is however possible to overcome the computational barrier represented by NPR by clever usage of modern computer graphics hardware. For a sketched look the Appel silhouette edge tracing was optimized for minimal computation but good enough accuracy for real-time rendering. For painterly looking virtual environments an image based rendering technique was created that filters the image based rendering textures with NPR to create special mipmaps called art-maps. 1 INTRODUCTION During recent years interest in non-photorealistic computer generated graphics has risen dramatically. Artists, technical illustrators, architects, animators and the like have found many way to use computer generated images that look like they were created by humans. Rendering images in pen-and-ink style rather than the usual shaded style has been found to create simpler looking images that are at the same time more understandable. For this reason architects prefer, some times even unintentionally, showing unfinished designs with a sketchy look, as in Figure 1, rather than the usual CAD hidden line removal graphics shown in Figure 2 (Schumann et al. 1996). It was found that the appearance of hand drawn lines in a CAD rendering creates livelier conversations about the design and in those conversations more suggestions for changes arouse. 1
Figure 1: CAD drawing with a sketched look. (Schumann et al., 1996) Being able to control the viewing angle of the shown object in a repair manual enables the user to see the information he or she needs. Virtual environments created with nonphotorealistic rendering (NPR) techniques can focus the users attention to a specific location inside the environment. It is also possible to better convey moods to the user with NPR graphics. But to be able to create interactive real-time animations with NPR graphics has its problems. Pen-and-ink methods are very heavy to calculate since the silhouette edge tracing is an exhaustive task. Traditional ray-casting methods such as Appels algorithm, used to find the silhouette edges, have to process very much information for every frame. This makes these techniques too heavy to calculate to be used in interactive graphics. Newer techniques that use the z-buffer to render images in a non-photorealistic look have also appeared (Lake et al., 2000). They are restricted to using the polygonal model of the object as the edges of the shown image, but produce very appealing imagery. "!# $% & &' (& ) *+ &,.-0/1 2& 3 &,54687 79;: 2
Markosian et al. (1997) have created a system that sacrifices some accuracy in order to be able to render pen-and-ink style graphics in real-time. They use probabilistic methods to reduce the amount of edge tracing needed for each frame. They trace only some of the edges and therefore save a lot of time. To increase the probability of finding all of the edges they used the fact that silhouettes change only little in subsequent frames, and therefore they are most likely to find silhouette edges to be the same they were in the previous frame. They therefore trace the previous silhouette edges first to increase the chances of catching the correct edges. Painterly rendering in real-time has many problems. Frame-to-frame coherence of strokes is the most visible problem since if the strokes move from frame to frame when the user changes the viewing position or direction, some flickering effects occur. Hertzmann and Perlin (2000) solved this problem quite well for video and animation by painting over the last frame and using optical flow to translate the image according to movement. Since Klein et al. (2000) were trying to achieve a style that would look like a painter having created a painting of a moving scene, constant stroke size for objects in different distances was required. The rendering of non-photorealistic images is very slow and it is therefore generally not possible to render them at interactive frame-rates. To be able to improve the performance the NPR calculations must be done off-line. Many virtual environments have the problem of having very low visual detail. For this a solution of using real images of a scene as the basis of a computer generated display creates a visually compelling and rich environment with accurate lighting of objects. As a solution for this problem, Klein et al. (2000) have created a system that uses an image based representation of a real or synthetic environment and then filters it with some arbitrary NPR filter to create a combined non-photorealistic image based representation. The general architecture used by Klein et al. (2000) is shown in Figure 3. Putting the NPR generated imagery to the textures of the environment fixes the problem of frameto-frame coherence of strokes. By using a special incarnation of mip-maps the authors have been able to create a system that also fixes the size of the strokes to be approximately the same for all objects. This system allowed Klein et al. (2000) to create nonphotorealistic virtual environments with a variety of styles and rich visual detail. Figure 3: Klein et al. s (2000) basic architecture of creating non-photorealistic virtual environments. 3
The following chapters are divided so that the implementation described by Klein et al. (2000) is detailed in chapter 2 and the solution of real-time pen-and-ink rendering by Markosian et al. (1997) is described in chapter 3. Chapter 4 provides a summary of issues discussed in this paper and points to some future areas of research in this field of computer science. 2 PAINTERLY RENDERING IN REAL-TIME Creating real-time NPR virtual environments has some challenges such as interactivity, visual detail, controlled stroke size and frame-to-frame coherence of strokes. Usually NPR methods take seconds to minutes for rendering just one frame but a virtual environment requires high enough frame rates to be compelling for the user. This usually translates to more than 20 frames per second. Clearly the NPR processing has to happen off-line. The visual appearance in virtual environments is usually lacking detail due to limited size and detail in textures and the global illumination model. By using image based rendering techniques the aforementioned problems can be solved by using actual photographs of a real environment. Debevec et al. (1996) used a modification of this technique to create a stunning effect for the video: Campanile shown in SIGGRAPH 1996. This however creates some new problems when under-sampled textures are used. If the captured images are used as textures as they are, visible seams will appear on the objects surface when the textures are filtered with NPR filters. Using an image-based representation with special filtering produces acceptable level of visual detail and coherence. This method used by Klein et al. (2000) is described in the following chapters. 2.1 Image based representation Creating the image-based representation involves several steps. Each of these steps is detailed below with an image. First a basic geometry for the environment is created, as shown in Figure 4. It does not need to be complex; it just needs to represent the largest objects, since the image-based textures will later add detail to the scene. The purpose of this geometry is to serve as a basis to which the photographs captured later can be fixed to. Figure 4: Basic geometry for an image-based representation of the gallery created by Klein et al. (2000). 4
< = >? @ A1BCDFE GIH? JA&@K> A&L A&@ M JA&NOHP= Q&J? @ A&RTS&E @KJU AV= GM > A&WYX M R A&N@ AH@ A&R A&L JM J= E L Z[0\]A+= LA&J M ]Z5^ _` ` `;a The second step is to take either real or computer generated pictures of the environment. There should be enough pictures to cover most of the environment so that as little as possible empty spaces would be left to the scene. In Figure 5 we see computer-generated pictures of the gallery. These pictures are then mapped to the geometry. At this point it is possible to check if the coverage of the images is good enough or whether some additional pictures are needed to cover some specific area of the environment. The mapped textures have seams between them since it is almost impossible to match the borders of the images perfectly. If the textures were to be filtered with an NPR filter now, there would be some visible seams in the textures. The seams are also apparent in unfiltered textures as in Figure 6. Figure 6: Pictures are then mapped to the basic geometry. b0cde+f ghe&ikj dl5m no o o;p 5
Figure 7: Coverage is computed for the mapped pictures. q0rst+u vht&wkx sy5z { ;} The next step is to compute the coverage of the pictures and use a hole-filling algorithm to fill any remaining holes in the textures. This process is depicted in Figure 7. To avoid seams in between different textures for faces, Klein chose to group the textures of each face together. This grouped texture for one wall of the gallery is shown in Figure 8. This way the NPR filter is applied to the whole face of an object and the result is a seamless NPR filtered texture. Figure 8: The textures for different vertices are grouped to avoid seams in NPR filtered textures. (Klein et al., 2000) 6
~ ƒ ˆ ŠY Œ8 Ž Œ Œ ƒ Ž &ƒž3 ƒ + Ž3 ƒ K ƒ&š& ƒ&ž 8 Iœ+ ƒ& 3 K ŠY Œ8 ƒ&š&š ƒ&ž hœhžž ƒ&ÿ+ Œ kœ8 + 0 ƒ+ hƒ& kœ 5 ; 2.2 Non-photorealistic filtering When the image-based representation of the environment is formed the next step is to filter them with some non-photorealistic filter. This step produces the final textures to be used in the run-time walkthrough. 2.2.1 Art-maps To avoid having strokes of different sizes appearing in the imagery, Klein et al. (2000) used mip-map based textures that were filtered with the selected NPR filter so that the sizes of strokes would appear constant when the object was viewed from different distances. As the graphics hardware reproduces the imagery in real-time, the appropriate mip-map textures are selected for each object according to their distance from the viewer. The farther the object is from the viewer the smaller texture is selected. Usually two closest matching textures are filtered together to better maintain frame-to-frame coherence. A screenshot of a real-time walkthrough is shown in Figure 10. Figure 10: The real-time walkthrough shows the final look of the environment. (Klein et al., 2000) 7
Figure 11: Wavy lines mask the seams in the edges of the environment. (Klein et al., 2000) 2.3 Results Before the final image is shown to the user, the edges of the environment are masked with wavy lines in order to hide the seams apparent between object face boundaries. A sample screenshot is shown in Figure 11. This environment was generated using a modern graphics workstation with the sustained frame rate kept easily above 30 frames per second. 3 SKETCHY LOOK IN REAL-TIME The main problem for rendering pen-and-ink images in real-time is that these images contain thousands of pen strokes that must be tested for visibility or be rendered for each frame for assuring correct visibility. Markosian et al. (1997) describe a method that increases the performance of pen-and-ink rendering to a level required by interactive graphics. Their method does, however, represent several limiting assumptions in order to make it work. They assume that the model consists only of polygons and is non-intersecting, that the light comes from the camera position and that no edge has more than two adjacent faces. This basically limits their method to showing simple objects from the outside. 3.1 Specific problems Keeping the drawn lines in approximately the same place from frame-to-frame requires basically that the strokes be drawn in object space. Because the wavy lines that Markosian et al. (1997) want to create, as shown in Figure 12, do not correspond directly to the polygonal model of the object, the edges cannot be evaluated with simple silhouette edge detection based on z-buffering. A silhouette edge trace is needed. But traditional methods such as Appels algorithm require complete ray tests of the model that are heavy to calculate and as such too slow to use in interactive graphics. 8
Figure 12: A part shown as a very rough sketch. (Markosian et al., 1997) 3.1.1 Appels algorithm for finding silhouette edges Appels algorithm does a complete ray casting visibility check to determine visible and silhouette edges for the object. It starts from some edge position on the object surface and then checks it for visibility. Similar tests are concluded for all edges of the object. This kind of method is much too slow for interactive graphics when the edge count of the object is reasonably high. 3.2 Solution Markosian et al. (1997) used probabilistic methods to identify the silhouette edges of an object. Their method also preserves the frame-to-frame coherence of the rendered strokes in image space. They start with a random sample of edges and test them for being silhouette edges. If an edge is a silhouette edge the method traverses the edge forwards and backwards from the camera to find adjacent silhouette edges. Since the silhouette edges change relatively little from frame-to-frame the previous frames silhouette edges are also tested for still being silhouette edges. This contributes for the frame-to-frame coherence and also increases the possibility of finding all silhouette edges of the object. This method has a problem when the edge is right along the camera view. It creates a singularity that Markosian calls a cusp vertex. A cusp vertex is such that (Markosian et al., 1997): 1. it is adjacent to exactly 2 silhouette edges, one front-facing and the other back-facing, 2. it is adjacent to more than 2 silhouette edges, or 3. it is adjacent to a border edge. The method uses these cusp vertices to further increase the performance of the calculation by only testing changes in the silhouette edge at the detected cusp vertices. 9
Figure 13: A part shown as a sketch with some shading. (Markosian et al., 1997) 3.2.1 Shading The algorithm can be extended to include hidden surface removal so that some simple shading can be implemented. This generated information is used to place shading strokes on the surface of the object that maintains the frame-to-frame coherence of the strokes. The strokes are rendered when a vertice is turned away from the camera normal more than some predefined threshold value see Figure 13. The angle of the strokes is the cross product of the surface normal and the camera normal. 3.3 Results The generated imagery was fast enough to display fairly complex images on a 1996 graphics workstation and the selected shading algorithm generates quite good looking images with a hand-drawn look. The main use for the method was suggested to be realtime previews of very complex objects and in this function the described method does work reasonably well, see Figure 14. 4 SUMMARY The previous chapters described to techniques to create real-time non-photorealistic renderings. This kind of imagery has many uses, such as in CAD walkthroughs and object previews. These techniques are already being used as a basis for many new methods and will presumably be of great use in the field of computer-generated graphics. Figure 14: Venus shaded in pen-and-ink style. (Markosian et al., 1997) 10
REFERENCES Schumann, J.; Strothotte, T.; Raab, A.; Laser, S. 1996. Assessing the Effect of Nonphotorealistic Rendered Images in CAD. (CHI-96), pp. 35-41. Klein, A. W.; Li, W.; Kazhdan, M. M.; Correa, W. T. 2000. Non-photorealistic virtual environments. Computer Graphics (SIGGRAPH 2000), pp. 527-534. Markosian, L.; Kowalski, M. A.; Trychin, S. J.; Bourdev, L. D.; Goldstein, D.; Hughes, J. F. 1997. Real-time nonphotorealistic rendering. Computer Graphics (SIGGRAPH 97), pp. 415 420. Debevec, P. E.; Taylor, C. J.; Malik, J. 1996. Modeling and rendering architecture from photographs: A hybrid geometry- and image-based approach. Computer Graphics (SIGGRAPH 96), pp. 11-20. Lake, A.; Marshall, C.; Harris, M.; Blackstein, M. 2000. Stylized Rendering Techniques For Scalable Real-Time 3D Animation. (NPAR 2000) Hertzmann, A.; Perlin, K. 2000. Painterly Rendering for Video and Interaction. (NPAR 2000) 11