Chapter Four: Feature Line Textures

Size: px

Start display at page:

Download "Chapter Four: Feature Line Textures"

Ira Sutton
5 years ago
Views:

1 Chapter Four: Feature Line Textures Preceding chapters described the motivation for adding a sparse, opaque texture to an overlaid transparent surface, to help better communicate its shape and relative depth distance from underlying opaque structures while preserving its overall transparent character. The goal of this chapter is to describe the perceptual motivation for selectively opacifying valley and ridge regions and to present an implementation that I developed, independently and concurrently with similar efforts elsewhere, to do this. Inspired by the ability of gifted artists to define a figure with just a few strokes, I would like to define a technique for illustrating layered transparent surfaces, in three dimensions, so that they can be both clearly seen and easily seen through at the same time. My aim is to efficiently and effectively communicate the essential features of the superimposed surface in an intuitively meaningful way, using a clear and simple representation that is appropriate for dynamic viewing conditions, minimizes extraneous detail and allows a largely unobstructed and undistorted view of underlying objects. Research in pictorial representation and image understanding indicates that line drawings, and outline in particular, are a natural and universally understood means for communicating information about objects. A wide range of experiments (cited in [Kennedy 1974]) with animals, children and individuals from a variety of culturally-diverse populations provide evidence that the ability to recognize objects from line drawings is inborn (as opposed to a learned technique). The ease with which we interpret line drawings may suggest an intrinsic relationship between this type of representation and the way our visual system processes and stores visual information. David Marr s primal sketch theory of visual information processing [Marr 1976], for example, is founded on the observation that many of the essential elements of image understanding can be derived from the type of information that is encoded in a line drawing representation of a scene; he postulates that the extraction of this primal sketch is a first step in visual information processing, at least for two-dimensional intensity images. Experiments by Biederman and Ju [1988] showed that familiar objects could be identified slightly more quickly and accurately in simple line drawings than in color slides when the images were presented, with masking, for extremely brief (50ms) exposure durations. (The accuracy and speed of object identification was equivalent for the two presentation methods when exposure durations were longer, and the objects could be correctly identified in nearly all of the images when viewing times were extended to 100ms.) These results provide further evidence both for the importance of the information that a line representation can carry and also for the merits of a display mode in which important information is highlighted while less essential detail is removed to improve clarity. Figure 4.1 illustrates some of the sample stimuli used in these object recognition experiments. Although the essential elements of many objects can be completely captured in a line drawing, it s important to recognize that simple lines (particularly contours or outlines) don t always provide sufficient information for object representation in general. Biederman and Ju [1988] noted that objects whose projections were structurally very similar (for example a peach and a plum) could not be easily differentiated on the basis of outline or contour information alone. They also observed that certain types of objects such as hairbrushes were poorly characterized by their outline. 77

by line drawings as opposed to color photographs, for very brief, masked exposure durations.

when only outlines of depth and intensity discontinuity are provided), even though the same individuals can be easily identified in the photographs upon which the line drawings are based.

2 Figure 4.1: Examples of sample stimuli used by Biederman and Ju [1988] in experiments showing slightly lower error rates and faster reaction times for naming (or verifying the identity of) objects represented by line drawings as opposed to color photographs, for very brief, masked exposure durations. Perceptual studies in facial recognition have consistently shown that observers perform poorly on tasks measuring the ability to identify familiar people from line drawings (and particularly poorly when only outlines of depth and intensity discontinuity are provided), even though the same individuals can be easily identified in the photographs upon which the line drawings are based. In experiments by Davies et al. [1978], individuals were correctly identified 90% of the time in photographs but only 23% of the time in outline (depth discontinuity and feature boundary) representations and 47% of the time in more detailed line drawings (in which intensity discontinuities were represented in addition to the outline). Subsequent experiments by Bruce et al. [1992] showed that inclusion of even the most basic form of bi-level shading (see figure 4.21-right for an example) could dramatically improve performance on facial recognition Figure 4.2: Examples of photographic vs. line drawing facial recognition task stimuli, generated by [Rhodes et al. 1987]. Left: a black & white photograph. Center: a line tracing of the photograph. Right: a computer-generated caricature derived from the line drawing. 78

tasks. The images in figure 4.2, from [Rhodes et al. 1987], are indicative of the type of sample stimuli used in these kinds of facial recognition experiments.

3 tasks. The images in figure 4.2, from [Rhodes et al. 1987], are indicative of the type of sample stimuli used in these kinds of facial recognition experiments. Interestingly, the studies by Rhodes et al. [1987] showed that familiar individuals could be more easily recognized in line drawings when their most atypical facial features were selectively exaggerated, in the style of a caricature. It appears that simple outline drawings in which lines are used only to mark intensity and depth discontinuities lack certain essential information required for some higher-level object identification tasks. Although we can often easily categorize an object based on a simple line representation, experiments such as those by Price and Humphreys [1989] indicate that we do use other surface information, particularly shape, texture, or characteristic color information, to make finer distinctions between similar members of the same class. Price and Humphreys [1989] suggest that edge-based and surface-based recognition processes might operate in a cascaded manner, with the effects of the latter being most significant when the information encoded in the former is inadequate for the task. Despite the insufficiency of line drawings for some higher-level object differentiation tasks, it remains evident that line representations can convey a great deal of information about the objects in a scene. For a number of reasons, which will be discussed in greater detail in chapter five, texturing methods in which the surface opacity is explicitly varied according to the surface shading parameters do not appear particular promising; preliminary experiments described in the appendix appear to confirm this view. However, a sparse, opaque texture that approximates a three-dimensional line drawing might be useful for emphasizing the important features of a transparent surface without unduly occluding underlying objects. Recognizing the roles of various types of lines in visual perception, the question then becomes: how can we best define a set of descriptive lines to clearly and efficiently communicate the essential shape features of a transparent surface in a perceptually intuitive way? 4.1: Silhouette and contour curves Silhouette and contour curves are the two-dimensional projection of points on a surface in 3-space whose surface normal is orthogonal to the line of sight. Silhouette curves form a closed outline around the projected form, while contour curves may appear within the projected form and may be discontinuous. Figure 4.3 illustrates the silhouette and contour lines in a projected image of a simple object, from [Koenderink 1990]. Figure 4.3: Examples of silhouette and contour curves in the projection of a simple object. Left: detailed image of a banana-shaped surface, from [Koenderink 1990]. Center: the silhouette curve. Right: the contour curve. 79

Silhouette and contour curves are ubiquitous in two-dimensional line art and illustration; it s difficult to imagine a successful line drawing that isn t based in large part upon these curves.

4 Silhouette and contour curves are ubiquitous in two-dimensional line art and illustration; it s difficult to imagine a successful line drawing that isn t based in large part upon these curves. Silhouette and contour lines possess a number of features that make them useful for image understanding and object recognition. Of particular importance, contour lines mark the depth discontinuities in a two-dimensional image, and silhouette lines separate the figure from the ground. One popular method for automatically generating line drawings of three-dimensional models, proposed by Saito and Takahashi [1990], operates directly from this observation, defining the contour lines in a two-dimensional projection by using a gradient operator to locate discontinuities in the depths of the surfaces that map to adjacent pixels in the projected image. Figure 4.4, from [Saito and Takahashi 1990], shows examples of some perceptually-enhanced images produced by this method. Figure 4.4: Enhanced images, generated by Saito and Takahashi [1990], in which various types of depth discontinuities, found using an edge operator on a two-dimensional depth map, are highlighted with black or white lines. Left: Upper left quadrant: the original nut image. Lower left quadrant: black lines represent the detected locations of zero-order depth discontinuities; white lines represent the detected locations of first-order (slope) discontinuities. Upper right quadrant: lines of zeroorder depth discontinuity superimposed over the original image. Lower right quadrant: lines of both depth and slope discontinuity are superimposed. Right: An generated using a depth map computed by a ray tracing program that stored, at every pixel location, the distance traveled by each ray. This results in depth discontinuities being signaled at the edges of the reflected images of the balls, but not at the shadow edges or specular highlights. (As we have seen in chapter three, however, it may be incorrect to represent specular highlights as residing at the same depth as the convex surface under which they appear in a stereo view.) There is abundant psychophysical evidence that our visual system is adept at extracting information about the three-dimensional shape of an object from the outline of its twodimensional projection. Pioneering work by Wallach and O Connell [1953] showed that the three-dimensional nature of an object could be communicated through a sequentially-viewed series of two-dimensional silhouette images, even though each of the projections, when seen individually, appeared flat. In a series of experiments using nearly orthogonally-projected shadows, they found that subjects who viewed silhouette images of smoothly rotating threedimensional objects, constructed of bent wire or bound by planar faces, consistently reported the perception of a rigidly rotating three-dimensional form, as long as both the figure and the axis of rotation were defined so that both the length and the orientation of the silhouette edges varied over time. When these conditions were not met, or when smoothly-curved solid objects were used, subjects reported the perception of a two-dimensional figure, shrinking and stretching or rotating in the plane. Although [Ullman 1979] later proved mathematically that threedimensional structure (i.e., exact coordinates in three-space) could be uniquely described by as 80

5 few as three different views of four noncoplanar points on the surface of a rigid object, and [Braunstein et al. 1987] showed that a perception of three-dimensional form could be induced from stimuli at least as sparse if not more so, the difficulty of inducing such a perception from the changing shape of smooth two-dimensional curves remained. As a smoothly curving surface rotates relative to the line of sight, the set of surface points whose normal directions are orthogonal to the viewing direction (namely, the set of points corresponding to the rim) will change; the result is that, in general, different surface points will project onto the outline in different views of an object. It does not appear, however, that the perception of three-dimensional shape from deforming silhouettes requires a stable correspondence of points on the projection to points on the surface. Todd [1985] found that subjects were able to perceive three-dimensional form from the silhouettes of slowly rotating smoothly curved solid objects when other verbal or figural cues were introduced that explicitly defined the nature of the rotation, for example describing the image generation technique ahead of time or using multiple objects in the stimuli whose relative positions (in 3-space) were rigidly defined. Exactly how the visual system might infer three-dimensional shape from the changes over time in the shape of the silhouette of a single, smoothly curved, rotating solid object remains a topic of active investigation [Cortese and Andersen 1991]. Important information about the three-dimensional shape of an object can also be derived from various static features of the contour lines in a single two-dimensional projection of the surface [Koenderink 1984, Richards et al. 1987]. The following has been proven: where the curvature of the contour is positive, the surface will be locally elliptic (although for opaque surfaces only the convex patches will map to the contour); where the curvature of the contour is negative, the surface will be locally hyperbolic (or saddle-shaped ); and where the contour is locally flat, the surface will also be flat, at least in the direction of the contour. Inflections in the curvature of the contour correspond to parabolic lines on the surface (where the Gaussian curvature changes from positive to negative, marking the boundary between an elliptic and a hyperbolic patch), and cusps in the contour occur when the direction of projection is aligned with an asymptotic direction on the surface. Figure 4.5, adapted from images in [Koenderink 1990], gives examples of some of these different types of surface patches and their projected contours. A discussion of the relevance of parabolic lines to our understanding of surface shape, and a description of other surface decomposition schemes, are given in later sections. Figure 4.5: Surface patches of different shapes and their associated contour lines. Green: positively curving contour bounding an elliptic patch. Blue: negatively curving contour bounding a hyperbolic patch. Red: contour of zero curvature bounding a cylindrical patch. The red lines on the surface on the left mark the parabolic curves, across which the surface is locally flat and at which the contour inflects. Constructed using surfaces from Koenderink [1990]. 81

6 The presence of T-junctions in a contour has been shown to be a strong local cue for occlusion, and X-junctions in a contour locally suggest the possibility of surface transparency [Cavanagh 1987]. A variety of other, more global, contour features catalogued by Koenderink [1990] using names such as lips, beaks and swallowtails only appear in the projections of transparent surfaces. Figure 4.6, from [Koenderink 1990], gives an example of one of the conditions under which either T- or X- junctions are typically found. Figure 4.6: Some contour features. Left: T-junction in the contour of an opaque surface. Right: swallowtail in the contour of a transparent surface. Adapted from [Koenderink 1990]. Silhouette and contour curves are, very clearly, some of the most important elements in two-dimensional drawings; they communicate essential information about the three-dimensional structure of the scene in universally understandable and perceptually intuitive ways. When we turn our attention to three-dimensional perception, using more than a single projection with the introduction of dynamic motion and stereo cues, we find that the inherent two-dimensional character of silhouette and contour features limits their direct applicability, and the direction to take in extending these concepts for three-dimensional illustration is not at all clear. Despite the importance of the information carried by silhouette and contour curves, the perceptual benefits of explicitly marking these lines as opaque stripes on a transparent surface dynamically viewed in three dimensions turn out to be limited, primarily because of their viewpoint-dependent nature. As mentioned earlier, silhouette and contour curves do not, in general, project from a fixed set of points on a smoothly curving surface under arbitrary viewing conditions. When I tried to highlight figure/ground distinctions and depth discontinuities by applying an opaque line texture to the surface along its rim, the first perception informallyreported by a majority of casual observers who viewed the rotating object was the distracting and undesirable impression of seeing lines crawling around over the surface. A related problem with explicitly marking rim points on a surface to reemphasize the contour features is that, in a stereo pair of images, the set of surface points that map to the contour in the view from one eye will be different from the set of points that map to the contour from the other eye. When we look at a stereo pair in which these surface points have been explicitly highlighted, our visual system tries to establish a correspondence between these two distinct lines but cannot, and our ability to easily and accurately perceive surface shape is hampered rather than enhanced. This effect is illustrated in the images in figure 4.7, which I generated using a silhouette-finding algorithm whose implementation details are fairly straightforward and which I do not describe in this dissertation. If silhouette lines are to be at all useful, even in static two-dimensional images, some care must be taken in their presentation. If the line appears flat (belonging to the plane of the image) 82

Figure 4.7: A stereo pair of images in which the surface points mapping onto the contour in the perspective projection from each eye are highlighted.

7 Figure 4.7: A stereo pair of images in which the surface points mapping onto the contour in the perspective projection from each eye are highlighted. and neither attached to the figure nor to the background, the three-dimensional quality of the entire image will be diminished [Sullivan 1922]. Support for the general idea that line enhancements are more effective when rendered as if they are painted on a surface rather than painted on an image of the surface is also given by [van Wijk 1992]. From the above discussion, it seems that for the case of dynamically-viewed silhouette curves attaching the lines to the object doesn t produce pleasing results. Marking the outline on the backplane might be preferable in this situation; unfortunately I didn t follow up on this idea and have no images to support this presumption. A number of other possible techniques exist for emphasizing figure/ground distinctions besides explicitly marking the silhouette line. One option, demonstrated by Kaneda et al. [1987] for superimposed transparent surfaces, is to artificially enhance the differences between foreground and background colors. Another option, proposed by Kay and Greenberg [1979], is to emphasize the rim area in a gradual way, incrementing the opacity of points on a transparent surface by a smoothly increasing function of the angle between their surface normal direction and the direction of view. This approach might offer greater promise than explicit line drawing for emphasizing contour regions (and thereby communicating figure/ground and depth discontinuities) on dynamically-viewed three-dimensional models in a more visually pleasing and less distracting or confusing manner. Although the problems of viewpoint dependency remain, the shifting opacity patterns defined by this method might bear a closer resemblance to natural shading effects, because of their gradual, as opposed to abrupt, nature and their association with surface areas through which primarily because of refraction light is not ordinarily directly transmitted. Because of this analogy with shading, it is possible that we might be able to intuitively accept the changing opacity of silhouette regions thus defined in a manner similar to that in which we naturally ascribe variations in surface color to changing illumination rather than to flowing pigments. Nevertheless, if we are looking for stable, meaningful regions of a three-dimensional surface that when highlighted on an outer transparent shell allow its shape and relative depth to be better understood, it appears that we must look further. 4.2: Valley and ridge curves: an introduction If the usefulness of silhouette and contour curves as a sparse opaque texture is hampered by their viewpoint-dependent character, can we define another set of lines that will be effective in communicating surface shape based on features that are intrinsic to the object rather than to its two-dimensional projection? The answer is yes, and the lines are the ridges and valleys. 83

In his book Solid Shape, Jan Koenderink [1990] defines ridge and valley lines as the locus of points where the principal curvature assumes an extreme value along the line of curvature.

8 In his book Solid Shape, Jan Koenderink [1990] defines ridge and valley lines as the locus of points where the principal curvature assumes an extreme value along the line of curvature. As these standard differential geometry terms may not be familiar to all, a brief, more explicit description will be given. At any non-spherical point on a smoothly curving surface, there will be a single direction in which the normal curvature is greatest, which is called the first principal direction. The orthogonal direction, which has been proven for these points to be the direction of least curvature, is called the second principal direction. If we walk along a surface, starting at different points and moving in either the first or second principal direction, we will trace out the two families of lines of curvature. Some of these lines are shown in black on the ellipsoid of figure 4.8. If we are walking along one of these lines and encounter a positive local maximum of curvature in our direction of motion, this will be a ridge point; a negative local minimum in the directional curvature along a line of curvature will mark a valley point. There are three ridge lines on the ellipsoid shown in figure 4.8, marked in red, green and blue. On this very regular surface the ridge lines happen to be planar curves and correspond to the axes of symmetry, although neither of these properties will hold true in general. Figure 4.8: Selected lines of curvature and the three ridges of an ellipsoidal surface. The ridges coincide with the planes of symmetry in this example, and have been individually highlighted in red, green and blue. Adapted from an illustration in [Hilbert and Cohn-Vossen 1952]. 4.3: Perceptual motivation for highlighting valley lines Because they are defined by the local surface geometry, ridge and valley lines are intrinsic shape features that remain fixed in their location on a surface, independent of the viewing angle. The idea of explicitly marking the set of valley lines as an opaque texture on a transparent surface to enable a better perception of surface shape shows promise because, in several different ways, valley lines appear to correspond to perceptually important features of a surface. The question of which points on a surface might be most visually significant, or most useful for shape representation, has received a great deal of attention over the years. In 1954, Attneave proposed that information was distributed differently across different parts of an image, and that it was concentrated in the case of a two-dimensional silhouette along the edges or lines of intensity discontinuity, and in particular, at the points of sharpest curvature along an edge. Attneave supported this theory with an experiment in which subjects were asked to mark the 10 points that they felt best represented an abstract blobby figure. He reported that a 84

9 disproportionate number of the points specified were located at the places along the outline where the shape was least flat. Figure 4.9-left, reproduced from [Attneave 1954], gives an example of the experimental results. Attneave also presented a drawing of a sleeping cat, shown in figure 4.9-right, to illustrate the apparent ease with which a natural object can be represented by a few corner points and their connecting information. Figure 4.9: Left: a summary of the frequency with which individual points along a contour were chosen as one of the 10 points best representing the shape. Right: a drawing of a sleeping cat, derived from 38 points of relatively sharper contour curvature and the straight line connections between them. Both figures are from [Attneave 1954]. Green and Courtis [1966], based on observations of the way cartoonists use incomplete lines to suggest figures, argued that it was only by fortuitous circumstance that information sometimes appeared to be more heavily concentrated in the highly curved regions of an object s contour. Using the same figure of the sleeping cat as an example, they suggested that the shape of an object was no more poorly characterized by partial contour segments between the highly curved points than by partial segments directly upon them. Figure 4.10, reproduced from this paper, illustrates these two different contour deletion strategies. Figure 4.10: A comparison of figures in which deleted sections of the contour are located a) at the most highly curved regions of a figure or b) midway along the straight lines between the most highly curved regions, from [Green and Courtis 1966]. A series of experiments by Kennedy and Domander [1985] cast further doubt on Attneave s theory that the highly curved areas of the contour were the most important for shape representation and object recognition. If, as Attneave had hypothesized, the corner points carried most or all of the shape information, drawings in which these points were erased should have been much more difficult to recognize than drawings in which segments were erased from the 85

10 contour in flat areas only. Kennedy and Domander measured the ability of subjects to recognize six objects from line drawings in which the straight edges of the contour lines were represented by a pair of short dashes placed a) separately at each end of the edge (adjacent to the maximally curved corner points), b) together in the middle of the edge, or c) evenly spaced across the edge. Figure 4.11 shows some of the example stimuli. Kennedy and Domander found in these experiments that objects were correctly identified most often when the contour markings were evenly spaced, and that of the corner-only and midline-only representations the latter were more often correctly identified than the former. In a separate preference experiment using the same stimuli, a majority of subjects indicated that the even-segment pictures better showed the objects than the pictures in which only the corner lines or only the midlines were shown. Figure 4.11: Sample stimuli used by Kennedy and Domander [1985] to show that objects are more easily recognized in images in which the contour is represented by straight lines at the midpoints of flat areas than in images in which lines are drawn only at the corner regions. The question of which contour segments are most critical for shape representation seems to have been finally laid to rest by Biederman and Blickle (cited in Biederman [1985]). They observed that people appeared to reconstruct broken contours by smoothly interpolating, either linearly or curvilinearly, across the gaps, and hypothesized that objects would be most difficult to identify in images in which the deleted portions of the contour lines spanned a non-recoverable convexity. Through a carefully-designed sequence of experiments, in which they varied both the amount of contour deletion and the location of the deleted segments in images of 18 objects (defined by both straight and curved edges) presented over a variety of masked exposure durations, they were able to convincingly show that object recognition was impaired to a greater extent when the contour deletion was centered around the vertices of a figure than when it was centered midway between them. Figure 4.12 gives an example of the type of sample stimuli used in these experiments. Subjects were able to correctly identify the objects in almost all of the images, given enough time (error rates fell to below 10% across the board when the exposure duration was extended to 750ms). However, when the masked exposure duration was brief (100ms) and a relatively large proportion (65%) of the contour was deleted, error rates climbed to 54% for the objects with vertices erased as opposed to 31% for the objects with missing edge midsegments. In addition, contour deletion at the vertices had a consistently more detrimental effect on the mean correct response time than contour deletion midsegment for all exposure durations and all proportions of contour deletion. Biederman conducted his contour-deletion experiments as part of a much larger effort to substantiate his theory, recognition by components [Biederman 1985], in which he proposes that object recognition is based on the perception of a complex shape as an arrangement of geometrical primitives, defines an elementary vocabulary of around 36 or so simple volumes, and describes how these component parts might be individually characterized by the properties of 86

11 Figure 4.12: Sample stimuli used by Biederman and Blickle [Biederman 1985] in experiments measuring the effect of contour deletion on object recognition. their opposing edges. An essential element of this theory is the idea, originally proposed by Hoffman and Richards [1984], that people tend to perceive objects as collections of parts delimited by the regions of deep concavity in a surface. Earlier work by Marr and Nishihara [1978] had suggested that object recognition at least for some shape forms might be based on a modular, hierarchically organized, object-centered representation of three-dimensional shape; they used examples in which a multi-scale cylinder axis provided the skeletal definition of the form. Experiments by Biederman and Gerhardstein [1993] support the idea that object recognition is based on a viewpoint-invariant understanding of shape, compatible with a parts representation. The theory of recognition by parts has also been supported, in different forms, by the work of many others, including Pentland [1987], who proposed a part-and-process representation of shape based on combinations of primitives described by deformed superquadrics (mathematical surfaces that can be bent and twisted in various ways, like lumps of clay ). Beusmans et al. [1987] suggest a theory of recognition by parts based on the construction of solid shapes from compact, convex subunits, while Blum and Nagel [1978] suggest that the segmentation of an object into subunits is based on the structure of its symmetric axis and on the shape of the contour relative to this axis. Researchers at the University of North Carolina have been developing a model of shape description based on cores [Burbeck and Pizer 1995], and many other theories abound. Unfortunately, a summary of research into the nature of shape description, object recognition and the many alternative theories of parts of recognition is beyond the scope of this dissertation. Hoffman and Richards [1984] emphasized the importance of valley lines for defining the parts that make up an object and argued that the perceptual division of an object into component parts is based on the characteristics of the part boundaries rather than on the shapes of the individual parts themselves. (Such an approach would allow us to perceive an object as being made up of a collection of parts before, or without, obtaining a complete shape description of each of the parts.) Their argument that our visual system uses valley lines to define part boundaries stems from the observation that, in general, there will always be a concave discontinuity along the line where two separate surfaces are combined to form a third. When this discontinuity assumption is ever-so-slightly relaxed (as it must be, to apply to real world problems in which generic as opposed to singular features are the rule), the corners turn into local minima of negative curvature along the first principal direction, or what Koenderink [1990] 87

12 refers to as valley lines. Hoffman and Richards provide a number of compelling examples, one of which is reproduced in figure 4.13, to bolster their argument that the perceptual division of an object into parts is mediated by the valley lines, and not, as had been alternatively suggested, by the sign of the Gaussian curvature or the surface inflections (parabolic lines), by both the local maxima and local minima of curvature, or by surface curvature discontinuities in general. Figure 4.13: A two-dimensional surface that appears to naturally partition along its valley lines, adapted from [Hoffman and Richards 1984]. (The image on the left was rotated by 180 to produce the image on the right; note how the apparent surface partitioning changes as the page is turned.) Psychophysical experiments in visual shape recognition [Braunstein et al. 1989] support Hoffman and Richards theory that people tend to naturally perceive objects as partitioning into more primitive subunits along their valley lines. In the studies by Braunstein et al. [1989], subjects were asked to specify which of four candidate shapes they recognized as having been a subunit of a previously displayed surface of revolution. In the cases where either choice would have been correct, subunits that had been defined by partitioning the surface along its valley lines were selected twice as often as subunits that had been defined by partitioning the surface along its ridge lines. The greater familiarity of the valley-defined parts would be expected if this was the decomposition that had been subjectively perceived. When the same subjects were later asked to explicitly mark the lines along which they perceived these test objects dividing into subparts, they placed the part divisions at or near negative minima of curvature 81% of the time, and these results appeared to be independent of the width of the object at the valley or ridge lines. Another perceptual characteristic of valley regions, particularly those sharp valleys across which the curvature almost cusps, is that primarily because of light source occlusion these parts of a surface are more likely than others to be consistently less brightly illuminated, under arbitrary lighting conditions. Highly specular, transparent surfaces don t ordinarily exhibit the same type of self-shadowing effects, but perhaps by drawing in the valley lines we might be able to incorporate some of the information described by this shape-from-shading cue into the surface representation in an intuitive and also orientation-independent way. Miller [1994] proposed a rendering algorithm called accessibility shading, illustrated in figure 4.14, that represents this effect very nicely. In this approach, which can be thought of as a kind of firstorder heuristic approximation to global illumination, shading computations are based on the proximity of opposing surface sections, an occurrence that is consistent with the existence of a valley line but not strictly dependent on it. Despite the close association between valley regions and shaded regions of the surface, there are distinct advantages to using the former rather than the latter as the basis for defining 88

13 Figure 4.14: An image that illustrates the effect of selectively darkening surface cavities, from [Miller 1994]. Left: the accessibility map. Luminance values represent, at each point on the surface, the radius of the largest just-fitting sphere (tangent to the surface at the point and not intersecting the surface at any other point). Center: the model rendered with Lambertian shading. Right: Lambertian shading with luminance decreased according to inaccessibility. geometrically as well as perceptually relevant surface features. The primary advantage of marking a surface along the valley lines, which are intrinsic surface features, rather at the points of intensity minima after shading, which are features of both the object and its environment, is that while for a single image these two methods will produce very similar results, the latter is an inherently orientation-dependent technique, and will not be perfectly stable under arbitrary viewing conditions. It is evident that valley lines are perceptually important entities. By marking the valley lines as a sparse, opaque texture on a transparent surface, my aim is to concentrate occluding elements in the perceptually most important parts of a surface, while generating, at the same time, an approximation to the type of line drawing that an artist might make. By specifically accenting the valley lines, I hope, if not to explicitly represent what people already at some level subjectively perceive, at least to illustratively supplement a naturally existing perceptual process. Despite all of the above evidence of the perceptual importance of valley lines, and the indications of clear perceptual benefits to drawing them in when they are present, it is still the case that valley lines alone may not provide enough information to characterize a surface shape well. Under what circumstances can it be helpful to add some other lines to the texture definition, both to obtain better surface coverage and also to help more fully define the shape? (The topic of distributed textures I will leave until chapter five, and consider here only additional types of feature lines that might complement the valley line representation.) When the network of valley lines over a surface does not adequately represent enough of the prominent surface features, it may be the case that additional useful information can be conveyed by the explicit drawing of ridge lines, possibly differentiated from the valley lines by the intensity of their color. 4.4: The role of ridge lines Ridge lines are the locus of points at which the normal curvature in the principal direction assumes a local maximum. Despite the fact that psychophysical evidence does not indicate an important role for ridges in the perceptual subdivision of a complex object into component parts [Braunstein et al. 1989], there are many indications that ridge lines may be useful for distinguishing other perceptually important features of a surface s shape, particularly 89

14 when they correspond to discontinuities or near-discontinuities in surface curvature in the transecting direction or when they bound patches of otherwise approximately constant curvature. Discontinuities, of all kinds, seem to be important features in visual perception, image understanding, and pictorial representation. We have already seen some theoretical arguments [Marr 1976] that lines of intensity discontinuity may play an important role in two-dimensional visual information processing. Abundant pictorial evidence can be found in [Kennedy 1974] to support the idea that artists often choose the lines in a line drawing to represent some kind of discontinuity, be it of depth distance or of differences in material property. The experiments of Biederman and Ju [1988], cited in section 4.1, which demonstrate the relatively greater detrimental effect of deleting contour segments at the vertices of a line drawing than between them, may also support the importance of these features, both convex and concave. When they are associated with sharp changes in curvature on a surface that is otherwise fairly smooth, ridge lines can be interpreted as marking the boundaries of distinct surface subregions. Sharp ridges in many cases mark the boundary (for a significant range of viewpoints) between the front-facing and back-facing portions of a surface. Sharp ridges also often correspond to locations of sharp changes in the apparent intensity of a directionally-illuminated surface, due to the locally rapid rate of change of the surface normal direction, but it is not clear they necessarily derive their perceptual significance from this shading effect. (Is the edge between two faces of a cube less representationally important when the light hits each face equally than when it shines more predominantly on one face than on the other? Certainly we cannot see the dividing edge as well in the former instance, but in an effective illustration perhaps we ought to be able to.) It can be observed, in images such as the one shown in figure 4.15, that many of the interior edges represented in line drawing illustrations of manufactured objects correspond to the ridge lines on these surfaces. Figure 4.15: An image in which ridge lines appear to correspond to perceptually relevant surface shape features, from Ponce and Brady [1987]. The idea that ridge lines might be useful for demarcating surface features is supported by the work of Bookstein and Cutting [1988], in which they note the strong correspondence between ridge lines and important shape features of the skull, and propose that ridge lines could be used to define landmark curves for morphometrical analyses of three-dimensional craniofacial data. In an extension of this idea, Cutting et al. [1993] describe how a network of ridge lines and geodesic curves could be computed to partition a skull into meaningful surface patches that are in consistent correspondence between subjects, and describe how this partitioning could be used for such purposes as computing an average shape across specimens or determining the relative abnormality of a specific shape. Figure 4.16, from [Cutting et al. 1995] gives an idea of what such a surface decomposition would look like. 90

Thirion demonstrates the stability of this ridge line decomposition on complex biological shapes and presents compelling practical arguments for the preferability of a ridge-based as opposed to

15 Figure 4.16: A skull surface partitioned into patches using ridge and geodesic curves, from [Cutting et al. 1995]. An alternative surface parameterization method based on ridge lines is the extremal mesh, proposed by Thirion [1994]. Thirion demonstrates the stability of this ridge line decomposition on complex biological shapes and presents compelling practical arguments for the preferability of a ridge-based as opposed to parabolic line-based surface decomposition scheme, primarily because of the inherent instability of the latter representation. Figure 4.17 illustrates the parameterization of a skull computed using the extremal mesh. It should be pointed out that the extremal mesh was not intended to produce a perceptually intuitive surface decomposition but rather to produce a mathematically stable representation of surface shape that could be used for data registration (either from the same patient or from different patients), surface simplification, surface reconstruction, or surface parameterization. Figure 4.17: An extremal mesh partitioning of a skull surface defined from acquired volume data, from [Thirion 1994]. 91

The correspondence between ridge lines and important intrinsic surface features is further borne out by the proven utility of these lines for data registration tasks, as evidenced for example by the

16 The correspondence between ridge lines and important intrinsic surface features is further borne out by the proven utility of these lines for data registration tasks, as evidenced for example by the work of Declerck et al. [1995]. 4.5: Parabolic curves perceptually less-relevant shape feature lines Many people, and most notably the mathematician Felix Klein (as related by [Koenderink 1990]), have explored the possibility of enhancing the perception of a surface s shape by marking the locations of the parabolic curves. Parabolic curves are the locus of points on a surface where one of the principal curvatures is zero and correspond to places where the surface is in some sense locally flat. The Gaussian curvature, which is equal to the product of the two principal curvatures, is zero at these parabolic points, which on generic surfaces separate the hyperbolic (or saddle-shaped ) patches from the elliptic (convex or concave) regions. Although they are geometrically very meaningful [Koenderink and van Doorn 1982], arguments for the perceptual significance of these curves are not particularly strong. Various attempts, over the years, to gain deeper perceptual insights into the shape of a surface from the explicit representation of the zerocrossings of Gaussian curvature have been of only marginal success, if any at all. Figure 4.18, from [Hilbert and Cohn-Vossen 1952], illustrates one of the most famous instances of an attempt to use parabolic lines to reveal the features of a surface shape; in this case the possible aim was, reputedly, to better understand the mathematical relations concerning artistic beauty [Koenderink 1990]. Figure 4.19 illustrates a very similar attempt by [Bruce et al. 1993] to gain insight into the shape features of the human face from a surface partitioning by parabolic lines. Figure 4.18: A photograph of a statue of Apollo of Belvedere, upon which Felix Klein had the parabolic lines drawn. From [Hilbert and Cohn-Vossen 1952]. Thirion [1994] described one disadvantage of decomposing a surface into patches using the parabolic lines: the problem of instability (a relatively small, relatively distant change in surface shape may have a relatively large effect on the shape of a parabolic line). Ponce and Brady [1987] alluded to another practical problem that arises when defining parabolic lines on a surface. While principal curvature values can be used to help differentiate the most significant ridge or valley features, it is not clear, from purely local information, how to successfully determine which zero-crossings of Gaussian curvature correspond to important patch boundaries and which merely indicate insignificant detail. The main difficulty I see with attempts to divide a 92

Figure 4.19: Images of facial surfaces in which the values of the Gaussian and mean curvatures are used for shape description, from [Bruce et al. 1993].

17 Figure 4.19: Images of facial surfaces in which the values of the Gaussian and mean curvatures are used for shape description, from [Bruce et al. 1993]. surface into perceptually meaningful subunits based on zero-crossings of Gaussian curvature, however, is that this representation is just not very perceptually intuitive. The difficulty we have in estimating or even verifying the accuracy of parabolic line placement, confirmed by [Koenderink 1990] in his discussion of the lines on Apollo Belvedere, is evidence of this gap between mathematical and perceptual understanding. I do not mean to say, by all of this, that parabolic lines are not useful shape descriptors. Displays of Gaussian curvature properties have been shown to be of great practical use in computer aided surface design [Seidenberg et al. 1992] and in other computational geometry applications in which their values directly correspond to quantities of interest [Maekawa and Patrikalakis 1994]. It s just that illustrating the locations of the parabolic lines on a surface does not appear to be a promising technique for conveying perceptually meaningful shape information in an intuitive and immediately understandable way. 4.6: A proof-of-concept demonstration of the usefulness of explicitly marking ridge and valley lines on a transparent surface We have seen, so far, a number of compelling arguments for the perceptual benefits of explicitly marking several types of feature lines on the surfaces of transparent objects. Valley lines have been shown, among other things, to reflect a natural partitioning of a surface, and sample illustrations have been presented that show how both ridge and valley features are often represented in line drawings, particularly where they mark surface curvature discontinuities. We have also seen that parabolic lines, which are often among the first of a surface s geometrical features to come to mind, seem to be of less potential usefulness as intuitively meaningful shape descriptors. Armed with the theoretical confidence that ridge and valley lines might be intuitively useful surface shape descriptors, we can begin experiments to ascertain the advantages of illustrating these lines as an opaque texture on a transparent surface. 93

Figure 4.20 provides a proof-of-concept illustration, showing the kind of results that we might hope to expect from explicitly drawing the ridge and valley lines on a transparent surface.

familiarity with the mathematical definition for their computation).

18 Figure 4.20 provides a proof-of-concept illustration, showing the kind of results that we might hope to expect from explicitly drawing the ridge and valley lines on a transparent surface. This model was constructed by hand, by placing self-sticking colored drafting tape onto the surface of a plastic bear along the ridge and valley lines determined by sight (taking advantage of prior familiarity with the mathematical definition for their computation). A subjective assessment of these images seemed to support the idea that the shape of a transparent surface might be more effectively conveyed if ridge and valley lines were superimposed, so the algorithmic implementation was begun. The following sections describe my own implementation of a technique for automatically marking ridge and valley lines on a transparent surface, after briefly describing past work in this area. Figure 4.20: A photograph of a transparent plastic bear, with colored tape applied along the ridge and valley lines. (The untextured bear is shown on the right, for purposes of comparison.) 4.7: Previous implementations of ridge-finding algorithms Valley and ridge lines have not, to my knowledge, been previously used in computer graphics to enhance communication of the shape and depth of transparent surfaces. But ridge and valley-finding algorithms have been previously implemented for several other purposes. Figure 4.21: Deriving a line drawing from a photograph, from [Pearson and Robinson 1985]. Left: a photograph. Center: a line drawing based on the luminance valleys in the photograph. Right: the line drawing augmented with bi-level shading (in which all pixels in the photograph with intensities below a certain threshold are set to black and all pixels with intensities above the threshold are set to white). 94

19 Pearson and Robinson [1985] demonstrated how the luminance valleys in a photograph of a human face could be used to derive a line drawing of that person. Their purpose was to design a method for efficiently and effectively transmitting the essential elements of the pictorial information over a low-speed data line. Their results, shown in figure 4.21, appear very similar to the type of drawing that might be produced by an artist. It can be noted that the lines in the center image correspond to figure/ground discontinuities, depth discontinuities, sharp changes in shading due to sharp changes in surface slope, and also to intensity discontinuities due to color differences between distinct, sometimes nearly coplanar surfaces. The success of this representation is very compelling, despite its intrinsic limitation to two dimensions. I should also probably note at this point that shadow lines (as opposed to shadow patches) appear to be generally unhelpful for surface shape description and are only infrequently included in pure outline drawings. In fact, the use of line to mark boundaries between surface patches differentiated only by their apparent intensity appears to be somewhat problematic in general. Figure 4.22 shows an example, reproduced from [Kennedy 1974] in which line is used to mark the stripes on a zebra. Although the subject in this image can be easily identified, the effect of the lines is confusing. Figure 4.22: An image in which lines are used to mark intensity discontinuities unrelated to surface boundaries or surface shape, from [Kennedy 1974]. One of the earliest ridge-finding algorithms for surfaces (in this case D surfaces) was implemented by Ponce and Brady [1985], who were concerned with defining a primal sketch representation of surface shape, based on measurements of differential geometry, that could ultimately be used for automatic object recognition. While the general mathematical definition of ridge lines is fairly uncomplicated, one of the biggest hurdles to be overcome in defining the ridges (or any other critical points) on acquired data is the problem of not having a smooth surface representation. Ponce and Brady [1987], who were interested in locating surface intersections variously defined by positive maxima, negative minima, and zero-crossings of principal curvature on height surfaces acquired from actual objects using a laser scanner, reduced the incidence of spurious feature detection by first using a combination of edgedetection and two-dimensional Gaussian blurring to smooth the surface without melting it into the backplane, and then tracing the lifespans of individual critical points across a variety of scales (as proposed by Witkin [1986]) to more accurately localize the features found at the coarsest level and also to eliminate features that were detected only at the finest. As a postprocess, they deleted connected collections of fewer than 3 or 4 points and filled in some gaps in broken lines. Figure 4.23 illustrates the type of results they were able to achieve using this approach. The body of work in automated object recognition is quite extensive, and a variety of algorithms for representing objects according to features of their differential geometry have been proposed. For example, Haralick et al. [1983] proposed a topographic primal sketch 95

Figure 4.23: Surface intersection lines computed by Ponce and Brady [1985] from differential geometry measurements on range data acquired from a variety of objects. Left: hammer.

representation of the intensity information in two-dimensional images for the purposes of object recognition, in which the individual pixels of an image are categorized according to the values of the

20 Figure 4.23: Surface intersection lines computed by Ponce and Brady [1985] from differential geometry measurements on range data acquired from a variety of objects. Left: hammer. Center: telephone handset. Right: automobile part. representation of the intensity information in two-dimensional images for the purposes of object recognition, in which the individual pixels of an image are categorized according to the values of the first and second derivatives of the intensity function, while Besl and Jain [1986] use the signs of the Gaussian and mean curvatures to define eight different types of surface patches for object recognition tasks. As was the case with object recognition, a comprehensive survey of surface characterization methods is also beyond the scope of this dissertation; the interested reader can find a thorough explanation of many of the various approaches and a discussion of their relative merits in [Brady et al. 1989]. Techniques for using measurements of differential geometry to describe surfaces have been extensively explored, and they continue to be a topic of some current interest. For medical applications (such as the one we are concerned with in this dissertation), one of the biggest problems with obtaining meaningful measurements of the local surface geometry is the problem of first obtaining a continuous, smooth surface representation to base those Figure 4.24: An image, by Subsol et al. [1994], that very nicely illustrates a set of ridge lines on an isointensity surface in acquired volume data, computed according to the approach of Monga et al. [1992]. 96

21 computations on. Many smooth surface-fitting algorithms have been developed for other applications (for example [Hoppe et al. 1994]), but they often obliterate finer detail or introduce subtle rippling patterns or other curvature artifacts, which reduces their usefulness as a basis for local shape description. As an alternative to this approach, Monga et al. [1992] proposed a method for computing ridge lines on isointensity surfaces in volume data directly from the partial derivatives of the three-dimensional density distribution, without requiring any sort of explicit smooth surface fitting. Figure 4.24 (from [Subsol et al. 1994]) illustrates a set of ridge lines computed by this method on an isointensity surface in acquired volume data. My implementation, which was developed independently and concurrently, follows a very similar approach to that of Monga et al., although it might perhaps be more accurately characterized as a rather literal interpretation of the ridge definition given by Koenderink [1990] using surface normals defined by grey-level gradients according to the definition of Höhne and Bernstein [1986]. (Koenderink [1990] gives a clear and straightforward algorithm for detecting ridges on mathematically-defined surfaces in three dimensions and also suggests applications for which it would be useful to use ridge lines to divide a surface into meaningful patches.) The greatest practical distinction between my method, which I describe below, and that of the INRIA group ([Monga et al. 1992, Subsol et al. 1994]), is that, because my aim was merely to enhance the appearance of transparent surfaces through the selective opacification of valley (and in some cases ridge) shape features and not to precisely define exact ridge lines, I was able to forego the potentially problematic computation of third derivatives in favor of a very stable and simple approximation that tests for the presence of a local curvature extremum in a subvoxel region. These two approaches produce similar but not identical results, the main distinction being that the "lines" detected by my method will actually be volume regions of some appreciable thickness. 4.8: A straightforward ridge-finding algorithm The implementation I describe here, also summarized in [Interrante et al. 1995], integrates the marching cubes isosurface detection algorithm developed by Lorensen and Cline [1987] into the ray-casting volume rendering software of Levoy [1988]. It was designed as part of our overall effort to produce images of radiation therapy treatment planning data in which the essential features of the superimposed transparent skin surface, the underlying opaque treatment volume, and one or more transparent isointensity surfaces of radiation dose are all clearly visible. As described in chapter two, the skin surface is ordinarily displayed in conjunction with the treatment volume and the isointensity surfaces of radiation dose for the dual purposes a) of providing a scale and orientation context within which to interpret this underlying information and also b) of explicitly reminding the physician of the locations of sensitive soft-tissue anatomical structures through which radiation beams should not pass. Although several layers of surfaces are rendered to produce the final image, the following explanation will describe the rendering of the outermost surface only. (Nothing new has been added to the ray-casting algorithm, described in detail by Levoy [1988], for displaying the underlying layers.) 4.8.1: Locating the surface in the volume The first step in the implementation, for each pixel of the final image, is to locate the point at which the viewing ray through a pixel intersects the transparent surface. I begin by examining the data values at the eight corner voxels of successive data cells pierced by the ray as it traverses the volume for evidence of an isosurface crossing. For each data cell in which an isovalue crossing is detected, I compute the surface triangulation in that cell (details of the surface triangulation computations are specified in [Lorensen and Cline 1987]) and determine whether the viewing ray intersects any of these triangles. After the point of intersection, if any, of the viewing ray with the isointensity surface is determined, this local triangular representation of 97

22 the surface is discarded. To improve image clarity, I allow the option of ignoring all but the first ray intersection with a given isointensity surface; this option was used in the rendering of the skin surfaces shown in figures 4.25, 4.26 and : Computing smoothed surface normals As a preprocessing step, grey-level gradients [Höhne and Bernstein 1986] are computed at all of the voxel sample locations in the volume, using a Gaussian-weighted filter over a 3x3x3 neighborhood. During ray-casting, I compute the smoothed surface normal at a ray/surface intersection point in a data cell by trilinear interpolation from the values of grey-level gradients at the eight surrounding voxels. The use of both Gaussian blurring and floating point normals turned out to be essential for approximating a smooth reconstruction of the surface; directional artifacts in surface shape, arising as much from inadequate surface reconstruction as from noise in the data, frustrated many earlier attempts to distinguish meaningful surface features using local differential geometry measurements based on integer normals obtained from the volume data using central differences (for example, n x = v x+1,y,z v x 1,y,z, where v x,y,z represents the voxel density value) : Defining an orthogonal frame Once we have located a surface point P x,y,z and computed its normal direction, we can define an orthogonal frame ( e r 1, e r 2, e r 3 ) at that position, in which e r 3 points in the surface normal direction and e r 1 and e r 2 span the tangent plane (at this point, we are unconcerned with the rotational orientation of the frame in the tangent plane). In practice, e r 1 is obtained by choosing an arbitrary direction in the tangent plane and e r 2 is derived from the cross-product of e r 1 and e r : Computing principal directions and principal curvatures From the orthogonal frame, we can determine the matrix representing the Second ω Fundamental Form, A = ƒ ƒω which describes the local surface shape in terms of changes ƒω 2 ƒω 2 in the direction of the surface normal in the local area. The coefficients ƒω i3 j specify the component in the e r i direction of how much the normal tips as you move across the surface in the r e j direction. (Specifically, the coefficients in the first row specify how the normal tips as you move in the e r 1 direction, while the coefficients in the second row specify how the normal tips as you move in the e r 2 direction. The tipping of the normal is broken down, across the first and second columns, into terms giving its components in the e r 1 and e r 2 directions respectively.) Koenderink describes the ƒω i3 j terms as representing "nosedives" when i = j and "twists" when i j. I calculate each ƒω i3 j by taking the dot product of e r i and the first derivative of the gradient in the e r j direction; this latter term is obtained by directionally applying a Gaussian-weighted derivative operator to data points that are resampled by trilinear interpolation from the precomputed gradient values in the volume. Next I calculate the "principal frame", in which the vectors r e 1 and r e 2 are aligned with the principal directions, by rotating the orthogonal frame in the tangent plane so that the twist terms ƒω 1 23 and ƒω 2 13 disappear. This rotation is done by diagonalizing the matrix A to obtain 98

23 D = k k 2 and P = v1u v 2u v 1v v, where A = PDP 1 and k 1 > k 2. The eigenvalues k 1 and k 2 2v specify the principal curvatures, and the principal directions in three-space are computed from the eigenvectors defined in the u, v coordinates of the tangent plane as follows: r r r e 1 = v 1u e1 + v 1v e2, r r r e2 = v 2u e1 + v 2v e : Computing Gaussian and mean curvatures The Gaussian curvature K of the surface at any point is equal to the product of the two principal curvatures and describes the solid angle spanned by the surface normals over a unit area of the surface centered at that point; the mean curvature H is equal to the average of the two principal curvatures and represents the average normal curvature over all directions in the tangent plane at that point [Koenderink 1990]. If we only want to calculate the Gaussian or mean curvature at a surface point, however, it is not necessary to compute the principal directions and principal curvatures first. The Gaussian curvature is also defined by the determinant of the Second Fundamental Form, and mean curvature is equal to the average of the normal curvatures in any two orthogonal directions. I used these latter computations in preliminary experiments in which I illustrated parabolic lines or otherwise defined the color and/or opacity at each point on a surface as a function of the Gaussian or mean curvature there. As have many others before me, I found this type of representation to be generally unhelpful for enhancing a perceptually intuitive understanding of surface shape : Determining whether a point lies on or near a ridge or valley line Once the principal curvatures and principal directions at the ray/surface intersection point have been computed, we can use this information along with curvature measurements at neighboring points to determine whether or not this particular point should be displayed with the characteristics of a ridge or valley line. By definition, a point P xyz lies on a ridge line if the first principal curvature at the point is positive ( k 1 > 0 ) and is a local maximum of normal curvature in the first principal direction. Similarly the point is on a valley line if k 1 is negative and a local minimum of normal curvature along the line of curvature. In practice, because the probability is vanishingly small that any individual viewing ray will intersect the surface exactly at a ridge or valley point, I use an approximation to this criterion that allows me to mark points that are not only on an exact ridge or valley line but near to one. In this approximation, I define the curvature k 1 to be a local maximum (minimum) if it is greater (less) than the normal curvatures in the e r 1 direction at each of the neighboring points found by stepping out a short distance in the volume from P xyz in the positive and negative e r 1 directions. I did not experiment much with step size and simply stepped one voxel unit in each direction. If the step size is too large, the tangent plane will no longer be an adequate local representation of the surface, and the similarity of the nested isosurfaces cannot be guaranteed. During ray-casting, the color and opacity values of all surfaces are composited in front-to-back order to obtain a final pixel value for display. If a ray/surface intersection point is identified as being part of a ridge or valley region, it is assigned an additional amount of opacity and possibly a slightly different color in order to distinguish it more prominently in the image, while rendering of the other data objects proceeds in the usual manner. It should be noted in passing that in the images I show below, only the first intersection of a given ray with the external transparent surface is represented. Because the luminance contribution of the transparent skin surface is computed using an additive model of semitransparency, the surface is represented as a sheer fabric and would otherwise appear brighter in areas where, because of folds in the material, a single ray happened to intersect the surface multiple times. 99

24 4.9: Improving the display of ridge and valley information Figure 4.25-left shows an opaque rendering of the outer skin surface of a patient with cancer of the nasopharynx, a region behind the nose and at the top of the throat. Even after Gaussian smoothing, many subtle and not-so-subtle shape irregularities, including small ledges and depressions, are still quite apparent. This is typical of clinically-acquired data, and it makes the job of obtaining a clean surface representation very difficult. Figure 4.25-right shows they type of results that are achieved when all of the ridge and valley points identified by the preceding algorithm are displayed opaquely. Clearly this will not do. Ridge and valley lines appear almost everywhere, and the effectiveness of the representation is degraded by the clutter of excessive detail. Additional steps obviously need to be taken to selectively emphasize the most important ridge and valley regions while de-emphasizing or ignoring the others. Rather than simply assigning a fixed opacity increment to each point that falls within a ridge or valley region, it would be beneficial to weight the amount of additional opacity assigned to each ridge or valley point to reflect the relative importance of that feature in terms of the overall shape. Figure 4.25: A demonstration of the insufficiency of a uniform representation of ridge and valley line opacity, on skin surface data acquired from a patient being treated for cancer of the nasopharynx. Left: opaque skin surface. Right: semi-transparent, forward-facing, outermost skin surface with all ridge and valley points rendered opaquely. Data courtesy of Dr. Julian Rosenman, Department of Radiation Oncology, UNC Hospitals : Defining opacity as a function of normal curvature in the first principal direction There are a number of reasons, discussed in section 4.4, to believe that explicitly drawing ridge lines on a surface for the purposes of illustrating its shape will be most useful when the ridge lines correspond to surface discontinuities or near-discontinuities. By defining the opacity increment as a function of the relative magnitude of the normal curvature in the first principal direction at a ridge or valley point, the presence of either ridge or valley lines can be reinforced when they correspond to more sharply curved regions of the surface and minimize their emphasis in relatively flatter places. Figure 4.26-upper right illustrates some initial results of this approach. The specific functions I used to generate this image defined the opacity increment for ridges as (k 1 k max ) and for valleys as (k 1 k min ). While it initially appeared that I would need to precompute the global maximum and minimum values of k 1 over all ridge and valley points on the surface to calculate the opacity increment in this way, I found in practice that it was fairly easy, with some experience, to estimate suitable values for k min and k max based on a knowledge of values previously used with success and a subjective visual assessment of the relative sharpness of the curvature of the form; the results, in any event, appeared to be fairly consistent over a wide range of different weighting values. 100

25 Figure 4.26: Improving the display of ridge and valley information. Upper left: all detected ridge and valley points equally opacified. Upper right: opacity weighted by the relative absolute magnitude of the first principal curvature. Lower left: after application of a curvature-based opacity threshold. Lower right: results when an approximate feature width criterion is also used, in addition to the other techniques, to limit the opacification of spurious ridge and valley features. As illustrated in figure 4.26-lower left, better results can be achieved by refining the definition of the opacity increment to incorporate a minimum curvature threshold criterion. Allowing ridge and valley points to be rendered with additional opacity only when the absolute magnitude of the maximum principal curvature exceeds a specified cutoff value eliminates from the display many of the seemingly spurious ridge or valley features, which appear to overwhelmingly correspond to the faintest lines in figure 4.26-upper right, or to the points at which the maximum principal curvature is relatively flat. There is no obvious way to know, a priori, which threshold value will give the best results for a given dataset. As part of the experimental effort to design and test various techniques for differentiating the more significant ridge and valley lines, I implemented a program option to recompute images based on a saved, intermediate data representation, which allowed me to test a variety of different parameter values in rapid succession. For this particular dataset I found that as the curvature cutoff was increased, the lines along the more gradually curved ridge and valley regions such as across the eyebrows and at the crease between the head and the neck began to break up or disappear, and that this loss of relevant detail occurred at curvature cutoff values significantly lower than those at which all of the unwanted markings were eliminated. To choose the compromise curvature cutoff shown in figure 4.26-lower left, I generated a series of images using successively greater threshold values, starting from zero, animated the appearance and disappearance of the ridge and valley markings, and then made an empirical selection. 101

26 It is possible to shape the function by which opacity is determined from principal curvature using a variety of additive, multiplicative or even exponential constant parameters, as in opacity_ increment = a + bk 1 c. Theoretically such options could even be incorporated into an interactive display program so that they could be set by the user according to his own æsthetic preferences or any other criteria. I did not experiment any further with modifying the opacity definition function along these lines, however, because it appeared that this type of tweaking would bring only marginal further improvements beyond the results already achieved : Predicting the significance of a ridge or valley by the extent of its flanks While I have shown that it is possible to eliminate a number of spurious elements from the display by defining the ridge and valley region opacity as an appropriate function of the principal curvature at the ridge or valley point, this approach is not sufficient to completely solve the problem, as not all unwanted shape detail can be characterized purely on the basis of its shallow curvature, even after a significant amount of smoothing. Problems particularly arise in de-emphasizing unwanted ridge or valley features that happen to be especially sharp and deep although of extremely limited extent (in terms of the overall surface area of the protrusion or depression). It is certainly true that many tiny, sharp blemishes can be eliminated by Gaussian smoothing, but we have to be careful not to blur away important detail at the same time as we reduce the effects of noise. In practice, I have found that the curvature of the surface at the location of many undesired lines can be greater than the curvature of the surface at features I want to represent. The levels of blurring required to completely smooth over all of the noise will obliterate much of the relevant surface detail at the same time. If the spurious nature of some sharp features is defined more by their size than by their curvature, perhaps we need to incorporate some estimate of feature size into the ridge and valley line display algorithm. From a purely local standpoint, there is not much we can do to measure the length of a ridge, but estimating the width of a ridge is somewhat more feasible. The final local enhancement technique I will describe is concerned with eliminating spurious surface features by recognizing their limited width, which requires working with slightly larger local regions of the surface. The principle of this method is illustrated in figure 4.27, which shows a cross-section of a small ridge of the type that we might prefer to leave unenhanced by additional opacity. Although the sharp curvature at the ridge point makes this feature appear very strong from a purely local vantage point, we can see that its width is quite modest; this narrowness is reflected in the fact that the directions of the surface normals at P a and P b are very similar to the direction of the normal at P xyz, meaning that the surface has begun to curve back around within a very short distance from the peak. Not all spurious features are well-characterized by this type of description, and many perceptually very important valley features in particular actually fit this description quite closely, but in some cases we can achieve discernible improvements in image quality by incorporating this criterion into the display algorithm, as evidenced by the image in figure 4.26-lower right. There are a number of different ways in which we could implement the detection of this type of feature, and it is also possible to selectively apply any such criterion only to ridges and not to valleys. The image in figure lower right was generated by a method in which I step away from the ridge point by some small amount in the first principal direction on either side, which is in general not orthogonal but merely transverse to the direction of the ridge line, and then compare the directions of the surface normals at these neighboring locations to the direction of the surface normal on the ridge. If the angles between n xyz and n a and n xyz and n b are both very small, I conclude that the ridge was extremely narrow and skip opacification of the point. The width criterion used to define narrow ridges is, like the cutoff curvature threshold, a user-definable parameter; the step distance in figure 4.26-lower right was empirically chosen based on a visual examination of a series of images in which increasingly larger distance values were specified. 102

27 Figure 4.27: A diagram illustrating the concept of a narrow ridge feature. While the quality of the ridge and valley line features extracted from acquired data can be considerably improved using purely local information, the extent to which the significance of a feature can be determined solely on the basis of its local shape characteristics is inherently limited. Another possible measure of the relative importance of a feature is its persistence across scale; one would expect that most of the spurious ridges and valleys the ones due to noise, caused by reconstruction error, or corresponding to relatively minor surface perturbations would be found only at the finest scales. If we were somehow able to detect ridge and valley features at a very coarse scale and then trace them back, through scale space, to the finer scale at which the surface is displayed (as proposed by [Ponce and Brady 1987]), we might be able to improve our line-drawing representation of the surface. This approach would certainly be useful for connecting all of the tiny fragments around the neckline, for example, both where it meets the shoulders and under the chin. Although I did not pursue this approach, primarily because of uncertainty about how to accurately define the surface at different scales (blurring volume data with a Gaussian not only smoothes the isovalue surfaces but also changes the isointensity level at which any particular surface is found) and also because of uncertainty about how to correctly trace the location of a feature through scale, I believe that further improvements in ridge definition for surfaces defined from acquired data will need to come from this direction. The extension to three dimensions of recent accomplishments by Fidrich and Thirion [1995] in the multiscale extraction of geometrical features from two-dimensional medical data appear to describe a very promising direction for future work in this area : The role of color While my implementation allows distinct colors to be arbitrarily assigned to the ridge and valley regions, blended in along with the additional opacity, I have not experimented with the use of color in very much depth. Although I used a very subtly darker shade of the skin color for the valley lines and a subtly lighter shade for the ridges to reinforce their natural shading differences when computing the images shown in figures , this effect is hardly visible. It is my impression, based more on intuition than on experience, that there would be little value in explicitly differentiating ridge from valley lines through the use of different hues; to avoid overloading these already complex visualizations with too much extraneous information, it might be better to reserve the use of color for differentiating between the various underlying dose surfaces and anatomical structures. 4.10: Empirical evidence of the benefits of opacifying ridge and valley lines Figure 4.28 shows a radiation therapy treatment plan for the patient whose skin surface only was shown in figures In each of the images of this figure the outer skin surface is 103

rendered semi-transparently, and through this surface we can see the opaque treatment region enclosed by a semitransparent isointensity surface of radiation dose.

28 rendered semi-transparently, and through this surface we can see the opaque treatment region enclosed by a semitransparent isointensity surface of radiation dose. This data represents a fairly typical case treated by the radiation oncologists at UNC. As previously described in chapter two, the primary purpose of displaying the skin surface is to provide a spatial context within which size and location of the tumor and dose surfaces can be understood. A secondary purpose is to provide a reminder of the locations of those sensitive anatomical structures which can be readily interpreted from the skin surface, in this case the eyes and mouth. Figure 4.28: Radiation therapy treatment plan for cancer of the nasopharynx. Upper left: outer skin surface rendered without additional enhancement. Upper right: outer skin surface rendered with opaque ridge and valley lines. Lower right: outer skin surface rendered with solid grid texture. In figure 4.28-upper left, the skin surface is rendered with the common display method for representing transparent surfaces. One can observe that the details of the facial features are relatively difficult to clearly discern and that it is very difficult to estimate the distance between the isodose and skin surfaces. (This difficulty persists even when the data is viewed from a number of different angles in sequence, as can be verified from the videotape which accompanies this dissertation.) In figure 4.28-upper right, the ridge and valley regions computed by the methods described in section 4.9 (and illustrated in figure 4.26-lower right) are rendered with additional opacity and a slightly different color, which gives them greater prominence in the image and provides occluding regions whose motion relative to the underlying surfaces can make the distance between these structures easier to perceive. For the purposes of subjective comparison, I have included an image (figure 4.28-lower right) in which, rather than ridge and valley lines, a standard solid grid texture is applied to the skin surface. Although, for reasons I will go into below, it would be premature without further study to claim any relative practical advantage for the feature opacification technique, the obvious dissimilarities in the appearances of these three images inspires a closer look into the question of whether measurable performance differences might be expected, and of what nature these differences might be. 104

29 Figure 4.29-upper left shows the opaque skin surface of a patient being treated for prostate cancer. In figure 4.29-upper right, a more complete radiation therapy treatment plan is shown for this patient. In this image, the semi-transparent skin surface can be seen to enclose both the opaquely-rendered bones of the pelvis and, within them, an opaque treatment region surrounded by a semitransparent isointensity surface of radiation dose. The same dataset is shown in figure 4.29-lower right with opaque valley lines displayed on the outer skin surface. It was not necessary or desirable to display the ridge lines in this image, because the valley lines on this surface are sufficient to explicitly define the location of the soft tissue area that the radiation beams need to avoid. Figure 4.29: Radiation therapy treatment plan for prostate cancer. Upper left: outer skin surface rendered opaquely. Upper right: outer skin surface rendered transparently, without additional enhancement. Lower right: outer skin surface rendered with opaque valley lines. Data courtesy of Dr. Julian Rosenman, Department of Radiation Oncology, UNC Hospitals. My aim in showing these images has been to provide some empirical evidence that by marking what I argue to be perceptually relevant feature lines, namely valleys and sometimes ridges, opaquely on the surface of a transparent object we might better communicate its existence, form and location in depth. Although it is tempting to want to describe such pictures as proof of the advantageousness of this visualization technique, such a claim cannot reasonably be made on the basis of a subjective evaluation of these (or any other) images. The true merits of any scientific visualization paradigm can be accurately measured only in terms of the improvements in task performance that it enables. Preferences for a particular style of data representation, even when shown through controlled observer experiments to be fairly consistent and universally held, remain only introspective judgements and cannot be taken as reliable predictors of the practical benefits of the technique. I have not implemented controlled experiments to measure the task-related performance advantages of feature line enhancement, in comparison either with plain transparency or with an alternative texturing method such as the solid grid. However, I did 105

Conveying 3D Shape and Depth with Textured and Transparent Surfaces Victoria Interrante

Conveying 3D Shape and Depth with Textured and Transparent Surfaces Victoria Interrante In scientific visualization, there are many applications in which researchers need to achieve an integrated understanding