ECEN 447 Digital Image Processing

ECEN 447 Digital Image Processing Lecture 8: Segmentation and Description Ulisses Braga-Neto ECE Department Texas A&M University

Image Segmentation and Description Image segmentation and description are the essential components of Image Analysis the quantification of images for object recognition and image understanding. Segmentation partitions an image into its constituent regions or objects. This is a hard problem to solve, except in trivial cases. Segmentation accuracy determines the eventual success of any image analysis problem, such as industrial inspection applications, which rely on correct identification of the objects in the image. Image description techniques, on the other hand, summarize into a few characteristics (called features) the regions or objects found by segmentation. They can be based on boundary or region features.

Segmentation Approaches Segmentation based on discontinuity (edge-based). original image edge image segmentation Segmentation based on similarity (region-based). original image edge image (not a good idea) region-based segmentation

Edge Detection As we mentioned in connection with image sharpening, derivative operators are used to detect sharp variations (edges). Edge models: step ramp "roof"

Derivative Filters A variety of masks implement derivative filters for edge detection. edge-detecting masks simple 1-D edge-detecting masks diagonal edge-detecting masks

Gradient Edge-detecting operators g x and g y, for horizontal and vertical edges, respectively, can be combined to form the gradient vector. As we saw before, the gradient points in the direction of maximum change. Its magnitude is an image that gives the edge strength while the gradient angle is an image that gives the orthogonal direction to the edge

Gradient Example original image Sobel g x Sobel g y gradient g x + gy

The Role of Noise in Edge Detection Because the filters employed are derivative, they amplify noise. Prior application of image smoothing is thus essential.

Gradient with Smoothing Example original image smoothed with a 5x5 avg filter Sobel g x Sobel g y gradient g x + gy

Combining Gradient with Thresholding The gradient image can be thresholded to produce a binary image indicating the location of edges. gradient without smooting thresholded at 33% gradient with smooting thresholded at 33%

Laplacian of Gaussian Following the same idea that the image should be smoothed prior to applying derivative filters, one can apply Gaussian smoothing prior to a Laplacian filter. The 2-D Gaussian with zero mean and standard deviation is The Laplacian of the Gaussian (LoG) is thus

Laplacian of Gaussian - II Negative of the Laplacian of Gaussian:

Marr-Hildreth Edge Detector The Marr-Hildreth algorithm is a classical procedure for edge detection. It consists of three steps 1. Apply n x n mask approximating a Gaussian lowpass filter. 2. Apply Laplacian mask to result. 3. Find the zero crossings of that as the edge location. The Marr-Hildreth edge detector consists thus of a Laplacian of Gaussian filter followed by zero-crossing detection. The latter step is the key feature of the procedure. The zero crossings are found by looking in a 3x3 neighborhood of a point and looking for sign changes. A threshold is used to require changes of a certain minimum magnitude.

Marr-Hildreth Edge Detector - Example original image LoG with sigma = 4 and n = 25 zerocrossings with T = 0 zerocrossings with T = 4%

Canny Edge Detector The Canny algorithm is based on the same basic principle. It consists of three steps 1. Apply n x n mask approximating a Gaussian lowpass filter. 2. Compute gradient magnitude and direction of result. 3. Process gradient magnitude image using direction information to detect, thin, and link edges. Other than step 3, the Canny edge detector is the same as the Marr-Hildreth edge detector, but using the gradient of Gaussian rather than the zero-crossings of the Laplacian of Gaussian.

Canny Edge Detector - Example original image thresholded gradient Marr-Hildreth edge detector Canny edge detector

Canny Edge Detector - Another Example original image thresholded gradient Marr-Hildreth edge detector Canny edge detector

Edge Detection for Color Images The gradient and Laplacian cannot be applied directly to vector functions, only to scalar functions. One possibility is to apply edge detection to each band of a color image and then combine the results. For example, one can compute the magnitude of the gradient of the R, G, and B bands of an RGB color image, and then sum the results to obtain an edge image. Each band can be independently smoothed to improve its gradient, and thresholding can be applied after summation of the gradients, just as before.

Edge Detection for Color Images - Example original image sum of gradient magnitudes red gradient magnitude green gradient magnitude blue gradient magnitude

Thresholding Thresholding is a region-based segmentation method, which relies on the intensity of gray values. Thesholding produces a binary image according to the equation

Types of Thresholding In global thresholding, T is a constant. On the other hand, if the value of T changes over the image, we have variable thresholding. There are several kinds of strategies for variable thresholding. For example: Local thresholding: T depends on the grayscale values in a neighborhood of (x,y). Adaptive thresholding: T depends on the coordinates (x,y) themselves.

Multiple Thresholding In multiple thresholding, there are three or more modes in the histogram, requiring two or more threshold parameters:

Multiple Thresholding - Example original image histogram thresholded image with T 1 = 80, T 2 = 177

Basic Global Thresholding The idea is to identify the mean intensities of each class and take T to be the middle point between them.

Basic Global Thresholding - II In practice, the parameters and are not known and need to be estimated from the histogram of the image. This can be done by means of the following algorithm. 1. Select an initial estimate for the global threshold T (for example, the overall mean intensity value of the image). 2. Apply threshold T to the image. 3. Compute the mean intensity value m1 of gray values below T and the intensity value m2 of gray values above T 4. Compute a new threshold value 5. Repeat steps 2 through 4 until the difference between values of T in successive iteration is smaller than a tolerance ΔT

Basic Thresholding - Example Fingerprint imaging: applying the preceding algorithm with initial T equal to overall image mean, and ΔT = 0 leads to a final T = 125.4. original image histogram thresholded image with T = 125

Otsu's Thresholding Otsu's method of thresholding automates the choice of the best threshold T as the value that maximizes a criterion of separability between the foreground and background pixel values. Let T = k and consider the histogram of an image {p i ; i=0,...,l-1}. The probability that a pixel is assigned to the background is while the probability that a pixel is assigned to the foreground is

Otsu's Thresholding - II The mean value of the pixels assigned to the background is a weighted average, with weights given by the histogram values Similarly, the mean value of the pixels assigned to the foreground is The global mean is given simply by

Otsu's Thresholding - III Otsu's method uses as the criterion of separability to be maximized a ratio of variances where the between-class variance is given by while the total variance is simply

Otsu's Thresholding - IV The ratio of variances is adimensional and between 0 and 1. It is maximal when is maximal, which occurs when the means of background and foreground are well separated. Otsu's method is therefore to pick the best as the value that maximizes the between-class variance The ratio of variances evaluated at the best threshold serves as a measure of the effectiveness of thresholding:

Otsu's Thresholding - Example original image histogram basic thresholding T = 169 Otsu thresholding T = 181 η = 0.467

The Role of Noise in Thresholding Noise smears the histogram, making thesholding more difficult. Example: zero-mean Gaussian noise. no noise std = 10 std = 50

The Role of Illumination in Thresholding Nonuniform illumination also makes thesholding more difficult. We saw an example of this in the MM lecture. Example: ramp illumination.

Using Smoothing to Improve Thresholding Smoothing reduces noise and makes thresholding easier. Example: zero-mean Gaussian noise (std = 50) and 5x5 avg filter. noisy image histogram Otsu thresholding smoothed image histogram Otsu thresholding

Using Edges to Improve Thresholding In some cases, it is necessary to compute a threshold value based only on grayscale information in the edges of an image. Example: small object in noise. noisy image histogram Otsu thresholding smoothed image histogram Otsu thresholding

Using Edges to Improve Thresholding - II One can use an edge image to mask the original image, and then compute the optimal T using that. Example: Gradient magnitude with thresholding. noisy image histogram binary edge image masked image histogram Otsu thresholding

Using Edges to Improve Thresholding - III Realistic example: segmentation of yeast cell nuclei. original image histogram Otsu thresholding thresholded Laplacian histogram of masked image Otsu thresholding

Variable Thresholding by Partitioning One can make the threshold T local simply by computing a different value for different regions of a partition of the image. original image histogram basic thresholding Otsu thresholding partition variable Otsu thresholding

Variable Thresholding by Moving Average One can also make the threshold local by computing the value of T based on an average of the values in a neighborhood of a pixel. original image Otsu thresholding moving average original image Otsu thresholding moving average

Color Image Thresholding Thresholding of a color image can be accomplished by where d(z,a) is the distance of point z in RGB space to a given fixed point a. This defines a ROI with center in a. For example: Euclidean distance: Mahalanobis distance: where C is a given matrix (usually a covariance matrix) Maximum or "chessboard" distance:

Color Image Thresholding - II The previous distances correspond to the following ROIs in RGB space. Euclidean distance Mahalanobis distance Chessboard distance

Color Image Thresholding - Example Chessboard distance is used, with values of a and T derived directly from the image by specifying a region containing the desired colors. original image thresholded image

Morphological Watershed The watershed transformation is a method for image segmentation originally proposed in the context of Mathematical Morphology. It is based on the simple idea of watersheds of a topographical surface. In geography, the main rivers and their affluent rivers and streams partition the land in catchment basins. A catchment basin is defined as a connected region such that any drop of water placed at a point of it falls to the same regional minimum, and does not fall into any other region. The borders between the catchment basins are the watershed lines. Watershed lines are therefore crest lines that separate the basins. Alternatively, the watershed lines can be found as dams in a flooding simulation, where water rises from each regional minimum, and a dam is built at the line where rising water from different basins merge.

Watershed - Example The artificial image below has three regional minima, which produce a watershed segmentation with three catchment basins.

Flooding Simulation original image original image viewed as a topographic surface beginning of flooding further flooding

Flooding Simulation - II more flooding further flooding and beginning of dam construction more flooding and longer dams final watershed lines overlaid on orignal image

Watershed of Gradient In practice, the watershed is applied on the (magnitude of) gradient of an image, for which the crests locate the boundary between objects. original image gradient image watershed lines watershed lines overlaid on original image

Oversegmentation In practice, due to noise and the fact that each minimum produces one catachment basin, direct application of the watershed method produces oversegmentation. original image watershed of gradient = oversegmentation

Marker-Based Watershed The oversegmentation problem can be overcome by specifying ("imposing") the minima one wants on the image, while eliminating all other undesirable minima. This can be done by means of internal and external markers and a closing by reconstruction operation. This is sometimes called a homotopy modification, as the markers become the only minima of the image. The markers can be specified by a human operator (in which case the process is semi-automatic), or they can be obtained directly from the image itself for a fully automatic procedure. This is similar in spirit to the positive effect of smoothing applied prior to using derivative filters for edge detection.

Marker-Based Watershed - Example In this example, the internal markers are simply the regional minima of a smoothed version of the image, while the external markers are the wateshed lines. After homotopy modification, the watershed is applied again to obtain the final result. original image with overlaid markers marker-based watershed

Marker-Based Watershed - Another Example Segmentation of heel bone in an magnetic-resonance image of foot. original foot MR image magnitude of gradient using Sobel operators original image with overlaid markers gradient after imposition of minima marker-based watershed line result overlaid on original image

Marker-Based Watershed - Yet Another Example Segmentation using markers specified manually. original image internal and external markers overlaid on image result overlaid on original image (from Serge Beucher's watershed website at Centre de Morphologie Mathematique - Paris)

Watershed for Binary Segmentation Binary segmentation refers to the identification of overlapping objects in a binary image. Rather than the image gradient, here the watershed is applied on the distance transform of the binary image. original image distance transform of grains result overlaid on original image (from Serge Beucher's watershed website at Centre de Morphologie Mathematique - Paris)

Image Description Basics The last step in the image processing and analysis pipeline is often image description. The objective is usually to produce a short numeric vector that quantifies objects obtained by segmentation. It produces the raw material for image-based pattern recognition. This numeric vector is often called a feature vector, and image description is called feature extraction. Because it compresses information, image description is also called dimensionality reduction. Similarly to segmentation, two main approaches for this are boundary descriptors regional descriptors Regardless of the approach, it is important that the shape of the object be described, rather than translation, rotation and scale. Thus, normalization and invariance with regard to these factors is important.

Boundary Following In all boundary description, it is necessary to obtain the sequence of pixels in the boundary of the object. The following boundary tracking algorithm produces the ordered sequence of pixels in the outer boundary of a binary object: 1. Let b 0 (starting point) = the uppermost, leftmost point. 2. Let c 0 = west neighbor of b 0 (this must be a background point). 3. Proceeding through the 8-neighbors of b 0 clockwise, starting from c 0, until a foreground pixel is found. Call this b 1 and call the last background pixel visited c 1. 4. Obtain b 2 and c 2 from b 1 and c 1 in the same fashion. 5. Repeat 2 though 4 until b k = b 0 and b k+1 = b 1. Stop and return sequence of pixels {b 0,b 1,...,b k-1 }.

Boundary Following - Example This shows the first few steps of the algorithm: This shows that the stopping condition must be b k = b 0 and b k+1 = b 1 :

Chain Codes Once the sequence of boundary pixels is found, one can code it by the directions of the displacement between one pixel and the next. 4-direction code 8-direction code sub-sampling scheme 8-code: 0766666453321212

Chain Codes - Example The following example illustrates segmentation, followed by boundary extraction, subsampling and code representation. 8-code: 00006066666666444444242222202202

Boundary Descriptors Once a suitable boundary representation has been obtained, the next step is to obtain a short feature vector that can describe it. Several simple alternatives are possible: The length is the simplest descriptor. In a chain code, the number of horizontal and vertical components plus srqt(2) x number of diagonal components gives the length. The diameter is defined as The line corresponding to the diameter is the major axis. The minor axis is a line perpendicular to that and passing through the centroid of the shape. The ratio of the two acis is the eccentricity. The axes also define the basic box.

Signatures A signature is a 1-D functional representation of a boundary. One common form is to plot the distance from the centroid of the boundary as a funtion of the angle.

Signatures - Example shape shape boundary boundary signature signature

Shape Numbers Shape numbers are based on the first difference of a chain code. The shape number is the first difference of smallest magnitude. The order of a shape number is its number of digits.

Shape Numbers - Example Suppose the order n=18 is specified for the shape number. One should find the basic rectangle and then discretize as a 6x3 = 18 grid. The chain code is computed and from it the shape number is derived.

Fourier Descriptors The idea behnd Fourier descriptors is simple, but powerful. One represents the boundary pixels of an object as points in the complex plane, and computes the DFT of that. complex representation Fourier descriptors

Fourier Descriptors - II Fourier descriptors have nice properties that help deal with translation, rotation, and scaling issues. Not all K Fourier descriptors need to be kept. Keeping the first P descriptors leads to a smoothed approximation

Fourier Descriptors - Example 2868 FD (100%) 1434 FD (50%) 286 FD (10%) 144 FD (5%) 72 FD (2.5%) 36 FD (1.25%) 18 FD (0.63%) 8 FD (0.28%)

Regional Descriptors Rather than extracting a boundary and describing that, it is possible to define descriptors of the region itself that corresponds to an object. Several simple alternatives are possible: The area (number of pixels) is the simplest descriptor. The compactness is the dimensionaless ratio (perimeter) 2 / area, where the perimeter is the length of the boundary. The mean and median intensity levels of pixels in the region.

Topological Descriptors The topological properties of regions are by definition invariant to translation, rotation, and scaling, and even to continuous stretching (which does not involve tearing or joining). In particular, topological properties do not depend in general on any given distance measure. The simplest example is the number of holes. The following region has two holes, which does not change with respect to any transformation of it, as long as it does not involve tearing or joining.

Topological Descriptors - II The number of connected components is another useful example of topological descriptor. three connected components The Euler number is defined as the number of connected components minus the number of holes. euler number = 0 euler number = -1

Texture As mentioned previously, texture is a very important cue in image analysis, both by computer and human. A textural descriptor provides information on the smoothness, coarseness, and regularity of textures. smooth coarse regular

Histogram-Based Texture Descriptors The histogram {p(z i ); i=0,...,l-1} of a texture contains valuable information about a texture. The n-th moment about the mean or n-th central moment is given by: where m is the mean For example, the 2nd central moment is the familiar variance, while the third central moment gives the skewness of the histogram.

Histogram-Based Texture Descriptors - II One can also define the "uniformity" as well as the entropy The following table gives several histogram-based descriptors for the textures given in the previous figure.

Co-Occurrence Matrix Given a texture with L levels of intensity, the co-occurrence matrix has at its general position g ij the number of pixel pairs with intensities z i and z j that satisfy in the position specified by a predicate Q. For example, let L = 8 and Q = "one pixel immediately to the right."

Co-Occurrence Matrix - II A co-occurrence matrix can be normalized by dividing all elements by the sum of all elments, p ij = g ij /n, where n = sum g ij. Based on this several measures can be defined

Co-Occurrence Matrix - Example texture co-occurrence matrix random periodic mixed patterns

Co-Occurrence Matrix - Example (Cont'd) The following table gives some descriptors evaluated from the co-occurrence matrices for the textures on the previous slide.