DIGITAL IMAGE PROCESSING

Size: px

Start display at page:

Download "DIGITAL IMAGE PROCESSING"

Eunice Hawkins
5 years ago
Views:

AS SR INSTITUTE OF TECHNOLOGY PRATHIPADU, TADEPALLIGUDEM DEPARTMENT OF ECE DIGITAL IMAGE PROCESSING S.NO. 1 2 3 4 5 CONTENT UNIT UNIT-1 UNIT-3 UNIT-4 UNIT-6 UNIT-7 PAGE NO.

1 AS SR INSTITUTE OF TECHNOLOGY PRATHIPADU, TADEPALLIGUDEM DEPARTMENT OF ECE DIGITAL IMAGE PROCESSING S.NO CONTENT UNIT UNIT-1 UNIT-3 UNIT-4 UNIT-6 UNIT-7 PAGE NO Dow wnload this study ma aterial from Department of ECEE

2 UNIT-I Image processing involves changing the nature of an image in order to either: 1. Improve its pictorial information for human interpretation; or 2. Render it more suitable for processing, storage, transmission, and representation for autonomous machine perception. Examples of condition 1 may include: Enhancing the edges of an image to make it appear sharper Removing noise from an image Removing motion blur from an image Examples of condition 2 may include: Obtaining the edges of an image Removing detail from an image ASPECTS OF IMAGE PROCESSING Digital image processing is the use of computer algorithms to perform image processing on digital images. It is convenient to subdivide different image-processing algorithms into broad subclasses: Image Enhancement: Processing an image so that the result is more suitable for a particular application is called image enhancement. Examples include: Sharpening or deblurring an out-of-focus image Highlighting Edges Improving image contrast or brightening an image and Removing noise Image Restoration: An image may be restored by the damage done to it by a known cause, for example: Removing of blur caused by linear motion Removal of optical distortions and Removing periodic interference Image Segmentation: Segmentation involves sub dividing an image into constituent parts or isolating certain aspects of an image, including: Finding lines, circles, or particular shapes in an image & Identifying cars, trees, buildings, or roads in an aerial photographs DIGITAL IMAGE REPRESENTATION An image may be defined as a two-dimensional function, f(x, y), where x and y are spatial (plane) coordinates, and the amplitude of f at any pair of coordinates (x, y) is called the intensity or gray level of the image at that point. When x, y, and the amplitude values of f are all finite, discrete quantities, we call the image a digital image. The field of digital image processing refers to processing digital images by means of a digital computer. A digital image is composed of a finite number of elements, each of which has a particular location and value. These elements are referred to as picture elements, image elements, pels, and pixels. Pixel is the term most widely used to denote the elements of a digital image. 2 Department of ECE

3 A Digital image can be considered as a matrix whose row and column indices identify a point in the image and the corresponding matrix element value identifies the gray level at that point. FUNDAMENTAL STEPS IN IMAGE PROCESSING Let us consider a simple application of image processing technique for automatically reading the address on pieces of mail. The following figure shows that the overall objective is to produce a result from a problem domain by means of image processing. Segmentation Representation & description Preprocessing Knowledge Base Recognition And Interpretation Result Problem Domain Image Acquisition The problem domain in this example consists of pieces of mail, and the objective is to read the address on each piece. Thus the desired output in this case is a stream of alphanumeric characters. Image acquisition The first step in this process is Image Acquisition- that is, to acquire a digital image. To do so requires imaging sensor and the capability to digitize the signal produced by the sensor. The sensor could be a TV camera that produces an entire image of the problem domain every 1/30 sec. The imaging sensor can also be a line-scan camera that produces a single image line at a time. If the output of the camera or other imaging sensor is not already in digital form, an 3 Department of ECE

4 analog-to-digital converter digitizes it. The nature of the sensor and the image it produces are determined by the application. Preprocessing After a digital image has been obtained, the next step is preprocessing that image. The key function of preprocessing is to improve the image in ways that increase the chances for success of the other processes. Basically, the idea behind enhancement techniques is to bring out detail that is obscured, or simply to highlight certain features of interest in an image. In this example, preprocessing typically deals with techniques, for enhancing contrast, removing noise, and isolating regions whose texture indicate a likelihood of alphanumeric information. Segmentation The next stage deals with Segmentation. Segmentation partitions an input image into its constituent parts or objects. In general, autonomous segmentation is one of the most difficult tasks in digital image processing. A rugged segmentation procedure brings the process a long way toward successful solution of imaging problems that require objects to be identified individually. On the other hand, weak or erratic segmentation algorithms almost always guarantee eventual failure. In general, the more accurate the segmentation, the more likely recognition is to succeed. In our example of character recognition, the key role of segmentation is to extract individual characters and words from the background. Representation and Description The output of the segmentation stage usually is raw pixel data, constituting either the boundary of a region (i.e., the set of pixels separating one image region from another) or all the points in the region itself. In either case, converting the data to a form suitable fro computer processing is necessary. The first decision that must be made is whether the data should be represented as a boundary or as a complete region. Boundary representation is appropriate when the focus is on external shape characteristics, such as corners and inflections. Regional representation is appropriate when the focus is on internal properties, such as texture or skeletal shape. In some applications, these representations complement each other. Our character recognition example requires algorithms based on boundary shape as well as skeletons and other internal properties. Choosing a representation is only part of the solution for transforming raw data into a form suitable for subsequent computer processing. A method must also be specified for describing the data so that features of interest are highlighted. Description, also called feature selection, deals with extracting attributes that result in some quantitative information of interest or are basic for differentiating one class of objects from another. In terms of character recognition, descriptors such as lakes (holes) and bays are powerful features that help differentiate one part of the alphabet from another. 4 Department of ECE

5 Recognition and Interpretation The last stage involves recognition and interpretation. Recognition is the process that assigns a label to an object based on its descriptors. Interpretation involves assigning meaning to an ensemble of recognized objects. In terms, of our example, identifying a character as, say, a c requires associating the descriptors for that character with the label c. Interpretation attempts to assign meaning to a set of labeled entities. For example, a string of six numbers can be interpreted to be a ZIP code. So far we have said nothing about the need for prior knowledge or about the interaction between the knowledge base and the processing modules. Knowledge about a problem domain is coded into an image processing system in the form of a knowledge database. In addition to guiding the operation of each processing module, the knowledge base also controls the interaction between modules. ELEMENTS OF DIGITAL IMAGE PROCESSING SYSTEM COMMUNICATION IMAGE ACQUISITION DISPLAY PROCESSING STORAGE IMAGE ACQUISITION Two elements are required to acquire digital images. The first is a physical device that is sensitive to a band in the electromagnetic energy spectrum (such as x-ray, ultraviolet, visible or infrared bands) and that produces an electrical signal output proportional to the level of energy sensed. The second, called a Digitizer, is a device for converting the electrical output of the physical sensing device into digital form. The types of images in which we are interested are generated by the combination of an illumination source and the reflection or absorption of energy from that source by the elements of the scene being imaged. We enclose illumination and scene in quotes to emphasize the fact 5 Department of ECE

that they are considerably more general than the familiar situation in which a visible light source illuminates a common everyday 3-D (three-dimensional) scene.

Figure shows the three principal sensor arrangements used to transform illumination energy into digital images- single imaging sensor, Line sensor and Array sensor.

6 that they are considerably more general than the familiar situation in which a visible light source illuminates a common everyday 3-D (three-dimensional) scene. For example, the illumination may originate from a source of electromagnetic energy such as radar, infrared, or X-ray energy. Figure shows the three principal sensor arrangements used to transform illumination energy into digital images- single imaging sensor, Line sensor and Array sensor. The idea is simple: incoming energy is transformed into a voltage by the combination of input electrical power and sensor material that is responsive to the particular type of energy being detected. The output voltage waveform is the response of the sensor(s), and a digital quantity is obtained from each sensor by digitizing its response. Image Acquisition Using a Single Sensor The most familiar sensor of this type is the photodiode, which is constructed of silicon materials and whose output voltage waveform is proportional to light. The use of a filter in front of a sensor improves selectivity. In order to generate a 2-D image using a single sensor, there has to be relative displacements in both the x- and y-directions between the sensor and the area to be imaged. 6 Department of ECE

Figure shows an arrangement used in high-precision scanning, where a film negative is mounted onto a drum whose mechanical rotation provides displacement in one dimension.

7 Figure shows an arrangement used in high-precision scanning, where a film negative is mounted onto a drum whose mechanical rotation provides displacement in one dimension. The single sensor is mounted on a lead screw that provides motion in the perpendicular direction. Microdensitometers In microdensitometers the transparency or photograph is mounted on a flat bed or wrapped around a drum. Scanning is accomplished by focusing a beam of light (which could be a laser) on the image and translating the bed or rotating the drum in relation to the beam. In the case of transparencies, the beam passes through the film; in photographs the beam is reflected from the surface of the image. In both cases, the beam is focused on a photodetector and the gray level at any point in the image is obtained by allowing only discrete values of intensity and position in the output. Image Acquisition Using Sensor Strips A geometry that is used much more frequently than single sensors consists of an in-line arrangement of sensors in the form of a sensor strip. The strip provides imaging elements in one direction. Motion perpendicular to the strip provides imaging in the other direction as above figure shows. This is the type of arrangement used in most flat bed scanners. Solid state arrays are composed of discrete silicon imaging elements, called photosites that have a voltage output proportional to the intensity of the incident light. The figure below shows a typical line scan sensor containing a row of photosites, two transfer gates used to clock the contents of the imaging elements into transport registers, and an output gate used to clock the contents of the transport registers into an amplifier. The amplifier outputs a voltage signal proportional to the contents of the row of photosites. 7 Department of ECE

Sensor strips mounted in a ring configuration are used in medical and industrial imaging to obtain cross-sectional ( slice ) images of 3-D objects, as Figure shows.

8 Sensor strips mounted in a ring configuration are used in medical and industrial imaging to obtain cross-sectional ( slice ) images of 3-D objects, as Figure shows. A rotating X-ray source provides illumination and the portion of the sensors opposite the source collect the X-ray energy that pass through the object (the sensors obviously have to be sensitive to X-ray energy). A 3-D digital volume consisting of stacked images is generated as the object is moved in a direction perpendicular to the sensor ring. Image Acquisition Using Sensor Arrays Numerous electromagnetic and some ultrasonic sensing devices frequently are arranged in an array format. This is also the predominant arrangement found in digital cameras. A typical sensor for these cameras is a CCD array. Charge-Coupled are arrays are similar to line-scan sensors, except that the photosites are arranged in a matrix form and gate/transport register combination separates columns of photosites. 8 Department of ECE

9 The principal manner in which array sensors are used is shown in figure. This figure shows the energy from an illumination source being reflected from a scene element, but, as mentioned at the beginning of this section, the energy also could be transmitted through the scene elements. The first function performed by the imaging system shown in figure (c) is to collect the incoming energy and focus it onto an image plane. If the illumination is light, the front end of the imaging system is a lens, which projects the viewed scene onto the lens focal plane, as figure (d) shows. The sensor array, which is coincident with the focal plane, produces outputs proportional to the integral of the light received at each sensor. Digital and analog circuitry sweeps these outputs and converts them to a video signal, which is then digitized by another section of the imaging system. STORAGE Digital storage for image processing applications falls into three principle categories: i) Short term storage for use during processing ii) On-line storage for relatively fast recall and iii) Archival storage, characterized by infrequent access One method for providing short term storage is computer memory. Another is by specialized boards, called frame buffers that store one or more images and can be accessed rapidly. On-line storage generally takes the form of magnetic disks. Juke boxes that hold optical disks provide an effective solution for large scale, on-line storage applications that require read-write capability. Archival storage is characterized by massive storage requirements, but infrequent need for access. Magnetic tapes and optical disks are the usual media for archival applications. PROCESSING Processing of digital images involves procedures that are usually expressed in algorithmic form. Most image processing functions are implemented in software. The only eason for specialized image processing hardware is the need for speed in some applications or to overcome some fundamental computer limitations. Image processing is characterized by specific solutions. Hence techniques that work well in one area can be totally inadequate in another. 9 Department of ECE

10 COMMUNICTION Communication in digital image processing primarily involves local communication between image processing systems and remote communication from one point to another. Hardware and software for local communication are readily available for most computers. Communication of images across vast distances presents a more serious challenge. A voicegrade telephone line is cheaper to use but slower. Wireless links using intermediate stations, such as satellites, are much faster, but they also considerably more. DISPLAY Monochrome and color monitors are the principal display devices used in modern image processing systems. Monitors are driven by the outputs of an image display module in the backplane of the host computer. The signals of the display module can also be fed into an image recording device that produces a hard copy of the image being viewed in the monitor screen. Other display media include random access cathode ray tubes (CRTs) and printing devices. A SIMPLE IMAGE FORMATION MODEL The term image refers to a two dimensional light-intensity function, denoted by f(x,y), where the value or amplitude of f at spatial coordinates (x,y) gives the intensity (brightness) of the image at that point. As light is a form of energy, f(x,y) must be nonzero and finite, that is, 0 < f(x,y) < The images people perceive in everyday visual activities normally consist of light reflected from objects. The function f(x, y) may be characterized by two components: (1) the amount of source illumination incide nt on the scene being viewed, and (2) the amount of illumination reflected by the objects in the scene. Appropriately, these are called the illumination and reflectance components and are denoted by i(x, y) and r(x, y), respectively. The two functions combine as a product to form f(x, y): f(x, y) = i(x, y) r(x, y) where 0 < i(x, y) < and 0 < r(x, y) < 1 Above equation indicates that reflectance is bounded by 0 (total absorption) and 1 (total reflectance). The nature of i(x, y) is determined by the illumination source, and r(x, y) is determined by the characteristics of the imaged objects. The values given in above equations are theoretical bounds. The following average numerical figures illustrate some typical ranges of i(x, y) for visible light. On a clear day, the sun may produce in excess of 90,000 foot-candles of illumination on the surface of the Earth. This figure decreases to less than 10,000 foot-candles on a cloudy day. On a clear evening, a full moon yields about 0.1 foot-candles of illumination. The typical illumination level in a commercial office is about 1000 foot-candles. Similarly, the following are some typical values of r(x, y): 0.01 for black velvet, 0.65 for stainless steel, 0.80 for flat-white wall paint, 0.90 for silver-plated metal, and 0.93 for snow Department of ECE

11 The intensity of a monochrome image at any coordinates (x, y) the gray level (l) of the image at that point. That is, l=f(x,y) From the equations of illumination and reflection, it is evident that l lies in the range L min l L max In theory, the only requirement on L min is that it be positive, and on L max that it be finite. In practice, L min =i min r min and L max =i max r max.using the preceding average office illumination and range of reflectance values as guidelines, we may expect Lmin 10 and Lmax 1000 to be typical limits for indoor values in the absence of additional illumination. The interval is called the gray scale. Common practice is to shift this interval numerically to the interval [0, L-1], where l=0 is considered black and l=l-1 is considered white on the gray scale. All intermediate values are shades of gray varying from black to white. UNIFORM SAMPLING & QUANTIZATION There are numerous ways to acquire images, but our objective in all is the same: to generate digital images from sensed data. The output of most sensors is a continuous voltage waveform whose amplitude and spatial behavior are related to the physical phenomenon being sensed. To create a digital image, we need to convert the continuous sensed data into digital form. This involves two processes: sampling and quantization. Basic Concepts in Sampling and Quantization The basic idea behind sampling and quantization is illustrated in following figure. Figure (a) shows a continuous image, f(x, y), that we want to convert to digital form. An image may be continuous with respect to the x- and y-coordinates, and also in amplitude. To convert it to digital form, we have to sample the function in both coordinates and in amplitude. Digitizing the coordinate values is called sampling. Digitizing the amplitude values is called quantization. (a) (b) 11 Department of ECE

12 (c) (d) The one-dimensional function shown in Fig. (b) is a plot of amplitude (gray level) values of the continuous image along the line segment AB in Fig.(a).The random variations are due to image noise. To sample this function, we take equally spaced samples along line AB, as shown in Fig. (c).the location of each sample is given by a vertical tick mark in the bottom part of the figure. The samples are shown as small white squares superimposed on the function. The set of these discrete locations gives the sampled function. However, the values of the samples still span (vertically) a continuous range of gray-level values. In order to form a digital function, the graylevel values also must be converted (quantized) into discrete quantities. The right side of Fig. (c) shows the gray-level scale divided into eight discrete levels, ranging from black to white. The vertical tick marks indicate the specific value assigned to each of the eight gray levels. The continuous gray levels are quantized simply by assigning one of the eight discrete gray levels to each sample. The assignment is made depending on the vertical proximity of a sample to a vertical tick mark. The digital samples resulting from both sampling and quantization are shown in Fig.(d). Starting at the top of the image and carrying out this procedure line by line produces a two-dimensional digital image. Quantization of the sensor outputs completes the process of generating a digital image. Representing Digital Images To be suitable for computer processing, an image function f(x,y) must be digitized both spatially and in amplitude. Digitization of the spatial coordinates (x,y) is called image sampling, and amplitude digitization is called gray-level quantization. The result of sampling and quantization is a matrix of real numbers. Suppose that a continuous image f(x, y) is approximated by equally spaced samples arranged in the form of an NxM matrix, where each element of the array is a discrete quantity. The right side of this equation is by definition a digital image. Each element of this matrix array is called an image element, picture element, pixel, or pel. The terms image and pixel will be used throughout the rest of our discussions to denote a digital image and its elements Department of ECE

13 Expressing sampling and quantization in more formal mathematical terms can be useful at times. Let Z and R denote the set of real integers and the set of real numbers, respectively. The sampling process may be viewed as partitioning the xy plane into a grid, with the coordinates of the center of each grid being a pair of elements from the Cartesian product Z 2 (ZxZ), which is the set of all ordered pairs of elements (z i, z j ), with z i and z j being integers from Z. Hence, f(x, y) is a digital image if (x, y) are integers from Z 2 and f is a function that assigns a gray-level value (that is, a real number from the set of real numbers, R) to each distinct pair of coordinates (x, y). This functional assignment obviously is the quantization process described earlier. If the gray levels also are integers (as usually is the case in this and subsequent chapters), Z replaces R, and a digital image then becomes a 2-D function whose coordinates and amplitude values are integers. This digitization process requires decisions about values for M, N, and for the number, L, of discrete gray levels allowed for each pixel. There are no requirements on Mand N, other than that they have to be positive integers. However, due to processing, storage, and sampling hardware considerations, the number of gray levels typically is an integer power of 2: L = 2 k We assume that the discrete levels are equally spaced and that they are integers in the interval [0, L-1]. Sometimes the range of values spanned by the gray scale is called the dynamic range of an image, and we refer to images whose gray levels span a significant portion of the gray scale as having a high dynamic range. When an appreciable number of pixels exhibit this property, the image will have high contrast. Conversely, an image with low dynamic range tends to have a dull, washed out gray look. The number, b, of bits required to store a digitized image is b=m x N x k When M=N, this equation becomes b = N 2 k Table shows the number of bits required to store square images with various values of N and k.the number of gray levels corresponding to each value of k is shown in parentheses: 13 Department of ECE

14 Spatial and Gray-Level Resolution The resolution of an image strongly depends on two parameters: no. of samples and gray levels. Sampling is the principal factor determining the spatial resolution of an image. Basically, spatial resolution is the smallest discernible detail in an image. Gray-level resolution similarly refers to the smallest discernible change in gray level. Due to hardware considerations, the number of gray levels is usually an integer power of 2. Now, let us consider the effect that variations in N and k have on the image quality Effect of Reducing the Spatial Resolution: Figure 2.19 shows an image of size 1024x1024 pixels whose gray levels are represented by 8 bits. The other images shown in Fig are the results of sub-sampling the 1024*1024 image. The sub sampling was accomplished by deleting the appropriate number of rows and columns from the original image. For example, the 512x512 image was obtained by deleting every other row and column from the 1024x1024 image. The 256x256 image was generated by deleting every other row and column in the 512x512 image, and so on. The number of allowed gray levels was kept at 256. The simplest way to compare these effects is to bring all the sub-sampled images up to size 1024*1024 by row and column pixel replication. The level of detail lost is simply too fine to be seen on the printed page at the scale in which these images are shown. Next, the 256x256 image in Fig. 2.20(c) shows a very slight fine checkerboard pattern in the borders between flower petals and the black background. A slightly more pronounced graininess throughout the image also is beginning to appear. These effects are much more visible in the 128x128 image in Fig. 2.20(d), and they become pronounced in the 64x64 and 32x32 imagesin Figs. 2.20(e) and (f), respectively Department of ECE

gray levels from 256 to 2, in integer powers of 2. Figure 2.

15 Effect of Reducing Gray Level Resolution (GRAY TO BINARY CONVERSION) In this example, we keep the number of samples constant and reduce the number of gray levels from 256 to 2, in integer powers of 2. Figure 2.21(a) is a 452x374image, displayed with k=8(256 gray levels) Department of ECE

Figures 2.21(b) through (h) were obtained by reducing the number of bits from k=7to k=1while keeping the spatial resolution constant at 452x374 pixels.

16 Figures 2.21(b) through (h) were obtained by reducing the number of bits from k=7to k=1while keeping the spatial resolution constant at 452x374 pixels. The 256-, 128-, and 64-level images are visually identical for all practical purposes. The 32-level image shown in Fig. 2.21(d), however, has an almost imperceptible set of very fine ridge like structures in areas of smooth gray levels (particularly in the skull).this effect, caused by the use of an insufficient number of gray levels in smooth areas of a digital image, is called false contouring, so called because the ridges resemble topographic contours in a map. False contouring generally is quite visible in images displayed using 16 or less uniformly spaced gray levels, as the images in Figs. 2.21(e) through (h) show. Iso Preference Curves The results in above Examples illustrate the effects produced on image quality by varying N and k independently. However, these results only partially answer the question of how varying N and k affect images because we have not considered yet any relationships that might exist between these two parameters. An early study was attempted to quantify experimentally the effects on image quality produced by varying N and k simultaneously. The experiment consisted of a set of subjective tests. Images similar to those shown in Fig were used.the woman s face is representative of an image with relatively little detail; the picture of the cameraman contains an intermediate amount of detail; and the crowd picture contains, by comparison, a large amount of detail Department of ECE

Sets of these three types of images were generated by varying N and k, and observers were then asked to rank them according to their subjective quality.

17 Sets of these three types of images were generated by varying N and k, and observers were then asked to rank them according to their subjective quality. Results were summarized in the form of so-called isopreference curves in the Nk-plane. Each point in the Nk-plane represents an image having values of N and k equal to the coordinates of that point. Points lying on an isopreference curve correspond to images of equal subjective quality. It was found in the course of the experiments that the isopreference curves tended to shift right and upward, but their shapes in each of the three image categories were similar to those shown in Fig This is not unexpected, since a shift up and right in the curves simply means larger values for N and k, which implies better picture quality. The key point of interest here is that isopreference curves tend to become more vertical as the detail in the image increases. This result suggests that for images with a large amount of detail only a few gray levels may be needed. For example, the isopreference curve in Fig corresponding to the crowd is nearly vertical. This indicates that, for a fixed value of N, the perceived quality for this type of image is nearly independent of the number of gray levels used (for the range of gray levels shown in Fig. 2.23). It is also of interest to note that perceived quality in the other two image categories remained the same in some intervals in which the spatial resolution was increased, but the number of gray levels actually decreased. The most likely reason for this result is that a decrease in k tends to increase the apparent contrast of an image, a visual effect that humans often perceive as improved quality in an image Department of ECE

18 NON-UNIFORM SAMPLING AND QUANTIZATION Non-Uniform Sampling For a fixed value of spatial resolution, the appearance of an image can be improved in many cases by using an adaptive scheme where the sampling process depends on the characteristics of the image. In general, fine sampling is required in the neighborhood of sharp gray-level transitions, whereas coarse sampling may be utilized in relatively smooth regions. Consider, for example, a simple consisting of a face superimposed on a uniform background. Clearly, the background carries little detailed information and can be quite adequately represented by coarse sampling. The face, however contains considerably more detail. If the additional samples not used in the background are used in region of the image, the overall result would tend to improve. In distributing the samples, greater sample concentration should be used in gray-level transition boundaries, such as boundary between the face and the background. Disadvantages or Drawbacks: The necessity of having to identify boundaries is a definite draw back of the nonuniform sampling approach. This method also is not practical for images containing relatively small uniform regions (crowd image). Non-Uniform Sampling When the number of gray levels must be kept small, the use of unequally spaced levels in the quantization process usually is desirable. A method similar to the non-uniform sampling technique may be used for the distribution of gray levels in an image. As the eye is relatively poor at estimating shades of gray near abrupt level changes, the approach in this case is to use few gray levels in the neighborhood of boundaries. The remaining levels can then be used in regions where gray-level variations are smooth, thus avoiding or reducing the false contours that often appear in these regions if they are too coarsely quantized. Disadvantages This method is subjected to the preceding observations about boundary detection and detail content. An alternative technique that is particularly attractive for distributing gray levels consists of computing the frequency of occurrence of allowed levels. If gray levels in a certain range occur frequently, while others occur rarely, the quantization levels are finely spaced in this range and coarsely spaced outside of it. This method is sometimes called as TAPERED QUATIZATION 18 Department of ECE

19 SOME BASIC RELATIONSHIPS BETWEEN PIXELS An image is denoted by f(x, y).when referring in this section to a particular pixel, we use lowercase letters, such as p and q. A subset of pixels of f(x,y) is denoted by S. Neighbors of a Pixel A pixel p at coordinates (x, y) has four horizontal and vertical neighbors whose coordinates are given by (x+1, y), (x-1, y), (x, y+1), (x, y-1) This set of pixels, called the 4-neighbors of p, is denoted by N 4 (p). Each pixel is a unit distance from (x, y), and some of the neighbors of p lie outside the digital image if (x, y) is on the border of the image. The four diagonal neighbors of p have coordinates (x+1, y+1), (x+1, y-1), (x-1, y+1), (x-1, y-1) and are denoted by N D (p). These points, together with the 4-neighbors, are called the 8-neighbors of p, denoted by N 8 (p). As before, some of the points in N D (p) and N 8 (p) fall outside the image if (x, y) is on the border of the image. Connectivity, Adjacency, Regions & Boundaries Connectivity between pixels is a fundamental concept that simplifies the definition of numerous digital image concepts, such as regions and boundaries. To establish if two pixels are connected, it must be determined if they are neighbors and if their gray levels satisfy a specified criterion of similarity (say, if their gray levels are equal).for instance, in a binary image with values 0 and 1, two pixels may be 4-neighbors, but they are said to be connected only if they have the same value. Let V be the set of gray-level values used to define connectivity. In a binary image, V={1} if we are referring to connectivity of pixels with value 1. In a grayscale image, the idea is the same, but set V typically contains more elements. For example, in the connectivity of pixels with a range of intensity values say, 32 to 64, it follows that V={32,33, 63,64}. We consider three types of connectivity: (a) 4-connectivity: Two pixels p and q with values from V are 4-connected if q is in the set N 4 (p). (b) 8- connectivity: Two pixels p and q with values from V are 8- connected if q is in the set N 8 (p). (c) m- connectivity (mixed connectivity): Two pixels p and q with values from V are m- connected if (i) q is in N 4 (p), or (ii) q is in N D (p) and the set N 4 (P) N 4 (q)has no pixels whose values are from V. Mixed connectivity is a modification of 8-connectivity. It is introduced to eliminate the ambiguities (multiple path connections) that often arise when 8-adjacency is used. For example, consider the pixel arrangement shown in figure (a). For V={1} The three pixels at the top of figure show multiple (ambiguous) 8-adjacency, as indicated by the dashed lines in (b). This ambiguity is removed by using m-adjacency, as shown in (c) Department of ECE

20 (a) (b) (c) A pixel p is adjacent to q if they are connected. So there are 3 types of adjacencies too (a) 4-adjacency: Two pixels p and q with values from V are 4-adjacent if q is in the set N 4 (p). (b) 8-adjacency: Two pixels p and q with values from V are 8-adjacent if q is in the set N 8 (p). (c) m-adjacency (mixed adjacency): Two pixels p and q with values from V are m-adjacent if (i) q is in N 4 (p), or (ii) q is in N D (p) and the set N 4 (P) N 4 (q)has no pixels whose values are from V. A (digital) path (or curve) from pixel p with coordinates (x, y) to pixel q with coordinates (s, t) is a sequence of distinct pixels with coordinates (x 0,y 0 ),(x 1,y 1 ),,(x n,y n ) Where (x 0,y 0 )= (x,y) and (x n,y n )=(s,t), and (x i,y i ) is adjacent to (x i-1,y i-1 ), 1 i n. In this case, n is the length of the path. If (x 0,y 0 )= (x n,y n ) the path is a closed path. We can define 4-, 8-, or m- paths depending on the type of adjacency specified. For example, the paths shown in above figure (b) between the northeast and southeast points are 8- paths, and the path in figure (c) is an m-path. Let S represent a subset of pixels in an image. Two pixels p and q are said to be connected in S if there exists a path between them consisting entirely of pixels in S. For any pixel p in S, the set of pixels that are connected to it in S is called a connected component of S. If it only has one connected component, then set S is called a connected set. Let R be a subset of pixels in an image. We call R a region of the image if R is a connected set. The boundary (also called border or contour) of a region R is the set of pixels in the region that have one or more neighbors that are not in R. If R happens to be an entire image (which we recall is a rectangular set of pixels), then its boundary is defined as the set of pixels in the first and last rows and columns of the image. This extra definition is required because an image has no neighbors beyond its border. Normally, when we refer to a region, we are referring to a subset of an image, and any pixels in the boundary of the region that happen to coincide with the border of the image are included implicitly as part of the region boundary. The concept of an edge is found frequently in discussions dealing with regions and boundaries. There is a key difference between these concepts, however. The boundary of a finite region forms a closed path and is thus a global concept. Edges are formed from pixels with 20 Department of ECE

21 derivative values that exceed a preset threshold. Thus, the idea of an edge is a local concept that is based on a measure of gray-level discontinuity at a point. It is possible to link edge points into edge segments, and sometimes these segments are linked in such a way that correspond to boundaries, but this is not always the case. The one exception in which edges and boundaries correspond is in binary images. Depending on the type of connectivity and edge operators used, the edge extracted from a binary region will be the same as the region boundary. Conceptually, it is helpful to think of edges as intensity discontinuities and boundaries as closed paths. Distance Measures For pixels p, q, and z, with coordinates (x, y), (s, t), and (v, w), respectively, D is a distance function or metric if (a) D(p, q) 0 (D(p, q)=0 if and only if p=q), (b) D(p, q)=d(q, p), and (c) D(p, z) D(p, q)+d(q, z). The Euclidean distance between p and q is defined as D e (p,q)=[(x-s) 2 + (y-t) 2 ] 1/2 For this distance measure, the pixels having a distance less than or equal to some value r from (x,y) are the points contained in a disk of radius r centered at (x, y). The D 4 distances (also called city-block distance) between p and q is defined as D 4 (p,q)= x-s + y-t In this case, the pixels having a D 4 distance from (x, y) less than or equal to some value r form a diamond centered at (x, y). For example, the pixels with D 4 distance 2 from (x, y) (the center point) form the following contours of constant distance: The pixels with D4=1 are the 4-neighbors of (x, y). The D 8 distance (also called chessboard distance) between p and q is defined as D 8 (p,q) = max( x-s, y-t ) In this case, the pixels with D 8 distance from (x, y) less than or equal to some value r form a square centered at (x, y). For example, the pixels with D 8 distance 2 from (x, y) (the center point) form the following contours of constant distance: 21 Department of ECE

22 The pixels with D 8 =1 are the 8-neighbors of (x, y). Note that the D 4 and D 8 distances between p and q are independent of any paths that might exist between the points because these distances involve only the coordinates of the points. If we elect to consider m-adjacency, however, the D m distance between two points is defined as the shortest m-path between the points. In this case, the distance between two pixels will depend on the values of the pixels along the path, as well as the values of their neighbors. For instance, consider the following arrangement of pixels and assume that p, p2, and p4 have value 1 and that p1 and p3 can have a value of 0 or 1: Suppose that we consider adjacency of pixels valued 1 (i.e.,v={1}). If p1 and p3 are 0, the length of the shortest m-path (the D m distance) between p and p4 is 2. If p1 is 1, then p2 and p will no longer be m-adjacent (see the definition of m-adjacency) and the length of the shortest m-path becomes 3 (the path goes through the points p-p 1 -p 2 -p 4 ). Similar comments apply if p3 is 1 (and p1 is 0); in this case, the length of the shortest m-path also is 3. Finally, if both p1 and p3 are 1 the length of the shortest m-path between p and p4 is 4. In this case, the path goes through the sequence of points pp 1 p 2 p 3 p 4. IMAGING GEOMETRY SOME BASIC TRANSFORMATIONS In this, all transformations are expressed in a three-dimensional Cartesian coordinate system in which a point has coordinates denoted as (X,Y,Z). Translation Suppose that the task is to translate a point with coordinates (X,Y,Z) to a new location by using displacement (X 0,Y 0,Z 0 ). The translation is easily accomplished by using the equations: X*=X+X 0 Y*=Y+Y 0 Z*=Z+Z 0 Where (X*,Y*,Z*) are the coordinates of the new point. Above equations can be represented in matrix form as X * Y * = Z * X X 0 Y Y 0 Z Z Department of ECE

23 It is often useful to concatenate several transformations to produce a composite result, such s translation, followed by scaling and then rotation. The use of square matrices simplifies the notational representation of this process considerably. So, the above can be modified as: X * X 0 X Y * Y 0 Y = Z * Z 0 Z Let us consider the unified matrix representation. v*=av where A is a 4x4 transformation matrix, v is the column vector containing the original coordinates, X Y v = Z 1 and v* is a column vector whose components are the transformed coordinates X * Y * v* = Z * 1 With this notation, the matrix used for translation is T X Y 0 = Z And the translation process id accomplished by the equation v*=tv Scaling Scaling by factors S x,s y,s z along the X,Y and X axes is given by the transformation matrix Rotation S Sx S 0 0 y = 0 0 Sz The transformations used for 3-D rotation are more complex. The simplest form of these transformations is for rotation of a point about the coordinate axes. To rotate a point about another arbitrary point in space requires three transformations: the first translates the arbitrary 23 Department of ECE

24 point to the origin, the second performs the rotation, and the third translated the point back to the original position. Z θ Y X α β With reference to above figure, rotation of a point about Z coordinate axis by an angle θ is achieved by using the transformation cosθ sinθ 0 0 -sinθ cosθ 0 0 R θ = The rotation angle θ is measured clockwise when looking at the origin from point on z-axis. This transformation affects only the values of X and Y coordinates. Rotation of a point about the X axis by an angle α is performed by using the transformation cosα sinα R α = 0 -sinα cosα 0 Finally the rotation of a point about the Y axis by an angle β is achieved using the transformation cosβ 0 -sinβ R β = sinβ 0 cosβ 0 Concatenation and Inverse transformations The application of several transformations can be represented by a single 4x4 transformation matrix. For example, translation, scaling, and rotation about Z axis of a point v is given by v*=r θ (S(Tv))=Av 24 Department of ECE

25 Where A is a 4x4 matrix, A=R θ ST. These matrices generally do not commute, so the order of application is important. The same above ideas can be extended for transforming a set of m points simultaneously by using a single transformation. Let v 1, v2,. v m represent the coordinates of m points. For a 4x4 matrix V whose column vectors, the simultaneous transformation of all these points by a 4x4 transformation matrix A is given by V*=AV The resulting matrix V* is 4xm. Its i th column, v * i, contains the coordinates of the transformed point corresponding to v i Many of the transformations discussed above have inverse matrices that perform the opposite transformation and can be obtained by inspection. For example, the inverse transformation matix is T X Y -1 0 = Z QUESTIONS & ANSWERS 1. Why do we process images? Image Processing has been developed in response to three major problems concerned with pictures: Picture digitization and coding to facilitate transmission, printing and storage of pictures. Picture enhancement and restoration in order, for example, to interpret more easily pictures of the surface of other planets taken by various probes. Picture segmentation and description as an early stage in Machine Vision. 2. What is the brightness of an image at a pixel position? Each pixel of an image corresponds to a part of a physical object in the 3D world. This physical object is illuminated by some light which is partly reflected and partly absorbed by it. Part of the reflected light reaches the sensor used to image the scene and is responsible for the value recorded for the specific pixel. The recorded value of course, depends on the type of sensor used to image the scene, and the way this sensor responds to the spectrum of the reflected light. However, as a whole scene is imaged by the same sensor, we usually ignore these details. What is important to remember is that the brightness values of different pixels have significance only relative to each other and they are meaningless in absolute terms. So, pixel values between different images should only be compared if either care has been taken for the physical processes used to form the two images to be identical, or the brightness values of the two images have somehow been normalized so that the effects of the different physical processes have been removed Department of ECE

26 3. Why are images often quoted as being 512 X 512, 256 X 256, 128 X 128 etc? Many image calculations with images are simplified when the size of the image is a power of How many bits do we need to store an image? The number of bits, b, we need to store an image of size N X N with 2m different grey levels is: b=nxnxm So, for a typical 512 X 512 image with 256 grey levels (m = 8) we need 2,097,152 bits or 262,144 8-bit bytes. That is why we often try to reduce m and N, without significant loss in the quality of the picture. 5. Consider the two image subsets, S1 and S2, shown in the following figure. For V={1}, determine whether these two subsets are (a) 4-adjacent, (b) 8-adjacent, or (c) m-adjacent. Let p and q be as shown in Figure. Then, (a) S 1 and S 2 are not 4connected because q is not in the set N 4 (p) (b) S 1 and S 2 are 8- connected because q is in the set N 8 (p) (c) S 1 and S 2 are m-connected because (i) q is in N D (p), and (ii) the set N 4 (p) N 4 (q) is empty. 6. Consider the image segment shown. (a) Let V={0, 1} and compute the lengths of the shortest 4-, 8-, and m-path between p and q. If a particular path does not exist between these two points, explain why. (b) Repeat for V={1, 2}. (a) When V = {0,1, 4path does not exist between p and q because it is impossible toget from p to q by traveling along points that are both 4-adjacent and also have values from V. Figure below shows this condition; it is not possible to get to q Department of ECE

27 The shortest 8-path is shown in Figure below; its length is 4. The length of shortest m-path (shown dashed) is 5. Both of these shortest paths are unique in this case. (b) One possibility for the shortest 4-path when V = {1, 2} is shown in Figure below; its length is 6. It is easily verified that another 4path of the same length exists between p and q. One possibility for the shortest 8path (it is not unique) is shown in figure below; its length is 4. The length of a shortest m-path (shown dashed) is 6. This path is not unique. 7. (a) Give the condition(s) under which the D4 distance between two points p and q is equal to the shortest 4-path between these points. (b) Is this path unique? A shortest 4-path between a point p with coordinates (x, y) and a point q with coordinates (s, t) is shown in Fig below, where the assumption is that all points along the path are from V. The length of the segments of the path are x- s and y- t, respectively. The total path length is x-s + y- t, which we recognize as the definition of the D 4 distance. This distance is independent of any paths that may exist between the points. The D4 distance obviously is equal to the length of the shortest 4path when the length of the path is x- s + y- t. This occurs whenever we can get from p to q by following a path whose elements (1) are from V; and (2) are arranged in such a way that we can traverse the path from p to q by making 27 Department of ECE

28 turns in at most two directions (e.g., right and up). (b) The path may of may not be unique, depending on V and the values of the points along the way. 8. Develop an algorithm for converting a one-pixel-thick 8-path to a 4-path. The solution to this problem consists of determining all possible neighborhood shapes to go from a diagonal segment to a corresponding 4connected segment, as shown in figure. The algorithm then simply looks for the appropriate match every time a diagonal segment is encountered in the boundary. 9. Explain the basic principle of imaging in different bands of electromagnetic spectrum. Today, there is almost no area of technical endeavor that is not impacted in some way by digital image processing. One of the simplest ways to develop a basic understanding of the extent of image processing applications is to categorize images according to their source (e.g., visual, X-ray, and so on). The principal energy source for images in use today is the electromagnetic energy spectrum. Other important sources of energy include acoustic, ultrasonic, and electronic (in the form of electron beams used in electron microscopy). Images based on radiation from the EM spectrum are the most familiar, especially images in the X-ray and visual bands of the spectrum. Electromagnetic waves can be conceptualized as propagating sinusoidal waves of varying wavelengths, or they can be thought of as a stream of mass less particles, each traveling in a wavelike pattern and moving at the speed of light. If spectral bands are grouped according to energy per photon, we obtain the spectrum shown in figure, ranging from gamma rays (highest energy) at one end to radio waves (lowest energy) at the other Department of ECE

29 Gamma-Ray Imaging: Major uses of imaging based on gamma rays include nuclear medicine and astronomical observations. In nuclear medicine, the approach is to inject a patient with a radioactive isotope that emits gamma rays as it decays. Images are produced from the emissions collected by gamma ray detectors. X-ray Imaging: X-rays are among the oldest sources of EM radiation used for imaging. The best known use of X-rays is medical diagnostics, but they also are used extensively in industry and other areas, like astronomy. X-rays for medical and industrial imaging are generated using an X-ray tube, which is a vacuum tube with a cathode and anode. In digital radiography, digital images are obtained by one of two methods: (1) by digitizing X-ray films; or (2) by having the X-rays that pass through the patient fall directly onto devices (such as a phosphor screen) that convert X-rays to light. The light signal in turn is captured by a light-sensitive digitizing system. Angiography is another major application in an area called contrast enhancement radiography. This procedure is used to obtain images (called angiograms) of blood vessels. Imaging in the Ultraviolet Band: Applications of ultraviolet light are varied. They include lithography, industrial inspection, microscopy, lasers, biological imaging, and astronomical observations. Ultraviolet light is used in fluorescence microscopy, one of the fastest growing areas of microscopy. Fluorescence microscopy is an excellent method for studying materials that can be made to fluoresce, either in their natural form (primary fluorescence) or when treated with chemicals capable of fluorescing (secondary fluorescence). Imaging in the Visible and Infrared Bands: Considering that the visual band of the electromagnetic spectrum is the most familiar in all our activities, it is not surprising that imaging in this band outweighs by far all the others in terms of scope of application. The infrared band often is used in conjunction with visual imaging. The examples range from pharmaceuticals and micro inspection to materials characterization. Another major area of visual processing is remote sensing, which usually includes several bands in the visual and infrared regions of the spectrum. A major area of imaging in the visual spectrum is in automated visual inspection of manufactured goods. Imaging in the Microwave Band The dominant application of imaging in the microwave band is radar. The unique feature of imaging radar is its ability to collect data over virtually any region at any time, regardless of weather or ambient lighting conditions. Some radar waves can penetrate clouds, and under certain conditions can also see through vegetation, ice, and extremely dry sand. In many cases, radar is the only way to explore inaccessible regions of the Earth s surface. Instead of a camera 29 Department of ECE

30 lens, a radar uses an antenna and digital computer processing to record its images. In a radar image, one can see only the microwave energy that was reflected back toward the radar antenna. Imaging in the Radio Band: As in the case of imaging at the other end of the spectrum (gamma rays), the major applications of imaging in the radio band are in medicine and astronomy. In medicine radio waves are used in magnetic resonance imaging (MRI). This technique places a patient in a powerful magnet and passes radio waves through his or her body in short pulses. Each pulse causes a responding pulse of radio waves to be emitted by the patient s tissues. The location from which these signals originate and their strength are determined by a computer, which produces a two-dimensional picture of a section of the patient. MRI can produce pictures in any plane Department of ECE

31 UNIT-III IMAGE ENHANCEMENT IN SPATIAL DOMAIN The principal objective of enhancement is to process an image so that the result is more suitable than the original image for a specific application. Image enhancement approaches fall into two broad categories: spatial domain methods and frequency domain methods. The term spatial domain refers to the image plane itself, and approaches in this category are based on direct manipulation of pixels in an image. Frequency domain processing techniques are based on modifying the Fourier transform of an image. BACKGROUND The term spatial domain refers to the aggregate of pixels composing an image. Spatial domain methods are procedures that operate directly on these pixels. Spatial domain processes will be denoted by the expression. g(x, y) = T[f(x, y)] where f(x, y) is the input image, g(x, y) is the processed image, and T is an operator on f, defined over some neighborhood of (x, y), where f(x, y) is the input image, g(x, y) is the processed image, and T is an operator on f, defined over some neighborhood of (x, y). The principal approach in defining a neighborhood about a point (x, y) is to use a square or rectangular sub-image area centered at (x, y), as Fig. 3.1 shows. The center of the sub-image is moved from pixel to pixel starting, say, at the top left corner. The operator T is applied at each location (x, y) to yield the output, g, at that location. The process utilizes only the pixels in the area of the image spanned by the neighborhood. Although other neighborhood shapes, such as approximations to a circle, sometimes are used, square and rectangular arrays are by far the most predominant because of their ease of implementation Department of ECE

32 POINT PROCESSING TECHNIQUES The simplest form of T is when the neighborhood is of size 1x1 (that is, a single pixel). In this case, g depends only on the value of f at (x, y), and T becomes a gray-level (also called an intensity or mapping) transformation function of the form s=t(r) where, for simplicity in notation, r and s are variables denoting, respectively, the gray level of f(x, y) and g(x, y) at any point (x, y). For example, if T(r) has the form shown in Fig. 3.2(a), the effect of this transformation would be to produce an image of higher contrast than the original by darkening the levels below m and brightening the levels above m in the original image. In this technique, known as contrast stretching, the values of r below m are compressed by the transformation function into a narrow range of s, toward black. The opposite effect takes place for values of r above m. In the limiting case shown in Fig. 3.2(b), T(r) produces a two-level (binary) image. A mapping of this form is called a thresholding function. Some fairly simple, yet powerful, processing approaches can be formulated with gray-level transformations. Because enhancement at any point in an image depends only on the gray level at that point, techniques in this category often are referred to as point processing. Larger neighborhoods allow considerably more flexibility. The general approach is to use a function of the values of f in a predefined neighborhood of (x, y) to determine the value of g at (x, y). One of the principal approaches in this formulation is based on the use of so-called masks (also referred to as filters, kernels, templates, or windows). Basically, a mask is a small (say, 3*3) 2-D array, such as the one shown in Fig. 3.1, in which the values of the mask coefficients determine the nature of the process, such as image sharpening. Enhancement techniques based on this type of approach often are referred to as mask processing or filtering. SOME BASIC GRAY LEVEL TRANSFORMATIONS These are among the simplest of all image enhancement techniques. The values of pixels, before and after processing, will be denoted by r and s, respectively. These values are related by an expression of the form s=t(r), where T is a transformation that maps a pixel value r into a pixel value s. As an introduction to gray-level transformations, consider Fig. 3.3, which shows three basic types of functions used frequently for image enhancement: linear (negative and 32 Department of ECE

33 identity transformations), logarithmic (log and inverse-log transformations), and power-law (nth power and nth root transformations).the identity function is the trivial case in which output intensities are identical to input intensities. Image Negatives The negative of an image with gray levels in the range [0,L-1]is obtained by using the negative transformation shown in Fig. 3.3, which is given by the expression. s = L-1-r Reversing the intensity levels of an image in this manner produces the equivalent of a photographic negative. This type of processing is particularly suited for enhancing white or gray detail embedded in dark regions of an image, especially when the black areas are dominant in size. An example is shown in Fig The original image is a digital mammogram showing a small lesion. In spite of the fact that the visual content is the same in both images, note how much easier it is to analyze the breast tissue in the negative image in this particular case Department of ECE

34 Log Transformations The general form of the log transformation shown in Fig. 3.3 is s = c log (1 + r) where c is a constant, and it is assumed that r 0. The shape of the log curve in Fig. 3.3 shows that this transformation maps a narrow range of low gray-level values in the input image into a wider range of output levels. The opposite is true of higher values of input levels. We would use a transformation of this type to expand the values of dark pixels in an image while compressing the higher-level values. The opposite is true of the inverse log transformation. The log function has the important characteristic that it compresses the dynamic range of images with large variations in pixel values. A classic illustration of an application in which pixel values have a large dynamic range is the Fourier spectrum. It is not unusual to encounter spectrum values that range from 0 to or higher. While processing numbers such as these presents no problems for a computer, image display systems generally will not be able to reproduce faithfully such a wide range of intensity values. The net effect is that a significant degree of detail will be lost in the display of a typical Fourier spectrum. Power-Law Transformations Power-law transformations have the basic form s = cr γ where c and g are positive constants. Sometimes the above equation is written as s = (c+ε) γ to account for an offset (that is, a measurable output when the input is zero). However, offsets typically are an issue of display calibration and as a result they are normally ignored. Plots of s versus r for various values of g are shown in Fig As in the case of the log transformation, power-law curves with fractional values of γ map a narrow range of dark input values into a wider range of output values, with the opposite being true for higher values of input levels. Unlike the log function, however, we notice here a family of possible transformation curves obtained simply by varying γ Department of ECE

35 A variety of devices used for image capture, printing, and display respond according to a power law. By convention, the exponent in the power-law equation is referred to as gamma. The process used to correct these power-law response phenomena is called gamma correction. In addition to gamma correction, power-law transformations are useful for general-purpose contrast manipulation. ARITHMATIC & LOGICAL OPERATIONS Arithmetic/logic operations involving images are performed on a pixel-by-pixel basis between two or more images (this excludes the logic operation NOT, which is performed on a single image). As an example, subtraction of two images results in a new image whose pixel at coordinates (x, y) is the difference between the pixels in that same location in the two images being subtracted. When dealing with logic operations on gray-scale images, pixel values are processed as strings of binary numbers. For example, performing the NOT operation on a black, 8-bit pixel (a string of eight 0 s) produces a white pixel (a string of eight 1 s). Intermediate values are processed the same way, changing all 1 s to 0 s and vice versa. Thus, the NOT logic operator performs the same function as the negative transformation The AND and OR operations are used for masking; that is, for selecting sub-images in an image. In the AND and OR image masks, light represents a binary 1 and dark represents a binary 0. Masking sometimes is referred to as region of interest (ROI) processing. In terms of enhancement, masking is used primarily to isolate an area for processing. This is done to highlight that area and differentiate it from the rest of the image. Of the four arithmetic operations, subtraction and addition (in that order) are the most useful for image enhancement. IMAGE SUBTRACTION The difference between two images f(x, y) and h(x, y), expressed as is obtained by computing the difference between all pairs of corresponding pixels from f and h. The key usefulness of subtraction is the enhancement of differences between images. In practice, most images are displayed using 8 bits (even 24-bit color images consists of three separate 8-bit channels). Thus, we expect image values not to be outside the range from 0 to 255.The values in a difference image can range from a minimum of 255 to a maximum of 255, so some sort of scaling is required to display the results. There are two principal ways to scale a difference image. One method is to add 255 to every pixel and then divide by 2. It is not guaranteed that the values will cover the entire 8-bit range from 0 to 255, but all pixel values definitely will be within this range. This method is fast and simple to implement, but it has the limitations that the full range of the display may not be utilized and, potentially more serious; the truncation inherent in the division by 2 will generally cause loss in accuracy Department of ECE

36 If more accuracy and full coverage of the 8-bit range are desired, then we can resort to another approach. First, the value of the minimum difference is obtained and its negative added to all the pixels in the difference image (this will create a modified difference image whose minimum values is 0). Then, all the pixels in the image are scaled to the interval [0, 255] by multiplying each pixel by the quantity 255_Max, where Max is the maximum pixel value in the modified difference image. It is evident that this approach is considerably more complex and difficult to implement. Application One of the most commercially successful and beneficial uses of image subtraction is in the area of medical imaging called mask mode radiography. In this case h(x, y), the mask, is an X-ray image of a region of a patient s body captured by an intensified TV camera (instead of traditional X-ray film) located opposite an X-ray source. The procedure consists of injecting a contrast medium into the patient s bloodstream, taking a series of images of the same anatomical region as h(x, y), and subtracting this mask from the series of incoming images after injection of the contrast medium. The net effect of subtracting the mask from each sample in the incoming stream of TV images is that the areas that are different between f(x, y) and h(x, y) appear in the output image as enhanced detail. Because images can be captured at TV rates, this procedure in essence gives a movie showing how the contrast medium propagates through the various arteries in the area being observed. IMAGE AVERAGING Consider a noisy image g(x, y) formed by the addition of noise h(x, y) to an original image f(x, y); that is, where the assumption is that at every pair of coordinates (x, y) the noise is uncorrelated and has zero average value. The objective of the following procedure is to reduce the noise content by adding a set of noisy images, {g i (x, y)}. If the noise satisfies the constraints just stated, it can be shown that if an image is formed by averaging K different noisy images, 36 Department of ECE

37 As K increases, Eqs. (3.4-5) and (3.4-6) indicate that the variability (noise) of the pixel values at each location (x, y) decreases. Because = f(x, y), this means that approaches f(x, y) as the number of noisy images used in the averaging process increases. An important application of image averaging is in the field of astronomy, where imaging with very low light levels is routine, causing sensor noise frequently to render single images virtually useless for analysis. HISTOGRAM PROCESSING The histogram of a digital image with gray levels in the range [0, L-1] is a discrete function h(r k) =n, where r k is the k th gray level and n k is the number of pixels in the image having gray level r k. It is common practice to normalize a histogram by dividing each of its values by the total number of pixels in the image, denoted by n. Thus, a normalized histogram is given by p(r k) =n k /n, for k=0, 1,,L-1. Loosely speaking, p(r k ) gives an estimate of the probability of occurrence of gray level r k. Note that the sum of all components of a normalized histogram is equal to 1. The horizontal axis of each histogram plot corresponds to gray level values, r k. The vertical axis corresponds to values of h(r k )=n k or p(r k )=n k /n if the values are normalized Histograms are the basis for numerous spatial domain processing techniques. Histogram manipulation can be used effectively for image enhancement. Histograms are the basis for numerous spatial domain processing techniques. Histogram manipulation can be used effectively for image enhancement For a dark image the components of the histogram are concentrated on the low (dark) side of the gray scale. Similarly, the components of the histogram of the bright image are biased toward the high side of the gray scale. An image with low contrast has a histogram that will be narrow and will be centered toward the middle of the gray scale. For a monochrome image this implies a dull, washed-out gray look. Finally, we see that the components of the histogram in the high-contrast image cover a broad range of the gray scale and, further, that the distribution of pixels is not too far from uniform, with very few vertical lines being much higher than the others. Intuitively, it is reasonable to conclude that an image, whose pixels tend to occupy the entire range of possible gray levels and, in addition, tend to be distributed uniformly, will have an appearance of high contrast and will exhibit a large variety of gray tones. The net effect will be an image that shows a great deal of gray-level detail and has high dynamic range. Histogram Equalization (Histogram Linearization) Consider for a moment continuous functions, and let the variable r represent the gray levels of the image to be enhanced. Assume that r has been normalized to the interval [0, 1], with r =0 representing black and r=1 representing white. Later, we consider a discrete formulation and allow pixel values to be in the interval [0, L-1] Department of ECE

38 For any r satisfying the aforementioned conditions, we focus attention on transformations of the form that produce a level s for every pixel value r in the original image. Assume that the transformation function T(r) satisfies the following conditions: (a) T(r) is single-valued and monotonically increasing in the interval 0 r 1; and (b) 0 T(r) 1 for 0 r 1. The requirement in condition (a) is that T(r) be single valued is needed to guarantee that the inverse transformation will exist, and the monotonicity condition preserves the increasing order from black to white in the output image. A transformation function that is not monotonically increasing could result in at least a section of the intensity range being inverted, thus producing some inverted gray levels in the output image. Finally, condition (b) guarantees that the output gray levels will be in the same range as the input levels. Figure 3.16 gives an example of a transformation function that satisfies these two conditions. The inverse transformation from s back to r is denoted The gray levels in an image may be viewed as random variables in the interval [0, 1]. One of the most fundamental descriptors of a random variable is its probability density function (PDF). Let p r (r) and p s (s) denote the probability density functions of random variables r and s, respectively, where the subscripts on p are used to denote that p r and p s are different functions. A basic result from an elementary probability theory is that, if p r (r) and T(r) are known and satisfies condition (a), then the probability density function p s (s) of the transformed variable s can be obtained using a rather simple formula: Thus, the probability density function of the transformed variable, s, is determined by the graylevel PDF of the input image and by the chosen transformation function. A transformation function of particular importance in image processing has the form where w is a dummy variable of integration Department of ECE

39 Given transformation function T(r),we find ps(s) by applying Eq. (3.3-3).We know from basic calculus (Leibniz s rule) that the derivative of a definite integral with respect to its upper limit is simply the integrand evaluated at that limit. In other words, Substituting this result for dr/ds into Eq. (3.3-3), and keeping in mind that all probability values are positive, yields Because p s (s) is a probability density function, it follows that it must be zero outside the interval [0, 1] in this case because its integral over all values of s must equal 1.We recognize the form of p s (s) given in Eq. (3.3-6) as a uniform probability density function. Simply stated, we have demonstrated that performing the transformation function given in Eq. (3.3-4) yields a random variable s characterized by a uniform probability density function. It is important to note from Eq. (3.3-4) that T(r) depends on p r (r), but, as indicated by Eq. (3.3-6), the resulting p s (s) always is uniform, independent of the form of p r (r). For discrete values we deal with probabilities and summations instead of probability density functions and integrals. The probability of occurrence of gray level r k in an image is approximated by where, n is the total number of pixels in the image, n k is the number of pixels that have gray level r k, and L is the total number of possible gray levels in the image. The discrete version of the transformation function given in Eq. (3.3-4) is Thus, a processed (output) image is obtained by mapping each pixel with level r k in the input image into a corresponding pixel with level s k in the output image via Eq. (3.3-8). As indicated earlier, a plot of p r (r k ) versus r k is called a histogram. The transformation (mapping) given in Eq. (3.3-8) is called histogram equalization or histogram linearization. In general that this discrete transformation will produce the discrete equivalent of a uniform probability density function, which would be a uniform histogram. However, as will be 39 Department of ECE

40 seen shortly, use of Eq. (3.3-8) does have the general tendency of spreading the histogram of the input image so that the levels of the histogram-equalized image will span a fuller range of the gray scale. In addition to producing gray levels that have this tendency, the method just derived has the additional advantage that it is fully automatic. In other words, given an image, the process of histogram equalization consists simply of implementing Eq. (3.3-8), which is based on information that can be extracted directly from the given image, without the need for further parameter specifications. Histogram Matching (Histogram Specification) Histogram equalization automatically determines a transformation function that seeks to produce an output image that has a uniform histogram. When automatic enhancement is desired, this is a good approach because the results from this technique are predictable and the method is simple to implement. But there are applications in which attempting to base enhancement on a uniform histogram is not the best approach. In particular, it is useful sometimes to be able to specify the shape of the histogram that we wish the processed image to have. The method used to generate a processed image that has a specified histogram is called histogram matching or histogram specification. Development of the method Let r and z denote continuous gray levels, and let p r (r) and p z (z) denote their corresponding continuous probability density functions. In this notation, r and z denote the gray levels of the input and output (processed) images, respectively. We can estimate p r (r) from the given input image, while p z (z) is the specified probability density function that we wish the output image to have. Let s be a random variable with the property where w is a dummy variable of integration. We recognize this expression as the continuous version of histogram equalization given in Eq. (3.3-4). Suppose next that we define a random variable z with the property. where t is a dummy variable of integration. It then follows from these two equations that G(z)=T(r) and, therefore, that z must satisfy the condition The transformation T(r) can be obtained from Eq. (3.3-10) once p r (r) has been estimated from the input image. Similarly, the transformation function G(z) can be obtained using Eq. (3.3-11) because p z (z) is given Department of ECE

41 The discrete formulation of above equations is,. (1) where n is the total number of pixels in the image, nj is the number of pixels with gray level r j, and L is the number of discrete gray levels. Similarly, the discrete formulation of Eq. (3.3-11) is obtained from the given histogram p z (z i ),i=0, 1, 2,,L-1, and has the form (2) (3).(4) Equations (1) to (3) are the foundation for implementing histogram matching for digital images. Equation (1) is a mapping from the levels in the original image into corresponding levels s k based on the histogram of the original image, which we compute from the pixels in the image. Equation (2) computes a transformation function G from the given histogram p z (z). Finally, Eq. (3) or its equivalent, Eq. (4), gives us (an approximation of) the desired levels of the image with that histogram. The above equations show that an image with a specified probability density function can be obtained from an input image by using the following procedure: (1) Obtain the transformation function T(r) using Eq. (1). (2) Use Eq. (2) to obtain the transformation function G(z). (3) Obtain the inverse transformation function G 1. (4) Obtain the output image by applying Eq. (3) to all the pixels in the input image. The result of this procedure will be an image whose gray levels, z, have the specified probability density function p z (z). Local Enhancement The histogram processing methods discussed in the previous two sections are global, in the sense that pixels are modified by a transformation function based on the gray-level content of an entire image. Although this global approach is suitable for overall enhancement, there are cases in which it is necessary to enhance details over small areas in an image. The number of pixels in these areas may have negligible influence on the computation of a global transformation whose shape does not necessarily guarantee the desired local enhancement. The solution is to devise transformation functions based on the gray-level distribution or other properties in the neighborhood of every pixel in the image. The histogram processing techniques previously described are easily adaptable to local enhancement. The procedure is to define a square or rectangular neighborhood and move the center of this area from pixel to pixel. At each location, the histogram of the points in the 41 Department of ECE

42 neighborhood is computed and either a histogram equalization or histogram specification transformation function is obtained. This function is finally used to map the gray level of the pixel centered in the neighborhood. The center of the neighborhood region is then moved to an adjacent pixel location and the procedure is repeated. Since only one new row or column of the neighborhood changes during a pixel-to-pixel translation of the region, updating the histogram obtained in the previous location with the new data introduced at each motion step is possible. This approach has obvious advantages over repeatedly computing the histogram over all pixels in the neighborhood region each time the region is moved one pixel location. Another approach used some times to reduce computation is to utilize non-overlapping regions, but this method usually produces an undesirable checkerboard effect. Figure 3.23(a) shows an image that has been slightly blurred to reduce its noise content (see Section regarding blurring).figure 3.23(b) shows the result of global histogram equalization. As is often the case when this technique is applied to smooth, noisy areas, Fig. 3.23(b) shows considerable enhancement of the noise, with a slight increase in contrast. Note that no new structural details were brought out by this method. However, local histogram equalization using a 7*7 neighborhood revealed the presence of small squares inside the larger dark squares.the small squares were too close in gray level to the larger ones, and their sizes were too small to influence global histogram equalization significantly. Note also the finer noise texture in Fig. 3.23(c), a result of local processing using relatively small neighborhoods. SPATIAL FILTERING Some neighborhood operations work with the values of the image pixels in the neighborhood and the corresponding values of a sub-image that has the same dimensions as the neighborhood. The sub-image is called a filter, mask, kernel, template, or window. The values in a filter sub-image are referred to as coefficients, rather than pixels. The mechanics of spatial filtering are illustrated in Fig The process consists simply of moving the filter mask from point to point in an image. At each point (x, y), the response of the filter at that point is calculated using a predefined relationship. For linear spatial filtering, 42 Department of ECE

43 the response is given by a sum of products of the filter coefficients and the corresponding image pixels in the area spanned by the filter mask. For the 3x3 mask shown in Fig. 3.32, the result (or response), R, of linear filtering with the filter mask at a point (x, y) in the image is In general, linear filtering of an image f of size MxN with a filter mask of size mxn is given by the expression: where, a=(m-1)/2 and b=(n-1)/2. To generate a complete filtered image this equation must be applied for x=0, 1, 2,, M-1 and y=0, 1, 2,, N-1. The process of linear filtering given in above equation is similar to a frequency domain concept called convolution. For this reason, linear spatial filtering often is referred to as convolving a mask with an image. Similarly, filter masks are sometimes called convolution masks. The term convolution kernel also is in common use. When interest lies on the response, R, of an mxn mask at any point (x, y), and not on the mechanics of implementing mask convolution, it is common practice to simplify the notation by using the following expression: 43 Department of ECE

44 where the w s are mask coefficients, the z s are the values of the image gray levels corresponding to those coefficients, and mn is the total number of coefficients in the mask. For the 3x3 general mask shown in figure below, the response at any point (x, y) in the image is given by An important consideration in implementing neighborhood operations for spatial filtering is the issue of what happens when the center of the filter approaches the border of the image. Consider for simplicity a square mask of size nxn. At least one edge of such a mask will coincide with the border of the image when the center of the mask is at a distance of (n-1)/2 pixels away from the border of the image. If the center of the mask moves any closer to the border, one or more rows or columns of the mask will be located outside the image plane. There are several ways to handle this situation. The simplest is to limit the excursions of the center of the mask to be at a distance no less than (n-1)/2 pixels from the border. The resulting filtered image will be smaller than the original, but all the pixels in the filtered imaged will have been processed with the full mask. If the result is required to be the same size as the original, then the approach typically employed is to filter all pixels only with the section of the mask that is fully contained in the image. With this approach, there will be bands of pixels near the border that will have been processed with a partial filter mask. Other approaches include padding the image by adding rows and columns of 0 s (or other constant gray level), or padding by replicating rows or columns. The padding is then stripped off at the end of the process. This keeps the size of the filtered image the same as the original, but the values of the padding will have an effect near the edges that becomes more prevalent as the size of the mask increases. The only way to obtain a perfectly filtered result is to accept a somewhat smaller filtered image by limiting the excursions of the center of the filter mask to a distance no less than (n-1)/2 pixels from the border of the original image Department of ECE

45 SMOOTHING SPATIAL FILTERS Smoothing filters are used for blurring and for noise reduction. Blurring is used in preprocessing steps, such as removal of small details from an image prior to (large) object extraction, and bridging of small gaps in lines or curves. Noise reduction can be accomplished by blurring with a linear filter and also by nonlinear filtering. Smoothing Linear Filters The output (response) of a smoothing, linear spatial filter is simply the average of the pixels contained in the neighborhood of the filter mask. These filters sometimes are called averaging filters. They also are referred to a lowpass filters. The idea behind smoothing filters is straightforward. By replacing the value of every pixel in an image by the average of the gray levels in the neighborhood defined by the filter mask, this process results in an image with reduced sharp transitions in gray levels. Because random noise typically consists of sharp transitions in gray levels, the most obvious application of smoothing is noise reduction. However, edges (which almost always are desirable features of an image) also are characterized by sharp transitions in gray levels, so averaging filters have the undesirable side effect that they blur edges. Another application of this type of process includes the smoothing of false contours that result from using an insufficient number of gray levels. A major use of averaging filters is in the reduction of irrelevant detail in an image. By irrelevant we mean pixel regions that are small with respect to the size of the filter mask. Above figure shows two 3x3 smoothing filters. Use of the first filter yields the standard average of the pixels under the mask. The response of the first filter is given by which is the average of the gray levels of the pixels in the 3x3 neighborhood defined by the mask. A spatial averaging filter in which all coefficients are equal is sometimes called a box filter. The second mask shown in above figure is a little more interesting. This mask yields a so-called weighted average, terminology used to indicate that pixels are multiplied by different coefficients, thus giving more importance (weight) to some pixels at the expense of others. In the second mask shown, the pixel at the center of the mask is multiplied by a higher value than any other, thus giving this pixel more importance in the calculation of the average. The other pixels are inversely weighted as a function of their distance from the center of the mask. The diagonal terms are further away from the center than the orthogonal neighbors (by a factor of) and, thus, 45 Department of ECE

46 are weighed less than these immediate neighbors of the center pixel. The basic strategy behind weighing the center point the highest and then reducing the value of the coefficients as a function of increasing distance from the origin is simply an attempt to reduce blurring in the smoothing process. In practice, it is difficult in general to see differences between images smoothed by using either of the masks in above figure, or similar arrangements, because the area these masks span at any one location in an image is so small. The general implementation for filtering an MxN image with a weighted averaging filter of size mxn (m and n odd) is given by the expression. The denominator in above equation is simply the sum of the mask coefficients and, therefore, it is a constant that needs to be computed only once. The effects of smoothing as a function of filter size are illustrated in Fig. 3.35, which shows an original image and the corresponding smoothed results obtained using square averaging filters of sizes n=3, 5, 9, 15, and 35 pixels, respectively. The principal features of these results are as follows: For n=3, we note a general slight blurring throughout the entire image but, as expected, details that are of approximately the same size as the filter mask are affected considerably more. For example, the 3*3 and 5*5 squares, the small letter a, and the fine grain noise show significant blurring when compared to the rest of the image. A positive result is that the noise is less pronounced. Note that the jagged borders of the characters and gray circles have been pleasingly smoothed. The result for n=5 is somewhat similar, with a slight further increase in blurring. For n=9 we see considerably more blurring, and the 20% black circle is not nearly as distinct from the background as in the previous three images, illustrating the blending effect that blurring has on objects whose gray level content is close to that of its neighboring pixels. Note the significant further smoothing of the noisy rectangles. The results for n=15 and 35 are extreme with respect to the sizes of the objects in the image. This type of excessive blurring is generally used to eliminate small objects from an image. For instance, the three small squares, two of the circles, and most of the noisy rectangle areas have been blended into the background of the image in Fig. 3.35(f). Note also in this figure the pronounced black border. This is a result of padding the border of the original image with 0 s (black) and then trimming off the padded area. Some of the black was blended into all filtered images, but became truly objectionable for the images smoothed with the larger filters Department of ECE

An important application of spatial averaging is to blur an image for the purpose getting a gross representation of objects of interest, such that the intensity of smaller objects blends with the

Smoothing Non-Linear Filters (Order-Statistics Filters) Order-statistics filters are nonlinear spatial filters whose response is based on ordering (ranking) the pixels contained in the image area

47 An important application of spatial averaging is to blur an image for the purpose getting a gross representation of objects of interest, such that the intensity of smaller objects blends with the background and larger objects become blob-like and easy to detect. The size of the mask establishes the relative size of the objects that will be blended with the background. Smoothing Non-Linear Filters (Order-Statistics Filters) Order-statistics filters are nonlinear spatial filters whose response is based on ordering (ranking) the pixels contained in the image area encompassed by the filter, and then replacing the value of the center pixel with the value determined by the ranking result. The best-known example in this category is the median filter, which, as its name implies, replaces the value of a pixel by the median of the gray levels in the neighborhood of that pixel (the original value of the pixel is included in the computation of the median). Median filters are quite popular because, for certain types of random noise, they provide excellent noise-reduction capabilities, with considerably less blurring than linear smoothing filters of similar size. Median filters are particularly effective in the presence of impulse noise, also called salt-and-pepper noise because of its appearance as white and black dots superimposed on an image Department of ECE

48 The median, ξ, of a set of values is such that half the values in the set are less than or equal to ξ, and half are greater than or equal to ξ. In order to perform median filtering at a point in an image, we first sort the values of the pixel in question and its neighbors, determine their median, and assign this value to that pixel. For example, in a 3*3 neighborhood the median is the 5th largest value, in a 5*5 neighborhood the 13th largest value, and so on. The principal function of median filters is to force points with distinct gray levels to be more like their neighbors. In fact, isolated clusters of pixels that are light or dark with respect to their neighbors, and whose area is less than n 2 /2 (one-half the filter area), are eliminated by an nxn median filter. In this case eliminated means forced to the median intensity of the neighbors. Larger clusters are affected considerably less. Figure 3.37(a) shows an X-ray image of a circuit board heavily corrupted by salt-andpepper noise. To illustrate the point about the superiority of median filtering over average filtering in situations such as this, we show in Fig. 3.37(b) the result of processing the noisy image with a 3*3 neighborhood averaging mask, and in Fig. 3.37(c) the result of using a 3*3 median filter. The image processed with the averaging filter has less visible noise, but the price paid is significant blurring. The superiority in all respects of median over average filtering in this case is quite evident. In general, median filtering is much better suited than averaging for the removal of additive salt-and-pepper noise. SHARPENING SPATIAL FILTERS The principal objective of sharpening is to highlight fine detail in an image or to enhance detail that has been blurred, either in error or as a natural effect of a particular method of image acquisition. Uses of image sharpening vary and include applications ranging from electronic printing and medical imaging to industrial inspection and autonomous guidance in military systems. We saw that image blurring could be accomplished in the spatial domain by pixel averaging in a neighborhood. Since averaging is analogous to integration, it is logical to conclude that sharpening could be accomplished by spatial differentiation. Fundamentally, the strength of the response of a derivative operator is proportional to the degree of discontinuity of the image at the point at which the operator is applied. Thus, image differentiation enhances 48 Department of ECE

49 edges and other discontinuities (such as noise) and deemphasizes areas with slowly varying graylevel values. We consider in some detail sharpening filters that are based on first- and secondorder derivatives, respectively. Before proceeding with that discussion, however, we stop to look at some of the fundamental properties of these derivatives in a digital context. To simplify the explanation, we focus attention on one-dimensional derivatives. In particular, we are interested in the behavior of these derivatives in areas of constant gray level (flat segments), at the onset and end of discontinuities (step and ramp discontinuities), and along gray-level ramps. These types of discontinuities can be used to model noise points, lines, and edges in an image. The behavior of derivatives during transitions into and out of these image features also is of interest. The derivatives of a digital function are defined in terms of differences. There are various ways to define these differences. However, we require that any definition we use for a first derivative (1) must be zero in flat segments (areas of constant gray-level values); (2) must be nonzero at the onset of a gray-level step or ramp; and (3) must be nonzero along ramps. Similarly, any definition of a second derivative (1) must be zero in flat areas; (2) must be nonzero at the onset and end of a gray-level step or ramp; and (3) must be zero along ramps of constant slope. Since we are dealing with digital quantities whose values are finite, the maximum possible graylevel change also is finite, and the shortest distance over which that change can occur is between adjacent pixels. A basic definition of the first-order derivative of a one-dimensional function f(x) is the difference We used a partial derivative here in order to keep the notation the same as when we consider an image function of two variables, f(x, y), at which time we will be dealing with partial derivatives along the two spatial axes. Similarly, we define a second-order derivative as the difference It is easily verified that these two definitions satisfy the conditions stated previously regarding derivatives of the first and second order. To see this, and also to highlight the fundamental similarities and differences between first- and second- order derivatives in the context of image processing, consider the example shown below: Figure 3.38(a) shows a simple image that contains various solid objects, a line, and a single noise point. Figure 3.38(b) shows a horizontal gray-level profile (scan line) of the image along the center and including the noise point. This profile is the one-dimensional function we will use for illustrations regarding this figure. Figure 3.38(c) shows a simplification of the profile, with just enough numbers to make it possible for us to analyze how the first- and secondorder derivatives behave as they encounter a noise point, a line, and then the edge of an object. In 49 Department of ECE

our simplified diagram the transition in the ramp spans four pixels, the noise point is a single pixel, the line is three pixels thick, and the transition into the gray-level step takes place between

50 our simplified diagram the transition in the ramp spans four pixels, the noise point is a single pixel, the line is three pixels thick, and the transition into the gray-level step takes place between adjacent pixels. The number of gray levels was simplified to only eight levels. Let us consider the properties of the first and second derivatives as we traverse the profile from left to right. First, we note that the first-order derivative is nonzero along the entire ramp, while the second-order derivative is nonzero only at the onset and end of the ramp. Because edges in an image resemble this type of transition, we conclude that first-order derivatives produce thick edges and second-order derivatives, much finer ones. Next we encounter the isolated noise point. Here, the response at and around the point is much stronger for the secondthan for the first-order derivative. Of course, this is not unexpected. A second-order derivative is much more aggressive than a first-order derivative in enhancing sharp changes. Thus, we can expect a second-order derivative to enhance fine detail (including noise) much more than a firstorder derivative. The thin line is a fine detail, and we see essentially the same difference between the two derivatives. If the maximum gray level of the line had been the same as the isolated point, the response of the second derivative would have been stronger for the latter. Finally, in this case, the response of the two derivatives is the same at the gray-level step (in most cases when the transition into a step is not from zero, the second derivative will be weaker).we also note that the second derivative has a transition from positive back to negative. In an image, this shows as a thin double line. This double-edge effect is an issue that will be important, where we use derivatives for edge detection. It is of interest also to note that if the gray level of the thin line had been the same as the step, the response of the second derivative would have been stronger for the line than for the step Department of ECE

51 In summary, comparing the response between first- and second-order derivatives, we arrive at the following conclusions. (1) First-order derivatives generally produce thicker edges in an image. (2) Second-order derivatives have a stronger response to fine detail, such as thin lines and isolated points. (3) First order derivatives generally have a stronger response to a gray-level step. (4) Second- order derivatives produce a double response at step changes in gray level. We also note of second-order derivatives that, for similar changes in gray-level values in an image, their response is stronger to a line than to a step, and to a point than to a line. In most applications, the second derivative is better suited than the first derivative for image enhancement because of the ability of the former to enhance fine detail. Image Enhancement using Second Derivatives The Laplacian The approach basically consists of defining a discrete formulation of the second-order derivative and then constructing a filter mask based on that formulation. We are interested in isotropic filters, whose response is independent of the direction of the discontinuities in the image to which the filter is applied. In other words, isotropic filters are rotation invariant; in the sense that rotating the image and then applying the filter gives the same result as applying the filter to the image first and then rotating the result. Development of the method: It can be shown that the simplest isotropic derivative operator is the Laplacian, which, for a function (image) f(x, y) of two variables, is defined as Because derivatives of any order are linear operations, the Laplacian is a linear operator. In order to be useful for digital image processing, this equation needs to be expressed in discrete form.there are several ways to define a digital Laplacian using neighborhoods. The partial second-order derivative in the x-direction is: The digital implementation of the two-dimensional Laplacian is This equation can be implemented using the mask shown in Fig. 3.39(a), which gives an isotropic result for rotations in increments of Department of ECE

52 The diagonal directions can be incorporated in the definition of the digital Laplacian by adding two more terms to Eq (3.7-4), one for each of the two diagonal directions. The form of each new term is the same as either Eq. (3.7-2). or (3.7-3), but the coordinates are along the diagonals. Since each diagonal term also contains a 2f(x, y) term, the total subtracted from the difference terms now would be 8f(x, y). The mask used to implement this new definition is shown in Fig. 3.39(b). This mask yields isotropic results for increments of 45. The other two masks shown in Fig also are used frequently in practice. They are based on a definition of the Laplacian that is the negative of the one we used here. As such, they yield equivalent results, but the difference in sign must be kept in mind when combining (by addition or subtraction) a Laplacian-filtered image with another image. Because the Laplacian is a derivative operator, its use highlights gray-level discontinuities in an image and deemphasizes regions with slowly varying gray levels. This will tend to produce images that have grayish edge lines and other discontinuities, all superimposed on a dark, featureless background. Background features can be recovered while still preserving the sharpening effect of the Laplacian operation simply by adding the original and Laplacian images. If the definition used has a negative center coefficient, then we subtract, rather than add, the Laplacian image to obtain a sharpened result. Thus, the basic way in which we use the Laplacian for image enhancement is as follows: Simplifications: Previously, we implemented Eq. (3.7-5) by first computing the Laplacian-filtered image and then subtracting it from the original image. In practice, Eq. (3.7-5) is usually implemented with one pass of a single mask. The coefficients of the single mask are easily obtained by substituting Eq. (3.7-4) for 2 (x, y )in the first line of Eq. (3.7-5): 52 Department of ECE

This equation can be implemented using the mask shown below. The second mask shown in below would be used if the diagonal neighbors also were included in the calculation of the Laplacian.

53 This equation can be implemented using the mask shown below. The second mask shown in below would be used if the diagonal neighbors also were included in the calculation of the Laplacian. Identical masks would have resulted if we had substituted the negative of Eq. (3.7-4) into the second line of Eq. (3.7-5). The results obtainable with the mask containing the diagonal terms usually are a little sharper than those obtained with the more basic mask of Fig. 3.41(a). This property is illustrated by the Laplacian-filtered images shown in Figs. 3.41(d) and (e), which were obtained by using the masks in Figs. 3.41(a) and (b), respectively. By comparing the filtered images with the original image shown in Fig. 3.41(c), we note that both masks produced effective enhancement, but the result using the mask in Fig. 3.41(b) is visibly sharper Unsharp masking and high-boost filtering A process used for many years in the publishing industry to sharpen images consists of subtracting a blurred version of an image from the image itself. This process, called unsharp masking, is expressed as 53 Department of ECE

54 where f s (x, y) denotes the sharpened image obtained by unsharp masking, and f (x,y) is a blurred version of f(x, y).the origin of unsharp masking is in darkroom photography, where it consists of clamping together a blurred negative to a corresponding positive film and then developing this combination to produce a sharper image. A slight further generalization of unsharp masking is called high-boost filtering. A highboost filtered image, f hb, is defined at any point (x, y) as High-boost filtering can be implemented with one pass using either of the two masks shown in Fig Note that, when A=1, high-boost filtering becomes standard Laplacian sharpening. As the value of A increases past 1, the contribution of the sharpening process becomes less and less important. Eventually, if A is large enough, the high-boost image will be approximately equal to the original image multiplied by a constant. One of the principal applications of boost filtering is when the input image is darker than desired. By varying the boost coefficient, it generally is possible to obtain an overall increase in average gray level of the image, thus helping to brighten the final result Department of ECE

55 Image Enhancement using First Derivatives The Gradient First derivatives in image processing are implemented using the magnitude of the gradient. For a function f(x, y), the gradient of f at coordinates (x, y) is defined as the twodimensional column vector. The components of the gradient vector itself are linear operators, but the magnitude of this vector obviously is not because of the squaring and square root operations. On the other hand, the partial derivatives in Eq. (3.7-12) are not rotation invariant (isotropic), but the magnitude of the gradient vector is. Although it is not strictly correct, the magnitude of the gradient vector often is referred to as the gradient. The computational burden of implementing Eq. (3.7-13) over an entire image is not trivial, and it is common practice to approximate the magnitude of the gradient by using absolute values instead of squares and square roots: This equation is simpler to compute and it still preserves relative changes in gray levels, but the isotropic feature property is lost in general. However, as in the case of the Laplacian, the isotropic properties of the digital gradient defined are preserved only for a limited number of rotational increments that depend on the masks used to approximate the derivatives. As it turns out, the most popular masks used to approximate the gradient give the same result only for vertical and horizontal edges and thus the isotropic properties of the gradient are preserved only for multiples of 90. These results are independent of whether Eq. (3.7-13) or (3.7-14) is used, so nothing of significance is lost in using the simpler of the two equations. As in the case of the Laplacian, we now define digital approximations to the preceding equations, and from there formulate the appropriate filter masks. In order to simplify the discussion that follows, we will use the notation in Fig. 3.44(a) to denote image points in a 3x3 region. For example, the center point, z 5, denotes f(x, y), z 1 denotes f(x-1, y-1), and so on. Roberts cross-gradient operators The simplest approximations to a first-order derivative that satisfy the conditions stated in that section are G x =(z 8 -z 5 ) and Gy=(z 6 -z 5 ). Two other definitions proposed by Roberts in the early development of digital image processing use cross differences: 55 Department of ECE

56 This equation can be implemented with the two masks shown below.these masks are referred to as the Roberts cross-gradient operators. Sobel Operators Masks of even size are awkward to implement. The smallest filter mask in which we are interested is of size 3x3. An approximation using absolute values, still at point z 5, but using a 3x3 mask, is The difference between the third and first rows of the 3x3 image region approximates the derivative in the x-direction, and the difference between the third and first columns approximates the derivative in the y-direction. The masks shown above are called the Sobel operators, can be used to implement Eq. (3.7-18). The idea behind using a weight value of 2 is to achieve some smoothing by giving more importance to the center point. Note that the coefficients in all the masks shown above sum to 0, indicating that they would give a response of 0 in an area of constant gray level, as expected of a derivative operator. The gradient is used frequently in industrial inspection, either to aid humans in the detection of defects or, what is more common, as a preprocessing step in automated inspection. The gradient can be used to enhance defects and eliminate slowly changing background features Department of ECE

57 UNIT-IV IMAGE ENHANCEMENT IN FREQUENCY DOMAIN The frequency domain is nothing more than the space defined by values of the fourier transform and its frequency variables (u,v). Some basic properties of the frequency domain Each term of F(u,v) contains all values of f(x,y), modified by the values of the exponential terms. Some general statements can be made about the relationship between the frequency components of the Fourier transform and spatial characteristics of an image. For instance, since frequency is directly related to rate of change, we can associate frequencies in the Fourier transform with patterns of intensity variations in an image. The slowest varying frequency component (u=v=0) corresponds to the average gray level of an image. As we move far away from the origin of the transform, the low frequencies correspond to the slowly varying components of an image. In an image of a room, for example, these might correspond to smooth gray-level variations on the walls and floor. As we move further away from the origin, the higher frequencies begin to correspond to faster and faster gray level changes in the image. These are edges of objects and other components of an image characterized by abrupt changes in gray level, such as noise. Basics of filtering in the frequency domain Filtering in the frequency domain consists of the following steps: 1. Multiply the input image by (-1) x+y to center the transform as shown below. f(x,y) y F(u-M/2, v-n/2) v x 2. Compute F(u,v),th e Discrete Fourier transform of the image from the step 1 3. Multiply F(u,v) by a filter function H(u,v) 4. Compute the inverse Discrete Fourier transform of the result in step 3 5. Obtain the real part of the result in step 4 6. Multiply the result in step 5 by (-1) x+y u The reason that H(u,v) is called a filter is because it suppresses certain frequencies in the transform while leaving others unchanged. In equation form, let f(x,y) represent the input image 57 Department of ECE

58 in step 1 and F(u,v) its Fourier transform. Then the Fourier transform of the output image is given by In general, the components of F are complex quantities, but the filters which we deal are real. In this case, each component of H multiplies both the real and imaginary parts of the corresponding component in F. Such filters are called Zero-Phase-Shift filters. As their name implies, these filters do not change the phase of the transform. The filtered image is obtained by simply taking the inverse fourier transform of G(u,v): The final image is obtained by taking the real part of thus result and multiplying it by (-1) x+y to cancel the multiplication of the input image by this quantity. In addition in the (-1) x+y process, examples of other pre-processing functions may include cropping of the input image to its closest even dimensions, gray level slicing, conversion to floating point on input, and conversion to an 8-bit integer format on the output. Multiple filtering stages and other pre- and post processing functions are possible. The important point here is that filtering process is based on modifying the transform of an image in some way via a filter function, and then taking the inverse of the result to obtain the processed output image. Some Basic filters and their properties According the following equation, the average value of an image is given by F(0,0). If we set this term to zero in the frequency domain and take the inverse transform, then the average value of the resulting image will be zero. Assuming that the transform has been centered, we can do this operation by multiplying all values of F(u,v) by the following filter function: 58 Department of ECE

59 The result of processing any image with the above transfer function is the drop in overall average gray level resulting from forcing the average value to zero. Low frequencies in the Fourier transform are responsible for the general gray-level appearance of an image over smooth areas, while high frequencies are responsible for detail, such as edges and noise. A filter that attenuates high frequencies while passing low frequencies is called a lowpass filter. A lowpass-filtered image has less sharp detail than the original image because the high frequencies have been attenuated. Such an image will appear smoother. A filter that attenuates low frequencies while passing high frequencies is called a low pass filter. A highpass-filtered image would have less gray level variations in smooth areas and emphasized transitional gray level detail. Such an image will appear sharper. SMOOTHING FREQUENCY-DOMAIN FILTERS Edges and other sharp transitions in the gray levels of an image contribute significantly to the high-frequency content of its Fourier transform. Hence smoothing (blurring) is achieved in the frequency domain by attenuating a specified range of high-frequency components in the transform of a given image. Our basic model for filtering in the frequency domain is given by G(u,v)=H(u,v).F(u,v) where F(u,v) is the fourier transform of the image to be smoothed. The objective is to select a filter transfer function H(u,v) that yields G(u,v) by attenuating the high-frequency components of F(u,v). we consider 3 types of lowpass filters: 1) Ideal Lowpass Filter (ILPF) 2) Butterworth Lowpass Filter (BLPF) 3) Gaussian Lowpass Filter (GLPF) Ideal Lowpass Filter The simplest lowpass filter we can visualize is a filter that cuts off all high frequency components of the Fourier transform that are at a distance higher greater than a specified distance D 0 from the origin of the (centered) transform. Such a filter is called a two-dimensional ideal lowpass filter and has transfer function shown below. where D 0 is a specified non-negative quantity, and D(u,v) is the distance from point (u,v) to the origin of the frequency rectangle. If the image is of size MxN, we know that its transform also is of same size, so the center of the frequency rectangle is at (u,v) = (M/2,N/2). In this case, the distance from any point (u,v) to the center (origin) of the fourier transform is given by 59 Department of ECE

The name ideal filter indicates that all frequencies inside a circle of radius D 0 are passed with no attenuation, where all frequencies outside this circle are completely attenuated.

60 The name ideal filter indicates that all frequencies inside a circle of radius D 0 are passed with no attenuation, where all frequencies outside this circle are completely attenuated. This filter is radially symmetric about the origin. The complete filter transfer function can be visualized by rotating the cross section about the origin. For an ideal lowpass filter cross section, the point of transition between H(u,v) = 1 and H(u,v) = 0 I called the cutoff frequency, D 0. The lowpass filters can be compared by studying their behavior as a function of the same cutoff frequencies. One way to establish a set of standard cutoff frequency loci is to compute circles that enclose specified amounts of total image power P T. This quantity is obtained by summing the components of the power spectrum at each point (u,v), for u = 0,1,2, M-1 and v = 0,1,2, N-1; where P(u,v) is the power spectrum. If the transform has been centered, a circle of radius r with origin at the center of the frequency rectangle encloses α percent of the power, where 60 Department of ECE

61 It is clear from this example that ideal lowpass filtering is not very practical. The blurring and ringing properties of the ideal lowpass filter can be explained by the reference to the convolution theorem. The fourier transforms of the original image f(x,y) and the blurred image g(x,y) are related in the frequency domain by the equation. Where, H(u,v) is the filter function F and G are the fourier transforms of the two images just mentioned. The convolution theorem states that the corresponding process in the spatial domain is where h(x,y) is the inverse fourier transform of the filter transfer function H(u,v). This h(x,y) has two majot distinctive characteristics: a dominant component at the origin, and concentric, circular components about the center component. The center component is primarily responsible for blurring. The concentric components are responsible primarily for the ringing characteristics of ideal filters. Both the radius of the center component and the number of circles per unit distance from the origin are inversely proportional to the value of the cutoff frequency of the ideal filter. So, as the cutoff frequency increases, blurring and ringing effect decreases. Butterworth Lowpass Filter The Butterworth filter has a parameter, called the filter order. For high values of this parameter the Butterworth filter approaches the form of the ideal filter. For lower-order values, the Butterworth filter has a smooth form similar to the Gaussian filter. The transfer function of the Butterworth lowpass filter of order n, and with cutoff frequency at a distance D 0 from the origin, is defined as 61 Department of ECE

62 where D(u,v) is the distance from point (u,v) to the origin of the frequency rectangle.. Unlike the Ideal lowpass filter, the BLPF transfer function does not have a sharp discontinuity that establishes a clear cutoff between passed and filtered frequencies. For filters with smooth transfer functions, defining a cutoff frequency locus at points for which H(u,v) is down to a certain fraction of its maximum value is customary. In this case H(u,v) = 0.5 when D(u,v) = D 0. Butterworth filtered image has a smooth transition in blurring as a function of increasing cutoff frequency. A Butterworth filter of order 1 has no ringing. Ringing generally is imperceptible in filters of order 2, but can become a significant factor in filters of higher order. Spatial representations of BLPFs of for n = 1,2,5 and 20 respectively are shown below. The BLPF of order 1 has neither ringing not negative values. The filter of order 2 does show mild ringing and small negative value, but certainly less pronounces than in the ILPF. As the remaining image show, ringing in the BLPF becomes significant for higher-order filters. A Butterworth filter of order 20 exhibits the characteristics of the ILPF. In general, BLPFs of order 2 are a good compromise between effective lowpass filtering and acceptable ringing characteristics. Gaussian Lowpass Filter The transfer function of a two dimensional Gaussian lowpass filter is given by where D(u,v) is the distance from point (u,v) to the origin of the frequency rectangle., σ is a measure of the spread of the Gaussian curve. By letting σ = D 0, the transfer function changes to 62 Department of ECE

63 When D(u,v) = D 0, the filter is down to of its maximum value. The inverse Fourier transform of the Gaussian lowpass filter also is Gaussian. A spatial Gaussian filter, obtained by computing the inverse Fourier transform of above equation will have no ringing. The Gaussian lowpass filter did not achieve as smoothing as the BLPF of order 2 for the same value of cutoff frequency. This is because; the profile of the GLPF is not as tight as the profile of the BLPF of order 2. SHARPENING FREQUENCY-DOMAIN FILTERS An image can be blurred by attenuating the high frequency components of its Fourier transform. Because edges and other abrupt changes in gray levels are associated with highfrequency components, image sharpening can be achieved in the frequency domain by a highpass filtering process, which attenuates the low frequency components without disturbing the high frequency information in the Fourier transform. Because the intended function of the highpass filter is to perform the reverse operation of the low pass filters, the transfer function of a highpass filter (H hp (u,v)) can be obtained form its corresponding transfer function of lowpass filter (H lp (u,v)) as we consider 3 types of lowpass filters: 1) Ideal Highpass Filter (IHPF) 2) Butterworth Highpass Filter (BHPF) 3) Gaussian Highpass Filter (GHPF) Ideal Highpass Filter The transfer function of a 2-D Ideal highpass filter is defined as where D 0 is the cutoff distance measured from the origin of the frequency rectangle, and D(u,v) is the distance from point (u,v) to the origin of the frequency rectangle.. This filter is the opposite of the ideal lowpass filter in the sense that it sets to zero all frequencies inside a circle of radius D 0 while passing, without attenuation, all frequencies outside the circle Department of ECE

64 As in the case of ILPF, Ideal Highpass also has ringing effect. Because the spatial representation of IHPF contains rings. It also contains a black spot at the center. Smaller objects in the image cannot be filters because of the black spot in the spatial representation of the IHPF. Distortion of the edges is also main problem in Ideal highpass filter. As the cutoff frequency increases, distortion in the output image decreases and the spot size also decreases in the h(x,y) resulting in the better filtering of smaller objects in the image f(x,y). Butterworth Highpass Filter The Butterworth Filter represents a transition between the sharpness of the ideal filter and the total smoothness of the Gaussian filter. The transfer function of a Butterworth highpass filter of order n and with cutoff frequency locus at a distance D 0 from the origin is given by where D(u,v) is the distance from point (u,v) to the origin of the frequency rectangle. Butterworth filters behave smoother than Ideal highpass filters. In this filter, the distortion is compared to that of IHPF. Since the center spot size in the spatial representations of IHPF and BHPF are similar, the performance of the two filters in terms of filtering the smaller objects is comparable. The transition into higher values of cutoff frequencies is much smoother with the BHPF. Gaussian Highpass Filter The transfer function of the Gaussian highpass filter with cutoff frequency locus at a distance of D 0 from the origin is given by 64 Department of ECE

65 where D(u,v) is the distance from point (u,v) to the origin of the frequency rectangle. The results obtained with Gaussian highpass filter are smoother than that of IHPF and BHPF. Even the filtering of smaller objects and thin bars is cleaner with Gaussian filter. Laplacian in the frequency Domain It can be shown that From the above expression, it follows that (1) (2) The expression inside the brackets on the left side of above equation is nothing but the Laplacian of f(x,y). Thus we got the important result, (3) The above equation says that the Laplacian can be implemented in the frequency domain by using the filter (4) But we generally center the F(u,v) by performing the operation f(x,y)(-1) x+y prior to taking the transform of the image. If f or F are of size MxN, this operation shifts the center transform so that (u,v)=0 is at point (M/2,N/2) is the frequency rectangle. So, the center of the filter function also needs to be shifted as: (5) The Laplacian filtered image in the spatial domain is obtained by computing the inverse fourier transform of H(u,v)F(u,v) as shown below: (6) Computing the Laplacian in the spatial domain and taking the fourier transform of result is equivalent to multiplying F(u,v) by H(u,v) in (6) (7) 65 Department of ECE

66 The spatial domain Laplacian filter function obtained by taking the inverse Fourier transform of equation (5). Figure below shows the mask used to implement the definition of Laplacian in the spatial domain. The enhanced image g(x,y) can be obtained by subtracting the Laplacian image from the original image: (8) Instead of enhancing the image in two steps (first calculating the Laplacian image and subtracting from original image), a single mask can be used to perform the entire operation in the frequency domain with only one filter given by (substituting (7) in (8)) The above equation was obtained from the following equation, from which the enhanced image can be obtained with a single transformation operation: Unsharp Masking, High Boost Filtering, High Frequency Emphasis Filtering The average background intensity in a highpass filtered image is near to black. This is due to the fact that the highpass filters eliminate the zero-frequency component of their Fourier transforms. The solution to this problem consists of adding a portion of the image back to the filtered result as in Laplacian technique. Sometimes it is advantageous to increase the contribution made by the original image to the overall filtered result. This approach is called high-boost filtering, which is a generalization of unsharp masking. Unsharp masking consists of simply generating a sharp image by subtracting from an image, a blurred version of itself. That is, obtaining a highpass filtered image by subtracting from the image a lowpass filtered version of itself. That is, f hp (x,y) = f(x,y) - f lp (x,y) (1) High-Boost filtering generalizes this by multiplying f(x,y) by a constant A 1 f hb = Af(x,y)-f lp (x,y) (2) Thus, high-boost filtering gives us the flexibility to increase the contribution made by the image to the overall enhanced image. The above equation can be changed as f hb = (A-1)f(x,y) + f(x,y) - f lp (x,y) => f hb = (A-1)f(x,y)-f hp (x,y) (3) The above result is based on a highpass rather than a lowpass image. When A=1, high-boost filtering reduces to regular highpass filtering. As A increases past 1, the contribution made by the image itself becomes more dominant Department of ECE

67 We know, F lp (u,v) = H lp (u,v)f(u,v) (4) F hp (u,v) = H hp (u,v)f(u,v) (5) Where H lp is the transfer function of a lowpass filter and H hp is the transfer function of highpass filter. Converting equation (1) into frequency domain F hp (u,v) = F(u,v) - F lp (u,v) (6) Substituting equation (4) in equation (6) F hp (u,v) = F(u,v) - H lp (u,v)f(u,v) => F hp (u,v) = F(u,v) (1- H lp (u,v)) (7) Therefore, unsharp masking can be obtained directly in the frequency domain by using the composite filter H hp (u,v) = (1- H lp (u,v)) (8) Converting equation (3) into frequency domain F hb = (A-1)F(u,v)-F hp (u,v) (9) Substituting equation (5) in equation (9) F hb = (A-1)F(u,v)- H hp (u,v)f(u,v) => F hb = F(u,v) ((A-1) - H hp (u,v)) (10) Therefore, high-boost filtering can be obtained directly in the frequency domain by using the composite filter H hb (u,v) = (A-1) - H hp (u,v) (11) Sometimes it is advantageous to accentuate the contribution to enhancement made by the highfrequency components of an image. In this case, we simply multiply a highpass filter function by a constant and add an offset so that the frequency term is not eliminated by the filter. This process is called High-Frequency Emphasis filtering. It has a transfer function given by where a 0 and b>a. typical values of a range from 0.25 to 0.5 and typical values of b range from 1.5 to 2.0. High-frequency emphasis filtering reduces to high boost filtering when a=(a-1) and b=1. When b>1, the high frequencies are emphasized (highlighted), thus giving this procedure its name. HOMOMORPHIC FILTERING The illumination-reflectance model can be used to develop a frequency domain procedure for improving the appearance of an image by simultaneous gray-level range compression and contrast enhancement. An image f(x,y) can be expressed as the product of illumination and reflectance components: (1) The above equation cannot be used directly to operate separately on the frequency components of illumination and reflectance because the fourier transform of the product of two functions is not separable Department of ECE

68 So, Let us define Then (2) or (3) (4) Where F i (u,v) and F r (u,v) are the fourier transforms of ln i(x,y) and ln r(x,y) respectively. If we process Z(u,v) by means of a filter function H(u,v) then, (5) Where S(u,v) is the fourier transform of the result s(x,y). In the spatial domain, By letting (6) and (7) Now, the equation (6) can be expressed as (8) (9) Finally, as z(x,y) is formed by taking the logarithm of the original image f(x,y), the inverse (exponential) operation yields the desired enhanced image, denoted by g(x,y) Where The above operations can be represented in the form of block diagram as, This method is based on a special case of a class of systems known as homomorphic systems. In this particular application, the key approach is the separation of the illumination and reflectance 68 Department of ECE

components achieved in the form shown in equation (4). The homomorphic filter function H(u,v) can then operate on these components separately as in equation (5).

69 components achieved in the form shown in equation (4). The homomorphic filter function H(u,v) can then operate on these components separately as in equation (5). The illumination component of an image is generally characterized by slow spatial variations, while the reflectance component tends to vary abruptly, particularly at the junctions of dissimilar objects. So, we can associate the low frequencies of the Fourier transform of the logarithm of an image with the illumination and the high frequencies with reflectance. A good deal of control can be gained over the illumination and reflectance components with a homomorphic filter. This control requires specification of a filter function H(u,v) that effects the low and high frequency components of the fourier transform in different ways. The below figure shows the cross section of such a filter. If the parameters γ L and γ H are chosen so that γ L <1 and γ H >1, the filter function tends to decrease the contribution made by low frequencies (illumination) and amplifies the contribution made by high frequencies (reflectance). The net result is simultaneous dynamic range compression (by log function) and contrast enhancement (by H(u,v)). Figure 4.33 is typical of the results that can be obtained with the homomorphic filter function. N in the original image shown in figure 4.33(a) the details inside the shelter are obscured by the glare from the outside walls. Figure 4.33(b) shows the result of processing this image by homomorphic filtering, with γ L = 0.5 and γ H = 2. A reduction of dynamic range in the brightness, together with an increase in contrast, brought out the details of objects inside the shelter and balanced the gray levels of the outside wall. The enhanced image also is sharper 69 Department of ECE

70 UNIT-VI IMAGE RESTORATION Restoration attempts to reconstruct or recover an image that has been degraded by using a priori knowledge of the degradation phenomenon. Thus restoration techniques are oriented towards modeling the degradation and applying the inverse process in order to recover the original image. We consider the restoration problem only from the point where degraded, digital image is given; thus we consider topics dealing with sensor, digitizer and display degradations only. A MODEL OF THE IMAGE DEGRADATION/RESTORATION PROCESS A degradation function that, together with an additive noise term, operates on an input image f(x,y) to produce a degraded image g(x,y). Given g(x,y), some knowledge about the degradation function H, and some knowledge about the additive noise term η(x,y), the objective of restoration is to obtain an estimate of the original image. We want to estimate to be as close as possible to the original image and, in general, the more we know about the H and η, the closer will be to f(x,y). If H is linear, position invariant process then the degraded image is given in the spatial domain by (1) where h(x,y) is the spatial representation of the degradation function and the symbol * indicates the spatial resolution. The convolution in the spatial domain is equal to multiplication in the frequency domain, so we may write the model in above equation in an equivalent frequency domain representation. where the terms in the capital letters are the fourier transforms of the corresponding terms in equation (1). NOISE MODELS The principal sources of noise in digital images arise during image acquisition (digitization) and/or transmission. The performance of imaging sensors is affected by a variety of factors, such as environmental conditions during image acquisition, and by the quality of the sensing 70 Department of ECE

71 elements themselves. For instance, in acquiring image with a camera, light levels and sensor temperature are major factors affecting the amount of noise in the resulting image. Images are corrupted during transmission principally due to interference in the channel used for transmission. For example, an image transmitted using a wireless network might be corrupted as a result of lightening or other atmospheric disturbance. Spatial characteristics of a noise refer to whether the noise is correlated with an image. Frequency properties refer to the frequency content of noise in the Fourier sense. For example, when the Fourier spectrum of noise is constant, the noise is usually is called white noise. This terminology is a carry over from the physical properties of white light, which contains nearly all frequencies in the visible spectrum in equal proportions. The noise we are going to consider here are 1. Gaussian Noise 2. Rayleigh Noise 3. Erlang Noise 4. Exponential Noise 5. Uniform Noise 6. Impulse (salt-and-pepper) Noise 7. Periodic Noise With the exception of periodic noise, we assume here that noise is independent of spatial coordinates and that it is uncorrelated with respect to the image itself (that is, there is no correlation between pixel values and the values of noise components). Because it is difficult to deal with noises that are spatially dependent and correlated. Gaussian Noise Because of its mathematical tractability in both spatial and frequency domains, Gaussian (also called normal) noise models are used frequently practice. In fact, this tractability is so convenient that it often results in Gaussian models being used in situations in which they are marginally applicable at best. The PDF of a Gaussian random variable, z, is given by 71 Department of ECE

72 Where z represents the gray level, µ is the mean average value of z, and σ is its standard deviation. The standard deviation squared, σ 2 is called the variance of z. when z is described by the above equation, 70% of its values will be in the range [(µ σ),(µ+σ)], and about 95% will be in the range [(µ 2σ),(µ+2σ)]. Rayleigh Noise The PDF of Rayleigh noise is given by The mean and variance of this density are given by Note the displacement from the origin and the fact that the basic shape of this density is skewed to the right. The Rayleigh density can be quite useful for approximating skewed histograms. Erlang (Gamma) Noise The PDF of Erlang Noise given by Where the parameters are such that a>0, b is a positive integer and! indicates factorial. The mean and variance of this density are given by 72 Department of ECE

73 Although the above equation is often referred to as the gamma density, strictly speaking this correct only when the denominator is the gamma function, Γ(b). When the denominator is as shown, the density is more appropriately called the Erlang density. Exponential Noise The PDF of exponential noise is given by where a>0. The mean and variance of this density function given by The PDF of exponential noise is a special case of the Erlang PDF, with b=1 Uniform Noise The PDF of uniform noise is given by The mean and variance of this density function is given by 73 Department of ECE

74 Impulse (Salt-and-Pepper) Noise The PDF of (bipolar) impulse noise is given by If b>a, gray level b will appear as a light dot in the image and level a will appear like a dark dot. If either P a or P b is zero, the impulse noise is called unipolar. If neither probability is zero, and especially if they are approximately equal, impulse noise values will resemble salt-and pepper granules randomly distributed over the image. This noise is called shot noise or spike noise. DEGRADATION MODEL The degradation process can be modeled as an operator or system H, which together with an additive noise term η(x,y) operates on an input image f(x,y) to produce a degraded image g(x,y). Image restoration may be viewed as the process of obtaining an approximation to f(x,y), given g(x,y) and a knowledge of the degradation in the form of the operator H. f(x,y) H + η(x,y) g(x,y) The input output relation in above figure is expressed as g(x,y) = H[f(x,y)]+ η(x,y) (1) For a moment, let us assume that η(x,y)=0, so that g(x,y) = H[f(x,y)] The operator H is said to be linear if H[k 1 f 1 (x,y) + k 2 f 2 (x,y)] = k 1 h[f 1 (x,y)] + k 2 H[f 2 (x,y)] (2) where k 1 and k 2 are constants and f 1 (x,y) and f 2 (x,y) are two input images. If k 1 = k 2 =1,then equation (2) becomes H[f 1 (x,y) + f 2 (x,y)] = h[f 1 (x,y)] + H[f 2 (x,y)] (3) The above equation is called the property of additivity; this property simply says that, if H is a linear operator, the response to a sum of two points is equal to the sum of the two responses. When f 2 (x,y) = 0, equation (2) becomes 74 Department of ECE

75 H[k 1 f 1 (x,y)] = k 1 h[f 1 (x,y)] (4) The above equation is called the property of homogeneity. It says that the response to a constant multiple of any input is equal to the response to that input multiplied by the same constant. Thus the linear property possesses both the property of additivity and the property of homogeneity. An operator having the input-output relation g(x,y) = H[f(x,y)] is said to be position (or space) invariant if H[f(x-α,y-β)] = g(x-α,y-β) (5) This definition indicates that the response at any point in the image depends only on the value of the input at that point and not on the position of the point. Degradation model for Continuous case f(x,y) can be expressed in impulse form as f ( x, y) = f ( α, β ) δ ( x α, y β ) dαdβ Then, if η(x,y) = 0, substituting equation (6) in (1) (6) [ ] g( x, y) = H[ f ( x, y)] = H f ( α, β ) δ ( x α, y β ) dαdβ If H is a linear operator, above equation changes as [ ] g( x, y) = H f ( α, β ) δ ( x α, y β ) dαdβ Since f(α,β) is independent of x,y and from the homogeneity property [ ] g( x, y) = f ( α, β ) H δ ( x α, y β ) dαdβ The term H[δ(x-α,y-β)] is called the impulse response of H and is denoted as h(x,α,y,β)= H[δ(x-α,y-β)] (10) from equations (9) and (10), we can write = g( x, y) f ( α, β ) h( x, α, y, β ) dαdβ (11) (8) (9) (7) The above equation is called the superposition (or Fredholm) integral of the first kind. This expression states that of the response of H to a impulse is known, the response to any inpout f(α,β) can be calculated by means of equation (11) If H is position invariant from equation (5) Now from equations (12), (11) and (10) H[δ(x-α,y-β)] = h(x-α,y-β) (12) 75 Department of ECE

76 g( x, y) = f ( α, β ) h( x α, y β ) dαdβ (13) This is nothing but the convolution integral. In the presence of additive noise the above expression describing the linear degradation model becomes g( x, y) = f ( α, β ) h( x α, y β ) dαdβ + η( x, y) (15) Many types of degradations can be approximated by linear, position invariant processes. The advantage of this approach is that the extensive tools of linear system theory then become available for the solution of image restoration problems. Degradation model for Discrete case Suppose that f(x) and h(x) are sampled uniformly to from arrays of dimensions A and B respectively. In this case x is a discrete variable in the ranges 0,1,2, A-1 for f(x) and 0,1,2, B-1 for h(x). The discrete convolution is based in the assumption that the sampled functions are periodic, with a period M. Overlap in the individual periods of the resulting convolution is avoided by choosing M A+B-1 and extending the functions with zeroes so that their length is equal to M. Let f e (x) and h e (x) represent the extended functions. Their convolution is given by M 1 g ( x) = f ( m) h ( x m) e e e m= 0 (1) for x=0,1,2,, M-1. As both f e (x) and h e (x) are assumed to have period equal to M, g e (x) also has the same period. The above equation can be represented in matrix form as g = Hf (2) where f and g are M-dimensional column vectors f fe(0) fe(1). =.. fe( M 1) (3) 76 Department of ECE

77 g ge(0) ge(1). =.. ge( M 1) (4) and H is an MxM matrix he (0) he ( 1) he ( 2)... he ( M + 1) he (1) he (0) he ( 1)... he ( M 2) + he (2) he (1) he (0)... he ( M + 3) H = he ( M 1) he ( M 2) he ( M 3)... he (0) Because of the periodicity assumption on h e (x), it follows that h e (x) = h e (M+x). Using this property the above matrix can be changed as he (0) he ( M 1) he ( M 2)... he (1) he (1) he (0) he ( M 1)... h (2) e he (2) he (1) he (0)... he (3) H = he ( M 1) he ( M 2) he ( M 3)... he (0) In the above matrix, the rows are related by a circular shift to the right; that is the right-most element in one row is equal to the left-most element in the row immediately below. The shift is called circular because an element shifted off the right end of row reappears at the left end of the next row. Moreover, the circularity of the H is complete in the sense that it extends from the last row back the first row. A square matrix in which each row is a circular shift of the preceding row, and the first row is a circular shift of the last row, is called a circulant matrix. Extension of the discussion to a 2D, discrete degradation model is straightforward. For two digitized images f(x,y) and h(x,y) of sizes AxB and CxD respectively, extended sizes of MxN may be formed by padding the above functions with zeroes. That is f e (x,y) = f(x,y) 0 x A-1 and 0 y B-1 = 0 A x M-1 or B y N-1 and h e (x,y) = h(x,y) 0 x C-1 and 0 y D-1 = 0 C x M-1 or D y N Department of ECE

78 Treating the extended functions f e (x,y) and h e (x,y) as periodic in two dimension, with periods M and N in the x and y directions, respectively M 1 N 1 g ( x, y) = f ( m, n) h ( x m, y n) e e e m= 0 n= 0 For x=0,1,2,,m-1 and y=0,1,2,,n-1 The convolution function g e (x,y) is periodic with the same period of f e (x,y) and h e (x,y). Overlap of the individual convolution periods is avoided by chosing M A+C-1 and N B+D-1. Now, the complete discrete degradation model can be given by adding an MxN extended discrete noise term η e (x,y) to the above equation M 1 N 1 e e e m= 0 n= 0 e ( ) g ( x, y) = f ( m, n) h ( x m, y n) + η x, y For x=0,1,2,,m-1 and y=0,1,2,,n-1 The above equation can be represented in matrix from as g=hf+n where f,g,n are MN-dimensional column vectors formed by stacking the rows of the MxN functions f e (x,y), g e (x,y) and η e (x,y). The first N elements of f, foe example are the elements in the first row of f e (x,y), the next N elements are form the second row, and so on for all the M rows of fe(x,y). So, f,g and n of dimension MNx1and H is of dimension MnxMN. This matrix consists of M 2 partitions, each partition being of size NxN and ordered according to H0 H M 1 H M 2... H1 H1 H0 H M 1... H 2 H 2 H1 H0... H 3 H = H M 1 H M 2 H M 3... H 0 Each partition H j is constructed from the jth row of the extended function h e (x,y) as follows H j he ( j,0) he ( j, N 1) he ( j, N 2)... he ( j,1) he ( j,1) he ( j,0) he ( j, N 1)... he ( j, 2) he ( j, 2) he ( j,1) he ( j,0)... he ( j,3) = he ( j, N 1) he ( j, N 2) he ( j, N 3)... he ( j,0) Here, H j is a circulant matrix, and the blocks of H are subscripted in a circular manner. For these reasons, the matrix H is called a Block-Circulant Matrix Department of ECE

79 ALGEBRAIC APPROACH TO RESTORATION The objective of image restoration is to estimate an original image f from a degraded image g and some knowledge or assumption about H and n. Central to the algebraic approach is the concept of seeking an estimate of f, denoted ˆf, that minimizes a predefined criterion of performance. Because of their simplicity, least squares method is used here. Unconstrained Restoration From g=hf+n, the noise term in the degradation model is n=g-hf (1) In the absence of any knowledge of n, a meaningful criterion function is to seek an ˆf such that ˆ Hf approximates g in a least squares sense by assuming that the norm of the noise term is as small as possible. In other words, we want to find an ˆf such that is minimum, where n 2 ˆ = g-hf 2 (2) n 2 T = n n and 2 ˆ ˆ T ˆ g-hf = (g-hf ) (g-hf ) are the squared norms of n and ˆ (g-hf ) respectively. Equation (2) allows the equivalent view of this problem as one of minimizing the criterion function with respect to ˆf. 2 J (f ˆ) = g-hfˆ (3) Aside from the requirement that it should minimize equation (3) ˆf is not constrained in any other way. Now, we want to know, for what value of ˆf, the function ˆ J (f ) minimizes to least value. For that, simply differentiate J with respect to ˆf and set the result equal to zero vector. Solving the above equation for f J ˆ ˆf (f ) T ˆ => = 0 = 2H (g-hf) 2H g+2h Hf=0 => => T T ˆ T T ˆ H g=h Hf T -1 T ˆf=(H H) H g Letting M=N so that H is a square matrix and assuming that H -1 exists, the above equation reduces to 79 Department of ECE

80 -1 T -1 T ˆf=H (H ) H g -1 ˆf=H g Constrained Restoration In this section, we consider the least squares restoration problem as one of minimizing functions of the form ˆ Qf 2, where Q is a linear operator on f, subject to the constraint ˆ g-hf 2 = n 2.This approach introduces considerable flexibility in the restoration process because it yields different solutions for different choices of Q. The addition of an equality constraint in the minimization problem can be handled without difficulty by using the method of Lagrange Multipliers. The procedure calls for expressing the constraint in the form α 2 2 ( g-hfˆ n ) and then appending it to the function ˆ Qf 2. In other words, we seek an ˆf that minimizes the criterion function J (f ˆ) = Qfˆ + α( g-hfˆ n ) Where α is a constant called the Lagrange multiplier. After the constraint has been appended, minimization is carried out in the usual way. Differentiating above equation with respect to ˆf and setting the result equal to zero vector yields Now, solving for ˆf, J ˆ ˆf (f ) T ˆ T ˆ α ( ) = 0 = 2Q Qf - 2 H g-hf T 1 ˆ T f = ( H H+ H (g-hf) ˆ ) The quantity 1/ α must be adjusted so that the constraint is satisfied. α INVERSE FILTERING The simplest approach to restoration is direct inverse filtering, where we compute an estimate, ˆF(u,v), of the transform of the original image simply by dividing the transform of the degraded image, G(u,v) by the degradation function: But we know, G(u,v)=F(u,v)H(u,v)+N(u,v) Substituting this in above equation gives 80 Department of ECE

81 The image restoration approach in above equations is commonly referred to as the inverse filtering method. This terminology arises from considering H(u,v) as filter function that multiplies F(u,v) to produce the transform of the degraded image g(x,y). The above equation tells us that even if we know the degradation function we cannot recover the undegraded image exactly because N(u,v) is a random function whose fourier transform is not known. If the degradation has zero or very small values, then the ratio N(u,v)/H(u,v) could easily dominate the estimate ˆF(u,v). One approach to get around the zero or small-value problem is to limit the filter frequencies to values neat the origin. By limiting the analysis to frequencies near the origin, we reduce the probability of encountering zero values. LEAST MEAN SQUARE FILTER/ MINIMUM MEAN SQUARE ERROR (WIENER) FILTERING The inverse filtering makes no explicit provision for handling noise. This Wiener filtering method incorporates both the degradation function and statistical characteristics images and noise as random process, and the objective is to find an estimate ˆf of the uncorrupted image f such that the mean square error between them is minimized. This error measure is given by (1) where E{.} is the expected value of the argument. It is assumed that the noise and the image are uncorrelated; that one or the other has zero mean; and that the gray levels in the estimate are a linear function of the levels in the degraded image. Based on these conditions, the minimum of the error function in above equation is given in the frequency domain by the expression The terms in the above equations are as follows: (2) The result in equation (2) is known as the Weiner filter. It is also referred to as the minimum mean square error filter or the least square error filter. It does not have the same problem as the inverse filter with zeroes in the degradation function, unless both H(u,v) and S η (u,v) are zero for the same values of u and v Department of ECE

82 If the noise is zero, then the noise power spectrum vanishes and the wiener filter reduces to the inverse filter. When we are dealing with spectrally white noise, the spectrum N(u,v) 2 is a constant, which simplifies things considerably. Now, the above equation can be written as where K is a specified constant. Generally, Wiener filter works better than inverse filtering in the presence of noise and degradation function. CONSTRIAINED LEAST SQUARES FILTERING The difficulty in Wiener filtering is, the power spectra of the undegraded image and noise must be known. But this constrained least squares filtering method requires knowledge of only the mean and variance of the noise. These parameters can be calculated from a given degraded image, so this is an important advantage. Another difference is that the Wiener filter is based on minimizing a statistical criterion and, as such, it is optimal in an average sense. But this method has a notable feature that it yields an optimal result for each image to which it is applied. The degraded image can be represented in matrix form as (1) The problem here is H is highly sensitive to noise. One way to lighten the noise sensitivity problem is to base optimality of restoration on a measure of smoothness, such as second derivative of an image. To be meaningful, the restoration must be constrained by the parameters of the problems. Thus, what is desired is to find the minimum of a criterion function, defined by subject to the constraint (2) (3) is the vector norm and ˆf is the estimate of the undegraded image. The frequency domain solution to this optimization problem is given by the following expression 82 Department of ECE

83 where γ is a parameter that must be adjusted so that the constraint in equation (3) is satisfied, and P(u,v) is the fourier transform of the function p(x,y) We can recognize the above function as the Laplacian operator. By comparing the constrained least squares and Wiener results, it is noted that the former yielded slightly better results for the high and medium noise cases. It is not unexpected that the constrained least squares filter would outperform the Wiener filter when selecting the parameters manually for better visual results. The parameter γ is a scalar, while the value of K in Wiener filtering is an approximation to the ratio of two unknown frequency domain functions, whose ratio seldom is constant. Thus, it stands to reason that a result based on manually selecting γ would be more accurate estimate of the undegraded image. The difference between Wiener filtering and constrained least square restoration method is 1. The Wiener filter is designed to optimize the restoration in an average statistical sense over a large ensemble of similar images. The constrained matrix inversion deals with one image only and imposes constraints on the solution sought. 2. The Wiener filter is based on the assumption that the random fields involved are homogeneous with known spectral densities. In the constrained matrix inversion it is assumed that we know only some statistical property of the noise. In the constraint matrix restoration approach, various filters may be constructed using the same formulation by simply changing the smoothing criterion. RESTORATION IN THE PRESENCE OF NOISE ONLY- SPATIAL FILTERING: We know that the general equations for degradation process in spatial and frequency domain are given by When the only degradation present in an image is only noise, the above equations become The noise terms are unknown, so subtracting them from g(x,y) or G(u,v) is not a realistic option. Spatial filtering is the method of choice in situations when only additive noise is present. MEAN FILTERS 83 Department of ECE

84 Arithmetic Mean Filter This is the simplest of the mean filters. Let S xy represent the set of coordinates in a rectangular subimage window of size mxn, centered at point (x,y). The arithmetic mean filtering process computes the average value of the corrupted image g(x,y) in the area defined by S xy. The value of the restored image ˆf at any point (x,y) is simply the arithmetic mean computed using the pixels in the region defined by S xy. This operation can be implemented using a convolution mask in which all coefficients have value 1/mn. A mean filter simply smoothes local variations in the image. Noise is reduced as result of blurring. Geometric Mean Filter An image restored using a geometric mean filter is given by the expression Here, each restored pixel is given by the product of the pixels in the subimage window, raised to the power 1/mn. A geometric mean filter achieves smoothing comparable to the arithmetic mean filter, nut it tends to lose less image detail in the process. Harmonic Mean Filter The harmonic mean filtering operation is given by the expression The harmonic mean filter works well for salt noise, but fails for pepper noise. It does well also with other types of noise like Gaussian noise. Contraharmonic Mean Filter The Contraharmonic mean filtering operation yields a restored image based in the expression. where Q is called the order of the filter. This filter is well suited for reducing or virtually eliminating the effects of salt and pepper noise. For positive values of Q, the filter eliminates pepper noise. For negative values of Q, the filter eliminates salt noise 84 Department of ECE

85 For Q=0, this filter reduces to arithmetic mean filter For Q=-1, this filter reduces to harmonic mean filter. In general, the arithmetic mean and geometric mean filters are well suited for random noise like Gaussian or uniform noise. The Contraharmonic filter is well suited for impulse noise, but it has the disadvantage that it must be known whether the noise is dark or light in order to select the proper sign for Q. The results of choosing the wrong sign for Q can be disastrous. ORDER-STATISTICS FILTERS Order-statistics filters are spatial filters whose response is based on ordering the pixels contained in the image area encompassed by the filter. The response of the filter at any point is determined by the ranking result. Median Filter It replaces the value of a pixel by the median of the gray levels in the neighborhood of that pixel: For certain types of noises, median filters provide excellent noise reduction capabilities, with considerably less blurring than linear smoothing filters of similar size. Median filters are particularly effective in the presence of both bipolar and unipolar noise. Max and Min Filters The median filter represents the 50 th percentile of a ranked set of numbers. The 100 th percentile result is represented by the Max filter, given by Max filter is useful for finding the brightest points in an image. It can be used to reduce the pepper noise from the image. But it removes (sets to a light gray level) some dark pixel from the borders of the dark objects The 0 th percentile result is represented by the Min filter, given by Min filter is useful for finding the darkest points in an image. It can be used to reduce the salt noise from the image. But it removes white points around the border of light objects. Mid point Filter The midpoint filter simply computes the midpoint between the maximum and minimum values in the area encompassed by the filter 85 Department of ECE

86 This filter combines order statistics ad averaging. This filter works best for randomly distributed noise like Gaussian noise. Alpha-Trimmed mean Filter Suppose that we delete the d/2 lowest and d/2 highest gray-level values of g(s,t) in the neighborhood S xy. Let g r (s,t) represent the remaining mn-d pixels. A filter formed by averaging these remaining pixels is called the alpha-trimmed mean filter. Where the value of d can range from 0 to mn-1 When d=0, this filter reduces to the arithmetic mean filter When d=(mn-1)/2 this filter becomes to median filter. For other values of d, the alpha-trimmed filter is useful in situations involving multiple types of noise, such as combination of salt and pepper and Gaussian noise. ADAPTIVE FILTERS Once selected, the mean filters and order-statistics filters are applied to an image without regard for how image characteristics vary from one point to another. Adaptive filters are those, whose behavior changes based on statistical characteristics of the image inside the filter region defined by the mxn rectangular window S xy. Adaptive filters are capable of performance superior to that of the other filters, but with increase in filter complexity. Adaptive Local Noise Reduction Filter The simplest statistical measures of a random variable are its mean and variance. These are reasonable parameters on which to base an adaptive filter because they are quantities closely related to the appearance of an image. The mean gives a measure of average gray level in the region over which the mean is computed, and the variance gives a measure of average contrast in that region. Our filter is to operate in a local region S xy. The response of the filter at any point (x,y) on which the region is centered is to be based on four quantities : i) g(x,y), the value of the noisy image at (x,y) ii) σ 2 η, the variance of the noise corrupting f(x,y) to form g(x,y) iii) m L, the local mean of the pixels in S xy and iv) σ 2 L, the local variance of the pixels in S xy. The behavior of the filter is to be as follows: 86 Department of ECE

An adaptive expression for obtaining ˆf(x,y) based on the above assumptions may be written as The only quantity that needs to be known or estimated is the variance of the overall noise σ 2 η.

An implicit assumption in above expression is that σ η σ 2 L, because the noise in our model is additive and position independent.

87 An adaptive expression for obtaining ˆf(x,y) based on the above assumptions may be written as The only quantity that needs to be known or estimated is the variance of the overall noise σ 2 η. The other parameters are computed from the pixels in S xy at each location (x,y) on which the 2 filter window is centered. An implicit assumption in above expression is that σ η σ 2 L, because the noise in our model is additive and position independent. Adaptive Median Filter The median filter performs well as long as the spatial density of the impulse noise is not large. Adaptive median filtering can handle impulse noise even with large probabilities. An additional advantage of the adaptive median filter is that it seeks to preserve detail while smoothing non-impulse noise, something that the traditional median filter does not do. The adaptive filter also works in a rectangular window area S xy. Unlike the other filters, the adaptive median filter changes (increases) the size of S xy during filter operation, depending on certain conditions. Consider the following notation: The adaptive median filtering algorithm works in two levels, denoted level A and level B, as follows: 87 Department of ECE

88 The adaptive median filtering has three main purposes: 1. To remove salt-and-pepper (or impulse) noise, 2. To provide smoothing of other noise that may not be impulsive and 3. To reduce distortion, such as excessive thinning or thickening of object boundaries. Every time the algorithm outputs a value, the window S xy is moved to the net location in the image. The algorithm is then reinitialized and applied to the pixels in the new location. PERIODIC NOISE REDUTION BY FREQUENCY DOMAIN FITLERING Periodic Noise Periodic noise in an image arises typically form electrical and electromechanical interference during image acquisition. This is the only type of spatially dependent noise. Periodic noise can be reduced significantly with frequency domain filtering. Band Reject Filters Band Pass Filters Notch Filters Optimum Notch Filtering/ Interactive Restoration Clearly defined interference patterns are not common. Images derived from electro-optical scanners, such as those used in space and aerial imaging, sometimes are corrupted by coupling and amplification of low-level signals in the scanners electronics circuitry. The resulting images tend to contain pronounced, 2D periodic structures superimposed on the scene data with more complex patterns. When several interference components are present, the methods like band pass and band reject are not always acceptable because they may remove too much image information in the filtering process. The method discussed here is optimum, in the sense that it minimizes local variances of the restored image ˆf(x,y). The procedure consists of first isolating the principal contributions of the interference pattern and then subtracting a variable, weighted portion of the pattern from the corrupted image. QUESTION AND ANSWERS 1. What is image restoration? 88 Department of ECE

89 Image restoration is the improvement of an image using objective criteria and prior knowledge as to what the image should look like. 2. What is the difference between image enhancement and image restoration? In image enhancement we try to improve the image using subjective criteria, while in image restoration we are trying to reverse a specific damage suffered by the image,using objective criteria. 3. Why may an image require restoration? An image may be degraded because the grey values of individual pixels may be altered, or it may be distorted because the position of individual pixels may be shifted away from their correct position. The second case is the subject of geometric restoration. Geometric restoration is also called image registration because it helps in finding corresponding points between two images of the same region taken from different viewing angles. Image registration is very important in remote sensing when aerial photographs have to be registered against the map, or two aerial photographs of the same region have to be registered with each other. 4. What is the problem of image restoration? The problem of image restoration is: given the degraded image g, recover the original undegraded image f. 5. How can the problem of image restoration be solved? The problem of image restoration can be solved if we have prior knowledge of the point spread function or its Fourier transform (the transfer function) of the degradation process. 6. The white bars in the test pattern shown in figure are 7 pixels wide and 210 pixels high. The separation between bars is 17 pixels. What would this image look like after application of different filters of different sizes? Solution: The matrix representation of a portion of the given image at any end of a vertical bar is 89 Department of ECE

90 a) A 3x3 Min Filter: b) A 5x5 Min Filter: c) A 7x7 Min Filter: d) A 9x9 Min Filter: Explanation: The 0 th percentile result is represented by the Min filter, given by Min filter is useful for finding the darkest points in an image. It can be used to reduce the salt noise from the image. But it removes white points around the border of light objects. But for the given image, the effect of Min filter is decrease in the width and height of the white vertical bars. As the size of the filter increase, the width and height of the vertical bars decrease. (a) (c) (d) a) The resulting image consists of vertical bars of 5 pixels wide and 208 pixels height. There will be no deformation of the corners. The matrix after the application of 3x3 Min filter is shown below: 90 Department of ECE

91 b) The resulting image consists of vertical bars of 3 pixels wide and 206 pixels height. There will be no deformation of the corners. The matrix after the application of 5x5 Min filter is shown below: c) The resulting image consists of vertical bars of 1 pixels wide and 204 pixels height. There will be no deformation of the corners. The matrix after the application of 7x7 Min filter is shown below: Department of ECE

d) The resulting image consists of vertical bars of 0 pixels wide and 202 pixels height. There will be no deformation of the corners. The white bars completely disappear from the image.

92 d) The resulting image consists of vertical bars of 0 pixels wide and 202 pixels height. There will be no deformation of the corners. The white bars completely disappear from the image. The matrix after the application of 9x9 Min filter is shown below: e) A 3x3 Max Filter: f) A 5x5 Max Filter: g) A 7x7 Max Filter: h) A 9x9 Max Filter: Explanation Max filter is useful for finding the brightest points in an image. It can be used to reduce the pepper noise from the image. But it removes (sets to a light gray level) some dark pixel from the borders of the dark objects. But for the given image, the effect of Max filter is increase in the width and height of the white vertical bars. As the size of the filter increase, the width and height of the vertical bars also increases. (e) (f) (g) e) The resulting image consists of vertical bars of 9 pixels wide and 212 pixels height. There will be no deformation of the corners. The matrix after the application of 3x3 Max filter is shown below: Department of ECE

93 f) The resulting image consists of vertical bars of 11 pixels wide and 214 pixels height. There will be no deformation of the corners. The matrix after the application of 5x5 Max filter is shown below: g) The resulting image consists of vertical bars of 13 pixels wide and 216 pixels height. There will be no deformation of the corners. The matrix after the application of 7x7 Max filter is shown below: Department of ECE

94 h) The resulting image consists of vertical bars of 15 pixels wide and 218pixels height. There will be no deformation of the corners. The matrix after the application of 9x9 Max filter is shown below: i) A 3x3 Arithmetic Mean Filter: j) A 5x5 Arithmetic Mean Filter: k) A 7x7 Arithmetic Mean Filter: k) A 9x9 Arithmetic Mean Filter: Explanation: (i) (j) (k) Arithmetic mean filter causes blurring. Burring increases with the size of the mask. i) Since the width of each vertical bar is 7 pixels wide, a 3x3 arithmetic mean filter slightly distorts the edges of the vertical bars. As a result, the edges of the vertical bars become a bit darker. There will be some deformation at the corners of the bars, they become rounded Department of ECE

95 j) As the size of the mask or filter increases, the vertical bars will distort more, and blurring increases. Since the size of the mask here is 5x5, after the application of the filter, only the 3 centre lines of the vertical bars remains white. As move we move from the center of the vertical bar to the either of the edge, the pixels become darker. There will be some deformation at the corners of the bars, they become rounded k) As the size of the mask or filter increases, the vertical bars will distort more, and blurring increases. Since the size of the mask here is 7x7, after the application of the filter, only the centre line of the vertical bars remains white. As move we move from the center of the vertical bar to the either of the edge, the pixels become darker. There will be some deformation at the corners of the bars, they become rounded. l) As the size of the mask is larger than the width of the bars, the vertical bars are completely distorted. The burring also increases compared to the previous case. The corners also become more rounded and deformed. m) A 3x3 Geometric Mean Filter n) A 5x5 Geometric Mean Filter o) A 7x7 Geometric Mean Filter p) A 9x9 Geometric Mean Filter Explanation An image restored using a geometric mean filter is given by the expression Here, each restored pixel is given by the product of the pixels in the subimage window, raised to the power 1/mn. A geometric mean filter achieves smoothing comparable to the arithmetic mean 95 Department of ECE

filter, nut it tends to lose less image detail in the process. But for the given image, the effect of Min filter is decrease in the width and height of the white vertical bars.

96 filter, nut it tends to lose less image detail in the process. But for the given image, the effect of Min filter is decrease in the width and height of the white vertical bars. As the size of the filter increase, the width and height of the vertical bars decrease. (n) (o) (p) m) The resulting image consists of vertical bars of 5 pixels wide and 208 pixels height. There will be no deformation of the corners. The matrix after the application of 3x3 Geometric Mean filter is shown below: n) The resulting image consists of vertical bars of 3 pixels wide and 206 pixels height. There will be no deformation of the corners. The matrix after the application of 5x5 Geometric Mean filter is shown below: o) The resulting image consists of vertical bars of 1 pixels wide and 204 pixels height. There will be no deformation of the corners. The matrix after the application of 7x7 Geometric Mean Filter is shown below: 96 Department of ECE

97 p) The resulting image consists of vertical bars of 0 pixels wide and 202 pixels height. There will be no deformation of the corners. The white bars completely disappear from the image. The matrix after the application of 9x9 Geometric Mean filter is shown below: 97 Department of ECE

98 UNIT-VI IMAGE SEGMENTATION DETECTION OF DISCONTINUITIES There are 3 types of gray level discontinuities in an image: Points, Lines and Edges Point Detection Line Detection 98 Department of ECE

99 Edge Detection 99 Department of ECE

100 100 Department of ECE

101 101 Department of ECE

102 Gradient Operators Department of ECE

103 103 Department of ECE

104 The Laplacian Department of ECE

105 THRESHOLDING Department of ECE

106 Global Thresholding Department of ECE

107 Adaptive/Local Thresholding REGION BASED SEGMENTATION Basic Formulation Department of ECE

108 Region Growing Region Splitting and Merging Department of ECE

109 QUESTION AND ANSWERS 1. Suppose that an image has the following intensity distributions. Where p1 (z) corresponds to the intensity of the objects p2 (z) corresponds to the intensity of the background. Assume that p1= p2 and find the optimal between objects and background pixels. Shown in figure 7. [16] Solution Department of ECE

UNIT-2 IMAGE REPRESENTATION IMAGE REPRESENTATION IMAGE SENSORS IMAGE SENSORS- FLEX CIRCUIT ASSEMBLY

UNIT-2 IMAGE REPRESENTATION IMAGE REPRESENTATION IMAGE SENSORS IMAGE SENSORS- FLEX CIRCUIT ASSEMBLY 18-08-2016 UNIT-2 In the following slides we will consider what is involved in capturing a digital image of a real-world scene Image sensing and representation Image Acquisition Sampling and quantisation