Chapter 11 Image Processing

Similar documents
Image Processing. Application area chosen because it has very good parallelism and interesting output.

Image Processing. Low-level Image Processing. The stored image consists of a two-dimensional array of pixels (picture elements):

Edge and local feature detection - 2. Importance of edge detection in computer vision

Types of Edges. Why Edge Detection? Types of Edges. Edge Detection. Gradient. Edge Detection

Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong)

Lecture 7: Most Common Edge Detectors

Chapter 3: Intensity Transformations and Spatial Filtering

Intensity Transformations and Spatial Filtering

Filtering and Enhancing Images

EE795: Computer Vision and Intelligent Systems

CS4733 Class Notes, Computer Vision

Image Enhancement in Spatial Domain. By Dr. Rajeev Srivastava

Image Processing

Biomedical Image Analysis. Point, Edge and Line Detection

Ulrik Söderström 16 Feb Image Processing. Segmentation

Lecture: Edge Detection

Classification of image operations. Image enhancement (GW-Ch. 3) Point operations. Neighbourhood operation

Digital Image Processing. Image Enhancement - Filtering

Texture. Outline. Image representations: spatial and frequency Fourier transform Frequency filtering Oriented pyramids Texture representation

JNTUWORLD. 4. Prove that the average value of laplacian of the equation 2 h = ((r2 σ 2 )/σ 4 ))exp( r 2 /2σ 2 ) is zero. [16]

Edge detection. Convert a 2D image into a set of curves. Extracts salient features of the scene More compact than pixels

Babu Madhav Institute of Information Technology Years Integrated M.Sc.(IT)(Semester - 7)

Computer Vision 2. SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung. Computer Vision 2 Dr. Benjamin Guthier

Digital Image Processing, 2nd ed. Digital Image Processing, 2nd ed. The principal objective of enhancement

Filtering Images. Contents

Assignment 3: Edge Detection

Lecture 6: Edge Detection

Digital Image Processing. Lecture 6

Image Enhancement in Spatial Domain (Chapter 3)

Digital filters. Remote Sensing (GRS-20306)

Perception. Autonomous Mobile Robots. Sensors Vision Uncertainties, Line extraction from laser scans. Autonomous Systems Lab. Zürich.

Final Review. Image Processing CSE 166 Lecture 18

Numerical Algorithms

Edge Detection. EE/CSE 576 Linda Shapiro

IMAGING. Images are stored by capturing the binary data using some electronic devices (SENSORS)

Multimedia Computing: Algorithms, Systems, and Applications: Edge Detection

Image Compression System on an FPGA

Other Linear Filters CS 211A

EEM 463 Introduction to Image Processing. Week 3: Intensity Transformations

Pipelined Computations

(Refer Slide Time: 0:32)

Edge detection. Gradient-based edge operators

Point and Spatial Processing

CHAPTER 3 IMAGE ENHANCEMENT IN THE SPATIAL DOMAIN

COMP30019 Graphics and Interaction Transformation geometry and homogeneous coordinates

Computer Vision I. Announcements. Fourier Tansform. Efficient Implementation. Edge and Corner Detection. CSE252A Lecture 13.

Noise Model. Important Noise Probability Density Functions (Cont.) Important Noise Probability Density Functions

Chapter - 2 : IMAGE ENHANCEMENT

2D Image Processing INFORMATIK. Kaiserlautern University. DFKI Deutsches Forschungszentrum für Künstliche Intelligenz

Image features. Image Features

Outlines. Medical Image Processing Using Transforms. 4. Transform in image space

Introduction to Digital Image Processing

ELEN E4830 Digital Image Processing. Homework 6 Solution

Vivekananda. Collegee of Engineering & Technology. Question and Answers on 10CS762 /10IS762 UNIT- 5 : IMAGE ENHANCEMENT.

Segmentation and Grouping

Computer Vision and Graphics (ee2031) Digital Image Processing I

Tutorial 5. Jun Xu, Teaching Asistant March 2, COMP4134 Biometrics Authentication

EN1610 Image Understanding Lab # 3: Edges

Lectures Remote Sensing

Digital Image Processing COSC 6380/4393

Digital Signal Processing. Soma Biswas

Edge Detection. CSE 576 Ali Farhadi. Many slides from Steve Seitz and Larry Zitnick

ECG782: Multidimensional Digital Signal Processing

Lecture Image Enhancement and Spatial Filtering

Computer Vision. Image Segmentation. 10. Segmentation. Computer Engineering, Sejong University. Dongil Han

Image Enhancement Techniques for Fingerprint Identification

Digital Image Processing. Prof. P. K. Biswas. Department of Electronic & Electrical Communication Engineering

Edge Detection Lecture 03 Computer Vision

Point Operations and Spatial Filtering

HOUGH TRANSFORM. Plan for today. Introduction to HT. An image with linear structures. INF 4300 Digital Image Analysis

Image Processing. BITS Pilani. Dr Jagadish Nayak. Dubai Campus

Image Processing. Traitement d images. Yuliya Tarabalka Tel.

Graphics and Interaction Transformation geometry and homogeneous coordinates

Anno accademico 2006/2007. Davide Migliore

Digital Image Processing

CMPUT 206. Introduction to Digital Image Processing

Dr Richard Greenaway

Fourier transforms and convolution

Issues with Curve Detection Grouping (e.g., the Canny hysteresis thresholding procedure) Model tting They can be performed sequentially or simultaneou

Fast Fourier Transform (FFT)

INTENSITY TRANSFORMATION AND SPATIAL FILTERING

Computer Vision. Fourier Transform. 20 January Copyright by NHL Hogeschool and Van de Loosdrecht Machine Vision BV All rights reserved

Point operation Spatial operation Transform operation Pseudocoloring

ECG782: Multidimensional Digital Signal Processing

Lecture 4 Image Enhancement in Spatial Domain

9 length of contour = no. of horizontal and vertical components + ( 2 no. of diagonal components) diameter of boundary B

Feature extraction. Bi-Histogram Binarization Entropy. What is texture Texture primitives. Filter banks 2D Fourier Transform Wavlet maxima points

Sorting Algorithms. - rearranging a list of numbers into increasing (or decreasing) order. Potential Speedup

Lecture 4: Spatial Domain Transformations

UNIT 2 GRAPHIC PRIMITIVES

Edge Detection. Announcements. Edge detection. Origin of Edges. Mailing list: you should have received messages

Twiddle Factor Transformation for Pipelined FFT Processing

Image Transformation Techniques Dr. Rajeev Srivastava Dept. of Computer Engineering, ITBHU, Varanasi

Biomedical Image Analysis. Spatial Filtering

the most common approach for detecting meaningful discontinuities in gray level. we discuss approaches for implementing

Lecture 8 Object Descriptors

Lecture 5: Frequency Domain Transformations

APPM 2360 Project 2 Due Nov. 3 at 5:00 PM in D2L

Data Partitioning. Figure 1-31: Communication Topologies. Regular Partitions

Massachusetts Institute of Technology. Department of Computer Science and Electrical Engineering /6.866 Machine Vision Quiz I

Transcription:

Chapter Image Processing Low-level Image Processing Operates directly on a stored image to improve or enhance it. Stored image consists of a two-dimensional array of pixels (picture elements): Origin (0, 0) j i Picture element (pixel) p(i, j) Figure. Pixmap. Many low-level image-processing operations assume monochrome images and refer to pixels as having gray level values or intensities. Page 27

Computational Requirements Suppose a pixmap has 024 024 pixels and 8-bit pixels. Storage requirement is 2 20 bytes ( Mbytes) Suppose each pixel must be operated upon just once. Then 2 20 operations are needed in the time of one frame. At 0-8 second/operation (0ns/operation), this would take 0 ms. In real-time applications, the speed of computation must be at the frame rate (typically 60 85 frames/second). All pixels in the image must be processed in the time of one frame; that is, in 2 6 ms. Typically, many high-complexity operations must be performed, not just one operation. Point Processing Operations that produce an output based upon the value of a single pixel. Thresholding Pixels with values above a predetermined threshold value are kept and others below the threshold are reduced to 0. Given a pixel, x i, the operation on each pixel is if (x i < threshold) x i = 0; else x i = ; Contrast Stretching Range of gray level values extended to make details more visible. Given pixel of value x i within range x l and x h, the contrast stretched to the range x H to x L by multiplying x i by x i ( x i x l ) x H x L = ----------------- + x L x h x l Gray Level Reduction Number of bits used to represent the gray level reduced. Simple method would be to truncate the lesser significant bits. Page 272

Histogram Shows the number of pixels in the image at each gray level: Number of pixels 0 255 Gray level Figure.2 Image histogram. Sequential code for(i = 0; i < height_max; x++) for(j = 0; j < width_max; y++) hist[p[i][j]] = hist[p[i][j]] + ; where the pixels are contained in the array p[][] and hist[k] will hold the number of pixels having the kth gray level. Similar to adding numbers to an accumulating sum and similar parallel solutions can be used for computing histograms. Page 273

Smoothing, Sharpening, and Noise Reduction Smoothing suppresses large fluctuations in intensity over the image area and can be achieved by reducing the high-frequency content. Sharpening accentuates the transitions, enhancing the detail, and can be achieved by two ways. Noise reduction suppresses a noise signal present in the image. Often requires a local operation with access to a group of pixels around the pixel to be updated. A common group size is 3 3: x 0 x x 2 x 3 x 4 x 5 x 6 x 7 x 8 Figure.3 Pixel values for a 3 3 group. Page 274

Mean A simple smoothing technique is to take the mean or average of a group of pixels as the new value of the central pixel. Given a 3 3 group, the computation is x 4 ' = where x 4 ' is the new value for x 4. x 0 + x + x 2 + x 3 + x 4 + x 5 + x 6 + x 7 + x -------------------------------------------------------------------------------------------------- 8 9 Sequential Code Nine steps to compute the average for each pixel, or 9n for n pixels. A sequential time complexity of Ο(n). Parallel Code Number of steps can be reduced by separating the computation into four data transfer steps in lock-step fashion. Step Each pixel adds pixel from left Step 2 Each pixel adds pixel from right Step 3 Each pixel adds pixel from above Step 4 Each pixel adds pixel from below Figure.4 Four-step data transfer for the computation of mean. Page 275

x 0 x x 2 x 0 x x 2 x 0 + x x 0 + x + x 2 x 3 x 4 x 5 x 3 x 4 x 5 x 3 + x 4 x 3 + x 4 + x 5 x 6 x 7 x 8 x 6 x 7 x 8 x 6 + x 7 x 6 + x 7 + x 8 (a) Step (b) Step 2 x 0 x x 2 x 0 x x 2 x 0 + x + x 2 x 0 + x + x 2 x 3 x 4 x 5 x 3 x 4 x 5 x x 0 + x + x 0 + x + x 2 2 x x 3 + x 4 + x 3 + x 4 + x 5 5 x 6 + x 7 + x 8 x 6 x 7 x 8 x 6 x 7 x 8 x 6 + x 7 + x 8 x 6 + x 7 + x 8 (c) Step 3 (d) Step 4 Figure.5 Parallel mean data accumulation. Page 276

Median Sequential Code Median can be found by ordering the pixel values from smallest to largest and choosing the center pixel value (assuming an odd number of pixels). With a 3 3 group, suppose the values in ascending order are y 0, y, y 2, y 3, y 4, y 5, y 6, y 7, and y 8. The median is y 4. Suggests that all the values in the group must first be sorted, and then the fifth element taken to replace the original value of the pixel. Using bubble sort, in which the lesser values are found first in order, sorting could, in fact, be terminated after the fifth lowest value is obtained. Number of steps for finding each median is given by 8 + 7 + 6 + 5 + 4 = 30 steps, or 30n for n pixels. Parallel Code An Approximate Sorting Algorithm First, a compare-and-exchange operationperformed on each of the rows, requiring three steps. For the ith row, we have p i,j p i,j p i,j p i,j+ p i,j p i,j where means compare and exchange if left gray level greater than right gray level. Then the columns sorted using three compare-and-exchange operations: p i,j p i,j p i,j p i+,j p i,j p i,j The value in p i,j is taken to be the fifth largest pixel value. Does not always select fifth largest value.reasonable approximation. Only requires six steps. Page 277

Largest in row Next largest in row Next largest in column Figure.6 Approximate median algorithm requiring six steps. Weighted Masks The mean method could be described by a weighted 3 3 mask. Suppose the weights are w 0, w, w 2, w 3, w 4, w 5, w 6, w 7, and w 8, and pixel values are x 0, x, x 2, x 3, x 4, x 5, x 6, x 7, and x 8. The new center pixel value, x 4 ', is given by x 4 ' = w 0 x 0 + w x + w 2 x 2 + w 3 x 3 + w 4 x 4 + w 5 x 5 + w 6 x 6 + w 7 x 7 + w 8 x --------------------------------------------------------------------------------------------------------------------------------------------------------------- 8 k The scale factor, /k, is set to maintain the correct grayscale balance in the image after the operation. Often, k is given by w 0 + w + w 2 + w 3 + w 4 + w 5 + w 6 + w 7. The summation of products, w i x i, from two functions w and x is the (discrete) cross-correlation of f with w (written as f w). Page 278

Mask Pixels Result w 0 w w 2 x 0 x x 2 w 3 w 4 w 5 x 3 x 4 x 5 = x 4 ' w 6 w 7 w 8 x 6 x 7 x 8 Figure.7 Using a 3 3 weighted mask. k = 9 Figure.8 Mask to compute mean. Page 279

k = 6 8 Figure.9 A noise reduction mask. k = 9 8 Figure.0 High-pass sharpening filter mask. The computation done with this mask is x 4 ' = 8x 4 x 0 x x 2 x 3 x 5 x 6 x 7 x 8 -------------------------------------------------------------------------------------------------- 9 Page 280

Edge Detection Highlighting edges of object where an edge is a significant change in gray level intensity. Gradient and Magnitude Let us first consider a one-dimension gray level function, f(x) (say along a row). If the function, f(x), were differentiated, the first derivative, f/ x, measures the gradient of the transition and would be a positive-going or negative-going spike at a transition. The direction of the transition (either increasing or decreasing) can be identified by the polarity of the first derivative. Intensity transition First derivative Second derivative Figure. Edge detection using differentiation. Page 28

Image Function A two-dimensional discretized gray level function, f(x,y). Gradient (magnitude) f = f ----- 2 f x + ----- y 2 Gradient Direction φ ( x, y ) = tan f ----- y -------- f ----- x where φ is the angle with respect to the y-axis. The gradient can be approximated to f f ----- y + f ----- x for reduced computational effort x Image y f(x, y) φ Constant intensity Gradient Figure.2 Gray level gradient and direction. Page 282

Edge Detection Masks For discrete functions, the derivatives are approximated by differences. The term f/ x is the difference in the x-direction and f/ y is the difference in the y- direction. We might consider computing the approximate gradient using pixel values x 5 and x 3 (to get f/ x) and x 7 and x (to get f/ y); i.e., so that f ----- x x 5 x 3 f ----- x y 7 x f x 7 x + x 5 x 3 Two masks are needed, one to obtain x 7 x and one to obtain x 5 x 3. The absolute values of the results of each mask are added together. Prewitt Operator The approximate gradient obtained from Then f ----- ( x y 6 x 0 ) + ( x 7 x ) + ( x 8 x 2 ) f ----- ( x x 2 x 0 ) + ( x 5 x 3 ) + ( x 8 x 6 ) f x 6 x 0 + x 7 x + x 8 x 2 + x 2 x 0 + x 5 x 3 + x 8 x 6 which requires using the two 3 3 masks. Page 283

0 0 0 0 0 0 Figure.3 Prewitt operator. Derivatives are approximated to Sobel Operator f ----- ( x y 6 + 2 x 7 + x 8 ) ( x 0 + 2 x + x 2 ) f ----- ( x x 2 + 2 x 5 + x 8 ) ( x 0 + 2 x 3 + x 6 ) Operators implementing first derivatives will tend to enhance noise. However, the Sobel operator also has a smoothing action. Page 284

2 0 0 0 0 2 0 2 2 0 Figure.4 Sobel operator. (a) Original image (Annabel) (b) Effect of Sobel operator Figure.5 Edge detection with Sobel operator. Page 285

Laplace Operator The Laplace second-order derivative is defined as approximated to 2 f which can be obtained with the single mask: 2 f 2 f = ------- x 2 + ------- y 2 2 f = 4 x 4 ( x + x 3 + x 5 + x 7 ) 0 0 4 0 0 Figure.6 Laplace operator. Upper pixel x x 3 x 4 x 5 Left pixel Right pixel x 7 Lower pixel Figure.7 Pixels used in Laplace operator. Page 286

Figure.8 Effect of Laplace operator. The Hough Transform Purpose is to find the parameters of equations of lines that most likely fit sets of pixels in an image. A line is described by the equation y = ax + b where the parameters, a and b, uniquely describe the particular line, a the slope and b the intercept on the y-axis. A search for those lines with the most pixels mapped onto them would be computationally prohibitively expensive [an Ο(n 3 ) algorithm]. Suppose the equation of the line is rearranged as: b = xa + y Every point that lies on a specific line in the x-y space will map into same point in the a- b space (parameter space). Therefore, we can find the number of points on each possible line in the x-y space by simply counting when a mapping maps into a point in the a-b space. Page 287

y y = ax + b b b = x a + y (x, y ) b = xa + y (a, b) Pixel in image x a (a) (x, y) plane (b) Parameter space Figure.9 Mapping a line into (a, b) space. Finding the Most Likely Lines A single point (x, y ) can be mapped into the points of a line in the a-b space. b = x a + y In the mapping process, discrete values will be used to a coarse prescribed precision and the computation is rounded to the nearest possible a-b coordinates. The mapping process is done for every point in the x-y space. A record is kept of those a-b points that have been obtained by incrementing the corresponding accumulator. Each accumulator will have the number of pixels that map into a single point in the parameter space. The points in the parameter space with locally maximum numbers of pixels are chosen as lines. Page 288

Method will fail for vertical lines and with lines that approach this extreme. To avoid the problem, line equationrearranged to polar coordinates: r = x cos θ + y sin θ where r is the perpendicular distance to the origin in the original (x, y) coordinate system and θ is the angle between r and the x-axis. y y = ax + b r r = x cos θ + y sin θ (r, θ) r θ (a) (x, y) plane x (b) (r, θ) plane θ Figure.20 Mapping a line into (r, θ) space. The θ value will be in degrees and very conveniently will also be the gradient angle of the line (with respect to the x-axis). Each line will map to a single point in the (r, θ) space. A vertical line will simply map into a point where θ = 0 if a positive intercept with the x-axis, or θ = 80 if a negative intercept with the x-axis. A horizontal line has θ = 90. Each point in the x-y space will map into a curve in the (r, θ) space; i.e., (x, y ) maps into r = x cos θ + y sin θ Page 289

Implementation Assume origin at the top left corner. x r θ y Figure.2 Normal representation using image coordinate system. The parameter space is divided into small rectangular regions. One accumulator is provided for each region. Accumulators of those regions that a pixel maps into are incremented. This process must be done for all the pixels in the image. If all values of θ were tried (i.e., incrementing θ through all its values), the computational effort would be given by the number of discrete values of θ, say k intervals. With n pixels the complexity is Ο(kn). The computational effort can be reduced significantly by limiting the range of lines for individual pixels using some criteria. A single value of θ could be selected based upon the gradient of the line. Page 290

Accumulator r 5 0 5 0 0 0 20 30 θ Figure.22 Accumulators, acc[r][θ], for the Hough transform. Sequential Code for (x = 0; x < xmax; x++) /* for each pixel */ for (y = 0; y < ymax; y++) { sobel(&x, &y, dx, dy); /* find x and y gradients */ magnitude = grad_mag(dx, dy);/* find magnitude if needed */ if (magnitude > threshold) { theta = grad_dir(dx, dy); /* atan2() fn */ theta = theta_quantize(theta); r = x * cos(theta) + y * sin(theta); r = r_quantize(r); acc[r][theta]++; /* increment accumulator */ append(r, theta, x, y); /* append point to line */ } } Finally, when all pixels have been considered, the values held in each accumulator will give the number of pixels that can map onto the corresponding line. The most likely lines are those with the most pixels, and accumulators with the local maxima are chosen. An algorithm for choosing the local maxima must be devised. Page 29

Parallel Code Since the computation for each accumulator is independent of the other accumulations, it could be performed simultaneously, although each requires read access to the whole image. - Left as an exercise. Transformation into the Frequency Domain Fourier Transform Many applications for the Fourier transform in science and engineering. In the area of image processing, the Fourier transform is used in image enhancement, restoration, and compression. The image is a two-dimensional discretized function, f(x, y), but first start with the onedimensional case. For completeness, let us first review the results of the Fourier series and Fourier transform concepts from first principles. Page 292

Fourier Series The Fourier series is a summation of sine and cosine terms and can be written as x( t) = a ---- 0 a 2πjt 2 j --------- 2πjt + cos + b T sin j --------- T j = T is the period (/T = f, where f is a frequency). By some mathematical manipulation, we can get the more convenient description of the series: x( t) = X j e j = 2πij t T --- where X j is the jth Fourier coefficient in a complex form and i =. The Fourier coefficients can also be computed from specific integrals. Fourier Transform Continuous Functions The previous summation developed into an integral: x( t) = X( f )e 2πif d f where X(f ) is a continuous function of frequency. The function X(f ) can be obtained from t X( f ) = x( t)e 2πift d X(f ) is the spectrum of x(t), or more simply the Fourier transform of x(t). The original function, x(t), can obtained from X(f ) using the first integral given, which is the inverse Fourier transform. Page 293

Discrete Functions For functions having a set of N discrete values by replacing the integral with a summation, leading to the discrete Fourier transform (DFT) given by N X k = --- x N j e 2 j = 0 and inverse discrete Fourier transform given by x k = N j = 0 π i jk ---- N X j e 2πi jk ---- N for 0 k N. The N (real) input values, x 0, x, x 2,, x N, produce N (complex) transform values, X 0, X, X 2,, X N. Fourier Transforms in Image Processing A two-dimensional Fourier transform is N M X lm = x jk e 2 πi jl km --- + ------ N M where j and k are row and column coordinates, 0 j N and 0 k M. Assume image is square, where N = M. Equation can be rearranged into N j = 0 k = 0 N ------ X lm x jk e 2πi km N = e 2 j = 0 k = 0 π i jl --- N Inner summation is a one-dimensional DFT operating on N points of a row to produce a transformed row. Outer summation is a one-dimensional DFT operating on N points of a column. We might write N X lm = X jm e 2 j = 0 πi jl --- N Hence, the two-dimensional DFT can be divided into two sequential phases, one operating on rows of elements and one operating on columns of (transformed) elements: Page 294

j k Transform rows Transform columns x jk X jm X lm Figure.23 Two-dimensional DFT. Applications Frequency filtering can be described by the convolution operation: h( j, k) = g( j, k) f( j, k) the same operation as the cross-correlation operation for symmetrical masks, where g( j, k) describes the weighted mask (filter) and f( j, k) describes the image. It can be shown that the Fourier transform of a product of functions is given by the convolution of the transforms of the individual functions. Hence, the convolution of two functions can be obtained by taking the Fourier transforms of each function, multiplying the transforms H(j, k) = G(j, k) F(j, k) (element by element multiplication), where F(j, k) is the Fourier transform of f(j, k) and G( j, k) is the Fourier transform of g( j, k), and then taking the inverse transform to return the result into the original spatial domain. Page 295

Image Transform f j,k Convolution f(j, k) F(j, k) Multiply Inverse transform h j,k H(j, k) h(j, k) g j,k g(j, k) G(j, k) Filter/image (a) Direct convolution (b) Using Fourier transform Figure.24 Convolution using Fourier transforms. Starting from Parallelizing the Discrete Fourier Transform and using the notation w = e 2πi/N, N X k = x j e 2 j = 0 πi jk ---- N The w terms are called the twiddle factors. Each input value has to be multiplied by a twiddle factor. The inverse transform can be obtained by replacing w with w. X k = N j = 0 x j w jk Page 296

Sequential Code for (k = 0; k < N; k++) { /* for every point */ X[k] = 0; for (j = 0; j < N; j++) /* compute summation */ X[k] = X[k] + w j * k * x[j]; } where X[k] is the kth transformed point, x[k] is the kth input and w = e -2pi/N. Summation step requires complex number arithmetic. The code can be rewritten as for (k = 0; k < N; k++) { X[k] = 0; a = ; for (j = 0; j < N; j++) { X[k] = X[k] + a * x[j]; a = a * w k ; } } where a is a temporary variable. Elementary Master-Slave Implementation One slave process of N slave processes could be assigned to produce one of the transformed values; i.e., the kth slave process produces X[k]. Parallel time complexity with N (slave) processes is Ο(N). Master process w 0 w w n Slave processes X[0] X[] X[n ] Figure.25 Master-slave approach for implementing the DFT directly. Page 297

Pipeline Implementation Unfolding the inner loop for X[k], we have X[k] = 0; a = ; X[k] = X[k] + a * x[0]; a = a * w k ; X[k] = X[k] + a * x[]; a = a * w k ; X[k] = X[k] + a * x[2]; a = a * w k ; X[k] = X[k] + a * x[3]; a = a * w k ;.. Each pair of statements X[k] = X[k] + a * x[0]; a = a * w k ; could be performed by a separate pipeline stage. x[j] X[k] a w k Process j + a x[j] Values for next iteration X[k] a w k Figure.26 One stage of a pipeline implementation of DFT algorithm. Page 298

0 w k x[0] x[] x[2] x[3] x[n ] X[k] a w k P 0 P P 2 P 3 P N (a) Pipeline structure Output sequence X[0],X[],X[2],X[3] Figure.27 Discrete Fourier transform with a pipeline. X[0] X[] X[2] X[3] X[4] X[5] X[6] P N P N 2 Pipeline stages P 2 P P 0 Time (b) Timing diagram Page 299

DFT as a Matrix-Vector Product The kth element of the discrete Fourier transform is given by X k = x 0 w 0 + x w + x 2 w 2 + x 3 w 3 + x N w N and the whole transform can be described by a matrix-vector product: X 0 X X 2 X 3. X k. X N = --- N.. w w 2 w 3.. w N w 2 w 4 w 6.. w 2 ( N ) w 3 w 6 w 9.. w 3 N....... w k w 2 k w 3 k.. w N....... w N w 2 ( N ) w 3 N ( ).. w N ( ) ( N ) x 0 x x 2 x 3. x k. x N (Note w 0 =.) Hence, the parallel methods of producing a matrix-vector product as described in Chapter 0 can be used for the discrete Fourier transform. Fast Fourier Transform Method of obtaining discrete Fourier transform with a time complexity of Ο(N log N) instead of Ο(N 2 ). Let us start with the discrete Fourier transform equation: where w = e 2πi/N. N X k = --- x N j w jk j = 0 Page 300

Many formulations. The version that starts by dividing the summation into the following two parts: where the first summation processes the x s with even indices and the second summation processes the x s with odd indices. Rearranging, we get ( N 2) ( N 2) X k = --- x N 2 j w 2 jk + x 2 j j = 0 j = 0 + w ( 2 j + )k or X k = -- 2 ( N 2) ( N 2) ( --------------- N 2) x w 2 jk + w k 2 j ( --------------- N 2) x 2 j j = 0 j = 0 + w 2 jk X k = -- 2 ( N 2 ) ( --------------- N 2) x e 2 2 j j = 0 π i ---------- jk N 2 ( N 2 ) + w k ( --------------- N 2 ) x 2 j j = 0 + e 2 π i ---------- jk N 2 Each summation can now be recognized as an N/2 discrete Fourier transform operating on the N/2 even points and the N/2 odd points, respectively. for k = 0,, N, where X even is the N/2-point DFT of the numbers with even indices, x 0, x 2, x 4,, and X odd is the N/2-point DFT of the numbers with odd indices, x, x 3, x 5,. Now, suppose k is limited to 0,, N/2, the first N/2 values of the total N values. The complete sequence can be divided into two parts: and X k + N 2 X k = -- [ X 2 even + w k X odd ] X k = -- [ X 2 even + w k X odd ] = -- [ X 2 even + w k + N 2 X odd ] = -- X 2 even w k X odd since w k+n/2 = w k, where 0 k < N/2. Hence, we could compute X k and X k+n/2 using two N/2-point transforms: Page 30

Input sequence Transform x 0 x x 2 x 3 N/2 pt DFT X even + X k x N 2 x N N/2 pt DFT X odd w k k = 0,, N/2 X k+n/2 Figure.28 Decomposition of N-point DFT into two N/2-point DFTs. Each of the N/2-point DFTs can be decomposed into two N/4-point DFTs and the decomposition could be continued until single points are to be transformed. A -point DFT is simply the value of the point. Computation often depicted in the form: x 0 + + X 0 x + X x 2 + + X 2 x 3 + X 3 Figure.29 Four-point discrete Fourier transform. Page 302

X k = Σ(0,2,4,6,8,0,2,4)+w k Σ(,3,5,7,9,,3,5) {Σ(0,4,8,2)+w k Σ(2,6,0,4)}+w k {Σ(,5,9,3)+w k Σ(3,7,,5)} {[Σ(0,8)+w k Σ(4,2)]+w k [Σ(2,0)+w k Σ(6,4)]}+{[Σ(,9)+w k Σ(5,3)]+w k [Σ(3,)+w k Σ(7,5)]} x 0 x 8 x 4 x 2 x 2 x 0 x 6 x 4 x x 9 x 5 x 3 x 3 x x 7 x 5 0000 000 000 00 000 00 00 0 000 00 00 0 00 0 0 Figure.30 Sixteen-point DFT decomposition. Sequential Code The sequential time complexity is essentially Ο(N log N) since there are log N steps and each step requires a computation proportional to N, where there are N numbers. The algorithm can be implemented recursively or iteratively. Page 303

Parallelizing the FFT Algorithm Binary Exchange Algorithm x 0 X 0 x X x 2 X 2 x 3 X 3 x 4 X 4 x 5 X 5 x 6 X 6 x 7 X 7 x 8 X 8 x 9 X 9 x 0 X 0 x X x 2 X 2 x 3 X 3 x 4 X 4 x 5 Figure.3 Sixteen-point FFT computational flow. X 5 Process Row P/r Inputs 0000 x 0 Outputs X 0 P 0 000 000 x x 2 X X 2 00 x 3 X 3 000 x 4 X 4 P 00 00 x 5 x 6 X 5 X 6 0 x 7 X 7 000 x 8 X 8 P 2 P 3 00 x 9 00 x 0 0 x 00 x 2 0 x 3 0 x 4 X 9 X 0 X X 2 X 3 X 4 x 5 Figure.32 Mapping processors onto 6-point FFT computation. Page 304 X 5

Analysis Computation Given p processors and N data points, at each step each processor will compute N/p points, and each point requires one multiplication and addition. With log N steps, the parallel time complexity is given by t comp = Ο(N log N). Communication If p = N, communication occurs at each step and there is one data exchange between pairs of processors at each of the log N steps. Suppose a hypercube or another interconnection network that allows simultaneous exchanges is used. In that case, the communication time complexity is given by t comm = Ο(log N) If p < N, the communication time complexity is simply given by t comm = Ο(log p) Transpose Algorithm If processors organized as a two-dimensional array, communications would first take place between processors in each column, and then in each row: P 0 P P 2 P 3 x 0 x x 2 x 3 x 4 x 5 x 6 x 7 x 8 x 9 x 0 x x 2 x 3 x 4 x 5 Figure.33 FFT using transpose algorithm first two steps. Page 305

During the first two steps, all communication would be within a processor, as shown. During the last two steps, the communication would be between processors. In the transpose algorithm, between the first two steps and the last two steps, the array elements are transpose. P 0 P P 2 P 3 x 0 x x 2 x 3 x 4 x 5 x 6 x 7 x 8 x 9 x 0 x x 2 x 3 x 4 x 5 Figure.34 Transposing array for transpose algorithm. After the transpose, the last two steps proceed but now involve communication only within the processors. The only communication between processors is to transpose the array P 0 P P 2 P 3 x 0 x 4 x 8 x 2 x x 5 x 9 x 3 x 2 x 6 x 0 x 4 x 3 x 7 x x 5 Figure.35 FFT using transpose algorithm last two steps. Page 306