Image Processing, Analysis and Machine Vision Milan Sonka PhD University of Iowa Iowa City, USA Vaclav Hlavac PhD Czech Technical University Prague, Czech Republic and Roger Boyle DPhil, MBCS, CEng University of Leeds Leeds, UK CHAPMAN & HALL COMPUTING London Glasgow Weinheim New York Tokyo Melbourne Madras
Contents Colour plates appear between pages 268 and 269 List of Algorithms List of symbols and abbreviations Preface xi xiii xv 1 Introduction 1 Th«: digitized image and its properties 2.1 Basic concepts 2.1.1 Image functions 2.1.2 The Dirac distribution and convolution 2.1.3 The Fourier transform 2.1.4 Images as a stochastic process 2.1.5 Images as linear systems 2.2 Image digitization 2.2.1 Sampling 2.2.2 Quantization 2.2.3 Colour images 2.3 Digital image properties 2.3.1 Metric and topological properties of digital images 2.3.2 Histograms 2.3.3 Visual perception of the image 2.3.4 Image quality 2.3.5 Noise in images Dat ;a structures for image analysis 3.1 3.2 Levels of image data representation Traditional image data structures 3.2.1 Matrices 3.2.2 Chains 3.2.3 Topological data structures 3.2.4 Relational structures 3.3 Hierarchical data structures 13 13 13 16 17 19 22 22 23 28 28 30 30 35 36 39 39 42 42 43 44 46 47 49 50 V
VI Contents 3.3.1 Pyramids 3.3.2 Quadtrees 4 Image pre-processing 4.1 Pixel brightness transformations 4.1.1 Position-dependent brightness correction 4.1.2 Grey scale transformation 4.2 Geometric transformations 4.2.1 Pixel co-ordinate transformations 4.2.2 Brightness interpolation 4.3 Local pre-processing 4.3.1 Image smoothing 4.3.2 Edge detectors 4.3.3 Zero crossings of the second derivative 4.3.4 Scale in image processing 4.3.5 Canny edge detection 4.3.6 Edges in multispectral images 4.3.7 Other local pre-processing operators 4.3.8 Adaptive neighbourhood pre-processing 4.4 Image restoration 4.4.1 Image restoration as inverse convolution of the whole image 4.4.2 Degradations that are easy to restore 4.4.3 Inverse filtration 4.4.4 Wiener filtration 5 Segmentation 5.1 Thresholding 5.1.1 Threshold detection methods 5.1.2 Multispectral thresholding 5.1.3 Thresholding in hierarchical data structures 5.2 Edge-based segmentation 5.2.1 Edge image thresholding 5.2.2 Edge relaxation 5.2.3 Border tracing 5.2.4 Edge following as graph searching 5.2.5 Edge following as dynamic programming 5.2.6 Hough transforms 5.2.7 Border detection using border location information 5.2.8 Region construction from borders 5.3 Region growing segmentation 5.3.1 Region merging 5.3.2 Region splitting 5.3.3 Splitting and merging
Contents Vll 5.4 Matching 176 5.4.1 Matching criteria 176 5.4.2 Control strategies of matching 178 Shape representation and description 192 6.1 Region identification 197 6.2 Contour-based shape representation and description 200 6.2.1 Chain codes 200 6.2.2 Simple geometric border representation 201 6.2.3 Fourier transforms of boundaries 205 6.2.4 Boundary description using a segment sequence; polygonal representation 208 6.2.5 B-spline representation 212 6.2.6 Other contour-based shape description approaches 215 6.2.7 Shape invariants 215 6.3 Region-based shape representation and description 220 6.3.1 Simple scalar region descriptors 222 6.3.2 Moments 228 6.3.3 Convex hull 230 6.3.4 Graph representation based on region skeleton 235 6.3.5 Region decomposition 240 6.3.6 Region neighbourhood graphs 241 Object recognition 255 7.1 Knowledge representation 256 7.2 Statistical pattern recognition 262 7.2.1 Classification principles 264 7.2.2 Classifier setting 266 7.2.3 Classifier learning 270 7.2.4 Cluster analysis 273 7.3 Neural nets 275 7.3.1 Feed-forward nets 276 7.3.2 Kohonen feature maps 279 7.3.3 Hybrid neural nets 280 7.3.4 Hopfield neural nets 280 7.4 Syntactic pattern recognition 283 7.4.1 Grammars and languages 285 7.4.2 Syntactic analysis, syntactic classifier 288 7.4.3 Syntactic classifier learning, grammar inference 290 7.5 Recognition as graph matching 292 7.5.1 Isomorphism of graphs and subgraphs 293 7.5.2 Similarity of graphs 298 7.6 Optimization techniques in recognition 299 7.6.1 Genetic algorithms, 301
VUl Contents 7.6.2 Simulated annealing 8 Image understanding 8.1 Image understanding control strategies 8.1.1 Parallel and serial processing control 8.1.2 Hierarchical control 8.1.3 Bottom-up control strategies 8.1.4 Model-based control strategies 8.1.5 Combined control strategies 8.1.6 Non-hierarchical control 8.2 Active contour models - snakes 8.3 Pattern recognition methods in image understanding 8.3.1 Contextual image classification 8.4 Scene labelling and constraint propagation 8.4.1 Discrete relaxation 8.4.2 Probabilistic relaxation 8.4.3 Searching interpretation trees 8.5 Semantic image segmentation and understanding 8.5.1 Semantic region growing 8.5.2 Semantic genetic segmentation and interpretation 9 3D Vision 9.1 Strategy 9.1.1 Marr's theory 9.1.2 Modelling strategies 9.2 Line labelling 9.3 Shape from X 9.3.1 Shape from stereo 9.3.2 Shape from shading 9.3.3 Shape from motion 9.3.4 Shape from texture 9.4 Approaches to the recognition of 3D objects 9.4.1 Goad's algorithm 9.4.2 Features for model-based recognition of curved objects 9.5 Depth map technologies 9.6 Summary 10 Mathematical morphology 10.1 Basic principles and morphological transformations 10.1.1 Morphological transformations 10.1.2 Dilation 10.1.3 Erosion 10.1.4 Opening and closing 10.2 Skeleton and other topological processing
Contents ix 10.2.1 Homotopic transformations 432 10.2.2 Skeleton 433 10.2.3 Thinning and thickening 434 10.2.4 Conditional dilation and ultimate erosion 439 11 Linear discrete image transforms 443 11.1 Basic theory 444 11.2 The Fourier transform 445 11.3 Hadamard transform 447 11.4 Discrete cosine transform 448 11.5 Other discrete image transforms 449 11.6 Applications of discrete image transforms 450 12 Image data compression 458 12.1 Image data properties 460 12.2 Discrete image transforms in image data compression 461 12.3 Predictive compression methods 463 12.4 Vector quantization 465 12.5 Pyramid compression methods 466 12.6 Comparison of compression methods 468 12.7 Other techniques 470 13 Texture 477 13.1 Statistical texture description 480 13.1.1 Methods based on spatial frequencies 480 13.1.2 Co-occurrence matrices 482 13.1.3 Edge frequency 485 13.1.4 Primitive length (run length) 487 13.1.5 Other statistical methods of texture description 488 13.2 Syntactic texture description methods 490 13.2.1 Shape chain grammars 491 13.2.2 Graph grammars 493 13.2.3 Primitive grouping in hierarchical textures 494 13.3 Hybrid texture description methods 497 13.4 Texture recognition method applications 498 14 Motion analysis 507 14.1 Differential motion analysis methods 510 14.2 Optical flow 512 14.2.1 Optical flow computation 513 14.2.2 Global and local optical flow estimation 516 14.2.3 Optical flow in motion analysis 521 14.3 Motion analysis based on detection of interest points 524 14.3.1 Detection of interest points 525
x Contents 14.3.2 Correspondence of interest points 525 Index 543