Basic Pattern Recognition. Pattern Recognition Main Components. Introduction to PR. PR Example. Introduction to Pattern Recognition.

Similar documents
FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

Unsupervised Learning and Clustering

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Detection of an Object by using Principal Component Analysis

Unsupervised Learning

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Feature Reduction and Selection

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Shape Representation Robust to the Sketching Order Using Distance Map and Direction Histogram

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Classifier Selection Based on Data Complexity Measures *

Machine Learning: Algorithms and Applications

CS 534: Computer Vision Model Fitting

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION

An Image Fusion Approach Based on Segmentation Region

Edge Detection in Noisy Images Using the Support Vector Machines

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Support Vector Machines

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros.

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

Classifying Acoustic Transient Signals Using Artificial Intelligence

Support Vector Machines

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

Lecture 4: Principal components

A Statistical Model Selection Strategy Applied to Neural Networks

Lecture #15 Lecture Notes

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Wishing you all a Total Quality New Year!

Novel Fuzzy logic Based Edge Detection Technique

Brushlet Features for Texture Image Retrieval

12. Segmentation. Computer Engineering, i Sejong University. Dongil Han

An efficient method to build panoramic image mosaics

Recognizing Faces. Outline

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Modular PCA Face Recognition Based on Weighted Average

A Binarization Algorithm specialized on Document Images and Photos

Biostatistics 615/815

Analysis of Continuous Beams in General

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

The Research of Support Vector Machine in Agricultural Data Classification

Lecture 5: Multilayer Perceptrons

Lecture 13: High-dimensional Images

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

TN348: Openlab Module - Colocalization

Image Alignment CSC 767

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

Machine Learning 9. week

Mining Image Features in an Automatic Two- Dimensional Shape Recognition System

Fitting: Deformable contours April 26 th, 2018

Novel Pattern-based Fingerprint Recognition Technique Using 2D Wavelet Decomposition

Feature Extractions for Iris Recognition

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval

CLASSIFICATION OF ULTRASONIC SIGNALS

Cluster Analysis of Electrical Behavior

Signature and Lexicon Pruning Techniques

Correlative features for the classification of textural images

Unsupervised Learning and Clustering

Multi-stable Perception. Necker Cube

X- Chart Using ANOM Approach

Object-Based Techniques for Image Retrieval

Keyword-based Document Clustering

Palmprint Feature Extraction Using 2-D Gabor Filters

An Entropy-Based Approach to Integrated Information Needs Assessment

Hierarchical clustering for gene expression data analysis

Research and Application of Fingerprint Recognition Based on MATLAB

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

3D vector computer graphics

Applying EM Algorithm for Segmentation of Textured Images

S1 Note. Basis functions.

A Background Subtraction for a Vision-based User Interface *

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Using Fuzzy Logic to Enhance the Large Size Remote Sensing Images

Discriminative classifiers for object classification. Last time

Feature Selection for Target Detection in SAR Images

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Title: A Novel Protocol for Accuracy Assessment in Classification of Very High Resolution Images

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

COMPLEX WAVELET TRANSFORM-BASED COLOR INDEXING FOR CONTENT-BASED IMAGE RETRIEVAL

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

Image Segmentation. Image Segmentation

Local Quaternary Patterns and Feature Local Quaternary Patterns

Invariant Shape Object Recognition Using B-Spline, Cardinal Spline, and Genetic Algorithm

MATHEMATICS FORM ONE SCHEME OF WORK 2004

An Improved Image Segmentation Algorithm Based on the Otsu Method

SVM-based Learning for Multiple Model Estimation

Content-Based Bird Retrieval using Shape context, Color moments and Bag of Features

Data Mining: Model Evaluation

Principal Component Inversion

Comparison Study of Textural Descriptors for Training Neural Network Classifiers

TOPOGRAPHIC OBJECT RECOGNITION THROUGH SHAPE

Face Recognition using 3D Directional Corner Points

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Transcription:

Introducton to Pattern Recognton Pattern Recognton (PR): Classfy what nsde of the mage Basc Pattern Recognton Xaojun Q Applcatons: Speech Recognton/Speaker Identfcaton Fngerprnt/Face Identfcaton Sgnature Verfcaton Character Recognton Bomedcal: DNA Sequence Identfcaton Remote Sensng Meteorology Industral Inspecton Robot Vson Introducton to PR Pattern recognton deals wth classfcaton, descrpton, and analyss of measurements taken from physcal or mental processes. Pattern recognton Take n raw data Determne the category of the pattern Take an acton based on the category of Pattern Recognton Man Components Sensng: Desgn of transducers Segmentaton and groupng (nto a composte objec Feature extracton: A set of characterstc measurements (numercal or non-numercal), and ther relatons are extracted to represent patterns for further process. Classfcaton: The process or events wth same smlar propertes are grouped nto a class. The number of classes s task-dependent. Post-processng: Consderng the effects of context and the pattern the cost of errors 3 4 How many classes are needed n an Englsh character recognton system? How many classes are need to dstngush Chnese from Englsh? PR Example Fsh-packng plant -- speces determnaton Separate sea bass from salmon usng optcal sensng Image features Length Lghtness Wdth Number and shape of fns, etc. 5 Establsh models for objects to be classfed Descrptons n mathematcs form Consderng nose or varatons n populaton tself and sensng 6

Man procedures Preprocessng segmentaton from each other and background Feature extracton Classfcaton and decson Tentatve model Sea bass have a typcal length longer than that for salmon Desgn (Tranng) samples For feature measurement and model dentfcaton Cost Cost of actons (e.g., false postve and false negatve) Symmetry n the cost s often assumed, but not nvarably Decson theory Make a decson rule (.e., set a decson boundary) to mnmze a cost 7 8 Frst Feature Extracton Second Feature Extracton 9 Classfcaton -- Classfer Desgn Lnear Classfer Feature space Feature vector x x = x Scatterng plot for tranng samples Classfer : desgn of decson boundary on scatterng plot Partton the feature space nto several regons.

The Best Classfer The Optmal Classfer 3 4 PR Feature Extracton Seek dstngushng features that are nvarant to rrelevant transformatons. Dstngushng features Feature values are smlar n the same category and very dfferent n dfferent categores. Irrelevant transformatons Rotaton, Scale, and Translaton, (RST nvarance, major concern) Occluson Projectve dstorton Non-rgd deformatons Feature selecton (those are most effectve) 5 PR Classfcaton Assgn an object to a category by usng the feature vector. Dffculty of classfcaton depends on the varablty of the feature values n the same category relatve to the dfference between feature values n dfferent categores. The varablty of feature values n the same category may come from nose 6 PR Post-Processng Consder the cost of acton Mnmze classfcaton error rate Mnmze rsk (total expected cos PR Feature Extracton Technques -- Object Representaton Explot context (nput-dependent nformaton) to mprove system performance E.g., use the context nformaton for OCR or speech recognton Multple classfer (dfferent from multple features) Each classfer operates on dfferent aspects of the nput (e.g., speech recognton = acoustc recognton + lp readng) 7 Decson fuson 8

PR Classfcaton Technques -- Statstcs (Dstrbuton) Based PR Defne/Fnd probablty model Classfy on bass of a posteror probablty PR Classfcaton Technques -- Expert (Model) Based PR Defne/fnd dscrmnant functon G(x) Classfy on bass of value (sgn) of G(x) Bayes Classfcaton K-Mean Algorthm ISO Data Algorthm Herarchcal Clusterng Lnear Dscrmnant Analyss -Two-class Problem -Percepton crteron -Fsher crteron -Least-Mean-Square Error 9 PR Classfcaton Technques --Prototype (Dstance) Based PR Defne/fnd prototypcal examples of classes Classfy on bass of smlarty wth prototypes Syntactc Rule Areas Related to PR Image Processng Speech Processng Artfcal Intellgence Assocate Memory Neural and Fuzzy Probablty and Statstcs (Statstcal) Regresson (fnd functonal descrpton of data for new nput predcton) Interpolaton (nfer the functon for ntermedate ranges of npu Densty estmaton (for Maxmum Lkelhood and Maxmum A Pror classfcaton) Formal language (Syntactc) Neural network archtecture desgn and tranng (Neural) Introducton to Computer Vson Computer Vson (CV): Focus on vew analyss usng technques from IP, PR and artfcal ntellgence (AI). It s the area of AI concerned wth modelng and replcatng human vson usng computer software and hardware. Applcatons: Robotcs Traffc Montorng Face Identfcaton 3D Modelng n Medcal Imagng 3 Concepts of Pattern Recognton Pattern: A pattern s the descrpton of an object. Accordng to the nature of the patterns to be recognzed, we may dvde our acts of recognton nto two major types: The recognton of concrete tems The recognton of abstract tems 4

When a person perceves a pattern, he makes an nductve nference and assocates ths percepton wth some general concepts or clues whch he has derved from hs past experence. Thus, the problem of pattern recognton may be regarded as one of dscrmnatng of the nput data, not between ndvdual patterns but between populatons, va the search for features or nvarant attrbutes among members of a populaton. 5 The study of pattern recognton problems may be logcally dvded nto two major categores: The study of the pattern recognton capablty of human bengs and other lvng organsms. (Psychology, Physology, and Bology) The development of theory and technques for the desgn of devces capable of performng a gven recognton task for a specfc applcaton. (Engneerng, Computer, and Informaton Scence) 6 Pattern recognton can be defned as the categorzaton of nput data nto dentfable classes va the extracton of sgnfcant features or attrbutes of the data from a background of rrelevant detal. 7 Task of Classfcaton Character Recognton Speech Recognton Speaker Recognton Weather Predcton Medcal Dagnoss Stock Market Predcton Input Data Optcal sgnals or strokes Acoustc waveforms Voce Weather maps Symptoms Fnancal news and charts Output Response Name of character Name of word Name of speaker Weather forecast Dsease Predcted market 8 ups and downs. Pattern Class: It s a category determned by some gven common attrbutes. Pattern: It s the descrpton of any member of a category representng a pattern class. When a set of patterns fallng nto dsjont classes s avalable, t s desred to categorze these patterns nto ther respectve classes through the use of some automatc devce. The basc functons of a pattern recognton system are to detect and extract common features from the patterns descrbng the objects that belong to the same pattern class, and to recognze ths pattern n any new envronment and classfy t as a member of one of the pattern classes under consderaton. 9 Object Feature -- Shape What features can we get from an Object? Permeter Area Eccentrcty: The rato of the major to the mnor axs Curvature: The rate of change of slope. That s: Use the dfference between the slopes of adjacent boundary segments as a descrptor of curvature at the pont of ntersecton of the segments. Chan Code 3

Structural Syntactc Graph Tree Model-drven Data-drven Contour Object Shape Non-Structural Permeter Compactness Eccentrcty Fourer Descrptors Wavelet Descrptors Curvature Scale Space Shape Sgnature Chan Code Hausdorff Dstance Elastc Matchng Regon Area Euler Number Eccentrcty Geometrc Moments Zernke Moments Pseudo-Zernke Mmts Legendre Moments Grd Method 3 Object Representaton and Recognton Representng a regon nvolves two choces: We can represent the regon n terms of ts external characterstcs (ts boundary) Focus on Shape Characterstcs. We can represent the regon n terms of ts nternal characterstcs (the pxels comprsng the regon) Focus on regonal propertes such as color and texture. Descrbe the regon based on the chosen representaton. A regon may be represented by ts boundary, and the boundary can be descrbed by features such as: Length, The orentaton of the straght lne jonng ts extreme ponts, The number of concavtes n the boundary. 3 The features selected as descrptors should be as nsenstve as much as possble to varatons n sze, translaton, and rotaton. 33 Shape Sgnatures A sgnature s a -D functonal representaton of a boundary and may be represented n varous ways. Regardless of how a sgnature s generated, the basc dea s to reduce the boundary representaton to a -D functon, whch presumably s easer to descrbe than the orgnal -D boundary. 34 Shape Sgnatures. Complex Coordnates. Central Dstance 3. Central Complex Coordnates 4. Chordlength 5. Cumulatve Angular Functon 6. Curvature Functon 7. Area Functon Shape Sgnatures --Complex Coordnates The boundary can be represented as the sequence of coordnates s( = [x(, y(] for t =,,,, N-, where x( = x t and y( = y t ; (x t, y t ) s are encountered n traversng the boundary n the counterclockwse drecton and N s the total number of ponts on the boundary. Z( = x( + y( 35 36

Shape Sgnatures -- Central Dstance r( = ([x( x c ] + [y( - y c ] ) / x c = N = N x(, yc N t= N t= y( Shape Sgnatures --Central Complex Coordnates z( = [x( x c ] + [y( - y c ] x c = N = N x(, yc N t= N t= y( 37 38 Shape Sgnatures --Chordlength R*( = length of chord n object perpendcular to tangent at p, as a functon of p. The chord length functon r*( s derved from shape boundary wthout usng any reference pont Shape Sgnatures --Cumulatve Angular Functon It s also called turnng angle functon. ϕ ( = [θ ( - θ ()]mod(π) 39 4 Shape Sgnatures -- Curvature Functon K( = θ ( - θ (t-) y( y( t + w) θ ( = arctan x( x( t + w) w s the jumpng step n selectng next pxel Shape Sgnatures -- Area Functon A( = x ( y ( x( y( 4 4

3 5 5 5 5 5 5 3 35 4 3 5 5 5 5 5 5 3 Possble Boundary Features () Smple features Area : A Crcumference: r Euler s number: #parts - #holes Drecton: ϕ major Eccentrcty: l major / l mnor Elongatedness: w BB / h BB Rectangularty: A / A BB Compactness: r / A Gray value/color/texture statstcs Projectons Not all nvarant l mnor ϕ major w BB h BB boundng box l major 43 Possble Boundary Features () Moments p q m = x y I( x, y) dxdy pq x= y= th order (.e., p + q = ) : sze st order (.e., p + q = ) : center-of-mass nd order (.e., p + q = ): orentaton hgher order : shape 44 Possble Boundary Features () Cont. Central moments: translaton nvarant µ = pq = x= y= x= y= ( x x) m ( x m ( y y) I( x, y) dxdy q ) I( x, y) dxdy 7 Hu moments: translaton, rotaton, scale nvarant p 47 Zernke moments: translaton, rotaton, scale nvarant q p m ) ( y m 45 ) Seven moments nvarants are calculated for each of these mages, and the logarthm of the results are taken to reduce the dynamc range. ) As the Table shows, the results are n reasonable agreement wth the nvarants computed for the orgnal mage. 3) The major cause of error can be attrbuted to the dgtal nature of the data, especally for the rotated mages. 46 Possble Boundary Features (3) Skeletons: eat away pxels from all sdes untl there s one left Senstve to nose Possble Boundary Features (4) Skeleton features: Count the number of endponts Count the number of branches On orgnal After dlaton (addng pxel at border) 47 48

8 7 6 5 4 3 8 7 6 5 4 3 Possble Boundary Features (5) -- Chan Codes A chan code could be generated by followng a boundary n a clockwse drecton and assgnng a drecton to the segments connectng every par of pxels. Dsadvantage: The resultng chan of codes tends to be qute long Any small dsturbance along the boundary due to nose or mperfect segmentaton causes changes n the code that may not be related to the shape of the boundary. 49 The accuracy of the resultng code representaton depends on the spacng of the samplng grd. 5 Contours Freeman chan codes f(p), p =,, P 3 4 5 6 7 f(p) = 7,,,7,6,4,4,4,4,3,4,, Normalze by nterpretng as base 8 (octal) number and shftng untl mnmum s reached Drect use s very senstve to nose; perhaps smooth frst f Gauss (σ) The chan code of a boundary depends on the startng pont. Treat the chan code as a crcular sequence of drecton numbers and redefne the startng pont so that the resultng sequence of numbers forms an nteger of mnmum magntude. Rotaton Invarance: Normalze for rotaton by usng the frst dfference of the chan code nstead of the code tself. Ths dfference s obtaned by countng the number of drecton changes (n a counterclockwse drecton) that separate two adjacent elements of the code. Scalng Invarance: Normalze for scalng s acheved by alterng the sze of the resamplng grd. 5 5 Elmnate the effect of the startng pont: Elmnate the effect of the rotaton Elmnate the effect of the scalng Example: 3333333 Possble Boundary Features (6) Based on chan code f(p) Curvature: yc( p) yc( p ) θ ( p) = tan xc( p) xc( p ) κ ( p) = θ( p) f Gauss ( σ) C = Total curvature: κ ( p ) p = P 766665533 Bendng energy: P E = B κ ( p) p = 53 Polygons:easly scaleable 54

Possble Boundary Features (7) -- Fourer Descrptors Fourer Transform of the Sgnature s(: N = j π un s( e N t= for n =,,..., N nt / N The complex coeffcents u n are called the Fourer descrptors of the boundary, and are denoted as FDn. 55 -- Fourer Descrptors The nverse Fourer transform of these coeffcents restores s(. That s: N ( ) = s t u n e n= j πnt / N Suppose that only the frst P coeffcents are used (that s, settng u n = for n > P-). The result s the followng approxmaton to s(k): ^ P j πnt / N = s( u n e = n for t =,,,, N-. for t =,,..., N - 56 The goal s to use a few Fourer descrptors to capture the gross essence of a boundary. These coeffcents carry shape nformaton and can be used as the bass for dfferentatng between dstnct boundary shapes. 57 st ( k) = [ x( k) + x] + j[ y( k) + y) s ( k) = x( k k ) + jy( k ) p k ) Magntude FDn s translaton and rotaton nvarant ) FD carres scale-nformaton 3) Low-frequency terms (t small): smooth behavor 4) Hgh-frequency terms (t large): jaggy, bumpy behavor 58 -- Normalzed Fourer Descrptor FD FD FD m Why? f = [,,..., ] FD FD FD When two shapes are compared, m=n/ coeffcents are used for central dstance, curvature and angular functon. m=n coeffcents are used for complex coordnates. m d = ( f = q q f q t where f = ( f, f ) q,..., f m q ) and f = ( f are the feature vectors of the two shapes respectvely. t t, f t,..., f m t ) 59 Complex FFT example: A = [ 3 4 4] ; B = [ 3 4 7] ; C = A + B * ; fft(a) = [3 - + - - ] ; fft(b) = [5-3 + 4-5 -3-4] ; fft(c) = [3 +5-6 - - - 5-4] ; 6

Crtera for shape representaton Rotaton, scale and translaton Invarant Compact & easy to derve Perceptual smlarty Robust to shape varatons Applcaton Independent FD satsfes all these crtera Problem Dfferent shape sgnatures can be used to derve FD, whch s the best? 6 Possble Regonal Features () -- Color Hstogram Hstogram (gray-scale mages) Invarant to translaton, rotaton, and small varatons Normalzed hstogram s nvarant to scale Not very senstve to nose But: removes a lot of nformaton! 6 Possble Regonal Features () -- MPEG-7 Color Descrptors -- MPEG-7 Scalable Color Descrptor A color hstogram n HSV color space Encoded by Haar Wavelet transform 63 64 -- MPEG-7 Domnant Color Descrptor -- MPEG-7 Color Layout Descrptor Cluster colors nto a small number of representatve colors It can be defned for each object, regons, or the whole mage F = {{c, p, v }, s} C : Representatve colors P : Ther percentages n the regon V : Color varances S: Spatal coherency 65 Cluster the mage nto 64 (8x8) blocks Derve the average color of each block Apply DCT and encode Effcent for Sketch-based mage retreval Content Flterng usng mage ndexng 66

-- MPEG-7 Color Structure Descrptor Scan the mage by an 8x8 pxel block Count the number of blocks contanng each color Generate a color hstogram n HMMD color space Man usages: Stll mage retreval Natural mage reatreval 67 Possble Regonal Features () -- Texture Features The three prncpal approaches used n mage processng to descrbe the texture of a regon are statstcal, structural, and spectral. Statstcal approaches yeld characterzatons of textures as smooth, coarse, grany, and so on. Structural technques deal wth the arrangement of mage prmtves, such as the descrpton of texture based on regularly spaced parallel lnes. Spectral technques are based on propertes of the Fourer spectrum and are used prmarly to detect global perodcty n an mage by dentfyng hghenergy, narrow peaks n the spectrum. 68 -- Statstcal Approaches 69 Mean : m = L = Standard Devaton : σ = R : R = + σ ( z) Thrd Moment : Unformty : U = Entropy : e = z p( z ) µ ( z) = L = 3 L = L = p ( z ) L = ( z m) p( z ) 3 ( z m) p( z ) p( z )log p( z ) Standard Devaton s a measure of gray-level contrast that can be used to establsh descrptors of relatve smoothness. R: Normalzed varance n the range of [, ] Thrd moment s a measure of the skewness of the hstogram. Unformty: Hstogram based measure. Average Entropy s a measure of varablty and s for a constant mage. 7 -- Structural Approaches -- Spectral Approaches S( r) = S( θ ) = π θ θ = R r= S ( r) S ( θ ) r where S s the spectrum functon r andθ s radus and angle n the polar coordnates 7 7

-- MPEG-7 Homogenous Texture Descrptor Partton the frequency doman nto 3 channels (modeled by a D-Gabor functon) Compute the energy and energy devaton for each channel Compute mean and standard varaton of frequency coeffcents F = {f DC, f SD, e,, e 3, d,, d 3 } An effcent mplementaton: Radon transform followed by Fourer transform -D Gabor Functon It s a Gaussan weghted snusod It s used to model ndvdual channels Each channel flters a specfc type of texture 73 74 Radon Transform Transform mages wth lnes nto a doman of possble lne parameters Each lne wll be transformed to a peak pont n the resulted mage 75 MPEG-7 Non-Homogenous Texture Descrptor Represent the spatal dstrbuton of fve types of edges vertcal, horzontal, 45, 35, and nondrectonal Dvdng the mage nto 6 (4x4) blocks Generatng a 5-bn hstogram for each block It s scale nvarant 76 Feature Space () End result: a k-dmensonal space, n whch each dmenson s a feature contanng N (labeled) samples (objects) Feature 3 5 camels 4 3 elephants - 4-4 77 - - Feature 78 Feature

Feature Space () Dfferent features have dfferent scale Area Crcumference Area Crcumference Soluton: normalze varance n each drecton Pattern recognton Clusterng: fnd natural groups of samples n unlabelled data Feature Space (3) What s our basc problem? Densty estmaton: make a statstcal model of the data 6 x 4 - -4 x -6-6 -4-4 6 8 x = x ' x x = var( ) x ' x var( ).5 x.5.5 -.5 - -.5 - -.5 - - 3 x Classfcaton: fnd functons separatng the classes Regresson: ft lnes or other functons to data (not n ths course) 79 8 Summary Features are derved from measurements Applcaton-dependent knowledge tells what features are mportant Invarance s mportant to make dscrmnaton easer Recognton: Nose removal Shadng removal Segmentaton and labelng Features: Smple, Skeletons, Moments, Polygons, 8 Chan Code, Fourer descrptors,. Fundamental Problems n Pattern Recognton System Desgn The frst one s concerned wth the representaton of nput data whch can be measured from the objects to be recognzed. The pattern vectors contan all the measured nformaton avalable about the patterns. The measurements performed on the objects of a pattern class may be regarded as a codng process whch conssts of assgnng to each pattern characterstc a symbol from the alphabet set. When the measurements yeld nformaton n the form of real numbers, t s often useful to thnk of a pattern vector as a pont n an n-dmensonal Eucldean space. The set of patterns belongng to the same class corresponds to an ensemble of ponts scattered wthn some regon of the measurement space. 8 The second problem concerns the extracton of characterstc features or attrbutes from the receved nput data and the reducton of the dmensonalty of pattern vectors. (Ths s often referred to as the preprocessng and feature extracton problem.) The features of a pattern class are the characterzng attrbutes common to all patterns belongng to that class. Such features are often referred to as ntraset features. The features whch represent the dfferences between pattern classes may be referred to as the nterset features. The elements of ntraset features whch are common to all pattern classes under consderaton carry no dscrmnatory nformaton and can be gnored. The extracton of features has been recognzed as an mportant problem n the desgn of pattern recognton systems. 83 The thrd problem nvolves the determnaton of optmum decson procedures, whch are needed n the dentfcaton and classfcaton process. If completed a pror knowledge about the patterns to be recognzed s avalable, the decson functons may be determned wth precson on the bass of ths nformaton. If only qualtatve knowledge about the patterns s avalable, reasonable guesses of the forms of the decson functons can be made. Need adjustment as necessary. If there exsts lttle, f any, a pror knowledge about the patterns to be recognzed, a tranng or learnng procedure s needed. 84

The patterns to be recognzed and classfed by an automatc pattern recognton system must possess a set of measurable characterstcs. Correct recognton wll depend on The amount of dscrmnatng nformaton contaned n the measurements; The effectve utlzaton of ths nformaton. Desgn Concepts and Methodologes Membershp-roster Concept Characterzaton of a pattern class by a roster of ts members suggests automatc pattern recognton by template matchng. The membershp-roster approach wll work satsfactorly under the condton of nearly perfect pattern samples. 85 86 Common-property Concept Characterzaton of a pattern class by common propertes shared by all of ts members suggests automatc pattern recognton va the detecton and processng of smlar features. The basc assumpton n ths method s that the patterns belongng to the same class possess certan common propertes or attrbutes whch reflect smlartes among these patterns. Advantage: (Membershp-roster Concept vs. Common-property Concep The storage requrement for the features of a pattern class s much less severe than that for all the patterns n the class. Sgnfcant pattern varatons cannot be tolerated n template matchng. If all the features of a class can be determned from sample patterns, the recognton process reduces smply to feature matchng. 87 88 Clusterng Concept When the patterns of a class are vectors whose components are real numbers, a pattern class can be characterzed by ts clusterng propertes n the pattern space. If the classes are characterzed by clusters whch are far apart, smple recognton schemes such as the mnmum-dstance classfers may be successfully employed. When the clusters overlap, t becomes necessary to utlze more sophstcated technques for parttonng the pattern space. 89 Overlappng clusters are the result of: A defcency n observed nformaton; The presence of measurement nose. The degree of overlappng can often be mnmzed by: Increasng the number and the qualty of measurements performed on the patterns of a class. 9

The basc desgn concepts for automatc pattern recognton descrbed above may be mplemented by three prncpal categores of methodology: Heurstc; Mathematcal; Lngustc or syntactc. Heurstc Methods: The heurstc approach s based on human ntuton and experence, makng use of the membershp-roster and common-property concepts. A system desgned usng ths prncple generally conssts of a set of ad hoc procedures developed for specalzed recognton tasks. Decson s based on ad hoc rules. 9 Example: Character recognton (Detecton of features such as the number and sequence of partcular strokes) 9 Mathematcal Methods: It s based on classfcaton rules whch are formulated and derved n a mathematcal framework, makng use of the common-property and clusterng concepts. Determnstc approach: Does not employ explctly the statstcal propertes of the pattern classes. Statstcal approach: It s formulated and derved n a statstcal framework. Example: Bayes classfcaton rule and ts varatons. Ths rule yelds an optmum classfer when the probablty densty functon of each pattern populaton and the probablty of occurrence of each pattern class are known. 93 Lngustc (Syntactc) Methods: Characterzaton of patterns by prmtve elements (subpatterns) and ther relatonshps suggests automatc pattern recognton by the lngustc or syntactc approach, makng use of the common-property concept. A pattern can be descrbed by a herarchcal structure of subpatterns analogous to the syntactc structure of languages. Ths permts applcaton of formal language theory to the pattern recognton problem. Ths approach s partcularly useful n dealng wth patterns whch cannot be convenently descrbed by numercal measurements or are so complex that local features cannot be dentfed and global propertes must be used. 94 In a supervsed learnng envronment, the system s taught to recognze patterns by means of varous adaptve schemes. The essentals of ths approach are a set of tranng patterns of known classfcaton and the mplementaton of an approprate learnng procedure. The unsupervsed pattern recognton technques are applcable to the stuatons where only a set of tranng patterns of unknown classfcaton may be avalable. 95 Examples of Automatc Pattern Recognton Systems Character Recognton: Technque Used: Rather than beng compared wth pre-stored patterns, handprnted characters are analyzed as combnatons of common features, such as curved lnes, vertcal and horzontal lnes, corners, and ntersectons. 96

Object Recognton Matchng Technque Automatc Classfcaton of Remotely Sensed Data: Examples: Land use, crop nventory, cropdsease detecton, forestry, montorng of ar and water qualty, geologcal and geographcal studes, and weather predcton, plus a score of other applcatons of envronmental sgnfcance. Technque Used: Bayes classfer 97 98 Object Recognton Bayes Classfer Technque Bomedcal Applcatons: Technque Used: Pattern prmtves, such as long arcs, short arcs, and sem-straght segments, whch characterze the chromosome boundares are defned. When combned, these prmtves form a strng or symbol sentence whch can be assocated wth a so-called pattern grammar. There s one grammar for each type (class) of chromosome. 99 Fngerprnt Recognton: Technque Used: It detects tentatve mnutae and records ther precse locatons and angles. Nuclear Reactor Component Survellance: Technque Used: Detect the clusters of pattern vectors by teratve applcatons of a cluster-seekng algorthm. The data cluster centers and assocated descrptve parameters, such as cluster varances, can then be used as templates aganst whch measurements are compared at any gven tme n order to determne the status of the plants. Sgnfcant devatons from the pre-establshed characterstc normal behavor are flagged as ndcatons of an abnormal operatng condtons.