Multi prototype fuzzy pattern matching for handwritten character recognition

Similar documents
Modified Fuzzy Hyperline Segment Neural Network for Pattern Classification and Recognition

Application of Geometry Rectification to Deformed Characters Recognition Liqun Wang1, a * and Honghui Fan2

Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction

EE 584 MACHINE VISION

Object Recognition Using Reflex Fuzzy Min-Max Neural Network with Floating Neurons

Machine vision. Summary # 6: Shape descriptors

Invarianceness for Character Recognition Using Geo-Discretization Features

A Computer Vision System for Graphical Pattern Recognition and Semantic Object Detection

A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data

A New Online Clustering Approach for Data in Arbitrary Shaped Clusters

A Feature based on Encoding the Relative Position of a Point in the Character for Online Handwritten Character Recognition

A System for Joining and Recognition of Broken Bangla Numerals for Indian Postal Automation

ECM A Novel On-line, Evolving Clustering Method and Its Applications

Figure (5) Kohonen Self-Organized Map

OCR For Handwritten Marathi Script

Handwritten Devanagari Character Recognition Model Using Neural Network

Generation of Artistic Calligraphic Fonts Considering Character Structure

Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network

S. Sreenivasan Research Scholar, School of Advanced Sciences, VIT University, Chennai Campus, Vandalur-Kelambakkam Road, Chennai, Tamil Nadu, India

A Naïve Soft Computing based Approach for Gene Expression Data Analysis

Binarization of Color Character Strings in Scene Images Using K-means Clustering and Support Vector Machines

Edge Detection and Template Matching Approaches for Human Ear Detection

Clustering and Visualisation of Data

A 3D Point Cloud Registration Algorithm based on Feature Points

Online Feature Extraction Technique for Optical Character Recognition System

FEATURE EXTRACTION TECHNIQUES FOR IMAGE RETRIEVAL USING HAAR AND GLCM

Programming Exercise 3: Multi-class Classification and Neural Networks

Comparative Study of ROI Extraction of Palmprint

Unsupervised Learning : Clustering

Equation to LaTeX. Abhinav Rastogi, Sevy Harris. I. Introduction. Segmentation.

MORPHOLOGICAL EDGE DETECTION AND CORNER DETECTION ALGORITHM USING CHAIN-ENCODING

Figure 1 shows unstructured data when plotted on the co-ordinate axis

Fingerprint Classification Using Orientation Field Flow Curves

Implemented by Valsamis Douskos Laboratoty of Photogrammetry, Dept. of Surveying, National Tehnical University of Athens

Biometrics Technology: Hand Geometry

MORPHOLOGICAL BOUNDARY BASED SHAPE REPRESENTATION SCHEMES ON MOMENT INVARIANTS FOR CLASSIFICATION OF TEXTURES

Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York

Content Based Image Retrieval Using Color and Texture Feature with Distance Matrices

Recognizing Handwritten Digits Using the LLE Algorithm with Back Propagation

MODULE 7 Nearest Neighbour Classifier and its Variants LESSON 12

Novel Intuitionistic Fuzzy C-Means Clustering for Linearly and Nonlinearly Separable Data

Part-Based Skew Estimation for Mathematical Expressions

CHAPTER 4 FUZZY LOGIC, K-MEANS, FUZZY C-MEANS AND BAYESIAN METHODS

Implementation Feasibility of Convex Recursive Deletion Regions Using Multi-Layer Perceptrons

CHAPTER 8 COMPOUND CHARACTER RECOGNITION USING VARIOUS MODELS

Finger Vein Biometric Approach for Personal Identification Using IRT Feature and Gabor Filter Implementation

Fuzzy Bidirectional Weighted Sum for Face Recognition

Task analysis based on observing hands and objects by vision

Chapter 4. The Classification of Species and Colors of Finished Wooden Parts Using RBFNs

Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes

6. Applications - Text recognition in videos - Semantic video analysis

A two-stage approach for segmentation of handwritten Bangla word images

The Effects of Outliers on Support Vector Machines

Improved Version of Kernelized Fuzzy C-Means using Credibility

Patch-Based Image Classification Using Image Epitomes

Texture Analysis of Painted Strokes 1) Martin Lettner, Paul Kammerer, Robert Sablatnig

Artificial Neural Networks Lab 2 Classical Pattern Recognition

Points Lines Connected points X-Y Scatter. X-Y Matrix Star Plot Histogram Box Plot. Bar Group Bar Stacked H-Bar Grouped H-Bar Stacked

COMPARATIVE ANALYSIS OF EYE DETECTION AND TRACKING ALGORITHMS FOR SURVEILLANCE

Texture Image Segmentation using FCM

Distance and Angles Effect in Hough Transform for line detection

Redefining and Enhancing K-means Algorithm

HOUGH TRANSFORM CS 6350 C V

Time Series Clustering Ensemble Algorithm Based on Locality Preserving Projection

Robust Shape Retrieval Using Maximum Likelihood Theory

NOVATEUR PUBLICATIONS INTERNATIONAL JOURNAL OF INNOVATIONS IN ENGINEERING RESEARCH AND TECHNOLOGY [IJIERT] ISSN: VOLUME 2, ISSUE 1 JAN-2015

CLASSIFICATION OF BOUNDARY AND REGION SHAPES USING HU-MOMENT INVARIANTS

APPLICATION OF THE FUZZY MIN-MAX NEURAL NETWORK CLASSIFIER TO PROBLEMS WITH CONTINUOUS AND DISCRETE ATTRIBUTES

Clustering CS 550: Machine Learning

Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc

Efficient Image Compression of Medical Images Using the Wavelet Transform and Fuzzy c-means Clustering on Regions of Interest.

Robust & Accurate Face Recognition using Histograms

Texture Segmentation by Windowed Projection

Digital Image Processing

Question - 1 Ransomizer

A FUZZY LOGIC BASED METHOD FOR EDGE DETECTION

Gene Clustering & Classification

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

ENHANCED DBSCAN ALGORITHM

Principal Component Image Interpretation A Logical and Statistical Approach

Translations. Geometric Image Transformations. Two-Dimensional Geometric Transforms. Groups and Composition

Analysis of high dimensional data via Topology. Louis Xiang. Oak Ridge National Laboratory. Oak Ridge, Tennessee

Face Recognition by Combining Kernel Associative Memory and Gabor Transforms

Enhanced Image. Improved Dam point Labelling

Classification with Diffuse or Incomplete Information

Notes. Reminder: HW2 Due Today by 11:59PM. Review session on Thursday. Midterm next Tuesday (10/10/2017)

Firm Object Classification using Butterworth Filters and Multiscale Fourier Descriptors Saravanakumar M

SVM Classification in Multiclass Letter Recognition System

Defect Detection of Regular Patterned Fabric by Spectral Estimation Technique and Rough Set Classifier

Automatic Detection of Texture Defects using Texture-Periodicity and Gabor Wavelets

Rank Measures for Ordering

Use of Multi-category Proximal SVM for Data Set Reduction

A Framework for Efficient Fingerprint Identification using a Minutiae Tree

ON THE STRONGLY REGULAR GRAPH OF PARAMETERS

A Vertex Chain Code Approach for Image Recognition

Topic 7 Machine learning

Image Contrast Enhancement in Wavelet Domain

Lecture 14 Shape. ch. 9, sec. 1-8, of Machine Vision by Wesley E. Snyder & Hairong Qi. Spring (CMU RI) : BioE 2630 (Pitt)

CIS 520, Machine Learning, Fall 2015: Assignment 7 Due: Mon, Nov 16, :59pm, PDF to Canvas [100 points]

Dynamic Stroke Information Analysis for Video-Based Handwritten Chinese Character Recognition

Transcription:

Multi prototype fuzzy pattern matching for handwritten character recognition MILIND E. RANE, DHABE P. S AND J. B. PATIL Dept. of Electronics and Computer, R.C. Patel Institute of Technology, Shirpur, Dist. Dhule, Maharashtra, Pin-425405 INDIA Abstract :- In this paper a novel method of multi-prototype fuzzy matching is proposed. It calculates multiple prototypes of a single class using a fuzzy parameter. Its ability of classification and recognition is tested with a realistic database of handwritten digits. Its performance is found superior than traditional single class prototype matching for classification as well as recognition for few prototypes per class. It also eliminate serious handicap of single prototype method, that after calculation of class prototype we can not increase recognition using the same training set used for calculation of prototype. Key-Words: - prototype matching, classification, recognition, handwritten character recognition. 1 Introduction Handwritten character recognition is still a hot research topic where researchers are struggling a lot. Matching the class prototype is traditional method used for this task. This method calculates a single prototype for a class which is generally taken as average of all the patterns used to calculate class prototype. Then to reason about class label of a pattern, applied pattern is matched with all the existing class prototypes and class label of the prototype having minimum distance with pattern is awarded as class label of that pattern. This method uses minimum distance clasifier, after calculating Euclidean distance between prototype and applied pattern [1]. This theory is well suited for all pattern recognition tasks where the patterns within a class has less deviation from one another. But the concept of single prototype is found less effective for handwritten character recognition. Since the patterns belonging to a class has considerable variation in their shapes. Figure 1. shows variations in shape of Devanagiri (Alphbet set used in India) digit 3. Following facts can be immediately detected by observation of above figure. i. Each digit has considerable variation in their shape. ii. Angle of orientation of each digit is different. iii. Thickness of digit has considerable variation depending on thickness of tip of the pen. iv. Each digit is written in different style. Fig.1. Variations in shape of digit three. If we calculate a single prototype of a class of pattrens depicted in figure 1. then that prototype can not capture all the peculiarity of all the patterns within that class. At the first glance it seems that it can be better to calculate more than one prototype for a single class to capture peculiarities of all possible shapes, which is the central idea of this paper. From figure 1. we immediately come to know that if we create a separate prototypes from first, second and third digit and from fourth, fifth, sixth and seventh digit, then it is expected to get better recognition with prototype matching. From this idea we worked in this direction and our work leads to a fuzzy system that can be used for classification and recognition of handwritten characters. Remaining part of this paper is organized as follows. Section 2 discusses realistic image database of 1000 handwritten digits used for experimentation purpose along with the image normalization with respect to translation and scale. Section 3 explores method used for calculation of multiple prototypes. A 2-D example is discussed in section 4 to explain behavior of proposed method for better understanding. Experimental results are given in section 5, along with performance

comparisons with traditional method using single prototype. Conclusions are withdrawn in section 6 based on the emperical evidences. References are cited at the end. 2 Image database Image database used in this work is collected from 100 multipen writers. Each person has written ten digits from 0 to 9. These characters are scanned and stroed in 65 x 65.bmp images in binary format. These digit images are in arbitrary rotation, translation and scale. Figure 2 depicts a set of ten digits written by a single person. Fig. 2. A set of ten digits. By observation of figure 1 and 2 it is clear that characters have variation in size, angle of orientation and position within image frame. Thus we can not extract image features directly from such an images, since they dependent on size, position and rotation angle (i.e. angle of least moment of inertia axis). So before extraction of features these images are normalized with respect to scale and translation. For extraction of features from these images we used ring features, which are invariant to rotation thus images are not normalized with respect to rotation [2]. To achieve normalization wrt translation and scale we used the moment normalization discussed in [3]. To translate a digit at the center of the image frame we have calculated regular geometrical zeroth and first order moments for gettting center of gravity (CG) of the th object. The ( p + q) order moment M pq of a digital image f ( x, y) is given by (1) M pq = M M x= 1 y= 1 x p y q f ( x, y) (1) (4) where p, q = 0,1,2... Then the CG of object image is given by (2) M 1,0 M 0,1 CG =, (2) M 0,0 M 0,0 This calculated CG is shifted to center of image frame ( 33, 33) along with shifting all the object pixels to get normalization w.r.t translation. For scale normalization we have to count total number of object pixels present in the image by M 0, 0. Each image is represented by 400 object pixels to compensate its size. The scaling factor β is calculated as (3) β = 400 M (3) 0,0 ' The image f ( ) after scaling is obtained by (4) ' x y f ( x, y) = f, (4) β β Figure 3, shows (from left to right) an original image, its scaled version and the image after translation. Fig. 3. Image normalization. The translation is done after scaling since scaling moves the CG of the object in image. The ring features are extracted after these steps. 3 Multiple prototype calculation For calculation of multiple prototypes we have used a fuzzy membership function This function calculates membership of input pattern R h in a class prototype Ci as shown in (5). 0 if γ. d 1 f ( Ci ) =, (5) 1 γ. d if γ. d < 1 where, the distance d is calculated as ( ) 1/ 2 m 2 d = C ij R hj 0 (6) j= and γ is the sensitivity parameter that regulates how fast membership value decreases as distance between cluster prototype and input pattern increases, where

γ > 0. The membership function designed in (5) possesses all the properties of fuzzy sets like normality and convexity described in [4]. The plot of membership functions for a 2-D class prototype (0.5, 0.5), with γ = 1 and γ = 4 are shown in Figure 4. Prorotype calculation requires to find out natural grouping of patterns within a class and then the largest group within the class is considered for calculation of a single prototype for this selected group. For finding a pattern group we have used a parameter called grouping factor 0 < α 1, which defines the size of pattern group and thus controls number of prototypes created within a class. More value of α leads to creation of more prototypes within the given class. 1 count and S be the set of these p patterns falling close around R j with fuzzy membership α. Then the class prototype is computed as 1 p 1 C ji = Si for i = 1, 2,..., n p (7) j= 1 Now the patterns which are already grouped and present in S 1 are removed from S and the patterns which are not grouped are considered for calculation of next prototype within the same class. This process repeats untill all the patterns of a class are grouped and S will becomes empty. The same steps are repeated for class c = 1,2,..., m. Thus for each class we get multiple prototypes. Fig.4 a. Membership function plot for γ = 1. Fig.4 b. Membership function plot for γ = 4. Let that the pattern set R = { Rh h = 1, 2..., k}, contains n dimensional k patterns of m class. To calculate pattern grouping of i th class patterns, we have to get a set S of all patterns belonging to i th class, where S R. To determine the pattern group of this class, all the patterns are applied to each of the pattern assuming them as a class prototype and fuzzy membership of each pattern is calcultaed in all other patterns using (5). The patterns that give fuzzy membership value larger than α are counted for all the patterns. Let R j S is the pattern with the maximum 4. A 2-D Example For the explanation of calculation of multiple prototypes an example in 2-D pattern space is selected. The patterns are selected to capture all the behavioral possibilities of the algorithm. The selected patterns are listed in Table 1 and its scatter plot is shown in Figure.5. All the patterns are belonging to same class. From the scatter plot also we can confirm that we can not capture peculiarities of all the patterns with using a single prototype. It is expected that at least three prototypes are needed for better recognition. The order of data presentation to the system is same as that shown in Table 1. Table 1. A set of 2-D patterns. P1=[0.5, 0.7] P2=[0.6, 0.7] P3=[0.55, 0.65] P4=[0.5, 0.6] P5=[0.6, 0.6] P6=[0.1, 0.2] P7=[0.2, 0.2] P10=[0.2, 0.1] P13=[0.75, 0.25] P8=[0.15, 0.15] P11=[0.7, 0.3] P14=[0.7, 0.2] P9=[0.1, 0.1] P12=[0.8, 0.3] P15=[0.8, 0.2] By selecting α = 0. 85 we have given patterns listed within Table 1. The first largest group selected for calculation of prototype contains (P1, P2, P3, P4, P5 ) and the calculated prototype is (0.55,0.65). Now these patterns are removed from the dataset and remaining ten

patterns from P6,..,p15 are considered for pattern grouping and prototype calculation. When the remaining ten patterns considered, it has selected largest group containing patterns ( P6, P7, P8, P9, P10 ) and caculated second prototype as (0.15,0.15). 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Fig.5. Scatter plot of 2-D example. After removing these patterns only five patterns are remained in the dataset and the algorithm has selected largest group containing (P11, P12, P13, P14, P15) and calculated prototype as ( 0.75,0.25 ). The patterns falling in the largest group when removed from the dataset and when dataset become empty algorithm halted by successfully giviing multiple prototypes. Thus algorithm has created three prototypes for the patterns of same class. From this example it is clear that algorithm has created three prototypes for the patterns belonging to the same class. The pattern groups are circled and given the name as c 1, c2 and c3 as shown in Figure 6. The calculated prototypes are indicated by square dots. From observation of Figure 6, it is clear that algorithm did the expected job to get natural pattern groupings within a single class. Fig.6. Pattern groups and multiple prototypes. We can increase the number of prototypes created by increasing the value of α. For less value of α algorithm calculates less number of prototypes. 5 Experimental results We have simulated proposed method on MatLab 5.x. For experimentation purpose we have used database of handwritten numerals described in section 2. These images are used to extract ring features after normalization with respect to scale and translation. The scale normalization is achieved in such a way that the normalised image contains 400 pixels. Ring features are extracted by setting ring width equals to 2. Since image size is 65 x 65 it creates 16 ring within an image frame. So the dimensionality of feature vector for a single image is (1,16) and for the complete database of 1000 images is (1000,16). After getting the feature matrix R of size (1000,16), the values in it are found arbitrary integers to bring all the values within the interval [0,1], we have used (8) R = R (8) max ( R ) where max ( R ) is the maximum value in the matrix R. These bring all the values in the interval [0,1]. Such normalization of feature matrix makes pattern space as n-dimensional hyper cube and is more convinient for the recognition task. This normalization although leads to loss of some absolute information about patterns but retains all the relative information abut the patterns [5]. The feature matrix contains patterns of 10 classes i.e 100 patterns of each class. We have used first 500 pattern features from R for calculation of prototypes i.e 50 pattern features per class and remaining 500 are used for testing recognition. The proposed method is compared with the traditional single prototype method by applying same patterns of same database and with the same order of data presentation, to make valid comparison. For first 500 patterns single prototype method has created ten prototypes for ten classes. We used same 500 pattern features for getting classification performance. The classification is found to be 38 percent. Now the remaining 500 pattern features which are not considered for calculation of class prototype are used for testing recognition. It has given 27.6 percent recognition rate. It is shown in Table 2.

Table 2. Performance of single prototype method Percentage Classification 38.0 Recognition 27.6 The time required to calculate prototype and recall time per pattern in seconds exhibited during recognition by single prototype method are shown in Table 3. Table 3. Timing analysis of single prototype method Prototype calculation time Recall time per pattern 0.031 0.0015 The proposed method of multiple prototypes is also tested with the same pattern feature set by appling patterns in the same order. When α = 0. 6 it has created total 22 prototypes belonging to ten classes. The number of prototypes created for each class are shown in row vector V, where V = (5,1,1, 2,1,1, 3, 3, 3, 2). The value of V (i) indicates number of prototype created for i th class, where i = 1, 2,..., 10. The performance of classification and recognition is shown in Table 4. For α = 0. 7 it has created total 42 prototypes of all the classes with V = (8, 3, 4, 4, 3, 2, 5, 5, 4, 4). For α = 0. 8 it has created total 120 prototypes of all the classes with V = (16,8,10,15, 7, 7,11,18,15,13). Similarly this method is tested for α = 0. 85 and α = 0. 88 it has created total 217 and 346 prototypes of all the classes, respectively. Table 4 shows classification and recognition performance in all the cases with different values of α, where CC indicates percentage classification and RR indicates percentage recognition rate. Table 4. Performance of multiprototype method. Value of α Total prototypes created Value of CC Value of RR 0.7 42 41.6 32.2 0.8 120 55 36.8 0.85 217 75 39.2 0.88 346 83 42 Time required to calculates the prototypes for 500 pattern features and the recall time per pattern in seconds exhibited during recognition in multiprototype method is tabulated in Table 5. Table 5. Timing analysis of multiprototype method Value of α Prototype calculation time Recall time per pattern 0.7 4.36 0.0086 0.8 7.42 0.0231 0.85 14.61 0.0413 0.88 30.578 0.0661 6 Conclusions The proposed method of multiprototype is found superior than the traditional single method of prototype per class. It also adds flexibility that makes it possible to increase classification and recognition rate by increasing total number of prototype created per class, which could not be possible for single class prototype which is the severe handicap of that method. By increasing value of α we can increase total number of prototypes created per class in other words we can add more knowledge in it which is used for classification and recognition decision making. It is suggested that moderate value of α is chosen to get acceptable recognition/classification rate. The proposed method has proven better performance than the single prototype method for worse task of realistic handwritten character recognition. The time required to calculate the class prototype of this method is found larger than the single prototype method. This time increases as we increase value of α which can not be treated as serious drawback of this method since prototype calculation need to be done only once. It is also observed that recall time per pattern i.e. time required for a pattern to reason about its class, grows as we increase value of α. The value of α must be chosen in such a way that the method must create prototype per class larger than the one and acceptable recall time per pattern is exhibited by it. This method can also learn patterns on fly i.e new patterns can be added at any time by finding class prototype of the newly applied pattern which gives fuzzy membership value greater than or equals to α and the newely pattern is accomodated in that prototype. If the above condition does not satisfiy then a new class prototype is created from it.

References: [1]. Rafael C. Gonzalez and Richard E. Woods, Digital Image Processing, 2 nd ed., Pearson education,2003. [2]. Hung-Pin Chiu and Din-Chang Tseng, Invariant handwritten Chinese character recognition using fuzzy min-max neural networks, Pattern recog. letters, Vol. 18, pp. 481-491, 1997. [3]. S.J. Parantonis and P.J.G. Lisboa, Translation, rotation and scale invariant pattern recognition by high-order neural network and moment classifiers, IEEE Trans. Neural networks, vol.3, No.2, pp.241-251, 1992. [4]. P.M Patil. P.S. Dhabe, U.V. Kulkani and T.R.Sontakke Recognition of handwritten characters using modified fuzzy hyperline segment neural network, in proc. IEEE int. conference on fuzzy systems FUZZ_IEEE 03, May 2003. [5]. P. K. Simpson, Fuzzy min-max neural networks Part 1: Classification, IEEE Trans. on Neural Networks, Vol. 3, No. 5, pp. 776-786, 1992.