Multi prototype fuzzy pattern matching for handwritten character recognition

Multi prototype fuzzy pattern matching for handwritten character recognition MILIND E. RANE, DHABE P. S AND J. B. PATIL Dept. of Electronics and Computer, R.C. Patel Institute of Technology, Shirpur, Dist. Dhule, Maharashtra, Pin-425405 INDIA Abstract :- In this paper a novel method of multi-prototype fuzzy matching is proposed. It calculates multiple prototypes of a single class using a fuzzy parameter. Its ability of classification and recognition is tested with a realistic database of handwritten digits. Its performance is found superior than traditional single class prototype matching for classification as well as recognition for few prototypes per class. It also eliminate serious handicap of single prototype method, that after calculation of class prototype we can not increase recognition using the same training set used for calculation of prototype. Key-Words: - prototype matching, classification, recognition, handwritten character recognition. 1 Introduction Handwritten character recognition is still a hot research topic where researchers are struggling a lot. Matching the class prototype is traditional method used for this task. This method calculates a single prototype for a class which is generally taken as average of all the patterns used to calculate class prototype. Then to reason about class label of a pattern, applied pattern is matched with all the existing class prototypes and class label of the prototype having minimum distance with pattern is awarded as class label of that pattern. This method uses minimum distance clasifier, after calculating Euclidean distance between prototype and applied pattern [1]. This theory is well suited for all pattern recognition tasks where the patterns within a class has less deviation from one another. But the concept of single prototype is found less effective for handwritten character recognition. Since the patterns belonging to a class has considerable variation in their shapes. Figure 1. shows variations in shape of Devanagiri (Alphbet set used in India) digit 3. Following facts can be immediately detected by observation of above figure. i. Each digit has considerable variation in their shape. ii. Angle of orientation of each digit is different. iii. Thickness of digit has considerable variation depending on thickness of tip of the pen. iv. Each digit is written in different style. Fig.1. Variations in shape of digit three. If we calculate a single prototype of a class of pattrens depicted in figure 1. then that prototype can not capture all the peculiarity of all the patterns within that class. At the first glance it seems that it can be better to calculate more than one prototype for a single class to capture peculiarities of all possible shapes, which is the central idea of this paper. From figure 1. we immediately come to know that if we create a separate prototypes from first, second and third digit and from fourth, fifth, sixth and seventh digit, then it is expected to get better recognition with prototype matching. From this idea we worked in this direction and our work leads to a fuzzy system that can be used for classification and recognition of handwritten characters. Remaining part of this paper is organized as follows. Section 2 discusses realistic image database of 1000 handwritten digits used for experimentation purpose along with the image normalization with respect to translation and scale. Section 3 explores method used for calculation of multiple prototypes. A 2-D example is discussed in section 4 to explain behavior of proposed method for better understanding. Experimental results are given in section 5, along with performance

comparisons with traditional method using single prototype. Conclusions are withdrawn in section 6 based on the emperical evidences. References are cited at the end. 2 Image database Image database used in this work is collected from 100 multipen writers. Each person has written ten digits from 0 to 9. These characters are scanned and stroed in 65 x 65.bmp images in binary format. These digit images are in arbitrary rotation, translation and scale. Figure 2 depicts a set of ten digits written by a single person. Fig. 2. A set of ten digits. By observation of figure 1 and 2 it is clear that characters have variation in size, angle of orientation and position within image frame. Thus we can not extract image features directly from such an images, since they dependent on size, position and rotation angle (i.e. angle of least moment of inertia axis). So before extraction of features these images are normalized with respect to scale and translation. For extraction of features from these images we used ring features, which are invariant to rotation thus images are not normalized with respect to rotation [2]. To achieve normalization wrt translation and scale we used the moment normalization discussed in [3]. To translate a digit at the center of the image frame we have calculated regular geometrical zeroth and first order moments for gettting center of gravity (CG) of the th object. The ( p + q) order moment M pq of a digital image f ( x, y) is given by (1) M pq = M M x= 1 y= 1 x p y q f ( x, y) (1) (4) where p, q = 0,1,2... Then the CG of object image is given by (2) M 1,0 M 0,1 CG =, (2) M 0,0 M 0,0 This calculated CG is shifted to center of image frame ( 33, 33) along with shifting all the object pixels to get normalization w.r.t translation. For scale normalization we have to count total number of object pixels present in the image by M 0, 0. Each image is represented by 400 object pixels to compensate its size. The scaling factor β is calculated as (3) β = 400 M (3) 0,0 ' The image f ( ) after scaling is obtained by (4) ' x y f ( x, y) = f, (4) β β Figure 3, shows (from left to right) an original image, its scaled version and the image after translation. Fig. 3. Image normalization. The translation is done after scaling since scaling moves the CG of the object in image. The ring features are extracted after these steps. 3 Multiple prototype calculation For calculation of multiple prototypes we have used a fuzzy membership function This function calculates membership of input pattern R h in a class prototype Ci as shown in (5). 0 if γ. d 1 f ( Ci ) =, (5) 1 γ. d if γ. d < 1 where, the distance d is calculated as ( ) 1/ 2 m 2 d = C ij R hj 0 (6) j= and γ is the sensitivity parameter that regulates how fast membership value decreases as distance between cluster prototype and input pattern increases, where

γ > 0. The membership function designed in (5) possesses all the properties of fuzzy sets like normality and convexity described in [4]. The plot of membership functions for a 2-D class prototype (0.5, 0.5), with γ = 1 and γ = 4 are shown in Figure 4. Prorotype calculation requires to find out natural grouping of patterns within a class and then the largest group within the class is considered for calculation of a single prototype for this selected group. For finding a pattern group we have used a parameter called grouping factor 0 < α 1, which defines the size of pattern group and thus controls number of prototypes created within a class. More value of α leads to creation of more prototypes within the given class. 1 count and S be the set of these p patterns falling close around R j with fuzzy membership α. Then the class prototype is computed as 1 p 1 C ji = Si for i = 1, 2,..., n p (7) j= 1 Now the patterns which are already grouped and present in S 1 are removed from S and the patterns which are not grouped are considered for calculation of next prototype within the same class. This process repeats untill all the patterns of a class are grouped and S will becomes empty. The same steps are repeated for class c = 1,2,..., m. Thus for each class we get multiple prototypes. Fig.4 a. Membership function plot for γ = 1. Fig.4 b. Membership function plot for γ = 4. Let that the pattern set R = { Rh h = 1, 2..., k}, contains n dimensional k patterns of m class. To calculate pattern grouping of i th class patterns, we have to get a set S of all patterns belonging to i th class, where S R. To determine the pattern group of this class, all the patterns are applied to each of the pattern assuming them as a class prototype and fuzzy membership of each pattern is calcultaed in all other patterns using (5). The patterns that give fuzzy membership value larger than α are counted for all the patterns. Let R j S is the pattern with the maximum 4. A 2-D Example For the explanation of calculation of multiple prototypes an example in 2-D pattern space is selected. The patterns are selected to capture all the behavioral possibilities of the algorithm. The selected patterns are listed in Table 1 and its scatter plot is shown in Figure.5. All the patterns are belonging to same class. From the scatter plot also we can confirm that we can not capture peculiarities of all the patterns with using a single prototype. It is expected that at least three prototypes are needed for better recognition. The order of data presentation to the system is same as that shown in Table 1. Table 1. A set of 2-D patterns. P1=[0.5, 0.7] P2=[0.6, 0.7] P3=[0.55, 0.65] P4=[0.5, 0.6] P5=[0.6, 0.6] P6=[0.1, 0.2] P7=[0.2, 0.2] P10=[0.2, 0.1] P13=[0.75, 0.25] P8=[0.15, 0.15] P11=[0.7, 0.3] P14=[0.7, 0.2] P9=[0.1, 0.1] P12=[0.8, 0.3] P15=[0.8, 0.2] By selecting α = 0. 85 we have given patterns listed within Table 1. The first largest group selected for calculation of prototype contains (P1, P2, P3, P4, P5 ) and the calculated prototype is (0.55,0.65). Now these patterns are removed from the dataset and remaining ten

patterns from P6,..,p15 are considered for pattern grouping and prototype calculation. When the remaining ten patterns considered, it has selected largest group containing patterns ( P6, P7, P8, P9, P10 ) and caculated second prototype as (0.15,0.15). 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Fig.5. Scatter plot of 2-D example. After removing these patterns only five patterns are remained in the dataset and the algorithm has selected largest group containing (P11, P12, P13, P14, P15) and calculated prototype as ( 0.75,0.25 ). The patterns falling in the largest group when removed from the dataset and when dataset become empty algorithm halted by successfully giviing multiple prototypes. Thus algorithm has created three prototypes for the patterns of same class. From this example it is clear that algorithm has created three prototypes for the patterns belonging to the same class. The pattern groups are circled and given the name as c 1, c2 and c3 as shown in Figure 6. The calculated prototypes are indicated by square dots. From observation of Figure 6, it is clear that algorithm did the expected job to get natural pattern groupings within a single class. Fig.6. Pattern groups and multiple prototypes. We can increase the number of prototypes created by increasing the value of α. For less value of α algorithm calculates less number of prototypes. 5 Experimental results We have simulated proposed method on MatLab 5.x. For experimentation purpose we have used database of handwritten numerals described in section 2. These images are used to extract ring features after normalization with respect to scale and translation. The scale normalization is achieved in such a way that the normalised image contains 400 pixels. Ring features are extracted by setting ring width equals to 2. Since image size is 65 x 65 it creates 16 ring within an image frame. So the dimensionality of feature vector for a single image is (1,16) and for the complete database of 1000 images is (1000,16). After getting the feature matrix R of size (1000,16), the values in it are found arbitrary integers to bring all the values within the interval [0,1], we have used (8) R = R (8) max ( R ) where max ( R ) is the maximum value in the matrix R. These bring all the values in the interval [0,1]. Such normalization of feature matrix makes pattern space as n-dimensional hyper cube and is more convinient for the recognition task. This normalization although leads to loss of some absolute information about patterns but retains all the relative information abut the patterns [5]. The feature matrix contains patterns of 10 classes i.e 100 patterns of each class. We have used first 500 pattern features from R for calculation of prototypes i.e 50 pattern features per class and remaining 500 are used for testing recognition. The proposed method is compared with the traditional single prototype method by applying same patterns of same database and with the same order of data presentation, to make valid comparison. For first 500 patterns single prototype method has created ten prototypes for ten classes. We used same 500 pattern features for getting classification performance. The classification is found to be 38 percent. Now the remaining 500 pattern features which are not considered for calculation of class prototype are used for testing recognition. It has given 27.6 percent recognition rate. It is shown in Table 2.

Table 2. Performance of single prototype method Percentage Classification 38.0 Recognition 27.6 The time required to calculate prototype and recall time per pattern in seconds exhibited during recognition by single prototype method are shown in Table 3. Table 3. Timing analysis of single prototype method Prototype calculation time Recall time per pattern 0.031 0.0015 The proposed method of multiple prototypes is also tested with the same pattern feature set by appling patterns in the same order. When α = 0. 6 it has created total 22 prototypes belonging to ten classes. The number of prototypes created for each class are shown in row vector V, where V = (5,1,1, 2,1,1, 3, 3, 3, 2). The value of V (i) indicates number of prototype created for i th class, where i = 1, 2,..., 10. The performance of classification and recognition is shown in Table 4. For α = 0. 7 it has created total 42 prototypes of all the classes with V = (8, 3, 4, 4, 3, 2, 5, 5, 4, 4). For α = 0. 8 it has created total 120 prototypes of all the classes with V = (16,8,10,15, 7, 7,11,18,15,13). Similarly this method is tested for α = 0. 85 and α = 0. 88 it has created total 217 and 346 prototypes of all the classes, respectively. Table 4 shows classification and recognition performance in all the cases with different values of α, where CC indicates percentage classification and RR indicates percentage recognition rate. Table 4. Performance of multiprototype method. Value of α Total prototypes created Value of CC Value of RR 0.7 42 41.6 32.2 0.8 120 55 36.8 0.85 217 75 39.2 0.88 346 83 42 Time required to calculates the prototypes for 500 pattern features and the recall time per pattern in seconds exhibited during recognition in multiprototype method is tabulated in Table 5. Table 5. Timing analysis of multiprototype method Value of α Prototype calculation time Recall time per pattern 0.7 4.36 0.0086 0.8 7.42 0.0231 0.85 14.61 0.0413 0.88 30.578 0.0661 6 Conclusions The proposed method of multiprototype is found superior than the traditional single method of prototype per class. It also adds flexibility that makes it possible to increase classification and recognition rate by increasing total number of prototype created per class, which could not be possible for single class prototype which is the severe handicap of that method. By increasing value of α we can increase total number of prototypes created per class in other words we can add more knowledge in it which is used for classification and recognition decision making. It is suggested that moderate value of α is chosen to get acceptable recognition/classification rate. The proposed method has proven better performance than the single prototype method for worse task of realistic handwritten character recognition. The time required to calculate the class prototype of this method is found larger than the single prototype method. This time increases as we increase value of α which can not be treated as serious drawback of this method since prototype calculation need to be done only once. It is also observed that recall time per pattern i.e. time required for a pattern to reason about its class, grows as we increase value of α. The value of α must be chosen in such a way that the method must create prototype per class larger than the one and acceptable recall time per pattern is exhibited by it. This method can also learn patterns on fly i.e new patterns can be added at any time by finding class prototype of the newly applied pattern which gives fuzzy membership value greater than or equals to α and the newely pattern is accomodated in that prototype. If the above condition does not satisfiy then a new class prototype is created from it.

References: [1]. Rafael C. Gonzalez and Richard E. Woods, Digital Image Processing, 2 nd ed., Pearson education,2003. [2]. Hung-Pin Chiu and Din-Chang Tseng, Invariant handwritten Chinese character recognition using fuzzy min-max neural networks, Pattern recog. letters, Vol. 18, pp. 481-491, 1997. [3]. S.J. Parantonis and P.J.G. Lisboa, Translation, rotation and scale invariant pattern recognition by high-order neural network and moment classifiers, IEEE Trans. Neural networks, vol.3, No.2, pp.241-251, 1992. [4]. P.M Patil. P.S. Dhabe, U.V. Kulkani and T.R.Sontakke Recognition of handwritten characters using modified fuzzy hyperline segment neural network, in proc. IEEE int. conference on fuzzy systems FUZZ_IEEE 03, May 2003. [5]. P. K. Simpson, Fuzzy min-max neural networks Part 1: Classification, IEEE Trans. on Neural Networks, Vol. 3, No. 5, pp. 776-786, 1992.