CHAPTER 8 COMPOUND CHARACTER RECOGNITION USING VARIOUS MODELS

8.1 Introduction The recognition systems developed so far were for simple characters comprising of consonants and vowels. But there is one more category of characters called as compound characters in Marathi script. These characters are formed by joining two or more consonants. The recognition of compound characters is still more difficult due to their features. In this paper, we propose a system for unconstrained handwritten Marathi compound character recognition without separation of the characters in the compound character. Some pieces of work for handwritten Bangla compound characters are found [117]. No work for handwritten Marathi compound character recognition is found so far to the best of our knowledge. This chapter discusses the design of various models; simple as well as hybrid for the recognition of handwritten Marathi compound characters. Section 8.2 describes the features of compound characters. Section 8.3 presents the data collection, pre-processing and structural classification which is common to all the models. Section 8.4 presents the system using template matching using various kernels. Section 8.5 discusses the design of system using neural networks while Section 8.6 describes a hybrid model for the same. Section 8.7 discusses the concluding remarks. 8.2 Features of compound characters Marathi script has a complex system of compound characters in which two or more consonants are combined forming a new special symbol as shown in Figure 8.1. Figure 8.1 Subset of Compound characters with different joining strategies The compound characters as seen in Figure 8.1 are formed by combination of some specific consonants in order to give a meaningful combination. The compound character can have two or more characters joined together in various ways as shown in 8. Compound Character Recognition using Various Models 133

Figure 8.1. One way of forming compound character is by removing the vertical line of a character and then joining it to the other on its left hand side. This type of joining is more common. Another way of connection of characters in the compound character is by just joining the characters side by side or one above the other. In another way, one of the characters completely changes its form and then gets connected to the other to form a compound character. More than two consonants also join in various ways to form a compound character. As seen from Figure 8.1, the compound characters not only exhibit a variation in the shape of the character but also in the aspect ratio as per the joining strategy. One might get tempted to use the features like aspect ratio or number of end points, but the various joining strategies limits the use of these features to achieve acceptable recognition accuracy. All these challenges cannot be met by just a single feature extractor or a single classifier. Hence a multistage system is needed that can recognize the characters over a wide range of varying conditions. Compound characters in Marathi script occur more frequently in the script as compared to other languages derived from Devanagari. The occurrence of compound characters in Marathi is found to be about 11 to 12% whereas in other scripts of Devanagari and Bangla script, it is just 5 to 7% (17). This percentage is calculated by going through books, newspapers and other literature in Marathi script. For example, the word in Hindi language (derived from Devanagari) is in Marathi. The word in Hindi is in Marathi. Also the usage of English words in Marathi further adds to the compound characters in the Marathi script. For example, the word computer is a more commonly used word instead of its Marathi meaning. Now the word computer when written in Marathi is, which includes a compound character as seen here. The next section discusses the compound characters used for recognition in the proposed system along with the pre-processing and structural classification steps. 8.3 Data Collection, Pre-processing and Structural classification In the proposed system, we aim at recognizing handwritten Marathi compound characters. This is done by employing multiple feature extraction and classification stages. At first, the character is pre-classified into one of 24 classes based upon the 8. Compound Character Recognition using Various Models 134

structural features in the same way as discussed in Chapter 5. This two stage structural classification is followed by character resizing. The character is resized to a typical size of 16x16 or 32x32 and then further given to the various systems discussed in the next sections. The handwritten compound characters used for recognition are shown in Figure 8.2. It comprises of 40 compound characters and 34 split characters resulting into 74 characters in all. Figure 8.2 Characters used in the proposed system The database for handwritten compound characters is created by scanning the characters at 300 dpi using a flatbed scanner. The images are stored in bmp file format. Here we assume that the consonants in the compound characters are either touching or overlapping. However, sometimes they may not be touching or overlapping or may result into a gap or separate after binarization and pre-processing. These split components of a compound character result into separate entities after segmentation. In order to consider these separate entities of the compound characters, we also consider the split components of the compound character for recognition Figure 8.3 shows few examples of compound characters split into two entities during pre-processing. Figure 8.3 Example of split characters 8. Compound Character Recognition using Various Models 135

The binarization, pre-processing and the structural classification is carried out in the same manner as discussed in Chapter 5. At the end of the structural classification, there are 24 classes with compound characters in them based upon the structural features. These now form the database for generation of kernels for template matching or for training the neural network. The handwritten Marathi compound character dataset was collected from different individuals. More than 500 samples per character were collected, resulting in about more than 20000 character samples. Out of these, two third of the characters were used for database creation for storing as templates and for training the neural network and rest were used for testing. No standard database is available for handwritten Devanagari compound characters so far. Figure 8.4 show some sample characters in the database stored after pre-processing. Figure 8.4 Pre-processed compound characters The compound characters classified to the 24 classes after pre-processing are shown in Table 8.1. Each structural class shows different number of characters obtained as per the writing style of the writers. The classes contain compound characters as well as split characters. Table 8.1 Number of compound characters in the structural class Sr. no. Class No. of characters classified Characters classified 1. NBE/00 4 2. NBE/01 18 8. Compound Character Recognition using Various Models 136

3. NBE/10 3 4. NBE/11 13 5. NBNE/00 2 6. NBNE/01 26 7. NBNE/10 6 8. 22 NBNE/11 9. MBE/00 5 10. MBE/01 7 11. 12. MBE/10 MBE/11 1 9 8. Compound Character Recognition using Various Models 137

13. MBNE/00 4 14. MBNE/01 8 15. 16. MBNE/10 MBNE/11 4 9 17. EBE/00 1 18. EBE/01 11 19. EBE/10 2 20. EBE/11 18 21. EBNE/00 1 22. EBNE/01 18 8. Compound Character Recognition using Various Models 138

23. EBNE/10 3 24. EBNE/11 19 The compound characters as discussed are more complex and incorporate a large amount of details in them. These details are the character shape, endpoints, junction points, curves and strokes. Various recognition techniques are developed step-wise for the recognition of these characters and the results are analyzed. The recognition of compound characters is carried out by various models like: 1. Template matching 2. Neural networks, and 3. Hybrid network The cropped binary characters in the structural classes are used as database for template matching as well as training the neural network. The various types of models discussed in the previous sections for handwritten compound character recognition are executed on an Intel Core 2 Duo CPU running on 2GHz with 2 GB RAM. The next sections discuss the recognition techniques mentioned above in detail. 8. Compound Character Recognition using Various Models 139

8.4 Compound character recognition using template matching In this system, the characters are recognized using template matching. As discussed before, the basic template matching technique performs cross correlation of the image f (m, n) with the template g (m, n). The result contains peaks at the location of the matches between the template and the underlying object. This matching is indicated by r which is the value between 0 and 1, where higher the value, more is the similarity. ( f f )( g g) m n r = (8.1) 2 2 ( f f ) ( g g) m n m n where, f and g are the mean of the original image and the template respectively. Matching is done only with the kernels in the class to which the character is classified after two stage structural classification. This reduces the number of comparisons and speeds up the recognition. The templates used for template matching are of various types. They are as follows: 1. Binary templates 2. Convolved binary templates 3. Wavelet approximation templates and 4. Modified wavelet approximation templates. The block diagram for the template matching procedure is shown in Figure 8.5. As shown in Figure 8.5, the compound character is pre-processed, classified structurally and then resized to generate a template as per the types mentioned above and then matched with the similar templates in the database. The results obtained by each template matching procedure with the respective templates are analyzed. The procedure for generation of these templates is explained next. 8. Compound Character Recognition using Various Models 140

Character input Pre-processing Structural Classification Character Resizing Binary templates Convolved binary templates Wavelet Approximation templates Modified Wavelet Approximation templates Template matching Template matching Template matching Template matching Recognized character Recognized character Recognized character Recognized character Figure 8.5 Template matching using different templates 8.4.1 Generation of binary templates Here the templates used for matching are the binary character images which are resized to either 16x16 or 32x32. 8.4.2 Generation of convolved templates In this method, instead of using the resized binary images as templates directly, they are convolved to themselves and stored as templates. Convolution is just like correlation, except that the filter is flipped before correlating. Convolution performs filtering operation and generates a multidimensional space invariant to rotation. This is useful in improving the recognition accuracy with template matching. Convolution between an image I x, y and a template T x, y is defined as 8. Compound Character Recognition using Various Models 141

f g = ( x, y) W f ( x', y' ) g( i x', j y' ) (8.2) where, x ' = x + i and y ' = y + j. The equation for convolution shows that the template matrix is flipped in both horizontal and vertical direction before multiplying the overlapped input data. Convolution of both 16x16 and 32x32 images is carried out for matching. 8.4.3 Generation of wavelet approximation templates Wavelet transform is another technique used for template generation. The wavelet transform exhibits the features like separability, scalability, translatability, orthogonality and multiresolution capability. Single level wavelet decomposition is used for templategeneration. Given separable 2D scaling and wavelet functions, the scaled and translated basis function is defined as: j / 2 j j ϕ ( x, y) = 2 ϕ(2 x m,2 y ), (8.3) j, m, n n i j / 2 i j j ϕ ( x, y) = 2 ψ (2 x m,2 y ), (8.4) j, m, n n where index i identifies the directional wavelets i.e. horizontal, vertical and diagonal details. The discrete wavelet transform of function f(x, y) of size MxN is then M 1 N 1 1 Wϕ ( j0, m, n) = f ( x, y) ϕ j,, (, ) 0 m n x y (8.5) MN x= 0 y= 0 M 1 N 1 i 1 i Wψ ( j, m, n) = f ( x, y) ψ j, m, n ( x, y) (8.6) MN x= 0 y= 0 where, ( j 0, m, n) coefficients define an approximation of f(x, y) at scale j 0 and W i ψ W ϕ ( j, m, n) add horizontal, vertical and diagonal details for scales j j 0. The discrete wavelet transform can be implemented using digital filters and down samplers. The high pass or detail component characterizes the image s high-frequency information with vertical orientation; the low-pass, approximation component contains its low-frequency, vertical information. The approximation coefficients obtained for every character after single level decomposition is stored as the templates in the database. 8. Compound Character Recognition using Various Models 142

The characters are then normalized to a fixed size of 16x16 after structural classification for kernel generation. A single level wavelet decomposition of the resized character image generates the approximation and the detail coefficients. The decomposition is done with respect to Daubechies wavelet. Here, j 0 =0 and M=N=16 which is equal to 2 j, hence j=0, 1, 2,, 7 and m, n=0, 1, 2,., 15. Thus 8x8 approximation coefficients generated after single level wavelet decomposition are used to match with the kernels stored in the database obtained in the similar manner. The wavelet decomposition reduces the image size by two, which further reduces the computations and increases the speed of recognition. The characters are also resized to 32x32 to generate 16x16 approximation features which are used as templates. The results are analyzed for both. 8.4.4 Generation of modified wavelet approximation templates The convolved templates obtained by convolving the binary templates with themselves gave improved results over simply binary templates. This was an inspiration to obtain the modified wavelet approximation templates. Here the templates generated in the previous step are used as wavelet kernels. The approximation coefficients obtained using single level wavelet decomposition is convolved with themselves for generation of modified wavelet template. Convolution between a 2D signal f(x, y) and a template g(x, y) is defined as f g = ( x, y) W f ( x', y' ) g( i x', j y' ) (8.7) where, x =x + i and y =y + j. The convolved output as indicated above is stored as the template in the database for matching. C = conv2 (A, B) computes the two-dimensional convolution of matrices A and B in MATLAB. The size of C in each dimension is equal to the sum of the corresponding dimensions of the input matrices, minus one. That is, if the size of A is [ma, na] and the size of B is [mb, nb], then the size of C is [ma+mb-1, na+nb-1].the 16x16 and 32x32 images are convolved with themselves to generate 31x31 and 63x63 kernels respectively, which are used for matching. In this method, we are trying to generate a wavelet kernel using 2D wavelet decomposition for each character. When a separate kernel is generated 8. Compound Character Recognition using Various Models 143

for each character, the feature space is different for each character and it increases the interclass and intra class separation between the characters. 8.4.5 Results for compound character recognition using template matching The compound character recognition is done using four different types of templates as discussed before. The sample compound character shown in Figure 8.6 (tya) is used for recognition. This character got classified to EBNE/11 class after structural classification. This class contains 19 different characters with 564 samples in all. The recognition using all the four template matching methods with 16x16 and 32x32 resize factors respectively, is shown in Figure 8.7 (a) through (h). The graphs show total number of characters in the EBNE/11 class on the x-axis and the matching value r on the y-axis. Figures 8.7 (a), 8.7 (b) and 8.7 (e) give the maximum similarity at index 89, which shows misclassification of the characters. The character is misclassified as vya which closely resembles the character under test. In rest of the figures, the maximum similarity value is found at index 74, which gives the correct classification result. Figure 8.6 Sample Compound character used for testing (a) (b) 8. Compound Character Recognition using Various Models 144

(c) (d) (e) (f) (g) (h) Figure 8.7 Compound character similarity matching using different templates (a) Binary template with resize factor 16x16, (b) Binary template with resize factor 32x32, (c) Convolved template with resize factor 16x16, (d) Convolved template with resize factor 32x32, (e) Wavelet approximation template with resize factor 16x16, (f) Wavelet approximation template with resize factor 32x32, (g) Modified wavelet approximation template with resize factor 16x16, (h) Modified wavelet approximation template with resize factor 32x32 8. Compound Character Recognition using Various Models 145

The character is misclassified by both 16x16 and 32x32 binary template as shown in Figure 8.7 (a) and (b). While it is classified correctly by both the convolved templates with resize factor 16x16 and 32x32 respectively as per Figure 8.7 (c) and (d). The character is further misclassified by wavelet approximation template with resize factor 16x16 but correctly classified by wavelet approximation template with resize factor 32x32 as shown in Figures 8.7 (e) and (f) respectively. Finally the character is correctly classified by the modified wavelet approximation templates with both 16x16 and 32x32 resize factors as shown in Figure 8.7 (g) and (h) respectively. Table 8.2 Template matching performance for compound character recognition Sr. Template no. generation technique 1. Binary template Resize factor Template size Recognition Time for EBNE/11 Overall Recognition rate (%) (sec) 16x16 16x16 4.15 87.65 32x32 32x32 5.79 89.00 2. Convolved binary template 3. Wavelet approximation template 4. Modified wavelet approximation template 16x16 31x31 5.04 91.99 32x32 63x63 5.68 92.41 16x16 8x8 5.29 93.89 32x32 16x16 5.44 94.39 16x16 15x15 5.97 95.00 32x32 31x31 6.22 95.50 The results in the Table 8.2 show that as the template size and the computations increase, the time required for recognition goes on increasing. The recognition time is also a function of number of characters and the samples per character in the class. The recognition rate increases with the increase in the resize factor and the number of samples per character. Increasing the resize factor did not show considerable 8. Compound Character Recognition using Various Models 146

improvement in the recognition rate but certainly requires more time to execute. Hence only 16x16 and 32x32 resize factors are considered. 8.5 Compound character recognition using neural network 8.8. The flow of the recognition scheme using neural network is depicted in Figure Handwritten Compound character Handwritten Compound character Pre-processing Pre-processing Structural Classification Structural Classification Character Resizing Character Resizing Modified Wavelet features Modified Wavelet features Neural Network Classifier Neural Network Training Recognized character Testing Phase Save weights and biases Training Phase Figure 8.8 Compound character recognition using neural network It consists of training phase and testing phase. At first, the characters are preclassified based upon their structural features. A two stage structural classification 8. Compound Character Recognition using Various Models 147

technique is implemented in both the phases. As seen in the template matching technique, the modified wavelet approximation features gave the highest recognition accuracy amongst other template generation technique. This gave an inspiration to use these features with neural network to improve the recognition rate. In the training phase, modified wavelet featured are obtained by convolving wavelet approximation features and used for training the neural network built as per the set parameters, so as to fix the weights and biases for each character. In the testing phase, similar features are extracted from the character. This is done again after pre-processing and structural classification of the character. The features are applied as inputs to the neural network. The output of the neural network yields the final recognition result. The neural network implemented is multilayer perceptron with the parameters as given in Table 8.3. 8.5.1 Feature extraction for compound character recognition using neural network The 16x16 resized binary images of compound character are decomposed using 2D single level wavelet decomposition for extraction of approximation features. This procedure is explained in Section 8.4.3. The 8x8 approximation coefficients thus obtained are convolved with themselves to generate 64 modified wavelet approximation features. The instruction used for convolution operation is Cs = conv2(a,b,'same'), where, Cs is of the same size as A. Thus the 8x8 approximation features are convolved with themselves to obtain 8x8 = 64 modified wavelet approximation features. These features are further applied for training and testing the neural network. 8.5.2 Neural network design for compound character recognition using neural network Multilayer perceptron is used for compound character recognition. The parameters selected for the neural network are given in Table 8.3. 8. Compound Character Recognition using Various Models 148

Table 8.3 Neural network parameter settings Parameters Number of inputs Values 64 : Modified wavelet approximation features Number of hidden layers 1 Number of neurons in hidden layer Equal to the square root of the product of number of inputs and number of outputs Hidden layer activation function hyperbolic tangent sigmoid transfer function Number of neurons in output layer Number of characters in the structural class Output layer activation function Linear Goal 0.001 Error function mse Maximum number of epoch 300 Training algorithm Levenberg-Marquardt algorithm 8.5.3 Results for compound character recognition using neural network The neural network performance with modified wavelet features is given in Table 8.4. As indicated in the table, overall recognition accuracy for handwritten Marathi compound character is 96.23 %. This shows improved recognition accuracy over template matching technique. Also the time required to recognize a character is approximately 0.05 sec irrespective of the number of characters in the structural class, i.e. it is not dependent upon the number of characters or their samples in the structural class. This gives an additional advantage over template matching technique. Table 8.4 Neural network performance for compound character recognition Sr. no. Feature extraction technique Recognition technique Recognition rate (%) 1. 64 Modified wavelet features Neural network 96.23 8. Compound Character Recognition using Various Models 149

8.6 Proposed smart model for handwritten Compound character recognition using hybrid network The structure and the need for hybrid model are studied in Section 7.2 and Section 7.3 respectively. Such a model is more essential in the case of compound characters, as both the complexity and the number of characters are increased. But it is also found in the hybrid models in Chapter 7 that, the template matching approach in the structure required much more time for recognition due to matching with all the templates in the database. Hence, in the hybrid network proposed for compound character recognition, template matching classifier is replaced by a neural network to optimize, both the accuracy and the speed. Character input Pre-processing Structural Classification Character Resizing Pixel Density Features Euclidean Distance Features Modified Wavelet Features Neural Network B Neural Network Neural Network A Majority voting Recognized character C Figure 8.9 Proposed smart model for compound character recognition using hybrid model 8. Compound Character Recognition using Various Models 150

Figure 8.9 shows the proposed smart model for handwritten compound character recognition using hybrid model. As shown in Figure 8.9, at first, the character is preclassified into one of 24 classes based upon the structural features. This two stage structural classification is followed by character resizing. The character is resized to a fixed size of 16x16 or 70x50 depending upon the feature extraction technique. Three different features are extracted from the resized character and applied to three neural networks built for each structural class. In the testing phase, the weights and biases, fixed during training, are used to get the recognition result from all the three branches as indicated by A, B and C in Figure 8.9. The final recognition decision is based upon majority voting criterion. The concept of majority voting is studied in Chapter 7. The designs indicated by paths A and B in Figure 8.9 are implemented individually in Chapter 5 while the design indicated by path C in Figure 8.9 is implemented in Chapter 6. The hybrid network is thus designed by maximum voting criteria applied to all the three paths. The design of this hybrid network is further studied in detail in the following sub-sections. 8.6.1 Feature extraction for compound character recognition using hybrid network The resized binary compound characters are used to extract three different types of features as indicated in Table 8.5. Table 8.5 Feature extraction parameter settings for the hybrid network Sr. Feature extraction Resize No. of features no. technique factor 1. Normalized pixel density features 70x50 35 (character partitioning into 35 nonoverlapping blocks of size 100x100) 2. Euclidean distance features 16x16 32 (16 in horizontal and 16 in vertical directions) 3. Modified wavelet approximation features 16x16 64 (obtained after convolving 8x8 wavelet approximation features) The procedure for extraction of normalized pixel density features and Euclidean distance features is explained in Chapter 5, Section 5.4.6. The procedure for 8. Compound Character Recognition using Various Models 151

extraction of modified wavelet approximation features is discussed in Section 8.5.1. These features are applied to three different neural networks in the hybrid structure. The number of inputs for each network is differs as per the number of features extracted. The architecture of the neural networks is discussed next. 8.6.2 Neural network design for compound character recognition using hybrid network The hybrid network for compound characters as discussed above incorporate feature extraction using three different features. These features are applied to three different neural networks designed as per the parameters given in Table 8.6. Table 8.6 Neural network parameter settings for hybrid network Parameters Values Number of inputs 35 : Normalized pixel density features 32 : Euclidean distance features 64 : Modified wavelet approximation features Number of hidden layers 1 Number of neurons in hidden layer Equal to the square root of the product of number of inputs and number of outputs Hidden layer activation function hyperbolic tangent sigmoid transfer function Number of neurons in output layer Number of characters in the structural class Output layer activation function Linear Goal 0.001 Error function mse Maximum number of epoch 300 Training algorithm Levenberg-Marquardt algorithm As seen in Table 8.6, the number of features applied to each neural network in the hybrid network is different. This in turn results in the difference in the number of hidden neurons, which is a function of number of input neurons. Each network generates an 8. Compound Character Recognition using Various Models 152

index for the character under recognition. If the indices of any two networks match, the same is used to obtain the recognized character. If the indices of all the networks differ, then the index of the neural network with modified wavelet approximation features is considered as the final output. This is because; the recognition accuracy of this branch is higher as compared to the other two in the hybrid network. 8.6.3 Results for compound character recognition using hybrid network As discussed and studied earlier, a hybrid network is designed to improve the results over individual feature extraction and classification combination. The performance of the hybrid network for compound character recognition is presented in Table 8.7. It also summarizes the performance of the individual feature extraction and classification combination. The observations quoted in the table indicate that the hybrid network improves the recognition rate over individual network. The time required to recognize a character is approximately 0.15 sec. Table 8.7 Hybrid network performance for compound character recognition Sr. no. Recognition technique Overall recognition rate (%) 1. NN with Euclidean distance features 94.88 2. NN with normalized pixel density features 95.24 3. NN with Modified wavelet approximation 96.23 features 4. Hybrid network (majority voting of 1 to 3) 97.95 The techniques implemented for the recognition of handwritten Marathi compound characters and their performances are summarized in Table 8.8. 8. Compound Character Recognition using Various Models 153

Table 8.8 Summary of results of various systems for compound character recognition Sr. no. Template/Feature generation technique Resize factor Template/Feature size Recognition Technique used Overall Recognition rate (%) 1. Binary templates Convolved binary templates Wavelet approximation templates Modified wavelet approximation templates 16x16 16x16 87.65 32x32 32x32 89.00 16x16 31x31 91.99 32x32 63x63 92.41 16x16 8x8 Template 93.89 matching 32x32 16x16 94.39 16x16 31x31 95.00 32x32 63x63 95.50 2. Euclidean distance features 16x16 32 Neural network 94.88 3. Normalized pixel density features 70x50 35 Neural network 95.24 4. Modified wavelet approximation features 5. Multiple (normalized pixel density, Euclidean distance & Modified wavelet approximation features) 16x16 64 Neural network Differ as Differ as per the Hybrid per the feature extraction network feature technique extraction technique 96.23 97.95 8. Compound Character Recognition using Various Models 154

8.7 Concluding Remarks Handwritten Marathi compound character recognition without separation is studied in this chapter. No such system for handwritten Marathi compound character recognition is found so far to the best of our knowledge. A huge database of compound characters is created. Various recognition techniques were implemented right from template matching to hybrid network in order to improve the performance in a systematic manner. Table 8.9 presents a comparative study of all the three techniques implemented for handwritten Marathi compound character recognition. Table 8.9 Comparative study of various systems for compound character recognition Sr. no. Template/Feature generation technique Resize factor Template/Feature size Recognition Technique used Overall Recognition rate (%) 1. Modified wavelet 32x32 63x63 Template 95.50 approximation templates matching 2. Modified wavelet 16x16 64 Neural 96.23 approximation features network 3. Multiple (normalized pixel density, Euclidean distance & Modified wavelet approximation features) Differ as per the feature extraction technique Differ as per the feature extraction technique Hybrid network 97.95 According to the observations, although the template matching gives acceptable results with modified wavelet features, the time required to recognize a character is large as shown in Table 8.2. Hence neural network is used which improves the accuracy (refer Table 8.9) due to its learning and generalization ability as well as requires quite a less time for recognition (approximately 0.05 sec per character). The hybrid network that combines the outputs from three different neural networks gives the highest recognition accuracy of 97.95%. 8. Compound Character Recognition using Various Models 155