Off Line Sinhala Handwriting Recognition with an Application for Postal City Name Recognition M.L.M Karunanayaka, N.D Kodikara, G.D.S.P Wimalaratne University of Colombo School of Computing, No.35, Reid Avenue, Colombo 7, Sri Lanka. Tel: +94-11-2581245/8, Fax: +94-11-2587239 {mlm,ndk,spw}@ucsc.cmb.ac.lk Abstract Sinhala is the national language of the country of Sri Lanka. 70% of the people living in Sri Lanka use Sinhala language for their day-to-day activities.very little number of researches have been done about Sinhala handwriting recognition. This proposed system is focused on recognition of Sinhala handwriting using postal city names as a case study.training and testing of this case study is done by using the handwriting of postal envelops. Therefore this research not limited only to a single writing style. Number of main post offices in Sri Lanka are limited to five hundreds and one. In this system one of the major impediments are touching characters. Segmentation of handwritten touching characters become a crucial step in such systems. Conventional segmentation methods incapable of handling the complexity exists in Sinhala handwritten characters. The proposed method separates touching characters into isolated character models in two steps viz; basic projection profile method and water reservoir concept. Finally recognition process carries with using the Kohenen artificial neural network. Over 300 patterns are tested in segmentation and 92% accuracy was reported and recognition phase was tested using 400 patterns and 84.5% success rate was reported. Keywords: Sinhala handwriting, character recognition, character segmentation, noise removal, artificial neural network, postal recognition 1. Introduction Off-Line recognition of handwriting has numerous practical applications in areas such as banking, census, mail sorting, commerce, etc. There are many techniques available in computational pattern recognition such as artificial neural networks and hidden Markov models to recognize handwritten characters in various research areas. In the character recognition schema there are several steps to be done before the recognition process starts. Those steps can be divided mainly into two groups. The first is Data acquisition and the second is preprocessing. In Data acquisition, Input image (i.e. Gray Level image) is converted to the binary image. All the image enhancement processes prior to recognition had been done in preprocessing step. Noise detection and removal, thinning, segmentation as the processes was done in this step. The next step is known as recognition. In our Sinhala handwriting recognition system Data acquisition and noise removal was done by using thresholding technique. Segmentation and recognition is the most difficult task of the proposed system, because most of the real Sinhala postal writings are touching each other and writing with various handwriting styles. There are two possible approaches that can be taken by the segmentation and recognition system namely: segment the image containing characters prior to the recognition phase [2,3,6,10] or integrate segmentation and recognition phases together [1,7]. In the proposed system segmentation and the recognition phases are done separately. It has been able to identify four different touching character groups according to the way they touch each other.overlapping, touching, connecting and intersecting are the identified character groups (Figure 1). In this paper, a two level segmentation algorithm is proposed. At the first level of the method, touching characters are identified if they are available in the input image.those touching characters are separated into appropriate groups described above. A suitable segmentation algorithm is applied consequently to each touching character. This proposed system recognition process is done using Kohenen artificial neural network. The organization of the paper is as follows; the section two of this paper describes the background of the Sinhala handwriting and its different styles. The previous researches done in this area is also described in the
section. The section three of this paper describes the methods used to implement the system. Binarization, noise reduction, segmentation and recognition methods too are described in this section. The evaluation of the results of the present work is given in the section four. Finally, the conclusion of the present work is described in the section five. Figure 1: Combined character groups 2. Background Sinhala is the national language of the people living in Sri Lanka [9]. 13 million people i.e. 70% of the population in Sri Lanka use Sinhala characters to write their mother tongue. It is not spoken in any other country, except enclaves of migrants. Being a descendant of a spoken form (Pali) of the root Indic language, Sanskrit, it can be argued that this belongs to the large family of Indo-Aryan languages [11]. Sinhala language is written by the left to right pattern and it has curved shape scripts. And these characters are written within the three layers horizontal that is upper layer, middle and lower layers. Some characters are written within these three layers, some of them are written only in the middle layer and another set of characters are written in the middle layer and in other two layers but it is optional to occupy both upper and lower layers for these characters ( Figure 2 ). A few researches have been done on Sinhala handwritten characters [4, 8]. Almost all those researches are focused on identifying regular, well-defined Sinhala handwritten character recognition. Therefore there has been no research done in cursive, unconstrained handwritten characters at all. 3. Methodology This section explains the procedure of noise removing, separation of touching and individual characters, segmenting touching characters and recognition of the postal cities. Section 3.1 describes the noise removal techniques of the input image. The section 3.2 describes the methodology of the separation touching characters and individual characters. The section 3.3 describes the segmentation techniques of the touching characters. Final section 3.4, describes the recognition methodology used in Kohenen artificial neural networks. 3.1. Binarization and Noise Removal Techniques This section introduces the combination techniques of binarization and noise removing. After processing these techniques the output image contain 255 background value and original foreground pixel values. This section describes three approaches taken in the process of binarization and noise removal of images. First of all the gray levels of the input images are sorted in ascending order, assuming that, the foreground of the image is one fourth of the all pixels in the image. Then get the first quarter of the sorted pixels and check the maximum gray value (i.e., MaxPV ) of this particular pixels. The MaxPV used as the cutoff gray value between the background and the foreground in the image. Use the following algorithm (Equation 1) to convert gray level image foreground into gray level 255. ( 1 ) Figure 2: Sinhala characters written with layers In the next method 3x3 kernel is used. This kernel is then applied on each pixel of the image together with their respective grey level values. The kernel is applied to one pixel exactly once. If the number of pixels that have grey level values which are close to zero (i.e. black) in a
Noise Removed Image Vertical Projection Profile section 3.2 Labeling section 3.3 Kohenen ANN Recognized Character Condition in equation 4 Water Reservoir concept Section 3.3 Cat 1 Condition 1 in equation 5 Non-Segment Condition 2 in equation 5 Cat 2 Condition 3 in equation 5 Cat 3 Non Segmented Character Figure 3 : Hierarchy of segmentation and recognition
given kernel is less than or equal to two that area is considered as an area belongs to the background of the image and then the grey level values of that area are set to 255(i.e. white). The other approach is the dynamic adaptive threshold (DTU) method. The following equation determines the adaptive threshold (Equation 2) I(x,y) is the gray level of a particular pixel in the image where 1 x w and 1 y h where w and h are width and height of the image respectively. Adaptive Threshold = x=width, y=height Min{I(x,y)}+Max{I(x,y)}+MaxPV ( 2 ) x=1, y=1 3 The threshold calculated according to the equation 2 is compared with grey level value of the corresponding pixel. If the grey value is above the threshold value of the image that pixel is set to 255. In the final output image the grey values of the background pixels should be 255 and the grey values of the foreground pixels are as same as the initial image gray value. In this method the grey values of the foreground are not changed in order to preserve the details and the information of the characters as much as possible. 3.2. Separation Touching and Individual Characters This section describes the segmentation procedure carried out in the present work. The figure 3 depicts the flow of the segmentation and recognition procedure. At the beginning the images are segmented using vertical projection profile (VPP) method. The touching characters (TC) are considered as a single entity at this level. In the next level the segmented character entities are further classified according to the criteria that whether the segmented character entities are touching characters or single characters. If there are touching characters the criteria further classifies them into the groups described in the section one of this paper. The above mentioned steps are done according to the procedure described below: The width of the image and number of characters occur in that image obtained from VPP is used to estimate roughly the average character width.( Equation 3) Average Character Image Width ( 3 ) Width = Number of Characters The segmented character entities are further classified according to the procedure given in equation 4. 3.3. Segmentation of Touching Characters This section describes the touching character segmentation procedures. First step of the segmentation procedure is labeling each touching character in order to distinguish overlapping touching character from other touching characters. If ( ( 3 x AvgWidth /2 ) > Width ) Touching Character ( 4 ) Else Segmented Character The proposed labeling algorithm uses 3 x 3 kernel and moves it horizontally through the input image. When it finds the first dark pixel (foreground pixel) it moves along the object assigning a special number 1(use label counter). If the foreground object is discontinued, the label counter is increased by one and the kernel moves freely to find another object. Then assign increment label counter value to newly find object and this procedure is carried out until the width of the image is reached If the label counter is greater than one it is reasonable to deduce that segmented unit which has more than one character. Then each label represents a separate character unit. If the label counter is equal to one the segmented character unit which is segmented in section 3.2 belongs to other three groups mention in section 1 viz touching, connecting and intersecting. These groups will continue in the next segmentation procedure namely Water Reservoir Concept discussed briefly in Pal, Beliad and Choisy[7]. The Figure 4 shows how the Water Reservoir Concept segments touching characters. Figure 4 : Apply water reservoir concept in touching character Validating of the segmented units using water reservoir concepts is based on the following attributes: 1. Number of Top reservoirs and its heights, volume and the Centre of Gravity. 2. Number of Bottom reservoirs and its heights, volume and the Center of Gravity. 3. Number of reservoirs in this unit 4. Maximum depth of each reservoir and check if a reservoir has more than one maximum point.
5. If the character unit have both top and bottom reservoirs, then calculate the angle of the centre of gravity (join each top centre of gravity and each bottom centre of gravity points in the reservoirs) If ( Number of top reservoirs = 1 and Top Reservoir height > 3 x Character Height / 4 and No one maximum depth Point ) PUT BIN 1 Else if (Number of Top reservoir = 1 and Number of bottom reservoir = 1 and Angle of the Center of Gravities of the reservoirs is in between -45 degree and 45 degree and No one maximum depth point ) PUT BIN 2 Else if ( Number of Top reservoir >= 1 and Number of bottom reservoir >= 1 and Only one maximum depth point in each reservoir ) PUT BIN 3 Else CANNOT SEGMENTED Equation 5 This is the final step of the segmentation method. The equation 5 is used to separate the touching characters into the three groups mentioned in the section one. There are different techniques which can be used to segment each group of character units which are suitable for each character group. Following paragraph discusses the segmentation of characters in particular groups or categories: Category 1 : Gray level distribution is used to segment characters in this category. In this type of characters, the connection point gray values are higher than other gray values of the character. That means the connection point is lighter than the other points in the character image because the writer withdraws the pressure on the pen tip through the connecting area but still continues writing with a little amount of pressure. The highest gray value points in the foreground pixels are the segmented point of the category 1 character images. This segmented point should also be checked whether it belongs to the reservoirs bottom line. Category 2 : This type of characters uses MDTR and MDBR to choose the segmentation point. Each maximum depth points on top and at the bottom are joined and the length of these connecting points, are calculated. This category has only one top reservoir and one bottom reservoir. Then, the segmentation path occurs through this connected line and separates the combined character into the isolated two characters using the MDTR and MDBR joining line. Category 3 : Characters in category 3 has more than one top reservoirs and also more than one bottom reserviors. All maximum depth points in top and bottom reserviors are joined. Then, the minimum distance of connecting points in the top reserviors depth points and the bottom reservoirs depth points, is calculated. This minimum distance line is the best cutoff line of the combined characters in category 3. 3.4. Recognition The Kohenen Artificial Neural Network (KANN) is used for the recognition phase of the proposed system. In this KANN 32 x 32 input neurons and 1 output neuron is used. The pattern in the input neuron is shown in figure 5. Each square in this pattern is one input node of the KANN. This KANN has only one input layer and one output layer. For this proposed system, all available characters are divided into forty groups as shown in figure 6. In dividing the group, modifiers of the characters are ignored where modifiers can be separated like, but the other characters where modifiers cannot be separated are taken as a whole with the modifier as in. All the modifiers that can be separated are grouped in cage 9. In this proposed system, first pattern is to input into KANN and produced the output, which is one of the forty character group listed in figure 6. If the output is group 9, then this pattern is ignored and passed on to check the next pattern which will be one out of the other thirty nine groups and is selected as the first letter of the word. The second pattern is chosen from the second letter of the list of words categorized under one selected character that is the first letter of the word selected by KANN and set it as
the second letter of the word ensuring that the pattern is not included in the group 9. This processed can be continued to select the remaining character patterns in the word image. Figure 5 : Input pattern of KANN Identifying city image is as follows: for a example if the city image is,the KANN generate the output signal as AQTMX and then system search the database which city is equal to emitting symbol AQTMX. Finally, system can understand the input city is Anamaduwa. Symbol Character Symbol Character A Q B R C S D E F f G H I i J s T t U V W w X K Y L Z l 1 M 2 m 3 N 4 O 5 P 9 Figure 6 : Groups of Characters x 4. Results and Evaluation Proposed method was applied for Sinhala handwritten postal addresses because the postal addresses are written by many different people and with different educational levels which leads the sample data set to accommodate a wide variation of handwritten characters and there is no restriction applied when writing the city names. The Sinhala handwritten database which is available in NSF[4] in Sri Lanka is one of the sources of Sinhala handwriting where the real postal addresses used to train and test the proposed method. Another source used to testing and training of proposed system is manually collected postal city names which are digitized using HP Scanjet 5200C, written by selected students of University of Colombo. For training and testing of the proposed system, the collected data are categorized into four groups. Those groups are namely real postal addresses which are in NSF database(rpa), words written by the student of university of Colombo(WUOC) and a selected sample of combined characters(cc) and isolated characters(ic). Success rate of the segmentation method is 92% and recognition method is 80%. Segmentation and recognition results are shown in table 4.1 and table 4.2 respectively. No of Segmented Success(%) Patterns correctly RPA 100 89 89% WUOC 100 93 93% CC 100 94 94% Total 300 276 92% No of Segmented Success(%) Patterns correctly RPA 100 76 76% WUOC 100 87 87% CC 100 85 85% IC 100 90 90% Total 400 338 84.5% 5. Conclusions Table 4.2: Results of segmentation Table 4.2: Results of recognition According to the observation of the present work it can be concluded that the approach presented in this paper is efficient in recognizing cursive unconstrained Sinhala handwriting recognition compared with the traditional conventional segmentation and recognition methods. The performance of the present system can be improved in
many ways by incorporating other segmentation and recognition methods. More complex touching character groups and very noisy images that could not be handled in the present work can be handled by improving this method as a future enhancement. The next major future enhancement in this approach is to introduce the hybrid recognition process as some character groups are misrecognized when only the Kohenen Artificial Neural Network is used. Examples for 10. K. Romeo-Pakker, H. Miled and Y. Lecourtier. A new approach for Latin/Arabic character segmentation', 3 rd International Conference on Document Analysis and Recognition, pages. 874-877. 1995 11. R. Weerasinghe. A Statistical Translation Approach to Sinhala-Timil Language Translation. 5 th International Informmation Technology Conference,pages 136 141,2003. such groups are, and and and. Hybrid recognition process can be used by combing the Kohenen artificial neural network and Hidden Markov Models for post processing techniques in the future enhancements. References 1. T. M. Breue. Segmentation of Handprinted Letter Strings using a Dynamic Programming Algorithm. 6 th Internatioonal Conference on Document Analysis and Recognition, volume 1, pages 821-826. 2001 2. R.G. Casey and E. Lecolinet.'A Survey of Methods and Strategies in Character Segmentation. IEEE Transaction on Ptttern Analysis and Machine Intelligence,volume. 18,number 7, pages. 690-706. 1996. 3. C.E. Dunn and P.S.P Wang. Character Segmentation Tech- Techniques for Handwritten Text A Survey. International Conference on Pattern Recognition, pages. 577-580. 1992. 4. H.C Fernando, N.D Kodikara and S. Hewavitharana. A Database for Handwriting Recognition Research in Sinhala Language. Proceeding of 7 th International Conference on Document Analysis and Recognition, Edinburgh,UK. 5. S. Hewavitharana, H.C. Fernando and N.D. Kodikara. Offline Sinhala Handwriting Recognition using Hidden Markov Models. Proceeding of the Third Indian Conference on Computer Vision, Graphics and Image Processing,2002. 6. D. Lee, S. Lee and H. Park. A New Methodology for Gray- Scale Character Segmentation and Reco- gnition',ieee Transactions on Pattern Analysis and Machine Intelligence, volume. 18, number. 10, pages. 1045-1050. 1996. 7. S. Messelodi and C.M. Modena. Context driven text segmentation and recognition. Pattern Recognition, volume. 17, pages. 47-56.1996. 8. U. Pal,U, A. Belaid, and C. Choisy. Touching numeral segmentation using water reservoir concept. Pattern Recognition Letters,volume. 24, pages 261-272,2003 9. H.L. Premaratne and J. Bigun.Recognition of Printed Sinhala Characters Using Linear Symmetry. Symmetry. 5th Asian Conference on Computer Vision, pages 23-25, 2002.