CHAPTER 2 LITERATURE REVIEW

Size: px

Start display at page:

Download "CHAPTER 2 LITERATURE REVIEW"

Nickolas Dominic Elliott
6 years ago
Views:

1 CHAPTER 2 LITERATURE REVIEW

2 2.1 Introduction There is a great need for OCR related research in Indian languages, even though there are many technical challenges as well as the lack of a commercial market [1]. With the spread of computers in organizations and homes, automatic processing of paper documents is rapidly gaining importance in India [2]. A short description of the advancements in OCR of Indian scripts including Bangla, Tamil, Telugu, Gurmukhi, Oriya, Gujarati, Kannada, and Devanagari up to 2002 can be seen in [3]. In this paper, it is tried to address all the advancements till 2010 in printed as well as handwritten Devanagari script recognition along with their performances. Devanagari is the script used for writing many official languages in India, such as Hindi, Marathi, Sindhi, Nepali, Sanskrit, and Konkani, where Marathi is the language spoken in Maharashtra state. Several other Indian languages like Gujarati, Punjabi, and Bengali use scripts similar to Devanagari. More than 300 million people use Devanagari script for documentation in central and northern parts of India [4]. This chapter presents a comprehensive review of the work carried out in Devanagari OCR. Section 2.2 discusses the literature review in the field of machine-printed Devanagari script. Section 2.3 presents the review in the handwritten character recognition field. In both these cases, the research carried out at each stage of the OCR namely, pre-processing, feature extraction and classification/recognition is discussed in detail. Section 2.4 puts forth some observations and finally the chapter ends giving some concluding remarks in Section Recognition of machine-printed Devanagari script The work on automatic recognition of printed Devanagari script started in early 1970s. The efforts then were initiated by Sinha [9], [10] at Indian Institute of Technology, Kanpur. A syntactic pattern analysis system for Devanagari script recognition is presented in Sinha s Ph.D. thesis [9]. Another OCRsystem development of printed Devanagari is by Palit and Chaudhuri [11] as well as Pal and Chaudhuri [12]. A team comprising Prof. B. B. Chaudhuri, U. Pal, M. Mitra, and U. Garain of Indian Statistical Institute, Kolkata, developed the first commercial level product for printed Devanagari OCR. The same technology has been transferred to Center for Development for the Advance Computing (CDAC) in 2001 for commercialization and is marketed as 2. Literature Review 16

3 Chitrankan [3]. The following sections discuss the preprocessing, feature-extraction, and classification techniques reported so far for machine-printed Devanagari OCR Pre-processing and segmentation techniques When a document is scanned using an optical scanner, a small degree of skew (tilt) is unavoidable. Skew angle is the angle that the text lines in the digital image make with the horizontal direction. Skew estimation and correction are important preprocessing steps of document layout analysis. As far as documents containing Devanagari text are concerned, the most important characteristic to be considered for skew estimation is the header line (shirorekha) joining all the characters in a word. An approach based on the detection of shirorekha is proposed by Chaudhuri and Pal [13] and in [14]. Das and Chanda [15] also proposed a fast and scriptindependent skew estimation technique based on mathematical morphology. After layout preprocessing like skew elimination, the separation of paragraphs, text lines, words, and characters is to be carried out for effective feature extraction. Text blocks in the document pages are extracted first, and then, lines and words are separated. Separation of text lines from text blocks is called line segmentation and separation of words from each text line is called word segmentation. Projection profiles, space between words and lines are used to achieve this in [5]. Separating words into constituent characters is called character segmentation. Removal of shirorekha (header line) does the segmentation of characters from each Devanagari word in [5], [16]. Garain and Chaudhuri [17] presented another technique for identification and segmentation of touching machine-printed Devanagari characters based on fuzzy multi factorial analysis. Bansal and Sinha [18] presented a two-pass algorithm for the segmentation of machine-printed composite characters into their constituent symbols. The proposed algorithm extensively uses structural properties of the script. Kompalli et al. [19] used a graph representation method to segment characters from printed words. In the methodology described by Bansal and Sinha [20], the segmentation by smearing leaves the overlapping text lines and touching characters unsegmented. The selection of image regions for further segmentation is based on statistical analysis of height or width depending on the context. Sharma et al. [21] 2. Literature Review 17

4 presented a rule-based approach for skew correction along with removing insignificant data like dark band, thumb mark, and specks. In the method proposed by Kompalli et al. [22], the shirorekha is determined using projection profile and run length. Once the shirorekha is removed, the top, middle, and bottom zones are identified easily. Components in top and bottom zones are part of vowel modifiers. Each of these components is then scaled to a standard size before feature extraction and classification [23]. To segment touching printed Devanagari characters on degraded documents, a technique based on fuzzy multi factorial analysis is proposed in [96], where a predictive algorithm effectively selects the cut points to segment touching Devanagari characters. For the binarization of natural scene images containing Devanagari textual information, an adaptive thresholding technique is proposed in [80]. A water-reservoirbased analogy is proposed in [39] to extract individual text lines from such documents. It is necessary to identify the scripts before applying their corresponding recognition engine. Many techniques on line-wise and word-wise script identification have been proposed in the literature [79], [82], [84], [86], [91], [95], [98], [106]. In [106], a linewise script identification approach is proposed, where different structural features are used. In [86], appearance-based models are employed for the script identification of the printed text. These models are based on principal component analysis (PCA) and linear discriminant analysis (LDA)/Fisher s linear discriminant (FLD). Words are identified in multilingual document images using SVM in [95]. In [98], for word-wise script identification, the document is initially segmented into lines, and then, the lines are segmented into words. Individual script words are identified from document images using different topological and structural features. Texture features have been applied in [84] for script identification. In [79], a technique to identify Kannada, Hindi, and English text lines from a printed document is presented. To get higher accuracy, a two-stage approach is proposed for printed script identification in [82] Feature extraction techniques Different features have been used for the recognition of Devanagari characters. The system described by Sinha and Mahabala [10] for printed Devanagari characters 2. Literature Review 18

5 stores structural descriptions for each symbol of the script in terms of primitives and their relationships. Sinha [24] also demonstrated how the spatial association among the constituent symbols of Devanagari script plays an important role in understanding Devanagari words. In [5], a character is assigned to one of the three groups, namely basic, modifier, and compound character groups and group-wise features are considered. Also, it is observed that the compound characters (around 250) in the script occupy only 6% of the text. The major two features considered for printed Devanagari characters by Jayanthi et al. [25] are main horizontal line and various vertical lines. The third feature is to test whether vertical lines are present in the rightmost side of the character. The other features have been the height to width (aspect) ratio of the character, whether the character is narrow or broad ended and the number of free ends it has. Govindaraju et al. [16] considered gradient features for feature selection of the characters. Kompalli et al. [22], [26], used gradient, structural, and concavity (GSC) features for OCR of machine printed and multifont Devanagari text. The gradient features were used to classify segmented images. In the method proposed by Dhurandhar et al. [27], the significant contours of the printed character are extracted and characterized as a contour set based on a reference coordinate system. Jawahar et al. [23] used PCA for feature extraction of printed characters. A word-level matching scheme for searching in printed document images is proposed by Meshesha and Jawahar [28]. The feature-extraction scheme extracts local features by scanning vertical strips of the word image and combines them automatically based on their discriminatory potential. The features considered are word profiles, moments, and transform-domain representations. In [1], printed Hindi words are initially identified from bilingual or multilingual documents based on features of the Devanagari script using SVM. Identified words are then segmented into individual characters in the next step, where the composite characters are identified and further segmented based on the structural properties of the script and statistical information. In [79], a technique to identify Kannada, Hindi, and English text lines from a printed document is presented. The features used for script identification of machineprinted text in [82] are 64-D CH features and 400-D gradient features. For the purpose of indexing in [87], printed Devanagari word images are represented in the form of geometric feature graphs (GFG). It is a graph-based representation of the features extracted from the image of the word. A set of features including percentiles, horizontal, 2. Literature Review 19

6 and vertical derivatives of percentiles, angles, correlations, and energy were used for the recognition of printed Devanagari character recognition in [94]. LDA was then used to reduce the dimensionality of the feature set from 81 to 15. Zernike moments and directional features are used as the features for printed characters in [95]. Using background and foreground information, a scheme toward the recognition of Indian complex documents is proposed in [107] Recognition/Classification techniques Many classifiers like artificial neural network (ANN) [22], [23], [61], [77], hidden Markov model (HMM) [42], support vector machine (SVM) [35], [61], modified quadratic discriminant function (MQDF) [50], [56], etc., have been used for Devanagari character recognition. Several compound discriminant functions have been derived from the projection distance (PD) and the MQDF is one of them [35]. Some contemporary techniques like rough sets, fuzzy rules, evolutionary algorithms, and Mahalanobis and Hausdorff distances [54], [68], [69], [96], etc., are also used for the recognition purpose of Devanagari characters. A feature-based tree classifier has been used in [5] to recognize the basic characters. A top down binary-tree-based recognition of printed Devanagari characters is proposed by Jayanthi et al. [25] as binary tree is one of the fastest decision making processes for a computer program. Govindaraju et al. [16] considered 38 characters and 83 frequently occurring conjunct character classes in a multistage classification approach. Initially, they were classified into four categories depending on their structural properties. Each category was then classified using a separate classifier of three-level ANN, where the network is trained using a standard back propagation algorithm. The recognition of printed characters in the method proposed by Dhurandhar et al. [27] involves comparing the contour sets with those in the enrolled database. In [10], the recognition of printed characters involves a search for primitives on the labeled pattern based on the stored description. Contextual constraints are also utilized to arrive at the correct interpretation. In [19], multiple hypotheses are obtained for each composite character by considering all possible combinations of the classifier results for the primitive components. A dynamic time warping (DTW) based partial matching 2. Literature Review 20

7 algorithm is designed for morphological matching that takes care of word from variations in the beginning and at the end is proposed by Meshesha et al. [28]. Kompalli et al. [26] outlined two different techniques for OCR of machineprinted, multifont Devanagari text. In [22], neural network classifiers are used for the recognition of printed characters and words. Jawahar et al. [23] used SVM for classifying printed characters. In [1], segmented printed characters are recognized using generalized Hausdorff image comparisons. In [29], the classification of printed Devanagari characters is done through five filters: 1) coverage of the region of the core strip; 2) vertical bar feature; 3) horizontal zero crossings; 4) number and position of vertex points; and 5) moments. In [94], for printed Devanagari character recognition, each basic glyph and ligature is modeled with a 14-state left-to-right HMM with a maximum of 256 Gaussians per HMM. The training of HMM was carried out using the standard expectation maximization procedure. For classification of printed characters in [95], generalized Hausdorff image comparison, nearest neighbor classifier, weighted Euclidean distance, and hierarchical classification technique were employed. General OCR techniques produce poor results on noisy and degraded documents like old books or newspapers, photocopy materials, faxed documents, etc. [31]. The quality degradation of old documents and books are mainly due to ancient print technology and poor paper quality. As a result the main difficulty in recognizing the images of such documents is because of the distortion of characters due to spreading of ink. Imperfections in scanning may also result in noisy images. To handle such degraded documents, Dhingra et al. [31] presented an approach for the development of minimum classification error (MCE) based system. Gabor filters directly extract features used for classification as they have been successfully applied to Chinese OCR in [32]. The MCEbased classifiers provide robustness to the system against random noise by adjusting the system feature space according to the loss function computed. Dhingra et al. [31] used a degradation model [33] to simulate the distortions caused due to the imperfections in scanning. In [71] and [91], the effectiveness of Gabor and discrete cosine transform (DCT) features was independently evaluated using nearest neighbor, linear discriminant, and SVM classifiers for the blind recognition of 11 different printed scripts including Devanagari. From the experimentations, it was evident that the Gabor SVM combination had an edge over other combinations. The 2. Literature Review 21

8 classification of a machine-printed word to a particular script was done in [82] using SVM via majority voting of each recognized character component of the word. For the recognition of multi-oriented Devanagari characters SVM is used in [107] too. Towards post processing of Devanagari OCR, only a few works are reported. Bansal and Sinha [30] described a method for the correction of optically read printed character strings using a Hindi word dictionary. Pal and Chaudhuri [12] and [99] also proposed a suffix- and prefix-based error correction technique, which can take care of different inflectional languages. Only a few works are reported regarding document retrieval and word spotting. In [88], a search system for retrieval of relevant documents from large collection of document images is presented. A DTW-based partial matching scheme is employed to group together similar words for the indexing purpose. Word profiles like upper and lower words and projection and transition profiles are used as features for word representation. Two different approaches are proposed for spotting words in images of printed Sanskrit documents in [97]. In the first approach, a block adjacency graph (BAG) based scheme for word recognition is used. In the second approach, a moment-based word matching technique, which maintains a script invariant representation of all word images, is employed. Word matching is then carried out using cosine similarity. A shape-code-based word-spotting matching technique for retrieval of multilingual Indian documents is proposed by Tarafdar et al. [100], where different primitive shape codes like 1) zonal information of extreme points; 2) vertical-shapebased feature; 3) crossing count (with respect to the position of vertical bar); 4) loop shape and position; and 5) background information, etc., are used. An inexact matching technique is employed to measure the similarity for possible spotting. The details of many printed Devanagari character and word recognition systems are summarized in Tables 2.1 and Table 2.2, respectively. It is evident from Table 2.1 that for printed Devanagari characters, the method proposed by Dhingra et al. [31] is superior to other methods in terms of recognition accuracy. For printed word recognition, the method proposed by Kompalli et al. [19] has the highest accuracy, as shown in Table Literature Review 22

9 Table 2.1 Details of printed Devanagari character recognition systems Method Feature Classifier Data set (size) Accuracy (%) Govindajaru et al [16] Gradient Neural networks 4, Kompalli et al [22] GSC Neural networks 32, Bansal et al [20] Statistical and Statistical knowledge Unspecified 87 Structural sources Huanfeng Ma et al [1] Structural and Hausdroff image 2, statistical comparison Sinha et al [10] Structural Syntactic pattern Unspecified 90 recognition Natarajan et al [94] Derivatives HMM 21, Bansal et al [29] Filters Five filters Unspecified 93 Dhurandhar et al [27] Contours Interpolation Kompalli et al [26] GSC K-nearest neighbor 9, Jayanthi et al [25] Statistical Binary tree Chaudhuri et al [5] Statistical Tree classifier and 10, Template matching Kompalli et al [19] SFSA Stochastic finite state 10, automation Jawahar et al [23] PCA Support vector machine 2,00, Dhingra et al [31] Gabor MCE 30, Table 2.2 Details of printed Devanagari word recognition systems Method Feature Classifier Data set (size) Accuracy (%) Govindajaru et al [16] Gradient Neural networks 4, Kompalli et al [26] GSC K-nearest neighbor 1, Kompalli et al [22] GSC Neural networks 14, Huanfeng Ma et al [1] Statistical and Hausdroff image Structural comparison Chaudhuri et al [5] Statistical Tree classifier and 10, Template matching Kompalli et al [19] SFSA Stochastic finite state automation 10, Literature Review 23

10 2.3 Recognition of handwritten Devanagari script Only during recent years, research toward Indian handwritten character recognition is getting increased attention although the first research report on offline handwritten Devanagari characters was published in 1977 [34]. Many approaches have been proposed toward handwritten Devanagari numeral, character, and word recognition in the past decade [35] Pre-processing and segmentation techniques Some handwritten documents (e.g., Indian postal documents) may contain some non text parts (like stamp-seal, etc.). Before recognition of this document, it is needed to segment the text and non-text parts. Many techniques [37], [38] based on connected component analysis, run length-smoothing approach (RLSA), and morphological operations are used for this. For converting gray-scale images to binary, many techniques are employed in the literature. In [38], images are binarized using a histogram based global binarization algorithm [39]. In [41] and [42], the Devanagari word image is first smoothed using a median filter, and then, binarized by Otsu s [43] thresholding method. The binarized image is then smoothed using a median filter. Both local and global methods are used in some of the works [37]. Noise removal of the document is also an important step toward the recognition. Bajaj et al. [44] used a median filtering-based approach for noise removal from the images of handwritten Devanagari numerals. For skew angle detection of handwritten Devanagari words and characters, an extension to the work in [13] is proposed in [67]. The method treats shirorekha (header line) as an inherent feature of Devanagari script. The authors have assumed that a handwritten Devanagari word will never have the straight shirorekha, and hence, considered the straightest part of the shirorekha for skew determination. A heuristic approach has been applied to detect the skew angle. Initially the document is scanned from all the four sides for getting the coordinates of pixels encountered along the demarcation of the word boundaries. First-order differential of the coordinate information gives the spatial-level curve. Various levels are then clustered using the nearest neighborhood algorithm to form various regions. The biggest region is treated as 2. Literature Review 24

11 the region of importance. The skew angle is then calculated through a heuristic weight assignment scheme. In [41], mathematical morphological operations, namely erosion and dilation were used to detect the shirorekha of each Devanagari word. With the assumption that the shirorekha is piecewise linear, the skew correction of the word is performed after detecting the shirorekha. The skew angle is found using eigenvectors of the scatter matrix of each component (piece) of shirorekha. For correcting the skew of the word, it is again divided into slabs of a particular number of columns. Each slab is pushed up or down depending on the skew angle of the shirorekha component of that particular slab. Text-line segmentation is an important task in the automatic recognition of offline handwritten text document. Variations in interline distance, presence of inconsistent baseline skew, touching, and overlapping text lines make this task more crucial and complex. Correctness/incorrectness of text-line segmentation directly affects the accuracy of word/character segmentation, which consequently changes the accuracy of word/character recognition. Several techniques for text-line segmentation are reported in the literature [101], [102]. The techniques may be categorized into four groups, which are as follows: 1) projection-profile-based techniques; 2) Hough-transform based techniques; 3) smearing techniques; and 4) methods based on thinning operation. As a conventional technique for text-line segmentation, global horizontal projection analysis of black pixels has been utilized for line segmentation in printed documents [3]. However, this technique cannot be used directly on unconstrained handwritten text documents due to text-line skew variability, inconsistent interline distances, and overlapping and touching components of two consecutive text lines. Partial or piecewise horizontal projection analysis of black pixels is employed by many researchers to separate handwritten text lines of different languages [60], [103], [104]. In the piecewise horizontal projection technique, a text-page image is initially decomposed into a number of vertical stripes. The positions of potential piecewise separating lines (PSL) are obtained for each stripe using partial horizontal projection on each stripe. For PSL computing, row-wise sum of all black pixels of a stripe is calculated. The row, where this sum is zero is a PSL. The extra pieces of lines are removed based on some heuristic rules. The potential separating lines are then connected to achieve complete separating lines for all respective text lines of the image [40], [104]. 2. Literature Review 25

12 For line segmentation of handwritten Devanagari text in [83], a method based on header line detection, base line detection, and contour-following technique is proposed. The proposed method is free from preprocessing techniques like skew correction, thinning, and noise removal. Roy et al. [105] proposed morphology based handwritten line segmentation using foreground and background information. Hanmandlu et al. [59] used a structural approach for segmentation of handwritten Hindi text. In [81], a dual method based on interdependency between text line and interline gap is proposed for the identification of handwritten Devanagari text. The method draws curves simultaneously through the text and interlines gap points found from strip wise histogram peaks and inter peak valleys. The curves stabilize after several iterations, and then, define the final text-line and interline gaps. Also because of upper and lower modifiers of Devanagari text, many touching may occur between two consecutive lines and more research is needed to solve these problems in Devanagari scripts. After a text line is segmented, words are separated from it. Most of the exiting techniques use vertical projection profile for this purpose [3], [60]. The segmentation of characters from words, there are two types of segmentation schemes: recognition-free and recognition-based segmentations. In recognition-free segmentation, a character string can be divided into segments by rules without recognition. In recognition-based segmentation, candidate segmentation points are verified with recognizer. In the past years, many algorithms for the segmentation of character strings have been proposed [3], [41], and [59]. One class of approaches use contour features for segmentation. Analyzing the contour of a connected pattern, the corresponding valley and mountain points are derived. A cutting path is then decided to segment the connected pattern by joining valley and mountain points. In general, contour-based methods do not provide accurate results. Some researchers use profile features for segmentation. Profile-based methods fail when the handwritings are strongly skewed or overlapped. A multi agent-based approach to the segmentation of touching handwritten Hindi numerals is presented in [65]. The first agent locates possible touching based on the thickness of handwriting. The second agent works on the thinned image to locate possible touching based on the rules that govern the connection of different segments to form digits. The two agents then negotiate and try to agree on the actual touching points. The distortions in 2. Literature Review 26

13 handwritten Devanagari characters are removed in [72] using a thickening process followed by thinning and pruning operations. Hanmandlu et al. [59] make an attempt to segment handwritten Devanagari words into constituent characters and modifiers. Initially, the handwritten text is segmented into lines and words using the technique given in [60]. The segmentation of each word includes its separation into characters, lower modifiers, upper modifiers, and separation of compound (composite) characters into consonants and half consonants. Initially, the header line is located and removed after correcting the skew. Analysis of horizontal pixel density in the top half of a word gives the location of the header line. After removing the header line, upper modifiers and characters below the header line are separated using connected component analysis. The characters below the header lines are analyzed further for the presence of lower modifiers. This is done by horizontally scanning the thinned image from top to bottom. A window-based approach is used to find whether the segmented character is a composite one or not. The segmentation of characters from a Devanagari word in [41] is based on the assumption that a shirorekha (header line) is always present in a word. Some works are reported on script identification from handwritten documents. It is done using texture features in [84]. The texture features are extracted based on the cooccurrence histograms of wavelet-decomposed images, which capture information about the relationship between each high-frequency subband and the corresponding lowfrequency subband of the transformed image. The correlation between the subbands at the same resolution is significant in characterizing a texture. For script identification in handwritten documents in [85], denoising, thinning, pruning, m-connectivity, and text size normalization are done in sequence. Afterward, multi channel Gabor filtering is used to extract texture features that characterize the visual appearances of the document image. There exist documents, where both machine printed and handwritten texts appear together. In [92], a machine printed and handwritten text classification for Devanagari and Bangla is presented. The scheme is based on both the structural and statistical features of printed and handwritten text lines. 2. Literature Review 27

14 2.3.2 Feature extraction techniques Even though researchers test different features, statistical and structural features are mostly used for handwritten numeral/character recognition. The feature-extraction methods in [8] for handwritten Devanagari numeral recognition are based on both statistical and structural features. Sethi and Chatterjee [34] described handwritten Devanagari numeral recognition based on a structural approach. The primitives used are horizontal and vertical line segments, right and left slants. For handwritten numerals in [2], a wavelet filter-based multiresolution analysis of input numeral images is carried out in a cascaded manner. It is described that Daubechies wavelet, as a problem solving tool, fit efficiently with digital computer with its basis functions defined by multiplication and addition operators, as there are no derivatives or integrals involved. They considered high-level features based on contour representations of all the four frequency components (high high, high low, low high, and low low) of the wavelet-filtered image. Bajaj et al. [44] represented each handwritten Devanagari numeral using three types of features: 1) density features; 2) moment features of right, left, upper, and lower profile curves; and 3) descriptive component features. For extracting the features, a box approach is proposed by Hanmandlu et al. [45], [46] for handwritten numbers, which requires a spatial division of the numeral image into boxes. Ramteke and Mehrotra [47] evaluated the performance of various techniques based on moment invariants on handwritten Devanagari numerals. The features that have been extracted are based on moments, image partition, principal component axes, correlation coefficient, and perturbed moments. Thinning-based features are also used in Devanagari handwritten character recognition. From the thinned images of handwritten Hindi numerals, three different types of feature points, namely end, branch, and cross points are extracted first in [62]. The strokes between these feature points and their cavity information is also used for the recognition purpose. In [75], translation and scale invariance of handwritten Devanagari numerals are achieved using simple Geometric moments. Higher order Zernike moments are also used in the same work as shape descriptors. The feature used for classifying handwritten digits in [90] is the quad-tree-based longest run feature (QTLR). Chain code and gradient-based features are used for Devanagari numeral recognition in [56]. Fourier descriptors (FD) capable of representing 2. Literature Review 28

15 shapes have been used as features in [57] for handwritten numerals. Sixty-fourdimensional FD invariant to rotation, scale, and translation represent each handwritten numeral. Kumar [48] compared performances of five feature-extraction methods on handwritten characters. The various features covered are Kirsch directional edges, distance transform, chain code, gradient and directional distance distribution. From the experimentations, it is found that Kirsch directional edges are least performing and gradient is best performing with SVM classifiers. With multilayer perceptrons (MLP), the performance of gradient and directional distance distribution is almost same. The chain-code-based feature is better as compared to Kirsch directional edges and distance transform. A new feature is also proposed in the paper, where the gradient direction is quantized into four-directional levels and each gradient map is divided into 4 4 regions. This is combined with total distances in four directions and neighborhood pixels weight. Kaur [49] used Zernike moments along with zoning for feature extraction from handwritten Devanagari characters. The application of moments as a feature extractor provides a method for describing the properties of an object in terms of its area, position, orientation, and other precisely defined parameters. For the recognition of handwritten Devanagari non compound characters, shadow features, and CH features are computed in [61]. In [63], the handwritten Devanagari characters are represented using chain-code features. In [68], the features are extracted from handwritten Devanagari characters using a box approach presented in [69]. Each character image is divided into 24 boxes. The features are represented using normalized vector distances for each character. The shirorekha and spine in a handwritten character are detected using a differential-distance-based technique in [72]. Also features like crossing points, end points, and corners are also considered in the same work. In [73], a feature-extraction technique to improve the recognition results of two similar shaped handwritten characters is discussed. The technique is based on Fisher ratio (F-ratio), a statistical measure defined by the ratio of the between-class variance to the within-class variance. The main features for handwritten Devanagari characters considered in [77] are the CH features, four side views based, and shadow-based features. Features used by Sharma et al. [50] for handwritten Devanagari characters are obtained from the directional chain code information of the contour points of the characters. The bounding box of a character is segmented into blocks and a CH is 2. Literature Review 29

16 computed in each of the blocks. Based on the CH, they have used 64-D features for recognition. The features used by Pal et al. [51] for handwritten characters are mainly based on directional information obtained from the arc tangent of the gradient and Gaussian filter. In [35], a comparative study of Devanagari handwritten character recognition using 12 different classifiers and four sets of features is presented. Feature sets used in the classifiers are computed based on curvature and gradient information obtained from binary as well as gray-scale images. The histogram of chain-code directions in the image-strips scanned from left to right by a sliding window is used by Shaw et al. [42] as a feature vector for handwritten Devanagari word recognition Recognition/Classification techniques A decision tree is employed to perform the analysis of hand printed Devanagari numerals by Sethi and Chatterjee [34] depending on the presence/absence of primitives like horizontal and vertical line segments, right and left slants and their interconnections. A similar strategy is applied to the constrained hand printed characters in [52]. Bhattacharya and Chaudhuri [2] use a distinct MLP classifier at each stage of their recognition scheme for handwritten numerals. Each such classifier either classifies or rejects an input numeral at the corresponding resolution level. If the MLP classifier at a coarser resolution level rejects a numeral, the classifier of the following stage attempts to recognize it at the next higher resolution level. Finally, if rejection still occurs at the highest resolution level, the output vector of each of these three MLP classifiers is transformed into a kind of likelihood measurement. Another MLP classifier has been used to obtain the final decision by combining these three likelihood measurement vectors. Patil and Sontakke [53] proposed a general fuzzy hyperline segment neural network for rotation, scale, and translation invariant handwritten numeral recognition. It combines supervised and unsupervised learning in a single algorithm so that it can be used for pure classification, pure clustering, and hybrid classification/clustering. Bajaj et al. [44] combined decisions of multiple classifiers for handwritten Devanagari numerals. A neural network-based classification scheme is designed for this task. Three different neural classifiers have been used for classification. The outputs of the three classifiers are combined using a connectionist scheme. 2. Literature Review 30

17 Hanmandlu et al. [46] proposed a fuzzy model-based scheme for recognition of handwritten Devanagari numerals by representing them in the form of exponential membership functions, which serve as a fuzzy model. Modifying the exponential membership functions fitted to the fuzzy sets does the recognition. These fuzzy sets are derived from features consisting of normalized distances obtained using the Box approach. The Gaussian distribution function has been adopted by Ramteke and Mehrotra [47] for classification of handwritten numerals. In [76], a method is proposed based on cubic spline interpolation for determining smooth and continuous edges in the images of handwritten Devanagari numerals. In [90], a Hough transformation- based technique is used to localize the postal code blocks from structured postal documents with defined address block region. Isolated handwritten digits are then extracted from the localized postal-code region. In [58], a system is proposed to classify handwritten Devanagari characters into several groups based on similarity measure. The header line (shirorekha) is located based on end points and pixels positions in the top half part of the character image. The header line is removed from the images of every character before coarse classification. Three different classifiers, namely nearest neighbor, k-nn, and SVM were tested independently to recognize handwritten Devanagari numerals in [57]. The performance of SVM in terms of accuracy was better than the other two classifiers. A syntactic representation (SR) of features is used in [62] for handwritten numeral recognition. This representation is matched against the set of prototype SRs of handwritten numerals for a possible match. Edge direction histogram features are used along with PCA for enhancing recognition accuracies of handwritten Devanagari numerals in [76]. Recognition of handwritten numeric postal codes in a multiscript environment is presented in [90]. Similar shaped digit patterns of four scripts, namely Latin, Devanagari, Bangla, and Urdu are grouped in 25 clusters. A script-independent pattern SVM-based classifier is designed to classify the numeric postal codes into one of these 25 clusters. Based on the classification decisions, a rule-based script inference engine is developed to infer about the script of the numeric postal code. One of the four script-specific SVM-based classifiers is then invoked to recognize the digits of the corresponding script. The work in [64] explores the potentiality of a clonal selection algorithm (CSA) in recognition. In particular, a retraining scheme for the CSA is proposed for better 2. Literature Review 31

18 recognition of handwritten Devanagari numerals. Size normalized binary image matrix is used as the feature map for the same. In [49], the feature vector is entered as an input to one of the feed forward back propagation neural network for the classification of handwritten Devanagari characters. Kumar [48] compared the performances of SVM and MLP classifiers with six different features on handwritten characters and found that the performance of SVM classifier was superior to MLP in all the six cases. But the classification time required for SVM was greater than that of MLP. Sharma et al. [50] proposed a quadratic classifier-based scheme for the recognition of handwritten characters. A modified quadratic classifier is applied by Pal et al. [51] on the features of handwritten characters for recognition. In [55], two classifiers are combined to get higher accuracy of character recognition with the same features. Combined use of SVM and MQDF is applied for the same. A comparative study was done by Pal et al. [35] on Devanagari handwritten character recognition using 12 different classifiers like PD, subspace method (SM), linear discriminant function (LDF), SVM, MQDF, mirror image learning (MIL), Euclidean distance (ED), nearest neighbor, k-nn, modified PD (MPD), compound PD (CPD), and compound MQDF (CMQDF). From the experiment, they noted that MIL classifier provided best results and the ED showed the lowest results among all the 12 classifiers considered. A divide-and-conquer strategy is adopted in [58] for the recognition of handwritten Devanagari characters, where each category is divided into subcategories based on structural properties to make the classification process simpler. The subcategories considered in the paper are connected characters, non connected characters, end-bar characters, middle-bar characters, without bar characters, end-bar characters with one closed loop, end-bar characters with two closed loops, and withoutbar characters with loop. Identifying the presence and position of vertical line segments and closed loops does this coarse classification. In [59], the top modifiers of Devanagari script are classified into one touchingpoint and two touching-point modifiers by checking whether a modifier touches the header line at two positions or not. Further classification of two touching point modifiers is done by analyzing the core strip of the word. Two MLPs and a minimum edit distance (MED) method are used for classification of handwritten Devanagari non compound characters in [61]. In the first stage of classification, characters with distinct shapes are classified using two MLPs. Shadow features are used for one MLP and CH features are 2. Literature Review 32

19 used for the other MLP for classification. In the second stage of classification, confused characters having similar shapes are classified using a MED method. This method makes use of corners detected in a character image using modified Harris corner detection technique. The work reported in [72] presents a two-stage classification approach for handwritten Devanagari characters. The first stage is using structural properties like shirorekha and spine in a character. The second stage exploits intersection features of characters, which are then fed to a feed forward neural network (FFNN) for further classification. In [77], three MLPs are designed for three types of features. Each MLP is trained with a back propagation-learning algorithm. Results of three MLPs are then combined using a weighted majority scheme. The work reported in [63] discusses the use of regular expressions (RE) in handwritten Devanagari character recognition, where a handwritten character is converted into an encoded string based on chain-code features. Then, RE of stored templates is matched with it. Rejected samples are then sent to a MED classifier for recognition. An elastic matching (EM) technique based on an eigen deformation (ED) for recognition of handwritten Devanagari characters is proposed in [66]. The method consists of two phases: a training phase for the estimation of EDs, and a recognition process using the estimated EDs. EDs are the intrinsic deformations within each character category and can be estimated by the PCA of actual deformations collected through the EM. A coarse classification is done in [68] prior to recognition, where the handwritten Devanagari characters are classified into three major categories, namely end-bar characters, middle-bar characters, and characters without any bar based on the presence of vertical bar. The recognition of handwritten characters in [68] is based on the modified exponential membership function fitted to the fuzzy sets derived from the features of the characters. A Reuse Policy that provides guidance from the past policies is also utilized in the paper to improve the speed of the learning process. Not much work is reported toward handwritten character string (word) recognition of Devanagari. A segmentation-based approach to handwritten Devanagari word recognition is proposed by Shaw et al. [41]. On the basis of the header line, a word image is segmented into pseudo characters. HMM are proposed to recognize the pseudo characters. The word-level recognition is done on the basis of string edit distance. 2. Literature Review 33

20 A continuous density HMM is also proposed by Shaw et al. [42] to recognize a handwritten word images. The states of the HMM are not determined a priori, but are determined automatically based on a database of handwritten word images. An HMM is constructed for each word. To classify an unknown word image, its class conditional probability for each HMM is computed. The class that gives highest such probability is finally selected. In [74] a dynamic programming (DP) based technique is proposed for pin code string recognition. Initially, the pin code string is segmented into primitives. Table 2.3 Details of handwritten Devanagari numeral recognition systems Method Feature Classifier Data set (size) Accuracy (%) Bajaj et al [44] Statistical Neural networks 2, Ramteke et al [47] Moment Gaussian distribution 2, invariants Lakshmi et al [76] Gradient PCA 9, Hanmandlu et al [46] Box approach Fuzzy model 3, Hanmandlu et al [54] Box approach Bacterial foraging 3, Elnagar et al [62] Structural Matching SR 1, Garain et al [64] Binary image Clonal selection 12, Basu et al [90] QTLR SVM 3, Rajput et al [57] Fourier SVM 13, descriptor Pal et al [74] Gradient MQDF 23, Sharma et al [50] Chain code Quadratic 22, Bhattacharya et al [2] Wavelet MLP 22, Patil et al [53] Structural General fuzzy neural 2, network Pal et al [56] Gradient MQDF 22, The details of many handwritten Devanagari numeral, character, and word recognition systems are summarized in Tables , respectively. It is evident from Table 2.3 that in recognizing Devanagari handwritten numerals, the method proposed by Pal et al. [56] is superior in terms of accuracy. Even for recognizing Devanagari handwritten characters, the method proposed by Pal et al. [35] has the highest accuracy, as shown in Table Literature Review 34

21 Table 2.4 Details of handwritten Devanagari character recognition systems Method Feature Classifier Data set (size) Accuracy (%) Sharma et al [16] Chain code Quadratic 11, Deshpande et al [63] Chain code RE and MED 5, Arora et al [72] Structural FFNN 50, Arora et al [77] Combined MLP 1, Hanmandlu et al [68] Vector distance Fuzzy sets 4, Arora et al [61] Shadow and CH MLP and MED 7, Kumar et al [48] Gradient SVM 25, Pal et al [51] Gradient and Quadratic 36, Gaussian filter Mane et al [66] Eigen Elastic matching 3, deformation Pal et al [55] Gradient SVM and MQDF 36, Pal et al [35] Gradient MIL 36, Table 2.5 Details of handwritten Devanagari word recognition systems Method Feature Classifier Data set (size) Accuracy (%) Shaw et al [42] Chain code HMM 39, Shaw et al [41] Segments HMM 39, Some observations In the recent past, Department of Information Technology (DIT), Government of India formed a Consortium of several Institutions/Universities of India involved in OCR activities and provided considerable amount of fund to this Consortium to improve the quality of research related to Indian language OCR. Creation of benchmark databases for Devanagari script is essential for successful research. Efforts have been made in India and U.S. to create test beds for printed and handwritten Devanagari character recognition [2], [36], [93]. The database developed by Indian Statistical Institute, Kolkata [2] contains isolated handwritten Devanagari 2. Literature Review 35

22 numeral samples collected from real-life situations, and is made available free of cost to researchers of other academic institutions. Setlur et al. [36] also made some efforts in creating data resources and designing an evaluation test bed for Devanagari script recognition. Jawahar et al. [93] of International Institute of Information Technology, Hyderabad were successful in generating a corpus consists of more than document images printed in Indian scripts. But still there is no standard database for handwritten composite characters and words written in Devanagari. For the segmentation-based recognition of handwritten text, initially it has to be separated into words, and then, words into individual characters and modifiers. As the recognition has to be performed on isolated characters, segmentation of words into characters is a critical step for handwritten text recognition as incorrect segmentation of words may lead to incorrect recognition. Most of the segmentation errors are due to various writing styles of different individuals. Also presence of many touching characters is another major problem of segmentation. As a result, much research is needed and expected in this area. Identifying compound and touching characters is also a challenging task. Some authors are of the opinion that the use of contextual information may improve the results of segmentation. Characters are joined together using a shirorekha to frame words in Devanagari script. It has been observed that some people write words without using a shirorekha on them. Thus, a word may or may not have a header line in some handwriting text and such an absence creates problems in recognition. Skew detection will be difficult in the absence of shirorekha as some of the existing works related to skew detection are based on the presence of shirorekha on words. The straightness of the shirorekha is also an issue of concern. A study on the irregularities in Devanagari handwriting is presented in [70]. These irregularities occur during the writing process and make the word-level recognition of the text more complex. Some of them are: abnormal size of a vowel symbol, incomplete and inaccurate representation of a vowel symbol, merging of vowel symbols with headline, intrusion of upper vowel symbols with middle region, intrusion of lower vowel symbols with middle region, improper attachment of lower vowel symbols, writing vowel symbols in isolation, wrong position of header line, incomplete writing, presence of unwanted or extra strokes, narrow writing, and over writing. 2. Literature Review 36

A Survey of Problems of Overlapped Handwritten Characters in Recognition process for Gurmukhi Script

A Survey of Problems of Overlapped Handwritten Characters in Recognition process for Gurmukhi Script Arwinder Kaur 1, Ashok Kumar Bathla 2 1 M. Tech. Student, CE Dept., 2 Assistant Professor, CE Dept.,