CHAPTER 2 LITERATURE REVIEW

Size: px
Start display at page:

Download "CHAPTER 2 LITERATURE REVIEW"

Transcription

1 CHAPTER 2 LITERATURE REVIEW

2 2.1 Introduction There is a great need for OCR related research in Indian languages, even though there are many technical challenges as well as the lack of a commercial market [1]. With the spread of computers in organizations and homes, automatic processing of paper documents is rapidly gaining importance in India [2]. A short description of the advancements in OCR of Indian scripts including Bangla, Tamil, Telugu, Gurmukhi, Oriya, Gujarati, Kannada, and Devanagari up to 2002 can be seen in [3]. In this paper, it is tried to address all the advancements till 2010 in printed as well as handwritten Devanagari script recognition along with their performances. Devanagari is the script used for writing many official languages in India, such as Hindi, Marathi, Sindhi, Nepali, Sanskrit, and Konkani, where Marathi is the language spoken in Maharashtra state. Several other Indian languages like Gujarati, Punjabi, and Bengali use scripts similar to Devanagari. More than 300 million people use Devanagari script for documentation in central and northern parts of India [4]. This chapter presents a comprehensive review of the work carried out in Devanagari OCR. Section 2.2 discusses the literature review in the field of machine-printed Devanagari script. Section 2.3 presents the review in the handwritten character recognition field. In both these cases, the research carried out at each stage of the OCR namely, pre-processing, feature extraction and classification/recognition is discussed in detail. Section 2.4 puts forth some observations and finally the chapter ends giving some concluding remarks in Section Recognition of machine-printed Devanagari script The work on automatic recognition of printed Devanagari script started in early 1970s. The efforts then were initiated by Sinha [9], [10] at Indian Institute of Technology, Kanpur. A syntactic pattern analysis system for Devanagari script recognition is presented in Sinha s Ph.D. thesis [9]. Another OCRsystem development of printed Devanagari is by Palit and Chaudhuri [11] as well as Pal and Chaudhuri [12]. A team comprising Prof. B. B. Chaudhuri, U. Pal, M. Mitra, and U. Garain of Indian Statistical Institute, Kolkata, developed the first commercial level product for printed Devanagari OCR. The same technology has been transferred to Center for Development for the Advance Computing (CDAC) in 2001 for commercialization and is marketed as 2. Literature Review 16

3 Chitrankan [3]. The following sections discuss the preprocessing, feature-extraction, and classification techniques reported so far for machine-printed Devanagari OCR Pre-processing and segmentation techniques When a document is scanned using an optical scanner, a small degree of skew (tilt) is unavoidable. Skew angle is the angle that the text lines in the digital image make with the horizontal direction. Skew estimation and correction are important preprocessing steps of document layout analysis. As far as documents containing Devanagari text are concerned, the most important characteristic to be considered for skew estimation is the header line (shirorekha) joining all the characters in a word. An approach based on the detection of shirorekha is proposed by Chaudhuri and Pal [13] and in [14]. Das and Chanda [15] also proposed a fast and scriptindependent skew estimation technique based on mathematical morphology. After layout preprocessing like skew elimination, the separation of paragraphs, text lines, words, and characters is to be carried out for effective feature extraction. Text blocks in the document pages are extracted first, and then, lines and words are separated. Separation of text lines from text blocks is called line segmentation and separation of words from each text line is called word segmentation. Projection profiles, space between words and lines are used to achieve this in [5]. Separating words into constituent characters is called character segmentation. Removal of shirorekha (header line) does the segmentation of characters from each Devanagari word in [5], [16]. Garain and Chaudhuri [17] presented another technique for identification and segmentation of touching machine-printed Devanagari characters based on fuzzy multi factorial analysis. Bansal and Sinha [18] presented a two-pass algorithm for the segmentation of machine-printed composite characters into their constituent symbols. The proposed algorithm extensively uses structural properties of the script. Kompalli et al. [19] used a graph representation method to segment characters from printed words. In the methodology described by Bansal and Sinha [20], the segmentation by smearing leaves the overlapping text lines and touching characters unsegmented. The selection of image regions for further segmentation is based on statistical analysis of height or width depending on the context. Sharma et al. [21] 2. Literature Review 17

4 presented a rule-based approach for skew correction along with removing insignificant data like dark band, thumb mark, and specks. In the method proposed by Kompalli et al. [22], the shirorekha is determined using projection profile and run length. Once the shirorekha is removed, the top, middle, and bottom zones are identified easily. Components in top and bottom zones are part of vowel modifiers. Each of these components is then scaled to a standard size before feature extraction and classification [23]. To segment touching printed Devanagari characters on degraded documents, a technique based on fuzzy multi factorial analysis is proposed in [96], where a predictive algorithm effectively selects the cut points to segment touching Devanagari characters. For the binarization of natural scene images containing Devanagari textual information, an adaptive thresholding technique is proposed in [80]. A water-reservoirbased analogy is proposed in [39] to extract individual text lines from such documents. It is necessary to identify the scripts before applying their corresponding recognition engine. Many techniques on line-wise and word-wise script identification have been proposed in the literature [79], [82], [84], [86], [91], [95], [98], [106]. In [106], a linewise script identification approach is proposed, where different structural features are used. In [86], appearance-based models are employed for the script identification of the printed text. These models are based on principal component analysis (PCA) and linear discriminant analysis (LDA)/Fisher s linear discriminant (FLD). Words are identified in multilingual document images using SVM in [95]. In [98], for word-wise script identification, the document is initially segmented into lines, and then, the lines are segmented into words. Individual script words are identified from document images using different topological and structural features. Texture features have been applied in [84] for script identification. In [79], a technique to identify Kannada, Hindi, and English text lines from a printed document is presented. To get higher accuracy, a two-stage approach is proposed for printed script identification in [82] Feature extraction techniques Different features have been used for the recognition of Devanagari characters. The system described by Sinha and Mahabala [10] for printed Devanagari characters 2. Literature Review 18

5 stores structural descriptions for each symbol of the script in terms of primitives and their relationships. Sinha [24] also demonstrated how the spatial association among the constituent symbols of Devanagari script plays an important role in understanding Devanagari words. In [5], a character is assigned to one of the three groups, namely basic, modifier, and compound character groups and group-wise features are considered. Also, it is observed that the compound characters (around 250) in the script occupy only 6% of the text. The major two features considered for printed Devanagari characters by Jayanthi et al. [25] are main horizontal line and various vertical lines. The third feature is to test whether vertical lines are present in the rightmost side of the character. The other features have been the height to width (aspect) ratio of the character, whether the character is narrow or broad ended and the number of free ends it has. Govindaraju et al. [16] considered gradient features for feature selection of the characters. Kompalli et al. [22], [26], used gradient, structural, and concavity (GSC) features for OCR of machine printed and multifont Devanagari text. The gradient features were used to classify segmented images. In the method proposed by Dhurandhar et al. [27], the significant contours of the printed character are extracted and characterized as a contour set based on a reference coordinate system. Jawahar et al. [23] used PCA for feature extraction of printed characters. A word-level matching scheme for searching in printed document images is proposed by Meshesha and Jawahar [28]. The feature-extraction scheme extracts local features by scanning vertical strips of the word image and combines them automatically based on their discriminatory potential. The features considered are word profiles, moments, and transform-domain representations. In [1], printed Hindi words are initially identified from bilingual or multilingual documents based on features of the Devanagari script using SVM. Identified words are then segmented into individual characters in the next step, where the composite characters are identified and further segmented based on the structural properties of the script and statistical information. In [79], a technique to identify Kannada, Hindi, and English text lines from a printed document is presented. The features used for script identification of machineprinted text in [82] are 64-D CH features and 400-D gradient features. For the purpose of indexing in [87], printed Devanagari word images are represented in the form of geometric feature graphs (GFG). It is a graph-based representation of the features extracted from the image of the word. A set of features including percentiles, horizontal, 2. Literature Review 19

6 and vertical derivatives of percentiles, angles, correlations, and energy were used for the recognition of printed Devanagari character recognition in [94]. LDA was then used to reduce the dimensionality of the feature set from 81 to 15. Zernike moments and directional features are used as the features for printed characters in [95]. Using background and foreground information, a scheme toward the recognition of Indian complex documents is proposed in [107] Recognition/Classification techniques Many classifiers like artificial neural network (ANN) [22], [23], [61], [77], hidden Markov model (HMM) [42], support vector machine (SVM) [35], [61], modified quadratic discriminant function (MQDF) [50], [56], etc., have been used for Devanagari character recognition. Several compound discriminant functions have been derived from the projection distance (PD) and the MQDF is one of them [35]. Some contemporary techniques like rough sets, fuzzy rules, evolutionary algorithms, and Mahalanobis and Hausdorff distances [54], [68], [69], [96], etc., are also used for the recognition purpose of Devanagari characters. A feature-based tree classifier has been used in [5] to recognize the basic characters. A top down binary-tree-based recognition of printed Devanagari characters is proposed by Jayanthi et al. [25] as binary tree is one of the fastest decision making processes for a computer program. Govindaraju et al. [16] considered 38 characters and 83 frequently occurring conjunct character classes in a multistage classification approach. Initially, they were classified into four categories depending on their structural properties. Each category was then classified using a separate classifier of three-level ANN, where the network is trained using a standard back propagation algorithm. The recognition of printed characters in the method proposed by Dhurandhar et al. [27] involves comparing the contour sets with those in the enrolled database. In [10], the recognition of printed characters involves a search for primitives on the labeled pattern based on the stored description. Contextual constraints are also utilized to arrive at the correct interpretation. In [19], multiple hypotheses are obtained for each composite character by considering all possible combinations of the classifier results for the primitive components. A dynamic time warping (DTW) based partial matching 2. Literature Review 20

7 algorithm is designed for morphological matching that takes care of word from variations in the beginning and at the end is proposed by Meshesha et al. [28]. Kompalli et al. [26] outlined two different techniques for OCR of machineprinted, multifont Devanagari text. In [22], neural network classifiers are used for the recognition of printed characters and words. Jawahar et al. [23] used SVM for classifying printed characters. In [1], segmented printed characters are recognized using generalized Hausdorff image comparisons. In [29], the classification of printed Devanagari characters is done through five filters: 1) coverage of the region of the core strip; 2) vertical bar feature; 3) horizontal zero crossings; 4) number and position of vertex points; and 5) moments. In [94], for printed Devanagari character recognition, each basic glyph and ligature is modeled with a 14-state left-to-right HMM with a maximum of 256 Gaussians per HMM. The training of HMM was carried out using the standard expectation maximization procedure. For classification of printed characters in [95], generalized Hausdorff image comparison, nearest neighbor classifier, weighted Euclidean distance, and hierarchical classification technique were employed. General OCR techniques produce poor results on noisy and degraded documents like old books or newspapers, photocopy materials, faxed documents, etc. [31]. The quality degradation of old documents and books are mainly due to ancient print technology and poor paper quality. As a result the main difficulty in recognizing the images of such documents is because of the distortion of characters due to spreading of ink. Imperfections in scanning may also result in noisy images. To handle such degraded documents, Dhingra et al. [31] presented an approach for the development of minimum classification error (MCE) based system. Gabor filters directly extract features used for classification as they have been successfully applied to Chinese OCR in [32]. The MCEbased classifiers provide robustness to the system against random noise by adjusting the system feature space according to the loss function computed. Dhingra et al. [31] used a degradation model [33] to simulate the distortions caused due to the imperfections in scanning. In [71] and [91], the effectiveness of Gabor and discrete cosine transform (DCT) features was independently evaluated using nearest neighbor, linear discriminant, and SVM classifiers for the blind recognition of 11 different printed scripts including Devanagari. From the experimentations, it was evident that the Gabor SVM combination had an edge over other combinations. The 2. Literature Review 21

8 classification of a machine-printed word to a particular script was done in [82] using SVM via majority voting of each recognized character component of the word. For the recognition of multi-oriented Devanagari characters SVM is used in [107] too. Towards post processing of Devanagari OCR, only a few works are reported. Bansal and Sinha [30] described a method for the correction of optically read printed character strings using a Hindi word dictionary. Pal and Chaudhuri [12] and [99] also proposed a suffix- and prefix-based error correction technique, which can take care of different inflectional languages. Only a few works are reported regarding document retrieval and word spotting. In [88], a search system for retrieval of relevant documents from large collection of document images is presented. A DTW-based partial matching scheme is employed to group together similar words for the indexing purpose. Word profiles like upper and lower words and projection and transition profiles are used as features for word representation. Two different approaches are proposed for spotting words in images of printed Sanskrit documents in [97]. In the first approach, a block adjacency graph (BAG) based scheme for word recognition is used. In the second approach, a moment-based word matching technique, which maintains a script invariant representation of all word images, is employed. Word matching is then carried out using cosine similarity. A shape-code-based word-spotting matching technique for retrieval of multilingual Indian documents is proposed by Tarafdar et al. [100], where different primitive shape codes like 1) zonal information of extreme points; 2) vertical-shapebased feature; 3) crossing count (with respect to the position of vertical bar); 4) loop shape and position; and 5) background information, etc., are used. An inexact matching technique is employed to measure the similarity for possible spotting. The details of many printed Devanagari character and word recognition systems are summarized in Tables 2.1 and Table 2.2, respectively. It is evident from Table 2.1 that for printed Devanagari characters, the method proposed by Dhingra et al. [31] is superior to other methods in terms of recognition accuracy. For printed word recognition, the method proposed by Kompalli et al. [19] has the highest accuracy, as shown in Table Literature Review 22

9 Table 2.1 Details of printed Devanagari character recognition systems Method Feature Classifier Data set (size) Accuracy (%) Govindajaru et al [16] Gradient Neural networks 4, Kompalli et al [22] GSC Neural networks 32, Bansal et al [20] Statistical and Statistical knowledge Unspecified 87 Structural sources Huanfeng Ma et al [1] Structural and Hausdroff image 2, statistical comparison Sinha et al [10] Structural Syntactic pattern Unspecified 90 recognition Natarajan et al [94] Derivatives HMM 21, Bansal et al [29] Filters Five filters Unspecified 93 Dhurandhar et al [27] Contours Interpolation Kompalli et al [26] GSC K-nearest neighbor 9, Jayanthi et al [25] Statistical Binary tree Chaudhuri et al [5] Statistical Tree classifier and 10, Template matching Kompalli et al [19] SFSA Stochastic finite state 10, automation Jawahar et al [23] PCA Support vector machine 2,00, Dhingra et al [31] Gabor MCE 30, Table 2.2 Details of printed Devanagari word recognition systems Method Feature Classifier Data set (size) Accuracy (%) Govindajaru et al [16] Gradient Neural networks 4, Kompalli et al [26] GSC K-nearest neighbor 1, Kompalli et al [22] GSC Neural networks 14, Huanfeng Ma et al [1] Statistical and Hausdroff image Structural comparison Chaudhuri et al [5] Statistical Tree classifier and 10, Template matching Kompalli et al [19] SFSA Stochastic finite state automation 10, Literature Review 23

10 2.3 Recognition of handwritten Devanagari script Only during recent years, research toward Indian handwritten character recognition is getting increased attention although the first research report on offline handwritten Devanagari characters was published in 1977 [34]. Many approaches have been proposed toward handwritten Devanagari numeral, character, and word recognition in the past decade [35] Pre-processing and segmentation techniques Some handwritten documents (e.g., Indian postal documents) may contain some non text parts (like stamp-seal, etc.). Before recognition of this document, it is needed to segment the text and non-text parts. Many techniques [37], [38] based on connected component analysis, run length-smoothing approach (RLSA), and morphological operations are used for this. For converting gray-scale images to binary, many techniques are employed in the literature. In [38], images are binarized using a histogram based global binarization algorithm [39]. In [41] and [42], the Devanagari word image is first smoothed using a median filter, and then, binarized by Otsu s [43] thresholding method. The binarized image is then smoothed using a median filter. Both local and global methods are used in some of the works [37]. Noise removal of the document is also an important step toward the recognition. Bajaj et al. [44] used a median filtering-based approach for noise removal from the images of handwritten Devanagari numerals. For skew angle detection of handwritten Devanagari words and characters, an extension to the work in [13] is proposed in [67]. The method treats shirorekha (header line) as an inherent feature of Devanagari script. The authors have assumed that a handwritten Devanagari word will never have the straight shirorekha, and hence, considered the straightest part of the shirorekha for skew determination. A heuristic approach has been applied to detect the skew angle. Initially the document is scanned from all the four sides for getting the coordinates of pixels encountered along the demarcation of the word boundaries. First-order differential of the coordinate information gives the spatial-level curve. Various levels are then clustered using the nearest neighborhood algorithm to form various regions. The biggest region is treated as 2. Literature Review 24

11 the region of importance. The skew angle is then calculated through a heuristic weight assignment scheme. In [41], mathematical morphological operations, namely erosion and dilation were used to detect the shirorekha of each Devanagari word. With the assumption that the shirorekha is piecewise linear, the skew correction of the word is performed after detecting the shirorekha. The skew angle is found using eigenvectors of the scatter matrix of each component (piece) of shirorekha. For correcting the skew of the word, it is again divided into slabs of a particular number of columns. Each slab is pushed up or down depending on the skew angle of the shirorekha component of that particular slab. Text-line segmentation is an important task in the automatic recognition of offline handwritten text document. Variations in interline distance, presence of inconsistent baseline skew, touching, and overlapping text lines make this task more crucial and complex. Correctness/incorrectness of text-line segmentation directly affects the accuracy of word/character segmentation, which consequently changes the accuracy of word/character recognition. Several techniques for text-line segmentation are reported in the literature [101], [102]. The techniques may be categorized into four groups, which are as follows: 1) projection-profile-based techniques; 2) Hough-transform based techniques; 3) smearing techniques; and 4) methods based on thinning operation. As a conventional technique for text-line segmentation, global horizontal projection analysis of black pixels has been utilized for line segmentation in printed documents [3]. However, this technique cannot be used directly on unconstrained handwritten text documents due to text-line skew variability, inconsistent interline distances, and overlapping and touching components of two consecutive text lines. Partial or piecewise horizontal projection analysis of black pixels is employed by many researchers to separate handwritten text lines of different languages [60], [103], [104]. In the piecewise horizontal projection technique, a text-page image is initially decomposed into a number of vertical stripes. The positions of potential piecewise separating lines (PSL) are obtained for each stripe using partial horizontal projection on each stripe. For PSL computing, row-wise sum of all black pixels of a stripe is calculated. The row, where this sum is zero is a PSL. The extra pieces of lines are removed based on some heuristic rules. The potential separating lines are then connected to achieve complete separating lines for all respective text lines of the image [40], [104]. 2. Literature Review 25

12 For line segmentation of handwritten Devanagari text in [83], a method based on header line detection, base line detection, and contour-following technique is proposed. The proposed method is free from preprocessing techniques like skew correction, thinning, and noise removal. Roy et al. [105] proposed morphology based handwritten line segmentation using foreground and background information. Hanmandlu et al. [59] used a structural approach for segmentation of handwritten Hindi text. In [81], a dual method based on interdependency between text line and interline gap is proposed for the identification of handwritten Devanagari text. The method draws curves simultaneously through the text and interlines gap points found from strip wise histogram peaks and inter peak valleys. The curves stabilize after several iterations, and then, define the final text-line and interline gaps. Also because of upper and lower modifiers of Devanagari text, many touching may occur between two consecutive lines and more research is needed to solve these problems in Devanagari scripts. After a text line is segmented, words are separated from it. Most of the exiting techniques use vertical projection profile for this purpose [3], [60]. The segmentation of characters from words, there are two types of segmentation schemes: recognition-free and recognition-based segmentations. In recognition-free segmentation, a character string can be divided into segments by rules without recognition. In recognition-based segmentation, candidate segmentation points are verified with recognizer. In the past years, many algorithms for the segmentation of character strings have been proposed [3], [41], and [59]. One class of approaches use contour features for segmentation. Analyzing the contour of a connected pattern, the corresponding valley and mountain points are derived. A cutting path is then decided to segment the connected pattern by joining valley and mountain points. In general, contour-based methods do not provide accurate results. Some researchers use profile features for segmentation. Profile-based methods fail when the handwritings are strongly skewed or overlapped. A multi agent-based approach to the segmentation of touching handwritten Hindi numerals is presented in [65]. The first agent locates possible touching based on the thickness of handwriting. The second agent works on the thinned image to locate possible touching based on the rules that govern the connection of different segments to form digits. The two agents then negotiate and try to agree on the actual touching points. The distortions in 2. Literature Review 26

13 handwritten Devanagari characters are removed in [72] using a thickening process followed by thinning and pruning operations. Hanmandlu et al. [59] make an attempt to segment handwritten Devanagari words into constituent characters and modifiers. Initially, the handwritten text is segmented into lines and words using the technique given in [60]. The segmentation of each word includes its separation into characters, lower modifiers, upper modifiers, and separation of compound (composite) characters into consonants and half consonants. Initially, the header line is located and removed after correcting the skew. Analysis of horizontal pixel density in the top half of a word gives the location of the header line. After removing the header line, upper modifiers and characters below the header line are separated using connected component analysis. The characters below the header lines are analyzed further for the presence of lower modifiers. This is done by horizontally scanning the thinned image from top to bottom. A window-based approach is used to find whether the segmented character is a composite one or not. The segmentation of characters from a Devanagari word in [41] is based on the assumption that a shirorekha (header line) is always present in a word. Some works are reported on script identification from handwritten documents. It is done using texture features in [84]. The texture features are extracted based on the cooccurrence histograms of wavelet-decomposed images, which capture information about the relationship between each high-frequency subband and the corresponding lowfrequency subband of the transformed image. The correlation between the subbands at the same resolution is significant in characterizing a texture. For script identification in handwritten documents in [85], denoising, thinning, pruning, m-connectivity, and text size normalization are done in sequence. Afterward, multi channel Gabor filtering is used to extract texture features that characterize the visual appearances of the document image. There exist documents, where both machine printed and handwritten texts appear together. In [92], a machine printed and handwritten text classification for Devanagari and Bangla is presented. The scheme is based on both the structural and statistical features of printed and handwritten text lines. 2. Literature Review 27

14 2.3.2 Feature extraction techniques Even though researchers test different features, statistical and structural features are mostly used for handwritten numeral/character recognition. The feature-extraction methods in [8] for handwritten Devanagari numeral recognition are based on both statistical and structural features. Sethi and Chatterjee [34] described handwritten Devanagari numeral recognition based on a structural approach. The primitives used are horizontal and vertical line segments, right and left slants. For handwritten numerals in [2], a wavelet filter-based multiresolution analysis of input numeral images is carried out in a cascaded manner. It is described that Daubechies wavelet, as a problem solving tool, fit efficiently with digital computer with its basis functions defined by multiplication and addition operators, as there are no derivatives or integrals involved. They considered high-level features based on contour representations of all the four frequency components (high high, high low, low high, and low low) of the wavelet-filtered image. Bajaj et al. [44] represented each handwritten Devanagari numeral using three types of features: 1) density features; 2) moment features of right, left, upper, and lower profile curves; and 3) descriptive component features. For extracting the features, a box approach is proposed by Hanmandlu et al. [45], [46] for handwritten numbers, which requires a spatial division of the numeral image into boxes. Ramteke and Mehrotra [47] evaluated the performance of various techniques based on moment invariants on handwritten Devanagari numerals. The features that have been extracted are based on moments, image partition, principal component axes, correlation coefficient, and perturbed moments. Thinning-based features are also used in Devanagari handwritten character recognition. From the thinned images of handwritten Hindi numerals, three different types of feature points, namely end, branch, and cross points are extracted first in [62]. The strokes between these feature points and their cavity information is also used for the recognition purpose. In [75], translation and scale invariance of handwritten Devanagari numerals are achieved using simple Geometric moments. Higher order Zernike moments are also used in the same work as shape descriptors. The feature used for classifying handwritten digits in [90] is the quad-tree-based longest run feature (QTLR). Chain code and gradient-based features are used for Devanagari numeral recognition in [56]. Fourier descriptors (FD) capable of representing 2. Literature Review 28

15 shapes have been used as features in [57] for handwritten numerals. Sixty-fourdimensional FD invariant to rotation, scale, and translation represent each handwritten numeral. Kumar [48] compared performances of five feature-extraction methods on handwritten characters. The various features covered are Kirsch directional edges, distance transform, chain code, gradient and directional distance distribution. From the experimentations, it is found that Kirsch directional edges are least performing and gradient is best performing with SVM classifiers. With multilayer perceptrons (MLP), the performance of gradient and directional distance distribution is almost same. The chain-code-based feature is better as compared to Kirsch directional edges and distance transform. A new feature is also proposed in the paper, where the gradient direction is quantized into four-directional levels and each gradient map is divided into 4 4 regions. This is combined with total distances in four directions and neighborhood pixels weight. Kaur [49] used Zernike moments along with zoning for feature extraction from handwritten Devanagari characters. The application of moments as a feature extractor provides a method for describing the properties of an object in terms of its area, position, orientation, and other precisely defined parameters. For the recognition of handwritten Devanagari non compound characters, shadow features, and CH features are computed in [61]. In [63], the handwritten Devanagari characters are represented using chain-code features. In [68], the features are extracted from handwritten Devanagari characters using a box approach presented in [69]. Each character image is divided into 24 boxes. The features are represented using normalized vector distances for each character. The shirorekha and spine in a handwritten character are detected using a differential-distance-based technique in [72]. Also features like crossing points, end points, and corners are also considered in the same work. In [73], a feature-extraction technique to improve the recognition results of two similar shaped handwritten characters is discussed. The technique is based on Fisher ratio (F-ratio), a statistical measure defined by the ratio of the between-class variance to the within-class variance. The main features for handwritten Devanagari characters considered in [77] are the CH features, four side views based, and shadow-based features. Features used by Sharma et al. [50] for handwritten Devanagari characters are obtained from the directional chain code information of the contour points of the characters. The bounding box of a character is segmented into blocks and a CH is 2. Literature Review 29

16 computed in each of the blocks. Based on the CH, they have used 64-D features for recognition. The features used by Pal et al. [51] for handwritten characters are mainly based on directional information obtained from the arc tangent of the gradient and Gaussian filter. In [35], a comparative study of Devanagari handwritten character recognition using 12 different classifiers and four sets of features is presented. Feature sets used in the classifiers are computed based on curvature and gradient information obtained from binary as well as gray-scale images. The histogram of chain-code directions in the image-strips scanned from left to right by a sliding window is used by Shaw et al. [42] as a feature vector for handwritten Devanagari word recognition Recognition/Classification techniques A decision tree is employed to perform the analysis of hand printed Devanagari numerals by Sethi and Chatterjee [34] depending on the presence/absence of primitives like horizontal and vertical line segments, right and left slants and their interconnections. A similar strategy is applied to the constrained hand printed characters in [52]. Bhattacharya and Chaudhuri [2] use a distinct MLP classifier at each stage of their recognition scheme for handwritten numerals. Each such classifier either classifies or rejects an input numeral at the corresponding resolution level. If the MLP classifier at a coarser resolution level rejects a numeral, the classifier of the following stage attempts to recognize it at the next higher resolution level. Finally, if rejection still occurs at the highest resolution level, the output vector of each of these three MLP classifiers is transformed into a kind of likelihood measurement. Another MLP classifier has been used to obtain the final decision by combining these three likelihood measurement vectors. Patil and Sontakke [53] proposed a general fuzzy hyperline segment neural network for rotation, scale, and translation invariant handwritten numeral recognition. It combines supervised and unsupervised learning in a single algorithm so that it can be used for pure classification, pure clustering, and hybrid classification/clustering. Bajaj et al. [44] combined decisions of multiple classifiers for handwritten Devanagari numerals. A neural network-based classification scheme is designed for this task. Three different neural classifiers have been used for classification. The outputs of the three classifiers are combined using a connectionist scheme. 2. Literature Review 30

17 Hanmandlu et al. [46] proposed a fuzzy model-based scheme for recognition of handwritten Devanagari numerals by representing them in the form of exponential membership functions, which serve as a fuzzy model. Modifying the exponential membership functions fitted to the fuzzy sets does the recognition. These fuzzy sets are derived from features consisting of normalized distances obtained using the Box approach. The Gaussian distribution function has been adopted by Ramteke and Mehrotra [47] for classification of handwritten numerals. In [76], a method is proposed based on cubic spline interpolation for determining smooth and continuous edges in the images of handwritten Devanagari numerals. In [90], a Hough transformation- based technique is used to localize the postal code blocks from structured postal documents with defined address block region. Isolated handwritten digits are then extracted from the localized postal-code region. In [58], a system is proposed to classify handwritten Devanagari characters into several groups based on similarity measure. The header line (shirorekha) is located based on end points and pixels positions in the top half part of the character image. The header line is removed from the images of every character before coarse classification. Three different classifiers, namely nearest neighbor, k-nn, and SVM were tested independently to recognize handwritten Devanagari numerals in [57]. The performance of SVM in terms of accuracy was better than the other two classifiers. A syntactic representation (SR) of features is used in [62] for handwritten numeral recognition. This representation is matched against the set of prototype SRs of handwritten numerals for a possible match. Edge direction histogram features are used along with PCA for enhancing recognition accuracies of handwritten Devanagari numerals in [76]. Recognition of handwritten numeric postal codes in a multiscript environment is presented in [90]. Similar shaped digit patterns of four scripts, namely Latin, Devanagari, Bangla, and Urdu are grouped in 25 clusters. A script-independent pattern SVM-based classifier is designed to classify the numeric postal codes into one of these 25 clusters. Based on the classification decisions, a rule-based script inference engine is developed to infer about the script of the numeric postal code. One of the four script-specific SVM-based classifiers is then invoked to recognize the digits of the corresponding script. The work in [64] explores the potentiality of a clonal selection algorithm (CSA) in recognition. In particular, a retraining scheme for the CSA is proposed for better 2. Literature Review 31

18 recognition of handwritten Devanagari numerals. Size normalized binary image matrix is used as the feature map for the same. In [49], the feature vector is entered as an input to one of the feed forward back propagation neural network for the classification of handwritten Devanagari characters. Kumar [48] compared the performances of SVM and MLP classifiers with six different features on handwritten characters and found that the performance of SVM classifier was superior to MLP in all the six cases. But the classification time required for SVM was greater than that of MLP. Sharma et al. [50] proposed a quadratic classifier-based scheme for the recognition of handwritten characters. A modified quadratic classifier is applied by Pal et al. [51] on the features of handwritten characters for recognition. In [55], two classifiers are combined to get higher accuracy of character recognition with the same features. Combined use of SVM and MQDF is applied for the same. A comparative study was done by Pal et al. [35] on Devanagari handwritten character recognition using 12 different classifiers like PD, subspace method (SM), linear discriminant function (LDF), SVM, MQDF, mirror image learning (MIL), Euclidean distance (ED), nearest neighbor, k-nn, modified PD (MPD), compound PD (CPD), and compound MQDF (CMQDF). From the experiment, they noted that MIL classifier provided best results and the ED showed the lowest results among all the 12 classifiers considered. A divide-and-conquer strategy is adopted in [58] for the recognition of handwritten Devanagari characters, where each category is divided into subcategories based on structural properties to make the classification process simpler. The subcategories considered in the paper are connected characters, non connected characters, end-bar characters, middle-bar characters, without bar characters, end-bar characters with one closed loop, end-bar characters with two closed loops, and withoutbar characters with loop. Identifying the presence and position of vertical line segments and closed loops does this coarse classification. In [59], the top modifiers of Devanagari script are classified into one touchingpoint and two touching-point modifiers by checking whether a modifier touches the header line at two positions or not. Further classification of two touching point modifiers is done by analyzing the core strip of the word. Two MLPs and a minimum edit distance (MED) method are used for classification of handwritten Devanagari non compound characters in [61]. In the first stage of classification, characters with distinct shapes are classified using two MLPs. Shadow features are used for one MLP and CH features are 2. Literature Review 32

19 used for the other MLP for classification. In the second stage of classification, confused characters having similar shapes are classified using a MED method. This method makes use of corners detected in a character image using modified Harris corner detection technique. The work reported in [72] presents a two-stage classification approach for handwritten Devanagari characters. The first stage is using structural properties like shirorekha and spine in a character. The second stage exploits intersection features of characters, which are then fed to a feed forward neural network (FFNN) for further classification. In [77], three MLPs are designed for three types of features. Each MLP is trained with a back propagation-learning algorithm. Results of three MLPs are then combined using a weighted majority scheme. The work reported in [63] discusses the use of regular expressions (RE) in handwritten Devanagari character recognition, where a handwritten character is converted into an encoded string based on chain-code features. Then, RE of stored templates is matched with it. Rejected samples are then sent to a MED classifier for recognition. An elastic matching (EM) technique based on an eigen deformation (ED) for recognition of handwritten Devanagari characters is proposed in [66]. The method consists of two phases: a training phase for the estimation of EDs, and a recognition process using the estimated EDs. EDs are the intrinsic deformations within each character category and can be estimated by the PCA of actual deformations collected through the EM. A coarse classification is done in [68] prior to recognition, where the handwritten Devanagari characters are classified into three major categories, namely end-bar characters, middle-bar characters, and characters without any bar based on the presence of vertical bar. The recognition of handwritten characters in [68] is based on the modified exponential membership function fitted to the fuzzy sets derived from the features of the characters. A Reuse Policy that provides guidance from the past policies is also utilized in the paper to improve the speed of the learning process. Not much work is reported toward handwritten character string (word) recognition of Devanagari. A segmentation-based approach to handwritten Devanagari word recognition is proposed by Shaw et al. [41]. On the basis of the header line, a word image is segmented into pseudo characters. HMM are proposed to recognize the pseudo characters. The word-level recognition is done on the basis of string edit distance. 2. Literature Review 33

20 A continuous density HMM is also proposed by Shaw et al. [42] to recognize a handwritten word images. The states of the HMM are not determined a priori, but are determined automatically based on a database of handwritten word images. An HMM is constructed for each word. To classify an unknown word image, its class conditional probability for each HMM is computed. The class that gives highest such probability is finally selected. In [74] a dynamic programming (DP) based technique is proposed for pin code string recognition. Initially, the pin code string is segmented into primitives. Table 2.3 Details of handwritten Devanagari numeral recognition systems Method Feature Classifier Data set (size) Accuracy (%) Bajaj et al [44] Statistical Neural networks 2, Ramteke et al [47] Moment Gaussian distribution 2, invariants Lakshmi et al [76] Gradient PCA 9, Hanmandlu et al [46] Box approach Fuzzy model 3, Hanmandlu et al [54] Box approach Bacterial foraging 3, Elnagar et al [62] Structural Matching SR 1, Garain et al [64] Binary image Clonal selection 12, Basu et al [90] QTLR SVM 3, Rajput et al [57] Fourier SVM 13, descriptor Pal et al [74] Gradient MQDF 23, Sharma et al [50] Chain code Quadratic 22, Bhattacharya et al [2] Wavelet MLP 22, Patil et al [53] Structural General fuzzy neural 2, network Pal et al [56] Gradient MQDF 22, The details of many handwritten Devanagari numeral, character, and word recognition systems are summarized in Tables , respectively. It is evident from Table 2.3 that in recognizing Devanagari handwritten numerals, the method proposed by Pal et al. [56] is superior in terms of accuracy. Even for recognizing Devanagari handwritten characters, the method proposed by Pal et al. [35] has the highest accuracy, as shown in Table Literature Review 34

21 Table 2.4 Details of handwritten Devanagari character recognition systems Method Feature Classifier Data set (size) Accuracy (%) Sharma et al [16] Chain code Quadratic 11, Deshpande et al [63] Chain code RE and MED 5, Arora et al [72] Structural FFNN 50, Arora et al [77] Combined MLP 1, Hanmandlu et al [68] Vector distance Fuzzy sets 4, Arora et al [61] Shadow and CH MLP and MED 7, Kumar et al [48] Gradient SVM 25, Pal et al [51] Gradient and Quadratic 36, Gaussian filter Mane et al [66] Eigen Elastic matching 3, deformation Pal et al [55] Gradient SVM and MQDF 36, Pal et al [35] Gradient MIL 36, Table 2.5 Details of handwritten Devanagari word recognition systems Method Feature Classifier Data set (size) Accuracy (%) Shaw et al [42] Chain code HMM 39, Shaw et al [41] Segments HMM 39, Some observations In the recent past, Department of Information Technology (DIT), Government of India formed a Consortium of several Institutions/Universities of India involved in OCR activities and provided considerable amount of fund to this Consortium to improve the quality of research related to Indian language OCR. Creation of benchmark databases for Devanagari script is essential for successful research. Efforts have been made in India and U.S. to create test beds for printed and handwritten Devanagari character recognition [2], [36], [93]. The database developed by Indian Statistical Institute, Kolkata [2] contains isolated handwritten Devanagari 2. Literature Review 35

22 numeral samples collected from real-life situations, and is made available free of cost to researchers of other academic institutions. Setlur et al. [36] also made some efforts in creating data resources and designing an evaluation test bed for Devanagari script recognition. Jawahar et al. [93] of International Institute of Information Technology, Hyderabad were successful in generating a corpus consists of more than document images printed in Indian scripts. But still there is no standard database for handwritten composite characters and words written in Devanagari. For the segmentation-based recognition of handwritten text, initially it has to be separated into words, and then, words into individual characters and modifiers. As the recognition has to be performed on isolated characters, segmentation of words into characters is a critical step for handwritten text recognition as incorrect segmentation of words may lead to incorrect recognition. Most of the segmentation errors are due to various writing styles of different individuals. Also presence of many touching characters is another major problem of segmentation. As a result, much research is needed and expected in this area. Identifying compound and touching characters is also a challenging task. Some authors are of the opinion that the use of contextual information may improve the results of segmentation. Characters are joined together using a shirorekha to frame words in Devanagari script. It has been observed that some people write words without using a shirorekha on them. Thus, a word may or may not have a header line in some handwriting text and such an absence creates problems in recognition. Skew detection will be difficult in the absence of shirorekha as some of the existing works related to skew detection are based on the presence of shirorekha on words. The straightness of the shirorekha is also an issue of concern. A study on the irregularities in Devanagari handwriting is presented in [70]. These irregularities occur during the writing process and make the word-level recognition of the text more complex. Some of them are: abnormal size of a vowel symbol, incomplete and inaccurate representation of a vowel symbol, merging of vowel symbols with headline, intrusion of upper vowel symbols with middle region, intrusion of lower vowel symbols with middle region, improper attachment of lower vowel symbols, writing vowel symbols in isolation, wrong position of header line, incomplete writing, presence of unwanted or extra strokes, narrow writing, and over writing. 2. Literature Review 36

A Survey of Problems of Overlapped Handwritten Characters in Recognition process for Gurmukhi Script

A Survey of Problems of Overlapped Handwritten Characters in Recognition process for Gurmukhi Script A Survey of Problems of Overlapped Handwritten Characters in Recognition process for Gurmukhi Script Arwinder Kaur 1, Ashok Kumar Bathla 2 1 M. Tech. Student, CE Dept., 2 Assistant Professor, CE Dept.,

More information

Segmentation Based Optical Character Recognition for Handwritten Marathi characters

Segmentation Based Optical Character Recognition for Handwritten Marathi characters Segmentation Based Optical Character Recognition for Handwritten Marathi characters Madhav Vaidya 1, Yashwant Joshi 2,Milind Bhalerao 3 Department of Information Technology 1 Department of Electronics

More information

LITERATURE REVIEW. For Indian languages most of research work is performed firstly on Devnagari script and secondly on Bangla script.

LITERATURE REVIEW. For Indian languages most of research work is performed firstly on Devnagari script and secondly on Bangla script. LITERATURE REVIEW For Indian languages most of research work is performed firstly on Devnagari script and secondly on Bangla script. The study of recognition for handwritten Devanagari compound character

More information

A Brief Study of Feature Extraction and Classification Methods Used for Character Recognition of Brahmi Northern Indian Scripts

A Brief Study of Feature Extraction and Classification Methods Used for Character Recognition of Brahmi Northern Indian Scripts 25 A Brief Study of Feature Extraction and Classification Methods Used for Character Recognition of Brahmi Northern Indian Scripts Rohit Sachdeva, Asstt. Prof., Computer Science Department, Multani Mal

More information

Recognition of Off-Line Handwritten Devnagari Characters Using Quadratic Classifier

Recognition of Off-Line Handwritten Devnagari Characters Using Quadratic Classifier Recognition of Off-Line Handwritten Devnagari Characters Using Quadratic Classifier N. Sharma, U. Pal*, F. Kimura**, and S. Pal Computer Vision and Pattern Recognition Unit, Indian Statistical Institute

More information

Devanagari Isolated Character Recognition by using Statistical features

Devanagari Isolated Character Recognition by using Statistical features Devanagari Isolated Character Recognition by using Statistical features ( Foreground Pixels Distribution, Zone Density and Background Directional Distribution feature and SVM Classifier) Mahesh Jangid

More information

OCR For Handwritten Marathi Script

OCR For Handwritten Marathi Script International Journal of Scientific & Engineering Research Volume 3, Issue 8, August-2012 1 OCR For Handwritten Marathi Script Mrs.Vinaya. S. Tapkir 1, Mrs.Sushma.D.Shelke 2 1 Maharashtra Academy Of Engineering,

More information

Optical Character Recognition (OCR) for Printed Devnagari Script Using Artificial Neural Network

Optical Character Recognition (OCR) for Printed Devnagari Script Using Artificial Neural Network International Journal of Computer Science & Communication Vol. 1, No. 1, January-June 2010, pp. 91-95 Optical Character Recognition (OCR) for Printed Devnagari Script Using Artificial Neural Network Raghuraj

More information

Chapter Review of HCR

Chapter Review of HCR Chapter 3 [3]Literature Review The survey of literature on character recognition showed that some of the researchers have worked based on application requirements like postal code identification [118],

More information

DEVANAGARI SCRIPT SEPARATION AND RECOGNITION USING MORPHOLOGICAL OPERATIONS AND OPTIMIZED FEATURE EXTRACTION METHODS

DEVANAGARI SCRIPT SEPARATION AND RECOGNITION USING MORPHOLOGICAL OPERATIONS AND OPTIMIZED FEATURE EXTRACTION METHODS DEVANAGARI SCRIPT SEPARATION AND RECOGNITION USING MORPHOLOGICAL OPERATIONS AND OPTIMIZED FEATURE EXTRACTION METHODS Sushilkumar N. Holambe Dr. Ulhas B. Shinde Shrikant D. Mali Persuing PhD at Principal

More information

Optical Character Recognition

Optical Character Recognition Chapter 2 Optical Character Recognition 2.1 Introduction Optical Character Recognition (OCR) is one of the challenging areas of pattern recognition. It gained popularity among the research community due

More information

Indian Multi-Script Full Pin-code String Recognition for Postal Automation

Indian Multi-Script Full Pin-code String Recognition for Postal Automation 2009 10th International Conference on Document Analysis and Recognition Indian Multi-Script Full Pin-code String Recognition for Postal Automation U. Pal 1, R. K. Roy 1, K. Roy 2 and F. Kimura 3 1 Computer

More information

Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network

Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network Utkarsh Dwivedi 1, Pranjal Rajput 2, Manish Kumar Sharma 3 1UG Scholar, Dept. of CSE, GCET, Greater Noida,

More information

Handwritten Script Recognition at Block Level

Handwritten Script Recognition at Block Level Chapter 4 Handwritten Script Recognition at Block Level -------------------------------------------------------------------------------------------------------------------------- Optical character recognition

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK HANDWRITTEN DEVANAGARI CHARACTERS RECOGNITION THROUGH SEGMENTATION AND ARTIFICIAL

More information

CHAPTER 8 COMPOUND CHARACTER RECOGNITION USING VARIOUS MODELS

CHAPTER 8 COMPOUND CHARACTER RECOGNITION USING VARIOUS MODELS CHAPTER 8 COMPOUND CHARACTER RECOGNITION USING VARIOUS MODELS 8.1 Introduction The recognition systems developed so far were for simple characters comprising of consonants and vowels. But there is one

More information

SEVERAL METHODS OF FEATURE EXTRACTION TO HELP IN OPTICAL CHARACTER RECOGNITION

SEVERAL METHODS OF FEATURE EXTRACTION TO HELP IN OPTICAL CHARACTER RECOGNITION SEVERAL METHODS OF FEATURE EXTRACTION TO HELP IN OPTICAL CHARACTER RECOGNITION Binod Kumar Prasad * * Bengal College of Engineering and Technology, Durgapur, W.B., India. Rajdeep Kundu 2 2 Bengal College

More information

FRAGMENTATION OF HANDWRITTEN TOUCHING CHARACTERS IN DEVANAGARI SCRIPT

FRAGMENTATION OF HANDWRITTEN TOUCHING CHARACTERS IN DEVANAGARI SCRIPT International Journal of Information Technology, Modeling and Computing (IJITMC) Vol. 2, No. 1, February 2014 FRAGMENTATION OF HANDWRITTEN TOUCHING CHARACTERS IN DEVANAGARI SCRIPT Shuchi Kapoor 1 and Vivek

More information

A survey on optical character recognition for Bangla and Devanagari scripts

A survey on optical character recognition for Bangla and Devanagari scripts Sādhanā Vol. 38, Part 1, February 2013, pp. 133 168. c Indian Academy of Sciences A survey on optical character recognition for Bangla and Devanagari scripts 1. Introduction SOUMEN BAG 1 and GAURAV HARIT

More information

Complementary Features Combined in a MLP-based System to Recognize Handwritten Devnagari Character

Complementary Features Combined in a MLP-based System to Recognize Handwritten Devnagari Character Journal of Information Hiding and Multimedia Signal Processing 2011 ISSN 2073-4212 Ubiquitous International Volume 2, Number 1, January 2011 Complementary Features Combined in a MLP-based System to Recognize

More information

A System for Joining and Recognition of Broken Bangla Numerals for Indian Postal Automation

A System for Joining and Recognition of Broken Bangla Numerals for Indian Postal Automation A System for Joining and Recognition of Broken Bangla Numerals for Indian Postal Automation K. Roy, U. Pal and B. B. Chaudhuri CVPR Unit; Indian Statistical Institute, Kolkata-108; India umapada@isical.ac.in

More information

PCA-based Offline Handwritten Character Recognition System

PCA-based Offline Handwritten Character Recognition System Smart Computing Review, vol. 3, no. 5, October 2013 346 Smart Computing Review PCA-based Offline Handwritten Character Recognition System Munish Kumar 1, M. K. Jindal 2, and R. K. Sharma 3 1 Computer Science

More information

Isolated Handwritten Words Segmentation Techniques in Gurmukhi Script

Isolated Handwritten Words Segmentation Techniques in Gurmukhi Script Isolated Handwritten Words Segmentation Techniques in Gurmukhi Script Galaxy Bansal Dharamveer Sharma ABSTRACT Segmentation of handwritten words is a challenging task primarily because of structural features

More information

Word-wise Hand-written Script Separation for Indian Postal automation

Word-wise Hand-written Script Separation for Indian Postal automation Word-wise Hand-written Script Separation for Indian Postal automation K. Roy U. Pal Dept. of Comp. Sc. & Engg. West Bengal University of Technology, Sector 1, Saltlake City, Kolkata-64, India Abstract

More information

A two-stage approach for segmentation of handwritten Bangla word images

A two-stage approach for segmentation of handwritten Bangla word images A two-stage approach for segmentation of handwritten Bangla word images Ram Sarkar, Nibaran Das, Subhadip Basu, Mahantapas Kundu, Mita Nasipuri #, Dipak Kumar Basu Computer Science & Engineering Department,

More information

A Technique for Offline Handwritten Character Recognition

A Technique for Offline Handwritten Character Recognition A Technique for Offline Handwritten Character Recognition 1 Shilpy Bansal, 2 Mamta Garg, 3 Munish Kumar 1 Lecturer, Department of Computer Science Engineering, BMSCET, Muktsar, Punjab 2 Assistant Professor,

More information

Handwritten Gurumukhi Character Recognition by using Recurrent Neural Network

Handwritten Gurumukhi Character Recognition by using Recurrent Neural Network 139 Handwritten Gurumukhi Character Recognition by using Recurrent Neural Network Harmit Kaur 1, Simpel Rani 2 1 M. Tech. Research Scholar (Department of Computer Science & Engineering), Yadavindra College

More information

NOVATEUR PUBLICATIONS INTERNATIONAL JOURNAL OF INNOVATIONS IN ENGINEERING RESEARCH AND TECHNOLOGY [IJIERT] ISSN: VOLUME 5, ISSUE

NOVATEUR PUBLICATIONS INTERNATIONAL JOURNAL OF INNOVATIONS IN ENGINEERING RESEARCH AND TECHNOLOGY [IJIERT] ISSN: VOLUME 5, ISSUE OPTICAL HANDWRITTEN DEVNAGARI CHARACTER RECOGNITION USING ARTIFICIAL NEURAL NETWORK APPROACH JYOTI A.PATIL Ashokrao Mane Group of Institution, Vathar Tarf Vadgaon, India. DR. SANJAY R. PATIL Ashokrao Mane

More information

Comparative Performance Analysis of Feature(S)- Classifier Combination for Devanagari Optical Character Recognition System

Comparative Performance Analysis of Feature(S)- Classifier Combination for Devanagari Optical Character Recognition System Comparative Performance Analysis of Feature(S)- Classifier Combination for Devanagari Optical Character Recognition System Jasbir Singh Department of Computer Science Punjabi University Patiala, India

More information

HANDWRITTEN GURMUKHI CHARACTER RECOGNITION USING WAVELET TRANSFORMS

HANDWRITTEN GURMUKHI CHARACTER RECOGNITION USING WAVELET TRANSFORMS International Journal of Electronics, Communication & Instrumentation Engineering Research and Development (IJECIERD) ISSN 2249-684X Vol.2, Issue 3 Sep 2012 27-37 TJPRC Pvt. Ltd., HANDWRITTEN GURMUKHI

More information

Morphological Approach for Segmentation of Scanned Handwritten Devnagari Text

Morphological Approach for Segmentation of Scanned Handwritten Devnagari Text Abstract In this paper we present a system towards the of Hindi Handwritten Devnagari Text. Segmentation of script is essential for handwritten script recognition. This system deals with of (matras) and

More information

Handwritten Numeral Recognition of Kannada Script

Handwritten Numeral Recognition of Kannada Script Handwritten Numeral Recognition of Kannada Script S.V. Rajashekararadhya Department of Electrical and Electronics Engineering CEG, Anna University, Chennai, India svr_aradhya@yahoo.co.in P. Vanaja Ranjan

More information

Recognition of Unconstrained Malayalam Handwritten Numeral

Recognition of Unconstrained Malayalam Handwritten Numeral Recognition of Unconstrained Malayalam Handwritten Numeral U. Pal, S. Kundu, Y. Ali, H. Islam and N. Tripathy C VPR Unit, Indian Statistical Institute, Kolkata-108, India Email: umapada@isical.ac.in Abstract

More information

MOMENT AND DENSITY BASED HADWRITTEN MARATHI NUMERAL RECOGNITION

MOMENT AND DENSITY BASED HADWRITTEN MARATHI NUMERAL RECOGNITION MOMENT AND DENSITY BASED HADWRITTEN MARATHI NUMERAL RECOGNITION S. M. Mali Department of Computer Science, MAEER S Arts, Commerce and Science College, Pune Shankarmali007@gmail.com Abstract In this paper,

More information

Line and Word Segmentation Approach for Printed Documents

Line and Word Segmentation Approach for Printed Documents Line and Word Segmentation Approach for Printed Documents Nallapareddy Priyanka Computer Vision and Pattern Recognition Unit Indian Statistical Institute, 203 B.T. Road, Kolkata-700108, India Srikanta

More information

HCR Using K-Means Clustering Algorithm

HCR Using K-Means Clustering Algorithm HCR Using K-Means Clustering Algorithm Meha Mathur 1, Anil Saroliya 2 Amity School of Engineering & Technology Amity University Rajasthan, India Abstract: Hindi is a national language of India, there are

More information

Handwritten Marathi Character Recognition on an Android Device

Handwritten Marathi Character Recognition on an Android Device Handwritten Marathi Character Recognition on an Android Device Tanvi Zunjarrao 1, Uday Joshi 2 1MTech Student, Computer Engineering, KJ Somaiya College of Engineering,Vidyavihar,India 2Associate Professor,

More information

Segmentation of Characters of Devanagari Script Documents

Segmentation of Characters of Devanagari Script Documents WWJMRD 2017; 3(11): 253-257 www.wwjmrd.com International Journal Peer Reviewed Journal Refereed Journal Indexed Journal UGC Approved Journal Impact Factor MJIF: 4.25 e-issn: 2454-6615 Manpreet Kaur Research

More information

Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes

Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes 2009 10th International Conference on Document Analysis and Recognition Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes Alireza Alaei

More information

Isolated Curved Gurmukhi Character Recognition Using Projection of Gradient

Isolated Curved Gurmukhi Character Recognition Using Projection of Gradient International Journal of Computational Intelligence Research ISSN 0973-1873 Volume 13, Number 6 (2017), pp. 1387-1396 Research India Publications http://www.ripublication.com Isolated Curved Gurmukhi Character

More information

Handwritten Devanagari Character Recognition Model Using Neural Network

Handwritten Devanagari Character Recognition Model Using Neural Network Handwritten Devanagari Character Recognition Model Using Neural Network Gaurav Jaiswal M.Sc. (Computer Science) Department of Computer Science Banaras Hindu University, Varanasi. India gauravjais88@gmail.com

More information

Devanagari Handwriting Recognition and Editing Using Neural Network

Devanagari Handwriting Recognition and Editing Using Neural Network Devanagari Handwriting Recognition and Editing Using Neural Network Sohan Lal Sahu RSR Rungta College of Engineering & Technology (RSR-RCET), Bhilai 490024 Abstract- Character recognition plays an important

More information

LECTURE 6 TEXT PROCESSING

LECTURE 6 TEXT PROCESSING SCIENTIFIC DATA COMPUTING 1 MTAT.08.042 LECTURE 6 TEXT PROCESSING Prepared by: Amnir Hadachi Institute of Computer Science, University of Tartu amnir.hadachi@ut.ee OUTLINE Aims Character Typology OCR systems

More information

Building Multi Script OCR for Brahmi Scripts: Selection of Efficient Features

Building Multi Script OCR for Brahmi Scripts: Selection of Efficient Features Building Multi Script OCR for Brahmi Scripts: Selection of Efficient Features Md. Abul Hasnat Center for Research on Bangla Language Processing (CRBLP) Center for Research on Bangla Language Processing

More information

Segmentation of Kannada Handwritten Characters and Recognition Using Twelve Directional Feature Extraction Techniques

Segmentation of Kannada Handwritten Characters and Recognition Using Twelve Directional Feature Extraction Techniques Segmentation of Kannada Handwritten Characters and Recognition Using Twelve Directional Feature Extraction Techniques 1 Lohitha B.J, 2 Y.C Kiran 1 M.Tech. Student Dept. of ISE, Dayananda Sagar College

More information

A HYBRID FEATURE EXTRACTION AND RECOGNITION TECHNIQUE FOR OFFLINE DEVNAGRI HADWRITING

A HYBRID FEATURE EXTRACTION AND RECOGNITION TECHNIQUE FOR OFFLINE DEVNAGRI HADWRITING A HYBRID FEATURE EXTRACTION AND RECOGNITION TECHNIQUE FOR OFFLINE DEVNAGRI HADWRITING Poonam Sharma Department of Computer Science The NorthCap University Email-Id: poonamsharma@ncuindia.edu Shivani Sihmar

More information

A Simple Text-line segmentation Method for Handwritten Documents

A Simple Text-line segmentation Method for Handwritten Documents A Simple Text-line segmentation Method for Handwritten Documents M.Ravi Kumar Assistant professor Shankaraghatta-577451 R. Pradeep Shankaraghatta-577451 Prasad Babu Shankaraghatta-5774514th B.S.Puneeth

More information

Online Handwritten Devnagari Word Recognition using HMM based Technique

Online Handwritten Devnagari Word Recognition using HMM based Technique Online Handwritten Devnagari Word using HMM based Technique Prachi Patil Master of Engineering Dept. of Electronics & Telecommunication Dr. D. Y. Patil SOE, Pune, India Saniya Ansari Professor Dept. of

More information

Online Bangla Handwriting Recognition System

Online Bangla Handwriting Recognition System 1 Online Bangla Handwriting Recognition System K. Roy Dept. of Comp. Sc. West Bengal University of Technology, BF 142, Saltlake, Kolkata-64, India N. Sharma, T. Pal and U. Pal Computer Vision and Pattern

More information

Handwritten Devanagari Character Recognition

Handwritten Devanagari Character Recognition Handwritten Devanagari Character Recognition Akhil Deshmukh, Rahul Meshram, Sachin Kendre, Kunal Shah Department of Computer Engineering Sinhgad Institute of Technology (SIT) Lonavala University of Pune,

More information

Structural Feature Extraction to recognize some of the Offline Isolated Handwritten Gujarati Characters using Decision Tree Classifier

Structural Feature Extraction to recognize some of the Offline Isolated Handwritten Gujarati Characters using Decision Tree Classifier Structural Feature Extraction to recognize some of the Offline Isolated Handwritten Gujarati Characters using Decision Tree Classifier Hetal R. Thaker Atmiya Institute of Technology & science, Kalawad

More information

A Review on Different Character Segmentation Techniques for Handwritten Gurmukhi Scripts

A Review on Different Character Segmentation Techniques for Handwritten Gurmukhi Scripts WWJMRD2017; 3(10): 162-166 www.wwjmrd.com International Journal Peer Reviewed Journal Refereed Journal Indexed Journal UGC Approved Journal Impact Factor MJIF: 4.25 e-issn: 2454-6615 Manas Kaur Research

More information

Opportunities and Challenges of Handwritten Sanskrit Character Recognition System

Opportunities and Challenges of Handwritten Sanskrit Character Recognition System Opportunities and Challenges of Handwritten System Shailendra Kumar Singh Research Scholar, CSE Department SLIET Longowal, Sangrur, Punjab, India Sks.it2012@gmail.com Manoj Kumar Sachan Assosiate Professor,

More information

Off-line Recognition of Hand-written Bengali Numerals using Morphological Features

Off-line Recognition of Hand-written Bengali Numerals using Morphological Features Off-line Recognition of Hand-written Bengali Numerals using Morphological Features Pulak Purkait and Bhabatosh Chanda ECSU, Indian Statistical Institute, Kolkata, India {pulak r, chanda}@isical.ac.in Abstract

More information

Marathi Handwritten Numeral Recognition using Fourier Descriptors and Normalized Chain Code

Marathi Handwritten Numeral Recognition using Fourier Descriptors and Normalized Chain Code Marathi Handwritten Numeral Recognition using Fourier Descriptors and Normalized Chain Code G. G. Rajput Department of Computer Science Gulbarga University, Gulbarga 585106 Karnataka, India S. M. Mali

More information

Character Recognition

Character Recognition Character Recognition 5.1 INTRODUCTION Recognition is one of the important steps in image processing. There are different methods such as Histogram method, Hough transformation, Neural computing approaches

More information

Segmentation of Bangla Handwritten Text

Segmentation of Bangla Handwritten Text Thesis Report Segmentation of Bangla Handwritten Text Submitted By: Sabbir Sadik ID:09301027 Md. Numan Sarwar ID: 09201027 CSE Department BRAC University Supervisor: Professor Dr. Mumit Khan Date: 13 th

More information

Image Normalization and Preprocessing for Gujarati Character Recognition

Image Normalization and Preprocessing for Gujarati Character Recognition 334 Image Normalization and Preprocessing for Gujarati Character Recognition Jayashree Rajesh Prasad Department of Computer Engineering, Sinhgad College of Engineering, University of Pune, Pune, Mahaashtra

More information

Recognition of Gurmukhi Text from Sign Board Images Captured from Mobile Camera

Recognition of Gurmukhi Text from Sign Board Images Captured from Mobile Camera International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 17 (2014), pp. 1839-1845 International Research Publications House http://www. irphouse.com Recognition of

More information

Segmentation of Isolated and Touching characters in Handwritten Gurumukhi Word using Clustering approach

Segmentation of Isolated and Touching characters in Handwritten Gurumukhi Word using Clustering approach Segmentation of Isolated and Touching characters in Handwritten Gurumukhi Word using Clustering approach Akashdeep Kaur Dr.Shaveta Rani Dr. Paramjeet Singh M.Tech Student (Associate Professor) (Associate

More information

Research Article Development of Comprehensive Devnagari Numeral and Character Database for Offline Handwritten Character Recognition

Research Article Development of Comprehensive Devnagari Numeral and Character Database for Offline Handwritten Character Recognition Applied Computational Intelligence and Soft Computing Volume 2012, Article ID 871834, 5 pages doi:10.1155/2012/871834 Research Article Development of Comprehensive Devnagari Numeral and Character base

More information

Degraded Text Recognition of Gurmukhi Script. Doctor of Philosophy. Manish Kumar

Degraded Text Recognition of Gurmukhi Script. Doctor of Philosophy. Manish Kumar Degraded Text Recognition of Gurmukhi Script A Thesis Submitted in fulfilment of the requirements for the award of the degree of Doctor of Philosophy Submitted by Manish Kumar (Registration No. 9000351)

More information

Handwritten Character Recognition: A Comprehensive Review on Geometrical Analysis

Handwritten Character Recognition: A Comprehensive Review on Geometrical Analysis IOSR Journal of Computer Engineering (IOSRJCE) eissn: 22780661,pISSN: 22788727, Volume 17, Issue 2, Ver. IV (Mar Apr. 2015), PP 8388 www.iosrjournals.org Handwritten Character Recognition: A Comprehensive

More information

Handwritten character and word recognition using their geometrical features through neural networks

Handwritten character and word recognition using their geometrical features through neural networks Handwritten character and word recognition using their geometrical features through neural networks Sudarshan Sawant 1, Prof. Seema Baji 2 1 Student, Department of electronics and Tele-communications,

More information

Handwritten Hindi Numerals Recognition System

Handwritten Hindi Numerals Recognition System CS365 Project Report Handwritten Hindi Numerals Recognition System Submitted by: Akarshan Sarkar Kritika Singh Project Mentor: Prof. Amitabha Mukerjee 1 Abstract In this project, we consider the problem

More information

SEGMENTATION OF CHARACTERS WITHOUT MODIFIERS FROM A PRINTED BANGLA TEXT

SEGMENTATION OF CHARACTERS WITHOUT MODIFIERS FROM A PRINTED BANGLA TEXT SEGMENTATION OF CHARACTERS WITHOUT MODIFIERS FROM A PRINTED BANGLA TEXT ABSTRACT Rupak Bhattacharyya et al. (Eds) : ACER 2013, pp. 11 24, 2013. CS & IT-CSCP 2013 Fakruddin Ali Ahmed Department of Computer

More information

A Technique for Classification of Printed & Handwritten text

A Technique for Classification of Printed & Handwritten text 123 A Technique for Classification of Printed & Handwritten text M.Tech Research Scholar, Computer Engineering Department, Yadavindra College of Engineering, Punjabi University, Guru Kashi Campus, Talwandi

More information

Automatic Recognition and Verification of Handwritten Legal and Courtesy Amounts in English Language Present on Bank Cheques

Automatic Recognition and Verification of Handwritten Legal and Courtesy Amounts in English Language Present on Bank Cheques Automatic Recognition and Verification of Handwritten Legal and Courtesy Amounts in English Language Present on Bank Cheques Ajay K. Talele Department of Electronics Dr..B.A.T.U. Lonere. Sanjay L Nalbalwar

More information

Time Stamp Detection and Recognition in Video Frames

Time Stamp Detection and Recognition in Video Frames Time Stamp Detection and Recognition in Video Frames Nongluk Covavisaruch and Chetsada Saengpanit Department of Computer Engineering, Chulalongkorn University, Bangkok 10330, Thailand E-mail: nongluk.c@chula.ac.th

More information

Region-based Segmentation

Region-based Segmentation Region-based Segmentation Image Segmentation Group similar components (such as, pixels in an image, image frames in a video) to obtain a compact representation. Applications: Finding tumors, veins, etc.

More information

Layout Segmentation of Scanned Newspaper Documents

Layout Segmentation of Scanned Newspaper Documents , pp-05-10 Layout Segmentation of Scanned Newspaper Documents A.Bandyopadhyay, A. Ganguly and U.Pal CVPR Unit, Indian Statistical Institute 203 B T Road, Kolkata, India. Abstract: Layout segmentation algorithms

More information

Character Segmentation for Telugu Image Document using Multiple Histogram Projections

Character Segmentation for Telugu Image Document using Multiple Histogram Projections Global Journal of Computer Science and Technology Graphics & Vision Volume 13 Issue 5 Version 1.0 Year 2013 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Inc.

More information

CHAPTER 6 DETECTION OF MASS USING NOVEL SEGMENTATION, GLCM AND NEURAL NETWORKS

CHAPTER 6 DETECTION OF MASS USING NOVEL SEGMENTATION, GLCM AND NEURAL NETWORKS 130 CHAPTER 6 DETECTION OF MASS USING NOVEL SEGMENTATION, GLCM AND NEURAL NETWORKS A mass is defined as a space-occupying lesion seen in more than one projection and it is described by its shapes and margin

More information

Word-wise Script Identification from Video Frames

Word-wise Script Identification from Video Frames Word-wise Script Identification from Video Frames Author Sharma, Nabin, Chanda, Sukalpa, Pal, Umapada, Blumenstein, Michael Published 2013 Conference Title Proceedings 12th International Conference on

More information

N.Priya. Keywords Compass mask, Threshold, Morphological Operators, Statistical Measures, Text extraction

N.Priya. Keywords Compass mask, Threshold, Morphological Operators, Statistical Measures, Text extraction Volume, Issue 8, August ISSN: 77 8X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Combined Edge-Based Text

More information

Chapter 2. Literature Survey and Objectives. 2.1 Literature Survey

Chapter 2. Literature Survey and Objectives. 2.1 Literature Survey Chapter 2 Literature Survey and Objectives 2.1 Literature Survey In India, there are 18 official (Indian constitution accepted) languages. Two or more of these languages may be written in one script. Twelve

More information

Lecture 8 Object Descriptors

Lecture 8 Object Descriptors Lecture 8 Object Descriptors Azadeh Fakhrzadeh Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University 2 Reading instructions Chapter 11.1 11.4 in G-W Azadeh Fakhrzadeh

More information

Image Processing, Analysis and Machine Vision

Image Processing, Analysis and Machine Vision Image Processing, Analysis and Machine Vision Milan Sonka PhD University of Iowa Iowa City, USA Vaclav Hlavac PhD Czech Technical University Prague, Czech Republic and Roger Boyle DPhil, MBCS, CEng University

More information

Keywords Connected Components, Text-Line Extraction, Trained Dataset.

Keywords Connected Components, Text-Line Extraction, Trained Dataset. Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Language Independent

More information

HMM-Based Handwritten Amharic Word Recognition with Feature Concatenation

HMM-Based Handwritten Amharic Word Recognition with Feature Concatenation 009 10th International Conference on Document Analysis and Recognition HMM-Based Handwritten Amharic Word Recognition with Feature Concatenation Yaregal Assabie and Josef Bigun School of Information Science,

More information

Chapter 2. Components

Chapter 2. Components Chapter 2 [2]OCR: General Architecture and Components In some areas which require the automation of human intelligence, such as chess playing, tremendous improvements are achieved over the last few decades.

More information

SKEW DETECTION AND CORRECTION

SKEW DETECTION AND CORRECTION CHAPTER 3 SKEW DETECTION AND CORRECTION When the documents are scanned through high speed scanners, some amount of tilt is unavoidable either due to manual feed or auto feed. The tilt angle induced during

More information

A Document Image Analysis System on Parallel Processors

A Document Image Analysis System on Parallel Processors A Document Image Analysis System on Parallel Processors Shamik Sural, CMC Ltd. 28 Camac Street, Calcutta 700 016, India. P.K.Das, Dept. of CSE. Jadavpur University, Calcutta 700 032, India. Abstract This

More information

ABJAD: AN OFF-LINE ARABIC HANDWRITTEN RECOGNITION SYSTEM

ABJAD: AN OFF-LINE ARABIC HANDWRITTEN RECOGNITION SYSTEM ABJAD: AN OFF-LINE ARABIC HANDWRITTEN RECOGNITION SYSTEM RAMZI AHMED HARATY and HICHAM EL-ZABADANI Lebanese American University P.O. Box 13-5053 Chouran Beirut, Lebanon 1102 2801 Phone: 961 1 867621 ext.

More information

Review of Automatic Handwritten Kannada Character Recognition Technique Using Neural Network

Review of Automatic Handwritten Kannada Character Recognition Technique Using Neural Network Review of Automatic Handwritten Kannada Character Recognition Technique Using Neural Network 1 Mukesh Kumar, 2 Dr.Jeeetendra Sheethlani 1 Department of Computer Science SSSUTMS, Sehore Abstract Data processing

More information

Digital Image Processing

Digital Image Processing Digital Image Processing Third Edition Rafael C. Gonzalez University of Tennessee Richard E. Woods MedData Interactive PEARSON Prentice Hall Pearson Education International Contents Preface xv Acknowledgments

More information

HMM-based Indic Handwritten Word Recognition using Zone Segmentation

HMM-based Indic Handwritten Word Recognition using Zone Segmentation HMM-based Indic Handwritten Word Recognition using Zone Segmentation a Partha Pratim Roy*, b Ayan Kumar Bhunia, b Ayan Das, c Prasenjit Dey, d Umapada Pal a Dept. of CSE, Indian Institute of Technology

More information

An Efficient Character Segmentation Based on VNP Algorithm

An Efficient Character Segmentation Based on VNP Algorithm Research Journal of Applied Sciences, Engineering and Technology 4(24): 5438-5442, 2012 ISSN: 2040-7467 Maxwell Scientific organization, 2012 Submitted: March 18, 2012 Accepted: April 14, 2012 Published:

More information

Handwriting segmentation of unconstrained Oriya text

Handwriting segmentation of unconstrained Oriya text Sādhanā Vol. 31, Part 6, December 2006, pp. 755 769. Printed in India Handwriting segmentation of unconstrained Oriya text N TRIPATHY and U PAL Computer Vision and Pattern Recognition Unit, Indian Statistical

More information

Robust line segmentation for handwritten documents

Robust line segmentation for handwritten documents Robust line segmentation for handwritten documents Kamal Kuzhinjedathu, Harish Srinivasan and Sargur Srihari Center of Excellence for Document Analysis and Recognition (CEDAR) University at Buffalo, State

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

On Segmentation of Documents in Complex Scripts

On Segmentation of Documents in Complex Scripts On Segmentation of Documents in Complex Scripts K. S. Sesh Kumar, Sukesh Kumar and C. V. Jawahar Centre for Visual Information Technology International Institute of Information Technology, Hyderabad, India

More information

Localization, Extraction and Recognition of Text in Telugu Document Images

Localization, Extraction and Recognition of Text in Telugu Document Images Localization, Extraction and Recognition of Text in Telugu Document Images Atul Negi Department of CIS University of Hyderabad Hyderabad 500046, India atulcs@uohyd.ernet.in K. Nikhil Shanker Department

More information

2: Image Display and Digital Images. EE547 Computer Vision: Lecture Slides. 2: Digital Images. 1. Introduction: EE547 Computer Vision

2: Image Display and Digital Images. EE547 Computer Vision: Lecture Slides. 2: Digital Images. 1. Introduction: EE547 Computer Vision EE547 Computer Vision: Lecture Slides Anthony P. Reeves November 24, 1998 Lecture 2: Image Display and Digital Images 2: Image Display and Digital Images Image Display: - True Color, Grey, Pseudo Color,

More information

Multiple Classifier Combination for Off-line Handwritten Devnagari Character Recognition

Multiple Classifier Combination for Off-line Handwritten Devnagari Character Recognition Multiple Combination for Off-line Handwritten Devnagari Character Recognition Sandhya Arora Department of CSE & T Meghnad Saha nstitute of Technology Kolkata-700107 sandhyabhagat@yahoo.com Debotosh Bhattacharjee,

More information

A Recognition System for Devnagri and English Handwritten Numerals

A Recognition System for Devnagri and English Handwritten Numerals A Recognition System for Devnagri and English Handwritten Numerals G S Lehal 1 and Nivedan Bhatt 2 1 Department of Computer Science & Engineering, Thapar Institute of Engineering & Technology, Patiala,

More information

A Hierarchical Pre-processing Model for Offline Handwritten Document Images

A Hierarchical Pre-processing Model for Offline Handwritten Document Images International Journal of Research Studies in Computer Science and Engineering (IJRSCSE) Volume 2, Issue 3, March 2015, PP 41-45 ISSN 2349-4840 (Print) & ISSN 2349-4859 (Online) www.arcjournals.org A Hierarchical

More information

Enhancing the Character Segmentation Accuracy of Bangla OCR using BPNN

Enhancing the Character Segmentation Accuracy of Bangla OCR using BPNN Enhancing the Character Segmentation Accuracy of Bangla OCR using BPNN Shamim Ahmed 1, Mohammod Abul Kashem 2 1 M.S. Student, Department of Computer Science and Engineering, Dhaka University of Engineering

More information

Signature Based Document Retrieval using GHT of Background Information

Signature Based Document Retrieval using GHT of Background Information 2012 International Conference on Frontiers in Handwriting Recognition Signature Based Document Retrieval using GHT of Background Information Partha Pratim Roy Souvik Bhowmick Umapada Pal Jean Yves Ramel

More information

II. WORKING OF PROJECT

II. WORKING OF PROJECT Handwritten character Recognition and detection using histogram technique Tanmay Bahadure, Pranay Wekhande, Manish Gaur, Shubham Raikwar, Yogendra Gupta ABSTRACT : Cursive handwriting recognition is a

More information