Alternatives for Page Skew Compensation in Writer Identification
|
|
- Hope Knight
- 6 years ago
- Views:
Transcription
1 Alternatives for Page Skew Compensation in Writer Identification Jin Chen and Daniel Lopresti Department of Computer Science & Engineering Lehigh University Bethlehem, PA 18015, USA {jic207, Abstract Traditionally, page images undergo pre-processing before the later stages of document analysis are applied. One common pre-processing step is to calculate and correct for the presence of simple page skew through a compensating rotation. Such operations modify the original input image, however, and in doing so may discard or obscure useful information. In this paper, we examine the impact of page deskewing on the task of writer identification for complicated handwritten documents. As an alternative to rotating the page image, we demonstrate a method that compensates for page skew during feature extraction. Experimental evaluation involving 61 Arabic writers and 610 page images show that handling page skew during feature extraction can benefit writer ID with a significant 1.4% gain in accuracy. In addition, we also obtain a 4.7% gain after improving an existing contour-based feature extraction method. I. INTRODUCTION Traditional techniques for document image analysis (DIA) follow a paradigm of pre-processing data, extracting features from it, and training a classifier to decode testing data. Noise and artifacts are normalized or removed during pre-processing so that the feature extraction and the classification modules can work on improved data. However, such pre-processing usually modifies the original image and may discard these image modifications which can be exploited for later stages of document analysis. For example, slant correction, stroke-width normalization, and vertical scaling are commonly used for offline handwriting recognition, but Schlapbach and Bunke found that slant correction, and its combination with vertical scaling or width normalization, can be harmful to writer ID [1]. Writer identification, which is the problem of assigning a sample of unknown handwriting to one of a list of known writers, has been studied for decades. Over the years, a number of classifiers have been proven useful in identifying writers, including Neural Networks [2], [3], K-nearestneighbors (KNNs) [4], [5], [6], [7], [8], Hidden Markov Models (HMMs) [9], [10], Gaussian Mixture Models (GMMs) [11], [12], Support Vector Machines (SVMs) [13], and weighted Euclidean Distance classifiers (WED) [4]. Features have been based on connected-component contours [7], grapheme codebooks [7], [14], Gabor filtering [4], chain-code encoding [15], morphological operations [3], and many others [5], [13], [6]. Given carefully prepared datasets, researchers are able to achieve a reasonably high accuracy with the help of discriminative classifiers and feature sets [16]. Recently, however, people have become interested in dealing with challenging data, which may contain various types of noise and artifacts. One type of artifacts in handwritten documents is page skew, which is introduced during document scanning. In practice, pre-printed ruling lines, which are designed to help people write neatly, may help decide the page skew, as shown in Figure 1. On the other hand, rulings can interfere with efforts to segment handwritten strokes [17], [18], [19], [20], Thus, to examine the impact of page skew on writer ID, we have to take into account pre-printed rulings. In general, there are two ways to handle pre-printed rulings: remove rulings during pre-processing, or detect and compensate them during later processing stages. Following the pre-processing paradigm, Arvind, et al. detect the ruling lines within segmented handwritten blocks by computing the horizontal projection profiles and design several rules to remove them [19]. Cao, et al. design a set of heuristics to recover handwritten broken strokes after removing rulings [20]. Other machine-learning base methods include [17], [18]. None of the these approaches seems ideal. First, training based approaches have to face the difficulty of creating ground-truth, which is either tedious to label at the pixel level, or uses synthetic datasets which is less convincing. Second, removalbased approaches use simple shape analysis and tend to create false-alarm strokes and/or to miss broken ruling segments. Nevertheless, ruling line removal has been widely used in applications such as check processing and form processing. In our previous work [21], we avoid the problem of recovering broken strokes after removing rulings, but try to handle the impact of rulings during feature extraction. This paradigm differs from the pre-processing one in that we do not modify any part of the image, but detect different image components and deal with them during later processing, e.g., feature extraction. It turned out this paradigm enables us to make use of the fact that people may treat them differently and thus can be exploited for writer ID. In this paper, we examine the impact of page skew during pre-processing and try to instead compensate for the page skew during feature extraction. First, we detect the pre-printed rulings based on a model-based method [22]. Next, we overcome the effects of ruling lines during the extraction of contour-hinge based features [7]. Then, we propose our ways of handling page skew during feature extraction rather than rotating the image during pre-processing. Finally we discuss several issues when implementing the feature extraction module and examine their impact on writer ID performance.
2 Y X (12, 0) contour direction ruling contour (0, 12) Fig. 2: Quantization of the angular plane in feature extraction. in detail in Section IV. In our experiments, all the features are computed on a text line basis. Fig. 1: An Arabic document with negative page-wise skew. II. A. Page Skew Detection WRITER ID SYSTEM Our model-based ruling line detection algorithm exploits characteristics of pre-printed rulings such as consistent spacing β 1 and approximately the same length L, skew angle β 2, and thickness H [22]. It models these rulings as a problem of multi-line linear regression. One advantage is that it guarantees a globally optimal solution under the Least Squares Error (LSE). The result of the algorithm is a set of parameters of the ruling model which can be used to render them again. In our experiments, these pre-printed rulings are represented as lists of pixel sequence, and the skew angle β 2 is defined to be the skew of the document image. B. Feature Extraction Contour-hinge features are one bivariate probabilistic distribution function (PDF) that captures both the orientation and and the curvature of contours [7]. After extracting contours from connected-component analysis, we examine each two adjacent segments (each 10-pixel long) along the contours, and compute their angles against the horizontal axis. Quantizing the angle plane ([0, 2π)) into a 24-bin histogram, we vote in these bins when traversing all contours. As shown in Figure 6, this quantization strategy separates positive and negative skew angles so that it may cause jumps between bins in the PDF matrix when dealing with ruling contours. We shall discuss this We observe that the testing datasets in Bulacu and Schomaker s work [7] do not contain pre-printed rulings. Ruling lines complicate feature extraction because handwriting tends to overlap them. In this work, we follow the same strategy as in our previous work to deal with ruling lines, as shown in Figure 3. First, rulings are extracted by the previous detection algorithm and when traversing the contours, we measure whether the current pivot pixel lies on any ruling line. If so, we simply skip the computation of the pivot s angular indices in the PDF matrix. We show the effect of ruling compensation in Figure 3, where blue means the valid contour pixels that contribute to the PDF matrix and red means the ruling contour pixels. Also in their paper [23], the authors only use half of the matrix as a feature vector (φ2 >= φ1), considering the other half redundant, which results in 300-D feature vectors (n(2n + 1) = 300, n = 12). We shall examine other options of generating feature vectors in Section IV which may provide more discriminating power. In general, there are two different ways to handle page skew during feature extraction. First, when traversing contours, we explicitly subtract the page skew from the hinge segment angles. Second, we transform the coordinates of extracted contour pixels against the skew direction before the traverse operation. We will investigate both methods in the experimental evaluation. C. Writer Identification Support Vector Machines (SVMs) are often used for writer ID. SVMs construct a hyperplane with maximum margin in higher dimensional vector space, where a non-linearly separable classification problem in the original vector space may become linearly separable after projecting these feature vectors into higher dimensional space by different mapping functions. The mapping functions are called kernels in the literature. We
3 1.2 Accumulated Feature Vector Distances Distance Fig. 3: Dealing with rulings during feature extraction. In the lower half, blues pixels are valid contour pixels that contribute to the PDF matrix, while red pixels are ruling contours that are not counted in the PDF matrix Deskew Page vs. Subtract Skew Deskew Page vs. Transform Contour Subtract Skew vs. Transform Contour Page Skew (degrees) Fig. 5: Accumulated feature vector distances between the three methods in comparison. Deskew Page Subtract Skew Transform Contour Rotate Page ( 1, 2) Detect Page Skew 2 Extract Contour ( 1 2, 2 2) Transform Contour ( 1, 2) Pre-processing Feature Extraction Fig. 4: A workflow diagram showing processing modules in different feature extraction methods in evaluation. (, ) means the actual angles used to index in the PDF matrix. use the Radial Basis Function (RBF) kernel because it offers better discriminability than the linear kernel, while using fewer parameters than the polynomial kernel. In our experiments, we employ the libsvm tool [24] for writer ID, where we set the cost c = and we normalize feature vectors into the unit hyper-cube. III. EXPERIMENTAL SETUP Our Arabic dataset was provided by the Linguistic Data Consortium (LDC) [25]. We randomly selected a subset that has 61 writers in total, each of whom contributed 10 handwritten pages. Each page was scanned at 600 DPI with a bitonal setting. A typical size for a page image is 5104w 6600h. We randomly divided the dataset into five folds, each having two pages from each writer. Each page was annotated with polygon bounding boxes for handwritten text lines. Using 5-fold cross-validation, each text line was tested once. In total, we used 4,893 text lines for experimental evaluation. First, we examined different ways of handling page skew: Deskew Page: Subtract Skew: this served as the baseline system which rotates the image against to page skew direction, as in traditional pre-processing. when traversing contours, subtracted the page skew from the angles of hinge segments, and then computed the indices in the PDF matrix. Transform Contour:compensated the page skew by transforming the coordinates of extracted contours during feature extraction. As shown in Figure 4, all three methods involved page skew detection, contour extraction and traversal. The baseline rotated the image during pre-processing while the other proposed methods did not. We can think of these three methods of handling page skew differ in the order they handle it. Deskew Page compensates page skew during pre-processing, by rotating the bitmap directly. Transform Contour first extracts the contours and then rotates them before computing the contour hinge angles. Subtract Skew pretends there is no page skew until indexing in the PDF matrix. In theory, these methods should generate the same feature vector. Due to the discrete 2-D digital grid, however, they may generate significant different feature vectors, as in Figure 5. We generate this figure by extracting features on a standard eclipse shape under different page skew in [ 1.0, 1.0]. After extracting feature vectors from the three methods, we compute the accumulated distances between pairs of methods and then plot their distributions. As we can see, the distances are quite significant and thus they may result in different writer ID performance.
4 TABLE I: Writer ID performance on different methods. Deskew Subtract Skew Transform Contour Fold % 77.76% 75.73% Fold % 71.58% 78.48% Fold % 79.45% 80.19% Fold % 77.29% 81.81% Fold % 82.48% 81.40% Average 78.14% 77.71% 79.52% TABLE II: Options in implementing feature extraction. Accumulated Distance Accumulated Transpose Distance in PDF Matrix Eclipse Real HW Image Deskew Rotate Contour Half PDF 73.47% 74.89% Full PDF 78.14% 79.52% Adjust Quantization 81.42% 82.00% Second, we examined benefits of using the full PDF matrix as feature vectors. Half PDF: served as the baseline method which used half of the PDF matrix as feature vectors (n(2n + 1) = 300-D, n = 12), as in [7]. Full PDF: used the full matrix as feature vectors, so the feature vectors are (2n) 2 = 576-D, n = 12. Finally, we discuss the effect of rulings on the quantization strategy, and also their combining impact on writer ID. In the experimental evaluation, all the experiments used the same SVM configuration and the full PDF matrix for features except for the one that addresses this issue explicitly. A. Evaluation IV. EXPERIMENTAL RESULTS First of all, we show performance of our proposed systems that compensated page skew during feature extraction rather than rotating images during pre-processing. The experimental results are summarized in Table I. The baseline seemed to outperform the subtract-skew based system, but the statistical significance test showed that this performance difference (0.43%) is not significant. In other words, these two systems performed similarly. If we choose to rotate the contours during feature extraction, we obtained a performance gain (1.4%) with statistical significance (at a confidence level of 99%). This result validated our hypothesis that it is possible to avoid the damage caused by rotating bitmaps during traditional preprocessing but to exploit it during feature extraction. In the following discussions, we only used the transform-contour based system for performance comparison. Next, we show why we chose to use the full PDF matrix as feature vector for writer ID. In Bulacu and Schomaker s work [7], they used half of the matrix as a feature vector, considering the other half contains only redundant information. This idea assumes the contours are symmetric with respect to the horizontal axis, so the PDF matrix is symmetric. We investigated this by rotating a standard eclipses by different angles Page Skew (degrees) Fig. 6: Transposed PDF matrix distance of different objects. The eclipse is rotated at different angles in [ 1.0, 1.0 ] and the other curve is summarized with the evaluation dataset. for feature extraction. For each skew angle in [ 1.0, 1.0], we computed the accumulated transpose matrix distance D in the PDF matrix M: D = 2n 2n i=1 j=i+1 M[i][j] M[j][i] (1) where n = 12. Then, we computed the average distance for each bin in the skew range. Likewise, we also computed this metric using all the text line images in our evaluation dataset. The difference of the two is shown in Figure 6. Although the PDF matrix of the text lines seemed symmetric, the distance between their transposed elements is significantly larger than that from a real symmetric object. Hence, we considered this difference might be useful information to exploit. The results in Table II validated our hypothesis, showing that the subtle differences in the transposed entries (shown in Figure 6) played an important role in identifying writers. All methods obtained a large performance gain over the baseline which used only half of the PDF matrix. Finally, we discuss a subtle but practical issue when dealing with page skew using angle subtraction during feature extraction. One strategy of quantizing the angular plane is as shown in Figure 2, which separates positive and negative angles, and this works fine when no rulings are present. It will, however, cause problems when rulings are prevalent as in our evaluation dataset. Without losing generality, suppose the contours on ruling lines are traversed in the counter clockwise direction. Since the page skew is usually small in our dataset (±1 ), locally the contour hinge segments are always horizontal, thus the matrix indices are (φ1, φ2) = (0, 12) and (φ1, φ2) = (12, 0) (Figure 2). For positive skew, subtracting it bumps the indices to into (23, 11) and (11, 23), respectively. For negative skew, however, subtracting it from hinge segment angles will not change the indices. This is undesirable because now the PDF matrices vary significantly just because of the direction of page skew. The same case when the contours are
5 traversed in the other direction. There are two ways to solve this issue. In addition to detect rulings as we did in our experiments, we also tried to adjust the quantization strategy so that the 0th bin covers both positive and negative angles. This was done by rotating the x-/y- axis counter clockwise wise by 15 /2 = 7.5. After this adjustment, we conducted the experiment with the baseline and also the transform-contour based system, and found that we obtained significant performance gains, as shown in Table II. Again, transform-contour based system outperformed the deskewing based system with statistical significance. B. Statistical Significance Test In the experimental evaluation, all the performance loss or gains were validated using the McNemar test [26]: Z 2 = ( n 01 n 10 1) 2 n 10 + n 01. (2) where we first divided misclassified samples into two groups, and then stated the hypothesis test (Denote F 0, F 1 as the performance for the baseline system and the proposed system, respectively): n 01 : number of samples misclassified by the proposed system, but not by the baseline system. n 10 : number of samples misclassified by the baseline system, but not by the proposed systems. Null Hypothesis H 0 : F 0 = F 1. Alternative Hypothesis H 1 : F 0 < F 1. The test statistic Z 2 approximately follows the χ 2 distribution with 1 degree of freedom. Looking this up in the χ 2 table, we concluded that the performance gains we obtained and reported here are statistically significant at a confidence level of 99%. V. CONCLUSION Traditional pre-processing techniques usually modify the original image before later stages of document analysis are applied. For example, images are modified by rotating bitmaps during deskewing. In this paper, we investigated the impact of image rotation and proposed methods to compensate page skew while retaining all the information in the original image. Experimental results involving 61 writers with 610 Arabic handwritten documents showed that our methods performed better than the deskewing-based method. In addition, we also examined the complexity of feature extraction when dealing with pre-printed rulings and showed how to adopt the quantization strategy as well as the benefits of exploiting the full PDF matrix for writer ID. For future work, we plan to examine the fundamental reasons why deskewing tends to modify the contour characteristics of handwritten text lines. ACKNOWLEDGEMENT The authors acknowledge insightful discussions with George Nagy on the idea of not altering input images through pre-processing. This work is supported by a DARPA IPTO grant administered by Raytheon BBN Technologies. REFERENCES [1] A. Schlapbach and H. Bunke, Writer identification using an HMMbased handwriting recognition system: to normalize the input or not? in Proc. of the 12th international Graphonomics Society, 2005, pp [2] R. Sabourin and J. Drouhard, Off-line signature verification using directional PDF and Neural Networks, in Proc. the International Conference on Pattern Recognition, Vancouver, BC, Canada, 1992, pp [3] E. Zois and V. Anastassopoulos, Morphological waveform coding for writer identification, Pattern Recognition, vol. 33, pp , [4] H. Said, T. Tan, and K. Baker, Personal identification based on handwriting, Pattern Recognition, vol. 33, pp , [5] C. Hertel and H. Bunke, A set of novel features for writer identification, J. Kittler and M. Nixon, Eds. Springer, [6] B. Li, Z. Sun, and T. Tan, Hierarchical shape primitive features for online text-independent writer identification, in Proc. 10th International Conference on Document Analysis and Recognition, Barcelona, Spain, August 2009, pp [7] M. Bulacu and L. Schomaker, Text-independent writer identification and verification using textural and allographic features, IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 29, pp , [8] S. Fiel and R. Sablatnig, Writer retrieval and writer identification using local features, in Proceedings of the 10th International Workshop on Document Analysis Systems, 2012, pp [9] A. Schlapbach and H. Bunke, A writer identification and verification system using HMM based recognizers, Pattern Analysis and Application, vol. 10, pp , [10] Y. Yamazaki, T. Nagao, and N. Komatsu, Text-indicated writer verification using Hidden Markov Models, in Proc. International Conference on Document Analysis and Recognition, 2003, pp [11] A. Schlapbach and H. Bunke, Off-line writer identification using Gaussian Mixture Models, in Proc. of the 18th International Conference on Pattern Recognition, 2006, pp [12], Off-line writer identification and verification using Gaussian Mixture Models, Studies in Computational Intelligence, vol. 90, pp , [13] E. Justino, F. Bortolozzi, and R. Sabourin, A comparison of SVM and HMM classifiers in the off-line signature verification, Pattern Recognition Letters, vol. 26, pp , [14] A. Bensefia, T. Paquet, and L. Heutte, A writer identification and verification system, Pattern Recognition Letters, vol. 26, pp , [15] I. Siddiqi and N. Vincent, A set of chain code based features for writer recognition, in Proc. the 10th international Conference on Document Analysis and Recognition, 2009, pp [16] G. Louloudis, N. Stamatopoulos, and B. Gatos, ICDAR 2011 writer identification contest, in Proceedings of the 19th International Conference on Document Analysis and Recognition, 2011, pp [17] H. Cao and V. Govindaraju, Handwritten carbon form preprocessing based on markov random field, in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, [18] W. Abd-Almageed, J. Kumar, and D. Doermann, Page rule-line removal using linear subspaces in monochromatic handwritten Arabic documents, in Proc. of the 12th International Conference on Document Analysis and Recognition, 2009, pp [19] K. Arvind, J. Kumar, and A. Ramakrishnan, Line removal and restoration of handwritten strokes, in Proc. of the 7th international Conference on Computational Intelligence and Multimedia Application, 2007, pp [20] H. Cao, R. Prasad, and P. Natarajan, A stroke regeneration method for cleaning rule-lines in handwritten document images, in Proc. of the MOCR workshop at the 10th international Conference on Document Analysis and Recognition, [21] J. Chen and D. Lopresti, Exploiting ruling line artifacts in writer identification, in Proceedings of the st International Conference on Pattern Recognition, September 2012, pp [22], A model-based ruling line detection algorithm for noisy handwritten documents, in Proceedings of the 11th International Conference on Document Analysis and Recognition, September 2011, pp [23] B. Gatos, D. Danatsas, I. Pratikakis, and S. Perantonis, Automatic table detection in document images, in Proceedings of the Third International Conference on Advances in Pattern Recognition, 2005, pp [24] C.-C. Chang and C.-J. Lin, LIBSVM: a library for support vector machines, 2001, software available at cjlin/libsvm. [25] S. Strassel, Linguistic resources for Arabic handwriting recognition, in Proceedings of the Second International Conference on Arabic Language Resources and Tools, Cairo Egypt, [26] T. Dietterich, Approximate statistical tests for comparing supervised classification learning algorithms, Neural Computation, vol. 10, pp , 1998.
The Impact of Ruling Lines on Writer Identification
The Impact of Ruling Lines on Writer Identification Jin Chen Lehigh University Bethlehem, PA 18015, USA jic207@cse.lehigh.edu Daniel Lopresti Lehigh University Bethlehem, PA 18015, USA lopresti@cse.lehigh.edu
More informationFine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes
2009 10th International Conference on Document Analysis and Recognition Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes Alireza Alaei
More informationWriter Recognizer for Offline Text Based on SIFT
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 5, May 2015, pg.1057
More informationA Feature based on Encoding the Relative Position of a Point in the Character for Online Handwritten Character Recognition
A Feature based on Encoding the Relative Position of a Point in the Character for Online Handwritten Character Recognition Dinesh Mandalapu, Sridhar Murali Krishna HP Laboratories India HPL-2007-109 July
More informationRobustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification
Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification Tomohiro Tanno, Kazumasa Horie, Jun Izawa, and Masahiko Morita University
More informationA Set of Chain Code Based Features for Writer Recognition
29 1th International Conference on Document Analysis and Recognition A Set of Chain Code Based Features for Writer Recognition Imran Siddiqi, Nicole Vincent Paris Descartes University, Laboratoire CRIP5
More informationEquation to LaTeX. Abhinav Rastogi, Sevy Harris. I. Introduction. Segmentation.
Equation to LaTeX Abhinav Rastogi, Sevy Harris {arastogi,sharris5}@stanford.edu I. Introduction Copying equations from a pdf file to a LaTeX document can be time consuming because there is no easy way
More informationHierarchical Shape Primitive Features for Online Text-independent Writer Identification
2009 10th International Conference on Document Analysis and Recognition Hierarchical Shape Primitive Features for Online Text-independent Writer Identification Bangy Li, Zhenan Sun and Tieniu Tan Center
More informationThe PAGE (Page Analysis and Ground-truth Elements) Format Framework
2010,IEEE. Reprinted, with permission, frompletschacher, S and Antonacopoulos, A, The PAGE (Page Analysis and Ground-truth Elements) Format Framework, Proceedings of the 20th International Conference on
More informationOff-Line Multi-Script Writer Identification using AR Coefficients
2009 10th International Conference on Document Analysis and Recognition Off-Line Multi-Script Writer Identification using AR Coefficients Utpal Garain Indian Statistical Institute 203, B.. Road, Kolkata
More informationShort Survey on Static Hand Gesture Recognition
Short Survey on Static Hand Gesture Recognition Huu-Hung Huynh University of Science and Technology The University of Danang, Vietnam Duc-Hoang Vo University of Science and Technology The University of
More informationOnline Text-independent Writer Identification Based on Temporal Sequence and Shape Codes
2009 10th International Conference on Document Analysis and Recognition Online Text-independent Writer Identification Based on Temporal Sequence and Shape Codes Bangy Li and Tieniu Tan Center for Biometrics
More informationIsolated Curved Gurmukhi Character Recognition Using Projection of Gradient
International Journal of Computational Intelligence Research ISSN 0973-1873 Volume 13, Number 6 (2017), pp. 1387-1396 Research India Publications http://www.ripublication.com Isolated Curved Gurmukhi Character
More informationRobust line segmentation for handwritten documents
Robust line segmentation for handwritten documents Kamal Kuzhinjedathu, Harish Srinivasan and Sargur Srihari Center of Excellence for Document Analysis and Recognition (CEDAR) University at Buffalo, State
More informationOff-line Signature Verification Using Writer-Independent Approach
Off-line Signature Verification Using Writer-Independent Approach Luiz S. Oliveira, Edson Justino, and Robert Sabourin Abstract In this work we present a strategy for off-line signature verification. It
More informationConvolution Neural Networks for Chinese Handwriting Recognition
Convolution Neural Networks for Chinese Handwriting Recognition Xu Chen Stanford University 450 Serra Mall, Stanford, CA 94305 xchen91@stanford.edu Abstract Convolutional neural networks have been proven
More informationHidden Loop Recovery for Handwriting Recognition
Hidden Loop Recovery for Handwriting Recognition David Doermann Institute of Advanced Computer Studies, University of Maryland, College Park, USA E-mail: doermann@cfar.umd.edu Nathan Intrator School of
More informationGabor Features for Offline Arabic Handwriting Recognition
Gabor Features for Offline Arabic Handwriting Recognition Jin Chen Lehigh University Bethlehem, PA 18015 jic207@cse.lehigh.edu Anurag Bhardwaj University of Buffalo Amherst, NY 14260 ab94@buffalo.edu Huaigu
More informationText Dependent Writer Identification using Support Vector Machine
ext Dependent Writer Identification using Support Vector Machine Saranya K M.Phil Research Scholar, PSGR Krishnammal College for Women, Coimbatore- 641004. Vijaya M S Associate Professor, G.R. Govindarajalu
More informationABJAD: AN OFF-LINE ARABIC HANDWRITTEN RECOGNITION SYSTEM
ABJAD: AN OFF-LINE ARABIC HANDWRITTEN RECOGNITION SYSTEM RAMZI AHMED HARATY and HICHAM EL-ZABADANI Lebanese American University P.O. Box 13-5053 Chouran Beirut, Lebanon 1102 2801 Phone: 961 1 867621 ext.
More informationRuling-Based Table Analysis for Noisy Handwritten Documents
Ruling-Based Table Analysis for Noisy Handwritten Documents ABSTRACT Jin Chen CSE Department 19 Memorial Drive West Bethlehem, PA 18015, USA jic207@cse.lehigh.edu Table analysis can be a valuable step
More informationAn Accurate Method for Skew Determination in Document Images
DICTA00: Digital Image Computing Techniques and Applications, 1 January 00, Melbourne, Australia. An Accurate Method for Skew Determination in Document Images S. Lowther, V. Chandran and S. Sridharan Research
More informationECG782: Multidimensional Digital Signal Processing
ECG782: Multidimensional Digital Signal Processing Object Recognition http://www.ee.unlv.edu/~b1morris/ecg782/ 2 Outline Knowledge Representation Statistical Pattern Recognition Neural Networks Boosting
More informationWriter Identification and Retrieval using a Convolutional Neural Network
Writer Identification and Retrieval using a Convolutional Neural Network Stefan Fiel and Robert Sablatnig Computer Vision Lab TU Wien Vienna, Austria {fiel,sab}@caa.tuwien.ac.at Abstract. In this paper
More informationOffline Signature verification and recognition using ART 1
Offline Signature verification and recognition using ART 1 R. Sukanya K.Malathy M.E Infant Jesus College of Engineering And Technology Abstract: The main objective of this project is signature verification
More informationIndian Multi-Script Full Pin-code String Recognition for Postal Automation
2009 10th International Conference on Document Analysis and Recognition Indian Multi-Script Full Pin-code String Recognition for Postal Automation U. Pal 1, R. K. Roy 1, K. Roy 2 and F. Kimura 3 1 Computer
More informationCOSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor
COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality
More informationLogical Templates for Feature Extraction in Fingerprint Images
Logical Templates for Feature Extraction in Fingerprint Images Bir Bhanu, Michael Boshra and Xuejun Tan Center for Research in Intelligent Systems University of Califomia, Riverside, CA 9252 1, USA Email:
More informationLECTURE 6 TEXT PROCESSING
SCIENTIFIC DATA COMPUTING 1 MTAT.08.042 LECTURE 6 TEXT PROCESSING Prepared by: Amnir Hadachi Institute of Computer Science, University of Tartu amnir.hadachi@ut.ee OUTLINE Aims Character Typology OCR systems
More informationSignature Based Document Retrieval using GHT of Background Information
2012 International Conference on Frontiers in Handwriting Recognition Signature Based Document Retrieval using GHT of Background Information Partha Pratim Roy Souvik Bhowmick Umapada Pal Jean Yves Ramel
More informationCLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS
CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of
More informationBuilding Multi Script OCR for Brahmi Scripts: Selection of Efficient Features
Building Multi Script OCR for Brahmi Scripts: Selection of Efficient Features Md. Abul Hasnat Center for Research on Bangla Language Processing (CRBLP) Center for Research on Bangla Language Processing
More informationChain Code Histogram based approach
An attempt at visualizing the Fourth Dimension Take a point, stretch it into a line, curl it into a circle, twist it into a sphere, and punch through the sphere Albert Einstein Chain Code Histogram based
More informationInvariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction
Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction Stefan Müller, Gerhard Rigoll, Andreas Kosmala and Denis Mazurenok Department of Computer Science, Faculty of
More informationFully Automatic Methodology for Human Action Recognition Incorporating Dynamic Information
Fully Automatic Methodology for Human Action Recognition Incorporating Dynamic Information Ana González, Marcos Ortega Hortas, and Manuel G. Penedo University of A Coruña, VARPA group, A Coruña 15071,
More informationAvailable online at ScienceDirect. Procedia Computer Science 45 (2015 )
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 45 (2015 ) 205 214 International Conference on Advanced Computing Technologies and Applications (ICACTA- 2015) Automatic
More informationAutomatic removal of crossed-out handwritten text and the effect on writer verification and identification
Automatic removal of crossed-out handwritten text and the effect on writer verification and identification (The original paper was published in: Proc. of Document Recognition and Retrieval XV, IS&T/SPIE
More informationA Non-Rigid Feature Extraction Method for Shape Recognition
A Non-Rigid Feature Extraction Method for Shape Recognition Jon Almazán, Alicia Fornés, Ernest Valveny Computer Vision Center Dept. Ciències de la Computació Universitat Autònoma de Barcelona Bellaterra,
More informationFacial Expression Classification with Random Filters Feature Extraction
Facial Expression Classification with Random Filters Feature Extraction Mengye Ren Facial Monkey mren@cs.toronto.edu Zhi Hao Luo It s Me lzh@cs.toronto.edu I. ABSTRACT In our work, we attempted to tackle
More informationAn Improvement Study for Optical Character Recognition by using Inverse SVM in Image Processing Technique
An Improvement Study for Optical Character Recognition by using Inverse SVM in Image Processing Technique I Dinesh KumarVerma, II Anjali Khatri I Assistant Professor (ECE) PDM College of Engineering, Bahadurgarh,
More informationA Novel Smoke Detection Method Using Support Vector Machine
A Novel Smoke Detection Method Using Support Vector Machine Hidenori Maruta Information Media Center Nagasaki University, Japan 1-14 Bunkyo-machi, Nagasaki-shi Nagasaki, Japan Email: hmaruta@nagasaki-u.ac.jp
More information1 Case study of SVM (Rob)
DRAFT a final version will be posted shortly COS 424: Interacting with Data Lecturer: Rob Schapire and David Blei Lecture # 8 Scribe: Indraneel Mukherjee March 1, 2007 In the previous lecture we saw how
More informationTracing and Straightening the Baseline in Handwritten Persian/Arabic Text-line: A New Approach Based on Painting-technique
Tracing and Straightening the Baseline in Handwritten Persian/Arabic Text-line: A New Approach Based on Painting-technique P. Nagabhushan and Alireza Alaei 1,2 Department of Studies in Computer Science,
More informationOnline Mathematical Symbol Recognition using SVMs with Features from Functional Approximation
Online Mathematical Symbol Recognition using SVMs with Features from Functional Approximation Birendra Keshari and Stephen M. Watt Ontario Research Centre for Computer Algebra Department of Computer Science
More informationCursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network
Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network Utkarsh Dwivedi 1, Pranjal Rajput 2, Manish Kumar Sharma 3 1UG Scholar, Dept. of CSE, GCET, Greater Noida,
More informationObject Classification Using Tripod Operators
Object Classification Using Tripod Operators David Bonanno, Frank Pipitone, G. Charmaine Gilbreath, Kristen Nock, Carlos A. Font, and Chadwick T. Hawley US Naval Research Laboratory, 4555 Overlook Ave.
More informationTable of Contents. Recognition of Facial Gestures... 1 Attila Fazekas
Table of Contents Recognition of Facial Gestures...................................... 1 Attila Fazekas II Recognition of Facial Gestures Attila Fazekas University of Debrecen, Institute of Informatics
More informationSoftware Documentation of the Potential Support Vector Machine
Software Documentation of the Potential Support Vector Machine Tilman Knebel and Sepp Hochreiter Department of Electrical Engineering and Computer Science Technische Universität Berlin 10587 Berlin, Germany
More informationPreliminary Local Feature Selection by Support Vector Machine for Bag of Features
Preliminary Local Feature Selection by Support Vector Machine for Bag of Features Tetsu Matsukawa Koji Suzuki Takio Kurita :University of Tsukuba :National Institute of Advanced Industrial Science and
More informationConservative preprocessing of document images
IJDAR DOI 10.1007/s10032-016-0273-3 ORIGINAL PAPER Conservative preprocessing of document images Jin Chen 1 Daniel Lopresti 2 George Nagy 3 Received: 14 October 2015 / Revised: 16 August 2016 / Accepted:
More informationRadial Basis Function Neural Network Classifier
Recognition of Unconstrained Handwritten Numerals by a Radial Basis Function Neural Network Classifier Hwang, Young-Sup and Bang, Sung-Yang Department of Computer Science & Engineering Pohang University
More informationWriter Identification In Music Score Documents Without Staff-Line Removal
Writer Identification In Music Score Documents Without Staff-Line Removal Anirban Hati, Partha P. Roy and Umapada Pal Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata,
More informationStructure in On-line Documents
Structure in On-line Documents Anil K. Jain and Anoop M. Namboodiri Department of Comp. Sci. and Engg. Michigan State University East Lansing, MI 4884 fjain, anoopg@cse.msu.edu Jayashree Subrahmonia IBM
More informationThe Interpersonal and Intrapersonal Variability Influences on Off- Line Signature Verification Using HMM
The Interpersonal and Intrapersonal Variability Influences on Off- Line Signature Verification Using HMM EDSON J. R. JUSTINO 1 FLÁVIO BORTOLOZZI 1 ROBERT SABOURIN 2 1 PUCPR - Pontifícia Universidade Católica
More informationFace Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN
2016 International Conference on Artificial Intelligence: Techniques and Applications (AITA 2016) ISBN: 978-1-60595-389-2 Face Recognition Using Vector Quantization Histogram and Support Vector Machine
More informationLearning to Recognize Faces in Realistic Conditions
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationRecognition of online captured, handwritten Tamil words on Android
Recognition of online captured, handwritten Tamil words on Android A G Ramakrishnan and Bhargava Urala K Medical Intelligence and Language Engineering (MILE) Laboratory, Dept. of Electrical Engineering,
More informationRobust PDF Table Locator
Robust PDF Table Locator December 17, 2016 1 Introduction Data scientists rely on an abundance of tabular data stored in easy-to-machine-read formats like.csv files. Unfortunately, most government records
More informationENSEMBLE RANDOM-SUBSET SVM
ENSEMBLE RANDOM-SUBSET SVM Anonymous for Review Keywords: Abstract: Ensemble Learning, Bagging, Boosting, Generalization Performance, Support Vector Machine In this paper, the Ensemble Random-Subset SVM
More informationThe Effects of Outliers on Support Vector Machines
The Effects of Outliers on Support Vector Machines Josh Hoak jrhoak@gmail.com Portland State University Abstract. Many techniques have been developed for mitigating the effects of outliers on the results
More informationLinear Discriminant Analysis in Ottoman Alphabet Character Recognition
Linear Discriminant Analysis in Ottoman Alphabet Character Recognition ZEYNEB KURT, H. IREM TURKMEN, M. ELIF KARSLIGIL Department of Computer Engineering, Yildiz Technical University, 34349 Besiktas /
More informationAccelerometer Gesture Recognition
Accelerometer Gesture Recognition Michael Xie xie@cs.stanford.edu David Pan napdivad@stanford.edu December 12, 2014 Abstract Our goal is to make gesture-based input for smartphones and smartwatches accurate
More informationRobust Shape Retrieval Using Maximum Likelihood Theory
Robust Shape Retrieval Using Maximum Likelihood Theory Naif Alajlan 1, Paul Fieguth 2, and Mohamed Kamel 1 1 PAMI Lab, E & CE Dept., UW, Waterloo, ON, N2L 3G1, Canada. naif, mkamel@pami.uwaterloo.ca 2
More informationUnsupervised Feature Selection Using Multi-Objective Genetic Algorithms for Handwritten Word Recognition
Unsupervised Feature Selection Using Multi-Objective Genetic Algorithms for Handwritten Word Recognition M. Morita,2, R. Sabourin 3, F. Bortolozzi 3 and C. Y. Suen 2 École de Technologie Supérieure, Montreal,
More informationMobile Human Detection Systems based on Sliding Windows Approach-A Review
Mobile Human Detection Systems based on Sliding Windows Approach-A Review Seminar: Mobile Human detection systems Njieutcheu Tassi cedrique Rovile Department of Computer Engineering University of Heidelberg
More informationKeywords Connected Components, Text-Line Extraction, Trained Dataset.
Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Language Independent
More informationModeling of High-Dimensional Data in Object Recognition
International Journal of Innovative Research in Electronics and Communications (IJIREC) Volume 4, Issue 1, 2017, PP 27-41 ISSN 2349-4042 (Print) & ISSN 2349-4050 (Online) DOI: http://dx.doi.org/10.20431/2349-4050.0401005
More informationRecognition of Animal Skin Texture Attributes in the Wild. Amey Dharwadker (aap2174) Kai Zhang (kz2213)
Recognition of Animal Skin Texture Attributes in the Wild Amey Dharwadker (aap2174) Kai Zhang (kz2213) Motivation Patterns and textures are have an important role in object description and understanding
More informationA Framework for Efficient Fingerprint Identification using a Minutiae Tree
A Framework for Efficient Fingerprint Identification using a Minutiae Tree Praveer Mansukhani February 22, 2008 Problem Statement Developing a real-time scalable minutiae-based indexing system using a
More informationMachine Learning and Pervasive Computing
Stephan Sigg Georg-August-University Goettingen, Computer Networks 17.12.2014 Overview and Structure 22.10.2014 Organisation 22.10.3014 Introduction (Def.: Machine learning, Supervised/Unsupervised, Examples)
More informationTexture Analysis of Painted Strokes 1) Martin Lettner, Paul Kammerer, Robert Sablatnig
Texture Analysis of Painted Strokes 1) Martin Lettner, Paul Kammerer, Robert Sablatnig Vienna University of Technology, Institute of Computer Aided Automation, Pattern Recognition and Image Processing
More informationA Fast Caption Detection Method for Low Quality Video Images
2012 10th IAPR International Workshop on Document Analysis Systems A Fast Caption Detection Method for Low Quality Video Images Tianyi Gui, Jun Sun, Satoshi Naoi Fujitsu Research & Development Center CO.,
More informationFace Detection using Hierarchical SVM
Face Detection using Hierarchical SVM ECE 795 Pattern Recognition Christos Kyrkou Fall Semester 2010 1. Introduction Face detection in video is the process of detecting and classifying small images extracted
More informationAdaptive Learning of an Accurate Skin-Color Model
Adaptive Learning of an Accurate Skin-Color Model Q. Zhu K.T. Cheng C. T. Wu Y. L. Wu Electrical & Computer Engineering University of California, Santa Barbara Presented by: H.T Wang Outline Generic Skin
More informationA System for Joining and Recognition of Broken Bangla Numerals for Indian Postal Automation
A System for Joining and Recognition of Broken Bangla Numerals for Indian Postal Automation K. Roy, U. Pal and B. B. Chaudhuri CVPR Unit; Indian Statistical Institute, Kolkata-108; India umapada@isical.ac.in
More informationNOVATEUR PUBLICATIONS INTERNATIONAL JOURNAL OF INNOVATIONS IN ENGINEERING RESEARCH AND TECHNOLOGY [IJIERT] ISSN: VOLUME 2, ISSUE 1 JAN-2015
Offline Handwritten Signature Verification using Neural Network Pallavi V. Hatkar Department of Electronics Engineering, TKIET Warana, India Prof.B.T.Salokhe Department of Electronics Engineering, TKIET
More informationCharacter Recognition
Character Recognition 5.1 INTRODUCTION Recognition is one of the important steps in image processing. There are different methods such as Histogram method, Hough transformation, Neural computing approaches
More informationCS 231A Computer Vision (Fall 2012) Problem Set 3
CS 231A Computer Vision (Fall 2012) Problem Set 3 Due: Nov. 13 th, 2012 (2:15pm) 1 Probabilistic Recursion for Tracking (20 points) In this problem you will derive a method for tracking a point of interest
More informationA Model-based Line Detection Algorithm in Documents
A Model-based Line Detection Algorithm in Documents Yefeng Zheng, Huiping Li, David Doermann Laboratory for Language and Media Processing Institute for Advanced Computer Studies University of Maryland,
More informationRESTORATION OF DEGRADED DOCUMENTS USING IMAGE BINARIZATION TECHNIQUE
RESTORATION OF DEGRADED DOCUMENTS USING IMAGE BINARIZATION TECHNIQUE K. Kaviya Selvi 1 and R. S. Sabeenian 2 1 Department of Electronics and Communication Engineering, Communication Systems, Sona College
More informationWriter Identification from Gray Level Distribution
Writer Identification from Gray Level Distribution M. WIROTIUS 1, A. SEROPIAN 2, N. VINCENT 1 1 Laboratoire d'informatique Université de Tours FRANCE vincent@univ-tours.fr 2 Laboratoire d'optique Appliquée
More informationLearning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009
Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer
More informationA Method of Annotation Extraction from Paper Documents Using Alignment Based on Local Arrangements of Feature Points
A Method of Annotation Extraction from Paper Documents Using Alignment Based on Local Arrangements of Feature Points Tomohiro Nakai, Koichi Kise, Masakazu Iwamura Graduate School of Engineering, Osaka
More informationFacial expression recognition using shape and texture information
1 Facial expression recognition using shape and texture information I. Kotsia 1 and I. Pitas 1 Aristotle University of Thessaloniki pitas@aiia.csd.auth.gr Department of Informatics Box 451 54124 Thessaloniki,
More informationTypes of Edges. Why Edge Detection? Types of Edges. Edge Detection. Gradient. Edge Detection
Why Edge Detection? How can an algorithm extract relevant information from an image that is enables the algorithm to recognize objects? The most important information for the interpretation of an image
More informationBagging and Boosting Algorithms for Support Vector Machine Classifiers
Bagging and Boosting Algorithms for Support Vector Machine Classifiers Noritaka SHIGEI and Hiromi MIYAJIMA Dept. of Electrical and Electronics Engineering, Kagoshima University 1-21-40, Korimoto, Kagoshima
More informationHuman Motion Detection and Tracking for Video Surveillance
Human Motion Detection and Tracking for Video Surveillance Prithviraj Banerjee and Somnath Sengupta Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur,
More informationA New Algorithm for Detecting Text Line in Handwritten Documents
A New Algorithm for Detecting Text Line in Handwritten Documents Yi Li 1, Yefeng Zheng 2, David Doermann 1, and Stefan Jaeger 1 1 Laboratory for Language and Media Processing Institute for Advanced Computer
More informationCase-Based Reasoning. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. Parametric / Non-parametric.
CS 188: Artificial Intelligence Fall 2008 Lecture 25: Kernels and Clustering 12/2/2008 Dan Klein UC Berkeley Case-Based Reasoning Similarity for classification Case-based reasoning Predict an instance
More informationCS 188: Artificial Intelligence Fall 2008
CS 188: Artificial Intelligence Fall 2008 Lecture 25: Kernels and Clustering 12/2/2008 Dan Klein UC Berkeley 1 1 Case-Based Reasoning Similarity for classification Case-based reasoning Predict an instance
More informationHMM-based Indic Handwritten Word Recognition using Zone Segmentation
HMM-based Indic Handwritten Word Recognition using Zone Segmentation a Partha Pratim Roy*, b Ayan Kumar Bhunia, b Ayan Das, c Prasenjit Dey, d Umapada Pal a Dept. of CSE, Indian Institute of Technology
More informationCHAPTER 8 COMPOUND CHARACTER RECOGNITION USING VARIOUS MODELS
CHAPTER 8 COMPOUND CHARACTER RECOGNITION USING VARIOUS MODELS 8.1 Introduction The recognition systems developed so far were for simple characters comprising of consonants and vowels. But there is one
More informationWord Slant Estimation using Non-Horizontal Character Parts and Core-Region Information
2012 10th IAPR International Workshop on Document Analysis Systems Word Slant using Non-Horizontal Character Parts and Core-Region Information A. Papandreou and B. Gatos Computational Intelligence Laboratory,
More information2. LITERATURE REVIEW
2. LITERATURE REVIEW CBIR has come long way before 1990 and very little papers have been published at that time, however the number of papers published since 1997 is increasing. There are many CBIR algorithms
More informationHMM-Based Handwritten Amharic Word Recognition with Feature Concatenation
009 10th International Conference on Document Analysis and Recognition HMM-Based Handwritten Amharic Word Recognition with Feature Concatenation Yaregal Assabie and Josef Bigun School of Information Science,
More informationExperimentation on the use of Chromaticity Features, Local Binary Pattern and Discrete Cosine Transform in Colour Texture Analysis
Experimentation on the use of Chromaticity Features, Local Binary Pattern and Discrete Cosine Transform in Colour Texture Analysis N.Padmapriya, Ovidiu Ghita, and Paul.F.Whelan Vision Systems Laboratory,
More informationFace Recognition using SURF Features and SVM Classifier
International Journal of Electronics Engineering Research. ISSN 0975-6450 Volume 8, Number 1 (016) pp. 1-8 Research India Publications http://www.ripublication.com Face Recognition using SURF Features
More informationMachine Learning for NLP
Machine Learning for NLP Support Vector Machines Aurélie Herbelot 2018 Centre for Mind/Brain Sciences University of Trento 1 Support Vector Machines: introduction 2 Support Vector Machines (SVMs) SVMs
More informationCS 223B Computer Vision Problem Set 3
CS 223B Computer Vision Problem Set 3 Due: Feb. 22 nd, 2011 1 Probabilistic Recursion for Tracking In this problem you will derive a method for tracking a point of interest through a sequence of images.
More informationExploring Similarity Measures for Biometric Databases
Exploring Similarity Measures for Biometric Databases Praveer Mansukhani, Venu Govindaraju Center for Unified Biometrics and Sensors (CUBS) University at Buffalo {pdm5, govind}@buffalo.edu Abstract. Currently
More information