ANALYTIC WORD RECOGNITION WITHOUT SEGMENTATION BASED ON MARKOV RANDOM FIELDS

Similar documents
Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction

IDIAP. Martigny - Valais - Suisse IDIAP

Slant normalization of handwritten numeral strings

Handwritten Month Word Recognition on Brazilian Bank Cheques

HMM-Based Handwritten Amharic Word Recognition with Feature Concatenation

Feature Sets Evaluation for Handwritten Word Recognition

An Offline Cursive Handwritten Word Recognition System

A syntax-directed method for numerical field extraction using classifier combination.

A System for Joining and Recognition of Broken Bangla Numerals for Indian Postal Automation

AUTOMATIC EXTRACTION OF INFORMATION FOR CATENARY SCENE ANALYSIS

WORD LEVEL DISCRIMINATIVE TRAINING FOR HANDWRITTEN WORD RECOGNITION Chen, W.; Gader, P.

Automatic Recognition and Verification of Handwritten Legal and Courtesy Amounts in English Language Present on Bank Cheques

A NEW STRATEGY FOR IMPROVING FEATURE SETS IN A DISCRETE HMM-BASED HANDWRITING RECOGNITION SYSTEM

Indian Multi-Script Full Pin-code String Recognition for Postal Automation

Adaptive Technology for Mail-Order Form Segmentation

One Dim~nsional Representation Of Two Dimensional Information For HMM Based Handwritten Recognition

Handwritten Text Recognition

Effect of Initial HMM Choices in Multiple Sequence Training for Gesture Recognition

Recognition-based Segmentation of Nom Characters from Body Text Regions of Stele Images Using Area Voronoi Diagram

A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition

Segmentation-driven recognition applied to numerical field extraction from handwritten incoming mail documents

University of Groningen. From Off-line to On-line Handwriting Recognition Lallican, P.; Viard-Gaudin, C.; Knerr, S. Published in: EPRINTS-BOOK-TITLE

ModelStructureSelection&TrainingAlgorithmsfor an HMMGesture Recognition System

Using Hidden Markov Models to analyse time series data

Constraints in Particle Swarm Optimization of Hidden Markov Models

Mono-font Cursive Arabic Text Recognition Using Speech Recognition System

Optical Music Recognition using Hidden Markov Models

LECTURE 6 TEXT PROCESSING

application of learning vector quantization algorithms. In Proceedings of the International Joint Conference on

Segmentation and Recognition of Handwritten Numeric Chains

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

Explicit fuzzy modeling of shapes and positioning for handwritten Chinese character recognition

Hidden Markov Model for Sequential Data

Evaluation of Model-Based Condition Monitoring Systems in Industrial Application Cases

Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network

Handwritten Gurumukhi Character Recognition by using Recurrent Neural Network

Radial Basis Function Neural Network Classifier

VOL. 3, NO. 7, Juyl 2012 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

Skill. Robot/ Controller

Adaptive technology for mail-order segmentation. this approach lies mainly in the absence of a rigid a priori model, replaced by a simply and

A Simplistic Way of Feature Extraction Directed towards a Better Recognition Accuracy

Farsi Handwritten Word Recognition Using Discrete HMM and Self- Organizing Feature Map

An automatic correction of Ma s thinning algorithm based on P-simple points

The Method of User s Identification Using the Fusion of Wavelet Transform and Hidden Markov Models

A reevaluation and benchmark of hidden Markov Models

Text Recognition in Videos using a Recurrent Connectionist Approach

A Performance Evaluation of HMM and DTW for Gesture Recognition

Recognition of online captured, handwritten Tamil words on Android

NOVEL HYBRID GENETIC ALGORITHM WITH HMM BASED IRIS RECOGNITION

HANDWRITTEN/PRINTED TEXT SEPARATION USING PSEUDO-LINES FOR CONTEXTUAL RE-LABELING

A Feature based on Encoding the Relative Position of a Point in the Character for Online Handwritten Character Recognition

Biology 644: Bioinformatics

Clustering Sequences with Hidden. Markov Models. Padhraic Smyth CA Abstract

Ambiguity Detection by Fusion and Conformity: A Spectral Clustering Approach

Symbol Detection Using Region Adjacency Graphs and Integer Linear Programming

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

The Interpersonal and Intrapersonal Variability Influences on Off- Line Signature Verification Using HMM

Comparison of Bernoulli and Gaussian HMMs using a vertical repositioning technique for off-line handwriting recognition

Adaptative Elimination of False Edges for First Order Detectors

Multi-Modal Human Verification Using Face and Speech

Spotting Words in Latin, Devanagari and Arabic Scripts

Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes

A Visualization Tool to Improve the Performance of a Classifier Based on Hidden Markov Models

Static Gesture Recognition with Restricted Boltzmann Machines

Hidden Loop Recovery for Handwriting Recognition

Elastic Image Matching is NP-Complete

Automatic Article Extraction in Old Newspapers Digitized Collections

Robust line segmentation for handwritten documents

ECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov

Chapter 3. Speech segmentation. 3.1 Preprocessing

Mobile Application with Optical Character Recognition Using Neural Network

Short Survey on Static Hand Gesture Recognition

HANDWRITTEN GURMUKHI CHARACTER RECOGNITION USING WAVELET TRANSFORMS

Comparing Natural and Synthetic Training Data for Off-line Cursive Handwriting Recognition

Computer Vesion Based Music Information Retrieval

Identifying Layout Classes for Mathematical Symbols Using Layout Context

Handwritten text segmentation using blurred image

The Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem

ABJAD: AN OFF-LINE ARABIC HANDWRITTEN RECOGNITION SYSTEM

Learning-Based Candidate Segmentation Scoring for Real-Time Recognition of Online Overlaid Chinese Handwriting

An Efficient Hidden Markov Model for Offline Handwritten Numeral Recognition

A Review on Different Character Segmentation Techniques for Handwritten Gurmukhi Scripts

in order to apply the depth buffer operations.

The optimal routing of augmented cubes.

L E A R N I N G B A G - O F - F E AT U R E S R E P R E S E N TAT I O N S F O R H A N D W R I T I N G R E C O G N I T I O N

Today. Types of graphs. Complete Graphs. Trees. Hypercubes.

A Survey of Problems of Overlapped Handwritten Characters in Recognition process for Gurmukhi Script

Toward a robust 2D spatio-temporal self-organization

Discriminative training and Feature combination

Optimization of HMM by the Tabu Search Algorithm

Hand Written Character Recognition using VNP based Segmentation and Artificial Neural Network

On the use of Lexeme Features for writer verification

Bag-of-Features Representations for Offline Handwriting Recognition Applied to Arabic Script

Speech Recognition Lecture 8: Acoustic Models. Eugene Weinstein Google, NYU Courant Institute Slide Credit: Mehryar Mohri

Universal Graphical User Interface for Online Handwritten Character Recognition

Off-Line Multi-Script Writer Identification using AR Coefficients

Word Spotting and Regular Expression Detection in Handwritten Documents

Tai Chi Motion Recognition Using Wearable Sensors and Hidden Markov Model Method

Software/Hardware Co-Design of HMM Based Isolated Digit Recognition System

Automation of Indian Postal Documents written in Bangla and English

Transcription:

ANALYTIC WORD RECOGNITION WITHOUT SEGMENTATION BASED ON MARKOV RANDOM FIELDS CHRISTOPHE CHOISY AND ABDEL BELAID LORIA/CNRS Campus scientifique, BP 239, 54506 Vandoeuvre-les-Nancy cedex, France Christophe.Choisy@loria.fr, Abdel.Belaid@loria.fr In this paper, a method for analytic handwritten word recognition based on causal Markov random fields is described. The words models are HMMs where each state corresponds to a letter; each letter is modelled by a NSHP-HMM (Markov field). Global models are build dynamically, and used for recognition and learning with the Baum-Welch algorithm. Learning of letter and word models is made using the parameters reestimated on the generated global models. No segmentation is necessary : the system determines itself the best limits between the letters during learning. First experiments on a real base of french check amount words give encouraging results of 83.4% for recognition. Keywords : HMM, NSHP-HMM, Cross-learning, Meta-models, Baum-Welch Algorithm. 1 Introduction These recent years, research on writing recognition shows that the 2D approach gives better results that 1D ones ( 1;2;3;4;5;6 ), because it takes more into acount the basically plane nature of the writing. The 2D estimator can be either a Neural Network (NN) or a planar-hmm (PHMM). The NN can be applied either on letters ( 6 )oronsegments ( 7 ); the major default of these classifiers is their lack of elasticity. The PHMM was applied with success in many works ( 2;3;4 ). This model, based on HMMs has 2D elasticity properties but needs an independance hypothesis between the columns which is not always true in practice. SAON proposed, in our team, a new model based on Markov fields : the NSHP- HMM( 8 ). Its architecture, based on an HMM, gives it an horizontal elasticity allowing the adaptation of the length of patterns analysed. Using a 2D neighborhood of pixels, it overcomes the column independance hypothesis of the PHMMs. It applies on binary patterns, with an easy use. All these models where applied in a model discriminant global approach of words. However, this approach has some limits. Particularly, the NSHP-HMM uses a lot of parameters (cf.3 p 2). Another classical limit is the restricted and distinct vocabulary. To overcome these limits, an analytic approach is proposed : models for letters are smaller than models for words, and working with letters allows to extend the vocabulary without limits. Many works on analytic words recognition are based on a segmentation into

graphems ( 5;9;10 ). This segmentation is generally based on topological criterions and cannot be 100% reliable ( 9 ). It seems better to let the system decide the best limits of letters. While many works use dynamic time warping algorithms to learn and recognize words, our system is based on the Baum-Welch algorithm, that guarantees a local optimum. The method proposed is a dynamic generation of word models, based on letter models and HMMs in which states represent letters. The reestimation of letter models and transitions between letters is made by cross-learning. This technique, directly derived from the Baum-Welch reestimation formulas, consists in crossing the information relative to letters in the different word models. The use of the Baum-Welch algorithm allows the system to find the best repartition of parameters in the models, knowing only the label of the words learned. 2 HMM Structure HMMs are already well known in Automatic Handwriting Recognition. RA- BINER ( 11 ) gives the bases of these stochastic models. According to RABINER notation we defined a discrete first order HMM by : S = fs1; :::; s N g the N states of the model, denoted at time t by q t 2 S V = fv1;:::;v M g the M observation symbols, denoted at time t by O t 2 V A = fa ij g1»i;j»n the matrix of transitions probabilities; a ij = P (q t+1 = s j jq t = s i ) B = fb j (k)g1»j»n;1»k»m the matrix of observations probabilities; b j (k) =P (O t = v k jq t = s i ) Two specific states D and F are introduced : the probabilities to start or end in a state are modelled by the transitions between D and F and this state. 3 Non-Symmetric Half-plane Hidden Markov Model The NSHP-HMM is a stochastic model of Markov fields type. This model showed very good performances in french check amount words recognition ( 8;12 ). It runs directly on binary images that are analysed column by column. The use of 2D neighborhoods of pixels enables the system to better take into account the 2D nature of the writing. Its architecture, based on a HMM, allows a horizontal elasticity making its adaptation easy on various image widths. Each column is observed in one state. Its observation probability is given as the product of elementary probabilities performed for each pixel in the column. The elementary probability is determined according to a neighbor fixed in the half plane analysed before the pixel to overcome the problem of correlation between adjacent columns. Training and recognition methods are described in 8.

The parameters of the NSHP-HMM are the height of the columns analysed, the size of the neighborhood (order of the model), the number of states of the HMM. 4 Word Modeling For word modeling we use meta-hmms in which each meta-state represents a letter. Starting from a meta-model, we build a global NSHP-HMM by connecting the NSHP-HMM associated to the meta-model letter states. i x is the state i of the model associated to the meta-state x, D x and F x the initial and final states of this model. D m and F m are the specific states associated to the meta-model. Each sequence of state of type i x! F x! D y! j y is replaced by one transition i x! j y, whose value is the product of the transitions between these states : P (j y ji x )=P (j y jd y ) Λ P (D y jf y ) Λ P (F y ji x ) Following the same idea, we obtain : P (i x jd m )=P (i x jd x ) Λ P (D x jd m ) P (F m ji x )=P (F m jf x ) Λ P (F x ji x ) The model obtained is a NSHP-HMM which can be applied as a global model. The meta-states must not loop on themselves because this would build transitions between states in the HMM associated, and erase those existing. 5 Cross-learning Cross-learning consists in crossing the informations of the words to determine the informations corresponding to each letter. The reestimation of the letter models are made using the reestimation of word models. This method is derived from the Baum- Welch training, considering that each state of each word model also belongs to a letter model. The reestimation of the transition between specific and normal states of the letters models is made through transitions generated between the letters. Let m be a meta-model of a word, with S m normal states and D m and F m the specific states. At each meta-state x 2 S m is associated a NSHP-HMM of a letter with S x normal states and the specific state D x and F x. During the construction of the global model, the specific states D x and F x are removed. The reestimation of the transitions a D x i x and a i x F x is made using generated transitions. For a meta-model m, K is the number of images analysed, O k is the kth image, T k is the number of columns of the image k, P k = P (O k jm). For a model associated to the meta-state x : ffl the transition a D x i x is used to build the transitions a j y i x y 6= x and a D mi x ffl the transition a i x F x is used to build the transitions a i x j y y 6= x and a i x F m

ffl the internal transitions a i x jx are left unchanged. The principle of the cross-reestimation is to gather this information for all the models associated with a same letter in the various meta-models. For the internal transitions the Baum-Welch formula can be applied directly by summing the paths containing the transition over all occurences of a letter model in all the word models (the reestimation of observation probabilities follows the same principle). For the transitions a D x i x and a i x F x this sum is made through the sum of the paths containing the transitions built with these. 6 Meta-model Reestimation Global models are build from meta-models. These are HMMs and the transitions between the meta-state from the informations of the generated models can be reestimated. Indeed, for x; y 2 S, we obtain by construction : a xy = a F x D y;a D mx = a DmD x;a xf m = a F x F m. ffl the transition a xy (x 6= y) is used to build the transitions a i x j y ffl the transition a Dmx is used to build the transitions a Dmi x ffl the transition a xfm is used to build the transitions a i x F m As for the cross-learning, according to the Baum-Welch formulas, the reestimation of a meta-transition is made by summing on all the paths containing the transition using this. 7 First Experiments The system was tested on a base of 7031 french bank check words given by the SRTP a (vocabulary of 26 words). The parameters of the NSHP-HMM for the letter models are : height of 20 pixels, 3 pixels for the neighborhoods; the number of normal states for the NSHP-HMM corresponding to a letter is n=2+1, where n is the average number of columns of samples for the letter. The meta-models synthetise the frequent errors found in the words. Two preprocessing steps are applied to reduce the variability of the writing. The first is a slant correction, as proposed in 12. The second normalizes the height of the words by normalizing the 3 writing bands in 3 equal vertical parts. A test was performed to validate the cross-learning principle : the interest of this method is that all the models and the meta-models can theoretically be learnt in the a Service de Recherche Technique de la Poste : French Post Research Team

same time knowing only the labels of the words. The base was split approximatively in 66% (4626 words) for cross-learning and 34% (2405 words) for recognition tests. Word meta-models and letter models with equal probabilities of transitions and observations are generated and the cross-learning is applied at several steps. The results are reported for several learning steps in Table 1. The results show the efficiency of the cross-learning to gather the informations of letters from various words without initialization of the system. Table 1. Average word recognition rates for different numbers of learning steps cross-learning top 1 top 2 top 3 top 5 5 steps 80.96% 88.48% 91.43% 94.47% 10 steps 83.12% 89.23% 92.35% 95.14% 15 steps 83.41% 89.31% 92.02% 94.84% 20 steps 82.83% 89.15% 91.56% 94.43% This approach allows the reduction of the complexity of the system proposed by SAON ( 8;12 ). The number of floating point operations is proportional to the number of states. The global approach proposed by SAON has a high number of states, based on the mean size of words. Our approach considers the mean size of letters reducing the number of states by a factor 7; this divides the floating point operations necessary for a word analysis by 7. At the same time, we observe that a neighborhood of size 4 is too high for the letter recognition. We choose a size of 3 which divides by two the number of parameters to estimate for each model. The combination of these factors allows a reduction of a factor 14 for the number of parameters to estimate. 8 Conclusion We proposed a new approach for analytic word recognition based on a dynamic generation of global models. This approach divides the number of parameters of the system of SAON by 14. The first tests give encouraging results of 83.4%. The learning of letter models is made between the words models, and the Baum-Welch algorithm ensures the optimal learning in good conditions. More tests need to be made with bigger databases in order to evaluate in a better condition such an approach. The word models are dynamically generated, corresponding to meta-models. At first remark, we can say that this method can easily be extended at entire amounts with another level of meta-models. This extension requires we can find the best path between words. This problem is the same with a generalisation of our method to

unconstrained vocabulary recognition. For such a task, we need to find the sequence of states in the meta-model that best describes the word analysed. Some studies are necessary to find the best method to get the path in the meta-models. References 1. H. S. Park and S. W. Lee. An HMMRF-Based Statistical Approach for Off-line Handwritten Character Recognition. In IEEE Proceedings of ICPR 96, volume 2, pages 320 324, 1996. 2. R. Bippus. 1-Dimensional and Pseudo 2-Dimensional HMMs for the Recognition of German Literal Amounts. In Fourth International Conference on Document Analysis and Recognition (ICDAR 97), volume 2, pages 487 490, Ulm, Germany, Aug. 1997. 3. O. E. Agazzi and S. Kuo. Hidden Markov Model Based Optical Character Recognition in the Presence of Deterministic Transformation. Pattern Recognition, 26(12):1813 1826, February 1993. 4. M. Gilloux. Reconnaissance de chiffres manuscrits par modèle de Markov pseudo-2d. In Actes du 3 eme Colloque National sur l Écrit et le Document, pages 11 17, Rouen, France, 1994. 5. M. Gilloux, B. Lemarié, and M. Leroux. A Hybrid Radial Basis Function Network/Hidden Markov Model Handwritten Word Recognition System. In Third International Conference on Document Analysis and Recognition (ICDAR 95), pages 394 397, Montréal, 1995. 6. J. C. Simon, O. Baret, and N. Gorski. A System for the Recognition of Handwritten Literal amounts of checks. In Internal Association for Pattern Recognition Workshop on Document Analysis System (DAS 94), Kaiserlautern, Germany, pages 135 155, September 1994. 7. B. Lemarié, M. Gilloux, and M. Leroux. Un modèle neuro-markovien contextuel pour la reconnaissance de l écriture manuscrite. In Actes 10ème Congrès AFCET de Reconnaissance des Formes et Intelligence Artificielle, Rennes, France, 1996. 8. G. Saon. Modèles markoviens uni- et bidimensionnels pour la reconnaissance de l écriture manuscrite hors-ligne. PhD thesis, Université Henri Poincaré - Nancy I, Vandœuvre-lès-Nancy, 1997. 9. Mou-Yen Chen, Amlan Kundu, and Jian Zhou. Off-Line Handwritten Word Recognition Using a Hidden Markov Model Type Stochastic Network. IEEE Transactions on Pattern Recognition and Machine Intelligence, 16(5):481 497, 1994. 10. F. Kimura M. Shridhar, Gilles Houle. Handwritten word recognition using lexicon free and lexicon directed word recognition algorithms. In Fourth International Conference on Document Analysis and Recognition (ICDAR 97), Ulm, Germany, Aug. 1997. 11. L. R. Rabiner. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE, 77(2), February 1989. 12. G. Saon and A. Belaïd. Off-line Handwritten Word Recognition Using A Mixed HMM- MRF Approach. In Fourth International Conference on Document Analysis and Recognition (ICDAR 97), volume 1, pages 118 122, Ulm, Germany, Aug. 1997.