Natural Language Inspired Approach for Handwritten Text Line Detection in Legacy Documents
|
|
- Angelica Tate
- 5 years ago
- Views:
Transcription
1 Natural Language Inspired Approach for Handwritten Text Line Detection in Legacy Documents Vicente Bosch Alejandro Hector Toselli Enrique Vidal Pattern Recognition and Human Language Technologies group Universidad Politécnica de Valencia
2 Presentation Outline Introduction 1 Statistical Framework 2 Modelling 4 System Architecture 7 Experimental Setup and Results 12 Conclusions & Future Work 17 1
3 Introduction Document layout analysis (DLA) is an important task for any Document Retrieval Information System. Text line detection (TLD) is a DLA task that yields the physical locations. of text lines given an input page image; required for systems that require text line images as input. TLD in handwritten legacy documents is a difficult task due to: Handwritten text issues: variable inter-line spacing, overlapping and touching strokes of adjacent lines, etc Legacy document issues: smears, background variations, uneven illumination, humidity spots and bleed-through marks Current state of the art-methods are mostly based on heuristic approaches and require parameter tunning. We present a formal approach for text line detection in handwritten text based in machine-learning techniques. 2
4 Statistical Framework (1) We assume that the input image contains one or more paragraphs of single-column parallel text with no images or diagram figures which we mathematically represent as a sequence of observations o = o 1, o 2,..., o m. TLD problem can be formulated as the problem of finding the most likely text lines sequence, ĥ = h 1, h 2,..., h n, for a given input sequence o; which we can decompose using Bayes rule: ĥ = arg max h P (h o) = arg max h P (o h) P (h) 3
5 Statistical Framework (2) As the actual physical location (and not just the best label sequence) is required we rewrite the formula to obtain this: ĥ = arg max h P (o, b h) P (h) Which we can approximate by the dominant term, max b P (o, b h): (ˆb, b ĥ) arg max b,h P (h) P (o, b h) 4
6 Modelling (1) 5
7 Modelling (2) Two levels of modelling: Morphological: That model the page vertical regions, usually approximated by HMMs Syntactical: a Language Model (LM) that restrics how those vertical regions are composed to form an actual page, modelled as a stochastic finite state grammar (SFSG) Vertical region types: Blank Space (BS), Normal Line (NL), Inter Line (IL) and Non-Text Line LM allows us to force restricctions: NL+IL, NT+IL, BS, etc. Both modelling levels represented by finite-state automaton, can be integrated into a single global model on which our problem can be easily solved. 6
8 Modelling (3) HMM Sample: SFSG Sample: 7
9 System Overview Page Images Page layout corpus Preprocessing Cleaned Page Images Feature Extraction HMM Training LM Training Training Decoding Feature Vectors Off-line line HMMs LM Model Type label and Region position coordinates 8
10 Natural Language Inspired Approach for Handwritten Text Line Detection in Legacy Documents Valencia April 18, 2012 Preprocessing
11 Feature Extraction b0 b1 b2 b3 b4 b5 b6 b7 bf-2 bf-1 bf X1 XL h1 h2 h3 h4 h5 h6 h7 hf-2 hf-1 hf D 10
12 Feauture Extraction
13 Feauture Extraction 12
14 Corpus Description (1) Experiments are carried out with a corpus compiled from a XIX century Spanish manuscript identified as Cristo-Salvador (CS). Kindly provided by the Biblioteca Valenciana Digital (BiVaLDi). Small document composed of 53 colour images of text pages, scanned at 300 dpi and written by a single writer. We employ the so-called book partition: Training set is formed by the first 33 page images. Test set contains the 20 remaining pages. 13
15 Corpus Description (2) 14
16 Corpus Description (3) Number of: Training Test Total Pages Normal-text lines (NL) Blank Lines (BL) Non-text Lines (NT) Inter Lines (IL) Each page was annotated with a succession of reference labels (NL, NT, BL and IL). Vertical regions were delimited by executing standard methods for text line detection based on vertical projection profiles and manually verified/corrected. Labelling of the different regions was performed manually by an operator. 15
17 Evaluation Measures Quality of the text line detection was measured using the line error rate (LER). LER is performed by comparing the sequences of automatically obtained region labels with the corresponding reference label sequences. The LER is computed in the same way as the well known WER. 16
18 Experiments and Results Training and decoding parameters were empirically tuned. Three model languages where tested: Prior, Conditional and Line-number constrained LM LER(%) Prior 0.86 Conditional 0.70 LN-Constrained
19 Conclusions & Future Work We have presented a new approach for text line detection by using a statistical framework similar to that already employed in many topics of NLP. It avoids the traditional heuristics approaches usually adopted for this task. The proposed approach not only has up to par detection accuracy,with current state of the art solutions, but also yields baselines of better quality (visually closer to the actual line). Extend this approach not only to detect, but also to classify line-region types in order to determine for example titles, short lines, beginning and and end of paragraphs, etc. It is envisioned that the proposed stochastic framework serves as a cornerstone to implementing interactive approaches to line detection similar to those used for handwritten text transcription used in Multimodal interactive transcription systems. 18
20 Questions 19
Natural Language Inspired Approach for Handwritten Text Line Detection in Legacy Documents
Natural Language Inspired Approach for Handwritten Text Line Detection in Legacy Documents Vicente Bosch Campos vbosch@iti.upv.es Alejandro Héctor Toselli ahector@iti.upv.es Enrique Vidal evidal@iti.upv.es
More informationInteractive Handwritten Text Recognition and Indexing of Historical Documents: the transcriptorum Project
Interactive Handwritten Text Recognition and ing of Historical Documents: the transcriptorum Project Alejandro H. Toselli ahector@prhlt.upv.es Pattern Recognition and Human Language Technology Reseach
More informationContents. Resumen. List of Acronyms. List of Mathematical Symbols. List of Figures. List of Tables. I Introduction 1
Contents Agraïments Resum Resumen Abstract List of Acronyms List of Mathematical Symbols List of Figures List of Tables VII IX XI XIII XVIII XIX XXII XXIV I Introduction 1 1 Introduction 3 1.1 Motivation...
More informationOnLine Handwriting Recognition
OnLine Handwriting Recognition (Master Course of HTR) Alejandro H. Toselli Departamento de Sistemas Informáticos y Computación Universidad Politécnica de Valencia February 26, 2008 A.H. Toselli (ITI -
More informationHandwritten Text Recognition
Handwritten Text Recognition M.J. Castro-Bleda, S. España-Boquera, F. Zamora-Martínez Universidad Politécnica de Valencia Spain Avignon, 9 December 2010 Text recognition () Avignon Avignon, 9 December
More informationHandwritten Text Recognition
Handwritten Text Recognition M.J. Castro-Bleda, Joan Pasto Universidad Politécnica de Valencia Spain Zaragoza, March 2012 Text recognition () TRABHCI Zaragoza, March 2012 1 / 1 The problem: Handwriting
More informationViterbi Based Alignment between Text Images and their Transcripts
Viteri Based Alignment etween Text Images and their Transcripts Alejandro H. Toselli, Verónica Romero and Enrique Vidal Institut Tecnològic d Informàtica Universitat Politècnica de València Camí de Vera
More informationWorkshop: Automatisierte Handschriftenerkennung
Workshop: Automatisierte Handschriftenerkennung Joan Andreu Sánchez Pattern Recognition and Human Language Research group (Technical University of Valencia) Günter Mühlberger, Sebastian Colutto, Philip
More informationTRAINING ON-LINE HANDWRITING RECOGNIZERS USING SYNTHETICALLY GENERATED TEXT
TRAINING ON-LINE HANDWRITING RECOGNIZERS USING SYNTHETICALLY GENERATED TEXT Daniel Martín-Albo, Réjean Plamondon * and Enrique Vidal PRHLT Research Center Universitat Politècnica de València * Laboratoire
More informationHandwritten word verification by SVM-based hypotheses re-scoring and multiple thresholds rejection
Author manuscript, published in "International Conference on Frontiers in Handwriting Recognition (2010)" Handwritten word verification by SVM-based hypotheses re-scoring and multiple thresholds rejection
More informationDocument downloaded from: This paper must be cited as:
Document downloaded from: http://hdl.handle.net/1/0 This paper must be cited as: The final publication is available at https://doi.org/.0/s00-01-- Copyright Springer-Verlag Additional Information Neural
More informationA Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition
A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition Théodore Bluche, Hermann Ney, Christopher Kermorvant SLSP 14, Grenoble October
More informationStochastic Segment Modeling for Offline Handwriting Recognition
2009 10th nternational Conference on Document Analysis and Recognition tochastic egment Modeling for Offline Handwriting Recognition Prem Natarajan, Krishna ubramanian, Anurag Bhardwaj, Rohit Prasad BBN
More informationCompiler Construction
Compiler Construction Lecture 2: Lexical Analysis I (Introduction) Thomas Noll Lehrstuhl für Informatik 2 (Software Modeling and Verification) noll@cs.rwth-aachen.de http://moves.rwth-aachen.de/teaching/ss-14/cc14/
More informationText lines and snippets extraction for 19th century handwriting documents layout analysis
Author manuscript, published in "2009 10th International Conference on Document Analysis and Recognition, Barcelona : Spain (2009)" Text lines and snippets extraction for 19th century handwriting documents
More informationKeyword Spotting in Document Images through Word Shape Coding
2009 10th International Conference on Document Analysis and Recognition Keyword Spotting in Document Images through Word Shape Coding Shuyong Bai, Linlin Li and Chew Lim Tan School of Computing, National
More informationSegmentation free Bangla OCR using HMM: Training and Recognition
Segmentation free Bangla OCR using HMM: Training and Recognition Md. Abul Hasnat, S.M. Murtoza Habib, Mumit Khan BRAC University, Bangladesh mhasnat@gmail.com, murtoza@gmail.com, mumit@bracuniversity.ac.bd
More informationMono-font Cursive Arabic Text Recognition Using Speech Recognition System
Mono-font Cursive Arabic Text Recognition Using Speech Recognition System M.S. Khorsheed Computer & Electronics Research Institute, King AbdulAziz City for Science and Technology (KACST) PO Box 6086, Riyadh
More informationCHAPTER 3 SYSTEM DESCRIPTION
39 CHAPTER 3 SYSTEM DESCRIPTION This chapter exhibits the overview of the system with specifications. It also furnishes the purpose of using the untapped descriptive statistics measures and detailed description
More informationRevealing the Modern History of Japanese Philosophy Using Digitization, Natural Language Processing, and Visualization
Revealing the Modern History of Japanese Philosophy Using Digitization, Natural Language Katsuya Masuda *, Makoto Tanji **, and Hideki Mima *** Abstract This study proposes a framework to access to the
More informationCHAPTER 8 COMPOUND CHARACTER RECOGNITION USING VARIOUS MODELS
CHAPTER 8 COMPOUND CHARACTER RECOGNITION USING VARIOUS MODELS 8.1 Introduction The recognition systems developed so far were for simple characters comprising of consonants and vowels. But there is one
More informationAn Efficient Feature Extraction Algorithm for the Recognition of Handwritten Arabic Digits
An Efficient Feature Extraction Algorithm for the Recognition of Handwritten Arabic Digits Ahmad T. AlTaani Abstract In this paper, an efficient structural approach for recognizing online handwritten digits
More informationA Hidden Markov Model for Alphabet Soup Word Recognition
A Hidden Markov Model for Alphabet Soup Word Recognition Shaolei Feng 1 Nicholas R. Howe 2 R. Manmatha 1 1 University of Massachusetts, Amherst 2 Smith College Motivation: Inaccessible Treasures Historical
More informationGround-Truth Production in the transcriptorium Project
2014 11th IAPR International Workshop on Document Analysis Systems Ground-Truth Production in the transcriptorium Project B. Gatos and G. Louloudis Inst. of Inf. and Telecommunications National Centre
More informationDynamic Time Warping
Centre for Vision Speech & Signal Processing University of Surrey, Guildford GU2 7XH. Dynamic Time Warping Dr Philip Jackson Acoustic features Distance measures Pattern matching Distortion penalties DTW
More informationLeveraging Set Relations in Exact Set Similarity Join
Leveraging Set Relations in Exact Set Similarity Join Xubo Wang, Lu Qin, Xuemin Lin, Ying Zhang, and Lijun Chang University of New South Wales, Australia University of Technology Sydney, Australia {xwang,lxue,ljchang}@cse.unsw.edu.au,
More informationA Multimodal Framework for the Recognition of Ancient Tamil Handwritten Characters in Palm Manuscript Using Boolean Bitmap Pattern of Image Zoning
A Multimodal Framework for the Recognition of Ancient Tamil Handwritten s in Palm Manuscript Using Boolean Bitmap Pattern of Zoning E.K.Vellingiriraj, Asst. Professor and Dr.P.Balasubramanie, Professor
More informationConfidence Measures: how much we can trust our speech recognizers
Confidence Measures: how much we can trust our speech recognizers Prof. Hui Jiang Department of Computer Science York University, Toronto, Ontario, Canada Email: hj@cs.yorku.ca Outline Speech recognition
More informationCandidate Document Retrieval for Web-scale Text Reuse Detection
Candidate Document Retrieval for Web-scale Text Reuse Detection Matthias Hagen Benno Stein Bauhaus-Universität Weimar matthias.hagen@uni-weimar.de SPIRE 2011 Pisa, Italy October 19, 2011 Matthias Hagen,
More informationPatent Terminlogy Analysis: Passage Retrieval Experiments for the Intellecutal Property Track at CLEF
Patent Terminlogy Analysis: Passage Retrieval Experiments for the Intellecutal Property Track at CLEF Julia Jürgens, Sebastian Kastner, Christa Womser-Hacker, and Thomas Mandl University of Hildesheim,
More informationBinarization-free Text Line Extraction for Historical Manuscripts
Binarization-free Text Line Extraction for Historical Manuscripts Nikolaos Arvanitopoulos and Sabine Süsstrunk School of Computer and Communication Sciences, EPFL, Switzerland 1 Introduction Nowadays,
More informationA Modular System to Recognize Numerical Amounts on Brazilian Bank Cheques
A Modular System to Recognize Numerical Amounts on Brazilian Bank Cheques L. S. Oliveira ½, R. Sabourin ½, F. Bortolozzi ½ and C. Y. Suen ½ PUCPR Pontifícia Universidade Católica do Paraná (PPGIA-LARDOC)
More informationSemi-Supervised Clustering with Partial Background Information
Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject
More informationTopics for Today. The Last (i.e. Final) Class. Weakly Supervised Approaches. Weakly supervised learning algorithms (for NP coreference resolution)
Topics for Today The Last (i.e. Final) Class Weakly supervised learning algorithms (for NP coreference resolution) Co-training Self-training A look at the semester and related courses Submit the teaching
More informationMachine Learning in GATE
Machine Learning in GATE Angus Roberts, Horacio Saggion, Genevieve Gorrell Recap Previous two days looked at knowledge engineered IE This session looks at machine learned IE Supervised learning Effort
More informationDocument Image Restoration Using Binary Morphological Filters. Jisheng Liang, Robert M. Haralick. Seattle, Washington Ihsin T.
Document Image Restoration Using Binary Morphological Filters Jisheng Liang, Robert M. Haralick University of Washington, Department of Electrical Engineering Seattle, Washington 98195 Ihsin T. Phillips
More informationOverview of ImageCLEF Mauricio Villegas (on behalf of all organisers)
Overview of ImageCLEF 2016 Mauricio Villegas (on behalf of all organisers) ImageCLEF history Started in 2003 with a photo retrieval task 4 participants submitting results In 2009 we had 6 tasks and 65
More informationAutomatic Detection of Change in Address Blocks for Reply Forms Processing
Automatic Detection of Change in Address Blocks for Reply Forms Processing K R Karthick, S Marshall and A J Gray Abstract In this paper, an automatic method to detect the presence of on-line erasures/scribbles/corrections/over-writing
More informationFinal Exam 1, CS154. April 21, 2010
Final Exam 1, CS154 April 21, 2010 Exam rules. The exam is open book and open notes you can use any printed or handwritten material. However, no electronic devices are allowed. Anything with an on-off
More informationCompiler Construction
Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-16/cc/ Conceptual Structure of a Compiler Source code x1 := y2
More informationDATA EMBEDDING IN TEXT FOR A COPIER SYSTEM
DATA EMBEDDING IN TEXT FOR A COPIER SYSTEM Anoop K. Bhattacharjya and Hakan Ancin Epson Palo Alto Laboratory 3145 Porter Drive, Suite 104 Palo Alto, CA 94304 e-mail: {anoop, ancin}@erd.epson.com Abstract
More informationSEMICCA: A NEW SEMI-SUPERVISED PROBABILISTIC CCA MODEL FOR KEYWORD SPOTTING
SEMICCA: A NEW SEMI-SUPERVISED PROBABILISTIC CCA MODEL FOR KEYWORD SPOTTING Giorgos Sfikas, Basilis Gatos Computational Intelligence Laboratory / IIT NCSR Demokritos 15310 Athens, Greece Christophoros
More informationPreface MOTIVATION ORGANIZATION OF THE BOOK. Section 1: Basic Concepts of Graph Theory
xv Preface MOTIVATION Graph Theory as a well-known topic in discrete mathematics, has become increasingly under interest within recent decades. This is principally due to its applicability in a wide range
More informationA Syntactic Methodology for Automatic Diagnosis by Analysis of Continuous Time Measurements Using Hierarchical Signal Representations
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART B: CYBERNETICS, VOL. 33, NO. 6, DECEMBER 2003 951 A Syntactic Methodology for Automatic Diagnosis by Analysis of Continuous Time Measurements Using
More informationRecognition of online captured, handwritten Tamil words on Android
Recognition of online captured, handwritten Tamil words on Android A G Ramakrishnan and Bhargava Urala K Medical Intelligence and Language Engineering (MILE) Laboratory, Dept. of Electrical Engineering,
More informationInitial Results in Offline Arabic Handwriting Recognition Using Large-Scale Geometric Features. Ilya Zavorin, Eugene Borovikov, Mark Turner
Initial Results in Offline Arabic Handwriting Recognition Using Large-Scale Geometric Features Ilya Zavorin, Eugene Borovikov, Mark Turner System Overview Based on large-scale features: robust to handwriting
More informationA Simple Text-line segmentation Method for Handwritten Documents
A Simple Text-line segmentation Method for Handwritten Documents M.Ravi Kumar Assistant professor Shankaraghatta-577451 R. Pradeep Shankaraghatta-577451 Prasad Babu Shankaraghatta-5774514th B.S.Puneeth
More informationStochastic Language Models for Style-Directed Layout Analysis of Document Images
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 12, NO. 5, MAY 2003 583 Stochastic Language Models for Style-Directed Layout Analysis of Document Images Tapas Kanungo, Senior Member, IEEE, and Song Mao, Member,
More informationAutomated Extraction of Event Details from Text Snippets
Automated Extraction of Event Details from Text Snippets Kavi Goel, Pei-Chin Wang December 16, 2005 1 Introduction We receive emails about events all the time. A message will typically include the title
More informationMachine Learning (CSMML16) (Autumn term, ) Xia Hong
Machine Learning (CSMML16) (Autumn term, 28-29) Xia Hong 1 Useful books: 1. C. M. Bishop: Pattern Recognition and Machine Learning (2007) Springer. 2. S. Haykin: Neural Networks (1999) Prentice Hall. 3.
More informationDesigning a Semantic Ground Truth for Mathematical Formulas
Designing a Semantic Ground Truth for Mathematical Formulas Alan Sexton 1, Volker Sorge 1, and Masakazu Suzuki 2 1 School of Computer Science, University of Birmingham, UK, A.P.Sexton V.Sorge@cs.bham.ac.uk,
More informationCompiler Design Overview. Compiler Design 1
Compiler Design Overview Compiler Design 1 Preliminaries Required Basic knowledge of programming languages. Basic knowledge of FSA and CFG. Knowledge of a high programming language for the programming
More informationRecognition-based Segmentation of Nom Characters from Body Text Regions of Stele Images Using Area Voronoi Diagram
Author manuscript, published in "International Conference on Computer Analysis of Images and Patterns - CAIP'2009 5702 (2009) 205-212" DOI : 10.1007/978-3-642-03767-2 Recognition-based Segmentation of
More informationHMM-Based On-Line Recognition of Handwritten Whiteboard Notes
HMM-Based On-Line Recognition of Handwritten Whiteboard Notes Marcus Liwicki and Horst Bunke Institute of Computer Science and Applied Mathematics University of Bern, Neubrückstrasse 10, CH-3012 Bern,
More informationPage 1. Interface Input Modalities. Lecture 5a: Advanced Input. Keyboard input. Keyboard input
Interface Input Modalities Lecture 5a: Advanced Input How can a user give input to an interface? keyboard mouse, touch pad joystick touch screen pen / stylus speech other more error! harder! (?) CS 530:
More informationMobile Human Detection Systems based on Sliding Windows Approach-A Review
Mobile Human Detection Systems based on Sliding Windows Approach-A Review Seminar: Mobile Human detection systems Njieutcheu Tassi cedrique Rovile Department of Computer Engineering University of Heidelberg
More informationLearning to Segment Document Images
Learning to Segment Document Images K.S. Sesh Kumar, Anoop Namboodiri, and C.V. Jawahar Centre for Visual Information Technology, International Institute of Information Technology, Hyderabad, India Abstract.
More informationAnnotation of Human Motion Capture Data using Conditional Random Fields
Annotation of Human Motion Capture Data using Conditional Random Fields Mert Değirmenci Department of Computer Engineering, Middle East Technical University, Turkey mert.degirmenci@ceng.metu.edu.tr Anıl
More informationHTR Part II: Handwritten Text Recognition
HTR Part II: Handwritten Text Recognition Preprocessing and Feature Extraction for Off-Line Continuous HTR Alejandro H. Toselli & PRHLT-Group Departamento de Sistemas Informáticos y Computación Universidad
More informationHandwriting recognition for IDEs with Unicode support
Technical Disclosure Commons Defensive Publications Series December 11, 2017 Handwriting recognition for IDEs with Unicode support Michal Luszczyk Sandro Feuz Follow this and additional works at: http://www.tdcommons.org/dpubs_series
More informationADVANCES in NATURAL and APPLIED SCIENCES
ADVANCES in NATURAL and APPLIED SCIENCES ISSN: 1995-0772 Published BYAENSI Publication EISSN: 1998-1090 http://www.aensiweb.com/anas 2017 May 11(7):pages 57-63 Open Access Journal English Cursive Hand
More informationHEURISTIC OPTIMIZATION USING COMPUTER SIMULATION: A STUDY OF STAFFING LEVELS IN A PHARMACEUTICAL MANUFACTURING LABORATORY
Proceedings of the 1998 Winter Simulation Conference D.J. Medeiros, E.F. Watson, J.S. Carson and M.S. Manivannan, eds. HEURISTIC OPTIMIZATION USING COMPUTER SIMULATION: A STUDY OF STAFFING LEVELS IN A
More informationDocument Structure Analysis in Associative Patent Retrieval
Document Structure Analysis in Associative Patent Retrieval Atsushi Fujii and Tetsuya Ishikawa Graduate School of Library, Information and Media Studies University of Tsukuba 1-2 Kasuga, Tsukuba, 305-8550,
More informationRefinement of digitized documents through recognition of mathematical formulae
Refinement of digitized documents through recognition of mathematical formulae Toshihiro KANAHORI Research and Support Center on Higher Education for the Hearing and Visually Impaired, Tsukuba University
More informationOCR For Handwritten Marathi Script
International Journal of Scientific & Engineering Research Volume 3, Issue 8, August-2012 1 OCR For Handwritten Marathi Script Mrs.Vinaya. S. Tapkir 1, Mrs.Sushma.D.Shelke 2 1 Maharashtra Academy Of Engineering,
More informationRESTORATION OF DEGRADED DOCUMENTS USING IMAGE BINARIZATION TECHNIQUE
RESTORATION OF DEGRADED DOCUMENTS USING IMAGE BINARIZATION TECHNIQUE K. Kaviya Selvi 1 and R. S. Sabeenian 2 1 Department of Electronics and Communication Engineering, Communication Systems, Sona College
More informationAUTHOR COPY. Audio-video based character recognition for handwritten mathematical content in classroom videos
Integrated Computer-Aided Engineering 21 (2014) 219 234 219 DOI 10.3233/ICA-140460 IOS Press Audio-video based character recognition for handwritten mathematical content in classroom videos Smita Vemulapalli
More informationAutomatic State Machine Induction for String Recognition
Automatic State Machine Induction for String Recognition Boontee Kruatrachue, Nattachat Pantrakarn, and Kritawan Siriboon Abstract One problem of generating a model to recognize any string is how to generate
More informationAn Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation
An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio Université de Montréal 13/06/2007
More informationHandling Place References in Text
Handling Place References in Text Introduction Most (geographic) information is available in the form of textual documents Place reference resolution involves two-subtasks: Recognition : Delimiting occurrences
More informationExplicit fuzzy modeling of shapes and positioning for handwritten Chinese character recognition
2009 0th International Conference on Document Analysis and Recognition Explicit fuzzy modeling of and positioning for handwritten Chinese character recognition Adrien Delaye - Eric Anquetil - Sébastien
More informationDiscriminative Training of Decoding Graphs for Large Vocabulary Continuous Speech Recognition
Discriminative Training of Decoding Graphs for Large Vocabulary Continuous Speech Recognition by Hong-Kwang Jeff Kuo, Brian Kingsbury (IBM Research) and Geoffry Zweig (Microsoft Research) ICASSP 2007 Presented
More informationConstraint Satisfaction Problems
Constraint Satisfaction Problems [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.] What is Search
More informationConstructing Empirical Models for Automatic Dialog Parameterization
Constructing Empirical Models for Automatic Dialog Parameterization Mikhail Alexandrov 1, Xavier Blanco 1, Natalia Ponomareva 2, and Paolo Rosso 2 1 Universidad Autonoma de Barcelona, Spain 2 Universidad
More informationSpeech Technology Using in Wechat
Speech Technology Using in Wechat FENG RAO Powered by WeChat Outline Introduce Algorithm of Speech Recognition Acoustic Model Language Model Decoder Speech Technology Open Platform Framework of Speech
More informationComparison of Bernoulli and Gaussian HMMs using a vertical repositioning technique for off-line handwriting recognition
2012 International Conference on Frontiers in Handwriting Recognition Comparison of Bernoulli and Gaussian HMMs using a vertical repositioning technique for off-line handwriting recognition Patrick Doetsch,
More informationConditional Random Fields and beyond D A N I E L K H A S H A B I C S U I U C,
Conditional Random Fields and beyond D A N I E L K H A S H A B I C S 5 4 6 U I U C, 2 0 1 3 Outline Modeling Inference Training Applications Outline Modeling Problem definition Discriminative vs. Generative
More informationFrom Handwriting Recognition to Ontologie-Based Information Extraction of Handwritten Notes
From Handwriting Recognition to Ontologie-Based Information Extraction of Handwritten Notes Marcus Liwicki 1, Sebastian Ebert 1,2, and Andreas Dengel 1,2 1 DFKI, Trippstadter Str. 122, Kaiserslautern,
More informationPart 5 Program Analysis Principles and Techniques
1 Part 5 Program Analysis Principles and Techniques Front end 2 source code scanner tokens parser il errors Responsibilities: Recognize legal programs Report errors Produce il Preliminary storage map Shape
More informationCS 314 Principles of Programming Languages. Lecture 3
CS 314 Principles of Programming Languages Lecture 3 Zheng Zhang Department of Computer Science Rutgers University Wednesday 14 th September, 2016 Zheng Zhang 1 CS@Rutgers University Class Information
More informationLanguages and Compilers
Principles of Software Engineering and Operational Systems Languages and Compilers SDAGE: Level I 2012-13 3. Formal Languages, Grammars and Automata Dr Valery Adzhiev vadzhiev@bournemouth.ac.uk Office:
More informationFinal Exam 2, CS154. April 25, 2010
inal Exam 2, CS154 April 25, 2010 Exam rules. he exam is open book and open notes you can use any printed or handwritten material. However, no electronic devices are allowed. Anything with an on-off switch
More informationA Bambara Tonalization System for Word Sense Disambiguation Using Differential Coding, Segmentation and Edit Operation Filtering
A Bambara Tonalization System for Word Sense Disambiguation Using Differential Coding, Segmentation and Edit Operation Filtering Luigi (Y.-C.) Liu Damien Nouvel ER-TIM, INALCO, 2 rue de Lille, Paris, France
More informationRobust Phase-Based Features Extracted From Image By A Binarization Technique
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 18, Issue 4, Ver. IV (Jul.-Aug. 2016), PP 10-14 www.iosrjournals.org Robust Phase-Based Features Extracted From
More informationExtracting and Composing Robust Features with Denoising Autoencoders
Presenter: Alexander Truong March 16, 2017 Extracting and Composing Robust Features with Denoising Autoencoders Pascal Vincent, Hugo Larochelle, Yoshua Bengio, Pierre-Antoine Manzagol 1 Outline Introduction
More informationInput Validation Testing: A Requirements-Driven, System level, Early Lifecycle Technique
Input Validation Testing: A Requirements-Driven, System level, Early Lifecycle Technique Jane Hayes Jeff Offutt SAIC George Mason University jane.e.hayes@cpmx.saic.com ofut@gmu.edu Support from U.S.National
More informationMath Information Retrieval: User Requirements and Prototype Implementation. Jin Zhao, Min Yen Kan and Yin Leng Theng
Math Information Retrieval: User Requirements and Prototype Implementation Jin Zhao, Min Yen Kan and Yin Leng Theng Why Math Information Retrieval? Examples: Looking for formulas Collect teaching resources
More informationUsing Corner Feature Correspondences to Rank Word Images by Similarity
Using Corner Feature Correspondences to Rank Word Images by Similarity Jamie L. Rothfeder, Shaolei Feng and Toni M. Rath Multi-Media Indexing and Retrieval Group Center for Intelligent Information Retrieval
More informationFLL: Answering World History Exams by Utilizing Search Results and Virtual Examples
FLL: Answering World History Exams by Utilizing Search Results and Virtual Examples Takuya Makino, Seiji Okura, Seiji Okajima, Shuangyong Song, Hiroko Suzuki, Fujitsu Laboratories Ltd. Fujitsu R&D Center
More informationRecognition of Tables and Forms
Recognition of Tables and Forms Bertrand Coüasnon, Aurélie Lemaitre To cite this version: Bertrand Coüasnon, Aurélie Lemaitre. Recognition of Tables and Forms. Handbook of Document Image Processing and
More informationA Visualization Tool to Improve the Performance of a Classifier Based on Hidden Markov Models
A Visualization Tool to Improve the Performance of a Classifier Based on Hidden Markov Models Gleidson Pegoretti da Silva, Masaki Nakagawa Department of Computer and Information Sciences Tokyo University
More informationOverview of the 5th International Competition on Plagiarism Detection
Overview of the 5th International Competition on Plagiarism Detection Martin Potthast, Matthias Hagen, Tim Gollub, Martin Tippmann, Johannes Kiesel, and Benno Stein Bauhaus-Universität Weimar www.webis.de
More informationActiveClean: Interactive Data Cleaning For Statistical Modeling. Safkat Islam Carolyn Zhang CS 590
ActiveClean: Interactive Data Cleaning For Statistical Modeling Safkat Islam Carolyn Zhang CS 590 Outline Biggest Takeaways, Strengths, and Weaknesses Background System Architecture Updating the Model
More informationArtwork Specifications EcoGrips
Artwork Specifications EcoGrips Eco-Products wants to help you promote your brand. We know that combining more sustainable products with innovative, cutting edge custom branding will help you engage with
More informationInvariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction
Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction Stefan Müller, Gerhard Rigoll, Andreas Kosmala and Denis Mazurenok Department of Computer Science, Faculty of
More informationApplying Machine Learning to Real Problems: Why is it Difficult? How Research Can Help?
Applying Machine Learning to Real Problems: Why is it Difficult? How Research Can Help? Olivier Bousquet, Google, Zürich, obousquet@google.com June 4th, 2007 Outline 1 Introduction 2 Features 3 Minimax
More informationKeywords Connected Components, Text-Line Extraction, Trained Dataset.
Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Language Independent
More informationA Fast Approximated k Median Algorithm
A Fast Approximated k Median Algorithm Eva Gómez Ballester, Luisa Micó, Jose Oncina Universidad de Alicante, Departamento de Lenguajes y Sistemas Informáticos {eva, mico,oncina}@dlsi.ua.es Abstract. The
More informationLecture 7: Neural network acoustic models in speech recognition
CS 224S / LINGUIST 285 Spoken Language Processing Andrew Maas Stanford University Spring 2017 Lecture 7: Neural network acoustic models in speech recognition Outline Hybrid acoustic modeling overview Basic
More informationTaming Text. How to Find, Organize, and Manipulate It MANNING GRANT S. INGERSOLL THOMAS S. MORTON ANDREW L. KARRIS. Shelter Island
Taming Text How to Find, Organize, and Manipulate It GRANT S. INGERSOLL THOMAS S. MORTON ANDREW L. KARRIS 11 MANNING Shelter Island contents foreword xiii preface xiv acknowledgments xvii about this book
More information