Continuous Chinese Handwriting Recognition with Language Model
|
|
- Gabriel Powell
- 6 years ago
- Views:
Transcription
1 Continuous Chinese Handwriting Recognition with Language Model Yanming Zou Kun Yu Kongqiao Wang Nokia Research Centre Beijing, BDA, P.R.China Abstract In this paper, we proposed a method to recognize the handwriting of several Chinese characters or even a full sentence simultaneously. With a common single Chinese character recognition engine and a language model, the whole recognition process is divided into two optimization processes. The first is to find the best grouping scheme to segment a plurality of written strokes into characters, and the following process is to find the best character sequence corresponding to those stroke groups. Some measures are utilized to speed up both optimization processes, in order that it is applicable on portable devices. Based on our test on over characters or 2200 sentences, the overall performance is quite promising, and the positive feedback from the testers have confirmed the validity of the proposed method. Keywords: HWR. 1. Introduction segmentation scheme, language model, Chinese characters are composed of multiple strokes in a compact manner, bearing some heritage from the pictograph. This makes it somewhat difficult to input those characters into digital devices from keyboard. Although there are some keyboard-based methods like Pinyin, many people have come to realize that writing is the most natural and efficient way for text entry. As intensive attention is paid on the handwriting recognition (HWR) technology, most of the research is focused on how to improve the recognition accuracy and input speed[1][2]. Referring to the pen and paper metaphor, people are used to writing a complete sentence without stopping to see whether the previous character is correctly recognized. Here new challenges appear: if the whole sentence is written continuously, characters are needed to be correctly segmented before recognizing each of them. Practically this has been a persistent problem, since the computer has to adapt and tackle the wide varieties of handwriting traces[3][4] before giving out plausible segmentation results. After segmentation, the individual characters should be recognized with single character HWR engine. To ensure the efficiency of text entry, it is desirable that the first candidate of each character is correct, because users would not like to check the recognition results for respective characters to complete the sentence. Many efforts have been taken to improve the first-hit rate of single character HWR engine, although no satisfactory result is achieved, especially for those on portable devices with many hardware restrictions. Naturally, the context could be used to group handwritten strokes into characters and then select the result from the candidate list. In most publications a lexicon is assumed to be available [5][6]. The lexicon does help to improve the recognition accuracy if only words are inputted, but for the case of casual input, especially the mixed text entry with punctuation, its effect is quite limited. The problem becomes even worse for Chinese, since there is no clear definition of words in Chinese, and no clear space between words when writing. Language model is another choice to utilize the context information[7]. Generally speaking, a language model can assign a probability to a sequence of characters and this probability can be used to adjust the output of common HWR engine. There have already been some publications on applications of language models for western language [8][9], but still not for Chinese. In this paper we propose a continuous handwriting segmentation and recognition method based on a bigram language model and a common single character HWR engine (Figure1). In the proposed scheme, the raw pen traces are first segmented according to the spatio-temporal information, and then the language model is used to find some optimized segmentation schemes. In our method, the language model is also utilized to finetune the recognition results. As there is no close coupling between the language model and the single character recognition engine, it is feasible to utilize different single character HWR engines in this solution. In our method the whole optimization process is divided into two stages: handwriting segmentation and recognition result adjustment. The aim for this division
2 C max = arg max C i C P (C i S) (1) where S = {s 1, s 2,... s i,... s n } is a collection of written strokes, s i is the ith stroke in the writing process, and C i represents one possible character string as a recognition result for the whole stroke sequence. If G = {G 1, G 2,... G i,... G p } and G i represents the ith scheme of grouping the written strokes, the optimization target becomes P (C i S) = max G j G P (C i G j ) (2) With formula (2), the optimization process can be written as max P (C i S) = max C i C max G j G P (C i G j ) (3) Since changing the order of the two maximum operators does not affect the maximum result, the formula (3) can be rewritten as max P (C i S) = max G j G max C i C P (C i G j ) (4) If P (G j ) is defined as formula (4) becomes P (G j ) = max C i C P (C i G j ) (5) Figure 1. Recognition process of the proposed method. is to speed up the whole searching process and make it applicable for portable devices. Actually, the computing complexity of our method as a common HWR engine is much higher than an engine for some special applications, e.g. mail address recognition[10]. It can be implemented in one step if enough computing resource is available. In the rest of the paper, the proposed approach will be introduced with emphasis on the language model for character segmentation and recognition. In section 2, the continuous HWR problem is separated into two optimization tasks. And details for solving them respectively are introduced in section 3. In section 4, experiments and the evaluations of the proposed method are illustrated. Section 5 concludes the paper. 2. Proposed Method As the purpose of handwriting recognition could be described as extracting the most possible text string from a sequence of written strokes, the optimization problem could be expressed as: G max = arg max G i G P (G i) (6) Since there is an efficient method to calculate the rough value of P (G j ), we propose a 2-stage method to solve whole optimization problem (1): first to find the best segmentation scheme defined in (6), and then to get the most possible text string with the best scheme G max. Assumed a segment g k is a collection of strokes, a segmentation scheme G can be defined as G = {g 0,... g k... g N 1 }. As one possible recognition result of the scheme, a text string C i can be described as C i = {c 0 i,... ck i... cn 1 i }, where c k i is just the recognized character in the string corresponded to the segment g k. With the definition above, the probability based on a bigram language model for the given segmentation G to be recognized as the text string C i becomes P (C i G) = Π N 1 k=0 P (ck i g k )P (c k i c k 1 i ) (7) where P (c k i ck 1 i ) is the transition probability from character c k 1 i to c k i, and P (ck i gk ) is the probability of stroke group g k to be recognized as character c k i. Thus P(G) can be written as
3 P (G) = max C i C ΠN 1 k=0 P (ck i g k )P (c k i c i k 1) (8) To speed up the searching process, we do not want to go through all the possible text strings. Instead, we calculate the reasonable maximum of each factor in (8) and use the product of those maximums as an approximated value of P (G). Thus P (G) can be calculated as where and here P (G) Π N 1 k=0 P (gk ) P (g k g k 1 ) (9) P (g k ) = max P (c k i g k ) i (10) P (g k g k 1 ) = P (c k n c k 1 n ) (11) n = arg max i:c i C P (ck i, c k 1 i g k, g k 1 ) (12) Here P (g j ) can be realized as the likelihood of a segment to be a character. And P (g j g j 1 ) is the rationality of forming a meaningful sentence with the segmentation scheme. We do not use the maximum value of P (c k i ck 1 i ) as an approximate value of P (g k g k 1 ), since the value of max i P (c k i ck 1 i ) may be meaningless when the corresponding P (c k i ) and P (ck 1 i ) is very small. With the approximate method (9), the computing complexity to evaluate one segmentation scheme is reduced dramatically. Assumed that the segmentation scheme G has N segments and each segment has M recognition candidates, to calculate the possibility of all the strings, we need o(m N ) multiply operation. With our approximate method, the complexity is decreased to the level of o(m M (N 1)). As a conclusion, our proposed method can be written as: G max = arg max G i G ΠP (gk i ) P (g k i g k 1 i )) (13) C max = arg max C i C ΠP (ck i g k max)p (c k i c k 1 i ) (14) 3. Algorithm Implementation Figure 2 illustrates the recognition flow of the proposed scheme. The key concept underlying this approach is the involvement of the language model built over the single character recognition engine. As the model could finetune the recognition process to reduce the requirement on memory usage and efficiency, this architecture is applicable on portable devices. Figure 2. The workflow of the proposed method Presegmentation As preparation of the segmentation task, presegmentation is adopted to get rough segments from the raw pen traces. In the original algorithm as defined in equation (9), the basic element for a grouping scheme is one single stroke. The number of possible schemes for n written strokes is 2 n 1, which will lead to high computational cost in the grouping process. To improve the efficiency, some obviously unreasonable schemes should be excluded from the searching space as early as possible. Actually there are some strokes which can be allocated to one character only based on their spatio-temporal relationship[11]. It is unreasonable to allocate them to different characters, so these strokes can be taken as a whole in the segmentation process to avoid further segmentation. Huge groups with too many stokes are also unreasonable, and thus if a threshold is set for the maximum stroke a character can have, the searching space for segmentation can be much smaller than 2 n 1. In fact, the maximum number of presegmented groups in one character can be much fewer than the number of strokes. Usually, a simplified Chinese character has no more than 40 strokes and a traditional character less than 65 strokes. However, with our method, most of traditional and simplified characters could be presegmented into at most 5 groups. So if the threshold is set for the maximum presegmented groups, the searching space is decreased drastically. It is ideal that one presegmented result is just one character. However, when written continuously, there is no clear space between adjacent characters. Moreover, some Chinese characters are composed of more than one component horizontally (Figure (3)), and thus the presegmentation result may contain two types of errors[6]: The group is only a part of one character; The group contains strokes from more than one character. To keep the correct scheme in the searching space, we should prevent the second type of errors while reducing the first type of errors as much as possible.
4 best text string. As a consequence, the best 10 segmentation schemes are reserved in the overall evaluation of the recognition result before the final text string is decided. 4. Experiments and Evaluation Figure 3. Incorrect segmentations and combinations Segmentation and recognition The whole optimization process is dependent on two kinds of probabilities. One is the transition probability between the characters, which can be retrieved from language model directly. The other is the probability of a group of stroke to be one specific character. Generally speaking, common single character recognition engine can only give out some conditional probabilities which assumes the input stroke group is really corresponding to one character in the range of the engine. But this assumption is not always satisfied in our cases. To get the possibility our method requires, a coefficient should be multiplied to the probability given out by the single character engine. This coefficient can be estimated from geometric information such as the height and width of the segment, the height of the text line, and the space between the strokes, as shown in Figure (4). To test the performance of the proposed method, we have collected over 200 continuous handwriting samples from 170 people. The samples were collected on the touch-sensitive screen of a PDA. Each sample includes 6 10 pages, and for each page there are 3 15 Chinese characters including punctuation. Samples from 80 people were taken as reference to finetune the parameters of the algorithm; and the remaining samples from 90 people were used for test. In the writing process, the writers were advised to avoid intersection between adjacent characters to avoid ambiguities. Figure 5 gives some examples of the collected handwriting sample. Figure 5. Samples in the dataset. Figure 4. Geometric information from the strokes. In the optimization process, the standard dynamic programming method is utilized. Since the output of the language model and the likelihood evaluation from the HWR engine are both in the form of logarithm, all the other likelihood is transformed to logarithmic measurement as well, and thus all the multiplication calculations in the aforementioned formulas have been changed to addition calculations to reduce the complexity of computation. It is not necessary to search through the whole recognition range for each stroke group in equation (12), and evaluation of top ten candidates with highest possibility is enough. This further speeds up the optimization process. Since an approximated probability is used for searching the best segmentation scheme, it can not be guaranteed that the optimized scheme is just corresponding to the In the evaluation tests, the common single character engine is licensed from an outside company and its recognition range was set as the GB2312 character set. The language model is trained with the text from prevalent Chinese newspapers, and it covers the same character set. For the purpose of comparison, the HWR engine for single character and the engine for continuous recognition(chwr) were tested respectively. As shown in Table (1), for the proposed method, the overall recognition rate is 89.7% for the mixed input of Chinese characters and punctuation, which is much better than the single HWR engine alone (76.4%). Even though in the example of Figure 5(b) and Figure 5(d), where there is little spacing between the adjacent characters, the proposed method could recognize the characters correctly. A higher recognition rate is achieved for pure Chinese characters (92.1%) compared with the mixed input of Chinese and punctuation, because most punctuation are written with a single stroke, and it s easy to merge them into adjacent characters. We
5 have also got promising results for the punctuation between the characters (from 16.7% to 71.1%), because the proposed method has drastically improved the accuracy by utilizing the context and spatial information. Table 1. Comparison of performance. Recognition Rate(%) Single HWR CHWR Mixed Character Punctuation Besides the objective algorithm evaluation, a series of subjective tests were taken to test the level of user acceptance. Over 50 users were asked to write a collection of sentences, including characters and punctuation. During the writing process, they were kept aware of the recognition result. When the writing process was finished, the user was asked to evaluate the performance of the system with a score ranging from 1 to 7. Here 1 refers to the most unacceptable level, and 7 the most desirable level. The average evaluation score was 5.15, and referring to the experiential criterion, if this score is no less than 5, it is expected to be accepted by the major user group. Moreover, from the perspective of users, the speed of text entry has been notably improved, because they could write fluently as they do on paper. 5. Conclusion In this paper, an efficient scheme for continuous handwriting recognition is proposed to improve the accuracy of recognition, while giving users intuitive and efficient writing experience. As a practical method for online entry of Chinese sentences by casual users, the most significant features of this approach include: High recognition accuracy for the writing input of multiple characters; Reliable recognition speed to ensure the application on mobile devices; Moreover, as there is no close coupling between the language and single character HWR engine, most other single character HWR engines could be utilized as well. The advantages have not only proved the effectiveness of the proposed method, but also created new chances of text entry by means of natural writing. According to the comparison test, both the average recognition rate and the recognition speed are much better than the performance of single character recognition algorithms. generous help and advice. We also express our appreciations to the Centre for Intelligent Image and Document Information Processing in Tsinghua University for their technical discussions. References [1] Chenglin Liu, S. Jaeger and M. Nakagawa, Online recognition of Chinese characters: the state-of-the-art, IEEE Trans. PAMI, vol.26, no.2, 2004, pp [2] Hiromichi Fujisawa A view on the past and future of character and document recogntion, Proceedings of the ninth International Conference on Document Analysis and Recognition, Curitiba, Brasil, 2007, pp [3] S. Wesolkowski Cursive script recognition: a survey, Handwriting and Drawing Research: Basic and Applied Issues. IOS Press. Amsterdam, pp [4] Zhao S, Chi Z, Shi P, etc. Handwritten Chinese character segmentation using a two-stage approach, Proc. 6th Int. Conf. Document Analysis and Recognition, Seattle, USA, Sep. 2001, IEEE Computer Society Press, pp [5] G. Kim, V. Govindaraju A lexicon driven approach to handwritten word recognition for real-time applications, IEEE Trans. Pattern Anal. Mach. Intell. vol. 19, 1997, pp [6] Zhengbin Yao, Xiaoqing Ding, Changsong Liu On-line handwritten Chinese word recognition based on lexicon, Proc. 18th Int. Conf. Pattern Recognition, Hong Kong, China, Aug. 2006, pp [7] John F. Pitrelli, Amit Roy Creating word-level language models for large-vocabulary handwriting recognition, International Journal on Document Analysis and Recognition, vol.5, Numbers 2-3, Apr. 2003, pp [8] Freddy Perraud, Christian Viard-Gaudin, Emmanuel Morin and Pierre-Michel Lallican N-Gram and N-Class models for online handwriting recognition, Proceedings of the Seventh International Conference on Document Analysis and Recognition, Washington, USA, 2003, pp [9] F. Pitrelli, Jayashree, Subrahmonia and P. Perrone Confidence modeling for handwriting recognition: algorithms and applications, International Journal on Document Analysis and Recognition, vol.8, Issue 1, Mar. 2006, pp [10] Chenglin Liu, M. Koga and H. Fujisawa, Lexicon-Driven Segmentation and Recognition of Handwritten Character Strings for Japanese Address Reading, IEEE Trans. PAMI, vol.24, no.11, 2002, pp [11] Naohiro Furukawa, Junko Tokuno and Hisashi Ikeda Online character segmentation method for unconstrained handwriting strings using off-stroke features, Proceedings of the International Workshop on Frontiers in Handwriting Recognition, Rennes, France, Acknowledgements The authors owe special thanks to the colleagues in Visual Systems team and Dr. Guohong Ding of Nokia Research Centre, who have facilitated the research with
A semi-incremental recognition method for on-line handwritten Japanese text
2013 12th International Conference on Document Analysis and Recognition A semi-incremental recognition method for on-line handwritten Japanese text Cuong Tuan Nguyen, Bilan Zhu and Masaki Nakagawa Department
More informationLearning-Based Candidate Segmentation Scoring for Real-Time Recognition of Online Overlaid Chinese Handwriting
2013 12th International Conference on Document Analysis and Recognition Learning-Based Candidate Segmentation Scoring for Real-Time Recognition of Online Overlaid Chinese Handwriting Yan-Fei Lv 1, Lin-Lin
More informationExplicit fuzzy modeling of shapes and positioning for handwritten Chinese character recognition
2009 0th International Conference on Document Analysis and Recognition Explicit fuzzy modeling of and positioning for handwritten Chinese character recognition Adrien Delaye - Eric Anquetil - Sébastien
More informationA Touching Character Database from Chinese Handwriting for Assessing Segmentation Algorithms
2012 International Conference on Frontiers in Handwriting Recognition A Touching Character Database from Chinese Handwriting for Assessing Segmentation Algorithms Liang Xu, Fei Yin, Qiu-Feng Wang, Cheng-Lin
More informationWriter Authentication Based on the Analysis of Strokes
Writer Authentication Based on the Analysis of Strokes Kun Yu, Yunhong Wang, Tieniu Tan * NLPR, Institute of Automation, Chinese Academy of Sciences, Beijing, 00080 P.R.China ABSTRACT This paper presents
More informationIndian Multi-Script Full Pin-code String Recognition for Postal Automation
2009 10th International Conference on Document Analysis and Recognition Indian Multi-Script Full Pin-code String Recognition for Postal Automation U. Pal 1, R. K. Roy 1, K. Roy 2 and F. Kimura 3 1 Computer
More informationDynamic Stroke Information Analysis for Video-Based Handwritten Chinese Character Recognition
Dynamic Stroke Information Analysis for Video-Based Handwritten Chinese Character Recognition Feng Lin and Xiaoou Tang Department of Information Engineering The Chinese University of Hong Kong Shatin,
More informationA Dynamic Programming Method for Segmentation of Online Cursive Uyghur Handwritten Words into Basic Recognizable Units
JOURNAL OF SOFTWARE, VOL. 8, NO. 10, OCTOBER 2013 2535 A Dynamic Programming Method for Segmentation of Online Cursive Uyghur Handwritten Words into Basic Recognizable Units Mayire Ibrayim Institute of
More informationA New Algorithm for Detecting Text Line in Handwritten Documents
A New Algorithm for Detecting Text Line in Handwritten Documents Yi Li 1, Yefeng Zheng 2, David Doermann 1, and Stefan Jaeger 1 1 Laboratory for Language and Media Processing Institute for Advanced Computer
More informationEffect of Text/Non-text Classification for Ink Search employing String Recognition
2012 10th IAPR International Workshop on Document Analysis Systems Effect of Text/Non-text Classification for Ink Search employing String Recognition Tomohisa Matsushita, Cheng Cheng, Yujiro Murata, Bilan
More informationRecognition of online captured, handwritten Tamil words on Android
Recognition of online captured, handwritten Tamil words on Android A G Ramakrishnan and Bhargava Urala K Medical Intelligence and Language Engineering (MILE) Laboratory, Dept. of Electrical Engineering,
More informationCASIA-OLHWDB1: A Database of Online Handwritten Chinese Characters
2009 10th International Conference on Document Analysis and Recognition CASIA-OLHWDB1: A Database of Online Handwritten Chinese Characters Da-Han Wang, Cheng-Lin Liu, Jin-Lun Yu, Xiang-Dong Zhou National
More informationISSN: [Mukund* et al., 6(4): April, 2017] Impact Factor: 4.116
IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY ENGLISH CURSIVE SCRIPT RECOGNITION Miss.Yewale Poonam Mukund*, Dr. M.S.Deshpande * Electronics and Telecommunication, TSSM's Bhivarabai
More informationA Feature based on Encoding the Relative Position of a Point in the Character for Online Handwritten Character Recognition
A Feature based on Encoding the Relative Position of a Point in the Character for Online Handwritten Character Recognition Dinesh Mandalapu, Sridhar Murali Krishna HP Laboratories India HPL-2007-109 July
More informationWord Slant Estimation using Non-Horizontal Character Parts and Core-Region Information
2012 10th IAPR International Workshop on Document Analysis Systems Word Slant using Non-Horizontal Character Parts and Core-Region Information A. Papandreou and B. Gatos Computational Intelligence Laboratory,
More informationKeywords Connected Components, Text-Line Extraction, Trained Dataset.
Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Language Independent
More informationExploring Similarity Measures for Biometric Databases
Exploring Similarity Measures for Biometric Databases Praveer Mansukhani, Venu Govindaraju Center for Unified Biometrics and Sensors (CUBS) University at Buffalo {pdm5, govind}@buffalo.edu Abstract. Currently
More informationMulti-Step Segmentation Method Based on Adaptive Thresholds for Chinese Calligraphy Characters
Journal of Information Hiding and Multimedia Signal Processing c 2018 ISSN 2073-4212 Ubiquitous International Volume 9, Number 2, March 2018 Multi-Step Segmentation Method Based on Adaptive Thresholds
More informationMulti-dimensional database design and implementation of dam safety monitoring system
Water Science and Engineering, Sep. 2008, Vol. 1, No. 3, 112-120 ISSN 1674-2370, http://kkb.hhu.edu.cn, e-mail: wse@hhu.edu.cn Multi-dimensional database design and implementation of dam safety monitoring
More informationAn Adaptive Approach to Extract Characters from Digital Ink Text in Chinese Based on Extracted Errors
An Adaptive Approach to Extract Characters from Digital Ink Text in Chinese Based on Extracted Errors Hao Bai ( ) Beijing Language and Culture University, Beijing, China baihao@blcu.edu.cn Abstract. Extracting
More informationWORD LEVEL DISCRIMINATIVE TRAINING FOR HANDWRITTEN WORD RECOGNITION Chen, W.; Gader, P.
University of Groningen WORD LEVEL DISCRIMINATIVE TRAINING FOR HANDWRITTEN WORD RECOGNITION Chen, W.; Gader, P. Published in: EPRINTS-BOOK-TITLE IMPORTANT NOTE: You are advised to consult the publisher's
More informationA Novel Approach for Rotation Free Online Handwritten Chinese Character Recognition +
2009 0th International Conference on Document Analysis and Recognition A Novel Approach for Rotation Free Online andwritten Chinese Character Recognition + Shengming uang, Lianwen Jin* and Jin Lv School
More informationCharacter Segmentation and Recognition Algorithm of Text Region in Steel Images
Character Segmentation and Recognition Algorithm of Text Region in Steel Images Keunhwi Koo, Jong Pil Yun, SungHoo Choi, JongHyun Choi, Doo Chul Choi, Sang Woo Kim Division of Electrical and Computer Engineering
More informationRECOGNITION FOR LARGE SETS OF HANDWRITTEN MATHEMATICAL SYMBOLS. Stephen M. Watt and Xiaofang Xie
RECOGNITION FOR LARGE SETS OF HANDWRITTEN MATHEMATICAL SYMBOLS Stephen M. Watt and Xiaofang Xie Dept. of Computer Science University of Western Ontario London Ontario, Canada N6A 5B7 {watt,maggie}@csd.uwo.ca
More informationPrototype Selection Methods for On-line HWR
Prototype Selection Methods for On-line HWR Jakob Sternby To cite this version: Jakob Sternby. Prototype Selection Methods for On-line HWR. Guy Lorette. Tenth International Workshop on Frontiers in Handwriting
More informationFrom Handwriting Recognition to Ontologie-Based Information Extraction of Handwritten Notes
From Handwriting Recognition to Ontologie-Based Information Extraction of Handwritten Notes Marcus Liwicki 1, Sebastian Ebert 1,2, and Andreas Dengel 1,2 1 DFKI, Trippstadter Str. 122, Kaiserslautern,
More informationINTER-LINE DISTANCE ESTIMATION AND TEXT LINE EXTRACTION FOR UNCONSTRAINED ONLINE HANDWRITING Ratzlaff, E.
University of Groningen INTER-LINE DISTANCE ESTIMATION AND TEXT LINE EXTRACTION FOR UNCONSTRAINED ONLINE HANDWRITING Ratzlaff, E. Published in: EPRINTS-BOOK-TITLE IMPORTANT NOTE: You are advised to consult
More informationWriter Recognizer for Offline Text Based on SIFT
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 5, May 2015, pg.1057
More informationAn Improved Pre-classification Method for Offline Handwritten Chinese Character Using Four Corner Feature
ISBN 978-952-5726-04-6 (Print), 978-952-5726-05-3 (CD-ROM) Proceedings of the International Symposium on Intelligent Information Systems and Applications (IISA 09) Qingdao, P. R. China, Oct. 28-30, 2009,
More informationCar License Plate Detection Based on Line Segments
, pp.99-103 http://dx.doi.org/10.14257/astl.2014.58.21 Car License Plate Detection Based on Line Segments Dongwook Kim 1, Liu Zheng Dept. of Information & Communication Eng., Jeonju Univ. Abstract. In
More informationRecognition-based Segmentation of Nom Characters from Body Text Regions of Stele Images Using Area Voronoi Diagram
Author manuscript, published in "International Conference on Computer Analysis of Images and Patterns - CAIP'2009 5702 (2009) 205-212" DOI : 10.1007/978-3-642-03767-2 Recognition-based Segmentation of
More informationSegmenting Handwritten Math Symbols Using AdaBoost and Multi-Scale Shape Context Features
Segmenting Handwritten Math Symbols Using AdaBoost and Multi-Scale Shape Context Features Lei Hu Department of Computer Science Rochester Institute of Technology, USA lei.hu@rit.edu Richard Zanibbi Department
More informationA Model of On-line Handwritten Japanese Text Recognition Free from Line Direction and Writing Format Constraints
IEICE TRANS. INF. & SYST., VOL.E88 D, NO.8 AUGUST 2005 1815 PAPER Special Section on Document Image Understanding and Digital Documents A Model of On-line Handwritten Japanese Text Recognition Free from
More informationHandwritten Word Recognition using Conditional Random Fields
Handwritten Word Recognition using Conditional Random Fields Shravya Shetty Harish Srinivasan Sargur Srihari Center of Excellence for Document Analysis and Recognition (CEDAR) Department of Computer Science
More informationEngineering Drawings Recognition Using a Case-based Approach
Engineering Drawings Recognition Using a Case-based Approach Luo Yan Department of Computer Science City University of Hong Kong luoyan@cs.cityu.edu.hk Liu Wenyin Department of Computer Science City University
More informationImgSeek: Capturing User s Intent For Internet Image Search
ImgSeek: Capturing User s Intent For Internet Image Search Abstract - Internet image search engines (e.g. Bing Image Search) frequently lean on adjacent text features. It is difficult for them to illustrate
More informationA Model-based Line Detection Algorithm in Documents
A Model-based Line Detection Algorithm in Documents Yefeng Zheng, Huiping Li, David Doermann Laboratory for Language and Media Processing Institute for Advanced Computer Studies University of Maryland,
More informationScene Text Detection Using Machine Learning Classifiers
601 Scene Text Detection Using Machine Learning Classifiers Nafla C.N. 1, Sneha K. 2, Divya K.P. 3 1 (Department of CSE, RCET, Akkikkvu, Thrissur) 2 (Department of CSE, RCET, Akkikkvu, Thrissur) 3 (Department
More informationExtracting Characters From Books Based On The OCR Technology
2016 International Conference on Engineering and Advanced Technology (ICEAT-16) Extracting Characters From Books Based On The OCR Technology Mingkai Zhang1, a, Xiaoyi Bao1, b,xin Wang1, c, Jifeng Ding1,
More informationA Combined Method for On-Line Signature Verification
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 14, No 2 Sofia 2014 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2014-0022 A Combined Method for On-Line
More informationSpotting Words in Latin, Devanagari and Arabic Scripts
Spotting Words in Latin, Devanagari and Arabic Scripts Sargur N. Srihari, Harish Srinivasan, Chen Huang and Shravya Shetty {srihari,hs32,chuang5,sshetty}@cedar.buffalo.edu Center of Excellence for Document
More informationSlant normalization of handwritten numeral strings
Slant normalization of handwritten numeral strings Alceu de S. Britto Jr 1,4, Robert Sabourin 2, Edouard Lethelier 1, Flávio Bortolozzi 1, Ching Y. Suen 3 adesouza, sabourin@livia.etsmtl.ca suen@cenparmi.concordia.ca
More informationRobust line segmentation for handwritten documents
Robust line segmentation for handwritten documents Kamal Kuzhinjedathu, Harish Srinivasan and Sargur Srihari Center of Excellence for Document Analysis and Recognition (CEDAR) University at Buffalo, State
More informationGrouping Text Lines in Online Handwritten Japanese Documents by Combining Temporal and Spatial Information
Grouping Text Lines in Online Handwritten Japanese Documents by Combining Temporal and Spatial Information Xiang-Dong Zhou, Da-Han Wang, Cheng-Lin Liu ational Laboratory of Pattern Recognition, Institute
More informationA New Approach to Detect and Extract Characters from Off-Line Printed Images and Text
Available online at www.sciencedirect.com Procedia Computer Science 17 (2013 ) 434 440 Information Technology and Quantitative Management (ITQM2013) A New Approach to Detect and Extract Characters from
More informationA NEW STRATEGY FOR IMPROVING FEATURE SETS IN A DISCRETE HMM-BASED HANDWRITING RECOGNITION SYSTEM
A NEW STRATEGY FOR IMPROVING FEATURE SETS IN A DISCRETE HMM-BASED HANDWRITING RECOGNITION SYSTEM F. GRANDIDIER AND R. SABOURIN CENPARMI, Concordia University, 1455 de Maisonneuve Blvd West, Montréal H3G
More informationA Visualization Tool to Improve the Performance of a Classifier Based on Hidden Markov Models
A Visualization Tool to Improve the Performance of a Classifier Based on Hidden Markov Models Gleidson Pegoretti da Silva, Masaki Nakagawa Department of Computer and Information Sciences Tokyo University
More informationHANDWRITTEN TEXT LINE EXTRACTION BASED ON MINIMUM SPANNING TREE CLUSTERING
HANDWRITTEN TEXT LINE EXTRACTION BASED ON MINIMUM SPANNING TREE CLUSTERING FEI YIN, CHENG-LIN LIU National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences
More informationOnline Bangla Handwriting Recognition System
1 Online Bangla Handwriting Recognition System K. Roy Dept. of Comp. Sc. West Bengal University of Technology, BF 142, Saltlake, Kolkata-64, India N. Sharma, T. Pal and U. Pal Computer Vision and Pattern
More informationA System for Joining and Recognition of Broken Bangla Numerals for Indian Postal Automation
A System for Joining and Recognition of Broken Bangla Numerals for Indian Postal Automation K. Roy, U. Pal and B. B. Chaudhuri CVPR Unit; Indian Statistical Institute, Kolkata-108; India umapada@isical.ac.in
More informationAn Objective Evaluation Methodology for Handwritten Image Document Binarization Techniques
An Objective Evaluation Methodology for Handwritten Image Document Binarization Techniques K. Ntirogiannis, B. Gatos and I. Pratikakis Computational Intelligence Laboratory, Institute of Informatics and
More informationHidden Loop Recovery for Handwriting Recognition
Hidden Loop Recovery for Handwriting Recognition David Doermann Institute of Advanced Computer Studies, University of Maryland, College Park, USA E-mail: doermann@cfar.umd.edu Nathan Intrator School of
More informationHandwriting Character Recognition as a Service:A New Handwriting Recognition System Based on Cloud Computing
2011 International Conference on Document Analysis and Recognition Handwriting Character Recognition as a Service:A New Handwriting Recognition Based on Cloud Computing Yan Gao, Lanwen Jin +, Cong He,
More informationNews-Oriented Keyword Indexing with Maximum Entropy Principle.
News-Oriented Keyword Indexing with Maximum Entropy Principle. Li Sujian' Wang Houfeng' Yu Shiwen' Xin Chengsheng2 'Institute of Computational Linguistics, Peking University, 100871, Beijing, China Ilisujian,
More informationMachine Learning Final Project
Machine Learning Final Project Team: hahaha R01942054 林家蓉 R01942068 賴威昇 January 15, 2014 1 Introduction In this project, we are asked to solve a classification problem of Chinese characters. The training
More informationRecognition of Unconstrained Malayalam Handwritten Numeral
Recognition of Unconstrained Malayalam Handwritten Numeral U. Pal, S. Kundu, Y. Ali, H. Islam and N. Tripathy C VPR Unit, Indian Statistical Institute, Kolkata-108, India Email: umapada@isical.ac.in Abstract
More informationHIGH RESOLUTION REMOTE SENSING IMAGE SEGMENTATION BASED ON GRAPH THEORY AND FRACTAL NET EVOLUTION APPROACH
HIGH RESOLUTION REMOTE SENSING IMAGE SEGMENTATION BASED ON GRAPH THEORY AND FRACTAL NET EVOLUTION APPROACH Yi Yang, Haitao Li, Yanshun Han, Haiyan Gu Key Laboratory of Geo-informatics of State Bureau of
More informationProduction of Video Images by Computer Controlled Cameras and Its Application to TV Conference System
Proc. of IEEE Conference on Computer Vision and Pattern Recognition, vol.2, II-131 II-137, Dec. 2001. Production of Video Images by Computer Controlled Cameras and Its Application to TV Conference System
More informationICDAR2007 Handwriting Segmentation Contest
ICDAR2007 Handwriting Segmentation Contest B. Gatos 1, A. Antonacopoulos 2 and N. Stamatopoulos 1 1 Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Center
More informationToward Interlinking Asian Resources Effectively: Chinese to Korean Frequency-Based Machine Translation System
Toward Interlinking Asian Resources Effectively: Chinese to Korean Frequency-Based Machine Translation System Eun Ji Kim and Mun Yong Yi (&) Department of Knowledge Service Engineering, KAIST, Daejeon,
More informationOn-line One Stroke Character Recognition Using Directional Features
On-line One Stroke Character Recognition Using Directional Features Blind for review process 1 1 Blind for review process Abstract. This paper presents a method based on directional features for recognizing
More informationSegmentation of Characters of Devanagari Script Documents
WWJMRD 2017; 3(11): 253-257 www.wwjmrd.com International Journal Peer Reviewed Journal Refereed Journal Indexed Journal UGC Approved Journal Impact Factor MJIF: 4.25 e-issn: 2454-6615 Manpreet Kaur Research
More informationHierarchical Shape Primitive Features for Online Text-independent Writer Identification
2009 10th International Conference on Document Analysis and Recognition Hierarchical Shape Primitive Features for Online Text-independent Writer Identification Bangy Li, Zhenan Sun and Tieniu Tan Center
More informationA Segmentation Free Approach to Arabic and Urdu OCR
A Segmentation Free Approach to Arabic and Urdu OCR Nazly Sabbour 1 and Faisal Shafait 2 1 Department of Computer Science, German University in Cairo (GUC), Cairo, Egypt; 2 German Research Center for Artificial
More informationOn-Line Recognition of Mathematical Expressions Using Automatic Rewriting Method
On-Line Recognition of Mathematical Expressions Using Automatic Rewriting Method T. Kanahori 1, K. Tabata 1, W. Cong 2, F.Tamari 2, and M. Suzuki 1 1 Graduate School of Mathematics, Kyushu University 36,
More informationSegmentation of Kannada Handwritten Characters and Recognition Using Twelve Directional Feature Extraction Techniques
Segmentation of Kannada Handwritten Characters and Recognition Using Twelve Directional Feature Extraction Techniques 1 Lohitha B.J, 2 Y.C Kiran 1 M.Tech. Student Dept. of ISE, Dayananda Sagar College
More informationExtraction and segmentation of tables from Chinese ink documents based on a matrix model
Pattern Recognition 40 (2007) 1855 1867 www.elsevier.com/locate/pr Extraction and segmentation of tables from Chinese ink documents based on a matrix model Xi-wen Zhang a,b,, Michael R. Lyu b, Guo-zhong
More informationComparing Natural and Synthetic Training Data for Off-line Cursive Handwriting Recognition
Comparing Natural and Synthetic Training Data for Off-line Cursive Handwriting Recognition Tamás Varga and Horst Bunke Institut für Informatik und angewandte Mathematik, Universität Bern Neubrückstrasse
More informationImage retrieval based on region shape similarity
Image retrieval based on region shape similarity Cheng Chang Liu Wenyin Hongjiang Zhang Microsoft Research China, 49 Zhichun Road, Beijing 8, China {wyliu, hjzhang}@microsoft.com ABSTRACT This paper presents
More informationLocal Segmentation of Touching Characters using Contour based Shape Decomposition
2012 10th IAPR International Workshop on Document Analysis Systems Local Segmentation of Touching Characters using Contour based Shape Decomposition Le Kang, David Doermann Institute for Advanced Computer
More informationPenpower Handwriter for Mac User Manual
Penpower Handwriter for Mac User Manual Version: 6.1 Release: February, 2009 Penpower Technology Ltd. Software User License Agreement You are licensed to legally use this software program ( the Software
More informationThe Comparative Study of Machine Learning Algorithms in Text Data Classification*
The Comparative Study of Machine Learning Algorithms in Text Data Classification* Wang Xin School of Science, Beijing Information Science and Technology University Beijing, China Abstract Classification
More informationA Skew-tolerant Strategy and Confidence Measure for k-nn Classification of Online Handwritten Characters
A Skew-tolerant Strategy and Confidence Measure for k-nn Classification of Online Handwritten Characters Vandana Roy and Sriganesh Madhvanath Hewlett-Packard Labs, Bangalore, India vandana.roy, srig@hp.com
More informationA reversible data hiding based on adaptive prediction technique and histogram shifting
A reversible data hiding based on adaptive prediction technique and histogram shifting Rui Liu, Rongrong Ni, Yao Zhao Institute of Information Science Beijing Jiaotong University E-mail: rrni@bjtu.edu.cn
More informationDocument-Form Identification Using Constellation Matching of Keywords Abstracted by Character Recognition
Document-Form Identification Using Constellation Matching of Keywords Abstracted by Character Recognition Hiroshi Sako 1, Naohiro Furukawa 1, Masakazu Fujio 1, and Shigeru Watanabe 2 1 Central Research
More informationAn evaluation of HMM-based Techniques for the Recognition of Screen Rendered Text
An evaluation of HMM-based Techniques for the Recognition of Screen Rendered Text Sheikh Faisal Rashid 1, Faisal Shafait 2, and Thomas M. Breuel 1 1 Technical University of Kaiserslautern, Kaiserslautern,
More informationFine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes
2009 10th International Conference on Document Analysis and Recognition Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes Alireza Alaei
More informationLARGE-VOCABULARY CHINESE TEXT/SPEECH INFORMATION RETRIEVAL USING MANDARIN SPEECH QUERIES
LARGE-VOCABULARY CHINESE TEXT/SPEECH INFORMATION RETRIEVAL USING MANDARIN SPEECH QUERIES Bo-ren Bai 1, Berlin Chen 2, Hsin-min Wang 2, Lee-feng Chien 2, and Lin-shan Lee 1,2 1 Department of Electrical
More informationA Novel Texture Classification Procedure by using Association Rules
ITB J. ICT Vol. 2, No. 2, 2008, 03-4 03 A Novel Texture Classification Procedure by using Association Rules L. Jaba Sheela & V.Shanthi 2 Panimalar Engineering College, Chennai. 2 St.Joseph s Engineering
More informationTemplate-based Synthetic Handwriting Generation for the Training of Recognition Systems
Template-based Synthetic Handwriting Generation for the Training of Recognition Systems Tamás VARGA, Daniel KILCHHOFER and Horst BUNKE Department of Computer Science, University of Bern Neubrückstrasse
More informationHandwriting Recognition of Diverse Languages
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 6.017 IJCSMC,
More informationStroke Fragmentation based on Geometry Features and HMM
Stroke Fragmentation based on Geometry Features and HMM Guihuan Feng, Christian Viard-Gaudin To cite this version: Guihuan Feng, Christian Viard-Gaudin. Stroke Fragmentation based on Geometry Features
More informationSlant Correction using Histograms
Slant Correction using Histograms Frank de Zeeuw Bachelor s Thesis in Artificial Intelligence Supervised by Axel Brink & Tijn van der Zant July 12, 2006 Abstract Slant is one of the characteristics that
More informationSCUT-COUCH2008: A Comprehensive Online Unconstrained Chinese Handwriting Dataset
SCUT-COUCH2008: A Comprehensive Online Unconstrained Chinese Handwriting Dataset Yunyang Li, Lianwen Jin *, Xinhua Zhu, Teng Long Department of Electronics and Information Engineering, South China University
More informationOff-line Character Recognition using On-line Character Writing Information
Off-line Character Recognition using On-line Character Writing Information Hiromitsu NISHIMURA and Takehiko TIMIKAWA Dept. of Information and Computer Sciences, Kanagawa Institute of Technology 1030 Shimo-ogino,
More informationAutomatic Detection of Change in Address Blocks for Reply Forms Processing
Automatic Detection of Change in Address Blocks for Reply Forms Processing K R Karthick, S Marshall and A J Gray Abstract In this paper, an automatic method to detect the presence of on-line erasures/scribbles/corrections/over-writing
More informationFAST REGISTRATION OF TERRESTRIAL LIDAR POINT CLOUD AND SEQUENCE IMAGES
FAST REGISTRATION OF TERRESTRIAL LIDAR POINT CLOUD AND SEQUENCE IMAGES Jie Shao a, Wuming Zhang a, Yaqiao Zhu b, Aojie Shen a a State Key Laboratory of Remote Sensing Science, Institute of Remote Sensing
More informationAUTOMATIC EXTRACTION OF BUILDING FEATURES FROM TERRESTRIAL LASER SCANNING
AUTOMATIC EXTRACTION OF BUILDING FEATURES FROM TERRESTRIAL LASER SCANNING Shi Pu and George Vosselman International Institute for Geo-information Science and Earth Observation (ITC) spu@itc.nl, vosselman@itc.nl
More informationAutomatic Generation of Personal Chinese Handwriting by Capturing the Characteristics of Personal Handwriting
Proceedings of the Twenty-First Innovative Applications of Artificial Intelligence Conference (2009) Automatic Generation of Personal Chinese Handwriting by Capturing the Characteristics of Personal Handwriting
More informationPenpower Handwriter for Mac User Manual
Penpower Handwriter for Mac User Manual Version: 6.2 Release: July, 2011 Edition: 3 Penpower Technology Ltd. Software User License Agreement You are licensed to legally use this software program ( the
More informationA Novel Image Transform Based on Potential field Source Reverse for Image Analysis
A Novel Image Transform Based on Potential field Source Reverse for Image Analysis X. D. ZHUANG 1,2 and N. E. MASTORAKIS 1,3 1. WSEAS Headquarters, Agiou Ioannou Theologou 17-23, 15773, Zografou, Athens,
More informationParallel-computing approach for FFT implementation on digital signal processor (DSP)
Parallel-computing approach for FFT implementation on digital signal processor (DSP) Yi-Pin Hsu and Shin-Yu Lin Abstract An efficient parallel form in digital signal processor can improve the algorithm
More informationHMM-Based On-Line Recognition of Handwritten Whiteboard Notes
HMM-Based On-Line Recognition of Handwritten Whiteboard Notes Marcus Liwicki and Horst Bunke Institute of Computer Science and Applied Mathematics University of Bern, Neubrückstrasse 10, CH-3012 Bern,
More informationPaper ID: NITETE&TC05 THE HANDWRITTEN DEVNAGARI NUMERALS RECOGNITION USING SUPPORT VECTOR MACHINE
Paper ID: NITETE&TC05 THE HANDWRITTEN DEVNAGARI NUMERALS RECOGNITION USING SUPPORT VECTOR MACHINE Rupali Vitthalrao Suryawanshi Department of Electronics Engineering, Bharatratna Indira Gandhi College,
More informationSupport for word-by-word, non-cursive handwriting
Decuma Latin 3.0 for SONY CLIÉ / PalmOS 5 Support for word-by-word, non-cursive handwriting developed by Decuma AB Copyright 2003 by Decuma AB. All rights reserved. Decuma is a trademark of Decuma AB in
More informationInvarianceness for Character Recognition Using Geo-Discretization Features
Computer and Information Science; Vol. 9, No. 2; 2016 ISSN 1913-8989 E-ISSN 1913-8997 Published by Canadian Center of Science and Education Invarianceness for Character Recognition Using Geo-Discretization
More informationFUZZY C-MEANS ALGORITHM BASED ON PRETREATMENT OF SIMILARITY RELATIONTP
Dynamics of Continuous, Discrete and Impulsive Systems Series B: Applications & Algorithms 14 (2007) 103-111 Copyright c 2007 Watam Press FUZZY C-MEANS ALGORITHM BASED ON PRETREATMENT OF SIMILARITY RELATIONTP
More informationAn ELM-based traffic flow prediction method adapted to different data types Wang Xingchao1, a, Hu Jianming2, b, Zhang Yi3 and Wang Zhenyu4
6th International Conference on Information Engineering for Mechanics and Materials (ICIMM 206) An ELM-based traffic flow prediction method adapted to different data types Wang Xingchao, a, Hu Jianming2,
More informationA Novel Field-source Reverse Transform for Image Structure Representation and Analysis
A Novel Field-source Reverse Transform for Image Structure Representation and Analysis X. D. ZHUANG 1,2 and N. E. MASTORAKIS 1,3 1. WSEAS Headquarters, Agiou Ioannou Theologou 17-23, 15773, Zografou, Athens,
More informationIJSER. Real Time Object Visual Inspection Based On Template Matching Using FPGA
International Journal of Scientific & Engineering Research, Volume 4, Issue 8, August-2013 823 Real Time Object Visual Inspection Based On Template Matching Using FPGA GURURAJ.BANAKAR Electronics & Communications
More information