OPTICAL CHARACTER RECOGNITION FOR SCRIPTS AND DOCUMENTS
|
|
- Andrea Singleton
- 5 years ago
- Views:
Transcription
1 Volume 120 No , ISSN: (on-line version) url: OPTICAL CHARACTER RECOGNITION FOR SCRIPTS AND DOCUMENTS P. Ramani 1,R.Keshav 2, Vishal.D 3,Vigneshwaran.S 4, Isaac Mathew 5, 1 (PhD), 1,2,3,4,5 Department of ECE, SRM Institute of Science and Technology, Chennai 1 ramani.p@srmuniv.ac.in, 2 rkeshav2810@gmail.com 3 visddvs307@gmail.com 4 svigneshwaran230@gmail.com 5 isaacmathew96@gmail.com July 23, 2018 Abstract The Optical character recognition (OCR) system automatically translates or converts scanned images of handwritten, typed or printed text into machine-encoded text with respect to the structure of the scanned image text. This OCR system can be used for recognizing multilingual images by using structural algorithm and shape analysis. This paper is organized into six major sections covering introduction, architecture of multilingual OCR, existing system, proposed model, future prospects and the conclusion. The proposed OCR system handles the existing time and space complexities for achieving higher recognition accuracy and adds new functionalities. Key Words:Digital image processing, OCR, Multilingual, Tamil, Hindi and English scripts, Structure analysis
2 1 Introduction Optical character recognition (OCR) is a system that recognizes and converts image content to printed text module. It has received considerable attention in recent years due to the process of computer recognition of optically scanned and digitized character images that are converted into an electronic text document. The input to the system is achieved by scanning given data. Figure 1. Sample Multilingual Document The output of the system will be in the form of an editable text document that can be stored as a secured file. The character recognition model plays a vital role in the world of text to speech conversion, as it identifies and understands the text in a document. It can be later read out by a computer. We propose a model that recognizes English and other languages of the Indian scripts such as Tamil, English, Telugu, Hindi, Kannada, Marathi and Sanskrit. Some of the practical uses of the OCR are (i)processing cheques without human interference, (ii) reading aid for visually impaired, (iii) automatically entering text into the computer for publishing, library catalogue management and ledgering, (iv) automatically reading city names and addresses for postal mail, (v) natural language processing and (vi) reading inscriptions from heritage sites. The majority of the task depends upon the structural form of the input. The separation of languages depends on the curves and lines involved. The task of separating lines and words in the document is independent of the script and it achieved with conventional projection profile techniques. The system is also based on the concept of Support Vector Machines. The steps of structural algorithm include segmentations such as line and word segmentation. The accuracy of a particular language can also be determined
3 2 Architecture of Multilingual OCR System: The input provided by the user is a scanned image of hand written text or image uploaded from the database. Each part of the system can be controlled by the user by changing the parameters or image effects. To make the software recognize the image, it undergoes some pre-processing steps. They are Image pre-processing Filtering noise RGB to Gray conversion Image orientation Segmentation Image Separation Input Image Processing The desired document to be analyzed is converted into an image format and is given as input in MATLAB. Using specific sub functions, the image is converted from RGB to grey in order eliminate any kind of noise present in the image and enhance its clarity. Line Segmentation For the given input image, the borders are created (upper line, lower line, headline, base line). Then, the image is segregated into two images in which the desired line will appear in one image and rest of the lines in other image. The rows in which there are minimum or absolutely no pixels are eliminated. Word Segmentation Every word in each line is segregated and the respective images are formed irrespective of the language. Figure 2. (a) Sample Input Image for Word Segmentation
4 Figure 2. (b) Outputs for Word Segmentation for the given sample input Separation of Scripts Based on the scripts and patterns (Tick components in Telugu, Upper line in Hindi, Curves in Tamil, Horizontal lines in English etc.), all the words for a single language will be formed as separate images. Thus, different images for different languages are identified and formed. 3 Existing System First generation of Indic OCRs used rule-based solutions and intuitive features for the recognition of characters. Second generation OCRs continued with the definition of characters but moved forward with more principled features based on signal processing or statistical techniques. The existing system includes research work carried out by Peakeand Tan who have proposed a method for automatic script and language identification from document images using multiple channel (Gabour) filters and gray level co-occurrence matrices for seven languages: Chinese, English, Greek, Korean, Malayalam, Persian and Russian. Tan has developed rotation invariant texture feature extraction method for automatic script identification for six languages: Chinese, Greek, English, Russian, Persian and Malayalam. Traditional machine learning based methods used for Indic OCRs used multiclass classification schemes implemented with neural networks or SVMs. However, such solutions demanded two separate modules. Indian language OCRs that were developed until recently used a segmentation scheme prior to recognition. From a machine learning perspective, OCR is more of a structured prediction problem where the output is a sequence of
5 characters/symbols of arbitrary length and input is a sequence of feature vectors of varying length. Our survey for previous research work in this area shows that hardly few attempts have been made to focus on these seven languages Telugu, Tamil, Hindi, English, Sanskrit, Marati and Kannada. Few basic features that are analyzed based on the script patterns are: Top row, Bottom row, Horizontal line, Vertical line, Tick components, Holes, Curves, Dots and Spacing. 4 Proposed Model The proposed concept can be implemented using MATLAB based on structural algorithm or Connected Content approach which specifies the relationship between neighbouring characters or pixels. Tamil language has a set of unique patterns in its script such as curves, dots, etc. The dots occur frequently in the upper line. The letters contain a combination of curves, horizontal and vertical lines. These specific features can be used to distinguish Tamil from other languages. Figure 3. Sample Hindi word after segmentation The top row of the image shows the number of top segments where the intensity of dark pixels is maximum. Pixel value of black is 0 while white is represented as 1.The bottom row of the image shows the number of bottom segments where the intensity of dark pixels is maximum. Pixel value of black is 0 while white is represented as
6 Figure 4. Image Processing Block Diagram When more than 50% of the letters (i.e.) connected components are present below the bottom row, then that connected component is considered as the descendent. Dots are the most important feature in Tamil script as they occur very frequently in the upper row. They occur in Hindi as well but not with a very high frequency. Majority of the Telugu characters have tick mark shaped structures at the top segment of the characters. Also, it could be observed that majority of Telugu characters have upward curves present at their bottom portion. These distinct characteristics of Telugu characters are helpful in separating them from Kannada, Hindi, Tamil, Sanskrit and English languages. It could be noted that Devanagari script contains characters that have a horizontal line at the top part which is called headline that is named as sirorekha in Devanagari as seen in Fig. 3. It joins two or more basic or compound characters to form a word. These head lines are present at the top segment of the characters and they can be used to identify features of Devanagari text. Another strong feature in a Devanagari script is that most of the pixels in the headline and bottom line appear to be similar. This results in both top profile and bottom profile of a Hindi text line to lie at the top part of the characters. However, this distinct feature is absent in both Kannada and English text lines where the density top and bottom profiles occur at different positions. Using these characteristics, Hindi text line can be separated from Kannada, Tamil and English languages. It is found that the distribution of pixels in English characters is regular and symmetric. Due to this uniform pixel distribution in English characters, the density of top and bottom profiles are almost similar. On the contrast, such uniformity is not found in the other five languages Sanskrit, Kannada, Tamil, Telugu and Hindi. Thus, this structural attribute is used in supporting character recognition features in the proposed model
7 5 Conclusion and Future Prospects The proposed approach could successfully identify the seven different language scripts (Telugu, Tamil, English, Sanskrit, Kannada, Marati and Hindi). It is based on the features extracted from the given text words. Currently we have developed shared linked libraries of each module (script-independent or script-dependent module) received by other consortia members. We are testing each language OCR for character level accuracy and word level accuracy. We are in the process of testing for at least thousand pages of each script for character and word level accuracy. It can be powerful enough to discriminate among the prototypes those that most likely will match the sample. Once this subset has been found, a more detailed description is computed, and the main classification step entered. To achieve this purpose, a multilevel description of the character, in terms of the features provided by the feature extractor is given. At the intermediate level, the character is decomposed into components by removing the branch points. The main classifier chooses which one of the prototypes, among the selected ones, has the best matching with the sample. Experiments have proved that the method is correct and efficient. As a future prospect of the OCR project we are trying to tune the recognition to respond well to a variety of fonts and font/point sizes. Also, multilingual OCR could be integrated with Braille interface for all Indian scripts addressed here. We can aim of developing large Document Management Systems 6 Result Implementing the enlisted steps in MATLAB for languages like Tamil, Kannada, English, Telugu, Marati, Sanskrit, Hindi the following result of accuracy was obtained which is shown in the efficiency table given below
8 Table: Efficiency References [1] Gaurav Harit, K. J. Jinesh, Ritu Garg C.V Jawahar and Santanu Chaudhury Managing Multilingual OCR Project using XML Proc. of International Workshop on Multilingual OCR 2009 Barcelona, Spain. [2] Tushar Patnaik, Shalu Gupta and Gaurav K. Rai. Performance evaluation for Indian Languages in Consortia based OCR. AS- CNT 2009, CDAC, Noida. [3] A. Lear, XML seen as integral to application integration, IT Professional, vol.1, no. 5, pp. 1216, Sep/Oct [4] S Rice, J Kanai and T Nartker, An evaluation of OCR accuracy, UNLV Annual Re- port, pp 9-33, [5] J Esakov, D. P. Lopresti and J. S Sandberg, Classification and distribution of Optical Character Recognition errors, SPIE Vol. 2181, Document Recognition,
9 [6] P. B. Pati and A. G. Ramakrishnan, OCR in Indian scripts: A Survey, IETE Technical Re- view, May-Jun 2005, 22(3): [7] K. G. Aparna and A. G. Ramakrishnan, A complete Tamil Optical Character Recognition System, Proc. Fifth IAPR Workshop on Document Analysis Systems DAS-02, Princeton, NJ, August 19-21, 2002, pp [8] B. Vijay Kumar and A. G. Ramakrishnan, Machine Recognition of Printed Kannada Text, Proc. Fifth IAPR Workshop on Document Analysis Systems (DAS-02), August 19-21, 2002, Springer Verlag, Berlin. [9] R S Umesh, Peeta Basa Pati and A G Ramakrishnan, Design of a bilingual Kannada-English OCR, in the book Guide to OCR for Indic Scripts: Document Recognition and Retrieval Springer, 2009 in the Advances in Pattern Recognition Series. Ed: Venu Govindaraju and Setlur Srirangaraj. pp ISBN: [10] Karthika Mohan and C.V.Jawahar A Post-Processing Scheme for Telugu using Statistical Sub-character Language Models Proceedingsof Ninth IAPR International Workshop on Document Analysis Systems (DAS 10), pp , 9-11 June, 2010, Boston, MA, USA. [11] C.V. Jawahar and Anand Kumar Content-level Annotation of Large Collection of Printed Document Images Proc of 9thInternational Conference on Document Analysis and Recognition, Brazil, September, [12] U. Pal, B. B. Chaudhuri: Indian script character recognition: a survey. Pattern Recognition 37(9): (2004) [13] V. Govindaraju and S. Setlur (Editors), Guide to OCR for Indic Scripts, Springer, Sep [14] J.Hochberg, P.Kelly, T.Thomas, L.Kerns, Automatic Script Identification from Document Images using Cluster based Templates, IEEE Transaction on Pattern Analysis and Machine Intelligence, ,1997. Gopal Datt Joshi, Saurabh
10 Garg, Jayanthi Sivaswamy, Script Identification from Indian Documents, DAS 2006, LNCS3872, , [15] Swamy Das M,segmentation of overlapping text lines, characters in printed telugu text document Images International Journal of Engineering Science and Technology Vol. 2(11), 2010, [16] Rajesh Gopakumar, N.V.Subbareddy, Krishnamoorthi Makkaithaya, U.Dinesh Acharya Script Identification from Multilingual Indian Documents using Structural Features, Journal of Computing,Vol.2, Issue 7, July [17] J.Michael Fitzpatrick and John D. Crocetti,Introduction to Programming with MATLAB-Text Book. [18] Alasdair McAndrew, An Introduction to Digital Image Processing with MATLAB-Text Book. [19] Rafael C. Gonzalez, Richard E. Woods, Steven L. Eddins, Digital Image Processing using MATLAB- Text Book. [20] Haris Papasaika Hanusch, Digital Image Processing using MATLAB-Text Book
11 10157
12 10158
Experiences of Integration and Performance Testing of Multilingual OCR for Printed Indian Scripts
Experiences of Integration and Performance Testing of Multilingual OCR for Printed Indian Scripts Deepak Arya CDAC, Noida deepakarya@cdacnoida.in Tushar Patnaik CDAC Noida tusharpatnaik@cdacnoida.in Santanu
More informationOCR For Handwritten Marathi Script
International Journal of Scientific & Engineering Research Volume 3, Issue 8, August-2012 1 OCR For Handwritten Marathi Script Mrs.Vinaya. S. Tapkir 1, Mrs.Sushma.D.Shelke 2 1 Maharashtra Academy Of Engineering,
More informationDepartment of Studies in Computer Science, Karnataka State Open University, Mysore, India 2
Volume 5, Issue 12, December 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com K-means Clustering
More informationWord-wise Hand-written Script Separation for Indian Postal automation
Word-wise Hand-written Script Separation for Indian Postal automation K. Roy U. Pal Dept. of Comp. Sc. & Engg. West Bengal University of Technology, Sector 1, Saltlake City, Kolkata-64, India Abstract
More information2015, IJARCSSE All Rights Reserved Page 665
Volume 5, Issue 12, December 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Script Identification
More informationImproved recognition of aged Kannada documents by effective segmentation of merged characters
Improved recognition of aged Kannada documents by effective segmentation of merged characters Madhavaraj A, A G Ramakrishnan, Shiva Kumar H R MILE Laboratory, Dept. of Electrical Engineering Indian Institute
More informationHandwritten Script Recognition at Block Level
Chapter 4 Handwritten Script Recognition at Block Level -------------------------------------------------------------------------------------------------------------------------- Optical character recognition
More informationOptical Character Recognition (OCR) for Printed Devnagari Script Using Artificial Neural Network
International Journal of Computer Science & Communication Vol. 1, No. 1, January-June 2010, pp. 91-95 Optical Character Recognition (OCR) for Printed Devnagari Script Using Artificial Neural Network Raghuraj
More informationCursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network
Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network Utkarsh Dwivedi 1, Pranjal Rajput 2, Manish Kumar Sharma 3 1UG Scholar, Dept. of CSE, GCET, Greater Noida,
More informationHCR Using K-Means Clustering Algorithm
HCR Using K-Means Clustering Algorithm Meha Mathur 1, Anil Saroliya 2 Amity School of Engineering & Technology Amity University Rajasthan, India Abstract: Hindi is a national language of India, there are
More informationN.Priya. Keywords Compass mask, Threshold, Morphological Operators, Statistical Measures, Text extraction
Volume, Issue 8, August ISSN: 77 8X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Combined Edge-Based Text
More informationHandwritten Marathi Character Recognition on an Android Device
Handwritten Marathi Character Recognition on an Android Device Tanvi Zunjarrao 1, Uday Joshi 2 1MTech Student, Computer Engineering, KJ Somaiya College of Engineering,Vidyavihar,India 2Associate Professor,
More informationBangla/English Script Identification Based on Analysis of Connected Component Profiles
Bangla/English Script Identification Based on Analysis of Connected Component Profiles Lijun Zhou 1,YueLu 1,2,andChewLimTan 3 1 Department of Computer Science and Technology, East China Normal University,
More informationDEVANAGARI SCRIPT SEPARATION AND RECOGNITION USING MORPHOLOGICAL OPERATIONS AND OPTIMIZED FEATURE EXTRACTION METHODS
DEVANAGARI SCRIPT SEPARATION AND RECOGNITION USING MORPHOLOGICAL OPERATIONS AND OPTIMIZED FEATURE EXTRACTION METHODS Sushilkumar N. Holambe Dr. Ulhas B. Shinde Shrikant D. Mali Persuing PhD at Principal
More informationHandwritten Devanagari Character Recognition Model Using Neural Network
Handwritten Devanagari Character Recognition Model Using Neural Network Gaurav Jaiswal M.Sc. (Computer Science) Department of Computer Science Banaras Hindu University, Varanasi. India gauravjais88@gmail.com
More informationA Brief Study of Feature Extraction and Classification Methods Used for Character Recognition of Brahmi Northern Indian Scripts
25 A Brief Study of Feature Extraction and Classification Methods Used for Character Recognition of Brahmi Northern Indian Scripts Rohit Sachdeva, Asstt. Prof., Computer Science Department, Multani Mal
More informationA Technique for Classification of Printed & Handwritten text
123 A Technique for Classification of Printed & Handwritten text M.Tech Research Scholar, Computer Engineering Department, Yadavindra College of Engineering, Punjabi University, Guru Kashi Campus, Talwandi
More informationSegmentation of Characters of Devanagari Script Documents
WWJMRD 2017; 3(11): 253-257 www.wwjmrd.com International Journal Peer Reviewed Journal Refereed Journal Indexed Journal UGC Approved Journal Impact Factor MJIF: 4.25 e-issn: 2454-6615 Manpreet Kaur Research
More informationSEVERAL METHODS OF FEATURE EXTRACTION TO HELP IN OPTICAL CHARACTER RECOGNITION
SEVERAL METHODS OF FEATURE EXTRACTION TO HELP IN OPTICAL CHARACTER RECOGNITION Binod Kumar Prasad * * Bengal College of Engineering and Technology, Durgapur, W.B., India. Rajdeep Kundu 2 2 Bengal College
More informationOptical Character Recognition
Chapter 2 Optical Character Recognition 2.1 Introduction Optical Character Recognition (OCR) is one of the challenging areas of pattern recognition. It gained popularity among the research community due
More informationCharacter Recognition Using Matlab s Neural Network Toolbox
Character Recognition Using Matlab s Neural Network Toolbox Kauleshwar Prasad, Devvrat C. Nigam, Ashmika Lakhotiya and Dheeren Umre B.I.T Durg, India Kauleshwarprasad2gmail.com, devnigam24@gmail.com,ashmika22@gmail.com,
More informationRecognition of Off-Line Handwritten Devnagari Characters Using Quadratic Classifier
Recognition of Off-Line Handwritten Devnagari Characters Using Quadratic Classifier N. Sharma, U. Pal*, F. Kimura**, and S. Pal Computer Vision and Pattern Recognition Unit, Indian Statistical Institute
More informationReview of Automatic Handwritten Kannada Character Recognition Technique Using Neural Network
Review of Automatic Handwritten Kannada Character Recognition Technique Using Neural Network 1 Mukesh Kumar, 2 Dr.Jeeetendra Sheethlani 1 Department of Computer Science SSSUTMS, Sehore Abstract Data processing
More informationOn Separation of English Numerals from Multilingual Document Images
26 JOURNAL OF MULTIMEDIA, VOL. 2, NO. 6, NOVEMBER 2007 On Separation of English Numerals from Multilingual Document Images B.V.Dhandra P.G.Department of Studies and Research in Computer Science, Gulbarga
More informationaccount the distribution of black pixels around every pixel of a connected component. In this work we have considered document images in portrait orie
Identification of Scripts of Indian Languages by Combining Trainable Classifiers Santanu Chaudhury Gaurav Harit Shekhar Madnani R.B. Shet santanuc@ee.iitd.ernet.in g harit@hotmail.com s madnani@hotmail.com
More informationINTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK HANDWRITTEN DEVANAGARI CHARACTERS RECOGNITION THROUGH SEGMENTATION AND ARTIFICIAL
More informationBangla/English Script Identification Based on Analysis of Connected Component Profiles
Bangla/English Script Identification Based on Analysis of Connected Component Profiles Lijun Zhou 1, Yue Lu 1,2, Chew Lim Tan 3 1 Department of Computer Science and Technology East China Normal University,
More informationRESTORATION OF DEGRADED DOCUMENTS USING IMAGE BINARIZATION TECHNIQUE
RESTORATION OF DEGRADED DOCUMENTS USING IMAGE BINARIZATION TECHNIQUE K. Kaviya Selvi 1 and R. S. Sabeenian 2 1 Department of Electronics and Communication Engineering, Communication Systems, Sona College
More informationCHAPTER 8 COMPOUND CHARACTER RECOGNITION USING VARIOUS MODELS
CHAPTER 8 COMPOUND CHARACTER RECOGNITION USING VARIOUS MODELS 8.1 Introduction The recognition systems developed so far were for simple characters comprising of consonants and vowels. But there is one
More informationScript Identification for Document Image Retrieval: A Survey
International Journal of Engineering Research and Technology. ISSN 0974-3154 Volume 6, Number 1 (2013), pp. 171-179 International Research Publication House http://www.irphouse.com Script Identification
More informationCharacter Segmentation for Telugu Image Document using Multiple Histogram Projections
Global Journal of Computer Science and Technology Graphics & Vision Volume 13 Issue 5 Version 1.0 Year 2013 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Inc.
More informationWord-wise Script Identification from Video Frames
Word-wise Script Identification from Video Frames Author Sharma, Nabin, Chanda, Sukalpa, Pal, Umapada, Blumenstein, Michael Published 2013 Conference Title Proceedings 12th International Conference on
More informationRecognition of Unconstrained Malayalam Handwritten Numeral
Recognition of Unconstrained Malayalam Handwritten Numeral U. Pal, S. Kundu, Y. Ali, H. Islam and N. Tripathy C VPR Unit, Indian Statistical Institute, Kolkata-108, India Email: umapada@isical.ac.in Abstract
More informationHandwritten Numeral Recognition of Kannada Script
Handwritten Numeral Recognition of Kannada Script S.V. Rajashekararadhya Department of Electrical and Electronics Engineering CEG, Anna University, Chennai, India svr_aradhya@yahoo.co.in P. Vanaja Ranjan
More informationSegmentation of Isolated and Touching characters in Handwritten Gurumukhi Word using Clustering approach
Segmentation of Isolated and Touching characters in Handwritten Gurumukhi Word using Clustering approach Akashdeep Kaur Dr.Shaveta Rani Dr. Paramjeet Singh M.Tech Student (Associate Professor) (Associate
More informationHANDWRITTEN GURMUKHI CHARACTER RECOGNITION USING WAVELET TRANSFORMS
International Journal of Electronics, Communication & Instrumentation Engineering Research and Development (IJECIERD) ISSN 2249-684X Vol.2, Issue 3 Sep 2012 27-37 TJPRC Pvt. Ltd., HANDWRITTEN GURMUKHI
More informationToward Part-based Document Image Decoding
2012 10th IAPR International Workshop on Document Analysis Systems Toward Part-based Document Image Decoding Wang Song, Seiichi Uchida Kyushu University, Fukuoka, Japan wangsong@human.ait.kyushu-u.ac.jp,
More informationOn Segmentation of Documents in Complex Scripts
On Segmentation of Documents in Complex Scripts K. S. Sesh Kumar, Sukesh Kumar and C. V. Jawahar Centre for Visual Information Technology International Institute of Information Technology, Hyderabad, India
More informationImage Normalization and Preprocessing for Gujarati Character Recognition
334 Image Normalization and Preprocessing for Gujarati Character Recognition Jayashree Rajesh Prasad Department of Computer Engineering, Sinhgad College of Engineering, University of Pune, Pune, Mahaashtra
More informationA HYBRID FEATURE EXTRACTION AND RECOGNITION TECHNIQUE FOR OFFLINE DEVNAGRI HADWRITING
A HYBRID FEATURE EXTRACTION AND RECOGNITION TECHNIQUE FOR OFFLINE DEVNAGRI HADWRITING Poonam Sharma Department of Computer Science The NorthCap University Email-Id: poonamsharma@ncuindia.edu Shivani Sihmar
More informationSEGMENTATION OF CHARACTERS WITHOUT MODIFIERS FROM A PRINTED BANGLA TEXT
SEGMENTATION OF CHARACTERS WITHOUT MODIFIERS FROM A PRINTED BANGLA TEXT ABSTRACT Rupak Bhattacharyya et al. (Eds) : ACER 2013, pp. 11 24, 2013. CS & IT-CSCP 2013 Fakruddin Ali Ahmed Department of Computer
More informationA Survey of Problems of Overlapped Handwritten Characters in Recognition process for Gurmukhi Script
A Survey of Problems of Overlapped Handwritten Characters in Recognition process for Gurmukhi Script Arwinder Kaur 1, Ashok Kumar Bathla 2 1 M. Tech. Student, CE Dept., 2 Assistant Professor, CE Dept.,
More informationwith Profile's Amplitude Filter
Arabic Character Segmentation Using Projection-Based Approach with Profile's Amplitude Filter Mahmoud A. A. Mousa Dept. of Computer and Systems Engineering, Zagazig University, Zagazig, Egypt mamosa@zu.edu.eg
More informationA SURVEY ON WORD SPOTTING TECHNIQUES FOR DOCUMENT IMAGE RETRIEVAL
A SURVEY ON WORD SPOTTING TECHNIQUES FOR DOCUMENT IMAGE RETRIEVAL Dr. S. Vijayarani Assistant Professor, Department of Computer Science, Bharathiar University, Coimbatore Ms. A. SAKILA Research Scholar,
More informationSeminar. Topic: Object and character Recognition
Seminar Topic: Object and character Recognition Tse Ngang Akumawah Lehrstuhl für Praktische Informatik 3 Table of content What's OCR? Areas covered in OCR Procedure Where does clustering come in Neural
More informationComparative Performance Analysis of Feature(S)- Classifier Combination for Devanagari Optical Character Recognition System
Comparative Performance Analysis of Feature(S)- Classifier Combination for Devanagari Optical Character Recognition System Jasbir Singh Department of Computer Science Punjabi University Patiala, India
More informationJournal of Applied Research and Technology ISSN: Centro de Ciencias Aplicadas y Desarrollo Tecnológico.
Journal of Applied Research and Technology ISSN: 1665-6423 jart@aleph.cinstrum.unam.mx Centro de Ciencias Aplicadas y Desarrollo Tecnológico México Singla, S. K.; Yadav, R. K. Optical Character Recognition
More informationOptical Character Recognition
Optical Character Recognition Jagruti Chandarana 1, Mayank Kapadia 2 1 Department of Electronics and Communication Engineering, UKA TARSADIA University 2 Assistant Professor, Department of Electronics
More informationIMPLEMENTING ON OPTICAL CHARACTER RECOGNITION USING MEDICAL TABLET FOR BLIND PEOPLE
Impact Factor (SJIF): 5.301 International Journal of Advance Research in Engineering, Science & Technology e-issn: 2393-9877, p-issn: 2394-2444 Volume 5, Issue 3, March-2018 IMPLEMENTING ON OPTICAL CHARACTER
More informationStructural Feature Extraction to recognize some of the Offline Isolated Handwritten Gujarati Characters using Decision Tree Classifier
Structural Feature Extraction to recognize some of the Offline Isolated Handwritten Gujarati Characters using Decision Tree Classifier Hetal R. Thaker Atmiya Institute of Technology & science, Kalawad
More informationLine and Word Segmentation Approach for Printed Documents
Line and Word Segmentation Approach for Printed Documents Nallapareddy Priyanka Computer Vision and Pattern Recognition Unit Indian Statistical Institute, 203 B.T. Road, Kolkata-700108, India Srikanta
More informationCharacter Recognition of High Security Number Plates Using Morphological Operator
Character Recognition of High Security Number Plates Using Morphological Operator Kamaljit Kaur * Department of Computer Engineering, Baba Banda Singh Bahadur Polytechnic College Fatehgarh Sahib,Punjab,India
More informationINTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK POSSIBLE USE OF OCR FOR RECOGNITION OF KORKU LANGUAGE TEXT ARVIND ARJUNRAO TAYADE,
More informationGabor Features Based Script Identification of Lines within a Bilingual/Trilingual Document
, pp.1-12 http://dx.doi.org/10.14257/ijast.2014.66.01 Gabor Features Based Script Identification of Lines within a Bilingual/Trilingual Document Rajneesh Rani 1, Renu Dhir 1 and Gurpreet Singh Lehal 2
More informationDevanagari Handwriting Recognition and Editing Using Neural Network
Devanagari Handwriting Recognition and Editing Using Neural Network Sohan Lal Sahu RSR Rungta College of Engineering & Technology (RSR-RCET), Bhilai 490024 Abstract- Character recognition plays an important
More informationIsolated Handwritten Words Segmentation Techniques in Gurmukhi Script
Isolated Handwritten Words Segmentation Techniques in Gurmukhi Script Galaxy Bansal Dharamveer Sharma ABSTRACT Segmentation of handwritten words is a challenging task primarily because of structural features
More informationFRAGMENTATION OF HANDWRITTEN TOUCHING CHARACTERS IN DEVANAGARI SCRIPT
International Journal of Information Technology, Modeling and Computing (IJITMC) Vol. 2, No. 1, February 2014 FRAGMENTATION OF HANDWRITTEN TOUCHING CHARACTERS IN DEVANAGARI SCRIPT Shuchi Kapoor 1 and Vivek
More informationRecognition of online captured, handwritten Tamil words on Android
Recognition of online captured, handwritten Tamil words on Android A G Ramakrishnan and Bhargava Urala K Medical Intelligence and Language Engineering (MILE) Laboratory, Dept. of Electrical Engineering,
More informationI. INTRODUCTION. Keywords Script separation, Indian script, Bilingual (English-Oriya) OCR, Horizontal profiles, nearest neighbour.
A Comparative Analysis of Classifiers Accuracies for Bilingual Printed Documents (Oriya-English) Sanghamitra Mohanty, Himadri Nandini Das Bebartta P.G. Department of Computer Science and Application, Utkal
More informationOptical Character Recognition Based Speech Synthesis System Using LabVIEW
Optical Character Recognition Based Speech Synthesis System Using LabVIEW S. K. Singla* 1 and R.K.Yadav 2 1 Electrical and Instrumentation Engineering Department Thapar University, Patiala,Punjab *sunilksingla2001@gmail.com
More informationINTERNATIONAL RESEARCH JOURNAL OF MULTIDISCIPLINARY STUDIES
STUDIES & SPPP's, Karmayogi Engineering College, Pandharpur Organize National Conference Special Issue March 2016 Neuro-Fuzzy System based Handwritten Marathi System Numerals Recognition 1 Jayashri H Patil(Madane),
More informationTowards a Robust OCR System for Indic Scripts
Towards a Robust OCR System for Indic Scripts Praveen Krishnan, Naveen Sankaran, Ajeet Kumar Singh, C. V. Jawahar Center for Visual Information Technology, IIIT Hyderabad, India. Abstract The current Optical
More informationResearch Article Development of Comprehensive Devnagari Numeral and Character Database for Offline Handwritten Character Recognition
Applied Computational Intelligence and Soft Computing Volume 2012, Article ID 871834, 5 pages doi:10.1155/2012/871834 Research Article Development of Comprehensive Devnagari Numeral and Character base
More informationCloud Based Mobile Business Card Reader in Tamil
Cloud Based Mobile Business Card Reader in Tamil Tamizhselvi. S.P, Vijayalakshmi Muthuswamy, S. Abirami Department of Information Science and Technology, CEG Campue, Anna University tamizh8306@gmail.com,
More informationDESIGNING A REAL TIME SYSTEM FOR CAR NUMBER DETECTION USING DISCRETE HOPFIELD NETWORK
DESIGNING A REAL TIME SYSTEM FOR CAR NUMBER DETECTION USING DISCRETE HOPFIELD NETWORK A.BANERJEE 1, K.BASU 2 and A.KONAR 3 COMPUTER VISION AND ROBOTICS LAB ELECTRONICS AND TELECOMMUNICATION ENGG JADAVPUR
More informationTEXT DETECTION AND RECOGNITION IN CAMERA BASED IMAGES
TEXT DETECTION AND RECOGNITION IN CAMERA BASED IMAGES Mr. Vishal A Kanjariya*, Mrs. Bhavika N Patel Lecturer, Computer Engineering Department, B & B Institute of Technology, Anand, Gujarat, India. ABSTRACT:
More informationFine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes
2009 10th International Conference on Document Analysis and Recognition Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes Alireza Alaei
More informationSegmentation Based Optical Character Recognition for Handwritten Marathi characters
Segmentation Based Optical Character Recognition for Handwritten Marathi characters Madhav Vaidya 1, Yashwant Joshi 2,Milind Bhalerao 3 Department of Information Technology 1 Department of Electronics
More informationColor based segmentation using clustering techniques
Color based segmentation using clustering techniques 1 Deepali Jain, 2 Shivangi Chaudhary 1 Communication Engineering, 1 Galgotias University, Greater Noida, India Abstract - Segmentation of an image defines
More informationSkew Detection and Correction of Document Image using Hough Transform Method
Skew Detection and Correction of Document Image using Hough Transform Method [1] Neerugatti Varipally Vishwanath, [2] Dr.T. Pearson, [3] K.Chaitanya, [4] MG JaswanthSagar, [5] M.Rupesh [1] Asst.Professor,
More informationSegmentation of Kannada Handwritten Characters and Recognition Using Twelve Directional Feature Extraction Techniques
Segmentation of Kannada Handwritten Characters and Recognition Using Twelve Directional Feature Extraction Techniques 1 Lohitha B.J, 2 Y.C Kiran 1 M.Tech. Student Dept. of ISE, Dayananda Sagar College
More informationInternational Journal of Advance Research in Engineering, Science & Technology
Impact Factor (SJIF): 3.632 International Journal of Advance Research in Engineering, Science & Technology e-issn: 2393-9877, p-issn: 2394-2444 (Special Issue for ITECE 2016) Analysis and Implementation
More informationGradient-Angular-Features for Word-Wise Video Script Identification
Gradient-Angular-Features for Word-Wise Video Script Identification Author Shivakumara, Palaiahnakote, Sharma, Nabin, Pal, Umapada, Blumenstein, Michael, Tan, Chew Lim Published 2014 Conference Title Pattern
More informationPCA-based Offline Handwritten Character Recognition System
Smart Computing Review, vol. 3, no. 5, October 2013 346 Smart Computing Review PCA-based Offline Handwritten Character Recognition System Munish Kumar 1, M. K. Jindal 2, and R. K. Sharma 3 1 Computer Science
More informationEnhancing the Character Segmentation Accuracy of Bangla OCR using BPNN
Enhancing the Character Segmentation Accuracy of Bangla OCR using BPNN Shamim Ahmed 1, Mohammod Abul Kashem 2 1 M.S. Student, Department of Computer Science and Engineering, Dhaka University of Engineering
More informationOnline Handwritten Devnagari Word Recognition using HMM based Technique
Online Handwritten Devnagari Word using HMM based Technique Prachi Patil Master of Engineering Dept. of Electronics & Telecommunication Dr. D. Y. Patil SOE, Pune, India Saniya Ansari Professor Dept. of
More informationMobile Application with Optical Character Recognition Using Neural Network
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 1, January 2015,
More informationMOMENT AND DENSITY BASED HADWRITTEN MARATHI NUMERAL RECOGNITION
MOMENT AND DENSITY BASED HADWRITTEN MARATHI NUMERAL RECOGNITION S. M. Mali Department of Computer Science, MAEER S Arts, Commerce and Science College, Pune Shankarmali007@gmail.com Abstract In this paper,
More informationA two-stage approach for segmentation of handwritten Bangla word images
A two-stage approach for segmentation of handwritten Bangla word images Ram Sarkar, Nibaran Das, Subhadip Basu, Mahantapas Kundu, Mita Nasipuri #, Dipak Kumar Basu Computer Science & Engineering Department,
More informationComplementary Features Combined in a MLP-based System to Recognize Handwritten Devnagari Character
Journal of Information Hiding and Multimedia Signal Processing 2011 ISSN 2073-4212 Ubiquitous International Volume 2, Number 1, January 2011 Complementary Features Combined in a MLP-based System to Recognize
More informationAn Efficient Character Segmentation Based on VNP Algorithm
Research Journal of Applied Sciences, Engineering and Technology 4(24): 5438-5442, 2012 ISSN: 2040-7467 Maxwell Scientific organization, 2012 Submitted: March 18, 2012 Accepted: April 14, 2012 Published:
More informationMorphological Approach for Segmentation of Scanned Handwritten Devnagari Text
Abstract In this paper we present a system towards the of Hindi Handwritten Devnagari Text. Segmentation of script is essential for handwritten script recognition. This system deals with of (matras) and
More informationPlant Leaf Disease Detection using K means Segmentation
Volume 119 No. 15 2018, 3477-3483 ISSN: 1314-3395 (on-line version) url: http://www.acadpubl.eu/hub/ http://www.acadpubl.eu/hub/ Plant Leaf Disease Detection using K means Segmentation 1 T. Gayathri Devi,
More informationFPGA IMPLEMENTATION FOR REAL TIME SOBEL EDGE DETECTOR BLOCK USING 3-LINE BUFFERS
FPGA IMPLEMENTATION FOR REAL TIME SOBEL EDGE DETECTOR BLOCK USING 3-LINE BUFFERS 1 RONNIE O. SERFA JUAN, 2 CHAN SU PARK, 3 HI SEOK KIM, 4 HYEONG WOO CHA 1,2,3,4 CheongJu University E-maul: 1 engr_serfs@yahoo.com,
More informationarxiv: v1 [cs.cv] 9 Aug 2017
Anveshak - A Groundtruth Generation Tool for Foreground Regions of Document Images Soumyadeep Dey, Jayanta Mukherjee, Shamik Sural, and Amit Vijay Nandedkar arxiv:1708.02831v1 [cs.cv] 9 Aug 2017 Department
More informationText-Line Extraction from Handwritten Document images using Histogram and Connected Component Analysis
Text-Line Extraction from Handwritten Document images using Histogram and Connected Component Analysis G. G. Rajput Rani Channamma University Belagavi, Karnataka Suryakant B. Ummapure Dept. of Computer
More informationSpectral Analysis of Projection Histogram for Enhancing Close matching character Recognition in Malayalam
Spectral Analysis of Projection Histogram for Enhancing Close matching character Recognition in Malayalam Sajilal Divakaran University of Kerala, Thiruvananthapuram, Kerala, India 6981 sajilald@gmail.com
More informationA New Algorithm for Detecting Text Line in Handwritten Documents
A New Algorithm for Detecting Text Line in Handwritten Documents Yi Li 1, Yefeng Zheng 2, David Doermann 1, and Stefan Jaeger 1 1 Laboratory for Language and Media Processing Institute for Advanced Computer
More informationA System for Joining and Recognition of Broken Bangla Numerals for Indian Postal Automation
A System for Joining and Recognition of Broken Bangla Numerals for Indian Postal Automation K. Roy, U. Pal and B. B. Chaudhuri CVPR Unit; Indian Statistical Institute, Kolkata-108; India umapada@isical.ac.in
More informationRecognition of Gurmukhi Text from Sign Board Images Captured from Mobile Camera
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 17 (2014), pp. 1839-1845 International Research Publications House http://www. irphouse.com Recognition of
More informationAn Improvement Study for Optical Character Recognition by using Inverse SVM in Image Processing Technique
An Improvement Study for Optical Character Recognition by using Inverse SVM in Image Processing Technique I Dinesh KumarVerma, II Anjali Khatri I Assistant Professor (ECE) PDM College of Engineering, Bahadurgarh,
More informationK S Prasanna Kumar et al,int.j.computer Techology & Applications,Vol 3 (1),
Optical Character Recognition (OCR) for Kannada numerals using Left Bottom 1/4 th segment minimum features extraction K.S. Prasanna Kumar Research Scholar, JJT University, Jhunjhunu, Rajasthan, India prasannakumarks@acharya.ac.in
More informationVolume 2, Issue 5, May 2014 International Journal of Advance Research in Computer Science and Management Studies
Volume 2, Issue 5, May 2014 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com Fast
More informationOptical Recognition of Digital Characters Using Machine Learning
International Journal of Research Studies in Computer Science and Engineering (IJRSCSE) Volume 5, Issue 1, 2018, PP 9-16 ISSN 2349-4840 (Print) & ISSN 2349-4859 (Online) DOI: http://dx.doi.org/10.20431/2349-4859.0501002
More informationKeyword Spotting in Document Images through Word Shape Coding
2009 10th International Conference on Document Analysis and Recognition Keyword Spotting in Document Images through Word Shape Coding Shuyong Bai, Linlin Li and Chew Lim Tan School of Computing, National
More informationScript Characterization in the Old Slavic Documents
Script Characterization in the Old Slavic Documents Darko Brodić 1 2, Zoran N. Milivojević,andČedomir A. Maluckov1 1 University of Belgrade, Technical Faculty in Bor, Vojske Jugoslavije 12, 19210 Bor,
More informationHandwritten character and word recognition using their geometrical features through neural networks
Handwritten character and word recognition using their geometrical features through neural networks Sudarshan Sawant 1, Prof. Seema Baji 2 1 Student, Department of electronics and Tele-communications,
More informationA Fast Recognition System for Isolated Printed Characters Using Center of Gravity and Principal Axis
Applied Mathematics, 2013, 4, 1313-1319 http://dx.doi.org/10.4236/am.2013.49177 Published Online September 2013 (http://www.scirp.org/journal/am) A Fast Recognition System for Isolated Printed Characters
More informationA New Technique for Segmentation of Handwritten Numerical Strings of Bangla Language
I.J. Information Technology and Computer Science, 2013, 05, 38-43 Published Online April 2013 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijitcs.2013.05.05 A New Technique for Segmentation of Handwritten
More informationMarathi Handwritten Numeral Recognition using Fourier Descriptors and Normalized Chain Code
Marathi Handwritten Numeral Recognition using Fourier Descriptors and Normalized Chain Code G. G. Rajput Department of Computer Science Gulbarga University, Gulbarga 585106 Karnataka, India S. M. Mali
More information