LECTURE 6 TEXT PROCESSING

Size: px
Start display at page:

Download "LECTURE 6 TEXT PROCESSING"

Transcription

1 SCIENTIFIC DATA COMPUTING 1 MTAT LECTURE 6 TEXT PROCESSING Prepared by: Amnir Hadachi Institute of Computer Science, University of Tartu amnir.hadachi@ut.ee

2 OUTLINE Aims Character Typology OCR systems Preprocessing Segmentation Feature extraction Classification

3 1. AIM

4 AIMS Text recognition is part of document analysis domain the objective behind it is: extract text and recognise it from images or sequences of characters codes.

5 2. TOPOLOGY

6 CHARACTER TYPOLOGY Machine printed text Limited vocabulary restricted vocabulary (city names, street names,...) language knowledge of contextual Handwritten text TOPOLOGY Various Alphabet Western languages (Roman, Greek, Asian (Chinese, Japanese, Korean, Thai,... Arabic alphabets Isolated characters, connected characters, cursive text

7 3. OCR

8 PRE-PROCESSING SEGMENTATION FEATURE EXTRACTION CLASSIFICATION

9 3.1 PRE-PROCESSING

10 PRE-PROCESSING Any raw data is subject to preprocessing. The preprocessing stage make sure that the data is clean and usable. The main targets of preprocessing are as follows: Binarization noise reduction stroke width normalisation Skew correction Slant removal

11 PRE-PROCESSING Binarization: binarization or thresholding refers to the conversion of a grayscale image into a binary image. We can distinguish two categories: Global: picked one threshold value and apply it for the entire document Adaptive: uses different values for each pixel according to the local area pixels.

12 PRE-PROCESSING Noise reduction: Improving the quality of the document. Two main approaches: Filtering or masks Morphological Operation (erosion, dilation, ) Normalisation: Provide a reduction in data size which can help in extracting the shape information from the data.

13 PRE-PROCESSING Skew correction: The method is used to align the paper document with the coordinate system of the scanner. this means that the skew method uses correction, projection, profiles, transformation,

14 PRE-PROCESSING Slant removal Slant of handwriting vary from a user to another. slant removal techniques are used to normalise all the characters in the document Example: Calculation of the average angle of near-vertical elements Entropy

15 3.2 SEGMENTATION

16 SEGMENTATION Text Line detection (projections, Hough transform, Word extraction (vertical projections, connected component analysis, ) Explicit segmentation Implicite segmentation

17 3.3 FEATURES EXTRACTION

18 FEATURE EXTRACTION The characters are resented by feature vectors. Identity or finger print Objective is: maximising the recognition rate with the smallest amount possible of elements. The variability and imprecision of handwriting make it very difficult tasks. Types of features: Statistical Structural Global transformations and moments

19 FEATURE EXTRACTION Statistical Features: representing characters image by statical distribution of points. The major statistical features used are: Zoning Projections and profiles Crossing and distances

20 FEATURE EXTRACTION Statistical Features: Zoning, The image are spliced into small zones. Each zones features are extracted to form the features vector. Objective: is to obtain the local characteristics instead of global characteristics.

21 FEATURE EXTRACTION Statistical Features: Zoning, density features number of foreground pixels is considered a feature.

22 FEATURE EXTRACTION Statistical Features: Zoning, direction features focuses on the contour of the character image For each zone the contour is followed and a directional histogram is obtained by analysing the adjacent pixels. Reference: Off line handwritten OCR by G. Vamvakas

23 FEATURE EXTRACTION Statistical Features: Projection histograms representing the image 2D signal into 1-D signal. These features are: Independent to noise and deformation Dependent to rotation. Projection histogram is the number of pixel in each columns and row in the image.

24 FEATURE EXTRACTION Statistical Features: Profiles Counting the number of pixels between the bounding box and the character. Can be used to: extract contour of character locate upper or lower points on the contour calculate in / out profiles of the contour.

25 FEATURE EXTRACTION Statistical Features: Crossing and distances counting the transition from background to foreground pixels along vertical and horizontal line.

26 FEATURE EXTRACTION Structural Features: Structural features can be with high tolerance to distortions and style variations. Structural features focuses on topological and geometrical properties of the character, like: branch points, strokes and their directions, inflection between points, loops, crossing point, etc.

27 FEATURE EXTRACTION Global Transformation: transformation that affect the contour of the image and can make the recognition effective such as: Fourier Transform (FT) Central Zernike

28 3.4 CLASSIFICATION

29 CLASSIFICATION Applying classification techniques such as : k-nearest Neighbour (k-nn), Bayes Classifier, Neural Networks (NN), Hidden Markov Models (HMM), Support Vector Machines (SVM), etc Reading: Pooja Kamavisdar, Sonam Saluja, Sonu Agrawal, A Survey on Image Classification Approaches and Techniques,International Journal of Advanced Research in Computer and Communication Engineering Vol. 2, Issue 1, January 2013

30 READINGS AND REFERENCE: Teddy Mantoro, Abdul Muis Sobri, Wendi Usino, Optical Character Recognition (OCR) Performance in Server- Based Mobile Envirement, in international Conference on Advanced Computer Science Applications and Technologies, G. Vamvakas, N. Stamatopoulos,B. Gatos, I. Pratikakis and S.J. Perantonis, "Standard Database and Methods for Handwritten Greek Character Recognition", In the proc. of the 11th Panhellenic Conference on Informatics (PCI 2007),Patras,May G. Vamvakas, B. Gatos, I. Pratikakis, N. Stamatopoulos, A. Roniotis and S.J. Perantonis, "Hybrid Off-Line OCR for Isolated Handwritten Greek Characters", The Fourth IASTED International Conference on Signal Processing, Pattern Recognition, and Applications (SPPRA 2007), ISBN: , pp , Innsbruck, Austria, February 2007.

Segmentation of Kannada Handwritten Characters and Recognition Using Twelve Directional Feature Extraction Techniques

Segmentation of Kannada Handwritten Characters and Recognition Using Twelve Directional Feature Extraction Techniques Segmentation of Kannada Handwritten Characters and Recognition Using Twelve Directional Feature Extraction Techniques 1 Lohitha B.J, 2 Y.C Kiran 1 M.Tech. Student Dept. of ISE, Dayananda Sagar College

More information

HANDWRITTEN GURMUKHI CHARACTER RECOGNITION USING WAVELET TRANSFORMS

HANDWRITTEN GURMUKHI CHARACTER RECOGNITION USING WAVELET TRANSFORMS International Journal of Electronics, Communication & Instrumentation Engineering Research and Development (IJECIERD) ISSN 2249-684X Vol.2, Issue 3 Sep 2012 27-37 TJPRC Pvt. Ltd., HANDWRITTEN GURMUKHI

More information

An Improved Zone Based Hybrid Feature Extraction Model for Handwritten Alphabets Recognition Using Euler Number

An Improved Zone Based Hybrid Feature Extraction Model for Handwritten Alphabets Recognition Using Euler Number International Journal of Soft Computing and Engineering (IJSCE) An Improved Zone Based Hybrid Feature Extraction Model for Handwritten Alphabets Recognition Using Euler Number Om Prakash Sharma, M. K.

More information

Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network

Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network Utkarsh Dwivedi 1, Pranjal Rajput 2, Manish Kumar Sharma 3 1UG Scholar, Dept. of CSE, GCET, Greater Noida,

More information

An Improvement Study for Optical Character Recognition by using Inverse SVM in Image Processing Technique

An Improvement Study for Optical Character Recognition by using Inverse SVM in Image Processing Technique An Improvement Study for Optical Character Recognition by using Inverse SVM in Image Processing Technique I Dinesh KumarVerma, II Anjali Khatri I Assistant Professor (ECE) PDM College of Engineering, Bahadurgarh,

More information

Building Multi Script OCR for Brahmi Scripts: Selection of Efficient Features

Building Multi Script OCR for Brahmi Scripts: Selection of Efficient Features Building Multi Script OCR for Brahmi Scripts: Selection of Efficient Features Md. Abul Hasnat Center for Research on Bangla Language Processing (CRBLP) Center for Research on Bangla Language Processing

More information

SEVERAL METHODS OF FEATURE EXTRACTION TO HELP IN OPTICAL CHARACTER RECOGNITION

SEVERAL METHODS OF FEATURE EXTRACTION TO HELP IN OPTICAL CHARACTER RECOGNITION SEVERAL METHODS OF FEATURE EXTRACTION TO HELP IN OPTICAL CHARACTER RECOGNITION Binod Kumar Prasad * * Bengal College of Engineering and Technology, Durgapur, W.B., India. Rajdeep Kundu 2 2 Bengal College

More information

Handwritten Gurumukhi Character Recognition by using Recurrent Neural Network

Handwritten Gurumukhi Character Recognition by using Recurrent Neural Network 139 Handwritten Gurumukhi Character Recognition by using Recurrent Neural Network Harmit Kaur 1, Simpel Rani 2 1 M. Tech. Research Scholar (Department of Computer Science & Engineering), Yadavindra College

More information

K S Prasanna Kumar et al,int.j.computer Techology & Applications,Vol 3 (1),

K S Prasanna Kumar et al,int.j.computer Techology & Applications,Vol 3 (1), Optical Character Recognition (OCR) for Kannada numerals using Left Bottom 1/4 th segment minimum features extraction K.S. Prasanna Kumar Research Scholar, JJT University, Jhunjhunu, Rajasthan, India prasannakumarks@acharya.ac.in

More information

OCR For Handwritten Marathi Script

OCR For Handwritten Marathi Script International Journal of Scientific & Engineering Research Volume 3, Issue 8, August-2012 1 OCR For Handwritten Marathi Script Mrs.Vinaya. S. Tapkir 1, Mrs.Sushma.D.Shelke 2 1 Maharashtra Academy Of Engineering,

More information

Chapter Review of HCR

Chapter Review of HCR Chapter 3 [3]Literature Review The survey of literature on character recognition showed that some of the researchers have worked based on application requirements like postal code identification [118],

More information

Handwritten Character Recognition: A Comprehensive Review on Geometrical Analysis

Handwritten Character Recognition: A Comprehensive Review on Geometrical Analysis IOSR Journal of Computer Engineering (IOSRJCE) eissn: 22780661,pISSN: 22788727, Volume 17, Issue 2, Ver. IV (Mar Apr. 2015), PP 8388 www.iosrjournals.org Handwritten Character Recognition: A Comprehensive

More information

A Survey of Problems of Overlapped Handwritten Characters in Recognition process for Gurmukhi Script

A Survey of Problems of Overlapped Handwritten Characters in Recognition process for Gurmukhi Script A Survey of Problems of Overlapped Handwritten Characters in Recognition process for Gurmukhi Script Arwinder Kaur 1, Ashok Kumar Bathla 2 1 M. Tech. Student, CE Dept., 2 Assistant Professor, CE Dept.,

More information

Segmentation of Characters of Devanagari Script Documents

Segmentation of Characters of Devanagari Script Documents WWJMRD 2017; 3(11): 253-257 www.wwjmrd.com International Journal Peer Reviewed Journal Refereed Journal Indexed Journal UGC Approved Journal Impact Factor MJIF: 4.25 e-issn: 2454-6615 Manpreet Kaur Research

More information

A Review on Handwritten Character Recognition

A Review on Handwritten Character Recognition IJCST Vo l. 8, Is s u e 1, Ja n - Ma r c h 2017 ISSN : 0976-8491 (Online) ISSN : 2229-4333 (Print) A Review on Handwritten Character Recognition 1 Anisha Sharma, 2 Soumil Khare, 3 Sachin Chavan 1,2,3 Dept.

More information

Isolated Curved Gurmukhi Character Recognition Using Projection of Gradient

Isolated Curved Gurmukhi Character Recognition Using Projection of Gradient International Journal of Computational Intelligence Research ISSN 0973-1873 Volume 13, Number 6 (2017), pp. 1387-1396 Research India Publications http://www.ripublication.com Isolated Curved Gurmukhi Character

More information

Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes

Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes 2009 10th International Conference on Document Analysis and Recognition Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes Alireza Alaei

More information

International Journal of Scientific & Engineering Research, Volume 8, Issue 3, March ISSN

International Journal of Scientific & Engineering Research, Volume 8, Issue 3, March ISSN International Journal of Scientific & Engineering Research, Volume 8, Issue 3, March-2017 1850 Optical Character Recognition for Running C Code Upendra Mishra 1, Shiva Panwar 2, Deeksha Upadhyay 2, Kamal

More information

A two-stage approach for segmentation of handwritten Bangla word images

A two-stage approach for segmentation of handwritten Bangla word images A two-stage approach for segmentation of handwritten Bangla word images Ram Sarkar, Nibaran Das, Subhadip Basu, Mahantapas Kundu, Mita Nasipuri #, Dipak Kumar Basu Computer Science & Engineering Department,

More information

A Novel Feature Extraction and Classification Methodology for the Recognition of Historical Documents

A Novel Feature Extraction and Classification Methodology for the Recognition of Historical Documents 2009 10th International Conference on Document Analysis and Recognition A Novel Feature Eraction and Classification Methodology for the Recognition of Historical Documents G. Vamvakas, B. Gatos and S.

More information

Structural Feature Extraction to recognize some of the Offline Isolated Handwritten Gujarati Characters using Decision Tree Classifier

Structural Feature Extraction to recognize some of the Offline Isolated Handwritten Gujarati Characters using Decision Tree Classifier Structural Feature Extraction to recognize some of the Offline Isolated Handwritten Gujarati Characters using Decision Tree Classifier Hetal R. Thaker Atmiya Institute of Technology & science, Kalawad

More information

Keywords Connected Components, Text-Line Extraction, Trained Dataset.

Keywords Connected Components, Text-Line Extraction, Trained Dataset. Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Language Independent

More information

Handwritten Devanagari Character Recognition Model Using Neural Network

Handwritten Devanagari Character Recognition Model Using Neural Network Handwritten Devanagari Character Recognition Model Using Neural Network Gaurav Jaiswal M.Sc. (Computer Science) Department of Computer Science Banaras Hindu University, Varanasi. India gauravjais88@gmail.com

More information

Mono-font Cursive Arabic Text Recognition Using Speech Recognition System

Mono-font Cursive Arabic Text Recognition Using Speech Recognition System Mono-font Cursive Arabic Text Recognition Using Speech Recognition System M.S. Khorsheed Computer & Electronics Research Institute, King AbdulAziz City for Science and Technology (KACST) PO Box 6086, Riyadh

More information

Short Survey on Static Hand Gesture Recognition

Short Survey on Static Hand Gesture Recognition Short Survey on Static Hand Gesture Recognition Huu-Hung Huynh University of Science and Technology The University of Danang, Vietnam Duc-Hoang Vo University of Science and Technology The University of

More information

OFF-LINE HANDWRITTEN JAWI CHARACTER SEGMENTATION USING HISTOGRAM NORMALIZATION AND SLIDING WINDOW APPROACH FOR HARDWARE IMPLEMENTATION

OFF-LINE HANDWRITTEN JAWI CHARACTER SEGMENTATION USING HISTOGRAM NORMALIZATION AND SLIDING WINDOW APPROACH FOR HARDWARE IMPLEMENTATION OFF-LINE HANDWRITTEN JAWI CHARACTER SEGMENTATION USING HISTOGRAM NORMALIZATION AND SLIDING WINDOW APPROACH FOR HARDWARE IMPLEMENTATION Zaidi Razak 1, Khansa Zulkiflee 2, orzaily Mohamed or 3, Rosli Salleh

More information

A Brief Study of Feature Extraction and Classification Methods Used for Character Recognition of Brahmi Northern Indian Scripts

A Brief Study of Feature Extraction and Classification Methods Used for Character Recognition of Brahmi Northern Indian Scripts 25 A Brief Study of Feature Extraction and Classification Methods Used for Character Recognition of Brahmi Northern Indian Scripts Rohit Sachdeva, Asstt. Prof., Computer Science Department, Multani Mal

More information

A Technique for Offline Handwritten Character Recognition

A Technique for Offline Handwritten Character Recognition A Technique for Offline Handwritten Character Recognition 1 Shilpy Bansal, 2 Mamta Garg, 3 Munish Kumar 1 Lecturer, Department of Computer Science Engineering, BMSCET, Muktsar, Punjab 2 Assistant Professor,

More information

Segmentation Based Optical Character Recognition for Handwritten Marathi characters

Segmentation Based Optical Character Recognition for Handwritten Marathi characters Segmentation Based Optical Character Recognition for Handwritten Marathi characters Madhav Vaidya 1, Yashwant Joshi 2,Milind Bhalerao 3 Department of Information Technology 1 Department of Electronics

More information

Image Normalization and Preprocessing for Gujarati Character Recognition

Image Normalization and Preprocessing for Gujarati Character Recognition 334 Image Normalization and Preprocessing for Gujarati Character Recognition Jayashree Rajesh Prasad Department of Computer Engineering, Sinhgad College of Engineering, University of Pune, Pune, Mahaashtra

More information

NOVATEUR PUBLICATIONS INTERNATIONAL JOURNAL OF INNOVATIONS IN ENGINEERING RESEARCH AND TECHNOLOGY [IJIERT] ISSN: VOLUME 5, ISSUE

NOVATEUR PUBLICATIONS INTERNATIONAL JOURNAL OF INNOVATIONS IN ENGINEERING RESEARCH AND TECHNOLOGY [IJIERT] ISSN: VOLUME 5, ISSUE OPTICAL HANDWRITTEN DEVNAGARI CHARACTER RECOGNITION USING ARTIFICIAL NEURAL NETWORK APPROACH JYOTI A.PATIL Ashokrao Mane Group of Institution, Vathar Tarf Vadgaon, India. DR. SANJAY R. PATIL Ashokrao Mane

More information

Recognition of Unconstrained Malayalam Handwritten Numeral

Recognition of Unconstrained Malayalam Handwritten Numeral Recognition of Unconstrained Malayalam Handwritten Numeral U. Pal, S. Kundu, Y. Ali, H. Islam and N. Tripathy C VPR Unit, Indian Statistical Institute, Kolkata-108, India Email: umapada@isical.ac.in Abstract

More information

RESTORATION OF DEGRADED DOCUMENTS USING IMAGE BINARIZATION TECHNIQUE

RESTORATION OF DEGRADED DOCUMENTS USING IMAGE BINARIZATION TECHNIQUE RESTORATION OF DEGRADED DOCUMENTS USING IMAGE BINARIZATION TECHNIQUE K. Kaviya Selvi 1 and R. S. Sabeenian 2 1 Department of Electronics and Communication Engineering, Communication Systems, Sona College

More information

DEVANAGARI SCRIPT SEPARATION AND RECOGNITION USING MORPHOLOGICAL OPERATIONS AND OPTIMIZED FEATURE EXTRACTION METHODS

DEVANAGARI SCRIPT SEPARATION AND RECOGNITION USING MORPHOLOGICAL OPERATIONS AND OPTIMIZED FEATURE EXTRACTION METHODS DEVANAGARI SCRIPT SEPARATION AND RECOGNITION USING MORPHOLOGICAL OPERATIONS AND OPTIMIZED FEATURE EXTRACTION METHODS Sushilkumar N. Holambe Dr. Ulhas B. Shinde Shrikant D. Mali Persuing PhD at Principal

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION CHAPTER 1 INTRODUCTION 1.1 Introduction Pattern recognition is a set of mathematical, statistical and heuristic techniques used in executing `man-like' tasks on computers. Pattern recognition plays an

More information

Character Recognition Using Matlab s Neural Network Toolbox

Character Recognition Using Matlab s Neural Network Toolbox Character Recognition Using Matlab s Neural Network Toolbox Kauleshwar Prasad, Devvrat C. Nigam, Ashmika Lakhotiya and Dheeren Umre B.I.T Durg, India Kauleshwarprasad2gmail.com, devnigam24@gmail.com,ashmika22@gmail.com,

More information

Handwritten Hindi Numerals Recognition System

Handwritten Hindi Numerals Recognition System CS365 Project Report Handwritten Hindi Numerals Recognition System Submitted by: Akarshan Sarkar Kritika Singh Project Mentor: Prof. Amitabha Mukerjee 1 Abstract In this project, we consider the problem

More information

Restoring Warped Document Image Based on Text Line Correction

Restoring Warped Document Image Based on Text Line Correction Restoring Warped Document Image Based on Text Line Correction * Dep. of Electrical Engineering Tamkang University, New Taipei, Taiwan, R.O.C *Correspondending Author: hsieh@ee.tku.edu.tw Abstract Document

More information

Offline Handwritten Word Recognition using Multiple Features with SVM Classifier for Holistic Approach

Offline Handwritten Word Recognition using Multiple Features with SVM Classifier for Holistic Approach Offline Handwritten Word Recognition using Multiple Features with SVM Classifier for Holistic Approach Shruthi A 1, M S Patel 2 M.Tech Student, Department of Information Science and Engineering, DSCE,

More information

Auto-Digitizer for Fast Graph-to-Data Conversion

Auto-Digitizer for Fast Graph-to-Data Conversion Auto-Digitizer for Fast Graph-to-Data Conversion EE 368 Final Project Report, Winter 2018 Deepti Sanjay Mahajan dmahaj@stanford.edu Sarah Pao Radzihovsky sradzi13@stanford.edu Ching-Hua (Fiona) Wang chwang9@stanford.edu

More information

SEGMENTATION OF CHARACTERS WITHOUT MODIFIERS FROM A PRINTED BANGLA TEXT

SEGMENTATION OF CHARACTERS WITHOUT MODIFIERS FROM A PRINTED BANGLA TEXT SEGMENTATION OF CHARACTERS WITHOUT MODIFIERS FROM A PRINTED BANGLA TEXT ABSTRACT Rupak Bhattacharyya et al. (Eds) : ACER 2013, pp. 11 24, 2013. CS & IT-CSCP 2013 Fakruddin Ali Ahmed Department of Computer

More information

LITERATURE REVIEW. For Indian languages most of research work is performed firstly on Devnagari script and secondly on Bangla script.

LITERATURE REVIEW. For Indian languages most of research work is performed firstly on Devnagari script and secondly on Bangla script. LITERATURE REVIEW For Indian languages most of research work is performed firstly on Devnagari script and secondly on Bangla script. The study of recognition for handwritten Devanagari compound character

More information

Word-wise Hand-written Script Separation for Indian Postal automation

Word-wise Hand-written Script Separation for Indian Postal automation Word-wise Hand-written Script Separation for Indian Postal automation K. Roy U. Pal Dept. of Comp. Sc. & Engg. West Bengal University of Technology, Sector 1, Saltlake City, Kolkata-64, India Abstract

More information

SKEW DETECTION AND CORRECTION

SKEW DETECTION AND CORRECTION CHAPTER 3 SKEW DETECTION AND CORRECTION When the documents are scanned through high speed scanners, some amount of tilt is unavoidable either due to manual feed or auto feed. The tilt angle induced during

More information

HCR Using K-Means Clustering Algorithm

HCR Using K-Means Clustering Algorithm HCR Using K-Means Clustering Algorithm Meha Mathur 1, Anil Saroliya 2 Amity School of Engineering & Technology Amity University Rajasthan, India Abstract: Hindi is a national language of India, there are

More information

Sinhala Handwriting Recognition Mechanism Using Zone Based Feature Extraction

Sinhala Handwriting Recognition Mechanism Using Zone Based Feature Extraction Sinhala Handwriting Recognition Mechanism Using Zone Based Feature Extraction 10 K.A.K.N.D. Dharmapala 1, W.P.M.V. Wijesooriya 2, C.P. Chandrasekara 3, U.K.A.U. Rathnapriya 4, L. Ranathunga 5 Department

More information

MOMENT AND DENSITY BASED HADWRITTEN MARATHI NUMERAL RECOGNITION

MOMENT AND DENSITY BASED HADWRITTEN MARATHI NUMERAL RECOGNITION MOMENT AND DENSITY BASED HADWRITTEN MARATHI NUMERAL RECOGNITION S. M. Mali Department of Computer Science, MAEER S Arts, Commerce and Science College, Pune Shankarmali007@gmail.com Abstract In this paper,

More information

A Document Image Analysis System on Parallel Processors

A Document Image Analysis System on Parallel Processors A Document Image Analysis System on Parallel Processors Shamik Sural, CMC Ltd. 28 Camac Street, Calcutta 700 016, India. P.K.Das, Dept. of CSE. Jadavpur University, Calcutta 700 032, India. Abstract This

More information

Convolution Neural Networks for Chinese Handwriting Recognition

Convolution Neural Networks for Chinese Handwriting Recognition Convolution Neural Networks for Chinese Handwriting Recognition Xu Chen Stanford University 450 Serra Mall, Stanford, CA 94305 xchen91@stanford.edu Abstract Convolutional neural networks have been proven

More information

Automatic Recognition and Verification of Handwritten Legal and Courtesy Amounts in English Language Present on Bank Cheques

Automatic Recognition and Verification of Handwritten Legal and Courtesy Amounts in English Language Present on Bank Cheques Automatic Recognition and Verification of Handwritten Legal and Courtesy Amounts in English Language Present on Bank Cheques Ajay K. Talele Department of Electronics Dr..B.A.T.U. Lonere. Sanjay L Nalbalwar

More information

Equation to LaTeX. Abhinav Rastogi, Sevy Harris. I. Introduction. Segmentation.

Equation to LaTeX. Abhinav Rastogi, Sevy Harris. I. Introduction. Segmentation. Equation to LaTeX Abhinav Rastogi, Sevy Harris {arastogi,sharris5}@stanford.edu I. Introduction Copying equations from a pdf file to a LaTeX document can be time consuming because there is no easy way

More information

FREEMAN CODE BASED ONLINE HANDWRITTEN CHARACTER RECOGNITION FOR MALAYALAM USING BACKPROPAGATION NEURAL NETWORKS

FREEMAN CODE BASED ONLINE HANDWRITTEN CHARACTER RECOGNITION FOR MALAYALAM USING BACKPROPAGATION NEURAL NETWORKS FREEMAN CODE BASED ONLINE HANDWRITTEN CHARACTER RECOGNITION FOR MALAYALAM USING BACKPROPAGATION NEURAL NETWORKS Amritha Sampath 1, Tripti C 2 and Govindaru V 3 1 Department of Computer Science and Engineering,

More information

Handwriting Recognition of Diverse Languages

Handwriting Recognition of Diverse Languages Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 6.017 IJCSMC,

More information

Optical Character Recognition (OCR) for Printed Devnagari Script Using Artificial Neural Network

Optical Character Recognition (OCR) for Printed Devnagari Script Using Artificial Neural Network International Journal of Computer Science & Communication Vol. 1, No. 1, January-June 2010, pp. 91-95 Optical Character Recognition (OCR) for Printed Devnagari Script Using Artificial Neural Network Raghuraj

More information

Enhancing the Character Segmentation Accuracy of Bangla OCR using BPNN

Enhancing the Character Segmentation Accuracy of Bangla OCR using BPNN Enhancing the Character Segmentation Accuracy of Bangla OCR using BPNN Shamim Ahmed 1, Mohammod Abul Kashem 2 1 M.S. Student, Department of Computer Science and Engineering, Dhaka University of Engineering

More information

Text line Segmentation of Curved Document Images

Text line Segmentation of Curved Document Images RESEARCH ARTICLE S OPEN ACCESS Text line Segmentation of Curved Document Images Anusree.M *, Dhanya.M.Dhanalakshmy ** * (Department of Computer Science, Amrita Vishwa Vidhyapeetham, Coimbatore -641 11)

More information

A New Algorithm for Detecting Text Line in Handwritten Documents

A New Algorithm for Detecting Text Line in Handwritten Documents A New Algorithm for Detecting Text Line in Handwritten Documents Yi Li 1, Yefeng Zheng 2, David Doermann 1, and Stefan Jaeger 1 1 Laboratory for Language and Media Processing Institute for Advanced Computer

More information

Recognition of Gurmukhi Text from Sign Board Images Captured from Mobile Camera

Recognition of Gurmukhi Text from Sign Board Images Captured from Mobile Camera International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 17 (2014), pp. 1839-1845 International Research Publications House http://www. irphouse.com Recognition of

More information

Dietrich Paulus Joachim Hornegger. Pattern Recognition of Images and Speech in C++

Dietrich Paulus Joachim Hornegger. Pattern Recognition of Images and Speech in C++ Dietrich Paulus Joachim Hornegger Pattern Recognition of Images and Speech in C++ To Dorothea, Belinda, and Dominik In the text we use the following names which are protected, trademarks owned by a company

More information

Scene Text Detection Using Machine Learning Classifiers

Scene Text Detection Using Machine Learning Classifiers 601 Scene Text Detection Using Machine Learning Classifiers Nafla C.N. 1, Sneha K. 2, Divya K.P. 3 1 (Department of CSE, RCET, Akkikkvu, Thrissur) 2 (Department of CSE, RCET, Akkikkvu, Thrissur) 3 (Department

More information

A Feature based on Encoding the Relative Position of a Point in the Character for Online Handwritten Character Recognition

A Feature based on Encoding the Relative Position of a Point in the Character for Online Handwritten Character Recognition A Feature based on Encoding the Relative Position of a Point in the Character for Online Handwritten Character Recognition Dinesh Mandalapu, Sridhar Murali Krishna HP Laboratories India HPL-2007-109 July

More information

NOVATEUR PUBLICATIONS INTERNATIONAL JOURNAL OF INNOVATIONS IN ENGINEERING RESEARCH AND TECHNOLOGY [IJIERT] ISSN: VOLUME 2, ISSUE 1 JAN-2015

NOVATEUR PUBLICATIONS INTERNATIONAL JOURNAL OF INNOVATIONS IN ENGINEERING RESEARCH AND TECHNOLOGY [IJIERT] ISSN: VOLUME 2, ISSUE 1 JAN-2015 Offline Handwritten Signature Verification using Neural Network Pallavi V. Hatkar Department of Electronics Engineering, TKIET Warana, India Prof.B.T.Salokhe Department of Electronics Engineering, TKIET

More information

Recognition of online captured, handwritten Tamil words on Android

Recognition of online captured, handwritten Tamil words on Android Recognition of online captured, handwritten Tamil words on Android A G Ramakrishnan and Bhargava Urala K Medical Intelligence and Language Engineering (MILE) Laboratory, Dept. of Electrical Engineering,

More information

One Dim~nsional Representation Of Two Dimensional Information For HMM Based Handwritten Recognition

One Dim~nsional Representation Of Two Dimensional Information For HMM Based Handwritten Recognition One Dim~nsional Representation Of Two Dimensional Information For HMM Based Handwritten Recognition Nafiz Arica Dept. of Computer Engineering, Middle East Technical University, Ankara,Turkey nafiz@ceng.metu.edu.

More information

RECOGNIZING TYPESET DOCUMENTS USING WALSH TRANSFORMATION. Attila Fazekas and András Hajdu University of Debrecen 4010, Debrecen PO Box 12, Hungary

RECOGNIZING TYPESET DOCUMENTS USING WALSH TRANSFORMATION. Attila Fazekas and András Hajdu University of Debrecen 4010, Debrecen PO Box 12, Hungary RECOGNIZING TYPESET DOCUMENTS USING WALSH TRANSFORMATION Attila Fazekas and András Hajdu University of Debrecen 4010, Debrecen PO Box 12, Hungary Abstract. In this paper we present an effective character

More information

A Simple Text-line segmentation Method for Handwritten Documents

A Simple Text-line segmentation Method for Handwritten Documents A Simple Text-line segmentation Method for Handwritten Documents M.Ravi Kumar Assistant professor Shankaraghatta-577451 R. Pradeep Shankaraghatta-577451 Prasad Babu Shankaraghatta-5774514th B.S.Puneeth

More information

A New Approach to Detect and Extract Characters from Off-Line Printed Images and Text

A New Approach to Detect and Extract Characters from Off-Line Printed Images and Text Available online at www.sciencedirect.com Procedia Computer Science 17 (2013 ) 434 440 Information Technology and Quantitative Management (ITQM2013) A New Approach to Detect and Extract Characters from

More information

Offline Signature verification and recognition using ART 1

Offline Signature verification and recognition using ART 1 Offline Signature verification and recognition using ART 1 R. Sukanya K.Malathy M.E Infant Jesus College of Engineering And Technology Abstract: The main objective of this project is signature verification

More information

Word Slant Estimation using Non-Horizontal Character Parts and Core-Region Information

Word Slant Estimation using Non-Horizontal Character Parts and Core-Region Information 2012 10th IAPR International Workshop on Document Analysis Systems Word Slant using Non-Horizontal Character Parts and Core-Region Information A. Papandreou and B. Gatos Computational Intelligence Laboratory,

More information

Isolated Handwritten Words Segmentation Techniques in Gurmukhi Script

Isolated Handwritten Words Segmentation Techniques in Gurmukhi Script Isolated Handwritten Words Segmentation Techniques in Gurmukhi Script Galaxy Bansal Dharamveer Sharma ABSTRACT Segmentation of handwritten words is a challenging task primarily because of structural features

More information

Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction

Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction Stefan Müller, Gerhard Rigoll, Andreas Kosmala and Denis Mazurenok Department of Computer Science, Faculty of

More information

A Fast Recognition System for Isolated Printed Characters Using Center of Gravity and Principal Axis

A Fast Recognition System for Isolated Printed Characters Using Center of Gravity and Principal Axis Applied Mathematics, 2013, 4, 1313-1319 http://dx.doi.org/10.4236/am.2013.49177 Published Online September 2013 (http://www.scirp.org/journal/am) A Fast Recognition System for Isolated Printed Characters

More information

Robust Phase-Based Features Extracted From Image By A Binarization Technique

Robust Phase-Based Features Extracted From Image By A Binarization Technique IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 18, Issue 4, Ver. IV (Jul.-Aug. 2016), PP 10-14 www.iosrjournals.org Robust Phase-Based Features Extracted From

More information

Development of an Automated Fingerprint Verification System

Development of an Automated Fingerprint Verification System Development of an Automated Development of an Automated Fingerprint Verification System Fingerprint Verification System Martin Saveski 18 May 2010 Introduction Biometrics the use of distinctive anatomical

More information

Text Detection in Indoor/Outdoor Scene Images

Text Detection in Indoor/Outdoor Scene Images Text Detection in Indoor/Outdoor Scene Images B. Gatos, I. Pratikakis, K. Kepene and S.J. Perantonis Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Center

More information

MURDOCH RESEARCH REPOSITORY.

MURDOCH RESEARCH REPOSITORY. MURDOCH RESEARCH REPOSITORY http://researchrepository.murdoch.edu.au This is the author's final version of the work, as accepted for publication following peer review but without the publisher's layout

More information

An Arabic Baseline Estimation Method Based on Feature Points Extraction

An Arabic Baseline Estimation Method Based on Feature Points Extraction , July 5-7, 2017, London, U.K. An Arabic Baseline Estimation Method Based on Feature Points Extraction Arwa AL-Khatatneh, Sakinah Ali Pitchay and Musab Al-qudah Abstract Baseline estimation is an important

More information

Neural Network Classifier for Isolated Character Recognition

Neural Network Classifier for Isolated Character Recognition Neural Network Classifier for Isolated Character Recognition 1 Ruby Mehta, 2 Ravneet Kaur 1 M.Tech (CSE), Guru Nanak Dev University, Amritsar (Punjab), India 2 M.Tech Scholar, Computer Science & Engineering

More information

ADVANCES in NATURAL and APPLIED SCIENCES

ADVANCES in NATURAL and APPLIED SCIENCES ADVANCES in NATURAL and APPLIED SCIENCES ISSN: 1995-0772 Published BYAENSI Publication EISSN: 1998-1090 http://www.aensiweb.com/anas 2017 May 11(7):pages 57-63 Open Access Journal English Cursive Hand

More information

Recognition of Off-Line Handwritten Devnagari Characters Using Quadratic Classifier

Recognition of Off-Line Handwritten Devnagari Characters Using Quadratic Classifier Recognition of Off-Line Handwritten Devnagari Characters Using Quadratic Classifier N. Sharma, U. Pal*, F. Kimura**, and S. Pal Computer Vision and Pattern Recognition Unit, Indian Statistical Institute

More information

Automatic Static Signature Verification Systems: A Review

Automatic Static Signature Verification Systems: A Review Automatic Static Signature Verification Systems: A Review 1 Vitthal K. Bhosale1 Dr. Anil R. Karwankar2 1 PG Student, Government College of Engineering, Aurangabad (M.S.), 2 Assistant Professor, Dept. Of

More information

Handwritten Text Recognition

Handwritten Text Recognition Handwritten Text Recognition M.J. Castro-Bleda, S. España-Boquera, F. Zamora-Martínez Universidad Politécnica de Valencia Spain Avignon, 9 December 2010 Text recognition () Avignon Avignon, 9 December

More information

An Efficient Character Segmentation Based on VNP Algorithm

An Efficient Character Segmentation Based on VNP Algorithm Research Journal of Applied Sciences, Engineering and Technology 4(24): 5438-5442, 2012 ISSN: 2040-7467 Maxwell Scientific organization, 2012 Submitted: March 18, 2012 Accepted: April 14, 2012 Published:

More information

Segmentation free Bangla OCR using HMM: Training and Recognition

Segmentation free Bangla OCR using HMM: Training and Recognition Segmentation free Bangla OCR using HMM: Training and Recognition Md. Abul Hasnat, S.M. Murtoza Habib, Mumit Khan BRAC University, Bangladesh mhasnat@gmail.com, murtoza@gmail.com, mumit@bracuniversity.ac.bd

More information

Segmentation of Bangla Handwritten Text

Segmentation of Bangla Handwritten Text Thesis Report Segmentation of Bangla Handwritten Text Submitted By: Sabbir Sadik ID:09301027 Md. Numan Sarwar ID: 09201027 CSE Department BRAC University Supervisor: Professor Dr. Mumit Khan Date: 13 th

More information

EE 584 MACHINE VISION

EE 584 MACHINE VISION EE 584 MACHINE VISION Binary Images Analysis Geometrical & Topological Properties Connectedness Binary Algorithms Morphology Binary Images Binary (two-valued; black/white) images gives better efficiency

More information

Optical Character Recognition

Optical Character Recognition Chapter 2 Optical Character Recognition 2.1 Introduction Optical Character Recognition (OCR) is one of the challenging areas of pattern recognition. It gained popularity among the research community due

More information

Word Matching of handwritten scripts

Word Matching of handwritten scripts Word Matching of handwritten scripts Seminar about ancient document analysis Introduction Contour extraction Contour matching Other methods Conclusion Questions Problem Text recognition in handwritten

More information

Chapter 2. Components

Chapter 2. Components Chapter 2 [2]OCR: General Architecture and Components In some areas which require the automation of human intelligence, such as chess playing, tremendous improvements are achieved over the last few decades.

More information

Printed Arabic Text Recognition using Linear and Nonlinear Regression

Printed Arabic Text Recognition using Linear and Nonlinear Regression Printed Arabic Text Recognition using Linear and Nonlinear Regression Ashraf A. Shahin 1,2 1 College of Computer and Information Sciences, Al Imam Mohammad Ibn Saud Islamic University (IMSIU) Riyadh, Kingdom

More information

A Decision Tree Based Method to Classify Persian Handwritten Numerals by Extracting Some Simple Geometrical Features

A Decision Tree Based Method to Classify Persian Handwritten Numerals by Extracting Some Simple Geometrical Features A Decision Tree Based Method to Classify Persian Handwritten Numerals by Extracting Some Simple Geometrical Features Hamidreza Alvari, Seyed Mehdi Hazrati Fard, and Bahar Salehi Abstract Automatic recognition

More information

IJSER. Abstract : Image binarization is the process of separation of image pixel values as background and as a foreground. We

IJSER. Abstract : Image binarization is the process of separation of image pixel values as background and as a foreground. We International Journal of Scientific & Engineering Research, Volume 7, Issue 3, March-2016 1238 Adaptive Local Image Contrast in Image Binarization Prof.Sushilkumar N Holambe. PG Coordinator ME(CSE),College

More information

Hand Written Character Recognition using VNP based Segmentation and Artificial Neural Network

Hand Written Character Recognition using VNP based Segmentation and Artificial Neural Network International Journal of Emerging Engineering Research and Technology Volume 4, Issue 6, June 2016, PP 38-46 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) Hand Written Character Recognition using VNP

More information

Recognition of Printed Arabic Words with Fuzzy ARTMAP Neural Network

Recognition of Printed Arabic Words with Fuzzy ARTMAP Neural Network Recognition of Printed Arabic Words with Fuzzy ARTMAP Neural Network Adnan Amin' and Nabeel Murshed2 'School of Computer Science and Engineering University of New South Wales, Sydney-Australia amin@cse.unsw.edu.au

More information

Feature Extraction and Image Processing, 2 nd Edition. Contents. Preface

Feature Extraction and Image Processing, 2 nd Edition. Contents. Preface , 2 nd Edition Preface ix 1 Introduction 1 1.1 Overview 1 1.2 Human and Computer Vision 1 1.3 The Human Vision System 3 1.3.1 The Eye 4 1.3.2 The Neural System 7 1.3.3 Processing 7 1.4 Computer Vision

More information

IDIAP. Martigny - Valais - Suisse IDIAP

IDIAP. Martigny - Valais - Suisse IDIAP R E S E A R C H R E P O R T IDIAP Martigny - Valais - Suisse Off-Line Cursive Script Recognition Based on Continuous Density HMM Alessandro Vinciarelli a IDIAP RR 99-25 Juergen Luettin a IDIAP December

More information

Seminar. Topic: Object and character Recognition

Seminar. Topic: Object and character Recognition Seminar Topic: Object and character Recognition Tse Ngang Akumawah Lehrstuhl für Praktische Informatik 3 Table of content What's OCR? Areas covered in OCR Procedure Where does clustering come in Neural

More information

NEW ALGORITHMS FOR SKEWING CORRECTION AND SLANT REMOVAL ON WORD-LEVEL

NEW ALGORITHMS FOR SKEWING CORRECTION AND SLANT REMOVAL ON WORD-LEVEL NEW ALGORITHMS FOR SKEWING CORRECTION AND SLANT REMOVAL ON WORD-LEVEL E.Kavallieratou N.Fakotakis G.Kokkinakis Wire Communication Laboratory, University of Patras, 26500 Patras, ergina@wcl.ee.upatras.gr

More information

Handwritten Arabic Digits Recognition Using Bézier Curves

Handwritten Arabic Digits Recognition Using Bézier Curves www.ijcsi.org 57 Handwritten Arabic Digits Recognition Using Bézier Curves Aissa Kerkour El Miad and Azzeddine Mazroui University Mohammed First, Faculty of Sciences, Oujda, Morocco Abstract In this paper

More information

Handwritten Numeral Recognition of Kannada Script

Handwritten Numeral Recognition of Kannada Script Handwritten Numeral Recognition of Kannada Script S.V. Rajashekararadhya Department of Electrical and Electronics Engineering CEG, Anna University, Chennai, India svr_aradhya@yahoo.co.in P. Vanaja Ranjan

More information