IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY A REVIEW: OPTICAL CHARACTER RECOGNITION Swati Tomar *1 & Amit Kishore 2 *1

Similar documents
Arabic Handwritten Recognition using Hybrid Transform

Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network

Argha Roy* Dept. of CSE Netaji Subhash Engg. College West Bengal, India.

HANDWRITTEN GURMUKHI CHARACTER RECOGNITION USING WAVELET TRANSFORMS

Segmentation of Characters of Devanagari Script Documents

Isolated Curved Gurmukhi Character Recognition Using Projection of Gradient

Machine Learning: Handwritten Character Recognition

Simplifying Handwritten Characters Recognition Using a Particle Swarm Optimization Approach

Handwritten Gurumukhi Character Recognition by using Recurrent Neural Network

Text Extraction from Images

Review on Optical Character Recognition and Signature Recognition and Verification Technique

Simulation of Back Propagation Neural Network for Iris Flower Classification

CHAPTER 1 INTRODUCTION

A Survey of Problems of Overlapped Handwritten Characters in Recognition process for Gurmukhi Script

Mobile Application with Optical Character Recognition Using Neural Network

IMPLEMENTING ON OPTICAL CHARACTER RECOGNITION USING MEDICAL TABLET FOR BLIND PEOPLE

Neural Network Classifier for Isolated Character Recognition

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

NOVATEUR PUBLICATIONS INTERNATIONAL JOURNAL OF INNOVATIONS IN ENGINEERING RESEARCH AND TECHNOLOGY [IJIERT] ISSN: VOLUME 5, ISSUE

Tamil Image Text to Speech Using Rasperry PI

A Literature Survey on Handwritten Character Recognition

Handwritten Character Recognition: A Comprehensive Review on Geometrical Analysis

Segmentation of Kannada Handwritten Characters and Recognition Using Twelve Directional Feature Extraction Techniques

Optical Character Recognition (OCR) for Printed Devnagari Script Using Artificial Neural Network

Handwritten Devanagari Character Recognition Model Using Neural Network

International Journal of Advance Research in Engineering, Science & Technology

A Technique for Classification of Printed & Handwritten text

FREEMAN CODE BASED ONLINE HANDWRITTEN CHARACTER RECOGNITION FOR MALAYALAM USING BACKPROPAGATION NEURAL NETWORKS

A Technique for Offline Handwritten Character Recognition

ABJAD: AN OFF-LINE ARABIC HANDWRITTEN RECOGNITION SYSTEM

Problems in Extraction of Date Field from Gurmukhi Documents

Segmentation of Isolated and Touching characters in Handwritten Gurumukhi Word using Clustering approach

PCA-based Offline Handwritten Character Recognition System

OCR For Handwritten Marathi Script

A Brief Study of Feature Extraction and Classification Methods Used for Character Recognition of Brahmi Northern Indian Scripts

LECTURE 6 TEXT PROCESSING

OFFLINE SIGNATURE VERIFICATION

Multi-Layer Perceptron Network For Handwritting English Character Recoginition

Opportunities and Challenges of Handwritten Sanskrit Character Recognition System

Handwritten Marathi Character Recognition on an Android Device

International Journal of Advance Research in Engineering, Science & Technology

A Review on Handwritten Character Recognition

A THREE LAYERED MODEL TO PERFORM CHARACTER RECOGNITION FOR NOISY IMAGES

Recognition of Gurmukhi Text from Sign Board Images Captured from Mobile Camera

9. Lecture Neural Networks

2. Basic Task of Pattern Classification

Toward Part-based Document Image Decoding

IJESRT. Scientific Journal Impact Factor: (ISRA), Impact Factor: 1.852

Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes

Optimization of Benchmark Functions Using Artificial Bee Colony (ABC) Algorithm

DEVANAGARI SCRIPT SEPARATION AND RECOGNITION USING MORPHOLOGICAL OPERATIONS AND OPTIMIZED FEATURE EXTRACTION METHODS

Hand Written Character Recognition using VNP based Segmentation and Artificial Neural Network

Handwritten Hindi Character Recognition System Using Edge detection & Neural Network

Handwritten Character Recognition with Feedback Neural Network

Khmer Character Recognition using Artificial Neural Network

Research Article Path Planning Using a Hybrid Evolutionary Algorithm Based on Tree Structure Encoding

Character Recognition Using Matlab s Neural Network Toolbox

DATABASE DEVELOPMENT OF HISTORICAL DOCUMENTS: SKEW DETECTION AND CORRECTION

Preprocessing of Gurmukhi Strokes in Online Handwriting Recognition

A Fast Personal Palm print Authentication based on 3D-Multi Wavelet Transformation

Handwriting Recognition of Diverse Languages

A Fast Recognition System for Isolated Printed Characters Using Center of Gravity and Principal Axis

A Review on Different Character Segmentation Techniques for Handwritten Gurmukhi Scripts

An Improvement Study for Optical Character Recognition by using Inverse SVM in Image Processing Technique

HCR Using K-Means Clustering Algorithm

Gender Classification Technique Based on Facial Features using Neural Network

wavelet packet transform

Handwritten Script Recognition at Block Level

Density Based Clustering using Modified PSO based Neighbor Selection

IMPROVED FACE RECOGNITION USING ICP TECHNIQUES INCAMERA SURVEILLANCE SYSTEMS. Kirthiga, M.E-Communication system, PREC, Thanjavur

SEVERAL METHODS OF FEATURE EXTRACTION TO HELP IN OPTICAL CHARACTER RECOGNITION

INTERNATIONAL RESEARCH JOURNAL OF MULTIDISCIPLINARY STUDIES

A Combined Method for On-Line Signature Verification

A study of hybridizing Population based Meta heuristics

Face Detection using Hierarchical SVM

Image Compression: An Artificial Neural Network Approach

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

A New Approach to Detect and Extract Characters from Off-Line Printed Images and Text

International Journal of Advance Engineering and Research Development OPTICAL CHARACTER RECOGNITION USING ARTIFICIAL NEURAL NETWORK

An Algorithm based on SURF and LBP approach for Facial Expression Recognition

LITERATURE REVIEW. For Indian languages most of research work is performed firstly on Devnagari script and secondly on Bangla script.

Renu Dhir C.S.E department NIT Jalandhar India

Unit V. Neural Fuzzy System

Keywords Handwritten alphabet recognition, local binary pattern (LBP), feature Descriptor, nearest neighbor classifier.

Classification of Printed Chinese Characters by Using Neural Network

Learning to Recognize Faces in Realistic Conditions

Robust Optical Character Recognition under Geometrical Transformations

Design and Development of BFO Based Robust Watermarking Algorithm for Digital Image. Priyanka*, Charu Jain** and Geet Sandhu***

Keywords Clustering, Goals of clustering, clustering techniques, clustering algorithms.

Recognition of Unconstrained Malayalam Handwritten Numeral

Invarianceness for Character Recognition Using Geo-Discretization Features

A Document Image Analysis System on Parallel Processors

PATTERN RECOGNITION USING NEURAL NETWORKS

1. Introduction. 2. Motivation and Problem Definition. Volume 8 Issue 2, February Susmita Mohapatra

Handwritten Character Recognition A Review

Linear Discriminant Analysis in Ottoman Alphabet Character Recognition

Handwritten character and word recognition using their geometrical features through neural networks

Performance Analysis of Data Mining Classification Techniques

A Survey And Comparative Analysis Of Data

Volume 1, Issue 3 (2013) ISSN International Journal of Advance Research and Innovation

Transcription:

IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY A REVIEW: OPTICAL CHARACTER RECOGNITION Swati Tomar *1 & Amit Kishore 2 *1 M. Tech Scholar, Cyber Security, Site, Subharti University 2 Assistant professor, Cyber Security, Site, Subharti University DOI: 10.5281/zenodo.1213078 ABSTRACT This paper presents detailed review in the field of Optical Character Recognition. Various techniques are determine that have been proposed to realize the center of character recognition in an optical character recognition system. Even though, sufficient studies and papers are describes the techniques for converting textual content from a paper document into machine readable form. Optical character recognition is a process where the computer understands automatically the image of handwritten script and transfer into classify character. This material use as a guide and update for readers working in the Character Recognition area. Selection of a relevant feature extraction method is probably the single most important factor in achieving high character recognition with much better accuracy in character recognition systems without any variation. KEYWORDS: Neural Network, Feature extraction, Classification, OCR. I. INTRODUCTION Optical Character acknowledgment has been a subject of research. Example acknowledgment has three principle steps: perception, design division, and example order. Optical Character Recognition (OCR) frameworks is changing expansive measure of reports, either printed letters in order or manually written into machine encoded content with no change, clamor, determination varieties and different variables. In general, handwriting recognition is classified into two types as off-line and On-line character recognition. Off-line handwriting recognition involves automatic conversion of text into an image into letter codes which are usable within computer and text-processing applications. Off-line handwriting recognition is more difficult, as different people have different handwriting styles. But, in the on-line system, On-line character recognition deals with a data stream which comes from a transducer while the user is writing. The typical hardware to collect data is a digitizing tablet which is electromagnetic or pressure sensitive. When the user writes on the tablet, the successive movements of the pen are transformed to a series of electronic signal which is memorized and analyzed by the computer. Optical Character Recognition (OCR) is a field of research in pattern recognition, artificial intelligence and machine vision, signal processing. Optical character recognition (OCR) is usually referred to as an off-line character recognition process to mean that the system scans and recognizes static images of the characters. It refers to the mechanical or electronic translation of images of handwritten character or printed text into machine code without any variation. [233]

OCR consists of many phases such as Pre-processing, Segmentation, Feature Extraction, Classifications and Recognition. The input of one step is the output of next step. The task of preprocessing relates to the removal of noise and variation in handwritten. Several area where ocr used including mail sorting, bank processing, document reading and postal address recognition require offline handwriting recognition systems, pattern recognition. II. PHASES OF GENERAL CHARACTER RECOGNITION SYSTEM Digitization Digitization is the process of converting a paper-based handwritten document into electronic format. Here, each document consists of only one character. The electronic conversion is accomplished by using a method whereby a document is scanned and an electronic representation of the original document as a image file format is produced. We used various scanner for digitization, and the digital image was go for next step that is preprocessing phase. Pre-processing In The pre-processing phase, there is a series of operations performed on the scanned input image. It enhances the image rendering it suitable for segmentation the gray-level character image is normalized into a window sized. After noise reduction, we produced a bitmap image. Then, the bitmap image was transformed into a thinned image. [234]

Segmentation The Segmentation phase is the most important process. Segmentation is done by separation from the individual characters of an image. Segmentation of handwritten characters into different zones (upper, middle and lower zone) and characters is more difficult than that of printed documents that are in standard form. This is mainly because of variability in paragraph, words of line and characters of a word, skew, slant, size and curved. Sometimes components of two adjacent characters may be touched or overlapped and this situation create difficulties in the segmentation task. Touching or overlapping problem occurs frequently because of modified characters in upper-zone and lowerzone. Segmentation is an important stage. Feature Extraction In this phase, features of individual character are extracted. The performance of an each character recognition system that depends on the features that are extracted. The extracted features from input character should allow classification of a character in a unique way. We used diagonal features, intersection and open end points features, transition features, zoning features, directional features, parabola curve fitting based features, and power curve fitting based features in order to find the feature set for a given character. III. LITERATURE REVIEW ON OPTICAL CHARACTER RECOGNITION The review process was adopted by surveying the research in last 10 years (2005-2013) for extraction of information about some issues. The 31 research articles were reviewed to cover the review of character recognition technique. Various issues: Review papers on different technique to recognize the handwritten cursive characters. The review and discussion of issues is ranging from year 2005-2013. MAJIDA ALI ABED HAMID ALI ABED ALASADI [2005][1] This manuscript considers a new approach to Simplifying Handwritten Characters Recognition based on simulation of the behaviour of schools of fish and flocks of birds, called the Particle Swarm Optimization Approach (PSOA). We present an overview of the proposed approaches to be optimized and tested on a number of handwritten characters in the experiments. Our experimental results demonstrate the higher degree of performance of the proposed approaches. It is noted that the PSOA in general generates an optimized comparison between the input samples and database samples which improves the final recognition rate. Experimental results show that the PSOA is convergent and more accurate in solutions that minimize the error recognition rate. Mohammed Z. Khedher, Gheith A. Abandah, and Ahmed M. Al-Khawaldeh2005 [2] This paper describe that Recognition of characters greatly depends upon the features used. Several features of the handwritten Arabic characters are selected and discussed. An off-line recognition system based on the selected features was built. The system was trained and tested with realistic samples of handwritten Arabic characters. Evaluation of the importance and accuracy of the selected features is made. The recognition based on the selected features give average accuracies of 88% and 70% for the numbers and letters, respectively. Further improvements are achieved by using feature weights based on insights gained from the accuracies of individual features. Ivan Dervisevic [2006][3] Success of optical character recognition depends on a number of factors, two of which are feature extraction and classification algorithms. In this paper we look at the results of the application of a set of classifiers to datasets obtained through various basic feature extraction methods. Diego J. Romero, Leticia M. Seijas, Ana M. Ruedin[2007][4] The recognition of handwritten numerals has many important applications, such as automatic le cture of zip codes in post offices, and automatic lecture of numbers in checknotes. In this paper we present a preprocessing method for handwritten numerals recognition, based on a directional two dimensional continuous wavelet transform. The wavelet chosen is the Mexican hat. It is given a principal orientation by stretching one of its axes, and adding a rotation angle. The resulting transform has 4 parameters: scale, angle (orientation), and position (x,y) in the image. By fixing some of its parameters we obtain wavelet descriptors that form a feature vector for each digit image. We use these for the recognition of the handwritten numerals in the Concordia University data base We input the preprocessed samples into a multilayer feed forward neural network, trained with backpropagation. Our results are promising. Chirag I Patel, Ripal Patel, Palak Patel [2011][5] Objective is this paper is recognize the characters in a given scanned [235]

documents and study the effects of changing the Models of ANN.Today Neural Networks are mostly used for Pattern Recognition task. The paper describes the behaviors of different Models of Neural Network used in OCR. OCR is widespread use of Neural Network. We have considered parameters like number of Hidden Layer, size of Hidden Layer and epochs. We have used Multilayer Feed Forward network with Back propagation. In Preprocessing we have applied some basic algorithms for segmentation of characters, normalizing of characters and De-skewing. We have used different Models of Neural Network and applied the test set on each to find the accuracy of the respective Neural Network. Sushree Sangita Patnaik and Anup Kumar Panda May 2011[6] This paper proposes the implementation of particle swarm optimization (PSO) and bacterial foraging optimization (BFO) algorithms which are intended for optimal harmonic compensation by minimizing the undesirable losses occurring inside the APF itself. The efficiency and effectiveness of the implementation of two approaches are compared for two different conditions of supply. The total harmonic distortion (THD) in the source current which is a measure of APF performance is reduced drastically to nearly 1% by employing BFO. The results demonstrate that BFO outperforms the conventional and PSO-based approaches by ensuring excellent functionality of APF and quick prevail over harmonics in the source current even under unbalanced supply. Dileep Kumar Patel, Tanmoy Som1, Sushil Kumar Yadav Manoj Kumar Singh [2012][7] In the present paper, the problem of handwritten character recognition has been tackled with multiresolution technique using Discrete wavelet transform (DWT) and Euclidean distance metric (EDM). The technique has been tested and found to be more accurate and faster. Characters is classified into 26 pattern classes based on appropriate properties. Features of the handwritten character images are extracted by DWT used with appropriate level of multiresolution tech- nique, and then each pattern class is characterized by a mean vector. Distances from input pattern vector to all the mean vectors are computed by EDM. Minimum distance determines the class membership of input pattern vector. The pro- posed method provides good recognition accuracy of 90% for handwritten characters even with fewer samples. Vijay Laxmi Sahu, Babita Kubde (January 2013) [8] This paper explains that classification methods based on learning from examples have been widely applied to character recognition from the 1990s and have brought forth significant improvements of recognition accuracies. This class of methods includes statistical methods, artificial neural networks, support vector machines, multiple classifier combination, etc. In this paper, the characteristics of the classification methods that have been successfully applied to character recognition, and show the remaining problems that can be potentially solved by learning methods have been discussed. Gurpreet Singh Chandan Jyoti Kumar Rajneesh Rani Dr. Renu Dhir (January 2013) [9] This paper presents detailed review in the field of Off-line Handwritten Character Recognition.. The recognition of handwriting can, however, still is considered an open research problem due to its substantial variation in appearance. Even though, sufficient studies have performed from history to this era, paper describes the techniques for converting textual content from a paper document into machine readable form. Offline handwritten character recognition is a process where the computer understands automatically the image of handwritten script. This material serves as a guide and update for readers working in the Character Recognition area. Selection of a relevant feature extraction method is probably the single most important factor in achieving high recognition performance with much better accuracy in character recognition systems. Majida Ali Abed, Hamid Ali Abed Alasadi, (August 2013)[10] This manuscript considers a new approach to Simplifying Handwritten Characters Recognition based on simulation of the behavior of schools of fish and flocks of birds, called the Particle Swarm Optimization Approach (PSOA). We present an overview of the proposed approaches to be optimized and tested on a number of handwritten characters in the experiments. Our experimental results demonstrate the higher degree of performance of the proposed approaches. It is noted that the PSOA in general generates an optimized comparison between the input samples and database samples which improves the final recognition rate. Experimental results show that the PSOA is convergent and more accurate in solutions that minimize the error recognition rate.. Argha Roy, Diptam Dutta KAustav, Choudhury (March 2013)11] This paper, the adaptation of network weights using Particle Swarm Optimization (PSO) was proposed as a mechanism to improve the performance of Artificial Neural Network (ANN) in classification of IRIS dataset. Classification is a machine learning [236]

technique used to predict group membership for data instances. To simplify the problem of classification neural networks are being introduced. This paper focuses on IRIS plant classification using Neural Network. The problem concerns the identification of IRIS plant species on the basis of plant attribute measurements. Classification of IRIS data set would be discovering patterns from examining petal and sepal size of the IRIS plant and how the prediction was made from analyzing the pattern to form the class of IRIS plant. By using this pattern and classification, in future upcoming years the unknown data can be predicted more precisely. Artificial neural networks have been successfully applied to problems in pattern classification, function approximations, optimization, and associative memories. In this work, Multilayer feed- forward networks are trained using back propagation learning algorithm. Amir Bahador Bayat[2013] [12] Automatic recognition of handwritten characters has long been a goal of many research efforts in the pattern recognition field. This paper investigates the design of a high efficient system for recognition of handwritten digits. First it proposes an efficient system that includes two main modules: the feature extraction module and the classifier module. In the feature extraction module, seven sets of discriminative features are extracted and used in the recognition system. In the classifier module, as the first time in this area, the adaptive neuro-fuzzy inference system (ANFIS) is investigated. Experimental results show that the proposed system has good Recognition Accuracy (RA). However, the results show that in ANFIS training, the vector of radius has very important role for its recognition accuracy. At the second fold, it proposes an intelligence system in which a novel optimization module, i.e., improved bees algorithm (IBA) is proposed for finding the best parameters of the classifier. In test stage, 3-fold cross validation method was applied to the MNIST handwritten numeral database to evaluate the proposed system performances. Simulation results show that the proposed system has high recognition accuracy. Swagatam Das, Arijit Biswas, Sambarta Dasgupta, and Ajith Abraham[13] This paper proposed Bacterial foraging optimization algorithm (BFOA) has been widely accepted as a global optimization algorithm of current interest for distributed optimization and control. BFOA is inspired by the social foraging behaviour of Escherichia coli. It starts with a lucid outline of the classical BFOA. It then analyses the dynamics of the simulated chemo taxis step in BFOA with the help of a simple mathematical model. It presents a new adaptive variant of BFOA, where the chemo tactic step size is adjusted on the run according to the current fitness of a virtual bacterium. And, analysis of the dynamics of reproduction in BFOA is also discussed and also provides an account of most of the significant applications of BFOA until date. IV. CONCLUSION This is detailed discussion about handwritten character recognize and include various concepts involved, and boost further advances in the area. The accurate recognition is directly depending on the nature of the material to be read and by its quality. Current research is not directly concern to cursive handwriting and to recognize the child handwriting which require high supervised system. From various studies we have seen that selection of relevant feature extraction and classification technique plays an important role in performance of character recognition rate. This review establishes a complete system that converts scanned images of handwritten characters to text documents. In this paper, we have studied various papers with different algorithm. Each technique have own porn s and con s. But, still there are many premature problems, when multiple optima exist. The performance is degrading. So, In future there is lots of work to remove drawbacks. BFO WITH BPN should give us good accuracy and increase performance. It may exist multiple optima. It is use for global optimization This material serves as a guide and update for readers working in the Character Recognition area V. REFERENCES [1] MAJIDA ALI ABED HAMID ALI ABED ALASADI Simplifying Handwritten Characters Recognition Using a Particle Swarm Optimization Approach EUROPEAN ACADEMIC RESEARCH, VOL. I, ISSUE 5/ AUGUST 2013 ISSN 2286-4822 [2] Mohammed Z. Khedher, Gheith A. Abandah, and Ahmed M. Al-Khawaldeh Optimizing Feature Selection for Recognizing Handwritten Arabic Characters PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY VOLUME 4 FEBRUARY 2005 ISSN 1307-6884 [3] Ivan Dervisevic Machine Learning Methods for Optical Character Recognition December 18, 2006 [4] Diego J. Romero, Leticia M. Seijas, Ana M. Ruedin Directional Continuous Wavelet Transform Applied to Handwritten Numerals Recognition Using Neural Networks JCS&T Vol. 7 No. 1 April 2007 [237]

[5] Chirag I Patel, Ripal Patel, Palak Patel Handwritten Character Recognition using Neural Network International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011 [6] Sushree Sangita Patnaik and Anup Kumar Panda Particle Swarm Optimization and Bacterial Foraging Optimization Techniques for Optimal Current Harmonic Mitigation by Employing Active Power Filter Applied Computational Intelligence and Soft Computing Volume 2012, Article ID 897127. [7] Dileep Kumar Patel, Tanmoy Som1, Sushil Kumar Yadav, Manoj Kumar Singh, Handwritten Character Recognition Using Multiresolution Technique and Euclidean Distance Metric JSIP 2012, 208-214 [8] Vijay Laxmi Sahu, Babita Kubde Offline Handwritten Character Recognition Techniques using Neural Network: A Review International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 2 Issue 1, January 2013 [9] Gurpreet Singh Chandan Jyoti Kumar Rajneesh Rani Dr. Renu Dhir, Feature Extraction of Gurmukhi Script and Numerals: A Review of Offline Techniques IJARCSSE Volume 3, Issue 1, January 2013 pp 257-263 [10] Majida Ali Abed, Hamid Ali Abed Alasadi, Simplifying Handwritten Characters Recognition Using a Particle Swarm Optimization Approac h European Academic Research, Vol. I, Issue 5/ August 2013 pp-532-552 [11] Argha Roy, Diptam Dutta, Kaustav Choudhury, Training Artificial Neural Network using Particle Swarm Optimization Algorithm IJARCS SE Volume 3, Issue 3, March 2013 pp 43 434 [12] Amir Bahador Bayat Recognition of Handwritten Digits Using Optimized Adaptive Neuro- Fuzzy Inference Systems and Effective Features Journal of Pattern Recognition and Intelligent Systems Aug. 2013, Vol. 1 [13] Swagatam Das, Arijit Biswas, Sambarta Dasgupta, and Ajith Abraham Bacterial Foraging Optimization Algorithm: Theoretical Foundations, Analysis, and Applications Hindawi Publishing Corporation Applied Computational Intelligence and Soft Computing Volume 2012, Article ID 897127 CITE AN ARTICLE Tomar, S., & Kishore, A. (n.d.). A REVIEW: OPTICAL CHARACTER RECOGNITION. INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY, 7(4), 233-238. [238]