Segmentation of Kannada Handwritten Characters and Recognition Using Twelve Directional Feature Extraction Techniques

Similar documents
Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network

OCR For Handwritten Marathi Script

HANDWRITTEN GURMUKHI CHARACTER RECOGNITION USING WAVELET TRANSFORMS

LECTURE 6 TEXT PROCESSING

SEVERAL METHODS OF FEATURE EXTRACTION TO HELP IN OPTICAL CHARACTER RECOGNITION

Handwritten Character Recognition A Review

Handwritten Gurumukhi Character Recognition by using Recurrent Neural Network

Isolated Curved Gurmukhi Character Recognition Using Projection of Gradient

HCR Using K-Means Clustering Algorithm

A Survey of Problems of Overlapped Handwritten Characters in Recognition process for Gurmukhi Script

Extraction and Recognition of Alphanumeric Characters from Vehicle Number Plate

Hand Written Character Recognition using VNP based Segmentation and Artificial Neural Network

Handwritten Devanagari Character Recognition Model Using Neural Network

K S Prasanna Kumar et al,int.j.computer Techology & Applications,Vol 3 (1),

DEVANAGARI SCRIPT SEPARATION AND RECOGNITION USING MORPHOLOGICAL OPERATIONS AND OPTIMIZED FEATURE EXTRACTION METHODS

Structural Feature Extraction to recognize some of the Offline Isolated Handwritten Gujarati Characters using Decision Tree Classifier

Segmentation of Isolated and Touching characters in Handwritten Gurumukhi Word using Clustering approach

Enhancing the Character Segmentation Accuracy of Bangla OCR using BPNN

A Technique for Offline Handwritten Character Recognition

An Efficient Character Segmentation Based on VNP Algorithm

Character Recognition Using Matlab s Neural Network Toolbox

Segmentation of Characters of Devanagari Script Documents

FREEMAN CODE BASED ONLINE HANDWRITTEN CHARACTER RECOGNITION FOR MALAYALAM USING BACKPROPAGATION NEURAL NETWORKS

International Journal of Advance Research in Engineering, Science & Technology

MOMENT AND DENSITY BASED HADWRITTEN MARATHI NUMERAL RECOGNITION

Convolution Neural Networks for Chinese Handwriting Recognition

Character Recognition of High Security Number Plates Using Morphological Operator

OFFLINE SIGNATURE VERIFICATION

Neural Network Classifier for Isolated Character Recognition

Automatic Recognition and Verification of Handwritten Legal and Courtesy Amounts in English Language Present on Bank Cheques

Review of Automatic Handwritten Kannada Character Recognition Technique Using Neural Network

Finger Print Enhancement Using Minutiae Based Algorithm

A Review on Handwritten Character Recognition

Handwritten Character Recognition: A Comprehensive Review on Geometrical Analysis

A two-stage approach for segmentation of handwritten Bangla word images

A Feature based on Encoding the Relative Position of a Point in the Character for Online Handwritten Character Recognition

A Multimodal Framework for the Recognition of Ancient Tamil Handwritten Characters in Palm Manuscript Using Boolean Bitmap Pattern of Image Zoning

NOVATEUR PUBLICATIONS INTERNATIONAL JOURNAL OF INNOVATIONS IN ENGINEERING RESEARCH AND TECHNOLOGY [IJIERT] ISSN: VOLUME 2, ISSUE 1 JAN-2015

International Journal of Electrical, Electronics ISSN No. (Online): and Computer Engineering 3(2): 85-90(2014)

An Improvement Study for Optical Character Recognition by using Inverse SVM in Image Processing Technique

Recognition of Unconstrained Malayalam Handwritten Numeral

CHAPTER 1 INTRODUCTION

Image Normalization and Preprocessing for Gujarati Character Recognition

IJESRT. Scientific Journal Impact Factor: (ISRA), Impact Factor: 1.852

Segmentation Based Optical Character Recognition for Handwritten Marathi characters

Image Processing: Final Exam November 10, :30 10:30

Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes

Encryption of Text Using Fingerprints

Available online at ScienceDirect. Procedia Computer Science 45 (2015 )

Recognition of Handwritten Digits using Machine Learning Techniques

A System for Joining and Recognition of Broken Bangla Numerals for Indian Postal Automation

Short Survey on Static Hand Gesture Recognition

Time Stamp Detection and Recognition in Video Frames

EXTRACTING TEXT FROM VIDEO

Mixture of Printed and Handwritten Kannada Numeral Recognition Using Normalized Chain Code and Wavelet Transform

Improving License Plate Recognition Rate using Hybrid Algorithms

ISSN: [Mukund* et al., 6(4): April, 2017] Impact Factor: 4.116

Morphological Image Processing GUI using MATLAB

Optical Character Recognition (OCR) for Printed Devnagari Script Using Artificial Neural Network

Motion Detection Algorithm

DATABASE DEVELOPMENT OF HISTORICAL DOCUMENTS: SKEW DETECTION AND CORRECTION

Offline Signature verification and recognition using ART 1

Moving Object Detection for Video Surveillance

Handwritten character and word recognition using their geometrical features through neural networks

Mobile Application with Optical Character Recognition Using Neural Network

Skeletonization Algorithm for Numeral Patterns

A Technique for Classification of Printed & Handwritten text

CITS 4402 Computer Vision

Automatic Detection of Change in Address Blocks for Reply Forms Processing

NOVATEUR PUBLICATIONS INTERNATIONAL JOURNAL OF INNOVATIONS IN ENGINEERING RESEARCH AND TECHNOLOGY [IJIERT] ISSN: VOLUME 5, ISSUE

HANDWRITTEN SIGNATURE VERIFICATION USING NEURAL NETWORK & ECLUDEAN APPROACH

Segmentation of Bangla Handwritten Text

Anale. Seria Informatică. Vol. XVII fasc Annals. Computer Science Series. 17 th Tome 1 st Fasc. 2019

Handwritten Hindi Character Recognition System Using Edge detection & Neural Network

A Robust Automated Process for Vehicle Number Plate Recognition

Extracting and Segmenting Container Name from Container Images

Building Multi Script OCR for Brahmi Scripts: Selection of Efficient Features

Skew Detection and Correction of Document Image using Hough Transform Method

LITERATURE REVIEW. For Indian languages most of research work is performed firstly on Devnagari script and secondly on Bangla script.

Online Signature Verification Technique

RESEARCH ON OPTIMIZATION OF IMAGE USING SKELETONIZATION TECHNIQUE WITH ADVANCED ALGORITHM

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers

I. INTRODUCTION. Figure-1 Basic block of text analysis

Image Processing Fundamentals. Nicolas Vazquez Principal Software Engineer National Instruments

Tamil Image Text to Speech Using Rasperry PI

Machine Learning: Handwritten Character Recognition

The Vehicle Logo Location System based on saliency model

Journal of Applied Research and Technology ISSN: Centro de Ciencias Aplicadas y Desarrollo Tecnológico.

INTERNATIONAL RESEARCH JOURNAL OF MULTIDISCIPLINARY STUDIES

Preprocessing of Gurmukhi Strokes in Online Handwriting Recognition

Handwritten Script Recognition at Block Level

EE 584 MACHINE VISION

Optical Character Recognition

II. WORKING OF PROJECT

Spatial Adaptive Filter for Object Boundary Identification in an Image

Problems in Extraction of Date Field from Gurmukhi Documents

DESIGNING A REAL TIME SYSTEM FOR CAR NUMBER DETECTION USING DISCRETE HOPFIELD NETWORK

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Handwritten Numeral Recognition of Kannada Script

Signature Recognition by Pixel Variance Analysis Using Multiple Morphological Dilations

Transcription:

Segmentation of Kannada Handwritten Characters and Recognition Using Twelve Directional Feature Extraction Techniques 1 Lohitha B.J, 2 Y.C Kiran 1 M.Tech. Student Dept. of ISE, Dayananda Sagar College of Engineering, Bangalore 2 Research Scholar, CSE Dayananda Sagar College of Engineering, Jain University, Bangalore, Associate Professor, Dept. of ISE Dayananda Sagar College of Engineering, Bangalore Abstract: The OCR system provides an efficient way to translate human readable characters to machine readable characters. It is to identify and analyze a document image by dividing the page into line, words, and then characters. Handwriting character recognition is a challenging and interesting task in the field of pattern recognition. In this paper we are segmenting the Kannada handwritten characters. For the feature extraction we are using twelve directional feature extraction techniques and for the recognition back propagation neural network is used. Experimental results show the good recognition rate towards segmented handwritten Kannada characters. Keywords: Handwritten Kannada Character Recognition Feed forward neural network, Recognition accuracy rate. 1. INTRODUCTION Nowadays, recognition systems are used in many fields that have different nature. The optical character recognition (OCR) was started from the recognition of machine printed digits and characters and then it was developed to the recognition of machine printed words. Gradually, handwritten digit, character and word recognition were introduced into this domain. Several research works have been focusing towards evolving the newer techniques that would reduce the preprocessing time and to provide higher recognition accuracy. The applications of OCR is in bank cheque processing, form data entry, vehicle plate recognition, postal address block detection and recognition, Automatic insurance documents key information extraction, Extracting business card information into a contact list. The handwritten character recognition system is classified into two types. They are off-line system and on-line system. In off-line system the writing is usually captured optically by scanner and it is available as image. It involves automatic conversion of text in an image into letters code that is usable within a computer. The probability of recognizing handwriting recorded with digitizer as a time sequence of pen co-ordinates is known as on-line character recognition. The handwritten character recognition system has following steps 1. Image acquisition 2. Segmentation 3. Pre-processing 4. Feature extraction 5. Classification Page 1135

In this paper the features of directions of pixels of the characters with admiration to their neighbouring pixels are extracted with the help of gradient values. The direction has been divided into 12 regions with each region covering angle of 30 degree, hence direction value of any pixel may have only 12 values assigned from 1 to 12. This approach increases the information content and gives better recognition rate with reduced recognition time. 2. LITERATURE REVIEW M. Amrouch, Y. es-saady et al [1] proposed a global approach to the recognition of handwritten Amazigh characters using directional features and Hidden Markov Model. Feature vector is extracted from an image of a character using sliding window technique based on Hough transform and translated into sequence of observations. Finally the class of the character is determined by the classifier. K.S Prasanna Kumar [3] introduces a novel way of feature extraction for Optical Character Recognition (OCR) customized for Kannada characters. The algorithm described here relies on breaking the character into four equal parts and using one of the quarters for extraction. The algorithm is deliberately kept away from all the complexities and the number of features to be extracted is also minimized so as to increase the efficiency and speed of recognition. The algorithm also describes a conflict resolution technique helpful in effectively utilizing the algorithm. Bindu S. Moni, G. Raju [2] implemented a fixed meshing for the off-line recognition of handwritten isolated Malayalam characters. The 12-directional features are extracted to form a feature vector. Classification has been carried out by implementing the Quadratic discriminant Function and Modified Quadratic discriminant Function. Cheng-Lin Liu [4] represented selected feature extraction methods in off-line handwritten Chinese character recognition. The experimental results showed, among the 8-directional, 12-directional and 16-directional feature extraction techniques, 12-directional method has better trade off between accuracy and complexity. 3. PRE-PROCESSING The raw data of handwritten characters no matter how it is acquired will be subjected to a number of preprocessing steps to make it useable. First convert image input to gray scale image and then convert gray scale image to binary image. The preprocessing phase aims to extract the relevant textual parts and prepares them for segmentation and recognition. Noise presented in the image is removed by using filtering and morphological operations. Normalization is used to converts the random character into the standard or specific size. 4. SEGMENTATION Segmentation is an important task of any Optical Character Recognition (OCR) system. It separates the image text documents into lines, words and characters. In this paper segmentation stage consists of two stages. In the first stage, mathematical morphology technique is used for constructing bridge between the components.in the next stage, projection technique is proposed for the segmentation of the text into line, words and characters. 4.1 Morphology: Initially, all the connected components in a document image are detected and removed from the binary image using connected component analysis algorithm. For a component, if the number of on pixels is very small compared to a preset threshold then we remove that component. After this process, the proposed method uses morphology operation that is by using appropriate size of structure element, erosion and dilation will be applied to the binary image. In erosion the last zero value pixel present at the boundary of the image is converted into 1 and in dilation last one value pixel present at the boundary is converted to zero. After dilation, the dilated image is inverted and then the content present in the image is cropped by identifying the rows. The rows are identified by finding the minimum and maximum positions of the zero valued pixels. 4.2 Projection technique: After the completion of first stage, the next stage is to extract individual character present in the image. In order to extract individual character, a technique based on projection is used. A projection profile is a histogram giving the number of ON pixels accumulated along parallel lines. A vertical projection profile is a one-dimensional array where each element Page 1136

denotes the number of ON pixels along a column in the image. The horizontal projection profile is a one-dimensional array where each element denotes the number of ON pixels along a row in the image. To segment each text word into several characters, we use the valleys of the vertical projection of each text word obtained by computing the column-wise sum of black pixels. The position between two consecutive vertical projections where the histogram height is least denotes one boundary line. Using these boundary lines, every text word is segmented into several characters. 5. FEATURE EXTRACTION This is also called as data extraction & gives data from perspective areas. Features are a set of numbers that capture the salient characteristics of the segmented image. The statistical features are derived from the statistical distributions of pixels, such as zoning, moments, projection histograms or direction histograms. Structural features are based on the topological and geometrical properties of the character, like strokes and their directions, end-points or intersection of segments and loops. Mathematically, the gradient of a two-variable function, here the image intensity function is at each image point, a 2D vector with the components given by the derivatives in the horizontal and vertical directions. At each image point, the gradient vector points in the direction of largest possible intensity increase, and the length of the gradient vector corresponds to the rate of change in that direction. This implies that the result of the Sobel operator at an image point which is in a region of constant image intensity is a zero vector and at a point on an edge is a vector which points across the edge, from darker to brighter values. To extract gradient feature, 3 3 Sobel operators are used. It uses two templates to compute the gradient components in horizontal and vertical directions, respectively. -1 0 +1-2 0 +2-1 0 +1 Horizontal +1 +2 +1 0 0 0-1 -2-1 Vertical The two gradient components at location (i, j) are calculated by: ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) Grad = arctan [ ( ) ( )] After computing the gradient of each pixel of the character, the gradient values are mapped onto 12 direction values to the angle span of 30 degree between any two adjacent direction values. The orientations of these 12 directional values are shown in Figure 1. Gradient values Direction g = -1 0 0 1 2 3 4 5 6 7 8 9 10 11 12 Figure 1: Orientation of 12 directions Figure 2: mapping of gradients on 12 directions Page 1137

To calculate directional distribution values of background pixels for each foreground pixel, we have used the masks for each direction. The pixel at centre is foreground pixel under consideration to calculate directional distribution values of background. The weight for each direction is computed by using specific mask in particular direction. The mapping of gradient values on 12 directional values can be given as in Figure 2. 6. CLASSIFICATION A feed forward back propagation neural network is used in this work for classifying and recognizing the Kannada handwritten characters. The neural classifier consists of a hidden layer besides an input layer and an output layer. The hidden layer use tansig activation function and the output layer is a competitive layer as one of the characters is required to be identified at any point in time. Once the neural network has been trained successfully it is then required to identify the same corresponding character. In addition, the network should also be able to handle noise. In practice, the network does not receive a perfect Kannada character as an input. 7. EXPERIMENTAL RESULTS The database is created by collecting data from 25 different writers so that recognition engine could be trained with different styles of handwriting. The method has been implemented in MATLAB 13b on Dual Core 3 GHz with 4 GB RAM. For experimental purpose, we have considered 500 handwritten words collected from different individuals of various professions. 8. CONCLUSION andwritten character recognition is the complex task in document image processing. Several reasons for this are quality of the scanner, degraded image, thickness of the pen, handwriting style. Compare to English like languages identifying the Kannada handwritten characters is the most challenging task. This is because of the huge number of characters and shapes in the Kannada language. Although several works has been done in this Kannada character recognition field, still it has got open problem for the researchers. Lack of availability of standard dataset is more difficult to start with the Kannada handwritten character recognition process. Lot of time is required to collect the handwritten words, scanning, cropping and resizing. Also subscripts are can t be identified correctly. Handwritten characters recognition involves data acquisition, preprocessing, segmentation, feature extraction and classification stages. Good combination of data set, feature extraction method and classifiers gives the better accuracy rate. Experimental result shows the good recognition rate towards handwritten Kannada characters. Page 1138

REFERENCES [1] M. Amrouch, Y. es-saady, Handwritten Amazigh character recognition system based on continuous HMMs and directional features, IJMER, vol. 2, issue 2, 2012, pp-436-411. [2] Bindu S Moni,G Raju Modified quadratic classifier and directional features for handwritten Malayalam character recognition, Computational Science-New Dimensions and Perspectives, NCCSE, 2011. [3] K.S. Prasanna Kumar Algorithm to Identify Kannada Vowels using Minimum Features Extraction method International Journal of Innovative Technology and Exploring Engineering (IJITEE) ISSN: 2278-3075, Volume-2, Issue-2, January 2013 [4] Cheng-Lin Liu Handwritten Chinese character recognition: effects of shape normalization and feature extraction, inria 00120408, Dec 2006. [5] Mamatha H.R, Srikantamurthy K. Morphological Operations and Projection Profiles based Segmentation of Handwritten Kannada Document, International Journal of Applied Information Systems (IJAIS) ISSN: 2249-0868 Foundation of Computer Science FCS, New York, USA Vol. 4 No.5, and October 2012 [6] Shraddha S. Gundal & P. P. Narwade, Handwritten character recognition using neural Network with four, eight & twelve directional feature extraction techniques, International Journal of Electronics, Communication & Instrumentation Engineering Research and Development (IJECIERD) ISSN(P): 2249-684X; ISSN(E):2249-7951Vol. 4, Issue 2, Apr 2014, 5-12. [7] Sangame S.K., Ramteke R.J., Rajkumar Benne, Recognition of Isolated Handwritten Kannada Vowels, Advances in Computational Research, ISSN: 0975 3273, Vol. 1, Issue 2, 2009, pp-52-55. [8] Sumedha B. Hallale, Geeta D. Salunke Twelve Directional Feature Extraction for Handwritten English Character Recognition International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-2, Issue-2, May 2013. Page 1139