Comparing Tesseract results with and without Character localization for Smartphone application

Similar documents
Mobile Camera Based Text Detection and Translation

IMPLEMENTING ON OPTICAL CHARACTER RECOGNITION USING MEDICAL TABLET FOR BLIND PEOPLE

Mobile Camera Based Calculator

AN APPROACH FOR SCANNING BASIC BUSINESS CARD IN ANDROID DEVICE

HCR Using K-Means Clustering Algorithm

OCR For Handwritten Marathi Script

Comparative Study of Hand Gesture Recognition Techniques

ABSTRACT I. INTRODUCTION. Dr. J P Patra 1, Ajay Singh Thakur 2, Amit Jain 2. Professor, Department of CSE SSIPMT, CSVTU, Raipur, Chhattisgarh, India

Abstract. Problem Statement. Objective. Benefits

Recognition of Gurmukhi Text from Sign Board Images Captured from Mobile Camera

DESIGN OF TEXT RECOGNITION IN SCENE AND ITS TRANSLATION INTO DIFFERENT LANGUAGES

MATRIX BASED INDEXING TECHNIQUE FOR VIDEO DATA

EXTRACTING TEXT FROM VIDEO

Mobile Application with Optical Character Recognition Using Neural Network

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Handwriting Recognition of Diverse Languages

Portable, Robust and Effective Text and Product Label Reading, Currency and Obstacle Detection For Blind Persons

Hand Written Character Recognition using VNP based Segmentation and Artificial Neural Network

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification

Effects Of Shadow On Canny Edge Detection through a camera

Keywords OCR, NoteMate, Text Mining, Template Matching, Android.

IJESRT. Scientific Journal Impact Factor: (ISRA), Impact Factor: 1.852

N.Priya. Keywords Compass mask, Threshold, Morphological Operators, Statistical Measures, Text extraction

A Study to Recognize Printed Gujarati Characters Using Tesseract OCR

COMPARISON OF DIFFERENT CLASSIFICATION TECHNIQUES

Footprint Recognition using Modified Sequential Haar Energy Transform (MSHET)

INTERNATIONAL RESEARCH JOURNAL OF MULTIDISCIPLINARY STUDIES

DEVANAGARI SCRIPT SEPARATION AND RECOGNITION USING MORPHOLOGICAL OPERATIONS AND OPTIMIZED FEATURE EXTRACTION METHODS

Handwritten Gurumukhi Character Recognition by using Recurrent Neural Network

Handwritten English Alphabet Recognition Using Bigram Cost Chengshu (Eric) Li Fall 2015, CS229, Stanford University

MOVING OBJECT DETECTION USING BACKGROUND SUBTRACTION ALGORITHM USING SIMULINK

Image Based Feature Extraction Technique For Multiple Face Detection and Recognition in Color Images

Design of A DIP System for Circumstantial Examination and Determination for Visually Disabled Persons

A Review: Content Base Image Mining Technique for Image Retrieval Using Hybrid Clustering

Gesture Identification Based Remote Controlled Robot

A Survey on Feature Extraction Techniques for Palmprint Identification

Renu Dhir C.S.E department NIT Jalandhar India

MULTI ORIENTATION PERFORMANCE OF FEATURE EXTRACTION FOR HUMAN HEAD RECOGNITION

Optical Character Recognition (OCR) for Printed Devnagari Script Using Artificial Neural Network

Character Recognition

A Technique for Classification of Printed & Handwritten text

Signboard Text Translator: A Guide to Tourist

International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering

FPGA IMPLEMENTATION FOR REAL TIME SOBEL EDGE DETECTOR BLOCK USING 3-LINE BUFFERS

Segmentation of Isolated and Touching characters in Handwritten Gurumukhi Word using Clustering approach

Shape Prediction Linear Algorithm Using Fuzzy

Figure-Ground Segmentation Techniques

Segmentation of Kannada Handwritten Characters and Recognition Using Twelve Directional Feature Extraction Techniques

Neural Network Classifier for Isolated Character Recognition

A Completion on Fruit Recognition System Using K-Nearest Neighbors Algorithm

Fuzzy Inference System based Edge Detection in Images

Cover Page. Abstract ID Paper Title. Automated extraction of linear features from vehicle-borne laser data

MEASUREMENT OF SEMANTIC SIMILARITY BETWEEN WORDS: A SURVEY

Skew Detection and Correction of Document Image using Hough Transform Method

Extraction and Recognition of Alphanumeric Characters from Vehicle Number Plate

FREEMAN CODE BASED ONLINE HANDWRITTEN CHARACTER RECOGNITION FOR MALAYALAM USING BACKPROPAGATION NEURAL NETWORKS

OFFLINE SIGNATURE VERIFICATION USING SUPPORT LOCAL BINARY PATTERN

Handwritten Hindi Character Recognition System Using Edge detection & Neural Network

An Edge Detection Algorithm for Online Image Analysis

An Efficient Approach for Color Pattern Matching Using Image Mining

Ubiquitous Computing and Communication Journal (ISSN )

Automatic Number Plate Recognition using Android application

Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network

Scene Text Detection Using Machine Learning Classifiers

Image Segmentation Based on Watershed and Edge Detection Techniques

A Generalized Method to Solve Text-Based CAPTCHAs

Integrating Text Mining with Image Processing

Identifying and Reading Visual Code Markers

A Review on Image Segmentation Techniques

Understanding Tracking and StroMotion of Soccer Ball

Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm

AN EFFICIENT BINARIZATION TECHNIQUE FOR FINGERPRINT IMAGES S. B. SRIDEVI M.Tech., Department of ECE

Character Recognition from Google Street View Images

Image Mining: frameworks and techniques

Including the Size of Regions in Image Segmentation by Region Based Graph

HANDWRITTEN GURMUKHI CHARACTER RECOGNITION USING WAVELET TRANSFORMS

Review on Text String Detection from Natural Scenes

Spatial Adaptive Filter for Object Boundary Identification in an Image

Data Preprocessing. Data Preprocessing

Image enhancement for face recognition using color segmentation and Edge detection algorithm

A New Algorithm for Shape Detection

COLOR AND SHAPE BASED IMAGE RETRIEVAL

AUTONOMOUS IMAGE EXTRACTION AND SEGMENTATION OF IMAGE USING UAV S

Volume 2, Issue 5, May 2014 International Journal of Advance Research in Computer Science and Management Studies

Detecting and Tracking a Moving Object in a Dynamic Background using Color-Based Optical Flow

Real Time Motion Detection Using Background Subtraction Method and Frame Difference

PATTERN RECOGNITION USING NEURAL NETWORKS

922 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 13, NO. 5, OCTOBER 2011

Scanner Parameter Estimation Using Bilevel Scans of Star Charts

An Approach for Real Time Moving Object Extraction based on Edge Region Determination

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Copyright Detection System for Videos Using TIRI-DCT Algorithm

PERFORMANCE EVALUATION OF ONTOLOGY AND FUZZYBASE CBIR

Detection and Segmentation of Human Face

Autonomous Interpretation of Elevator Panels for Robot Navigation

Instant Camera Translation and Voicing for Signs

An Improvement Study for Optical Character Recognition by using Inverse SVM in Image Processing Technique

Schema Matching with Inter-Attribute Dependencies Using VF2 Approach

Face Liveness Detection Using Euler Method Based On Diffusion Speed Calculation

Recognition of Non-symmetric Faces Using Principal Component Analysis

Transcription:

Comparing Tesseract results with and without Character localization for Smartphone application Snehal Charjan 1, Prof. R. V. Mante 2, Dr. P. N. Chatur 3 M.Tech 2 nd year 1, Asst. Professor 2, Head of Department 3 snehalcharjan@gmail.com, mante.ravi@gmail.com,chatur.prashant@gcoea.ac.in Department of Computer Science and Engineering, Government College of Engineering, Amravati (Maharashtra), India. Abstract :- Tesseract is considered the most accurate free OCR engine in existence. Android operating system based Smartphones application where images taken from camera of mobile device or browsed from gallery are preprocessed. The text of these images will be accurately localized within the device using special localization method. Localized text sub-image will be fed for text extraction to the best OCR engine called Tesseract. In this paper, Tesseract results with and without Character localization is compared based on computation time in milliseconds. Each image is taken 10 times and time for each is calculated. The computation time is taken as average of this 10 values. There is drastic change in time and accuracy of localized image compared to nonlocalized image. Finally we concluded the importance of localization in OCR system especially for Smartphone application where we OCR a few words and need high accuracy. scanned images of handwritten, typewritten or printed text into machine-encoded text. A lot of OCR software have been developed to accomplish this mission. Tesseract is used in this method which is one of the most accurate open source OCR engine currently available. [2] After lying dormant for more than 10 years, Tesseract is now behind the leading commercial engines in terms of its accuracy. Its key strength is probably its unusual choice of features. Its key weakness is probably its use of a polygonal approximation as input to the classifier instead of the raw outlines. [3] Many objects in natural images, such as tree branches or electrical wires, are easily confused for text by existing optical character recognition (OCR) algorithms. For this reason, applying OCR on an unprocessed natural image is computationally expensive and may produce erroneous results. Hence, robust and efficient methods are needed to identify the text-containing regions within natural images before performing OCR [1]. Keywords: Pre-process, localization, OCR, android, Smartphone Introduction An Android-platform based image text search application is developed that is able to recognize the text captured by a mobile phone camera and display the web search result onto the screen of the mobile phone. User can get current information about the product, place or boards. The only assumption we make regarding the text is that it is written on a more or less uniform background using a contrasting color. [1] OCR stands for Optical Character Recognition, which is the process of taking an image, and being able to interpret the image and obtain textual data from it. OCR, Optical Character Recognition, is developed to translate I. Methodology The handheld device equipped with a 3-megapixel camera, CPU 330MHz and RAM 128 MB capable of acquiring images of text to be readable by a human viewer. Images taken from camera of mobile device will be browsed from gallery and pre-processed. The text of these images will be accurately localized within the device in a fraction of a second. Localized text sub-image will be fed for text extraction. The text output is then fed to web to search for web search results. The result will be displayed on browser of mobile. In order to identify signs, we begin by locating seed points in areas with homogeneous luminance. In order www.ijrcct.org Page 298

do this; the image is first divided into a grid of size non-overlapping blocks. Each pixel block is tested for homogeneity to determine if it is part of the background of a sign. II. Proposed Scheme Fig.1. Graphical representation of the M=3 weight matrices used to quantify homogeneity for each K K block in the image. The colors white and black are used to represent +1 and -1, respectively, in each K K region. where K is a power of two. The homogeneity of each block is calculated, and blocks which meet a given threshold are labeled as homogenous. The set of pixels from these homogenous blocks are then used as seed points. We determine the homogeneity of a block as follows. For each block, let be the vector of dimension containing the luminance values for the block. We compute homogeneity features for each block given by δ (m) = 2/K 2 iw i (m) Where w m for 0 m<m is a weight vector with binary entries (i.e., each entry is ±1) that sum to zero. With this scaling, δ (m) falls into the same range as the pixels. Since the maximum intensity difference in an individual pixel is 255, the average difference in pixel intensity, δ (m),is then a value in the range of 0 to 255. We use M=3 features corresponding to the three binary weight vectors shown in Fig. 1. Notice that each of the three features quantify the variation of pixels values within a block, with a smaller value of δ (m) signifying a more homogenous texture. After all values have been computed, we classify a block as homogenous if L norm of the vector δ = (δ 0,, δ M-1 ) T is less than a chosen threshold, T u, and at least one of its four neighboring blocks also meets the same condition. Seed points are exactly the set of all pixels in the homogenous blocks. Smaller block sizes have the advantage of identifying smaller homogeneous regions, while larger blocks are less susceptible to noise. Notice that homogeneous blocks are generally not located in areas where edges reside. [1] Fig. 2 software architecture Figure 2 is an overview of the software architecture that is divided into boxes that represent portions of code called an Activity. A specific activity communicates through an Intent, which are the lines relating each activity in Fig 2. Inside each activity are functions that operate on each particular activity. Activity and Intent, which are the fundamental components of producing an Android application are shown. The Home Activity is the first screen in the application and the user can choose to acquire images through the file system in the Gallery activity on the phone or through the camera Preview activity. The Gallery activity is built into the Operating System and only required coding of the intent to retrieve image files. The Preview activity contains code to preview images through the camera before the Capture intent is sent upon pressing the image capture button. Upon Capture or Open each sends a specific intent to the Localize and OCR function at Home activity where the image processing occurs and editable text is displayed on screen of Home activity. III. Results Analysis The analysis of the computation time of computed taking each image 10 times and average value is taken shown in table below. Table 1 show Non-localized image results while Table 2 shows localizes images result by Tesseract. www.ijrcct.org Page 299

No. Original Text Recognized text without Localization Time(in ms) 10 Panasonic VNe x 2141 1 F l ff lf vf l V Osborne Garages I J nf N V s I 3 H f I axis m 14785 11 f f k i 3674 2 L d v1 M V 2823 12 qinl s 1685 3 A 2 f Z d y a m n Ffa fi f m HI xffeu A w x V 15515 13 W 1r1fff WfliSfgith 3769 4 i 55 2375 14 I gf g AI U p for 450 yds 9 7 4469 5 C rw M v P s v House Seatrae Communic I g istered Q 6143 Table 1. Non-localized Results 6 F 2107 No. Localized Text Recognized text with Localization Time(in ms) 7 RE ERVED ibn WfcLuB SYECRETARY 6407 1 Osborne Garages 1154 8 EsPAISloL INGNLES IncLEs ESPANOL 4561 9 Imighrnne 3079 2 London Chelmsford 974 3 g Colchester 906 www.ijrcct.org Page 300

4 Central Eff 1576 5 Seatrade House 1188 6 Sal ers 1564 7 RESERVED FOR CLUB SECRETARY 1552 Fig.3 Comparison between Localized and without Localized OCR results 8 EsPASoL INcLEs ESPANOL 4820 9 Nighrnne 1639 10 fanasonic 1373 11 MmDLEBoucHA 2791 12 PosT QFHCE 2316 13 WHSmith 1767 14 F40 fooiway for 450 yds Table 2. Localized Results 2062 The Comparison between Localized and without Localized OCR results for 14 images is shown in graph below where red graph line is time for Non-localized image processing and blue graph line is Localized image processing time by Tesseract. It clearly shows that Non-Localized image take more than double time in processing by Tesseract, also accuracy is very low.only 5 out of 14 Non-localized image output are nearly correct which also need post-processing for correction. So accuracy is 35.71 %. While localized image gives fast and accurate output which is most important for sending internet query to get right search results. Almost all localized images give almost accurate results and 7 out of 14 localized image output are correct and 5 localized images can give correct output after post-processing for correction. So accuracy is 85.71 %. IV. Conclusion The goal of this project is to focus on Time complexity and accuracy comparison between localized and non-localized image processing by Tesseract. Non-Localized image take more than double time in processing by Tesseract, which makes Smartphone application slower. Also accuracy is very low which is most important for using that text for next function or application, for example, in our proposed scheme feeding text to search engine to get search results, we need accurate search query to be provided to get right results. Without localization accuracy is 35.71 %. While localized images gives 85.71 %. accuracy by Tesseract. So for Smartphone application where accuracy and speed is most important localization is very important and must. Future scope is to decrease the localization speed and make attempt in post-processing to get more accurate results. www.ijrcct.org Page 301

Reference [1] Katherine L. Bouman, Golnaz Abdollahian, A Low Complexity Sign Detection and Text Localization Method for Mobile Applications, IEEE Transactions on multimedia, VOL.13, NO. 5, OCTOBER 2011. Computer Science and Engineering department at Govt. College of Engineering, Amravati. [2] Derek Ma, Qiuhau Lin, Tong Zhang Mobile Camera Based Text Detection and Translation Stanford University, Nov 2000. [3] Ray Smith, An Overview of the Tesseract OCR Engine, Google Inc. IEEE 0-7695-2822-8/07,2007 Snehal Charjan received her B.E. degree in Computer science and Engineering, from G.H.Raisoni College of Engineering, Nagpur, Maharashtra, India, in 2011, pursuing M.tech Degree in Computer Science and Engineering from Government college of Engineering, Amravati, s Photo Maharashtra, India. Her research interests include Pattern recognition, Optical Character recognition. At present, she is engaged in Character localization and recognition application for Smartphone. Ravi V. Mante received his B.E. degree in Computer science and Engineering, from Government college of Engineering, Amravati, Maharashtra, India in 2006, the M.tech Degree in Computer science and Engineering, from Government college of Engineering, Amravati, Maharashtra, India, in 2011. He is Assistant professor in Government college of Engineering, Amravati, Maharashtra, India since 2007. His research interests include ECG signal analysis, soft computing technique, cloud computing. At present, He is working with Artificial neural network. P. N. Chatur has received his M.E. degree in Electronics Engineering from Govt. College of Engineering, Amravati, India and PhD degree from Amravati University. He has published twenty papers in national and ten papers in international journal. His area of research includes Neural Network, data mining. Currently he is head of www.ijrcct.org Page 302