IMPLEMENTING ON OPTICAL CHARACTER RECOGNITION USING MEDICAL TABLET FOR BLIND PEOPLE

Similar documents
International Journal of Advance Research in Engineering, Science & Technology

Tamil Image Text to Speech Using Rasperry PI

Mobile Application with Optical Character Recognition Using Neural Network

HCR Using K-Means Clustering Algorithm

Comparing Tesseract results with and without Character localization for Smartphone application

Recognition of Gurmukhi Text from Sign Board Images Captured from Mobile Camera

Portable, Robust and Effective Text and Product Label Reading, Currency and Obstacle Detection For Blind Persons

Optical Character Recognition Based Speech Synthesis System Using LabVIEW

Journal of Applied Research and Technology ISSN: Centro de Ciencias Aplicadas y Desarrollo Tecnológico.

Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network

A Technique for Classification of Printed & Handwritten text

Segmentation of Characters of Devanagari Script Documents

Mono-font Cursive Arabic Text Recognition Using Speech Recognition System

HANDWRITTEN GURMUKHI CHARACTER RECOGNITION USING WAVELET TRANSFORMS

Text Extraction from Natural Scene Images and Conversion to Audio in Smart Phone Applications

IJESRT. Scientific Journal Impact Factor: (ISRA), Impact Factor: 1.852

Anale. Seria Informatică. Vol. XVII fasc Annals. Computer Science Series. 17 th Tome 1 st Fasc. 2019

CHAPTER 1 INTRODUCTION

OPTICAL CHARACTER RECOGNITION FOR BLIND USING RASPBERRY PI

OCR and OCV. Tom Brennan Artemis Vision Artemis Vision 781 Vallejo St Denver, CO (303)

Technical Disclosure Commons

NOVATEUR PUBLICATIONS INTERNATIONAL JOURNAL OF INNOVATIONS IN ENGINEERING RESEARCH AND TECHNOLOGY [IJIERT] ISSN: VOLUME 5, ISSUE

A Survey of Problems of Overlapped Handwritten Characters in Recognition process for Gurmukhi Script

Optical Character Recognition

Scene Text Detection Using Machine Learning Classifiers

Review on Optical Character Recognition and Signature Recognition and Verification Technique

Bus Detection and recognition for visually impaired people

II. WORKING OF PROJECT

DATABASE DEVELOPMENT OF HISTORICAL DOCUMENTS: SKEW DETECTION AND CORRECTION

Book Reader for Blind

A Study to Recognize Printed Gujarati Characters Using Tesseract OCR

A Document Image Analysis System on Parallel Processors

A Smart App for Mobile Phones to Top-Up User Accounts for Any Network Service Provider in SriLanka

Multi-Layer Perceptron Network For Handwritting English Character Recoginition

SKEW DETECTION AND CORRECTION

International Journal of Scientific & Engineering Research, Volume 8, Issue 3, March ISSN

AN APPROACH FOR SCANNING BASIC BUSINESS CARD IN ANDROID DEVICE

6. Applications - Text recognition in videos - Semantic video analysis

Optical Character Recognition (OCR) for Printed Devnagari Script Using Artificial Neural Network

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Note: TAO OCR is a derivitive of the same OCR engine used in Microsoft's "Seeing AI" application!

I. INTRODUCTION. Figure-1 Basic block of text analysis

Hand Written Character Recognition using VNP based Segmentation and Artificial Neural Network

Mobile Camera Based Text Detection and Translation

OCR For Handwritten Marathi Script

Seminar. Topic: Object and character Recognition

Suresha M Department of Computer Science, Kuvempu University, India

System Based on Voice and Biometric Authentication

An Efficient Character Segmentation Based on VNP Algorithm

Discovering Computers Chapter 5 Input. CSA 111 College of Applied Studies UOB

Skew Detection and Correction of Document Image using Hough Transform Method

Portable Camera Based Assistive Text and Label Reading From Hand-Held Object for Visually Impaired Person

I. INTRODUCTION ABSTRACT

Block Diagram. Physical World. Image Acquisition. Enhancement and Restoration. Segmentation. Feature Selection/Extraction.

Automatic Recognition and Verification of Handwritten Legal and Courtesy Amounts in English Language Present on Bank Cheques

Character Recognition from Google Street View Images

Discovering Computers Chapter 5 Input

Strategic White Paper

Handwriting Recognition of Diverse Languages

GABOR WAVELETS FOR HUMAN BIOMETRICS

A New Approach to Detect and Extract Characters from Off-Line Printed Images and Text

A VOICE BASED PRODUCT IDENTIFICATION FOR BLIND PERSONS

LECTURE 6 TEXT PROCESSING

RECOGNITION OF DRIVING MANEUVERS BASED ACCELEROMETER SENSOR

OPTICAL CHARACTER RECOGNITION Ku. Priti V. Manjare 1, Vaishali B. Farkade 2, Mr. Jha 3

Indian Currency Recognition Based on ORB

An Improvement Study for Optical Character Recognition by using Inverse SVM in Image Processing Technique

Simulation of Zhang Suen Algorithm using Feed- Forward Neural Networks

Face Cyclographs for Recognition

Handwritten Hindi Character Recognition System Using Edge detection & Neural Network

Image Normalization and Preprocessing for Gujarati Character Recognition

The PAGE (Page Analysis and Ground-truth Elements) Format Framework

Skew Angle Detection of Bangla Script using Radon Transform

International Journal of Signal Processing, Image Processing and Pattern Recognition Vol.9, No.2 (2016) Figure 1. General Concept of Skeletonization

Biometric Palm vein Recognition using Local Tetra Pattern

Text Extraction from Images

Optical Character Recognition

Carmen Alonso Montes 23rd-27th November 2015

Handwritten Character Recognition with Feedback Neural Network

2. Basic Task of Pattern Classification

Unicode. Standard Alphanumeric Formats. Unicode Version 2.1 BCD ASCII EBCDIC

2009 International Conference on Emerging Technologies

ABSTRACT I. INTRODUCTION. Dr. J P Patra 1, Ajay Singh Thakur 2, Amit Jain 2. Professor, Department of CSE SSIPMT, CSVTU, Raipur, Chhattisgarh, India

A Survey on Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons

Automatically Algorithm for Physician s Handwritten Segmentation on Prescription

Adobe Acrobat X Pro Accessibility Guide: Best Practices for PDF Accessibility

Summary Table Criteria Supporting Features Remarks and explanations. Supports with exceptions. Supports with exceptions. Supports with exceptions

Recognizing Deformable Shapes. Salvador Ruiz Correa Ph.D. UW EE

Cursive Handwriting Recognition

Character Recognition Using Matlab s Neural Network Toolbox

A Generalized Method to Solve Text-Based CAPTCHAs

Ethiopic Document Image Database for Testing Character Recognition Systems

Handwritten Marathi Character Recognition on an Android Device

International Journal of Computer Engineering and Applications, Volume XII, Special Issue, March 18, ISSN

NEW ALGORITHMS FOR SKEWING CORRECTION AND SLANT REMOVAL ON WORD-LEVEL

OCR and Automated Translation for the Navigation of non-english Handsets: A Feasibility Study with Arabic

Chapter 7. Input and Output. McGraw-Hill/Irwin. Copyright 2008 by The McGraw-Hill Companies, Inc. All rights reserved.

Solving Word Jumbles

Cloud Based Mobile Business Card Reader in Tamil

On-line handwriting recognition using Chain Code representation

Transcription:

Impact Factor (SJIF): 5.301 International Journal of Advance Research in Engineering, Science & Technology e-issn: 2393-9877, p-issn: 2394-2444 Volume 5, Issue 3, March-2018 IMPLEMENTING ON OPTICAL CHARACTER RECOGNITION USING MEDICAL TABLET FOR BLIND PEOPLE Mrs.M.M Bidwe 1, Muskan Jodhwani 2, Diksha Jadhav 3, Muskan Baig 4, Priyam Tiwari 5 1 madhuribidwe@gmail.com, 2 muskanjodhwani1@gmail.com, 3 dikhajadhav57958@gmail.com, 4 muskanbaig1709@gmail.com, 5 priyamtiwarisbc@gmail.com 1 Dr.D.Y Patil Polytechnic,Akurdi 2 Dr.D.Y Patil Polytechnic,Akurdi 3 Dr.D.Y Patil Polytechnic,Akurdi 4 Dr.D.Y Patil Polytechnic,Akurdi 5 Dr.D.Y Patil Polytechnic,Akurdi Abstract At the present time, keyboarding remains the most common way of inputting data into computers. This is probably the most time consuming and labor intensive operation. OCR is the machine replication of human reading and has been the subject of intensive research for more than three decades. OCR can be described as Mechanical or electronic conversion of scanned images where images can be handwritten, typewritten or printed text. It is a method of digitizing printed texts so that they can be electronically searched and used in machine processes. This system presents a simple, efficient, and less costly approach to construct OCR for reading any document that has fix font size and style or handwritten style using the android application for medical prescription. To achieve efficiency and less computational cost, OCR in this system uses on medical tablets for blind peoples which makes this OCR very simple to manage. Keywords - document image analysis(dia), Raspberry PI, audio output, OCR based book reader I. INTRODUCTION The advancements in pattern recognition has accelerated recently due to the many emerging applications which are not only challenging, but also computationally more demanding, such evident in Optical Character Recognition (OCR), Document Classification, Computer Vision, Data Mining, Shape Recognition, and Biometric Authentication, for instance. The area of OCR is becoming an integral part of document scanners, and is used in many applications such as postal processing, script recognition, banking, security (i.e. passport authentication) and language identification. The research in this area has been ongoing for over half a century and the outcomes have been astounding with successful recognition rates for printed characters exceeding 99%, with significant improvements in performance for handwritten cursive character recognition where recognition rates have exceeded the 90% mark. Nowadays, many organizations are depending on OCR systems to eliminate the human interactions for better performance and efficiency. Optical Character Recognition also referred to as OCR is a system that provides a full alphanumeric recognition of printed or handwritten characters at electronic speed by simply scanning the document. Documents are scanned using a scanner and are given to the OCR systems which recognizes the characters in the scanned documents and converts them into ASCII data. Visually impaired people report numerous difficulties with accessing printed text using existing technology, including problems with alignment, focus, accuracy, mobility and efficiency. We present a smart device that assists the visually impaired which effectively and efficiently reads paper-printed text. The proposed project uses the methodology of a camera based assistive device that can be used by people to read Text document. The framework is on implementing image capturing technique in an embedded system based on Raspberry Pi board. The design is motivated by preliminary studies with visually impaired people, and it is small-scale and mobile, which enables a more manageable operation with little setup. In this project we have proposed a text read out system for the visually challenged. The proposed fully integrated system has a camera as an input device to feed the printed text document for digitization and the scanned document is processed by a software module the OCR (optical character recognition engine). A methodology is 639

implemented to recognition sequence of characters and the line of reading. As part of the software development the Open CV (Open source Computer Vision) libraries is utilized to do image capture of text, to do the character recognition. Most of the access technology tools built for people with blindness and limited vision are built on the two basic building blocks of OCR software and Text-to-Speech (TTS) engines. Optical character recognition (OCR) is the translation of captured images of printed text into machine encoded text. OCR is a process which associates a symbolic meaning with objects (letters, symbols an number) with the image of a character. It is defined as the process of converting scanned images of machine printed into a computer process able format. Optical Character recognition is also useful for visually impaired people who cannot read Text document, but need to access the content of the Text documents. Optical Character recognition is used to digitize and reproduce texts that have been produced with non-computerized system. Digitizing texts also helps reduce storage space. Editing and Reprinting of Text document that were printed on paper are time consuming and labor intensive. It is widely used to convert books and documents into electronic files for use in storage and document analysis. OCR makes it possible to apply techniques such as machine translation, text-to-speech and text mining to the capture / scanned page. The final recognized text document is fed to the output devices depending on the choice of the user. The output device can be a headset connected to the raspberry pi board or a speaker which can spell out the text document aloud Gives an algorithm for detecting and reading text in natural images for the use of blind and visually impaired subjects walking through city scenes. The overall algorithm has a success rate of over 90% on the test set and the unread text is typically small and distant from the viewer. have proposed a novel scheme for the extraction of textual areas of an image using globally matched wavelet filters. A clustering based technique has been devised for estimating globally matched wavelet filters using a collection of ground truth images. proposes a support vector machine (SVM) is used to analyses the textual properties of texts. The combination of CAMSHIFT and SVMs produces both robust and efficient text detection. tells about the navigational technologies available to blind individuals to support independent travel, our focus is on blind navigation on large scale. II. ARTIFICIAL INTELLIGENCE It is the field of computer science which makes the system to behave intelligently by various process of training and cognitive learning. It is the one of the branch of computer science which aims to imitate human vision and form the basis of image acquisition, processing, document understanding and recognition. A far more streamlined field of Document Recognition and understanding is Optical Character Recognition which attempts to identify a single character from an optically read text image as a part of a word that can be then used to process further information on. The area gains rising significance as more and more information each day needs to store processed and retrieved rather than being keyed in from an already present printed or handwritten source. III. CHARACTER RECOGNITION Character recognition is a sub-field of pattern recognition in which images of characters from a text image are recognized and as a result of recognition respective character codes are returned, these when rendered give the text in the image. The problem of character recognition is the problem of automatic recognition of raster images as being letters, digits or some other symbol and it is like any other problem in computer vision. IV. FLOW OF THE PROJECT The below fig is illustrates the overall functioning of Optical Character Recognition (OCR). The input image can be any document, live text, journals, magazines etc. The functioning of OCR contains the following steps: scanning, segmentation, pre-processing, feature extraction, recognition. The input is first scanned using an Android mobile camera. This is done to digitize the document. Segmentation extracts any symbols in the text region. Noise is removed by preprocessing each symbol, and the characteristics of each symbol are extracted using feature extraction to finally recognize the text. 640

IMAGE CAPTURING The first step in which the device is moved over the printed page and the inbuilt camera captures the images of the text. The quality of the image captured will be high so as to have fast and clear recognition due to the high resolution camera PRE-PROCESSING Pre-processing stage consists of three steps: Skew Correction, Linearization and Noise removal. The captured image is checked for skewing. There are possibilities of image getting skewed with either left or right orientation. Here the image is first brightened and binaries The function for skew detection checks for an angle of orientation between ±15 degrees and if detected then a simple image rotation is carried out till the lines match with the true horizontal axis, which produces a skew corrected image. The noise introduced during capturing or due to poor Quality of page has to be cleared before further processing of an image patch into a feature vector. Adjacent character grouping is performed to calculate candidates of text patches prepared for text classification. An Adaboost learning model is employed to localize text in camera-based images. Off-the-shelf OCR is used to perform word recognition on the localized text regions and transform into audio output for blind users. In this research, the camera acts as input for the paper. As the Raspberry Pi board is powered the camera starts streaming. The streaming data will be displayed on the screen using GUI application. When the object for label reading is placed in front of the camera then the capture button is clicked to provide image to the board. Using Tesseract library the image will be converted into data and the data detected from the image will be shown on the status bar. The obtained data will be pronounced through the ear phones using Flite library. V. THE PROJECT OUTPUT SCREENSHOTS 641

642

VI. ADVANTAGES 1. Much faster than someone manually entering large amounts of text 2. Easy to implement. 3. Easy to use because android application. VII. DRAWBACKS 1. All documents need to be checked over carefully and then manually corrected 2. Sometimes not recognized properly. VIII. FUTURE WORK The proposed system is that it is easily portable and its scalability which can recognize English languages. In future scope we implementing different languages for optical character recognition. IX. CONCLUSION This system tells about Optical character recognition for handheld devices in recognizing characters in offline mode. The system has the ability to recognize characters with accuracy exceeding 90% mark. The advantage of the system is that it is easily portable and its scalability which can recognize English languages. Recognition is often followed by a postprocessing stage. If post-processing is done on the output image, the accuracy can be increased. The future scope is to develop software for automatic editing and searching. REFERENCES [1] Heuristic-Based OCR Post-Correction for Smart Phone Applications, the University of North Carolina at Chapel Hill department of computer science honors thesis Author: Wing-Soon Wilson Lian 2009. [2] R.W. Smith, The Extraction and Recognition of Text from Multimedia Document Images, PhD Thesis, University of Bristol, November 1987. [3] The Tesseract open source OCR engine, http://code.google.com/p/tesseract-ocr. [4]R. Smith. An overview of the Tesseract OCR Engine. Proc 9th Int.Conf. on Document Analysis and Recognition, IEEE, Curitiba, Brazil,Sep 2007 [5] α-soft: An English Language OCR, 2010 Second InternationalConference on Computer Engineering and Applications. Junaid Tariq,Umar Nauman Muhammad UmairNaru 643