Automatically Algorithm for Physician s Handwritten Segmentation on Prescription

Similar documents
Scene Text Detection Using Machine Learning Classifiers

OCR For Handwritten Marathi Script

A Laplacian Based Novel Approach to Efficient Text Localization in Grayscale Images

OTCYMIST: Otsu-Canny Minimal Spanning Tree for Born-Digital Images

Segmentation Framework for Multi-Oriented Text Detection and Recognition

Topic 4 Image Segmentation

HCR Using K-Means Clustering Algorithm

Text Extraction from Natural Scene Images and Conversion to Audio in Smart Phone Applications

Motion Detection. Final project by. Neta Sokolovsky

Recognition of Gurmukhi Text from Sign Board Images Captured from Mobile Camera

CHAPTER 1 INTRODUCTION

Time Stamp Detection and Recognition in Video Frames

Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes

Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds

Extraction Characters from Scene Image based on Shape Properties and Geometric Features

Text Information Extraction And Analysis From Images Using Digital Image Processing Techniques

An ICA based Approach for Complex Color Scene Text Binarization

II. WORKING OF PROJECT

Texture Image Segmentation using FCM

Handwritten Devanagari Character Recognition Model Using Neural Network

AN EFFICIENT BINARIZATION TECHNIQUE FOR FINGERPRINT IMAGES S. B. SRIDEVI M.Tech., Department of ECE

Bus Detection and recognition for visually impaired people

An Approach to Detect Text and Caption in Video

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM

A Fast Caption Detection Method for Low Quality Video Images

Optical Character Recognition (OCR) for Printed Devnagari Script Using Artificial Neural Network

A reversible data hiding based on adaptive prediction technique and histogram shifting

A New Approach to Detect and Extract Characters from Off-Line Printed Images and Text

An Introduction to Content Based Image Retrieval

Handwritten Script Recognition at Block Level

TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK

I. INTRODUCTION. Figure-1 Basic block of text analysis

EXTRACTING TEXT FROM VIDEO

An Efficient Character Segmentation Based on VNP Algorithm

Multi-scale Techniques for Document Page Segmentation

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification

Recognition-based Segmentation of Nom Characters from Body Text Regions of Stele Images Using Area Voronoi Diagram

Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial Region Segmentation

A Hierarchical Pre-processing Model for Offline Handwritten Document Images

Overlay Text Detection and Recognition for Soccer Game Indexing

International Journal of Advance Research in Engineering, Science & Technology

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

Image Segmentation Techniques

Cardio-Thoracic Ratio Measurement Using Non-linear Least Square Approximation and Local Minimum

Text Detection and Extraction from Natural Scene: A Survey Tajinder Kaur 1 Post-Graduation, Department CE, Punjabi University, Patiala, Punjab India

Ulrik Söderström 16 Feb Image Processing. Segmentation

Robotics Programming Laboratory

Connected Component Analysis and Change Detection for Images

EXTRACTING GENERIC TEXT INFORMATION FROM IMAGES

TRAFFIC SIGN RECOGNITION USING A MULTI-TASK CONVOLUTIONAL NEURAL NETWORK

Blood Microscopic Image Analysis for Acute Leukemia Detection

HANDWRITTEN GURMUKHI CHARACTER RECOGNITION USING WAVELET TRANSFORMS

Color-Texture Segmentation of Medical Images Based on Local Contrast Information

Binarization of Color Character Strings in Scene Images Using K-means Clustering and Support Vector Machines

ABSTRACT 1. INTRODUCTION 2. RELATED WORK

Texture Segmentation by Windowed Projection

Content based Image Retrieval Using Multichannel Feature Extraction Techniques

Speeding up the Detection of Line Drawings Using a Hash Table

Random spatial sampling and majority voting based image thresholding

Artifacts and Textured Region Detection

Spam Filtering Using Visual Features

International Journal of Electrical, Electronics ISSN No. (Online): and Computer Engineering 3(2): 85-90(2014)

Identifying Layout Classes for Mathematical Symbols Using Layout Context

Including the Size of Regions in Image Segmentation by Region Based Graph

Real-Time Document Image Retrieval for a 10 Million Pages Database with a Memory Efficient and Stability Improved LLAH

RESTORATION OF DEGRADED DOCUMENTS USING IMAGE BINARIZATION TECHNIQUE

A Generalized Method to Solve Text-Based CAPTCHAs

A Model-based Line Detection Algorithm in Documents

Texture Sensitive Image Inpainting after Object Morphing

Available online at ScienceDirect. Procedia Computer Science 45 (2015 )

Keywords Connected Components, Text-Line Extraction, Trained Dataset.

A New Algorithm for Detecting Text Line in Handwritten Documents

Content Based Medical Image Retrieval Using Fuzzy C- Means Clustering With RF

A New Feature Local Binary Patterns (FLBP) Method

LECTURE 6 TEXT PROCESSING

Layout Segmentation of Scanned Newspaper Documents

IMPLEMENTING ON OPTICAL CHARACTER RECOGNITION USING MEDICAL TABLET FOR BLIND PEOPLE

An Edge Detection Algorithm for Online Image Analysis

MICROARRAY IMAGE SEGMENTATION USING CLUSTERING METHODS

Part 3: Image Processing

An Objective Evaluation Methodology for Handwritten Image Document Binarization Techniques

Extraction and Recognition of Alphanumeric Characters from Vehicle Number Plate

Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network

Unique Journal of Engineering and Advanced Sciences Available online: Research Article

Features Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Available Online through

Color Image Segmentation

Chapter 11 Representation & Description

IMAGE S EGMENTATION AND TEXT EXTRACTION: APPLICATION TO THE EXTRACTION OF TEXTUAL INFORMATION IN SCENE IMAGES

12/12 A Chinese Words Detection Method in Camera Based Images Qingmin Chen, Yi Zhou, Kai Chen, Li Song, Xiaokang Yang Institute of Image Communication

Adaptive Local Thresholding for Fluorescence Cell Micrographs

Signboard Optical Character Recognition

Content Based Image Retrieval (CBIR) Using Segmentation Process

Motion Detection Algorithm

Mingle Face Detection using Adaptive Thresholding and Hybrid Median Filter

Dot Text Detection Based on FAST Points

A Review of various Text Localization Techniques

EDGE BASED REGION GROWING

Object Tracking Algorithm based on Combination of Edge and Color Information

Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi

Transcription:

Automatically Algorithm for Physician s Handwritten Segmentation on Prescription Narumol Chumuang 1 and Mahasak Ketcham 2 Department of Information Technology, Faculty of Information Technology, King Mongkut's University of Technology North Bangkok, Thailand 1 lecho20@hotmail.com, s5407011910031@email.kmutnb.ac.th 2 mahasak.k@it.kmutnb.ac.th Abstract - Text detection is an important prerequisite for many content-based image analysis tasks. In this paper proposed a novel approach for physician's handwritten text region detection algorithm based on the threshold and the basis of 4 and 8-connected component. The handwriting style of people are privacy. Especially, the physician's handwritten is unique and particular word like medical name, symbolic, symptoms of disease. The prescription form from different health care are different layout then using blank form to delete the recorded form is not enough because that is fix and restrict with form. Our method do not adhered to lay out the form but we used interest of region. The good experimental result shows handwritten text detection and extraction. intensity of light during the pixel. Text detection and complete is a challenging issue. In particular, he main reasons are as follows: first the text detection on low quality or the intensity of light is not uniform across the entire image. The less difference between the foreground and the background is a cause. The last there are a lot of lines are a closer proximity between the adjacent pixel resolutions they make the complete text detection are more difficult. For text detection also developing continuously and text detection in natural scene [3] text line detection [4] and extraction in video frames [5], the text detection on document [6-7] with variety techniques. All of these papers used print texts detected which have extracted pattern that is not difficult. Keywords - Intelligent, Algorithm, Handwritten, Text Detection, Prescription I. INTRODUCTION Text detection has been a popular area of research since few decades under the purview of image processing and pattern recognition. Text detection is a technique in image processing. The purpose to find out the extent of characters on the image or video [1]. This method uses the basic principles of the edge of the image is more pronounced in order to make the boundaries [2]. The regions of image are obvious difference from the intensity of light from one pixel to another pixel on image and the contiguous. The edges are more pronounced or not it s depending on the Fig. 1 A Sample of Prescription. 123

Automatically Algorithm for Physician s Handwritten Segmentation on Prescription Prescription in agencies that provide health care, such as hospitals and public hospitals and private health, including public health clinics open during special treatment. It found that each prescription that looks placement are different record, see in Fig. 1. Along with the detail in the note such as the diagnosis, the patient's symptoms, drug name, dose, characteristics of drugs, physician s signature these are recorded by physicians who patient owner. All of details on a prescription has the printed text, the symbols and the handwritten text. The challenge of this paper is focus on handwritten [5-6] text detection on prescription form. The remaining part of this paper is organized as follows. The next section reviews the work done on text detection for various techniques domains. Section third introduces the proposed method. Section fourth the experiment for physician's handwritten text detection on prescription. Section fifth, we demonstrate for discussion in this paper. Then the conclusion and future work are illustrated. II. RELATED WORK First review methods for text detection in natural scenes and point out their inadequacies all of image and video. Then we survey the methods generally used for video text detection to show their deficiency in dealing with natural scene text. This leads us to the research gap that we will address by means of our proposed method. Shi C. et al. [7], use graph model built upon maximally stable extreme regions (MSER) for text detection. The four contributions are detected in the original image, which are shown to be suitable for text detection. Special features effective was designed for MSERs and used to train a classifier to estimate the probability text. The graph model that cost function incorporates region-based as well as context-relevant information. The different information carried by the function optimally balanced to get labeling result by minimizing the cost function via graph algorithm. This method shown encouraging performance, especially in the precision. The recall still needs improvement because by disturb of the illumination condition. Yin, Xu-Cheng, et al. [8], design a fast and effective pruning algorithm to extract MSERs as character detection using the strategy of minimizing. The proposed system is evaluated on the ICDAR 2011 regularized variations dataset. The result shown the f-measure is over 76%, much better than the state-of-the-art performance of 71%. Three limitations of this algorithm can t detected well on highly blurred natural scene images. The non-english characters detect poorly. The detection multiorientation, especially for similar multiple text lines with a seriously skewed distortion. Fraz, Muhammad et al. [9], proposed propose an end-to-end scene text recognition system using exploited color information for better scene text detection and recognition. The main contribution of this paper is to demonstrate that the color information within the images if efficiently exploited is good enough to identify text regions from the surrounding noise. In the part of text detection this algorithm can detected and recognition in various print text using color-based connected components generation and color clustering. All the detected text regions are analyzed with respect to their adjacent text regions. Two nearby regions possess similarities in color and size, the bounding boxes of these text regions are merged into a single bounding box. They solved this problem with a post-processing where the words are separated from each other on the basis of horizontal distance. The distance between the letters of each word is computed. If the distance between letters is found to be greater than a threshold, then it is an indication for breaking the chain into two words and so on. The threshold for separating the letters is empirically computed using the window sizes. Wu, Liang, et al. [10], presented technique for detecting and tracking video texts of any orientation by using spatial and temporal information, respectively. They combined potential text candidates regardless of 124

Narumol Chumuang and Mahasak Ketcham orientation based on the nearest neighbor criterion. To tackle the problems of multi-font and multi-sized texts, they propose multi-scale integration by a pyramid structure, which helps in extracting full text lines. The result of precision, recall and f measure are well for print text. Kumar, Punith et al. [11], suggested detect and extract the moving text from the news video using hybrid technology in association of edge detection along with the component connectivity analysis. This algorithm is easy to understand and the results in experiment shown well but they did not test this algorithm with the TV news channel in other characters. III. PROPOSED METHOD of physician's handwritten text detection on prescription shown in Fig. 2. The prescription form has printed and handwritten texts. Moreover Fig. 2, displayed the diagram which can expand the detail follow each block step as: Step 1: Input Prescription Image The input prescription image should be standard size 1024x768 pixels and converted to gray scale with (1), f = (0.3xR)+(0.59xG)+(0.11xB) (1) where f is the prescription written form image. R represents red color, G and B represent green and blue color, respectively and convert gray scale to binary with (2), 1; f ( x, y) 0 P ( x, y) (2) 0; f ( x, y) 0 where P(x, y) is the coordinates of each pixel in the binary image, f(x, y) is the coordinates of each pixel in the grayscale. Step 2: Interested Region Detection The prescription form are different layout then the region of interest on prescription form was set. Our interested area was set to table see in Fig. 3 and 4. Fig. 2 A Block Diagram of Physician's Handwritten Text Detection. The handwritten text detection on form has quite impact and influence on lot many image processing applications for extract handwritten text to recording into database like detection and extraction the handwritten text on immigration form, detection and extraction the handwritten text form, etc. The block diagram Fig. 3 Region of Interest. 125

Automatically Algorithm for Physician s Handwritten Segmentation on Prescription In Fig. 5(b) and (c) indicated to remove grid and line which remain the group of text only. Fig. 4 Sample of Converted Images Gray Scale to Binary Image. Step 4: Text Segmentation Our text segmentation method the histogram and connection components are used in this step. Histogram represent the frequency of the pixel area of interest. The horizontal histogram used for splitting line break and the vertical used for splitting character. The splitting decision at the min point which it has minimum or no has black pixel, see in Fig. 6. This step set the interest region that is the handwritten text area but it still has noise like grid, lines etc. Step 3: Noise Reduction The noise reduction step, we propose reduce noise follow as: Removable noise at the edges. Eliminate interference with less than threshold. Eliminate interference is larger than threshold. Removing the noise, the lines and grids. We set threshold and compute on the binary image. Threshold in our system has two value and the result displayed in Fig. 5. If T > 340 pixels then delete and if T< 5 pixel then delete where T is intensity of black pixels. Min for Split Fig. 6 The Example of Text Segmentation. Then connection component is used to search the pixels are continue connection. If their edges touch that s means that pair of adjoining pixels is part of the same character only if they are both on and are connected along the horizontal or vertical aspect. The connected components in our system there are two methods that are 4 and 8 connection components shown in Fig. 7. Step 5: Text Clustering Afterwards, finish splitting line breaks and splitting character with histogram and considered the connection component pixel. We cogitated text clustering for crop handwritten text image. The measurements for the set of properties specified for each connected component in the binary. The contiguous regions are label matrix containing contiguous regions might look like this as Fig. 8. Fig. 5 Reduction Noise (a) Original, (b) Threshold > 340, and (c) Threshold <5. 126

Narumol Chumuang and Mahasak Ketcham dataset in our experiment are the recorded prescription form. The results in text detection and extraction illustrate in Fig. 9. We enlarge the image in each line break to obvious. Fig. 7 Connected Components 4 Connected Component and 8 Connected Component. Elements of L equal to 1 belong to the first continuous region or connected component. The elements of L equal to 2 belong to the second connected component and so on. Discontinuous regions are regions that might contain multiple connected components. A label matrix containing discontinuous regions might look like this: Elements of L equal to 1 belong to the first region, which is discontinuous and contains two connected components. Elements of L equal to 2 belong to the second region, which is a single connected component. Then all text will crop shown in Fig. 9. Finally, the last output text that were crop by our algorithm can isolate to text image. We enlarge the image in each line break to obvious up and display the text extracted in Fig. 8. Fig. 8 Handwritten Text Detection. Fig. 8, shown the example split both of handwritten and print text, the numeral and symbols images that it can detected. This is a problem is our future task. IV. THE EXPERIMENT AND RESULTS The proposed technique consists of a major contributions, that is, handwritten text detection. The Fig. 9 The Example of Handwritten Text Detection and Extracted. Furthermore, 131 texts on this prescription are detected. We separated 73 print texts and 58 handwritten texts were detected. The correctly rate for the handwritten text detection and extraction was 42 texts or 87.34%. V. CONCLUSION In this paper, we propose a new method for handwritten text detecting. The contribution of this paper is the first algorithm for physician s handwritten text detection in prescribing on form. This method used the histogram and connected components with challenge physician s handwritten on prescription. The results displayed the accuracy rate 87.34%. We found many the problems for future work: noise reduction made data lose task, adaptive threshold task, handwritten different style task, the lexicon about medicine, the symbol concerns in prescribing, giving complete endto-end performance. These are pre-problem that want to improvement for detect handwritten. The recognition is an important technique for handwritten style and it can elongate rate of accuracy. REFERENCES (Arranged in the order of citation in the same fashion as the case of Footnotes.) [1] Cai-Xia, D., Gui-Bin, W., and Xin-Rui, Y. (2013). Image edge detection algorithm based on improved canny operator. In: Wavelet Analysis and Pattern Recognition (ICWAPR), 2013 International Conference on, IEEE, p. 168-172. 127

Automatically Algorithm for Physician s Handwritten Segmentation on Prescription [2] Phan, T.Q., Shivakumara, P., Tian, S., and Tan, C.L. (2013). Recognizing text with perspective distortion in natural scenes. In: Proc of ICCV. [3] Uchida, S. (2014). Text localization and recognition in images and video. Handbook of Document Image Processing and Recognition, p. 843-883. [4] Rege, P.P. and Chandrakar, C.A. (2012). Text-Image Separation in Document Images Using Boundary Perimeter Detection. ACEEE Int. 1, On Signal & Image Processing, Vol. 3(1). [5] Chumuang, N. and Ketcham, M. (2014). Intelligent handwriting Thai Signature Recognition System based on artificial neuron network. TENCON 2014-2014 IEEE Region 10 Conference, Bangkok, p. 1-6. [6] Hnoohom, N., Chumuang, N., and Ketcham, M. (2015). Thai Handwritten Verification System on Documents for the Investigation. 2015 11 th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), Bangkok, p. 617-622. [7] Cunzhao, S. (2013). Scene text detection using graph model built upon maximally stable extremal regions. Pattern Recognition Letters, Vol. 34(2), p. 107-116. [8] Xu-Cheng, Y. (2014). Robust text detection in natural scene images. Pattern Analysis and Machine Intelligence, IEEE Transactions on, Vol. 36(5), p. 970-983. [9] Muhammad, F., Sarfraz, M.S., and Edirisinghe, E.A. (2015). Exploiting colour information for better scene text detection and recognition. International Journal on Document Analysis and Recognition (IJDAR), Vol. 18(2), p. 153-167. [10] Liang, W. (2015). A New Technique for Multi-Oriented Scene Text Lines Detection and Tracking in Video. [11] Punith, K. and Puttaswamy, P.S. (2015). Moving text line detection and extraction in TV video frames. In: Advance Computing Conference (IACC), 2015 IEEE International, IEEE, p. 24-28. 128