Document Image Restoration Using Binary Morphological Filters. Jisheng Liang, Robert M. Haralick. Seattle, Washington Ihsin T.

Similar documents
UW Document Image Databases. Document Analysis Module. Ground-Truthed Information DAFS. Generated Information DAFS. Performance Evaluation

Two Image-Template Operations for Binary Image Processing. Hongchi Shi. Department of Computer Engineering and Computer Science

OCR For Handwritten Marathi Script

CS 223B Computer Vision Problem Set 3

Skeletonization Algorithm for Numeral Patterns

A Study on the Document Zone Content Classification Problem

Identifying Layout Classes for Mathematical Symbols Using Layout Context

New method for edge detection and de noising via fuzzy cellular automata

Groundtruth Image Generation from Electronic Text (Demonstration)

Network. Department of Statistics. University of California, Berkeley. January, Abstract

MORPHOLOGICAL BOUNDARY BASED SHAPE REPRESENTATION SCHEMES ON MOMENT INVARIANTS FOR CLASSIFICATION OF TEXTURES

EE 584 MACHINE VISION

Processing of binary images

Image Enhancement Using Fuzzy Morphology

ADVANCED IMAGE PROCESSING METHODS FOR ULTRASONIC NDE RESEARCH C. H. Chen, University of Massachusetts Dartmouth, N.

COMPUTER AND ROBOT VISION

CS 231A Computer Vision (Fall 2012) Problem Set 3

Progress in Image Analysis and Processing III, pp , World Scientic, Singapore, AUTOMATIC INTERPRETATION OF FLOOR PLANS USING

SYDE 575: Introduction to Image Processing

Recognition. Clark F. Olson. Cornell University. work on separate feature sets can be performed in

Prewitt. Gradient. Image. Op. Merging of Small Regions. Curve Approximation. and

Power Functions and Their Use In Selecting Distance Functions for. Document Degradation Model Validation. 600 Mountain Avenue, Room 2C-322

Morphological Image Processing

ZONE CONTENT CLASSIFICATION AND ITS PERFORMANCE EVALUATION

AN ALGORITHM USING WALSH TRANSFORMATION FOR COMPRESSING TYPESET DOCUMENTS Attila Fazekas and András Hajdu

Classifier C-Net. 2D Projected Images of 3D Objects. 2D Projected Images of 3D Objects. Model I. Model II

Experiments with Edge Detection using One-dimensional Surface Fitting

David James Swain. Thesis submitted to the faculty ofthe. Virginia Polytechnic Institute and State University

Refine boundary at resolution r. r+1 r. Update context information CI(r) based on CI(r-1) Classify at resolution r, based on CI(r), update CI(r)

EDGE BASED REGION GROWING

A New Algorithm for Detecting Text Line in Handwritten Documents

Partition definition. Partition coding. Texture coding

An Evaluation of Information Retrieval Accuracy. with Simulated OCR Output. K. Taghva z, and J. Borsack z. University of Massachusetts, Amherst

BMVC 1996 doi: /c.10.41

CS 5540 Spring 2013 Assignment 3, v1.0 Due: Apr. 24th 11:59PM

Fast Distance Transform Computation using Dual Scan Line Propagation

Digital Image Processing

Computer and Machine Vision

HCR Using K-Means Clustering Algorithm

Mingle Face Detection using Adaptive Thresholding and Hybrid Median Filter

A Quantitative Approach for Textural Image Segmentation with Median Filter

DOCUMENT IMAGE ZONE CLASSIFICATION A Simple High-Performance Approach

FOR EFFICIENT IMAGE PROCESSING. Hong Tang, Bingbing Zhou, Iain Macleod, Richard Brent and Wei Sun

Segmentation of Isolated and Touching characters in Handwritten Gurumukhi Word using Clustering approach

RECOGNIZING TYPESET DOCUMENTS USING WALSH TRANSFORMATION. Attila Fazekas and András Hajdu University of Debrecen 4010, Debrecen PO Box 12, Hungary

Motion Detection Algorithm

A Vertex Chain Code Approach for Image Recognition

Impact of Intensity Edge Map on Segmentation of Noisy Range Images

Procedia Computer Science

INFORMATION RETRIEVAL USING MARKOV MODEL MEDIATORS IN MULTIMEDIA DATABASE SYSTEMS. Mei-Ling Shyu, Shu-Ching Chen, and R. L.

Morphological Image Processing

Hybrid filters for medical image reconstruction

Modified Watershed Segmentation with Denoising of Medical Images

EECS490: Digital Image Processing. Lecture #17

Character Recognition

Handwritten Gurumukhi Character Recognition by using Recurrent Neural Network

An ICA based Approach for Complex Color Scene Text Binarization

IRIS SEGMENTATION OF NON-IDEAL IMAGES

Medical images, segmentation and analysis

Binary Shape Characterization using Morphological Boundary Class Distribution Functions

160 M. Nadjarbashi, S.M. Fakhraie and A. Kaviani Figure 2. LUTB structure. each block-level track can be arbitrarily connected to each of 16 4-LUT inp

Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection

The only known methods for solving this problem optimally are enumerative in nature, with branch-and-bound being the most ecient. However, such algori

Stacked Denoising Autoencoders for Face Pose Normalization

Classification of Printed Chinese Characters by Using Neural Network

Image Processing. Bilkent University. CS554 Computer Vision Pinar Duygulu

Pattern Recognition Using Graph Theory

Heap-on-Top Priority Queues. March Abstract. We introduce the heap-on-top (hot) priority queue data structure that combines the

retrieve portions of these maps based on the information and then let the GIS handle the queries and discard the number of heterogeneous maps.

An Approach for Reduction of Rain Streaks from a Single Image

Algorithms and Data Structures. Marcin Sydow. Introduction. QuickSort. Sorting 2. Partition. Limit. CountSort. RadixSort. Summary

RESEARCH ON OPTIMIZATION OF IMAGE USING SKELETONIZATION TECHNIQUE WITH ADVANCED ALGORITHM

Exercise 3: ROC curves, image retrieval

A Graph Theoretic Approach to Image Database Retrieval

EE795: Computer Vision and Intelligent Systems

PROCESS > SPATIAL FILTERS

Sampling informative/complex a priori probability distributions using Gibbs sampling assisted by sequential simulation

PROJECTION MODELING SIMPLIFICATION MARKER EXTRACTION DECISION. Image #k Partition #k

Allowing Cycle-Stealing Direct Memory Access I/O. Concurrent with Hard-Real-Time Programs

Filter Banks with Variable System Delay. Georgia Institute of Technology. Abstract

Backpropagation Neural Networks. Ain Shams University. Queen's University. Abstract

WEINER FILTER AND SUB-BLOCK DECOMPOSITION BASED IMAGE RESTORATION FOR MEDICAL APPLICATIONS

Binary Image Processing. Introduction to Computer Vision CSE 152 Lecture 5

Generalizing Binary Classiers to the Multiclass Case

MRI Brain Image Segmentation Using an AM-FM Model

Using Game Theory for Image Segmentation

Morphological Image Processing

An Edge Detection Algorithm for Online Image Analysis

Spatial Enhancement Definition

Extracting Layers and Recognizing Features for Automatic Map Understanding. Yao-Yi Chiang

09/11/2017. Morphological image processing. Morphological image processing. Morphological image processing. Morphological image processing (binary)

Reinforcement Control via Heuristic Dynamic Programming. K. Wendy Tang and Govardhan Srikant. and

Development of an Automated Fingerprint Verification System

Interpolation is a basic tool used extensively in tasks such as zooming, shrinking, rotating, and geometric corrections.

Blind Image Deconvolution Technique for Image Restoration using Ant Colony Optimization

Filters. Advanced and Special Topics: Filters. Filters

A FUZZY LOGIC BASED METHOD FOR EDGE DETECTION

Recognition of Gurmukhi Text from Sign Board Images Captured from Mobile Camera

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Image Segmentation Based on Watershed and Edge Detection Techniques

Transcription:

Document Image Restoration Using Binary Morphological Filters Jisheng Liang, Robert M. Haralick University of Washington, Department of Electrical Engineering Seattle, Washington 98195 Ihsin T. Phillips Seattle University, Department of Computer Science Seattle, Washington 98122 ABSTRACT This paper discusses a method for binary morphological lter design to restore document images degraded by subtractive or additive noise, given a constraint on the size of lters. With a lter size restriction (for example 3 3), each pixel in output image depends only on its ( 3 3 ) neighborhood of input image. Therefore, we can construct a look-up table between input and output. Each output image pixel is determined by this table. So the lter design becomes the search for the optimal look-up table. By considering the degradation condition of the input image, we provide a methodology for knowledge based look-up table design, to achieve computational tractability. The methodology can be applied iteratively so that the nal output image is the input image after being transformed through successive 3 3 operations. An experimental protocol is developed for restoring degraded document images, and improving the corresponding recognition accuracy rates of an OCR algorithm. We present results for a set of real images which are manually ground-truthed. The performance of each lter is evaluated by the OCR accuracy. Keyword: Document image restoration, morphology ltering, OCR, look-up table. 1 INTRODUCTION During the document image generation processes, such as scanning and digitization, the images are usually corrupted by subtractive as well as additive noise. We wish to design a lter to restore a class of document images with similar structural features and degradation conditions using morphological operations. Earlier work on image restoration can be categorically divided into two groups. The linear lters are based on the assumption of linear, space invariant degradations. The restoration techniques can be carried out in the frequency domain by means of a 2-D FFT algorithm. The linear lter is easy to design and analyze. The classical Wiener ltering provides a mathematically simple and computationally ecient optimization process. But document images often have sharp edges, binary structures and a limited number of bits per pixel. The restriction

that the estimation rule be a linear combination of observed values is typically unsuitable. 2 Mathematical formalisms for practical design of nonlinear optimal lters are not generally available. Optimal nonlinear lters are often found by computationally intensive search procedures. To mitigate the search problem, Loce 2 has employed various optimization constraints, such as a window constraint and library constraint. In this paper we discuss a method for binary morphological lter design to restore document images degraded by subtractive or additive noise, given the constraint on the size of lters. With the lter size restriction (for example 3 3), each pixel in output image depends only on its ( 3 3 ) neighborhood of input image. Therefore, we can construct a look-up table between input and output. Each output image pixel is determined by this table. So the lter design becomes the search for the optimal look-up table. By considering the degradation condition of input image, we provide a methodology for knowledge based look-up table design, to achieve computational tractability. The earlier restoration techniques are oriented toward modeling the degradation and applying the inverse process in order to recover the original image. This approach usually involves formulating a criterion of goodness that will yield some optimal estimate of the desired result. 1 The main problem is the validation of degradation model. 2 PROBLEM STATEMENT Let I denote the input real image, and R denote output restored image. Any morphological ltering can be expressed as a mapping from I to R: R = F (I): Our objective is to search for an optimal lter F to restore the degraded input image and improve the corresponding recognition accuracy rate of OCR algorithm. The lter is constrained to a 3 3 size. 3 FILTER DESIGN ALGORITHM With the 3 3 size restriction, each pixel k in output image depends only on its 3 3 neighborhood of input image, see Figure 1: 3 x 3 1 or 0 Input Image I Output Image R Figure 1: Mapping between input and output images k = f(k1; k2; k3; k; k5; k6; k7; k8; k9): k1; ; k9 are elements in 3 3 neighborhood, shown in Table 1. We call each possible 3 3 neighborhood as a pattern. Each input pattern is associated with a binary 1 or 0

k1 k2 k3 k k5 k6 k7 k8 k9 Table 1: 3 3 pattern on the output. Therefore, we can construct a look-up table between input and output, see Table 2. k1; k2; k3; k; k5; k6; k7; k8; k9 k 000000000 0; 1 100000000 0; 1 0; 1 0; 1 111111111 0; 1 Table 2: Illustrates the look-up table between input and output Each output image pixel is determined by this look-up table. So, the lter design becomes the search for an optimal look-up table. However, there are 2 9 = 512 dierent patterns and 2 512 possible tables. Of course, searching through all the possible combinations is computationally burdensome. 3.1 Generation of Knowledge-Based Table We have developed a methodology to generate a knowledge-based table, by considering the degradation types of input image. We partition the 512 possible input patterns into L dierent subsets, by considering the numbers of -connected components in each pattern. -connected means only the north, south, east, and west neighbors of a pixel are considered connected to it. All patterns in a subset should produce the same output. If we have some knowledge of the degradation process and the nature of the ideal input image, we can determine m subsets which must produce binary-1 outputs, and determine n subsets which must produce binary-0 outputs, by considering contribution of each subset to restoration. Therefore, there are L? m? n remaining subsets which need to be determined whether it is better for the output to be a binary-1 or binary-0. We will search through all the possible combinations and choose the optimal one. The size of search then decreases from 2 L to 2 L?m?n. 3.1.1 Table Generation for Subtractive Noise We assume that noise process is one that creates holes in the image. So whether a 3 3 pattern produces output 1 or 0 depends on its contribution to hole lling. For subtractive noise, all 3 3 patterns whose central pixel is 1 should produce binary-1 outputs. Filters with this property are called foreground preservers. Then, we partition all remaining possible patterns into 50 subsets, by the number of -connected components in each pattern. We determine some subsets which must produce binary-1 outputs, by considering their contribution to the hole lling process. These subsets are called hole llers. All the hole llers are listed in Figure 2 and Figure 3. In order to prevent over lling, some subsets must produce binary-0 outputs to preserve the background. A list of background preservers is shown in Figure. The remaining subsets, which we do not know whether they should produce binary-1 or binary-0, are listed in

Figure 5. Hole Fillers 8 1 Component 1 2 8 8 8 2 Components Figure 2: Subsets of 3 3 patterns which are hole llers. The number below each 3 3 pattern represents the number of rotations of each pattern. 8 8 8 8 2 2 2 Components 8 8 3 Components Components Figure 3: Subsets of 3 3 patterns which are hole llers. The generated knowledge-based table for subtractive noise corrupted images is shown in Table 3. For the remaining subsets, we must check to see whether it is better for the outputs to be a binary-1 or binary-0. We will search through all the possible combinations and choose the optimal one. But the size of search (2 15 ) may be still too large. For subsets which have similar contribution to hole lling, they can be combined to one set. Suppose there are N sets, we search through 2 N combinations. The remaining subsets which are combined to 6 sets are shown in Figure 6. Now, the size of search decreases to 2 6 = 6. In practice, one can employ a dierent partition based on the knowledge of degradation process.

Foreground Preservers Background Preservers 8 8 1 Component 2 Components Figure : Illustrates the subsets of 3 3 patterns which are foreground and background preservers, for subtractive noise corrupted image. Remaining Subsets 8 1 Component 8 8 8 2 8 8 2 Components 3 Components 8 Figure 5: Illustrates the remaining subsets for subtractive noise corrupted image. 3.1.2 Table Generation for Additive Noise With additive noise, all the points of the ideal image belong to the noisy image. We wish to clean the additive points. To do this, we rst determine all the patterns whose central pixel is 0 should produce binary-0 outputs. Filters with this property are called background preservers. Then, we partition all the remaining possible patterns into a number of dierent subsets. Since the noise process is one that creates additive points, we determine some subsets which can be used to remove these points. These subsets must produce binary-0 outputs, see Figure 7. During the noise removing process, in order to preserve the structural shape of ideal character, we determine a list of foreground preservers which must produce binary-1 outputs, see Figure 7. The remaining subsets which are combined to 7 sets are shown in Figure 8. We will search through all the combinations of these 7 sets. The size of search is 2 7 = 128. Again, in practice, one can employ a dierent partition based on the knowledge of degradation process. The generated knowledge-based table for additive noise corrupted image is shown in Table.

Set 1 Set 2 Set 3 Set Set 5 Set 6 Figure 6: Illustrates a list of remaining subsets which are combined to 6 sets. Subsets Output pixel Central pixel is 1 1 Hole llers 1 Background preservers 0 Remaining subsets 0 or 1 Table 3: Look-up table for subtractive noise corrupted images EXPERIMENTAL PROTOCOL We select some real document images as the input images. They have dierent degradation types and degradation conditions. The zone-based ground truth les for input images are generated, following the protocol used in UW English Document Image Database I. We draw the bounding box for each text zone and extract the box coordinates. We now have a list of identiers for the zones and the corresponding bounding box coordinates. The generated ground truth le can be considered as the ideal stage for restoring input images and testing. Subsets Output pixel Central pixel is 0 0 Noise removers 0 Foreground preservers 1 Remaining subsets 0 or 1 Table : Look-up table for additive noise corrupted images

Foreground Preservers 2 Additive Noise Removers 1 8 Figure 7: Illustrates the foreground preservers and noise removers for additive noise corrupted image. The following are the procedures for lter design: Find the degradation type of input image. Generate knowledge-based look-up tables, using the algorithm discussed in Section 3. For each possible table, there is one output image. Run the OCR algorithm on each of output images. Evaluate each look-up table by comparing OCR outputs with ground truth of input image, using OPE (OCR Performance Evaluation) software. A new lter can be designed to work on the output of the previous ltered image. The experiment ow chart is shown in Figure 9..1 Testing We use Caereocr OCR Version 109a, from Caere Systems, and specify the coordinates of the zones to be recognized. Script les to run the OCR algorithm on each of the zones in image are generated and executed. OPE Software 6 is provided by UWEDID-I. It compares the OCR outputs and the corresponding groundtruth information, generating symbol statistics, such as the number of matches, changes, insertions, and deletions, as well as line statistics. In addition, it optionally generates a contingency table which tells how each character is interpreted by the OCR algorithm. We use symbol match accuracy as the primary index in lter testing. Let G(I) and O(R) denote the ground truth of input image and output of OCR algorithm over restored image respectively. Let M(G(I); O(R)) represent the symbol match accuracy between G(I) and O(R). Therefore, our testing criterion is: maxfm(g(i); O(R)g:

8 Set 1 Set 2 Set 3 2 1 Set 8 8 8 Set 5 8 Set 6 Set 7 8 8 Figure 8: Illustrates the remaining subsets for additive noise corrupted image. They are combined to 7 sets. 5 EXPERIMENTAL RESULTS We choose 15 real images with dierent degradation types and conditions to verify our algorithm. The degradation type for each input image has been dened. When either subtractive or additive noise alone is present, the appropriate lters can be used respectively. A new lter can be designed to work on the output of the previous ltered image. For example, four levels of recursive ltering are performed on image 1 in our experiment. If the input has mixed noise, alternating sequential lters can be used. For example, iterated additive noise lter and subtractive noise lter are performed on image 6. During each ltering level, we search through all the possible look-up tables ( 6 for subtractive noise, and 128 for additive noise, see Section 3 ). For each table, there is one output image and corresponding symbol match accuracy of OCR algorithm. The optimal look-up table is determined from 6 (7) input subsets for subtractive (additive) noise. For each input image, the output pixel value (binary 1 or 0) of each input subset is listed in a table. The following are the experimental results for the 15 input images with dierent dominant noise types. 1. Image 1 subtractive noise. 2. Image 2 subtractive noise. 1 65.98 1,,6 2,3 71.39 2 1,3,,6 2 77.06 3 2,,6 1,3 78.09 2,,6 1,3 78.61

INPUT IMAGE Restored FILTER Image GROUND TRUTH OCR Text OPE Text Symbol Match Accuracy Figure 9: Experiment ow chart 3. Image 3 additive noise.. Image additive noise. 5. Image 5 additive noise. 6. Image 6 mixed noise. 1 90.38 1,2,,6 3,5 93.77 1 39.7 6 1,2,3,,5,7 57.89 1 93.28 1,,5,7 2,3,6 96.97 2 2,5 1,3,,6,7 97.05 1 51.78 2,,6 1,3,5,7 68.76 Two levels of recursive ltering are performed on image 6 in our experiment. The rst one is used to remove the additive noise. The second one is used to clean the subtractive noise. This method is comparable to morphological opening and closing ltering. The optimal lters are listed in following table. 1 59.87 2,3,7 1,,5,6 71.71 2 1,3,5 2,,6 73.6

7. Image 7 subtractive noise. 8. Image 8 additive noise. 9. Image 9 additive noise. 10. Image 10 subtractive noise. 11. Image 11 subtractive noise. 12. Image 12 subtractive noise. 13. Image 13 mixed noise. 1 7.03 2,3,,5,6 1 8.76 2 1,2,,5,6 3 85.8 1 87.85 2,,7 1,3,5,6 91.81 1 67.3 3,,5,6,7 1,2 68.51 1 55.79 2,3,,5 1,6 6.19 2,5,6 1,2,3 66.67 1 73.86 1,2,3,5,6 82.63 1 88.20 2,3,,5 1,6 95.28 Two levels of recursive ltering are performed on image 13 in our experiment. The rst one is used to remove the additive noise. The second one is used to clean the subtractive noise. 1. Image 1 additive noise. 1 87.61 1,2,3,,5,6 7 90.19 2 2,3,,5 1,6 92.77 1 63.03 1,2,,6 3,5,7 70.77

15. Image 15 additive noise. 1 53.12 1,2,3,5,6,7 66.36 2 1,3,5,6,7 66.2 We have worked on the individual images, and found the optimal image specic lter which produces the highest OCR accuracy. Of course, in real applications, it is not realistic to search for the best lter for each image. We want to nd the optimal lter for a set of images with similar properties. Therefore, we divided the images into two sets, one with subtractive noise, another with additive noise. The OCR accuracy from 6 3 3 lters for subtractive noise and 128 3 3 lters for additive noise were computed. The average OCR accuracy ( only for the rst ltering level ) were computed for the additive and subtractive cases. The lter which produces the highest average accuracy for each set of images was chosen as the global lter (Table 5). For each image, the accuracy from the global lter, is given in the Table 6 and Table 7. the accuracy from the image specic lter, and the accuracy from the global lter, are given in the Table 6 and Table 7. Image Type Sets with Sets with output 1 output 0 Subtractive Noise 1,2,3,,6 5 Additive Noise 1 2,3,,5,6,7 Table 5: Global Filters for Subtractive Noise Images and Additive Noise Images Image Number 2 7 10 11 12 After ltering 93.77 83.8 63.36 81.30 9.9 Table 6: OCR Accuracy for Subtractive Noise Image Image Number 3 5 6 8 9 13 1 15 After Filtering 57.11 96.07 67.92 70.61 91.2 68.51 89.67 70.3 65.90 Table 7: OCR Accuracy for Additive Noise Image Let x i be the OCR accuracy obtained after employing the best image specic lter on image i. Let ^x i be the OCR accuracy obtained after employing the best global lter on image i. The mean accuracy loss (MAL) can be calculated as: M AL = 1 n P n i=1 (x i? ^x i ) The mean accuracy loss for employing the global subtractive noise lter rather than the best image specic lter was 0:76%. The mean accuracy loss for employing the global additive noise lter rather than the best image specic lter was 0:62%. This shows that the global lter works well for each individual image. 6 DISCUSSION An algorithm for binary morphological lter design has been given to restore document images degraded by subtractive or additive noise, given design constraints on the size of lters. We provided a methodology for

knowledge based look-up table design, to achieve computational tractability. An experimental protocol has been developed for restoring degraded document image, and improving the corresponding recognition accuracy rates of OCR algorithm. Our future work will include the investigation of the eect on ideal characters when each kind of pattern is applied and the eect of constraint on lter properties. Also, we try to develop a classier to classify the input images into dierent subsets and decide the appropriate lter to each subset. 7 REFERENCES [1] R.C. Gonzalez and R.E. Woods, Digital Imaging Processing, Addison Wesley, 1993. [2] R.P. Loce, \Morphological Filter Mean-Absolute-Error Representation Theorems and Their Application to Optimal Morphological Filter Design" Ph.D Thesis, Rochester Inst. of Tech, 199. [3] I.R. Joughin, R.M. Haralick, and E.R. Dougherty, \Model-based Algorithm for Designing Suboptimal Morphological Filters for Restoring Subtractive-noise-corrupted Images", Journal of Electronic Imaging, vol.2, no., pp. 31-25, Oct. 1993. [] S. Chen, S. Subramaniam, I.T. Phillips, and R.M. Haralick, \Performance Evaluation of Two OCR Systems", Third Annual Symposium on Document Analysis and Information Retrieval, pp. 299-317, April 199. [5] \Reference Manual for UW English Document Image database", ISL report,e.e. Dept.,U. of Washington. [6] Su Chen, \OCR Performance Evaluation Software User's Manual", ISL Report,E.E. Dept.,U. of Washington.