Text segmentation for MRC Document compression using a Markov Random Field model

Similar documents
Test Segmentation of MRC Document Compression and Decompression by Using MATLAB

MR IMAGE SEGMENTATION

International Journal of Electrical, Electronics ISSN No. (Online): and Computer Engineering 3(2): 85-90(2014)

Multilayer Document Compression Algorithm

A New Algorithm for Detecting Text Line in Handwritten Documents

Document image binarisation using Markov Field Model

An ICA based Approach for Complex Color Scene Text Binarization

RESTORATION OF DEGRADED DOCUMENTS USING IMAGE BINARIZATION TECHNIQUE

Markov/Conditional Random Fields, Graph Cut, and applications in Computer Vision

Undirected Graphical Models. Raul Queiroz Feitosa

Handwritten Devanagari Character Recognition Model Using Neural Network

CRFs for Image Classification

Digital Image Processing Laboratory: Markov Random Fields and MAP Image Segmentation

Handwritten Word Recognition using Conditional Random Fields

OCR For Handwritten Marathi Script

Image Restoration using Markov Random Fields

A Model-based Line Detection Algorithm in Documents

Multi-scale Techniques for Document Page Segmentation

AUTOMATIC LOGO EXTRACTION FROM DOCUMENT IMAGES

Segmentation and labeling of documents using Conditional Random Fields

Robust Phase-Based Features Extracted From Image By A Binarization Technique

ISSN: An Efficient Fully Exploiting Spatial Correlation of Compress Compound Images in Advanced Video Coding

Limited View Angle Iterative CT Reconstruction

Enhanced Hybrid Compound Image Compression Algorithm Combining Block and Layer-based Segmentation

Mesh segmentation. Florent Lafarge Inria Sophia Antipolis - Mediterranee

Layout Segmentation of Scanned Newspaper Documents

Available online at ScienceDirect. Procedia Computer Science 45 (2015 )

Image Segmentation Techniques for Object-Based Coding

Digital Image Processing Laboratory: MAP Image Restoration

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

Segmentation of Distinct Homogeneous Color Regions in Images

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Bus Detection and recognition for visually impaired people

HCR Using K-Means Clustering Algorithm

Processing of binary images

Application of MRF s to Segmentation

Mixed Raster Content for Compound Image Compression

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Tak-Shing Wong, Charles A. Bouman, Ilya Pollak, and Zhigang Fan. Abstract

Markov Networks in Computer Vision. Sargur Srihari

Mixed Raster Content for Compound Image Compression

Markov Random Fields and Segmentation with Graph Cuts

Contextual High-Resolution Image Classification by Markovian Data Fusion, Adaptive Texture Extraction, and Multiscale Segmentation

DjVu Technology Primer

Region-based Segmentation

MRF-based Algorithms for Segmentation of SAR Images

Scene Text Detection Using Machine Learning Classifiers

Markov Random Fields and Gibbs Sampling for Image Denoising

NONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION

Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong)

Advances in Natural and Applied Sciences. Efficient Illumination Correction for Camera Captured Image Documents

Estimating Human Pose in Images. Navraj Singh December 11, 2009

Keywords: Image segmentation, EM algorithm, MAP Estimation, GMM-HMRF, color image segmentation

Markov Networks in Computer Vision

Application of Principal Components Analysis and Gaussian Mixture Models to Printer Identification

Analysis: TextonBoost and Semantic Texton Forests. Daniel Munoz Februrary 9, 2009

Regularization and Markov Random Fields (MRF) CS 664 Spring 2008

Binarization of Color Character Strings in Scene Images Using K-means Clustering and Support Vector Machines

I. INTRODUCTION. Figure-1 Basic block of text analysis

Segmentation. Bottom up Segmentation Semantic Segmentation

Structured Models in. Dan Huttenlocher. June 2010

Enhancement and Artifact Removal for Transform Coded Document Images

Improving Latent Fingerprint Matching Performance by Orientation Field Estimation using Localized Dictionaries

Handwritten Script Recognition at Block Level

JPEG: An Image Compression System

Modern Medical Image Analysis 8DC00 Exam

Norbert Schuff VA Medical Center and UCSF

Generative and discriminative classification techniques

Spatial Regularization of Functional Connectivity Using High-Dimensional Markov Random Fields

Lesion Segmentation and Bias Correction in Breast Ultrasound B-mode Images Including Elastography Information

Machine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves

CS6716 Pattern Recognition

Unsupervised refinement of color and stroke features for text binarization

CS 664 Segmentation. Daniel Huttenlocher

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models

A Method of Annotation Extraction from Paper Documents Using Alignment Based on Local Arrangements of Feature Points

II. WORKING OF PROJECT

Combining Top-down and Bottom-up Segmentation

Random spatial sampling and majority voting based image thresholding

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Algorithms for Markov Random Fields in Computer Vision

Segmentation of Isolated and Touching characters in Handwritten Gurumukhi Word using Clustering approach

Histograms. h(r k ) = n k. p(r k )= n k /NM. Histogram: number of times intensity level rk appears in the image

Scanner Parameter Estimation Using Bilevel Scans of Star Charts

Fundamentals of Digital Image Processing

Mobile Camera Based Text Detection and Translation

Introduction to Medical Imaging (5XSA0) Module 5

An Introduction To Automatic Tissue Classification Of Brain MRI. Colm Elliott Mar 2014

Morphological Image Processing

CS 490: Computer Vision Image Segmentation: Thresholding. Fall 2015 Dr. Michael J. Reale

DYNAMIC HIERARCHICAL DICTIONARY DESIGN FOR MULTI-PAGE BINARY DOCUMENT IMAGE COMPRESSION**

Ground Tracking in Ground Penetrating Radar

Random projection for non-gaussian mixture models

Locating 1-D Bar Codes in DCT-Domain

Tri-modal Human Body Segmentation

EFFICIENT BAYESIAN INFERENCE USING FULLY CONNECTED CONDITIONAL RANDOM FIELDS WITH STOCHASTIC CLIQUES. M. J. Shafiee, A. Wong, P. Siva, P.

Image Segmentation Using Iterated Graph Cuts Based on Multi-scale Smoothing

CAP 6412 Advanced Computer Vision

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints

Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction

Transcription:

Text segmentation for MRC Document compression using a Markov Random Field model Final dissertation by Eri Haneda Advisory committee: Prof. Charles Bouman (Chair) Prof. Jan Allebach Prof. Peter Doerschuk Prof. George Chiu February 25th, 2011 1

Primary contributions 2

1) Text segmentation for MRC document with Samsung Co. Ltd. Multiscale-COS/CCC algorithm 2) Next generation image capture device development with Samsung Co. Ltd. Motion/Lamp control Scanner image quality bench mark Snap-to-White contrast enhancement 3) CT baggage reconstruction (In progress) Multislice helical scan CT reconstruction code development Image reconstruction using a substitute prior model 3

The background of scanned images sometimes appears too dark because of the paper material (e.g. newspapers) EPSON HP Samsung Low color contrast Background is not white 4

Input image Auto-cropping Paper-white estimation Black-colorant estimation Estimated paper-white Estimated black-colorant Linear contrast stretch Output image Auto-cropping removes extra white or black which is sometimes included around the borders of the image Paper-white/black-colorant estimation determines respective RGB values using extrema and region growing Linear contrast stretch snaps the estimated paper-white and blackcolorant to the largest and smallest encoded value to increase the dynamic range 5

Original Samsung Adjusted Samsung 6

Short-term goals Write preliminary multislice helical CT reconstruction code Develop a statistical model for accurate CT baggage reconstruction/segmentation Forward projection Sinogram (Projection) Image voxels Reconstruction Multislice helical CT (Source & Detector) Multislice CT, M.F. Reiser (2004) Weighting Smoothing 7

We are trying to develop a new generalized prior model for MAP estimate In general, the prior term p(x) needs to be modeled under strong assumption Our approach is to generate a second order polynomial approximation to the function log p(x) about an initial image x with the form If we assume dependencies only on neighbors, the expression is dramatically simplified 8

Text segmentation for MRC Document compression using a Markov Random Field model 9

Text segmentation is extracting text components from a document White: Non-text Black: Text Input image (Color) Fig. 1. An example of text segmentation Segmentation result (Binary) 10

Useful for layer based document compression Layer based document compression is defined in ITU-T. T.44, Mixed Raster Content (MRC) encoding Good text segmentation achieves high document compression and preserves high image quality Useful for other applications such as OCR etc. Our goal is to generate segmentation which is: Accurate Robust Computationally inexpensive 11

Mixed Raster Content (MRC) is a standard for layer based document compression defined in ITU-T. T.44 She sells se ashells on the seas hore = + She sells se ashells + on the seas hore INPUT DOCUMENT IMAGE FOREGROUND JPEG2000 BINARY MASK JBIG2 BACKGROUND JPEG2000 MRC compressed document Fig. 2. Illustration of MRC document compression 12

Top-down/Bottom-Up approach X-Y cut algorithm [1], Run Length Smearing Algorithm [2] (RLSA) Thresholding approach Otsu, Niblack, Sauvola, Kapur, and Tsai method [3] Statistical approach Hidden Markov Model (HMM) The most known commercial software DjVu uses HMM model [4] Markov Random Field (MRF) Zhen et al. used an MRF model to exploit the contextual information for noise removal [5] Kumar and Kuk also incorporates their proposed prior model to MAP- MRF text segmentation Conditional Random Field (CRF) New emerging model, originally proposed by Lafferty [6] It directly models the posterior distribution of labels given observations It has been applied to pixel-wise segmentation 13

Three novel algorithms have been developed COS algorithm (Cost Optimized Segmentation) Used for initial segmentation. Formulated in a global cost optimization framework. CCC algorithm (Connected Component Classification) Refines segmentation by classifying connected components into text and non-text using a Markov Random Field (MRF) model Multiscale-COS/CCC algorithm Comprehensive segmentation scheme using multiple resolutions. This improves simultaneous detection of large and small text. 14

Global segmentation Input image Block-wise segmentation Segmentation class definition Cost minimization Segmentation result Block-wise segmentation Image is divided into overlapping blocks Each block segmented independently using clustering procedure Global segmentation Four possible classes are defined for each block The class of each block is chosen to minimize a global cost 15

After initial block segmentation, four possible classes are defined for each block Class 0 : Original segmentation Class 1 : Reversed Class 2 : All zeros Class 3 : All ones class 0 class 1 class 2 class 3 Black = Foreground White = Background 16

: Class of block at location (i,j). : Total variance of gray levels of each group (0 or 1) : Number of mismatches in horizontal overlap region : Number of mismatches in vertical overlap region : Number of 1 pixels inside block : Weight coefficients, k=1,2,3 Cost function may be minimized using dynamic programming 17

Optimal parameters determined by minimizing weighted error Original Errors equally weighted =0.5 Greater weighting on missed detections, =0.09 Missed detections Approach: Minimize missed detections, and eliminate false detections in a later stage. Motivation for CCC algorithm False detections 18

Refines the COS results by eliminating non-text components using a Markov Random Field (MRF) model Classification procedure Step1: Extract connected component CCi Step2: Calculate feature vector Yi Step3: Each CCi is classified as either text (Xi=1) or non-text (Xi=0) using MRF-MAP framework 19

Classification of each determined by maximizing posterior density (MAP) Y 1 Y 2 Y 3 Y 4 Data term p(y x) assumed to be conditionally independent Prior term p(x) for true segmentation labels is modeled by an MRF X CC 1 3 X 1 X 2 CC 2 CC 3 CC4 Y={Y 1, Y 2, Y N } ~ Observed data (feature vectors) Neighbors X 4 X={X 1, X 2, X N } ~ Classification of CC 20

Feature vector, Yi Boundary edge depth statistics Color uniformity Yi are conditionally independent given associated Xi Feature vector for both text and non-text modeled as a multivariate Gaussian mixture 21

MRF used to model local interaction between neighboring elements An MRF is a density satisfying the Markov property: MRF may be expressed as a Gibbs distribution (Hammersley-Clifford theorem): 22

Neighborhood system k-nearest neighbors, based on physical distance Enforce neighbors to be mutual Clique potential Dissimilarity measure between neighboring components 23

Augmented feature vector, Zi Original feature vector, concatenated with center location of component Dissimilarity measure, Normalized Mahalanobis distance between feature vectors Zi and Zj : feature vector covariance 24

Let P denote all neighboring component pairs. Then the labels, X, are distributed as a, b, and p are scalar parameters Class probability p(x) decreases from neighboring pairs having different class labels Decrease is more pronounced when distance Di,j is small 25

Combining data and prior models, compute the MAP estimate for the optimal set of classification labels X Ctxt controls the trade-off between missed and false detections Approximate solution using iterative conditional modes (ICM) Step1 : Initialize each class label with ML estimate Step2 : For each component, update label Step3: If no change occurs to the labels, then stop. Otherwise go to Step2 26

Gaussian mixture parameters in data term estimated using expectation maximization (EM) algorithm Prior model parameters, = [p, a, b], estimated using pseudolikelihood maximization (Besag, 1975) 27

Incorporation of COS/CCC algorithms into a multiscale framework to improve detection of varying size text Progress from coarse to fine scales, where coarser scales use larger COS block size Segmentation for each layer incorporates result from previous (coarser) layer Input image y Coarser layer COS (block size = 2 (L-1) m) CCC x (L-1) Segmentation x (L-1) Coarse layers use larger block size Fine layers use smaller block size COS (block size = 2 (L-2) m). CCC x (L-2). x (L-2) Finer layer COS (block size = m) CCC x (0) x (0) Final segmentation 28

New term in the COS cost function represents the number of pixel mismatches between current and previous layers New term : Class of block at location (i,j) on n th layer. The new term V 4 enforces consistency with coarser segmentation results 29

Comparison of multiscale-cos/ccc and MRC commercial products: DjVu (LizardTech) and LuraDocument (LuraTech) Compound test image (400dpi) 30

COS only COS/CCC Multiscale-COS/CCC 31

Multiscale-COS/CCC DjVu LuraDocument 32

Original DjVu Luratech Multiscale-COS/CCC 33

Original DjVu Luratech Multiscale-COS/CCC 34

Missed text detection % (Averaged over EPSON, HP, Samsung scanners) Component = (# missed components) / (# components in ground truth) Pixel = (# pixels of missed components) / (image size) Multiscale-COS/CCC has fewer missed detection than the other algorithms 35

False detection % (Averaged over EPSON, HP, Samsung scanners) Component = (# false detection) / (# components in ground truth) Pixel = (# pixels of false detection) / (image size) Multiscale-COS/CCC has fewer missed detection than the other algorithms 36

Multiscale-COS/CCC vs. DjVu (Best commercial product) Averaged over EPSON, HP, and Samsung scanner data Multiscale-COS/CCC has superior error rates over DjVu 37

Multiscale-COS/CCC (289:1) DjVu (281:1) LuraDocument (242:1)

Multiscale- COS/CCC (289:1) DjVu (281:1) LuraDocument (242:1) 39

Developed three novel algorithms for text segmentation: COS, COS/CCC, Multiscale-COS/CCC Accurate text extraction compared to commercial products Flexible for future developments Robust over various paper materials, scanner types, and various image backgrounds Can extend segmentation to other applications such as Optical Character Recognition (OCR) 40

Multiscale text segmentation for MRC document (Two conference papers, One journal paper, One patent) Segmentation for MRC compression, in Proc. of SPIE Conf. on Color Imaging XII, 2007 Multiscale segmentation for MRC compression using a Markov Random Field (MRF) model in IEEE ICASSP, March 2010 Text segmentation for MRC document compression accepted by IEEE Trans. on Image Processing on Oct 2010 Patents: combined declaration by Samsung Co. Ltd and Purdue University, United States Next generation image capture device development (Two patents for snap-to-white algorithm) Apparatus and method of segmenting an image in an image coding and/or decoding system Application 20080175477 Auto-cropping method for image capture device Application 20090323129 41

1. G. Nagy, S. Seth, and M. Viswanathan, A prototype document image analysis system for technical journals, Computer, vol. 25, no. 7, pp. 10 22, 1992. 2. J. Fisher, A rule-based system for document image segmentation, in Pattern Recognition, 10th international conference, 1990, pp. 567 572. 3. P. Stathis, E. Kavallieratou, and N. Papamarkos, An evaluation survey of binarization algorithms on historical documents, in 19th International Conference on Pattern Recognition, 2008, pp. 1 4. 4. P. Haffner, L. Bottou, and Y. Lecun, A general segmentation scheme for DjVu document compression, in Proc. of ISMM 2002, Sydney, Australia, April 2002. 5. Y. Zheng and D. Doermann, Machine printed text and handwriting identification in noisy document images, vol. 26, no. 3, pp. 337 353, 2004. 6. J. Lafferty, A. McCallum, and F. Pereira, Conditional random fields: Probabilistic models for segmenting and labeling sequence data, in Proc. 18th International Conf. on Machine Learning. organ Kaufmann, 2001, pp. 282 289. 42