Mixed Raster Content for Compound Image Compression

Similar documents
Mixed Raster Content for Compound Image Compression

ISSN: An Efficient Fully Exploiting Spatial Correlation of Compress Compound Images in Advanced Video Coding

Enhanced Hybrid Compound Image Compression Algorithm Combining Block and Layer-based Segmentation

Bi-Level Image Compression

Multilayer Document Compression Algorithm

Reducing/eliminating visual artifacts in HEVC by the deblocking filter.

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

USING H.264/AVC-INTRA FOR DCT BASED SEGMENTATION DRIVEN COMPOUND IMAGE COMPRESSION

DjVu Technology Primer

SIGNAL COMPRESSION. 9. Lossy image compression: SPIHT and S+P

Modified SPIHT Image Coder For Wireless Communication

A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm

Compressing 2-D Shapes using Concavity Trees

FAST AND EFFICIENT SPATIAL SCALABLE IMAGE COMPRESSION USING WAVELET LOWER TREES

Module 7 VIDEO CODING AND MOTION ESTIMATION

Histogram Based Block Classification Scheme of Compound Images: A Hybrid Extension

JPEG Joint Photographic Experts Group ISO/IEC JTC1/SC29/WG1 Still image compression standard Features

Test Segmentation of MRC Document Compression and Decompression by Using MATLAB

A QUAD-TREE DECOMPOSITION APPROACH TO CARTOON IMAGE COMPRESSION. Yi-Chen Tsai, Ming-Sui Lee, Meiyin Shen and C.-C. Jay Kuo

JPEG Baseline JPEG Pros and Cons (as compared to JPEG2000) Advantages. Disadvantages

Video Compression An Introduction

STUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS (H.264/AVC, DIRAC)

Lossless and Lossy Minimal Redundancy Pyramidal Decomposition for Scalable Image Compression Technique

MRT based Adaptive Transform Coder with Classified Vector Quantization (MATC-CVQ)

Fast Region-of-Interest Transcoding for JPEG 2000 Images

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

A COMPRESSION TECHNIQUES IN DIGITAL IMAGE PROCESSING - REVIEW

Image Compression for Mobile Devices using Prediction and Direct Coding Approach

Vidhya.N.S. Murthy Student I.D Project report for Multimedia Processing course (EE5359) under Dr. K.R. Rao

signal-to-noise ratio (PSNR), 2

Visually Improved Image Compression by using Embedded Zero-tree Wavelet Coding

Comparative and performance analysis of HEVC and H.264 Intra frame coding and JPEG2000

So, what is data compression, and why do we need it?

Module 6 STILL IMAGE COMPRESSION STANDARDS

CHAPTER 6. 6 Huffman Coding Based Image Compression Using Complex Wavelet Transform. 6.3 Wavelet Transform based compression technique 106

CONTENT ADAPTIVE SCREEN IMAGE SCALING

Use of Local Minimization for Lossless Gray Image Compression

Georgios Tziritas Computer Science Department

Context based optimal shape coding

Optimizing the Deblocking Algorithm for. H.264 Decoder Implementation

International Journal of Advancements in Research & Technology, Volume 2, Issue 9, September ISSN

Mesh Based Interpolative Coding (MBIC)

Rate Distortion Optimization in Video Compression

Strip Based Embedded Coding of Wavelet Coefficients for Large Images

AN ANALYTICAL STUDY OF LOSSY COMPRESSION TECHINIQUES ON CONTINUOUS TONE GRAPHICAL IMAGES

Compression of RADARSAT Data with Block Adaptive Wavelets Abstract: 1. Introduction

ECE 533 Digital Image Processing- Fall Group Project Embedded Image coding using zero-trees of Wavelet Transform

Comparison of EBCOT Technique Using HAAR Wavelet and Hadamard Transform

EFFICIENT METHODS FOR ENCODING REGIONS OF INTEREST IN THE UPCOMING JPEG2000 STILL IMAGE CODING STANDARD

Wireless Communication

Embedded Rate Scalable Wavelet-Based Image Coding Algorithm with RPSWS

Vector Bank Based Multimedia Codec System-on-a-Chip (SoC) Design

Fingerprint Image Compression

Hierarchical Representation of 2-D Shapes using Convex Polygons: a Contour-Based Approach

CMPT 365 Multimedia Systems. Media Compression - Image

Multimedia Systems Image III (Image Compression, JPEG) Mahdi Amiri April 2011 Sharif University of Technology

A New Technique of Extraction of Edge Detection Using Digital Image Processing

Pre- and Post-Processing for Video Compression

CSEP 521 Applied Algorithms Spring Lossy Image Compression

Topic 5 Image Compression

Locating 1-D Bar Codes in DCT-Domain

Data Hiding in Binary Text Documents 1. Q. Mei, E. K. Wong, and N. Memon

IMAGE COMPRESSION USING HYBRID QUANTIZATION METHOD IN JPEG

Scanner Parameter Estimation Using Bilevel Scans of Star Charts

JPEG 2000 A versatile image coding system for multimedia applications

Digital Image Representation Image Compression

A Comparative Study of DCT, DWT & Hybrid (DCT-DWT) Transform

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

BLIND QUALITY ASSESSMENT OF JPEG2000 COMPRESSED IMAGES USING NATURAL SCENE STATISTICS. Hamid R. Sheikh, Alan C. Bovik and Lawrence Cormack

High Capacity Reversible Watermarking Scheme for 2D Vector Maps

FRAME-LEVEL QUALITY AND MEMORY TRAFFIC ALLOCATION FOR LOSSY EMBEDDED COMPRESSION IN VIDEO CODEC SYSTEMS

CS 335 Graphics and Multimedia. Image Compression

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform

EXPLORING ON STEGANOGRAPHY FOR LOW BIT RATE WAVELET BASED CODER IN IMAGE RETRIEVAL SYSTEM

DCT Based, Lossy Still Image Compression

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Lossless Compression Algorithms

EE 5359 Low Complexity H.264 encoder for mobile applications. Thejaswini Purushotham Student I.D.: Date: February 18,2010

Wavelet Transform (WT) & JPEG-2000

H.264/AVC BASED NEAR LOSSLESS INTRA CODEC USING LINE-BASED PREDICTION AND MODIFIED CABAC. Jung-Ah Choi, Jin Heo, and Yo-Sung Ho

An embedded and efficient low-complexity hierarchical image coder

Image Compression Algorithm and JPEG Standard

CHAPTER 4 REVERSIBLE IMAGE WATERMARKING USING BIT PLANE CODING AND LIFTING WAVELET TRANSFORM

A SCALABLE CODING APPROACH FOR HIGH QUALITY DEPTH IMAGE COMPRESSION

EE795: Computer Vision and Intelligent Systems

SHAPE ENCODING FOR EDGE MAP IMAGE COMPRESSION. Demetrios P. Gerogiannis, Christophoros Nikou, Lisimachos P. Kondi

3 Data Storage 3.1. Foundations of Computer Science Cengage Learning

Layout Segmentation of Scanned Newspaper Documents

DCT-BASED IMAGE COMPRESSION USING WAVELET-BASED ALGORITHM WITH EFFICIENT DEBLOCKING FILTER

Robust Steganography Using Texture Synthesis

Compression of Light Field Images using Projective 2-D Warping method and Block matching

DYNAMIC HIERARCHICAL DICTIONARY DESIGN FOR MULTI-PAGE BINARY DOCUMENT IMAGE COMPRESSION**

A Parallel Reconfigurable Architecture for DCT of Lengths N=32/16/8

New Perspectives on Image Compression

BIG DATA-DRIVEN FAST REDUCING THE VISUAL BLOCK ARTIFACTS OF DCT COMPRESSED IMAGES FOR URBAN SURVEILLANCE SYSTEMS

ISSN (ONLINE): , VOLUME-3, ISSUE-1,

Wavelet Based Image Compression Using ROI SPIHT Coding

SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

Transcription:

Mixed Raster Content for Compound Image Compression SPRING 2009 PROJECT REPORT EE-5359 MULTIMEDIA PROCESSING Submitted to: Dr. K. R. Rao Submitted by: (1000555858) 1 P a g e

INDEX Sr. # TOPIC page (1) Motivation... 3 (2) Abstract... 3 (3) Introduction. 4 (4) Mixed Raster Content (MRC) Imaging Model. 5 Decomposition and Compression Analysis 7 RD plot Modification. 8 (5) Resolution Enhanced Rendering. 10 (6) Iterative Pre-and Post processing for MRC layers. 16 Edge Sharpening 17 Data-filling. 18 Edge softening. 18 (7) JPEG matched MRC compression of compound documents. 19 Pictorial and Text compression comparison.. 22 (8) Shape Primitive Extraction and Coding (SPEC). 25 (1) The System.. 25 (2) Segmentation. 26 (3) Coding of pixels. 28 (9) References. 30 (10) Acronyms... 32 2 P a g e

Submitted to: Dr. K. R. Rao Submitted by: (UTA ID: 1000555858; E-mail: priteshrasiklal.shah@mavs.uta.edu ) Title: Implementation of Mixed Raster Content (MRC) for Compound Image Compression Motivation: In today s world it is impossible to imagine a day without information or without information exchange in digital form over internet. With new advances in data processing systems and scanning devices, documents are present in a wide variety of printing systems. Documents in digital form are easy to store and edit, and can also be transmitted within seconds. These documents may contain text, graphics and pictures. So their storage requires huge memory and also transmission requires high compression and bit rates to avoid expenses and delay. Abstract: Mixed raster content (MRC) is described in the following report. It is a compression method which will be described here for compression of compound images (images containing both, continuous tone images and binary text) [1]. It uses a multi-layer, multi-resolution representation of a compound document. Instead of using a single algorithm, it uses multiple compression algorithms, including the ones specifically developed for text and images. So it can combine the best of new or existing algorithms and can provide various qualitycompression ratio tradeoffs. Also the problems faced by MRC compression are described along with its solutions and related topics. These topics include the problems with soft edges, edge sharpening, data-filling (with algorithms) [6], edge softening [4], resolution enhanced rendering (RER) [2] and optimizing block threshold segmentation for MRC compression [5], [9]. Also a comparison is made with JPEG standard [7], [8]. Tables and images after using different 3 P a g e

compression techniques are shown. Most of the algorithms use the standard MRC with a 3- layer representation which has some image processing overhead due to the management of multiple algorithms and the imaging model, and is also computationally expensive. A hybrid image coding scheme, shape primitive extraction and coding (SPEC) is also described which provides an accurate segmentation algorithm that separates the image into two classes of pixels: text/graphics and pictures [10], [11]. SPEC also provides a lossless coding scheme with low complexity algorithm for segmentation and good compression ratio along with excellent visual quality. Introduction: With the new advancement in scanning devices and data processing systems, documents are now present in variety of digital forms. These documents may contain pictures, text and graphics. In many cases documents with images are stored in high resolution and so their sizes are invariably large and commonly consume large amounts of storage space. Also these days information transform has been inevitable. So transferring such huge amount of data is difficult and expensive. So data compression becomes inevitable. There are different algorithms proposed and being used for different applications. There is no such single best algorithm for all image types or applications [1]. Different types of images require different coding fidelities. The preservation of edges and shapes of characters is important to facilitate reading. After the text is binarized, it has a lossless compression since coding errors in text can be easily noticed. But as far as continuous tone images are concerned, the human visual system works differently due to the richness of patterns and frequency contents. Text requires much higher resolution than pictures. When compressing a compound image (scanned image containing both text and pictures), the differences between text and continuous-tone images becomes significant. For pictures, lossy 4 P a g e

continuous-tone compression while for text, lossless binary compression is suitable. MRC allows both to be used within a single raster image. The raster is decomposed into several image layers with different image classes. So the different layers can be compressed differently with individual efficient compression systems. Mixed Raster Content (MRC) Imaging Model Encoding of compound images in MRC is carried out using a multi-layered, multi-resolution imaging model. The 3-layer MRC model contains 2 colored image layers (foreground (FG) and background (BG)) and one binary image layer (mask) [1]. Mask layer is the decisive layer in reconstructing the image from the FG and BG. When the pixel value in the mask layer is 1, the corresponding pixel from the FG is selected and when it is 0, the corresponding pixel from the BG is selected and the final image is reconstructed as shown in Fig. 1. The final image can be thought as the FG poured onto the BG through the Mask layer. This is MRC s most common form. Also there can be more FG planes poured through different Mask planes onto the same BG plane and this process can be repeated several times. Fig. 1: Basic 3 plane configuration for MRC imaging model. Foreground plane is poured into background through the mask plane [2]. 5 P a g e

Fig. 2: Basic element of MRC imaging model is the combination of mask and foreground planes [1]. Fig. 3: Block diagram of plane decomposition, compression and composing processes for a basic 3-layer model [1]. After the decomposition of the original image into layers, each layer can be individually processed and compressed using different algorithms as per Fig. 3. Resolution change or color mapping can be a part of this image processing technique. For good compression and reduced distortion visibility, the resolution and compression algorithm used for a particular layer would be matched to the layer s content [1]. The compressed layers after being packaged in a format like TIFF-FX or ITU-T MRC data-stream are delivered to the decoder where the individual planes are retrieved, decompressed, processed and the image is composed using MRC imaging model. Depending on its content, a page may be represented as one, two, three or more layers. A page containing black and white text can use FG and BG layers defaulted to black and white along with the mask layer. 6 P a g e

Compression algorithms such as JPEG or wavelet encoders are designed for compression of natural images while they yield very poor quality/bit rate tradeoffs for typical document images. This is observed when the document contains pictures, graphic content, text, and line art. Since last few years, MRC is being used for document encoding as it improves image quality or reduces bit rate to a great extent when compared to conventional picture encoders. Decomposition and compression analysis Fig. 4: Typical decomposition approaches yielding the same reconstructed image (in the absence of processing or compression) [1]. It can be seen from Fig. 3 that how the decomposition process and the compressors and their associated parameters for each plane provide a good degree of freedom in the MRC based compression. The operation of the encoder is affected by decomposition, while the decoder remains unaffected. Fig. 4 describes the decomposition process. The basic approaches are: region classification (RC) and transition identification (TI). In case of RC decomposition, the regions containing graphics and text are represented in a separate FG plane. Everything is represented in FG plane including the spaces in between letters and other blank spaces. The mask as shown in the figure is uniform with large patches, clearly differentiating the text and the graphic regions and the background contains the document background, complex graphics and/or continuous tone pictures [1]. TI decomposition is quite similar to RC decomposition as can be seen from Fig. 4. However, in case of TI, mask and FG planes represent graphics and text in a different manner. The FG plane pours the ink of text and graphics through the mask onto the BG plane. So the mask should have the necessary text contours. Hence the mask layer 7 P a g e

contains text characters, line art and filled regions, while the FG layer contains colors of the text letters and graphics. In RC decomposition, the mask layer is very uniform and very well compressible. But the FG can contain edges and continuous tone details. So it cannot be compressed well with typical continuous tone coders such as JPEG [1]. As the mask layer in case of TI contains text objects and edges, it can be efficiently encoded using standard binary coders such as JBIG and JBIG-2 [1]. The FG plane can be very efficiently coded even with coders such as JPEG because it contains large uniform patches. Also the FG plane can be sub-sampled without much loss in image quality [1]. RD plot Modification The benefit of using MRC model for compression can be observed by analysing its ratedistortion (RD) characteristics. As shown in Fig. 5, if image (a) is compressed with a generic coder A with fixed parameters except for a compression parameter, it will operate under a RD plot as shown in (b). It is seen that another coder B under the same circumstances is found to perform better than coder A if its RD plot is shifted to the left as shown in (c). The logic for MRC is to split the image into multiple planes as shown in (d), and to apply coders C, D and E to each plane with RD plots similar to that of coder B. Thus the equivalent coder may have better RD plots than A, but there would be an overhead associated with a multi-plane representation. 8 P a g e

Fig. 5: Interpretation of the multi-plane approach as a means to modify the RD characteristics of coding mixed images. (a) Mixed image; (b) RD plot for given coder A; (c) modified RD plot for resulting coder B; (d) coder B may be achieved by plane decomposition, where each plane undergoes a better tailored coder, thus achieving better RD plots [1]. 9 P a g e

Resolution Enhanced Rendering Conventional MRC Encoder: There is some inefficiency in MRC model because all the layers are compressed independently and so the layers which are not used are also encoded. The FG and BG layers are compressed at lower resolutions with lossy image coders such as JPEG or JPEG2000 and mask layer is encoded with JBIG or JBIG2[2]. TABLE I List of existing MRC based encoders [2]. Resolution enhanced rendering method: The RER algorithm [2] functions as described below. The FG and BG are segmented using an adaptive error diffusion method. The binary mask is dithered along the edges of the characters to represent the gradual transition of the scanned text characters. The true value of the document pixels are then estimated by the RER decoder with the help of mask, FG and BG layers. This estimation is done using a non-linear tree structured predictor. This encoder is trained to identify the characteristic patterns of the RER encoder. So it can estimate the true pixel values more accurately. 10 P a g e

Fig. 6: Training model of the optimized encoder and decoder. Once training is complete, the encoder and decoder function independently [2]. Fig. 6 shows the minimization of distortion of the decoded document carried out jointly by the encoder and decoder. Both the encoder and decoder have parameters which can be trained to produce very good results. During each particular process, the parameters of either the encoder or decoder are fixed (alternatively) and the parameters of the other are optimized. Two disjoint sets of documents are used for training the encoder and decoder. The simulation results of the above process have shown that parameters have been reached which can reduce the distortion of the decoded document. Also it has been found that optimization of both encoder and decoder together performs better than the individual optimization of the two [2]. 11 P a g e

Fig. 7: (a) Structure of an MRC encoder. (b) Basic structure of a RER-enhanced MRC encoder [2]. In Fig. 7(b), the RER encoding module creates the dithered mask (D s )as the output by taking in the FG, BG, binary mask layer, and the original document as the input. The D s, FG and BG are separately compressed by using the binary image encoder and continuous-tone encoder that are used for the MRC encoder. In the encoding module, an edge detection procedure is performed on the binary mask layer to determine the pixels that lie on the boundary between FG and BG, to compute dithered mask. If K s is 1, then s is a boundary pixel or it is a non-boundary pixel. λ s is obtained by Linear Projection. Then at each boundary pixel D s is generated from λ s by switching on the error diffusion procedure. The error diffusion procedure is switched off at each non-boundary pixel and the binary mask value, M s, is the output, D s. Thus the binary output image of the dithered mask is produced.. 12 P a g e

In Fig. 7, X s is the pixel color in the raster document at location s. F s and B s are the decimated FG and BG colors at s. M s and D s are the binary and dithered masks respectively. F s ( ) and B s ( ) be the interpolated FG and BG colors at s. K s is the binary output of the boundary detection procedure. λ s is the output of the linear projector [2]. t in equation (1) stands for transpose. The solution of λ s is, (1) Fig. 8: (a) Structure of an MRC decoder. (b) Basic structure of a RER-enhanced MRC decoder [2]. 13 P a g e

Fig. 8(a) and 8 (b) show the working of an MRC and RER decoder respectively. In an RER decoder, first, the binary mask, M s, is input to a boundary detection module, which works exactly the same way as in the RER encoder. It is assumed that the decoded FG and BG layers have lower resolution than the original. Also any interpolating algorithm can be applied to interpolate the two layers. Let F s and B s be the decoded FG and BG layers that have been interpolated respectively, both having the same resolution as before. X s ( ) is the reconstructed color pixel for each non-boundary pixel and is chosen between the FG and the BG colors, according to the binary mask as in the normal MRC. To show the reconstructed pixel color if the pixel is on the boundary, a nonlinear prediction algorithm is performed [2]. It works by computing, λ s ( ), a minimum mean-squared error (MSE) estimate of λ s. Using this estimate, we can have the following relation, (2) Fig. 9: RER predictor using tree-based nonlinear estimation [2]. 14 P a g e

Fig. 9 contains a tree-based classifier and a set of class-dependent least-square estimators. A binary mask is extracted in a 5 x 5 window about the pixel in question. The binary vector formed, Z s, is then input to a binary regression tree predictor known as tree-based resolution synthesis (TBRS). A two step process, classification and prediction, is used by the TBRS predictor to estimate the value of λ s. Firstly it classifies the vector Z s into one of the M classes using a binary tree predictor. Each class, then has a corresponding linear prediction filter to estimate the value of λ s from Z s by using the following relation (3) where, m is the determined class of vector Z s ; A m and β m are the corresponding linear prediction parameters of class m. Fig.10: Example of binary tree prediction model (a) The regression tree structure e i and µ i specify the binary decision rule for splitting node i. A m and β m are the linear filter parameters of least-square estimation for leaf (or class) m. (b) The recursive classification procedure of the binary tree in the input vector space Z m [2].. 15 P a g e

Fig. 10 shows the structure of the binary tree prediction model and its classification procedure. In Fig. 10(a), the tree is to be traversed down from the root which is at the top. A binary decision is to be made at each non-terminal node that the input vector (Z s ) passes through [2]. The termination is at one of the leaves at the bottom, where each leaf represents a distinct class of the input vector space Z s. At any non-terminal node i, the input vector is determined to go to the left child or the right child using the specific splitting rule (4) t where, e i is a pre-computed vector specifying the orientation of a hyperplane for splitting, e i is the transpose of e i and µ i is a reference point on the hyperplane. If the projection of Z s - µ i is negative, then Z s falls into the left side of the hyperplane passing through the reference point µ i and perpendicular to the vector e i and consequently goes down to the left child of node i. Or, it falls to the right side of the hyperplane and goes down to the right child. Fig. 10(b) illustrates an example of the recursive classification procedure in the input vector space Z s. Each polygon region at the bottom level represents a different class of Z s. The distinct regions of the document corresponding to mask edges of different shapes and orientations can be separated by the classification step [2]. Thus RER leads to minimum document image distortion. Simulation results indicate that at a fixed bit rate, RER can reduce distortion to a great extent. Also RER is fully compatible with the standard MRC encoders and decoders. Iterative Pre-and Post processing for MRC layers The binary nature of the mask layer makes it difficult for MRC compression to deal with scanned data and soft edges. halo is observed at the object edges when MRC model is used. This is because the edge transitions do not belong fully to either FG or BG [4]. 16 P a g e

In MRC after the original single resolution image is decomposed into layers, they are processed and compressed using different algorithms. Fig. 11 shows how the original document X is input to the pre-processor which yields output Y (3-layer MRC model). It also constructs an edge sharpening map and estimates the original edge softness. This information is termed as side information. This data is then encoded and then the encoded data goes to the decoder, where the reconstructed data Y, along with the side information is used by the post-processor to assemble the reconstructed version X of the original document. Problem with soft edges Fig. 11: Working of MRC with scanned data [4]. In the mask layer in MRC, the size and shape of the text is same as in the imaged text. The edges of text and graphics, when scanned are not very sharp and they are referred to as soft edges. The selector plane (mask) is binary and the edge transitions are smooth, so it is not possible to contain all the BG in one plane and all the FG in the other [4]. This causes halo around the text in FG/BG planes which is very damaging to compression. Edge Sharpening This halo cannot be removed with data-filling, forcing to change the data itself. It cannot be assumed that all image edges can cause problems. Also not all mask transitions are going to cause halo in the scanned material [4]. This occurs only when the edges of the image and the transition of the mask coincide. 17 P a g e

Data-filling There may be unused pixels in FG or BG layers because the final selected pixels can be from either layer depending on the mask. So in order to increase compression ratio, these pixels can be replaced by any color. This can be seen in the Fig. 12 [4], [6]. Once the pre-processor and input are fixed, segmenter finds a binary mask, with which the pre-processor can derive the output layers according to the input. Fig. 12: MRC decomposer (based on a segmenter and plane filling algorithm [6]. Edge softening 18 P a g e

Although intentionally blurring an image is not a common practice since the edges in the image were originally soft, soft edges at the decoder can be expected from a good compression system. The edges are sharpened to accommodate the mask, so an estimation of the image edge transition is necessary [4]. A few parameters about the soft edges coinciding with the mask are to be sent to the decoder, so that it can soften the image around the mask borders. One parameter can be the width of the edge, or how soft the edges are while the other can give an idea of the offset of the mask border with the image edges and is related to the thresholding parameters used to find the mask. Also a linear filter can be used. The encoder has access to the recovered and the original unsharpened image. The encoder can find the optimal filter coefficients say by least squares, and send all filter parameters to the decoder at a low cost. JPEG matched MRC compression of Compound documents The Fig. 13 shows the MRC encoder. In order to maintain tractable run-time memory requirement, the algorithm works on independent stripes of image data rather than a full image [7]. For each of the stripes there are practically ample number of possible decompositions and associated stripe encoding parameters. But we need to obtain a nearoptimal decomposition in terms of good compression and reconstructed image. 19 P a g e

Fig. 13: Schematic of MRC encoder [7]. The analysis algorithm should analyze the input stripe thoroughly and also should consider the characteristics of the particular coders that are to be used to code the decomposed layers after the analysis. JPEG [7] has been chosen for coding the FG and BG layers as it is omnipresent and an efficient image codec. The decomposition scheme developed is matched to the way JPEG operates. The mask layer is coded with JBIG [7]. 20 P a g e

Fig. 14: (a) Original 300 dpi full color National Geographic scan, and (b) Reconstructed document, with file sizes, along with (c) the binary Mask Layer. The MRC codec uses 2x2 sub-sampling of image layers [7]. Fig. 14 represents the compression results for a 300 dpi scan from the National Geographic. The mask layer as can be seen performs a nice job distinguishing text and other features from the document. Thus this full-color compound document codec is compliant with the ITU standard T.44 [7]. JPEG was used for FG and BG layers and JBIG for the mask layer. This approach makes the encoding complexity low enough to make scan-to-email, scan-to-web, or scan-and-distribute type applications feasible [7]. To enable viewing documents on machines with varying capabilities, resolution scalability is implemented at the decoder. 21 P a g e

Fig. 15: Images representing the magnitude of the error after decompression (scaled 30, white = zero): (a) original compound image (600 dpi page); (b) after JPEG compression; (c) after MRC-JPEG compression [8]. Table II Results of different coding methods applied to compound image Toy Store (Fig.11a) [8]. Fig.15 shows the evaluation of the visual quality by showing the error after decompression [8]. These results correspond to the compression ratios in Table II. It can be seen from this figure that JPEG produces large errors around the text whereas MRC decomposition and coding can improve both the compression ratios and quality to a great extent. 22 P a g e

Pictorial and Text compression comparison between JPEG and JPEG matched MRC compression Fig. 16 and Fig. 17 will show the comparison of the JPEG and JPEG matched MRC images along with the original image. It shows the comparison both in text and pictures. The simulation results show how JPEG matched MRC is better than JPEG. This is true in both, the text and the pictorial case. (a) (b) 23 P a g e

(C) Fig.16: Enlarged portion of the text region of original and reconstructed images at a compression ratio of 70:1. (a) Original, (b) JPEG, and (c) MRC using JPEG + MMR [12] + JPEG [1]. (a) (b) 24 P a g e

(c) Fig.17: Enlarged portion of the pictorial region of original and reconstructed images at a compression ratio of 70:1. (a) Original, (b) JPEG, and (c) MRC using JPEG + MMR [12] + JPEG [1]. SPEC (Shape Primitive Extraction and Coding) In case of a compound image compression, most of the algorithms use the standard 3-layer MRC model. But this model is computationally expensive and has some image-processing overhead to manage multiple algorithms and the imaging model. SPEC [10] decomposes the image into two sets of pixels: text/graphics and pictures. Its unique features are that segmentation and coding are tightly integrated in the SPEC algorithm. Also shape and color serve as two basic features to the segmentation and the lossless coding. SPEC Algorithm (1) The System: 25 P a g e

Fig. 18 shows the proposed SPEC algorithm consisting of two stages segmentation and coding. Firstly, the algorithms segments 16 x 16 non-overlapping blocks of pixels into text/graphics class and picture class [11]. Then the text/graphics pixels are compressed with a new lossless coding algorithm and the rest of the pixels are compressed with lossy JPEG. Fig. 18: Flow-chart of the SPEC system [10]. SPEC uses four types of shape primitives: isolated pixels, horizontal lines, vertical lines and rectangles. These shape primitives are represented by a color tag and its position information, i.e., (x, y) is for an isolated pixel, (x, y, w) for a horizontal line, (x, y, h) for a vertical line, and (x, y, w, h) for a rectangle, respectively. (2) Segmentation: The two steps that comprise segmentation are block classification and refinement segmentation as shown in Fig. 19. In step one, the 16 x 16 non-overlapping blocks are scanned to count the number of different colors. The block is classified as the picture block if the color 26 P a g e

number is larger than a certain threshold value T1 (T1=32 is used for SPEC) [11]. Else it is classified as text/graphics block. This classification can be extremely fast. To increase the performance refinement segmentation is done to extract text and graphics pixels, as the first step is coarse. Its implementation is carried out by scanning each picture block in order to extract four types of shape primitives. The policy implemented here is sizefirst. The classification of a shape primitive depends on its size and color. The shape primitive is extracted as text/graphics pixels if its size is larger than the specified threshold T2 (T2=3 for SPEC). Else its color is compared to a dynamic palette of recent text/graphics. The shape primitive is considered as text/graphics if an exact color match is found [11]. A shape primitive s color is put inot the dynamic color palette if its size is larger than a threshold T3 (T3=5 for SPEC). The implementation of the dynamic color palette is carried out with a first-in first-out buffer and eight entries. It is of small size for computational efficiency. A shape primitive with pure black or pure white color is directly classified into text/graphics. Thus the segmentation process segments the image into two parts text/graphics pixels (shape primitive pixels and all pixels of text/graphics) and pictorial pixels (the remaining pixels in the picture blocks). 27 P a g e

Fig. 19: Flow-chart of segmentation [10]. Fig. 20: Flow-chart of coding [10]. 28 P a g e

(3) Coding of Pixels: The SPEC coding process is shown in Fig. 20. The pictorial pixels are compressed with the lossy JPEG algorithm, while the text/graphics pixels are compressed with a new lossless coding algorithm which is based on shape primitives. Lossless Coding of text/graphics pixels The shape primitives are extracted first to compress the text/graphics blocks. The extraction procedure is similar to that of the picture blocks. The encoding scheme that represents the counts four types of shape primitives and their corresponding values is used for each color. Too many small shape primitives in a complex block makes shape-based coding inefficient. So palette-based coding is a good alternative. The most shape complex color is palette-based coded whereas the other colors are coded shape based in order to attain minimal code length. When all colors are shape-based coded, the most shape-complex color is the color that generates the maximum coding length. A color table reuse technique is applied by SPEC to represent the colors of shape primitives. In general, most colors in the color tables of two consecutive blocks are the same. If a color of the current block is found to match the color of the previous block, it is represented by a 1-byte index. Else it is represented by a 3-byte (R, G, B) format. Pictorial Pixel coding with JPEG A simple JPEG coder is used to compress pictorial pixels in picture blocks in SPEC. Text/graphics pixels in the picture block are removed before the JPEG coding in order to reduce ringing artifacts and to achieve higher compression ratio. These pixels are coded by a lossless algorithm. Their values can be arbitrarily chosen, but its preferable that these values are similar to the neighbor pictorial pixels to produce a smooth picture block. So these holes are filled in with the average color of pictorial pixels in the block. 29 P a g e

Thus an accurate algorithm is developed to separate text/graphics from pictures and a lossless coding method is designed for text/graphics compression. 30 P a g e

References: [1] R. de Queiroz, R. Buckley and M. Xu, Mixed Raster Content(MRC) model for Compound Image Compression, Conference on Visual Communications and Image Processing, SPIE, Vol. 3653, pp. 1106-1117, Jan. 1999. [2] G. Feng and C. A. Bouman, High-Quality MRC Document Coding, IEEE Trans. on Image Processing, Vo.15, pp. 3152-3169, Oct. 2006. [3] A. Zaghetto, R. L. de Queiroz and D. Mukherjee, MRC Compression of Compound Documents using Threshold Segmentation, Iterative Data-filling and H.264/AVC-INTRA, Sixth Indian Conference on Computer Vision, Graphics & Image processing, pp. 679-686, Dec. 2008. [4] A. Zaghetto and R. L. de Queiroz, Iterative Pre- and Post-Processing for MRC layers of scanned documents, IEEE ICIP 2008, pp. 1009-1012, Oct. 2008. [5] R. L. De Queiroz, Z. Fan and T. D. Tran, Optimizing Block-Thresholding Segmentation for Multilayer Compression of Compoud Images, IEEE Trans. on Image Processing, Vol.9, pp. 1461-1471, Sept. 2000. [6] R. L. de Queiroz, On Data Filling Algorithms for MRC layers, IEEE ICIP 2000, Vol.2, pp. 586-589, Vancouver, Canada, Sept. 2000. [7] D. Mukherjee, N. Memon and A. Said, JPEG-Matched MRC compression of compound Documents, IEEE ICIP 2001, Vol.3, pp. 434-437, Oct. 2001. [8] A. Said, Compression of Compound Images and Video for Enabling Rich Media in Embedded Systems, VCIP 2004, SPIE Vol. 5308, pp. 69-82, Jan. 2004. 31 P a g e

[9] R. L. De Queiroz, Z. Fan and T. D. Tran, Optimizing Block-Threshold Segmentation for MRC Compression, IEEE ICIP 2000, Vol.2, pp. 597-600, Sept. 2000. [10] T. Lin and P. Hao, Compound Image Compression for Real Time Computer Screen Image Transmission, IEEE Trans. on Image Processing, Vol.14, pp. 993-1005, Aug. 2003. [11] R. Ramya and K. Mala, A hybrid Compression Algorithm for Compound Images, International Conference on Computational Intelligence and Multimedia Applications, Vol. 3, pp. 68-72, Dec. 2007. [12] Z. Ding, L. Liu, M. Fan, and Y. Chen, Correction of Random and Outburst Error in Facsimile MMR Code, International conference on Signal Processing, Vol. 2, Nov. 2006. [13] R. L. Queiroz, Compression of Compound Documents, IEEE ICIP 1999, Vol. 1, pp. 209-213, Oct. 1999. [14] D. Mukherjee, C. Chrysafis, and A. Said, JPEG 2000-Matched MRC compression of compound Documents, IEEE ICIP 2002, Vol.3, pp. 73-76, Jun. 2002. [15] R. L. Queiroz, Pre-Processing for MRC layers of Scanned Images, IEEE ICIP 2006, pp. 3093-3096, Oct. 2006. [16] G. Feng and C. A. Bouman, High-Quality MRC Document Coding, IEEE Trans. on Image Processing, Vol. 15, pp. 3152-3169, Oct. 2006. 32 P a g e

Acronyms: BG CCITT, ITU-T Dpi FG HP ICIP ITU-T JBIG JPEG M MMR MRC MSE RC RD RER SPEC TBRS TI TIFF-FX Background Telecommunication Standardization Sector of the International Telecommunications Union Dots per inch Foreground Hewlett-Packard International Conference on Image Processing Telecommunication Standardization Sector (ITU-T) coordinates telecommunication standards on behalf of ITU (International Telecommunication Unit) Joint bi-level image experts group Joint Photographic Experts Group Mask Modified Modified read Mixed raster content Mean square error Region classification Rate distortion Resolution enhanced rendering Shape primitive extraction and coding Tree-based resolution synthesis Transition identification Tagged image file format fax extended 33 P a g e