DYNAMIC HIERARCHICAL DICTIONARY DESIGN FOR MULTI-PAGE BINARY DOCUMENT IMAGE COMPRESSION**

Similar documents
Mixed Raster Content for Compound Image Compression

Bi-Level Image Compression

Multilayer Document Compression Algorithm

Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction

Mixed Raster Content for Compound Image Compression

Keywords : FAX (Fascimile), GIF, CCITT, ITU, MH (Modified haffman, MR (Modified read), MMR (Modified Modified read).

EFFICIENT METHODS FOR ENCODING REGIONS OF INTEREST IN THE UPCOMING JPEG2000 STILL IMAGE CODING STANDARD

Reducing/eliminating visual artifacts in HEVC by the deblocking filter.

ADAPTIVE PICTURE SLICING FOR DISTORTION-BASED CLASSIFICATION OF VIDEO PACKETS

Motion Estimation. Original. enhancement layers. Motion Compensation. Baselayer. Scan-Specific Entropy Coding. Prediction Error.

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

Enhanced Hybrid Compound Image Compression Algorithm Combining Block and Layer-based Segmentation

A Simple Lossless Compression Heuristic for Grey Scale Images

QUANTIZER DESIGN FOR EXPLOITING COMMON INFORMATION IN LAYERED CODING. Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose

Design of a High Speed CAVLC Encoder and Decoder with Parallel Data Path

SIGNAL COMPRESSION. 9. Lossy image compression: SPIHT and S+P

A LOW-COMPLEXITY AND LOSSLESS REFERENCE FRAME ENCODER ALGORITHM FOR VIDEO CODING

ISSN: An Efficient Fully Exploiting Spatial Correlation of Compress Compound Images in Advanced Video Coding

Compression of line-drawing images using vectorizing and feature-based filtering

Hierarchical Minimum Spanning Trees for Lossy Image Set Compression

JBEAM: Coding Lines and Curves via Digital Beamlets

Dictionary Based Compression for Images

Optimization of Bit Rate in Medical Image Compression

STUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS (H.264/AVC, DIRAC)

Text segmentation for MRC Document compression using a Markov Random Field model

Image Compression for Mobile Devices using Prediction and Direct Coding Approach

Image Error Concealment Based on Watermarking

Test Segmentation of MRC Document Compression and Decompression by Using MATLAB

LOSSLESS COMPRESSION OF LARGE BINARY IMAGES IN DIGITAL SPATIAL LIBRARIES

Low-cost Multi-hypothesis Motion Compensation for Video Coding

signal-to-noise ratio (PSNR), 2

Pattern based Residual Coding for H.264 Encoder *

Performance analysis of Integer DCT of different block sizes.

NEW CAVLC ENCODING ALGORITHM FOR LOSSLESS INTRA CODING IN H.264/AVC. Jin Heo, Seung-Hwan Kim, and Yo-Sung Ho

Multimedia Networking ECE 599

A QUAD-TREE DECOMPOSITION APPROACH TO CARTOON IMAGE COMPRESSION. Yi-Chen Tsai, Ming-Sui Lee, Meiyin Shen and C.-C. Jay Kuo

Information Extraction from Symbolically Compressed Document Images

IMPROVED SIDE MATCHING FOR MATCHED-TEXTURE CODING

H.264 Based Video Compression

A joint color trapping strategy for raster images

A 3-D Virtual SPIHT for Scalable Very Low Bit-Rate Embedded Video Compression

Optimized architectures of CABAC codec for IA-32-, DSP- and FPGAbased

Lossless and Lossy Minimal Redundancy Pyramidal Decomposition for Scalable Image Compression Technique

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC

A Hybrid Lossless Compression Technique with Segmentation and Modified Arithmetic Coding

Video Compression System for Online Usage Using DCT 1 S.B. Midhun Kumar, 2 Mr.A.Jayakumar M.E 1 UG Student, 2 Associate Professor

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

CMPT 365 Multimedia Systems. Media Compression - Video

LOSSLESS COMPRESSION OF LARGE BINARY IMAGES IN DIGITAL SPATIAL LIBRARIES

Features. Sequential encoding. Progressive encoding. Hierarchical encoding. Lossless encoding using a different strategy

ECE 533 Digital Image Processing- Fall Group Project Embedded Image coding using zero-trees of Wavelet Transform

Comparative and performance analysis of HEVC and H.264 Intra frame coding and JPEG2000

IMAGE COMPRESSION. Image Compression. Why? Reducing transportation times Reducing file size. A two way event - compression and decompression

H.264/AVC BASED NEAR LOSSLESS INTRA CODEC USING LINE-BASED PREDICTION AND MODIFIED CABAC. Jung-Ah Choi, Jin Heo, and Yo-Sung Ho

10.2 Video Compression with Motion Compensation 10.4 H H.263

IMAGE COMPRESSION USING HYBRID QUANTIZATION METHOD IN JPEG

EXPLORING ON STEGANOGRAPHY FOR LOW BIT RATE WAVELET BASED CODER IN IMAGE RETRIEVAL SYSTEM

A New Configuration of Adaptive Arithmetic Model for Video Coding with 3D SPIHT

Repeating Segment Detection in Songs using Audio Fingerprint Matching

Context based optimal shape coding

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

DCT Coefficients Compression Using Embedded Zerotree Algorithm

An Efficient Adaptive Binary Arithmetic Coder and Its Application in Video Coding

Scan-Based BIST Diagnosis Using an Embedded Processor

Efficient Image Compression of Medical Images Using the Wavelet Transform and Fuzzy c-means Clustering on Regions of Interest.

A New Compression Method Strictly for English Textual Data

Index. 1. Motivation 2. Background 3. JPEG Compression The Discrete Cosine Transformation Quantization Coding 4. MPEG 5.

A COST-EFFICIENT RESIDUAL PREDICTION VLSI ARCHITECTURE FOR H.264/AVC SCALABLE EXTENSION

System Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework

A Comparison of Still-Image Compression Standards Using Different Image Quality Metrics and Proposed Methods for Improving Lossy Image Quality

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Variable Temporal-Length 3-D Discrete Cosine Transform Coding

A SCALABLE SPIHT-BASED MULTISPECTRAL IMAGE COMPRESSION TECHNIQUE. Fouad Khelifi, Ahmed Bouridane, and Fatih Kurugollu

Image Compression Algorithm and JPEG Standard

FAULT TOLERANT SYSTEMS

Use of Local Minimization for Lossless Gray Image Compression

Data Representation. Types of data: Numbers Text Audio Images & Graphics Video

Error Protection of Wavelet Coded Images Using Residual Source Redundancy

Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform

A COMPRESSION TECHNIQUES IN DIGITAL IMAGE PROCESSING - REVIEW

IBM Research Report. Inter Mode Selection for H.264/AVC Using Time-Efficient Learning-Theoretic Algorithms

Wireless Communication

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ICIP.1996.

An embedded and efficient low-complexity hierarchical image coder

ABSTRACT 1. INTRODUCTION

Wavelet Based Image Compression Using ROI SPIHT Coding

SHAPE ENCODING FOR EDGE MAP IMAGE COMPRESSION. Demetrios P. Gerogiannis, Christophoros Nikou, Lisimachos P. Kondi

ANALYSIS OF SPIHT ALGORITHM FOR SATELLITE IMAGE COMPRESSION

Homogeneous Transcoding of HEVC for bit rate reduction

IMAGE COMPRESSION TECHNIQUES

Low-Memory Packetized SPIHT Image Compression

Interactive Progressive Encoding System For Transmission of Complex Images

JPEG. Table of Contents. Page 1 of 4

A 4-way parallel CAVLC design for H.264/AVC 4 Kx2 K 60 fps encoder

Chapter 10. Basic Video Compression Techniques Introduction to Video Compression 10.2 Video Compression with Motion Compensation

PERFORMANCE ANALYSIS OF INTEGER DCT OF DIFFERENT BLOCK SIZES USED IN H.264, AVS CHINA AND WMV9.

An Effective Approach to Improve Storage Efficiency Using Variable bit Representation

Lossless compression of large binary images in digital spatial libraries

Block-based Watermarking Using Random Position Key

Transcription:

DYNAMIC HIERARCHICAL DICTIONARY DESIGN FOR MULTI-PAGE BINARY DOCUMENT IMAGE COMPRESSION** Yandong Guo, Dejan Depalov, Peter Bauer, Brent Bradburn, Jan P. Allebach, and Charles A. Bouman School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN 47907-2035, U.S.A.; Hewlett-Packard Co., Boise, ID 8374, U.S.A. ABSTRACT The JBIG2 standard is widely used for binary document image compression primarily because it achieves much higher compression ratios than conventional facsimile encoding standards. In this paper, we propose a dynamic hierarchical dictionary design method (DH) for multi-page binary document image compression with JBIG2. Our DH method outperforms other methods for multi-page compression by utilizing the information redundancy among pages with the following technologies. First, we build a hierarchical dictionary to keep more information per page for future usage. Second, we dynamically update the dictionary in memory to keep as much information as possible subject to the memory constraint. Third, we incorporate our conditional entropy estimation algorithm to utilize the saved information more effectively. Our experimental results show that the compression ratio improvement by our DH method is about 5% compared to the best existing multi-page encoding method. Index Terms Binary document image compression, multi-page, JBIG2, dynamic hierarchical dictionary design, conditional entropy estimation. INTRODUCTION Binary document image compression is widely used for document scanning, storage, and transmission. Very often, people compress multi-page binary document images. Since these images usually come from consecutive pages of the same document source, there is typically information redundancy among pages. In this paper, we focus on how to utilize this information redundancy to improve the compression ratio for multi-page binary document image compression. The JBIG2 compression standard, developed by the Joint Bi-level Image Experts Group [], is considered to be the best existing standard for binary image compression because it can achieve much higher compression ratios than the previous methods, such as T.4, T.6, and T.82 [2, 3, 4, 5, 6]. The high compression ratio of JBIG2 compression comes from its **Research supported by the Hewlett-Packard Company. dictionary-symbol encoding procedure. A typical JBIG2 encoder works by first separating the document into connected components, or symbols. Next it creates a dictionary by encoding a subset of symbols from the image, and finally it encodes all the remaining symbols using the dictionary entries as a reference [7, 8, 9, 0, ]. There are two straightforward methods to compress multipage document images with the JBIG2 encoder. The first method creates an independent and static dictionary (IS) for each of the pages. Since the IS method does not utilize the information redundancy among pages, it generally results in the highest bit rate. Alternatively, the global-dictionary approach builds a single dictionary for the entire document, so it can generally achieve the lowest bit rate. However, the globaldictionary approach is not practical, since it requires an enormous memory to buffer the entire multi-page document before any compression can occur. In fact, practical JBIG2 encoders usually load only one page (sometimes even part of one page) to compress, and do not load the next page until the compression is finished [2]. Therefore, the information redundancy among pages is utilized by sharing dictionaries among pages, which is supported by the standard []. The first and widely cited practical algorithm, called LDM, was proposed by Ye and Cosman in [3, 4]. The LDM algorithm forms a dictionary for each of the pages by starting with the dictionary from the previous page, adding new dictionary entries needed for the current page, and expunging the dictionary entries (from the previous page) not used for the current page. The LDM algorithm improves the compression ratio compared to the IS method, and is memory efficient. Figuera, Yi, and Bouman also proposed a multi-page encoding algorithm referred to as dynamic symbol caching () in [5, 6]. The algorithm is claimed to have higher compression ratio compared to LDM, because only discards dictionary entries when there is no space available in the memory. The discarded dictionary entries are the least recently used ones. In our paper, we introduce a dynamic hierarchical (DH) dictionary design method for efficient compression of multipage documents. The encoding efficiency improvement of the 978--4799-234-0/3/$3.00 203 IEEE 2294 ICIP 203

document symbols requires better utilization of the information redundancy, which can be accomplished by constructing and sharing large dictionaries among pages. However, the large dictionaries also require a large number of bits to be encoded, and a great deal of memory to retain. Our DH method addresses these competing demands by employing both a hierarchical structure that efficiently encodes large dictionaries, and a dynamic strategy that retains the more information subject to the memory constraint. In addition, we use the conditional entropy estimation technique of [7, 8] to measure the information redundancy between two dictionary entries. This results in further improvements to the encoding efficiency of the dictionary design and symbol encoding processes. Our experimental results show that the DH method improves the compression ratio by approximately 5% relative to the best existing method. The rest of the paper is organized as follows. We present the dynamic hierarchical dictionary design method in details in Sec. 2. The experimental results are shown in Sec. 3 and the conclusion is in Sec. 4. 2. DYNAMIC HIERARCHICAL DESIGN In this section, we introduce the dynamic update approach for encoding of hierarchical dictionaries. Sec. 2. explains how the first page is encoded. Sec. 2.2 and Sec. 2.3 then explain the dynamic updating strategy for successive pages. 2.. Encoding of First Page dictionary. Finally, we encode all the symbols in the first document page by using the refinement dictionary as a reference. In order to construct the hierarchical dictionary of Figs. b), we use a bottom-up procedure. First, we extract all the distinct symbols on the first page. Then, for each of the distinct symbol, we construct one refinement dictionary entry. Next, we group the similar refinement dictionary entries into clusters, and create one representative for each of the clusters. These representatives are the dictionary entries which form the direct dictionary. In order to perform the clustering, we use the conditional entropy estimation-based dictionary indexing and design algorithm (CEE-DI) of [7]. 2.2. Encoding of Successive Pages The Fig. 2 illustrates the dynamic hierarchical dictionary construction of successive pages of the document. For successive pages, we introduce the stored dictionary, denoted as Dk s, for the k th page (k ). When there is no memory constraint, the stored dictionary Dk s is the union of all dictionaries from the previous pages (of which indices < k). The case with memory constraint will be discussed in the next subsection. Once again, the refinement dictionary Dk r is formed by every unique symbols in the k th page. For each of the given refinement dictionary entries, we try to find a good match in the stored dictionary, Dk s, to encode it efficiently. Typically, most of the entries in the refinement dictionary will have a good match in the stored dictionary. Thus, in this case, the refinement dictionary is encoded very efficiently. However, there will typically be some refinement dictionary entries, that do not have a good match in the stored dictionary. In order to encode these unmatched refinement dictionary entrees, we form a new direct dictionary D k. We build this direct dictionary using the conditional entropy estimation-based dictionary indexing algorithm (CEE-I) [7]. (a) Single dictionary (b) Hierarchical dictionary Fig.. Single dictionary and hierarchical dictionary structure for the first page. The hierarchical dictionary structure efficiently encodes a large refinement dictionary, which more efficiently encodes the document symbols. The Figs. (b) illustrates how the hierarchical dictionary structure is used to encode the first page of the document. First, we encode a smaller dictionary, called the direct dictionary, denoted as D. This direct dictionary is encoded using the direct coding mode of the JBIG2 standard []. Next, we use the refinement coding mode of the JBIG standard [] to encode a much larger refinement dictionary, denoted by D r. The refinement dictionary is compressed very efficiently because refinement coding uses a reference symbol from the direct dictionary to encode each new symbol in the refinement Fig. 2. Hierarchical dictionary structure for thek th page. The stored dictionarydk s is the pruned union of all the dictionaries from previous pages. Some entries in the refinement dictionary are encoded using the stored dictionary, while the rest are encoded using the direct dictionary. The criteria to determine whether we can find a good match for the given refinement dictionary entry in the stored dictionary is based on the conditional entropy estimation (CEE) in [7]. Let d r k,j denote the j th entry in D r k, and ds k,i denote the i th entry in D s k. The best match for dr k,j in Ds k is 2295

found by d s k,î(j) = argmin Ĥ(d r k,j ds k,i ), () d s k,i Ds k where Ĥ(dr k,j ds k,i ) is the estimation of the conditional entropy of d r k,j given ds k,i. If the conditional entropy of dr k,j givend s k,î(j) is smaller than the predefined thresholdt R, Ĥ(d r k,j ds k,î(j) ) T R, (2) we encode d r k,j using the stored dictionary entry ds k,î(j) as a reference. Otherwise, we do not encoded r k,j using the stored dictionary. 2.3. Dynamic updating strategy In order to make our algorithm practical, we must ensure that the size of the dictionaries retained in the memory will not grow beyond our available memory size. The method we used is to discard some of the stored dictionary entries whenever the memory size for all the dictionaries for the k th page is larger than M bytes. We choose the threshold value to be M because the standard requires a decoder to have at least am byte of storage for the dictionary [2]. The memory size for the dictionaries for the k th page is the summation of the memory size for D k, D r k, and Ds k, which can be calculated using the formula in [2]. The entry to be discarded is the stored dictionary entry d s k,ˆm satisfying both of the following two conditions.. The entryd s k,ˆm is not referred by any entry ind r k. 2. The entryd s k,ˆm is least distinct, defined as ˆm = argmind XOR (d s k,m,d k,n), (3) m whered k,n is any dictionary entry different fromd s k,m, that belongs tod k,d r k, ords k. The functiond XOR calculates the Hamming distance between two dictionary entries. Similar dictionary entries typically have more mutual information. Therefore, by the above strategy, we maintain as much total information as possible in the memory under the memory size constraint. 3. EXPERIMENTAL RESULTS In this section, we compare the DH method with the best existing method of [5, 6]. The method in our experiments is based on the widely used weighted Hamming distance (WXOR) [9, 20] for the dissimilarity measurement between symbols and dictionary entries. For the DH method, we investigated two versions. The first one is as described in the Sec. 2, called DH-CEE, since it uses the conditional entropy estimation (CEE) as the dissimilarity measure. For the second one, we substitute the CEE dissimilarity measure with the WXOR dissimilarity measure, in order to see the benefit due only to the dynamic hierarchical dictionary design. We call this method DH-WXOR. The test images consists of three sets of document images, EEPaper, Vita, and I9intro. Each set was scanned from consecutive pages of the same document at 300 dpi. The set EEPaper contains 9 images with 3275 2525 pixels; Vita contains 0 images with 3300 2550, while I9intro contains 6 images with 3300 2550. These test images contain mainly text, but some of them also contain line art, tables, and graphical elements, but no halftones. The JBIG2 lossless text mode was used for all experiments. We limited the memory usage for dictionaries to be less than MB, as described in 2.3. Unless otherwise stated, we adjusted the parameters of all the methods so that each of the methods achieved its optimal compression ratio. 3.. Compression ratio improvement In this subsection, the compression ratio is investigated by compressing the three documents. The following plot in Fig. 3 shows a comparison of the DH-CEE method to the alternative methods in encoding EEPaper. In the plot, everything is relative to the independent and static dictionary constructed with the WXOR dissimilarity measure (IS-WXOR). Notice that, our method DH-CEE has the highest compression ratio for all the pages. For the entire document EEPaper, the DH-CEE improved the compression ratio by 4% compared to, while for Vita by5% and for I9intro by7%. Compression ratio improvement.3.2. IS WXOR 0 2 4 6 8 0 2 Fig. 3. Compression ratio improvements by different dictionary design methods (relative to IS-WXOR) for EEPaper. The main reason for the compression ratio improvement by DH-CEE over is that DH-CEE produced a much larger dictionary for each of the pages, shown in Figs. 4 (a), and the larger dictionaries can more efficiently encode the documents. Meanwhile, using DH-CEE, only a small overhead is needed to encode the large dictionary. The efficient encoding of large dictionaries is demonstrated with an example in the next subsection. 2296

Number of dictionary entries 3 2 x 0 4 0 0 5 0 (a) DH-CEE vs. Number of dictionary entries 3 2 x 0 4 0 0 5 0 (b) DH-WXOR vs. Fig. 4. The number of the dictionary entries obtained by using different dictionary design methods. The large dictionaries produced by the DH methods encode the documents efficiently. For DH-CEE and DH-WXOR, the dynamic updating controls the size of their dictionaries after the7 th page due to the memory constraint. 3.2. Efficient encoding of large dictionaries In this subsection, we show that the DH method encodes a large dictionary with a relatively little overhead. Different methods were used to create large dictionaries for the first page in EEPaper. The refinement dictionary produced by the DH method is naturally large in size, because DH method creates one refinement dictionary entry for each of the distinct symbols in the page. For comparison, in the experiment in this subsection, we adjusted the parameters of to enforce its single dictionary to be as large as the refinement dictionary with DH-CEE. Please notice that in the experiment in the last subsection, the dictionary size of was optimized to achieve its highest compression ratio, and as shown in Fig. 4, was much smaller than that obtained by using the DH method. As shown in Fig. 5, the bitstream filesize obtained by using DH-CEE are significantly smaller than that obtained with. This is due to the hierarchical structure of DH-CEE: DH-CEE builds up the direct dictionary using CEE-ID to encode the refinement dictionary efficiently. Dictionary design algorithms Number of bits for direct dictionary Number of bits for refinement dictionary Number of bits for single dictionary Number of bits for symbols 0 0 20 30 40 50 60 70 80 Filesize in KB Fig. 5. Comparison of bit rate using three methods for dictionary compression. Different from the last subsection, the dictionary with was enforced to be identical with the dictionary with DH-CEE. Note that DH-CEE results in the lowest bit rate in encoding the dictionary. 3.3. Conditional entropy estimation The compression ratio improvement by DH-CEE also comes from the conditional entropy estimation (CEE). For comparison, we investigate the DH-WXOR method, which substitutes CEE with WXOR. First, we repeated the single page experiment in Sec. 3.2 with the DH-WXOR method. The refinement dictionaries obtained by using DH-WXOR and DH-CEE are identical since they used the same method to create their refinement dictionaries. As shown in Fig. 5, the bit rate obtained by using DH- WXOR is smaller than that with due to the hierarchical dictionary design. One the other hand, the bit rate with DH- WXOR is larger than that of DH-CEE. This is because CEE used in DH-CEE provides better measurement for the information redundancy between dictionary entries than WXOR in DH-WXOR [7]. And thus DH-CEE creates a better direct dictionary to encode the refinement dictionary. Then, we repeated the multi-page experiment in Sec. 3. with the DH-WXOR method. As shown in Figs. 3, DH- WXOR improved the compression ratio by % compared to. This improvement purely comes from the large dictionaries produced by the dynamic hierarchical design of DH- WXOR, shown as Figs. 4 (b). On the other hand, The compression ratio with DH-WXOR is about4% less than that with DH-CEE. This is because, based on CEE, DH-CEE creates better direct dictionaries and selects better stored dictionary entries to encode the refinement dictionarythan DH-WXOR. Please note that all the above experiments were conducted subject to the MB memory constraint. As shown in Figs. 4 (a) and (b), The dynamic updating has kept discarding stored dictionary entries since the 7 th page was encoded. If we release the memory constraint, no dictionary entries are discarded and extra memory is consumed. However, according to our experimental results, the extra memory usage only improved the compression ratio by less than %. This is because the dynamic updating minimizes the information loss caused by discarded dictionary entries by selecting the least distinct stored dictionary entries. 4. CONCLUSION In this paper, we proposed the dynamic hierarchical (DH) dictionary design method for the multi-page binary document image compression. A typical dictionary design method for multi-page image improves the encoding efficiency by maintaining and utilizing the information from previous pages to encode the successive pages. The DH method outperforms the previously existing methods using the following technologies. First, hierarchical design allows us to maintain more information per page. Second, the dynamic updating helps us to maintain as much information as possible subject to the memory constraint. Third, the conditional entropy estimation helps us to utilize the maintained information more efficiently. 2297

5. REFERENCES [] JBIG2 Final Draft International Standard, ISO/IEC JTC/SC29/WGN545, Dec. 999. [2] Standardization of Group 3 Facsimile Apparatus for Document Transmission, CCITT Recommend. T.4, 980. [3] Facsimile Coding Schemes and Coding Control Functions for Group 4 Facsimile Apparatus, CCITT Recommend. T.6, 984. [4] Progressive Bi-level Image Compression, CCITT Recommend. T.82, 993. [5] Roy Hunter and Harry Robinson, International digital facsimile coding standards, Proc. of the IEEE, vol. 68, pp. 854 867, July 980. [6] Ronald B. Arps and Thomas K. Truong, Comparison of international standards for lossless still image compression, Proc. of the IEEE, vol. 82, pp. 889 899, 994. [7] Ono Fumitaka, Rucklidge William, Arps Ronald, and Constantinescu Corneliu, JBIG2 - the ultimate bi-level image coding standard, in Proc. of IEEE Int l Conf. on Image Proc., 2000, pp. 40 43. [8] Paul G. Howard, Faouzi Kossentini, Bo Martins, Søren Forchhammer, and William J. Rucklidge, The emerging JBIG2 standard, IEEE Trans. on Circuits and Systems for Video Technology, vol. 8, pp. 838 848, 998. [5] Maribel Figuera, Jonghyon Yi, and Charles A. Bouman, A new approach to JBIG2 binary image compression, in Proc. SPIE 6493, Color Imaging XII: Processing, Hardcopy, and Applications, 2007, pp. 649305 649305 2. [6] Maribel Figuera, Memory-efficient algorithms for raster document image compression, Ph.D. thesis, Purdue University, West Lafayette, IN, USA, 2008. [7] Yandong Guo, Dejan Depalov, Peter Bauer, Brent Bradburn, Jan P. Allebach, and Charles A. Bouman, Binary image compression using conditional entropy-based dictionary design and indexing, in Proc. SPIE 8652, Color Imaging XIII: Displaying, Processing, Hardcopy, and Applications. 203, to be published. [8] Guangzhi Cao, Yandong Guo, and Charles A. Bouman, High dimensional regression using the sparse matrix transform (SMT), in Proc. of IEEE Int l Conf. on Acoust., Speech and Sig. Proc., 200, pp. 870 873. [9] William K. Pratt, Patrice J. Capitant, Wen-Hsiung Chen, Eric R. Hamilton, and Robert H. Wallis, Combined symbol matching facsimile data compression system, Proc. of the IEEE, vol. 68, pp. 786 796, 980. [20] Stuart J. Inglis, Lossless Document Image Compression, Ph.D. thesis, University of Waikato, New Zealand, 999. [9] Comeliu Constantinescu and Ronald Arps, Fast residue coding for lossless textual image compression, in 997 IEEE Data Compression Conf.(DCC), 997, DCC 97, pp. 397 406. [0] Paul G. Howard, Lossless and lossy compression of text images by soft pattern matching, in 996 IEEE Data Compression Conf.(DCC), 996, pp. 20 29. [] Yan Ye and Pamela C. Cosman, Dictionary design for text image compression with JBIG2, IEEE Trans. on Image Processing, vol. 0, pp. 88 828, 200. [2] Application Profiles for Recommendation T.88: Lossy/lossless Coding of Bi-level Images (JBIG2) for Facsimile, ITU-T recommendation T.89, 200. [3] Yan Ye and Pamela C. Cosman, Fast and memory efficient text image compression with JBIG2, IEEE Trans. on Image Processing, vol. 2, pp. 944 956, 2003. [4] Yan Ye and Pamela C. Cosman, Fast and memory efficient JBIG2 encoder, in Proc. of IEEE Int l Conf. on Acoust., Speech and Sig. Proc., 200, vol. 3, pp. 753 756. 2298