1 EE678 WAVELETS APPLICATION ASSIGNMENT 1 JPEG2000: Wavelets In Image Compression Group Members: Qutubuddin Saifee 01d07009 Ankur Gupta 01d Nishant Singh 01d07019 Abstract During the past decade, with the birth of wavelet theory and multiresolution analysis, image processing techniques based on wavelet transform have been extensively studied and tremendously improved. JPEG 2000 uses wavelet transform and provides an integrated toolbox to better address increasing needs for compression. In this report, we study the basic concepts of JPEG2000, the LeGall 5/3 and Daubechies 9/7 wavelets used in it and finally Embedded zerotree wavelet coding and Set Partitioning in Hierarchical Trees. Index Terms Wavelets, JPEG2000, Image Compression, LeGall, Daubechies, EZW, SPIHT. I. INTRODUCTION SI NCE the mid 1980s, members from both the International Telecommunications Union (ITU) and the International Organization for Standardization (ISO) have been working together to establish a joint international standard for the compression of grayscale and color still images. This effort has been known as JPEG, the Joint Photographic Experts Group. The process was such that, after evaluating a number of coding schemes, the JPEG members selected a discrete cosine transform( DCT)-based method in From 1988 to 1990, the JPEG group continued its work by simulating, testing and documenting the algorithm. JPEG became Draft International Standard (DIS) in 1991 and International Standard (IS) in With the continual expansion of multimedia and Internet applications, the needs and requirements of the technologies used grew and evolved. In March 1997, a new call for contributions was launched for the development of a new standard for the compression of still images, the JPEG2000 standard. JPEG2000 standard makes use to Discrete Wavelet Transformation (DWT) with the block diagram as shown in fig. 1. Fig. 1. Block diagram for JPEG2000 Prior to the transformation, some sort of preprocessing is required which includes image tiling, DC level shifting and component transformation. Tiling refers to the partitioning of the original image into rectangular nonoverlapping blocks called tiles, which are compressed independently, as though they were entirely distinct images. Wavelet transform, quantization and entropy coding are performed independently on the image tiles. Tiling reduces memory EE678 Wavelets Application Assignment, March 2005
2 EE678 WAVELETS APPLICATION ASSIGNMENT 2 requirements, and since they are also reconstructed independently, they can be used for decoding specific parts of the image instead of the whole image. Also, Larger tiles perform visually better than smaller tiles. DC level shifting refer to the shifting of all samples of the image tile component by subtracting the same quantity 2 P 1, 1, where P is the components precision. Level shifting does not affect variances. It actually converts an unsigned representation to a twos complement representation, or vice versa. If color transformation is used, dc level shifting is performed prior to the computation of the forforward component transform. At the decoder side, inverse dc level shifting is performed on reconstructed samples by adding to them the bias 2 P 1 after the computation of the inverse component transform. The forward transform block is where the wavelet transform takes place. Wavelet transform is used for the analysis of the tile components into different decomposition levels. To perform the forward DWT the standard uses a onedimensional (1-D) subband decomposition of a 1-D set of samples into low-pass and high-pass samples. Lowpass samples represent a down-sampled, low-resolution version of the original set. High-pass samples represent a downsampled residual version of the original set, needed for the perfect reconstruction of the original set from the lowpass set. TheDWTcan be irreversible or reversible. The default irreversible transform is implemented by means of the Daubechies 9-tap/7-tap filter. The default reversible transformation is implemented by means of the Le Gall 5-tap/3-tap filter. The standard can support two filtering modes: convolution based and lifting based. Convolution-based filtering consists in performing a series of dot products between the two filter masks and the extended 1-D signal. Lifting-based filtering consists of a sequence of very simple filtering operations for which alternately odd sample values of the signal are updated with a weighted sum of even sample values, and even sample values are updated with a weighted sum of odd sample values. After transformation, all coefficients are quantized. Quantization is the process by which the coefficients are reduced in precision. This operation is lossy, unless the quantization step is 1 and the coefficients are integers, as produced by the reversible integer 5/3 wavelet. Each of the transform coefficientsa a b (u, v) of the subband b is quantized to the value q b (u, v) according to the formula q b (u, v) = sign(a b (u, v)) a b (u, v) (1) b Finally, Entropy coding is achieved by means of an arithmetic coding system that compresses binary symbols relative to an adaptive probability model. II. BACKGROUND THEORY The wavelet decomposition algorithm uses two analysis filters Ĥ(z) (lowpass) and Ĝ(z) (highpass). The reconstruction algorithm applies the complementary synthesis filters H(z) (refinement filter) and G(z) (wavelet filter).these four filters constitute a perfect reconstruction filter bank.the wavelet transform has a continuous-time domain interpretation that involves the scaling functions ˆϕ(x) and ϕ(x), which are solutions of two-scale relations with filters Ĥ(z) and H(z), respectively. The scaling function ϕ(x) associated with the filter H(z) is the L 2 -solution (if it exists) of the two-scale relation ϕ(x) = 2 H(1) h k ϕ(2x k) (2) kɛz While it is usually difficult to obtain an explicit characterization of ϕ(x) in the time domain, one can express its Fourier transform as a convergent infinite product ˆϕ(x) = k=1 H(e j(ω/2 ) H(1) A simple way to generate a scaling function is to run the synthesis part of the wavelet transform algorithm starting with an impulsethis is often referred to as the cascade algorithm. Much of the early work in wavelet theory has been devoted to working out the mathematical properties (convergence, regularity, order, etc.,) of these scaling functions. Thewavelets themselves do usually not pose a problem because they are linear combination of the scaling functions, i.e., (3)
3 EE678 WAVELETS APPLICATION ASSIGNMENT 3 ψ(x) = 2 H(1) ˆψ(x) = 2 H(1) g k ϕ(2x k) (4) kɛz kɛz ĝ k ˆϕ(2x k) (5) The corresponding analysis and synthesis wavelet basis functions are ˆψ i,k = 2 i/2 ˆψ(x/2 i k) and ψ i,k = 2 i/2 ψ(x/2 i k), respectively, where and are the translation and scale indices. A necessary condition for the convergence of (2) to an L 2 -stable function, is that the filter has a zero at z = 1. More generally, the refinement filters will have a specified number of regularity factors, which determine their order of approximation, defined as the number of factors (1 + z 1 ) that divide H(z). Fig. 2. Synthesis LEGall JPEG filter of length 3 and corresponding scaling function. Another important characteristic of a scaling function is its smoothness in the sense of degree of differentiability. The Besov regularity essentially specifies the fractional degree of differentiability of the function in the L p -sense. The most stringent measure corresponds to the case p = and coincides with the Holder regularity which is a classical measure of pointwise continuity. The other most commonly-used measure is the Sobolev regularity which corresponds to the intermediate Besov case p = 2 ; it is a more global indicator of smoothness that is entirely specified in the Fourier domain. Most researchers agree that a minimum of regularity (typically, continuity) is desirable for a good convergence behavior of the iterated filterbank. The next point concerns the stability of the wavelet representation and of its underlying multiresolution bases. The crucial mathematical property is that the translates of the scaling functions and wavelets form Riesz bases. Thus, one
4 EE678 WAVELETS APPLICATION ASSIGNMENT 4 needs to characterize their Riesz bounds and other related quantities. The tightest upper and lower bounds, B < 1 and A > 0, of the autocorrelation filter of ϕ(x) are the Riesz bounds of ϕ(x) i.e. A 2 = inf ωɛ[0,2π] a ϕ (ω) and B 2 = sup ωɛ[0,2π] a ϕ (ω), where a ϕ (ω) is the auto-correlation filter. The existence of the Riesz bounds ensures that the underlying basis functions are in L 2 and that they are linearly independent. JPEG2000 restricts the users choice to two wavelet transforms: Daubechies 9/7 for lossy compression, and the 5/3 LeGall wavelet, which has rational coefficients, for reversible or lossless compression. It also specifies that these should be implemented using the lifting scheme. The JPEG2000 LeGall 5/3 scaling filters are given by H(z) = 1 8 z(1 + z 1 ) 2 ( z z 1 + 4) (6) H(z) = 1 2 z(1 + z 1 ) 2 (7) Tha analysis filter has approximation order of 2, Holder regularity of 0 and Riesz bounds of 1 < and the synthesis filter has approximation order of 2, Holder regularity of 1 and Riesz bounds of < 1. The synthesis function (Fig. 2) is the linear B-spline which is boundedly differentiable. The analysis function (Fig. 3) is also in but has no smoothness at all; it is merely bounded. The Riesz bounds are not very tight indicating that the functions are far from orthogonal. A less favorable aspect of this transform is the relative magnitude of the wavelet constant which comes as a consequence of the lack of tightness of the Riesz bounds. While this may compromize the efficacy of bit-allocation in the wavelet domain, it is not really a problem here because this wavelet transform is intended to be used for lossless coding. The JPEG2000 Daubechies 9/7 scaling filters are given by 2 5 H(z) = 64 5ρ 6 + (1 + z 1 ) 4 (z 2 + z 2 (8 ρ)(z + z 1 ) ) (8) ρz2 5ρ H(z) = 2 3 ρ 2 z2 (1 + z 1 ) 4 ( z z 1 + ρ) (9) where ρ These filters result from the factorization of the same polynomial as Daubechies. The main difference is that the 9/7 filters are symmetric. The shortest basis functions are placed on the synthesis side. All functions are at least once continuously differentiable with a fair amount of extra smoothness on the synthesis side. Tha analysis filter has approximation order of 4, Holder regularity of 1.07 and Riesz bounds of < and the synthesis filter has approximation order of 4, Holder regularity of 1.70 and Riesz bounds of Perhaps the most important property that is truly specific to the 9/7 is the tightness of Riesz bounds, indicating that the basis functions are very nearly orthogonal. A. Embedded Zerotree Wavelet (EZW) III. APPLICATION. EZW is a simple, yet remarkably effective, image compression algorithm, having the property that the bits in the bit stream are generated in order of importance, yielding a fully embedded code. This compression algorithm is based on four key concepts: 1)wavelet transform or hierarchical subband decomposition, 2)prediction of the absence of significant information across scales by exploiting the self-similarity inherent in images, 3)entropy-coded successiveapproximation quantization, and 4)universal lossless data compression which is achieved via adaptive arithmetic coding. The compression algorithm contains the following features: 1) A discrete wavelet transform which provides a compact multiresolution representation of the image. 2) Zerotree coding which provides a compact multiresolution representation of significance maps, or maps indicating the locations of the significant samples. Zerotrees allow the successful prediction of insignificant samples across scales to be efficiently represented as part of exponentially growing trees. 3) Successive Approximation which provides a compact multiprecision representation of the significant coefficients and facilitates the embedding algorithm.
5 EE678 WAVELETS APPLICATION ASSIGNMENT 5 Fig. 3. Analysis LEGall JPEG filter of length 3 and corresponding scaling function. 4) A prioritization protocol whereby the ordering of importance is determined by the magnitudes of the reconstructed wavelet coefficients regardless of their scale. 5) Adaptive multilevel arithmetic coding which provides a fast and efficient method for entropy coding strings of symbols, and requires no training or prestored tables. The arithmetic coder used in the experiments is a customized version of that in . 6) The algorithm runs sequentially and stops whenever a target bit rate or a target distortion is met. A target bit rate can be met exactly, and an operational rate vs. distortion function (RDF) can be computed point-by-point. The image is first transformed using a discrete wavelet transform. The discrete wavelet transform used in this paper is identical to a hierarchical subband system, where the subbands are logarithmically spaced in frequency and represent octave-band decomposition. To begin the decomposition, the image is decomposed into four subbands by cascading horizontal and vertical two-channel critically sampled filterbanks. The filters used in the decomposition are scaled so that the squares of the filter coefficients sum to one. At this point in the decomposition, to tile the entire image in each subband, each coefficient represents a spatial area corresponding to approximately a 2x2 area of the original picture. To tile the 2-D frequency domain, the low frequencies represent a bandwidth in each dimension approximately corresponding to 0 < w < π/2, whereas the high frequencies represent the band from π/2 < w < π. To obtain the next coarser scale of wavelet coefficients, the lowest frequency subband is further decomposed and critically sampled. The process continues until some final scale is reached. To perform the embedded coding, successive-approximation quantization (SAQ) is applied. As will be seen, SAQ is related to bit-plane encoding of the magnitudes. Given an amplitude threshold T, a wavelet coefficient x is said to be insignificant with respect to T if x < T. The SAQ sequentially applies a sequence of thresholds T o,..., T n 1 to
6 EE678 WAVELETS APPLICATION ASSIGNMENT 6 determine significance, where the thresholds are chosen so that T i = T i 1 /2. The initial threshold To is chosen so that x j < 2T o for all transform coefficients x j. During the encoding (and decoding), two separate lists of coordinates of wavelet coefficients are maintained. At any point in the process, the dominant list contains the coordinates of those coefficients that have not yet been found to be significant, whereas the subordinate list contains the coordinates of those coefficients that have been found to be significant. For each threshold, each list is scanned once. During a dominant pass, coefficients with coordinates on the dominant list, i.e. those that have not yet been found to be significant, will be compared to the threshold T, to determine their significance, and if significant, their sign is also encoded. A dominant pass is followed by a subordinate pass in which all coefficients on the subordinate list are refined to an additional bit of precision. A parent-child relationship can be defined between wavelet coefficients at different scales corresponding to the same location. With the exception of the highest frequency subbands, every coefficient at a given scale can be related to a set of coefficients at the next finer scale of similar orientation. The coefficient at the coarse scale will be called the parent, and all coefficients corresponding to the same spatial location at the next finer scale of similar orientation will be called children. The scanning of the coefficients processed during a dominant pass is performed in such a way that no child is scanned before its parent. Given a threshold level T i to determine whether or not a coefficient is significant, a coefficient is said to be an element of a zerotree if it is insignificant and all of its descendants are also insignificant. A coefficient is said to be a zerotree root for a threshold T i if 1) the coefficient is insignificant, 2) the coefficient is not the descendant of a previously found zerotree root for T i, i.e. it is not predictably insignificant from the discovery of a zerotree root at a coarser scale, and 3) all of its descendants are insignificant. During the scanning of the coefficients during a dominant pass, each coefficient that is not predictably insignificant is encoded with a symbol from the four symbol alphabet: 1) zerotree root, 2) isolated zero, 3) positive significant, and 4) negative insignificant but has a significant descendant. The string of symbols is then encoded using a multilevel adaptive arithmetic coder such as in . Once a coefficient is found to be significant, it is moved onto the subordinate list. Also note that once a coefficient is determined to be significant, for the purpose of determining if one of its ancestors is a zerotree on future dominant passes, its value is treated as zero so as not to prevent a zerotree occurrence on future dominant passes. For the subordinate pass, each binary decision to refine a significant coefficient is arithmetically encoded. The process continues to alternate between dominant passes and subordinate passes where the threshold is halved before each dominant pass. The encoding stops when some target stopping condition is met, such as when the bit budget is exhausted or when a target quality has been achieved. Here, compression is achieved both by eliminating a large number of predictably insignificant coefficients from consideration through zerotree coding, and by adaptively arithmetic coding a string of symbols from a small alphabet. B. Set Partitioning in Hierarchical Trees (SPIHT) Embedded zerotree wavelet (EZW) coding, introduced by J. M. Shapiro, is a very effective and computationally simple technique for image compression. This algorithm implementation based on set partitioning in hierarchical trees (SPIHT) provides even better results. The complete algorithm can be studied in four parts 1) Embedded coding or progressive transmission scheme 1) Embedded coding or progressive transmission scheme 2) Partial ordering by coefficient magnitude and ordered bit plane transmission 3) Set partioning sorting procedure 4) Spatial orientation trees. Progressive Image Transmission: Suppose that the original image is defined by a set of pixel values p i,j, where (i, j) is the pixel coordinate. The coding is actually done to the array c = R(P ) where R(.) represents a unitary hierarchical subband transformation. The 2-D array c has the same dimensions of p, and each element c i,j is called transform coefjcient at coordinate (i, j). In a progressive transmission scheme, the decoder initially sets the reconstruction vector E to zero and updates its components according to the coded message. After receiving the value (approximate or exact) of some coefficients, the decoder can obtain a reconstructed image p = R(c). A major objective in a progressive transmission scheme is
7 EE678 WAVELETS APPLICATION ASSIGNMENT 7 to select the most important information-which yields the largest distortion reduction-to be transmitted first. For this selection, the mean squared-error (MSE) distortion measure is used. Set Partioning Sorting Algorithm: One important fact used in the design of the sorting algorithm is that we do not need to sort all coefficients. Actually, we need an algorithm that simply selects the coefficients such that 2 n <= c i,j < 2 n+1, with n decremented in each pass. Given n, if c i,j >= 2 n then we say that a coefficient is significant; otherwise it is called insignificant. The sorting algorithm flivides the set of pixels into partitioning subsets T m, and performs the magnitude test c i,j >= 2 n for all (i, j)ɛt m. If thle decoder receives a no to that answer (the subset is insignificant), then it knows that all coefficients in T m, are insignificant. If the answer is yes (the subset is significant), then a certain rule shared by the encoder and the decoder is used to partition T m, into new subsets T m,l, and the significance test is then applied to the new subsets. This set division process continues until the magnitude test is done to all single coordinate significant subsets in order to identify each significant coefficient. To reduce the number of magnitude comparisons (message bits) we define a set partitioning rule that uses an expected ordering in the hierarchy defined by the subband pyramid. The objective is to create new partitions such that subsets expected to be insignificant contain a large number of elements, and subsets expected to be significant contain only one element. Spatial Orientation Trees: The following sets of coordinates are used to present the new coding method: O(i, j) : set of coordinates of all offspring of node (i, j); D(i, j) : set of coordinates of all descendants of the node (i, j); H : set of coordinates of all spatial orientation tree roots (nodes in the highest pyramid level); C(i, j) = D(i, j) O(i, j). For instance, except at the highest and lowest pyramid levels, we have O(i, j) = (2i, 2j), (2i, 2j + 1), (2i + 1, 2j), (2i + 1, 2j + 1). We use parts of the spatial orientation trees as the partitioning subsets in the sorting algorithm. The set partitionmg rules are simply the following. 1) The initial partition is formed with the sets (i, j) and D(i, j), for all (i, j)ɛh. 2) If D(i, j) is significant, then it is partitioned into C(i, j) plus the four single-element sets with (k, l)ɛo(i, j). 3) If C(i, j) is significant, then it is partitioned into the four sets D(i, j), with (k, l)ɛo(i, j). C. Embedded Block Coding with Optimized Truncation (EBCOT) Like ealier algorithms, which include Shapiros EZW and Said and Pearlmans SPIHT, the EBCOT algorithm uses a wavelet transform to generate the subband samples which are to be quantized and coded, where the usual dyadic decomposition structure attributed to Mallat is typical, but other packet decompositions are also supported and occasionally preferable. IV. CONCLUSION We have hence studied the application of wavelets for image compression in the JPEG2000 standard. We looked at various wavelets, namely LeGall 5/3 and Daubechies 9/7, used by JPEG2000 and studied their properties. Daubechies 9/7 was found to have tight Riesz bounds, indicating that the basis functions are very nearly orthogonal. Thereafter we studied the Embedded Zerotree Wavelet and the Set Partitioning in Hierarchical Trees (SPIHT) algorithms for using of wavelets. Embedded zerotree wavelet (EZW) coding is a very effective and computationally simple technique for image compression but SPIHT algorithm implementation based on set partitioning in hierarchical trees was found to provide better results as compared to EZW. ACKNOWLEDGMENT We would like to thank Prof. V. M. Gadre for giving us this opportunity to look at wavelets in depth and thus appreciate the course.
8 EE678 WAVELETS APPLICATION ASSIGNMENT 8 REFERENCES  X. Liu, Comparison between wavelet image compression methods, University of South Carolina.  A. Skodras, C. Christopoulos and T. Ebrahimi, The JPEG2000 still image compression standard,ieee Signal Processing Magazine, September 2001, pg  M. Unser, and T. Blu, Mathematical properties of the JPEG2000 wavelet filters, IEEE Trans. Image Proc., vol. 12, no. 9, Sept. 2003, pg  J. M. Shapiro, An embedded hierarchical image coder using zerotree of wavelet coefficients, The Davis Sarnoff Research Centre, Princeton.  Sinha, Deepen and Tewfik, Ahmed H. Low Bit Rate Transparent Audio Compression using Adapted Wavelets. IEEE Transactions on Signal Processing, 41:12,  A. Said, and W. A. Pearlman, A new, fast, and efficient image codec based on set partitioning in hierarchical trees, IEEE Trans. Circuits and Systems for Video Tech., Vol. 6, No. 3, June 1996, pg