# IMAGE CODING USING WAVELET TRANSFORM, VECTOR QUANTIZATION, AND ZEROTREES

2 2.2 - Vector Quantization Vector Quantization (VQ) has been applied to image compression, either by coding of the image itself or by some transformation of it. In VQ, a group of pixels, called a vector, is approximated by another one taken from a table of admissible vectors, the codebook, and coded simply by the index of this vector in the table. The codebook is created by some optimization algorithm, applied over a series of images similar to the ones to be coded, or training set. As discussed in [3], application of VQ to the WT of an image requires the creation of one codebook for each scale and orientation in the transformed image Embedded zerotree wavelet The EZW algorithm is based in the construction of two lists for a given image that has been previously decorrelated with a wavelet transform. In the first list, called the dominant list, the information about the significance of a coefficient is coded. In the second, or significant list, only the values for the significant coefficients are kept up to a given degree of precision. In Shapiro s scheme, the significance of a coefficient at a given iteration is determined based on its comparison with a threshold (T). If the value of the coefficient is greater than T, the coefficient is significant while, if it is smaller than T, it is considered insignificant. When the coefficient is significant, it can be positive (P) or negative (N). When the coefficient value is below threshold, the values of its descendants, which are the corresponding coefficients in lower scales, are analyzed. The parent-child relationships are described in Figure 1. If all the descendants are insignificant, the coefficient is coded as a zerotree root (ZTR) and all its children are discarded from further processing. When some of the descendants are significant, however, we have an isolated zero (IZ), and the descendants have to be codified individually. In summary, 4 symbols (2 bits) are necessary to code completely the dominant list. The same procedure is performed in all scales with a prefixed order (given in Figure 2) until the dominant list is completed. Then, the same scheme is repeated iteratively reducing the threshold at each iteration, and in this way, the values of the coefficients are successively approximated. ii) Each sub-band in the transform is independently coded using vector quantization, except for the lowest frequency band, on which scalar quantization to eight bits is used. In this preliminary work, all subbands are coded using a 256-vector codebook, which gives the same compression for the three sub-bands of each scale. fl Figure I : Parent-child (dependencies of sub-bands. The arrows point from parents to children. This scheme can be extended to larger number of subbands METHODOLOGY The proposed procedure has four steps: i) Obtaining the Wavelet Transform of the image. Figure 2: Scanning order of the sub-bands foi encoding a significant map. The lower frequency sub-band is at the top left and the higher frequency sub-band at the bottom right. This scheme can be extended to larger number of sub-bands 167

3 iii) The VQ coded sub-bands are subjected to an information elimination procedure based on the EZW. We define a significant vector as one whose distance to the origin is greater than a selectable threshold, T. Subbands are scanned as in Fig. 2, maintaining two lists: the significance map (SM), and the significant vectors (SV). The SM has positional information for each subband. For the lowest frequency sub-bands, the SM has a one to one correspondence with the sub-band vectors, with three possible values: significant vector, isolated zero (less than T but with significant descendants) or zerotree root (this vector and all its descendants are less than T). For higher frequency sub-bands, the SM includes only those vectors that do not belong to a zerotree. Vectors in the highest frequency sub-bands have no children, and can only be significant or non significant, requiring only two symbols. When a vector in any scale is significant, its index is added to the significant vector list. Only one pass is made through the image, as the vector indexes saved represent the f dl precision of the significant vectors. In this procedure, T is fixed and controls the quality of the reconstructed image. A file is generated which contains both the SM and SV lists, and the scalar-coded low frequency image. iv) Finally, this file is further compressed using arithmetic coding, using the standard application "zip". Although our procedure sacrifices the embedded encoding properties of Shapiro's algorithm, it still maintains the WT good properties for a progressive transmission scheme. Threshold I32 1/16 Compression rate SNR (db) 30: : I (a) Original image IV. RESULTS Two sets of experiments have been realized, using the length 8 Daubechies wavelet. In the first set, an image (thorax radiography) of size 512x512 was transformed to three scales, and each sub-band was vector quantized using codebooks trained over ten similarly transformed radiographs; the quantization for all sub-bands uses 8-bit indexes to represent vectors of four elements (2x2 blocks). The threshold is a fraction of the greatest magnitude vector found in all sub-bands, and varies between 1/64 and 1/16 of this value. Figure 3 shows the original image and the reconstructed one for a threshold of 1/64. Figure 4 shows an amplified view of a section of the original image and the reconstructed images for threshold 1/64 and 1/16. The results are summarized in Table I. The compression rates range from 30:l with very good subjective quality, to 53:l with acceptable quality. As is frequently found, the SNR is a poor guide to the visual quality of the reconstructed image. (b) Reconstructed image. Threshold 1/64, compression 30: 1 Figure 3.- Experiment 1. Results for the 512x512 image 168

4 The second set of experiments was similar, but using a bigger image, 864x864 pixels, and transforming to four scales. We expected better compression rates, as the low frequency band (which is compressed only arithmetically) now represents a lower percentage of the total data. However, the magnitude of the coefficients in the WT rises in each new scale added; this introduces greater errors in the quantization of the low frequency band, and more high frequency information is discarded as non significant, with an overall loss in visual quality. The threshold varies between and 1/64 of the maximum magnitude vector. Figure 5 shows about 80% of the images, both the original and the reconstructed from threshold 1/90. Figure 6 shows an amplified section of the original andl three reconstructed images, compressed with different thresholds. Numerical results are summarized in Table 11. The compression rate goes from 1 :37 (good quality) to 1 :lo0 (acceptable). Table TI Experiment : (a) Original image V. CONCLUSIONS The results presented here compare very favorably against those obtained at the ETSIT-UPM in other project, using VQ over Discrete Cosine Transform of the image. In this work, a compression rate of 14:l gave acceptable quality of the reconstructed images, with SNR around 40.8 db, and notable "block" artifacts. (b) Reconstructed image. Threshold 1/64, compression 30: 1 Figure 4.- Experiment 1. Amplified 256x256 section (c) Reconstructed image. Threshold 111 6, compression 53: 1 169

5 37: I, 46.3 db, although subjectively the differences are almost unnoticeable,) but did not perform as well as the proposed method for higher rates (for the 512x512 image at 53:1, 43.4 db, and for the 864x864 one at loo:l, 42.1 db, and notably lower visual quality.) This is probably due to the SNR of the reconstructed image being limited by the quality of the VQ step when few vectors are discarded. This suggests that fine tuning of the VQ step can give better performance for the proposed method in every case. (a) Original image Many parameters in the proposed method can be changed, as, for example, the wavelet type, the filter length, the threshold magnitude for each sub-band, and the definition of significant vector. Specially critical, as discussed in [3], is the number of bits assigned to each sub-band in the VQ step. To find an optimal combination of these parameters is the subject of our present research, as is the medical validation of the results. Also of interest is the application of this method to normal b&w and color photographs ACKNOWLEDGMENT This work was supported in part by the project BID- CONICIT E-] 8 (New Technologies Program), and by Universidad Simon Bolivar. (b) Reconstructed image. Threshold 1/90, compression 78: 1 Figure 5.- Experiment x768 pixel section of the images REFERENCES [l] Jayant, N., Speech and image coding, special issue of IEEE J. Select. Areas Comm., SAC-10(5), [2] Vetterli, M., Herley, C., "Wavelets and filter banks: Theory and Design", IEEE Trans. Signal Processing, vol. 40, No. 9, pp , Sep [3] Antonini, M., Barlaud, M., Mathieu, P., Daubechies, I., "Image Coding Using Wavelet Transform", IEEE Trans. Image Proc., Vol. 1, pp , Apr [4] Shapiro, J. M., "Embedded Image Coding Using Zerotrees of Wavelet Coefficients", IEEE Trans. Signal Proc., Vol. 41,No. 12, pp , Dec [5] Field, D. J., "Scale invariance and self-similar wavelet transform: an analysis of natural images and mammalian visual systems", In "Wavelets, Fractals and Fourier Transform", D. M. Farge, J. C. R. Hunt and J. C. Vassilicos, Clarendon Press. Oxford A test run using Shapiro's algorithm gave slightly better results at lower compression rates (for the 5 12x5 12 image at 30:1, 45.7 db, and for the 864x854 one at 170

6 (a) Original image (b) Reconstructed irnage. Threshold 1 /256, compression 37: 1 (c) Reconstructed image. Threshold 1/90, compression 78: 1 (d) Reconstructed image. Threshold 1/64, compression 100: 1 Figure 6.- Experiment 2. Amplified 256x256 section 171

