DEPARTMENT OF ELECTRICAL ENGINEERING SIGNAL PROCESSING LABORATORY. Professor Murat Kunt - Director LTS-DE-EPFL CH-1015 LAUSANNE. Oce:

Size: px

Start display at page:

Download "DEPARTMENT OF ELECTRICAL ENGINEERING SIGNAL PROCESSING LABORATORY. Professor Murat Kunt - Director LTS-DE-EPFL CH-1015 LAUSANNE. Oce:"

Dale Whitehead
5 years ago
Views:

DEPARTMENT OF ELECTRICAL ENGINEERING SIGNAL PROCESSING LABORATORY Professor Murat Kunt - Director LTS-DE-EPFL CH-1015 LAUSANNE Oce: +41-1-693-66 Secretariat:

1 DEPARTMENT OF ELECTRICAL ENGINEERING SIGNAL PROCESSING LABORATORY Professor Murat Kunt - Director LTS-DE-EPFL CH-1015 LAUSANNE Oce: Secretariat: Fax: kunt@epfl.ch Subband Image Coding An Overview O. Egger, V. Vaerman and T. Ebrahimi Technical Report LTS Lausanne, June 0, 1997

2 Contents 1 Introduction Generalities : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 3 1. Statement of the Problem : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Investigated Approaches : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Organization of the Report : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 Linear Subband Decompositions 5.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5. Linear Subband Decomposition : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7..1 Analysis/Synthesis Block : : : : : : : : : : : : : : : : : : : : : : : : : : : 7.. Asymmetrical Filter Bank : : : : : : : : : : : : : : : : : : : : : : : : : : : 8.3 Subband Quantization Noise : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Quantization Noise in the Low-Frequency Subband : : : : : : : : : : : : : Roberts Pseudo-Random Noise Technique : : : : : : : : : : : : : : : : : : Quantization Noise Filtering : : : : : : : : : : : : : : : : : : : : : : : : : Noise Removal in the Low-Frequency Subband : : : : : : : : : : : : : : : Quantization Noise in the High-Frequency Subbands : : : : : : : : : : : : 19.4 Zerotree Successive-Approximation Quantization : : : : : : : : : : : : : : : : : :.5 Conclusions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5 3 Morphological Subband Decompositions Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6 3. Multiresolution Decomposition : : : : : : : : : : : : : : : : : : : : : : : : : : : : Pyramidal Decomposition : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Morphological Subband Decomposition : : : : : : : : : : : : : : : : : : : : : : : : Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Coding Gain : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Design of the Morphological Subband Filters : : : : : : : : : : : : : : : : Context-Adaptive Subband Decomposition : : : : : : : : : : : : : : : : : : : : : Decomposition using Linear Filters : : : : : : : : : : : : : : : : : : : : : : Context-Adaptive Subband Decomposition : : : : : : : : : : : : : : : : : New Subband Decomposition : : : : : : : : : : : : : : : : : : : : : : : : : Texture Detection : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Embedded Zerotree Lossless Algorithm : : : : : : : : : : : : : : : : : : : : : : : : Lossless Decomposition : : : : : : : : : : : : : : : : : : : : : : : : : : : : Lossless Successive-Approximation Quantization : : : : : : : : : : : : : : Conclusions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 47 4 Summary 49

3 1 Introduction 1.1 Generalities This paper aims to provide the reader with the technical description of state-of-the-art subband image coding techniques, developed at the Signal Processing Laboratory of the Swiss Federal Institute of Technology, Lausanne, Switzerland. 1. Statement of the Problem Every digital image acquisition system produces pictures in its canonical form. This means that the analog scene is sampled in space, and quantized in brightness. If the sampling step size is small enough, the integration ability of the human visual system will give the illusion of a continuous picture to the human observer. In that sense, a digital image is an N 1 N array of integer numbers. However, this canonic form needs a large number of bits for its representation. For example, a picture using 8 bits per pixel needs half a million bits for its representation. Image data compression aims at minimizing the number of bits required to represent an image. Data compression has a wide area of applications. Video telephony between two speakers or teleconferencing on two PCs via the normal twisted-pair phone lines is only possible with very low bit rate compression systems. As well, digital image transmission between a satellite and the earth will require compression techniques providing an ecient and relatively fast communication. Nearly all multimedia applications such as interactive databases (encyclopedias, electronic newspaper, travel information, and so on) need a strong compression of the huge input data consisting of text, audio and visual information. Some other applications are in remote sensing, education, and entertainment. Some applications allow for visible distortions of the input images in exchange for high compression ratios. This is typically the case for applications such as teleconferencing or accessing images from a distant server (e.g. applications related to the World Wide Web). Such coding schemes are called lossy. In other applications, however, no distortion of the input image is tolerated. Compression of medical images is a typical example of this category. Coding schemes that introduce no distortion are termed lossless. In the past, most of the eort has been dedicated to improve the compression ratios of the compression scheme. Functionalities of the compression schemes were considered as less important. Second-generation image coding techniques [15] attempt to split an image into visual primitives. An important functionality for compression schemes is the possibility of progressive transmission or progressive decoding of the bitstream. A typical application is data browsing. A picture has been compressed at a good quality (perhaps even lossless) and the user wants to visualize the picture at a lower quality to save transmission time. This is only possible if the stored bitstream is embedded in the sense that one can reconstruct a coarse version of the image using only a portion of the entire bitstream. This paper deals with the problem of coding digital images with subband-based techniques at any target bitrate. The problems of high compression coding and of lossless image compression 3

4 are addressed. 1.3 Investigated Approaches Most standards dealing with compression of visual information are based on block processing. Typically, transform coding based on the discrete cosine transform (DCT) is applied. The major artifact occuring at high compression ratios with such an approach is the blocking eect. The human eye is very sensitive to this distortion. Another type of distortion is due to the Gibbs phenomenon of linear lters and is called ringing eect. All known subband image coders suer from this distortion. This eect is most visible around high-contrast contours and is also very annoying. One approach presented in this paper attempts to minimize the ringing eect in a subband coding scheme by designing a new family of lter banks. The possibility of minimizing the quantization noise in the subbands is explored. The reason for the ringing eect is the Gibbs phenomenon that aects all linear lters. Known lters which do not suer from the Gibbs phenomenon are morphological lters. It is clear therefore, that a morphological subband decomposition would not suer from the annoying ringing artifact. The design of a morphological subband decomposition lead to a third category of artifacts: morphological lters are very poor at representing textured regions in an image. The possibility of a hybrid linear/morphological decomposition is explored. 1.4 Organization of the Report Section discusses linear subband coding. The general framework and the theoretical foundations are discussed. Existing linear lter banks are reviewed and their advantages and their drawbacks are discussed. A new family of two-band lter banks is then proposed. It aims at reducing as eectively as possible the ringing eect inherent to all linear lter banks. The quantization of the subbands is discussed and an ecient noise reduction system is proposed. An implementation of a complete subband coding scheme called embedded zerotree wavelet (EZW) algorithm is described and its parameterization is proposed. The possibility of using morphological lters for subband decomposition is then discussed in Sec. 3. The aim of using morphological lters instead of linear lters is to completely avoid the annoying ringing eect. A novel subband decomposition scheme based only on morphological lters is proposed. In order to overcome the problem of the bad representation of textured regions with morphological lters, an adaptive decomposition is proposed. This scheme uses a linear lter for textured regions and morphological lters for edges and homogeneous regions. A suitable texture detection algorithm for that purpose is proposed. Another advantage of using morphological lters instead of linear lters is highlighted. A possibility to use this advantage for the denition of a progressive lossless coding scheme is also described. Finally, Sec. 4 summarizes the proposed subband coding schemes, depending on the desired features: lossy or lossless compression. 4

5 Linear Subband Decompositions.1 Introduction In recent years, subband coding (SBC) of images has become a domain of intensive research. In such a scheme, subband transforms are computed by ltering the input image with a set of bandpass lters and decimating the results, thus obtaining the subband images, each representing a particular portion of the frequency spectrum. The power of this technique resides in its capability to code each subband separately with a bit rate that matches the visual importance of that subband. Subband coding leads to visually pleasing image reconstruction and does not produce blocking artifacts. In addition, it allows a progressive multiresolution transmission. Subband-based coding can be divided into three steps: 1. subband decomposition,. quantization, 3. and entropy coding of the subbands. The decoding process involves the inverse of each of these steps, in order to reconstruct the desired signal. The concept of subband decomposition was introduced rst by Crochiere [5] in the context of speech coding. Then Smith and Barnwell [6] solved the problem of perfect reconstruction lter banks for a one-dimensional multi-rate system. Following these studies, substantial research eort has been devoted to perfect reconstruction lter banks theory [9, 30, 33], which was then extended to the two-dimensional (-D) case by Vetterli [3]. Applications of subband coding to images was introduced by Woods and O'Neil [35] by means of -D separable quadrature mirror lter (QMF) banks. Several methods have been proposed in the last decade for designing lter banks. The most well-known lters are the QMFs introduced by Johnston [1]. They are two-band lter banks. The design process is based on the minimization of a weighted sum of the reconstruction error and the stop band energy of each lter. QMFs are not perfect reconstruction lters, but they have a linear phase. An alternative solution is the two-band lters proposed by Smith and Barnwell [6], called conjugate quadrature lters (CQF), which allow for perfect reconstruction but have a nonlinear phase. Vaidyanathan et al. [30] solved the design problem of M-band perfect reconstruction lter banks. They use perfect reconstruction blocks based on lossless polyphase matrices and optimize adequate parameters. Nayebi et al. [18] have developed a framework based on temporal analysis. A design technique leading to numerically perfect reconstruction lter banks has been developed as well [19]. This technique is very exible and addresses the inclusion of additional constraints for specic applications such as low delay, linear phase and so on. Although the above techniques exist for designing M-band and/or perfect reconstruction lter banks, the QMFs designed by Johnston are still the most cited and utilized in the image coding community. This is partly due to the simplicity of the design technique and the published tables of lter coecients. On the other hand, lters proposed in [31] and in [19] are relatively long, and hence, are not suitable for image coding applications. 5

6 Image coding applications require lter banks with specic features that dier from the classical perfect reconstruction problem or from the design of lter banks for other applications. Psycho-visual properties of the human visual system, and typical spectral characteristics of natural images have to be taken into consideration. Images are highly non-stationary sources. In general, they are composed of large homogeneous regions and edges which have a small spatial support. Typical natural images have a highly asymmetrical power spectrum with respect to. A good model is a spectrum proportional to f? [7], with f being the frequency. Filter banks that take into account the statistics of natural images and reduce the ringing eect have been proposed by Caglar et al. [1, 3]. One advantage of SBC of images over block transform coding such as the discrete cosine transform (DCT) used in the JPEG standard [11] is the absence of the blocking eect. However, one major artifact is still remaining. It is the ringing eect which occurs around high-contrast edges due to the Gibbs phenomenon of linear lters. This artifact can be reduced or even removed by an appropriate design of the lter bank. The second artifact introduced by SBC appears at high compression ratios and is due to the coarse quantization of the low-frequency subband. It is noticed as blurred false contours in the reconstructed image. In this section, a family of lter banks called asymmetrical lter banks (AFB) is proposed. The analysis lowpass lter is long, while its highpass counterpart is very short, thus matching the statistics and the non-stationarity of natural images. Maximum regularity is also imposed, the importance of which is well recognized in [1]. At the analysis stage, regularity allows for a better signal representation by their wavelet coecients, whereas at the synthesis stage regularity leads to a smooth perturbation. The AFBs are almost free of the \ringing eect" which occurs systematically in reconstructed images using QMFs. Two techniques are proposed to reduce the quantization noise in subband image coding. In the rst one, an adaptive Wiener ltering is applied for the low-frequency subband. In the second one, the optimal quantization reconstruction levels are estimated for all the highfrequency subbands. For natural images most of the energy of the decomposed image is contained in the lowfrequency subband. Moreover, this subband has nearly the same statistics as the original image. Based on this observation a noise reduction lter is developed. The proposed lter is a Wiener lter with adaptive directional support. It has the advantage of ltering out the noise without blurring the edges. This approach completely eliminates false contours and augments the quality of the reconstructed image. Furthermore, this process is independent of the employed lter bank. The quantization strategy of the high-frequency subbands is important for the coding performance. This boils down to the problem of entropy-constrained quantization for which there exists no optimal solution in general. However, it has been shown that for a Laplacian source, the optimal entropy-constrained quantizer is a uniform quantizer []. Moreover, experience shows that, for most natural sources, uniform quantization is quasi-optimal. In this section, it is proposed to estimate the optimal reconstruction levels as they do not coincide in general with the mid-points of the quantization decision levels. For compression purposes it is important to exploit the existing zero-correlation across the subbands. One powerful approach is the embedded zerotree wavelet (EZW) algorithm proposed 6

7 X(z) H 1 (z) G 1 (z) ^X(z) H (z) G (z) Figure 1: Two-band analysis/synthesis system. H 1 (z) is the lowpass analysis lter, H (z) is the highpass analysis lter, G 1 (z) is the lowpass synthesis lter and G (z) is the highpass synthesis lter. in [5]. The EZW algorithm is based on successive-approximation quantization which leads to a completely embedded bitstream. That is, the bitstream of the compressed image can be progressively decoded. This feature allows for important functionalities such as image browsing for example. The proposed techniques for noise reduction can be incorporated in the EZW algorithm. The outline of this section is as follows. In Sec.. the theory of the subband transform is explained. In Sec..3 a detailed analysis of the subband quantization noise is given and approaches to reduce it are proposed. The EZW algorithm is described in Sec..4. Finally, Sec..5 concludes this section.. Linear Subband Decomposition..1 Analysis/Synthesis Block Subband decomposition divides the input signal into dierent subbands. The choice of the lters is an important issue. It is shown in [3, 6, 16] that the lters represent an important factor in the performance of the decomposition for compression purposes. Figure 1 shows a two-band lter bank. The lters H 1 (z) and H (z) are the analysis lowpass and highpass lters respectively, while G 1 (z) and G (z) are the synthesis lters. In this system, the input/output relationship is given by where and ^X(z) = X(z)T (z) + X(?z)A(z) (1) T (z) = 1 [H 1(z)G 1 (z) + H (z)g (z)] () A(z) = 1 [H 1(?z)G 1 (z) + H (?z)g (z)] : (3) Perfect reconstruction can be achieved by removing the aliasing distortion, A(z), and imposing the transfer function, T (z), to be a pure delay of the form T (z) = z?, where is the delay of the system. By choosing the synthesis lters as G 1 (z) = H (?z) and G (z) =?H 1 (?z) the aliasing component is removed. Under these constraints the system transfer function becomes T (z) = F (z)? F (?z) (4) 7

8 where F (z) = H 1 (z)h (?z) is the product lter. Perfect reconstruction is obtained when the product lter, f(k) = Z?1 [F (z)], Z?1 () being the inverse z-transform, is a powercomplementary half-band lter. This means that every odd sample, except one sample f(), is equal to zero, that is f(n + 1) = ( 1 n + 1 = 0 otherwise n = 0; 1; ; :::; L? 1 (5) where L is the length of the product lter F (z) and the delay of the system... Asymmetrical Filter Bank In image processing it is important to have linear phase analysis and synthesis lters. If the analysis lters H 1 (z) and H (z) have a linear phase, then F (z) has a linear phase, too. An impulse response of a linear phase lter is either symmetric or antisymmetric. From Eq. (5), it is clear that the product lter f(k) is symmetric and that it has an odd number of samples L. Furthermore, the sample on the symmetry axis of the lter is = L? 1 ; (6) where is the delay of the system. The classical lters QMF [1] and CQF [6] constrain H 1 (z) and H (?z) to have the same magnitude response. If this constrain is used for image compression, the following questions arise: A good frequency partition is the wavelet decomposition. This partition is obtained using a highly asymmetrical tree. Therefore, why constrain the lowpass and highpass magnitude responses to be mirrors of each other? High-frequency information is represented by edges in a natural image. They have a very limited spatial support. On the other hand, low-frequency information comes from homogeneous regions which have a large spatial extension. Therefore, why use equal length lters to lter two completely dierent structures? Consequently, we propose to use a very short highpass analysis lter and a longer lowpass counterpart. The proposed lters have a linear phase, are maximally regular, and are of unequal length. Notice that equal length lters allow a polyphase implementation of the lter bank which divides the computational complexity roughly by a factor of two. This will not be possible with the proposed lter banks. However, we will see that the proposed lters are very short and, hence, are computationally less expensive than a standard QMF lter bank even though no polyphase implementation is possible. Due to the linearity of the phase and the half-band property, the length of the lowpass product lter F (z) must satisfy L = 4n? 1 n = 1; ; 3; ::: (7) 8

9 Let N i be the length of H i (z). Since L = N 1 + N? 1, we have N 1 = 4n? N (8) with n = d N 4 e; d N e + 1; ::: 4 It is shown in [] that a lowpass lter is regular if its Z-transform has at least one zero at z =?1. In order to make the product lter maximally regular, we impose a maximum number of zeros at z =?1 on the Z-transform of the product lter. Since the half-band constraint of a perfect reconstruction product lter allows only L+1 degrees of freedom, F (z) can be represented as F (z) = (z?1 + 1) L+1 Q(z) (9) where Q(z) can be computed by resolving a system of L+1 linear equations which impose the 4 half-band property and the linear phase of F (z). If the impulse response of Q(z) is denoted by q(k) and p(k) is the impulse response of the binomial lter (z?1 + 1) L+1, then, the impulse response of the product lter f(k) is given by the following convolution f(k) = p(k) q(k): (10) Equation (5) can be written in terms of the coecients of p(k) and q(k) as follows where P is the matrix P = 0 B P 0 q(0) q(1). q L?3 4 1 C A = C A p(1) p(0) p(3) p() p(1) p(0) 0 0 p. L?1 p L?3 1 C C (11).. A : (1) p() p(1) Using Eq. (11) the rst L+1 coecients of q(k) can be computed. The remaining coecients 4 are determined by noting that Q(z) is a symmetric linear phase signal, and hence L? 3 q(k) = q? k k = L ; L By spectral decomposition of F (z) the lowpass analysis lter becomes and its highpass counterpart H (z) becomes ; :::; L? 3 : (13) H 1 (z) = (z?1 + 1) N 1?N?1 Q 1 (z) (14) H (z) = (z?1? 1) N?1 Q (z?1 ) (15) 9

10 db normalized frequency Figure : Frequency response of the analysis lters with N 1 = 1 et N = 4. where Q 1 (z)q (z) = Q(z). We have experimentally determined that the best factorization of Q(z) is Q 1 (z) = Q(z) and Q (z) = 1. For such a case, the highpass analysis lter is constrained to be short and the term Q(z) is not factorized at all and is therefore attributed to H 1 (z). A consequence of this factorization is the linearity of the phase of all subband lters. The best coding results are obtained by imposing N 1 = 3 N. Figure shows the frequency response for lters of length N 1 = 1 and N = 4. The coecients of two lter banks are given in Tab. 1 and Tab. for the 1 4 and 9 3 cases. Note that the symmetric short kernel lters (SSKF) [8] are a special case of the proposed lters with N 1 = 5 and N = 3, respectively N 1 = 4 and N = 4..3 Subband Quantization Noise In order to compress the image data, the subband signals have to be quantized. To perform an optimal quantization, the type of quantization (uniform, non-uniform) and the number of the quantization levels in the dierent subbands must be dened. Let X be a real random variable with probability density function (pdf) p X (). The cumulative distribution function (cdf) F () is then F (x) = Z x?1 Let us recall the following denition of a quantizer []. p X () d: (16) Denition 1 A device with input X and output Y = Q(X) is called a quantizer if a i?1 < X a i implies Y = i. The a i 's are called the decision levels (quantization thresholds) and the i 's are called the reconstruction levels. The probability p i that the output Y is i depends on the cumulative distribution of X by p i = P (Y = i ) = F (a i )? F (a i?1 ): (17) 10

11 1-tap h 1 (k) e e e e e e e e e e e e-03 4-tap h (k) 1.5e e e e-01 Table 1: Unequal length analysis lters with N 1 = 1 and N = 4. 9-tap h 1 (k).34375e e e e e e e e e-0 3-tap h (k).5e e-01.5e-01 Table : Unequal length analysis lters with N 1 = 9 and N = 3. 11

12 Hence, the knowledge of F () allows us to compute the zeroth-order entropy rate, R, of the output: R =? X i p i log p i (18) and the average distortion, D r, as follows: D r = E (j Y? X j) r = X i Z ai a i?1 j x? i j r p X (x) dx; (19) where r is a positive integer. After quantization a source coding algorithm is applied. The smaller the entropy of the data, the more ecient the source coding algorithm. In this work, any quantizer +that minimizes D r for a xed entropy rate R is considered as an optimal quantizer. It can be observed from Eq. (18) that the entropy rate R does not depend on the reconstruction levels i. Therefore, they can be chosen to minimize the distortion D r. It can be shown [] that the optimal i 's are uniquely specied by a i?1, a i and r as follows: Z i a i?1 For r = 1, Eq. (0) reduces to ( i? x) r?1 p X (x) dx = Z ai i (x? i ) r?1 p X (x) dx: (0) or equivalently, Z i a i?1 p X (x) dx = Z ai i p X (x) dx; (1) E(X < i j a i?1 < X a i ) = 1 : () Equation () denes the median point of the interval. For r =, Eq. (0) implies which is the same as i = R ai a i?1 xp X(x) dx R ai a i?1 p X(x) dx ; (3) i = E(X j a i?1 < X a i ): (4) Equation (4) denes the expected value of the interval, which, in general is not the same as the median point. The design of a quantizer in the minimum mean square error (MMSE) sense reduces to choosing the thresholds a i that minimize D under the constraints of Eq. (4). This yields a set of nonlinear equations which are dicult to solve in general. The simplest quantizer, the uniform quantizer, is dened as follows. Denition A quantizer is said to be uniform if the separation between consecutive decision levels is a constant value, hence a i? a i?1 = ; 8i: (5) 1

13 amplitude Figure 3: Histogram of the low-frequency subband of the picture \Lena". A very surprising fact is that this simple quantizer is optimal or quasi-optimal in most of the cases. In fact, it is optimal if the input X has a Laplacian or exponential pdf. It has been demonstrated that, in other cases, the optimal quantizer performs only negligibly better than the uniform quantizer []. In the case of subband coding schemes, two types of subbands have to be distinguished: The low-frequency subband. It has nearly the same statistics as the scaled original image. Hence, it can be modeled as a natural image. All the other subbands. They contain mostly contour and texture information. The statistics are strongly dierent from the low-frequency subband and the original image. In the next sections an analysis of the quantization noise in these two dierent types of subbands is proposed..3.1 Quantization Noise in the Low-Frequency Subband The pdf of the samples in the low-frequency subband is not computable. Figure 3 shows the histogram of a typical low-frequency subband. One can observe that the samples do not follows a known distribution. In order to solve analytically Eq. (4) we would need to know the true pdf of X. In the absence of such information, clearly our best estimation for any type of input data is given by i = E(X j a i?1 < X a i ) = a i?1 + a i : (6) This denes the simplest possible quantizer where the quantization thresholds and the reconstruction levels are uniformly separated. This uniform quantization is guaranteed to be optimal under the entropy constraint, assuming an approximatively uniformly distributed statistics of this subband. However, from a visual point of view, this approach is not satisfactory. In fact, by quantizing the low-frequency subband uniformly we introduce signal-dependent quantization noise which 13

14 x(k 1 ; k ) y(k 1 ; k ) Q() Q?1 () ^x(k 1 ; k ) r(k 1 ; k ) r(k 1 ; k ) Figure 4: Flow diagram of the Roberts pseudo-random noise technique. r(k 1 ; k ) is the known pseudorandom noise and the operator Q() denotes quantization. gives articial contours in the coarsely quantized image. This artifact is still present in the reconstructed image after the synthesis stage. Roberts pseudo-random noise technique [3] can be applied to this subband to avoid this problem..3. Roberts Pseudo-Random Noise Technique Let us denote x(k 1 ; k ) the image to be quantized. In our case the input image x(k 1 ; k ) is the low-frequency subband. Suppose that we have access to a known white noise sequence, r(k 1 ; k ), with uniform pdf r(k 1 ; k ) U? ; where is the quantization step and means that the signal r(k 1 ; k ) is a random process with a distribution U? ;. The quantization is performed on x(k 1 ; k ) + r(k 1 ; k ): (7) y(k 1 ; k ) = Q [x(k 1 ; k ) + r(k 1 ; k )] : (8) The inverse operation dequantizes y(k 1 ; k ) and subtracts r(k 1 ; k ) ^x(k 1 ; k ) = Q?1 [y(k 1 ; k )]? r(k 1 ; k ) (9) Figure 4 shows the ow diagram of this procedure. This quantization method does not decrease the mean square error (MSE), however, it signicantly improves the subjective quality of the quantized images. Moreover, the power spectrum of the quantization noise becomes white and, hence all articial contours are removed. Furthermore, having a known noise power spectral density allows us to apply a noise reduction technique such as Wiener ltering. This technique is described in the next section..3.3 Quantization Noise Filtering Let us assume the image x(k 1 ; k ) is degraded by additive signal-independent random noise n(k 1 ; k ): g(k 1 ; k ) = x(k 1 ; k ) + n(k 1 ; k ) (30) 14

15 x(k 1 ; k ) + n(k 1 ; k ) H(e j! 1 ; e j! ) ^x(k 1 ; k ) E [x(k 1 ; k )] E [x(k 1 ; k )] Figure 5: Flow diagram of the Wiener ltering. The lter H(e j!1 ; e j! ) denotes the Wiener lter and E [x(k 1 ; k )] is the average of the original input signal. with E [x(k 1 ; k )n(k 1 ; k )] = E [x(k 1 ; k )] E [n(k 1 ; k )] : (31) These two conditions are satised when Roberts pseudo-random noise technique is applied. It can be shown that the optimal linear MMSE estimate of x(k 1 ; k ) is given by ltering the degraded image with a Wiener lter [17], H(e j! 1 ; e j! ), whose frequency response is H(e j! 1 ; e j! ) = P x (e j! 1 ; e j! ) P x (e j! 1; e j! ) + Pn (e j! 1; e j! ) (3) where P x (e j! 1 ; e j! ) is the power spectrum of x(k 1; k ) and P n (e j! 1 ; e j! ) is the power spectrum of n(k 1 ; k ). Equation (3) is only valid under the assumption that the signals x(k 1 ; k ) and n(k 1 ; k ) are zero-mean. However, this is readily achieved by subtracting the estimated mean of x from the signal g(k 1 ; k ). The ow diagram of Wiener ltering is shown in Figure 5. This method gives us the best estimate in the MSE sense. However, image quality does not depend solely on the MSE. In fact the MSE does not take into account visual problems such as sensitivity of the human eye across the noise spectrum and image sharpness. The Wiener lter in its raw form is not adequate for image noise reduction because it blurs the ltered image. One major disadvantage of a classical Wiener lter dened by Eq. (3) is that a xed lter is used throughout the whole image. This is due to the fact that the Wiener lter has been designed under the assumption that x(k 1 ; k ) and n(k 1 ; k ) are stationary random processes. However, natural images are strongly non-stationary. Therefore, the Wiener lter should be adapted to the local image statistics. Assume the noise spectrum is white with a constant variance, n, throughout the whole image. The noise power spectrum is then P n (e j! 1 ; e j! ) = n : (33) Consider a small region of the input signal x(k 1 ; k ). The region is chosen small enough (for example 3 3) so that x(k 1 ; k ) may be considered stationary within this region. This portion of the input signal may be modeled as follows: x(k 1 ; k ) = x + x w(k 1 ; k ); (34) 15

16 where x is the local mean of the signal, x is the local variance of the signal, and w(k 1 ; k ) is a zero-mean white noise with unit variance. For this case the local Wiener lter reduces to H(e j! 1 ; e j! ) = x ; (35) x + n and, consequently, in the space domain h(k 1 ; k ) = x x + n (k 1 ; k ); (36) where (k 1 ; k ) is a function being zero everywhere except at the origin where it has a value of 1. Thus, the Wiener lter is just a scaled impulse. We can compute the processed image y(k 1 ; k ) with which becomes y(k 1 ; k ) = x (k 1 ; k ) + [g(k 1 ; k )? x (k 1 ; k )] h(k 1 ; k ); (37) y(k 1 ; k ) = n x + n x (k 1 ; k ) + x x + n g(k 1 ; k ): (38) In order to compute Eq. (38) we have to estimate the local signal mean x (k 1 ; k ) and the local signal variance x(k 1 ; k ). Observe that the expectation of the original signal x(k 1 ; k ) is identical to the expectation of the degraded image g(k 1 ; k ) since the noise is assumed to be zero-mean. Therefore, we can compute the estimate of x (k 1 ; k ) by ^ x (k 1 ; k ) = 1 N X (n 1 ;n )S g(n 1 ; n ) (39) where N is the number of pixels in the local region S. Assuming the noise is signal-independent, the variance of the degraded image g is and, therefore, the estimate of x (k 1; k ) becomes g = x + n (40) ^ x(k 1 ; k ) = ^ g(k 1 ; k )? n (41) where ^ g(k 1 ; k ) = 1 N X (n 1 ;n )S [g(n 1 ; n )? ^ x (k 1 ; k )] : (4) In the Eqs. (33-4) we assume that the noise power n is known. From the point of view of subjective image quality, the adaptive Wiener lter method performs better than the global Wiener lter. However, a slight blur can still be detected in the processed image. Variations of this method will give us the desired noise reduction method. 16

17 (a) (b) (c) (d) (e) Figure 6: The ve dierent regions of support of the proposed noise reduction lter Proposed method 60 Adaptive Wiener filtering Error variance Unprocessed subband Quantization step Figure 7: Comparison of the eciency of the noise reduction technique on the \Lena" image. Natural images have most of the energy concentrated in the lower frequencies and are highly non-stationary. For a human observer it is very important that the edges are well preserved. Therefore, we would like to design a Wiener type lter, that does not destroy edge information. Edges typically have a dual character: they are generally highpass in one direction and lowpass in the orthogonal direction. The basic idea of the proposed algorithm consists in the ltering of the noise along the lowpass direction. The proposed algorithm proceeds as follows: Algorithm 1 If the region does not contain any edges, use the N N -D lter. Otherwise, use the 1-D lter which has the region of support with the smallest signal variance. This approach augments the accuracy of the signal model dened in Eq. (34). Using this approach we reduce the noise especially in the low-frequency region without aecting the sharpness of the edges. The noise is also ltered out of the edge region. Figure 6 shows the ve dierent lter regions of support. This ltering method produces better results than the adaptive Wiener ltering method. This approach is more ecient in terms of error variance as well as image quality. Figure 7 shows a comparison of the eciency of the noise reduction technique on the \Lena" image. On can clearly observe the lower MSE of the proposed technique in comparison to a classical Wiener ltering and in comparison with the unprocessed subband..3.4 Noise Removal in the Low-Frequency Subband The low-frequency subband contains most of the energy and is more or less a scaled down version of the original image. Hence, this subband may be assumed to possess the typical properties 17

18 Signal variance Figure 8: Typical histogram of the estimated signal variance. The amplitude of the histogram has been normalized to give a maximum value of 1. of a natural image. First, we propose to quantize this subband using Roberts pseudo-random noise technique. By applying this quantization strategy we can model the noise n ll (k 1 ; k ) in this subband accurately by having a uniform pdf, that is, n ll (k 1 ; k ) U? ; where is the quantization step. The variance of this noise is given by ; (43) ll = 1 : (44) Furthermore, when using Roberts technique it is reasonable to assume this noise to be white. Hence, the noise power spectrum becomes P ll = ll : (45) After dequantization, the noise reduction system described before is applied with the noise variance estimated from Eq. (44). Whether a local region contains edge information or not is determined as follows. The variance of the N N local region is computed. A threshold is used to classify the local variance. Figure 8 shows a typical histogram of the estimated signal variance. It can be seen that there is a strong increase in the number of pixels when decreasing the signal variance. These pixels correspond to homogeneous regions. The other pixels belong to an edge region. Based on experimental results, the threshold must lie between 700 and 000. The optimal threshold varies very little from image to image. In our experiments, a threshold of 1000 is used. This approach has the advantage of reducing very eciently the noise in the low-frequency subband. This subband contains the information to which the human visual system is most sensitive. Moreover, the low-frequency subband is typically quite small (64 64), hence the computational load is quite low. 18

19 amplitude Figure 9: Typical histogram of high-frequency subbands. The proposed noise reduction lter may be applied to the entire reconstructed image as well. However, the eectiveness of reducing the noise after synthesis is not high enough to justify the additional computational load. This is due to the fact that, after synthesis, the noise is not white anymore, but signal-dependent and with an unknown variance..3.5 Quantization Noise in the High-Frequency Subbands The highpass subbands have a completely dierent statistics than their lowpass counterpart. In fact, no DC component comes through the highpass analysis lter, hence they have zero-mean values. Figure 9 shows the histogram of a typical highpass subband. One can clearly observe a high concentration of zero-valued pixels. The density of the histogram drops very fast as the amplitude of the coecient increases. In order to exploit the characteristics of these subbands, one possible approach is to model the subbands by a Laplacian pdf. That is, the values in these subbands are assumed to be a realization of a random process with pdf p X (x) From Eq. (47) we can compute the mean and the variance X = V ar(x) = Z 1 X L(); (46) p X (x) = e?jxj : (47) X = E(X) = 0 (48)?1 x p X (x) dx = 4 : (49) The parameter is estimated from the data. One way to estimate it is to perform the maximum likelihood estimation (MLE) which is dened as follows: 19

20 Denition 3 The maximum likelihood estimation ^ = ^(x 1 ; :::; x N ) maximizes the likelihood function L() = Q N i=1 f X(x i j ) treated as a function of [10]. The MLE has the following properties. Property 1 If there exists an ecient unbiased estimator, then when certain regularity conditions are satised, the maximum likelihood method of estimation will produce it. This property is very useful because it gives us an operational method for determining the best unbiased estimator. It suces to nd the MLE and then check whether it is unbiased and whether it is ecient. Let us compute the MLE of the parameter of a Laplacian pdf. Assume for the moment that we know the original data with samples x i. Hence, we can compute the likelihood function by L() = NY i=1 f X (x i j ) = N e? P N i=1 jx ij In order to compute the MLE of, ^, the following equation has to be solved Consequently hence, This estimator is unbiased, since L() j =^ = L() = N? X N j x i j= 0; ^ = i=1 N P N i=1 j x i j : (53) E(^) = : (54) Suppose a coecient has been quantized in an uncertainty interval dened by [q min ; q max ]. The reconstruction levels in this case can be computed from Eqs. (3), (47) and (53) to be as follows: e?^q min (qmin + 1 = )? e?^qmax (q max + 1 ) : (55) e?^q min? e?^qmax For simplicity purposes we assume q min and q max to be positive. Since the pdf is symmetric, one can compute negative reconstruction levels with Eq. (55) just by negating the result. The model of a Laplacian pdf for the highpass subband signals seems reasonable. However, it is not optimal. A more accurate model for the highpass subband signals is a generalization of the density function called the generalized Gaussian pdf [34] which is dened as follows: p X (x) = a(b; )e?jbxj ; (56) 0

21 with a(b; ) = b (57)?( 1 ); where?() is the gamma-function. Equation (56) is a generalization of the Gaussian density ( = ) and a generalization of the Laplacian density ( = 1). The pdf in Eq. (56) depends on two parameters, namely and b. It is dicult to nd the MLE for both parameters simultaneously. However, experimental results show that does not vary strongly from subband to subband. Indeed, is between 0:4 and 0:8. In the proposed method is xed to = 0:5 for all subbands. Assuming known, then the MLE of b can be computed. The likelihood function becomes or, L(b) = NY i=1 a(b; )e?b jx i j (58) L(b) = a(b; ) N e?b P N i=1 jx ij : (59) As before, it is easier to maximize the log-likelihood function which is log L(b) = N log a(b; )? b N X i=1 Setting the derivative with respect to b to zero, gives L(b) N b b?1 and, hence b = N # NX j x i j i=1 j x i j : (60) j b=^b= 0; (61) P N i=1 j x i j : (6) Equation (6) gives the best estimate of the parameter b with a xed when the exact data is known. Two approaches can be investigated: 1. The parameter b is estimated for each subband in the decomposition stage before quantization and is transmitted to the receiver.. b is estimated from the quantized data. The performance of the two approaches do not dier signicantly. Here, the rst approach is chosen for simplicity reasons. Indeed, for typical image sizes such as only very few dierent parameters have to be sent. The reconstruction levels can be computed using Eq. (3). Since there exists no analytical solution, a numerical integration method has been used based on ve sampling points per interval. Figure 10 shows the eectiveness of this estimation technique. It can be seen that the variance of the error is reduced using the proposed technique in comparison with setting the reconstruction levels in the middle of the two quantization thresholds. 1

22 Proposed method Error variance Proposed method Uniform reconstruction levels Error variance Uniform reconstruction levels Quantization step (a) Quantization step (b) Figure 10: Examples of the eectiveness of the non-uniform reconstruction levels on two dierent subbands of the picture \Lena"..4 Zerotree Successive-Approximation Quantization For compression purposes it is important to exploit the existing zero-correlation across the subbands. One powerful approach is the EZW algorithm proposed in [5]. This approach is based on four main blocks: 1. a hierarchical subband decomposition,. prediction of the absence of signicant information across scales using zerotrees, 3. entropy-coded successive-approximation quantization, 4. and lossless source coding via adaptive arithmetic coding. The rst subsystem is a subband decomposition of any type. All decompositions described in Sec.. can be used for that purpose. The second block is the heart of the EZW algorithm and goes as follows. For each subband a parent-children relationship is dened. The low-frequency subband is dened to be the parent of all other subbands. Then, a parent-children relationship is recursively dened for all subbands. An example of a parent-children relationship for the wavelet decomposition is shown in Fig. 11. After the subband transform, a scan is performed to detect the highest valued coecient (in magnitude) c max of all the subbands. This coecient is used to compute an initial threshold T 0 by the following equation T 0 = c max (63) where is a parameter of the system. A symbol stream is generated for the coecients in the subbands using the following symbols: 1. POS: If the coecient is signicant with respect to the threshold and positive.

23 Figure 11: Parent-children relationship for the wavelet decomposition of the EZW algorithm. NEG 0 POS Min Max Figure 1: Illustration of the primary scan of the zerotree wavelet algorithm. The unquantized coecients range from Min to Max and are quantized to NEG, POS or 0.. NEG: If the coecient is signicant with respect to the threshold and negative. 3. ZTR: This symbol (zerotree root) is used if the wavelet coecient is insignicant with respect to the threshold and all its children are insignicant, too. 4. IZ: The symbol isolated zero is used for all other wavelet coecients. This quantization procedure is illustrated in Fig. 1 after one primary scan. One can see that the uncertainty of the quantization intervals is not uniform. Indeed, it is twice as large for the zero-quantized coecients as for the values quantized to the values corresponding to POS or NEG. This symbol stream is called primary stream. On the next scan all the coecients which have not been quantized to zero so far are rened. This is done by transmitting, for each of these coecients, a binary value dening if the original value of the coecient is in the lower or upper half of the interval. This renement is illustrated in Fig. 13. This symbol stream is call secondary stream. After a scan producing a secondary stream a new threshold is computed as: T i+1 = T i : (64) where i is the iteration number. The algorithm is iterated by applying successive primary and secondary symbol streams. These are then encoded using an entropy coder. The result is 3

24 POS LOW HIGH Figure 13: Illustration of the secondary scan of the zerotree wavelet algorithm. Each coecient which has been quantized to a signicant value in the previous primary scan is rened by a factor of. NEG 0 POS Min Max Figure 14: Illustration of the primary scan of the zerotree wavelet algorithm with Q = 3:0. The unquantized coecients range from Min to Max and are quantized to NEG, POS or 0. The three uncertainty intervals are equally large. a completely embedded bitstream. The EZW algorithm has shown excellent performances in compressing natural images. However, this quantization procedure does not lead to a uniform quantization of the wavelet coecients. Indeed, the zero-quantized coecient have a correspond to a larger input dynamic range than the other coecients. It should be noted that the proposed quantization noise reduction scheme (Sec..3.5) can still be applied since it does not depend on a uniform quantization. Wiener ltering of the low-frequency subband can be adapted to this noise since it is a small local lter. The assumption that the noise is white is still valid on a small region of support (3 3). Therefore, Eq. (44) can be generalized to ll = h 1 N P N i=1 i i where i is the quantization step of each pixel. This takes into account the fact that not all the samples have the same quantization step. One approach for generalizing the EZW algorithm is as follows. At each iteration, instead of dividing the threshold by, let us dene the new threshold by the general formula 1 (65) T i+1 = T i Q (66) where Q is a oating point value. Moreover, instead of splitting the uncertainty interval of the non-zero coecients into two bins, a renement into M distinct values is produced in the secondary stream. If Q = 3:0 and M = 3, one approaches a uniform quantization very closely. This is illustrated in Fig. 14. One can observe that after the rst primary stream, a uniform quantization is achieved if Q is chosen to be 3:0. Closer examination shows that this is indeed the case after each primary 4

25 stream. The EZW algorithm proposed in [5] has the following system parameters: = 0:5, Q = :0 and M =. The generalized EZW algorithm has shown excellent performance in compressing natural images at high compression ratios..5 Conclusions In this section a set of perfect reconstruction lters has been proposed. It takes into account the special statistics of natural images. The lowpass analysis lter is long while its highpass counterpart is very short, thus exploiting the non-stationarity and the asymmetrical power spectrum of natural images. A noise reduction device for subband image coding schemes is proposed. A uniform quantizer is used for the lowpass subband. For the other subbands, it is shown that the optimal reconstruction levels can be computed accurately from a generalized Gaussian pds. Articial contours which are visible at low bit rates can be removed by applying Roberts pseudo-random noise technique to only the ll subband. A noise reduction technique is proposed and applied to this subband. Finally, the EZW algorithm is described and a generalization of this algorithm is proposed. It is shown that the proposed noise reduction device can t into the generalized EZW algorithm. 5

26 3 Morphological Subband Decompositions 3.1 Introduction In Sec. it is shown that linear lters are able to decompose the original image into subbands. It is even possible to design lters allowing for critical sub-sampling of the ltered images [9]. In such cases, the decomposed image has the same number of pixels as the original image. It has been shown that the major drawback of such a decomposition scheme is the inherent Gibbs phenomenon of linear lters [6]. This produces an annoying ringing artifact when compressing an image with a high compression ratio. The Gibbs phenomenon aects linear lters but not nonlinear lters such as morphological lters. Is a decomposition using a critical sub-sampling possible using morphological lters? So far, no morphological decomposition use a critical subsampling of the subbands [8, 36]. Clearly, this is suboptimal for coding purposes. Major diculties in designing such a lter bank occur because the digital morphological sampling theorem [9] does not provide a tool for designing morphological lters that allow critical sampling and perfect reconstruction. Indeed, even after a perfect morphological lowpass lter such as an open-closing lter, there is no way of down-sampling the ltered image and reconstructing it perfectly. The problem is that morphological ltering removes small objects without blurring the contours of larger objects. This means that high-frequency information is always contained in the morphologically ltered image. Hence, aliasing always occurs and there is no way of reconstructing the original image perfectly. Notice that down-sampling is a linear process. It is most easily described through a frequency analysis using the sampling theorem [14]. However, the concept of frequency is a strictly linear concept and cannot be used in the context of morphological signal processing. In this section, a technique for morphological subband decomposition of an image with perfect reconstruction is proposed. It is shown that critical subsampling can be achieved. Images through this decomposition do not suer from the ringing eect. The drawback of using morphological lters instead of linear lters is that they have a poor texture representation at very low bit rates. In order to overcome this problem an adaptive subband decomposition is introduced. This decomposition allows for good visual quality at very low bitrates. It chooses linear lters on textured regions and morphological lters otherwise. A simple and ecient texture detection criterion is proposed and applied to the adaptive decomposition. One important dierence between linear lters and morphological lters is that morphological lters preserve the specicity of the input set. If the input consists of only integer valued coecients the output will also consist of integer valued coecients. Linear lters do not have this property. Based on this property, a lossless coding scheme based on the morphological subband decomposition is dened. The outline of this section is as follows. In Sec. 3. a classical morphological decomposition is discussed. Section 3.3 details the pyramidal decomposition. In Sec 3.4 the proposed morphological subband decomposition is described. The design issues are discussed. How this decomposition can be rendered adaptive is discussed in Sec A lossless coding scheme based on the proposed morphological subband decomposition is proposed in Sec Conclusions are 6

27 X X X OCB1 OCB OCB3 1 3 NO DETAIL IMAGE R1 R R3 RESIDUES Figure 15: Classical multiresolution morphological decomposition. The operation OC denotes the morphological open-closing ltering. The decomposition yields a no detail image and several residual images. drawn in Sec Multiresolution Decomposition Using mathematical morphology, the analysis of an image is based on object shapes and object sizes. This is in contrast with the frequency concept using linear operations. In terms of mathematical morphology, a multiresolution analysis intends to decompose an image into dierent sub-images where each sub-image contains objects of a specic size. Figure 15 shows the standard decomposition for multiresolution analysis [4, 36]. This decomposition is a cascade of open-closings which are intended to lter out objects of a certain size at each stage. The rst stage of the decomposition is computed by X 1 = (X B 1 ) B 1 (67) R 1 = X? X 1 (68) where (X B 1 ) B 1 = OC B1 (X) denotes open-closing of X by B 1. R 1 is the rst residual image containing only objects of size smaller or equal than the structuring element B 1. Similarly, the second stage is obtained by where X = (X 1 B ) B (69) R = X 1? X (70) B = B 1 B 1 (71) Each stage produces a residual image R i. Together with the last open-closed image X N the original image can be reconstructed just by adding all the residuals R i together as follows X = X N + NX k=1 R k (7) Attempts to use this decomposition for coding purposes are reported in literature [4, 36]. However, the basic problem is the coding of the residual images which are of the same size as the 7

28 X OC B X 1 Y 1 X x OC B x Y + + REC z + 1 x + z REC x R 1 Figure 16: Pyramidal decomposition using mathematical morphology. The operation OC denotes the morphological open-closing ltering. The decomposition yields a down-sampled no detail image and several residual images. R original image. Also, the no-detail image X N has still a large entropy and is also of the same size as the original image. The only advantage of this decomposition is the absence of ringing eect even under strong quantization. It is clear that a good decomposition for coding purposes must be compact, that is the number of pixels used to represent the original image has to be as close as possible to the number of pixels contained in the original image. In the case we discussed, the representation needs N times more pixels for representing the original and, consequently is not appropriate at all for image coding purposes. 3.3 Pyramidal Decomposition Another approach is the pyramidal decomposition [8] and is shown in Fig. 16. The idea is similar to the previous decomposition. The original image is rst open-closed by a structuring element B to give the ltered image X 1 X 1 = (X B) B = OC B (X) (73) then X 1 is down-sampled by a factor of in the two directions. Denote the image X 1 by x 1 (k 1 ; k ) in the spatial domain. The down-sampled image y 1 (k 1 ; k ) is then y 1 (k 1 ; k ) = x 1 (k 1 ; k ) with ( k 1 = 0; 1; :::; N 1? 1 k = 0; 1; :::; N? 1 (74) The image Y 1 is then further decomposed at each stage. In order to preserve perfect reconstruction the computation of the residual images has to be performed the following way. The image Y i is rst up-sampled by a factor of two in the two directions which gives z i (k 1 ; k ) = ( k yi 1 ; k if k 1 = 0; ; 4; :::; k = 0; ; 4; :::?1 otherwise The up-sampled image is then ltered by a morphological reconstruction lter which should produce an image as close to X i as possible. The residual image is nally computed as the dierence between the image Y i?1 of the previous stage and the reconstructed image, that is (75) R 1 = X? REC(Z 1 ) (76) 8

29 for the rst stage, and R i = Y i?1? REC(Z i ) i = ; 3; ::: (77) for the next stages. It is clear that the performance of this decomposition depends directly on the good performance of the reconstruction lter because the residual images do not only contain the residual objects which are contained in the dierence Y i?1? X i but also contain the reconstruction error " i = X i? REC(Z i ) (78) In that sense the variances of the residual images are higher than with the decomposition given in Sec. 3. since we can assume that " i and X i are uncorrelated. Indeed, Ri = "i + (79) where is the variance of the dierence Y i?1?x i. The smaller "i the better the decomposition. The advantage of this decomposition is that the residual images are getting smaller by a factor 4 at each stage. Although, such a decomposition is much more appropriate than the multiresolution decomposition, it seems that there is still a waste of space since the decomposed image needs more space than the original image. Optimal would be a compact decomposition which would require the same number of pixels as the original image. 3.4 Morphological Subband Decomposition Introduction In order to have a tool which can be generalized to the nonlinear case, we will use the concept of a half-band lter. Let us recall the denition of a linear half-band lter from Sec.. Denition 4 A linear lter is called a half-band lter if every odd sample of its impulse response f(k) is zero, except one sample f(), that is f(n + 1) = ( 1 if n + 1 = 0 otherwise where L is the length of the half-band lter f(k). n = 0; 1; ; :::; L? 1 (80) Suppose a signal x(k) is rst down-sampled and up-sampled by a factor of, call this signal y(k). Then, it is ltered by a half-band lter f(k). This procedure is shown in Fig. 17. Let us dene z(k) as the convolution of y(k) with f(k): Then, every alternate sample of z(k) is given by z(k) = y(k) f(k) (81) z(k) = 1 x(k); (8) 9

30 x(k) y(k) f(k) z(k) Figure 17: The signal x(k) is down- and up-sampled by a factor of to obtain y(k), which is then ltered by a half-band lter with impulse response f(k) leading to the output z(k). X(z) G 1 (z) ^X(z) H (z) Figure 18: Filter bank with specic lters yielding perfect reconstruction if G 1 (z) is a half-band lter and H (z) = G 1 (?z). that is, the input is not processed (except a factor of 1 ) for these samples. Note that Eq. (8) implies Eq. (80). Moreover, Eq. (8) allows us to generalize the concept of a half-band lter to the nonlinear case. Denition 5 Consider a signal x(k) which is down-sampled and then up-sampled by a factor of, to give y(k). A nonlinear lter is then called a half-band lter if every alternate sample of the ltered signal y(k) is equal to the corresponding samples in the original signal x(k). Dene F y (k) the output of the nonlinear half-band lter. Then where c is an arbitrary constant. F y (k) = c x(k); (83) Note that such nonlinear lters do exist such as erosion, dilation or median lter if the region of support is chosen appropriately. Let us recall the following fact. Perfect reconstruction of the linear two-band lter bank of Fig. 1 is achieved if the convolution of h 1 (k) and g 1 (k) is a half-band lter and if the lters respect the bi-orthogonality conditions given by H (z) = G 1 (?z) and G (z) =?H 1 (?z) [7]. Now, suppose H 1 (z) = 1. Then perfect reconstruction is achieved if G (z) =?1, G 1 (z) is a half-band lter and H (z) = G 1 (?z). This lter bank is shown in Fig. 18. Let us now replace the lter G 1 (z) by a generalized half-band lter M(). It is clear that this lter can not be described its z-transform since it is a nonlinear lter. However, from the bi-orthogonality condition in the case of linear lter banks, it is known that H (z) = G 1 (?z) yields perfect reconstruction. The solution to the problem of nding the corresponding highpass analysis lter in the nonlinear case, can be obtained by understanding exactly what it means to negate the variable z of a linear half-band lter. Again, consider the impulse response of a half-band lter. Every other sample is zero, except the median sample. Now, negating the variable z has the following eect on the impulse response: Z?1 ff (?z)g = (?1) k f(k): (84) 30

31 X(z) M() ^X(z) A() Figure 19: Filter bank with morphological lters yielding perfect reconstruction if M() is a generalized half-band lter and A() = I? M(). Since every odd sample is zero, the negation of the z-variable will only have an eect on the middle sample. Hence the following equation holds for a zero-phase half-band lter F (?z) = 1? F (z): (85) This situation can now be generalized to the nonlinear case. Denote A() the nonlinear highpass analysis lter. Then A = I? M; (86) where I is the identity operation. This nonlinear lter bank yields perfect reconstruction and is shown in Fig. 19. Although this lter bank is a particular case of a general subband decomposition, it invalidates the assertion that morphological lter banks with perfect reconstruction are impossible in general. The eect of taking the lowpass analysis lter as the identity operation will be to introduce aliasing in the down-sampled lowpass subband. However, it is shown that good subband lters, such as the asymmetrical lter banks described in Sec., have a similar property since the lowpass lter has a quite poor frequency response. What matters most for image compression is the behavior of the lowpass synthesis lter because this lter is mainly responsible for ltering the introduced quantization noise and for the ringing eect. With the lter bank proposed in this thesis we have a complete control on the lowpass synthesis lter. The use of a morphological lter at this stage will completely eliminate the ringing eect. Furthermore, the proposed morphological subband decomposition (MSD) has the desired property of representing the original image by the same number of pixels as the original image Coding Gain Let us compute the unied coding gain of the MSD. First, recall the general formula for SBC derived by Katto [13]: G SBC = M?1 Y i=0 in! i M?1 X i ; (87) g i (k) k=0 where in is the variance of the input image, i the variance of the i-th subband, g i (k) the impulse response of the i-th synthesis lter, i the inverse of the i-th down-sampling factor and 31

32 M the number of subbands. In our case, we have M = and 1 = = 1. Since we are dealing with morphological lters, it is not possible to directly evaluate the coding gain just by weighting the subband variances by the term P L i?1 k=0 g i (k). This term represents the variance of the subband after upsampling and synthesis, we will just use this variance to compute the coding gain in the following way. Denote Ri the variance of the subband directly before the summation in Fig. 19. Then, the coding gain becomes G M SD = in q : (88) R1 R This coding gain is the basic tool for the design of the optimal morphological lters used in the MSD Design of the Morphological Subband Filters The most important issue in subband decomposition is the design of the lters. In the case of morphological lters, this design is completely dierent from the problem of optimizing linear lter coecients. Indeed, morphological lters can not be optimized because there is only a limited number of possible solutions. The most general morphological lter is the rank-order lter. This lter has two degrees of freedom: the rank to be used, and the region of support of the structuring element. The design procedure is fairly simple. We just have to choose the best possible solution by comparing their performance on natural images. Let us consider a tree-structured decomposition for which each lowpass subband is decomposed further, whereas the highpass subband is not processed anymore. For each decomposition of the lowpass subband, the coding gain is computed. This gain decreases with each further decomposition. This means that after a certain point it is not worth decomposing anymore. A good lter will have high coding gains at each stage. Figures 0 and 1 show the comparison of the three morphological lters, namely dilation, closing and the median lter. Dilation and closing are used by the digital morphological sampling theorem [9] while the median lter has a region of support of 6 samples. The curves in Fig. 0 show the coding gains obtained by decomposing the \Lena" image and those in Fig. 1 show the results for the \Building" picture. Clearly, the median lter outperforms the other two. Also, it is veried that the coding gain is decreasing with the stage order. Next, median lters with dierent region of supports will be compared. Figure shows their regions of support and their labels. MED 6 is the lter used in the previous experiment. It is shown in Fig. 3 that this lter is the most appropriate one for this subband structure. This decomposition outperforms other morphological decompositions [8, 36] in terms of coding eciency. This is due to the fact that no critical down-sampling is performed for the other decompositions. In Fig. 4 an example is given using the \Lena" picture as a test image. It can be seen that the pyramidal decomposition gives a reconstruction quality which is about 8 db below the proposed decomposition at the same bit rate! From a visual point of view the reconstructed images decomposed by the MSD are very agreeable to the human eye. There is no ringing eect, as expected. The only disadvantage is a poor representation of the textures in 3

33 9 8 Median 7 Dilation 6 Closing GSBC Stage Figure 0: Comparison of the three morphological lters dilation, closing and median lter on the \Lena" picture. 8 GSBC Median Dilation Closing Stage Figure 1: Comparison of the three morphological lters dilation, closing and median lter on the \Building" picture. MED 4 MED 6 MED 8 MED 1 Figure : Region of support of dierent median lters and their labels. Black pixels belong to the lower subband. The unknown down- and up-sampled pixels are hatched. The pixel to be reconstructed is in the center and dotted. 33

34 9 GSBC MED 4 MED 6 MED 8 MED Stage Figure 3: Comparison of the coding gain of dierent lters. 8 PSNR Proposed decomposition Pyramidal decomposition bitrate Figure 4: Comparison of the coding performance of the proposed decomposition with the pyramidal decomposition. Test image: \Lena". an image at very low bitrates. The next section will describe this problem in more detail and a solution to this problem will be proposed. 3.5 Context-Adaptive Subband Decomposition Decomposition using Linear Filters As stated in the previous section, linear lters are more appropriate for coding textures than morphological lters. In order to benet from advantages of the two lter classes a new decomposition is introduced. This decomposition chooses adaptively the lter class that is most appropriate for a given image pixel. Also, a new texture detection criterion is introduced. Recall the starting point of the proposed MSD described in the previous section and based on the lter bank shown in Fig. 5. It was stated that perfect reconstruction is achieved if the lter G 1 (z) is a half-band lter. How does the coding performance of this linear lter bank compare to the MSD? 34

35 X(z) G 1 (z) ^X(z) H (z) Figure 5: Filter bank with specic lters yielding perfect reconstruction if G 1 (z) is a half-band lter and H (z) = G 1 (?z) G7-A db -60 G7-B -80 G7-C normalized frequency Figure 6: Frequency responses of three 7-tap linear half-band lters. G7-A: = 0:5. G7-B: = 0:3. G7-C: = 0:1. In order to compare the two types of lter banks, appropriate half-band lters have to be designed. The number of coecients, L, of a linear phase half-band lter must satisfy L = 4 n? 1 n = 1; ; ::: (89) and hence, only lters with 3, 7 or 11 coecients are eligible. From a reconstruction point of view, the desired linear lter must possess a good frequency response. Thus, the design of the lters will be performed using the Parks-McClellan [0] algorithm, also known as Remez exchange algorithm. This algorithm produces optimal linear phase lters. Let f c represent the cuto frequency and f s, the stopband frequency. The width of the transition band is then = f c? f s. On the normalized frequency scale, the half-band property can easily be imposed by choosing f s = 1? f c: (90) In the goal of nding of nding the optimal transition width for our application, three 7-tap lters with dierent transition bands were designed. Let G7-A be the optimal seven-tap halfband lter with transition band width = 0:5, G7-B with = 0:3 and G7-C with = 0:1. The frequency responses of these lters are shown in Fig. 6. Note that the ringing eect will be the strongest for the lter having the smallest transition width. 35

36 G7-A GSBC 10 8 G7-B G7-C Stage Figure 7: Coding gain of the three dierent 7-tap linear half-band lters on the picture \Lena". G7-A: = 0:5. G7-B: = 0:3. G7-C: = 0: G7-A GSBC 6 5 G7-B G7-C Stage Figure 8: Coding gain of the three dierent 7-tap linear half-band lters on the picture \Building". G7-A: = 0:5. G7-B: = 0:3. G7-C: = 0:1. Using Eq. (88), the coding gain of each stage in the decomposition can be computed. In this way the dierent lters can be compared in terms of coding performance. The results of these computations are shown in Figs. 7 and 8 for two dierent test images. Note that on the rst picture the coding performance of the lter G7-B is nearly identical to the one obtained with the lter G7-A (the dierence is invisible in the gure), whereas the lter G7-C has the poorest performance. On the other test image the lter G7-A performs slightly better than its counterpart G7-B. Moreover, G7-A produces less ringing eect since its transition width is larger. This is why the lter G7-A is the most appropriate for coding purposes among the 7-tap lters. In order to nd out the optimal lter length, lters with 3 and 11 coecients have also been designed. The best results are obtained, as before, by requiring the transition width to be as large as possible. Let G3 be the lter with 3 taps, and G11 the lter with 11 taps. In the Figs. 9 and 30, these lters are compared to the lter G7-A. It is clearly visible that the 11-tap lter has the poorest coding performance. Filters G3 and G7-A perform almost identically. However, by 36

37 16 14 G3 1 G7 GSBC 10 8 G Stage Figure 9: Comparison of the coding gain of three dierent linear lters on the picture \Lena". The lters have a length of 3, 7 and 11 respectively. All three lters have a transition width of = 0:5. 8 GSBC G3 G7 G Stage Figure 30: Comparison of the coding gain of dierent linear lters on the picture \Building". The lters have a length of 3, 7 and 11 respectively. All three lters have a transition width of = 0:5. comparing the coding results for other images, it can be observed that the lter G7-A performs slightly better, on average, than the lter G3. Therefore, the lter G7-A will be used henceforth for the linear decompositions. This linear subband decomposition (LSD) must now be compared to the morphological one (MSD). As mentioned before, objective criteria such as the coding gain or the peak signal-tonoise ratio (PSNR) are not well suited for comparing two dierent coding techniques because distortions are of dierent nature. Therefore, we have to compare the two dierent approaches by visually analyzing dierent reconstructed images. This is done in Fig. 31. Three dierent test pictures with dierent statistical properties have been chosen. On the \Pepper" picture, it is clearly visible that the picture coded with LSD suers from strong ringing eect whereas MSD does not produce any ringing eect at all. Visually, the picture coded with MSD is more pleasant. The adaptive subband decomposition (ASD) technique will be described in the next section. On the \Lena" image the same phenomenon can be observed. However, the textures of the image 37

38 (specically the feather) are better represented by the linear decomposition. Moreover, on the \Baboon" image, the picture coded with LSD is visually better, because it consists essentially of textures which are better reproduced using LSD than by using MSD. We can conclude that the MSD is very powerful when coding images without textures (such as \Pepper") because the complete absence of ringing eect is very pleasant to the human eyes. However, when textures become predominant, linear lters give a visually better result. Therefore, the idea is to nd an adaptive scheme where the coder decides, for each pixel, whether to use a linear lter or a morphological lter. How such a scheme can yield perfect reconstruction and, how textures can be detected, is discussed in the next section Context-Adaptive Subband Decomposition In general it is impossible to switch between two kinds of synthesis lters in subband coding. The reason is that if two sets of lters are used, the perfect reconstruction constraint is no longer satised. However, in the case of Fig. 5 it is possible to switch between two synthesis lters because the lowpass analysis lter is just an identity operation. Actually, what all the MSD and LSD do is to interpolate between two pixels. The interpolation error is the highpass subband whereas the down-sampled image is the lowpass subband. Suppose it is known in the synthesis part which lters were used to interpolate a given pixel. Then, it would be possible to switch between two lters if this was done the same way in the analysis and in the synthesis part. In practice however, one major problem occurs. Indeed, information telling which pixels were decomposed in which way, will be an overhead to transmit which will decrease the compression ratio. Therefore, even if the same criterion is used in the analysis and synthesis part, the lter will not in general, be chosen correctly because quantization is performed between the analysis and synthesis part as shown in Fig. 3. This will modify the signal, and therefore the criterion will not produce exactly the same result in the analysis and in the synthesis part New Subband Decomposition The objective is to nd a decomposition for which it is not necessary to send the information about which lter to use for a given pixel. As a rst step, Fig. 3 is transformed into the identical system shown in Fig. 33, where A() = I? M(). The goal now is to have the same signal to which the lter M(x) is applied. If the same quantization operation is performed before the highpass analysis part, then every other sample of its input will be equal to the signal fed into the lowpass synthesis lter. This scheme is shown in Fig. 34. Let us call this new subband decomposition NSD. From an intuitive point of view the NSD interpolates the quantized picture and then computes the interpolation error. Note that quantizing twice on the highpass part of the decomposition does not imply a worse reconstruction performance. Indeed, what is done in this part is to quantize the interpolation error due to the quantized down-sampled image (remember that the interpolation lter is half-band and that is why sampling factors are not necessary before the rst quantization). In this way the interpolation error is kept smaller than half of the quantization step. This was not the case with the previous decomposition (MSD). It is shown in Fig

$(a)-(c) \Pepper" 0.10 b/p.$ $(d)-(f) \Lena" 0.16 b/p.$

39 (a) (b) (c) (d) (e) (f) (g) (h) (i) Figure 31: Coded images. (a)-(c) \Pepper" 0.10 b/p. (a) LSD (b) MSD (c) ASD. (d)-(f) \Lena" 0.16 b/p. (d) LSD (e) MSD (f) ASD. (g)-(i) \Baboon" 0.49 b/p. (g) LSD (h) MSD (i) ASD. 39

40 X(z) Q() M() ^X(z) A() Q() Figure 3: MSD with quantization between the analysis and synthesis part. X(z) Q() M() ^X(z) Q() M() Figure 33: MSD with quantization between the analysis and synthesis part. X(z) Q() M() ^X(z) Q() Q() M() Figure 34: New proposed decomposition. A further quantization is performed before the highpass analysis lter in order to have the same detection as before the lowpass synthesis lter. 40

41 NSD MSD PSNR bit rate Figure 35: Coding performance comparison on image \Lena" between the decomposition of the previous section and the new decomposition. that the NSD has a better coding performance than the MSD using the same lter MED 6 (see Sec ). Moreover, it allows an adaptive choice of the lter. As mentioned before, every alternate sample of the input signal to the two half-band lters of Fig. 34 is the same. Hence, a texture detection criterion based on these pixels will give the same results in both analysis and synthesis stages. This adaptive subband decomposition (ASD) is shown in Fig. 36. The basic idea of the adaptive ltering is to use the linear lter G7-A for textured regions and the morphological lter MED 6 for all other pixels. Note that when using the proposed decomposition it is not necessary to transmit the type of lter used for a given pixel since the same detection is done in both analysis and synthesis stages. The following remark can be made concerning the tree-structure. The two-band adaptive lter bank of Fig. 36 can not be decomposed using a tree-structure in a straightforward way. Indeed, the quantization operations before the highpass analysis lter renders the structure asymmetrical. However, a solution can be found by approaching the problem in a more intuitive way. Recall that the ASD computes the interpolation error (with respect to the original image) of the quantized lowpass subband. In fact, it is possible to compute directly the lowest subband by just decimating the original image. The interpolation errors of each stage can then be computed successively going from the deepest decomposition stage to the rst one. In this way each subband can be computed without any ambiguity between the analysis and the synthesis part. Note that this is in contrast with a tree-structured decomposition where the subbands are ranged from the highest numbered into lower ones (see Fig. 4). The ASD, on the other hand, decomposes from the lowest numbered to the highest numbered subband Texture Detection One important issue in the adaptive decomposition is the texture detection criterion. Indeed, it is of major importance that the right lter is used for each pixel. A wrong decision at an edge for example will produce a ringing eect and the whole eort of using morphological lters would 41

42 M() X(z) Q() G(z) ^X(z) Q() Q() M() G(z) Figure 36: Adaptive morphological subband decomposition. A texture detection is performed before the highpass analysis lter and before the lowpass synthesis lter. be wasted. The known information is the lower subband, based on which a texture detection is performed. Natural images are composed of homogeneous regions, edges, and texture regions. An analysis of the texture properties will naturally produce the detection criterion we need. Indeed, homogeneous regions are usually of large spatial support and their local grey level variance is very small. Edges, on the other hand, are of short spatial support and have a high local variance across the edge and low local variance along the edge. Texture regions have a high local variance in all directions and have a spatial support which can be assumed as large as their edge counterparts. Accordingly, two parameters need to be computed. The rst parameter, ^, is the estimation of the local variance. The unbiased estimator of the variance is given by X ^ = 1 N?1 (x N? n? x) ; (91) 1 where N is the number of pixels considered and x, the estimated mean is given by x = 1 N n=0 N?1 X n=0 x n : (9) Extensive simulations have shown that a region of support with N = 1 performs best. This region of support is shown in Fig. 37. To be sure that a given pixel does not belong to a homogeneous region, the local variance must be larger than a threshold c. If this is the case then it belongs either to an edge or to a textured region. Next, we identify those pixels which belong to edges. For that purpose the local variances in the four directions are computed. With the region of support shown in Fig. 37 the following directions have been chosen 1 : (1; 1) (; 3) (; 5) (3; 7) : (3; 1) (; 3) (; 5) (1; 7) 3 : (; 1) (; 3) (; 5) (; 7) 4 : (1; 5) (; 5) (3; 5) 4

43 Figure 37: The pixels for the application of the edge detection criterion. Black pixels belong to the lower subband. The unknown down- and up-sampled pixels are hatched. The pixel to be reconstructed is in the center and dashed. where (; 4) is the dotted center coordinate of Fig. 37. These four directions are chosen to represent the directions 0,, and 3 as accurately as possible. Each local variance 4 ^ i is then computed using Eq. (91). The parameter p is then dened by p = max i ^ i min i ^ i : (93) We know that a textured region has a high local variance in all directions. Therefore, the higher the value of p the higher the probability that the pixel in question is an edge pixel. Therefore, the pixel is considered as an edge pixel if p is larger than a threshold p c. Hence, the texture detection algorithm can be stated as follows. Algorithm 1 Texture detection 1 Compute the parameter ^ by Eq. (91) representing the estimation of the isotropic local variance. Compute the parameter p by Eq. (93). 3 if (^ > c) AND (p < p c ) then 4 The pixel is a texture pixel 5 else 6 The pixel belongs either to an edge (^ > c ) or to a homogeneous region (^ < c ). 7 end This algorithm is fairly robust. Simulations show that the desired discrimination is obtained for any image type with the following parameters c = 100 and p c = 10: (94) Figure 38 shows an example of application of this texture detection algorithm on each subband of two dierent test images. It can be observed that most of the textured regions have been detected. However, the eective textured regions are full of isolated pixels and the detected pixels are not always connected. Also, thin lines appear on some edges which should be avoided since they will produce ringing. Therefore, a post-processing of these detection masks is necessary to improve the performance of the detection. The result of texture detection is quantized with the following convention: texture points are set to 0 (black), edge points to 18 (grey) and homogeneous region points to 55 (white). To eliminate the pulse type noise it is proposed to lter the masks with a 3 3 median lter. This forces each center of a 3 3 block to belong to a texture if there is no pixel 43

$Texture detection on each subband of the \Baboon" image (left) and the \Pepper" image (right). belonging to a homogeneous region in the block.$ Indeed, a grey pixel with many black neighbors will turn black. Such pixels are mostly in textured regions. Moreover, a black pixel with a minority of black neighbors will turn either grey or white.

Indeed, a grey pixel with many black neighbors will turn black. Such pixels are mostly in textured regions. Moreover, a black pixel with a minority of black neighbors will turn either grey or white.

44 Figure 38: Edge detection criterion: black pixels are textures, grey pixels are edges and white pixels belong to homogeneous reions. Texture detection on each subband of the \Baboon" image (left) and the \Pepper" image (right). belonging to a homogeneous region in the block. Indeed, a grey pixel with many black neighbors will turn black. Such pixels are mostly in textured regions. Moreover, a black pixel with a minority of black neighbors will turn either grey or white. This is the case for these thin black lines occuring at edges of the image. Hence, this very simple post-processing of the detection masks improves both weak points of the Algorithm 1. The complete classication algorithm is as follows. Algorithm Complete texture detection 1 Classify the pixel using Algorithm 1. for each pixel do 3 if the pixel is an edge AND all 8 neighboring pixels are either textures or edges then pixel texture 45 else 6 pixel median value of the 3 3 block with the processed pixel in the middle. 7 end 8 end The masks of Fig. 38 after post-processing are shown in Fig. 39. It is clearly veried that the textured regions have been lled out and that the thin lines on edge locations have mainly disappeared. The right column of Fig. 31 shows the compression of the three test images using the adaptive scheme. One can observe, that the reconstructed images using the adaptive scheme and the nal criterion described above, are free of ringing eect and create a good representation of textures. 44

45 Figure 39: Edge detection criterion: black pixels are textures, grey pixels are edges and white pixels belong to homogeneous regions. Texture detection with post-processing of the masks on each subband of the \Baboon" image (left) and the \Pepper" image (right). 3.6 Embedded Zerotree Lossless Algorithm Lossless Decomposition In the previous sections the MSD has been proposed as a tool for the compression of images. It has been shown that it does not suer from ringing eect at low bitrates. The MSD has another interesting property. All the ltering in the decomposition is performed with morphological lters and these have the property of preserving the specity of the input set. As an example, this means that if the inputs are integer valued coecients the outputs of the morphological lter will also be integer valued coecients. Based on this property of the MSD, a lossless compression scheme can be dened. In order to achieve lossless compression of the image two conditions must be satised by the decomposition lters: 1. The lter bank must allow perfect reconstruction. This means that if no quantization is applied to the decomposed coecients, the reconstructed image must be identical to the original image.. The coecients of the subbands must be representable by a nite number of bits. As shown in Sec. 3.4, the MSD enjoys the property of perfect reconstruction. However, it has also been shown that the median lter on a region of support of size 6 is a very ecient half-band lter. Therefore, the output of such a lter will no longer yield integer valued coecients. This is due to the fact that the median value of an even number of samples is dened by the average of the two middle samples. One possibility of satisfying the second condition is to modify the MSD in the following way. The highpass synthesis lter is dened by the generalized half-band lter M(). Let us dene a new half-band lter N by N = Q(M) (95) 45

46 X(z) N () ^X(z) N () Figure 40: Filter bank with morphological lters allowing for perfect reconstruction and having only integer valued coecients. The operator N () denotes Q(M) where Q() is a quantization function mapping the oating point valued coecient to an integer valued coecient. where Q() is a quantization function mapping the oating point valued coecient to an integer valued coecient. Notice that this new lter bank still allows perfect reconstruction. The ow diagram of the new decomposition is shown in Fig Lossless Successive-Approximation Quantization The quantization procedure of the EZW algorithm has been described in detail in Sec..4. It is clear that even though the rst two conditions on the lter might be satised by the lter they are not sucient to render the entire scheme lossless. In order to render this compression algorithm lossless, the following four conditions on the quantization are sucient: 1. The initial threshold should be set to T 0 = n with n being an integer value.. The threshold is divided by two at each iteration, T i+1 = T i. 3. The iterative quantization procedure is iterated n times for the primary stream and n? 1 times for the secondary stream. 4. The reconstruction levels have to be set to the minimum absolute value of the uncertainty interval. Conditions (1) and () ensure that after n iterations of the primary stream, the threshold will be exactly 1. At the n-th primary stream, for each quantized coecient c, the following inequality will be satised c min c < c max ; (96) where c min and c max dene the uncertainty interval. After n iterations of the primary stream c max? c min = 1. Since c is a signed integer value the correct value of c must be c = c min : (97) If conditions (1) and () are not satised, lossless compression is still obtained when the threshold of the successive-approximation quantization becomes smaller than 1, allowing for a perfect mapping of the quantized coecients to their integer values. 46

$Figure 41: Example of a decomposition on the picture \Baboon". 1 3 5 1 3 4 5 7 4 6 Figure 4: Transform partition of an image. (Left) Five subbands. (Right) Seven subbands.$

47 Figure 41: Example of a decomposition on the picture \Baboon" Figure 4: Transform partition of an image. (Left) Five subbands. (Right) Seven subbands. Due to the embedded nature of the scheme, it is possible to decode an image at any bitrate. By increasing the bitrate one can obtain a reconstructed picture with increasing quality. In progressive image coding, the computational load of the decoding process is much more critical than the one of the encoding process. In order to reduce the computational complexity at the synthesis stage, each level of the decomposition splits the input image into two subbands. Then, only the lowpass subband is decomposed further into two subbands. The coding gain is used to adaptively choose the direction of the splitting. Examples of such a decomposition is shown for the image \Baboon" in Fig. 41, and examples of partitioning the transform space are shown in Fig. 4. The resulting parent-children relationship for the zerotree quantization is illustrated in Fig Conclusions In this section a nonlinear image decomposition scheme is proposed. It is based on morphological lters which do not suer from the Gibbs phenomenon. In contrast to the usual morphological decompositions, critical sampling of the subbands is performed. This results in a decomposition that needs the same number of pixels as the original image. Structurally, the proposed scheme is a special case of subband coding based on morphological lters. It is shown that such a morphological lter bank does not suer from the ringing eect. At very low bit rates however, the coding performance on textured regions is poor in comparison to linear subband coding. In order to overcome this disadvantage, an adaptive decomposition scheme is proposed. This scheme uses linear lters for textured regions and morphological lters for other regions. For this purpose 47

High-Performance Compression of Visual Information A Tutorial Review Part I: Still Pictures

High-Performance Compression of Visual Information A Tutorial Review Part I: Still Pictures OLIVIER EGGER, PASCAL FLEURY, TOURADJ EBRAHIMI, AND MURAT KUNT, FELLOW, IEEE Digital images have become an important