Motion Estimation Using Low-Band-Shift Method for Wavelet-Based Moving-Picture Coding

Similar documents
A 3-D Virtual SPIHT for Scalable Very Low Bit-Rate Embedded Video Compression

signal-to-noise ratio (PSNR), 2

Wavelet Transform (WT) & JPEG-2000

A New Configuration of Adaptive Arithmetic Model for Video Coding with 3D SPIHT

Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform

An Embedded Wavelet Video Coder Using Three-Dimensional Set Partitioning in Hierarchical Trees (SPIHT)

FAST AND EFFICIENT SPATIAL SCALABLE IMAGE COMPRESSION USING WAVELET LOWER TREES

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 8, NO. 4, AUGUST

An Embedded Wavelet Video Coder. Using Three-Dimensional Set. Partitioning in Hierarchical Trees. Beong-Jo Kim and William A.

Express Letters. A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation. Jianhua Lu and Ming L. Liou

Modified SPIHT Image Coder For Wireless Communication

Low-Complexity Block-Based Motion Estimation via One-Bit Transforms

An Embedded Wavelet Video. Set Partitioning in Hierarchical. Beong-Jo Kim and William A. Pearlman

SIGNAL COMPRESSION. 9. Lossy image compression: SPIHT and S+P

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Fully Scalable Wavelet-Based Image Coding for Transmission Over Heterogeneous Networks

Wavelet Based Image Compression Using ROI SPIHT Coding

Embedded Rate Scalable Wavelet-Based Image Coding Algorithm with RPSWS

Embedded Descendent-Only Zerotree Wavelet Coding for Image Compression

Short Communications

Wavelet-based Contourlet Coding Using an SPIHT-like Algorithm

Multiresolution motion compensation coding for video compression

Reversible Wavelets for Embedded Image Compression. Sri Rama Prasanna Pavani Electrical and Computer Engineering, CU Boulder

Scalable Medical Data Compression and Transmission Using Wavelet Transform for Telemedicine Applications

Adaptive GOF residual operation algorithm in video compression

642 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 5, MAY 2001

DCT-BASED IMAGE COMPRESSION USING WAVELET-BASED ALGORITHM WITH EFFICIENT DEBLOCKING FILTER

Image Resolution Improvement By Using DWT & SWT Transform

Fingerprint Image Compression

Color Image Compression Using EZW and SPIHT Algorithm

CSEP 521 Applied Algorithms Spring Lossy Image Compression

Compression of RADARSAT Data with Block Adaptive Wavelets Abstract: 1. Introduction

Packed Integer Wavelet Transform Constructed by Lifting Scheme

Video Compression Method for On-Board Systems of Construction Robots

Bit-Plane Decomposition Steganography Using Wavelet Compressed Video

Variable Temporal-Length 3-D Discrete Cosine Transform Coding

Vidhya.N.S. Murthy Student I.D Project report for Multimedia Processing course (EE5359) under Dr. K.R. Rao

Multiframe Blocking-Artifact Reduction for Transform-Coded Video

Fully Spatial and SNR Scalable, SPIHT-Based Image Coding for Transmission Over Heterogenous Networks

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV

A deblocking filter with two separate modes in block-based video coding

Very Low Bit Rate Color Video

REGION-BASED SPIHT CODING AND MULTIRESOLUTION DECODING OF IMAGE SEQUENCES

IMAGE CODING USING WAVELET TRANSFORM, VECTOR QUANTIZATION, AND ZEROTREES

A SCALABLE SPIHT-BASED MULTISPECTRAL IMAGE COMPRESSION TECHNIQUE. Fouad Khelifi, Ahmed Bouridane, and Fatih Kurugollu

ANALYSIS OF SPIHT ALGORITHM FOR SATELLITE IMAGE COMPRESSION

An embedded and efficient low-complexity hierarchical image coder

Image Compression Algorithm for Different Wavelet Codes

Reconstruction PSNR [db]

Reduced Frame Quantization in Video Coding

Fast Wavelet-based Macro-block Selection Algorithm for H.264 Video Codec

Dense Motion Field Reduction for Motion Estimation

Analysis and Comparison of EZW, SPIHT and EBCOT Coding Schemes with Reduced Execution Time

FPGA IMPLEMENTATION OF BIT PLANE ENTROPY ENCODER FOR 3 D DWT BASED VIDEO COMPRESSION

UNIVERSITY OF DUBLIN TRINITY COLLEGE

Module 8: Video Coding Basics Lecture 42: Sub-band coding, Second generation coding, 3D coding. The Lecture Contains: Performance Measures

PERFORMANCE ANAYSIS OF EMBEDDED ZERO TREE AND SET PARTITIONING IN HIERARCHICAL TREE

Fast Color-Embedded Video Coding. with SPIHT. Beong-Jo Kim and William A. Pearlman. Rensselaer Polytechnic Institute, Troy, NY 12180, U.S.A.

Performance Analysis of SPIHT algorithm in Image Compression

Fully scalable texture coding of arbitrarily shaped video objects

Image Compression & Decompression using DWT & IDWT Algorithm in Verilog HDL

Video Compression System for Online Usage Using DCT 1 S.B. Midhun Kumar, 2 Mr.A.Jayakumar M.E 1 UG Student, 2 Associate Professor

A Low Memory Zerotree Coding for Arbitrarily Shaped Objects

Optimizing the Deblocking Algorithm for. H.264 Decoder Implementation

Enhanced Hexagon with Early Termination Algorithm for Motion estimation

IMAGE COMPRESSION USING EMBEDDED ZEROTREE WAVELET

Motion-Compensated Wavelet Video Coding Using Adaptive Mode Selection. Fan Zhai Thrasyvoulos N. Pappas

Key words: B- Spline filters, filter banks, sub band coding, Pre processing, Image Averaging IJSER

Fast Progressive Image Coding without Wavelets

An Optimum Approach for Image Compression: Tuned Degree-K Zerotree Wavelet Coding

EXPLORING ON STEGANOGRAPHY FOR LOW BIT RATE WAVELET BASED CODER IN IMAGE RETRIEVAL SYSTEM

Image coding based on multiband wavelet and adaptive quad-tree partition

A LOW-COMPLEXITY MULTIPLE DESCRIPTION VIDEO CODER BASED ON 3D-TRANSFORMS

MOTION COMPENSATION IN TEMPORAL DISCRETE WAVELET TRANSFORMS. Wei Zhao

Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding

OPTIMIZATION OF LOW DELAY WAVELET VIDEO CODECS

Wavelet Based Image Compression, Pattern Recognition And Data Hiding

OPTIMIZED QUANTIZATION OF WAVELET SUBBANDS FOR HIGH QUALITY REAL-TIME TEXTURE COMPRESSION. Bob Andries, Jan Lemeire, Adrian Munteanu

A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm

Mesh Based Interpolative Coding (MBIC)

Low-Memory Packetized SPIHT Image Compression

ISSN (ONLINE): , VOLUME-3, ISSUE-1,

ARCHITECTURES OF INCORPORATING MPEG-4 AVC INTO THREE-DIMENSIONAL WAVELET VIDEO CODING

ECE 533 Digital Image Processing- Fall Group Project Embedded Image coding using zero-trees of Wavelet Transform

On the Selection of Image Compression Algorithms

Adaptive Up-Sampling Method Using DCT for Spatial Scalability of Scalable Video Coding IlHong Shin and Hyun Wook Park, Senior Member, IEEE

Comparative Analysis of 2-Level and 4-Level DWT for Watermarking and Tampering Detection

Performance Comparison between DWT-based and DCT-based Encoders

Residual image coding for stereo image compression

Reducing/eliminating visual artifacts in HEVC by the deblocking filter.

Wavelet-Based Video Compression Using Long-Term Memory Motion-Compensated Prediction and Context-Based Adaptive Arithmetic Coding

Hybrid Fractal Zerotree Wavelet Image Coding

Compression of Stereo Images using a Huffman-Zip Scheme

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

Motion-Compensated Subband Coding. Patrick Waldemar, Michael Rauth and Tor A. Ramstad

Coding of Coefficients of two-dimensional non-separable Adaptive Wiener Interpolation Filter

MANY image and video compression standards such as

IMAGE COMPRESSION USING HYBRID TRANSFORM TECHNIQUE

Image Compression Using New Wavelet Bi-Orthogonal Filter Coefficients by SPIHT algorithm

A Low Bit-Rate Video Codec Based on Two-Dimensional Mesh Motion Compensation with Adaptive Interpolation

Transcription:

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 4, APRIL 2000 577 Motion Estimation Using Low-Band-Shift Method for Wavelet-Based Moving-Picture Coding Hyun-Wook Park, Senior Member, IEEE, and Hyung-Sun Kim Abstract The discrete wavelet transform (DWT) has several advantages of multiresolution analysis and subband decomposition, which has been successfully used in image processing. However, the shift-variant property is intrinsic due to the decimation process of the wavelet transform, and it makes the wavelet-domain motion estimation and compensation inefficient. To overcome the shift-variant property, a low-band-shift method is proposed and a motion estimation and compensation method in the wavelet domain is presented. The proposed method has a superior performance to the conventional motion estimation methods in terms of the mean absolute difference (MAD) as well as the subjective quality. The proposed method can be a model method for the motion estimation in wavelet domain just like the full-search block matching in spatial domain. Index Terms Block matching, discrete wavelet transform (DWT), low-band-shift, motion estimation and compensation, wavelet block. I. INTRODUCTION THE DISCRETE wavelet transform (DWT) has received considerable attentions in the field of image processing due to its flexibility in representing nonstationary image signals and its ability in adapting to human visual characteristics [1]. It is closely related to multiresolution analysis and subband decomposition, which has been successfully used in image processing for a decade [2] [4]. The wavelet transform decomposes a nonstationary signal into a set of multiresolutional wavelet coefficients where each component becomes relatively more stationary and hence easier to code. Also, coding schemes and parameters can be adapted to the statistical properties of each wavelet coefficient, thereby improving the coding efficiency. In subband coding, the frequency band of an image signal is decomposed into a number of subbands by a bank of bandpass filters. Each subband is then decimated and encoded separately. For reconstruction, the subband signals are decoded and expanded back to the original frequency band by interpolation. The subband-coding approach provides a signal-to-noise ratio comparable to the transform-coding approach and yields a superior subjective perception due to the lack of the blocking effect. In video coding, several types of interframe predictions [5], [6] have been used to reduce the interframe redundancy. Motioncompensated prediction has been used as an efficient scheme Manuscript received March 20, 1998; revised August 19, 1999. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Steven D. Blostein. The authors are with the Department of Electrical Engineering, Korea Advanced Institute of Science and Technology, Taejon 305-701, Korea (e-mail: hwpark@athena.kaist.ac.kr). Publisher Item Identifier S 1057-7149(00)02683-X. for temporal prediction. In order to perform the motion compensation in the wavelet domain, block matching can be applied to the wavelet coefficients. There have been many attempts to predict the wavelet coefficients by motion compensation in the wavelet domain [7] [11]. However, motion compensation in the wavelet domain is highly dependent on the alignment of the signal and the discrete grid chosen for the analysis. There exist very large differences between the wavelet coefficients of the original image and the one-pixel-shifted image [12], [13]. This shift-variant property happens frequently around the image edges, so motion compensation of the wavelet coefficients can be difficult. To overcome the shift-variant property of the wavelet transform, a new method is presented for motion estimation and compensation in the wavelet domain. For motion estimation and compensation in the wavelet domain, the reference frame is shifted by one pixel along the, the, and the diagonal directions, respectively, in the spatial domain. The shifted frames are transformed to the wavelet domain for motion estimation. These shift and wavelet-transform processes are named the lowband-shift method. The next-level low-band-shift operations are repeated iteratively to the low low band of each level. This low-band-shift method avoids the shift-variant property of the wavelet transform and performs the motion compensation more precisely and efficiently. This paper is organized as follows: Section II briefly reviews the motion estimation and compensation in the spatial domain and the wavelet domain. Section III describes the proposed motion estimation and compensation scheme for overcoming the shift-variant property. Comparative studies of the various motion compensation techniques for wavelet-based coding are presented in Section IV. Finally, conclusions are given in Section V. II. MOTION ESTIMATION AND COMPENSATION IN SPATIAL AND WAVELET DOMAINS A. Motion Estimation and Compensation in Spatial Domain Video compression standards, such as MPEG-1 and MPEG-2, use a two-dimensional (2-D) discrete cosine transform (DCT) to reduce the spatial redundancy and use a block-matching algorithm to reduce the temporal redundancy [5]. However, block-based coding suffers from blocking effects, in particular, in low-bitrate applications. In wavelet-based video coding [14], a motion estimation can be performed in the spatial domain, which is just the blockbased motion estimation introduced in the MPEG standards. However, the block-based motion estimation often produces discontinuities between the motion-compensated blocks because 1057-7149/00$10.00 2000 IEEE

578 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 4, APRIL 2000 Fig. 1. Examples of the DWT coefficients from the Haar and the D(9, 7) filters for a 1-D signal s(n) and 1-pixel-shifted 1-D signal s(n 0 1): (a) original 1-D signal, s(n), (b) 1-pixel-shifted 1-D signal, s(n 0 1), (c) Haar DWT low-band signal of s(n), (d) Haar DWT high-band signal of s(n), (e) Haar DWT low-band signal of s(n 0 1), (f) Haar DWT high-band signal of s(n 0 1), (g) D(9, 7) DWT low-band signal of s(n), (h) D(9, 7) DWT high-band signal of s(n), (i) D(9, 7) DWT low-band signal of s(n 0 1), and (j) D(9, 7) DWT high-band signal of s(n 0 1). the neighboring motion vectors are not coherent. These discontinuities lead to high-frequency components in the residual signals. When the wavelet transforms are performed on the residual signals, the block discontinuities generate large signals of the DWT coefficients in the high-bands, so the coding efficiency can be degraded. Therefore, effective reduction of the block discontinuities in the prediction error signals is required to realize higher coding efficiency [15]. In addition, there is another blocking effect coming from the various motion-compensation modes. In conventional block-based video coding, various types of blocks, such as intrablock, interblock, and bidirectional prediction block, can exist in a frame [5]. This combinational usage of the various prediction modes makes the block discontinuity large, so the coding performance of the wavelet-based compression can be degraded. B. Motion Estimation and Compensation in Wavelet Domain The wavelet representation provides a multiresolution/multifrequency expression of a signal with localization in both time and frequency domains. This property is very desirable in image- and video-coding applications. Wavelet expansions are highly dependent on the alignment of the signal and the discrete grid chosen for the analysis [16]. In order to perform the motion compensation in the wavelet domain, the coefficients of the transformed signal need to be predicted [12], [13]. The main difficulty of multiscale video coding lies in the fact that the decimation and expansion operations in the wavelet transform are shift-variant. A simple example of the shift-variant property of a wavelet transform is illustrated in Fig. 1. In the example, the original and the one-pixel-shifted one dimensional (1-D) signals are shown in Fig. 1(a) and (b), respectively. The transformed signals from the Haar filter and the Daubechies nine- and seven-tap filters [D(9, 7)] are shown in Fig. 1. The low-band signal is smooth, and the difference in the low-band coefficients between the original and the shifted signal is small. Thus, it is possible to estimate the low-band coefficients of the shifted signal from those of the original signal with small error. However, there is a big difference between the

PARK AND KIM: WAVELET-BASED MOVING PICTURE CODING 579 Fig. 2. Block diagram of the proposed motion estimation and compensation scheme for wavelet-based video coding. Fig. 3. Three-level 1-D wavelet analysis and synthesis filter banks. high-band coefficients of the shifted signal and those of the original signal. Such phenomena will happen frequently around the edges in the image. The high-band signal difference around the image edges generates large errors in predicting the motion vector in the wavelet domain when using the conventional block-matching method. Several researchers have performed motion estimation and compensation by direct prediction of the low-band and the high-band wavelet coefficients [7] [11]. However, direct band-to-band motion estimation of the wavelet coefficients is not so effective because of the shift variance induced by decimation. Instead of estimating directly the high-band coefficients at a given resolution, there is another approach; i.e., to perform the motion estimation of every low-band from the corresponding low-band of the reference in which the reference is not decimated. In this approach, the motion compensation of the high bands is performed with the motion vectors found in the corresponding lowband [17]. III. PROPOSED MOTION ESTIMATION AND COMPENSATION IN WAVELET DOMAIN Fig. 2 shows the block diagram of the video coding using the proposed motion estimation and compensation in the wavelet domain. In the proposed coding scheme, an input video frame is decomposed by the wavelet transform. The motion estimation and compensation are performed in the wavelet domain, not in the spatial domain. The low-band-shift block in the block diagram generates three shifted low low bands, which are shifted from the low low band of the reference frame by 1-pixel along the, the, and the diagonal directions, respectively. And each of

580 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 4, APRIL 2000 Fig. 4. Interpretation of the low-band-shift method in 1-D. A total of 22 subband signals are generated by the 1-D three-level low-band-shift method. Fig. 5. Example of the noble identities for the subband signal H (6; n ) in Fig. 4. the original low low band and the shifted low low bands is further decomposed. For multilevel DWT, these shift and wavelet-transform operations for the low low band are iteratively performed. The motion estimation finds a reference wavelet block, which is matched with the current wavelet block. The wavelet block consists of the wavelet coefficients which are generated from the low-band-shift block in Fig. 2. The residual signal from the compensation is the difference between the current wavelet block and the corresponding reference wavelet block. The residual signal can be quantized and encoded by embedded zerotree wavelet (EZW) coder [3] or by set partitioning in hierarchical trees (SPIHT) coder [4]. A. Wavelet Analysis and Synthesis Filter Banks A 1-D signal can be decomposed into a subband signal by the analysis filter bank and the decimator. Any or all of the resulting

PARK AND KIM: WAVELET-BASED MOVING PICTURE CODING 581 Fig. 6. Reorganization of the three-level wavelet coefficients into wavelet blocks. subbands can be decomposed by the analysis filter bank and decimated for as many levels as desired. Fig. 3 shows the threelevel 1-D wavelet analysis and the synthesis filter banks. The filters of,,, and in Fig. 3 are 1-D analysis and synthesis filters. For example, the Daubechies D(9, 7) filters [16], which are linear-phase biorthogonal filters, are given as follows: Subband filtering of 2-D image can be implemented by 1-D filtering in the - and the -directions since most 2-D wavelet filters can be applied by using the property of separability. The 2-D wavelet transform decomposes an image into four bands of,,, and, which are the low low, the high low, the low high, and the high high bands along the horizontal and the vertical directions, respectively. The reconstruction operation consists of upsampling followed by filtering of the synthesis filter bank. Subjecting all subbands to another stage of filtering and decimation leads to a uniform multilevel decomposition. (1) If only the lowband is further decomposed, it is referred to as an octave-band decomposition or pyramid decomposition [18]. The proposed scheme is similar to the octave-band decomposition, but the low low band is shifted and decomposed to overcome the shift-variant property of the wavelet transform. B. Low-Band-Shift Method The proposed motion estimation and compensation uses a low-band-shift method to overcome the shift-variant property. Fig. 4 shows the low-band-shift method in three-level 1-D wavelet decomposition. The th-level high-band signal of the input is denoted by, where indicates the number of shifts in the spatial domain and is the location in the subband signal. At the first level, the original and the shifted signals are decomposed into their low-band and high-band signals. The first-level high-band signal generated from the original signal is denoted by, and the high-band signal from the one-pixel-shifted signal by. For hierarchical decomposition of the proposed scheme, the low-band signal is further decomposed in the same way as at the first level. If the decomposition is performed up to third level, a total of eight low-band and 14 high-band signals are generated as shown in Fig. 4. For example, let be the -pixel-shifted signal of : (2)

582 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 4, APRIL 2000 Each subband of the shifted signal, by that of as follows:, can be represented to four-level wavelet transform to consist the wavelet blocks. The wavelet block size should increase for more than four-level wavelet transform. D. Motion Estimation and Compensation Using Low-Band-Shift where,,, and are the first-level high-band, the second-level high-band, the third-level high-band, and the third-level low-band signals of, respectively. In (3), is the modulo operation, which is modulo. The operator denotes the largest integer value less than or equal to. The variables of,, and in (3) are two-fold, four-fold, and eight-fold decimated coordinates, respectively, from spatial coordinate. As shown in (3) and Fig. 4, the wavelet coefficients of any shifted signal can be obtained from the low-band-shift of the original signal. Fig. 5 shows an example of noble identities [19] for, where the value of six indicates a six-pixel-shifted signal in the original spatial domain and is the eight-fold-decimated coordinate from the original spatial coordinate. The other signals can be simplified by applying the noble identities. The above 1-D formulation can be easily expanded to more than three-level wavelet transform and also to a 2-D image signal, as will be described in Section III-D. C. Generation of Wavelet Block In wavelet decomposition, every coefficient at a given scale, with the exception of those in the highest frequency subbands, can be related to a set of coefficients of the same orientation at the next finer scale. The coefficient at the coarse scale is called the parent, and the four coefficients representing the same spatial location and the same orientation at the next finer scale are called children. This relationship is exploited by representing the coefficients as a data structure called a wavelet tree. For the lowest frequency subband, the parent-child relationship is defined such that each parent has three children, each of which is in a subband at the same scale and the same spatial location, but at different orientation. The coefficients of each wavelet tree rooted in the lowest band are rearranged to form a wavelet block, as shown in Fig. 6. The purpose of the wavelet block is to provide a direct association between the wavelet coefficients and what they represent spatially in the image. Related coefficients at all scales and orientations are included in each block. For example, if three-level wavelet transform is applied to an image as shown in Fig. 6, the wavelet block consists of three blocks from the first-level,, and subbands, three blocks from the second-level,, and subbands, and four blocks from the third-level,,, and subbands. We can apply up (3) In the spatial domain, the block-based motion estimation usually divides an image into small blocks with pixels and then finds an optimum block in the reference frame for each block of the current frame within a given search area. The block-matching motion-estimation algorithms find the motion vector with which a given cost function is minimized. There are various cost functions, such as the mean absolute difference (MAD) and the mean squared difference (MSD). The implementation of MAD is much simpler than that of MSE. The motion estimation of the proposed scheme finds the motion vector (, ) that generates the minimum MAD between the current wavelet block and the reference wavelet block. The coefficients of each subband are rearranged to form a wavelet block, as shown in Fig. 6. The wavelet blocks in search window in the reference frame are compared to the current wavelet block, and a reference wavelet block that leads to the best match is selected. As an example, an input image is decomposed up to third level, so that the input image can be decomposed to a total of ten subbands: three subbands each at the first and the second levels, and four subbands at the third level. When the displacement vector is (, ), the MAD of the th wavelet block, in Fig. 6, is computed as follows: (4)

PARK AND KIM: WAVELET-BASED MOVING PICTURE CODING 583 Fig. 7. Rate distortion curves obtained by using the proposed motion estimation method and the conventional spatial-domain and wavelet-domain motion-estimation methods for the second frame of the football sequence, where the first frame of the football sequence is used as the reference frame. where the initial point of the th-level subbands in the th wavelet block are defined as (5) wavelet block of the low-band-shifted reference frame which has the motion vector (, ). The motion-compensated wavelet block is given by,, and are th-level high low, low high, and high high subbands of the reference frame, respectively. Also, is the third-level low low subband, as shown in Fig. 6. In (4), is the wavelet block size, which is in this paper. The subband signals of the 2-D reference frame in (4) results from a 2-D expansion of 1-D subband signal in Fig. 4. In (5), is the initial position of the th wavelet block in the spatial domain, as shown in Fig. 6. In (4), the optimum motion vector (, ) ofthe th wavelet block, which has minimum displacement error, is given by (6) where is the search window for motion estimation. The motion compensation for the th wavelet block of the current frame is performed by fetching the corresponding (7)

584 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 4, APRIL 2000 Fig. 8. Performance comparison of the proposed motion-estimation method with the conventional spatial-domain and wavelet-domain motion-estimation methods in terms of the MAD. TABLE I AVERAGE MAD OF 100 FRAMES FOR SEVERAL VIDEO SEQUENCES WITH VARIOUS RESOLUTIONS for where,, and are the motion-compensated th-level high low, low high, and high high subbands, respectively, and is the motion-compensated third-level low low subband. IV. SIMULATION In order to verify the proposed method, comparison studies were performed for the proposed method and two conventional methods. Several video sequences were used for this simulation. The motion estimation was performed by block matching with a -pixel block, and the discrete wavelet transform was performed by the D(9,7) filter bank with a three-level decomposition. In this simulation, the search window was [ 16, 16].

PARK AND KIM: WAVELET-BASED MOVING PICTURE CODING 585 Fig. 9. Second frame of football sequence and its reconstructed images from various motion estimation and compensation methods: (a) original second frame image, (b) the reconstructed image from the spatial-domain ME/MC, (c) error image between (a) and (b), (d) the reconstructed image from the direct wavelet-domain ME/MC, (e) error image between (a) and (d), (f) the reconstructed image from the proposed ME/MC, and (g) error image between (a) and (f). In the comparison study, the first method was the motion estimation using block matching in the spatial domain, and the residual signal from the prediction was decomposed by a threelevel DWT. The second method was the motion estimation in wavelet domain without the low-band shift, which is the direct band-to-band wavelet-domain motion estimation. The proposed

586 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 4, APRIL 2000 TABLE II COMPUTATIONAL COMPLEXITY AND MEMORY REQUIREMENTS FOR MOTION ESTIMATION AND COMPENSATION (FS: FULL SEARCH, DWT: DISCRETE WAVELET TRANSFORM, IDWT: INVERSE DISCRETE WAVELET TRANSFORM, LBS: LOW-BAND-SHIFT, GOP: GIGA OPERATIONS) motion estimation, which uses the low-band-shift method, was compared with these two motion estimation methods with respect to PSNR and MAD. In the first experiment, we compared the rate-distortion results of the proposed method and the two conventional methods. The first frame of the football sequence was used as the reference frame, and the second frame of the football sequence was estimated from the reference frame. The rate-distortion curves of the proposed method and the conventional methods are presented in Fig. 7 where a uniform quantizer was used. In Fig. 7, the entropy in the horizontal axis includes only the quantized residual signal and does not include the motion vectors. The average improvement in PSNR over the spatial-domain motion estimation is 0.7 db. In the second experiment, the MAD of the residual signal from the motion estimation and compensation was compared for the first 100 frames of the football sequence. In this simulation, the th frame was the reference frame and the ( )th frame was the current frame. The comparison results from the experiment are shown in Fig. 8. The average MAD s for several video sequences with various resolutions were analyzed as shown in Table I. These comparison results show the proposed motion-estimation method outperforms the conventional motion estimation methods. In order to show the subjective quality of the motion estimation methods, the second frame of football sequence is presented in Fig. 9. In Fig. 9, the reconstructed images were obtained from the compressed displacement frame difference with 0.1 bpp. As shown in Fig. 9, the reconstructed image from the proposed method has better image quality and does not have any blocking effects. In this paper, all experimental results were obtained with the D(9, 7) filters for the wavelet transform. Since the shift-variant property is a characteristic of the wavelet transform with any filter banks, our experimental results with other filter banks are similar to that with D(9, 7) filter bank. The analysis of the computational complexity and memory requirement of each method is described in Table II. The major disadvantages of the proposed scheme are the large memory requirement and the computational complexity. For three-level hierarchical decomposition with the low-band-shift method, a total of nine frames of memory space are required, which correspond to three frames for each wavelet level. However, the large memory requirement can be easily overcome through recent memory technology. The computation complexity of the proposed method is higher than that of the spatial-domain motion estimation because the low-band-shift method requires DWT operation for the low-band-shifted frame. The computational complexity was calculated with assumption of the full search motion estimation in image with search range of 16. The computation complexity of the proposed motion estimation is 10.3% higher than that of the spatial-domain motion estimation. However, the proposed algorithm has a better PSNR and a lower MAD in all cases. These simulation results show that the proposed method gets over the shift-variant property of the wavelet transform and performs more precise and efficient motion compensation. V. CONCLUSION In this paper, a new motion-estimation method in wavelet domain was proposed for the wavelet-based moving-picture coding. The proposed motion estimation in the wavelet domain was accomplished by the low-band-shift method. The low-band-shift method overcame the shift-variant property of the wavelet transform, so it could perform the motion estimation and compensation more precisely and efficiently. The simulation results showed that the proposed scheme with the low-band-shift method outperformed the full search schemes in the spatial domain and the direct band-to-band motion estimation in the wavelet domain with respect to the PSNR and the MAD. The proposed scheme was basically free from the blocking effects because the wavelet decomposition involved a global transform and hence the distortion was distributed among the whole picture. The low-band-shift method can be a model method for motion estimation in wavelet domain. REFERENCES [1] M. Antonini, M. Barlaud, P. Mathieu, and I. Daubechies, Image coding using wavelet transform, IEEE Trans. Image Processing, vol. 5, pp. 205 220, Apr. 1992. [2] S. Mallat, A theory for multiresolution signal decomposition: The wavelet representation, IEEE Trans. Pattern Anal. Machine Intell., vol. 11, pp. 674 693, July 1989. [3] J. Shapiro, Embedded image coding using zerotrees of wavelet coefficients, IEEE Trans. Signal Processing, vol. 41, pp. 3445 3462, Dec. 1993. [4] A. Said and W. Pearlman, A new, fast, and efficient image codec based on set partitioning in hierarchical trees, IEEE Trans. Circuits Syst. Video Technol., vol. 6, pp. 243 250, June 1996. [5] Generic Coding of Moving Pictures and Associated Audio, ISO/IEC Standard JTC1 IS 13 818, 1994. [6] D. Taubman and A. Zakhor, Multirate 3-D subband coding of video, IEEE Trans. Image Processing, vol. 3, pp. 572 588, Sept. 1994.

PARK AND KIM: WAVELET-BASED MOVING PICTURE CODING 587 [7] Y. Zhang and S. Zafar, Motion-compensated wavelet transform coding for color video compression, IEEE Trans. Circuits Syst. Video Technol., vol. 2, pp. 285 296, Sept. 1992. [8] K. Uz, M. Vetterli, and D. LeGall, Interpolative multiresolution coding of advanced television and compatible subchannels, IEEE Trans. Circuits Syst. Video Technol., vol. 1, pp. 86 99, Mar. 1991. [9] C. Caffario, C. Guaragnella, F. Bellifemine, A. Chimienti, and R. Picco, Motion compensation and multiresolution coding, Signal Process.: Image Commun., vol. 6, pp. 123 142, May 1994. [10] S. Kim, T. Aboulnasr, and S. Panchanathan, Adaptive multiresolution motion estimation techniques for wavelet-based video coding, Proc. SPIE Visual Communications Image Processing, vol. 3309, pp. 965 974, Jan. 1998. [11] S. Kim, S. Ree, J. G. Jeon, and K. T. Park, Interframe coding using twostage variable block-size multiresolution motion estimation and wavelet decomposition, IEEE Trans. Circuits Syst. Video Technol., vol. 8, pp. 399 410, Aug. 1998. [12] F. Meyer, A. Averbush, and R. Coifman, Motion compensation of wavelet coefficients for very low bit rate video coding, in Proc. IEEE Int. Conf. Image Processing, Santa Barbara, CA, Oct. 1997, pp. 638 641. [13] P. Cheng, J. Li, and C. J. Kuo, Multiscale video compression using wavelet transform and motion compensation, in Proc. IEEE Int. Conf. Image Processing, Washington, DC, Oct. 1995, pp. 606 609. [14] S. Martucci, I. Sodagar, T. Chiang, and Y. Zhang, A zerotree wavelet video coder, IEEE Trans. Circuits Syst. Video Technol., vol. 7, pp. 109 118, Feb. 1997. [15] M. Ohta and S. Nogaki, Hybrid picture coding with wavelet transform and overlapped motion-compensated interframe prediction coding, IEEE Trans. Signal Processing, vol. 41, pp. 3416 3424, Dec. 1993. [16] J. Villasenor, B. Belzer, and J. Liao, Wavelet filter evaluation for image compression, IEEE Trans. Image Processing, vol. 4, pp. 1053 1060, Aug. 1995. [17] A. Nosratinia and M. T. Orchard, Multi-resolution backward video coding, in Proc. IEEE Int. Conf. Image Processing, Washington, DC, Oct. 1995, pp. 563 566. [18] P. Cosman, R. Gray, and M. Vetterli, Vector quantization of image subbands: A survey, IEEE Trans. Image Processing, vol. 5, pp. 202 225, Feb. 1996. [19] P. P. Vaidyannathan, Multirate Systems and Filter Banks. Englewood Cliffs, NJ: Prentice-Hall, 1993. Hyun-Wook Park (A 93 SM 99) received the B.S. degree in electrical engineering from Seoul National University, Seoul, Korea, in 1981, and the M.S. and Ph.D. degrees in electrical engineering from Korea Advanced Institute of Science and Technology (KAIST), Taejon, in 1983 and 1988, respectively. He has been an Associate Professor with the Electrical Engineering Department, KAIST, since 1993. His current research interests include image computing systems, image processing, medical imaging, and multimedia systems. Hyung-Sun Kim received the B.S. and the M.S. degrees in control and instrumentation engineering from Seoul National University, Seoul, Korea, in 1992 and 1994, respectively. He is currently pursuing the Ph.D. degree in electrical engineering at Korea Advanced Institute of Science and Technology (KAIST). He has been with LG Electronics, Inc., Seoul, since 1999. His current research interests include image computing system, image compression, and multimedia systems, especially DVD.