MOTION ESTIMATION WITH THE REDUNDANT WAVELET TRANSFORM.*

Similar documents
A 3-D Virtual SPIHT for Scalable Very Low Bit-Rate Embedded Video Compression

Module 7 VIDEO CODING AND MOTION ESTIMATION

Motion-Compensated Wavelet Video Coding Using Adaptive Mode Selection. Fan Zhai Thrasyvoulos N. Pappas

Outline Introduction MPEG-2 MPEG-4. Video Compression. Introduction to MPEG. Prof. Pratikgiri Goswami

ECE 533 Digital Image Processing- Fall Group Project Embedded Image coding using zero-trees of Wavelet Transform

Lecture 5: Error Resilience & Scalability

Dense Motion Field Reduction for Motion Estimation

Compression of Light Field Images using Projective 2-D Warping method and Block matching

Image Fusion Using Double Density Discrete Wavelet Transform

EECS 556 Image Processing W 09. Interpolation. Interpolation techniques B splines

Image Transformation Techniques Dr. Rajeev Srivastava Dept. of Computer Engineering, ITBHU, Varanasi

A MULTI-RESOLUTION APPROACH TO DEPTH FIELD ESTIMATION IN DENSE IMAGE ARRAYS F. Battisti, M. Brizzi, M. Carli, A. Neri

Interframe coding A video scene captured as a sequence of frames can be efficiently coded by estimating and compensating for motion between frames pri

Coding of Coefficients of two-dimensional non-separable Adaptive Wiener Interpolation Filter

CHAPTER 3 DIFFERENT DOMAINS OF WATERMARKING. domain. In spatial domain the watermark bits directly added to the pixels of the cover

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

DIGITAL IMAGE PROCESSING WRITTEN REPORT ADAPTIVE IMAGE COMPRESSION TECHNIQUES FOR WIRELESS MULTIMEDIA APPLICATIONS

Multigrid Pattern. I. Problem. II. Driving Forces. III. Solution

Smoothers. < interactive example > Partial Differential Equations Numerical Methods for PDEs Sparse Linear Systems

Implementation of Lifting-Based Two Dimensional Discrete Wavelet Transform on FPGA Using Pipeline Architecture

HYBRID TRANSFORMATION TECHNIQUE FOR IMAGE COMPRESSION

International Journal of Advanced Research in Computer Science and Software Engineering

Image Segmentation Techniques for Object-Based Coding

Matching. Compare region of image to region of image. Today, simplest kind of matching. Intensities similar.

Module 8: Video Coding Basics Lecture 42: Sub-band coding, Second generation coding, 3D coding. The Lecture Contains: Performance Measures

Compression of RADARSAT Data with Block Adaptive Wavelets Abstract: 1. Introduction

Performance Evaluation of Fusion of Infrared and Visible Images

A Non-Linear Image Registration Scheme for Real-Time Liver Ultrasound Tracking using Normalized Gradient Fields

Overview. Videos are everywhere. But can take up large amounts of resources. Exploit redundancy to reduce file size

Motion Estimation for Video Coding Standards

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 11, NOVEMBER

5LSH0 Advanced Topics Video & Analysis

FAST AND EFFICIENT SPATIAL SCALABLE IMAGE COMPRESSION USING WAVELET LOWER TREES

Motion Estimation using Block Overlap Minimization

Digital Video Processing

Multiframe Blocking-Artifact Reduction for Transform-Coded Video

Multi-focus Image Fusion Using Stationary Wavelet Transform (SWT) with Principal Component Analysis (PCA)

Reconstruction PSNR [db]

Dense Image-based Motion Estimation Algorithms & Optical Flow

High Efficiency Video Coding. Li Li 2016/10/18

Comparison of Digital Image Watermarking Algorithms. Xu Zhou Colorado School of Mines December 1, 2014

Multiresolution Image Processing

ARCHITECTURES OF INCORPORATING MPEG-4 AVC INTO THREE-DIMENSIONAL WAVELET VIDEO CODING

Image Compression for Mobile Devices using Prediction and Direct Coding Approach

Modified SPIHT Image Coder For Wireless Communication

Video Quality Analysis for H.264 Based on Human Visual System

Video Compression An Introduction

A deblocking filter with two separate modes in block-based video coding

Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology

Multiresolution motion compensation coding for video compression

Video Transcoding Architectures and Techniques: An Overview. IEEE Signal Processing Magazine March 2003 Present by Chen-hsiu Huang

Fine grain scalable video coding using 3D wavelets and active meshes

Digital Image Processing

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

Introduction to Video Compression

Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects

Wavelet-Based Video Compression Using Long-Term Memory Motion-Compensated Prediction and Context-Based Adaptive Arithmetic Coding

Audio-coding standards

A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames

PREFACE...XIII ACKNOWLEDGEMENTS...XV

Adaptive GOF residual operation algorithm in video compression

A combined fractal and wavelet image compression approach

An Image Coding Approach Using Wavelet-Based Adaptive Contourlet Transform

A New Configuration of Adaptive Arithmetic Model for Video Coding with 3D SPIHT

Image and Video Compression Fundamentals

The Vehicle Logo Location System based on saliency model

Overcompressing JPEG images with Evolution Algorithms

Comparative Analysis of Image Compression Using Wavelet and Ridgelet Transform

Learning based face hallucination techniques: A survey

Crash Course in Wireless Video

TEXT DETECTION AND RECOGNITION IN CAMERA BASED IMAGES

A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm

IMPROVED MOTION-BASED LOCALIZED SUPER RESOLUTION TECHNIQUE USING DISCRETE WAVELET TRANSFORM FOR LOW RESOLUTION VIDEO ENHANCEMENT

Texture Analysis of Painted Strokes 1) Martin Lettner, Paul Kammerer, Robert Sablatnig

Linear Methods for Regression and Shrinkage Methods

IMAGE PROCESSING USING DISCRETE WAVELET TRANSFORM

ISSN (ONLINE): , VOLUME-3, ISSUE-1,

Optical Flow Estimation with CUDA. Mikhail Smirnov

CoE4TN3 Image Processing. Wavelet and Multiresolution Processing. Image Pyramids. Image pyramids. Introduction. Multiresolution.

3. Lifting Scheme of Wavelet Transform

ECG782: Multidimensional Digital Signal Processing

Adaptive-Mesh-Refinement Pattern

Efficient Method for Half-Pixel Block Motion Estimation Using Block Differentials

Image Compression. CS 6640 School of Computing University of Utah

Wavelet Transform (WT) & JPEG-2000

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

Adaptive Quantization for Video Compression in Frequency Domain

A New Technique of Extraction of Edge Detection Using Digital Image Processing

Implementation and analysis of Directional DCT in H.264

Non-Differentiable Image Manifolds

Bilevel Sparse Coding

Texture Mapping using Surface Flattening via Multi-Dimensional Scaling

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV

Audio-coding standards

University of Bristol - Explore Bristol Research. Link to published version (if available): /s

Fast Mode Decision for H.264/AVC Using Mode Prediction

Mesh Based Interpolative Coding (MBIC)

Motion Analysis. Motion analysis. Now we will talk about. Differential Motion Analysis. Motion analysis. Difference Pictures

Efficient Image Compression of Medical Images Using the Wavelet Transform and Fuzzy c-means Clustering on Regions of Interest.

Transcription:

MOTION ESTIMATION WITH THE REDUNDANT WAVELET TRANSFORM.* R. DeVore A. Petukhov R. Sharpley Department of Mathematics University of South Carolina Columbia, SC 29208 Abstract We present a fast method for computation of the motion flow based on the redundant Haar transform of reference frames in a video sequence. This algorithm gives a multiscale representation of the motion flow which allows progressive encoding. The high eficiency of the algorithm is achieved by by a multiscale correlation between the current and reference frames of the coeficients on each of the Haar blocks. The algorithm is designed to adapt dynamically to make optimal use of channel capaciv. The weakness of the state of the art encoders lies at two levels. The first is the inability of these encoders to capture motion occurring within their decomposition blocks. The second is their fixed allocation of bits to describe the motion field. More preferable algorithms should dynamically allocate bits to the motion and the residual in a progressive streaming environment. Addressing these deficiencies would have two benefits. Firstly, it would make full and optimal use of channel capacity. Secondly, it would allow users who have access to more bandwidth resources to select bit rates in order to receive higher quality video. Most video encoders are based on motion compensation. In this case, the bitstream consists of two substreams. One of these contains information about the flow field which is used for predicting the current frame. The second bitstream contains information on the residual image, i.e., on the difference between the actual and predicted frames. We propose new and fast algorithms (i.e. Waveflow) to compute flow fields by using redundant wavelet decompositions. These flow fields can then be progressively encoded and coupled with progressive encoding of the residual in video compression. Waveflow detects motion at all scales and can be used for pixel-wise flow field estimation. A video is composed of a sequence of frames which we label as F,, 4, F2,... Each frame is a digitized image and is therefore initially represented as an array of pixel values. The goal of compression is to reduce the number of bits so as to meet channel capacity. The sender (sometimes referred to as the server) has the original video at their ' This research was supported by ONR contract N00014-91-5-1076, DARPANSF Grant DMS-0221642, and ARO contract DAAD19-02-1-0028. 0-7803-7984-5/02 $10.00 0 2002 IEEE 53

-- disposal. The receiver receives information to reconstruct the compressed frames Fo, F',.... At the point of sending (a compressed version of) framef', the receiver already has -- - the previous frames F,, 4,.., F'-,. This can be used to advantage by the sender in reducing the number of bits to transmit. -- Namely, - the sender transmits only bits that describe how to compute given that Fo, Fk,..., Fk-, are known. Our main question then, is to determine how to capture the detail in 4, viewing it as a rearrangement of detail in F,, which is already known to us via E. Consider a pixel P in F, which corresponds to a region I( ) in 4, which has been moved to its new position in F,. If this model is correct, we can describe the pixel value P by simply identifying I( ). With this assumption, one can associate to each P in F, a vector v, which identifies I(?). This collection of vectors v,, PE F,, is called thejlowfiezd for the pair (F0,4). The vectors v, need to be computed for each new frame, which is computationally intensive. Our goal is to design a fast algorithm for computing the flow field. The two known ideas for computing flow fields are block motion compensation and optic jlow, which we now describe. Block Motion Compensation is the most common method for computing and encoding the flow field. In this method, the image F, is decomposed into blocks of 8x8 or 16x1 6 pixels. In its coarsest implementation, motion compensation would assign to each such block one flow vector which is used to describe the motion of every pixel in the block. In other words, the flow vectors are constanl. on the block. This approach has high compression since one has to encode very few flow vectors, but it suffers from several disadvantages: 1. Visible block artifacts are introduced. 2. Only linear motion of the whole block defined by the motion vector is captured. That is, if an object is moving within a given block, or if a linear deformation of the entire block takes place, this will not be observed in the block motion compensation encoding of the flow vectors. 3. There is no adaptivity in the bit budget assigned for the motion compensation and therefore neither for the residual. A method to diminish blocking artifacts was used in [2] and is based on the idea of piecewise bilinear approximation of the flow field, instead of piecewise constant approximation used in block motion compensation. For each vertex (corner) of a block, one determines a flow vector. The flow vectors associated to each of the pixels in a given block is then given by a bilinear interpolation to the comer flow vectors. This method provides the desired continuity across blocks and allows one to. partially overcome the 54

International Workshop on Digitor and Computational Video November 2002 disadvantages 2 and 3. However, the method works with fixed bit budget and cannot be used for progressive transmission. At the same time, the motion estimation part of the algorithm works on the block motion compensation principles, having very high computational cost. Point-wise estimation of the flow field is an important problem in video encoding. One method (so-called, Optic Flow) for computing the flow field is based on differential equations [ 11. The sender determines a (nonlinear) transport equation which models the flow from F, to F, and uses this equation to encode the flow. It appears that this method for computing the optic flow field may be numerically intensive and unstable. Moreover, it will most likely encounter difficulties from lack of uniqueness and non-smoothness in the flow field. There remains also the question of whether the formulation of video by partial differential equations with prescribed forms is faithful to the underlying visual process. Nevertheless, this method needs another serious look. Redundant Haar Transform We give a definition of a one dimensional redundant Haar transform which has two main features: 1) invariance with respect to integer shifts; 2) high computational efficiency. Let (xi ), ie N, be a signal (a sequence of real numbers). We define the recursive procedure for the L -level redundant Haar transform as follows: a) xp =xi ; f k b) x,!" =x,! +x,:*,, $" =xi -xi+*" k=0,1,..., L-1. Given a finite signal (xi = 0 for i < 0 and i 2 A4 ), the implementation of the transform requires the extension of the signal to the exterior of its support. While most wavelet applications use symmetric extensions, these look less attractive for computation of flow fields. Our numerical experiments have shown that the simple extension xi = x, for i < 0 and xi = xm-l for i 2 M leads to much better quality in flow field computations. We note that Haar coefficients of the simple signal extension can be computed by a slightly different formulae. For example, for the left extension we have x,! = x!*,-,, where i < -zk-'. To compute a two dimensional redundant Haar transform for the (k + 1 )" level of the decomposition, we must just apply the one dimensional transform to the rows of coefficients at the k'b level and then to repeat this operation on the resulting columns. For a more precise estimation of the flow field (up to one quarter of a pixel), we use bilinear interpolation of the whole pixel values. ' 55

The computation of the flow field utilizing tbe redundant Haar transform. Our method for the computation of the flow field, based on multilevel redundant Haar transform, is able to estimate inter-frame motion. The main feature of this method is that it works with wavelet coefficients rather than with pixels. This results in a significant increase in computational efficiency when conipared with other methods. In combination with a certain smoothing technique incorporated in the algorithm, it leads to a smooth wellcompressible flow field. Our present implementations separate the problem of computation from that of compressing (encoding) the flow field. Therefore, our method does not compete in efficiency with the algorithms that exploit a joint computationcompression technique as in[2]. These latter techniques maximize bit allocations by computing an approximation of the motion which can be maximally compressed. The advantage of our method is that it allows fast adaptation to different bit rates and to different channel conditions. Having a good estimation of the flow field, the transmitter can make decisions how to encode the video sequence depending on the required quality and current channel conditions. The four Haar coefficients corresponding to the same square have very simple geometric meaning. The LL coefficient (L - scale, H - wavelet) is just a mean value of the image intensity inside the square. The LH and HL coefficients give an average value of the gradient. HH characterizes a twist of the surface corresponding to the given square. These reflect the well-known fact that Haar coefficients are a very simple, but efficient, tool for feature detection, in particular, for edge detection. One could imagine that there could be other characteristics (different from Haar coefficients) associated to a square which could extract the essential features of the image on that square necessary for describing the flow field. However, the computational cost for the Haar transform is merely 1 arithmetic operation per coefficient which would certainly be hard to match for alternate methods To estimate the flow field between frames F,and F, we compute the regular Haar transform of the frame F, and the redundant Haar transform of the frame F,. Each dyadic block Q generated by the wavelet decomposition of Fl has 4 associated coefficients (1 scaling and 3 wavelet Coefficients) at the given level. We start computing the flow field at the coarsest level (the lowest frequency level of the decomposition). For eachq, our goal is to find the block in F,whose associated Haar coefficients has the least deviation from the corresponding coefficients of Q. Then we process the next level of decomposition and so on (down to pixel-wise estimation), using the motion vectors obtained during the previous step as a prediction for the vectors of the current step. For the lowest frequency level the prediction vectors are taken to be zero. Generally speaking, the computation of the flow field is an ill-posed problem. In most real video sequences there exist smooth, almost flat, areas of image intensity such that we 56

have little chance to assign a reliable motion vector. In this case many vectors can be good candidates and some regularization method must be used. We follow the rule that if there is a choice of which vector to assign to a block, we choose the vector corresponding to a smooth solution. We use two levels of the regularization. The first one provides a smooth transfer between levels (from a coarse level of decomposition to a finer level) while the second provides us with smoothing within the current level. -.... I Now we discuss our procedure for estimating the flow field in more detail. Let us assume that we have already estimated the motion vectors for the (k + 1) level of the decomposition. We halve the range of search of that used for the previous level of decomposition. The motion vector vk+l from the previous level defines the starting value of the search. Given a square Q at level k, we determine a vector v,k (associated to this square) which minimizes the functional II WO II +4 II v," - Vb+l II (1) The first term of the functional uses the Euclidean norm of the 4component vector which is composed of the differences between Haar coefficients of the current dyadic square and those of the redundant transform of the reference frame corresponding to the shift of the dyadic square by the vector v:. The second term is a penalty for too large a deviation from the predicted vector vk+'. We note that the abclve procedure provides a smooth transition from the coarse scale to the fine, however, it does not give a smooth field at the current level of the decomposition. To increase the smoothness inside the current level we use the following additional processing step (iterated 2 times). We start with the vectors consisting of v," and vi+a, for the eight neighbors of Q, and introduce an additional term in the functional (1) to penalize the deviation of the desired vector from the weighted average value of the v,".~. 57

International Worhhop on Digital and Computational Video November 2002 For squares having a common edge with Q we introduce weights fi times larger than for squares at the comers. Thus, for the current iteration we are looking for the vector which minimizes the hctional II M(v,k) II +& II 4.0 -vk+' I1 +AH II v," -VL II * The weights AL, which penalizes deviation from the previous (low frequency) level of decomposition, and AH, which penalizes for the violation of the smoothness on the current (high frequency) level, were chosen experimentally. They decrease as the scales become finer. Numerical Modeling for Video Coding While the development of an efficient video codec is not the main goal of this paper, we describe some results on the performance of our method. We incorporated our motion estimation algorithm (WaveFlow) into the codec described in [2]. This wavelet based codec consists of 5 highly optimized modules: 1) Discrete wavelet transform with optimized bases generated by rational symbols; 2) Encoder of intra-frames (still image encoder); 3) Encoder of inter-fiames (residual image encoder); 4) Motion estimator; 5) Motion encoder. In our experiments we used blocks 1)-3) and replaced the estimator of the flow field in step 4 by our estimator. For encoding the flow field (step5), we used the encoder of step 2). Our scheme outperforms popular formats like MPEG4 or Media Player, however, it gives 20-30% less compression than the codec from [2]. We speculate that the reason for these losses lies in non-optimality of the flow field encoder in step 5. The encoder of intra-frames used in step 2 is adapted to still images whose nature differs from the flow field. The flow field consists of much smoother regions than a regular still image. We believe that customizing the encoder in 5) will improve this component of the compression. P 58

International Workrhop on Digital and Computational Video November 2002 References [l] C.P. Bemard, Discrete wavelet analysis for fast optic flow computation, ACHA ~01.11, pp. 32-63,2001. [2] G. Heising, D. Marpe, H.L. Cycon, and A.P. Petukhov, Wavelet-based very low bit-rate video coding using image warping and overlapped block motion compensation, IEEE Proc- Vision, Image and Signal Processing, vol. 148, 93-101, 2001. 59