FRAME-RATE UP-CONVERSION USING TRANSMITTED TRUE MOTION VECTORS

Similar documents
Digital Video Processing

Efficient MPEG-2 to H.264/AVC Intra Transcoding in Transform-domain

Module 7 VIDEO CODING AND MOTION ESTIMATION

Rate Distortion Optimization in Video Compression

Fast Mode Decision for H.264/AVC Using Mode Prediction

Motion Estimation for Video Coding Standards

A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Error Control Techniques for Interactive Low-bit Rate Video Transmission over the Internet.

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

Motion-Compensated Wavelet Video Coding Using Adaptive Mode Selection. Fan Zhai Thrasyvoulos N. Pappas

Phase2. Phase 1. Video Sequence. Frame Intensities. 1 Bi-ME Bi-ME Bi-ME. Motion Vectors. temporal training. Snake Images. Boundary Smoothing

Depth Estimation for View Synthesis in Multiview Video Coding

Fast Implementation of VC-1 with Modified Motion Estimation and Adaptive Block Transform

Joint Adaptive Block Matching Search (JABMS) Algorithm

Advanced Video Coding: The new H.264 video compression standard

Semi-Hierarchical Based Motion Estimation Algorithm for the Dirac Video Encoder

A Hybrid Temporal-SNR Fine-Granular Scalability for Internet Video

Week 14. Video Compression. Ref: Fundamentals of Multimedia

In the name of Allah. the compassionate, the merciful

Digital Image Stabilization and Its Integration with Video Encoder

Mesh Based Interpolative Coding (MBIC)

A deblocking filter with two separate modes in block-based video coding

Frequency Band Coding Mode Selection for Key Frames of Wyner-Ziv Video Coding

An Adaptive Cross Search Algorithm for Block Matching Motion Estimation

Rate-Distortion Optimized Layered Coding with Unequal Error Protection for Robust Internet Video

Wavelet-Based Video Compression Using Long-Term Memory Motion-Compensated Prediction and Context-Based Adaptive Arithmetic Coding

Reduced Frame Quantization in Video Coding

Homogeneous Transcoding of HEVC for bit rate reduction

Video Transcoding Architectures and Techniques: An Overview. IEEE Signal Processing Magazine March 2003 Present by Chen-hsiu Huang

Optimal Estimation for Error Concealment in Scalable Video Coding

Redundancy and Correlation: Temporal

Pattern based Residual Coding for H.264 Encoder *

FAST: A Framework to Accelerate Super- Resolution Processing on Compressed Videos

Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding

JPEG 2000 vs. JPEG in MPEG Encoding

Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations

Lecture 13 Video Coding H.264 / MPEG4 AVC

ADAPTIVE INTERPOLATED MOTION COMPENSATED PREDICTION. Wei-Ting Lin, Tejaswi Nanjundaswamy, Kenneth Rose

Zonal MPEG-2. Cheng-Hsiung Hsieh *, Chen-Wei Fu and Wei-Lung Hung

An Optimized Template Matching Approach to Intra Coding in Video/Image Compression

IN the early 1980 s, video compression made the leap from

MPEG-4: Simple Profile (SP)

Laboratoire d'informatique, de Robotique et de Microélectronique de Montpellier Montpellier Cedex 5 France

Advanced De-Interlacing techniques with the use of Zonal Based Algorithms

VIDEO COMPRESSION STANDARDS

CMPT 365 Multimedia Systems. Media Compression - Video

Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations

STACK ROBUST FINE GRANULARITY SCALABLE VIDEO CODING

Chapter 10. Basic Video Compression Techniques Introduction to Video Compression 10.2 Video Compression with Motion Compensation

SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC

H.264 to MPEG-4 Transcoding Using Block Type Information

Multiframe Blocking-Artifact Reduction for Transform-Coded Video

Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology

Low Bitrate Video Communications

Video Alignment. Literature Survey. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin

Streaming Video Based on Temporal Frame Transcoding.

Compression and File Formats

ERROR-ROBUST INTER/INTRA MACROBLOCK MODE SELECTION USING ISOLATED REGIONS

System Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework

Object-Based Transcoding for Adaptable Video Content Delivery

Error Concealment Used for P-Frame on Video Stream over the Internet

Pre- and Post-Processing for Video Compression

Improved H.264/AVC Requantization Transcoding using Low-Complexity Interpolation Filters for 1/4-Pixel Motion Compensation

An Efficient Mode Selection Algorithm for H.264

An Efficient Motion Estimation Method for H.264-Based Video Transcoding with Arbitrary Spatial Resolution Conversion

A Low Bit-Rate Video Codec Based on Two-Dimensional Mesh Motion Compensation with Adaptive Interpolation

A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames

IBM Research Report. Inter Mode Selection for H.264/AVC Using Time-Efficient Learning-Theoretic Algorithms

One-pass bitrate control for MPEG-4 Scalable Video Coding using ρ-domain

Multiview Image Compression using Algebraic Constraints

Lecture 5: Error Resilience & Scalability

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 11, NOVEMBER

Complexity Reduced Mode Selection of H.264/AVC Intra Coding

A VIDEO TRANSCODING USING SPATIAL RESOLUTION FILTER INTRA FRAME METHOD IN MULTIMEDIA NETWORKS

Reconstruction PSNR [db]

DECODING COMPLEXITY CONSTRAINED RATE-DISTORTION OPTIMIZATION FOR THE COMPRESSION OF CONCENTRIC MOSAICS

Bit Allocation for MPEG-4 Video Coding with SPatio-Temporal Trade-offs

High Efficient Intra Coding Algorithm for H.265/HVC

Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction

BANDWIDTH REDUCTION SCHEMES FOR MPEG-2 TO H.264 TRANSCODER DESIGN

MOTION estimation is one of the major techniques for

Block-Matching based image compression

Multidimensional Transcoding for Adaptive Video Streaming

Overview: motion-compensated coding

BI-DIRECTIONAL AFFINE MOTION COMPENSATION USING A CONTENT-BASED, NON-CONNECTED, TRIANGULAR MESH

Compressed-Domain Video Processing and Transcoding

Dynamic Region of Interest Transcoding for Multipoint Video Conferencing

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

Decoded. Frame. Decoded. Frame. Warped. Frame. Warped. Frame. current frame

Introduction to Video Compression

A Fast Intra/Inter Mode Decision Algorithm of H.264/AVC for Real-time Applications

Video Compression An Introduction

FAST SPATIAL LAYER MODE DECISION BASED ON TEMPORAL LEVELS IN H.264/AVC SCALABLE EXTENSION

Motion Vector Coding Algorithm Based on Adaptive Template Matching

ARCHITECTURES OF INCORPORATING MPEG-4 AVC INTO THREE-DIMENSIONAL WAVELET VIDEO CODING

VIDEO streaming applications over the Internet are gaining. Brief Papers

MOTION COMPENSATION IN BLOCK DCT CODING BASED ON PERSPECTIVE WARPING

An adaptive preprocessing algorithm for low bitrate video coding *

Transcription:

FRAME-RATE UP-CONVERSION USING TRANSMITTED TRUE MOTION VECTORS Yen-Kuang Chen 1, Anthony Vetro 2, Huifang Sun 3, and S. Y. Kung 4 Intel Corp. 1, Mitsubishi Electric ITA 2 3, and Princeton University 1 4 Abstract - In this paper, we present a video frame-rate up-conversion scheme that uses transmitted true motion vectors for motion-compensated interpolation. In past work, we demonstrate that a neighborhood-relaxation motion tracker can provide more accurate true motion information than a conventional minimal-residue block-matching algorithm. Although the technique to estimate the true motion vectors is a novelty in its own right, the strength of this technique can be further demonstrated through various spatio-temporal interpolation applications. In this work, we focus on the particular problem of frame-rate up-conversion. In the proposed scheme, the true motion field is derived by the encoder and transmitted by normal means (e.g., MPEG or H.263 encoding). Then, it is recovered by the decoder and is used not only for motion compensated predictions but also used to reconstruct missing data. It is shown that the use of our neighborhood-relaxation motion estimation provides a method of constructing high quality image sequences in a practical manner. INTRODUCTION Low bit-rate video compression plays an significant role in many multimedia applications, such as video-conferencing, video-phone, and video games. The proposed frame-rate up-conversion scheme provides better picture quality in playing back highly-compressed video. To accomplish acceptable coding results at low bit-rates, most encoders reduce the temporal resolution. In other words, instead of targeting the full frame rate of 30 frames/sec (fps), the frame rate may be reduced to 10 fps, which would mean that 2 out of every 3 frames are never even considered by the encoder. However, to display the full frame rate at the decoder, a recovery mechanism is needed. The simplest recovery mechanism would repeat each frame until a new frame is received. The problem is that the image sequence will appear jerky, especially in areas of large or complex motion. Another simple mechanism is to linearly interpolate between coded frames. The problem with this mechanism is that the image sequence will appear blurry in areas of motion. This type of artifact is commonly referred to as a ghost artifact. From the above, it is motion that the major cause of problems for image recovery of this kind. This has been observed by a number of researchers [1, 4, 5, 6], and it has been shown that a motion-compensated interpolation can provide better results. In [6], up-sampling results are presented using decoded frames at low bitrates. However, the receiver needs to perform a separate motion estimation just for the interpolation. In [5], the proposed algorithm considers multiple motion vectors for a single block so as to provide better picture qualities. However, this scheme

From VLD Original Decoder To Display Quantizer indicator Quantized transform coefficients IQ IDCT + Frame Memory Our Extension Frame-rate up-conversion Motion Vectors Motion Compensation Figure 1: The proposed frame-rate up-conversion scheme scheme uses the decoded motion vectors for motion-compensated interpolation. requires extra motion information to be sent. In [4], the motion-compensated interpolation scheme is performed based on an object-based interpretation of the video. The main advantage of the scheme is that the decoded motion and segmentation information is used without refinement. However, a proprietary codec is used. The method that we propose is applicable to most video coding standards and does not require an extra motion estimation. Our motion-compensated interpolation scheme is based on the decoded motion vectors which are used for inter-coding. Our method does not require any proprietary information to be sent. One of the major advantages of this scheme over other motion-compensated interpolation schemes is that computation is saved on the decoder side. Besides the cost saving on the decoder side, another advantage of this scheme is high quality. Our simulation results show that the proposed scheme performs better than two conventional low-cost up-conversion methods. Our scheme eliminates the motion-jerkiness shown in the frame-repetition scheme and reduces the motionblurriness appeared in the non-motion-compensated linear-interpolation. In addition, our motion estimation scheme is based on a neighborhood-relaxation formulation [3]. Our true motion estimation process provides a more accurate representation of the motion within a scene, and hence it becomes easier to reconstruct information which needs to be recovered before display. Using the neighborhood-relaxation motion estimation algorithm results in better picture quality (0.15dB 0.3dB SNR) than using the conventional full-search motion estimation algorithm. MOTION COMPENSATED INTERPOLATION The proposed motion-compensated frame-rate up-conversion scheme uses the decoded motion vectors (cf. Figure 1). Here, we discuss the interpolation of frame F t using information from frames F t;1 and F t+1. It is easy to generalize the method for interpolating frame F t using information from frames F t;m and F t+n. The proposed interpolation scheme is based on the following premise: as shown

in Figure 2(a), if block B i moves ~v i from frame F t;1 to frame F t+1, then it is likely that block B i moves ~v i =2 from frame F t;1 to frame F t, i.e., I(~p ; ~v i t; 1) = I(~p ; ~v i 2 t)=i(~p t +1) 8~p 2 B i where ~p =[x y] t indicates the pixel location, I(~p t) means the intensity of the pixel [x y] at time t, and ~v i is the motion vector of block B i. From the above, the basic technique to interpolate frame F t based on frame F t;1 and frame F t+1 can be stated as follows: ~I(~p ; ~v i 2 t)= 1 2 I(~p ; ~v i t; 1) + I(~p t +1) 8~p 2 B i (1) where ~ I(~p t) means the reconstructed intensity of the pixel [x y] at time t. Because the pixel intensity at frame F t is reconstructed as the average of the intensity of the corresponding pixels at frame F t;1 and at frame F t+1, this process can be referred to as a motion-compensated linear-interpolation. Thus far, we have discussed how motion-compensated interpolation can be used to assist in the recovery of missing frames. However, in this discussion it has been assumed that a pixel can be seen in all frames, while in reality a pixel may be occluded or uncovered. This is a major source of failure for conventional linear interpolation and is a potentially larger problem for the motion-compensated interpolation. The reason is that the motion in these areas can not be tracked, and hence any motion information provides misleading/incorrect data. Therefore, additional heuristics are needed to interpolate the uncovered and occluded regions: 1. Uncovered region: We identify a block B i as an uncovered region, when it can be seen in F t and F t+1, but not in F t;1. When a block B i in F t+1 is coded as an INTRA block, it usually implies there is no matched displacement block in F t;1. That is, B i is in the uncovered region (from F t;1 to F t+1 ). Since we have no information about the uncovered region from F t to F t+1, we use a heuristic: a pixel is in the uncovered region if it (1) belongs to the corresponding location of an INTRA block B i and (2) has not been motion compensated by other blocks. Hence, we have the following formulation: ~I(~p t) =I(~p t +1) 8~p 2 B i (2) Note that since the block is INTRA coded, there is no motion information about the block. In this heuristic, we assume that the occluded and uncovered regions are stationary (~v = 0). The reason is that object occlusion and reappearance often happen in the background. And, it is most likely that the background has no motion. Hence, zero motion vectors are used for the occluded and uncovered regions. 2. Occluded region: Similar to the uncovered region, we identify a block B i as an occluded region, when it can be seen in F t;1 and F t but not in F t+1. All the blocks that are not in the occluded region can be motion compensated by

Eq. (1) and Eq. (2). When ~ I(~p t) has not been assigned any value by Eq. (1) and Eq. (2), it usually is in the occluded region. As a result, ~I(~p t) =I(~p t ; 1) (3) In summary, our method has discussed how decoded motion vectors can be used to provide a motion-compensated interpolation. With this dependency on the motion estimation algorithm at the encoder, it had been argued that a more accurate estimate of the true motion will yield better reconstruction of interpolated frames. That is, if ~v i =2 is an accurate estimate of the true motion, then jj I(~p ~ t);i(~p t)jj will be small. A suitable candidate for such a motion estimation algorithm would be the true motion estimation algorithm, such as (presented in [2, 3]): motion of B i = arg minfdfd(b i ~v) ~v +X i j DFD(B j ~v + ~ )g (4) B j 2N (B i) where DFD stands for the displaced frame difference, N (B i ) means the neighboring blocks of B i, is the weighting factor for different neighboring blocks, and a small ~ is incorporated to allow local variations of motion vectors among neighboring blocks due to the non-translational motions. If a motion vector can reduce the DFD of the center block and the DFDs of its neighbors, then it is selected to be the motion vector for the encoder. In other words, when two motion vectors produced similar DFDs, the one that is closer to its neighbors motion will be selected. In [3], we show that the neighborhood-relaxation motion tracker captures the true movement of the block more accurately than the widely adopted the minimal DFD criterion: motion vector = arg minfdfd(~v)g (5) ~v In terms of coding efficiency, our proposed motion estimation algorithm performs as well as the original minimal-residue motion estimation algorithm [2]. SIMULATION RESULTS To evaluate the performance of the proposed method of frame-rate up-conversion which uses the true motion vectors, a variety of 30 fps test sequences are encoded at 15 fps. The missing frames are interpolated and compared to the original frames. This process is illustrated in Figure 2(b). In Table 1, a summary of the average PSNR over all skipped frames is given. It is evident from this table that for sequences in which there is hardly any motion (e.g., akiyo and container) the linear interpolation does a better job of recovering the missing data. However, for the rest of the test sequences which contain moderate to high motion, the proposed method of frame-rate up-conversion performs much better. Next, we examine the performance of our motion-based frame-rate up-conversion using the neighborhood relaxation true motion tracker (see Eq. (4)) compared with the performance of the motion-based frame-rate up-conversion using the original

Frame t-1 t t+1 vi/2 vi B i Uncovered Region Original (30 fps) Coded (15 fps) Occluded Region Decoded & Up-convert (15 fps 30 fps) (a) (b) Figure 2: (a) Motion-compensated interpolation. (b) Our performance comparison scheme in the frame-rate up-conversion using transmitted true motion. We drop one of each two frames in the original sequence, then encode the sequence using standard video codec. While we decode the sequence, we also motion-interpolate the missing frames. We compare the skipped original frame and the interpolated reconstructed frame for the performance measurement. Sequences Non- Motion compensated SNR motion Old ME Our ME Improved akiyo 43.26 db 42.71 db 42.87 db 0.16 db coastguard 26.39 db 30.47 db 30.62 db 0.15 db container 41.89 db 40.47 db 40.77 db 0.30 db foreman 26.41 db 28.30 db 28.63 db 0.33 db hall monitor 35.69 db 36.02 db 36.15 db 0.13 db mother daughter 39.07 db 39.68 db 39.93 db 0.25 db news 33.01 db 32.78 db 33.08 db 0.30 db stefan 18.76 db 21.06 db 21.20 db 0.14 db Table 1: Comparison of different motion-based frame-rate up-conversion. Our neighborhood relaxation motion tracker performs about 0.15dB 0.3dB SNR better than the minimal-residue motion estimation method. minimal-residue motion estimation (see Eq. (5)). These simulation results show that our true motion estimation algorithm always performs better than the minimalresidue motion estimation method. The margin of gain is about 0.15dB 0.3dB. To get a better sense of the improvement which is achieved by using the proposed method, Figure 3 provides a visual comparison in the foreman sequence. Obviously, the proposed method is much better than linear interpolation. In fact, there is almost a 3dB difference. For the two methods which are motion-assisted, we observe that when the foreman turns his head, the brightness of his face changes. That is, the assumption of intensity conversation over motion trajectory is not valid in this scene. Under this condition, it is extraordinarily difficult for motion estimation algorithms to track the true motion. Hence, the motion vectors estimated by the minimal-residue criterion have many errors and the interpolated frame is quite disturbing. On the contrary, by using our neighborhood-relaxation motion tracker, which is more reliable, the errors can be significantly reduced. The SNR improvement in this example is 1.5 db.

(a) (b) (c) (d) (e) (f) Figure 3: Simulation results for visual comparison. (a) the 138th frame of the foreman sequence. (b) the 139th frame, which is not coded in the bit-stream. (c) the 140th frame. (d) the reconstructed 139th frame (26.86 db) using non-motion-compensated linear-interpolation of the 138th frame and the 140th frame. (e) the reconstructed 139th frame (28.31 db) using transmitted motion vectors that are generated from full-search motion estimation. (f) the reconstructed 139th frame (29.81 db) using transmitted motion vectors that are generated from the proposed true motion estimation. SUMMARY We have described a novel frame-rate up-conversion algorithm that uses decoded true motion vectors for motion-compensated interpolation, i.e., using the information contained within the bit-stream. Because of no motion estimation again on the decoder side, this technique provides a low-cost solution for playing back highlycompressed video in better picture quality. REFERENCES [1] R. Castagno, P. Haavisto, and G. Ramponi, A Method for Motion Adaptive Frame Rate Up-conversion, IEEE Trans. on Circuits and Systems for Video Technology, 1996. [2] Y.-K. Chen and S. Y. Kung, Rate Optimization by True Motion Estimation, in Proc. of IEEE Workshop on Multimedia Signal Processing, 1997. [3] Y.-K. Chen, Y.-T. Lin, and S. Y. Kung, A Feature Tracking Algorithm Using Neighborhood Relaxation with Multi-Candidate Pre-Screening, in Proc. of ICIP 96. [4] S.-C. Han and J. W. Woods, Frame-rate Up-conversion Using Transmitted Motion and Segmentation Fields for Very Low Bir-rate Video Coding, in Proc. of ICIP 97. [5] K. Kawaguchi and S. K. Mitra, Frame Rate Up-Conversion Considering Multiple Motion, in Proc. of ICIP 97. [6] R. Thoma and M. Bierling, Motion Compensating Interpolation Considering Covered and Uncovered Background, Signal Processing: Image Communications, 1989.