Jointly Optimized Transform Domain Temporal Prediction (TDTP) and Sub-pixel Interpolation

Similar documents
ADAPTIVE INTERPOLATED MOTION COMPENSATED PREDICTION. Wei-Ting Lin, Tejaswi Nanjundaswamy, Kenneth Rose

Adaptive Interpolated Motion-Compensated Prediction with Variable Block Partitioning

Optimal Estimation for Error Concealment in Scalable Video Coding

JOINT INTER-INTRA PREDICTION BASED ON MODE-VARIANT AND EDGE-DIRECTED WEIGHTING APPROACHES IN VIDEO CODING

QUANTIZER DESIGN FOR EXPLOITING COMMON INFORMATION IN LAYERED CODING. Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose

Frequency Band Coding Mode Selection for Key Frames of Wyner-Ziv Video Coding

An Optimized Template Matching Approach to Intra Coding in Video/Image Compression

Coding for the Network: Scalable and Multiple description coding Marco Cagnazzo

Low complexity H.264 list decoder for enhanced quality real-time video over IP

Scalable video coding with robust mode selection

Overview: motion-compensated coding

Intra-Mode Indexed Nonuniform Quantization Parameter Matrices in AVC/H.264

Digital Image Representation Image Compression

Fast Mode Decision for H.264/AVC Using Mode Prediction

Lecture 5: Compression I. This Week s Schedule

Lecture 5: Error Resilience & Scalability

High Efficiency Video Coding. Li Li 2016/10/18

Prediction Mode Based Reference Line Synthesis for Intra Prediction of Video Coding

Lec 04 Variable Length Coding in JPEG

LECTURE VIII: BASIC VIDEO COMPRESSION TECHNIQUE DR. OUIEM BCHIR

Coding of Coefficients of two-dimensional non-separable Adaptive Wiener Interpolation Filter

Complexity Reduced Mode Selection of H.264/AVC Intra Coding

A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Fractal Compression. Related Topic Report. Henry Xiao. Queen s University. Kingston, Ontario, Canada. April 2004

Multimedia Communications. Audio coding

Using animation to motivate motion

Low-cost Multi-hypothesis Motion Compensation for Video Coding

Error Protection of Wavelet Coded Images Using Residual Source Redundancy

Xiaoqing Zhu, Sangeun Han and Bernd Girod Information Systems Laboratory Department of Electrical Engineering Stanford University

MPEG-4: Simple Profile (SP)

Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ICIP.1996.

FRAME-RATE UP-CONVERSION USING TRANSMITTED TRUE MOTION VECTORS

Topic 5 Image Compression

CODING METHOD FOR EMBEDDING AUDIO IN VIDEO STREAM. Harri Sorokin, Jari Koivusaari, Moncef Gabbouj, and Jarmo Takala

OVERVIEW OF IEEE 1857 VIDEO CODING STANDARD

DIGITAL IMAGE PROCESSING WRITTEN REPORT ADAPTIVE IMAGE COMPRESSION TECHNIQUES FOR WIRELESS MULTIMEDIA APPLICATIONS

Pyramid Coding and Subband Coding

Tutorial on Image Compression

Towards Performance and Scalability Analysis of Distributed Memory Programs on Large-Scale Clusters

A HYBRID DPCM-DCT AND RLE CODING FOR SATELLITE IMAGE COMPRESSION

Video encoders have always been one of the resource

Improving Energy Efficiency of Block-Matching Motion Estimation Using Dynamic Partial Reconfiguration

Lec 10 Video Coding Standard and System - HEVC

Pre- and Post-Processing for Video Compression

Daala: One year later

CMPT 365 Multimedia Systems. Media Compression - Video

WZS: WYNER-ZIV SCALABLE PREDICTIVE VIDEO CODING. Huisheng Wang, Ngai-Man Cheung and Antonio Ortega

Network Image Coding for Multicast

Image Segmentation Techniques for Object-Based Coding

Spectral Compression of Mesh Geometry

Modified SPIHT Image Coder For Wireless Communication

In the name of Allah. the compassionate, the merciful

Optimized Analog Mappings for Distributed Source-Channel Coding

Pyramid Coding and Subband Coding

First-order distortion estimation for efficient video streaming at moderate to high packet loss rates

Affine SKIP and MERGE Modes for Video Coding

Scalable Extension of HEVC 한종기

Decoding-Assisted Inter Prediction for HEVC

CS 335 Graphics and Multimedia. Image Compression

Pseudo sequence based 2-D hierarchical coding structure for light-field image compression

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs Supplementary Material

EFFICIENT PU MODE DECISION AND MOTION ESTIMATION FOR H.264/AVC TO HEVC TRANSCODER

IMAGE COMPRESSION. Image Compression. Why? Reducing transportation times Reducing file size. A two way event - compression and decompression

DCT Based, Lossy Still Image Compression

Image Coding and Data Compression

Fast HEVC Intra Mode Decision Based on Edge Detection and SATD Costs Classification

Multiframe Blocking-Artifact Reduction for Transform-Coded Video

Digital Video Processing

Yui-Lam CHAN and Wan-Chi SIU

Region-based Fusion Strategy for Side Information Generation in DMVC

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV

10.2 Video Compression with Motion Compensation 10.4 H H.263

Sample Adaptive Offset Optimization in HEVC

SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

Mali GPU acceleration of HEVC and VP9 Decoder

SCALABLE HYBRID VIDEO CODERS WITH DOUBLE MOTION COMPENSATION

Reduced Frame Quantization in Video Coding

Fundamentals of Video Compression. Video Compression

VIDEO COMPRESSION SYSTEM BASED ON JOINT PREDICTIVE CODING

No-reference perceptual quality metric for H.264/AVC encoded video. Maria Paula Queluz

A New Psychovisual Threshold Technique in Image Processing Applications

FAST: A Framework to Accelerate Super- Resolution Processing on Compressed Videos

VIDEO COMPRESSION STANDARDS

Module 7 VIDEO CODING AND MOTION ESTIMATION

View Synthesis for Multiview Video Compression

JPEG Joint Photographic Experts Group ISO/IEC JTC1/SC29/WG1 Still image compression standard Features

CROSS-PLANE CHROMA ENHANCEMENT FOR SHVC INTER-LAYER PREDICTION

Performance Comparison between DWT-based and DCT-based Encoders

Multiple Description Coding for Video Using Motion Compensated Prediction *

Lecture 26: Missing data

Wireless Communication

Georgios Tziritas Computer Science Department

Module 6 STILL IMAGE COMPRESSION STANDARDS

Data Hiding in Video

Video Quality Analysis for H.264 Based on Human Visual System

A Low Bit-Rate Video Codec Based on Two-Dimensional Mesh Motion Compensation with Adaptive Interpolation

Transcription:

Jointly Optimized Transform Domain Temporal Prediction (TDTP) and Sub-pixel Interpolation Shunyao Li, Tejaswi Nanjundaswamy, Kenneth Rose University of California, Santa Barbara

BACKGROUND: TDTP MOTIVATION reference prediction Conventional temporal prediction: pixel-to-pixel

BACKGROUND: TDTP MOTIVATION reference prediction Conventional temporal prediction: pixel-to-pixel which ignores the spatial correlation -> suboptimal

BACKGROUND: TDTP MOTIVATION reference prediction Conventional temporal prediction: pixel-to-pixel which ignores the spatial correlation -> suboptimal Usually, people account for this in very complex ways: Multi-tap filtering, 3D subband coding, etc.

BACKGROUND: TDTP TDTP A different perspective: Spatial correlation is de-correlated in DCT domain Optimal one-to-one prediction! Transform Domain Temporal Prediction (TDTP) 1 1 J. Han et al. 2010, "Transform-domain temporal prediction in video coding: exploiting correlation variation across coefficients"

BACKGROUND: TDTP TEMPORAL CORRELATION Reference block Original block Pixel domain 1

BACKGROUND: TDTP TEMPORAL CORRELATION Reference block Original block Pixel domain 1 DCT domain 1497-2 -33-4 -21 81 14 0 229-10 64 52 1-70 -26 2 8 47-70 -146 39-15 1 5-136 -38 18 130-35 69 20-4 78-2 39-17 10-54 -30 8 43 17-46 -82-6 -20 19 4-25 1 15 37-10 35-12 -5-6 2 4 6 2-17 5 1 1505 1-44 -10-47 41 29-15 230-11 62 50 51-40 -34 19-41 38-53 -136-9 -8 14-15 -110-39 24 143-32 44 19 5 80 1 26-3 46-33 -50 8 0 23-44 -82-30 4 42-10 1-8 21 29 4 10-10 7-1 -2-3 3 8-12 -7-2

BACKGROUND: TDTP TEMPORAL CORRELATION Reference block Original block Pixel domain 1 DCT domain At low frequency, 1 1497-2 -33-4 -21 81 14 0 229-10 64 52 1-70 -26 2 8 47-70 -146 39-15 1 5-136 -38 18 130-35 69 20-4 78-2 39-17 10-54 -30 8 43 17-46 -82-6 -20 19 4-25 1 15 37-10 35-12 -5-6 2 4 6 2-17 5 1 1505 1-44 -10-47 41 29-15 230-11 62 50 51-40 -34 19-41 38-53 -136-9 -8 14-15 -110-39 24 143-32 44 19 5 80 1 26-3 46-33 -50 8 0 23-44 -82-30 4 42-10 1-8 21 29 4 10-10 7-1 -2-3 3 8-12 -7-2

BACKGROUND: TDTP TEMPORAL CORRELATION Reference block Original block Pixel domain 1 DCT domain At low frequency, 1 1497-2 -33-4 -21 81 14 0 229-10 64 52 1-70 -26 2 8 47-70 -146 39-15 1 5-136 -38 18 130-35 69 20-4 78-2 39-17 10-54 -30 8 43 17-46 -82-6 -20 19 4-25 1 15 37-10 35-12 -5-6 2 4 6 2-17 5 1 1505 1-44 -10-47 41 29-15 230-11 62 50 51-40 -34 19-41 38-53 -136-9 -8 14-15 -110-39 24 143-32 44 19 5 80 1 26-3 46-33 -50 8 0 23-44 -82-30 4 42-10 1-8 21 29 4 10-10 7-1 -2-3 3 8-12 -7-2

BACKGROUND: TDTP TEMPORAL CORRELATION Reference block Original block Pixel domain 1 DCT domain At low frequency, 1 1497-2 -33-4 -21 81 14 0 229-10 64 52 1-70 -26 2 8 47-70 -146 39-15 1 5-136 -38 18 130-35 69 20-4 78-2 39-17 10-54 -30 8 43 17-46 -82-6 -20 19 4-25 1 15 37-10 35-12 -5-6 2 4 6 2-17 5 1 1505 1-44 -10-47 41 29-15 230-11 62 50 51-40 -34 19-41 38-53 -136-9 -8 14-15 -110-39 24 143-32 44 19 5 80 1 26-3 46-33 -50 8 0 23-44 -82-30 4 42-10 1-8 21 29 4 10-10 7-1 -2-3 3 8-12 -7-2 At high frequency, < 1

BACKGROUND: TDTP TEMPORAL CORRELATION Reference block Dominated by low frequency part Original block Pixel domain 1 DCT domain At low frequency, 1 1497-2 -33-4 -21 81 14 0 229-10 64 52 1-70 -26 2 8 47-70 -146 39-15 1 5-136 -38 18 130-35 69 20-4 78-2 39-17 10-54 -30 8 43 17-46 -82-6 -20 19 4-25 1 15 37-10 35-12 -5-6 2 4 6 2-17 5 1 1505 1-44 -10-47 41 29-15 230-11 62 50 51-40 -34 19-41 38-53 -136-9 -8 14-15 -110-39 24 143-32 44 19 5 80 1 26-3 46-33 -50 8 0 23-44 -82-30 4 42-10 1-8 21 29 4 10-10 7-1 -2-3 3 8-12 -7-2 At high frequency, < 1

BACKGROUND: TDTP TEMPORAL CORRELATION Reference block Dominated by low frequency part Original block Pixel domain 1 DCT domain At low frequency, 1 1497-2 -33-4 -21 81 14 0 229-10 64 52 1-70 -26 2 8 47-70 -146 39-15 1 5-136 -38 18 130-35 69 20-4 78-2 39-17 10-54 -30 8 43 17-46 -82-6 -20 19 4-25 1 15 37-10 35-12 -5-6 2 4 6 2-17 5 1 1505 1-44 -10-47 41 29-15 230-11 62 50 51-40 -34 19-41 38-53 -136-9 -8 14-15 -110-39 24 143-32 44 19 5 80 1 26-3 46-33 -50 8 0 23-44 -82-30 4 42-10 1-8 21 29 4 10-10 7-1 -2-3 3 8-12 -7-2 At high frequency, < 1 TDTP: Better exploit the temporal correlation

BACKGROUND: TDTP TDTP For each DCT coefficient, its prediction is: x n = ˆx n 1 = E(x nˆx n 1 ) E(ˆx 2 n 1 ) Correlation between source and reference TDTP: scale reference with temporal correlation for each DCT coefficient

CHALLENGE: SUB-PIXEL INTERPOLATION CHALLENGE: SUB-PIXEL INTERPOLATION x n = ˆx n 1 0.999 0.998 0.997 0.996 0.978 0.983 0.748 0.700 0.512 0.640 0.470 0.339 Example values in 8x8 blocks High-freq are scaled down more than low-freq Similar to the interpolation filters low-pass frequency response The gain drops significantly!

EB-TDTP INTERPOLATION FILTER VS TDTP Interpolation TDTP

EB-TDTP INTERPOLATION FILTER VS TDTP Interpolation TDTP Interpolation filter maps the pixels as well as its neighbor pixels into a subspace TDTP de-correlates spatial correlation in the subspace

EB-TDTP EXTENDED BLOCK TDTP (EB-TDTP) EB-TDTP Interpolation

EB-TDTP EXTENDED BLOCK TDTP (EB-TDTP) X EB-TDTP Interpolation B 1 B 2 Ỹ = F 1 D 0 B 2 (D B2 XD 0 B 2 ) P B2 )D B2 F 2 DCT EB-TDTP Back to pixel domain interpolation

EB-TDTP EXTENDED BLOCK TDTP (EB-TDTP) X EB-TDTP Interpolation B 1 B 2 min Y Ỹ 2 Ỹ = F 1 D 0 B 2 (D B2 XD 0 B 2 ) P B2 )D B2 F 2 DCT EB-TDTP Back to pixel domain interpolation

JOINT OPTIMIZATION WITH FILTERS JOINT OPTIMIZATION Design {P B2, F 1, F 2 } to minimize the MSE Use an iterative approach to optimize one of them while fixing the others Fixing {F 1, F 2 }, optimize P B2 optimize EB-TDTP Fixing {P B2, F 2 }, optimize F 1 optimize interpolation filter Fixing {P B2, F 1 }, optimize F 2 min Y Ỹ 2 Ỹ = F 1 D 0 B 2 (D B2 XD 0 B 2 ) P B2 )D B2 F 2

JOINT OPTIMIZATION WITH FILTERS JOINT OPTIMIZATION Design {P B2, F 1, F 2 } to minimize the MSE J = Ax b 2 Use an iterative approach to optimize one of them while x opt =(A T A) 1 A T b fixing the others Fixing {F 1, F 2 }, optimize P B2 optimize EB-TDTP Fixing Fixing {P B2, F 2 }, optimize {P B2, F 1 }, optimize F 1 optimize interpolation filter F 2 min Y Ỹ 2 Ỹ = F 1 D 0 B 2 (D B2 XD 0 B 2 ) P B2 )D B2 F 2

RECAP RE-CAP TDTP de-correlates spatial correlation and exploits real temporal correlation across frequencies TDTP interferes with interpolation filter Joint design by an iterative approach

JOINT OPTIMIZATION WITH FILTERS NON-SEPARABLE FILTERS Separable filters cannot perfectly capture the spatial correlation

JOINT OPTIMIZATION WITH FILTERS NON-SEPARABLE FILTERS Separable filters cannot perfectly capture the spatial correlation Alternative: non-separable filters (at the same complexity) 2D 4x4 non-separable filters = two 1D 8-tap separable filters A similar iterative optimization approach to design {P B2, F} min Y Ỹ 2 Ỹ =(D 0 B 2 ((D B2 XD 0 B 2 ) DCT EB-TDTP P B2 )D B2 ) F non-separable wiener filter Back to pixel domain interpolation

INSTABILITY PROBLEM INSTABILITY PROBLEM IN TRAINING

INSTABILITY PROBLEM INSTABILITY PROBLEM IN TRAINING Whatever statistics we designed for will be changed when we apply the new predictor on it Because in a closed-loop system each frame is referencing from a different reconstruction now.

INSTABILITY PROBLEM INSTABILITY PROBLEM IN TRAINING frame 1 frame 2 frame 3

INSTABILITY PROBLEM INSTABILITY PROBLEM IN TRAINING 12 13 15 18 9 16 20 24 8 14 16 21 5 12 14 22 frame 1 12 13 15 18 9 16 20 24 8 14 16 21 5 12 14 22 frame 2 frame 3 Get some from the reference blocks and original blocks

INSTABILITY PROBLEM INSTABILITY PROBLEM IN TRAINING 12 13 15 18 9 16 20 24 8 14 16 21 5 12 14 22 frame 1 13 14 16 17 10 15 19 21 9 13 15 20 6 11 13 21 frame 2 frame 3

INSTABILITY PROBLEM INSTABILITY PROBLEM IN TRAINING 12 13 15 18 9 16 20 24 8 14 16 21 5 12 14 22 frame 1 0 13 14 16 17 10 15 19 21 9 13 15 20 6 11 13 21 frame 2 frame 3

INSTABILITY PROBLEM INSTABILITY PROBLEM IN TRAINING 12 13 15 18 9 16 20 24 8 14 16 21 5 12 14 22 frame 1 0 13 14 16 17 10 15 19 21 9 13 15 20 6 11 13 21 frame 2 frame 3 The change in reconstruction will keep propagating to the following frames and change the statistics completely in the end!

INSTABILITY PROBLEM SOLUTION ASYMPTOTIC CLOSED-LOOP (ACL) DESIGN [1] H. Khalil, K. Rose, and S. L. Regunathan, The asymptotic closed-loop approach to predictive vector quantizer design with application in video coding, TIP 2001 [2] S. Li, T. Nanjundaswamy, Y. Chen, and K. Rose, "Asymptotic Closedloop Design for Transform Domain Temporal Prediction, ICIP 2015

EXPERIMENTAL RESULTS RESULTS HEVC baseline: lowdelay P; using previous frame as reference frame; fixing CU/TU size to be 8x8; SAO disabled Experiment 1: design the EB-TDTP and interpolation for each sequence, aiming for offline encoding application

EXPERIMENTAL RESULTS RESULTS HEVC baseline: lowdelay P; using previous frame as reference frame; fixing CU/TU size to be 8x8; SAO disabled Experiment 1: design the EB-TDTP and interpolation for each sequence, aiming for offline encoding application RD curve for sequence BQSquare

EXPERIMENTAL RESULTS RESULTS Experiment 2: provide 8 modes of the trained parameters for encoder to choose for each sequence (with an overhead of 3 bits/sequence)

EXPERIMENTAL RESULTS RESULTS Experiment 2: provide 8 modes of the trained parameters for encoder to choose for each sequence (with an overhead of 3 bits/sequence) For simplicity, we use the 8 most distinct sets of predictors from the training set -> huge potential for proper mode design and adaptivity exploration

SUMMARY SUMMARY Paper #2324: Jointly Optimized Transform Domain Temporal Prediction (TDTP) and Sub-pixel Interpolation Transform domain temporal prediction (TDTP) disentangles the spatial and temporal correlation, and exploits the true temporal correlation at each frequency TDTP interferes with interpolation filter Extended blocks TDTP (EB-TDTP) accounts for the spatial correlations outside the block We jointly design the EB-TDTP and (separable and non-separable) sub-pixel interpolation filters in an iterative approach (main contribution of this paper) We use the asymptotic closed-loop (ACL) approach to avoid the instability problem due to quantization error propagation Future research includes proper mode design and adaptivity exploration for real-time encoding applications