Daala Codebase 17 Aug 2014

Similar documents
Daala: One year later

A Summary of the Daala Project

PREFACE...XIII ACKNOWLEDGEMENTS...XV

LIST OF TABLES. Table 5.1 Specification of mapping of idx to cij for zig-zag scan 46. Table 5.2 Macroblock types 46

Laboratoire d'informatique, de Robotique et de Microélectronique de Montpellier Montpellier Cedex 5 France

High Efficiency Video Coding. Li Li 2016/10/18

The Scope of Picture and Video Coding Standardization

Anatomy of a Video Codec

The VC-1 and H.264 Video Compression Standards for Broadband Video Services

MPEG-4: Simple Profile (SP)

Introduction to Video Compression

PROCEEDINGS OF SPIE. Perceptually-driven video coding with the Daala video codec

VIDEO COMPRESSION STANDARDS

H.264 STANDARD BASED SIDE INFORMATION GENERATION IN WYNER-ZIV CODING

Efficient MPEG-2 to H.264/AVC Intra Transcoding in Transform-domain

Video Compression Standards (II) A/Prof. Jian Zhang

Lecture 13 Video Coding H.264 / MPEG4 AVC

Introduction to Video Encoding

In the name of Allah. the compassionate, the merciful

JPEG Modes of Operation. Nimrod Peleg Dec. 2005

Week 14. Video Compression. Ref: Fundamentals of Multimedia

Video Codecs. National Chiao Tung University Chun-Jen Tsai 1/5/2015

Video Compression An Introduction

CMPT 365 Multimedia Systems. Media Compression - Video

Advanced Video Coding: The new H.264 video compression standard

Lecture 5: Video Compression Standards (Part2) Tutorial 3 : Introduction to Histogram

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

( ) ; For N=1: g 1. g n

Wireless Communication

06/12/2017. Image compression. Image compression. Image compression. Image compression. Coding redundancy: image 1 has four gray levels

10.2 Video Compression with Motion Compensation 10.4 H H.263

Selected coding methods in H.265/HEVC

Performance Evaluation of Kvazaar HEVC Intra Encoder on Xeon Phi Many-core Processor

The Standardization process

Interframe coding A video scene captured as a sequence of frames can be efficiently coded by estimating and compensating for motion between frames pri

ESE532 Spring University of Pennsylvania Department of Electrical and System Engineering System-on-a-Chip Architecture

Thanks for slides preparation of Dr. Shawmin Lei, Sharp Labs of America And, Mei-Yun Hsu February Material Sources

Multimedia Standards

A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm

Video Coding Standards. Yao Wang Polytechnic University, Brooklyn, NY11201 http: //eeweb.poly.edu/~yao

Digital Video Processing

VHDL Implementation of H.264 Video Coding Standard

Image and Video Coding I: Fundamentals

Scalable Extension of HEVC 한종기

Introduction to Video Encoding

Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Chapter 10 ZHU Yongxin, Winson

Xvid. Introduction. Junjie Cao. What s Xvid? WhyXvid. An open source implementation of the MPEG-4 standard.

Wavelet-Based Video Compression Using Long-Term Memory Motion-Compensated Prediction and Context-Based Adaptive Arithmetic Coding

Quo Vadis JPEG : Future of ISO /T.81

Using animation to motivate motion

IMAGE COMPRESSION. October 7, ICSY Lab, University of Kaiserslautern, Germany

JPEG decoding using end of block markers to concurrently partition channels on a GPU. Patrick Chieppe (u ) Supervisor: Dr.

Video Coding Standards

Video Coding Using Spatially Varying Transform

HYBRID TRANSFORMATION TECHNIQUE FOR IMAGE COMPRESSION

Overview. Videos are everywhere. But can take up large amounts of resources. Exploit redundancy to reduce file size

Multimedia Systems Image III (Image Compression, JPEG) Mahdi Amiri April 2011 Sharif University of Technology

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

High Efficiency Video Coding: The Next Gen Codec. Matthew Goldman Senior Vice President TV Compression Technology Ericsson

Interframe coding of video signals

Rate Distortion Optimization in Video Compression

IMPLEMENTATION OF H.264 DECODER ON SANDBLASTER DSP Vaidyanathan Ramadurai, Sanjay Jinturkar, Mayan Moudgill, John Glossner

Module 8: Video Coding Basics Lecture 42: Sub-band coding, Second generation coding, 3D coding. The Lecture Contains: Performance Measures

Stereo Image Compression

JPEG 2000 compression

Lecture 6: Texturing Part II: Texture Compression and GPU Latency Hiding Mechanisms. Visual Computing Systems CMU , Fall 2014

Distributed Video Coding

Lab-1: Profiling/Optimizing Video Decoder Using ADS. National Chiao Tung University Chun-Jen Tsai 3/3/2011

The Basics of Video Compression

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Technische Universität Berlin, Institut für Fernmeldetechnik Three-Dimensional Subband Coding with Motion Compensation

Multimedia Decoder Using the Nios II Processor

Overview: motion-compensated coding

Fast Progressive Image Coding without Wavelets

JPEG: An Image Compression System. Nimrod Peleg update: Nov. 2003

JPEG: An Image Compression System

ECE 417 Guest Lecture Video Compression in MPEG-1/2/4. Min-Hsuan Tsai Apr 02, 2013

NVJPEG. DA _v0.2.0 October nvjpeg Libary Guide

MRT based Fixed Block size Transform Coding

Reversible Wavelets for Embedded Image Compression. Sri Rama Prasanna Pavani Electrical and Computer Engineering, CU Boulder

libtheora Reference Manual

Transcoding from H.264/AVC to High Efficiency Video Coding (HEVC)

New Techniques for Improved Video Coding

Multimedia Signals and Systems Still Image Compression - JPEG

Intel Stress Bitstreams and Encoder 2016 VP9 User Guide Version 2.1 Updated November 20, 2015

Source Coding Techniques

Compression for High-Quality, High Bandwidth Video. By Stewart Taylor

Chapter 10. Basic Video Compression Techniques Introduction to Video Compression 10.2 Video Compression with Motion Compensation

Image/video compression: howto? Aline ROUMY INRIA Rennes

Mali GPU acceleration of HEVC and VP9 Decoder

Compression Part 2 Lossy Image Compression (JPEG) Norm Zeck

MPEG-2. ISO/IEC (or ITU-T H.262)

Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology

Computer Graphics. Attributes of Graphics Primitives. Somsak Walairacht, Computer Engineering, KMITL 1

H.264 Decoding. University of Central Florida

Implementation of H.264 Video Codec for Block Matching Algorithms

Computer Architectures for Medical Applications 3 rd Exercise, May 2, 2016

FFV1 Video Codec Specification

H.264/AVC und MPEG-4 SVC - die nächsten Generationen der Videokompression

Lecture 5: Error Resilience & Scalability

Transcription:

Daala Codebase 17 Aug 2014

Contents include Public API src Main library code examples Front-end tools tools Ancilliary tools (metrics, training, etc.) doc What documentation there is doc/coding_style.html: coding style guidelines

Build Targets Two build systems: configure.ac, Makefile.am: autotools-based unix/makefile: basic GNU makefile Three libraries: libdaalabase (common code between encoder and decoder), libdaaladec (decoder-specific) libdaalaenc (encoder-specific) Examples: encoder_example Encodes video (encapsulated in Ogg) dump_video Decodes to YUV4MPEG raw video player_example Simple SDL player

Front End Three example programs encoder_example The only one that currently does anything Reads y4m, writes.ogg Also reconstructs frames and writes a separate.y4m file for each one dump_video Meant to decode to.y4m Can copy example from libtheora and strip out uneeded things player_example Meant to be a simple SDL-based player Metrics (requires: --enable-dump-recons) tools/dump_{fastssim,psnr,psnrhvs,ssim} DAALA_ROOT=<build_dir>./tools/rd_collect.sh <codec> *.y4m OUTPUT=<label>./tools/rd_average.sh *.out IMAGE=prefix./tools/rd_plot.sh *.out EC2 instances available: https://github.com/tdaede/rd_tool.git (talk to Thomas)

Debugging --enable-assertions: turn on assertions --enable-logging: turn on logging OD_LOG_MODULES env variable to control what gets printed, see top of logging.c for a list Ex: OD_LOG_MODULES=motion-estimation:4 --enable-encoder-check Decode after encoding and check that the reconstructed frame matches the encoder s --enable-accouting: collect/dump statistics on bit usage make check: run unit tests make clean ; make debug Produces unoptimized debug build with assertions and logging enabled

Image Debugging od_state_dump_img(od_state *, od_img *, const char *tag) Dumps a %08i%s%s.png using frame #, tag, and suffix Suffix set via OD_DUMP_IMAGES_SUFFIX env variable (for parallel jobs) od_state_dump_yuv(od_state *, od_img *, const char *tag) Like above, but dumps a single-frame YUV4MPEG file %08i%s-%s.y4m od_img_draw_point(od_img *img, int x, int y, const unsigned char ycbcr[3]) od_img_draw_line(od_img *img, int x0,int y0, int x1, int y1, const unsigned char ycbcr[3]) Configure with --enable-dump-images to enable See also --enable-dump-recons to dump reconstructed frames only

Coding Tools Some coding tools can be enabled/disabled at compile time for testing purposes Block size min/max Prefilter Intra prediction Haar DC PVQ (vs. scalar quantization) Chroma from Luma Activity masking, quantization matrices Flags in internal.h Requires recompile: bitstream not compatible Be careful! We are already finding cases where some combinations are broken and/or subtly wrong (e.g., encoding information twice)

Video Data All video data in 8-bit Y CbCr (possibly plus alpha) struct od_img_plane { unsigned char *data; unsigned char xdec; unsigned char ydec; int xstride; int ystride; }; struct od_img { od_img_plane planes[od_nplanes_max]; int nplanes; ogg_int32_t width; ogg_int32_t height; };

Video Data Full flexibility only on encoder input Encoder copies data to internal buffer Width/Height padded to a multiple of 32 Crop rectangle in state.info.pic_{x, y, width, height} Start of rows aligned to 16-byte boundary Probably needs to be 32 32 pixels of padding on all sides: ystride > height xstride == 1

Objects od_state (state.h): daala_info info; int ref_imgi[4]; od_img ref_imgs[4]; od_img io_imgs[2]; ogg_int64_t cur_time; od_mv_grid_pt **mv_grid; int nhsb; int nvsb; unsigned char *bsize; }; od_enc (encint.h): od_state state; od_adapt_ctx adapt; oggbyte_buffer obb; od_ec_enc ec; int packet_state; Int quantizer[od_nplanes_max]; od_mv_est_ctx *mvest; }; od_dec (decint.h): od_state state; od_adapt_ctx adapt; oggbyte_buffer obb; od_ec_dec ec; Int quantizer[od_nplanes_max]; int packet_state; };

Entropy Coder Low-level encoding API (entenc.h) void od_ec_enc_bits(od_ec_enc *enc, ogg_uint32_t fl, unsigned ftb); void od_ec_encode_bool_q15(od_ec_enc *enc, int val, unsigned fz_q15); void od_ec_encode_bool(od_ec_enc *enc, int val, unsigned fz, unsigned ft); void od_ec_encode_cdf_q15(od_ec_enc *enc, int s, const ogg_uint16_t *cdf, int nsyms); void od_ec_encode_cdf_unscaled_dyadic(od_ec_enc *enc, int s, const ogg_uint16_t *cdf, int nsyms, unsigned ftb); void od_ec_encode_cdf(od_ec_enc *enc, int s, const ogg_uint16_t *cdf, int nsyms); void od_ec_encode_cdf_unscaled(od_ec_enc *enc, int s, const ogg_uint16_t *cdf, int nsyms); void od_ec_enc_uint(od_ec_enc *enc, ogg_uint32_t fl, ogg_uint32_t ft); Other encoder functions int od_ec_enc_tell(od_ec_enc *enc); ogg_uint32_t od_ec_enc_tell_frac(od_ec_enc *enc); void od_ec_enc_checkpoint(od_ec_enc *dst, const od_ec_enc *src); void od_ec_enc_rollback(od_ec_enc *dst, const od_ec_enc *src);

Entropy Decoder Low-level decoding API (entdec.h) ogg_uint32_t od_ec_dec_bits(od_ec_dec *dec, unsigned ftb); int od_ec_decode_bool_q15(od_ec_dec *dec, unsigned fz); int od_ec_decode_bool(od_ec_dec *dec, unsigned fz, unsigned ft); int od_ec_decode_cdf_q15(od_ec_dec *dec, const ogg_uint16_t *cdf, int nsyms); int od_ec_decode_cdf_unscaled_dyadic(od_ec_dec *dec, const ogg_uint16_t *cdf, int nsyms, unsigned _ftb); int od_ec_decode_cdf(od_ec_dec *dec, const ogg_uint16_t *cdf, int nsyms); int od_ec_decode_cdf_unscaled(od_ec_dec *dec, const ogg_uint16_t *cdf, int nsyms); ogg_uint32_t od_ec_dec_uint(od_ec_dec *dec, ogg_uint32_t ft); Other decoder functions int od_ec_dec_tell(od_ec_dec *dec); ogg_uint32_t od_ec_dec_tell_frac(od_ec_dec *dec);

Higher-level Entropy Coding Basic adaptive CDF: generic_code.h void od_encode_cdf_adapt(od_ec_enc *ec, int val, ogg_uint16_t *cdf, int n, int increment); int od_decode_cdf_adapt(od_ec_dec *ec, ogg_uint16_t *cdf, int n, int increment); Generic coder: generic_code.h Estimates model for you Shape of distribution modeled via lookup tables, decaying tail, can be shared by many contexts Particular context modeled via one parameter: expected value (updated per coded symbol) Laplace coder: laplace_code.h Versions for one-sided (exponential) distribution, known max, and vector with known L1 norm

Motion Estimation OBMC with adaptive partition sizes https://people.xiph.org/~tterribe/notes/mc.pdf (doc/mc.tex)... ignore the stuff about CGI Staged subpel (currently) Upsampled to hpel via od_state_upsample8() qpel and 1/8th pel done via bilinear interpolation Decoder: mc.c/mc.h OBMC blending (incl. multiresolution blending) MV prediction Encoder: mcenc.c SAD used for all decisions (no SATD yet) (Non-overlapped) block matching for all block sizes (fullpel) RDO for block-size decisions (Balmelli 2001) Real OBMC for costing, (badly) faked rate estimates Refine MVs via iterated dynamic programming Siubpel via diamond search during DP MV resolution chosen on per-frame basis

Intra Prediction Existing prediction done after the transform (freq. Domain) Currently disabled by default (OD_DISABLE_INTRA) Replaced by Haar DC over whole superblock (OD_DISABLE_HAAR_DC) Code in intra.c, trained tables in intradata* New hotness: Intra Paint Perform prediction prior to transform, like MC Can predict clean edges Decouples prediction block sizes from transform block sizes Easier to integrate with encoder MC decisions Status/integration plans? (Jean-Marc)

Block Sizes Transform block sizes supported: 4x4, 8x8, 16x16 Blocks organized into 32x32 Superblocks Planned for higher block sizes via TF Last attempt did not show improvements https://review.xiph.org/65/ Enough has changed that it s time to try again Psychovisual block size decisions Encoder estimates visibility of ringing artifacts No RDO (but bias towards larger blocks at low rates) Code in block_size*

Transforms OD_COEFF_SHIFT (4) Amount to shift up 8-bit coefficients before transform (non-lossless only) Lapping: filter.h/filter.c 4-point, 8-point, 16-point filters od_apply_filter_rows()/od_apply_filter_cols() decide which filters to apply based on block sizes Currently 4:4:4 or 4:2:0 only... need help to support 4:2:2 DCTs: dct.h/dct.c 4x4, 8x8, 16x16 Orthonormal scaling (e.g., DC scale == sqrt(1/n)) Reversible (bit-exact, both directions) TF: Trade off time-frequency resolution, tf.h/tf.c OD_HAAR_KERNEL: if you need a Haar transform for something, use this od_tf_up_hv_lp() Increase frequency resolution, horizontal and vertical directions, then low-pass (used by CfL) od_tf_up_hv Increase horizontal and vertical frequency resolution od_tf_down_hv Decrease horizontal and vertical frequency resolution (increase time resolution) od_tf_filter_2d()/od_tf_filter_inv_2d(): Second stage TF correction from Demo 3

PVQ Documentation: doc/video_pvq.lyx, doc/theoretical_results.lyx https://people.xiph.org/~tterribe/daala/pvq201404.pd f Code: pvq.h/pvq.c, pvq_encoder.c, pvq_decoder.c Scan order, band partitioning in partition.h/partition.c Bits actually coded with Laplace coder

Basic Encoding Process Copy/pad image (Y CbCr pixels) Prefilter across block boundaries Transform blocks Construct Intra predictors and pick an Intra mode Dump Images/ PSNR Postfilter across block boundaries Inverse transform Quantize + Encode coefficients (PVQ) Will use whiteboard