Introduction to Video Encoding Preben N. Olsen University of Oslo and Simula Research Laboratory preben@simula.no August 26, 2013 1 / 37
Agenda 1 Introduction Repetition History Quality Assessment Containers 2 Video Encoding Fundamentals Macroblocks Frames Prediction Modes Motion Compensation Parallel Encoding 2 / 37
Repetition From first lecture... Media Compression Raw data is inconvenient, very large file sizes Compression reduces bandwidth and storage costs Image Representation Number of pixels, e.g., 1920 1080 Color representation per pixel Y UV (Y C b C r ) Color Space Y is the luma component (light intensity) U is a chroma component (color) V is a chroma component (color) Reduce file size by chroma sub-sampling 3 / 37
Repetition Figure : RGB and CMYK [1] 4 / 37
Repetition Figure : YUV Dissected, original [2] 5 / 37
Repetition Full HD YUV frame size... 1920 1080 24 bits 5.9 MB 6 / 37
Repetition Why not (buy and) download The Hobbit in full YUV format? 5.9 MB 48 FPS (162 60) Seconds 2.6 TB 7 / 37
Repetition Figure : YUV Data Layout, original [3] 8 / 37
Repetition Figure : JPG Block Diagram [1] 9 / 37
Repetition JPEG is short for Joint Photographic Experts Group There s a trade-off between size and quality in jpg images Compression rate of 1: 10 gives a reasonable result Lossless jpg encoding yields approx comp rate of 1: 1.6 10 / 37
History The MPEG is short for Motion Picture Expert Group Industry together with ISO and ITU develops standards MPEG-1 started in 1988, released in 1993 MPEG-2 started in 1990, released in 1996 11 / 37
History MPEG-3 was to include support for HDTV (1080p) MPEG-4 started in 1998, released between 1999-... Part 2 of MPEG-4 describes H.263 Advanced Simple Profile Sometimes referred to as DivX or Xvid 12 / 37
History Part 10 of MPEG-4 defines H.264, introduced in 2003 Twice the compression of H.263 (MPEG-4 ASP) Used by Blu-ray, Rikstv, Youtube, and many others Sometimes referred to as x264 This codec has 17 different profiles 13 / 37
History High Efficiency Video Coding or H.265 started in 2004 HVEC has better compression, same level of quality Released to the public on June 7th, 2013 [4] Supports Ultra High Definition TV (UHDTV), 7680 4320 14 / 37
Money and Politics Patent pool created by MPEG-LA About 1,500 patents related to H.264 Incentive for large, global companies 15 / 37
History Google bought On2, which initially developed VP8 VP8 spec. released with open-source implementation in 2010 Supported by many browsers and mobile platforms Ongoing development on VP9, a HVEC competitor 16 / 37
Quality Assessment Assessing video quality is difficult A group of people rate which version is best People have different opinions on quality Objective measurements can give an estimate Peak Signal-to-Noise Ratio (PSNR) Shell script for PSNR found in mplayer source tree 17 / 37
Containers File containers are not codecs Video codecs are used for encoding and decoding bitstreams Containers are used for packaging bitstreams Examples include Audio Video Interleave (AVI), Matroska (MKV), Video Objects (VOB), and OGG 18 / 37
Video Encoding Fundamentals 19 / 37
Figure : Overview of H.264/VP8 20 / 37
Macroblocks Figure : Missing macroblocks [5] 21 / 37
Macroblocks Different macroblock types and sizes 16 16 pixels, subdivided into 4 4 Intra-, predicted-, and bi-directional predicted macroblocks 22 / 37
Macroblocks Figure : Foreman and macroblocks 23 / 37
Also different frame types Frames Usually intra-predicted frames, predicted frames, and bi-directional predicted frames VP8 does not have bi-directional, but alt-ref and golden-frames Figure : Different frames [6] 24 / 37
Frames Predict the pixels of a macroblock using information available within a single frame. Prediction type 1 Intra-prediction 2 Inter-prediction 3 Bi-directional Typically predicts from left, top and top-left macroblock by interor extrapolating the border pixel s values. Different prediction modes available, e.g. horizontal, vertical, and average. 25 / 37
Frames Prediction type 1 Intra-prediction 2 Inter-prediction 3 Bi-directional Predict a macroblock by reusing pixels from another frame. Objects tend to move around in a video, and motion vectors are used to compensate for this. H.264 allows up to 16 reference frames, while VP8 only supports 3 frames. 26 / 37
Frames Prediction type 1 Intra-prediction 2 Inter-prediction 3 Bi-directional Predict the pixels of a macroblock using information available in other frames, both previous and upcoming frames; that is, going back and forward in time. Can reference every type of frame, including other bi-directional predicted frames. 27 / 37
Determining Prediction Modes The motion estimator tries many modes Different blocks are evaluated Two-step process, initial and refinement 28 / 37
Some Cost Functions Mean square error (MSE) Sum of Absolute Differences (SAD) Sum of Absolute Transformed Differences (SATD) SATD is more accurate than SAD 29 / 37
Motion Compensation With the best motion vector a predicted block is generated The original reference frame can not be used directly as input to the motion compensator as the decoder never sees the original image Decoder sees a reconstructed image, i.e., an image with loss A reconstructed reference image must be used as input 30 / 37
Parallel Encoding Approaches available both for intra- and inter-prediction Some give up compression efficiency for increased parallelism Pipeline approach shouldn t be combined with real-time reqs 31 / 37
What should be optimized? Parallel Encoding Figure : VP8 profiling 32 / 37
Parallel Encoding Figure : Group of Pictures [6] 33 / 37
Parallel Encoding Figure : Sliced-based approach 34 / 37
Conclusion Video encoding is mainly about trying (and failing) different prediction modes limited by user-defined restrictions (resource usage) The actual encoding of the video when the parameters are known usually accounts for a small percentage of the running time Any (reasonable) codec can produce the desired video quality - what differs between them is the size of the output bitstream they produce 35 / 37
The End 36 / 37
References Video & Image Compression Techniques: Image Coding Fundamentals http://goo.gl/6fck7n Wikipedia: YUV http://en.wikipedia.org/wiki/yuv Any To YUV: Documentation http://any2yuv.sourceforge.net/docs H.265: High efficiency video coding http://www.itu.int/rec/t-rec-h.265 BitBlit.Org http://www.bitblit.org/gsoc/g3dvl/ GOP (Group of Pictures) http://goo.gl/83d7hz 37 / 37