Digital video coding systems MPEG-1/2 Video
Introduction What is MPEG? Moving Picture Experts Group Standard body for delivery of video and audio. Part of ISO/IEC/JTC1/SC29/WG11 150 companies & research institutes USA: AT&T, Motorola, Microsoft, IBM, DEC, G1, Asia: Sony, JVC, Mitsubishi, Matsushita, Daewoo, Samsung, Europe: Phillips, Thomson, Siemens, Bosch, Deutsche Telekom, BT, CNET, Ericson, Universities, EPFL, NTNU MPEG meets 3-5 times per years.
MPEG Standards MPEG-1 (IS 1992) Video on CD-Rom (1.5 Mbit/s) MPEG-2 (IS 1994) Multimedia application (5-10 Mbit/s) MPEG-4 (IS 1999 v.1, 2000 v. 2) Multimedia application (10 kbit/s-10 Mbit/s) MPEG-7 (IS 2001) Multimedia content description interface
MPEG-1
MPEG-2
MPEG-1 Project history: Start 1988 IS 1992 Main goal: Storage of interactive movies on CD (~1.4 Mbit/s) with CD-quality stereo audio. MPEG-1 standard consist of ISO/IEC 11172-1: MPEG-1 systems ISO/IEC 11172-2: MPEG-1 video ISO/IEC 11172-3: MPEG-1 audio ISO/IEC 11172-4: MPEG-1 conformance ISO/IEC 11172-5: MPEG-1 software System: Describes how various stream (video, audio, or generic data) are multiplex and synchronized) Video: Defines the video compression decoder Conformance: Defines a set of tests designed to aid in establishing that particular implementations conform to the design.
MPEG-1: Applications Video CD The format of audio and video for PC, window MPEG-1 decoder MPEG-1 audio, layer III MP3 for Web music Video cameras Digital Audio Broadcasting (DAB)
MPEG-2 (ISO/IEC 13818) Title: Generic coding of moving pictures and audio Project history Start in July 90 IS status achieved in November 94 Main goal Migration of television from analogue to digital with Composite quality at 6 Mbit/s Component quality at 9 Mbit/s Multichanel audio coding Multiprogram transport
MPEG-2 an assesment Hundreds of million set top boxes for satellite and cable have been sold Digital television VHF/UHF broadcasting Millions of DVD players sold The MPEG-2 4:2:2 profile is being adopted in the television production industry MPEG-s has created the entirely new digital television industry worth ~XX billion USD
MPEG-1 Fundamental Exploit both INTRA-frame redundancy and INTER-frame redundancy. Intraframe coding is based on DCT Interframe coding is based on motion compensation Color subsampling format: (Y:Cb:Cr)=(4:2:0)
Picture Format and Data Structure Y, Cb, Cr as noninterlaced 4:2:0; size as big as 4K x 4K SIF/525 352x240; rates of 23.97, 24, 25, 29.97, 50, 59, 59.94, 60 Hz Group of Pictures, Picture, Slice, Macrobolck and Block layers
I, P, B frames INTRA I-frames: Fandom access Error robustness Predicted P-Frames: Backward predicted from previous anchor picture (I or P) Bidirectionnally predicted B-frmes: Forward/backward predicted from previous anchor picture (I or P)
The Problem Some macroblocks need information that is not present in the previous reference frame o Maybe, such information is available in a succeeding frame! Add a third frame type (B-frame): To form a B-frome, search for matching macroblocks in both past and future frames. Typical pattern is IBBPBBPBB IBBPBBPBB IBBPBBPBB. Actual pattern is up to encoder, and not be regular.
Bitstream order vs. Display order A GOP may contain I, P and B pictures A GOP of size N means that there are N pictures in a GOP Number of B-pictures between consecutive anchor pictures is M-1 where M is the prediction distance Bitsteam (Transmit) order: 1(I), 4(P), 2(B), 3(B), 7(P), 5(B), 6(B), 10(I), 8(B), 9(B)
Motion Estimation and Compensation For P-pictures, prediction as in H. 261 except half pel accuracy For B-picture, bidirectional motion compensation
Differences from H. 261 Large gaps between I and P frames, so need to expand motion vector search range. Uni-Quant for P and Nonuniform-Quant for I. To get better encoding, allow motion vectors t be specified to fraction of a pixel (1/2 pixel). Bitstream syntax allows random access, forward/backward play etc. Added notion of slice for syncronization after loss/corrupt data (see figure at right: 7 slices in frame).
Coding of Macroblock (MB) MB of I-frame use INTRA mode (applying DCT to MB) MB of B, P frames are coded in several modes depending on Macorblock MSE mode decision Macroblock MSE < tsh1: transmit motion only tsh1 < Macroblock MSE < tsh2: transmit motion + DCT on DFD Displaced Frame Difference: motion compensated error image (predicted-original) Macroblock MSE > tsh2: INTRA mode
MPEG-1 Highlights MPEG-1 video compression technique only the decoder is standardized with the bitstream syntax The video sequence is split into intra, predicted and interpolated frames The video sequence is divided into group of pictures starting with intra frames Motion vector are obtained at half pel accuracy and sent to the decoder using DPCM at macroblock basis Handles CCIR 601 and formats Covers bitrates of about up to 1.5 Mb/s
MPEG-1 Block diagram
MPEG-1 Decoding Coded data Variable Length decoding Inverse Scan and Quantilization Inverse DCT Decoded pixels Motion vectors Motion Compensation Frame store memories
MPEG-2 Unlike MPEG-1 which is basically a standard for storing and playing video on a single computer at low bit-rates, MPEG2 is a standard for digital TV. It meets the requirements for: HDTV(High Definition TV) and DVD (Digital Video/Versatile Disc).
MPEG-2 - Highlights MPEG-2 video compression technique only the decoder is standardized along with the bitstream syntax. Several modes are considered in order to take into account interlaced frames (field based models) Generic structure in order to cope with several bitrates and picture formats Spatial, frequency and temporal scalability
Interlaced/progressive coding in MPEG-2 DCT of MB (16x16 pixel) Exploiting large correlation between the two fields; good for large movement in video
Scalable coding in MPEG-2
Profile and levels MPEG-2 support multiple profiles based on Scalability Profiles are a set of pre-defined tools and their configurations Profiles are divides into Levels each defining upper limits for coding parameters
Profiles in MPEG-2 Simple Simplest profile similar to Main profile, except for the lack of B frames Main Non scalable coding providing interlaced coding tools, random access, B mode SNR Scalable Similar to Main plus a 2 layer SNR scalability Spatial Scalable Similar to SNR scalable profile plus a 2 layer spatial scalability High Similar to Spatial Scalable profile with provisions for 3 layers in spatial and SNR scalability and 4:2:2 coding 4:2:2 Similar to Main profile with 4:2:2 coding
Level in MPEG-2 MPEG2 Support the following levels:
Scalability All information available in the bit streampossible to extract or receive- only parts of the bit stream and still decoded the video Penalty lie in quality Scalability by Temporal resolution (time) Spatial resolution (pixels) SNR (quality)
Temporal
Spatial
SNR
Multi-layer (example)
Sequencing (example (1,2))
JPEG2000: Scalability Different modes are realized depending on the way information is written into the codestream
Scalability Progressive By Resolution
Scalability Progressive By Resolution
Scalability Progressive By Resolution
Scalability Progressive By Resolution
Scalability Progressive By Accuracy
Scalability Progressive By Accuracy
Scalability Progressive By Accuracy
HDTV 2x horizontal and vertical resolution SDTV: 480 line, 720 pixels per line, 29.97 frames per second x 16 bits/pixel = 168 Mbits/sec uncompressed MPEG-1 brings this to 1.5Mbits/sec at VHS quality HDTV: expanded to 1080 lines, 1920 pixels per line, 60 fps x 16 bits/pixel = 1990 Mbits/sec uncompressed MPEG-2 like encoding, different audio encoding HDTV Audio Compression is based on the Dolby AC-3 system with sampling rate 48kHz and perceptually coded
Why HDTV Higher-resolution picture Wider picture Digital surround sound. Additional data Easy to interface with computers
Current TV Standards NTSC: National Television Systems Committee PAL: Phase Alternation Line SECAM: Séquential Couleur Avec Mèmoire TV Standards NTSC PAL SECAM Regions U.S. Asia, Europe, South America Channel Bandwidth France 6MHz 8MHz 8MHz Aspect ratio 4:3 4:3 4:3
HDTV and NTSC Specifications HDTV USA NTSC Aspect ratio 16:9 4:3 Largest frame rate 60 frames/sec Vertical refresh rate 60 Hz 30 frames/sec 60 Hz Highest resolution 1080 lines 525 lines
Hardware Requirements Digital Decoder converts digital signals to analog allow current TV set to work Digital-Ready TV set Wide-screen format progressive scanning HDTV set Wide-screen format can receive 18 digital input format
Comparison Current TV HDTV
Comparison (current TV)
Comparison (HDTV)