Lecture 5: Error Resilience & Scalability Dr Reji Mathew A/Prof. Jian Zhang NICTA & CSE UNSW COMP9519 Multimedia Systems S 010 jzhang@cse.unsw.edu.au Outline Error Resilience Scalability Including slides from Lecture 3 Tutorial Question Sheet Introduction Most digital storage media and communication channels are not error free. Channel coding is beyond the scope of the MPEG specification Therefore the compression scheme needs to be robust to errors or loss of data Due to temporal prediction there is dependency between frames hence errors can propagate from frame to frame Example below an error in the first I frame can propagate to all other predicted frames (i.e the last I frame is not affected) An error in the first P frame can propagate to other P and B frames Error resilience schemes can be applied at the Encoder & Decoder
Due to spatial prediction schemes there is dependency within a frame For example: spatial prediction of Motion Vectors (MV) Median prediction covered in the previous lecture Pred = MEDIAN (A,B,C) An error in the MV of the neighbour can cause error in the MV of the current block. Require techniques to limit propagation of error in coded video streams. Errors can be due to, for example, lost packets when streaming video How to be resilient to packet losses or loss of data? Error resilience techniques At the Encoder: Temporal/Spatial Localization Prevents error propagation by inserting regular intra coded pictures, slices or macro-blocks The impact of errors within decoded picture Two type of error propagations: temporal error propagation errors are propagated into next frames spatial error propagation errors are propagated within a video picture At the Decoder: Concealment The impact of any errors can be concealed using correctly decoded information from the current or previous pictures Original picture Error damaged picture
Temporal Localization Cyclic intra-coded pictures Temporal Localization Cyclic intra-coded slices I Extra intra-coded I-pictures can be inserted. Error propagation can be reduced at the cost of extra overhead in the bit stream This scheme will reduce coding efficiency Temporal Localization Cyclic intra-coded slices Temporal Localization Cyclic intra-coded marcoblock (MB) refreshment Extra intra-coded I-slices can be used to periodically refresh the frame from the top to the bottom over a number of frames The disadvantage is that the partial updating of the screen in a frame period will produce a noticeable windscreen wiper effect A fixed number of frames.
Temporal Localization Cyclic intra-coded marcoblock (MB) refreshment Extra intra-coded MBs can be inserted to refresh the frame periodically over a fixed number of frames This scheme has been widely used in current MPEG-4 codec Example : choose N random blocks in a frame to be coded as INTRA Spatial Localization Slice Mode (used in MPEG-4 & H.63) The 11MBs or MBs per slice can reduce the damage to decoded picture after a corrupted MB is detected. Each slice coded independently of other slices in the picture So there is no dependency from one slice to another. For example: no MV prediction across slice boundaries Spatial Localization Slices Mode (used in MPEG-4 or H.63) Spatial Localization Slice Mode : useful for IP streaming Example: one slice per packet The loss of one packet is contained within one slice only a small part of the frame is lost
Concealment Techniques Use available information to conceal errors Spatial concealment use nearby blocks Temporal concealment use blocks in previous frame Performed at the decoder Need to conceal errors to limit perceptual quality Spatial concealment: Simple interpolation with above/below macro-blocks This technique is best suited to little spatial activity but is far less successful in areas where there is significant spatial details. Spatial concealment at the decoder Simple temporal concealment Previous frame Current frame This technique is effective in a relatively stationary area but much less effective in the fast moving regions Error damaged picture Spatial concealed picture
Temporal concealment at the decoder Motion compensated concealment Combines both temporal replacement and motion compensation (MC) High correlation among nearby MVs in a picture Assumes linear changes in MVs from below to above MBs. This scheme can significantly improve error concealment in moving areas of the picture Refer to example on next slide Error damaged picture Temporal concealed picture Motion compensated (MC) concealment Calculate a MV for lost MB Use MV of MB s above and below Assume linear variation Obtain concealment block by performing MC Concealment Techniques Motion compensated concealment Lost MB Motion comp concealment Spatial concealed picture
Layered Coding for error concealment Base layer important data/information High layer less important 5. Scalability Scalable video coding means the ability to achieve more than one video resolution or quality simultaneously. Layered Coder High Priority Data Low Priority Data Low Loss Channel High Loss Channel X X Scalable Encoder Enhanced Layer Base Layer -Layer Scalable Decoder Full (scale) decoded sequence Spatial Scalability, SNR Scalability and Temporal Scalability Scalability explored in next section Single Layer Decoder Base-line decoded sequence 5. Scalability Useful for inter-working between different clients and networks. unequal error protection (protect the base layer more than the enhancement layer) 5. Scalability Temporal Scalability Spatial Scalability Quality (SNR) Scalability Scalable Encoder Enhanced Layer Base Layer -Layer Scalable Decoder Full (scale) decoded sequence Scalable Encoder Enhanced Layer Base Layer -Layer Scalable Decoder Full (scale) decoded sequence Single Layer Decoder Base-line decoded sequence Single Layer Decoder Base-line decoded sequence
5. Scalability Temporal Scalability Base layer: low temporal resolution Enhancement layer: higher resolution Spatial Scalability Quality (SNR) Scalability 5. Scalability Temporal Scalability Naturally achieved with I,P and B frames Can drop the B frames to reduce frame rate Enhancement Layer I B B P B B P B B P I B B P B B P B B P Base Layer 5. Scalability Spatial Scalability A spatially scalable coder operates by filtering and decimating a video sequence to a smaller size prior to coding. An up-sampled version of this coded base layer representation is then available as a predicator for the enhanced layer 5. Scalability Spatial Scalability A general architecture of a spatially scalable encoder and decoder is shown on the next slide. The low resolution frame from the base layer is available to the high resolution encoder as a predictor option Trying to make use of information already contained in the base layer
5. Scalability Spatial Scalability 5. Scalability Spatial Scalability Example: Full resolution HD TV Base layer + enhancement layer = HDTV service Base layer standard TV resolution Base layer = Standard TV Spatial scalability as defined by MPEG- is shown on the next slide Base layer frame is up-sampled and provided to the enhancement layer coder 5. Scalability 5. Scalability Sub-band coding Slides from Lecture 3 16x16 16x16 8x8 layer spatially scalable coder
3.1 Subband Coding 3.1 Subband Coding Input signal can be decomposed into frequency subbands. Two-channel filter bank example Sub-bands representing low and high frequency components. Recursive application possible giving a tree structured transform Refer to diagram on next slide x[k] F L F H L H F L F H F L F H LL LH HL HH G L G H G L G H + G L + + G H x[k] LL LH HL HH f COMP9519 Multimedia Systems Lecture 3 Slide 37 J Zhang COMP9519 Multimedia Systems Lecture 3 Slide 38 J Zhang 3.1 Subband Coding Sub-sampling after filtering to produce each subband. Sub-sampling by a factor of in the previous example The combined sampling rate (of all sub-bands) is the same as the input signal. By choosing appropriate analysis and synthesis filters it is possible to achieve perfect reconstruction. Analysis Filters (F H, F L ) Synthesis Filters (G H, G L ) 3.1 Subband Coding Consider recursive application of the sub-band transform for only the low pass channel Provides a logarithmically spaces pass bands Unlike the uniform pass-bands seen in the earlier example. Allows the signal information to be decoded incrementally; starting from the lowest frequency subband. Useful for scalable representations of the original signal Refer to diagram on the next slide COMP9519 Multimedia Systems Lecture 3 Slide 39 J Zhang COMP9519 Multimedia Systems Lecture 3 Slide 40 J Zhang
3.1 Subband Coding 3.1 Subband Coding x[k] F L F H L H F L F H LL LH G L G H + G L + G H x[k] Discrete Wavelet Transforms (DWT) are tree structured subband transforms with non-uniform (logarithamic) spaced pass-bands Decomposing the signal/image to base resolution And accompanying detailed components LL LH H For image coding DWT applied in both horizontal and vertical directions (D DWT) Diagram in future slide provides example of filter application. f COMP9519 Multimedia Systems Lecture 3 Slide 41 J Zhang COMP9519 Multimedia Systems Lecture 3 Slide 4 J Zhang 3.3.1 Analysis/Synthesis Stages Analysis A signal is first filtered to create a set of signals, each of which contains a limited range of frequencies. These signals are called subbands. 3.1 Analysis/Synthesis Stage D dimensional decomposition structure Since each subband has a reduced bandwidth compared to the original fullband signal, they may be downsampled. That is, a reduced number of samples may be taken of the signal without causing aliasing Synthesis-- Reconstruction is achieved by: Upsampling the decoded subbands. Applying appropriate filters to reverse the subbanding process. Adding the reconstructed subbands together. COMP9519 Multimedia Systems Lecture 3 Slide 43 J Zhang COMP9519 Multimedia Systems Lecture 3 Slide 44 J Zhang
3.1 Analysis/Synthesis Stage Ref: H.Wu 3.1 Analysis/Synthesis Stage Recursive application of the DWT First level and second level decomposition shown LL HL LL 1 HL 1 HL 1 LH HH LH 1 HH 1 LH 1 HH 1 COMP9519 Multimedia Systems Lecture 3 Slide 45 J Zhang COMP9519 Multimedia Systems Lecture 3 Slide 46 J Zhang 3.1 Analysis/Synthesis Stage 3.1 Subband Coding The formation of subbands does not create any compression in itself. The same total number of samples is required to represent the subbands as is required to represent the original signal. The subbands can be encoded efficiently: image D-DWT Image decomposition Quantization of transform coefficients Coding of Quantized coefficients Bit stream The significance of the different spatial frequencies are not uniform, and this fact may be exploited by different bit allocations to the various subbands The subbands are then encoded using one or more coders. Different bit rates or even different coding techniques may be used for each subband. Refer to recommended text for various sub-band coding techniques. DWT employed in JPEG-000 standard for image compression. COMP9519 Multimedia Systems Lecture 3 Slide 47 J Zhang COMP9519 Multimedia Systems Lecture 3 Slide 48 J Zhang
3.1 Subband Coding : Summary The fundamental concept behind Subband Coding is to split up the frequency band of a signal and then to code each subband using a coder that accurately matches the statistics of the bands. At the receiver, the sub-bands are re-sampled, fed through interpolation filters and added to reconstruct the image With an appropriate choice of filters, perfect reconstructions can be achieved COMP9519 Multimedia Systems Lecture 3 Slide 49 J Zhang