UNIVERSITY OF DUBLIN TRINITY COLLEGE

Similar documents
DUAL TREE COMPLEX WAVELETS Part 1

Lecture 5: Error Resilience & Scalability

The Core Technology of Digital TV

Module 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur

The Wavelet Transform & Motion Estimation. 1 The EESERVER Resources (at EEE, Trinity College)

CHAPTER 6. 6 Huffman Coding Based Image Compression Using Complex Wavelet Transform. 6.3 Wavelet Transform based compression technique 106

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Module 7 VIDEO CODING AND MOTION ESTIMATION

Pyramid Coding and Subband Coding

ECE 533 Digital Image Processing- Fall Group Project Embedded Image coding using zero-trees of Wavelet Transform

Pyramid Coding and Subband Coding

Lecture 7, Video Coding, Motion Compensation Accuracy

Secure Data Hiding in Wavelet Compressed Fingerprint Images A paper by N. Ratha, J. Connell, and R. Bolle 1 November, 2006

06/12/2017. Image compression. Image compression. Image compression. Image compression. Coding redundancy: image 1 has four gray levels

Lecture 10 Video Coding Cascade Transforms H264, Wavelets

Audio-coding standards

Wavelet Transform (WT) & JPEG-2000

Outline Introduction MPEG-2 MPEG-4. Video Compression. Introduction to MPEG. Prof. Pratikgiri Goswami

Reversible Wavelets for Embedded Image Compression. Sri Rama Prasanna Pavani Electrical and Computer Engineering, CU Boulder

Image Fusion Using Double Density Discrete Wavelet Transform

Aiyar, Mani Laxman. Keywords: MPEG4, H.264, HEVC, HDTV, DVB, FIR.

SIGNAL COMPRESSION. 9. Lossy image compression: SPIHT and S+P

Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology

ISSN (ONLINE): , VOLUME-3, ISSUE-1,

Video Compression An Introduction

International Journal of Advanced Research in Computer Science and Software Engineering

CHAPTER 3 DIFFERENT DOMAINS OF WATERMARKING. domain. In spatial domain the watermark bits directly added to the pixels of the cover

Lecture 12 Video Coding Cascade Transforms H264, Wavelets

Motion Estimation Using Low-Band-Shift Method for Wavelet-Based Moving-Picture Coding

Image Gap Interpolation for Color Images Using Discrete Cosine Transform

Audio-coding standards

MRT based Adaptive Transform Coder with Classified Vector Quantization (MATC-CVQ)

Frequency Band Coding Mode Selection for Key Frames of Wyner-Ziv Video Coding

A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm

Fundamentals of Video Compression. Video Compression

Directionally Selective Fractional Wavelet Transform Using a 2-D Non-Separable Unbalanced Lifting Structure

CISC 7610 Lecture 3 Multimedia data and data formats

BME I5000: Biomedical Imaging

ANALYSIS OF SPIHT ALGORITHM FOR SATELLITE IMAGE COMPRESSION

Digital Video Processing

CODING METHOD FOR EMBEDDING AUDIO IN VIDEO STREAM. Harri Sorokin, Jari Koivusaari, Moncef Gabbouj, and Jarmo Takala

EE482: Digital Signal Processing Applications

Appendix 4. Audio coding algorithms

MPEG: It s Need, Evolution and Processing Methods

JPEG compression of monochrome 2D-barcode images using DCT coefficient distributions

Image data compression with CCSDS Algorithm in full frame and strip mode

Efficient and Low-Complexity Image Coding with the Lifting Scheme and Modified SPIHT

ECE 417 Guest Lecture Video Compression in MPEG-1/2/4. Min-Hsuan Tsai Apr 02, 2013

A Miniature-Based Image Retrieval System

IMAGE ENHANCEMENT USING NONSUBSAMPLED CONTOURLET TRANSFORM

Index. 1. Motivation 2. Background 3. JPEG Compression The Discrete Cosine Transformation Quantization Coding 4. MPEG 5.

Artifacts and Textured Region Detection

Compression of Stereo Images using a Huffman-Zip Scheme

CMPT 365 Multimedia Systems. Media Compression - Video

Intra-Mode Indexed Nonuniform Quantization Parameter Matrices in AVC/H.264

VLSI Implementation of Daubechies Wavelet Filter for Image Compression

Scalable Multiresolution Video Coding using Subband Decomposition

Descrambling Privacy Protected Information for Authenticated users in H.264/AVC Compressed Video

Module 8: Video Coding Basics Lecture 42: Sub-band coding, Second generation coding, 3D coding. The Lecture Contains: Performance Measures

Redundant Data Elimination for Image Compression and Internet Transmission using MATLAB

CSE237A: Final Project Mid-Report Image Enhancement for portable platforms Rohit Sunkam Ramanujam Soha Dalal

High Efficiency Video Coding. Li Li 2016/10/18

Compression of RADARSAT Data with Block Adaptive Wavelets Abstract: 1. Introduction

Rate Distortion Optimization in Video Compression

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

Video Transcoding Architectures and Techniques: An Overview. IEEE Signal Processing Magazine March 2003 Present by Chen-hsiu Huang

Video Codec Design Developing Image and Video Compression Systems

Denoising and Edge Detection Using Sobelmethod

Georgios Tziritas Computer Science Department

Perceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects.

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV

Vidhya.N.S. Murthy Student I.D Project report for Multimedia Processing course (EE5359) under Dr. K.R. Rao

Performance Comparison between DWT-based and DCT-based Encoders

Chapter - 2 : IMAGE ENHANCEMENT

Video De-interlacing with Scene Change Detection Based on 3D Wavelet Transform

Digital Image Processing

Bit-Plane Decomposition Steganography Using Wavelet Compressed Video

Overview: motion-compensated coding

In the name of Allah. the compassionate, the merciful

Implementation of Lifting-Based Two Dimensional Discrete Wavelet Transform on FPGA Using Pipeline Architecture

Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal.

DTC-350. VISUALmpeg PRO MPEG Analyser Software.

Chapter 7 Multimedia Operating Systems

DIGITAL IMAGE PROCESSING WRITTEN REPORT ADAPTIVE IMAGE COMPRESSION TECHNIQUES FOR WIRELESS MULTIMEDIA APPLICATIONS

International Journal of Advance Research in Computer Science and Management Studies

DIGITAL IMAGE PROCESSING

Lecture Image Enhancement and Spatial Filtering

Audio Coding and MP3

A New Approach to Compressed Image Steganography Using Wavelet Transform

An Improved Complex Spatially Scalable ACC DCT Based Video Compression Method

Image Segmentation Techniques for Object-Based Coding

Fingerprint Image Compression

Visually Improved Image Compression by using Embedded Zero-tree Wavelet Coding

Design of 2-D DWT VLSI Architecture for Image Processing

A Novel Statistical Distortion Model Based on Mixed Laplacian and Uniform Distribution of Mpeg-4 FGS

Part 1 of 4. MARCH

Adaptive Quantization for Video Compression in Frequency Domain

Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform

Image Transformation Techniques Dr. Rajeev Srivastava Dept. of Computer Engineering, ITBHU, Varanasi

Transcription:

UNIVERSITY OF DUBLIN TRINITY COLLEGE FACULTY OF ENGINEERING, MATHEMATICS & SCIENCE SCHOOL OF ENGINEERING Electronic and Electrical Engineering Senior Sophister Trinity Term, 2010 Engineering Annual Examinations DIGITAL MEDIA PROCESSING (4C8) Date: XX th June 2010 Venue: XXower Luce Time: 14:00-17:00 Prof. A. C. Kokaram Answer FOUR (4) questions including question 1. Please answer questions from each section in separate answer books. Permitted Materials: Calculator Drawing Instruments Mathematical Tables Graph Paper Page 1 of 14

SECTION A (COMPULSORY) 1. The image f [h,k], shown in figure 1 (top left) is to be processed with an enhancement operation designed to improve the sharpness of the high frequency components of the image. (a) The Matlab code below shows the design of a 1-D low pass filter using a Gaussian window. x = -50:50;var = 4.5*4.5; g = exp(-x.*x/(2*var)); g =g/sum(g); Why is it necessary to make the d.c. gain of this filter equal to unity and in which line is this achived in the code above? (b) Figure 1 shows on the top right, the frequency reponse of two 1-D low-pass filters, both having 101 taps. The Gaussian filter g[k], has a frequency response shown by the solid line, while the near-ideal low pass filter w[k] has the frequency response shown by the dashed line. Both of these filters are used to create the respective 2D seperable filter g [h,k] and w [h,k]. For instance g [h,k] is created using g[k]. The image f [h,k] is then processed with these two seperable filters to yield two output images f g [h,k] and f w [h,k] respectively. The output images are shown in Figure 1 in no particular order. Match the output images with the correct filter that produced them, giving your reasons for your selection. (c) Figure 2 shows an image edge enhancement system that boosts the high frequency content of an image by a factor of 4. The input to the system is f [h,k] shown previously. The system diagram (Fig 2) shows a number of output signals labelled as A, B. A is an intermediate output while B is the final image output. The key component of the system is the low pass filter indicated by LPF. Figure 3 shows a number of possible output images when g [h,k] or w [h,k] are used as the LPF stage. Match A, B to the correct image from this set when i) g [h,k] is used as the LPF and ii) w [h,k] is used as the LPF. Explain your choices ensuring to relate what you see to the frequency frequency reponse of the respective filters. Page 2 of 14

1 Magnitude of Filter Response H(e jω ) 2 0.8 0.6 0.4 0.2 0 Ω (Frequency) rad/sample i) ii) Figure 1: Top Left: Original Zoneplate Image (mid-gray = 0 intensity), Top Right: Frequency response of 1-D filters. Bottom Row: Low pass filter outputs (mid-gray = 0 intensity). Figure 2: Enhancement System Page 3 of 14

(iii) (iv) (v) (vi) Figure 3: Intermediate and Final outputs from the enhancement system. (mid-gray = 0 intensity) Page 4 of 14

SECTION B 2. (a) In no more than about one page, describe the process of Full Search Block Matching for motion estimation. Mention concepts such as block size, search width and write down the image sequence model that is assumed implicitly in this motion estimation process. [8 marks] (b) A pixel based motion detector is used to detect motion in a real video signal, G n (x). It thresholds the absolute value of the inter-frame pixel difference n (x) = G n (x) G n 1 (x) as follows 1 If n (x) > T b n (x) = 0 Otherwise (1) where b n (x) is the motion detector output and is set to a value of 1 (one) at each pixel that motion is detected. In a real video signal, it is possible to model the p.d.f. of n (x) as a mixture of a Gaussian p.d.f., where the picture is not moving, and a Laplacian p.d.f where it is moving. Assume that the probability distribution of the pixel difference n (x) in moving areas is Laplacian with x 0 = 6.5 as follows. p( n (x)) = 1 ( ) n (x) exp 2x 0 x 0 (2) Therefore the probability that a pixel is incorrectly classified as stationary when it is moving (i.e. b n (x) = 0) in these areas is p m (T ) = 1 exp( T /x 0 ). i. Assume that the probability distribution of the pixel difference n (x) in stationary areas is Gaussian with variance σv 2 = 100 and zero mean. Show that the probability that a pixel is misclassified as moving (i.e. b n (x) = 1) in stationary areas is p n (T ) = erfc(t / 2σ 2 v ). See Table 1 for the required integral. [4 marks] ii. Over the range T = [3 : 9], plot on the same graph, curves of p n (T ), p m (T ) and hence choose an optimum threshold for correct classification of moving regions. Explain your choice. [8 marks] iii. A 1D [1, 1, 1]/3 filter is used in a separable implementation to create a 2D low pas filter H. This low pass filter is used to pre-process the pixel difference n (x). Calculate Page 5 of 14

the reduction in the variance of n in stationary areas and, assuming there is no significant effect in moving areas, estimate the new optimal threshold that should be used. [5 marks] Page 6 of 14

3. Figure 4 (below) shows the filter structure used in the analysis and reconstruction units of a 1D Perfect Reconstruction filterbank. x(n) X(z) Y 0 (z) H 0 (z) 2 y 0 (n) y 0 (n) n even H 1 (z) 2 y 1 (n) y 1 (n) Y 1 (z)) (a) Ŷ 0 (z) 2 G 0 (z) + ˆx(n) ˆX(z) 2 G 1 (z) Ŷ 1 (z) (b) Figure 4: Two-band filter banks for analysis (a) and reconstruction (b). (a) Describe the difference between the signals Ŷ 0 (z) and Y 0 (z). [4 marks] (b) Using the Haar Wavelet, the analysis filters in the system above are as follows. H 0 (z) = 1 2 ( z 1 + 1 ) H 1 (z) = 1 2 ( z 1 1 ) A signal x[k] = 1, 1, 0, 0, 1, 1, 0, 0 (for k = 0 : 7, 0 otherwsie) is input to this Haar filter bank. Calculate the output signals y 0 [n] and y 1 [n]. [4 marks] (c) The Haar filterbank in the configuration above creates a fully decimated transform and is not shift invariant. Using your signal output from above, explain the meaning of the terms fully decimated and shift invariant. Also explain why shift invariance is important in image manipulation? [6 marks]. (d) Draw a block diagram showing how the analysis stage above can be applied to create a 2D analysis filter bank resulting in 2 levels of decomposition. [6 marks] (e) Figure 5 shows a single grey scale image. Figure 6 shows a series of bandpass images which correspond to the 2D DWT of the image to the left of Figure 5 using the Haar wavelet at level 1 (the low pass sub-band is omitted). The brightness of the images has Page 7 of 14

been adjusted for easier viewing, the mid-gray scale level corresponds to a value of 0 (zero). Identify the corresponding Hi-Lo, Lo-Hi and Hi-Hi sub-bands at Level 1 of the 2D DWT from the bandpass image set shown in Figure 5. Use the typical convention shown for arrangement of 2D DWT sub-bands as indicated in Figure 5. Explain your selections. [5 marks] Page 8 of 14

Lo-Lo Hi-Lo Lo-Hi Hi-Hi Figure 5: Left : A Grey Scale Image. Right : The convention used for ordering the Level 1 sub-bands of the 2D DWT 1) 2) 3) 4) Figure 6: Level 1 Subbands of the 2D DWT using the Haar Transform. Page 9 of 14

4. (a) The 2-point, one-dimensional Haar transform of [x(1), x(2)] is given by the following relationships y(1) y(2) = T x(1) x(2) Where T = 1 1 1 2 1 1 (3) Derive an expression for the 4-point transformation matrix that is equivalent to 2 levels of the Haar Transform. [6 marks] (b) The DCT is used for almost all the core compression techniques in media transmission today. The DCT coefficient matrix for transforming a single 8 pixel row into 8 coefficients of the DCT is shown below. Since it is such an important transformation, it is typically required to be as efficient an implemementation as possible in hardware or software. Show how the number of multiply/add operations required to perform the 8 point DCT can be reduced by exploiting the structure in the matrix below. Estimate the reduction in operation count that you achieve. [8 marks] 0.3536 0.3536 0.3536 0.3536 0.3536 0.3536 0.3536 0.3536 0.4904 0.4157 0.2778 0.0975 0.0975 0.2778 0.4157 0.4904 0.4619 0.1913 0.1913 0.4619 0.4619 0.1913 0.1913 0.4619 0.4157 0.0975 0.4904 0.2778 0.2778 0.4904 0.0975 0.4157 0.3536 0.3536 0.3536 0.3536 0.3536 0.3536 0.3536 0.3536 0.2778 0.4904 0.0975 0.4157 0.4157 0.0975 0.4904 0.2778 0.1913 0.4619 0.4619 0.1913 0.1913 0.4619 0.4619 0.1913 0.0975 0.2778 0.4157 0.4904 0.4904 0.4157 0.2778 0.0975 (c) A 2D block of X = 8 8 pixels is to be transformed by a 2D Haar Transform and a 2D DCT Transform. State how this can be achieved using a generic 1D, 8 point Transform matrix is T 8. [2 marks] (d) Using your expression above, calculate for both the Haar Transform and the DCT, the operations per pixel required to transform an 8 8 block of pixels. [4 marks] (e) A mobile phone manufacturer wants to use either the Haar Transform or the DCT in its propreitary codec to compress images of size 480 640 streamed at 20 frames per second i.e. as individual compressed frames. Using your answer above, calculate the number of Page 10 of 14

operations per second needed in each case to perform the transform alone and hence recommend the best filter to use in terms of least computatinal load. State one other issue that would be important in the choice of transform. [5 marks] Page 11 of 14

5. The MPEG compression standard has been in place for over ten years and is the backbone of the Digital Television industry. A key component in the compression process is the step of predicting one frame from another, allowing for motion. (a) In creating frame n from frame n 1, MPEG schemes employ motion compensated prediction. There are two ways of achieving this. i) Frame n can be built by copying pixels from respective motion compensated locations in rame n 1 or ii) pixels from frame n 1 can be copied into their respective motion compensated locations in frame n. In i) a pixel at site [h,k] in frame n is given a value at site [h+d 1,k +d 2 ] from frame n 1 (where d 1,d 2 are the two components of motion between the frames). In ii) pixels at site [h,k] in frame n 1 are copied into location [h d 1,k d 2 ] in frame n. Although these two prediction equations are mathematically equivalent, MPEG schemes choose to PULL ie. use (i) for motion compensation rather than PUSH i.e. (ii). Explain why. [5 marks] (b) Explain the meaning of the terms I-frames, B-frames, P-frames and GOP, and explain their purpose in the MPEG codec. [ 5 marks] (c) Use a simple example to show and explain why the encode order of the frames in a GOP is different from the decode order. [ 4 marks] (d) The bit rate of compressed movies tends to spike dramatically at shot cuts. Why is this and how can an MPEG codec be modified to reduce this effect? [ 3 marks] (e) The design of real time codecs requires that attention be paid to buffer sizes for motion compensated prediction, and different broadcasters select different encoding strategies depending on their real time requirements. YouTube uses H264 encoding with 90 frame GOPs having no B frames. Digital TV broadcasters typicaly use 12 frame GOPS with all frame types (typically only two consecutive B frames). State the likely frame sequence that is used to create GOPs in these two cases and explain why these particular GOP choices are appropriate for the needs of the respective broadcaster. [ 8 marks] Page 12 of 14

x erfc(x) x erfc(x) 0.10 0.8875 1.30 0.0660 0.20 0.7773 1.40 0.0477 0.30 0.6714 1.50 0.0339 0.40 0.5716 1.60 0.0237 0.50 0.4795 1.70 0.0162 0.60 0.3961 1.80 0.0109 0.70 0.3222 1.90 0.0072 0.80 0.2579 2.00 0.0047 0.90 0.2031 2.10 0.0030 1.00 0.1573 2.20 0.0019 1.10 0.1198 2.30 0.0011 1.20 0.0897 2.40 0.0007 Table 1: Values for the erfc(x) function. erfc(x) = 2 π x exp( u 2 )du c UNIVERSITY OF DUBLIN 2010 Page 13 of 14