Comparative study of coding efficiency in HEVC and VP9. Dr.K.R.Rao

Comparative study of coding efficiency in and EE5359 Multimedia Processing Final Report Under the guidance of Dr.K.R.Rao University of Texas at Arlington Dept. of Electrical Engineering Shwetha Chandrakant Kodpadi 1001051972 Shwetha.chandrakantkodpadi@mavs.uta.edu Spring 2014 1

List of Acronyms and Abbreviations ADST - Asymmetric Discrete Sine Transform AVC Advanced Video Coding BD-BR- Bjøntegaard-Delta Bit-Rate Measurements BD-PSNR - Bjøntegaard-Delta Peak signal to noise ratio CU- Coding unit CTU- Coding tree unit DBF- Deblocking Filter DFT Discrete Fourier Transform DCT Discrete Cosine Transform DST Discrete Sine Transform DPB - Decoded Picture Buffer DC Direct Current HD- High definition -High Efficiency Video Coding ITU-T - International Telecommunication Union (Telecommunication Standardization Sector) JPEG - Joint photographic experts group JCT-VC- Joint collaborative team on video coding MSE-Mean square error MPEG-Moving picture experts group NGOV- Next Geneneration Open Video PU- Prediction unit PSNR-Peak signal to noise ratio PU Prediction Unit RD Rate Distortion SAO - Sample Adaptive Offset SSIM- Structural similarity index TM- True Motion TU-Transform units VCEG Video Coding Experts Group 2

1. Objective The objective of this project is to study, implement and compare video coding standards and [1][3]. The analysis will be carried out on the intra and inter frame coding efficiency by using performance metrics such as computational time, PSNR, BD-BR [14] and video quality will be evaluated for high resolution videos. The HM Test Model 13.0[12] and VPX encoder from The WebM Project [13] for and respectively will be used for this purpose. 2. General compression dataflow Both and video compression standards are hybrid block-based codecs relying on spatial transformations [9]. General compression dataflow of hybrid block-based encoders is illustrated in Figure 1. The input video frame is initially partitioned into blocks of the same size called macroblocks. The compression and decoding process works within each macroblock. A macroblock is sub partitioned into smaller blocks to perform prediction. There are two basic types of prediction: intra and inter. Intra-prediction works within a current video frame and is based upon the compressed and decoded data available for the block being predicted. Inter-prediction is used for motion compensation: a similar region on previously coded frames close to the current block is used for prediction. The aim of the prediction process is to reduce data redundancy and therefore, not store excessive information in coded bitstream. Figure 1: Hybrid block-based codec dataflow [9] Once the prediction is done, it is subtracted from the original data to get residuals that should be compressed. Residuals are subject to forward Discrete-Fourier Transform (DFT). DFT translates spatial residual information into frequency domain. Quantization is applied to the transformed matrix to lose insufficient information. The insufficient threshold is predetermined by encoder configuration. The remaining data and the steps applied are subject to entropy coding, which makes it possible to get compressed bit-stream. For inter-prediction and intra-prediction purposes the compressed data should be restored in the encoder. Dequantization and inverse DFT are performed to restore residuals. Then the restored residuals and the predicted values are summed up to get restored pixel values, 3

identical to those achieved in the decoder. These restored values are used for intraprediction within current video frame. An additional frame post-processing stage is optionally applied to eliminate image blocking introduced by DFT and quantization. The final restored and post-processed video frame is stored in the Decoded Picture Buffer (DPB) for interprediction of further frames. and both utilize the described general compression dataflow, but differ in details [9]. 3. High Efficiency Video Coding 3.1 Introduction High Efficiency Video Coding () is the latest Video Coding format [4]. It challenges the state-of-the-art H.264/AVC [20] Video Coding standard which is in current use in the industry by being able to reduce the bit rate by 50% and retaining the same video quality. It came into existence in the early 2012 although Joint Collaborative Team on Video Coding (JCT-VC) was formed in January 2001 to carry out developments on, and ever since then a huge range of development has been going on. On 13 April 2013 [5], standard also called H.265 was approved by ITU-T. Joint Collaborative Team on Video Coding (JCT- VC), is a group of video coding experts from ITU-T Study Group (VCEG) and ISO/IEC JTC 1/SC 29/WG 11 (MPEG). 3.2 Encoder and Decoder The standard is designed to achieve multiple goals, including coding efficiency, ease of transport system integration and data loss resilience, as well as implementability using parallel processing architectures [4]. Figures 2 and 3 represent block diagrams of encoder and decoder of respectively. Figure 2: Encoder block diagram for [4] 4

3.3 Coding Tools Figure 3: Decoder block diagram for [17] 3.3.1 Macroblock concept and Prediction block sizes The concept of macroblock in [9] is represented by the Coding Tree Unit (CTU). CTU size can be 16x16, 32x32 or 64x64, while AVC macroblock size is 16x16. Larger CTU size aims to improve the efficiency of block partitioning on high resolution video sequence. Larger blocks provoke the introduction of quad-tree partitioning (Figure 4) of a CTU into smaller coding units (CUs). A coding unit is a bottom-level quad-tree syntax element of CTU splitting. The CU contains a prediction unit (PU) and a transform unit (TU). a) b) Figure 4: CTU splitting example with solid lines for CU split: a) with PU splitting depicted as dotted lines; b) with TU splitting depicted as dotted lines [9] The TU is a syntax element responsible for storing transform data. Allowed TU sizes are 32x32, 16x16, 8x8 and 4x4. The PU is a syntax element to store prediction data like the intra-prediction angle or inter-prediction motion vector. The CU can contain up to four 5

prediction units. CU splitting on PUs can be 2Nx2N, 2NxN, Nx2N, NxN, 2NxnU, 2NxnD, nlx2n and nrx2n (Figure 5) where 2N is a size of a CU being split. In the intra-prediction mode only 2Nx2N PU splitting is allowed. An NxN PU split is also possible for a bottom level CU that cannot be further split into sub CUs. 3.3.2 Prediction Modes Figure 5: PU splitting [9] 3.3.2.1 Intra Prediction Modes There are a total of 35 intra-prediction modes in : planar (mode 0), DC (mode 1) and 33 angular modes (modes 2-34 in Figure 6). DC intra-prediction is the simplest mode in. All PU pixels are set equal to the mean value of all available neighboring pixels. Planar intra-prediction is the most computationally expensive. It is a two- dimensional linear interpolation. Angular intra-prediction modes 2-34 are linear interpolations of pixel values in the corresponding directions. Vertical intra-prediction (modes 18-34) is an updown interpolation of neighboring pixel values. Also, intra prediction can be done at different block sizes, ranging from 4 X 4 to 64 X 64 (whatever size the PU has) (Figure 7). Figure 6: Modes and directional orientations for intra picture prediction for [1] Figure 7: Luma intra prediction modes for different PU sizes in [8] 6

3.3.2.1 Inter Prediction Each PU is predicted from image data in one or two reference pictures (before or after the current picture in display order), using motion compensated prediction. 3.3.2 Transform and Quantization Any residual data remaining after prediction is transformed using a block transform based on the integer Discrete Cosine Transform (DCT) [22]. Only for 4x4 intra luma, a transform based on Discrete Sine Transform (DST) is used. One or more block transforms of size 32x32, 16x16, 8x8 and 4x4 are applied to residual data in each CU. Then the transformed data is quantized. 3.3.3 Entropy Coding Figure 8: CTU showing range of transform (TU) sizes [18] Context adaptive binary arithmetic coding (CABAC) is used for entropy coding. This is similar to the CABAC scheme in H.264/MPEG-4 AVC [20], but has undergone several changes to improve its throughput speed (especially for parallel-processing architectures) and its compression performance, and to reduce its context memory requirements. 3.3.4 Post Processing One or two filtering stages can be optionally applied (within the inter-picture prediction loop) before writing the reconstructed picture into the decoded picture buffer. A deblocking filter (DBF) is used that is similar to the one in AVC; however the DBF design has been simplified with regard to its decision-making and filtering processes and also has been made more friendly to parallel processing. The second stage, called the sample adaptive offset (SAO) filter, is a non-linear amplitude mapping. The goal of SAO is to improve the reconstruction of the signal amplitude by adding an offset based on a look-up table mapping that is controlled by the encoder. Two types of SAO operation can be selected for each CTB the band offset and edge offset 7

modes, where depending on additional criteria (amplitude or local directional amplitude constellation) an offset value is added to the reconstructed sample amplitude. 4. 4.1 Introduction is an open and royalty free video compression standard being developed by Google [2][3]. had earlier development names of Next Generation Open Video (NGOV) and VP- Next. is a successor to VP8. Development of started in Q3 2011. One of the goals of is to reduce the bit rate by 50% compared to VP8 while having the same video quality [7]. Also aims to improve it to the point where it would have better compression efficiency than High Efficiency Video Coding. expands techniques used in H.264/AVC and VP8 and is very likely to replace AVC at least in the YouTube video service [9]. 4.2 Encoder and Decoder A large part of the advances made by over its predecessors is natural progression from current generation video codecs to the next. Figures 9 and 10 represent block diagrams of encoder and decoder of respectively. DCT Scan Ordering Uniform Quantization Entropy Encoding Input + + - Inverse Quantization Scan reordering Inverse DCT + + + Motion Compensation Previous frame buffer Prediction Loop filter Motion Estimation Golden frame buffer Figure 9: Encoder block diagram for [19] 8

Encoded in Entropy Decoding Inverse Quantization + Scan reordering + IDCT + + + Decoded out Motion Compensation Prediction Loop filter Previous frame buffer Golden frame buffer Figure 10: Decoder block diagram for [19] 4.3 Coding Tools 4.3.1 Prediction Block Sizes A large part of the coding efficiency improvements achieved in can be attributed to incorporation of larger prediction block sizes [9] [3]. introduces super-blocks (SB) of size up to 64x64 and allows breakdown using recursive decomposition all the way down to 4x4. Unlike, any sub-block can be split on prediction blocks in intra mode. Furthermore rectangular intra-prediction blocks are possible which are demonstrated in Figure 11. Each sub-block may be further split into prediction blocks and transform blocks which are represented by Figure 12.a and Figure 12.b respectively. Intra-prediction in is still performed on square regions thus rectangular prediction blocks represent two square prediction blocks with the same prediction mode. Giving an analogy to, prediction splitting 2Nx2N, NxN, 2NxN or Nx2N is available (Figure 12.a) where 2Nx2N is the size of the block being split. It is worth mentioning that 4x4 prediction blocks are determined within corresponding 8x8 block as a group, unlike other prediction sizes when prediction data is stored per each prediction block. Like in, a sub-block can be split into transform blocks in a quad-tree structure down to the smallest 4x4 block. The allowed sizes are 32x32, 32x16, 16x16, 8x16, 8x8 and 4x4 (Figure 12.b). 9

Figure 11: Example partitioning of a 64x64 Super-block a) b) Figure 12: Superblock splitting example with solid lines for block split: a) with prediction splitting depicted as dotted lines; b) with transform splitting depicted as dotted lines [9] 4.3.2 Prediction Modes 4.3.2.1 Intra-prediction Modes supports a set of 10 Intra prediction modes [9] for block sizes ranging from 4x4 up to 32x32: DC_PRED (DC prediction), TM_PRED (True-motion prediction), H_PRED (Horizontal prediction), V_PRED (Vertical prediction), and 6 oblique directional prediction modes: D27, 10

D153, D135, D117, D63, D45 corresponding approximately to angles 27, 153, 135, 117, 63, and 45 degrees (counter-clockwise measured against the horizontal axis). The horizontal, vertical and oblique directional prediction modes involve copying (or estimating) pixel values from surrounding blocks into the current block along the angle specified by the prediction mode. Figure 8 shows angular Intra-prediction modes in. Figure 13: angular intra-prediction modes [9] 4.3.2.2 Inter Prediction Modes supports a set of 4 inter prediction modes for block sizes ranging from 4x4 up to 64x64 pixels: NEARESTMV, NEARMV, ZEROMV, and NEWMV [3]. 4.3.3 Transform and quantization The residuals after subtraction of predicted pixel values are subjected to transformation and quantization [9]. Transform blocks can be 32x32, 16x16, 8x8 or 4x4 pixels. Like most other coding standards, these transforms are an integer approximation of the DCT. For intra coded blocks either or both the vertical and horizontal transform pass can be DST (discrete sine transform) instead. This is with respect to the specific characteristics of the residual signal of intra blocks. In addition, introduces support for a new transform type, the Asymmetric Discrete Sine Transform (ADST), which can be used in combination with specific intra-prediction modes. Intra-prediction modes that predict from a left edge can use the 1-D ADST in the horizontal direction, combined with a 1-D DCT in the vertical direction. Similarly, the residual signal resulting from intra-prediction modes that predict from the top edge can employ a vertical 1-D ADST transform combined with a horizontal 1- D DCT transform. Intra-prediction modes that predict from both edges such as the True Motion mode and some diagonal intra-prediction modes use the 1-D ADST in both horizontal and vertical directions. 11

4.3.4 Entropy coding uses 8-bit arithmetic coding engine from VP8 known as bool-coder [9]. Unlike AVC or, the probabilities of bool-coder do not change adaptively within a frame. makes use of forward context updates through the use of flags in the frame header that signal modifications of the coding contexts at the start of each frame. These probabilities are stored in what is known as a frame context. The decoder maintains four of these contexts, and each frame specifies which one to use in bitstream. 4.3.5 Post-processing There is only one possible post-processing stage in : deblock filter [9]. It aims to reduce blocking artifacts on superblocks filtering vertical edges first and horizontal edges second. has 16-, 8-, 4- and 2-pixels wide filters with half filter size on each side of a boundary. also incorporates a flatness detector in the loop filter that detects at regions and varies the filter strength and size accordingly. 5. Performance comparison metrics 5.1 MSE and PSNR MSE and PSNR [17] for an NxM pixel image are defined in equations 1 and 2 where O is the original image and R is the reconstructed image. M and N are the width and height of an image and L is the maximum pixel value in the NxM pixel image. [ ( ) ( )] 5.2 Structural Similarity Index The structural similarity (SSIM) [10] index is a method for measuring the similarity between two images. SSIM emphasizes that the human visual system is highly adapted to extract structural information from visual scenes. Therefore, structural similarity measurement should provide a good approximation to perceptual image quality. SSIM is designed to improve on methods like peak signal-to-noise ratio (PSNR) and mean squared error (MSE), which have proved to be inconsistent with human eye perception. SSIM considers image degradation as perceived change in structural information. Structural information is the idea that the pixels have strong inter-dependencies especially when they are spatially close. SSIM is defined in equation 3. 12

SSIM( ) ( )( ) ( )( ) Where x and y correspond to two different signals that need to be compared for similarity, i.e. two different blocks in two separate images. 5.3 Bjøntegaard-Delta Bit-Rate Measurements As rate-distortion (R-D) performance assessment [14], Bjøntegaard-Delta bit-rate (BD-BR) measurement method is used for calculating average bit-rate differences between R-D curves for the same objective quality (e.g., for the same PSNRYUV values), where negative BD-BR values indicate actual bit-rate savings. As part of this project BD-BR performance metric will be used to determine bit-rate savings. The average PSNR is calculated for 4:2:0 sub-sampling pattern, equation 4 gives the formula to calculate the average PSNR value. ( ) 6. Implementation For comparison purpose, open-source implementations of the reviewed codecs will be used. compression efficiency will be measured with the HM Test Model [12]. Evaluation of compression performance will be carried out with the VPX encoder from The WebM Project [13]. Since has more Intra Prediction modes and few other features than, both the codecs are configured to establish a fair comparison. Encoding time is used to compare the implementation complexity. 7. Test Sequences The implementation will be carried out on the.yuv video sequences which are listed in Table 1. They have different resolutions and frame-rates, covering the most use cases possible. Table 1: Test sequences [9] 13

Figures 14, 15, 16 and 17 are frames of the test sequences RaceHorses, BasketballDrill, Kimono1 and PeopleOnStreet, respectively. Figure 14: RaceHorses (416x240) Figure 15: BasketballDrill (832x480) Figure 16: Kimono (1920x1080) 14

8. Implementation results 8.1 All Intra Mode (AI) Figure 17: PeopleOnStreet (2560x1600) To compare intra compression efficiency between and All Intra Main configuration is used in. In, key frame parameter is adjusted to make the VPX encoder behave in All Intra (AI) mode. PSNR and bitrate values are recorded for different quantization parameters (22, 27, 32, and 37) for each test sequence. Tables 2, 3, 4 and 5 demonstrate the implementation results test sequences RaceHorses, BasketballDrill, Kimono1 and PeopleOnStreet, respectively. The implementation was carried out for 20 frames of each test sequence. RaceHorses_416x240_30.yuv (AI mode) QP Bitrate(Kbit/s) Encoding Bitrate(Kbit/s) Encoding 22 42.5720 5057.16 63.726 41.143 6507.95 52.426 27 38.6146 3152.1000 57.084 40.191 4683.97 50185 32 34.9992 1814.8920 48.975 36.230 2380.11 42.470 37 31.9718 979.5840 41.869 35.913 2262.14 40.828 Table 2: Implementation results for RaceHorses_416x240_30.yuv sequence (AI mode) 15

BasketballDrill_832x480_50.yuv (AI mode) QP Bitrate(Kbit/s) Encoding Bitrate(Kbit/s) Encoding 22 42.3716 20407.88 278.849 42.3070 22101.34 197.060 27 39.0915 11014.04 220.022 39.553 13454.87 229.792 32 36.2986 5847.02 196.536 38.224 10621.66 178.128 37 33.9144 3200.72 162.733 35.831 7410.842 Table 3: Implementation results for BasketballDrill_832x480_50.yuv sequence (AI mode) 163.153 Kimono1_1920x1080_24.yuv (AI mode) QP Bitrate(Kbit/s) Encoding Bitrate(Kbit/s) Encoding 22 43.3716 18738.4512 1054.518 43.164 20764.76 940.820 27 42.0109 10746.1920 886.952 42.053 13404.4746 749.586 32 39.7045 6408.2112 830.751 40.955 9977.10 634.479 37 38.0951 3776.95 785.519 39.868 7508.94 659.983 Table 4: Implementation results for Kimono1_1920x1080_24.yuv sequence (AI mode) PeopleOnStreet_2560x1600_30_crop.yuv (AI mode) QP Bitrate(Kbit/s) Encoding Bitrate(Kbit/s) Encoding 22 43.8084 104202.5640 2109.036 43.219 101898.87 1714.655 27 40.6998 60435.6720 1773.904 40.387 62989.20 1577.737 32 37.8947 34338.7800 1661.502 38.794 47220.45 1481.304 37 35.4149 19983.8040 1571.038 36.611 35706.77 1347.487 Table 5: Implementation results for PeopleOnStreet_2560x1600_30_crop.yuv sequence (AI mode) 16

Figures 18, 19, 20 and 21 illustrate the Bitrate-PSNR plot for test sequences RaceHorses, BasketballDrill, Kimono1 and PeopleOnStreet, respectively. 44 42 40 Bitrate-PSNR plot for RaceHorses_416x240_30.yuv (AI Mode) 38 36 34 32 30 0 1000 2000 3000 4000 5000 6000 7000 Bitrate(kbps) Figure 18: R-D plot for RaceHorses_416x240_30.yuv (AI mode) 44 42 40 38 36 34 32 Bitrate-PSNR Plot for BasketballDrill_832x480_50.yuv (AI Mode) 30 0 5000 10000 15000 20000 25000 Bitrate(kbps) Figure 19: R-D plot for BasketballDrill_832x480_50.yuv (AI mode) 17

Bitrate-PSNR plot Kimono1_1920x1080_24.yuv (AI Mode) 44 43 42 41 40 39 38 37 0 5000 10000 15000 20000 25000 Bitrate(kbps) Figure 20: R-D plot for Kimono1_1920x1080_24.yuv (AI mode) Bitrate-PSNR plot for PeopleOnStreet_2560x1600_30_crop.yuv (AI Mode) 46 44 42 40 38 vp9 36 34 0 20000 40000 60000 80000 100000 120000 Bitrate(kbps) Figure 21: R-D plot for PeopleOnStreet_2560x1600_30_crop.yuv (AI mode) Figures 22 and 23 illustrate the average encoding time taken by and, and BD-BR for and, respectively. 18

BD-Bitrate % Average encoding time of and (AI Mode) 2000 1800 1600 1778.87 1530.29 1400 1200 Time(secs) 1000 800 600 400 200 84.66 74.36 214.53 192.03 889.43 746.21 0 1 2 3 4 1 RaceHorses_416x240_30 2 BasketballDrill_832x480_50 3 Kimono1_1920x1080_24 4 PeopleOnStreet_2560x1600_30_crop Figure 22: Average encoding time of and (AI mode) 0-2 -4-6 -8-10 -12-14 -16-18 -20 BD-BR for and (AI Mode) -10.38 RaceHorses_416x240 _30-14.07 BasketballDrill_832x4 80_50-17.17 Kimono1_1920x1080 _24 Figure 23: Bitrate savings of over (AI mode) -12.6 PeopleOnStreet_2560 x1600_30_crop BD-Bitrate -10.38-14.07-17.17-12.6 8.2 Random Access Mode (RA) To compare inter compression efficiency between and Random Access Main configuration is used in. 19

PSNR and bitrate values are recorded for different quantization parameters (22, 27, 32, and 37) for each test sequence. Tables 6, 7, 8 and 9 demonstrate the implementation results test sequences RaceHorses, BasketballDrill, Kimono1 and PeopleOnStreet, respectively. The implementation was carried out for 20 frames of each test sequence. RaceHorses_416x240_30.yuv (RA Mode) QP Bitrate(Kbit/s) Encoding Bitrate(Kbit/s) Encoding 22 39.74 1505.052 260.594 39.879 1746.91 185.702 27 36.15 769.092 222.183 38.090 1281.92 178.248 32 33.025 392.196 174.221 36.240 926.20 160.200 37 30.49 203.688 159.843 34.467 655.84 152.387 Table 6: Implementation results for RaceHorses_416x240_30.yuv sequence (RA Mode) BasketballDrill_832x480_50.yuv (RA Mode) QP Bitrate(Kbit/s) Encoding Bitrate(Kbit/s) Encoding 22 41.42 3479.880 790.675 40.458 2494.11 685.365 27 38.49 1699.040 736.582 38.998 1783.65 652.131 32 35.78 842.060 610.635 37.489 1259.46 618.282 37 33.50 447.220 530.509 36.087 894.03 591.511 Table 7: Implementation results for BasketballDrill_832x480_50.yuv sequence (RA Mode) 20

Kimono1_1920x1080_24.yuv (RA Mode) QP Bitrate(Kbit/s) Encoding Bitrate(Kbit/s) Encoding 22 42.35 6381.0816 4775.592 42.639 9167.01 3351.029 27 40.43 3083.6832 4014.146 41.868 6592.11 3187.684 32 38.22 1527.2736 3630.341 40.933 4712.71 3020.938 37 36.14 782.1216 2812.486 39.895 3365.43 2630.570 Table 8: Implementation results for Kimono1_1920x1080_24.yuv sequence (RA Mode) PeopleOnStreet_2560x1600_30_crop.yuv (RA Mode) QP Bitrate(Kbit/s) Encoding Bitrate(Kbit/s) Encoding 22 41.3641 34199.1720 9794.637 41.718 41467.65 7338.623 27 38.68 16691.3760 7652.505 39.582 29001.44 6984.065 32 36.075 8834.2080 6707.458 38.824 20737.66 6290.122 37 33.727 4987.2000 6322.114 37.446 15223.66 6097.715 Table 9: Implementation results for PeopleOnStreet_2560x1600_30_crop.yuv sequence (RA Mode) Figures 24, 25, 26 and 27 illustrate the Bitrate-PSNR plot for test sequences RaceHorses, BasketballDrill, Kimono1 and PeopleOnStreet, respectively. 21

Bitrate-PSNR plot for RaceHorses_416x240_30.yuv (RA Mode) 42 40 38 36 34 32 30 0 500 1000 1500 2000 Bitrate(kbps) Figure 24: R-D plot for RaceHorses_416x240_30.yuv (RA Mode) Bitrate-PSNR Plot for BasketballDrill_832x480_50.yuv (RA Mode) 44 42 40 38 36 34 32 0 1000 2000 3000 4000 Bitrate(kbps) Figure 25: R-D plot for BasketballDrill_832x480_50.yuv (RA Mode) 22

Bitrate-PSNR plot Kimono1_1920x1080_24.yuv (RA Mode) 44 42 40 38 (RA) (RA) 36 34 0 1500 3000 4500 6000 Bitrate(kbps) Figure 26: R-D plot for Kimono1_1920x1080_24.yuv (RA Mode) Bitrate-PSNR plot for PeopleOnStreet_2560x1600_30_crop.yuv (RA Mode) 44 42 40 38 36 34 32 30 0 10000 20000 30000 40000 50000 Bitrate(kbps) Figure 27: R-D plot for PeopleOnStreet_2560x1600_30_crop.yuv (RA Mode) Figures 28 and 29 illustrate the encoding time taken by and, and BD-BR for and in Random Access mode, respectively. 23

BD-Bitrate % Time(secs) 9000 8000 7000 6000 5000 4000 3000 2000 1000 0 Average encoding time for and (RA Mode) 204.21 169.13 667.1 636.82 3808.14 3047.55 7619.18 1 2 3 4 1 RaceHorses_416x240_30 2 BasketballDrill_832x480_50 3 Kimono1_1920x1080_24 4 PeopleOnStreet_2560x1600_30_crop 6677.63 Figure 28: Average encoding time for and (RA Mode) 10 5 0-5 -10-15 -20-25 -30 BD-BR for and (RA Mode) -14.39 RaceHorses_416x 240_30 6.42 BasketballDrill_83 2x480_50-21.79 Kimono1_1920x10 80_24 Figure 29: Bitrate savings of over (RA Mode) -23.82 PeopleOnStreet_2 560x1600_30_cro p BD-Bitrate -14.39 6.42-21.79-23.82 9. Conclusions provides better compression rates than, but is patent-free and can be used without licensing expenses. Both in Intra frame and Inter frame coding, gives 13% more bitrate savings than. The encoding time taken by is marginally less than. 24

10. References [1] G.J. Sullivan et al, Overview of the high efficiency video coding () standard, IEEE Trans. circuits and systems for video technology, vol. 22, no.12, pp. 1649 1668, Dec 2012. [2] D. Grois et al, Performance Comparison of H.265/ MPEG-,, and H.264/MPEG- AVC Encoders, IEEE PCS 2013, pp 394-397, San José, CA, USA, Dec 8-11, 2013 [3] D. Mukherjee et al, The latest open-source video codec An overview and preliminary results, Google Inc., United States [4] G.J. Sullivan et al, "Standardized Extensions of High Efficiency Video Coding ()", IEEE Journal of Selected Topics in Signal Processing, vol.7, no.6, pp.1001-1016, Dec. 2013 [5]Article on - http://en.wikipedia.org/wiki/high_efficiency_video_coding [6] Q. Cai et al, Lossy and lossless intra coding performance evaluation:, H.264/AVC, JPEG 2000 and JPEG LS, Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific, vol.9, no.12, pp.1-9, Dec 2012. [7] "VP-Next Overview and Progress Update" (PDF). WebM Project (Google). Retrieved 2012-12-29. Available on: http://downloads.webmproject.org/ngov2012/pdf/04-ngovproject-update.pdf [8]M.T. Pourazad et al, :The new gold standard for video compression, IEEE consumer electronics magazine,vol.1, no.7, pp.36-46, July 2012. [9] M.P. Sharabayko et al, "Intra Compression Efficiency in and " Applied Mathematical Sciences, Vol. 7, no. 137, pp.6803 6824, Hikari Ltd, 2013 [10] Z. Wang et al, Image quality assessment: From error visibility to structural similarity, IEEE Trans. on Image Processing, vol. 13, no. 4, pp. 600-612, Apr. 2004. [11] H. Jain, Comparative performance analysis of and H.264 Intra frame coding and JPEG2000, EE5359, UTA, spring 2013. http://www-ee.uta.edu/dip/courses/ee5359/index.html. [12] HM Reference Software- https://hevc.hhi.fraunhofer.de/hm-doc/ [13] Chromium open-source browser project, source code, Online: http://git.chromium.org/gitweb/?p=webm/libvpx.git;a=tree;f=vp9;hb=aaf61dfbcab414bfa cc3171501be17d191ff8506 [14] G. Bjøntegaard, Calculation of average PSNR differences between RD-curves, ITU-T Q.6/SG16 VCEG 13th Meeting, Document VCEG-M33, Austin, USA, Apr. 2001. [15] S. Jeong et al., High efficiency video coding for entertainment quality. ETRI J vol. 33, pp.145 154, 2011. [16] JVT Draft ITU-T recommendation and final draft international standard of joint video specification (ITU-T Rec. H.264-ISO/IEC 14496-10 AVC), March 2003, JVT-G050- http://ip.hhi.de/imagecom_g1/assets/pdfs/jvt-g050.pdf [17] White paper on PSNR-NI - http://www.ni.com/white-paper/13306/en/ [18] tutorial by I.E.G. Richardson: http://www.vcodex.com/h265.html [19] J. Padia, Complexity reduction for VP6 to H.264 transcoder using motion vector reuse, M.S. Thesis, EE Dept., UTA, Arlington, TX, 2010. Available on: http://wwwee.uta.edu/dip/courses/ee5359/index.html [20] T. Wiegand et al, Overview of the H.264/AVC Video Coding Standard, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13, No. 7, pp. 560-576, Jul. 2003. 25

[21] G. Bjøntegaard, Improvements of the BD-PSNR model ITU-T SG16 Q.6, Doc. VCEG- AI11, Berlin, Germany, July 16-18, 2008 [22] N. Ahmed, T. Natarajan, K.R. Rao, Discrete Cosine Transform, IEEE Transactions on Computers, Vol. C-23, pp. 90-93, Jan. 1974. [23] I.E.G. Richardson, The H.264 advanced video compression standard, 2nd Edition, Hoboken, NJ, Wiley, 2010. [24] I.E.G. Richardson, Video Codec Design: Developing Image and Video Compression Systems, Wiley, 2002. [25] K.R. Rao, D.N. Kim and J.J. Hwang, Video Coding Standards: AVS China, H.264/MPEG-4 Part 10,, VP6, DIRAC and VC-1, Springer, 2014. [26] B. Bross et al, High Efficiency Video Coding () Text Specification Draft 10, Document JCTVC-L1003, ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT- VC), Mar. 2013 available on http://phenix.itsudparis.eu/jct/doc_end_user/current_document.php?id=7243 [27]Special issues on - 1. Special issue on emerging research and standards in next geeration video coding, IEEE Transactions on Circuits and Systems for Video Technology, Vol.22, pp.1646-1909, Dec.2012 2. Special issue on emerging research and standards in nect generation video coding, IEEE Transactions on Circuits and Systems for Video technology, Vol.23, pp.2009-2142, Dec.2013 3. IEEE Journal of Selected Topics in Signal Processing, Vol.7, pp.931-933, Dec.2013 [28] H.Zhang and Z.Ma, Fast intra mode decision for High Efficiency Video Coding (), IEEE Transactions on Circuits and Systems for Video Technology,Vol.24,pp.660-668,April.2014 26