BLOCK STRUCTURE REUSE FOR MULTI-RATE HIGH EFFICIENCY VIDEO CODING. Damien Schroeder, Patrick Rehm and Eckehard Steinbach

Similar documents
A HIGHLY PARALLEL CODING UNIT SIZE SELECTION FOR HEVC. Liron Anavi, Avi Giterman, Maya Fainshtein, Vladi Solomon, and Yair Moshe

High Efficiency Video Coding (HEVC) test model HM vs. HM- 16.6: objective and subjective performance analysis

Fast HEVC Intra Mode Decision Based on Edge Detection and SATD Costs Classification

COMPARISON OF HIGH EFFICIENCY VIDEO CODING (HEVC) PERFORMANCE WITH H.264 ADVANCED VIDEO CODING (AVC)

Low-Complexity No-Reference PSNR Estimation for H.264/AVC Encoded Video

A COMPARISON OF CABAC THROUGHPUT FOR HEVC/H.265 VS. AVC/H.264. Massachusetts Institute of Technology Texas Instruments

Edge Detector Based Fast Level Decision Algorithm for Intra Prediction of HEVC

DUE TO THE ever-increasing demand for bit rate to

Fast Intra Mode Decision in High Efficiency Video Coding

EFFICIENT PU MODE DECISION AND MOTION ESTIMATION FOR H.264/AVC TO HEVC TRANSCODER

Analysis of Motion Estimation Algorithm in HEVC

Sample Adaptive Offset Optimization in HEVC

Cluster Adaptated Signalling for Intra Prediction in HEVC

Affine SKIP and MERGE Modes for Video Coding

FAST HEVC INTRA CODING ALGORITHM BASED ON MACHINE LEARNING AND LAPLACIAN TRANSPARENT COMPOSITE MODEL. Yi Shan and En-hui Yang, Fellow, IEEE

A comparison of CABAC throughput for HEVC/H.265 VS. AVC/H.264

Sparse Coding based Frequency Adaptive Loop Filtering for Video Coding

Homogeneous Transcoding of HEVC for bit rate reduction

QoE-based network-centric resource allocation for on-demand uplink adaptive HTTP streaming over LTE network

Multistream Video Encoder for Generating Multiple Dynamic Range Bitstreams

Hierarchical Fast Selection of Intraframe Prediction Mode in HEVC

Effective Quadtree Plus Binary Tree Block Partition Decision for Future Video Coding

Decoding-Assisted Inter Prediction for HEVC

Mode-dependent transform competition for HEVC

Mode-Dependent Pixel-Based Weighted Intra Prediction for HEVC Scalable Extension

Fast Coding Unit Decision Algorithm for HEVC Intra Coding

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 1

Prediction Mode Based Reference Line Synthesis for Intra Prediction of Video Coding

Rotate Intra Block Copy for Still Image Coding

PERCEPTUALLY-FRIENDLY RATE DISTORTION OPTIMIZATION IN HIGH EFFICIENCY VIDEO CODING. Sima Valizadeh, Panos Nasiopoulos and Rabab Ward

Low-cost Multi-hypothesis Motion Compensation for Video Coding

CONTENT ADAPTIVE COMPLEXITY REDUCTION SCHEME FOR QUALITY/FIDELITY SCALABLE HEVC

High Efficiency Video Decoding on Multicore Processor

FAST MOTION ESTIMATION DISCARDING LOW-IMPACT FRACTIONAL BLOCKS. Saverio G. Blasi, Ivan Zupancic and Ebroul Izquierdo

A DYNAMIC MOTION VECTOR REFERENCING SCHEME FOR VIDEO CODING. Jingning Han, Yaowu Xu, and James Bankoski

Testing HEVC model HM on objective and subjective way

EFFICIENT INTRA PREDICTION SCHEME FOR LIGHT FIELD IMAGE COMPRESSION

Comparative and performance analysis of HEVC and H.264 Intra frame coding and JPEG2000

MOTION COMPENSATION WITH HIGHER ORDER MOTION MODELS FOR HEVC. Cordula Heithausen and Jan Hendrik Vorwerk

Hierarchical complexity control algorithm for HEVC based on coding unit depth decision

Performance Comparison of HEVC and H.264/AVC Standards in Broadcasting Environments

Transcoding from H.264/AVC to High Efficiency Video Coding (HEVC)

LOW BIT-RATE INTRA CODING SCHEME BASED ON CONSTRAINED QUANTIZATION AND MEDIAN-TYPE FILTER. Chen Chen and Bing Zeng

HEVC The Next Generation Video Coding. 1 ELEG5502 Video Coding Technology

We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists. International authors and editors

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

Complexity Reduced Mode Selection of H.264/AVC Intra Coding

FAST CODING UNIT DEPTH DECISION FOR HEVC. Shanghai, China. China {marcusmu, song_li,

An Adaptive Video Compression Technique for Resource Constraint Systems

Fast CU Encoding Schemes Based on Merge Mode and Motion Estimation for HEVC Inter Prediction

Motion Modeling for Motion Vector Coding in HEVC

Dynamically Reconfigurable Architecture System for Time-varying Image Constraints (DRASTIC) for HEVC Intra Encoding

Video compression with 1-D directional transforms in H.264/AVC

Professor, CSE Department, Nirma University, Ahmedabad, India

An Efficient Mode Selection Algorithm for H.264

FAST PARTITIONING ALGORITHM FOR HEVC INTRA FRAME CODING USING MACHINE LEARNING

A BACKGROUND PROPORTION ADAPTIVE LAGRANGE MULTIPLIER SELECTION METHOD FOR SURVEILLANCE VIDEO ON HEVC

Efficient Parallel Architecture for a Real-time UHD Scalable HEVC Encoder

Reduced Frame Quantization in Video Coding

Descrambling Privacy Protected Information for Authenticated users in H.264/AVC Compressed Video

STUDY AND PERFORMANCE COMPARISON OF HEVC AND H.264 VIDEO CODECS

Convolutional Neural Networks based Intra Prediction for HEVC

HEVC. Complexity Reduction Algorithm for Quality Scalability in Scalable. 1. Introduction. Abstract

Comparative and performance analysis of HEVC and H.264 Intra frame coding and JPEG2000

Fast Transcoding From H.264/AVC To High Efficiency Video Coding

Fast Mode Decision for H.264/AVC Using Mode Prediction

FAST SPATIAL LAYER MODE DECISION BASED ON TEMPORAL LEVELS IN H.264/AVC SCALABLE EXTENSION

Analysis of Information Hiding Techniques in HEVC.

Jun Zhang, Feng Dai, Yongdong Zhang, and Chenggang Yan

Inter Prediction Complexity Reduction for HEVC based on Residuals Characteristics

Improved H.264/AVC Requantization Transcoding using Low-Complexity Interpolation Filters for 1/4-Pixel Motion Compensation

Fast inter-prediction algorithm based on motion vector information for high efficiency video coding

HEVC OVERVIEW. March InterDigital, Inc. All rights reserved.

arxiv: v1 [cs.mm] 27 Apr 2017

STUDY AND PERFORMANCE COMPARISON OF HEVC AND H.264 VIDEO ENCODERS

JOINT RATE ALLOCATION WITH BOTH LOOK-AHEAD AND FEEDBACK MODEL FOR HIGH EFFICIENCY VIDEO CODING

Performance Comparison of AV1, JEM, VP9, and HEVC Encoders

EFFICIENT QUANTIZATION PARAMETER ESTIMATION IN HEVC BASED ON ρ-domain

Reducing/eliminating visual artifacts in HEVC by the deblocking filter.

A LOW-COMPLEXITY AND LOSSLESS REFERENCE FRAME ENCODER ALGORITHM FOR VIDEO CODING

Performance Comparison between DWT-based and DCT-based Encoders

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

MediaTek High Efficiency Video Coding

Unit-level Optimization for SVC Extractor

Fast Mode Assignment for Quality Scalable Extension of the High Efficiency Video Coding (HEVC) Standard: A Bayesian Approach

PREPRINT OF A PAPER TO BE PUBLISHED at 32nd PICTURE CODING SYMPOSIUM (PCS 2016), Nuremberg, Germany, Dec 4-7, 2016.

Scalable Extension of HEVC 한종기

Video encoders have always been one of the resource

A Fast Depth Intra Mode Selection Algorithm

"Block Artifacts Reduction Using Two HEVC Encoder Methods" Dr.K.R.RAO

Performance Evaluation of Kvazaar HEVC Intra Encoder on Xeon Phi Many-core Processor

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 20, NO. 11, NOVEMBER

High Efficiency Video Coding: The Next Gen Codec. Matthew Goldman Senior Vice President TV Compression Technology Ericsson

ENCODER COMPLEXITY REDUCTION WITH SELECTIVE MOTION MERGE IN HEVC ABHISHEK HASSAN THUNGARAJ. Presented to the Faculty of the Graduate School of

Pseudo sequence based 2-D hierarchical coding structure for light-field image compression

Efficient MPEG-2 to H.264/AVC Intra Transcoding in Transform-domain

An Improved H.26L Coder Using Lagrangian Coder Control. Summary

RATE-DISTORTION EVALUATION FOR TWO-LAYER CODING SYSTEMS. Philippe Hanhart and Touradj Ebrahimi

Transcoding from H.264/AVC to High Efficiency Video Coding (HEVC)

An Optimized Template Matching Approach to Intra Coding in Video/Image Compression

Transcription:

BLOCK STRUCTURE REUSE FOR MULTI-RATE HIGH EFFICIENCY VIDEO CODING Damien Schroeder, Patrick Rehm and Eckehard Steinbach Technische Universität München Chair of Media Technology Munich, Germany ABSTRACT Adaptive HTTP streaming requires a video to be encoded at multiple independently decodable bitrates. Encoding of multiple bitrates is a complex and time consuming process, especially with the new video coding standard HEVC. In this paper, we analyze the relation of HEVC block structures across encodings of a video at different bitrates. We propose a multirate encoding method for HEVC to decrease the overall encoding complexity while keeping the rate-distortion performance as high as possible. Experimental results show an average encoding time decrease of 27% compared to the reference HEVC encoder without degrading the rate-distortion performance. Index Terms HEVC, DASH, block structure, multirate encoding 1. INTRODUCTION Nowadays, video streaming is mostly implementing the adaptive HTTP streaming paradigm, recently standardized as Dynamic Adaptive Streaming over HTTP (DASH) [1]. The video content is made available at the server at different bitrates called representations, and the streaming clients can request the representation best suiting their requirements. Scalable codecs (e.g. [2]) have been proposed to encode a video and offering multiple bitrates. The rate-distortion (RD) performance of scalable codecs is, however, lower than with single layer coding. Transcoding can also be used to generate multiple bitrates from an encoded video. Again, the RD performance is lower than for a single layer coding due to requantization of the encoded video [3]. This paper focuses on the simultaneous encoding of multiple independently decodable representations of a single video, as represented in Figure 1. The new video coding standard High Efficiency Video Coding (HEVC) [4] offers improved RD performance compared to previous standards at the cost of increased encoding complexity. Since DASH requires a video to be encoded in multiple representations, the encoding with HEVC can become a performance bottleneck, especially in the case of live raw video reference encoding dependent enc. dependent enc. side information (e.g. block structure) DASH representations high quality low quality Fig. 1. Proposed multi-rate HEVC coding system. DASH streaming, and is thus generally performed on powerful servers. Accordingly, research on adaptive HTTP streaming has almost exclusively considered the downlink channel of communication, especially in mobile networks (e.g., [5, 6]). In order to enable a mobile uplink DASH communication with resource-constrained terminals, the complexity of the encoding of multiple representations has to be significantly reduced. Encoding the same video multiple times, even at different qualities, implies a certain degree of redundancy. This redundancy can be used to reduce the complexity of encoding multiple representations. In 22, in the context of offering a video sequence at different fixed bitrates, [3] proposed to perform the motion estimation only for one reference representation but didn t assess the encoding time gains resulting from this method. Later, [7] reuses the same idea and performs motion compensation only once for a reference stream in a VP8 encoder. The resulting RD performance is, however, severely degraded compared to the reference encoder. The contribution of this paper is twofold: first, we analyze the specificities of the block structure of HEVC in the case of multiple encodings at different SNR qualities, that is, when the video signal is more or less quantized. Second, based on the findings, we propose a method to reuse the block structure from a high quality reference representation to speed up the dependent encoding of lower quality representations. We implement and assess the proposed method based on the ref-

Fig. 2. First frame of BasketballPass encoded at 22 with resulting block structure. Fig. 3. First frame of BasketballPass encoded at 26 with resulting block structure. erence HEVC software and show that we can significantly reduce the encoding time for multiple HEVC representations without notably degrading the RD performance. Reducing the complexity either allows the encoding of more representations for a given computing power or allows devices with limited computing resources, such as mobile devices, to provide DASH content, opening the way to mobile uplink DASH communication. The rest of the paper is organized as follows. Section 2 describes the HEVC block structure behavior across multiple representations at different qualities. The proposed multi-rate HEVC encoder is presented in Section 3. Section 4 shows the encoding results compared with the original HEVC encoder and Section 5 concludes the paper. area (%) 1 9 7 5 2 1 lower depth same depth greater depth 22 24 26 28 32 34 36 38 2. BLOCK STRUCTURE AT MULTIPLE RATES 2.1. HEVC block structure The block structure based on a quadtree representation is considered to be one of the most significant changes in HEVC compared to prior video coding standards [8]. A video slice is first divided in multiple coding tree units (CTU) which comprise a quadtree structure whose leafs are called coding units (CU). The decision of the prediction mode (e.g., inter/intra) is made at the CU level. The size of a CU is given by its depth in the quadtree with depth corresponding to the biggest block size and a greater depth corresponding to a smaller block size. Determining the block structure is a major part of the ratedistortion-optimization (RDO), which is the most time consuming part of HEVC [9]. The RDO could be shortened if information about the block structure were available beforehand. 2.2. Similarities across encodings In the case of multi-rate HEVC video encoding, we observe that there are strong similarities in the block structure across Fig. 4. Percentage of the area of the BasketballPass sequence with block depth greater, identical or lower than the reference encoding at 22. multiple encodings of a single video sequence at different SNR qualities. As an example, Figure 2 shows the first frame of the BasketballPass sequence encoded with the HEVC reference software, HM version 16.2 [1] at quantization parameter () 22 with the resulting quadtree block structure. Figure 3 shows the same frame encoded at 26 with the resulting block structure. Similarities in the block sizes can be observed between the two figures. In order to quantify the similarity between two encodings with different parameters, we calculate the percentage of the area of the frames where the block size is the same, that is, the block has the same depth. If the block depth is not identical, it can either have a greater depth (i.e. a smaller block size) or a lower depth (i.e. a larger block size). Figure 4 shows the percentage of the area of the frames where the block depth is greater, identical, or lower than the reference depth given by the encoding at 22. For that, one second (5 frames) of the BasketballPass sequence is en-

1 9 reference encoding depth dependent encoding area (%) 7 5 2 1 lower depth same depth greater depth 22 24 26 28 32 34 36 38 Fig. 5. Average percentage of the area of 8 sequences with block depth greater, identical or lower than the reference encoding at 22. coded with s ranging from 22 to. The encoding at 22 shows 1% similarity with the reference encoding 22, as expected. The more the differs from the reference, the more the block structure differs from the one at the reference. Interestingly, we observe that with growing, the video tends to have more blocks with a lower depth, that is, with a larger block size. On the other hand, less blocks will have a greater depth. This behavior can be intuitively explained by the fact that at a greater, the strong quantization leads to less details in the image and thus prediction can be more easily performed at a larger block size. This behavior is also observed in general for other encoded videos. In Figure 5, the percentage of the area with greater, identical or lower depth is presented as a mean over one second of 8 video sequences (BasketballPass, BlowingBubbles, BQmall, Kimono, ParkScene, PartyScene, PeopleOnStreet, RaceHorses) defined in [11]. 3. PROPOSED MULTI-RATE ENCODER Given the fact that there are similarities in the block structure across multiple encodings of a single video at different qualities, we propose to reuse the information of the block structure of a reference encoding to shorten the RDO process of other dependent encodings, and thus reduce the encoding time. The RDO process in HM 16.2 is implemented recursively starting from the largest CU size, that is, depth. Depending on the encoder settings, various predictions (inter/intra) are examined and best candidates in the RD sense are chosen. Then, the block is divided into 4 subblocks (depth + 1) and the RDO is recursively applied to these subblocks, and so on. The block structure combined with the predictions which depth 1 depth 2 depth 3 Fig. 6. Example CTU block structure and quadtree for the reference encoding and quadtree checked during the RDO process for the dependent encoding. gives the smallest RD cost is chosen in order to maximize the RD performance. This recursive block subdivision is equivalent to a depth-first traversing of the quadtree [9]. Knowing that, on one hand, the blocks of an encoding will mostly have a lower or equal depth compared to a high quality reference encoding, and that the amount of blocks having a greater depth than in the reference encoding is relatively small (cf. Figure 5), and on the other hand, the RDO process of the HEVC encoder is implemented recursively, we propose to stop the RDO process of the dependent encodings at the depth given by the reference encoding for each block in the video sequence. This is illustrated in Figure 6, where an example block structure of the reference encoding and the corresponding quadtree is shown on the left and on the right the quadtree that is checked during RDO of a dependent encoding. The depths depicted in dashed lines are not checked during the dependent RDO process, which leads to significant encoding time savings. In fact, the number of blocks to check is multiplied by 4 from one depth to the next greater depth. In the case of blocks that should have a greater depth as in the reference encoding, a suboptimal block size in the RD sense will be chosen. The overall RD loss should be small, however, as this concerns only a relatively small percentage of all blocks (cf. Figure 5). On the other hand, the relatively large number of blocks which have a lower depth in a dependent encoding will still have the optimal block size in the RD sense as the RDO process will pass through these low depths during the dependent encoding. The fact that the highest quality encoding tends to have the most small blocks combined with the recursive RDO process implementation is an argument to chose the highest quality encoding as the reference encoding, if we want to achieve best RD performance. Conceptually, a low quality reference could be used in addition to an RDO process where first the smallest blocks would be processed and then going up the quadtree to larger blocks. Due to lack of space, we do not consider this option in this paper.

4.1. Settings 4. RESULTS To evaluate the proposed method, we compare our implementation with the unmodified HM 16.2 encoder [1]. We consider both the RD performance as well as the encoding time as comparison metrics. The RD-performance difference is measured using the Bjontegaard delta rate (BD-rate) [12] and the Bjontegaard delta PSNR (BD-PSNR) [13]. The encoding time difference ( T) is measured as the difference of the total encoding time for 5 representations including the reference at different qualities (fixed 22, 26,, 34 and 38). All encodings are performed on an Ubuntu server with a Xeon X54 CPU at 3.16 GHz and 32 GB RAM. 8 videos defined in [11] with resolutions from 416 2 to 25 1 pixels and frame rates between 24 fps and fps are encoded. All sequences are encoded with the random access, main profile defined in [11], a CTU size fixed to 64 64 pixels and a maximum CU depth of 4. 4.2. Encoding results Table 1 shows the encoding performance of the proposed encoding method compared with the original HM 16.2 reference. On average, the encoding time for 5 representations is reduced by 27% compared to the original HM encoder. At the same time, the BD-rate increase is only.56% and the BD-PSNR decrease is only.2 db on average, which can be considered acceptable. We observed similar results when applying rate-control instead of a fixed encoding (overall encoding time decreased by 27.34%, BD-rate increased by.49% and BD-PSNR decreased by.2 db). PSNR (db) 42 38 36 34 32 HM 16.2 original HM reuse CTU structure 1 2 5 7 9 bitrate (kb/s) Fig. 7. RD curves of the BasketballPass sequence for the original HM 16.2 encoder and the proposed encoder. encoding time (s) 1 12 1 2 HM 16.2 original HM reuse CTU structure 22 26 34 38 Fig. 8. Encoding time of the BasketballPass sequence for the original HM 16.2 encoder and the proposed encoder. Table 1. Comparison of encoding results Sequence BD-rate BD-PSNR T PeopleStreet (25 1).85% -.4 db -19.2% Kimono (192 1).75% -.2 db -43.1% ParkScene (192 1).43% -.1 db -35.79% BQMall (832 4).57% -.2 db -28.% PartyScene (832 4).26% -.1 db -22.1% BasketballPass (416 2).62% -.3 db -35.12% BlowingBubbles (416 2).43% -.2 db -2.3% RaceHorses (416 2).55% -.3 db -12.67% Average.56% -.2 db -27.5% Figure 7 shows an example of PSNR-bitrate curves for the BasketballPass video encoded both with the original HM 16.2 encoder and the proposed encoder which reuses the CTU structure across representations. As indicated in Table 1, the difference between the curves is negligible. Figure 8 shows the encoding time for each of the 5 representations. The encoding time for the reference representation (here at 22) is approximately the same as for the original HM encoder and can be slightly longer due to the additional output of the block structure information for the other representations. The encoding time for the lower quality dependent representations ( 26,, 34 and 38) is then reduced compared to the original encoder by reusing the block structure information. The encoding time for these four representations is reduced by 46% in the case of the represented BasketballPass sequence. 5. CONCLUSION In this paper, we analyzed the similarities between the block structures of a single video encoded at different qualities with HEVC. Based on the observations, we proposed to reuse the block structure information from a high quality reference encoding in order to shorten the RDO process and reduce the dependent encoding time for lower quality representations. Our experimental results show that the total encoding time of 5 representations can be reduced by approximately % on average without any significant loss in RD performance compared to the reference HEVC encoder.

6. REFERENCES [1] ISO/IEC 29-1, Information technology - Dynamic adaptive streaming over HTTP (DASH) - Part 1: Media Presentation description and segment formats, Tech. Rep., Apr. 212. [2] Philipp Helle, Haricharan Lakshman, Mischa Siekmann, Jan Stegemann, Tobias Hinz, Heiko Schwarz, Detlev Marpe, and Thomas Wiegand, A Scalable Video Coding Extension of HEVC, in Data Compression Conference (DCC), Snowbird, UT, USA, Mar. 213. [11] Frank Bossen, Common test conditions and software reference configurations, Tech. Rep. I11, JCT-VC, 212. [12] Gisle Bjontegaard, Improvements of the BD-PSNR model, ITU-T SG16 Q, vol. 6, pp. 35, 28. [13] Gisle Bjontegaard, Calculation of average PSNR differences between RD-curves, Doc. VCEG-M33 ITU-T Q6/16, Austin, TX, USA, Apr. 21. [3] André Zaccarin and Boon-Lock Yeo, Multi-rate encoding of a video sequence in the DCT domain, in IEEE International Symposium on Circuits and Systems, IS- CAS 22, May 22. [4] Gary J Sullivan, Jens Ohm, Woo-Jin Han, and Thomas Wiegand, Overview of the high efficiency video coding (HEVC) standard, IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1649 1668, Dec. 212. [5] Ozgur Oyman and Sarabjot Singh, Quality of experience for HTTP adaptive streaming services, IEEE Communications Magazine, vol. 5, no. 4, pp. 2 27, Apr. 212. [6] Ali El Essaili, Damien Schroeder, Dirk Staehle, Mohammed Shehada, Wolfgang Kellerer, and Eckehard Steinbach, Quality-of-experience driven adaptive HTTP media delivery, in IEEE International Conference on Communications (ICC), Budapest, Hungary, June 213. [7] Dag Haavi Finstad, Håkon Kvale Stensland, Håvard Espeland, and Pål Halvorsen, Improved multi-rate video encoding, in IEEE International Symposium on Multimedia (ISM), Dana Point, CA, USA, Dec. 211. [8] Il-Koo Kim, Junghye Min, Tammy Lee, Woo-Jin Han, and JeongHoon Park, Block partitioning structure in the HEVC standard, IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1697 176, Dec. 212. [9] Frank Bossen, Benjamin Bross, Karsten Suhring, and David Flynn, HEVC complexity and implementation analysis, IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1685 1696, Dec. 212. [1] HEVC reference software HM 16.2, https://hevc.hhi.fraunhofer.de/svn/svn hevcsoftware /tags/hm-16.2/, 215.