Effective Quadtree Plus Binary Tree Block Partition Decision for Future Video Coding

Similar documents
Sample Adaptive Offset Optimization in HEVC

Prediction Mode Based Reference Line Synthesis for Intra Prediction of Video Coding

Fast HEVC Intra Mode Decision Based on Edge Detection and SATD Costs Classification

A HIGHLY PARALLEL CODING UNIT SIZE SELECTION FOR HEVC. Liron Anavi, Avi Giterman, Maya Fainshtein, Vladi Solomon, and Yair Moshe

Low-cost Multi-hypothesis Motion Compensation for Video Coding

EFFICIENT PU MODE DECISION AND MOTION ESTIMATION FOR H.264/AVC TO HEVC TRANSCODER

Fast CU Encoding Schemes Based on Merge Mode and Motion Estimation for HEVC Inter Prediction

FAST CODING UNIT DEPTH DECISION FOR HEVC. Shanghai, China. China {marcusmu, song_li,

Decoding-Assisted Inter Prediction for HEVC

Edge Detector Based Fast Level Decision Algorithm for Intra Prediction of HEVC

CONTENT ADAPTIVE COMPLEXITY REDUCTION SCHEME FOR QUALITY/FIDELITY SCALABLE HEVC

Hierarchical complexity control algorithm for HEVC based on coding unit depth decision

High Efficiency Video Coding (HEVC) test model HM vs. HM- 16.6: objective and subjective performance analysis

Fast Intra Mode Decision in High Efficiency Video Coding

Fast Coding Unit Decision Algorithm for HEVC Intra Coding

Affine SKIP and MERGE Modes for Video Coding

OVERVIEW OF IEEE 1857 VIDEO CODING STANDARD

Jun Zhang, Feng Dai, Yongdong Zhang, and Chenggang Yan

Convolutional Neural Networks based Intra Prediction for HEVC

Mode-Dependent Pixel-Based Weighted Intra Prediction for HEVC Scalable Extension

LOW BIT-RATE INTRA CODING SCHEME BASED ON CONSTRAINED QUANTIZATION AND MEDIAN-TYPE FILTER. Chen Chen and Bing Zeng

Fast and adaptive mode decision and CU partition early termination algorithm for intra-prediction in HEVC

Fast inter-prediction algorithm based on motion vector information for high efficiency video coding

IMPROVING video coding standards is necessary to allow

BLOCK STRUCTURE REUSE FOR MULTI-RATE HIGH EFFICIENCY VIDEO CODING. Damien Schroeder, Patrick Rehm and Eckehard Steinbach

Rotate Intra Block Copy for Still Image Coding

Reducing/eliminating visual artifacts in HEVC by the deblocking filter.

Hierarchical Fast Selection of Intraframe Prediction Mode in HEVC

A BACKGROUND PROPORTION ADAPTIVE LAGRANGE MULTIPLIER SELECTION METHOD FOR SURVEILLANCE VIDEO ON HEVC

HIGH Definition (HD) and Ultra-High Definition (UHD)

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

JOINT RATE ALLOCATION WITH BOTH LOOK-AHEAD AND FEEDBACK MODEL FOR HIGH EFFICIENCY VIDEO CODING

HEVC The Next Generation Video Coding. 1 ELEG5502 Video Coding Technology

THE High Efficiency Video Coding (HEVC) standard [1] Enhancing Quality for HEVC Compressed Videos. arxiv: v2 [cs.

A COMPARISON OF CABAC THROUGHPUT FOR HEVC/H.265 VS. AVC/H.264. Massachusetts Institute of Technology Texas Instruments

FAST HEVC TO SCC TRANSCODING BASED ON DECISION TREES. Wei Kuang, Yui-Lam Chan, Sik-Ho Tsang, and Wan-Chi Siu

Analysis of Motion Estimation Algorithm in HEVC

Inter Prediction Complexity Reduction for HEVC based on Residuals Characteristics

A hardware-oriented concurrent TZ search algorithm for High-Efficiency Video Coding

High Efficiency Video Decoding on Multicore Processor

Cluster Adaptated Signalling for Intra Prediction in HEVC

Mode-dependent transform competition for HEVC

High Efficiency Video Coding. Li Li 2016/10/18

A comparison of CABAC throughput for HEVC/H.265 VS. AVC/H.264

Temporally correlated quadtree partition algorithm for fast intra coding in high e ciency video coding

Research Article An Effective Transform Unit Size Decision Method for High Efficiency Video Coding

Effective Data Driven Coding Unit Size Decision Approaches for HEVC INTRA Coding

We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists. International authors and editors

MOTION COMPENSATION WITH HIGHER ORDER MOTION MODELS FOR HEVC. Cordula Heithausen and Jan Hendrik Vorwerk

Fast Mode Decision for H.264/AVC Using Mode Prediction

THE HIGH definition (HD) and ultra HD videos have

Complexity Reduced Mode Selection of H.264/AVC Intra Coding

CROSS-PLANE CHROMA ENHANCEMENT FOR SHVC INTER-LAYER PREDICTION

Fast Transcoding From H.264/AVC To High Efficiency Video Coding

A NOVEL SCANNING SCHEME FOR DIRECTIONAL SPATIAL PREDICTION OF AVS INTRA CODING

New Rate Control Optimization Algorithm for HEVC Aiming at Discontinuous Scene

Video encoders have always been one of the resource

Transcoding from H.264/AVC to High Efficiency Video Coding (HEVC)

An Efficient Mode Selection Algorithm for H.264

Performance Evaluation of Kvazaar HEVC Intra Encoder on Xeon Phi Many-core Processor

HEVC. Complexity Reduction Algorithm for Quality Scalability in Scalable. 1. Introduction. Abstract

Performance Comparison of AV1, JEM, VP9, and HEVC Encoders

Sparse Coding based Frequency Adaptive Loop Filtering for Video Coding

ENCODER COMPLEXITY REDUCTION WITH SELECTIVE MOTION MERGE IN HEVC ABHISHEK HASSAN THUNGARAJ. Presented to the Faculty of the Graduate School of

COMPARISON OF HIGH EFFICIENCY VIDEO CODING (HEVC) PERFORMANCE WITH H.264 ADVANCED VIDEO CODING (AVC)

SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC

Dynamically Reconfigurable Architecture System for Time-varying Image Constraints (DRASTIC) for HEVC Intra Encoding

DUE TO THE ever-increasing demand for bit rate to

Scalable Extension of HEVC 한종기

Professor, CSE Department, Nirma University, Ahmedabad, India

An Information Hiding Algorithm for HEVC Based on Angle Differences of Intra Prediction Mode

Fast Intra-frame Coding Algorithm for HEVC Based on TCM and Machine Learning

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. YY, MONTH YYYY 1

EFFICIENT INTRA PREDICTION SCHEME FOR LIGHT FIELD IMAGE COMPRESSION

Machine Learning-based Early Termination in Prediction Block Decomposition for VP9

High Efficiency Video Coding: The Next Gen Codec. Matthew Goldman Senior Vice President TV Compression Technology Ericsson

ADAPTIVE INTERPOLATED MOTION COMPENSATED PREDICTION. Wei-Ting Lin, Tejaswi Nanjundaswamy, Kenneth Rose

HEVC OVERVIEW. March InterDigital, Inc. All rights reserved.

FAST PARTITIONING ALGORITHM FOR HEVC INTRA FRAME CODING USING MACHINE LEARNING

Comparative and performance analysis of HEVC and H.264 Intra frame coding and JPEG2000

Adaptive Interpolated Motion-Compensated Prediction with Variable Block Partitioning

One-pass bitrate control for MPEG-4 Scalable Video Coding using ρ-domain

QUANTIZER DESIGN FOR EXPLOITING COMMON INFORMATION IN LAYERED CODING. Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose

FAST SPATIAL LAYER MODE DECISION BASED ON TEMPORAL LEVELS IN H.264/AVC SCALABLE EXTENSION

Video pre-processing with JND-based Gaussian filtering of superpixels

1492 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 11, NO. 6, DECEMBER 2015

A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames

Bi-directional optical flow for future video codec

IBM Research Report. Inter Mode Selection for H.264/AVC Using Time-Efficient Learning-Theoretic Algorithms

Context-Adaptive Binary Arithmetic Coding with Precise Probability Estimation and Complexity Scalability for High- Efficiency Video Coding*

Fast intermode decision algorithm based on general and local residual complexity in H.264/ AVC

A DYNAMIC MOTION VECTOR REFERENCING SCHEME FOR VIDEO CODING. Jingning Han, Yaowu Xu, and James Bankoski

A Hybrid Video Codec Based on Extended Block Sizes, Recursive Integer Transforms, Improved Interpolation, and Flexible Motion Representation

An Multi-Mini-Partition Intra Block Copying for Screen Content Coding

Multistream Video Encoder for Generating Multiple Dynamic Range Bitstreams

An Optimized Template Matching Approach to Intra Coding in Video/Image Compression

An HEVC Fractional Interpolation Hardware Using Memory Based Constant Multiplication

Rate Distortion Optimization in Video Compression

FRAME-LEVEL QUALITY AND MEMORY TRAFFIC ALLOCATION FOR LOSSY EMBEDDED COMPRESSION IN VIDEO CODEC SYSTEMS

Selected coding methods in H.265/HEVC

Transcription:

2017 Data Compression Conference Effective Quadtree Plus Binary Tree Block Partition Decision for Future Video Coding Zhao Wang*, Shiqi Wang +, Jian Zhang*, Shanshe Wang*, Siwei Ma* * Institute of Digital Media, Peking University, Beijing, 100871, China + Rapid-Rich Object Search (ROSE) Lab, Nanyang Technological University, Singapore {zhaowang, sqwang, jian.zhang, sswang, swma }@pku.edu.cn Abstract: Block partition structure has been recognized as a crucial module in video coding scheme. Recently, a quadtree plus binary tree (QTBT) block partition structure has been proposed in the Joint Video Exploration Team (JVET) development. Compared to the quadtree structure in HEVC, QTBT can achieve better coding performance with hugely increased encoding complexity. Here, we propose an effective QTBT partition decision algorithm to achieve a good tradeoff between computational complexity and coding performance. In particular, at the Coding Tree Unit level, the partition parameters of QTBT are dynamically derived to adapt to the local characteristics without transmitting any overhead. Subsequently, at the Coding Unit level, a joint-classifier decision tree structure is designed to eliminate unnecessary iterations and meanwhile control the risk of false prediction. Experimental results show that the proposed algorithm can achieve 64% encoding time reduction on average with only 1.26% increase in terms of bit rate. This greatly benefits the practical implementations of QTBT in real application scenarios. 1. Introduction Block-based coding structure has been recognized as the core of the state-of-the-art video coding standards because of its capability in achieving high compression performance. In HEVC, Coding Tree Unit (CTU) represents equal-sized image area, e.g. 64x64, which will be divided into Coding Units (CU) following a quadtree splitting. Each CU can be further split into Prediction Unit (PU) and Transform Unit (TU), which serve as the basic units of prediction and transformation, respectively. Moreover, CU is the basic granularity to specify intra- or inter-prediction. For the CU encoded with inter mode, it will be further divided into one or more PUs according to the selected splitting mode, while TU also follows the quadtree splitting. After applying all iterations, the optimal splitting mode minimizing the rate distortion cost will be selected. Because luma and chroma components are jointly considered in rate distortion optimization (RDO) process, they usually share the same block partitions [1]. Though the block structure in HEVC has largely improved the coding performance compared with the previous video coding standards, potential improvements based on the following observations have also been recognized [2]: CU, which serves the root node of PU and TU splitting, is limited to be square shape. More flexible shapes may improve the coding performance. The fixed types of PU splitting may restrict the potentials of the prediction ability in video coding. For most of the cases, luma and chroma components share the same splittings. However, if the luma and chroma components can be dealt with separately, they can better adapt to the properties of different components. 2375-0359/17 $31.00 2017 IEEE DOI 10.1109/DCC.2017.70 23

1 2 3 4 5 Binary Tree Splitting Quadtree Splitting 13 MaxBTSize 6 10 7 8 9 11 12 13 1 2 5 3 4 9 7 6 8 10 11 12 MinQTSize Figure 1: Illustration of the QTBT block partition structure. Based on the above observations, a quadtree plus binary tree (QTBT) block partition structure was proposed during the recent Joint Video Exploration Team (JVET) development [2]. One example of QTBT structure is illustrated in Fig. 1. For a CTU, QTBT first conducts quadtree partition to obtain some small square blocks, which is similar to the quadtree partition in HEVC. After the quadtree partition, binary tree partition can be performed sequentially following the quadtree leaf node, which leads to smaller blocks with various rectangle shapes. The binary tree leaf node is termed as the CU, and also serves as the basic unit of prediction and transformation. Though the binary tree partition only includes two types, horizontal splitting and vertical splitting, the property of recursion provides it higher flexibility than the fixed splitting modes in HEVC. To control the complexity of QTBT, there are three parameters to restrict the depth of quadtree and binary tree. As shown in Fig. 1, MinQTSize represents the minimal allowed quadtree leaf node size, which implies that the quadtree partition must be terminated if the block size reaches MinQTSize. Similarly, MaxBTSize restricts the maximal allowed binary tree root node size and MaxBTDepth limits the maximal allowed binary tree depth. Moreover, a luma-chromaseparated block partition structure is adopted for I slice, while the partitions for luma and chroma are still shared for P and B slice. Though the design of binary tree partition provides higher compression performance compared to that in HEVC, the encoding complexity has also been dramatically increased due to the recursive characteristics. For this reason, it is desirable to propose a lowcomplexity QTBT decision algorithm. Early termination is an effective method to reduce the unnecessary iterations, which can be applied by detecting the Coded Block Flag (CBF) [3], Skip mode [4], etc. Spatio-temporal correlations have also been utilized to speed up the encoding decisions due to similar partitions often appear in adjacent blocks [5-6]. In [7], an optimized scheme was proposed that only part of the inter splitting modes is evaluated according to the intermediate encoding parameters. For the intra coding, Zhang et al. analyzed the relationship between a block s texture characteristics and its best coding mode [8]. In [9], Shen et al. proposed a fast Bayesian theory based CU decision algorithm, where the motion vectors and rate-distortion costs were employed as the features. In [10], a three-output joint classifiers was designed to control the risk of false prediction. The aforementioned methods have been well developed for H.264 or HEVC codec and achieved significant time savings. However, they cannot be directly used in the QTBT scheme due to the new design philosophy of the block partition structure and the elimination of inter splitting modes. Inspired by this, this paper proposes an effective partition decision algorithm based on QTBT structure. On one hand, dynamic partition parameters derivation method is proposed to adapt to the local content properties at CTU 24

level. On the other hand, at the CU level the joint-classifier decision tree structure is proposed to achieve a good trade-off between the computational complexity and the coding performance. The remaining of this paper is organized as follows. The proposed effective QTBT partition decision algorithm is presented in Section 2. Simulation results and comparisons are shown in Section 3 and Section 4 concludes this paper. 2. The Proposed QTBT Partition Decision Scheme In this section, the novel splitting mode decision scheme for QTBT is detailed. In particular, two algorithms are coherently proposed to achieve the low complexity QTBT: 1) dynamic partition parameters derivation (DPPD), in which the partition parameters are adaptively derived for each CTU at both encoder and decoder sides to adapt to the local content properties; 2) joint-classifier decision tree (JCDT), in which a four-output decision tree consisting of multiple classifiers is designed to further reduce the computational complexity while simultaneously controlling the risk of false prediction. 2.1. Dynamic Partition Parameters Derivation In QTBT, the partition parameters, e.g., MinQTSize, MaxBTSize, MaxBTDepth, are playing a crucial role in balancing the coding performance and the encoding complexity. In particular, smaller MinQTSize, larger MaxBTSize and MaxBTDepth imply that the block size can vary in a larger range and more splitting iterations will be performed, thus leading to the increase of computational complexity. Regarding the RD performance, more variable block sizes can improve the prediction efficiency, while the side information induced for signaling the splitting flags increases simultaneously. For example, if a quadtree node equals to 32x32, a flag should be transmitted to signal whether the node is further divided by quadtree splitting when MinQTSize is less than 32x32. On the contrary, the bits can be saved if MinQTSize is equivalent to 32x32. To practically explore the influences of partition parameters on the coding performance and the computational complexity, simulations with different partition parameters are conducted based on the QTBT structure. Compared to the HEVC reference software HM- 13.0, the results under each configuration are showed in Table 1. For CFG1-CFG4, it is found that the coding performance improves significantly as the decrease of MinQTSize, and performance loss arises if we further reduce MinQTSize. The reason lies in the fact that smaller MinQTSize improves the partition flexibility at the cost of more coding bits required to signal the splitting types. Comparisons among CFG3, CFG5 and CFG6 reveal that smaller MaxBTSize will decrease the coding performance as binary tree splitting can only be employed for small blocks, and the computational complexity can also be reduced simultaneously. Moreover, when we compare CFG3, CFG7 and CFG8, it is observed that larger MaxBTDepth settings not only bring about dramatically computational complexity increase, but also lead to the loss of coding performance. All these statistics show that proper partition parameters should be selected to achieve a good trade-off between the coding performance and computational complexity. To reduce the unnecessary partition iterations while maintaining the partition flexibility, a dynamic partition parameters derivation algorithm is proposed. In this method, partition parameters are derived for each CTU on both encoder and decoder sides to adapt to the local characteristics, by utilizing the splitting information of neighboring blocks. Fig. 2 25

Table 1: Performance and encoding computational complexity comparisons among different QTBT partition parameters with anchor HM 13.0. Configurations CFG1 CFG2 CFG3 CFG4 CFG5 CFG6 CFG7 CFG8 MinQTSize 64 32 16 8 16 16 16 16 MaxBTSize 64 64 64 64 32 16 64 64 MaxBTDepth 4 4 4 4 4 4 3 5 BD-rate 19% -0.2% -4.6% -4.5% -3.7% -1.8% -3.9% -3.4% Enc Time 78% 137% 158% 181% 129% 104% 127% 213% illustrates the block partitions of two successive frames coded by the QTBT structure, from which one can discern that the optimal partition parameters should be locally adaptive. In particular, for the areas with less texture, larger blocks are preferred. Therefore, larger MinQTSize, MaxBTSize and smaller MaxBTDepth should be used to remove the redundant splitting iterations, while unnecessary bits for signaling the splitting types can also be saved. On the contrary, for the areas rich with texture, the block partition is in fine-grained and the binary tree splitting is frequently used. Hence, smaller MinQTSize, larger MaxBTSize and MaxBTDepth are demanded to suffice elaborate partitions. In this manner, flexible block partitions for the areas with complex texture can be maintained while the splitting iterations are reduced for the homogenous areas. From Fig. 2, we can obviously observe that there exist significant correlations in block splitting modes among spatio-temporal neighboring blocks. Therefore, the partition information of the current CTU can be predicted from the adjacent CTUs. Let and denote the average predicted size of quadtree leaf nodes (QTLNs) and binary tree leaf nodes (BTLNs), respectively. They can be predicted as follows: QT BT N pre i ave _ i i i= 1 i= 1 N = ( α QT ) α (1) N pre i ave _ i i i= 1 i= 1 N = ( α BT ) α (2) where and represent the average sizes of QTLNs and BTLNs in the neighboring CTUs, respectively. Moreover, refers to the corresponding weights. In particular, {0.6, 0.2, 0.2} are assigned to the temporal co-located CTU, top neighboring CTU and left neighboring CTU, respectively. Based on the predicted block sizes of the current CTU, proper partition parameters can be derived. In particular, both MinQTSize and MaxBTSize should be determined by the block sizes of QTLNs, because they limit the minimal size and the maximum size of QTLN, respectively. For example, if the sizes of all QTLNs are between 16x16 and 32x32, MinQTSize and MaxBTSize can be set as 16 and 32 respectively to remove redundant splitting iterations while unnecessary splitting flags can also be avoided. With respect to MaxBTDepth, it determines how small a block can be split compared to the corresponding quadtree leaf nodes. Hence, MaxBTDepth can be set according to the size ratio between the QTLNs and the BTLNs. Thus, we define the following formulas to predict the partition parameters of the current CTU: 26

(a) 3rd frame (BasketballPass, LDP, QP 27) (b) 4th frame (BasketballPass, LDP, QP 27) Figure 2: Block partitions of two successive frames coded by QTBT structure. log2 QTpre MinQTSize = 2 log2 QTpre + β MaxBTSize = 2 MaxBTDepth = QTpre BTpre + β (3) where serves as an offset equivalent to 0 or 1 according to whether is larger than 30. 2.2. Joint-Classifier Decision Tree Structure Based on the DPPD scheme, we can dynamically obtain the partition parameters at both encoder and decoder. In particular, the flexible partitions can be adopted for the rich texture areas while complex partition iterations are reduced for the homogeneous areas. However, DPPD is dedicated for CTU level partitions, and there still exists splitting redundancy in each CU depth. To further speed up the encoding process, a novel joint-classifier decision tree structure is proposed to reduce the redundant splitting iterations. The splitting mode decision in QTBT is a recursive process, as illustrated in Fig. 3-(a). Here, Bn indicates the procedure of checking the CU at depth level n. Then, further quadtree partition (QT), horizontal binary tree partition (HBT) and vertical binary tree partition (VBT) will be checked recursively, and the optimal one or their combinations are finally selected. Though this strategy can achieve best coding performance, it is very time consuming since all the CU sizes (including asymmetric shapes) will be checked. To speed up the CU partition decision process, decision tree is an effective strategy. In [11], Shen et al. proposed a decision tree structure in which Bn is first checked and then a classifier is applied to predict whether the CU coding process should be early terminated (ET) or not. Imbedding this decision tree structure into QTBT, it can be illustrated as Fig. 3-(b). In this structure, no matter which splitting mode is the optimal, Bn will be checked. Such strategy of checking is unnecessary when further splitting is the best, which leads to needless computational cost. Moreover, the time saving can only be obtained when the prediction of classifier is Y, which may make the decision tree not so efficient for QTBT as those three recursive iterations still need to perform. Other decision tree structure is shown in Fig. 3-(c) [10], where only one splitting mode is selected for each CU depth. If integrating this decision structure into QTBT, the classifier can be viewed as realizing multiclassification. For the output Y, the coding process will be early terminated and time saving is achieved. For the output N, the most probable further splitting mode (among 27

B n B n B n B n Figure 3: Illustration of various splitting decision structure. (a) original QTBT structure conducting all iterations. (b) Decision tree with early termination. (c) Decision tree with multi-classifications. (d) Proposed joint-classifier decision tree. QT, HBT and VBT) is selected, and the other two modes are also terminated. This decision tree can achieve most time savings, but the drawback is that the prediction accuracy may largely influence the RD performance (selecting the best from four choices). Unfortunately, due to many uncertainties in video coding, such as various video contents, encoding environments and limited available information, it is difficult to guarantee the prediction accuracy within a secure level, e.g. 95%, such that the RD performance may get degraded. To introduce more flexibilities into the decision structure as well as maintain the RD performance, we present a more advanced decision tree structure which jointly utilizes the advantages of Structure (b) and Structure (c). As shown in Fig. 3-(d), two classifiers are used in this structure for jointly decision making. Classifier A is used to decide whether to conduct the quadtree partition or not, and Classifier B is used to decide whether to conduct binary tree partition or not. By combining Classifier A and Classifier B, in total four decisions will be produced in this structure. 1) Both Classifier A and Classifier B output Y. Only the current block Bn will be coded and the other three further partitions will be terminated. In this circumstance, the time saving is achieved and the prediction accuracy can also be guaranteed. 2) Classifier A output N and Classifier B output Y. In this circumstance, the block itself will be coded and further quadtree partition will also be conducted, while binary tree partition is terminated and time saving can be achieved. 3) Classifier A output Y and Classifier B output N. Similar to (2), the block itself and one binary tree mode with higher probability will be processed, while the quadtree partition is terminated. 4) Both Classifier A and Classifier B output N. This means that the current block should not be early terminated, and hence both quadtree partition and binary tree partition are implied. It is worth noting that time saving can still be achieved because of only one binary tree mode conducted. In summary, the proposed joint-classifier decision tree is an improved integration of Structure (b) and Structure (c). On one hand, the decision of early termination is rigorous because it need two classifiers output Y. On the other hand, time saving is always achieved, resulting from only one or two splitting modes will be conducted. This property makes the proposed decision structure more flexible and is capable of controlling the risk of false prediction. Hence, a better trade-off between the computational complexity and the coding performance can be achieved. 28

0.25 0.2 Classifier A Classifier B Information gain 0.15 0.1 0.05 0 Size Lambda Texture T_h / T_v Neigh_ Neigh_ Neigh_ Attributes Depth_QTDepth_BT Partition Figure 4: Information gain for the Classifier A and Classifier B. To establish the classifier, there are some methods to select related features, such as support vector machine (SVM), information gain attribute evaluation (IGAE), etc. Here, we adopt the IGAE method, which employs the information gain [12] as an indicator for data classification. In particular, the entropy difference in terms of the number of bits for each data item to convey its class identity before and after data set classification is calculated. Fig. 4 lists the selected features for Classifier A and Classifier B. From Fig. 4, it is found that the features are consistent for two classifiers, and divergences are also observed on the information gain value. The features are detailed as follows: Size: the block size has obvious high information gain because larger blocks tend to be split and vice versa. Lagrange multiplier: it is employed as an important feature since it controls the tradeoff between rate and distortion. In particular, there is higher probability to select larger CU size when the Lagrange multiplier becomes large. Texture: the influence of content texture on splitting decision has been detailed in above, where the smooth areas often consist of larger blocks and vice versa. T_h/T_v: it represents the ratio between the horizontal texture and vertical texture. It is worth noting that T_h/T_v has a higher information gain for Classifier B, which reveals that the binary tree splitting is more relevant to the texture direction. Neigh_Depth_QT, Neigh_Depth_BT: represent the average depth of quadtree partition and binary tree partition among neighboring blocks, respectively. High information gain of these features also implies the spatio-temporal correlations. Neigh_Partition: it refers to the splitting types of the adjacent blocks. It is observed that this feature also has higher information gain for Classifier B than Classifier A. The reason mainly lies in the fact that the types of binary tree splitting among neighboring blocks are often identical, especially for the temporally co-located block. 3. Experimental Results To evaluate the performance of the proposed scheme, we integrate it into QTBT on the reference software HM-13.0-QTBT released by JVET [13]. The testing configurations and encoding parameters of Anchor and proposed scheme are showed in Table 2, which conform to the common test conditions [14]. With respect to the test sequences, in total six sequences, consisting of Traffic, BasketballDrill, BasketballPass, BasketballDrive, Johnny and ParkScene, are used for training, and the others are used for testing. Table 2 also lists 29

Table 2: Testing configurations and encoding parameters. FrameNumber 5*FrameRate CTU Size 128x128 QP 22,27,32,37 1080P MinQTSize_I 16 MinQTSize_PB 16 Resolution 720P MaxBTSize_I 32 MaxBTSize_PB 128 WVGA MaxBTDepth_I 4 MaxBTDepth_PB 4 WQVGA MinQTSize_C 4 MaxBTDepth_C 0 Table 3: Performance of the proposed DPPD and JCDT algorithms. DPPD JCDT Sequence All Intra Random Access All Intra Random Access BD-Rate ET BD-Rate ET BD-Rate ET BD-Rate ET PeopleOnStreet +0.26% -23.1% -0.06% -7.2% +1.07% -57.3% +1.03% -67.7% Kimono +0.31% -43.2% -0.08% -4.0% +0.91% -53.2% +1.33% -61.2% Cactus +0.52% -26.7% -0.03% -4.6% +1.13% -61.9% +1.21% -64.6% BQTerrace +0.21% -16.5% -0.29% -2.7% +1.09% -72.1% +1.62% -58.8% BQMall +0.31% -17.3% -0.08% -5.1% +1.08% -64.7% +1.08% -61.9% PartyScene -0.02% -3.8% +0.24% -1.8% +0.86% -59.3% +0.91% -60.3% RaceHorsesC +0.21% -16.5% -0.21% +6.7% +1.11% -60.4% +1.21% -64.7% BasketballPass +0.47% -21.6% -0.22% +4.3% +1.03% -57.7% +1.32% -69.3% BQSquare +0.01% -3.6% -0.04% -0.3% +1.33% -62.6% +1.54% -55.1% RaceHorses -0.01% -8.2% -0.20% +2.7% +1.18% -62.8% +1.19% -68.4% FourPeople +0.72% -28.8% -0.28% -5.8% +1.08% -47.2% +1.32% -61.5% KristenAndSara +0.69% -31.6% +0.18% -8.9% +0.99% -50.2% +1.27% -59.9% Overall +0.31% -20.1% -0.11% -2.2% +1.07% -59.1% +1.25% -62.8% the default parameters of QTBT, where MinQTSize_C and MaxBTDepth_C represent the corresponding parameters for the chroma components. The performance evaluation results by individually comparing the proposed algorithms under all intra (AI) and random access (RA) configurations are shown in Table 3. The coding performance is measured by BD-Rate. Positive and negative values represent coding loss and coding gain, respectively. ET is used to represent the reduction on the encoding time. From Table 3, it is observed that DPPD can reduce the encoding time by 20.1% on average with negligible loss of coding efficiency (0.32% BD-Rate increase) under AI configuration. Under RA configuration, it is interesting to find that 0.11% coding gain is achieved with even 2.2% coding time saving. The coding gain mainly comes from the bits saving of the splitting flags when the selected partition parameters of one CTU are adequate for splitting. With respect to the proposed JCDT method, around 59% and 63% encoding time reductions have been achieved for the two configurations, and the maximum time reduction can be up to 69%. Interestingly, this coding time reduction is remarkable for sequences with high activity, e.g. RaceHorses, BasketballPass and PeopleOnStreet, but still evident for the low-activity sequences such as BQSquare and KristenAndSara. On the other hand, the RD performance degradation is negligible for all the test sequences. This indicates that the proposed JCDT method can efficiently skip unnecessary iterations in the partition decision. In the following, we analyze the experimental results of the proposed scheme by incorporating DPPD and JCDT together. In this case, the computational complexity reduction varies from 51.3% to 77.8% with 64.7% on average, as shown in Table 4. It is 30

Table 4: Results of the proposed scheme by combining DPPD and JCDT. Proposed Related Work [15] Sequence All Intra Random Access Lowdelay-B Random Access BD-Rate ET BD-Rate ET BD-Rate ET BD-Rate ET PeopleOnStreet +1.33% -69.2% +0.99% -69.5% +1.02% -71.2% +0.62% -20.8% Kimono +1.25% -72.1% +1.27% -63.0% +1.27% -62.8% +0.47% -16.8% Cactus +1.58% -70.9% +1.29% -66.9% +1.34% -69.0% +0.43% -18.4% BQTerrace +1.17% -77.8% +1.45% -60.2% +1.23% -58.8% +0.40% -17.6% BQMall +1.31% -69.8% +1.06% -63.8% +1.03% -64.2% +0.46% -19.4% PartyScene +0.87% -60.3% +1.14% -60.7% +1.12% -62.3% +0.37% -18.6% RaceHorsesC +1.29% -67.8% +1.15% -59.1% +1.20% -52.7% +0.72% -21.5% BasketballPass +1.40% -65.7% +1.23% -67.8% +1.16% -68.8% +0.52% -17.7% BQSquare +1.35% -64.2% +1.53% -56.2% +1.28% -51.3% +0.13% -15.4% RaceHorses +1.18% -63.8% +1.12% -67.8% +1.18% -64.6% +0.55% -18.2% FourPeople +1.71% -61.6% +1.25% -64.5% +1.32% -61.7% +0.52% -17.3% KristenAndSara +1.64% -66.2% +1.38% -65.7% +1.27% -63.5% +0.56% -18.7% Overall +1.34% -67.6% +1.24% -63.8% +1.20% -62.6% +0.49% -18.4% also worth noticing that the time reduction achieved under AI configuration is higher than that under RA configuration. This is due to the reason that more time reductions can be achieved in AI configuration for the DPPD method. To further verify the proposed scheme, we compare it with the method proposed in the latest JVET meeting [15]. The idea in [15] is based on the adaptive adjustment of MaxBTDepth for each frame according to the temporal level, and about 18.4% time reduction can be achieved with 0.5% coding loss. Therefore, compared to [15], the average BD-Rate increase of 1.26% with average computational complexity reduction of 64.7% shows superiority in terms of the ratedistortion-complexity performance. 4. Conclusion In this paper, we propose an effective QTBT partition decision algorithm to achieve a good compromise between the encoding computational complexity and the RD performance. The novelty of this paper lies in two aspects: 1) we first propose the dynamic partition parameters derivation method at the CTU level to adapt to the local properties; 2) a fouroutput decision tree consisting of joint classifiers is specifically designed for QTBT which can further remove redundant splitting iterations and control the risk of false prediction simultaneously. Experimental results show that the proposed algorithm can significantly reduce the encoding complexity with ignorable coding performance degradation. Acknowledgment This work was supported in part by the National Natural Science Foundation of China under Grant 61632001, Grant 61571017, and Grant 61322106 and in part by the National Basic Research Program of China (973 Program) under Grant 2015CB351800, which are gratefully acknowledged. 31

References [1] Kim I K, Min J, Lee T, et al. Block partitioning structure in the HEVC standard, IEEE transactions on circuits and systems for video technology, vol. 22, pp. 1697-1706, 2012. [2] J. An, H. Huang, K. Zhang. Quadtree plus binary tree structure integration with JEM tools, JVET-B0023, Joint Video Exploration Team (JVET). Feb. 2016. [3] R. H. Gweon, Y.-L Lee, and J. Lim. Early termination of CU encoding to reduce HEVC complexity, JVTVC-F045, ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC). Jul. 2011. [4] J. Kim, S. Jeong, S. Cho, and J. S. Choi. Adaptive coding unit early termination algorithm for HEVC. Conn.: IEEE International Conference on Consumer Electronics (ICCE), 2012. [5] L. Shen, Z. Zhang, Z. Liu. Adaptive inter-mode decision for HEVC jointly utilizing inter-level and spatiotemporal correlations, IEEE Transactions on Circuits and Systems for Video Technology, vol. 24, pp. 1709-1722, 2014. [6] Z. Wang, S. Wang, J. Zhang, S. Ma. Local-Constrained Quadtree Plus Binary Tree Block Partition Structure for Enhanced Video Coding. Conn.: IEEE Visual Communication and Image Processing Conference (VCIP), 2016. [7] J. Vanne, M. Viitanen, T. D. Hämäläinen. Efficient mode decision schemes for HEVC inter prediction, IEEE Transactions on Circuits and Systems for Video Technology, vol. 24, pp. 1579-1593, 2014. [8] M. Zhang, C. Zhao, J. Xu. An adaptive fast intra mode decision in HEVC. Conn.: IEEE International Conference on Image Processing (ICIP), 2012. [9] X. Shen, L. Yu, and J. Chen, Fast coding unit size selection for HEVC based on bayesian decision rule. Conn.:: IEEE Picture Coding Symposium (PCS), 2012. [10] Y. Zhang, S. Kwong, X. Wang, H. Yuan, Z. Pan, L. Xu. Machine Learning-Based Coding Unit Depth Decisions for Flexible Complexity Allocation in High Efficiency Video Coding, IEEE Transactions on Image Processing, vol. 24, pp. 2225-2238, 2015. [11] X. Shen, L. Yu. CU splitting early termination based on weighted SVM, EURASIP Journal on Image and Video Processing, vol. 1, pp. 1, 2013. [12] T. M. Cover, J. A. Thomas. Elements of information theory, John Wiley & Sons, 2012. [13] JVET software repository. Available online: https://jvet.hhi.fraunhofer.de/svn/svn_ HMJEMSoftware/branches/HM-13.0-QTBT/ [14] F. Bossen, Common test conditions and software reference configurations, JCTVC- J1100, ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC), Sweden, 2012. [15] Y. Yamamoto. AHG5: Fast QTBT encoding configuration, JVET-D0095, Joint Video Exploration Team (JVET). Oct. 2016. 32