Video pre-processing with JND-based Gaussian filtering of superpixels

Similar documents
Low-cost Multi-hypothesis Motion Compensation for Video Coding

Intra-Mode Indexed Nonuniform Quantization Parameter Matrices in AVC/H.264

An Optimized Template Matching Approach to Intra Coding in Video/Image Compression

Compression of Stereo Images using a Huffman-Zip Scheme

ISSN: An Efficient Fully Exploiting Spatial Correlation of Compress Compound Images in Advanced Video Coding

FOR compressed video, due to motion prediction and

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

No-reference perceptual quality metric for H.264/AVC encoded video. Maria Paula Queluz

A Comparison of Still-Image Compression Standards Using Different Image Quality Metrics and Proposed Methods for Improving Lossy Image Quality

A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation

Image Quality Assessment Techniques: An Overview

Reduction of Blocking artifacts in Compressed Medical Images

Image and Video Quality Assessment Using Neural Network and SVM

A New Psychovisual Threshold Technique in Image Processing Applications

Video Quality Analysis for H.264 Based on Human Visual System

PERCEPTUALLY-FRIENDLY RATE DISTORTION OPTIMIZATION IN HIGH EFFICIENCY VIDEO CODING. Sima Valizadeh, Panos Nasiopoulos and Rabab Ward

An Efficient Mode Selection Algorithm for H.264

CONTENT ADAPTIVE COMPLEXITY REDUCTION SCHEME FOR QUALITY/FIDELITY SCALABLE HEVC

A LOW-LIGHT IMAGE ENHANCEMENT METHOD FOR BOTH DENOISING AND CONTRAST ENLARGING. Lin Li, Ronggang Wang,Wenmin Wang, Wen Gao

Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding

OVERVIEW OF IEEE 1857 VIDEO CODING STANDARD

Context based optimal shape coding

Hybrid Video Compression Using Selective Keyframe Identification and Patch-Based Super-Resolution

Performance Comparison between DWT-based and DCT-based Encoders

Adaptive Quantization for Video Compression in Frequency Domain

Fast Mode Decision for H.264/AVC Using Mode Prediction

Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

A deblocking filter with two separate modes in block-based video coding

Digital Image Stabilization and Its Integration with Video Encoder

Outline Introduction MPEG-2 MPEG-4. Video Compression. Introduction to MPEG. Prof. Pratikgiri Goswami

SJTU 4K Video Subjective Quality Dataset for Content Adaptive Bit Rate Estimation without Encoding

Network-based model for video packet importance considering both compression artifacts and packet losses

Advanced Video Coding: The new H.264 video compression standard

DEEP LEARNING OF COMPRESSED SENSING OPERATORS WITH STRUCTURAL SIMILARITY (SSIM) LOSS

EFFICIENT PU MODE DECISION AND MOTION ESTIMATION FOR H.264/AVC TO HEVC TRANSCODER

Comparative and performance analysis of HEVC and H.264 Intra frame coding and JPEG2000

ABSTRACT 1. INTRODUCTION 2. RELATED WORK

Image Quality Assessment based on Improved Structural SIMilarity

Graph based Image Segmentation using improved SLIC Superpixel algorithm

A NEW METHODOLOGY TO ESTIMATE THE IMPACT OF H.264 ARTEFACTS ON SUBJECTIVE VIDEO QUALITY

Quality versus Intelligibility: Evaluating the Coding Trade-offs for American Sign Language Video

FAST MOTION ESTIMATION DISCARDING LOW-IMPACT FRACTIONAL BLOCKS. Saverio G. Blasi, Ivan Zupancic and Ebroul Izquierdo

ANALYSIS AND COMPARISON OF SYMMETRY BASED LOSSLESS AND PERCEPTUALLY LOSSLESS ALGORITHMS FOR VOLUMETRIC COMPRESSION OF MEDICAL IMAGES

Further Reduced Resolution Depth Coding for Stereoscopic 3D Video

VIDEO streaming applications over the Internet are gaining. Brief Papers

Blind Measurement of Blocking Artifact in Images

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

Compression of RADARSAT Data with Block Adaptive Wavelets Abstract: 1. Introduction

Modified SPIHT Image Coder For Wireless Communication

Reducing/eliminating visual artifacts in HEVC by the deblocking filter.

Affine SKIP and MERGE Modes for Video Coding

Image Segmentation Techniques for Object-Based Coding

An adaptive preprocessing algorithm for low bitrate video coding *

A Novel Image Super-resolution Reconstruction Algorithm based on Modified Sparse Representation

Video Compression Method for On-Board Systems of Construction Robots

Just Noticeable Difference for Images with Decomposition Model for Separating Edge and Textured Regions

JPEG IMAGE CODING WITH ADAPTIVE QUANTIZATION

Enhanced Hexagon with Early Termination Algorithm for Motion estimation

Structural Similarity Based Image Quality Assessment Using Full Reference Method

STUDY ON DISTORTION CONSPICUITY IN STEREOSCOPICALLY VIEWED 3D IMAGES

Mode-Dependent Pixel-Based Weighted Intra Prediction for HEVC Scalable Extension

Digital Image Processing

SCREEN CONTENT IMAGE QUALITY ASSESSMENT USING EDGE MODEL

MAXIMIZING BANDWIDTH EFFICIENCY

In the name of Allah. the compassionate, the merciful

SUBJECTIVE QUALITY EVALUATION OF H.264 AND H.265 ENCODED VIDEO SEQUENCES STREAMED OVER THE NETWORK

A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames

JOINT RATE ALLOCATION WITH BOTH LOOK-AHEAD AND FEEDBACK MODEL FOR HIGH EFFICIENCY VIDEO CODING

PROBABILISTIC MEASURE OF COLOUR IMAGE PROCESSING FIDELITY

IBM Research Report. Inter Mode Selection for H.264/AVC Using Time-Efficient Learning-Theoretic Algorithms

Coding of 3D Videos based on Visual Discomfort

Automatic Video Caption Detection and Extraction in the DCT Compressed Domain

Fast HEVC Intra Mode Decision Based on Edge Detection and SATD Costs Classification

Efficient Color Image Quality Assessment Using Gradient Magnitude Similarity Deviation

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

Fusion of colour and depth partitions for depth map coding

A LOW-COMPLEXITY AND LOSSLESS REFERENCE FRAME ENCODER ALGORITHM FOR VIDEO CODING

How Many Humans Does it Take to Judge Video Quality?

An Information Hiding Algorithm for HEVC Based on Angle Differences of Intra Prediction Mode

Mesh Based Interpolative Coding (MBIC)

H.264/AVC Video Watermarking Algorithm Against Recoding

Fast Implementation of VC-1 with Modified Motion Estimation and Adaptive Block Transform

Efficient Image Compression of Medical Images Using the Wavelet Transform and Fuzzy c-means Clustering on Regions of Interest.

Robot localization method based on visual features and their geometric relationship

Primal Sketch Based Adaptive Perceptual JND Model for Digital Watermarking

Homogeneous Transcoding of HEVC for bit rate reduction

Video encoders have always been one of the resource

STUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS (H.264/AVC, DIRAC)

SSIM Image Quality Metric for Denoised Images

Multiframe Blocking-Artifact Reduction for Transform-Coded Video

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification

Compression of VQM Features for Low Bit-Rate Video Quality Monitoring

ADAPTIVE INTERPOLATED MOTION COMPENSATED PREDICTION. Wei-Ting Lin, Tejaswi Nanjundaswamy, Kenneth Rose

Face Hallucination Based on Eigentransformation Learning

SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC

Context-Adaptive Binary Arithmetic Coding with Precise Probability Estimation and Complexity Scalability for High- Efficiency Video Coding*

Novel Lossy Compression Algorithms with Stacked Autoencoders

Video Compression An Introduction

Non-Linear Masking based Contrast Enhancement via Illumination Estimation

Transcription:

Video pre-processing with JND-based Gaussian filtering of superpixels Lei Ding, Ge Li*, Ronggang Wang, Wenmin Wang School of Electronic and Computer Engineering, Shenzhen Graduate School, Peking University ABSTRACT In this paper, an innovative method of HEVC video pre-processing is proposed. The method applies a simple linear iterative clustering (SLIC), which adapts a k-means clustering to group pixels into perceptually meaningful atomic regions of superpixels. By calculating the average of weighted average of luminance differences around each pixel in the superpixel, a suitable parameter of Gaussian filter for the superpixel is determined. Experimental results show that bit rate can be reduced up to 29% without loss in visual quality. Keywords: HEVC, JND, superpixel, video pre-processing, Gaussian filtering 1. INTRODUCTION The recently developed HEVC [1], high efficiency video coding standard, is becoming more and more popular. Its improved compression performance relative to the existing standard is in the range of 50% bit rate reduction.the human eyes perceive images through the human visual system (HVS), which provides a possibility to get a higher video compression ratio. Extensive research has been conducted to improve the performance of encoder conformable with standards. Video pre-processing [2] can improve the subjective quality of a reconstructed video or reduce the bit rate in the generation of a compressed bit stream. The usual video pre-processing adopted video data spatial filtering, temporal filtering and image sharpening [3]. By applying a visual perception threshold (PTHD) with just noticeable distortion (JND), one can achieve a compression gain up to 10% to 15% by exploiting video data perceptual redundancy [4-6].The Gaussian filter has been widely used for de-noising in image processing. By applying Gaussian filtering, the new value of pixel (x, y) is the weighted average of the pixels around itself. Gaussian filtering makes the image smoother, which means the deviation of the pixels in a coding block will become smaller, and thus will reduce the bit rate in the process of motion estimation, transformation, scaling and quantization. Smooth areas can withstand strong filtering without being noticed, while edge areas or textured areas will be blurred, thus these areas should be filtered slightly or not at all. The superpixel is an area with similar texture, contour, colour, etc. Superpixel algorithms group pixels into perceptually meaningful atomic regions. They capture image redundancy and provide a convenient primitive from which to compute image features. Algorithms for generating superpixels can be broadly categorized as either graph-based or gradient ascent methods. Graph-based approaches to superpixel generation treat each pixel as a node in a graph. Edge weights between two nodes are proportional to the similarity between neighbouring pixels. The superpixels are created by minimizing a cost function defined over the graph. Whereas the gradient-ascent-based methods start from a rough initial clustering of pixels, iteratively refine the cluster until some convergence criterion is met to form superpixels. SLIC [7], a state-of-the-art method for generating superpixels based on K-means clustering, has been shown to outperform existing superpixel methods. Human eyes cannot perceive any changes below the JND threshold of around a pixel due to their underlying spatial/temporal masking properties [8]. The sensitivity of distortion by human eyes can vary significantly in different areas of a frame, upon which the JND model is set. The major factors that contribute to the JND model are spatial contrast sensitivity function, luminance adaption [9-10] etc. These factors reflect the texture, edge and boundary of the frame, and the frame can be filtered according to these factors. Based on the related works above, we present a new approach of incorporating SLIC and JND for video pre-processing in order to reduce the bit rate without loss of visual quality. Subjective and objective evaluation is carried out to verify the effectiveness of the proposed approach. Visual Information Processing and Communication VI, edited by Amir Said, Onur G. Guleryuz, Robert L. Stevenson, Proc. of SPIE-IS&T Electronic Imaging, SPIE Vol. 9410, 941004 2015 SPIE-IS&T CCC code: 0277-786X/15/$18 doi: 10.1117/12.2083818 Proc. of SPIE-IS&T Vol. 9410 941004-1

2. SLIC SUPERPIXEL SEGMENTATION Simple linear iterative clustering (SLIC) is an adaptation of k-means for superpixel generation. It has two major merits. The first is that the number of distances calculated in the optimization can be reduced by refining the search space to an area proportional to the superpixel size, and thus reduces the complexity function to be linear. The second is that the weighted distance measurement combines colour and spatial proximity and provides control over the size and compactness of the superpixels. The above merits make SLIC run faster, and more memory efficient than existing methods. The only parameter need to be determined is K, the desired number of approximately equally sized superpixels. In this paper, the size of superpixels I4Wi4ri is proposed to be equal to the size of a macroblock (16 x 16). The reasoning behind is as follows: If the size is too small, blocking artefacts arise after filtering; otherwise, too many pixels share the same filter, and as a result, major features may be filtered out, and the whole image will become obscure. ra irr,,*. fs l,as, 70 risprosiss+:r s! s-,,a!'!+ar_ii,0.+ae+ iplt0.4.-x0-_ 1,r r±.d,+%m/}.,s** 4,.r,- wrr* r1 *Al +*f`íis+ +*ẹ6ia,,fl11..1 addirnlili.átr,rr Figure 1. Racehorse image segmented into superpixels of size 256 pixels with SLIC As shown in Figure 1, image is segmented into superpixels that conform to a grid, of which the size is about 256 pixels. Each pixel in the superpixel shares similar luminance, contour, texture etc. 3. JND FACTOR EXTRACTION The visibility threshold of JND in normal images could be a very complicated function of the factors mentioned above. In the following equations, the factor that reflects the gradients and the texture around the pixel is shown. For the pixel at (x, y),, denotes the maximal weighted average of gradients around the pixel [11],, =,,,,!, = 1 16 3 +, 3 +, # " where, are four high-pass filters for texture detection as shown in Figure 2.! 1 2 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 o 0 1 3 8 3 1 0 8 3 0 0 0 0 3 8 0 0 3-3 0 0 0 0 0 0 1 3 0-3 -1-1 -3 0 3 I 0 8 0-8 0 -I -3-8 -3 -I 0 0-3 -8 0 0-8 -3 0 0 0 3-3 0 0 0 0 0 0 0-1 0 0 0 0 -I 0 0 0 I 0 0 91 92 93 94 Figure 2. Filters for calculating the weighted average of luminance changes in four directions Proc. of SPIE-IS&T Vol. 9410 941004-2

As shown in Figure 3, the reconstructed image (a) has almost the same boundaries with the original Racehorse image. Apparently, they share the same contour for the feature. In the reconstructed image, three typical areas A, B and C are selected to show the connection between the value of the maximal weighted average of gradients around the pixel and (a) 30 36 14 13 11 22 16 25 33 37 39 36 11 31 0 0 0 0 0 0 0 9 11 5 10 17 28 9 33 37 39 36 11 31 33 1 1 1 1 0 0 0 19 26 20 17 25 36 26 35 31 25 22 2 42 45 2 3 3 2 1 0 1 10 12 21 10 12 12 17 27 21 17 13 18 49 28 3 4 4 4 3 2 1 16 18 27 19 6 10 6 15 15 15 5 32 47 9 3 2 2 4 6 6 6 16 14 13 17 10 12 9 12 19 16 9 44 36 11 3 2 1 2 5 6 6 20 28 15 11 6 7 7 21 21 12 19 47 23 11 A Figure 3. (a) Reconstructed image of original Racehorse image with the maximum weighted average of luminance difference around each pixel,(b) pixel value of area A, B and C of the reconstructed image B C the smoothness and visibility threshold. Higher value of weighted average of luminance difference (, ) results in a brighter pixel, and vice versa. As shown in Figure 3 (b), area A, which is located in the grass field of the original image, has lower, value than that of Area B, located between the leg of the jockey and the horse, where there is complicated texture and edge. Area C, located at the horseback, has the lowest values less or equal to 10. It can also be deduced that lower, value represents less texture and more smoothness while the higher, value means more texture, boundary and surface crease. It can also be deduced that areas with lower, values may withstand greater distortion caused by filtering without been discernible, and vice versa. 4. JND-BASED GAUSSIAN FILTERING FOR SUPERPIXELS Traditionally, Gaussian filtering for the frame with the same parameter leads to blurring effects and significant loss of subjective quality at the same time. Under the circumstances, a novel approach is adopted, with which different areas of the frame can be filtered adaptively. Since each pixel within a superpixel shares the similar texture, smoothness and surface crease, which are correlated with JND factors, therefore different superpixel may require different filters as determined by,. Proc. of SPIE-IS&T Vol. 9410 941004-3

As a result, superpixels within the frame are filtered individually with different parameters. A simple formula is given to implement a filter for a superpixel. Let % denote the average of, for pixels in the superpixel. A standard deviation for a & & Gaussian filter on the superpixel is defined as follows,, %, % <. ( = ) * % = + /,. % 1 0 Here b = 8.5, k = 3,. = 15, 1 = 30, c = 1 and d = 0.5 based on our work. ( is linearly negatively correlated with % when % is smaller than., whereas ( will be set at value c when % is smaller than 1 and greater than.. Once % is above the threshold 1, the superpixel will not be filtered. Also, certain pixels, whose values of maximal weighted average of luminance around themselves are bigger than the threshold (set at 15 here), will not be filtered as well. As to the boundary between superpixels, standard deviation will be determined by, of the superpixels it belongs to. If the gap of, is smaller than a set value, then % will be average of, for superpixels sharing the same boundary. Otherwise, the boundary will remain the same. As shown in Figure 4, the pre-processed image obtains a better quality than that of direct Gaussian filtering. 3 Figure 4. Pre-processed image by means of proposed method and Gaussian filtering 5. EXPERIMENTAL RESULTS The proposed scheme is implemented based on the HEVC HM 13.0 reference encoder with RDOQ enabled. Sequences of 832x480p, 1280x720p and 1920x1080p with all the frames are used in the experiment. Under the same QP and different bit rates, the subjective quality of the encoded video is compared between the proposed method and the original reference encoder. The double stimulus impairment scale (DSIS) test method described in ITU-R BT.500 subjective evaluation standard [12] is applied for this subjective evaluation. The reference sequence is shown in Fig.5, followed by the sequence encoded with the pre-processing. A 65 Skyworth LCD display (65E810U) is used. The viewing distance is about 3 times the image height. Five observers with expertise in image / video processing are asked to give a score from 1 (very annoying) to 5 (imperceptible). The mean opinion score (MOS) is computed as the mean score of all observers. As shown in Table 1, all of the sequences have MOS of 5 or close to 5, and an overall average as 4.97, implying that the perceptual quality is the same as the reference sequences. Also, the structural similarity (SSIM) index is applied to evaluate the quality. SSIM value is calculated for each encoded sequence by original encoder and the proposed method with the initial uncompressed sequence as reference. As a result, the curves with bitrate for different SSIM in John and Kimo by original encoded sequence and proposed method are drawn in Fig. 5. The result shows a bit rate gain. Proc. of SPIE-IS&T Vol. 9410 941004-4

Bitrate for Different SSIM in 'John' Bitrate for Different SSIM in 'Wino' 35W 8000 30W 7000 25W g 20W Y C 15W 10W -original -Preprocessed 6000 5000 a 4000 é 3000 m 2000 -original - Preprocessed 500 1000 0 0.982 0.984 0.986 0.988 0.99 551M 0.992 0.994 0 0.97 0.975 0.98 0.985 0.99 0.995 1 551M Figure 5. Bitrate for different SSIM in sequence John and Kimo encoded by original reference encoder and preprocessed method Table 1 shows the bit rate reduction and MOS when the proposed scheme is used. On the average, the bit rate is reduced by 9.3% for all sequences. Fig. 5 shows the average bit rate change at different QPs. In general, a higher resolution and a smaller QP can result in a higher reduction of bit rate, up to 29% without a visual loss. To some extent, Gaussian filtering for superpixel misses the relevance of temporal domain, which causes the bit reduction effect of proposed video pre-processing to diminish. This also occurs as QP increases, since more coefficients are quantized to zero. Bitrate Change at Different QP 0% -2% 20 24 28 32-4% $ -6% -A II frame QP Figure 6. Bit rate changes at different QPs for all frames 6. CONCLUSIONS In this paper, we proposed a method that exploits the SLIC superpixel and JND in video pre-processing to improve the compression efficiency of HEVC encoder without causing visual loss. The whole frame is handled by means of JNDbased Gaussian filtering for superpixels generated by SLIC. This approach reduces the average bit rate by 9.3% without a loss of visual quality. The saving of bit rate is more significant when a video has higher resolution or is encoded at a lower QP. 7. ACKNOWLEDGMENTS This work was partly supported by a grant of National Science Foundation of China 61370115, Shenzhen Peacock Plan and Shenzhen fundamental research project JCYJ20120614150301623. Proc. of SPIE-IS&T Vol. 9410 941004-5

Table 1. Bit rate reduction of pre-processed sequences and their MOS. 832ti480p sequences 1280x720p sequences 1920x1080p sequences Sequence QP Bit rate change MOS Johnny 20-8.70% 5 24-7.90?0 5 BasketballDtill 28-6_50% 5 32-5.70" 0 5 20 1.70% 4.6 24 0.00% 5 28-2.40% 5 32-3.40% 5 20-9.50% 5 24-6.80% 5 RaceHorses 28-5.40% 5 32-4.40% 4.8 24-17.90% 5 28-12.40% 5 20-29% 5-32 -9.80% 5 20-1330% 5 24-6.00% 5 FourPcople 28-5.10% 5 32-4.70% 4.8 20-19.80% 5 24-11.10% 5 KristenAndSara 28-8.50% 5 32-6.40% 4.8 20-20.20% 5 24-11% 5 Kimono 28-9% 5 32-7% 5 20-13.2 5 22-12% 5 ParkScene 21-12% 5 3- -il% 5 REFERENCES [1] G. J. Sullivan, J. Ohm, W. J. Han, and T. Wiegand, "Overview of the high efficiency video coding (HEVC) standard," IEEE Circuits and Systems for Video Technology. Papers 22, 1649-1668 (2012). [2] K. J. Shen, "Video preprocessing techniques for real-time video compression-noise level estimation techniques," IEEE Circuits and Systems. Papers 2, 778-781 (1999). [3] S. J. Huang, "Adaptive noise reduction and image sharpening for digital video compression," IEEE International Conference on Computational Cybernetics and Simulation. Papers 4, 3142 3147 (1997). [4] X. Yang, W. Lin, Z. Lu, E. Ong, and S. Yao "Motion-compensated residue preprocessing in video coding based on justice-noticeable-distortion profile," IEEE transactions on Circuits and Systems for Video Technology. Papers 15, 742-752 (2005). [5] J. Geol, D. Chan, and P. Mandi, "Pre-processing for MPEG compression using adaptive spatial filtering," IEEE transactions on Consumer Electronics. Papers 41, 687-691 (1995). [6] I. Hontish and L. J. Karam "Adaptive image coding with perceptual distortion control," IEEE transactions on Image Processing. Papers 11, 213-222 (2002). [7] R. Achanta, A. Shaji, K. Smith, A. Lucchi, and P. Fua, and S. Susstrunk, "SLIC superpixels compared to state-ofthe-art superpixel methods," IEEE Pattern Analysis and Machine Intelligence. Papers 34, 2274-2282 (2012). [8] X. K. Yang, W. S. Ling, Z. K. Lu, E. P. Ong, and S. S. Yao, "Just noticeable distortion model and its applications in video coding," Signal Processing: Image Communication. Papers 20, 662-680 (2005). [9] Y. Jia, W. Lin, and A. A. Kassim, "Estimating just-noticeable distortion for video," IEEE Circuits and Systems for Video Technology. Papers 16, 820-829 (2006). [10] X. Yang, W. Lin, Z. Lu, E. Ong, and S. Yao, "Motion-compensated residue preprocessing in video coding based on just-noticeable-distortion profile," IEEE Circuits and Systems for Video Technology. Papers 15, 742-752 (2005). [11] C. H. Chou and Y. C. Li, "A perceptually tuned subband image coder based on the measure of just-noticeabledistortion profile," IEEE Circuits and Systems for Video Technology. Papers 5, 467-476 (1996). [12] Methodology for the subjective assessment of the quality of television pictures, ITU-R Recommendation BT.500-11 (2002). Proc. of SPIE-IS&T Vol. 9410 941004-6