DCT-BASED IMAGE QUALITY ASSESSMENT FOR MOBILE SYSTEM Jeoong Sung Park and Tokunbo Ogunfunmi Department of Electrical Engineering Santa Clara University Santa Clara, CA 9553, USA Email: jeoongsung@gmail.com and togunfunmi@scu.edu ABSTRACT In this paper, we do further research on the DCT-based image quality approach proposed in our previous paper [7]. Our objective is to find a new image quality metric that run on the fly at video encoder and decoder in mobile systems. Most of dominant image quality metrics such as use intensity, mean, variance, and covariance on the pixel domain which take too much hardware and complexity. Instead, we propose to measure just frequency difference between original image and distorted image. It takes low complexity enough to implement on hardware. By using a built-in DCT block in image and audio standards such as H.264 and HEVC, much hardware for computing frequency components can be saved. In this paper, we propose a performance-improved metric than FSM (Frequency Similarity Method) proposed in [7]. As a result of simulation, our proposed metric performs by 95 percent as does. Even though 95 percent is not better than, it is enough to make video system more adaptive and rate-controllable based on error measurement. Index Terms Objective image quality assessment,, FSM. INTRODUCTION Image quality is a characteristic of an image that measures the perceived image degradation. It plays an important role in various image processing application. Goal of image quality assessment is to supply quality metrics that can predict perceived image quality automatically. There are two types of image quality assessment: subjective quality assessment and objective quality assessment ([], [2], [3]). Subjective image quality is concerned with how image is perceived by a viewer and gives his or her opinion on a particular image. The mean opinion score (MOS) has been used for subjective quality assessment. Objective image quality assessment is a mathematical model that approximates results of subjective quality assessment. Goal of objective evalution is to devlope quantative measure that can predict perceived image quality. MSE (Mean Square Error) and PSNR (Peak Signal to Noise Ratio) ([2], [3]) are the most used methods of objective image quality assessment for image quality assessment. It measures pixel-to-pixel error between a reference image and a distorted image. Alternatively Wang et al. ([4]) proposed the Structural Similarity () index. This method extracts the structural information of the image and has been proved to be a new representative metric that reflects HVS. However, one may think of situations in which the information provided by this index does not match a subjective quality judgement. It is due to the bias each method has towards the image statistic it is using to measure. Some other quality assessment methods based on different features may give more accurate information of the global quality. [6] reported a drawback of and presented the Quality Index based on Local Variance (QILV) which is a new method based on the distribution of the local variance in the images with the aim to better handle the non-stationarity of the images to be compared. [5] proposes to add frequency structural comparison onto. But, frequency information is used redundantly and calculation is very complicated. [] developes a general-purpose no-reference approaches to image quality assessment based on a DCT statistics. Our objective is to find a new image quality metric that run on the fly at video encoder and decoder in mobile systems. If it is possible to measure BER in the receiver without taking CPU and requiring much hardware resource, mobile systems can become more intelligent and adaptive. Most of dominant image quality metrics such as use intensity, mean, variance, and covariance on the pixel domain which take too much hardware and complexity. In our previous paper [7], we proposed a new image quality assessment which was named as FSM (Frequency Similarity Method). FSM measures just frequency difference between a distorted image and a reference image by using Discrete Cosine Transform (DCT). It takes low complexity enough to implement on hardware. By using a built-in DCT block in image and audio standards such as H.264 and HEVC, much hardware for computing frequency components can be saved. Experimental result showed FSM achieved 9 percent
performance of in [7] which was higher than 86 percent of PSNR [5]. In this paper, we propose (Frequency Mean Square Error) as a new definition of FSM to get more precise image quality metric. Based on experimental results with standard image database ([9]), achieves 95 percent performance of. Besides, performs better than especially at white-noised imagesa. We also explore various transform sizes such as 6 6, 8 8 and 4 4 to get better performance. This paper is organized as follows. Section II presents a related background about image quality metric. Section III describes our proposed method. In Section IV, experimental results are provided and we conclude in Section V. 2. BACKGROUND In this section, we present a brief overview of image quality metric. MSE (Mean Square Error) and PSNR (Peak Signal to Noise Ratio) ([2], [3]) measure pixel-to-pixel error between a reference image and a distorted image as denoted in Equations () and (2). MSE = MN M j= j= N (x ij y ij ) 2 () L 2 PSNR=log (2) MSE where L is a maximum level of intensity. As presented in [4], equations of are as follows. l(x, y) = 2µ xµ y + C µ 2 x + µ 2 y + C (3) c(x, y) = 2σ xσ y + C 2 σ 2 x + σ 2 y + C 2 (4) s(x, y) = 2σ xy + C 3 (5) σ x σ y + C 3 Equation (3), (4), and (5) represent contrast comparison, luminance comparison, and structure similarity comparison, respectively. In Equation (5), structural similarity comparison is given by using covariance of x and y and both variance of x and variance of y. At last, equation (6) includes all those comparisons. s(x, y) = (µ x µ y + C )(σ x σ y + C 2 ) (µ 2 x + µ 2 y + C )(σ 2 x + σ 2 y + C 2 ) [6] reported some drawbacks of. Figure which is obtained from [6] shows 4 different Lena images Figure -(b) to Figure -(e) that have the same =.5 with reference image Figure -(a). All 4 images can not be perceived to human visual system with the same feeling and level. For example, (6) (b) looks definitely better than Figure -(c). Figure -(e) can not be identified as Lena image without any information. As shown in Figure, an important drawback of is a bias towards some features of the image. is too sensitive to white noise and speckle noise and too generous to blurring. In our previous paper [7], our proposed method which is named as FSM (Frequency Similarity Method) estimates similarity between the frequency map of a reference image and that of a distorted image. To transform pixel data to frequency domain, it uses 8 8 block based DCT for simple calculation. FSM i = min(x i,y i )+C (7) max(x i,y i )+C where X and Y are transformed results of original image and distorted image, respectively. i is a position index on a new transformed image which has the same size as the original image. C is a small constant used to avoid instability when the denominator might approach zero. Equation (7) indicates relative difference between frequency components of the original image and the distorted image regardless of which one is greater. Mean value of FSM i over all positions is the final metric for image quality assessment as shown in Equation (8). FSM = N N i= min(x i,y i + C) max(x i,y i + C) where FSM. N is the number of pixels. Compared with Equation (6), Equation (8) is much simpler than other equations. 3. IMAGE QUALITY ASSESSMENT BASED ON FREQUENCY SIMILARITY In [7], FSM achieves 9 percent performance of. That performance is OK to detect trend of low errors or highs error by using minimum hardware size. But, our new target is to make system performance higher than 9 percent of. To obtain higher performance, difference of each frequency component between original image and distorted image needs to be measured more precisely. One of popular and precise methods is MSE. So, we obtain frequency components of the original image and the distorted image by using DCT and apply them into MSE as follows. = MSE(X, Y )= MN M j= j= (8) N (X ij Y ij ) 2 (9) where all variables and index have the same meaning as Equation (8). simply indicates MSE on frequency domain. Compared with Equation (8), Equation (9) can measure more statistical frequency difference and does not have dependency on C. Of course, complexity of Equation (9) is low enough to implement simply on hardware compared with Equation (6).
(a) (b) (c) (d) (e) Fig.. (a) Original Image. Other images have the same =.582 (b) white noise added, (c) blur distortion (d) high-boosted (e) singular value decomposition (most significant eigenimage). This figures are referred from [6] vs on blurring effect vs on white noise.95.9.9.8.85.7.8.6 In the same as FSM, can use a built-in DCT block for data compression. If the original DCT block in video system is used for FSM, additional hardware resource is not needed to obtain frequency components. Only a few more additional operation units for multiplication, addition, and division are required for calculating Equation (9)..75.5.7.4.65.3.6.55.2 2 4 6 8 2 4. 2 4 6 8 2 4 6 8 4. EXPERIMENTAL RESULTS (a) Blurring (b) Speckling noise 4.. Similarity and difference between and Figure 2 shows relation between and. Each quality of image could be generated by two ways. One way is adding speckling noise. The other way is blurring. There are two plots that correspond to those two ways. As shown in both figures, has very high correlation with even though its complexity is lower than. However, the curve in Figure 2-(a) is sharper than that of Figure 2-(b). For example, in case of points corresponding to.6 on both plots, value of the blurred Lena image is 58 which indicates is too generous to blurring. But, value of the speckle-noised Lena image is 2 which indicates is too sensitive to white noise and speckle noise. In the other hand, shows more balanced and less biased results against the drawbacks of. As denoted in [7], frequency-based image quality metrics are robust against the drawbacks of. In Figure, value of each image is 25, 95, 89, and 24, respectively. 4.2. LIVE database The LIVE Image Quality Assess Database [9], together with the subjective score for each image was used to validate the performance of the proposed algorithm. In order to provide quantitative measures on the performance of the objective quality assessment models, we follow the performance evaluation procedures provided by the video quality experts group (VQEG) Phase II FR-TV test []. From [], the logistic functions are applied in fitting procedure to provide a nonlin- Fig. 2. Correlation between FSM and ear mapping between the objective/subjective scores as shown in Figures and 3. Then, Metric (The Pearson linear correlation coefficient) and Metric2 (Spearman rank order correlation coefficient) are used for comparison. FSM in Figures 3 to 4 and Tables to 4 indicates. Tables, 2, and 3 show the quantitative results. There are 5 groups in LIVE database images : JPEG, JPEG2, Fast Fading, White Noise, and Gaussian Blur. We divide those 5 groups into two groups which are the first three groups and the last two groups. In case of the first group, logistic regression curves between and / could be obtained easily as shown in Figure 3. However, we failed to obtain those of the second group. There were many outliers among values in LIVE database images. So, we changed comparison target of the second group (White Noise and Gaussian Blur) from to standard deviation values which are also included in LIVE database images. As a logistic regression result, we could obtain better curve than as shown in 4. When we remove outliers from all images in the second group, the logistic regression curve of is very similar to that of standard deviation. To compare more sample images, we select standard deviation values rather than values for the second group. Tables 2 and 3 indicate performs higher by 5 percent than at all groups other than the White Noise group. Exceptionally,
2 vs with jpeg vs with jp2k 2 vs with fastfading 8 6 4 2 8 6 4 2 8 6 4 2 2 2 4 6 8 2 4 6 8 2 4 5 5 2 25 2 vs with jpeg vs with jp2k 2 vs with fastfading 8 6 4 2 8 6 4 8 6 4 2 2 2.4.5.6.7.8.9.4.5.6.7.8.9.2.4.6.8 Fig. 3. Logistic regression curves with JPEG, JPEG2, and Fast Fading images Table. Performance comparison of and on JPEG, JPEG2, Fast Fading images performs higher by 4 percent than at the White Noise group. As experimental results on all LIVE database images in Table 3, achieves 95 percent performance as does. Even though does not outperform, it has more practical advantages: lower complexity, less hardware resource, and easy adaptation to existing video systems compared with as mentioned in section 3. Table 4 compares performance of with transform sizes of 6 6, 8 8 and 4 4. As the block size of DCT gets larger, performance of gets a bit higher. This is because a larger size of DCT block includes more frequency components than a smaller one. But, performance difference is very small. 5. CONCLUSIONS We presented an improved DCT-based metric for image quality assessment named as. estimates similarity of frequency between a distorted image and a reference image using DCT. Different from and PSNR, it does not use any data on the pixel domain. Instead, it simply uses only frequency components. Experimental results show achieves 95 percent performance as does. That is, it still has very high correlation with subject scores (). Besides, the computational complexity of is simpler than. Since it uses a fixed number of coefficients for matrix multiplication, it can be easily implemented on hardware. can use the DCT block which is already built in video system. If the original DCT block in video system is used for, simple additional hardware resource is needed to measure image quality. Mobile system (video encoder or decoder) can become more intelligent and adaptive by using for image quality assessment on the fly. 6. REFERENCES [] K.R. Rao and H. R. Wu, Digital Video Image Quality and Perceptual Coding,, CRC Press, 26. [2] S. Winkler and P. Mohandas, The Evolution of Video Quality Measurement: From PSNR to Hybrid Metrics, Broadcasting, IEEE Transactions on, vol.54, no.3, pp.66-668, Sept. 28.
2.5 vs with wn 5 vs with gblur 2.5 5.5 2 4 6 8 2 2 4 6 8 2 2 vs with wn 5 vs with gblur.5.5 5.2.4.6.8.4.5.6.7.8.9 Fig. 4. Logistic regression curves with white noise and gaussian blur images Table 2. Performance comparison of and on white noise and gaussian blur images [3] Z. Wang, A. C. Bovik, Modern Image Quality Assessment, New York: Morgan and Claypool Publishing Company, 26. [4] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, Image quality assessment: from error visibility to structural similarity, Image Processing, IEEE Transactions on, vol. 3, no. 4, pp. 6-62, Apr. 24. [5] D. Lv, D. Bi, and Y. Wang, Image Quality Assessment Based on DCT and Structural Similarity, Wireless Communications Networking and Mobile Computing (WiCOM), 2 6th International Conference on, vol., no., pp. -4, 23-25 Sept. 2. [6] S. Aja-Fernandez, R. San Jose Estepar, C. Alberola- Lopez, and C.F. Westin, Image quality assessment based on local variance, Engineering in Medicine and Biology Society, 26. EMBS 6. 28th Annual International Conference of the IEEE, vol., no., pp. 485-488, Aug. 3-Sept. 3 26. [7] J.S. Park and T. Ogunfunmi, A New Approach for Image Quality Assessment: Frequency Similarity Method (FSM), Proceedings of the IEEE International Conference on Industrial Electronics (ICIEA), Singapore, July 22. [8] J.S. Park and T. Ogunfunmi, Image quality assessment using frequency similarity, Provisional Patent Filing, USA, June 22. [9] H. R. Sheikh, Z. Wang, L. Cormack, and A. C. Bovik, LIVE Image Quality Assessment Database Release 2 [Online]. Available: http://live. ece.utexas.edu/research/quality 26 [] VQEG. Final Report From the Video Quality Experts Group on the Validation of Objective Models of Video
Table 3. Average values of SROCC, CC, MAE, RMS, OR All images MODEL SROCC CC MAE RMS OR 4x4.8746.8395 5.77 7.2868.579 8x8.8748.8398 5.7655 7.289.589 6x6.8756.846 5.748 7.2585.59 Table 4. Performance of with 4x4, 8x8, and 6x6 Quality Assessment. Phase II (FR-TV2)(23, 9). Available: http://www.vqeg.org/ [] M. A. Saad, A. C. Bovik, and C. Charrier, A DCT Statistics-Based Blind Image Quality Index, IEEE Signal Processing Letters, Vol. 7, No. 6., pp. 583-586, June 2. [2] H. Tang and L. Cahill, A new criterion for the evaluation of image restoration quality, TENCON 92. Technology Enabling Tomorrow : Computers, Communications and Automation towards the 2st Century. 992 IEEE Region International Conference., vol.2, pp.573-577, -3 Nov. 992. [3] A. Eskicioglu and P. Fisher, Image quality measures and their performance, Communications, IEEE Transactions on, vol. 43, no. 2, pp. 2959-2965, Dec. 995.