Perceptual Quality Improvement of Stereoscopic Images Jong In Gil and Manbae Kim Dept. of Computer and Communications Engineering Kangwon National University Chunchon, Republic of Korea, 200-701 E-mail: manbae@kangwon.ac.kr Abstract - In general, 3D scene depth is affected by intensity and saturation difference according to human factor. Utilizing this information, we propose a novel method to improve 3D perceived depth of stereoscopic images. By varying the RGB colors especially at adjacent areas of an edge pixel, the depth perception is expected to be enhanced. For this, two methods for intensity transformation and one for saturation processing are presented. Experiments performed on various stereoscopic images validate that the proposed methods deliver more enhanced 3D depth based on subjective test. Keywords: Stereoscopic image, perceptual quality, 3D perception 1 Introduction In general, stereoscopic images are acquired from two camera sensors. Displaying the images on a 3D monitor, humans can view and perceive 3D. In general, the stereoscopic images are delivered to viewers without any modification or enhancement. The human factor [1-4] indicates that the perceptual quality is related to the amount of intensity and saturation. For instance, objects with greater saturation is perceived close to the viewer compared with those with relatively smaller saturation even though they are located at the identical distance. This phenomenon is also applied to the intensity. Based on such a 3D human factor, this paper presents a novel method to enhance the stereoscopic images by varying the intensity and saturation values of left and right images. Fig. 1 shows an example of human illusion. We made a synthetic picture with varying intensity and saturation levels. We observe the left object more close to our eyes. As the saturation decreases, objects are perceived to be distant. Utilizing this human factor, a novel scheme for enhancing 3D is presented by expanding this phenomenon to a 3D stereoscopic image. While watching stereoscopic images, two objects at the same distance can be perceived at different depth according to either intensity or saturation levels. Fig. 1. Intensity and saturation levels decrease from left to right at the range of [1.0, 0.25]. This paper is organized as follows: Overall approach is introduced in Section 2. Image preprocessing is presented in Section 3. Section 4 presents the depth map filtering and the derivation of high-frequency components. The intensity and saturation processing schemes are presented in Section 5. Experimental results are discussed in Section 6 followed by the conclusion of Section 7. 2 Overall Method The stereoscopic image is composed of left and right images. As shown in Fig. 1, in the preprocessing, we estimate the magnitude and orientation of edges for left and right images, I L and I R. Given edge pixels, we extract highfrequency component from a depth map that can be obtained from stereo matching or a pre-made depth map. The depth map is low-pass filtered and the high-frequency components that exist at edge boundaries are derived. The high frequency component plays an important role in the intensity/saturation transformation. Finally, an output stereoscopic image I L and I R is generated. Fig. 2. The block diagram of the proposed method
3 Preprocessing The purpose of the preprocessing is two-folds: edge detection and depth map generation. To detect edge pixels, Sobel edge operator is used [5]. From the edge operator, edge magnitude and orientation can be computed. Fig. 3 shows the edge strength maps and binary edge maps after the thresholding of edge strength for left and right images, respectively. Fig. 5. Edge map obtained from Zhang-Suen thinning algorithm The edge map still contains small edge regions. Therefore, in order to remove such components, CCL (connected component labeling) is used. The labeled data smaller than P pixels is then eliminated. Fig. 6 shows the edge map after the CCL. Fig. 3. Edge map and binary edge map of left image and right image Fig. 6. Edge map after removing small edge components by CCL algorithm Since the edge noise might be present in the edge map, a min-max filter is applied. The output shown in Fig. 4 shows output where noisy edges are removed. Fig. 4. Noise-removed edge map after the min-max filter The next procedure is to derive edge boundary with maximum thickness of 1 or 2. Most thinning procedures repeatedly remove boundary pixels until a pixel set of 1 or 2 is found. Among them, Zhang-Suen thinning algorithm [6] that has been widely used due to the fast and simple implementation is adopted and the result is shown in Fig. 5. The edge orientation is important in the derivation of highfrequency components of a depth map. Since some edges might have inaccurate orientations, the adjacent edges might not have an identical orientation even for line edges. To solve this problem, a median filter is applied. Fig. 7 shows the input and median-filtered edge maps of left and right images. Fig. 7. shows the median-filtered edge maps Various disparity estimation methods for stereoscopic images have been proposed in the literature. Since the stereo matching is not our concern, we use depth maps provided by Middlebery site [7]. One of the depth maps of a stereoscopic image is shown in Fig. 8.
Fig.10. Horizontal and vertical filtering applied according to Fig. 8. Depth maps of Middlebery site [7] the edge orientation 4 Preprocessing Depth Map The depth at [0, 255] is defined as the distance between scene and a camera. We compute the difference between the original and the low-pass filtered depth map, thereby obtaining the high-frequency components. Given the input depth data D, we compute ΔD by D = G D D (1) with G D being the convolution of D and a Gaussian filter kernel with variance σ. ΔD contains local depth contrast, which can be interpreted as the following: ΔD 0 represents smooth areas of D, while ΔD > 0 represents areas of highly varying depth. More in detail, a negative ΔD < 0 represents areas of background objects that are close to other occluding areas, while a ΔD > 0 represents boundary areas of foreground objects. Fig. 9 illustrates the property of ΔD. Fig. 9. The sign of ΔD. The blue and red pixels have the positive and negative signs, respectively. The ΔD is computed only at edges of Fig. 7 and the implementation of Gaussian filtering depends on the edge orientation θ. For instance, as shown in Fig. 10, given an edge pixel (in red), vertical and horizontal filtering is performed based on the edge orientation. Fig. 11 shows the Gaussianfiltered depth map. Fig. 11. Gaussian-filtered depth maps 5 Intensity and Saturation Processing Utilizing ΔD, the following methods are presented: contrast variation and depth darkening for intensity transformation and saturation variation for saturation transformation. A) Contrast Variation As already mentioned, the ΔD contains information about the high-frequency property of a depth map. This information can be integrated by directly modulating the input image I. Consequently, we add the ΔD to each color channel by computing I ' = I + λ D (2) with λ being a user defined parameter. I {R, G, B}. In this method, in the case that the color of foreground objects is brighter than the background, the visual contrast can be reduced. The sign of λ affects whether either the foreground pixels become darker and the background pixels brighter, or vice versa. B) Depth Darkening In case that two regions of foreground and background have similar colors and different depths, a possible solution for missing color contrast is to introduce a kind of artificial contrast. In this situation, we only darken or brighten the background along spatially important edges by computing I ' = I + λ D (3) where ΔD indicates the negative ΔD. Depending upon the sign of λ, the background pixels can be darker or brighter.
C) Saturation Variation As mentioned, the amount of saturation can affect the 3D perception of stereoscopic images. To realize this, we present a method of adjusting saturation values of neighboring regions of edge pixels. The following equation transforms the input S using ΔD. + S' = S + λ D (4) where ΔD + is a positive component of ΔD. If we add ΔD + to S, the foreground saturation increases compared with the background. Then, HSI color is transformed into RGB. 6 Experimental Results The proposed methods were performed on various 2D images and depth maps. We illustrate the results for each test image. The first image is MSR breakdance image and depth sequences as shown in Fig. 4. The proposed methods were performed on a set of various stereoscopic images as shown in Fig. 12. The resulting images are shown in Fig. 13 through 15. Fig. 13 shows the output of the contrast variation and depth darkening. The intensity and saturation processing was performed on the adjacent region of edge pixels. The close-up of some particular edges is shown in Figs. 14 and 15. Further, the graph shows the variation of RGB values with respect to λ. The transformed RGB images from the saturation processing are shown in Fig.16. varied between [1.5, 3.0]. Depth perception was then subjectively judged on a scale of 1 (not good) to 5 (very good). On average, the score was higher than 3.5. The results are shown in Fig. 14. Fig. 13. Output stereoscopic images. contrast variation, and depth darkening Fig. 12. Test stereoscopic images [7] We observed the stereoscopic images with a 3D monitor. Our experimental results tested on stereoscopic 3D displays show that the perceived depth is appropriately improved proportional to the value of λ. To prove this, DQCQS (Double Stimulus Continuous Quality Scale) subjective test is performed. At the first stage, original views were displayed to five participants. Each participant watched the views for 10 seconds and their new views for the same period, and evaluated the effect of the 3D depth. The scaling factor was
7 Conclusions.In this paper, we presented novel schemes that could provide the improvement of 3D stereoscopic images. Following the human factor regarding intensity as well as saturation, it was the aim of this paper to expand such phenomenon to 3D human factor and to provide the 3D enhancement schemes. For this, two methods for intensity and one for saturation processing were proposed to achieve better 3D depth that was verified based on subjective evaluation. 8 Acknowledgement This work was supported by the MKE (The Ministry of Knowledge Economy), Korea, under the ITRC (Information Technology Research Center) support program supervised by the NIPA (National IT Industry Promotion Agency). (NIPA- 2011-(C1090-1111-0003)) and by the IT R&D program of MKE/KCC/KEIT. [KI002058, Signal Processing Elements and their SoC Developments to Realize the Integrated Service System for Interactive Digital Holograms]. 9 References [1] L. M. J. Meesters, W. A. IJsselsteijn, and P. J. H. Seuntiens, A survey of perceptual evaluations and requirements of three-dimensional TV, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 14, No. 3, pp. 381-391, 2004. [2] Blonde, L., Doyen, D., and Borel, T., 3D stereo rendering challenges and techniques, Information Sciences and Systems (CISS), 2010 44th Annual Conference on, IEEE, pp. 1-6, 2010. [3] E. Lee, H. Heo, and K. Park, The comparative measurements of eyestrain caused by 2D and 3D displays, IEEE Trans. on Consumer Electronics, Vol. 56, Issue 3, pp. 1677 1683, 2010. [4] L. Stelmach, W. Tam, F. Speranza, R. Renaud, and T. Martin, Improving the Visual Discomfort of Stereoscopic Images, Stereoscopic Displays and Virtual Systems X, Proceedings of SPIE-IS&T Electronic Imaging, SPIE Vol. 5006, pp. 269-282, 2003. [5] M. Sonka, V. Hlavac, and R. Boyle, Image Processing, Analysis and Machine Vision, 3rd Ed., Tomson, 2008. [6] T. Y. Zhang and C. Y. Suen, A fast parallel algorithm for thinning digital patterns, Communication of ACM, 27(3), pp. 236-239, 1984. [7] http://vision.middlebury.edu/stereo/ Fig. 14. Subjective tests for stereoscopic improvement