Image Retargeting for Small Display Devices

Image Retargeting for Small Display Devices Chanho Jung and Changick Kim Department of Electrical Engineering Korea Advanced Institute of Science and Technology (KAIST) Daejeon, Republic of Korea ABSTRACT In this paper, we propose a novel image importance model for image retargeting. The most widely used image importance model in existing image retargeting methods is L 1 -norm or L -norm of gradient magnitude. It works well under non-complex environment. However, the gradient magnitude based image importance model often leads to severe visual distortions when the scene is cluttered or the background is complex. In contrast to the most previous approaches, we focus on the excellence of gradient domain statistics (GDS) for more effective image retargeting rather than the gradient magnitude itself. In our work, the image retargeting is developed in the sense of human visual perception. We assume that the human visual perception is highly adaptive and sensitive to structural information in an image rather than non-structural information. We do not model the image structure explicitly since there are diverse aspects of image structure. Instead, our method obtains the structural information in an image by exploiting the gradient domain statistics in an implicit manner. Experimental results show that the proposed method is more effective than the previous image retargeting methods. Keywords: Human Visual System (HVS), Image Importance Model, Image Retargeting. 1. INTRODUCTION Recently, more effective and intelligent scheme for adapting a given image to the target display size or aspect ratio is becoming important with increasing diversity of display devices, such as cellular phones and PDAs. 1, A classical approach to changing the display size or aspect ratio is to uniformly rescale the image in spatial domain. The uniform scaling has some limitations. When the size of target display is smaller than the one of original image, the visually important parts (e.g., objects) rescaled by the uniform scaling maybe become unrecognizable due to the reduced cutoff frequency. Moreover, it is difficult to avoid the unwanted visual distortions when the aspect ratios of target and source are different. Image cropping, which focuses only on region-of-interest (ROI) in an image, is one of the commonly used methods. This naive cropping can not be applied when the image with multiple ROIs is dealt with. Above all, the main problem of cropping is the complete loss of surrounding contextual information, which plays an important role in understanding the image, outside the ROI. The uniform scaling and image cropping only cope with the display space constraint (i.e., geometric constraint) but not the image content. Figure 1 shows the limitations of classical image resizing methods. In order to overcome the limitations of traditional image resizing schemes, more sophisticated approaches have been introduced for automatic image retargeting by reorganizing the visual data. Suh et al. 3 proposed an automatic thumbnail creation method. Given an image, the thumbnail image is obtained by applying saliency and face detection. 4, 5 Chen et al. 6 introduced an image adaptation algorithm for mobile devices. Seam Carving 1 proposed by Avidan et al. is a highly popular and recently developed image retargeting technique. They define a seam (i.e., a connected path of vertical or horizontal pixels) on the image energy function. In order to reduce the width or height of an image, the optimal seam with the minimum energy is found using a well-defined dynamic programming. The L 1 -norm or L -norm of gradient magnitude is the most widely used image energy function in the Seam Carving. Rubinstein et al. 7 extended the Seam Carving scheme for video retargeting. The dynamic programming is replaced by graph cuts which are suitable for 3D volumes. Moreover, the dynamic programming is improved by introducing forward energy. The forward energy takes into account Further author information: (Send correspondence to Changick Kim) Chanho Jung: e-mail: peterjung@kaist.ac.kr Changick Kim: e-mail: cikim@ee.kaist.ac.kr, telephone: +8 4 350 741 Applications of Digital Image Processing XXXIII, edited by Andrew G. Tescher, Proc. of SPIE Vol. 7798, 77981N 010 SPIE CCC code: 077-786X/10/$18 doi: 10.1117/1.863133 Proc. of SPIE Vol. 7798 77981N-1

(a) (b) (c) (d) Figure 1. Image resizing results by classical approaches: (a) original image, (b) uniform scaling, (c) letter boxing, and (d) image cropping. Note that aspect ratios of ROIs are seriously distorted in uniform scaling. The ROIs resampled by letter boxing become too small. Image cropping loses important contextual information outside selected ROI. The original image is of width 300, whereas the resizing results are of width 100. the inserted gradient magnitude rather than the deleted gradient magnitude. The video retargeting is also done based on saliency, face, and motion attention models with an warping-based optimization. 8 Specifically, the motion information plays a crucial role in the method since the moving objects in video draw most of viewers attention. In those methods above-mentioned, the image resizing is performed by taking into account not only the geometric constraint but also the image content. That is, content awareness is the most important feature of the image retargeting methods. The key to content awareness is quantifying a visual importance value to each pixel, which we call image importance model (IIM) in this paper. In the literature, a number of image energy functions have been taken as an approximation of visual importance for image retargeting. The L 1 -norm of gradient magnitude is one of the most widely used IIMs. The entropy can be an approximation of visual importance. 1 In the entropy-based IIM, the entropy over a local neighborhood is estimated and then the sum of the entropy and the L 1 -norm of gradient magnitude is computed. The histogram of gradients 9 is also a useful IIM. The visual saliency map 4, 10, 11 hasproventobeaneffectiveiiminseveralimage retargeting methods. In this paper, we propose an image importance model inspired by human visual perception or system for image retargeting. Our model originates from the idea that the human visual perception is highly adaptive and sensitive to structural information in an image rather than non-structural information. Therefore, the structural parts in an image can be regarded as the visually important ones which should be preserved in the image retargeting. The proposed method does not restrict image structures to a few structural elements (e.g., lines or edges) in a scene. To this end, we obtain the structural information in an image by exploiting the gradient domain statistics (GDS) in an implicit manner.. SEAM CARVING REVIEW The Seam Carving 1 is one of the most effective and powerful content-aware image resizing techniques in the literature. The goal of Seam Carving is to remove unnoticeable pixels that blend with their surroundings. To this end, an image energy function (i.e., image importance model) is defined based on the L 1 -norm of gradient magnitude as follows. E 1 (x, y) = I(x, y) x + I(x, y) y, (1) where E 1 (x, y) andi(x, y) represent the visual importance value and gray-scale value of source image at a pixel (x, y), respectively. Given the image energy function, a vertical or horizontal seam is found to reduce the image Proc. of SPIE Vol. 7798 77981N-

(a) (b) (c) Figure. (a) image seams, superimposed on the source image, obtained from Seam Carving, (b) gradient magnitude based IIM, and (c) image retargeting results by Seam Carving. width or height. Let I be an n m image. The vertical seam is defined as follows. s x = {s i } n i=1 = {(x(i),i)}n i=1 subject to x(i) x(i 1) 1. () Thus, a vertical seam is an 8-connected path of pixels in the image from top to bottom, containing one, and only one, pixel in each row of the image. The Seam Carving repeatedly finds the least important (i.e., the lowest energy) vertical seams from the image energy function to reduce the image width (see Fig. (a)). The horizontal seam is defined in a similar manner. Let C(s x ) denote the cost of a seam s x.theoptimalseamŝ x (i.e., the least important seam) is obtained by minimizing the seam cost as follows. ŝ x =min s x C(sx )=min s x n E 1 (s i ). (3) In order to find the optimal seam satisfying (3), a dynamic programming optimization technique is employed as follows. M(i 1,j 1) M(i, j) =E 1 (i, j)+min M(i 1,j), (4) M(i 1,j+1) where M(i, j) denotes the cumulative minimum energy for all possible connected seams on (i, j). The first step is to traverse the image from the second row to the last row using (4). Finally, the optimal seam can be found by backtracking from the minimum value of the last row in M to the first row. The definitions for horizontal seam are similar. Figure shows the image retargeting results obtained by the Seam Carving. Compared to the traditional image resizing methods, the Seam Carving not only preserves well the visually important parts but also retains the surrounding context without unwanted visual distortions (see Fig. 1 and Fig. ). 3. PROPOSED ALGORITHM As described in the previous sections, the image retargeting techniques usually consist of two sequential stages: 1) IIM construction by quantifying visual importance of pixels and ) resizing operation by applying pre-defined optimization rule based on the IIM. For example, in the Seam Carving, the IIM is constructed based on the gradient magnitude in the first step and the dynamic programming optimization scheme is employed in the second step. The IIM construction is fundamental and essential for image retargeting. Moreover, the visual i=1 Proc. of SPIE Vol. 7798 77981N-3

quality of the target image obtained from the second stage strongly depends on the IIM constructed in the first stage. Therefore, the given image content should be effectively understood to guarantee the success of the image retargeting. In this paper, we assume that an image may consist of structured and unstructured image components in a local neighborhood about every pixel. We exploit the image importance measure on a discrete gradient vector space. The vector space consists of two random variables: 1) horizontal and ) vertical partial derivatives. There are several ways to calculate the derivatives. In this paper, the x and y components of the gradient G =(g x,g y ) T are estimated by means of the Sobel kernel. In our work, we observe that the highly structured image components have the following common significant attributes on the feature space: The (g x,g y ) elements are spread over a structured distribution. That is, the cluster has a dominant gradient orientation. There is a set of (g x,g y ) elements whose gradient magnitude is large along the dominant gradient orientation. Based on above-mentioned our observations, a statistical model for estimating the likelihood that the cluster arose due to the underlying significant structured component is introduced. In this paper, we employ the theory of angular signal statistics to exploit the consistency of gradient orientation (i.e., the first criterion). The directional data is modeled to build an angular dispersion of the gradient orientation in a local neighborhood N. Note that this is because higher angular dispersion implies lower likelihood that the cluster has a dominant gradient orientation. The angular data exhibit an inherently periodic nature. 1 Due to this periodicity, the statistical theory used for signals whose domain is a straight line can not be used to deal with the circular data. Let θ be an angular random variable. In the statistical literature, a number of circular measures of dispersion have been introduced. Unfortunately, all the circular measures of dispersion in the statistical literature have poor performance in our feature space. Note that this is because the traditional angular statistics cope with only the angular component of the vector. In this paper, to address the problem within our framework, the angular dispersion is estimated on a polar coordinate system to take the radial component of the vector into account. In our work, the circular mean μ θ is defined as the direction of the vector sum of a set of gradient vectors in N as follows. μ θ =tan 1 ( g y / g x ). (5) g y N Then, the probability density function of γ = θ μ θ is defined to estimate the angular deviation σ γ as follows. 13 g x N f Γ (γ) = 1 π e μ r /σ + μ r cos γ σ π e μ r sin γ/σ 1+erf( μ ] r cos γ σ ), (6) where μ r and σ represent the radial mean and the variance of gradient component, respectively. The standard deviation of γ is a function of ζ = μ r /σ and is given by π σ γ = = γ f Γ (γ)dγ π ] 1 π 3 e ζ / + ζ π π π γ cos γ 1+erf( ζ cos γ ] ) e ζ sin γ/ dγ ] 1. (7) Proc. of SPIE Vol. 7798 77981N-4

The summation over all gradients in neighborhood N centered on any given pixel and for which the gradient direction is γ is denoted by (γ),and denotes the summation over N without regard for angle. With these definitions, the discrete form of (6) can be described as follows. (γ) f Γ (γ) =E ]. (8) where E ] denotes the expectation operator. Moreover, based on (8), the discrete form of the angular deviation σ γ in (7) is given by σ γ = γ (γ) E ]] 1. (9) γ N Using the power series approximation cos γ 1 γ /, the angular deviation can be rearranged as follows. σ γ = ( (γ) (1 cos γ)e ]) ] 1. (10) γ N Due to the linearity of the expectation and summation, their order may be interchanged. Finally, we can obtain the angular dispersion D N as ( ]) ] 1 γ N (γ) cos γ D N = 1 E, (11) since the angular deviation σ γ in N can be regarded as the angular dispersion. The second criterion of structured image components is exploited based on tensor analysis. A structure tensor of a local neighborhood N is formed as follows. ] g S = x gx g y gx g y g. (1) y The eigen-decomposition is applied to the structure tensor matrix S to obtain the eigenvalues (λ 1,λ ). The structure tensor is very useful to deal with the set of gradient elements whose gradient magnitude is large onto the dominant gradient orientation, since the eigenvalues indicate the underlying certainty of the gradient structure along their associated eigenvector directions. Note that specifically the maximum eigenvalue of S implies the momentum of the distribution explained by the dominant gradient orientation. Based on the observation, we build a momentum of the gradient structure along the dominant gradient orientation as follows. M N =max(λ 1,λ ). (13) We combine the two desired statistics to yield an overall importance measure as follows. I N = f (D N, M N ), (14) where f( ) a combination function. The statistics are estimated within a local n n square window about every pixel. In our work, we set n = 11. Let D(x, y) andm(x, y) denote the angular dispersion function and the momentum function at (x, y). In our work, the importance measure should be required to satisfy the boundedness condition (i.e., 0 I(x, y) 1). We also would like to adjust the relative importance of the two statistics. To this end, the two components are normalized between 0 and 1, respectively. Finally, we define the image importance measure as follows and name the resulting metric the GDS model, which means importance measure based on gradient domain statistics. GDS(x, y) =D(x, y)] α M(x, y)] β, (15) where α>0andβ>0 are parameters used to adjust the relative importance. In this paper, we set α = β =1. Proc. of SPIE Vol. 7798 77981N-5

Figure 3. Several comparisons between Seam Carving 1 (left image of pairs) and the proposed scheme (right image of pairs) when the horizontal resolution is reduced. Note that the visual artifacts are greatly reduced by the proposed GDS model. 4. RESULTS AND CONCLUSION In order to evaluate the performance of the proposed scheme, a database14 provided by Microsoft Research Asia was used. The experiments are conducted on a number of images with the objects of humans, animals, architectures, and so on. Fig. 3 and Fig. 4 show the parts of retargeting results by the proposed method and the state-of-the-art image retargeting scheme (i.e., Seam Carving 1). In the comparison, to make fair evaluations, the constructed IIMs are identically handled by using the dynamic programming optimization scheme. As shown in Fig. 3 and Fig. 4, the previous approach may not eﬀectively preserve the visually importance contents, which are very sensitive to human perception, when the cluttered scenes with complex background are dealt with. This is because not only the visually important content but also its surrounding context are indicated by quite high image importance. On the other hand, the proposed scheme yields more visually acceptable retargeting results over various images as shown in Fig. 3 and Fig. 4. That is, our model eﬀectively represents the underlying image structure, whereas the Seam Carving does not. In this paper, we have proposed a novel image importance model for image retargeting. Unlike the previous approaches, we focus on the excellence of gradient domain statistics (GDS) for more eﬀective image retargeting. In our work, the image retargeting is developed in the sense of human visual perception. Experimental results presented in this section prove that the proposed scheme outperforms the state-of-the-art method. ACKNOWLEDGMENTS This research was supported by the MKE(The Ministry of Knowledge Economy), Korea, under the ITRC(Information Technology Research Center) support program supervised by the NIPA(National IT Industry Promotion Agency) (NIPA-010-( C1090-1011-0003)) REFERENCES 1] Avidan, S. and Shamir, A., Seam carving for content-aware image resizing, ACM Transactions on Graphics 6(3), 67 76 (007). ] Guo, Y., Liu, F., Shi, J., Zhou, Z.-H., and Gleicher, M., Image retargeting using mesh parametrization, IEEE Transactions on Multimedia 11(5), 856 867 (009). Proc. of SPIE Vol. 7798 77981N-6

Figure 4. Several comparisons between Seam Carving 1 (top image of pairs) and the proposed scheme (bottom image of pairs) when the vertical resolution is reduced. Note that the visual artifacts are greatly reduced by the proposed GDS model. 3] Suh, B., Ling, H., Bederson, B., and Jacobs, D., Automatic thumbnail cropping and its eﬀectiveness, in Proc. of ACM Annual Symposium on User Interface Software and Technology], 95 104 (003). 4] Itti, L., Koch, C., and Niebur, E., A model of saliency-based visual attention for rapid scene analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence 0(11), 154 159 (1998). 5] Yang, M. H., Kriegman, D. J., and Ahuja, N., Detecting faces in images: a survey, IEEE Transactions on Pattern Analysis and Machine Intelligence 4(1), 34 58 (00). 6] Chen, L. Q., Xie, X., Fan, X., Ma, W. Y., Zhang, H. J., and Zhou, H. Q., A visual attention model for adapting images on small displays, ACM Multimedia Systems 9(4), 353 364 (003). 7] Rubinstein, M., Shamir, A., and Avidan, S., Improved seam carving for video retargeting, ACM Transactions on Graphics 7(3), 1 8 (008). 8] Wolf, L., Guttmann, M., and Cohen-Or, D., Non-homogeneous content-driven video-retargeting, in Proc. of IEEE International Conference on Computer Vision], 1 6 (007). 9] Dalal, N. and Triggs, B., Histograms of oriented gradients for human detection, in Proc. of IEEE International Conference on Computer Vision and Pattern Recognition], 886 893 (005). 10] Ma, Y.-F. and Zhang, H.-J., Contrast-based image attention analysis by using fuzzy growing, in Proc. of ACM International Conference on Multimedia], 374 381 (003). 11] Hou, X. and Zhang, L., Saliency detection: A spectral residual approach, in Proc. of IEEE International Conference on Computer Vision and Pattern Recognition], 1 8 (007). 1] Nikolaidis, N. and Pitas, I., Nonlinear processing and analysis of angular signals, IEEE Transactions on Signal Processing 46(1), 3181 3194 (1998). 13] Gregson, P. H., Using angular dispersion of gradient direction for detecting edge ribbons, IEEE Transactions on Pattern Analysis and Machine Intelligence 15(7), 68 696 (1993). 14] Liu, T., Sun, J., Zheng, N.-N., Tang, X., and Shum, H.-Y., Learning to detect a salient object, in Proc. of IEEE International Conference on Computer Vision and Pattern Recognition ], 1 8 (007). Proc. of SPIE Vol. 7798 77981N-7