Salient Object Segmentation based on Linearly Combined Affinity Graphs

Size: px

Start display at page:

Download "Salient Object Segmentation based on Linearly Combined Affinity Graphs"

Sara Anderson
6 years ago
Views:

1 Salient Object Segmentation based on Linearly Combined Affinity Graphs Çağlar Aytekin, Alexandros Iosifidis, Serkan Kiranyaz * and Moncef Gabbouj Department of Signal Processing, Tampere University of Technology, Tampere, Finland * Department of Electrical Engineering, Qatar University, Doha, Qatar {caglar.aytekin, alexandros.iosifidis, moncef.gabbouj}@tut.fi, mkiranyaz@qu.edu.qa Abstract In this paper, we propose a graph affinity learning method for a recently proposed graph-based salient object detection method, namely Extended Quantum Cuts (EQCut). We exploit the fact that the output of EQCut is differentiable with respect to graph affinities, in order to optimize linear combination coefficients for several differentiable affinity functions by applying error backpropagation. We show that the learnt linear combination of affinities improves the performance over the baseline method and achieves comparable (or even better) performance when compared to the state-of-the-art salient object segmentation methods Keywords graph affinity learning, salient object segmentation, spectral graph theor I. INTRODUCTION Visual saliency estimation is an important research topic in computer vision where the task is to highlight the objects that visually stand out from their surroundings. Earlier research endeavors have focused mostly on models that are inspired by the human visual system [2] due to the fact that visually important region selection is related to human attention mechanisms [1]. Accordingly, some works have based their evaluation on the performance of accurately predicting human eye fixations [3]-[7]. However, the output of such methods corresponds to a (sparse) set of points (image pixels) and, thus, it is not very useful in terms providing an initial region of interest for higher level computer vision tasks, such as object recognition. Instead, a closed object segment is a more preferred input for these tasks. Hence, recently, the focus on visual saliency estimation is mostly concentrated on salient object segmentation [8]-[18], a task which has been proved to be useful for higher level computer vision and pattern recognition problems, such as tracking [26], object region proposals [28] and object recognition [27]. Graph-based salient object detection methods rank among the top performing methods in the literature. According to a recent benchmark [42], 4 out of the 6 top performing methods are based on graphs. All these methods exploit different graphbased approaches. For instance, in [12], the shortest path problem in graph theory [47] was exploited in order to find a robust backgroundness measure for the image regions. Then, this measure was combined with a region contrast measure in order to evaluate saliency. In [13], a Markov Chain was constructed on an image graph model. Virtual boundary nodes on this graph are selected as absorbing nodes and the saliency is calculated as the absorption time of the transient nodes. The method in [14] exploits the minimum barrier distance on graphs, in order to efficiently evaluate the saliency of image regions based on their deviation from the appearance of image boundary regions. The method in [17] and [18] solves a spectral approximation of a graph based cut problem, in order to find salient regions that both differ from image boundaries in appearance and are in high contrast with their surrounding, and have a large area. In [38], the authors give a general formulation for several salient object detection methods [11], [13] following a graph-based diffusion interpretation and propose a new diffusion based model by applying in-depth spectral analysis of the diffusion matrix. Aside from the fact that the aforementioned methods exploit graphs, they differ in terms of their graph-construction in many ways. For example [11], [12], [13], [18] and [38] exploit superpixels as nodes in the given graph representation whereas [14] and [17] define pixels as graph nodes. A 4- connected neighborhood rule for nodes was adopted in [14] and [17], in [12] every node was connected to adjacent nodes only, whereas in [11], [13] and [38] the nodes are also connected to the ones sharing common boundaries with their neighboring nodes. In [18] a very dense graph was constructed connecting nodes up to a 16 th degree of neighborhood. In [11], [12], [13] and [38] all nodes corresponding to the regions on the image boundary are connected to each other, whereas in [17] and [18] an artificial background node was introduced and all image boundary regions were connected to this artificial node with high weights. While constructing the graph weights between a pair of nodes, [12], [13] and [38] use an exponentially decaying function with increasing Euclidian distance between appearance features corresponding to each superpixel, whereas [17] and [18] use the inverse of this distance. The above-described differences in graph construction have a high impact on the performance of salient object detection methods. For instance, the difference between the methods in [18] and [17] lies on graph construction approach. Even if all remaining steps are the same, different graphs lead into a significant difference in performance. As will be shown later in this paper, graph construction can be even more effective than

2 the method itself. In addition, there might be cases where a method can only work well with a specific graph type. Based on the observations above, in this paper, we will focus on the graph construction problem and in particular we aim to improve the graph weights (affinity) computation for a better salient object segmentation. Conventionally, graph-based methods use a pre-defined function with fixed parameters mapping Euclidian distance of appearance features between a pair of nodes to an affinity value. In this paper, we propose a method that learns the parameters of three popular affinity functions along with the weights of their linear combination. We employ Extended Quantum Cuts (EQCut) [18] as salient object segmentation method. We exploit the fact that EQCut produces saliency maps that are differentiable with respect to graph affinities and this in turn allows us to backpropagate the salient object detection error in order to optimize the aforementioned parameters of the overall system. The rest of the paper is organized as follows: In Section II, EQCut is briefly introduced and the proposed graph affinity learning strategy is described. In Section III we show that the learnt affinities lead to a significant performance improvement over the baseline EQCut. Finally, Section IV concludes the paper and suggests future topics for research. II. METHOD In this section we first briefly describe the conventional EQCut algorithm that will be exploited. Next, we describe the proposed affinity learning approach. A. Extended Quantum Cuts Extended Quantum Cuts (EQCut) [18] is based on a set of improvements over Quantum Cuts (QCut) [17], which is a spectral foreground detection method exploiting the link between graph theory and quantum mechanics. In a graph representation of an image, QCut approximates the following optimization problem: A cut(a, A ) = argmin A area(a), (1) where A is the foreground segment, A is the background segment (rest of the image), area(a) is the number of nodes in segment A and the cut-cost is defined as cut(a, A ) = i A,j A w i,j. Here, w i,j is the affinity (similarity) between nodes i and j. Let y b be a a binary indicator vector representing A, that takes a value of 1 in foreground nodes and 0 elsewhere. It has been proven in [17] that, if y b is relaxed to have real values, the relaxed solution to (1) is given as follows: y = z z, Hz = λz, (2) where λ is the minimum eigenvalue of the Hamiltonian matrix H, which is formulated as H = D W + V. Here, W is constructed from the affinities w i,j, D is a diagonal degree matrix whose elements are obtained by row-wise summation of W and V is a non-negative diagonal matrix of background priors for each node. The above representation exhibits a great similarity with the solutions for quantum states of a sub-atomic particle which were previously investigated for salient object segmentation in [19]. EQCut investigates the design of the graph that represents the image and its affinity matrix W, both of which highly affect the performance in salient object detection. The extensions proposed in [18] include a superpixel representation of the image, a novel graph affinity normalization strategy on a graph with higher connectivity and a multiresolution approach. The proposed extensions lead to a significant improvement over QCut [17]. It should be noted here that the graph affinity normalization in EQCut is based on the neighbourhood degrees between pairs of superpixels and they result into an asymmetric affinity matrix. In this paper we will exploit a symmetric affinity matrix by simply averaging the asymmetric one with its transposed. B. Proposed Method In this study, we use the same graph construction approach with the symmetric version of graph affinity normalization in EQCut and we will investigate the effect of optimally combining affinity functions in salient object detection performance based on EQCut. In [17] and [18], the following affinity function was exploited: w (1) 1 i,j = ε + f i f j 2, (3) where f i is the related feature vector for the superpixel i and ε is a small constant to prevent division by zero. A property of the above-given affinity function is that it greatly penalizes the Euclidian distance between two feature vectors. An exponential function which is relatively smoother was used in some salient object segmentation methods [12], [13], [38] expressed as: w (2) i,j = e f i f j σ 2, where σ is a smoothing factor controlling the affinity decrease rate with the increasing feature distance. Finally, a linear affinity function can also be used as follows: (4) w (3) i,j = 1 f i f j. (5) Note that in all affinity functions, the Euclidian distances are unit-normalized in such a way that among all possible feature distances in a given image, the maximum is set to 1. Hence, it is guaranteed that (5) does not produce negative affinities. At this point we would like to note that, for a given image, affinities obtained using (3) are normalized such that the maximum affinity will have a value of 1. The affinities obtained by (4) and (5) are not needed to be normalized since they already produce outputs between 0 and 1. The proposed method, instead of using a single affinity function with fixed parameters, makes use of a linear combination of them, as follows: w i,j = c 1 w (1) i,j + c 2 w (2) i,j + c 3 w (3) i,j. (6)

3 Furthermore, we consider ε in (3) and σ in (4) as parameters to be learned. Hence, we aim to learn the parameter set θ = {c 1, c 2, c 3, ε, σ} in order to achieve a high performance in salient object detection. In order to learn affinities that will perform well on salient object detection, we exploit the observation that the saliency maps obtained via (2) are differentiable with respect to affinities of the graph that is used. This property gives us the opportunity to backpropagate the error related to the output of EQCut with respect to the parameters we wish to optimize. Exploiting the eigenvector derivatives for symmetric matrices as presented in [48], one can write the following: z h = [z (λi H) ]Dpl, where λ is the minimum eigenvector of H, z is the corresponding eigenvector, h is the vectorized version of H by row-wise raster scan, (. ) denotes the Moore-Penrose pseudoinverse, Dpl is the duplication matrix [48] which converts the half-vectorizations of symmetric matrices to full vectorizations, I is the identity matrix and is the Kronecker product. In order to backpropagate the error e from EQCut output y to affinities, we can write the partial derivatives of the error corresponding to i th superpixel with respect to the entire H matrix as follows: (7) e i H = ( e i y i ) ( y i z i ) ( z i H ). (8) We can extract ( z i H ) component of (8) from (7) and ( y i z i ) = 2z i as observed from (2). Only remaining component ( e i y i ) depends on the error function that is chosen for performance evaluation of the EQcut output. We have used the following error function and the corresponding partial derivatives of it with respect to y: e i = y i g i, ( e i y i ) = sgn(y i g i ), where g i is the ground-truth label for superpixel i. The groundtruth map takes the value 1 for superpixels corresponding to salient regions and 0 otherwise. At this point it should be noted that since y is l 1 normalized due to (2), elements of y never gets the value 1. Furthermore, due to numerical precision they do not become exactly equal to 0 either. Hence, (9) always produces an error derivative of 1 for background regions and 1 for salient object regions. We have observed that this kind of enforcement yields much better performance, mainly because of the fact that it always forces the network to give higher values to salient regions and suppress background regions. Other error functions such as Euclidian distance between y and l 1 normalized ground truth map, produce small gradients once y is close to l 1 normalized ground truth map, hence converges to a suboptimal solution. (9) Based on the above discussion, (8) can be calculated. Accordingly, partial derivatives with respect to W can be calculated as follows: e i W = 1T d ei,h d T ei,h e i H, (10) where d ei,h is a vector obtained by the diagonal elements of e i H. Finally, the derivative for the average error e per node can be calculated as follows: e W = 1 N e i. W i (11) Next, partial derivatives of affinities with respect to parameter set θ are needed to be calculated. The partial derivatives of affinities with respect to linear combination coefficient c k, k=1,2,3, can be calculated as follows: w i,j c k = w (k) i,j, (12) The partial derivatives of affinities with respect to parameters ε and σ can be given as follows: w i,j ε w i,j σ c 1 = (ε + f i f j 2 2, (13) ) 2c (2) 2w i,j fi f j = σ 3, (14) Combining (7)-(14), one can obtain the partial derivatives of the error e from EQCut s output y with respect to the affinity calculation parameter set θ. Hence, given a training dataset, parameter set θ can be learned by iteratively backpropagating the error to update θ and calculating the new output until convergence based on a stopping rule that will be explained in the next section. III. EXPERIMENTAL RESULTS In this section we provide extensive experiments conducted in order to evaluate the performance of the proposed method exploiting learned affinities for EQCut-based saliency object detection. Comparisons are made with EQCut as well as the state-of-the-art salient object detection methods. A. Datasets SOD [33] dataset contains 300 images that includes many salient objects and a high background clutter. JUDD [41] dataset is the most challenging dataset containing 900 images in total with multiple salient objects and high background clutter. MSRA10k [9] contains 10k images with relatively easier examples with large objects having high contrast. Dut- OMRON [11] contains 5168 images with one or more salient objects per image and high background clutter. Pascal1500 [49] contains 1500 images with more than one salient object at a variety of locations and scales and challenging background.

4 B. Evaluation Metrics In our experiments, we use precision-recall curves and maximum and mean F 1 scores, as performance metrics. Precision is the ratio of the total number of true positives to the total number of detected samples. Recall is the ratio of the number of true positives to the number of true samples. For salient object detection, a true sample is a pixel in the groundtruth segmentation mask G. A detected sample at a threshold τ is the number of nonzero elements in a saliency map thresholded with τ, denoted by S τ. pre(τ) = G S τ, rec(τ) = G S τ S τ G (15) F-measure is defined according to precision and recall values as follows: in the original EQCut affinity calculation as in (3). As the distances increase, the affinities decrease more smoothly when compared to (3) due to contribution of (4) and (5). F 1 (τ) = 2pre(τ)rec(τ) pre(τ) + rec(τ) (16) Maximum F 1 measure is then simply obtained by taking the maximum of the F 1 measure over all thresholds applied. C. Training In all our experiments, Dut-OMRON was used during the training phase in order to learn the parameters θ, as it is the one of the largest datasets with the highest complexity. A portion of Dut-OMRON could also have been used as a validation set, however the images within each dataset have some correlation, for example Dut-OMRON dataset contains several images that exhibit great similarity from some classes. Therefore, in order to have a system that generalizes well, we have used an entirely different dataset (SOD) for validation purposes. We have initialized the parameters with θ 0 = {1,0,0, 10 6, 0.1}. This initialization corresponds to the affinity calculation in EQCut and σ = 0.1 was set as in [12], [13] and [38]. At each iteration, the images in Dut-OMRON dataset are randomly shuffled. Next, in an epoch, 50 training samples were chosen from the shuffled indices and the affinity function parameters were updated via error backpropagation with a minibatch size of 1. The step size is scaled by 0.99 in each consecutive iteration. At each epoch, the maximum F 1 measure for the validation dataset is measured. Training is stopped when the error converges, or when a pre-defined maximum iteration number is achieved. The parameter values corresponding to the epoch that produced the best performance on the validation set were selected as the final parameters. D. Comparison to EQCut The Here we compare the performance of the proposed method (EQCut-A) with EQCut. First, three separate parameter sets were learned for three superpixel granularities of EQCut (150, 300 and 600 superpixels) according to the details explained in the previous section. The log affinities of the learnt functions and the ones calculated by w (1) in (3), w (2) in (4) and w (3) in (5) are illustrated in Fig. 1. The learnt affinity functions lead to very high affinities in very small distances as Fig. 1 Log affinities for w (1), w (2), w (3) and learnt affinities for 150, 300 and 600 superpixel (SP) granularities. In Fig. 2, maximum and mean F1 measures for the test datasets (i.e., Pascal1500, MSRA10k and Judd) are illustrated. For each superpixel granularity and for each test set, we observe a clear improvement over the maximum and mean F1 measures. The improvement is much clearer as the dataset gets more difficult. For instance, MSRA10k is the simplest dataset and although there is a performance improvement with the learnt affinities, it is rather incremental. However, for challenging datasets Pascal1500 and Judd, the performance improvement is significant, reaching up to 7% relative improvement over the EQCut in maximum F1 measures, respectively. Fig. 2 Maximum F1 measure of baseline EQCut and EQCut with learned affinities (EQCut-A). Performances are given for Pascal (P), Judd (J) and MSRA10k(M) datasets and 150, 300 and 600 superpixel granularities.

Fig. 3 Performance comparisons of EQCut-A, EQCut, GP, RBD, ST, DSR, MC and DRFI in precision-recall curves (Column 1 for JUDD and 2 for MSRA10k datasets) and in maximum F1measures (Column 3 for JUDD

5 Fig. 3 Performance comparisons of EQCut-A, EQCut, GP, RBD, ST, DSR, MC and DRFI in precision-recall curves (Column 1 for JUDD and 2 for MSRA10k datasets) and in maximum F1measures (Column 3 for JUDD and 4 for MSRA10k datasets) E. Comparison with the State of the Art In this section we compare the performance of EQCut-A with the state of the art salient object detection methods according to a recent benchmark study [42], namely EQCut [18], MC [13], RBD [12], ST [15], DSR [8] and DRFI [16]. Furthermore, we include one more method that became the state-of-the-art and published after the benchmark, namely GP [38]. For the comparisons in this section, we have employed the multi-resolution approach in [18] and combined all saliency maps at each superpixel granularity by simply averaging them. This was made for both EQCut and EQCut-A for superpixel granularities 150, 300 and 600. The performances are given in precision-recall curves and maximum F1 measures for Judd and MSRA10k datasets. Pascal-1500 dataset was excluded from comparisons since we couldn t reach all saliency map results/codes of the compared methods for this dataset. Nevertheless, Judd and MSRA10k datasets cover a large variety of cases where the former datasets include salient objects with highest difficulty, whereas the latter includes simplest cases in salient object detection task. In both datasets we observe consistent improvement over EQCut, moreover, in Judd database, EQCut-A performs better than all state-of-theart methods and in MSRA10k dataset, EQCut-A is better than all the methods except for DRFI, while it is comparable with DRFI in this dataset F. Computational Complexity In terms of computational complexity, the proposed affinity calculations do not bring any significant burden over EQCut. For the saliency calculations on SOD dataset, the average runtimes for EQCut and EQCUT-A are and seconds, respectively. Therefore, the total increase in the computational complexity is only around 11%. IV. CONCLUSION In this paper, we have proposed a graph affinity and parameter learning method for salient object detection. Specifically, linear combination coefficients combining several affinity functions and their parameters are learned altogether such that the resulting affinity matrix will lead to better saliency results via the exploited salient object detection method EQCut. The proposed affinities lead to a consistent improvement over EQCut in precision-recall curves and maximum F1 measures, yet they do not bring any significant computational complexity rise. Moreover, EQCut with the learnt affinities performs the best or comparable to the state-ofthe-art in salient object segmentation. REFERENCES [1] S. K. Ungerleider, and G. Leslie, "Mechanisms of visual attention in the human cortex," Annual Review of Neuroscience, vol. 23, no. 1, pp , [2] L. Itti, C. Koch, and E. Niebur, A model of saliency-based visual attention for rapid scene analysis, IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 11, pp , [3] J. Harel, C. Koch, and P. Perona, Graph-based visual saliency, Conference on Neural Information Processing Systems, pp , [4] I. Rigas, G. Economou and S. Fotopoulos, Efficient Modelling of Visual Saliency Based on Local Sparse Representation and the Use of Hamming Distance, Computer Vision and Image Understanding, vol. 134, pp 33-45, [5] D. Gao, V. Mahadevan, and N.Vasconcelos, The discriminant centersurround hypothesis for bottom-up saliency, Conference on Neural Information Processing Systems, pp , [6] T. Judd, K. Ehinger, F. Durand, and A. Torralba, Learning to predict where humans look, in Proc. IEEE International Conference on Computer Vision, pp , [7] J. Han, L. Sun, X. Hu, J. Han and L. Shao, Spatial and temporal visual attention prediction in videos using eye movement data, Neurocomputing, vol. 145, pp , [8] X. Li, H. Lu, L. Zhang, X. Ruan and M.H. Yang, "Saliency detection via dense and sparse reconstruction," in Proc. IEEE International Conference on Computer Vision, pp , [9] M.M. Cheng, G. Zhang, N. Mitra, X. Huang, and S.Hu, Global contrast based salient region detection, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no.3, pp , [10] Q. Yan, L. Xu, J. Shi, and J. Jia, Hierarchical saliency detection, in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp , [11] C. Yang, L. Zhang, H. Lu, X. Ruan, and M.-H. Yang, Saliency detection via graph-based manifold ranking, IEEE Conference on Computer Vision and Pattern Recognition, pp , [12] W. Zhu, S. Liang, Y. Wei and J. Sun, Saliency Optimization from Robust Background Detection, IEEE Conference on Computer Vision and Pattern Recognition, pp , [13] B. Jiang, L. Zhang, H. Lu, C. Yang and M-H. Yang, Saliency Detection via Absorbing Markov Chain, IEEE International Conference on Computer Vision, pp , [14] J. Zhang, S. Sclaroff, Z. Lin, X. Shen, B. Price, and R. Mechi, Minimum Barrier Salient Object Detection at 80 FPS, IEEE International Conference on Computer Vision, pp , [15] Z. Liu, W. Zou, and O. Le Meur. "Saliency tree: A novel saliency detection framework." IEEE Transactions on Image Processing, vol. 23 no. 5, pp , [16] H. Jiang, J. Wang, Z. Yuan, Y. Wu, N. Zheng, and S. Li. "Salient object detection: A discriminative regional feature integration approach." IEEE

6 Conference on Computer Vision and Pattern Recognition, pp , [17] C. Aytekin, S. Kiranyaz and M. Gabbouj, Automatic Object Segmentation by Quantum Cuts, International Conference on Pattern Recognition, pp , [18] C. Aytekin, E. C. Ozan, S. Kiranyaz, and M. Gabbouj, Visual saliency by extended quantum cuts, IEEE International Conference on Image Processing, pp , [19] C. Aytekin, S. Kiranyaz and M. Gabbouj, Quantum mechanics in Computer Vision: Automatic Object Extraction, IEEE International Conference on Image Processing, pp , [20] D. Rudoy, D.B. Goldman, E. Shechtman and L. Z. Manor, Learning Video Saliency from Human Gaze Using Candidate Selection, IEEE Conference on Computer Vision and Pattern Recognition, pp , [21] H. Hadizadeh and I.V. Bajic, Saliency-Aware Video Compression, IEEE Transactions on Image Processing, vol. 23, no.1, pp , [22] R. Margolin, L. Z. Manor and A. Tal, Saliency for Image Manipulation, The Visual Computer, vol. 29, no. 5, pp , [23] X. Hou and L. Zhang, Thumbnail Generation based on Global Saliency, Advances in Cognitive Neurodynamics, pp , [24] M. Casares, S. Velipasalar and A. Pinto, Light-weight Salient Foreground Detection for Embedded Smart Cameras, Computer Vision and Image Understanding, vol. 114, no. 11, pp , [25] Z. Li, S. Qin and L. Itti, Visual attention guided bit allocation in video compression, Image and Vision Computing, vol. 29, no. 1, pp. 1-14, [26] C. Aytekin, E. Tunali, and S. Oz, Fast Semi-automatic Target Initialization based on Visual Saliency for Airborne Thermal Imagery, In International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, pp , [27] E. Horbert, G. M. García, S. Frintrop and B. Leibe, Sequence-level object candidates based on saliency for generic object recognition on mobile systems, International Conference on Robotics and Automation, pp , [28] C. Aytekin, S. Kiranyaz, and M. Gabbouj, "Learning to rank salient segments extracted by multispectral quantum cuts." Pattern Recognition Letters, vol. 72, pp , [29] [30] R. Achanta, S. Hemami, F. Estrada and S. Susstrunk, "Frequency-tuned salient region detection." IEEE Conference on Computer Vision and Pattern Recognition, pp , [31] R. Achanta and S. Susstrunk, Saliency detection using maximum symmetric surround, International Conference on Image Processing, pp , [32] W. Zou, K. Kpalma, Z. Liu, and J. Ronsin, Segmentation driven lowrank matrix recovery for saliency detection, in Proc. British Machine Vision Conference, pp. 1-13, [33] V. Mohavedi and J. H. Elder, Design and perceptual validation of performance measures for salient object segmentation, IEEE Computer Vision and Pattern Recognition Workshops, pp , [34] S. Alpert, M. Galun, R. Basri, and A. Brandt, Image Segmentation by Probabilistic bottom-up aggregiation and cue integration, IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, [35] G. Li, Y. Yizhou "Visual Saliency Based on Multiscale Deep Features," arxiv preprint arxiv: , [36] A. Borji, M. M. Cheng, H. Jiang and J. Li, "Salient object detection: A survey," arxiv preprint arxiv: , [37] R.R. Coifman, and S. Lafon. "Diffusion maps," Applied and computational harmonic analysis, vol. 21, no. 1, pp. 5-30, [38] P. Jiang, N. Vasconcelos, and J. Peng. "Generic Promotion of Diffusion- Based Salient Object Detection," IEEE International Conference on Computer Vision, pp , [39] D. Zhou, J. Weston, A. Gretton, O. Bousquet, and B. Schölkopf, "Ranking on data manifolds," Advances in Neural Information Processing Systems, pp , [40] D. Zhou, O. Bousquet, T.N. Lal, J. Weston, and B. Schölkopf, "Learning with local and global consistency," Advances in Neural Information processing systems, pp , [41] A. Borji. "What is a salient object? a dataset and a baseline model for salient object detection," IEEE Transactions on Image Processing, vol 24, no. 2, pp , [42] A. Borji, M. M. Cheng, H. Jiang and J. Li. "Salient object detection: A benchmark," IEEE Transactions on Image Processing, vol. 24, no. 12, pp , [43] L. M. Manevitz, and M. Yousef. "One-class SVMs for document classification," The Journal of Machine Learning Research, vol. 2 pp , [44] B. Schölkopf, R. Herbrich, and A. J. Smola, A generalized representer theorem, Computational Learning Theory, pp , [45] G.D. Poole, "Generalized M-matrices and applications," Mathematics of Computation, vol. 29, no.131, pp , [46] B. Scholkopf, and A.J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, Cambridge MA: MIT press, [47] D.B. West, Introduction to graph theory (2 nd ed.), Upper Saddle River: Prentice Hall, [48] J.R. Magnus, On differentiating eigenvalues and eigenvectors, Econometric Theory, vol. 1, no. 2, pp , [49] W. Zou, K. Kpalma, Z. Liu and J. Rosnin, Segmentation driven lowrank matrix recovery for saliency detection, British Machine Vision Conference, 2013.

Supplementary Materials for Salient Object Detection: A

Supplementary Materials for Salient Object Detection: A Discriminative Regional Feature Integration Approach Huaizu Jiang, Zejian Yuan, Ming-Ming Cheng, Yihong Gong Nanning Zheng, and Jingdong Wang Abstract