Distributed Low-rank Subspace Segmentation

Size: px
Start display at page:

Download "Distributed Low-rank Subspace Segmentation"

Transcription

1 2013 IEEE International Conference on Computer Vision Distributed Low-rank Subspace Segmentation Ameet Talwalkar a Lester Mackey b Yadong Mu c Shih-Fu Chang c Michael I. Jordan a a University of California, Berkeley b Stanford University c Columbia University {ameet, jordan}@cs.berkeley.edu, lmackey@stanford.edu, {muyadong, sfchang}@ee.columbia.edu Abstract Vision problems ranging from image clustering to motion segmentation to semi-supervised learning can naturally be framed as subspace segmentation problems, in which one aims to recover multiple low-dimensional subspaces from noisy and corrupted input data. Low-Rank Representation (LRR), a convex formulation of the subspace segmentation problem, is provably and empirically accurate on small problems but does not scale to the massive sizes of modern vision datasets. Moreover, past work aimed at scaling up low-rank matrix factorization is not applicable to LRR given its non-decomposable constraints. In this work, we propose a novel divide-and-conquer algorithm for large-scale subspace segmentation that can cope with LRR s non-decomposable constraints and maintains LRR s strong recovery guarantees. This has immediate implications for the scalability of subspace segmentation, which we demonstrate on a benchmark face recognition dataset and in simulations. We then introduce novel applications of LRR-based subspace segmentation to large-scale semisupervised learning for multimedia event detection, concept detection, and image tagging. In each case, we obtain stateof-the-art results and order-of-magnitude speed ups. 1. Introduction Visual data, though innately high dimensional, often reside in or lie close to a union of low-dimensional subspaces. These subspaces might reflect physical constraints on the objects comprising images and video (e.g., faces under varying illumination [2] or trajectories of rigid objects [24]) or naturally occurring variations in production (e.g., digits hand-written by different individuals [12]). Subspace segmentation techniques model these classes of data by recovering bases for the multiple underlying subspaces [10, 7]. Applications include image clustering [7], segmentation of images, video, and motion [30, 6, 26], and affinity graph construction for semi-supervised learning [32]. One promising, convex formulation of the subspace segmentation problem is the low-rank representation (LRR) program of Liu et al. [17, 18]: (Ẑ, Ŝ) = argmin Z + λ S 2,1 (1) Z,S subject to M = MZ + S. Here, M is an input matrix of datapoints drawn from multiple subspaces, is the nuclear norm, 2,1 is the sum of the column l 2 norms, and λ is a parameter that trades off between these penalties. LRR segments the columns of M into subspaces using the solution Ẑ, and, along with its extensions (e.g., LatLRR [19] and NNLRS [32]), admits strong guarantees of correctness and strong empirical performance in clustering and graph construction applications. However, the standard algorithms for solving Eq. (1) are unsuitable for large-scale problems, due to their sequential nature and their reliance on the repeated computation of costly truncated SVDs. Much of the computational burden in solving LRR stems from the nuclear norm penalty, which is known to encourage low-rank solutions, so one might hope to leverage the large body of past work on parallel and distributed matrix factorization [11, 23, 8, 31, 21] to improve the scalability of LRR. Unfortunately, these techniques are tailored to optimization problems with losses and constraints that decouple across the entries of the input matrix. This decoupling requirement is violated in the LRR problem due to the M = MZ + S constraint of Eq. (1), and this nondecomposable constraint introduces new algorithmic and analytic challenges that do not arise in decomposable matrix factorization problems. To address these challenges, we develop, analyze, and evaluate a provably accurate divide-and-conquer approach to large-scale subspace segmentation that specifically accounts for the non-decomposable structure of the LRR problem. Our contributions are three-fold: Algorithm: We introduce a parallel, divide-and-conquer approximation algorithm for LRR that is suitable for largescale subspace segmentation problems. Scalability is achieved by dividing the original LRR problem into computationally tractable and communication-free subproblems, solving the subproblems in parallel, and combining the re /13 $ IEEE DOI /ICCV

2 sults using a technique from randomized matrix approximation. Our algorithm, which we call DFC-LRR, is based on the principles of the Divide-Factor-Combine (DFC) framework [21] for decomposable matrix factorization but can cope with the non-decomposable constraints of LRR. Analysis: We characterize the segmentation behavior of our new algorithm, showing that DFC-LRR maintains the segmentation guarantees of the original LRR algorithm with high probability, even while enjoying substantial speed-ups over its namesake. Our new analysis features a significant broadening of the original LRR theory to treat the richer class of LRR-type subproblems that arise in DFC-LRR. Moreover, since our ultimate goal is subspace segmentation and not matrix recovery, our theory guarantees correctness under a more substantial reduction of problem complexity than the work of [21] (see Sec. 3.2 for more details). Applications: We first present results on face clustering and synthetic subspace segmentation to demonstrate that DFC-LRR achieves accuracy comparable to LRR in a fraction of the time. We then propose and validate a novel application of the LRR methodology to large-scale graphbased semi-supervised learning. While LRR has been used to construct affinity graphs for semi-supervised learning in the past [4, 32], prior attempts have failed to scale to the sizes of real-world datasets. Leveraging the favorable computational properties of DFC-LRR, we propose a scalable strategy for constructing such subspace affinity graphs. We apply our methodology to a variety of computer vision tasks multimedia event detection, concept detection, and image tagging demonstrating an order of magnitude improvement in speed and accuracy that exceeds the state of the art. The remainder of the paper is organized as follows. In Section 2 we first review the low-rank representation approach to subspace segmentation and then introduce our novel DFC-LRR algorithm. Next, we present our theoretical analysis of DFC-LRR in Section 3. Section 4 highlights the accuracy and efficiency of DFC-LRR on a variety of computer vision tasks. We present subspace segmentation results on simulated and real-world data in Section 4.1. In Section 4.2 we present our novel application of DFC-LRR to graph-based semi-supervised learning problems, and we conclude in Section 5. Notation Given a matrix M R m n, we define U M Σ M VM as the compact singular value decomposition (SVD) of M, where rank(m) =r, Σ M is a diagonal matrix of the r non-zero singular values and U M R m r and V M R n r are the associated left and right singular vectors of M. We denote the orthogonal projection onto the column space of M as P M. 2. Divide-and-Conquer Segmentation In this section, we review the LRR approach to subspace segmentation and present our novel algorithm, DFC-LRR Subspace Segmentation via LRR In the robust subspace segmentation problem, we observe a matrix M = L 0 + S 0 R m n, where the columns of L 0 are datapoints drawn from multiple independent subspaces, 1 and S 0 is a column-sparse outlier matrix. Our goal is to identify the subspace associated with each column of L 0, despite the potentially gross corruption introduced by S 0. An important observation for this task is that the projection matrix V L0 VL 0 for the row space of L 0, sometimes termed the shape iteration matrix, is block diagonal whenever the columns of L 0 lie in multiple independent subspaces [10]. Hence, we can achieve accurate segmentation by first recovering the row space of L 0. The LRR approach of [17] seeks to recover the row space of L 0 by solving the convex optimization problem presented in Eq. (1). Importantly, the LRR solution comes with a guarantee of correctness: the column space of Ẑ is exactly equal to the row space of L 0 whenever certain technical conditions are met [18] (see Sec. 3 for more details). Moreover, as we will show in this work, LRR is also well-suited to the construction of affinity graphs for semisupervised learning. In this setting, the goal is to define an affinity graph in which nodes correspond to data points and edge weights exist between nodes drawn from the same subspace. LRR can thus be used to recover the block-sparse structure of the graph s affinity matrix, and these affinities can be used for semi-supervised label propagation Divide-Factor-Combine LRR (DFC-LRR) We now present our scalable divide-and-conquer algorithm, called DFC-LRR, for LRR-based subspace segmentation. DFC-LRR extends the principles of the DFC framework of [21] to a new non-decomposable problem. The DFC-LRR algorithm is summarized in Algorithm 1, and we next describe each step in further detail. D step - Divide input matrix into submatrices: DFC- LRR randomly partitions the columns of M into tl-column submatrices, {C 1,...,C t }. For simplicity, we assume that t divides n evenly. F step - Factor submatrices in parallel: DFC-LRR solves t subproblems in parallel. The ith LRR subproblem is of the form min Z i,s i Z i + λ S i 2,1 (2) subject to C i = MZ i + S i, 1 Subspaces are independent if the dimension of their direct sum is the sum of their dimensions

3 where the input matrix M is used as a dictionary but only a subset of columns is used as the observations. 2 A typical LRR algorithm can be easily modified to solve Eq. (2) and will return a low-rank estimate Ẑi in factored form. C step - Combine submatrix estimates: DFC-LRR generates a final approximation Ẑproj to the low-rank LRR solution Ẑ by projecting [Ẑ1,...,Ẑt] onto the column space of Ẑ1. This column projection technique is commonly used to produce randomized low-rank matrix factorizations [15] and was also employed by the DFC- PROJ algorithm of [21]. Runtime: As noted in [21], many state-of-the-art solvers for nuclear-norm regularized problems like Eq. (1) have Ω(mnk M ) per-iteration time complexity due to the rank-k M truncated SVD required on each iteration. DFC- LRR reduces this per-iteration complexity significantly and requires just O(mlk Ci ) time for the ith subproblem. Performing the subsequent column projection step is relatively cheap computationally, since an LRR solver can return its solution in factored form. Indeed, if we define k max i k Ci, then the column projection step of DFC-LRR requires only O(mk 2 + lk 2 ) time. Algorithm 1 DFC-LRR Input: M, t {C i } 1 i t =SAMPLECOLS(M, t) do in parallel Ẑ 1 = LRR(C 1, M). Ẑ t = LRR(C t, M) end do Ẑ proj =COLPROJ([Ẑ1,...,Ẑt], Ẑ1) 3. Theoretical Analysis Despite the significant reduction in computational complexity, DFC-LRR provably maintains the strong theoretical guarantees of the LRR algorithm. To make this statement precise, we first review the technical conditions for accurate row space recovery required by LRR Conditions for LRR Correctness The LRR analysis of Liu et al. [18] relies on two key quantities, the rank of the clean data matrix L 0 and the co- 2 An alternative formulation involves replacing both instances of M with C i in Eq. (1). The resulting low-rank estimate Ẑi would have dimensions l l, and the C step of DFC-LRR would compute a low-rank approximation on the block-diagonal matrix diag(ẑ1, Ẑ2,...,Ẑt). herence [22] of the singular vectors V L0. We combine these properties into a single definition: Definition 1 ((μ, r)-coherence). A matrix L R m n is (μ, r)-coherent if rank(l) =r and n r V L 2 2, μ, where 2, is the maximum column l 2 norm. 3 Intuitively, when the coherence μ is small, information is well-distributed across the rows of a matrix, and the row space is easier to recover from outlier corruption. Using these properties, Liu et al. [18] established the following recovery guarantee for LRR. Theorem 2 ([18]). Suppose that M = L 0 + S 0 R m n where S 0 is supported on γn columns, L 0 is ( μ 1 γ,r)- coherent, and L 0 and S 0 have independent column support with range(l 0 ) range(s 0 )={0}. Let Ẑ be a solution returned by LRR. Then there exists a constant γ (depending on μ and r) for which the column space of Ẑ exactly equals the row space of L 0 whenever λ =3/(7 M γ l) and γ γ. In other words, LRR can exactly recover the row space of L 0 even when a constant fraction γ of the columns has been corrupted by outliers. As the rank r and coherence μ shrink, γ grows allowing greater outlier tolerance High Probability Subspace Segmentation Our main theoretical result shows that, with high probability and under the same conditions that guarantee the accuracy of LRR, DFC-LRR also exactly recovers the row space of L 0. Recall that in our independent subspace setting accurate row space recovery is tantamount to correct segmentation of the columns of L 0. The proof of our result, which generalizes the LRR analysis of [18] to a broader class of optimization problems and adapts the DFC analysis of [21], can be found in the appendix. Theorem 3. Fix any failure probability δ>0. Under the conditions of Thm, 2, let Ẑproj be a solution returned by DFC-LRR. Then there exists a constant γ (depending on μ and r) for which the column space of Ẑproj exactly equals the row space of L 0 whenever λ = 3/(7 M γ l) for each DFC-LRR subproblem, γ γ, and t = n/l for l crμ log(4n/δ)/(γ γ) 2 and c a fixed constant larger than 1. 3 Although [18] uses the notion of column coherence to analyze LRR, we work with the closely related notion of (μ, r)-coherence for ease of notation in our proofs. Moreover, we note that if a rank-r matrix L R m n is supported on (1 γ)n columns then the column coherence of V L is μ if and only if V L is (μ/(1 γ),r)-coherent

4 success rate DFC LRR 10% DFC LRR 25% LRR LRR outliers fraction γ time (seconds) DFC LRR 10% DFC LRR 25% LRR LRR Timing outliers fraction γ time (seconds) DFC LRR 10% DFC LRR 25% LRR (a) (b) (c) LRR Timing number of data points Figure 1: Results on synthetic data (reported results are averages over 10 trials). (a) Phase transition of LRR and DFC-LRR. (b,c) Timing results of LRR and DFC-LRR as functions of γ and n respectively. Thm. 3 establishes that, like LRR, DFC-LRR can tolerate a constant fraction of its data points being corrupted and still recover the correct subspace segmentation of the clean data points with high probability. When the number of datapoints n is large, solving LRR directly may be prohibitive, but DFC-LRR need only solve a collection of small, tractable subproblems. Indeed, Thm. 3 guarantees high probability recovery for DFC-LRR even when the subproblem size l is logarithmic in n. The corresponding reduction in computational complexity allows DFC-LRR to scale to large problems with little sacrifice in accuracy. Notably, this column sampling complexity is better than that established by [21] in the matrix factorization setting: we require O(r log n) columns sampled, while [21] requires in the worst case Ω(n) columns for matrix completion and Ω((r log n) 2 ) for robust matrix factorization. 4. Experiments We now explore the empirical performance of DFC- LRR on a variety of simulated and real-world datasets, first for the traditional task of robust subspace segmentation and next for the more complex task of graph-based semi-supervised learning. Our experiments are designed to show the effectiveness of DFC-LRR both when the theory of Section 3 holds and when it is violated. Our synthetic datasets satisfy the theoretical assumptions of low rank, incoherence, and a small fraction of corrupted columns, while our real-world datasets violate these criteria. For all of our experiments we use the inexact Augmented Lagrange Multiplier (ALM) algorithm of [17] as our base LRR algorithm. For the subspace segmentation experiments, we set the regularization parameter to the values suggested in previous works [18, 17], while in our semi-supervised learning experiments we set it to 1/ max (m, n) as suggested in prior work. 4 In all experiments we report parallel running times for DFC-LRR, 4 i.e., the time of the longest running subproblem plus the time required to combine submatrix estimates via column projection. All experiments were implemented in Matlab. The simulation studies were run on an x86-64 architecture using a single 2.60 Ghz core and 30GB of main memory, while the real data experiments were performed on an x86-64 architecture equipped with a 2.67GHz 12-core CPU and 64GB of main memory Subspace Segmentation: LRR vs. DFC-LRR We first aim to verify that DFC-LRR produces accuracy comparable to LRR in significantly less time, both in synthetic and real-world settings. We focus on the standard robust subspace segmentation task of identifying the subspace associated with each input datapoint Simulations To construct our synthetic robust subspace segmentation datasets, we first generate n s datapoints from each of k independent r-dimensional subspaces of R m, in a manner similar to [18]. For each subspace i, we independently select a basis U i uniformly from all matrices in R m r with orthonormal columns and a matrix T i R r n s of independent entries each distributed uniformly in [0, 1]. We form the matrix X i R m n s of samples from subspace i via X i = U i T i and let X 0 R m kn s =[X 1... X k ]. For a given outlier fraction γ we next generate an additional n o = γ 1 γ kn s independent outlier samples, denoted by S R m n o. Each outlier sample has independent N (0,σ 2 ) entries, where σ is the average absolute value of the entries of the kn s original samples. We create the input matrix M R m n, where n = kn s + n o, as a random permutation of the columns of [X 0 S]. In our first experiments we fix k =3, m = 1500, r =5, and n s = 200, set the regularizer to λ =0.2, and vary the fraction of outliers. We measure with what frequency LRR and DFC-LRR are able to recover of the row space of X

5 (a) (b) Figure 3: Trade-off between computation and segmentation accuracy on face recognition experiments. All results are obtained by averaging across 100 independent runs. (a) Run time of LRR and DFC-LRR with varying number of subproblems. (b) Segmentation accuracy for these same experiments Face Clustering Figure 2: Exemplar face images from Extended Yale Database B. Each row shows randomly selected images for a human subject. and identify the outlier columns in S, using the same criterion as defined in [18]. 5 Figure 1(a) shows average performance over 10 trials. We see that DFC-LRR performs quite well, as the gaps in the phase transitions between LRR and DFC-LRR are small when sampling 10% of the columns (i.e., t =10) and are virtually non-existent when sampling 25% of the columns (i.e., t =4). Figure 1(b) shows corresponding timing results for the accuracy results presented in Figure 1(a). These timing results show substantial speedups in DFC-LRR relative to LRR with a modest tradeoff in accuracy as denoted in Figure 1(a). Note that we only report timing results for values of γ for which DFC-LRR was successful in all 10 trials, i.e., for which the success rate equaled 1.0 in Figure 1(a). Moreover, Figure 1(c) shows timing results using the same parameter values, except with a fixed fraction of outliers (γ =0.1) and a variable number of samples in each subspace, i.e., n s ranges from 75 to These timing results also show speedups with minimal loss of accuracy, as in all of these timing experiments, LRR and DFC-LRR were successful in all trials using the same criterion defined in [18] and used in our phase transition experiments of Figure 1(a). 5 Success is determined by whether the oracle constraints of Eq. (8) in the Appendix are satisfied within a tolerance of We next demonstrate the comparable quality and increased performance of DFC-LRR relative to LRR on real data, namely, a subset of Extended Yale Database B, 6 a standard face benchmarking dataset. Following the experimental setup in [17], 640 frontal face images of 10 human subjects are chosen, each of which is resized to be pixels and forms a 2016-dimensional feature vector. As noted in previous work [3], a low-dimensional subspace can be effectively used to model face images from one person, and hence face clustering is a natural application of subspace segmentation. Moreover, as illustrated in Figure 2, a significant portion of the faces in this dataset are corrupted by shadows, and hence this collection of images is an ideal benchmark for robust subspace segmentation. As in [17], we use the feature vector representation of these images to create a dictionary matrix, M, and run both LRR and DFC-LRR with the parameter λ set to Next, we use the resulting low-rank coefficient matrix Ẑ to compute an affinity matrix U Ẑ U, where U Ẑ Ẑ contains the top left singular vectors of Ẑ. The affinity matrix is used to cluster the data into k =10clusters (corresponding to the 10 human subjects) via spectral embedding (to obtain a 10D feature representation) followed by k-means. Following [17], the comparison of different clustering methods relies on segmentation accuracy. Each of the 10 clusters is assigned a label based on majority vote of the ground truth labels of the points assigned to the cluster. We evaluate clustering performance of both LRR and DFC- LRR by computing segmentation accuracy as in [17], i.e., each cluster is assigned a label based on majority vote of the ground truth labels of the points assigned to the cluster. The segmentation accuracy is then computed by averaging the percentage of correctly classified data over all classes. Figures 3(a) and 3(b) show the computation time and the 6 leekc/extyaledatabase

6 segmentation accuracy, respectively, for LRR and for DFC- LRR with varying numbers of subproblems (i.e., values of t). On this relatively-small data set (n = 640 faces), LRR requires over 10 minutes to converge. DFC-LRR demonstrates a roughly linear computational speedup as a function of t, comparable accuracies to LRR for smaller values of t and a quite gradual decrease in accuracy for larger t Graph-based Semi-Supervised Learning Graph representations, in which samples are vertices and weighted edges express affinity relationships between samples, are crucial in various computer vision tasks. Classical graph construction methods separately calculate the outgoing edges for each sample. This local strategy makes the graph vulnerable to contaminated data or outliers. Recent work in computer vision has illustrated the utility of global graph construction strategies using graph Laplacian [9] or matrix low-rank [32] based regularizers. L1 regularization has also been effectively used to encourage sparse graph construction [5, 13]. Building upon the success of global construction methods and noting the connection between subspace segmentation and graph construction as described in Section 2.1, we present a novel application of the lowrank representation methodology, relying on our DFC-LRR algorithm to scalably yield a sparse, low-rank graph (SLRgraph). We present a variety of results on large-scale semisupervised learning visual classification tasks and provide a detailed comparison with leading baseline algorithms Benchmarking Data We adopt the following three large-scale benchmarks: Columbia Consumer Video (CCV) Content Detection 7 : Compiled to stimulate research on recognizing highlydiverse visual content in unconstrained videos, this dataset consists of 9317 YouTube videos over 20 semantic categories (e.g., baseball, beach, music performance). Three popular audio/visual features (5000-D SIFT, 5000-D STIP, and 4000-D MFCC) are extracted. MED12 Multimedia Event Detection: The MED12 video corpus consists of 150K multimedia videos, with an average duration of 2 minutes, and is used for detecting 20 specific semantic events. For each event, 130 to 367 videos are provided as positive examples, and the remainder of the videos are null videos that do not correspond to any event. In this work, we keep all positive examples and sample 10K null videos, resulting in a dataset of 13, 876 videos. We extract six features from each video, first at sampled frames and then accumulated to obtain videolevel representations. The features are either visual (1000- D sparse-sift, 1000-D dense-sift, 1500-D color-sift, D STIP), audio (2000-D MFCC), or semantic features (2659-D CLASSEME [25]). NUS-WIDE-Lite Image Tagging: NUS-WIDE is among the largest available image tagging benchmarks, consisting of over 269K crawled images from Flickr that are associated with over 5K user-provided tags. Ground-truth images are manually provided for 81 selected concept tags. We generate a lite version by sampling 20K images. For each image, 128-D wavelet texture, 225-D block-wise LAB-based color moments and 500-D bag of visual words are extracted, normalized and finally concatenated to form a single feature representation for the image Graph Construction Algorithms The three graph construction schemes we evaluate are described below. Note that we exclude other baselines (e.g., NNLRS [32], LLE graph [28], L1-graph [5]) due to either scalability concerns or because prior work has already demonstrated inferior performance relative to the SPG algorithm defined below [32]. knn-graph: We construct a nearest neighbor graph by connecting (via undirected edges) each vertex to its k nearest neighbors in terms of l 2 distance in the specified feature space. Exponential weights are associated with edges, i.e., w ij =exp ( d 2 ij /σ2), where d ij is the distance between x i and x j and σ is an empirically-tuned parameter [27]. SPG: Cheng et al. [5] proposed a noise-resistant L1-graph which encourages sparse vertex connectedness, motivated by the work of sparse representation [29]. Subsequent work, entitled sparse probability graph (SPG) [13] enforced positive graph weights. Following the approach of [32], we implemented a variant of SPG by solving the following optimization problem for each sample: min w x x D x w x α w x 1, s.t. w x 0, (3) where x is a feature representation of a sample and D x is the basis matrix for x constructed from its n k nearest neighbors. We use an open-source tool 8 to solve this non-negative Lasso problem. SLR-graph: Our novel graph construction method contains two-steps: first LRR or DFC-LRR is performed on the entire data set to recover the intrinsic low-rank clustering structure. We then treat the resulting low-rank coefficient matrix Z as an affinity matrix, and for sample x i, the n k samples with largest affinities to x i are selected to form a basis matrix and used to solve the SPG optimization described by Problem (3). The resulting non-negative coefficients (typically sparse owing to the l 1 regularization term on w x in (3)) are used to define the graph

7 (a) (b) Figure 4: Trade-off between computation and accuracy for the SLR-graph on the CCV dataset. (a) Wall time of LRR and DFC-LRR with varying numbers of subproblems. (b) map scores for these same experiments. Table 1: Mean average precision (map) (0-1) scores for various graph construction methods. DFC-LRR-10 is performed for SLR-Graph. The best map score for each feature is highlighted in bold. (a) CCV knn-graph SPG SLR-GRAPH SIFT STIP MFCC (b) MED12 knn-graph SPG SLR-GRAPH COLOR-SIFT DENSE-SIFT SPARSE-SIFT MFCC CLASSEME STIP (c) NUS-WIDE-Lite knn-graph SPG SLR-GRAPH Experimental Design For each benchmarking dataset, we first construct graphs by treating sample images/videos as vertices and using the three algorithms outlined in Section to create (sparse) weighted edges between vertices. For fair comparison, we use the same parameter settings, namely α = 0.05 and n k = 500 for both SPG and SLR-graph. Moreover, we set k =40for knn-graph after tuning over the range k =10 through k =60. We then use a given graph structure to perform semisupervised label propagation using an efficient label propagation algorithm [27] that enjoys a closed-form solution and often achieves the state-of-the-art performance. We perform a separate label propagation for each category in our benchmark, i.e., we run a series of 20 binary classification label propagation experiments for CCV/MED12 and 81 experiments for NUS-WIDE-Lite. For each category, we randomly select half of the samples as training points (and use their ground truth labels for label propagation) and use the remaining half as a test set. We repeat this process 20 times for each category with different random splits. Finally, we compute Mean Average Precision (map) based on the results on the test sets across all runs of label propagation Experimental Results We first performed experiments using the CCV benchmark, the smallest of our datasets, to explore the tradeoff between computation and accuracy when using DFC-LRR as part of our proposed SLR-graph. Figure 4(a) presents the time required to run SLR-graph with LRR versus DFC-LRR with three different numbers of subproblems (t = 5, 10, 15), while Figure 4(b) presents the corresponding accuracy results. The figures show that DFC-LRR performs comparably to LRR for smaller values of t, and performance gradually degrades for larger t. Moreover, DFC-LRR is up to two orders of magnitude faster and achieves superlinear speedups relative to LRR. 9 Given the scalability issues of LRR on this modest-sized dataset, along with the comparable accuracy of DFC-LRR, we ran SLR-graph exclusively with DFC-LRR (t =10) for our two larger datasets. Table 1 summarizes the results of our semi-supervised learning experiments using the three graph construction techniques defined in Section The results show that our proposed SLR-graph approach leads to significant performance gains in terms of map across all benchmarking datasets for the vast majority of features. These results demonstrate the benefit of enforcing both low-rankedness and sparsity during graph construction. Moreover, conventional low-rank oriented algorithms, e.g., [32, 16] would be computationally infeasible on our benchmarking datasets, 9 We restricted the maximum number of internal LRR iterations to 500 to ensure that LRR ran to completion in less than two days

8 thus highlighting the utility of employing DFC s divideand-conquer approach to generate a scalable algorithm. 5. Conclusion Our primary goal in this work was to introduce a provably accurate algorithm suitable for large-scale lowrank subspace segmentation. While some contemporaneous work [1] also aims at scalable subspace segmentation, this method offers no guarantee of correctness. In contrast, DFC-LRR provably preserves the theoretical recovery guarantees of the LRR program. Moreover, our divide-and-conquer approach achieves empirical accuracy comparable to state-of-the-art methods while obtaining linear to superlinear computational gains, both on standard subspace segmentation tasks and on novel applications to semi-supervised learning. DFC-LRR also lays the groundwork for scaling up LRR derivatives known to offer improved performance, e.g., LatLRR in the setting of standard subspace segmentation and NNLRS in the graph-based semi-supervised learning setting. The same techniques may prove useful in developing scalable approximations to other convex formulations for subspace segmentation, e.g., [20]. References [1] A. Adler, M. Elad, and Y. Hel-Or. Probabilistic subspace clustering via sparse representations. IEEE Signal Process. Lett., 20(1):63 66, [2] R. Basri and D. Jacobs. Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell., 25(3): , [3] E. J. Candès, X. Li, Y. Ma, and J. Wright. Robust principal component analysis? Journal of the ACM, 58(3):1 37, [4] B. Cheng, G. Liu, J. Wang, Z. Huang, and S. Yan. Multi-task low-rank affinity pursuit for image segmentation. In ICCV, [5] B. Cheng, J. Yang, S. Yan, Y. Fu, and T. S. Huang. Learning with l1-graph for image analysis. IEEE Transactions on Image Processing, 19(4): , [6] J. Costeira and T. Kanade. A multibody factorization method for independently moving objects. International Journal of Computer Vision, 29(3), [7] E. Elhamifar and R. Vidal. Sparse subspace clustering. In CVPR, [8] B. R. Feng Niu, C. Ré, and S. J. Wright. Hogwild!: A lockfree approach to parallelizing stochastic gradient descent. In NIPS, [9] S. Gao, I. W. H. Tsang, L. T. Chia, and P. Zhao. Local features are not lonely - laplacian sparse coding for image classification. In CVPR, [10] C. W. Gear. Multibody grouping from motion images. Int. J. Comput. Vision, 29: , August , 2 [11] R. Gemulla, E. Nijkamp, P. J. Haas, and Y. Sismanis. Largescale matrix factorization with distributed stochastic gradient descent. In KDD, [12] T. Hastie and P. Simard. Metrics and models for handwritten character recognition. Statistical Science, 13(1):54 65, [13] R. He, W.-S. Zheng, B.-G. Hu, and X.-W. Kong. Nonnegative sparse coding for discriminative semi-supervised learning. In CVPR, [14] W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13 30, [15] S. Kumar, M. Mohri, and A. Talwalkar. On sampling-based approximate spectral decomposition. In ICML, [16] Z. Lin, A. Ganesh, J. Wright, L. Wu, M. Chen, and Y. Ma. Fast convex optimization algorithms for exact recovery of a corrupted low-rank matrix. UIUC Technical Report UILU- ENG , [17] G. Liu, Z. Lin, and Y. Yu. Robust subspace segmentation by low-rank representation. In ICML, , 2, 4, 5 [18] G. Liu, H. Xu, and S. Yan. Exact subspace segmentation and outlier detection by low-rank representation. arxiv: v2[cs.IT], , 2, 3, 4, 5, 9, 10, 11 [19] G. Liu and S. Yan. Latent low-rank representation for subspace segmentation and feature extraction. In ICCV, [20] G. Liu and S. Yan. Active subspace: Toward scalable lowrank learning. Neural Computation, 24: , [21] L. Mackey, A. Talwalkar, and M. I. Jordan. Divide-andconquer matrix factorization. In NIPS, , 2, 3, 4, 9 [22] B. Recht. A simpler approach to matrix completion. arxiv: v2[cs.it], [23] B. Recht and C. Ré. Parallel stochastic gradient algorithms for large-scale matrix completion. In Optimization Online, [24] C. Tomasi and T. Kanade. Shape and motion from image streams under orthography. International Journal of Computer Vision, 9(2): , [25] L. Torresani, M. Szummer, and A. W. Fitzgibbon. Efficient object category recognition using classemes. In ECCV, [26] R. Vidal, Y. Ma, and S. Sastry. Generalized principal component analysis (gpca). IEEE Trans. Pattern Anal. Mach. Intell., 27(12):1 15, [27] F. Wang and C. Zhang. Label propagation through linear neighborhoods. In ICML, , 7 [28] J. Wang, F. Wang, C. Zhang, H. C. Shen, and L. Quan. Linear neighborhood propagation and its applications. IEEE Trans. Pattern Anal. Mach. Intell., 31(9): , [29] J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma. Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell., 31(2): , [30] A. Yang, J. Wright, Y. Ma, and S. Sastry. Unsupervised segmentation of natural images via lossy data compression. Computer Vision and Image Understanding, 110(2): , [31] H.-F. Yu, C.-J. Hsieh, S. Si, and I. Dhillon. Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In ICDM, [32] L. Zhuang, H. Gao, Z. Lin, Y. Ma, X. Zhang, and N. Yu. Non-negative low rank and sparse graph for semi-supervised learning. In CVPR, , 2, 6,

NULL SPACE CLUSTERING WITH APPLICATIONS TO MOTION SEGMENTATION AND FACE CLUSTERING

NULL SPACE CLUSTERING WITH APPLICATIONS TO MOTION SEGMENTATION AND FACE CLUSTERING NULL SPACE CLUSTERING WITH APPLICATIONS TO MOTION SEGMENTATION AND FACE CLUSTERING Pan Ji, Yiran Zhong, Hongdong Li, Mathieu Salzmann, Australian National University, Canberra NICTA, Canberra {pan.ji,hongdong.li}@anu.edu.au,mathieu.salzmann@nicta.com.au

More information

Robust Face Recognition via Sparse Representation

Robust Face Recognition via Sparse Representation Robust Face Recognition via Sparse Representation Panqu Wang Department of Electrical and Computer Engineering University of California, San Diego La Jolla, CA 92092 pawang@ucsd.edu Can Xu Department of

More information

Principal Coordinate Clustering

Principal Coordinate Clustering Principal Coordinate Clustering Ali Sekmen, Akram Aldroubi, Ahmet Bugra Koku, Keaton Hamm Department of Computer Science, Tennessee State University Department of Mathematics, Vanderbilt University Department

More information

Supplementary Material : Partial Sum Minimization of Singular Values in RPCA for Low-Level Vision

Supplementary Material : Partial Sum Minimization of Singular Values in RPCA for Low-Level Vision Supplementary Material : Partial Sum Minimization of Singular Values in RPCA for Low-Level Vision Due to space limitation in the main paper, we present additional experimental results in this supplementary

More information

Non-Negative Low Rank and Sparse Graph for Semi-Supervised Learning

Non-Negative Low Rank and Sparse Graph for Semi-Supervised Learning Non-Negative Low Rank and Sparse Graph for Semi-Supervised Learning Liansheng Zhuang 1, Haoyuan Gao 1, Zhouchen Lin 2,3, Yi Ma 2, Xin Zhang 4, Nenghai Yu 1 1 MOE-Microsoft Key Lab., University of Science

More information

Robust Principal Component Analysis (RPCA)

Robust Principal Component Analysis (RPCA) Robust Principal Component Analysis (RPCA) & Matrix decomposition: into low-rank and sparse components Zhenfang Hu 2010.4.1 reference [1] Chandrasekharan, V., Sanghavi, S., Parillo, P., Wilsky, A.: Ranksparsity

More information

Low Rank Representation Theories, Algorithms, Applications. 林宙辰 北京大学 May 3, 2012

Low Rank Representation Theories, Algorithms, Applications. 林宙辰 北京大学 May 3, 2012 Low Rank Representation Theories, Algorithms, Applications 林宙辰 北京大学 May 3, 2012 Outline Low Rank Representation Some Theoretical Analysis Solution by LADM Applications Generalizations Conclusions Sparse

More information

Latent Low-Rank Representation

Latent Low-Rank Representation Latent Low-Rank Representation Guangcan Liu and Shuicheng Yan Abstract As mentioned at the end of previous chapter, a key aspect of LRR is about the configuration of its dictionary matrix. Usually, the

More information

arxiv: v1 [cs.cv] 9 Sep 2013

arxiv: v1 [cs.cv] 9 Sep 2013 Learning Transformations for Clustering and Classification arxiv:139.274v1 [cs.cv] 9 Sep 213 Qiang Qiu Department of Electrical and Computer Engineering Duke University Durham, NC 2778, USA Guillermo Sapiro

More information

Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference

Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference Minh Dao 1, Xiang Xiang 1, Bulent Ayhan 2, Chiman Kwan 2, Trac D. Tran 1 Johns Hopkins Univeristy, 3400

More information

Large-Scale Face Manifold Learning

Large-Scale Face Manifold Learning Large-Scale Face Manifold Learning Sanjiv Kumar Google Research New York, NY * Joint work with A. Talwalkar, H. Rowley and M. Mohri 1 Face Manifold Learning 50 x 50 pixel faces R 2500 50 x 50 pixel random

More information

LEARNING COMPRESSED IMAGE CLASSIFICATION FEATURES. Qiang Qiu and Guillermo Sapiro. Duke University, Durham, NC 27708, USA

LEARNING COMPRESSED IMAGE CLASSIFICATION FEATURES. Qiang Qiu and Guillermo Sapiro. Duke University, Durham, NC 27708, USA LEARNING COMPRESSED IMAGE CLASSIFICATION FEATURES Qiang Qiu and Guillermo Sapiro Duke University, Durham, NC 2778, USA ABSTRACT Learning a transformation-based dimension reduction, thereby compressive,

More information

arxiv: v1 [cs.cv] 18 Jan 2015

arxiv: v1 [cs.cv] 18 Jan 2015 Correlation Adaptive Subspace Segmentation by Trace Lasso Canyi Lu, Jiashi Feng, Zhouchen Lin 2,, Shuicheng Yan Department of Electrical and Computer Engineering, National University of Singapore 2 Key

More information

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,

More information

Sparse Subspace Clustering for Incomplete Images

Sparse Subspace Clustering for Incomplete Images Sparse Subspace Clustering for Incomplete Images Xiao Wen 1, Linbo Qiao 2, Shiqian Ma 1,WeiLiu 3, and Hong Cheng 1 1 Department of SEEM, CUHK, Hong Kong. {wenx, sqma, hcheng}@se.cuhk.edu.hk 2 College of

More information

Perturbation Estimation of the Subspaces for Structure from Motion with Noisy and Missing Data. Abstract. 1. Introduction

Perturbation Estimation of the Subspaces for Structure from Motion with Noisy and Missing Data. Abstract. 1. Introduction Perturbation Estimation of the Subspaces for Structure from Motion with Noisy and Missing Data Hongjun Jia, Jeff Fortuna and Aleix M. Martinez Department of Electrical and Computer Engineering The Ohio

More information

Online Low-Rank Representation Learning for Joint Multi-subspace Recovery and Clustering

Online Low-Rank Representation Learning for Joint Multi-subspace Recovery and Clustering ACCEPTED BY IEEE TRANSACTIONS ON IMAGE PROCESSING 1 Online Low-Rank Representation Learning for Joint Multi-subspace Recovery and Clustering Bo Li, Risheng Liu, Junjie Cao, Jie Zhang, Yu-Kun Lai, Xiuping

More information

Unsupervised learning in Vision

Unsupervised learning in Vision Chapter 7 Unsupervised learning in Vision The fields of Computer Vision and Machine Learning complement each other in a very natural way: the aim of the former is to extract useful information from visual

More information

Globally and Locally Consistent Unsupervised Projection

Globally and Locally Consistent Unsupervised Projection Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence Globally and Locally Consistent Unsupervised Projection Hua Wang, Feiping Nie, Heng Huang Department of Electrical Engineering

More information

arxiv: v1 [cs.cv] 12 Apr 2017

arxiv: v1 [cs.cv] 12 Apr 2017 Provable Self-Representation Based Outlier Detection in a Union of Subspaces Chong You, Daniel P. Robinson, René Vidal Johns Hopkins University, Baltimore, MD, 21218, USA arxiv:1704.03925v1 [cs.cv] 12

More information

Robust Subspace Segmentation with Block-diagonal Prior

Robust Subspace Segmentation with Block-diagonal Prior Robust Subspace Segmentation with Block-diagonal Prior Jiashi Feng, Zhouchen Lin 2, Huan Xu 3, Shuicheng Yan Department of ECE, National University of Singapore, Singapore 2 Key Lab. of Machine Perception,

More information

Outlier Pursuit: Robust PCA and Collaborative Filtering

Outlier Pursuit: Robust PCA and Collaborative Filtering Outlier Pursuit: Robust PCA and Collaborative Filtering Huan Xu Dept. of Mechanical Engineering & Dept. of Mathematics National University of Singapore Joint w/ Constantine Caramanis, Yudong Chen, Sujay

More information

Robust l p -norm Singular Value Decomposition

Robust l p -norm Singular Value Decomposition Robust l p -norm Singular Value Decomposition Kha Gia Quach 1, Khoa Luu 2, Chi Nhan Duong 1, Tien D. Bui 1 1 Concordia University, Computer Science and Software Engineering, Montréal, Québec, Canada 2

More information

Robust Face Recognition via Sparse Representation Authors: John Wright, Allen Y. Yang, Arvind Ganesh, S. Shankar Sastry, and Yi Ma

Robust Face Recognition via Sparse Representation Authors: John Wright, Allen Y. Yang, Arvind Ganesh, S. Shankar Sastry, and Yi Ma Robust Face Recognition via Sparse Representation Authors: John Wright, Allen Y. Yang, Arvind Ganesh, S. Shankar Sastry, and Yi Ma Presented by Hu Han Jan. 30 2014 For CSE 902 by Prof. Anil K. Jain: Selected

More information

A Geometric Analysis of Subspace Clustering with Outliers

A Geometric Analysis of Subspace Clustering with Outliers A Geometric Analysis of Subspace Clustering with Outliers Mahdi Soltanolkotabi and Emmanuel Candés Stanford University Fundamental Tool in Data Mining : PCA Fundamental Tool in Data Mining : PCA Subspace

More information

ADAPTIVE LOW RANK AND SPARSE DECOMPOSITION OF VIDEO USING COMPRESSIVE SENSING

ADAPTIVE LOW RANK AND SPARSE DECOMPOSITION OF VIDEO USING COMPRESSIVE SENSING ADAPTIVE LOW RANK AND SPARSE DECOMPOSITION OF VIDEO USING COMPRESSIVE SENSING Fei Yang 1 Hong Jiang 2 Zuowei Shen 3 Wei Deng 4 Dimitris Metaxas 1 1 Rutgers University 2 Bell Labs 3 National University

More information

Factorization with Missing and Noisy Data

Factorization with Missing and Noisy Data Factorization with Missing and Noisy Data Carme Julià, Angel Sappa, Felipe Lumbreras, Joan Serrat, and Antonio López Computer Vision Center and Computer Science Department, Universitat Autònoma de Barcelona,

More information

IN some applications, we need to decompose a given matrix

IN some applications, we need to decompose a given matrix 1 Recovery of Sparse and Low Rank Components of Matrices Using Iterative Method with Adaptive Thresholding Nematollah Zarmehi, Student Member, IEEE and Farokh Marvasti, Senior Member, IEEE arxiv:1703.03722v2

More information

Limitations of Matrix Completion via Trace Norm Minimization

Limitations of Matrix Completion via Trace Norm Minimization Limitations of Matrix Completion via Trace Norm Minimization ABSTRACT Xiaoxiao Shi Computer Science Department University of Illinois at Chicago xiaoxiao@cs.uic.edu In recent years, compressive sensing

More information

Feature Selection for Image Retrieval and Object Recognition

Feature Selection for Image Retrieval and Object Recognition Feature Selection for Image Retrieval and Object Recognition Nuno Vasconcelos et al. Statistical Visual Computing Lab ECE, UCSD Presented by Dashan Gao Scalable Discriminant Feature Selection for Image

More information

FAST PRINCIPAL COMPONENT PURSUIT VIA ALTERNATING MINIMIZATION

FAST PRINCIPAL COMPONENT PURSUIT VIA ALTERNATING MINIMIZATION FAST PRICIPAL COMPOET PURSUIT VIA ALTERATIG MIIMIZATIO Paul Rodríguez Department of Electrical Engineering Pontificia Universidad Católica del Perú Lima, Peru Brendt Wohlberg T-5 Applied Mathematics and

More information

Subspace Clustering by Block Diagonal Representation

Subspace Clustering by Block Diagonal Representation IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Subspace Clustering by Block Diagonal Representation Canyi Lu, Student Member, IEEE, Jiashi Feng, Zhouchen Lin, Senior Member, IEEE, Tao Mei,

More information

Multi-label Classification. Jingzhou Liu Dec

Multi-label Classification. Jingzhou Liu Dec Multi-label Classification Jingzhou Liu Dec. 6 2016 Introduction Multi-class problem, Training data (x $, y $ ) ( ), x $ X R., y $ Y = 1,2,, L Learn a mapping f: X Y Each instance x $ is associated with

More information

Bilevel Sparse Coding

Bilevel Sparse Coding Adobe Research 345 Park Ave, San Jose, CA Mar 15, 2013 Outline 1 2 The learning model The learning algorithm 3 4 Sparse Modeling Many types of sensory data, e.g., images and audio, are in high-dimensional

More information

Experiments of Image Retrieval Using Weak Attributes

Experiments of Image Retrieval Using Weak Attributes Columbia University Computer Science Department Technical Report # CUCS 005-12 (2012) Experiments of Image Retrieval Using Weak Attributes Felix X. Yu, Rongrong Ji, Ming-Hen Tsai, Guangnan Ye, Shih-Fu

More information

Constructing a Non-Negative Low Rank and Sparse Graph with Data-Adaptive Features

Constructing a Non-Negative Low Rank and Sparse Graph with Data-Adaptive Features 1 Constructing a Non-Negative Low Rank and Sparse Graph with Data-Adaptive Features Liansheng Zhuang, Shenghua Gao, Jinhui Tang, Jingjing Wang, Zhouchen Lin, Senior Member, and Yi Ma, IEEE Fellow, arxiv:1409.0964v1

More information

Robust Subspace Clustering via Half-Quadratic Minimization

Robust Subspace Clustering via Half-Quadratic Minimization 2013 IEEE International Conference on Computer Vision Robust Subspace Clustering via Half-Quadratic Minimization Yingya Zhang, Zhenan Sun, Ran He, and Tieniu Tan Center for Research on Intelligent Perception

More information

HIGH-dimensional data are commonly observed in various

HIGH-dimensional data are commonly observed in various 1 Simplex Representation for Subspace Clustering Jun Xu 1, Student Member, IEEE, Deyu Meng 2, Member, IEEE, Lei Zhang 1, Fellow, IEEE 1 Department of Computing, The Hong Kong Polytechnic University, Hong

More information

A new Graph constructor for Semi-supervised Discriminant Analysis via Group Sparsity

A new Graph constructor for Semi-supervised Discriminant Analysis via Group Sparsity 2011 Sixth International Conference on Image and Graphics A new Graph constructor for Semi-supervised Discriminant Analysis via Group Sparsity Haoyuan Gao, Liansheng Zhuang, Nenghai Yu MOE-MS Key Laboratory

More information

Lecture 2 September 3

Lecture 2 September 3 EE 381V: Large Scale Optimization Fall 2012 Lecture 2 September 3 Lecturer: Caramanis & Sanghavi Scribe: Hongbo Si, Qiaoyang Ye 2.1 Overview of the last Lecture The focus of the last lecture was to give

More information

IMA Preprint Series # 2281

IMA Preprint Series # 2281 DICTIONARY LEARNING AND SPARSE CODING FOR UNSUPERVISED CLUSTERING By Pablo Sprechmann and Guillermo Sapiro IMA Preprint Series # 2281 ( September 2009 ) INSTITUTE FOR MATHEMATICS AND ITS APPLICATIONS UNIVERSITY

More information

Face Recognition via Sparse Representation

Face Recognition via Sparse Representation Face Recognition via Sparse Representation John Wright, Allen Y. Yang, Arvind, S. Shankar Sastry and Yi Ma IEEE Trans. PAMI, March 2008 Research About Face Face Detection Face Alignment Face Recognition

More information

An efficient face recognition algorithm based on multi-kernel regularization learning

An efficient face recognition algorithm based on multi-kernel regularization learning Acta Technica 61, No. 4A/2016, 75 84 c 2017 Institute of Thermomechanics CAS, v.v.i. An efficient face recognition algorithm based on multi-kernel regularization learning Bi Rongrong 1 Abstract. A novel

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 13 UNSUPERVISED LEARNING If you have access to labeled training data, you know what to do. This is the supervised setting, in which you have a teacher telling

More information

Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011

Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011 Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011 1. Introduction Reddit is one of the most popular online social news websites with millions

More information

Subspace Clustering. Weiwei Feng. December 11, 2015

Subspace Clustering. Weiwei Feng. December 11, 2015 Subspace Clustering Weiwei Feng December 11, 2015 Abstract Data structure analysis is an important basis of machine learning and data science, which is now widely used in computational visualization problems,

More information

An efficient algorithm for sparse PCA

An efficient algorithm for sparse PCA An efficient algorithm for sparse PCA Yunlong He Georgia Institute of Technology School of Mathematics heyunlong@gatech.edu Renato D.C. Monteiro Georgia Institute of Technology School of Industrial & System

More information

AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery

AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery : Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery Hong Cheng Philip S. Yu Jiawei Han University of Illinois at Urbana-Champaign IBM T. J. Watson Research Center {hcheng3, hanj}@cs.uiuc.edu,

More information

Subspace Clustering by Block Diagonal Representation

Subspace Clustering by Block Diagonal Representation IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Subspace Clustering by Block Diagonal Representation Canyi Lu, Student Member, IEEE, Jiashi Feng, Zhouchen Lin, Senior Member, IEEE, Tao Mei,

More information

over Multi Label Images

over Multi Label Images IBM Research Compact Hashing for Mixed Image Keyword Query over Multi Label Images Xianglong Liu 1, Yadong Mu 2, Bo Lang 1 and Shih Fu Chang 2 1 Beihang University, Beijing, China 2 Columbia University,

More information

Learning Low-rank Transformations: Algorithms and Applications. Qiang Qiu Guillermo Sapiro

Learning Low-rank Transformations: Algorithms and Applications. Qiang Qiu Guillermo Sapiro Learning Low-rank Transformations: Algorithms and Applications Qiang Qiu Guillermo Sapiro Motivation Outline Low-rank transform - algorithms and theories Applications Subspace clustering Classification

More information

Effectiveness of Sparse Features: An Application of Sparse PCA

Effectiveness of Sparse Features: An Application of Sparse PCA 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Robust Subspace Segmentation by Simultaneously Learning Data Representations and Their Affinity Matrix

Robust Subspace Segmentation by Simultaneously Learning Data Representations and Their Affinity Matrix Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 015) Robust Subspace Segmentation by Simultaneously Learning Data Representations and Their Affinity Matrix

More information

Learning based face hallucination techniques: A survey

Learning based face hallucination techniques: A survey Vol. 3 (2014-15) pp. 37-45. : A survey Premitha Premnath K Department of Computer Science & Engineering Vidya Academy of Science & Technology Thrissur - 680501, Kerala, India (email: premithakpnath@gmail.com)

More information

Semi-supervised Data Representation via Affinity Graph Learning

Semi-supervised Data Representation via Affinity Graph Learning 1 Semi-supervised Data Representation via Affinity Graph Learning Weiya Ren 1 1 College of Information System and Management, National University of Defense Technology, Changsha, Hunan, P.R China, 410073

More information

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality

More information

Semi-Supervised Clustering with Partial Background Information

Semi-Supervised Clustering with Partial Background Information Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

Random projection for non-gaussian mixture models

Random projection for non-gaussian mixture models Random projection for non-gaussian mixture models Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92037 gyozo@cs.ucsd.edu Abstract Recently,

More information

Using the Kolmogorov-Smirnov Test for Image Segmentation

Using the Kolmogorov-Smirnov Test for Image Segmentation Using the Kolmogorov-Smirnov Test for Image Segmentation Yong Jae Lee CS395T Computational Statistics Final Project Report May 6th, 2009 I. INTRODUCTION Image segmentation is a fundamental task in computer

More information

A Representative Sample Selection Approach for SRC

A Representative Sample Selection Approach for SRC DEIM Forum 2011 E9-1 AliceChen, NTT, 239-0847 1-1 School of Computing Science, Simon Fraser University 8888 University Drive, Burnaby BC, V5A 1S6 Canada E-mail: {alice.chen,eda.takeharu,katafuchi.norifumi,kataoka.ryoji}@lab.ntt.co.jp

More information

Network Lasso: Clustering and Optimization in Large Graphs

Network Lasso: Clustering and Optimization in Large Graphs Network Lasso: Clustering and Optimization in Large Graphs David Hallac, Jure Leskovec, Stephen Boyd Stanford University September 28, 2015 Convex optimization Convex optimization is everywhere Introduction

More information

Learning a Manifold as an Atlas Supplementary Material

Learning a Manifold as an Atlas Supplementary Material Learning a Manifold as an Atlas Supplementary Material Nikolaos Pitelis Chris Russell School of EECS, Queen Mary, University of London [nikolaos.pitelis,chrisr,lourdes]@eecs.qmul.ac.uk Lourdes Agapito

More information

A Robust and Efficient Motion Segmentation Based on Orthogonal Projection Matrix of Shape Space

A Robust and Efficient Motion Segmentation Based on Orthogonal Projection Matrix of Shape Space A Robust and Efficient Motion Segmentation Based on Orthogonal Projection Matrix of Shape Space Naoyuki ICHIMURA Electrotechnical Laboratory 1-1-4, Umezono, Tsukuba Ibaraki, 35-8568 Japan ichimura@etl.go.jp

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

Subspace Clustering for Sequential Data

Subspace Clustering for Sequential Data Subspace Clustering for Sequential Data Stephen Tierney and Junbin Gao School of Computing and Mathematics Charles Sturt University Bathurst, NSW 2795, Australia {stierney, jbgao}@csu.edu.au Yi Guo Division

More information

Fast trajectory matching using small binary images

Fast trajectory matching using small binary images Title Fast trajectory matching using small binary images Author(s) Zhuo, W; Schnieders, D; Wong, KKY Citation The 3rd International Conference on Multimedia Technology (ICMT 2013), Guangzhou, China, 29

More information

Direct Matrix Factorization and Alignment Refinement: Application to Defect Detection

Direct Matrix Factorization and Alignment Refinement: Application to Defect Detection Direct Matrix Factorization and Alignment Refinement: Application to Defect Detection Zhen Qin (University of California, Riverside) Peter van Beek & Xu Chen (SHARP Labs of America, Camas, WA) 2015/8/30

More information

Feature Selection for fmri Classification

Feature Selection for fmri Classification Feature Selection for fmri Classification Chuang Wu Program of Computational Biology Carnegie Mellon University Pittsburgh, PA 15213 chuangw@andrew.cmu.edu Abstract The functional Magnetic Resonance Imaging

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

Convex Optimization MLSS 2015

Convex Optimization MLSS 2015 Convex Optimization MLSS 2015 Constantine Caramanis The University of Texas at Austin The Optimization Problem minimize : f (x) subject to : x X. The Optimization Problem minimize : f (x) subject to :

More information

ONLINE ROBUST PRINCIPAL COMPONENT ANALYSIS FOR BACKGROUND SUBTRACTION: A SYSTEM EVALUATION ON TOYOTA CAR DATA XINGQIAN XU THESIS

ONLINE ROBUST PRINCIPAL COMPONENT ANALYSIS FOR BACKGROUND SUBTRACTION: A SYSTEM EVALUATION ON TOYOTA CAR DATA XINGQIAN XU THESIS ONLINE ROBUST PRINCIPAL COMPONENT ANALYSIS FOR BACKGROUND SUBTRACTION: A SYSTEM EVALUATION ON TOYOTA CAR DATA BY XINGQIAN XU THESIS Submitted in partial fulfillment of the requirements for the degree of

More information

Hyperspectral Data Classification via Sparse Representation in Homotopy

Hyperspectral Data Classification via Sparse Representation in Homotopy Hyperspectral Data Classification via Sparse Representation in Homotopy Qazi Sami ul Haq,Lixin Shi,Linmi Tao,Shiqiang Yang Key Laboratory of Pervasive Computing, Ministry of Education Department of Computer

More information

Cluster Analysis (b) Lijun Zhang

Cluster Analysis (b) Lijun Zhang Cluster Analysis (b) Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Grid-Based and Density-Based Algorithms Graph-Based Algorithms Non-negative Matrix Factorization Cluster Validation Summary

More information

Blockwise Matrix Completion for Image Colorization

Blockwise Matrix Completion for Image Colorization Blockwise Matrix Completion for Image Colorization Paper #731 Abstract Image colorization is a process of recovering the whole color from a monochrome image given that a portion of labeled color pixels.

More information

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)

More information

Big-data Clustering: K-means vs K-indicators

Big-data Clustering: K-means vs K-indicators Big-data Clustering: K-means vs K-indicators Yin Zhang Dept. of Computational & Applied Math. Rice University, Houston, Texas, U.S.A. Joint work with Feiyu Chen & Taiping Zhang (CQU), Liwei Xu (UESTC)

More information

A Modular k-nearest Neighbor Classification Method for Massively Parallel Text Categorization

A Modular k-nearest Neighbor Classification Method for Massively Parallel Text Categorization A Modular k-nearest Neighbor Classification Method for Massively Parallel Text Categorization Hai Zhao and Bao-Liang Lu Department of Computer Science and Engineering, Shanghai Jiao Tong University, 1954

More information

Fast and Robust Earth Mover s Distances

Fast and Robust Earth Mover s Distances Fast and Robust Earth Mover s Distances Ofir Pele and Michael Werman School of Computer Science and Engineering The Hebrew University of Jerusalem {ofirpele,werman}@cs.huji.ac.il Abstract We present a

More information

Facial Expression Recognition Using Non-negative Matrix Factorization

Facial Expression Recognition Using Non-negative Matrix Factorization Facial Expression Recognition Using Non-negative Matrix Factorization Symeon Nikitidis, Anastasios Tefas and Ioannis Pitas Artificial Intelligence & Information Analysis Lab Department of Informatics Aristotle,

More information

Locality Preserving Projections (LPP) Abstract

Locality Preserving Projections (LPP) Abstract Locality Preserving Projections (LPP) Xiaofei He Partha Niyogi Computer Science Department Computer Science Department The University of Chicago The University of Chicago Chicago, IL 60615 Chicago, IL

More information

Diffusion Wavelets for Natural Image Analysis

Diffusion Wavelets for Natural Image Analysis Diffusion Wavelets for Natural Image Analysis Tyrus Berry December 16, 2011 Contents 1 Project Description 2 2 Introduction to Diffusion Wavelets 2 2.1 Diffusion Multiresolution............................

More information

Convex Optimization / Homework 2, due Oct 3

Convex Optimization / Homework 2, due Oct 3 Convex Optimization 0-725/36-725 Homework 2, due Oct 3 Instructions: You must complete Problems 3 and either Problem 4 or Problem 5 (your choice between the two) When you submit the homework, upload a

More information

arxiv: v1 [cs.mm] 12 Jan 2016

arxiv: v1 [cs.mm] 12 Jan 2016 Learning Subclass Representations for Visually-varied Image Classification Xinchao Li, Peng Xu, Yue Shi, Martha Larson, Alan Hanjalic Multimedia Information Retrieval Lab, Delft University of Technology

More information

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering is one of the fundamental and ubiquitous tasks in exploratory data analysis a first intuition about the

More information

Outlier Ensembles. Charu C. Aggarwal IBM T J Watson Research Center Yorktown, NY Keynote, Outlier Detection and Description Workshop, 2013

Outlier Ensembles. Charu C. Aggarwal IBM T J Watson Research Center Yorktown, NY Keynote, Outlier Detection and Description Workshop, 2013 Charu C. Aggarwal IBM T J Watson Research Center Yorktown, NY 10598 Outlier Ensembles Keynote, Outlier Detection and Description Workshop, 2013 Based on the ACM SIGKDD Explorations Position Paper: Outlier

More information

Improving Image Segmentation Quality Via Graph Theory

Improving Image Segmentation Quality Via Graph Theory International Symposium on Computers & Informatics (ISCI 05) Improving Image Segmentation Quality Via Graph Theory Xiangxiang Li, Songhao Zhu School of Automatic, Nanjing University of Post and Telecommunications,

More information

Clustering: Classic Methods and Modern Views

Clustering: Classic Methods and Modern Views Clustering: Classic Methods and Modern Views Marina Meilă University of Washington mmp@stat.washington.edu June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms Outline Paradigms for clustering

More information

ELEG Compressive Sensing and Sparse Signal Representations

ELEG Compressive Sensing and Sparse Signal Representations ELEG 867 - Compressive Sensing and Sparse Signal Representations Gonzalo R. Arce Depart. of Electrical and Computer Engineering University of Delaware Fall 211 Compressive Sensing G. Arce Fall, 211 1 /

More information

Advanced phase retrieval: maximum likelihood technique with sparse regularization of phase and amplitude

Advanced phase retrieval: maximum likelihood technique with sparse regularization of phase and amplitude Advanced phase retrieval: maximum likelihood technique with sparse regularization of phase and amplitude A. Migukin *, V. atkovnik and J. Astola Department of Signal Processing, Tampere University of Technology,

More information

Locality Preserving Projections (LPP) Abstract

Locality Preserving Projections (LPP) Abstract Locality Preserving Projections (LPP) Xiaofei He Partha Niyogi Computer Science Department Computer Science Department The University of Chicago The University of Chicago Chicago, IL 60615 Chicago, IL

More information

Fast Fuzzy Clustering of Infrared Images. 2. brfcm

Fast Fuzzy Clustering of Infrared Images. 2. brfcm Fast Fuzzy Clustering of Infrared Images Steven Eschrich, Jingwei Ke, Lawrence O. Hall and Dmitry B. Goldgof Department of Computer Science and Engineering, ENB 118 University of South Florida 4202 E.

More information

On Order-Constrained Transitive Distance

On Order-Constrained Transitive Distance On Order-Constrained Transitive Distance Author: Zhiding Yu and B. V. K. Vijaya Kumar, et al. Dept of Electrical and Computer Engineering Carnegie Mellon University 1 Clustering Problem Important Issues:

More information

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or

More information

Visual Representations for Machine Learning

Visual Representations for Machine Learning Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering

More information

Subspace Clustering with Global Dimension Minimization And Application to Motion Segmentation

Subspace Clustering with Global Dimension Minimization And Application to Motion Segmentation Subspace Clustering with Global Dimension Minimization And Application to Motion Segmentation Bryan Poling University of Minnesota Joint work with Gilad Lerman University of Minnesota The Problem of Subspace

More information

Robust Visual Domain Adaptation with Low-Rank Reconstruction

Robust Visual Domain Adaptation with Low-Rank Reconstruction Robust Visual Domain Adaptation with Low-Rank Reconstruction I-Hong Jhuo, Dong Liu, D.T.Lee, Shih-Fu Chang Dept. of CSIE, National Taiwan University, Taipei, Taiwan Institute of Information Science, Academia

More information

Using PageRank in Feature Selection

Using PageRank in Feature Selection Using PageRank in Feature Selection Dino Ienco, Rosa Meo, and Marco Botta Dipartimento di Informatica, Università di Torino, Italy fienco,meo,bottag@di.unito.it Abstract. Feature selection is an important

More information

LRLW-LSI: An Improved Latent Semantic Indexing (LSI) Text Classifier

LRLW-LSI: An Improved Latent Semantic Indexing (LSI) Text Classifier LRLW-LSI: An Improved Latent Semantic Indexing (LSI) Text Classifier Wang Ding, Songnian Yu, Shanqing Yu, Wei Wei, and Qianfeng Wang School of Computer Engineering and Science, Shanghai University, 200072

More information

Nonparametric estimation of multiple structures with outliers

Nonparametric estimation of multiple structures with outliers Nonparametric estimation of multiple structures with outliers Wei Zhang and Jana Kosecka George Mason University, 44 University Dr. Fairfax, VA 223 USA Abstract. Common problem encountered in the analysis

More information