www.nicta.com.au From imagination to impact Graph Connectivity in Sparse Subspace Clustering Behrooz Nasihatkon 24 March 2011
Outline 1 Subspace Clustering 2 Sparse Subspace Clustering 3 Graph Connectivity 4 Conclusion 2/35
Outline 1 Subspace Clustering Example Subspace Clustering Applications Solutions 2 Sparse Subspace Clustering 3 Graph Connectivity 4 Conclusion 3/35
Example: Rigid Body Frame 1: x i = [ A 1 [ ] Xi ] 2 4 1 4 1 4/35
Example: Rigid Body Frame 1: x i = [ A 1 [ ] Xi ] 2 4 1 4 1 ] [x 1 x n] = [ A 1 ] 2 4 [ X1 X n 1 1 4 n 4/35
Example: Rigid Body Frame 1: Frame 2: x i = [ A 1 [ ] Xi ] 2 4 1 4 1 ] [x 1 x n] = [ A 1 ] 2 4 [ X1 X n 1 1 4 n 4/35
Example: Rigid Body Frame 1: Frame 2: x i = [ A 1 [ ] Xi ] 2 4 1 4 1 ] [x 1 x n] = [ A 1 ] 2 4 [ X1 X n 1 1 4 n y i = [ A 2 [ ] Xi ] 2 4 1 4 1 4/35
Example: Rigid Body Frame 1: Frame 2: x i = [ A 1 [ ] Xi ] 2 4 1 4 1 ] [x 1 x n] = [ A 1 ] 2 4 [ X1 X n 1 1 4 n y i = [ A 2 [ ] Xi ] 2 4 1 4 1 ] [y 1 y n] = [ A 2 ] 2 4 [ X1 X n 1 1 4 n 4/35
Example: Rigid Body Frame 1: Frame 2: x i = [ A 1 [ ] Xi ] 2 4 1 4 1 ] [x 1 x n] = [ A 1 ] 2 4 [ X1 X n 1 1 4 n y i = [ A 2 [ ] Xi ] 2 4 1 4 1 [ ] [ ] [ x1 x n A1 X1 X = n y 1 y n A 2 1 1 4 4 ] 4 n 4/35
Example: Rigid Body Frame 1: Frame 2: x i = [ A 1 [ ] Xi ] 2 4 1 4 1 ] [x 1 x n] = [ A 1 ] 2 4 [ X1 X n 1 1 4 n y i = [ A 2 x 1 x n y 1 y n z 1 z n = [ ] Xi ] 2 4 1 A 1 A 2 A 3 6 4 4 1 [ X1 X n 1 1 ] 4 n 4/35
Example: Rigid Body Frame 1: Frame 2: x i = [ A 1 [ ] Xi ] 2 4 1 4 1 ] [x 1 x n] = [ A 1 ] 2 4 [ X1 X n 1 1 4 n y i = [ A 2 [ ] Xi ] 2 4 1 4 1 x 1 x n A 1 y 1 y n A.. = 2 [ ] X1 X n. 1 1 4 n z 1 z n A F 2F 4 4/35
Example: Rigid Body x 11 x 1n x 21 x 2n.. x F 1 x Fn 2F n = A 1 A 2. A F 2F 4 [ ] X1 X n 1 1 4 n 5/35
Example: Rigid Body x 11 x 1n x 21 x 2n.. x F 1 x Fn 2F n =? 5/35
Subspace Clustering Data lying on a mixture of subspaces. 6/35
Subspace Clustering Data lying on a mixture of subspaces. Number of subspaces 6/35
Subspace Clustering Data lying on a mixture of subspaces. No. of subspaces + their dimensions 6/35
Subspace Clustering Data lying on a mixture of subspaces. No. of subspaces + dimensions + A basis for each subspace 6/35
Subspace Clustering Data lying on a mixture of subspaces. No. of subspaces + dimensions + bases + data segmentation 6/35
Subspace Clustering Data lying on a mixture of subspaces. No. of subspaces + dimensions + bases + segmentation 6/35
Applications Motion Segmentation [Rene Vidal et al. 2008] Video Shot Segmentation [Le Lu and R. Vidal 2006] Illumination Invariant Clustering [J. Ho et al. 2003] Image Segmentation [Alen Yang et al. 2008] Image Representation and Compression [Wei Hong et al. 2005] Linear Hybrid Systems Identification [Rene Vidal et al. 2003] 7/35
Solutions Random Sample Consensus (RANSAC) [Martin Fischler and R. Bolles 1981] Mixture of Probabilistic PCA [Michael Tipping and C. Bishop 1999] Generalized PCA (GPCA) [Rene Vidal et al. 2005] Locally Linear Manifold Clustering (LLMC) [Alvina Goh and R. Vidal 2007] Agglomerative Lossy Compression (ALC) [Yi Ma et al. 2007] Sparse Subspace Clustering (SSC) [Ehsan Elhamifar and R. Vidal 2009] Low-Rank Subspace Clustering (LLR) [Guangcan Li et al. 2010] 8/35
Outline 1 Subspace Clustering 2 Sparse Subspace Clustering Sparse Representation Main Theorem Noise and Outliers Open Problems 3 Graph Connectivity 4 Conclusion 9/35
Sparse Representation Represent y as a linear combination of the smallest possible subset of the vectors {x 1, x 2,..., x n }. 10/35
Sparse Representation Represent y as a linear combination of the smallest possible subset of the vectors {x 1, x 2,..., x n }. min a 0 s.t. y = Xa where X = [x 1 x 2 x n ]. 10/35
Sparse Representation Represent y as a linear combination of the smallest possible subset of the vectors {x 1, x 2,..., x n }. min a 0 s.t. y = Xa where X = [x 1 x 2 x n ]. NP-hard 10/35
Sparse Representation Represent y as a linear combination of the smallest possible subset of the vectors {x 1, x 2,..., x n }. min a 0 s.t. y = Xa where X = [x 1 x 2 x n ]. NP-hard Use L 1 -minimization: min a 1 s.t. y = Xa 10/35
Sparse Representation Represent y as a linear combination of the smallest possible subset of the vectors {x 1, x 2,..., x n }. min a 0 s.t. y = Xa where X = [x 1 x 2 x n ]. NP-hard Use L 1 -minimization: min a 1 s.t. y = Xa L 1 /L 0 equivalence 10/35
Sparse Subspace Clustering If we had the basis [b 1 b 2... b m ] 11/35
Sparse Subspace Clustering If we had the basis [b 1 b 2... b m ] Represent each x i as a sparse combination of {b 1 b 2... b m } 11/35
Sparse Subspace Clustering We do not have the basis 11/35
Sparse Subspace Clustering We do not have the basis Represent each x i as a sparse combination of X {x i } 11/35
Main Theorem For each x i solve: a i = arg min a 1 s.t. x i = X a, a i = 0 12/35
Main Theorem For each x i solve: a i = arg min a 1 s.t. x i = X a, a i = 0 Construct a graph whose nodes are x i and each node x i is connected to the node x j if the j-th element of a i is nonzero. 12/35
Main Theorem For each x i solve: a i = arg min a 1 s.t. x i = X a, a i = 0 Construct a graph whose nodes are x i and each node x i is connected to the node x j if the j-th element of a i is nonzero. Theorem If the subspaces are independent, the graphs of different subspaces are disconnected. 12/35
Main Theorem For each x i solve: a i = arg min a 1 s.t. x i = X a, a i = 0 Construct a graph whose nodes are x i and each node x i is connected to the node x j if the j-th element of a i is nonzero. Theorem If the subspaces are independent, the graphs of different subspaces are disconnected. Find the connected components 12/35
Main Theorem For each x i solve: a i = arg min a 1 s.t. x i = X a, a i = 0 Construct a graph whose nodes are x i and each node x i is connected to the node x j if the j-th element of a i is nonzero. Theorem If the subspaces are independent, the graphs of different subspaces are disconnected. Find the connected components In practice spectral clustering using [a 1 a 2... a n ] 12/35
Example Figure: An example of an SSC graph 13/35
Noise and Outliers Noise min a 1 +λ e 2 s.t. x i = X i a + e a i = 0 14/35
Noise and Outliers Noise min a 1 +λ e 2 s.t. x i = X i a + e a i = 0 Outliers min a 1 +λ e 1 s.t. x i = X i a + e a i = 0 14/35
Open Problems Noise and Outliers Extension to manifolds Graph Connectivity in each subspace 15/35
Outline 1 Subspace Clustering 2 Sparse Subspace Clustering 3 Graph Connectivity Example Preparation 2D Case 3D Case N-D Case 4 Conclusion 16/35
A Simple Example 17/35
A Simple Example 17/35
A Simple Example 17/35
A Simple Example 17/35
A Simple Example 17/35
Preparation Adding Negative Points x i = j i a jx j X 18/35
Preparation Adding Negative Points x i = j i a jx j X ± 18/35
Preparation Adding Negative Points x i = j i a jx j a j x j = a j ( x j ) X ± 18/35
Preparation Adding Negative Points x i = j i a jx j a j x j = a j ( x j ) a j 0 X ± 18/35
Preparation a i = arg min a 1 s.t. x i = X a, a i = 0 a i = arg min 1 T a s.t. x i = X ± a, a 0, a i = 0 19/35
Geometry of Polytopes L 1 minimization and geometry of polytopes [David Donoho 2005] 20/35
Geometry of Polytopes L 1 minimization and geometry of polytopes [David Donoho 2005] b x i = X i b = X i b 1 b 1 20/35
Geometry of Polytopes L 1 minimization and geometry of polytopes [David Donoho 2005] b x i = X i b = X i b 1 b 1 x i = X i p α, where p = 1, p 0 α : to be minimized 20/35
Geometry of Polytopes L 1 minimization and geometry of polytopes [David Donoho 2005] b x i = X i b = X i b 1 b 1 x i = X i p α, where p = 1, p 0 α : to be minimized X i p : convex hull of X i 20/35
Geometry of Polytopes maximize β s.t. βx i hull(x i ) 21/35
Assumptions Indivisible subspaces? 22/35
Assumptions Indivisible subspaces? Degenerate Cases 22/35
Assumptions Indivisible subspaces? Degenerate Cases Assumption No d points lie in a (d 1)-dimensional subspace. 22/35
Assumptions Indivisible subspaces? Degenerate Cases Assumption No d points lie in a (d 1)-dimensional subspace. Assumption The facet of each polytope(x i ) on which y i lies has exactly d points on it. 22/35
Neighbourhood Cones 23/35
Neighbourhood Cones Theorem Two points are neighbours iff their neighbourhood cones strictly intersect. 23/35
Projection on Unit Hypersphere 24/35
2D Case 25/35
2D Case 25/35
2D Case 25/35
2D Case 25/35
2D Case 25/35
2D Case 25/35
2D Case 25/35
2D Case 25/35
3D Case 26/35
3D Case 26/35
3D Case 26/35
3D Case 26/35
3D Case residual holes for one connected component Topologically open disks. Area: < half-sphere. ( Gauss-Bonnet Theorem) 26/35
What about d 4? No residual holes Search for counterexamples 27/35
Towards a Counterexample Observations from 3D 28/35
Towards a Counterexample 28/35
Towards a Counterexample 28/35
4D case Counterexample Data around two great circles: [cos θ, sin θ, 0, 0] T [ 0, 0, cos θ, sin θ] T X ± = X C1 X C2 X C1 : [cos kπ m, sin kπ m, ± δ, ± δ]t X C2 : [ ± δ, ± δ, cos kπ m, sin kπ m ]T 29/35
4D case Counterexample (a) δ = + ɛ (b) δ = + ɛ Disconnected for: δ (, 2 2 ) = f(m) (c) δ = 2 2 ɛ (d) δ = 2 2 + ɛ Figure: Orthographic projection to 3D 30/35
Is Connectivity Generic? In the counterexample ray(x i ) hits the interior of the facet. 31/35
Is Connectivity Generic? In the counterexample ray(x i ) hits the interior of the facet. 31/35
Conclusion The importance of Subspace Clustering 32/35
Conclusion The importance of Subspace Clustering Advantages of Sparse Subspace Clustering 32/35
Conclusion The importance of Subspace Clustering Advantages of Sparse Subspace Clustering For d = 2 and d = 3 it will not fail, 32/35
Conclusion The importance of Subspace Clustering Advantages of Sparse Subspace Clustering For d = 2 and d = 3 it will not fail, Caution must be taken for d 4, A post processing stage, etc. 32/35
Thanks 33/35
Questions????? 34/35
35/35