On Clustering and Embedding Manifolds using a Low Rank Neighborhood Approach

Size: px
Start display at page:

Download "On Clustering and Embedding Manifolds using a Low Rank Neighborhood Approach"

Transcription

1 1 On Clustering and Embedding Manifolds using a Low Rank Neighborhood Approach Arun M. Saranathan, Student Member, IEEE, and Mario Parente, Member, IEEE arxiv: v2 [cs.cv] 16 Sep 2016 Abstract In the manifold learning community there has been an onus on the simultaneous clustering and embedding of multiple manifolds. Manifold clustering and embedding algorithms perform especially poorly when embedding highly nonlinear manifolds. In this paper we propose a novel algorithm for improved manifold clustering and embedding. Since a majority of these algorithms are graph based they use different strategies to ensure that only data-point belonging to the same manifold are chosen as neighbors. The new algorithm proposes the addition of a low-rank criterion on the neighborhood of each datapoint to ensure that only data-points belonging to the same manifold are prioritized for neighbor selection. Following this a reconstruction matrix is calculated to express each data-point as an affine combination of its neighbors. If the low rank neighborhood criterion succeeds in prioritizing data-points belonging to same manifold as neighbors, the reconstruction matrix is (near) block diagonal. This reconstruction matrix can then be used for clustering and embedding. Over a variety of simulated and real data-sets the algorithm shows improvements on the state-of-theart manifold clustering and embedding algorithms in terms of both clustering and embedding performance. Index Terms Manifold Clustering, Manifold Embedding, low rank neighborhood selection. I. INTRODUCTION In the era of big data we live under the constant deluge of high-dimensional data in the form of image/video streams, hyperspectral images and audio recordings to cite a few. Generally, these data-streams are not intrinsically high dimensional, rather they lie on or near manifolds that exhibit lower intrinsic dimensionality. Identifying and unraveling these few degrees of freedom leads to significant gains in data processing and storage. Points lying on/near smooth manifolds can be modeled as data that are originally drawn from a lowdimensional parameter space, and then have been mapped by smooth (i.e. diffeo-morphic and invertible) linear or nonlinear functions into a high dimensional ambient space. Manifold learning algorithms attempt to learn a low-dimensional representation of the data or better to embed the data into a lower dimensional coordinate system, such that locally some of the geometrical structure from the high-dimensional data is preserved. In simpler terms, such techniques attempt to eliminate the (nonlinear) effects of the mapping and learn the lower-dimensional parameter space representation. A large number of algorithms have been proposed for manifold learning [1], [2]. Such approaches attempt to find a mapping that would preserve appropriate global (e.g. ISOMAP [3]) or local properties (e.g. Locally Linear Embedding (LLE) [4], A. M. Saranathan and M. Parente are with the Department of Electrical and Computer Engineering, University of Massachusetts, Amherst, MA USA asaranat@umass.edu Laplacian Eigenmaps [5] and Local Tangent Space Alignment (LTSA) [6]). A variety of such techniques have been created and used for a wide range of applications (see [7], [8] and references therein for more examples). The assumption that the data are drawn from a single manifold is seldom satisfied. In practice we are better served by modeling the data as lying on or near a mixture of manifolds. If these manifolds are well separated, simply adding a clustering technique such as N-Cuts [9] prior to manifold learning is sufficient to identify the different manifolds, so that accurate low dimensional parameter space representations (embeddings) for each of the manifolds can be computed. On the other hand, if the different manifolds overlap, identifying the different manifolds and generating quality embedding cannot be easily accomplished. Data-sets that could be modeled as overlapping manifolds have been observed in hyperspectral imaging (whenever there are multiple mixtures with shared endmembers [10]), in images of hand-written numbers [11], natural images for object recognition [12], human face images [11], [13] and human motion capture tasks [14]. A variety of approaches have been designed for clustering data belonging to multiple manifolds. The simplest approaches model the data as lying on or near linear manifolds (affine spaces) and leveraged this expectation of linearity to aid manifold identification (see [15] and references therein). Other algorithms such as the Local Structural Consistency (LSC) [12] and Spectral Multi-Manifold clustering (SMMC) create a structure-dependent similarity metric to generate suitable affinity matrices for spectral clustering. Another approach, the Robust Multiple Manifolds Structure Learning (RMMSL) [16], generates affinity matrices based on tangent space alignment. Unfortunately, these algorithms do not generate lowdimensional embedding for the data. Other algorithms like k- Manifolds [14] are based on the assumption that there is no embedding in a lower dimensional space that preserves all the properties captured in the high dimensional data, but this has been shown to be false in the case of nonlinear manifolds that only share a boundary, such as multiple mixtures with shared endmembers in hyperspectral data [10]. Other popular algorithms draw on the notion of reconstruction coefficients/matrices first introduced in Locally Linear Embedding (LLE) [4]. The algorithms follow a general scheme wherein each data-point is expressed as an affine (or sometimes linear) combination of other points in the data-set. In addition to this, the algorithms place some penalty on the reconstruction coefficients to make sure that data-points on the same manifold are chosen (assigned non-zero reconstruction coefficients) to reconstruct the given data-point. This yields a reconstruction matrix that is approximately block-diagonal and

2 2 the application of a spectral clustering algorithm is sufficient to identify the different manifolds present in the data. Reconstruction-based methods differ mainly in the type of constraints imposed on the reconstruction coefficients to generate a block-diagonal reconstruction matrix. These techniques then generate embeddings from the reconstruction matrix using similar schemes to the one used by LLE. One such algorithms is the Low Rank Embedding (LRE) [17], which adds a rank-based penalty on the reconstruction matrix. The algorithm is quite similar to the Low Rank Representation (LRR) [18] approach for subspace clustering. In effect the LRE assumes that the reconstruction coefficients of data-points on the same manifold have a similar underlying structure, i.e. they can be reconstructed accurately by using the same set of points. The LRE algorithm generates an embedding of the data into a low-dimensional space using the reconstruction matrix in the same fashion as the LLE. The algorithm then performs k- means [19] on the embedding to learn manifold memberships. While this is a reasonable assumption for data which can be modeled as linear subspaces with some distortions, in scenarios where there are highly nonlinear manifolds the neighborhood structure for data-points in different parts of the manifold are significantly different thus different sets of points should be used to reconstruct target points in different linear patches on the manifold. Furthermore, the LLE embedding scheme, which LRE uses, only captures the geometric properties of a neighborhood when the reconstruction coefficients are unaffected by translation, rotation and scaling [4]. The LLE algorithm ensures such invariance by enforcing a sum-to-one (affineness) constraint on the reconstruction coefficients in the neighborhood. Since LRE does not add this constraint the reconstruction coefficients are no longer invariant to rigid linear transformations which leads to distortions in the embedding. Another example of reconstruction-matrix-based algorithms that perform both manifold clustering and embedding is the Sparse Manifold Clustering and Embedding (SMCE) [11]. The SMCE attempts to find a reconstruction matrix where each data-point is expressed as an affine combination of its k-nearest neighbors and adds an additional penalty on the distance based-sparsity of the reconstruction coefficient vector of the point. The authors show that the effect of minimizing both reconstruction error and sparsity penalty as much as possible is that only data-points on the same manifold are assigned non-zero weights. The reconstruction matrix created in this fashion also fulfills the conditions mentioned in LLE for accurate embeddings, i.e. the reconstruction coefficients are invariant to translations, rotations and scaling. While creating sparse neighborhoods may aid the clustering objective, it throws up some issues in the embedding. In particular, the spectral embedding technique introduced in the LLE only preserves local relationships, and there is no penalty if the global geometric information is distorted. Namely, if different neighborhoods do not share points, there is no penalty if they are embedded with different scalings or rotations: global shape is only preserved if there is significant overlap between adjacent neighborhoods. Since the SMCE creates very sparse neighborhoods with little or no overlap there maybe significant distortions in the global shape. A more recent technique, the Joint Manifold Clustering and Embedding (JMCE) [20], expresses each data-point as a convex combination of its k nearest neighbors and at the same time adds a penalty on the magnitude of the non-zero weights assigned to neighbors on other manifolds. While the technique has shown some promise in clustering of hyperspectral data, due to the restriction to convex reconstructions, the embedding suffers from distortions at the boundary of the manifolds. In this paper we propose a novel approach, the Low Rank Neighborhood Embedding (LRNE), which expresses every data-point as an affine combination of its k nearest neighbors. The novelty with respect to the other reconstruction-based approaches is that, in order to ensure that only neighbors on the same manifold are prioritized in reconstructing a target point, we add a penalty on the dimension of the neighborhood of the point, rather that relying on sparsity or penalizing the rank of the whole reconstruction matrix. More specifically, the penalty encourages selecting from the neighborhood a set of points for reconstruction belonging to an affine patch of dimension as low as possible. Consider a point at/near an intersection, the nearest neighbors are drawn from different linear patches each belonging to a different manifold overlapping at that intersection. In this scenario, choosing the set of points which reconstruction error and also lie on a patch of lowest possible dimension will make sure that points from the same manifold as the original point are chosen for reconstruction. Since the reconstruction scheme is local and affine, the LRNE reconstruction matrix can be embedded by using a spectral embedding stage similar to the one described in LLE. The reconstruction coefficient vectors generated by the LRNE are invariant to translations, rotations and scaling. Also, unlike the SMCE, which due to its requirements of learning a sparse neighborhood assigns large (non-zero) reconstruction coefficients to as few neighbors as possible, the LRNE assigns large (non-zero) reconstruction coefficients to all points on the same manifold in the neighborhood, this ensures sufficient overlap between adjacent neighborhoods, which is necessary to help preserving the global geometry as much as possible in the embedding. A preliminary version of this algorithm with limited results for hyperspectral data alone can be found in [21]. The paper is arranged as follows: in section II we will describe different multi-manifold structures our algorithm will target. In section III we describe the new Low Rank Neighborhood Embedding (LRNE), we will provide some intuition to show that choosing a low-dimensional neighborhood will ensure that only data-points from the same manifold are chosen. Following this we will describe an optimization scheme that will ensure the choice of such a low-dimensional neighborhood and the steps required to generate the clustering and embedding from the reconstruction coefficients. In section IV we describe the experiments used to compare the various manifold clustering and embedding algorithms and analysis of the results. We will offer concluding remarks and avenues for further research in section V.

3 3 Fig. 1. Types of manifold mixtures: (a) Adjoining manifolds. (b) Intersecting manifolds. Inset figures show the neighborhood for a target point at/near the intersection II. MANIFOLD MIXTURE TYPES We will apply our method to two different types of manifold mixtures. In one case, the different manifolds have only a boundary in common, we will refer to such manifolds as adjoining manifolds. An example of such manifolds is shown in Fig. 1 (a). Another case would be when the manifolds appear to pass through each other, we will refer to such manifolds as intersecting manifolds. An example of such manifolds is shown in Fig. 1 (b). For each set of manifolds, we will refer to the sub-manifold on each side of the intersection as an arm of the manifold. III. THE LOW-RANK NEIGHBORHOOD EMBEDDING ALGORITHM Given a set of points in R D, drawn from p different smooth and sufficiently well sampled manifolds, the Low Rank Neighborhood Embedding (LRNE) algorithm attempts to find a representation such that each point is reconstructed as an affine combination of the spatially nearest neighbors that lie on the same manifold as the target point. The resulting reconstruction matrix is therefore approximately block-diagonal, where each block should contain reconstruction coefficients for data-points drawn from a single manifold arm. A symmetrized version of the reconstruction matrix is then used as a similarity matrix in a spectral clustering algorithm to identify the different clusters (manifold arms) in the data. For intersecting manifolds, an additional procedure will pair the different arms of each manifold. The reconstruction matrix is also used for embedding in a procedure similar to the LLE embedding. A. Generating the reconstruction matrix The Low Rank Neighborhood Embedding (LRNE) follows a scheme similar to the Locally Linear Embedding (LLE), wherein it attempts to express every data-point as an affine combination of its nearest neighbors with additional penalties to ensure that only neighbors on the same manifold as a given data-point are used in the reconstruction. The LLE objective function to find the appropriate reconstruction coefficients, as defined in [4], is: α x α i n i 2 subject to 1 T α = 1. (1) where x is the target point for which we are finding the reconstruction coefficients, is the l 2 norm, N is the set of the k nearest neighbors of the data-point x defined by N (x) = {n 1, n 2,... n k } and α i R is the reconstruction coefficient assigned to the neighbor n i. In scenarios where the data lie on multiple manifolds with some overlap, the neighborhood N (x) of a target point at/near an intersection will contain points from all the different manifolds that overlap at the intersection. Since each manifold is smooth and well sampled, the neighbors drawn from each manifold will appear to lie on a linear patch. The inset pictures in Fig. 1 show the neighborhoods for a data-point at/near the intersection for the example manifolds. The target point in each case is marked in black and neighbors from the same manifold as the target point marked in blue, while the neighbors from the other manifold are marked in red. The dimensionality of N (x) is the dimensionality of the union of all linear patches, each of which belongs to a lowerdimensional affine space. The LRNE attempts to select from the neighborhood N (x) only those neighbors that lie on the same manifold as the target point as the ones to be used in the reconstruction. This problem can be cast as the task of finding a subset of the neighborhood that both successfully reconstruct the target point (according to Eqn. (1)) and spans an affine space of the lowest possible dimension. Consider a weighted neighborhood matrix M = [α 1 n 1 α 2 n 2... α k n k ]. If we add a penalty to the objective function defined in Eqn. (1) based on the dimension of the space spanned by the columns of M (which is given by rank(m)), the only way to lower the dimensionality of such space (i.e. rank(m)) is by zeroing out coefficients α i, corresponding to all the points that lie on some of the manifolds. We informally refer to the neighbors with non-zero coefficients as the points chosen for the reconstruction. The penalty on the dimensionality of the chosen set can be increased so that possibly neighbors lying on only one manifold are selected. Additionally, if the chosen points do not lie on the same manifold as the target, the reconstruction error will not be d. Thus the need to simultaneously the reconstruction error and the dimension of the neighborhood will ensure that only points on the same manifold as the target point are chosen for the reconstruction (in the absence of noise). The parameter λ regulates the trade-off between allowing reconstruction using some points from the wrong manifold to obtain a better linear fit to the data when noise or local density changes are present. As a result of the above discussion, we propose the addition of a dimension based penalty to the LLE reconstruction objective: α 1 2 x α i n i 2 + λ rank (M) subject to 1 T α = 1. It is straightforward to note that for target points not close to the intersection, Eqn. (2) can be applied as well with the effect that no points from N (x) are excluded. The one hurdle in the solution of the problem defined in (2)

4 4 Fig. 2. Reconstruction matrix for (a) adjoining manifolds and (b) intersecting manifolds. The points are arranged according to arm-membership to highlight the block structure Eqn. (2) is that the rank function and hence the objective function defined above is not convex. To mitigate this problem the rank function is replaced by the nuclear norm ( ) function which has been shown to be a convex approximation for the rank function [22]. The modified objective function can be written as: α 1 2 x α i n i 2 + λ ˆM subject to 1 T α = 1. where ˆM = [α1ˆn 1 α 2ˆn 2... α kˆn k ], and ˆn i = n i / n i. The normalization is necessary because unlike the rank, the nuclear norm is affected by the scaling of the columns of M. Since the objective function described in Eqn. (3) is convex, the optimization problem could be solved by using the CVX toolbox for MATLAB [23], [24]. In this paper we also present an ADMM based [25] first-order solver for the minimization of the problem described in Eqn. (3). We show the details of the technique in Appendix (A). The effect of the approximation in Eqn. (3) is that, rather than forcing to zero the reconstruction coefficients of neighbors on manifolds other than the one containing the target point, neighbors on the same manifold as the target point are assigned much larger weights as compared to points on the other manifolds. The result of application of Eqn. (3) to the i-th target point in the data-set is a reconstruction vector α that fills the i-th row of a reconstruction matrix R so that R ij = 0 if x j / N (x i ), otherwise the value R ij is α j. If the manifolds are adjoining manifolds, the reconstruction matrix will exhibit an approximately block-diagonal structure with number of blocks equal to the number of manifolds. Consider the example of adjoining manifolds shown in Fig. 1 (a), in this data-set each linear patch at the intersection belongs to a different manifold as shown in the inset figure. Since Eqn. (3) has a penalty on assigning large reconstruction weights to data-points on a different linear patch (or manifold), only data-points on the same manifold as the target point are assigned large reconstruction coefficients. In such a scenario the reconstruction matrices will appear to be approximately block-diagonal with two blocks. It is important to notice that the sparsity structure within each block will only be due to the neighborhood sparsity and no further subdivision into sub-blocks is observed. Such occurrence would imply either that the neighborhoods of some (3) points do not lie on linear patches or that there is not enough overlap between the different neighborhoods, which is contrary with the assumptions of local linearity and of a sufficiently well sampled manifold. The reconstruction matrix for the dataset shown in Fig. 1 (a) is shown in Fig. 2 (a). If the data-points lie on intersecting manifolds, the different blocks will identify with the different arms of the overlapping manifolds. Again, we can illustrate the issue with the example in Fig. 1 (b), in which there are two manifolds, each with an arm on each side of the intersection. The algorithm will assign much larger reconstruction weights to points on the same manifold as the target point. Consider a point that is on the horizontal manifold and is a small distance from the intersection (marked in black) as shown in Fig. 3 (a). The neighborhood of the target point is composed by points approximately lying on the four patches in orange (neighbors on the same arm as the target point), purple (neighbors on the opposite arm in the same manifold) and blue and green (neighbors on the other manifold). The number of neighbors in each patch is proportional to the area of the patch if the manifolds are uniformly sampled. Clearly the purple region is much smaller than the orange region and consequently a point has more neighbors on the same arm and far fewer neighbors on the other arm of the same manifold. Thus in scenarios where we consider neighborhoods made up of the nearest neighbors (either k nearest or ɛ) at/near the intersection for intersecting manifolds, there are significantly smaller number of neighbors on the opposing arm of the same manifold as compared to neighbors on the same arm. This effect, which is an intrinsic property of neighborhood graphs, together with the fact that LRNE prioritizes neighbors on the same manifold of each target point result in the reconstruction matrix, wherein the block corresponding to each arm appears disconnected from the blocks corresponding to the other arms. Therefore the reconstruction matrix exhibits twice as many blocks as the number of manifolds (in the case of the specific example in Fig. 1 (b) the reconstruction matrix has four blocks). The reconstruction matrix for the data-set shown in Fig. 1 (b) is shown in Fig. 2 (b). B. Clustering from the Reconstruction Matrix Based on the discussion in the previous section, the reconstruction matrices will be approximately block diagonal with the blocks corresponding to points on each manifold arm. In order to perform manifold clustering, we model the set of datapoints as the nodes of a graph such that the similarity between the nodes x i and x j is based on the reconstruction coefficient (r ij ) in R and the distance between x i and x j and we define the similarity between the two points as: - r ij / x j x i 2 w ij = t i r it/ x t x i 2 this scheme is similar to the one defined in section 2.2 of [11]. The final step is to make this similarity matrix symmetric, this can be achieved by setting W sym = max(w, W T ). This matrix, which exhibits the same block-structure as the reconstruction matrix is then provided as an input to a spectral clustering algorithm [9], [26]. In this case we use the unnormalized

5 5 Fig. 3. (a) neighborhood structure for a point at the intersection (b) reconstruction coefficients for a point at the intersection based on arm membership [blue:same arm, red:other arm of the same manifold, green&black different arms on the other manifold] spectral clustering algorithm described in section 4 of [27], additionally we also use the recursive two-way cut scheme described in section 3.2 of [9]. But in practice any spectral clustering techniques will provide reasonable results because if the weight matrix is (nearly) block-diagonal with k blocks, so is the Laplacian, then the spectral clustering algorithm will identify the different blocks as separate connected components in the corresponding graph (see Proposition II of [27]). In the case of adjoining manifolds, such clustering would directly identify the different manifolds present in the data as the number of blocks in the reconstruction matrix is equal to the number of manifolds in the data. For intersecting manifolds the identified clusters will correspond to the arms of every manifold. C. Pairing the opposing arms of intersecting manifolds It s important to identify two opposing arms as part of a single manifold, as these are seen as members of the same perceptual class. We pair the two arms of a manifold by leveraging the fact that for each target point, the neighborhoodrank-based penalty of the LRNE forces the neighbors on the arms of the same manifold as the target point to have, on average, higher reconstruction coefficients than neighbors on other manifolds. This effect of the optimization in Eqn. (3) is routinely observed for points at the intersection. After this merging step we are able to achieve accurate clustering performance in the case of intersecting manifolds. As an example, Fig. 3 (b) displays, for a target point at the intersection of the manifolds in Fig. 1 (b), the reconstruction coefficients grouped according to arm membership. Reconstruction coefficients for neighbors in the same arm are shown in blue, the reconstruction coefficients for neighbors on the opposite arm of the same manifold are shown in red and the reconstruction coefficients for neighbors on the other manifold are shown in green and black. The average reconstruction coefficients for points on the other manifolds (represented by the green and black lines in Fig. 3 (b)) exhibit lower values than the ones for neighbors on the same manifold as the target point (represented by the red and blue lines in Fig. 3 (b)). The algorithm for arm pairing is as follows: 1) Target points near the intersection are identified as the ones which have neighbors in each of the arms identified by the spectral clustering algorithm. 2) For every target point at the intersection we compute the difference of the average reconstruction coefficient of neighbors lying on the other arms to the average reconstruction coefficient for neighbors on the arm with largest weights. 3) The point then votes to merge the arm with largest weights with the arm with the smallest difference in average reconstruction coefficient. Each arm is merged with the arm that the majority of boundary points vote to merge with. In its present version, the algorithm requires user input on the presence of intersecting vs. adjoining manifolds. On the other hand a simple observation suggests a way to automatically detect if a merging step is required or not. In the case of adjoining manifolds there is only one arm associated with each manifold at the intersection, which ensures that for a data-point at/near the intersection the low-rank criterion of the LRNE ensures that only coefficients on the same arm as the target point are assigned on average large reconstruction coefficients, while in the case of intersecting manifolds the large coefficients are assigned on average to neighbors on the two arms of the same manifold as the target points. A simple analysis of the distribution of the metrics in step 2) of the pairing algorithm could help identify the presence of intersecting manifolds. D. Embedding from the Reconstruction Matrix The LRNE reconstruction coefficients are generated with the same affine constraints as the LLE. As a result, they are unaffected by translations, rotation or scalings of the data-points in each neighborhood which ensures that these coefficients capture the intrinsic geometric structure of the neighborhood [4]. We can use the reconstruction coefficients to compute a low-dimensional embedding in the same fashion as the LLE. Namely, we find a low-dimensional representation Y solving the following problem : Y Y Y R 2 F subject to Y Y T = I. Minimizing the above optimization problem generates the coordinates of the low-dimensional embedding Y in a orthogonal space which is centered at the origin [4]. Following this the embeddings corresponding to each of the different classes are separated based on the classification to generate embedding of each class. E. A note on Time-Complexity The CVX solver uses interior point techniques, which while very efficient, are quite complicated and slow and has worst case time complexity of the order of O(n 6 ) [28]. Each iteration of the ADMM algorithm has a time complexity of O(mn 2 ) since it requires the Singular Value Decomposition (SVD) of a m n matrix at every step [29]. In comparison the SMCE has a time complexity of O(k) (where k is number of iterations for the optimization). The LRE has a time complexity of O(N 3 ), where N is the number of points but is in general faster than (4)

6 6 TABLE I EFFECT OF PARAMETERS ON CLUSTERING PERFORMANCE Fig. 4. Simulated data-set containing adjoining manifolds (a) original labels (b) SMCE class labels (c) LRE class labels (d) LRNE class labels (e) LLE embedding (f) SMCE embedding (g) LRE embedding (h) LRNE embedding Fig. 5. Intersecting manifolds simulated data-set (a) original labels (b) SMCE class labels (c) LRE class labels (d) LRNE class labels (e) LLE embedding (f) SMCE embedding (g) LRE embedding (h) LRNE embedding the LRNE as it solves for the reconstruction coefficients of all the data-points at once. In the future we will look to adapt techniques such as the one described in [28], which have been previously used to further improve the time complexity of rank-based techniques. IV. EXPERIMENTS AND RESULTS We tested the new algorithm on some simulated datasets as well as some real (benchmark) data-sets for manifold clustering and embedding. We compared the performance of the LRNE with the ones exhibited by the LRE and SMCE algorithms. The LRNE outperformed its competitors in terms of both classification and embedding. A. Simulated Data The performance of the LRNE was first assessed on simulated manifolds. The simulated data-sets comprise of a pair of rectangular sheets warped by using smooth nonlinear mappings (using sinc and other trigonometric functions) corrupted by Gaussian white noise ( N (0, 0.01)). In the first data-set, the pair of sheets that share a boundary as shown in Fig. 1 (a) (simulating adjoining manifolds), while in the other they pass through each other as shown in Fig. 1 (b) (modeling intersecting manifolds). The parameter space representation of each manifold in the adjoining manifolds data-set is a 3 3 patch sampled uniformly. Each manifold contains 630 points. On the other hand each of the intersecting manifolds is made up of 1000 data-points uniformly sampled on a 4 4 patch. For k = 20 k = 30 k = 40 k = 50 k = 60 λ = λ = 0.01 λ = 0.1 λ = 1 λ = 10 SMCE LRNE SMCE LRNE SMCE LRNE SMCE LRNE SMCE LRNE LRE both these experiments we assume that we have knowledge of the number and type (whether adjoining or intersecting) of manifolds. The three algorithms under test feature a parameter (λ in Eqn. (3) for LRNE) that trades off the reconstruction objective with a penalty on using points on the wrong manifold. Additionally, LRNE and SMCE share with the LLE a parameter k, the number of neighbors in the k-nn graph. The various algorithms were tested on the different data-sets at different parameter values these results for the adjoining manifolds are shown in Table. I. The parameters were incrementally varied until the performance began to deteriorate. The LRE shows near constant performance and does not change significantly with the change in the parameters. The LRNE and the SMCE on the other hand are more sensitive to the parameter λ. Both algorithms are quite stable with respect to classification performance but in general over a wide range of parameters the LRNE shows improved performance over the SMCE. Due to the unsupervised nature of the approaches under test, the notion of the set of parameter values that yields the best overall performance is dependent on the structure of the data-set. One can decide for example to put more emphasis in lowering the misclassification rate at the expense of the embedding performance and vice versa. In order to have a pictorial representation of the results in this example, Table II records the best performance in terms of misclassification rate of the different algorithms for the above parameter range on the simulated data-sets described above and on the simulated hyperspectral mixture data-set described in section IV-B. Figs. 4 and 5 illustrate the corresponding clustering and embedding results on the two warped-sheet data-sets. In particular, Fig. 4 concerns the simulated data-set with adjoining manifolds, while Fig. 5 refers to the simulated data-set with intersecting manifolds. In both figures the classification results are featured in the first row against the true manifold labels and the embedding results are shown in the second row. The optimal embedding performance to compare the three competitors to can be considered the one achieved by applying the LLE separately on each manifold. As expected, the LRE does not fare well when the manifolds exhibit non-trivial curvature. Since nonlinear manifolds are only locally linear, reconstruction coefficients for data-points from different parts of the manifold have significantly differing structures. This does not conform with the low-rank penalty on the reconstruction matrix which leads to highly flawed

7 7 TABLE II MISCLASSIFICATION RATES OF THE DIFFERENT ALGORITHMS k LRE SMCE LRNE Simulated Intersecting Simulated Adjoining Hapke with Shared EM classification performances as shown in Figs. 4 (c) & 5 (c). The embedding generated by the LRE are also invalid, mainly due to the poor clustering performances. The SMCE on the other hand performs well in terms of the classification of the adjoining manifolds as shown in Fig. 4 (b) but the embeddings are significantly distorted as shown in Fig. 4 (f). The poor embedding performance is caused by fact that the penalty on the sparsity of the neighborhood of each target point result in adjacent neighborhoods that do not share significant overlap (i.e. neighboring points have vastly differing neighborhood structures). Since the embedding technique is local, i.e. it only penalizes distortions in the shape of local neighborhoods, the different neighborhoods may be embedded with slightly different orientation or scalings leading to distortions in the global shape of the parameter space representation. The SMCE clustering performance in the case of intersecting manifolds is flawed as shown in Fig. 5 (b). The clustering identifies one of the arms as a cluster as opposed to a manifold. This is because the reconstruction matrix in the case of intersecting manifolds with a nearest neighbor structure has twice as many blocks as manifolds as described in Section III-A. The SMCE uses distance based sparsity constraints which ensures that points on the other manifold and the opposing arm are assigned zero/very low reconstruction coefficients. Since the SMCE assigns zero/very low reconstruction coefficients to points that are far away from the target point there is no viable way of identifying the clusters/arms that belong to the same manifold. The LRNE performs well in terms of classification for both adjoining and intersecting manifolds as shown in Fig. 4 (d) and 5 (d). The embedding from the LRNE compares quite favorably to the embedding from the LLE and is quite successful in identifying the general shape of the parameter space (identifies approximate rectangular shapes) as shown in Fig. 4 (e) and Fig. 5 (e). B. Hyperspectral Mixture Data Hyperspectral imagers (HSIs) or imaging spectrometers measure electromagnetic energy scattered in their field view in the Visible to Near InfraRed (VNIR) wavelength range ( nm). HSI data-sets are organized into planes that form a data cube: each plane corresponds to solar electromagnetic energy reflected off the surface of materials, acquired over a narrow wavelength range a spectral channel for all pixels, and each pixel represents a vector of measurements acquired at a given location for all spectral channels a (reflectance) spectrum [30]. A spectrum can also be interpreted as a point in a high-dimensional space of dimension equal to the number of spectral channels. This experiment simulates the scenario Fig. 6. Simulated Hapke data-set (a) SMCE class labels (b) LRE class labels (c) LRNE class labels (d) LLE embedding (e) SMCE embedding (f) LRNE embedding in which a hyperspectral imager observes several pixels from a terrain composed by mechanical (or intimate) mixtures of different materials, called endmembers, as in a sand beach made up of grains of different minerals. The spectra of such intimate mixtures are described by physical models, such as the one introduced by Hapke [31]. In [10], it has been shown that the point cloud representing intimately mixed spectra of known materials (endmembers), if modeled using Hapke s model can be considered as lying near a manifold obtained by sampling an abundance simplex. Each mixed pixel can be modeled as a linear combination of endmember spectra weighted according to the sampled abundance coefficients in a D-dimensional space (where D is the number of spectral bands) and then applying a nonlinear mapping in that space. Even if the abundance simplex is uniformly sampled, the nonlinear map produces a point cloud that exhibits a density gradient, with higher density (of samples) near the dark endmembers and lower density around the brighter endmembers (as explained in [10]). The exact nature and amount of the density gradient depends upon the endmembers chosen in the mixture. The data-set was chosen because of the high-dimensionality of the ambient space and the non-uniform sampling of the manifolds. The density gradient affects neighborhood structures at points with low density as even the nearest neighbors are quite far away making this a harder data-set for manifold learning algorithms. We modeled a data-set which contains two ternary mixtures (mixtures with 3 endmembers) with two common endmembers. The resulting data cloud exhibits points lying near 2 manifolds adjoining at the boundary occupied by mixed spectra that are combinations of the shared endmembers. We chose four mineral endmember spectra from the RELAB spectral database 1. The minerals are olivine, ripidolite, illite and nontronite samples we generated by sampling each 2 D abundance simplex uniformly and mixed spectra was generate according to the Hapke Model. In Fig. 6, we show the clustering and embedding performance of the three competing algorithms on this data-set. The set-up is the same as in Figs. 4 and 5 except that the high-dimensional data are projected onto the first three principal components for visualization. It is 1 RELAB Spectral Database: Copyright 2008, Brown University, Providence, RI.; All Rights Reserved

8 8 TABLE IV AVG. ERROR IN EMBEDDING PERFORMANCE adjoining manifolds intersecting manifolds ternary mixtures LLE LRNE manif manif manif manif mix-manif mix-manif important to notice that the particular choice of endmembers creates an interesting density pattern. For the blue mixture, the point-density decreases from the intersection between the two mixtures towards the corner represented by the non-shared endmember. For the red mixture, a (more intense) decrease is observed towards one of the shared endmembers as well. The LRE clustering performance for this data-set shows that the performance of this algorithm suffers significantly in the presence of the density gradient, as depicted in Fig. 6 (c). The result affected the embedding so significantly that we opted for not showing the LRE embeddings. The SMCE shows reasonable clustering results as shown in 6 (b), the performance is however significantly degraded as compared to the one on the uniformly-sampled adjoining manifolds. In particular, the performance is affected significantly near the corners of the red mixture, as seen in Fig. 6 (b). In the region of the intersection towards one of the shared endmembers the neighborhoods of points on the red manifold might contain mostly points from the blue manifold. Since the SMCE prioritizes the closest neighbors and selects very few neighbors, it creates scenarios where the data-points in this region are completely disconnected from the correct manifold, so that many of those points are classified as blue. The SMCE embeddings show significant distortions and missing pieces, due to the clustering mistakes, as shown in 6 (f). The LRNE performs slightly better than the SMCE in terms of clustering in the presence of such density effects as shown in 6 (d). This is due to the fact that the algorithm does not have a sparsity penalty and can rely on more points for reconstruction in low-density neighborhoods. The LRNE embeddings (in 6 (g)) are significantly closer to the optimal LLE embeddings (6 (e)) as compared the SMCE embedding. The missing parts corresponds to the incorrectly classified parts of the simplex. [Note:- Embedding of incorrectly classified points not shown for clarity]. We have observed a similar trend of LRNE outperforming LRE and SMCE in terms of both clustering and embedding for several other end-member configurations. A similar example was presented in [21], in addition to results on a real hyperspectral data-set, acquired by the authors. C. Analyzing the Embedding Performance In this section we will attempt to make a quantitative comparison of the embedding performance of the LRNE to the optimal embeddings. In the best scenario, the embeddings generated by the optimal LLE are an approximate representation of the parameter space up to some affine transformation [32]. In the examples analyzed so far, the manifolds are non-linear mappings of convex sets in some low-dimensional space, i.e. the intersecting and adjoining manifolds are non-linear mappings of rectangular sheets while the mixture manifolds are non-linear mappings of the 2-D abundance simplex (a triangle). Since the data-sets are convex in the parameter space each point in the parameter space can be expressed as a convex combination of the cloud vertices. In general, If we ignore the performance loss due to the linear global transformation between the embedded cloud and the original parametrization, we can consider the optimal LLE embedding performance as the best achievable by a local affine reconstruction. The embeddings will also be convex sets and the points in the embedded spaces can also be expressed as convex combinations of the vertices. It might be interesting to compare the coefficients of the convex combination of vertices in the original parameter spaces and the ones of the convex combination of vertices in the embedding, as a way to quantitatively evaluate the embedding performance. This is important in some applications in which the coefficients carry semantic value. For example, in hyperspectral unmixing the coefficients represent the fractional contributions (abundances) of the different endmembers to the mixed pixels. We devised a simple method to perform the comparison. We express each point x i as a convex combination of some vertices V with weights W i. From the embeddings we also find the estimated weights Wi est for the embedded points y i with respect to the embedded versions of the same vertices V y. We define the average error in the embedding of a manifold as 1 n n W est i W i, where n is the number correctly classified points in each manifold. [Note: - we do not consider incorrectly classified points as the embedding error in these case is affected by the classification]. The effect of the parameters λ and the number of neighbors k on the embedding error of the LRNE, is shown for one of the manifolds in the adjoining manifolds dataset in Table. III. In general we note that the embedding error is slightly higher than the corresponding LLE error. For very small values of λ incorrect points are given high priority in the reconstruction this leading to distortion. For high values of λ the algorithm prioritizes low-rank neighborhoods over reconstruction error leading to large embedding errors. The best embedding error for the different simulated manifolds is shown in Table IV. In general the best embedding error from the LRNE is very close to the LLE embedding errors. D. Experiments with Real Data In this section we evaluate the LRNE algorithm on some well-known real data-sets used as benchmarks for manifold clustering. Real data-sets offer specific challenges, which vary according to the nature and type of noise in the data and the number of manifold intersections. Additionally, we seldom have information on the sample distribution on the manifold and whether the data are best modeled as adjoining or intersecting manifolds. To analyze the robustness of the various algorithms to issues encountered in real data-sets we

9 9 TABLE III EFFECT OF PARAMETERS ON LRNE EMBEDDING OF MANIFOLD-1 λ = λ = 0.01 λ = 0.1 λ = 1 λ = 10 LLE k = k = k = k = k = TABLE VII CLASSIFICATION PERFORMANCE OF YALE FACE -B DATA-SET Misclassification Rate SMCE LRE LRNE TABLE VIII CLASSIFICATION PERFORMANCE FOR THE SUBSET OF THE COIL-20 DATA-SET SMCE LRE LRNE % misclass Fig. 7. Top Row - Embeddings generated for the MNIST data-set by the LRNE for clusters corresponding to (a) digit 3, (b) digit 4, (c) digit 6. Bottom Row - Embeddings generated for the MNIST data-set by the SMCE for clusters corresponding to (d) digit 3, (e) digit 4, (f) digit 6. consider their performance on three data-sets: (i) a subset of the digits from the MNIST database [33], (ii) the Extended Yale Face Database B [34], [35] and (iii) images of objects from the well known COIL-20 database [36]. 1) MNIST Digit data-set: Each image in the MNIST digit data-set is a 28 X 28 grayscale image of a handwritten number. Following the SMCE paper [11], we only consider of 5 of the classes, namely the ones corresponding to digits { 0, 3, 4, 6, 7 }. In particular we draw at random 200 samples from each of these classes to form the test data-set. To variance in the estimation we perform multiple (four) trials with these settings and report the average performance across these trials. Similarly to the experiments on simulated data-sets, we report the classification error across a range of values of the penalty parameters for the three competitor algorithms in Table VI. The results show that LRNE outperforms SMCE and LRE. The best clustering performance of the various algorithms over different classes is shown in Table V. The SMCE particularly struggles with the classes 4 and 6. The embedding generated by LRNE and the SMCE for specific classes is shown in Fig. 7. The embeddings on the digits show that the stylized digits occur as outliers and separated from the other data. Note that images with similar shapes are spatial neighbors in the LRNE embeddings while spatially nearest neighbors have different shapes in the SMCE embeddings (especially in the case of the digit 3 and 6 ) [Note: To facilitate better discrimination incorrectly classified points are not shown in the embeddings for the real data-sets.] 2) Extended Yale Face Database: In this experiment, we consider the problem of clustering and embedding of face images of two subjects (specifically the images of the second and the fifth people in the dataset) from the Extended Yale B database proposed in [11]. In the original database there are 64 images corresponding to each subject at a resolution of pixels captured under fixed pose and with changes in illumination [34]. For this experiment, we use the version of the images that have been resized to These data are described in [37] 2. The performance of the three competitors with parameter λ in the range [0.01, 100] (in multiples of 10), whereas the parameter k was varied between 5 10, in increments of 1. The best classification performance is shown in Table (VII). In terms of clustering results, the LRNE matches the performance of the SMCE. The embeddings generated by the LRNE and the SMCE for the different classes are shown in Fig. 8. The LRNE embeddings resolves both light-direction and image brightness and appear smooth and not disconnected. [Note: To facilitate better discrimination incorrectly classified points are not shown in the embeddings for the real data-sets.] 3) The Columbia Image Object Image Library (COIL-20) data-set: The COIL-20 image database consists of images of 20 different objects, each object was placed on a turntable and 72 images were taken for each object with a 5 rotation between successive images. In this scenario it is expected that the nearest neighbors for each image are the pictures with smallest change in the angle (i.e. the images with a 5 rotation either way). This ensures that even when we choose the sparsest (smallest) neighborhoods there is overlap between the neighborhoods as the nearest neighbors on each side are approximately the same distance. Along with onedimensional nature of the manifold the overlap between very sparse neighborhoods makes this an ideal test case for the SMCE. We will compare the different algorithms in terms of both classification and embedding performance. As with the Yale data-set, in this experiment, we will use a 2 This data-set is available for download at dengcai/data/facedata.html

10 10 TABLE V AVERAGE CLASSIFICATION PERFORMANCE OF ACROSS THE DIFFERENT CLASSES IN THE MNIST DIGITS DATA-SETS WITH LRNE (λ = 0.01 & k = 50), SMCE(λ = 1 & kmax = 50) AND LRE(λ = 10) OVER MULTIPLE TRAILS Assigned Label 0 Assigned Label 3 Assigned Label 4 Assigned Label 6 Assigned Label 7 True Label 0 True Label 3 True Label 4 True Label 6 True Label 7 MisClassification LRNE SMCE LRE LRNE SMCE LRE LRNE SMCE LRE LRNE SMCE LRE LRNE SMCE LRE TABLE VI AVERAGE MISCLASSIFICATION RATES ON THE MNIST DATA-SET FOR THE DIFFERENT ALGORITHMS OVER MULTIPLE TRIALS avg. misclass. rate SMCE (kmax = 50) LRNE(k = 50) LRE λ = 0.1 λ = 1 λ = 5 λ = 1e 5 λ = 0.01 λ = 0.05 λ = 1 λ = 10 λ = Fig. 8. The embedding generated on the Extended Yale Face data-set for the clusters corresponding to the (a) the first person by LRNE (b) the second person by LRNE (c) the first person by SMCE (d) the second person by SMCE Fig. 9. (a) Classes chosen from the COIL-20 object database (high resolution images used for display). (b) and (c) Embeddings of the classes corresponding to the cups when classification accuracy is prioritized. (b) LRNE and (c) SMCE reduced size version of this database described in [38] 3. In this experiment, we will concentrate on a subset of the COIL-20 database made up 6 different objects shown in Fig. 9 (a). The parameter λ was varied in the range [0.01, 100] in multiples of 10, while the number of neighbors was varied between [5, 25] with a step-size of 5. The clustering performance of the different algorithms is shown in Table VIII. We note that the LRNE slightly improves on the performance of the SMCE. We will evaluate the embedding performance in terms of 3 This data-set is available for download at dengcai/data/mldata.html the two cup like objects (the first two objects on the bottom row) shown in Fig. 9 (a). First lets look at the embedding corresponding to the best classification performance. In terms of the embedding (corresponding to the best classification performance) while both algorithms are successful in learning the directions of variations the SMCE generates embeddings that are essentially 1D, while the 1-D manifold is obtained only partially by the LNRE. As discussed previously, for a 1-D manifold preserving fewer distances from the higher dimension of the manifold serves the SMCE well, whereas trying to preserve larger neighborhoods costs the LRNE, as

11 11 there are farther away and should not be preserved. If we look at the best embedding performance achievable by the algorithms, for the LRNE we can change the parameters to select small neighborhoods. For example, if we set the parameters to k = 2 and λ = 0.5, the LRNE generates near perfect embeddings as shown in Fig. 10 (a), in that it clearly identifies that the manifold is one 1-D, this is intuitive as only there is only one degree of freedom in the various images (the angle of rotation of the turn-table), and progressing along the manifold clearly shows the rotation (which can be appreciated by following the symbols in the figure on the left and the cup handle in the figures on the right). Further the points appear approximately equidistant from each other which is also expected as the change in angle is the same between each pair of pictures. The improved embeddings though come at the cost of the classification performance and in this case the misclassification rates jumps to 7.41%. For the SMCE the best embedding performance occurs if we set kmax = 3 and λ = 0.5, especially in the case of the cup with the handle but for the other manifold the embedding does not change much. The improvements in the embedding also carry a penalty on the classification performance and the misclassification rates is 8.10%. While the embedding is 1-D, the circular pattern is as concise as the one obtained by LNRE as the circular patterns in Fig. 10 (b) can be obtained as a non-linear mapping of the patterns in Fig. 10 (a). V. CONCLUSION & FUTURE WORKS The Low Rank Neighborhood algorithm successfully generates a reconstruction matrix that can be used for both manifold clustering and embedding. The LRNE outperforms existing state-of-the-art algorithms in terms of both clustering and embedding over a variety of simulated and real data-sets. The LRNE shows improved clustering especially in scenarios where there are local variation in densities. Additionally since the LRNE allows the users to choose the size of the neighborhood k, we can ensure that there is enough overlap between different neighborhood patches which ensures that the LRNE is better able to retrieve the global shape of the parameter space. The embeddings generate by the LRNE compare favorably to the ones generated by dedicated embedding algorithms on each manifold separately. Future work will focus on automating the decision on whether the manifolds are based modeled as adjoining or intersecting manifolds, i.e. whether any of the classes generated from the spectral clustering should be futher merged. Another avenue of research is the use of techniques such as the one described in [39] to make the algorithm parameterless. Attempts are also ongoing to evaluate the embedding quality in a more principled manner. APPENDIX A ADMM BASED OPTIMIZATION SCHEME FOR THE LRNE Recall the optimization problem for the LRNE described in Eqn. (3) is: α 1 2 x α i n i 2 + λ ˆM subject to 1 T α = 1. (5) this problem can be simplified and rewritten as: 1 α 2 x Nα 2 + λ ˆN α i e i e T i subject to 1 T α = 1 where the vector e i is a vector such that the i-th element is 1 and all the other elements are 0, where ˆN is the normalized neighborhood matrix described in section III. All the terms in the objective function described above are convex and this optimization problem can be solved by using the Alternating Direction Method of Multipliers (ADMM) [25] to find an optimal solution. A dummy variable is introduced and the variable-augmented optimization problem can be written as: 1 α, V 2 x Nα 2 + λ V subject to 1 T α = 1 V = ˆN α i e i e T i The augmented Lagrangian for the ADMM can be written as: L(V, α, Λ 1, Λ 2 ) = 1 2 x Nα 2 + λ V + β 2 1T α β Λ 1 + β V ˆN α i e i e T i + 1 F 2 β Λ 2 2 F (7) The ADMM decomposes the minimization into two separate optimization problems. Update equation for V : Specifically we can write the update iterations for the variable V with respect to the augmented Lagrangian as: V k+1 = argmin V V k+1 = argmin V L(V, α k, Λ k 1, Λ k 2) λ V + β 2 V T + 1 β Λk 2 where T = ˆN k αk i e ie T i. This equation is quite similar to Eqn. (6) in Lin et al. [40] and similarly can be updated by using a proximal update using the singular value thresholding [41]. Update equation for α: The next step is the optimization with respect to α, which can be written as: - α k+1 =argmin α =argmin α 2 F (6) (8) L(V k+1, α, Λ k 1, Λ k 2) 1 2 β x Nα + 2 }{{} 2 1T α β Λk 1 + F J 1 }{{} J β 2 V k+1 ˆN α i e i e T i + 1 β Λk 2 F }{{} J 3 (9)

12 12 Fig. 10. Embeddings generated by the different algorithms when embeddings are prioritized over the classification (a) LRNE and (b SMCE) First we find the gradient of these terms with respect to alpha, the gradient of the first two terms is very straight forward and can be written as: J 1 α = 2N T x + 2N T Nα (10) J 2 α = 2F T α (11) where F 1 = 1 Λk 1 β. The differentiation of the third term is slightly more complex. We begin by defining F 2 = V k+1 + Λk 2 β, then: J 3 = Tr(F2 T F 2 ) 2 Tr( α i e i e T i ˆN T F 2 )+ + Tr( α i e i e T i ˆN T ˆN α j e j e T j ) Using the properties of the Trace (Tr) (linearity and cyclic permutations) we can rewrite the above equation as: = Tr(F T 2 F 2 ) 2 + = Tr(F T 2 F 2 ) 2 α i Tr(e i e T i ˆN T F 2 )+ ( α i Tr e i e T i α i Tr(e i e T i ˆN T F 2 )+ + j=1 ( ˆN T k ) ) ˆN α je j e T j α i α j Tr(e j e T j e i e T i ˆN T ˆN) now differentiating the above equation with respect to α i we have: J 3 α i = 2 Tr(e i e T i ˆN T F 2 ) + j i ( α j Tr e j e T j e i e T i + 2α i Tr(e i e T i e i e T i ) ˆN T ˆN ˆN T ˆN) by definition e T j e i = 0 if j i and e T i e i = 1 so the above equation reduces to: J 3 α i = 2 Tr(e i e T i ˆN T F 2 ) + 2α i Tr(e i e T i ˆN T ˆN) the gradient with respect to all the entries of α can be written as: J 3 = 2P + 2Qα (12) α where the i th entry of the vector is given by P i = Tr(e i e T i ˆN T F 2 ) and Q is diagonal matrix which has entries given by Q ii = Tr(e i e T i ˆN T ˆN): Using the results in Eqn. (10), (11) and (12) we can write the total gradient as: - N T x + N T Nα β1f 1 + β11 T α βp + βqα = 0 (13) Solving the above equation we get that the optimal value is given by the stationary point ˆα = ( N T N + β11 T + Q ) 1 ( N T x + β1f 1 + βp ). The second-derivative w.r.t α at this stationary point is: 2 L α 2 = N T N + β11 T + βq in the above equation the matrices N T N, 11 T and Q are positive semi-definite, thus the second derivative has positive eigen-values making the stationary point a minimum. Update for the Lagrangian Multipliers: The Lagrangian multipliers can the be updated as:- Λ k+1 1 = Λ k 1 + β(1 T α 1) Λ k+1 2 = Λ k 2 + β(v ˆN REFERENCES α i e i e T i ) [1] X. Huo, X. S. Ni, and A. K. Smith, A survey of manifold-based learning methods, Recent advances in data mining of enterprise data, pp , [2] J. Zhang, H. Huang, and J. Wang, Manifold learning for visualizing and analyzing high-dimensional data, IEEE Intelligent Systems, vol. 25, no. 4, pp , [3] J. B. Tenenbaum, V. de Silva, and J. C. Langford, A global geometric framework for nonlinear dimensionality reduction, Science, vol. 290, no. 5500, p. 2319, [4] S. T. Roweis and L. K. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science, vol. 290, pp , [5] M. Belkin and P. Niyogi, Laplacian eigenmaps and spectral techniques for embedding and clustering, in Advances in Neural Information Processing Systems 14. MIT Press, 2001, pp [6] Z.-y. Zhang and H.-y. Zha, Principal manifolds and nonlinear dimensionality reduction via tangent space alignment, Journal of Shanghai University (English Edition), vol. 8, no. 4, pp , [7] L. Van Der Maaten, E. Postma, and J. Van den Herik, Dimensionality reduction: a comparative review, J Mach Learn Res, vol. 10, pp , [8] Y. Ma and Y. Fu, Manifold Learning Theory and Applications. CRC Press, 2012.

13 13 [9] J. Shi and J. Malik, Normalized cuts and image segmentation, IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp , Aug [10] A. Saranathan and M. Parente, Manifold clustering based unmixing for the multiple intimate mixture scenario, in Proc. 6th IEEE Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS), [11] E. Elhamifar and R. Vidal, Sparse manifold clustering and embedding, in Advances in Neural Information Processing Systems 24, J. ShaweTaylor, R. Zemel, P. Bartlett, F. Pereira, and K. Weinberger, Eds. Curran Associates, Inc., 2011, pp [12] Y. Wang, Y. Jiang, Y. Wu, and Z.-H. Zhou, Local and structural consistency for multi-manifold clustering, in IJCAI Proceedings-International Joint Conference on Artificial Intelligence, vol. 22, no. 1. Citeseer, 2011, p [13], PRICAI 2010: Trends in Artificial Intelligence: 11th Pacific Rim International Conference on Artificial Intelligence, Daegu, Korea, August 30 September 2, Proceedings. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, ch. Multi-manifold Clustering, pp [Online]. Available: 27 [14] R. Souvenir and R. Pless, Manifold clustering, in In ICCV, 2005, pp [15] R. Vidal, Subspace clustering, IEEE Signal Processing Magazine, vol. 28, no. 2, pp , March [16] D. Gong, X. Zhao, and G. Medioni, Robust multiple manifolds structure learning, arxiv preprint arxiv: , [17] R. Liu, R. Hao, and Z. Su, Mixture of manifolds clustering via low rank embedding, Journal of Information and Computational Science, vol. 8, no. 5, pp , [18] G. Liu, Z. Lin, and Y. Yu, Robust subspace segmentation by low-rank representation. [19] J. MacQueen et al., Some methods for classification and analysis of multivariate observations, in Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol. 1, no. 14. Oakland, CA, USA., 1967, pp [20] A. M. Saranathan and M. Parente, Simultaneous clustering and embedding for multiple intimate mixtures, in Geoscience and Remote Sensing Symposium (IGARSS), 2015 IEEE International, July 2015, pp [21] A. Saranathan and M. Parente, Unmixing multiple intimate mixtures via a locally low-rank representation, in Proc. of 8th IEEE GRSS Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS), August, [22] E. J. Cande s and B. Recht, Exact matrix completion via convex optimization, Found. Comput. Math., vol. 9, no. 6, pp , Dec [Online]. Available: [23] M. Grant and S. Boyd, CVX: Matlab software for disciplined convex programming, version 2.1, Mar [24], Graph implementations for nonsmooth convex programs, in Recent Advances in Learning and Control, ser. Lecture Notes in Control and Information Sciences, V. Blondel, S. Boyd, and H. Kimura, Eds. Springer-Verlag Limited, 2008, pp [25] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Foundations and Trends R in Machine Learning, vol. 3, no. 1, pp , [26] Z. Yang, T. Hao, O. Dikmen, X. Chen, and E. Oja, Clustering by nonnegative matrix factorization using graph random walk, in Advances in Neural Information Processing Systems, 2012, pp [27] U. Von Luxburg, A tutorial on spectral clustering, Statistics and computing, vol. 17, no. 4, pp , [28] Z. Lin, R. Liu, and Z. Su, Linearized alternating direction method with adaptive penalty for low-rank representation, in Advances in neural information processing systems, 2011, pp [29] G. H. Golub and C. F. Van Loan, Matrix computations. JHU Press, 2012, vol. 3. [30] M. T. Eismann, Hyperspectral remote sensing. SPIE Bellingham, [31] B. Hapke, Theory of Reflectance and Emittance Spectroscopy, 2nd ed. Cambridge University Press, 2012, cambridge Books Online. [32] Y. Goldberg, A. Zakai, D. Kushnir, and Y. Ritov, Manifold learning: The price of normalization, Journal of Machine Learning Research, vol. 9, no. Aug, pp , [33] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, in Proceedings of the IEEE, 1998, pp [34] A. Georghiades, P. Belhumeur, and D. Kriegman, From few to many: Illumination cone models for face recognition under variable lighting and pose, IEEE Trans. Pattern Anal. Mach. Intelligence, vol. 23, no. 6, pp , [35] K.-C. Lee, J. Ho, and D. J. Kriegman, Acquiring linear subspaces for face recognition under variable lighting, IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 5, pp , May [Online]. Available: [36] S. A. Nene, S. K. Nayar, H. Murase et al., Columbia object image library (coil-20), Tech. Rep. [37] D. Cai, X. He, Y. Hu, J. Han, and T. Huang, Learning a spatially smooth subspace for face recognition, in Proc. IEEE Conf. Computer Vision and Pattern Recognition Machine Learning (CVPR 07), [38] D. Cai, X. He, J. Han, and T. S. Huang, Graph regularized nonnegative matrix factorization for data representation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 8, pp , [39] L. Zelnik-Manor and P. Perona, Self-tuning spectral clustering, [40] Z. Lin, M. Chen, and Y. Ma, The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices, arxiv preprint arxiv: , [41] J.-F. Cai, E. J. Cande s, and Z. Shen, A singular value thresholding algorithm for matrix completion, SIAM Journal on Optimization, vol. 20, no. 4, pp , Arun M Saranathan received the B.E. degree from Visvesvaraya Technological University in Belgaum, India and the M.S. degree in Electrical Engineering from the University of Massachusetts, Amherst. He is currently a Ph.D student in the Department of Electrical & Computer Engineering at University of Massachusetts, Amherst. His interests include the use and extension of image segmentation techniques for Hyperspectral (HSI) Images and the use of manifold techniques to model the mixing seen in HSI. He has been a student member of IEEE since Mario Parente (M 05-SM 13) received the B.S. and M.S. (summa cum laude) degrees in telecommunication engineering from the University of Federico II of Naples, Italy, and the M.S. and Ph.D. degrees in electrical engineering from Stanford University, Stanford, CA. He was a Post-Doctoral Associate in the Department of Geosciences at Brown University. He is currently an Assistant Professor in the Department of Electrical and Computer Engineering at the University of Massachusetts Amherst. His research involves combining physical models and statistical techniques to address issues in remote sensing of Earth and planetary surfaces. Prof. Parente s professional interests include identification of ground composition, geomorphological feature detection and imaging spectrometer data modeling, reduction and calibration for NASA missions. He developed machine learning algorithms for the representation and processing of hyperspectral data based on statistical, geometrical and topological models. Dr. Parente s research also involves the study of physical models of light scattering in particulate media. Furthermore, he has developed solutions for the integration of color and hyperspectral imaging and robotics to identify scientifically significant targets for rover and orbiter-based reconnaissance. Dr. Parente has supported several scientific teams in NASA missions such as the Compact Reconnaissance Imaging Spectrometer for Mars (CRISM), the Mars Mineralogy Mapper (M3) and the Mars Science Laboratory ChemCam science teams. Dr. Parente is a principal investigator at the SETI Institute, Carl Sagan Center for Search for Life in the Universe and a member of the NASA Astrobiology Institute. Prof. Parente is serving as an Associate Editor for the IEEE Geoscience and Remote Sensing Letters.

Learning a Manifold as an Atlas Supplementary Material

Learning a Manifold as an Atlas Supplementary Material Learning a Manifold as an Atlas Supplementary Material Nikolaos Pitelis Chris Russell School of EECS, Queen Mary, University of London [nikolaos.pitelis,chrisr,lourdes]@eecs.qmul.ac.uk Lourdes Agapito

More information

Unsupervised learning in Vision

Unsupervised learning in Vision Chapter 7 Unsupervised learning in Vision The fields of Computer Vision and Machine Learning complement each other in a very natural way: the aim of the former is to extract useful information from visual

More information

Sparse Manifold Clustering and Embedding

Sparse Manifold Clustering and Embedding Sparse Manifold Clustering and Embedding Ehsan Elhamifar Center for Imaging Science Johns Hopkins University ehsan@cis.jhu.edu René Vidal Center for Imaging Science Johns Hopkins University rvidal@cis.jhu.edu

More information

Dimension Reduction CS534

Dimension Reduction CS534 Dimension Reduction CS534 Why dimension reduction? High dimensionality large number of features E.g., documents represented by thousands of words, millions of bigrams Images represented by thousands of

More information

Subspace Clustering with Global Dimension Minimization And Application to Motion Segmentation

Subspace Clustering with Global Dimension Minimization And Application to Motion Segmentation Subspace Clustering with Global Dimension Minimization And Application to Motion Segmentation Bryan Poling University of Minnesota Joint work with Gilad Lerman University of Minnesota The Problem of Subspace

More information

Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference

Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference Minh Dao 1, Xiang Xiang 1, Bulent Ayhan 2, Chiman Kwan 2, Trac D. Tran 1 Johns Hopkins Univeristy, 3400

More information

Non-linear dimension reduction

Non-linear dimension reduction Sta306b May 23, 2011 Dimension Reduction: 1 Non-linear dimension reduction ISOMAP: Tenenbaum, de Silva & Langford (2000) Local linear embedding: Roweis & Saul (2000) Local MDS: Chen (2006) all three methods

More information

Hyperspectral Remote Sensing

Hyperspectral Remote Sensing Hyperspectral Remote Sensing Multi-spectral: Several comparatively wide spectral bands Hyperspectral: Many (could be hundreds) very narrow spectral bands GEOG 4110/5100 30 AVIRIS: Airborne Visible/Infrared

More information

Character Recognition

Character Recognition Character Recognition 5.1 INTRODUCTION Recognition is one of the important steps in image processing. There are different methods such as Histogram method, Hough transformation, Neural computing approaches

More information

Principal Coordinate Clustering

Principal Coordinate Clustering Principal Coordinate Clustering Ali Sekmen, Akram Aldroubi, Ahmet Bugra Koku, Keaton Hamm Department of Computer Science, Tennessee State University Department of Mathematics, Vanderbilt University Department

More information

HIGH-dimensional data are commonly observed in various

HIGH-dimensional data are commonly observed in various 1 Simplex Representation for Subspace Clustering Jun Xu 1, Student Member, IEEE, Deyu Meng 2, Member, IEEE, Lei Zhang 1, Fellow, IEEE 1 Department of Computing, The Hong Kong Polytechnic University, Hong

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 WRI C225 Lecture 04 130131 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Histogram Equalization Image Filtering Linear

More information

Spectral Surface Reconstruction from Noisy Point Clouds

Spectral Surface Reconstruction from Noisy Point Clouds Spectral Surface Reconstruction from Noisy Point Clouds 1. Briefly summarize the paper s contributions. Does it address a new problem? Does it present a new approach? Does it show new types of results?

More information

GEOMETRIC MANIFOLD APPROXIMATION USING LOCALLY LINEAR APPROXIMATIONS

GEOMETRIC MANIFOLD APPROXIMATION USING LOCALLY LINEAR APPROXIMATIONS GEOMETRIC MANIFOLD APPROXIMATION USING LOCALLY LINEAR APPROXIMATIONS BY TALAL AHMED A thesis submitted to the Graduate School New Brunswick Rutgers, The State University of New Jersey in partial fulfillment

More information

Lecture 7. Spectral Unmixing. Summary. Mixtures in Remote Sensing

Lecture 7. Spectral Unmixing. Summary. Mixtures in Remote Sensing Lecture 7 Spectral Unmixing Summary This lecture will introduce you to the concepts of linear spectral mixing. This methods is sometimes also called: Spectral Mixture Analysis (SMA: Wessman et al 1997)

More information

Head Frontal-View Identification Using Extended LLE

Head Frontal-View Identification Using Extended LLE Head Frontal-View Identification Using Extended LLE Chao Wang Center for Spoken Language Understanding, Oregon Health and Science University Abstract Automatic head frontal-view identification is challenging

More information

Texture. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors

Texture. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors Texture The most fundamental question is: How can we measure texture, i.e., how can we quantitatively distinguish between different textures? Of course it is not enough to look at the intensity of individual

More information

A Taxonomy of Semi-Supervised Learning Algorithms

A Taxonomy of Semi-Supervised Learning Algorithms A Taxonomy of Semi-Supervised Learning Algorithms Olivier Chapelle Max Planck Institute for Biological Cybernetics December 2005 Outline 1 Introduction 2 Generative models 3 Low density separation 4 Graph

More information

Chapter 4. Clustering Core Atoms by Location

Chapter 4. Clustering Core Atoms by Location Chapter 4. Clustering Core Atoms by Location In this chapter, a process for sampling core atoms in space is developed, so that the analytic techniques in section 3C can be applied to local collections

More information

Nearest Neighbors Classifiers

Nearest Neighbors Classifiers Nearest Neighbors Classifiers Raúl Rojas Freie Universität Berlin July 2014 In pattern recognition we want to analyze data sets of many different types (pictures, vectors of health symptoms, audio streams,

More information

Large-Scale Face Manifold Learning

Large-Scale Face Manifold Learning Large-Scale Face Manifold Learning Sanjiv Kumar Google Research New York, NY * Joint work with A. Talwalkar, H. Rowley and M. Mohri 1 Face Manifold Learning 50 x 50 pixel faces R 2500 50 x 50 pixel random

More information

CRF Based Point Cloud Segmentation Jonathan Nation

CRF Based Point Cloud Segmentation Jonathan Nation CRF Based Point Cloud Segmentation Jonathan Nation jsnation@stanford.edu 1. INTRODUCTION The goal of the project is to use the recently proposed fully connected conditional random field (CRF) model to

More information

CSE 252B: Computer Vision II

CSE 252B: Computer Vision II CSE 252B: Computer Vision II Lecturer: Serge Belongie Scribes: Jeremy Pollock and Neil Alldrin LECTURE 14 Robust Feature Matching 14.1. Introduction Last lecture we learned how to find interest points

More information

Selection of Scale-Invariant Parts for Object Class Recognition

Selection of Scale-Invariant Parts for Object Class Recognition Selection of Scale-Invariant Parts for Object Class Recognition Gy. Dorkó and C. Schmid INRIA Rhône-Alpes, GRAVIR-CNRS 655, av. de l Europe, 3833 Montbonnot, France fdorko,schmidg@inrialpes.fr Abstract

More information

Generalized Principal Component Analysis CVPR 2007

Generalized Principal Component Analysis CVPR 2007 Generalized Principal Component Analysis Tutorial @ CVPR 2007 Yi Ma ECE Department University of Illinois Urbana Champaign René Vidal Center for Imaging Science Institute for Computational Medicine Johns

More information

Clustering CS 550: Machine Learning

Clustering CS 550: Machine Learning Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf

More information

This research aims to present a new way of visualizing multi-dimensional data using generalized scatterplots by sensitivity coefficients to highlight

This research aims to present a new way of visualizing multi-dimensional data using generalized scatterplots by sensitivity coefficients to highlight This research aims to present a new way of visualizing multi-dimensional data using generalized scatterplots by sensitivity coefficients to highlight local variation of one variable with respect to another.

More information

Stereo Vision. MAN-522 Computer Vision

Stereo Vision. MAN-522 Computer Vision Stereo Vision MAN-522 Computer Vision What is the goal of stereo vision? The recovery of the 3D structure of a scene using two or more images of the 3D scene, each acquired from a different viewpoint in

More information

Robust Pose Estimation using the SwissRanger SR-3000 Camera

Robust Pose Estimation using the SwissRanger SR-3000 Camera Robust Pose Estimation using the SwissRanger SR- Camera Sigurjón Árni Guðmundsson, Rasmus Larsen and Bjarne K. Ersbøll Technical University of Denmark, Informatics and Mathematical Modelling. Building,

More information

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009 Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer

More information

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints Last week Multi-Frame Structure from Motion: Multi-View Stereo Unknown camera viewpoints Last week PCA Today Recognition Today Recognition Recognition problems What is it? Object detection Who is it? Recognizing

More information

Data fusion and multi-cue data matching using diffusion maps

Data fusion and multi-cue data matching using diffusion maps Data fusion and multi-cue data matching using diffusion maps Stéphane Lafon Collaborators: Raphy Coifman, Andreas Glaser, Yosi Keller, Steven Zucker (Yale University) Part of this work was supported by

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 14 130307 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Stereo Dense Motion Estimation Translational

More information

Robust PDF Table Locator

Robust PDF Table Locator Robust PDF Table Locator December 17, 2016 1 Introduction Data scientists rely on an abundance of tabular data stored in easy-to-machine-read formats like.csv files. Unfortunately, most government records

More information

Recognition: Face Recognition. Linda Shapiro EE/CSE 576

Recognition: Face Recognition. Linda Shapiro EE/CSE 576 Recognition: Face Recognition Linda Shapiro EE/CSE 576 1 Face recognition: once you ve detected and cropped a face, try to recognize it Detection Recognition Sally 2 Face recognition: overview Typical

More information

Revision of the SolidWorks Variable Pressure Simulation Tutorial J.E. Akin, Rice University, Mechanical Engineering. Introduction

Revision of the SolidWorks Variable Pressure Simulation Tutorial J.E. Akin, Rice University, Mechanical Engineering. Introduction Revision of the SolidWorks Variable Pressure Simulation Tutorial J.E. Akin, Rice University, Mechanical Engineering Introduction A SolidWorks simulation tutorial is just intended to illustrate where to

More information

Recognition, SVD, and PCA

Recognition, SVD, and PCA Recognition, SVD, and PCA Recognition Suppose you want to find a face in an image One possibility: look for something that looks sort of like a face (oval, dark band near top, dark band near bottom) Another

More information

CSE 6242 A / CX 4242 DVA. March 6, Dimension Reduction. Guest Lecturer: Jaegul Choo

CSE 6242 A / CX 4242 DVA. March 6, Dimension Reduction. Guest Lecturer: Jaegul Choo CSE 6242 A / CX 4242 DVA March 6, 2014 Dimension Reduction Guest Lecturer: Jaegul Choo Data is Too Big To Analyze! Limited memory size! Data may not be fitted to the memory of your machine! Slow computation!

More information

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality

More information

Selecting Models from Videos for Appearance-Based Face Recognition

Selecting Models from Videos for Appearance-Based Face Recognition Selecting Models from Videos for Appearance-Based Face Recognition Abdenour Hadid and Matti Pietikäinen Machine Vision Group Infotech Oulu and Department of Electrical and Information Engineering P.O.

More information

Parameterization of Triangular Meshes with Virtual Boundaries

Parameterization of Triangular Meshes with Virtual Boundaries Parameterization of Triangular Meshes with Virtual Boundaries Yunjin Lee 1;Λ Hyoung Seok Kim 2;y Seungyong Lee 1;z 1 Department of Computer Science and Engineering Pohang University of Science and Technology

More information

Digital Image Processing Fundamentals

Digital Image Processing Fundamentals Ioannis Pitas Digital Image Processing Fundamentals Chapter 7 Shape Description Answers to the Chapter Questions Thessaloniki 1998 Chapter 7: Shape description 7.1 Introduction 1. Why is invariance to

More information

Factorization with Missing and Noisy Data

Factorization with Missing and Noisy Data Factorization with Missing and Noisy Data Carme Julià, Angel Sappa, Felipe Lumbreras, Joan Serrat, and Antonio López Computer Vision Center and Computer Science Department, Universitat Autònoma de Barcelona,

More information

Region-based Segmentation

Region-based Segmentation Region-based Segmentation Image Segmentation Group similar components (such as, pixels in an image, image frames in a video) to obtain a compact representation. Applications: Finding tumors, veins, etc.

More information

Dimension Reduction of Image Manifolds

Dimension Reduction of Image Manifolds Dimension Reduction of Image Manifolds Arian Maleki Department of Electrical Engineering Stanford University Stanford, CA, 9435, USA E-mail: arianm@stanford.edu I. INTRODUCTION Dimension reduction of datasets

More information

Application of Spectral Clustering Algorithm

Application of Spectral Clustering Algorithm 1/27 Application of Spectral Clustering Algorithm Danielle Middlebrooks dmiddle1@math.umd.edu Advisor: Kasso Okoudjou kasso@umd.edu Department of Mathematics University of Maryland- College Park Advance

More information

Lab 9. Julia Janicki. Introduction

Lab 9. Julia Janicki. Introduction Lab 9 Julia Janicki Introduction My goal for this project is to map a general land cover in the area of Alexandria in Egypt using supervised classification, specifically the Maximum Likelihood and Support

More information

Chapter 2 Basic Structure of High-Dimensional Spaces

Chapter 2 Basic Structure of High-Dimensional Spaces Chapter 2 Basic Structure of High-Dimensional Spaces Data is naturally represented geometrically by associating each record with a point in the space spanned by the attributes. This idea, although simple,

More information

CSE 6242 / CX October 9, Dimension Reduction. Guest Lecturer: Jaegul Choo

CSE 6242 / CX October 9, Dimension Reduction. Guest Lecturer: Jaegul Choo CSE 6242 / CX 4242 October 9, 2014 Dimension Reduction Guest Lecturer: Jaegul Choo Volume Variety Big Data Era 2 Velocity Veracity 3 Big Data are High-Dimensional Examples of High-Dimensional Data Image

More information

Locality Preserving Projections (LPP) Abstract

Locality Preserving Projections (LPP) Abstract Locality Preserving Projections (LPP) Xiaofei He Partha Niyogi Computer Science Department Computer Science Department The University of Chicago The University of Chicago Chicago, IL 60615 Chicago, IL

More information

Bagging for One-Class Learning

Bagging for One-Class Learning Bagging for One-Class Learning David Kamm December 13, 2008 1 Introduction Consider the following outlier detection problem: suppose you are given an unlabeled data set and make the assumptions that one

More information

Processing 3D Surface Data

Processing 3D Surface Data Processing 3D Surface Data Computer Animation and Visualisation Lecture 12 Institute for Perception, Action & Behaviour School of Informatics 3D Surfaces 1 3D surface data... where from? Iso-surfacing

More information

CS 664 Structure and Motion. Daniel Huttenlocher

CS 664 Structure and Motion. Daniel Huttenlocher CS 664 Structure and Motion Daniel Huttenlocher Determining 3D Structure Consider set of 3D points X j seen by set of cameras with projection matrices P i Given only image coordinates x ij of each point

More information

Lecture 2 September 3

Lecture 2 September 3 EE 381V: Large Scale Optimization Fall 2012 Lecture 2 September 3 Lecturer: Caramanis & Sanghavi Scribe: Hongbo Si, Qiaoyang Ye 2.1 Overview of the last Lecture The focus of the last lecture was to give

More information

The Curse of Dimensionality

The Curse of Dimensionality The Curse of Dimensionality ACAS 2002 p1/66 Curse of Dimensionality The basic idea of the curse of dimensionality is that high dimensional data is difficult to work with for several reasons: Adding more

More information

HYPERSPECTRAL REMOTE SENSING

HYPERSPECTRAL REMOTE SENSING HYPERSPECTRAL REMOTE SENSING By Samuel Rosario Overview The Electromagnetic Spectrum Radiation Types MSI vs HIS Sensors Applications Image Analysis Software Feature Extraction Information Extraction 1

More information

Bumptrees for Efficient Function, Constraint, and Classification Learning

Bumptrees for Efficient Function, Constraint, and Classification Learning umptrees for Efficient Function, Constraint, and Classification Learning Stephen M. Omohundro International Computer Science Institute 1947 Center Street, Suite 600 erkeley, California 94704 Abstract A

More information

Lab # 2 - ACS I Part I - DATA COMPRESSION in IMAGE PROCESSING using SVD

Lab # 2 - ACS I Part I - DATA COMPRESSION in IMAGE PROCESSING using SVD Lab # 2 - ACS I Part I - DATA COMPRESSION in IMAGE PROCESSING using SVD Goals. The goal of the first part of this lab is to demonstrate how the SVD can be used to remove redundancies in data; in this example

More information

The K-modes and Laplacian K-modes algorithms for clustering

The K-modes and Laplacian K-modes algorithms for clustering The K-modes and Laplacian K-modes algorithms for clustering Miguel Á. Carreira-Perpiñán Electrical Engineering and Computer Science University of California, Merced http://faculty.ucmerced.edu/mcarreira-perpinan

More information

Color Image Segmentation

Color Image Segmentation Color Image Segmentation Yining Deng, B. S. Manjunath and Hyundoo Shin* Department of Electrical and Computer Engineering University of California, Santa Barbara, CA 93106-9560 *Samsung Electronics Inc.

More information

Processing 3D Surface Data

Processing 3D Surface Data Processing 3D Surface Data Computer Animation and Visualisation Lecture 17 Institute for Perception, Action & Behaviour School of Informatics 3D Surfaces 1 3D surface data... where from? Iso-surfacing

More information

Dimension reduction for hyperspectral imaging using laplacian eigenmaps and randomized principal component analysis:midyear Report

Dimension reduction for hyperspectral imaging using laplacian eigenmaps and randomized principal component analysis:midyear Report Dimension reduction for hyperspectral imaging using laplacian eigenmaps and randomized principal component analysis:midyear Report Yiran Li yl534@math.umd.edu Advisor: Wojtek Czaja wojtek@math.umd.edu

More information

Clustering: Classic Methods and Modern Views

Clustering: Classic Methods and Modern Views Clustering: Classic Methods and Modern Views Marina Meilă University of Washington mmp@stat.washington.edu June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms Outline Paradigms for clustering

More information

CSE 6242 A / CS 4803 DVA. Feb 12, Dimension Reduction. Guest Lecturer: Jaegul Choo

CSE 6242 A / CS 4803 DVA. Feb 12, Dimension Reduction. Guest Lecturer: Jaegul Choo CSE 6242 A / CS 4803 DVA Feb 12, 2013 Dimension Reduction Guest Lecturer: Jaegul Choo CSE 6242 A / CS 4803 DVA Feb 12, 2013 Dimension Reduction Guest Lecturer: Jaegul Choo Data is Too Big To Do Something..

More information

Chapter 3 Image Registration. Chapter 3 Image Registration

Chapter 3 Image Registration. Chapter 3 Image Registration Chapter 3 Image Registration Distributed Algorithms for Introduction (1) Definition: Image Registration Input: 2 images of the same scene but taken from different perspectives Goal: Identify transformation

More information

Scanning Real World Objects without Worries 3D Reconstruction

Scanning Real World Objects without Worries 3D Reconstruction Scanning Real World Objects without Worries 3D Reconstruction 1. Overview Feng Li 308262 Kuan Tian 308263 This document is written for the 3D reconstruction part in the course Scanning real world objects

More information

Face Recognition via Sparse Representation

Face Recognition via Sparse Representation Face Recognition via Sparse Representation John Wright, Allen Y. Yang, Arvind, S. Shankar Sastry and Yi Ma IEEE Trans. PAMI, March 2008 Research About Face Face Detection Face Alignment Face Recognition

More information

Text Modeling with the Trace Norm

Text Modeling with the Trace Norm Text Modeling with the Trace Norm Jason D. M. Rennie jrennie@gmail.com April 14, 2006 1 Introduction We have two goals: (1) to find a low-dimensional representation of text that allows generalization to

More information

( ) =cov X Y = W PRINCIPAL COMPONENT ANALYSIS. Eigenvectors of the covariance matrix are the principal components

( ) =cov X Y = W PRINCIPAL COMPONENT ANALYSIS. Eigenvectors of the covariance matrix are the principal components Review Lecture 14 ! PRINCIPAL COMPONENT ANALYSIS Eigenvectors of the covariance matrix are the principal components 1. =cov X Top K principal components are the eigenvectors with K largest eigenvalues

More information

3D Models and Matching

3D Models and Matching 3D Models and Matching representations for 3D object models particular matching techniques alignment-based systems appearance-based systems GC model of a screwdriver 1 3D Models Many different representations

More information

Week 7 Picturing Network. Vahe and Bethany

Week 7 Picturing Network. Vahe and Bethany Week 7 Picturing Network Vahe and Bethany Freeman (2005) - Graphic Techniques for Exploring Social Network Data The two main goals of analyzing social network data are identification of cohesive groups

More information

Chapter 18. Geometric Operations

Chapter 18. Geometric Operations Chapter 18 Geometric Operations To this point, the image processing operations have computed the gray value (digital count) of the output image pixel based on the gray values of one or more input pixels;

More information

Identifying Layout Classes for Mathematical Symbols Using Layout Context

Identifying Layout Classes for Mathematical Symbols Using Layout Context Rochester Institute of Technology RIT Scholar Works Articles 2009 Identifying Layout Classes for Mathematical Symbols Using Layout Context Ling Ouyang Rochester Institute of Technology Richard Zanibbi

More information

calibrated coordinates Linear transformation pixel coordinates

calibrated coordinates Linear transformation pixel coordinates 1 calibrated coordinates Linear transformation pixel coordinates 2 Calibration with a rig Uncalibrated epipolar geometry Ambiguities in image formation Stratified reconstruction Autocalibration with partial

More information

EECS 442: Final Project

EECS 442: Final Project EECS 442: Final Project Structure From Motion Kevin Choi Robotics Ismail El Houcheimi Robotics Yih-Jye Jeffrey Hsu Robotics Abstract In this paper, we summarize the method, and results of our projective

More information

Emotion Classification

Emotion Classification Emotion Classification Shai Savir 038052395 Gil Sadeh 026511469 1. Abstract Automated facial expression recognition has received increased attention over the past two decades. Facial expressions convey

More information

CS 664 Slides #11 Image Segmentation. Prof. Dan Huttenlocher Fall 2003

CS 664 Slides #11 Image Segmentation. Prof. Dan Huttenlocher Fall 2003 CS 664 Slides #11 Image Segmentation Prof. Dan Huttenlocher Fall 2003 Image Segmentation Find regions of image that are coherent Dual of edge detection Regions vs. boundaries Related to clustering problems

More information

Clustering and Visualisation of Data

Clustering and Visualisation of Data Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some

More information

Locality Preserving Projections (LPP) Abstract

Locality Preserving Projections (LPP) Abstract Locality Preserving Projections (LPP) Xiaofei He Partha Niyogi Computer Science Department Computer Science Department The University of Chicago The University of Chicago Chicago, IL 60615 Chicago, IL

More information

Chapter 9 Object Tracking an Overview

Chapter 9 Object Tracking an Overview Chapter 9 Object Tracking an Overview The output of the background subtraction algorithm, described in the previous chapter, is a classification (segmentation) of pixels into foreground pixels (those belonging

More information

Prof. Fanny Ficuciello Robotics for Bioengineering Visual Servoing

Prof. Fanny Ficuciello Robotics for Bioengineering Visual Servoing Visual servoing vision allows a robotic system to obtain geometrical and qualitative information on the surrounding environment high level control motion planning (look-and-move visual grasping) low level

More information

Optimal Compression of a Polyline with Segments and Arcs

Optimal Compression of a Polyline with Segments and Arcs Optimal Compression of a Polyline with Segments and Arcs Alexander Gribov Esri 380 New York Street Redlands, CA 92373 Email: agribov@esri.com arxiv:1604.07476v5 [cs.cg] 10 Apr 2017 Abstract This paper

More information

An Introduction to Content Based Image Retrieval

An Introduction to Content Based Image Retrieval CHAPTER -1 An Introduction to Content Based Image Retrieval 1.1 Introduction With the advancement in internet and multimedia technologies, a huge amount of multimedia data in the form of audio, video and

More information

Linear Methods for Regression and Shrinkage Methods

Linear Methods for Regression and Shrinkage Methods Linear Methods for Regression and Shrinkage Methods Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Linear Regression Models Least Squares Input vectors

More information

Facial Expression Detection Using Implemented (PCA) Algorithm

Facial Expression Detection Using Implemented (PCA) Algorithm Facial Expression Detection Using Implemented (PCA) Algorithm Dileep Gautam (M.Tech Cse) Iftm University Moradabad Up India Abstract: Facial expression plays very important role in the communication with

More information

Segmentation and Grouping

Segmentation and Grouping Segmentation and Grouping How and what do we see? Fundamental Problems ' Focus of attention, or grouping ' What subsets of pixels do we consider as possible objects? ' All connected subsets? ' Representation

More information

Lecture 6: Edge Detection

Lecture 6: Edge Detection #1 Lecture 6: Edge Detection Saad J Bedros sbedros@umn.edu Review From Last Lecture Options for Image Representation Introduced the concept of different representation or transformation Fourier Transform

More information

Supporting Information. High-Throughput, Algorithmic Determination of Nanoparticle Structure From Electron Microscopy Images

Supporting Information. High-Throughput, Algorithmic Determination of Nanoparticle Structure From Electron Microscopy Images Supporting Information High-Throughput, Algorithmic Determination of Nanoparticle Structure From Electron Microscopy Images Christine R. Laramy, 1, Keith A. Brown, 2, Matthew N. O Brien, 2 and Chad. A.

More information

Numerical Analysis and Statistics on Tensor Parameter Spaces

Numerical Analysis and Statistics on Tensor Parameter Spaces Numerical Analysis and Statistics on Tensor Parameter Spaces SIAM - AG11 - Tensors Oct. 7, 2011 Overview Normal Mean / Karcher Mean Karcher mean / Normal mean - finding representatives for a set of points

More information

MSA220 - Statistical Learning for Big Data

MSA220 - Statistical Learning for Big Data MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups

More information

Noise Model. Important Noise Probability Density Functions (Cont.) Important Noise Probability Density Functions

Noise Model. Important Noise Probability Density Functions (Cont.) Important Noise Probability Density Functions Others -- Noise Removal Techniques -- Edge Detection Techniques -- Geometric Operations -- Color Image Processing -- Color Spaces Xiaojun Qi Noise Model The principal sources of noise in digital images

More information

Scan Matching. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics

Scan Matching. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics Scan Matching Pieter Abbeel UC Berkeley EECS Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics Scan Matching Overview Problem statement: Given a scan and a map, or a scan and a scan,

More information

Local Features: Detection, Description & Matching

Local Features: Detection, Description & Matching Local Features: Detection, Description & Matching Lecture 08 Computer Vision Material Citations Dr George Stockman Professor Emeritus, Michigan State University Dr David Lowe Professor, University of British

More information

ELEC Dr Reji Mathew Electrical Engineering UNSW

ELEC Dr Reji Mathew Electrical Engineering UNSW ELEC 4622 Dr Reji Mathew Electrical Engineering UNSW Review of Motion Modelling and Estimation Introduction to Motion Modelling & Estimation Forward Motion Backward Motion Block Motion Estimation Motion

More information

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering SYDE 372 - Winter 2011 Introduction to Pattern Recognition Clustering Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 5 All the approaches we have learned

More information

Segmentation of MR Images of a Beating Heart

Segmentation of MR Images of a Beating Heart Segmentation of MR Images of a Beating Heart Avinash Ravichandran Abstract Heart Arrhythmia is currently treated using invasive procedures. In order use to non invasive procedures accurate imaging modalities

More information

Subspace Clustering. Weiwei Feng. December 11, 2015

Subspace Clustering. Weiwei Feng. December 11, 2015 Subspace Clustering Weiwei Feng December 11, 2015 Abstract Data structure analysis is an important basis of machine learning and data science, which is now widely used in computational visualization problems,

More information

1 (5 max) 2 (10 max) 3 (20 max) 4 (30 max) 5 (10 max) 6 (15 extra max) total (75 max + 15 extra)

1 (5 max) 2 (10 max) 3 (20 max) 4 (30 max) 5 (10 max) 6 (15 extra max) total (75 max + 15 extra) Mierm Exam CS223b Stanford CS223b Computer Vision, Winter 2004 Feb. 18, 2004 Full Name: Email: This exam has 7 pages. Make sure your exam is not missing any sheets, and write your name on every page. The

More information

Discriminative Clustering for Image Co-Segmentation

Discriminative Clustering for Image Co-Segmentation Discriminative Clustering for Image Co-Segmentation Joulin, A.; Bach, F.; Ponce, J. (CVPR. 2010) Iretiayo Akinola Josh Tennefoss Outline Why Co-segmentation? Previous Work Problem Formulation Experimental

More information

SOM+EOF for Finding Missing Values

SOM+EOF for Finding Missing Values SOM+EOF for Finding Missing Values Antti Sorjamaa 1, Paul Merlin 2, Bertrand Maillet 2 and Amaury Lendasse 1 1- Helsinki University of Technology - CIS P.O. Box 5400, 02015 HUT - Finland 2- Variances and

More information