Stratified Structure of Laplacian Eigenmaps Embedding

Similar documents
JPlex. CSRI Workshop on Combinatorial Algebraic Topology August 29, 2009 Henry Adams

Topological Perspectives On Stratification Learning

A Multicover Nerve for Geometric Inference. Don Sheehy INRIA Saclay, France

Non-linear dimension reduction

A Multicover Nerve for Geometric Inference

Sheaf-Theoretic Stratification Learning

topological data analysis and stochastic topology yuliy baryshnikov waikiki, march 2013

Diffusion Maps and Topological Data Analysis

66 III Complexes. R p (r) }.

Data Analysis, Persistent homology and Computational Morse-Novikov theory

Alpha-Beta Witness Complexes

Sparse Manifold Clustering and Embedding

The Gudhi Library: Simplicial Complexes and Persistent Homology

Homology and Persistent Homology Bootcamp Notes 2017

A Geometric Perspective on Sparse Filtrations

A roadmap for the computation of persistent homology

Topological Persistence

Clustering algorithms and introduction to persistent homology

Parallel & scalable zig-zag persistent homology

A fast and robust algorithm to count topologically persistent holes in noisy clouds

Computational Topology in Reconstruction, Mesh Generation, and Data Analysis

Large-Scale Face Manifold Learning

Topological Classification of Data Sets without an Explicit Metric

Image Similarities for Learning Video Manifolds. Selen Atasoy MICCAI 2011 Tutorial

PERSISTENT HOMOLOGY OF FINITE TOPOLOGICAL SPACES

Data fusion and multi-cue data matching using diffusion maps

Evgeny Maksakov Advantages and disadvantages: Advantages and disadvantages: Advantages and disadvantages: Advantages and disadvantages:

Locality Preserving Projections (LPP) Abstract

SELECTION OF THE OPTIMAL PARAMETER VALUE FOR THE LOCALLY LINEAR EMBEDDING ALGORITHM. Olga Kouropteva, Oleg Okun and Matti Pietikäinen

Manifold Clustering. Abstract. 1. Introduction

JPlex Software Demonstration. AMS Short Course on Computational Topology New Orleans Jan 4, 2011 Henry Adams Stanford University

QUANTIFYING HOMOLOGY CLASSES

Topological Data Analysis - I. Afra Zomorodian Department of Computer Science Dartmouth College

On the Topology of Finite Metric Spaces

Robust Pose Estimation using the SwissRanger SR-3000 Camera

Globally and Locally Consistent Unsupervised Projection

Western TDA Learning Seminar. June 7, 2018

Stable and Multiscale Topological Signatures

Random Simplicial Complexes

THE BASIC THEORY OF PERSISTENT HOMOLOGY

Detection and approximation of linear structures in metric spaces

Recent Advances and Trends in Applied Algebraic Topology

Topological estimation using witness complexes. Vin de Silva, Stanford University

TOPOLOGICAL DATA ANALYSIS

Computational Statistics and Mathematics for Cyber Security

Locality Preserving Projections (LPP) Abstract

Data Skeletonization via Reeb Graphs

arxiv: v1 [cs.it] 10 May 2016

Analysis of high dimensional data via Topology. Louis Xiang. Oak Ridge National Laboratory. Oak Ridge, Tennessee

CSE 5559 Computational Topology: Theory, algorithms, and applications to data analysis. Lecture 0: Introduction. Instructor: Yusu Wang

A Practical Guide to Persistent Homology

Selecting Models from Videos for Appearance-Based Face Recognition

Learning a Manifold as an Atlas Supplementary Material

Persistence stability for geometric complexes

Topological Data Analysis Workshop February 3-7, 2014

Namita Lokare, Daniel Benavides, Sahil Juneja and Edgar Lobaton

Remote Sensing Data Classification Using Combined Spectral and Spatial Local Linear Embedding (CSSLE)

Technical Report. Title: Manifold learning and Random Projections for multi-view object recognition

Observing Information: Applied Computational Topology.

Open Problems Column Edited by William Gasarch

Visualizing pairwise similarity via semidefinite programming

Statistical properties of topological information inferred from data

Cluster Analysis (b) Lijun Zhang

The Analysis of Parameters t and k of LPP on Several Famous Face Databases

DIMENSION REDUCTION FOR HYPERSPECTRAL DATA USING RANDOMIZED PCA AND LAPLACIAN EIGENMAPS

Topological Data Analysis

Appearance Manifold of Facial Expression

On the Nonlinear Statistics of Range Image Patches

Sheaf-Theoretic Stratification Learning

CIE L*a*b* color model

Topology and the Analysis of High-Dimensional Data

Towards Multi-scale Heat Kernel Signatures for Point Cloud Models of Engineering Artifacts

An Analysis of Spaces of Range Image Small Patches

Brian Hamrick. October 26, 2009

Isometric Mapping Hashing

Differential Structure in non-linear Image Embedding Functions

Towards topological analysis of high-dimensional feature spaces.

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6,

Dimension Reduction CS534

Clique homological simplification problem is NP-hard

Persistent Homology and Nested Dissection

Discrete Exterior Calculus How to Turn Your Mesh into a Computational Structure. Discrete Differential Geometry

SimBa: An Efficient Tool for Approximating Rips-filtration Persistence via Simplicial Batch-collapse

Simplicial Complexes: Second Lecture

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining

Algebraic Topology: A brief introduction

Homological Reconstruction and Simplification in R3

Enhanced Multilevel Manifold Learning

Topological Inference via Meshing

Manifold Learning for Video-to-Video Face Recognition

arxiv: v1 [math.st] 11 Oct 2017

A Geometric Perspective on Machine Learning

Locally Linear Landmarks for large-scale manifold learning

Branching and Circular Features in High Dimensional Data

Relative Constraints as Features

Lecture 5: Simplicial Complex

Geometric Data. Goal: describe the structure of the geometry underlying the data, for interpretation or summary

Topological Inference via Meshing

Analyse Topologique des Données

Face Recognition using Laplacianfaces

Transcription:

Stratified Structure of Laplacian Eigenmaps Embedding Abstract We construct a locality preserving weight matrix for Laplacian eigenmaps algorithm used in dimension reduction. Our point cloud data is sampled from a low dimensional stratified space embedded in a higher dimension. Specifically, we use tools developed in local homology, persistence homology for kernel and cokernels to infer a weight matrix which captures neighborhood relations among points in the same or different strata. Introduction Motivation. In the area of machine learning and pattern recognition, one is often interested in searching for structure among data sampled from intrinsically low dimensional manifold embedded in higher dimensional space. We are motivated by the problem of dimension reduction, namely, computing a low dimensional representation of a high dimensional data set that preserves local structure to a certain extent. Spectral methods such as Laplacian eigenmaps are powerful tools utilized in this problem [, ]. The spectral methods generally reveal low dimensional structure from eigenvectors of specially constructed weight metrices, see survey []. In the case of Laplacian eigenmaps, the weight matrice captures proximity relations, namely, mapping nearby input patterns to nearby outputs. It also has a natural connection to clustering []. We are interested in dimension reduction that preserves the stratified structure of a point cloud data. Our data is sampled from a low dimensional stratified space embedded in higher dimension. We construct the weight matrix for Laplacian eigenmaps that captures not only the proximity information but also the stratified structure. In other words, the weight assigned to a pair of points reflects their closeness as well as the likelihood of them being in the same strata. Using this new weight matrix, the Laplacian eigenmaps algorithm can potentially reveal the clustering of strata components of different dimensions. Preliminaries In this section, we introduce the necessary background for understanding our algorithm for constructing the weight matrix for Laplacian eigenmaps. We begin with a review of Laplacian eigenmaps algorithm. Then we give a brief introduction to persistence homology, including some algebra on local homology and persistence homology for kernel and cokernels.. Laplacian eigenmaps The Laplacian eigenmaps algorithm is a graph-based spectral method in dimension reduction []. Graph-based mothods construct a sparse graph where the nodes represent input patterns and the edges represent neighborhood relations []. One then construct matrices whose spectral decomposition reveal the low dimensional structure of the data set []. Other graph-based methods include Isomap [] and maximum variance embedding []. In this section, we review the basic algorithmic steps of Laplacian eigenmaps and omit the justification, for details, see []. Example of Laplacian eigenmaps algorithm applied to alpha complex of the point cloud data sampled from a cross is shown in Appendix. Given k points {x, x,..., x k } in R n, we construct a weighted graph G with k nodes as follows:. (Construct the graph) We put an edge between nodes i and j following one of the below variations: (a) (parameter ǫ R) If x i x j ǫ. (b) (parameter l N) If node i is among l nearest neighbors of j or vice versa.. (Construct the weight matrix) Construct the weight matrix W, if nodes i and j are connected, there are two variations: (a) (parameter t R) W ij = e d(xi,xj)/t, where d(x i, x j ) = x i x j. Commonly t is chosen to be the median of all pair-wise distances.

(b) W ij =.. (Eigenmaps) Assume G is connected (otherwise for each connected component of G), compute eigenvalues and eigenvectors for the generalized eigenvector problem: Ly = λdy, where D ii = j W ji and L = D W.. (Embedding) Let y, y,...y k be the eigenvectors sorted by increasing eigenvalues. The image of x i under the embedding into R m is given by (y (i), y (i),..., y m (i)). Note. In step above, we can also use L = I D / WD /, a normalized weight matrix to compute, Ly = λy, and use the top m eigenvectors with the largest eigenvalues to get the embedding.. Persistence homology background In this section, we describe the sampled data, its representation by simplicial complex, local homology and persistence homology of kernels and cokernels. For general introduction to persistence homology, see [8, ]. Stratification and data. A stratification of a topological space X is a filtration by closed subsets, = X X... X m X m = X, where X i X i is the i-stratum which is a i-manifold (or empty). Its components are defined as the dimension i pieces of X [6]. The data we consider is a finite set of points U in R n. We assume that U is sampled from a compact space X R n with noise. We construct a nested family of simplicial complexes from U (Rips complexes, Cech complxes or witness complexes). For high-dimensional data, we use Witness complexes W α, for α [, ] [7]. We use Witness complexes here. Local homology. Bendich et al. introduce a multi-scale computation of local homology for reconstructing a stratified space from point sample [6]. We briefly describe computing local homology of a point z R n, for technical details, see [6]. Let z R n be a point. Let d z : R n R be the distance function defined by d z (x) = x z. Let B r = d z [, r] be the sub-level set, B r be its boundary. To compute the local homology of z, we first fix r > and compute the persistence homology of the following two filtrations, H(W α B r )... H(B r ), H(W α B r, W α B r )... H(B r, B r ). Since it is difficult to know a prior which value of r is appropriate, we examine the multi-scale persistence behavior by varying r across all radii and study its correponding vineyard [6]. To study local homology of z, we focus on small values of r that correspond to local dominant features. Given a simplex σ = [a, a,..., a p ] W α, σ is inside the ball B r if some or all of its vertices are in B r. σ is outside B r if all its vertices are outside B r. A simplex σ is considered on the boundary of B r if it has a coface that is in B r. Persistence kernels and cokernels. Consider two functions on topological spaces, f : X R and g : Y R, where Y X, g is the restriction of f to Y. The corresponding sequences of sub-level sets give the following maps between homology groups, H(X ) H(X )... H(X m ) j j... j m H(Y ) H(Y )... H(Y m ) We obtain following sequences of kernels, images and cokernels and compute their corresponding kernel/cokernel persistence. ker(g f) : kerj kerj... kerj m im(g f) : imj imj... imj m cok(g f) : cokj cokj... cokj m Algorithm We would like to construct a weight matrix based on local homology information which captures neighborhood relations among points within the same or different strata. We first consider the topological space X R n, two points x, x X have the same local structure at a fixed radius r if the following maps induced by intersections are isomorphisms. Correspondingly, these maps have zero kernel and cokernel.

H(X B r (x )) H(X B r (x ) B r (x )) H(X B r (x )) We compute the persistence of the following sequences of kernels and cokernels: kerj kerj... kerj m cokj cokj... cokj m keri keri... keri m cok i coki... coki m kerk kerk... kerk m cokk cokk... cokk m kerl kerl... kerl m H(X B r (x ), X B r (x )) H(X B r (x ) B r (x ), (X B r (x ) B r (x ))) H(X B r (x ), X B r (x )) In the setting of a point cloud U sampled from X, we consider two close points z, z U. z and z have similar local structure if the maps induced by intersection have small kernel and cokernel persistence. We now describe this precisely. we define B r (z i ) as the r-ball around z i. Fix a radius r, we have the following nested sequences as we vary α, (a). X = W α B r (z ), X = W α B r (z ) and Y = W α B r (z ) B r (z ). (b). Y = (W α B r (z ), W α B r (z )), Y = (W α B r (z ), W α B r (z )) and X = (W α B r (z ) B r (z ), (W α B r (z ) B r (z ))). Specifically, we have the following relations for filtration (a), H(X ) H(X )... H(X m ) j j... j m H(Y ) H(Y )... H(Y m ) i i... i m H(X ) H(X )... H(X m ) We have the following relation for filtration (b), H(Y ) H(Y )... H(Y m ) k k... k m H(X ) H(X )... H(X m ) l l... l m H(Y ) H(Y )... H(Y m ) cokl cokl... cokl m We find the largest persistence p (or average persistence) for the above 8 sequences and define our weight between z and z as e p/t. Notice that we do not compute the vine here by varying r in B r, instead, we choose r proportionally to z z. Implementation (on-going) The current implementation of local homology is based on Java version of Jplex []. The local parametrization based on local cohomology is shown in Appendix [] (for illustration purpose, local cohomology is the vector-space dual of local homology, details omitted here). We will also implement it based on the C++ version of persistence computation library by Dmitriy Morozov. Acknowledgment This is joint work among Bei Wang, Sayan Mukherjee, Paul Bendich, John Harer, Dmitriy Morozov and Herbert Edelsbrunner. The local cohomology parametrization is joint work among Bei Wang, Mikael Vejdemo-Johansson and Sayan Mukherjee. References [] Plex: Persistent Homology Computations. comptop.stanford.edu/programs/jplex/. [] L. K. SAUL, K. Q. WEINBERGER, J. H. HAM, F. SHA, AND D. D. LEE. Spectral methods for dimensionality reduction. In O. Chapelle, B. Schoelkopf, and A. Zien (eds.) Semisupervised Learning.MIT Press: Cambridge, MA, 6. [] M. BELKIN, P. NIYOGI. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering NIPS. (). [] J. B. TENENBAUM, V. DE SILVA, AND J. C. LANGFORD. A global geometric framework for nonlinear dimensionality reduction. Science.9(), 9. [] K. Q. WEINBERGER AND L. K. SAUL. Unsupervised learning of image manifolds by semidefinite programming. Int. J. Comput. Vision7(6), 77-9.

[6] P. BENDICH, D. COHEN-STEINER, H. EDELSBRUNNER, J. HARER AND D. MOROZOV. Inferring local homology from sampled stratified spaces. Proc. 8th Ann. Sympos. Found. Comput. Sci. (7), 6 6. [7] V. DE SILVA AND G. CARLSSON Topological estimation using witness complexes. Symposium on Point-Based Graphics. (), ETH, Zrich, Switzerland, June. [8] H. EDELSBRUNNER, D. LETSCHER AND A. ZOMORODIAN. Topological persistence and simplification. Discrete Comput. Geom. 8 (),. [9] J. R. MUNKRES. Elements of Algebraic Topology. Addison-Wesley, Redwood City, California, 98. [] D. COHEN-STEINER, H. EDELSBRUNNER AND J. HARER. Stability of persistence diagrams Discrete Comput. Geom. 7 (7),. [] D. COHEN-STEINER, H. EDELSBRUNNER AND J. HARER. Extending persistence using Poincaré and Lefschetz duality. Found. Comput. Math., to appear. [] H. EDELSBRUNNER AND J. HARER. Persistent homology a survey. Manuscript, Dept. Comput. Sci., Duke Univ., Durham, North Carolina, 7. [] V. DE SILVA AND M. VEJDEMO-JOHANSSON Persistence cohomology and circular coordinates. SOCG (9), to appear. Appendix A Examples of Laplacian Eigenmaps algorithm applied to alpha complex are shown in Figure and Figure. 8 6 6 8................ Figure : Top: alpha complex of point cloud data sampled without noise. Middle: alpha complex colored by connected component. Bottom: corresponding Laplacian Eigenmaps embedding with color corresponding to each component in the alpha complex. Appendix B Examples of local cohomology parametrization is shown in Figure.

.8.6.....6.8..8.6.....6.8.............. Figure : Top: alpha complex of point cloud data sampled with noise; Middle: alpha complex colored by connected component. Bottom: corresponding Laplacian Eigenmaps embedding with color corresponding to each component in the alpha complex..8.6.....6.8 Figure : Three local cohomology classes at the crossing point.