A Geometric Perspective on Data Analysis

Size: px
Start display at page:

Download "A Geometric Perspective on Data Analysis"

Transcription

1 A Geometric Perspective on Data Analysis Partha Niyogi The University of Chicago Collaborators: M. Belkin, X. He, A. Jansen, H. Narayanan, V. Sindhwani, S. Smale, S. Weinberger A Geometric Perspectiveon Data Analysis p.1

2 Manifold Learning Learning when data M R N Clustering: M {1,...,k} connected components, min cut Classification: M { 1, +1} P on M { 1, +1} Dimensionality Reduction: f : M R n n << N M unknown: what can you learn about M from data? e.g. dimensionality, connected components holes, handles, homology curvature, geodesics A Geometric Perspectiveon Data Analysis p.2

3 An Acoustic Example u(t) l s(t) A Geometric Perspectiveon Data Analysis p.3

4 An Acoustic Example u(t) l s(t) One Dimensional Air Flow (i) V x = A P ρc 2 t (ii) P x = ρ A V t V (x, t) = volume velocity P(x, t) = pressure A Geometric Perspectiveon Data Analysis p.3

5 Solutions beta u(t) = P n=1 α n sin(nω 0 t) l beta beta s(t) = P n=1 β n sin(nω 0 t) l 2 A Geometric Perspectiveon Data Analysis p.4

6 Acoustic Phonetics A 1 A 2 l 1 Vocal Tract modeled as a sequence of tubes. (e.g. Stevens, 1998) l 2 Jansen and Niyogi (2005,2006) A Geometric Perspectiveon Data Analysis p.5

7 Pattern Recognition P on X Y X = R N ;Y = {0, 1}, R (x i,y i ) labeled examples find f : X Y Ill Posed A Geometric Perspectiveon Data Analysis p.6

8 Simplicity A Geometric Perspectiveon Data Analysis p.7

9 Simplicity A Geometric Perspectiveon Data Analysis p.7

10 Regularization Principle 1 f = arg min f H K n n (y i f(x i )) 2 + γ f 2 K i=1 Splines Ridge Regression SVM K : X X R is a p.d. kernel e.g. e x y 2 σ 2, (1 + x y) d, etc. H K is a corresponding RKHS e.g., certain Sobolev spaces, polynomial families, etc. A Geometric Perspectiveon Data Analysis p.8

11 Simplicity is Relative A Geometric Perspectiveon Data Analysis p.9

12 Simplicity is Relative A Geometric Perspectiveon Data Analysis p.9

13 Intuitions supp P X has manifold structure geodesic distance v/s ambient distance geometric structure of data should be incorporated f versus f M A Geometric Perspectiveon Data Analysis p.10

14 Manifold Regularization 1 min f H K n n (y i f(x i )) 2 + γ A f 2 K + γ I f 2 I i=1 f 2 I = Laplacian Iterated Laplacian Heat kernel Differential Operator R gradm f, grad M f = R f M f R f M i f e Mt R f(df) Representer Theorem: f = P n i=1 α ik(x, x i ) + R M α(y)k(x, y) Belkin, Niyogi, Sindhwani (2004) A Geometric Perspectiveon Data Analysis p.11

15 Approximating f 2 I M is unknown but x 1...x M M f 2 I = M M f, M f i j W ij (f(x i ) f(x j )) 2 A Geometric Perspectiveon Data Analysis p.12

16 Manifolds and Graphs M G = (V,E) e ij E if x i x j < ɛ W ij = e x i x j 2 t M L = D W gradf, gradf i,j W ij(f(x i ) f(x j )) 2 f( f) f T Lf A Geometric Perspectiveon Data Analysis p.13

17 Manifold Regularization 1 n n i=1 V (f(x i ),y i ) + γ A f 2 K + γ I i j W ij (f(x i ) f(x j )) 2 Representer Theorem: f opt = n+m i=1 α ik(x,x i ) V (f(x), y) = (f(x) y) 2 : Least squares V (f(x), y) = (1 yf(x)) + : Hinge loss (Support Vector Machines) A Geometric Perspectiveon Data Analysis p.14

18 Ambient and Intrinsic Regularization SVM Laplacian SVM Laplacian SVM γ A = γ I = γ A = γ I = γ A = γ I = 1 A Geometric Perspectiveon Data Analysis p.15

19 Experimental Results: USPS Error Rates RLS vs LapRLS RLS LapRLS Error Rates SVM vs LapSVM SVM LapSVM Error Rates TSVM vs LapSVM TSVM LapSVM Classification Problems Classification Problems Classification Problems 15 Out of Sample Extension 15 Out of Sample Extension Std Deviation of Error Rates 15 LapRLS (Test) 10 5 LapSVM (Test) 10 5 SVM (o), TSVM (x) Std Dev LapRLS (Unlabeled) LapSVM (Unlabeled) LapSVM Std Dev A Geometric Perspectiveon Data Analysis p.16

20 Experimental Results: Isolet Error Rate (unlabeled set) Error Rates (test set) RLS vs LapRLS RLS LapRLS Labeled Speaker # RLS vs LapRLS RLS LapRLS Labeled Speaker # Error Rates (unlabeled set) Error Rates (test set) SVM vs TSVM vs LapSVM SVM TSVM LapSVM Labeled Speaker # SVM vs TSVM vs LapSVM SVM TSVM LapSVM Labeled Speaker # A Geometric Perspectiveon Data Analysis p.17

21 Experimental comparisons Dataset g50c Coil20 Uspst mac-win WebKB WebKB WebKB Algorithm (link) (page) (page+link) SVM (full labels) RLS (full labels) SVM (l labels) RLS (l labels) Graph-Reg TSVM Graph-density TSVM LDS LapSVM LapRLS LapSVM joint LapRLS joint A Geometric Perspectiveon Data Analysis p.18

22 Graph and Manifold Laplacian Fix f : X R. Fix x M (L n f) = j (f(x) f(x j))e x x j 2 4tn Put t n = n k 2 α, where α > 0 with prob. 1, lim n (4πt n ) k+1 2 n (L n f) x = M f x Belkin (2003), Belkin and Niyogi (2004,2005) also Lafon (2004), Coifman et al A Geometric Perspectiveon Data Analysis p.19

23 Learning Homology x 1,...,x n M R N Can you learn qualitative features of M? Can you tell a torus from a sphere? Can you tell how many connected components? Can you tell the dimension of M? (e.g. Carlsson, Zamorodian, Edelsbrunner, de Silva et al) A Geometric Perspectiveon Data Analysis p.20

24 Homology x 1,...,x n M R N U = n i=1b ɛ (x i ) If ɛ well chosen, then U deformation retracts to M. Homology of U is constructed using the nerve of U and agrees with the homology of M. A Geometric Perspectiveon Data Analysis p.21

25 Theorem M R d with cond. no. τ x = {x 1,...,x n } uniformly sampled i.i.d. 0 < ɛ < τ 2 β = Let U = x x B ɛ (x) vol(m) (sin 1 (ɛ/2τ)) k vol(b ɛ/8 ) n > β(log(β) + log( 1 δ )) with prob. > 1 δ, homology of U equals the homology of M (Niyogi, Smale, Weinberger, 2004) A Geometric Perspectiveon Data Analysis p.22

26 Spectral Clustering Isoperimetric inequalities. Cheeger constant. M δm 1 M1 M 1 e 1 e 1 = λ 1 e 1 cut to cluster h = inf vol n 1 (δm 1 ) min (vol n (M 1 ), vol n (M M 1 )) h λ1 2 [Cheeger] A Geometric Perspectiveon Data Analysis p.23

27 Clustering Digits A Geometric Perspectiveon Data Analysis p.24

28 Clustering Speech A Geometric Perspectiveon Data Analysis p.25

29 Volume Computation Volume of a convex body is deterministically hard to compute. (Báráany and Füredi, 1987). Polynomial by randomized algorithm. (Dyer, Kannan, Frieze, 1991). Current best O (d 4 ) by Lovász and Vempala. A Geometric Perspectiveon Data Analysis p.26

30 Volume Computation Volume of a convex body is deterministically hard to compute. (Báráany and Füredi, 1987). Polynomial by randomized algorithm. (Dyer, Kannan, Frieze, 1991). Current best O (d 4 ) by Lovász and Vempala. Surface volume at least as hard as volume. Grötschel, Lovász and Schrijver (1987) list as open problem. Dyer, Gritzmann and Hufnagel (1998) compute surface volume in randomized polynomial time. Complexity seems high. A Geometric Perspectiveon Data Analysis p.26

31 Surface Volume δμ Μ A Geometric Perspectiveon Data Analysis p.27

32 Surface Volume δμ Μ A Geometric Perspectiveon Data Analysis p.27

33 Surface Volume δμ Μ Heat Equation u = u t with u(x, 0) = f(x) Solution u(x, t) = f(x) K t (x, y) r π Z F t (M) = t R d \M u(x, t)dx A Geometric Perspectiveon Data Analysis p.27

34 Theorems [Theorem 1] lim t 0 F t (M) = M [Theorem 2] Let M be a convex body in R d such that (i) B r M B R (ii) M has condition 1/τ Then, it is possible to find the surface area of M in time O ( d 4 with error of ɛ with prob. > 3/4. ɛ 2 + d3.5 R 3 r 2 τɛ 3 ), Belkin, Narayanan, Niyogi (2005) A Geometric Perspectiveon Data Analysis p.28

35 Important Issues How to handle noise theoretically and practically? How to choose the graph neighborhood correctly? How often do manifolds arise in natural data? What is the right metric on these manifolds? What are other ways in which one might utilize the geometry of natural distributions? Identify real problems where this approach can make a difference. Complexity estimates and provably correct algorithms rather than heuristics. A Geometric Perspectiveon Data Analysis p.29

A Geometric Perspective on Machine Learning

A Geometric Perspective on Machine Learning A Geometric Perspective on Machine Learning Partha Niyogi The University of Chicago Collaborators: M. Belkin, V. Sindhwani, X. He, S. Smale, S. Weinberger A Geometric Perspectiveon Machine Learning p.1

More information

A Geometric Perspective on Machine Learning

A Geometric Perspective on Machine Learning A Geometric Perspective on Machine Learning Partha Niyogi The University of Chicago Thanks: M. Belkin, A. Caponnetto, X. He, I. Matveeva, H. Narayanan, V. Sindhwani, S. Smale, S. Weinberger A Geometric

More information

Approximating the surface volume of convex bodies

Approximating the surface volume of convex bodies Approximating the surface volume of convex bodies Hariharan Narayanan Advisor: Partha Niyogi Department of Computer Science, University of Chicago Approximating the surface volume of convex bodies p. Surface

More information

A Taxonomy of Semi-Supervised Learning Algorithms

A Taxonomy of Semi-Supervised Learning Algorithms A Taxonomy of Semi-Supervised Learning Algorithms Olivier Chapelle Max Planck Institute for Biological Cybernetics December 2005 Outline 1 Introduction 2 Generative models 3 Low density separation 4 Graph

More information

Data fusion and multi-cue data matching using diffusion maps

Data fusion and multi-cue data matching using diffusion maps Data fusion and multi-cue data matching using diffusion maps Stéphane Lafon Collaborators: Raphy Coifman, Andreas Glaser, Yosi Keller, Steven Zucker (Yale University) Part of this work was supported by

More information

Large Scale Manifold Transduction

Large Scale Manifold Transduction Large Scale Manifold Transduction Michael Karlen, Jason Weston, Ayse Erkan & Ronan Collobert NEC Labs America, Princeton, USA Ećole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland New York University,

More information

MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE

MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE REGULARIZATION METHODS FOR HIGH DIMENSIONAL LEARNING Francesca Odone and Lorenzo Rosasco odone@disi.unige.it - lrosasco@mit.edu June 6, 2011 ABOUT THIS

More information

Large-Scale Face Manifold Learning

Large-Scale Face Manifold Learning Large-Scale Face Manifold Learning Sanjiv Kumar Google Research New York, NY * Joint work with A. Talwalkar, H. Rowley and M. Mohri 1 Face Manifold Learning 50 x 50 pixel faces R 2500 50 x 50 pixel random

More information

MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE

MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE REGULARIZATION METHODS FOR HIGH DIMENSIONAL LEARNING Francesca Odone and Lorenzo Rosasco odone@disi.unige.it - lrosasco@mit.edu June 3, 2013 ABOUT THIS

More information

A Geometric Analysis of Subspace Clustering with Outliers

A Geometric Analysis of Subspace Clustering with Outliers A Geometric Analysis of Subspace Clustering with Outliers Mahdi Soltanolkotabi and Emmanuel Candés Stanford University Fundamental Tool in Data Mining : PCA Fundamental Tool in Data Mining : PCA Subspace

More information

Thorsten Joachims Then: Universität Dortmund, Germany Now: Cornell University, USA

Thorsten Joachims Then: Universität Dortmund, Germany Now: Cornell University, USA Retrospective ICML99 Transductive Inference for Text Classification using Support Vector Machines Thorsten Joachims Then: Universität Dortmund, Germany Now: Cornell University, USA Outline The paper in

More information

Large-Scale Semi-Supervised Learning

Large-Scale Semi-Supervised Learning Large-Scale Semi-Supervised Learning Jason WESTON a a NEC LABS America, Inc., 4 Independence Way, Princeton, NJ, USA 08540. Abstract. Labeling data is expensive, whilst unlabeled data is often abundant

More information

Stable and Multiscale Topological Signatures

Stable and Multiscale Topological Signatures Stable and Multiscale Topological Signatures Mathieu Carrière, Steve Oudot, Maks Ovsjanikov Inria Saclay Geometrica April 21, 2015 1 / 31 Shape = point cloud in R d (d = 3) 2 / 31 Signature = mathematical

More information

Semi-supervised Learning with Density Based Distances

Semi-supervised Learning with Density Based Distances Semi-supervised Learning with Density Based Distances Avleen S. Bijral Toyota Technological Institute Chicago, IL 6637 Nathan Ratliff Google Inc. Pittsburg, PA 526 Nathan Srebro Toyota Technological Institute

More information

Jointly Learning Data-Dependent Label and Locality-Preserving Projections

Jointly Learning Data-Dependent Label and Locality-Preserving Projections Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Jointly Learning Data-Dependent Label and Locality-Preserving Projections Chang Wang IBM T. J. Watson Research

More information

Divide and Conquer Kernel Ridge Regression

Divide and Conquer Kernel Ridge Regression Divide and Conquer Kernel Ridge Regression Yuchen Zhang John Duchi Martin Wainwright University of California, Berkeley COLT 2013 Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT 2013 1 / 15 Problem

More information

Persistence stability for geometric complexes

Persistence stability for geometric complexes Lake Tahoe - December, 8 2012 NIPS workshop on Algebraic Topology and Machine Learning Persistence stability for geometric complexes Frédéric Chazal Joint work with Vin de Silva and Steve Oudot (+ on-going

More information

Clustering algorithms and introduction to persistent homology

Clustering algorithms and introduction to persistent homology Foundations of Geometric Methods in Data Analysis 2017-18 Clustering algorithms and introduction to persistent homology Frédéric Chazal INRIA Saclay - Ile-de-France frederic.chazal@inria.fr Introduction

More information

Deep Learning via Semi-Supervised Embedding. Jason Weston, Frederic Ratle and Ronan Collobert Presented by: Janani Kalyanam

Deep Learning via Semi-Supervised Embedding. Jason Weston, Frederic Ratle and Ronan Collobert Presented by: Janani Kalyanam Deep Learning via Semi-Supervised Embedding Jason Weston, Frederic Ratle and Ronan Collobert Presented by: Janani Kalyanam Review Deep Learning Extract low-level features first. Extract more complicated

More information

Efficient Iterative Semi-supervised Classification on Manifold

Efficient Iterative Semi-supervised Classification on Manifold . Efficient Iterative Semi-supervised Classification on Manifold... M. Farajtabar, H. R. Rabiee, A. Shaban, A. Soltani-Farani Sharif University of Technology, Tehran, Iran. Presented by Pooria Joulani

More information

Semi-supervised Methods

Semi-supervised Methods Semi-supervised Methods Irwin King Joint work with Zenglin Xu Department of Computer Science & Engineering The Chinese University of Hong Kong Academia Sinica 2009 Irwin King (CUHK) Semi-supervised Methods

More information

Shape analysis through geometric distributions

Shape analysis through geometric distributions Shape analysis through geometric distributions Nicolas Charon (CIS, Johns Hopkins University) joint work with B. Charlier (Université de Montpellier), I. Kaltenmark (CMLA), A. Trouvé, Hsi-Wei Hsieh (JHU)...

More information

Interpolation by Spline Functions

Interpolation by Spline Functions Interpolation by Spline Functions Com S 477/577 Sep 0 007 High-degree polynomials tend to have large oscillations which are not the characteristics of the original data. To yield smooth interpolating curves

More information

Iterative Non-linear Dimensionality Reduction by Manifold Sculpting

Iterative Non-linear Dimensionality Reduction by Manifold Sculpting Iterative Non-linear Dimensionality Reduction by Manifold Sculpting Mike Gashler, Dan Ventura, and Tony Martinez Brigham Young University Provo, UT 84604 Abstract Many algorithms have been recently developed

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 13 UNSUPERVISED LEARNING If you have access to labeled training data, you know what to do. This is the supervised setting, in which you have a teacher telling

More information

Convex hulls of spheres and convex hulls of convex polytopes lying on parallel hyperplanes

Convex hulls of spheres and convex hulls of convex polytopes lying on parallel hyperplanes Convex hulls of spheres and convex hulls of convex polytopes lying on parallel hyperplanes Menelaos I. Karavelas joint work with Eleni Tzanaki University of Crete & FO.R.T.H. OrbiCG/ Workshop on Computational

More information

21 Analysis of Benchmarks

21 Analysis of Benchmarks 21 Analysis of Benchmarks In order to assess strengths and weaknesses of different semi-supervised learning (SSL) algorithms, we invited the chapter authors to apply their algorithms to eight benchmark

More information

Graph Transduction via Alternating Minimization

Graph Transduction via Alternating Minimization Jun Wang Department of Electrical Engineering, Columbia University Tony Jebara Department of Computer Science, Columbia University Shih-Fu Chang Department of Electrical Engineering, Columbia University

More information

Data Analysis, Persistent homology and Computational Morse-Novikov theory

Data Analysis, Persistent homology and Computational Morse-Novikov theory Data Analysis, Persistent homology and Computational Morse-Novikov theory Dan Burghelea Department of mathematics Ohio State University, Columbuus, OH Bowling Green, November 2013 AMN theory (computational

More information

Isograph: Neighbourhood Graph Construction Based On Geodesic Distance For Semi-Supervised Learning

Isograph: Neighbourhood Graph Construction Based On Geodesic Distance For Semi-Supervised Learning Isograph: Neighbourhood Graph Construction Based On Geodesic Distance For Semi-Supervised Learning Marjan Ghazvininejad, Mostafa Mahdieh, Hamid R. Rabiee, Parisa hanipour Roshan and Mohammad Hossein Rohban

More information

One-class Problems and Outlier Detection. 陶卿 中国科学院自动化研究所

One-class Problems and Outlier Detection. 陶卿 中国科学院自动化研究所 One-class Problems and Outlier Detection 陶卿 Qing.tao@mail.ia.ac.cn 中国科学院自动化研究所 Application-driven Various kinds of detection problems: unexpected conditions in engineering; abnormalities in medical data,

More information

The K-modes and Laplacian K-modes algorithms for clustering

The K-modes and Laplacian K-modes algorithms for clustering The K-modes and Laplacian K-modes algorithms for clustering Miguel Á. Carreira-Perpiñán Electrical Engineering and Computer Science University of California, Merced http://faculty.ucmerced.edu/mcarreira-perpinan

More information

Diffusion Maps and Topological Data Analysis

Diffusion Maps and Topological Data Analysis Diffusion Maps and Topological Data Analysis Melissa R. McGuirl McGuirl (Brown University) Diffusion Maps and Topological Data Analysis 1 / 19 Introduction OVERVIEW Topological Data Analysis The use of

More information

10-701/15-781, Fall 2006, Final

10-701/15-781, Fall 2006, Final -7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly

More information

Ranking on Graph Data

Ranking on Graph Data Shivani Agarwal SHIVANI@CSAIL.MIT.EDU Computer Science and AI Laboratory, Massachusetts Institute of Technology, Cambridge, MA 0139, USA Abstract In ranking, one is given examples of order relationships

More information

Western TDA Learning Seminar. June 7, 2018

Western TDA Learning Seminar. June 7, 2018 The The Western TDA Learning Seminar Department of Mathematics Western University June 7, 2018 The The is an important tool used in TDA for data visualization. Input point cloud; filter function; covering

More information

arxiv: v2 [cs.cg] 8 Feb 2010 Abstract

arxiv: v2 [cs.cg] 8 Feb 2010 Abstract Topological De-Noising: Strengthening the Topological Signal Jennifer Kloke and Gunnar Carlsson arxiv:0910.5947v2 [cs.cg] 8 Feb 2010 Abstract February 8, 2010 Topological methods, including persistent

More information

Local Limit Theorem in negative curvature. François Ledrappier. IM-URFJ, 26th May, 2014

Local Limit Theorem in negative curvature. François Ledrappier. IM-URFJ, 26th May, 2014 Local Limit Theorem in negative curvature François Ledrappier University of Notre Dame/ Université Paris 6 Joint work with Seonhee Lim, Seoul Nat. Univ. IM-URFJ, 26th May, 2014 1 (M, g) is a closed Riemannian

More information

A Multicover Nerve for Geometric Inference. Don Sheehy INRIA Saclay, France

A Multicover Nerve for Geometric Inference. Don Sheehy INRIA Saclay, France A Multicover Nerve for Geometric Inference Don Sheehy INRIA Saclay, France Computational geometers use topology to certify geometric constructions. Computational geometers use topology to certify geometric

More information

An introduction to Topological Data Analysis through persistent homology: Intro and geometric inference

An introduction to Topological Data Analysis through persistent homology: Intro and geometric inference Sophia-Antipolis, January 2016 Winter School An introduction to Topological Data Analysis through persistent homology: Intro and geometric inference Frédéric Chazal INRIA Saclay - Ile-de-France frederic.chazal@inria.fr

More information

Computing the Betti Numbers of Arrangements. Saugata Basu School of Mathematics & College of Computing Georgia Institute of Technology.

Computing the Betti Numbers of Arrangements. Saugata Basu School of Mathematics & College of Computing Georgia Institute of Technology. 1 Computing the Betti Numbers of Arrangements Saugata Basu School of Mathematics & College of Computing Georgia Institute of Technology. 2 Arrangements in Computational Geometry An arrangement in R k is

More information

Compactness Theorems for Saddle Surfaces in Metric Spaces of Bounded Curvature. Dimitrios E. Kalikakis

Compactness Theorems for Saddle Surfaces in Metric Spaces of Bounded Curvature. Dimitrios E. Kalikakis BULLETIN OF THE GREEK MATHEMATICAL SOCIETY Volume 51, 2005 (45 52) Compactness Theorems for Saddle Surfaces in Metric Spaces of Bounded Curvature Dimitrios E. Kalikakis Abstract The notion of a non-regular

More information

Lecture 19 Subgradient Methods. November 5, 2008

Lecture 19 Subgradient Methods. November 5, 2008 Subgradient Methods November 5, 2008 Outline Lecture 19 Subgradients and Level Sets Subgradient Method Convergence and Convergence Rate Convex Optimization 1 Subgradients and Level Sets A vector s is a

More information

Optimal transport and redistricting: numerical experiments and a few questions. Nestor Guillen University of Massachusetts

Optimal transport and redistricting: numerical experiments and a few questions. Nestor Guillen University of Massachusetts Optimal transport and redistricting: numerical experiments and a few questions Nestor Guillen University of Massachusetts From Rebecca Solnit s Hope in the dark (apropos of nothing in particular) To hope

More information

Tripod Configurations

Tripod Configurations Tripod Configurations Eric Chen, Nick Lourie, Nakul Luthra Summer@ICERM 2013 August 8, 2013 Eric Chen, Nick Lourie, Nakul Luthra (S@I) Tripod Configurations August 8, 2013 1 / 33 Overview 1 Introduction

More information

Robust Kernel Methods in Clustering and Dimensionality Reduction Problems

Robust Kernel Methods in Clustering and Dimensionality Reduction Problems Robust Kernel Methods in Clustering and Dimensionality Reduction Problems Jian Guo, Debadyuti Roy, Jing Wang University of Michigan, Department of Statistics Introduction In this report we propose robust

More information

Aarti Singh. Machine Learning / Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg

Aarti Singh. Machine Learning / Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg Spectral Clustering Aarti Singh Machine Learning 10-701/15-781 Apr 7, 2010 Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg 1 Data Clustering Graph Clustering Goal: Given data points X1,, Xn and similarities

More information

Towards Multi-scale Heat Kernel Signatures for Point Cloud Models of Engineering Artifacts

Towards Multi-scale Heat Kernel Signatures for Point Cloud Models of Engineering Artifacts Towards Multi-scale Heat Kernel Signatures for Point Cloud Models of Engineering Artifacts Reed M. Williams and Horea T. Ilieş Department of Mechanical Engineering University of Connecticut Storrs, CT

More information

Graph based machine learning with applications to media analytics

Graph based machine learning with applications to media analytics Graph based machine learning with applications to media analytics Lei Ding, PhD 9-1-2011 with collaborators at Outline Graph based machine learning Basic structures Algorithms Examples Applications in

More information

Topology and the Analysis of High-Dimensional Data

Topology and the Analysis of High-Dimensional Data Topology and the Analysis of High-Dimensional Data Workshop on Algorithms for Modern Massive Data Sets June 23, 2006 Stanford Gunnar Carlsson Department of Mathematics Stanford University Stanford, California

More information

Bagging for One-Class Learning

Bagging for One-Class Learning Bagging for One-Class Learning David Kamm December 13, 2008 1 Introduction Consider the following outlier detection problem: suppose you are given an unlabeled data set and make the assumptions that one

More information

A Dendrogram. Bioinformatics (Lec 17)

A Dendrogram. Bioinformatics (Lec 17) A Dendrogram 3/15/05 1 Hierarchical Clustering [Johnson, SC, 1967] Given n points in R d, compute the distance between every pair of points While (not done) Pick closest pair of points s i and s j and

More information

No more questions will be added

No more questions will be added CSC 2545, Spring 2017 Kernel Methods and Support Vector Machines Assignment 2 Due at the start of class, at 2:10pm, Thurs March 23. No late assignments will be accepted. The material you hand in should

More information

Minicourse II Symplectic Monte Carlo Methods for Random Equilateral Polygons

Minicourse II Symplectic Monte Carlo Methods for Random Equilateral Polygons Minicourse II Symplectic Monte Carlo Methods for Random Equilateral Polygons Jason Cantarella and Clayton Shonkwiler University of Georgia Georgia Topology Conference July 9, 2013 Basic Definitions Symplectic

More information

Graph Transduction via Alternating Minimization

Graph Transduction via Alternating Minimization Jun Wang Department of Electrical Engineering, Columbia University Tony Jebara Department of Computer Science, Columbia University Shih-Fu Chang Department of Electrical Engineering, Columbia University

More information

Learning with minimal supervision

Learning with minimal supervision Learning with minimal supervision Sanjoy Dasgupta University of California, San Diego Learning with minimal supervision There are many sources of almost unlimited unlabeled data: Images from the web Speech

More information

Stratified Structure of Laplacian Eigenmaps Embedding

Stratified Structure of Laplacian Eigenmaps Embedding Stratified Structure of Laplacian Eigenmaps Embedding Abstract We construct a locality preserving weight matrix for Laplacian eigenmaps algorithm used in dimension reduction. Our point cloud data is sampled

More information

Estimating Geometry and Topology from Voronoi Diagrams

Estimating Geometry and Topology from Voronoi Diagrams Estimating Geometry and Topology from Voronoi Diagrams Tamal K. Dey The Ohio State University Tamal Dey OSU Chicago 07 p.1/44 Voronoi diagrams Tamal Dey OSU Chicago 07 p.2/44 Voronoi diagrams Tamal Dey

More information

PS Geometric Modeling Homework Assignment Sheet I (Due 20-Oct-2017)

PS Geometric Modeling Homework Assignment Sheet I (Due 20-Oct-2017) Homework Assignment Sheet I (Due 20-Oct-2017) Assignment 1 Let n N and A be a finite set of cardinality n = A. By definition, a permutation of A is a bijective function from A to A. Prove that there exist

More information

Linear and Non-linear Dimentionality Reduction Applied to Gene Expression Data of Cancer Tissue Samples

Linear and Non-linear Dimentionality Reduction Applied to Gene Expression Data of Cancer Tissue Samples Linear and Non-linear Dimentionality Reduction Applied to Gene Expression Data of Cancer Tissue Samples Franck Olivier Ndjakou Njeunje Applied Mathematics, Statistics, and Scientific Computation University

More information

Computational Statistics and Mathematics for Cyber Security

Computational Statistics and Mathematics for Cyber Security and Mathematics for Cyber Security David J. Marchette Sept, 0 Acknowledgment: This work funded in part by the NSWC In-House Laboratory Independent Research (ILIR) program. NSWCDD-PN--00 Topics NSWCDD-PN--00

More information

The Pre-Image Problem in Kernel Methods

The Pre-Image Problem in Kernel Methods The Pre-Image Problem in Kernel Methods James Kwok Ivor Tsang Department of Computer Science Hong Kong University of Science and Technology Hong Kong The Pre-Image Problem in Kernel Methods ICML-2003 1

More information

On the Topology of Finite Metric Spaces

On the Topology of Finite Metric Spaces On the Topology of Finite Metric Spaces Meeting in Honor of Tom Goodwillie Dubrovnik Gunnar Carlsson, Stanford University June 27, 2014 Data has shape The shape matters Shape of Data Regression Shape of

More information

SOLVING PARTIAL DIFFERENTIAL EQUATIONS ON POINT CLOUDS

SOLVING PARTIAL DIFFERENTIAL EQUATIONS ON POINT CLOUDS SOLVING PARTIAL DIFFERENTIAL EQUATIONS ON POINT CLOUDS JIAN LIANG AND HONGKAI ZHAO Abstract. In this paper we present a general framework for solving partial differential equations on manifolds represented

More information

Alternating Minimization. Jun Wang, Tony Jebara, and Shih-fu Chang

Alternating Minimization. Jun Wang, Tony Jebara, and Shih-fu Chang Graph Transduction via Alternating Minimization Jun Wang, Tony Jebara, and Shih-fu Chang 1 Outline of the presentation Brief introduction and related work Problems with Graph Labeling Imbalanced labels

More information

Basis Functions. Volker Tresp Summer 2017

Basis Functions. Volker Tresp Summer 2017 Basis Functions Volker Tresp Summer 2017 1 Nonlinear Mappings and Nonlinear Classifiers Regression: Linearity is often a good assumption when many inputs influence the output Some natural laws are (approximately)

More information

On Manifold Regularization

On Manifold Regularization On Manifold Regularization Mikhail Belkin, artha Niyogi, Vikas Sindhwani misha,niyogi,vikass cs.uchicago.edu Department of Computer Science University of Chicago Abstract We propose a family of learning

More information

CPSC 340: Machine Learning and Data Mining. Kernel Trick Fall 2017

CPSC 340: Machine Learning and Data Mining. Kernel Trick Fall 2017 CPSC 340: Machine Learning and Data Mining Kernel Trick Fall 2017 Admin Assignment 3: Due Friday. Midterm: Can view your exam during instructor office hours or after class this week. Digression: the other

More information

Topological Classification of Data Sets without an Explicit Metric

Topological Classification of Data Sets without an Explicit Metric Topological Classification of Data Sets without an Explicit Metric Tim Harrington, Andrew Tausz and Guillaume Troianowski December 10, 2008 A contemporary problem in data analysis is understanding the

More information

Image Similarities for Learning Video Manifolds. Selen Atasoy MICCAI 2011 Tutorial

Image Similarities for Learning Video Manifolds. Selen Atasoy MICCAI 2011 Tutorial Image Similarities for Learning Video Manifolds Selen Atasoy MICCAI 2011 Tutorial Image Spaces Image Manifolds Tenenbaum2000 Roweis2000 Tenenbaum2000 [Tenenbaum2000: J. B. Tenenbaum, V. Silva, J. C. Langford:

More information

Bias Selection Using Task-Targeted Random Subspaces for Robust Application of Graph-Based Semi-Supervised Learning

Bias Selection Using Task-Targeted Random Subspaces for Robust Application of Graph-Based Semi-Supervised Learning Bias Selection Using Task-Targeted Random Subspaces for Robust Application of Graph-Based Semi-Supervised Learning Christopher T. Symons, Ranga Raju Vatsavai Computational Sciences and Engineering Oak

More information

Feature Transformation by Feedforward Neural Network for Semisupervised Classification

Feature Transformation by Feedforward Neural Network for Semisupervised Classification Feature Transformation by Feedforward Neural Network for Semisupervised Classification Gouchol Pok Division of Computer and Information Technology Education, Pai Chai University, Daejeon, Republic of Korea.

More information

Geometric Data. Goal: describe the structure of the geometry underlying the data, for interpretation or summary

Geometric Data. Goal: describe the structure of the geometry underlying the data, for interpretation or summary Geometric Data - ma recherche s inscrit dans le contexte de l analyse exploratoire des donnees, dont l obj Input: point cloud equipped with a metric or (dis-)similarity measure data point image/patch,

More information

Research in Computational Differential Geomet

Research in Computational Differential Geomet Research in Computational Differential Geometry November 5, 2014 Approximations Often we have a series of approximations which we think are getting close to looking like some shape. Approximations Often

More information

Dimension Reduction of Image Manifolds

Dimension Reduction of Image Manifolds Dimension Reduction of Image Manifolds Arian Maleki Department of Electrical Engineering Stanford University Stanford, CA, 9435, USA E-mail: arianm@stanford.edu I. INTRODUCTION Dimension reduction of datasets

More information

Week 5. Convex Optimization

Week 5. Convex Optimization Week 5. Convex Optimization Lecturer: Prof. Santosh Vempala Scribe: Xin Wang, Zihao Li Feb. 9 and, 206 Week 5. Convex Optimization. The convex optimization formulation A general optimization problem is

More information

arxiv: v1 [cs.cv] 12 Apr 2017

arxiv: v1 [cs.cv] 12 Apr 2017 Provable Self-Representation Based Outlier Detection in a Union of Subspaces Chong You, Daniel P. Robinson, René Vidal Johns Hopkins University, Baltimore, MD, 21218, USA arxiv:1704.03925v1 [cs.cv] 12

More information

Minimum norm points on polytope boundaries

Minimum norm points on polytope boundaries Minimum norm points on polytope boundaries David Bremner UNB December 20, 2013 Joint work with Yan Cui, Zhan Gao David Bremner (UNB) Minimum norm facet December 20, 2013 1 / 27 Outline 1 Notation 2 Maximum

More information

Image Smoothing and Segmentation by Graph Regularization

Image Smoothing and Segmentation by Graph Regularization Image Smoothing and Segmentation by Graph Regularization Sébastien Bougleux 1 and Abderrahim Elmoataz 1 GREYC CNRS UMR 6072, Université de Caen Basse-Normandie ENSICAEN 6 BD du Maréchal Juin, 14050 Caen

More information

Learning a Manifold as an Atlas Supplementary Material

Learning a Manifold as an Atlas Supplementary Material Learning a Manifold as an Atlas Supplementary Material Nikolaos Pitelis Chris Russell School of EECS, Queen Mary, University of London [nikolaos.pitelis,chrisr,lourdes]@eecs.qmul.ac.uk Lourdes Agapito

More information

Globally and Locally Consistent Unsupervised Projection

Globally and Locally Consistent Unsupervised Projection Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence Globally and Locally Consistent Unsupervised Projection Hua Wang, Feiping Nie, Heng Huang Department of Electrical Engineering

More information

Integer Programming Theory

Integer Programming Theory Integer Programming Theory Laura Galli October 24, 2016 In the following we assume all functions are linear, hence we often drop the term linear. In discrete optimization, we seek to find a solution x

More information

Face Recognition Based on LDA and Improved Pairwise-Constrained Multiple Metric Learning Method

Face Recognition Based on LDA and Improved Pairwise-Constrained Multiple Metric Learning Method Journal of Information Hiding and Multimedia Signal Processing c 2016 ISSN 2073-4212 Ubiquitous International Volume 7, Number 5, September 2016 Face Recognition ased on LDA and Improved Pairwise-Constrained

More information

Inference Driven Metric Learning (IDML) for Graph Construction

Inference Driven Metric Learning (IDML) for Graph Construction University of Pennsylvania ScholarlyCommons Technical Reports (CIS) Department of Computer & Information Science 1-1-2010 Inference Driven Metric Learning (IDML) for Graph Construction Paramveer S. Dhillon

More information

Lecture notes for Topology MMA100

Lecture notes for Topology MMA100 Lecture notes for Topology MMA100 J A S, S-11 1 Simplicial Complexes 1.1 Affine independence A collection of points v 0, v 1,..., v n in some Euclidean space R N are affinely independent if the (affine

More information

x 1 SpaceNet: Multivariate brain decoding and segmentation Elvis DOHMATOB (Joint work with: M. EICKENBERG, B. THIRION, & G. VAROQUAUX) 75 x y=20

x 1 SpaceNet: Multivariate brain decoding and segmentation Elvis DOHMATOB (Joint work with: M. EICKENBERG, B. THIRION, & G. VAROQUAUX) 75 x y=20 SpaceNet: Multivariate brain decoding and segmentation Elvis DOHMATOB (Joint work with: M. EICKENBERG, B. THIRION, & G. VAROQUAUX) L R 75 x 2 38 0 y=20-38 -75 x 1 1 Introducing the model 2 1 Brain decoding

More information

Scattered Data Problems on (Sub)Manifolds

Scattered Data Problems on (Sub)Manifolds Scattered Data Problems on (Sub)Manifolds Lars-B. Maier Technische Universität Darmstadt 04.03.2016 Lars-B. Maier (Darmstadt) Scattered Data Problems on (Sub)Manifolds 04.03.2016 1 / 60 Sparse Scattered

More information

Analysis of high dimensional data via Topology. Louis Xiang. Oak Ridge National Laboratory. Oak Ridge, Tennessee

Analysis of high dimensional data via Topology. Louis Xiang. Oak Ridge National Laboratory. Oak Ridge, Tennessee Analysis of high dimensional data via Topology Louis Xiang Oak Ridge National Laboratory Oak Ridge, Tennessee Contents Abstract iii 1 Overview 1 2 Data Set 1 3 Simplicial Complex 5 4 Computation of homology

More information

Kernel Methods in Machine Learning

Kernel Methods in Machine Learning Outline Department of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong Joint work with Ivor Tsang, Pakming Cheung, Andras Kocsor, Jacek Zurada, Kimo Lai November

More information

Numerische Mathematik

Numerische Mathematik Numer. Math. (1997) 77: 423 451 Numerische Mathematik c Springer-Verlag 1997 Electronic Edition Minimal surfaces: a geometric three dimensional segmentation approach Vicent Caselles 1, Ron Kimmel 2, Guillermo

More information

Exact discrete Morse functions on surfaces. To the memory of Professor Mircea-Eugen Craioveanu ( )

Exact discrete Morse functions on surfaces. To the memory of Professor Mircea-Eugen Craioveanu ( ) Stud. Univ. Babeş-Bolyai Math. 58(2013), No. 4, 469 476 Exact discrete Morse functions on surfaces Vasile Revnic To the memory of Professor Mircea-Eugen Craioveanu (1942-2012) Abstract. In this paper,

More information

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas CS839: Probabilistic Graphical Models Lecture 10: Learning with Partially Observed Data Theo Rekatsinas 1 Partially Observed GMs Speech recognition 2 Partially Observed GMs Evolution 3 Partially Observed

More information

Mesh segmentation. Florent Lafarge Inria Sophia Antipolis - Mediterranee

Mesh segmentation. Florent Lafarge Inria Sophia Antipolis - Mediterranee Mesh segmentation Florent Lafarge Inria Sophia Antipolis - Mediterranee Outline What is mesh segmentation? M = {V,E,F} is a mesh S is either V, E or F (usually F) A Segmentation is a set of sub-meshes

More information

LEARNING ON STATISTICAL MANIFOLDS FOR CLUSTERING AND VISUALIZATION

LEARNING ON STATISTICAL MANIFOLDS FOR CLUSTERING AND VISUALIZATION LEARNING ON STATISTICAL MANIFOLDS FOR CLUSTERING AND VISUALIZATION Kevin M. Carter 1, Raviv Raich, and Alfred O. Hero III 1 1 Department of EECS, University of Michigan, Ann Arbor, MI 4819 School of EECS,

More information

Combine the PA Algorithm with a Proximal Classifier

Combine the PA Algorithm with a Proximal Classifier Combine the Passive and Aggressive Algorithm with a Proximal Classifier Yuh-Jye Lee Joint work with Y.-C. Tseng Dept. of Computer Science & Information Engineering TaiwanTech. Dept. of Statistics@NCKU

More information

Projective Shape Analysis in Face Recognition from Digital Camera ImagesApril 19 th, / 32

Projective Shape Analysis in Face Recognition from Digital Camera ImagesApril 19 th, / 32 3D Projective Shape Analysis in Face Recognition from Digital Camera Images Seunghee Choi Florida State University, Department of Statistics April 19 th, 2018 Projective Shape Analysis in Face Recognition

More information

STATS306B STATS306B. Clustering. Jonathan Taylor Department of Statistics Stanford University. June 3, 2010

STATS306B STATS306B. Clustering. Jonathan Taylor Department of Statistics Stanford University. June 3, 2010 STATS306B Jonathan Taylor Department of Statistics Stanford University June 3, 2010 Spring 2010 Outline K-means, K-medoids, EM algorithm choosing number of clusters: Gap test hierarchical clustering spectral

More information

On the Local Behavior of Spaces of Natural Images

On the Local Behavior of Spaces of Natural Images On the Local Behavior of Spaces of Natural Images Gunnar Carlsson Tigran Ishkhanov Vin de Silva Afra Zomorodian Abstract In this study we concentrate on qualitative topological analysis of the local behavior

More information

Diffusion Map Kernel Analysis for Target Classification

Diffusion Map Kernel Analysis for Target Classification Diffusion Map Analysis for Target Classification Jason C. Isaacs, Member, IEEE Abstract Given a high dimensional dataset, one would like to be able to represent this data using fewer parameters while preserving

More information

An efficient face recognition algorithm based on multi-kernel regularization learning

An efficient face recognition algorithm based on multi-kernel regularization learning Acta Technica 61, No. 4A/2016, 75 84 c 2017 Institute of Thermomechanics CAS, v.v.i. An efficient face recognition algorithm based on multi-kernel regularization learning Bi Rongrong 1 Abstract. A novel

More information