Efficient Algorithm for Distance Metric Learning

Size: px
Start display at page:

Download "Efficient Algorithm for Distance Metric Learning"

Transcription

1 Efficient lgorithm for Distance Metric Learning Yipei Wang Language Technologies Institute, Carnegie Mellon University Pittsburgh, P yipeiw@cmu.edu bstract Distance metric learning provides an approach to transfer knowledge from sparse labeled data to unlabeled data. The learned metric is more proper to measure the similarity of semantics among instances. The main idea of the algorithm is to create an objective function using the equivalence constraints and in-equivalence constraints and pose the problem as an optimization problem. In this paper, we proposed to unify different metric learning algorithms into semidefinite programming (SDP) framework. The classical semidefinite programming algorithms are extremely expensive on larger problem. So we discuss efficient algorithms for large-scale metric learning. We investigated a recent proposed algorithm arise from Frank-Wolfe algorithm and proposed novel strategies for acceleration based on the special structure of the problem. We compared different algorithms on 3 UCI dataset in clustering problem. 1 Introduction proper distance metric has crucial effect on the performance of distance-based supervised learning and unsupervised learning. For instance, the performance of K-means clustering algorithm, KNN classifiers and SVM classifiers are critically influenced by good metrics. Metric learning provides approaches to transfer knowledge learned from sparse labeled data to unlabeled data. Recently this problem has been actively studied [1][2][3][4][5][16]. These methods have been applied to many real-world problems such as image retrieval [7], face verification [6] and bioinformatics [8]. Previous works often have different formulations and provide specific optimization technologies to solve the problem. For example, Xing [1] pose the problem into a convex optimization problem and design iterative gradient decent algorithm to solve the problem. In [2], they incorporate the idea of margin in formalizing the cost function. They solve the minimization of cost function though alternating projection algorithm. In [4], they learn the Mahalanobis distance metric bu directly maximizing a stochastic variant of the leave-one-out KNN score on training set. The function is not convex and they adopt gradient search to find the maximal. Though the distance metric can be a general function, the prevalent form of the distance function is (x y) T (x y),. This is linear transformation and non linear transformation can be implemented by using kernel. Inspired by the format of the distance function, can we derive a unified semidefinite programming framework? In the following part, we discuss the reformulation into standard semidefinite programming (SDP) of two previous algorithms and we implemented the algorithm with open source SDP solver Sedumi [9]. nother problem is the efficiency of the algorithm in dealing with large scale problem, either high dimension or large data size. Recent works have studied multiple technologies for large-scale semidefinite programming problem [11]. The work mainly falls into two lines of research directions. One direction is to develop first-order method designed for solving generic optimization problem[]. They provide approximation approaches to reduce the iteration cost. The second direction is designing 1

2 algorithm by exploiting the special structure of the problem [11][14]. These algorithms also include Frank-Wolfe algorithm, block coordinate descent method, cutting-plane method, etc. Recently, the development of subsampling technologies also lead to some efficient algorithms [15] for large scale problem. Here we followed a recent proposed approach [5], which is a combination of the two main directions. They first reformulated the problem into eigenvalue optimization problem and design efficient algorithm by combining smoothing technology and Frank-Wolfe algorithm. By investigating the sparse structure of the problem, we further propose several acceleration strategies to improve the efficiency. 2 Related Work We mainly focus on two metric learning methods in the rest of the paper. They learn the metric through convex optimization and one of the algorithm (LMNN) achieves the state-of-art performance on multiple dataset. 2.1 Review of method by Xing Problem Formulation In Xing s method, we re supposed to be given pairs of equivalent constraints: S: (x i, x j ) S if x i and x j are similar The defined criterion for the desired metric is to demand that pairs of points in S have smaller distance. This is cast into a convex optimization problem as below: min s.t. (x i,x j) D x i x j 2 (x i,x j) S x i x j 1 (1) It uses the in-equivalence constraints as condition. Here, D can be set of pairs of points known to be dissimilar if such information is explicitly available; otherwise, we just take all pairs not in S. Without this condition, the function can be solved trivially with =, which is not useful. s is mentioned in the paper, they didn t formulates in the format x i x j 2 1 because it would result in always being rank 1. Optimization The derive the the optimization part for computational cost analysis. Details can be referred to appendix. 1. Newton Method for diagonal. Xing proved that the original optimization problem is equivalent to minimize the function: g() = x i x j 2 c log( x i x j ) ( ) The computational cost is O(n). But when is full rank, it requires O(n 3 ) time to invert the hessian matrix. The computational cost is too expensive and unacceptable. Projected gradient search For full rank matrix, Xing proposed to use iterative projected gradient search to solve the opti- 2

3 mization problem efficiently. The problem is posed as the equivalent form as below: max = x i x j s.t. f() = x i x j 2 1 (x i,x j) S (2) The algorithm takes the gradient step on g() and then repeatedly project into the sets C 1 = { : (x i,x j) S x i x j 2 1} and C 2 = { : }. The algorithm is shown below: Iterate Iterate projection := P C1 () := P C2 () until converges := + t ( g()) until convergence f() The projection to set C 1 can be solved analytically. = < X s, > X s 2 X s + F Considering projection to set C 2. It is completed through decomposition of the matrix. 2.2 Review of LMNN method = U T ΣU Σ + = max(, Σ) = U T Σ + U This work aims to learn Mahanalobis matirx for knn classification. Compared to Xing s method, we are given more information (the class label for each point in the training set) than just the equivalence constraints. Here, we use y ij {, 1} to indicate whether or not the class label y i and y j match. We use η ij {, 1} to indicate whether input x j is a k-nearest neighbor of input x i. The criterion is that the k-nearest neighbors always belong to the same class while examples from different classes are separated by a large margin. The cost function is given by: cost() = i,j η ij x i x j 2 + c ijl η ij (1 y il ) max([1 + x i x j 2 x i x l 2 ], ) The optimization of the cost function is given by: min η ij x i x j 2 + c η ij (1 y il )ɛ ijl i,j ijl s.t. x i x l 2 x i x j 2 1 ɛ ijl ɛ ijl (3) Most of the slack variable η ijl never attain positive values. The method is based on a combination of sub-gradient decent in matrices L and M (M = L T L), the later mainly to verify that we reached the global minimum. The alternating projection algorithm is proved to be converged [17]. 3 The unified Semidifinite programming framework We will discuss how to transform two metric learning algorithms into standard semidefinite programming problem. 3

4 3.1 Xing s method We use the denotation: X s = (x i,x j) S x i x j 2, X α = x i x j 2, (x i, x j ) D. (x i,x j) S x i x j 2 can be written as < X S, >. We modify the original problem as below so that it s easy to reformulate into SDP problem. max s.t. min x i x j 2 x i x j 2 1 (4) The problem can be rewritten as: max t, t s.t. < X α, > t, α = 1,, D < X s, > 1 (5) The problem can be transformed into standard form as below: where X = ( t max X < C, X > s.t. < MD α, X > =, α = 1,, D < MS, X > = 1 X d ( D +1) ( D +1) ), C = ( n n 1 ( D +1) ( D +1) Here d is diagonal matrix and each diagonal element is the slack variable to transform the inequality into equality. X α X s 1 MD α = E, MS = α D D 1 For matrix E α R D D, E αα = 1, other elements equal to. ). (6) 3.2 LMNN method We use the denotation C = i,j η ij(x i x j )(x i x j ) T The original problem definition in section 1.1 can be transformed into SDP form as below: ( X = ɛ ) ( C, C = Y N min < C, X > s.t. < B ijl, X >, for all ijl X ). Here ɛ includes all the slack variable ɛ ijl and Y N = diag([η ij (1 y il ), ]). The index should be consistent with the ijl index order in ɛ. ( ) (xi x B ijl = l )(x i x l ) T (x i x j )(x i x j ) T. Eijl Here E ijl is a diagonal matrix where only the corresponding ijl diagonal element equals to 1 and (7) 4

5 others. The problem can be easily transformed into standard form by adding slack variables to the inequality constraints. 4 Large-scale problem 4.1 lgorithms for large-scale problem and DML-eig method Most of semidefinite programming solvers are based on interiror point method and compuing Hessian become very hard on larger problem. series of algorithms have been proposed to address the problem. lot of efforts focused on exploiting structural properties of the problem and the proper algorithm depends on the type of the problem. More general method is first order methods, which seeks to significantly reduce the per iteration complexity of optimization algorithms rather than the total computational cost. nother recent trend it to use subsampling to reduce the computational cost of each iteration. Recently Ying [5] proposed a method arised from a special structure based method, Frank-Wolfe algorithm. They modify the algorithm by smoothing technologies, which allows gradient search on the approximated function instead of subgradient method on the initial problem. Here we briefly review their method. Ying proved the theorem: ssume that X s is invertible and, for any τ D, let X τ = X 1/2 S X τ X 1/2 S. Then, problem () is equivalent to the following problem max min u τ < X τ, S > S P u τ D where = {u R D : u τ, τ D u τ = 1}, P = {M S d + : T r(m) = 1} Ying further propose en efficient algorithm for DML-eig, which a new first-order method by combining the Frank-Wolfe algorithm and smoothing technologies. Let f u (S) = min u τ D u τ < X τ, S > +u τ D < X τ, S >, u > is smoothing parameter lgorithm: pproximate Frank-Wolfe lgorithm for DML-eig Parameter: smoothing parameter u >, tolerance value tol, step size α t (, 1), t N Initialization: Set S u 1 S d + with T r(s u 1 ) = 1 for t=1,2,3,... Z u t = argmax{f u (S t )+ < Z, f u (S u t ) >: Z S d +, T r(z) = 1}, that is Z u t = vv T. S u t+1 = (1 α t )S u t + α t Z u t if f u (S u t+1) f u (S u t ) < tol then break The step size need to satisfy: α t =, lim t α t = t N 4.2 cceleration Strategies We can observe that the density part of the matrix is the gram matrix of the samples. So the complexity problem depends on the feature dimension. DML-eig algorithm has reduced the computational 5

6 cost to O(d 2 ). This is due to the reason that they calculate the leading eigenvector instead of the decomposition of the matrix. The constraints are all from the in-equivalence constraints and the computational cost is also proportional to the number of the in-equivalence constraints. However, only few of them should be active based on our formulation. Therefore, we might use the Euclidean distance to prefilter out less in-equivalence constrains before applying the optimization algorithms so that we can accelerate the optimization. nother idea to accelerate the DML-eig algorithm is that whether we can use better initialization with low computational cost. Relevent Component nalysis (RC) [16] is a metric learning method only considering the equivalence constraints with low computational cost. So we explored to use the result from RC as initialization. 5 Experiments 5.1 Dataset and Evaluation Criteria We experiment with 3 UCI dataset, iris, wine and protein. The number of classes and the feature dimension for each data set is: iris: classes 3,d=4; wine: classes 3, d=12; protein: classes 6, d=2; We follow the criteria used by Xing to evaluate the quality of learned metrics in a clustering application.(we use Kmeans with learned metric here) Let c i be the cluster label, ĉ i be the assigned label by an automatic clustering algorithm. ccuracy = i>j 1{1{c i = c j } = 1{ĉ i = ĉ j }}.5m(m 1) where 1 is the indicator function. ll the experiment code is released in Comparison of different optimization technologies Here we compare the performance of different optimization algorithms in learning metric. Xing s method is from his released implementation [1]. We implemented SDP using the open source Sedumi solver [9]. We implemented DML-eig algorithm by matlab. The baseline is using Euclidean distance. From the result, we can see that both SDP and DML-eig achieves better performance than Xing s optimization algorithm. 6

7 ccuracy (ratio=.9) 1 Baseline(Euclidean) Newton Iterative projected gradient SDP DEig Iris Wine Protein Figure 1: ccuracy for different optimization technologies. To better visualize the result, we also show the distance matrix on protein data. It actually includes 6 clusters. This is not clear in the Euclidean distance matrix but more clear pattern is shown using the learned matrix x Euclidean Newton IPG DML-eig Figure 2: Distance matrix over different distance functions 5.3 Result for ccelartion strategies 1. We explored to use RC as initialization for DML-eig algorithm. Unfortunately, we didn t reduce the iteration number. 2.We explored to filter the negative constraints by selecting those with smaller Euclidean distance. The result is shown in the figure below. Both the semidefinite programming algorithm and the DML-eig algorithm converge with fewer iterations while the performance is only slightly affect. 7

8 Iteration Number of SDP (tolerance=1e 6) original sampled 12 Iteration Number of Deig on wine data original sampled IterNum IterNum dataset: iris, wine, protein 1e 6 1e 8 1e 1 1e 12 1e 6 1e 8 1e 1 tolerance SDP DML-eig Figure 3: Iteration number using sampling strategy References [1] Xing, Eric P., et al. Distance metric learning with application to clustering with side-information. dvances in neural information processing systems. 22. [2] Kilian Q. Weinberger, Lawrence K. Saul,Distance Metric Learning for Large Margin Nearest Neighbor Classification,Journal of Machine Learning Research,1,29, [3] J.Davis, B. Kulis, P. Jain, S. Sra, and I. Dhillon. Information-theoretic metric learning. In Proceedings of the Twenty-Fourth International Conference on Machine Learning, pages 29216, 27. [4] J. Goldberger, S. Roweis, G. Hinton, and R. Salakhutdinov. Neighbourhood component analysis. In dvances in Neural Information Processing Systems 17, 24. [5] Yiming Ying, Peng Li, Distance Metric Learning with Eigenvalue Optimization, Journal of Machine Learning Research, 212 [6] S. Chopra, R. Hadsell, and Y. LeCun. Learning a similarity metric discriminatively with application to face verification. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages , 25. [7] S. C. H. Hoi, W. Liu, M. R. Lyu, and W.-Y. Ma. Learning distance metrics with contextual constraints for image retrieval. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages , 26 [8] T. Kato and N. Nagano. Metric learning for enzyme active-site search. Bioinformatics, 26: , 21. [9] Sturm, J. F. (1999). Using SeDuMi 1.2, a MTLB toolbox for optimization over symmetric cones.optimization Methods and Software, 1112: Special issue on Interior Point Methods. [1] epxing/papers/old_papers/code_metric_online.tar.gz [11] lexandre dspremont,(212) Tutorial: algorithms for Large-Scale Semidefinite Programming [12] Olivier Devolder, Franois Glineur, Yurii Nesterov, First-order methods of smooth convex optimization with inexact oracle, Math. Program., Ser. DOI 1.17/s [13] dspremont,., Banerjee, O., and El Ghaoui, L. (28). First-order methods for sparse covariance selection., SIM Journal on Matrix nalysis and its pplications, 3(1):5666. [14] Steven J. Benson, Yinyu Ye, and Xiong Zhang, Solving large-scale sparse semidefinite programs for combinatorial optimization, SIM J. Optim., 1(2), (19 pages) [15] lexandre dspremont,(211) Subsampling algorithms for Semidefinite Programming,Technical report arxiv:83.199v6 [16] haron Bar-Hillel, Tomer Hertz,Noam Shental, Daphna Weinshall, Learning a Mahalanobis Metric from Equivalence Constraints, Journal of Machine Learning Research 6 (25) [17] Lieven Vandenberghe,Stephen P. Boyd, Semidefinite Programming, SIM review, 38(1):49-95, March

9 ppendix Newton method by Xing Xing proved that the original optimization problem is equivalent to minimize the function: g() = x i x j 2 c log( x i x j ) ( ) When is diagonal matrix, Xing points that it can be solved by Newton method cheaply. We derived the gradient and hessian matrix to consider the computation complexity. We define a = [ 11, 22,, nn] T. dist(x i, x j ) = ((x i x j ) T a (x i x j )) g = g(11, 22,, nn) a 2 H = 2 g( 11, 22,, nn ) 2 a The update step is distderive1(x i, x j ) =.5 (x i x j ) 2 dist(x i, x j ) distderive2(x i, x j ) =.25 (x i x j ) 4 sumddist = sumdderive1 = sumdderive2 = = dist(x i, x j ) 3 dist(x i, x j) distderive1(x i, x j ) distderive2(x i, x j) (x i x j) 2 C sumdderive1 sumddist = C [ sumdderive2 sumddist a = a t [ 2 H] 1 g, t is stepsize sumdderive1t sumdderive1 sumddist 2 ] (8) There are n parameters in a and they are separable. So the computational cost is around O(n). But when is full rank, n 2 parameter requires O(n 6 ) time to invert the hessian matrix. The computational cost is too expensive and unacceptable. Projected gradient search by Xing For full rank matrix, Xing proposed to use iterative projected gradient search to solve the optimization problem efficiently. The problem is posed as the equivalent form as below: max g() = x i x j s.t. f() = x i x j 2 1 The algorithm takes the gradient step on g() and then repeatedly project into the sets C 1 = { : x i x j 2 1} and C 2 = { : }. The algorithm is shown below: (9) Iterate Iterate projection := P C1 () := P C2 () until converges := + t ( g()) until convergence f() 9

10 The projection to set C 1 is to solve the optimization problem: min s.t. 2 F x i x j 2 1 (1) Considering the dual problem. We define X s matrix as <, B >= T r( T B) = x i x j 2. We denote the inner product of The Lagrangian function:l(, u) = 2 F + u( x i x j 1) L(, u) = 2( ) + ux s = =.5uX s + (11) g(u) = min L(, u) =.5uX s 2 F + u(< X s, +.5uX s > 1) =.75u 2 T r(s T S) + u(t r(s T ) 1) (12) The dual problem is: max g(u), u R. g(u) =, we can get: u = T r(xt s ) T r(x T s X s ) (13) Using KKT condition, using (6) and (8), we can get: = < X s, > X s 2 X s + F Considering projection to set C 2: = U T ΣU Σ + = max(, Σ) = U T Σ + U Summary From the illustration of the projection step, we can see that the projection to set C 1 has analytical solution. X s can be pre-computed and stored. The only cost is matrix multiplication and the projection step is cheap. The main cost of projection to C 2 is matrix decomposition, which is usually O(n 3 ) time complexity. 1

Distance metric learning: A two-phase approach

Distance metric learning: A two-phase approach ESANN 07 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges (Belgium), 6-8 April 07, i6doc.com publ., ISBN 978-8758709-. Distance metric

More information

Regularized Large Margin Distance Metric Learning

Regularized Large Margin Distance Metric Learning 2016 IEEE 16th International Conference on Data Mining Regularized Large Margin Distance Metric Learning Ya Li, Xinmei Tian CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application

More information

Global Metric Learning by Gradient Descent

Global Metric Learning by Gradient Descent Global Metric Learning by Gradient Descent Jens Hocke and Thomas Martinetz University of Lübeck - Institute for Neuro- and Bioinformatics Ratzeburger Allee 160, 23538 Lübeck, Germany hocke@inb.uni-luebeck.de

More information

Constrained Metric Learning via Distance Gap Maximization

Constrained Metric Learning via Distance Gap Maximization Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Constrained Metric Learning via Distance Gap Maximization Wei Liu, Xinmei Tian, Dacheng Tao School of Computer Engineering

More information

Convex Optimization MLSS 2015

Convex Optimization MLSS 2015 Convex Optimization MLSS 2015 Constantine Caramanis The University of Texas at Austin The Optimization Problem minimize : f (x) subject to : x X. The Optimization Problem minimize : f (x) subject to :

More information

MTTS1 Dimensionality Reduction and Visualization Spring 2014, 5op Jaakko Peltonen

MTTS1 Dimensionality Reduction and Visualization Spring 2014, 5op Jaakko Peltonen MTTS1 Dimensionality Reduction and Visualization Spring 2014, 5op Jaakko Peltonen Lecture 9: Metric Learning Motivation metric learning Metric learning means learning a better metric (better distance function)

More information

Fast Solvers and Efficient Implementations for Distance Metric Learning

Fast Solvers and Efficient Implementations for Distance Metric Learning Fast Solvers and Efficient Implementations for Distance Metric Learning Kilian Q. Weinberger kilian@yahoo-inc.com Yahoo! Research, 2821 Mission College Blvd, Santa Clara, CA 9505 Lawrence K. Saul saul@cs.ucsd.edu

More information

Learning Models of Similarity: Metric and Kernel Learning. Eric Heim, University of Pittsburgh

Learning Models of Similarity: Metric and Kernel Learning. Eric Heim, University of Pittsburgh Learning Models of Similarity: Metric and Kernel Learning Eric Heim, University of Pittsburgh Standard Machine Learning Pipeline Manually-Tuned Features Machine Learning Model Desired Output for Task Features

More information

Locally Smooth Metric Learning with Application to Image Retrieval

Locally Smooth Metric Learning with Application to Image Retrieval Locally Smooth Metric Learning with Application to Image Retrieval Hong Chang Xerox Research Centre Europe 6 chemin de Maupertuis, Meylan, France hong.chang@xrce.xerox.com Dit-Yan Yeung Hong Kong University

More information

Semi-Supervised Distance Metric Learning for Collaborative Image Retrieval

Semi-Supervised Distance Metric Learning for Collaborative Image Retrieval Semi-Supervised Distance Metric Learning for Collaborative Image Retrieval Steven C.H. Hoi School of Computer Engineering Nanyang Technological University chhoi@ntu.edu.sg Wei Liu and Shih-Fu Chang Department

More information

Metric Learning Applied for Automatic Large Image Classification

Metric Learning Applied for Automatic Large Image Classification September, 2014 UPC Metric Learning Applied for Automatic Large Image Classification Supervisors SAHILU WENDESON / IT4BI TOON CALDERS (PhD)/ULB SALIM JOUILI (PhD)/EuraNova Image Database Classification

More information

A Unified Framework to Integrate Supervision and Metric Learning into Clustering

A Unified Framework to Integrate Supervision and Metric Learning into Clustering A Unified Framework to Integrate Supervision and Metric Learning into Clustering Xin Li and Dan Roth Department of Computer Science University of Illinois, Urbana, IL 61801 (xli1,danr)@uiuc.edu December

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer David G. Luenberger Yinyu Ye Linear and Nonlinear Programming Fourth Edition ö Springer Contents 1 Introduction 1 1.1 Optimization 1 1.2 Types of Problems 2 1.3 Size of Problems 5 1.4 Iterative Algorithms

More information

Learning Compact and Effective Distance Metrics with Diversity Regularization. Pengtao Xie. Carnegie Mellon University

Learning Compact and Effective Distance Metrics with Diversity Regularization. Pengtao Xie. Carnegie Mellon University Learning Compact and Effective Distance Metrics with Diversity Regularization Pengtao Xie Carnegie Mellon University 1 Distance Metric Learning Similar Dissimilar Distance Metric Wide applications in retrieval,

More information

Information-Theoretic Metric Learning

Information-Theoretic Metric Learning Jason V. Davis Brian Kulis Prateek Jain Suvrit Sra Inderjit S. Dhillon Dept. of Computer Science, University of Texas at Austin, Austin, TX 7872 Abstract In this paper, we present an information-theoretic

More information

Classification with Partial Labels

Classification with Partial Labels Classification with Partial Labels Nam Nguyen, Rich Caruana Cornell University Department of Computer Science Ithaca, New York 14853 {nhnguyen, caruana}@cs.cornell.edu ABSTRACT In this paper, we address

More information

on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015

on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 Vector visual representation Fixed-size image representation High-dim (100 100,000) Generic, unsupervised: BoW,

More information

SDLS: a Matlab package for solving conic least-squares problems

SDLS: a Matlab package for solving conic least-squares problems SDLS: a Matlab package for solving conic least-squares problems Didier Henrion 1,2 Jérôme Malick 3 June 28, 2007 Abstract This document is an introduction to the Matlab package SDLS (Semi-Definite Least-Squares)

More information

Quasi Cosine Similarity Metric Learning

Quasi Cosine Similarity Metric Learning Quasi Cosine Similarity Metric Learning Xiang Wu, Zhi-Guo Shi and Lei Liu School of Computer and Communication Engineering, University of Science and Technology Beijing, No.30 Xueyuan Road, Haidian District,

More information

Machine Learning: Think Big and Parallel

Machine Learning: Think Big and Parallel Day 1 Inderjit S. Dhillon Dept of Computer Science UT Austin CS395T: Topics in Multicore Programming Oct 1, 2013 Outline Scikit-learn: Machine Learning in Python Supervised Learning day1 Regression: Least

More information

Convergence of Multi-Pass Large Margin Nearest Neighbor Metric Learning

Convergence of Multi-Pass Large Margin Nearest Neighbor Metric Learning Convergence of Multi-Pass Large Margin Nearest Neighbor Metric Learning Christina Göpfert Benjamin Paassen Barbara Hammer CITEC center of excellence Bielefeld University - Germany (This is a preprint of

More information

Neighbourhood Components Analysis

Neighbourhood Components Analysis Neighbourhood Components Analysis Jacob Goldberger, Sam Roweis, Geoff Hinton, Ruslan Salakhutdinov Department of Computer Science, University of Toronto {jacob,rsalakhu,hinton,roweis}@cs.toronto.edu Abstract

More information

Large Margin Component Analysis

Large Margin Component Analysis DRAFT (November 2, 26) Final version to appear in Proc. NIPS 26 Large Margin Component Analysis Lorenzo Torresani Riya, Inc. lorenzo@riya.com Kuang-chih Lee Riya, Inc. kclee@riya.com Abstract Metric learning

More information

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,

More information

Semi-supervised Data Representation via Affinity Graph Learning

Semi-supervised Data Representation via Affinity Graph Learning 1 Semi-supervised Data Representation via Affinity Graph Learning Weiya Ren 1 1 College of Information System and Management, National University of Defense Technology, Changsha, Hunan, P.R China, 410073

More information

Supervised Distance Metric Learning

Supervised Distance Metric Learning Supervised Distance Metric Learning A Retrospective Nan Xiao Stat. Dept., Central South Univ. Q4 2013 Outline Theory Algorithms Applications What is a metric? A metric is a function. Function satisfies:

More information

Instance-based Learning

Instance-based Learning Instance-based Learning Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February 19 th, 2007 2005-2007 Carlos Guestrin 1 Why not just use Linear Regression? 2005-2007 Carlos Guestrin

More information

Learning Better Data Representation using Inference-Driven Metric Learning

Learning Better Data Representation using Inference-Driven Metric Learning Learning Better Data Representation using Inference-Driven Metric Learning Paramveer S. Dhillon CIS Deptt., Univ. of Penn. Philadelphia, PA, U.S.A dhillon@cis.upenn.edu Partha Pratim Talukdar Search Labs,

More information

Convex Optimizations for Distance Metric Learning and Pattern Classification

Convex Optimizations for Distance Metric Learning and Pattern Classification Convex Optimizations for Distance Metric Learning and Pattern Classification Kilian Q. Weinberger Department of Computer Science and Engineering Washington University, St. Louis, MO 63130 kilian@seas.wustl.edu

More information

Semi-Supervised Clustering via Learnt Codeword Distances

Semi-Supervised Clustering via Learnt Codeword Distances Semi-Supervised Clustering via Learnt Codeword Distances Dhruv Batra 1 Rahul Sukthankar 2,1 Tsuhan Chen 1 www.ece.cmu.edu/~dbatra rahuls@cs.cmu.edu tsuhan@cmu.edu 1 Carnegie Mellon University 2 Intel Research

More information

A Brief Look at Optimization

A Brief Look at Optimization A Brief Look at Optimization CSC 412/2506 Tutorial David Madras January 18, 2018 Slides adapted from last year s version Overview Introduction Classes of optimization problems Linear programming Steepest

More information

arxiv: v1 [cs.lg] 4 Jul 2014

arxiv: v1 [cs.lg] 4 Jul 2014 Improving Performance of Self-Organising Maps with Distance Metric Learning Method. Piotr P loński 1 and Krzysztof Zaremba 1 arxiv:1407.1201v1 [cs.lg] 4 Jul 2014 1 Institute of Radioelectronics, Warsaw

More information

Metric Learning for Large-Scale Image Classification:

Metric Learning for Large-Scale Image Classification: Metric Learning for Large-Scale Image Classification: Generalizing to New Classes at Near-Zero Cost Florent Perronnin 1 work published at ECCV 2012 with: Thomas Mensink 1,2 Jakob Verbeek 2 Gabriela Csurka

More information

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013 Machine Learning Topic 5: Linear Discriminants Bryan Pardo, EECS 349 Machine Learning, 2013 Thanks to Mark Cartwright for his extensive contributions to these slides Thanks to Alpaydin, Bishop, and Duda/Hart/Stork

More information

Lorentzian Distance Classifier for Multiple Features

Lorentzian Distance Classifier for Multiple Features Yerzhan Kerimbekov 1 and Hasan Şakir Bilge 2 1 Department of Computer Engineering, Ahmet Yesevi University, Ankara, Turkey 2 Department of Electrical-Electronics Engineering, Gazi University, Ankara, Turkey

More information

Linear methods for supervised learning

Linear methods for supervised learning Linear methods for supervised learning LDA Logistic regression Naïve Bayes PLA Maximum margin hyperplanes Soft-margin hyperplanes Least squares resgression Ridge regression Nonlinear feature maps Sometimes

More information

1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics

1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics 1. Introduction EE 546, Univ of Washington, Spring 2016 performance of numerical methods complexity bounds structural convex optimization course goals and topics 1 1 Some course info Welcome to EE 546!

More information

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or

More information

Unsupervised learning in Vision

Unsupervised learning in Vision Chapter 7 Unsupervised learning in Vision The fields of Computer Vision and Machine Learning complement each other in a very natural way: the aim of the former is to extract useful information from visual

More information

Constrained Clustering with Interactive Similarity Learning

Constrained Clustering with Interactive Similarity Learning SCIS & ISIS 2010, Dec. 8-12, 2010, Okayama Convention Center, Okayama, Japan Constrained Clustering with Interactive Similarity Learning Masayuki Okabe Toyohashi University of Technology Tenpaku 1-1, Toyohashi,

More information

Constrained optimization

Constrained optimization Constrained optimization A general constrained optimization problem has the form where The Lagrangian function is given by Primal and dual optimization problems Primal: Dual: Weak duality: Strong duality:

More information

PARALLEL CLASSIFICATION ALGORITHMS

PARALLEL CLASSIFICATION ALGORITHMS PARALLEL CLASSIFICATION ALGORITHMS By: Faiz Quraishi Riti Sharma 9 th May, 2013 OVERVIEW Introduction Types of Classification Linear Classification Support Vector Machines Parallel SVM Approach Decision

More information

Efficient Iterative Semi-supervised Classification on Manifold

Efficient Iterative Semi-supervised Classification on Manifold . Efficient Iterative Semi-supervised Classification on Manifold... M. Farajtabar, H. R. Rabiee, A. Shaban, A. Soltani-Farani Sharif University of Technology, Tehran, Iran. Presented by Pooria Joulani

More information

Semi-Supervised Clustering with Partial Background Information

Semi-Supervised Clustering with Partial Background Information Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject

More information

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data Ms. Gayatri Attarde 1, Prof. Aarti Deshpande 2 M. E Student, Department of Computer Engineering, GHRCCEM, University

More information

Stepwise Metric Adaptation Based on Semi-Supervised Learning for Boosting Image Retrieval Performance

Stepwise Metric Adaptation Based on Semi-Supervised Learning for Boosting Image Retrieval Performance Stepwise Metric Adaptation Based on Semi-Supervised Learning for Boosting Image Retrieval Performance Hong Chang & Dit-Yan Yeung Department of Computer Science Hong Kong University of Science and Technology

More information

Sparse and large-scale learning with heterogeneous data

Sparse and large-scale learning with heterogeneous data Sparse and large-scale learning with heterogeneous data February 15, 2007 Gert Lanckriet (gert@ece.ucsd.edu) IEEE-SDCIS In this talk Statistical machine learning Techniques: roots in classical statistics

More information

Exploiting Low-Rank Structure in Semidenite Programming by Approximate Operator Splitting

Exploiting Low-Rank Structure in Semidenite Programming by Approximate Operator Splitting Exploiting Low-Rank Structure in Semidenite Programming by Approximate Operator Splitting Mario Souto, Joaquim D. Garcia and Álvaro Veiga June 26, 2018 Outline Introduction Algorithm Exploiting low-rank

More information

Fast Low-Rank Semidefinite Programming for Embedding and Clustering

Fast Low-Rank Semidefinite Programming for Embedding and Clustering Fast Low-Rank Semidefinite Programming for Embedding and Clustering Brian Kulis Department of Computer Sciences University of Texas at Austin Austin, TX 78759 USA kulis@cs.utexas.edu Arun C. Surendran

More information

A Gradient-based Metric Learning Algorithm for k-nn Classifiers

A Gradient-based Metric Learning Algorithm for k-nn Classifiers A Gradient-based Metric Learning Algorithm for k-nn Classifiers Nayyar A. Zaidi 1, David McG. Squire 1, and David Suter 2 1 Clayton School of Information Technology, Monash University, VIC 38, Australia,

More information

Support Vector Machines.

Support Vector Machines. Support Vector Machines srihari@buffalo.edu SVM Discussion Overview 1. Overview of SVMs 2. Margin Geometry 3. SVM Optimization 4. Overlapping Distributions 5. Relationship to Logistic Regression 6. Dealing

More information

Overview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010

Overview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010 INFORMATICS SEMINAR SEPT. 27 & OCT. 4, 2010 Introduction to Semi-Supervised Learning Review 2 Overview Citation X. Zhu and A.B. Goldberg, Introduction to Semi- Supervised Learning, Morgan & Claypool Publishers,

More information

SDLS: a Matlab package for solving conic least-squares problems

SDLS: a Matlab package for solving conic least-squares problems SDLS: a Matlab package for solving conic least-squares problems Didier Henrion, Jérôme Malick To cite this version: Didier Henrion, Jérôme Malick. SDLS: a Matlab package for solving conic least-squares

More information

arxiv: v2 [cs.ds] 2 May 2018

arxiv: v2 [cs.ds] 2 May 2018 arxiv:1610.05710v2 [cs.ds] 2 May 2018 Feasibility Based Large Margin Nearest Neighbor Metric Learning Babak Hosseini 1 and Barbara Hammer 1 CITEC centre of excellence, Bielefeld University Bielefeld, Germany

More information

Convex Optimization. Lijun Zhang Modification of

Convex Optimization. Lijun Zhang   Modification of Convex Optimization Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Modification of http://stanford.edu/~boyd/cvxbook/bv_cvxslides.pdf Outline Introduction Convex Sets & Functions Convex Optimization

More information

Comparison of Interior Point Filter Line Search Strategies for Constrained Optimization by Performance Profiles

Comparison of Interior Point Filter Line Search Strategies for Constrained Optimization by Performance Profiles INTERNATIONAL JOURNAL OF MATHEMATICS MODELS AND METHODS IN APPLIED SCIENCES Comparison of Interior Point Filter Line Search Strategies for Constrained Optimization by Performance Profiles M. Fernanda P.

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Maximum Margin Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574

More information

The Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem

The Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem Int. J. Advance Soft Compu. Appl, Vol. 9, No. 1, March 2017 ISSN 2074-8523 The Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem Loc Tran 1 and Linh Tran

More information

Kernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018

Kernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Kernels + K-Means Matt Gormley Lecture 29 April 25, 2018 1 Reminders Homework 8:

More information

All lecture slides will be available at CSC2515_Winter15.html

All lecture slides will be available at  CSC2515_Winter15.html CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 9: Support Vector Machines All lecture slides will be available at http://www.cs.toronto.edu/~urtasun/courses/csc2515/ CSC2515_Winter15.html Many

More information

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited. page v Preface xiii I Basics 1 1 Optimization Models 3 1.1 Introduction... 3 1.2 Optimization: An Informal Introduction... 4 1.3 Linear Equations... 7 1.4 Linear Optimization... 10 Exercises... 12 1.5

More information

570 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 2, FEBRUARY 2011

570 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 2, FEBRUARY 2011 570 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 2, FEBRUARY 2011 Contextual Object Localization With Multiple Kernel Nearest Neighbor Brian McFee, Student Member, IEEE, Carolina Galleguillos, Student

More information

INF 4300 Classification III Anne Solberg The agenda today:

INF 4300 Classification III Anne Solberg The agenda today: INF 4300 Classification III Anne Solberg 28.10.15 The agenda today: More on estimating classifier accuracy Curse of dimensionality and simple feature selection knn-classification K-means clustering 28.10.15

More information

Metric Learning for Large Scale Image Classification:

Metric Learning for Large Scale Image Classification: Metric Learning for Large Scale Image Classification: Generalizing to New Classes at Near-Zero Cost Thomas Mensink 1,2 Jakob Verbeek 2 Florent Perronnin 1 Gabriela Csurka 1 1 TVPA - Xerox Research Centre

More information

Research Interests Optimization:

Research Interests Optimization: Mitchell: Research interests 1 Research Interests Optimization: looking for the best solution from among a number of candidates. Prototypical optimization problem: min f(x) subject to g(x) 0 x X IR n Here,

More information

Mathematical Themes in Economics, Machine Learning, and Bioinformatics

Mathematical Themes in Economics, Machine Learning, and Bioinformatics Western Kentucky University From the SelectedWorks of Matt Bogard 2010 Mathematical Themes in Economics, Machine Learning, and Bioinformatics Matt Bogard, Western Kentucky University Available at: https://works.bepress.com/matt_bogard/7/

More information

Learning Anisotropic RBF Kernels

Learning Anisotropic RBF Kernels Learning Anisotropic RBF Kernels Fabio Aiolli and Michele Donini University of Padova - Department of Mathematics Via Trieste, 63, 35121 Padova - Italy {aiolli,mdonini}@math.unipd.it Abstract. We present

More information

I How does the formulation (5) serve the purpose of the composite parameterization

I How does the formulation (5) serve the purpose of the composite parameterization Supplemental Material to Identifying Alzheimer s Disease-Related Brain Regions from Multi-Modality Neuroimaging Data using Sparse Composite Linear Discrimination Analysis I How does the formulation (5)

More information

Parallel and Distributed Sparse Optimization Algorithms

Parallel and Distributed Sparse Optimization Algorithms Parallel and Distributed Sparse Optimization Algorithms Part I Ruoyu Li 1 1 Department of Computer Science and Engineering University of Texas at Arlington March 19, 2015 Ruoyu Li (UTA) Parallel and Distributed

More information

ML Detection via SDP Relaxation

ML Detection via SDP Relaxation ML Detection via SDP Relaxation Junxiao Song and Daniel P. Palomar The Hong Kong University of Science and Technology (HKUST) ELEC 5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline of Lecture

More information

Deep Learning for Computer Vision

Deep Learning for Computer Vision Deep Learning for Computer Vision Spring 2018 http://vllab.ee.ntu.edu.tw/dlcv.html (primary) https://ceiba.ntu.edu.tw/1062dlcv (grade, etc.) FB: DLCV Spring 2018 Yu Chiang Frank Wang 王鈺強, Associate Professor

More information

Heterogeneous Multi-Metric Learning for Multi-Sensor Fusion

Heterogeneous Multi-Metric Learning for Multi-Sensor Fusion 14th International Conference on Information Fusion Chicago, Illinois, USA, July 5-8, 2011 Heterogeneous Multi-Metric Learning for Multi-Sensor Fusion Haichao Zhang Beckman Institute University of Illinois

More information

Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding

Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding B. O Donoghue E. Chu N. Parikh S. Boyd Convex Optimization and Beyond, Edinburgh, 11/6/2104 1 Outline Cone programming Homogeneous

More information

10-701/15-781, Fall 2006, Final

10-701/15-781, Fall 2006, Final -7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly

More information

Alternating Projections

Alternating Projections Alternating Projections Stephen Boyd and Jon Dattorro EE392o, Stanford University Autumn, 2003 1 Alternating projection algorithm Alternating projections is a very simple algorithm for computing a point

More information

Logistic Regression

Logistic Regression Logistic Regression ddebarr@uw.edu 2016-05-26 Agenda Model Specification Model Fitting Bayesian Logistic Regression Online Learning and Stochastic Optimization Generative versus Discriminative Classifiers

More information

An Efficient Algorithm for Local Distance Metric Learning

An Efficient Algorithm for Local Distance Metric Learning An Efficient Algorithm for Local Distance Metric Learning Liu Yang and Rong Jin Michigan State University Dept. of Computer Science & Engineering East Lansing, MI 48824 {yangliu1, rongjin}@cse.msu.edu

More information

Clustering CS 550: Machine Learning

Clustering CS 550: Machine Learning Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf

More information

Programming, numerics and optimization

Programming, numerics and optimization Programming, numerics and optimization Lecture C-4: Constrained optimization Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428 June

More information

Keyword Extraction by KNN considering Similarity among Features

Keyword Extraction by KNN considering Similarity among Features 64 Int'l Conf. on Advances in Big Data Analytics ABDA'15 Keyword Extraction by KNN considering Similarity among Features Taeho Jo Department of Computer and Information Engineering, Inha University, Incheon,

More information

Linear Discriminant Functions: Gradient Descent and Perceptron Convergence

Linear Discriminant Functions: Gradient Descent and Perceptron Convergence Linear Discriminant Functions: Gradient Descent and Perceptron Convergence The Two-Category Linearly Separable Case (5.4) Minimizing the Perceptron Criterion Function (5.5) Role of Linear Discriminant

More information

arxiv: v1 [stat.ml] 3 Jan 2012

arxiv: v1 [stat.ml] 3 Jan 2012 Random Forests for Metric Learning with Implicit Pairwise Position Dependence arxiv:1201.0610v1 [stat.ml] 3 Jan 2012 Caiming Xiong, David Johnson, Ran Xu and Jason J. Corso Department of Computer Science

More information

Improving Image Segmentation Quality Via Graph Theory

Improving Image Segmentation Quality Via Graph Theory International Symposium on Computers & Informatics (ISCI 05) Improving Image Segmentation Quality Via Graph Theory Xiangxiang Li, Songhao Zhu School of Automatic, Nanjing University of Post and Telecommunications,

More information

Performance Evaluation of an Interior Point Filter Line Search Method for Constrained Optimization

Performance Evaluation of an Interior Point Filter Line Search Method for Constrained Optimization 6th WSEAS International Conference on SYSTEM SCIENCE and SIMULATION in ENGINEERING, Venice, Italy, November 21-23, 2007 18 Performance Evaluation of an Interior Point Filter Line Search Method for Constrained

More information

Unsupervised Learning

Unsupervised Learning Networks for Pattern Recognition, 2014 Networks for Single Linkage K-Means Soft DBSCAN PCA Networks for Kohonen Maps Linear Vector Quantization Networks for Problems/Approaches in Machine Learning Supervised

More information

Data Mining in Bioinformatics Day 1: Classification

Data Mining in Bioinformatics Day 1: Classification Data Mining in Bioinformatics Day 1: Classification Karsten Borgwardt February 18 to March 1, 2013 Machine Learning & Computational Biology Research Group Max Planck Institute Tübingen and Eberhard Karls

More information

An efficient algorithm for rank-1 sparse PCA

An efficient algorithm for rank-1 sparse PCA An efficient algorithm for rank- sparse PCA Yunlong He Georgia Institute of Technology School of Mathematics heyunlong@gatech.edu Renato Monteiro Georgia Institute of Technology School of Industrial &

More information

Kernel Methods & Support Vector Machines

Kernel Methods & Support Vector Machines & Support Vector Machines & Support Vector Machines Arvind Visvanathan CSCE 970 Pattern Recognition 1 & Support Vector Machines Question? Draw a single line to separate two classes? 2 & Support Vector

More information

Active Sampling for Constrained Clustering

Active Sampling for Constrained Clustering Paper: Active Sampling for Constrained Clustering Masayuki Okabe and Seiji Yamada Information and Media Center, Toyohashi University of Technology 1-1 Tempaku, Toyohashi, Aichi 441-8580, Japan E-mail:

More information

Perceptron as a graph

Perceptron as a graph Neural Networks Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University October 10 th, 2007 2005-2007 Carlos Guestrin 1 Perceptron as a graph 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0-6 -4-2

More information

Discriminative Clustering for Image Co-segmentation

Discriminative Clustering for Image Co-segmentation Discriminative Clustering for Image Co-segmentation Armand Joulin Francis Bach Jean Ponce INRIA Ecole Normale Supérieure, Paris January 2010 Introduction Introduction Task: dividing simultaneously q images

More information

Non-Parametric Kernel Learning with Robust Pairwise Constraints

Non-Parametric Kernel Learning with Robust Pairwise Constraints Noname manuscript No. (will be inserted by the editor) Non-Parametric Kernel Learning with Robust Pairwise Constraints Changyou Chen Junping Zhang Xuefang He Zhi-Hua Zhou Received: date / Accepted: date

More information

Visualizing pairwise similarity via semidefinite programming

Visualizing pairwise similarity via semidefinite programming Visualizing pairwise similarity via semidefinite programming Amir Globerson Computer Science and Artificial Intelligence Laboratory MIT Cambridge, MA 02139 gamir@csail.mit.edu Sam Roweis Department of

More information

Machine Learning. Nonparametric methods for Classification. Eric Xing , Fall Lecture 2, September 12, 2016

Machine Learning. Nonparametric methods for Classification. Eric Xing , Fall Lecture 2, September 12, 2016 Machine Learning 10-701, Fall 2016 Nonparametric methods for Classification Eric Xing Lecture 2, September 12, 2016 Reading: 1 Classification Representing data: Hypothesis (classifier) 2 Clustering 3 Supervised

More information

IE598 Big Data Optimization Summary Nonconvex Optimization

IE598 Big Data Optimization Summary Nonconvex Optimization IE598 Big Data Optimization Summary Nonconvex Optimization Instructor: Niao He April 16, 2018 1 This Course Big Data Optimization Explore modern optimization theories, algorithms, and big data applications

More information

SPECTRAL SPARSIFICATION IN SPECTRAL CLUSTERING

SPECTRAL SPARSIFICATION IN SPECTRAL CLUSTERING SPECTRAL SPARSIFICATION IN SPECTRAL CLUSTERING Alireza Chakeri, Hamidreza Farhidzadeh, Lawrence O. Hall Department of Computer Science and Engineering College of Engineering University of South Florida

More information

Computational Statistics The basics of maximum likelihood estimation, Bayesian estimation, object recognitions

Computational Statistics The basics of maximum likelihood estimation, Bayesian estimation, object recognitions Computational Statistics The basics of maximum likelihood estimation, Bayesian estimation, object recognitions Thomas Giraud Simon Chabot October 12, 2013 Contents 1 Discriminant analysis 3 1.1 Main idea................................

More information

Advanced Topics In Machine Learning Project Report : Low Dimensional Embedding of a Pose Collection Fabian Prada

Advanced Topics In Machine Learning Project Report : Low Dimensional Embedding of a Pose Collection Fabian Prada Advanced Topics In Machine Learning Project Report : Low Dimensional Embedding of a Pose Collection Fabian Prada 1 Introduction In this project we present an overview of (1) low dimensional embedding,

More information

Face2Face Comparing faces with applications Patrick Pérez. Inria, Rennes 2 Oct. 2014

Face2Face Comparing faces with applications Patrick Pérez. Inria, Rennes 2 Oct. 2014 Face2Face Comparing faces with applications Patrick Pérez Inria, Rennes 2 Oct. 2014 Outline Metric learning for face comparison Expandable parts model and occlusions Face sets comparison Identity-based

More information