Efficient Iterative Semi-supervised Classification on Manifold
|
|
- Barrie Lindsey
- 5 years ago
- Views:
Transcription
1 . Efficient Iterative Semi-supervised Classification on Manifold... M. Farajtabar, H. R. Rabiee, A. Shaban, A. Soltani-Farani Sharif University of Technology, Tehran, Iran. Presented by Pooria Joulani University of Alberta December 11, 2011 M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
2 ...1 Introduction Graph Transduction...2 The Algorithm Analysis...3 The Algorithm Analysis...4 Setup Scenarios Summary and Future Works M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
3 . Semi-supervised Learning Introduction Graph Transduction Semi-supervised Learning: utilize unlabeled data to to enhance classification Manifold assumption: the labeling function varies smoothly with respect to the underlying manifold Manifold structure is modeled by the neighborhood graph of the data points Application such as image segmentation, handwritten digit recognition, text classification, and etc SSL is advantageous when there is large amount of unlabeled data which leads to better utilization of the underlying geometry Large-scale setting; time and memory limitation Efficient implementation M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
4 . Graph Transduction Algorithms Introduction Graph Transduction Graph Transduction: a simple form of manifold regularization algorithms Can be formulated as: arg min x where A R n n and b, x R n 1 2 x T Ax b T x, (1) Equivalent to solving the system of linear equations, Ax = b A is fortunately a sparse symmetric positive definite matrix M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
5 . Naive Solutions Outline Introduction Graph Transduction Require O(n 3 ) operations Methods that take into account the sparse structure of A can cost much less Taking the inverse of A directly is an obvious bad choice for various reasons Requires O(n 3 ) operations regardless of the sparsity A may be near singular in which case the inverse operation is numerically unstable The inverse of A is usually not sparse in which case a large amount of memory is needed to store and process A 1. M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
6 . Two Approaches Introduction Graph Transduction Reformulate the manifold regularization problem Linear kernel Sparsified regularizer Solve the original formulation via Factorization methods LQ LU Cholesky Optimization algorithms Gradient descent Conjugate gradient Quasi Newton Iterative methods LP LGC LNP M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
7 . Problem Statement Introduction Graph Transduction Let X u = {x 1,..., x u } and X l = {x u+1,..., x u+l } be sets of unlabeled and labeled data points, respectively, where n = u + l Let y be a vector of length n with y i = 0 for unlabeled x i and y i equals to the 1 or 1 corresponding to the class labels Our goal is to predict labels of X = X u X l as f Let W be the weight matrix of the k-nn graph of X, where σ is the bandwidth parameter W (i, j) = exp( x i x j 2 /2σ) (2) M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
8 . Problem Statement (cont.) Introduction Graph Transduction The family of graph transduction algorithms can be formulated as the following optimization problem: arg min f T Qf + (f y) T C(f y) (3) f where Q is a regularization matrix and C is a diagonal matrix with C ii equal to the importance of the i th node to stick to y i The first term represents smoothness of the predicted labels with respect to the underlying manifold The second term is squared error of the predicted labels compared with the initial ones weighted by C. Choosing different Qs and Cs leads to various manifold classification methods: Thikhonv Regularization Label Propagation and Harmonic Solution Local and Global Consistency M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
9 . Problem Statement (cont.) Introduction Graph Transduction Defining diagonal matrix D with D(i, i) = n j=1 W (i, j), symmetrically normalize W by S = D 1/2 WD 1/2. The Laplacian matrix is L = I S In Local and Global Consistency (LGC), Q = L and C = µi, i.e. we want to minimize R(f ) = f T Lf + (f y) T C(f y). (4) It may easily be shown that the solution is equal to: f = (L + C) 1 Cy = (I αs) 1 y, (5) where α = 1 µ+1 An iterative algorithm to compute this solution: f (t+1) = αsf (t) + (1 α)y. (6) M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
10 . Gradient descent The Algorithm Analysis Gradient of the objective function is R = 2(Lf + C(f y)), Gradient descent update rule: f (t+1) = f (t) 2α(Lf + C(f y))). (7) The stopping criterion is R η. Choosing α appropriately is essential for convergence Applying exact line search at iteration t: t log ( R (0) R log (1/z) R (t) R ) which z is a constant equal to 1 λ min(l+c) λ max (L+C).. (8) M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
11 . Gradient Descent (cont.) The Algorithm Analysis. Theorem 1.. The maximum number of iterations for gradient descent with exact line. search and fixed (η, µ) is O(log n)... To be exact: t (2+µ) log ( 2 n η ) 2 log (1 + µ 2 ). (9) Each iteration costs a sparse matrix-vector multiplication plus vector sums O(n) for each iteration given neighborhood size, k, is constant and small An O(n log n) rate of growth with respect the number of data, n The bound is valid for other graph transduction algorithms M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
12 . Newton s algorithm The Algorithm Analysis Newton s update rule for our problem is approximating the inverse Hessian. f (t+1) = f (t) α( 2 R) 1 R (10) ( 2 R) 1 = 1 2 (L + C) 1 = 1 (I S + C) 1 2 = 1 ( I (I + C) 1 S ) 1 (I + C) 1 (11) 2 = 1 ( Σ ( i=0 (I + C) 1 S ) ) i (I + C) 1 2 Using the m first terms in the above summation leads to an approximation of the inverse Hessian: ( ( 2 R) 1 ( (I + C) 1 S ) ) i (I + C) 1. (12) Σ m 1 i=0 M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
13 . Approximate Newton s algorithm The Algorithm Analysis Rewriting Newton s method with the approximated inverse Hessian and doing some math: where f (t+1) = H m f (t) + g m, (13) H = (I + C) 1 S (14) m 1 g m = ( H i )(I + C) 1 Cy. (15) i=0 This update rule is performed iteratively from an initial f (0) until the stopping criterion R η is reached. LGC s default iterative procedure is a especial case of the proposed method with m = 1. M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
14 . Analysis Outline The Algorithm Analysis. Theorem 2... The approximate Newton s method converges to the solution of LGC.... Theorem 3.. For the approximate Newton s method the stopping criterion R η is. reached in O(log n) iterations... To be exact: t log ( (2+µ)n η ) m log (1 + µ) (16) m is empirically set to 1,2, or 3. A larger m disturbs sparsity. Given neighborhood size, k, is constant and small, cost of each iteration is equal to a sparse matrix-vector multiplication, i.e., O(n). Given η and µ are constant, the time complexity is O(n log n). M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
15 . Illusteration Outline The Algorithm Analysis Optimization for two data points from MNIST Gradient descent Approx. Newton m = 1 Approx. Newton m = 2 Gradient Descent LGC (m = 1) Approximate method m = 2 Consider the directions which the methods find M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
16 Setup Scenarios Summary and Future Works data from two classes of MNIST; handwritten digit recognition data from two classes of Covertype; forest cover prediction 7000 data from Classic dataset; text categorization Comparison with CHOLMOD and LGC s default implementation 5-NN for neighborhood construction Bandwidth size set to mean of standard deviation of data 2 % of data points are labeled µ is set to 0.5 η = empirically ensures convergence to the optimal solutions Number of Iterations, accuracy, and distance to optimum are reported by average of 10 runs for different random labelings M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
17 . Number of iterations Setup Scenarios Summary and Future Works Number of Iterations 35 LGC Approx. Newton m = 2 30 Gradient Descent Number of data (a) MNIST Number of Iterations Number of data x 10 4 (b) Covertype 35 Number of Iterations Number of data (c) Classic M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
18 . Accuracy Outline Setup Scenarios Summary and Future Works Accuracy 1.05 LGC Approx. Newton m = 2 Gradient Descent 1 CHOLMOD 0.95 Accuracy Number of data (d) MNIST Number of data x 10 4 (e) Covertype Accuracy Number of data (f) Classic M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
19 . Distance form optimum Setup Scenarios Summary and Future Works f (t) f * 150 LGC Approx. Newton m = 2 Gradient Descent f (t) f * Number of iterations (g) MNIST Number of iterations (h) Covertype f (t) f * Number of iterations (i) Classic M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
20 . Time Outline Setup Scenarios Summary and Future Works Duration (Sec) 4Approx. Newton m = 2 CHOLMOD Duration (Sec) 0.08LGC Approx. Newton m = Gradient Descent Number of data (j) MNIST Number of data (k) MNIST M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
21 Setup Scenarios Summary and Future Works Summary A novel approximation to Newton s method is proposed for solving graph transduction problems A theoretical analysis on the number of iterations for the proposed method and the gradient descent method The number of iterations have logarithmic dependence on the number of data A reasonable approach when a large amount of data is being classified Future works: Analysis of robustness against noise Incorporating a low cost line search with the proposed method M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
22 Setup Scenarios Summary and Future Works Thanks for your Attention. M. Farajtabar et al. Efficient Iterative Semi-supervised Classification on Manifold December 11, / 22
(Sparse) Linear Solvers
(Sparse) Linear Solvers Ax = B Why? Many geometry processing applications boil down to: solve one or more linear systems Parameterization Editing Reconstruction Fairing Morphing 2 Don t you just invert
More informationGRAPH-BASED SEMI-SUPERVISED HYPERSPECTRAL IMAGE CLASSIFICATION USING SPATIAL INFORMATION
GRAPH-BASED SEMI-SUPERVISED HYPERSPECTRAL IMAGE CLASSIFICATION USING SPATIAL INFORMATION Nasehe Jamshidpour a, Saeid Homayouni b, Abdolreza Safari a a Dept. of Geomatics Engineering, College of Engineering,
More informationCombine the PA Algorithm with a Proximal Classifier
Combine the Passive and Aggressive Algorithm with a Proximal Classifier Yuh-Jye Lee Joint work with Y.-C. Tseng Dept. of Computer Science & Information Engineering TaiwanTech. Dept. of Statistics@NCKU
More information(Sparse) Linear Solvers
(Sparse) Linear Solvers Ax = B Why? Many geometry processing applications boil down to: solve one or more linear systems Parameterization Editing Reconstruction Fairing Morphing 1 Don t you just invert
More informationThorsten Joachims Then: Universität Dortmund, Germany Now: Cornell University, USA
Retrospective ICML99 Transductive Inference for Text Classification using Support Vector Machines Thorsten Joachims Then: Universität Dortmund, Germany Now: Cornell University, USA Outline The paper in
More informationDivide and Conquer Kernel Ridge Regression
Divide and Conquer Kernel Ridge Regression Yuchen Zhang John Duchi Martin Wainwright University of California, Berkeley COLT 2013 Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT 2013 1 / 15 Problem
More informationBipartite Edge Prediction via Transductive Learning over Product Graphs
Bipartite Edge Prediction via Transductive Learning over Product Graphs Hanxiao Liu, Yiming Yang School of Computer Science, Carnegie Mellon University July 8, 2015 ICML 2015 Bipartite Edge Prediction
More informationImproving Image Segmentation Quality Via Graph Theory
International Symposium on Computers & Informatics (ISCI 05) Improving Image Segmentation Quality Via Graph Theory Xiangxiang Li, Songhao Zhu School of Automatic, Nanjing University of Post and Telecommunications,
More informationMOST machine learning algorithms rely on the assumption
1 Domain Adaptation on Graphs by Learning Aligned Graph Bases Mehmet Pilancı and Elif Vural arxiv:183.5288v1 [stat.ml] 14 Mar 218 Abstract We propose a method for domain adaptation on graphs. Given sufficiently
More informationIntroduction to Optimization
Introduction to Optimization Second Order Optimization Methods Marc Toussaint U Stuttgart Planned Outline Gradient-based optimization (1st order methods) plain grad., steepest descent, conjugate grad.,
More informationLarge-Scale Lasso and Elastic-Net Regularized Generalized Linear Models
Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models DB Tsai Steven Hillion Outline Introduction Linear / Nonlinear Classification Feature Engineering - Polynomial Expansion Big-data
More informationConvexization in Markov Chain Monte Carlo
in Markov Chain Monte Carlo 1 IBM T. J. Watson Yorktown Heights, NY 2 Department of Aerospace Engineering Technion, Israel August 23, 2011 Problem Statement MCMC processes in general are governed by non
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University September 20 2018 Review Solution for multiple linear regression can be computed in closed form
More informationGraph Laplacian Kernels for Object Classification from a Single Example
Graph Laplacian Kernels for Object Classification from a Single Example Hong Chang & Dit-Yan Yeung Department of Computer Science, Hong Kong University of Science and Technology {hongch,dyyeung}@cs.ust.hk
More informationClassification: Linear Discriminant Functions
Classification: Linear Discriminant Functions CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Discriminant functions Linear Discriminant functions
More informationCapturing, Modeling, Rendering 3D Structures
Computer Vision Approach Capturing, Modeling, Rendering 3D Structures Calculate pixel correspondences and extract geometry Not robust Difficult to acquire illumination effects, e.g. specular highlights
More informationGeneralized trace ratio optimization and applications
Generalized trace ratio optimization and applications Mohammed Bellalij, Saïd Hanafi, Rita Macedo and Raca Todosijevic University of Valenciennes, France PGMO Days, 2-4 October 2013 ENSTA ParisTech PGMO
More informationContents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
page v Preface xiii I Basics 1 1 Optimization Models 3 1.1 Introduction... 3 1.2 Optimization: An Informal Introduction... 4 1.3 Linear Equations... 7 1.4 Linear Optimization... 10 Exercises... 12 1.5
More informationLearning Two-View Stereo Matching
Learning Two-View Stereo Matching Jianxiong Xiao Jingni Chen Dit-Yan Yeung Long Quan Department of Computer Science and Engineering The Hong Kong University of Science and Technology The 10th European
More informationTransductive Phoneme Classification Using Local Scaling And Confidence
202 IEEE 27-th Convention of Electrical and Electronics Engineers in Israel Transductive Phoneme Classification Using Local Scaling And Confidence Matan Orbach Dept. of Electrical Engineering Technion
More informationAlternating Minimization. Jun Wang, Tony Jebara, and Shih-fu Chang
Graph Transduction via Alternating Minimization Jun Wang, Tony Jebara, and Shih-fu Chang 1 Outline of the presentation Brief introduction and related work Problems with Graph Labeling Imbalanced labels
More informationSuper-resolution on Text Image Sequences
November 4, 2004 Outline Outline Geometric Distortion Optical/Motion Blurring Down-Sampling Total Variation Basic Idea Outline Geometric Distortion Optical/Motion Blurring Down-Sampling No optical/image
More informationExperimental Data and Training
Modeling and Control of Dynamic Systems Experimental Data and Training Mihkel Pajusalu Alo Peets Tartu, 2008 1 Overview Experimental data Designing input signal Preparing data for modeling Training Criterion
More informationWhat is machine learning?
Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 24 2019 Logistics HW 1 is due on Friday 01/25 Project proposal: due Feb 21 1 page description
More informationSemi-supervised Data Representation via Affinity Graph Learning
1 Semi-supervised Data Representation via Affinity Graph Learning Weiya Ren 1 1 College of Information System and Management, National University of Defense Technology, Changsha, Hunan, P.R China, 410073
More informationConvex Optimization MLSS 2015
Convex Optimization MLSS 2015 Constantine Caramanis The University of Texas at Austin The Optimization Problem minimize : f (x) subject to : x X. The Optimization Problem minimize : f (x) subject to :
More informationConvolution Neural Nets meet
Convolution Neural Nets meet PDE s Eldad Haber Lars Ruthotto SIAM CS&E 2017 Convolution Neural Networks (CNN) Meet PDE s Optimization Multiscale Example Future work CNN - A quick overview Neural Networks
More informationSemi-supervised Learning
Semi-supervised Learning Piyush Rai CS5350/6350: Machine Learning November 8, 2011 Semi-supervised Learning Supervised Learning models require labeled data Learning a reliable model usually requires plenty
More informationThe K-modes and Laplacian K-modes algorithms for clustering
The K-modes and Laplacian K-modes algorithms for clustering Miguel Á. Carreira-Perpiñán Electrical Engineering and Computer Science University of California, Merced http://faculty.ucmerced.edu/mcarreira-perpinan
More informationOptimization for Machine Learning
with a focus on proximal gradient descent algorithm Department of Computer Science and Engineering Outline 1 History & Trends 2 Proximal Gradient Descent 3 Three Applications A Brief History A. Convex
More informationOptimization Plugin for RapidMiner. Venkatesh Umaashankar Sangkyun Lee. Technical Report 04/2012. technische universität dortmund
Optimization Plugin for RapidMiner Technical Report Venkatesh Umaashankar Sangkyun Lee 04/2012 technische universität dortmund Part of the work on this technical report has been supported by Deutsche Forschungsgemeinschaft
More informationDeep Learning via Semi-Supervised Embedding. Jason Weston, Frederic Ratle and Ronan Collobert Presented by: Janani Kalyanam
Deep Learning via Semi-Supervised Embedding Jason Weston, Frederic Ratle and Ronan Collobert Presented by: Janani Kalyanam Review Deep Learning Extract low-level features first. Extract more complicated
More informationConditional Random Fields and beyond D A N I E L K H A S H A B I C S U I U C,
Conditional Random Fields and beyond D A N I E L K H A S H A B I C S 5 4 6 U I U C, 2 0 1 3 Outline Modeling Inference Training Applications Outline Modeling Problem definition Discriminative vs. Generative
More informationThe exam is closed book, closed notes except your one-page (two-sided) cheat sheet.
CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or
More informationProgramming, numerics and optimization
Programming, numerics and optimization Lecture C-4: Constrained optimization Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428 June
More informationGraph-based Techniques for Searching Large-Scale Noisy Multimedia Data
Graph-based Techniques for Searching Large-Scale Noisy Multimedia Data Shih-Fu Chang Department of Electrical Engineering Department of Computer Science Columbia University Joint work with Jun Wang (IBM),
More informationBilevel Sparse Coding
Adobe Research 345 Park Ave, San Jose, CA Mar 15, 2013 Outline 1 2 The learning model The learning algorithm 3 4 Sparse Modeling Many types of sensory data, e.g., images and audio, are in high-dimensional
More informationRevisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization. Author: Martin Jaggi Presenter: Zhongxing Peng
Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization Author: Martin Jaggi Presenter: Zhongxing Peng Outline 1. Theoretical Results 2. Applications Outline 1. Theoretical Results 2. Applications
More informationSVM in Oracle Database 10g: Removing the Barriers to Widespread Adoption of Support Vector Machines
SVM in Oracle Database 10g: Removing the Barriers to Widespread Adoption of Support Vector Machines Boriana Milenova, Joseph Yarmus, Marcos Campos Data Mining Technologies Oracle Overview Support Vector
More informationELEG Compressive Sensing and Sparse Signal Representations
ELEG 867 - Compressive Sensing and Sparse Signal Representations Gonzalo R. Arce Depart. of Electrical and Computer Engineering University of Delaware Fall 211 Compressive Sensing G. Arce Fall, 211 1 /
More informationComputational Statistics The basics of maximum likelihood estimation, Bayesian estimation, object recognitions
Computational Statistics The basics of maximum likelihood estimation, Bayesian estimation, object recognitions Thomas Giraud Simon Chabot October 12, 2013 Contents 1 Discriminant analysis 3 1.1 Main idea................................
More informationMachine Learning / Jan 27, 2010
Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,
More informationLearning Algorithms for Medical Image Analysis. Matteo Santoro slipguru
Learning Algorithms for Medical Image Analysis Matteo Santoro slipguru santoro@disi.unige.it June 8, 2010 Outline 1. learning-based strategies for quantitative image analysis 2. automatic annotation of
More informationGRAPH BASED SEMI-SUPERVISED LEARNING IN COMPUTER VISION
GRAPH BASED SEMI-SUPERVISED LEARNING IN COMPUTER VISION by NING HUANG A dissertation submitted to the Graduate School New Brunswick Rutgers, The State University of New Jersey in conjunction with The Graduate
More informationLocally Linear Landmarks for large-scale manifold learning
Locally Linear Landmarks for large-scale manifold learning Max Vladymyrov and Miguel Á. Carreira-Perpiñán Electrical Engineering and Computer Science University of California, Merced http://eecs.ucmerced.edu
More informationTheoretical Concepts of Machine Learning
Theoretical Concepts of Machine Learning Part 2 Institute of Bioinformatics Johannes Kepler University, Linz, Austria Outline 1 Introduction 2 Generalization Error 3 Maximum Likelihood 4 Noise Models 5
More informationA comparison of Algorithms for Sparse Matrix. Real-time Multibody Dynamic Simulation
A comparison of Algorithms for Sparse Matrix Factoring and Variable Reordering aimed at Real-time Multibody Dynamic Simulation Jose-Luis Torres-Moreno, Jose-Luis Blanco, Javier López-Martínez, Antonio
More informationA Brief Look at Optimization
A Brief Look at Optimization CSC 412/2506 Tutorial David Madras January 18, 2018 Slides adapted from last year s version Overview Introduction Classes of optimization problems Linear programming Steepest
More informationMachine Learning Classifiers and Boosting
Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:
More informationLarge Scale Manifold Transduction
Large Scale Manifold Transduction Michael Karlen, Jason Weston, Ayse Erkan & Ronan Collobert NEC Labs America, Princeton, USA Ećole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland New York University,
More informationConvex and Distributed Optimization. Thomas Ropars
>>> Presentation of this master2 course Convex and Distributed Optimization Franck Iutzeler Jérôme Malick Thomas Ropars Dmitry Grishchenko from LJK, the applied maths and computer science laboratory and
More informationIntroduction to Optimization Problems and Methods
Introduction to Optimization Problems and Methods wjch@umich.edu December 10, 2009 Outline 1 Linear Optimization Problem Simplex Method 2 3 Cutting Plane Method 4 Discrete Dynamic Programming Problem Simplex
More informationA Taxonomy of Semi-Supervised Learning Algorithms
A Taxonomy of Semi-Supervised Learning Algorithms Olivier Chapelle Max Planck Institute for Biological Cybernetics December 2005 Outline 1 Introduction 2 Generative models 3 Low density separation 4 Graph
More informationThe Alternating Direction Method of Multipliers
The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein October 8, 2015 1 / 30 Introduction Presentation Outline 1 Convex
More informationLecture 12: Feasible direction methods
Lecture 12 Lecture 12: Feasible direction methods Kin Cheong Sou December 2, 2013 TMA947 Lecture 12 Lecture 12: Feasible direction methods 1 / 1 Feasible-direction methods, I Intro Consider the problem
More informationHumanoid Robotics. Least Squares. Maren Bennewitz
Humanoid Robotics Least Squares Maren Bennewitz Goal of This Lecture Introduction into least squares Use it yourself for odometry calibration, later in the lecture: camera and whole-body self-calibration
More informationMachine Learning: Think Big and Parallel
Day 1 Inderjit S. Dhillon Dept of Computer Science UT Austin CS395T: Topics in Multicore Programming Oct 1, 2013 Outline Scikit-learn: Machine Learning in Python Supervised Learning day1 Regression: Least
More informationA fast algorithm for sparse reconstruction based on shrinkage, subspace optimization and continuation [Wen,Yin,Goldfarb,Zhang 2009]
A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization and continuation [Wen,Yin,Goldfarb,Zhang 2009] Yongjia Song University of Wisconsin-Madison April 22, 2010 Yongjia Song
More informationA Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images
A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images Marc Aurelio Ranzato Yann LeCun Courant Institute of Mathematical Sciences New York University - New York, NY 10003 Abstract
More informationKernel-based Transductive Learning with Nearest Neighbors
Kernel-based Transductive Learning with Nearest Neighbors Liangcai Shu, Jinhui Wu, Lei Yu, and Weiyi Meng Dept. of Computer Science, SUNY at Binghamton Binghamton, New York 13902, U. S. A. {lshu,jwu6,lyu,meng}@cs.binghamton.edu
More informationConvex Optimization CMU-10725
Convex Optimization CMU-10725 Conjugate Direction Methods Barnabás Póczos & Ryan Tibshirani Conjugate Direction Methods 2 Books to Read David G. Luenberger, Yinyu Ye: Linear and Nonlinear Programming Nesterov:
More informationREGRESSION ANALYSIS : LINEAR BY MAUAJAMA FIRDAUS & TULIKA SAHA
REGRESSION ANALYSIS : LINEAR BY MAUAJAMA FIRDAUS & TULIKA SAHA MACHINE LEARNING It is the science of getting computer to learn without being explicitly programmed. Machine learning is an area of artificial
More informationRadial Basis Function Networks: Algorithms
Radial Basis Function Networks: Algorithms Neural Computation : Lecture 14 John A. Bullinaria, 2015 1. The RBF Mapping 2. The RBF Network Architecture 3. Computational Power of RBF Networks 4. Training
More informationApplication of Spectral Clustering Algorithm
1/27 Application of Spectral Clustering Algorithm Danielle Middlebrooks dmiddle1@math.umd.edu Advisor: Kasso Okoudjou kasso@umd.edu Department of Mathematics University of Maryland- College Park Advance
More informationClass 6 Large-Scale Image Classification
Class 6 Large-Scale Image Classification Liangliang Cao, March 7, 2013 EECS 6890 Topics in Information Processing Spring 2013, Columbia University http://rogerioferis.com/visualrecognitionandsearch Visual
More informationCPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016
CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:
More informationThe Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem
Int. J. Advance Soft Compu. Appl, Vol. 9, No. 1, March 2017 ISSN 2074-8523 The Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem Loc Tran 1 and Linh Tran
More informationOverview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010
INFORMATICS SEMINAR SEPT. 27 & OCT. 4, 2010 Introduction to Semi-Supervised Learning Review 2 Overview Citation X. Zhu and A.B. Goldberg, Introduction to Semi- Supervised Learning, Morgan & Claypool Publishers,
More informationVariational Autoencoders. Sargur N. Srihari
Variational Autoencoders Sargur N. srihari@cedar.buffalo.edu Topics 1. Generative Model 2. Standard Autoencoder 3. Variational autoencoders (VAE) 2 Generative Model A variational autoencoder (VAE) is a
More informationThe exam is closed book, closed notes except your one-page cheat sheet.
CS 189 Fall 2015 Introduction to Machine Learning Final Please do not turn over the page before you are instructed to do so. You have 2 hours and 50 minutes. Please write your initials on the top-right
More informationFull waveform inversion by deconvolution gradient method
Full waveform inversion by deconvolution gradient method Fuchun Gao*, Paul Williamson, Henri Houllevigue, Total), 2012 Lei Fu Rice University November 14, 2012 Outline Introduction Method Implementation
More informationMatching. Compare region of image to region of image. Today, simplest kind of matching. Intensities similar.
Matching Compare region of image to region of image. We talked about this for stereo. Important for motion. Epipolar constraint unknown. But motion small. Recognition Find object in image. Recognize object.
More informationPRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING. 1. Introduction
PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING KELLER VANDEBOGERT AND CHARLES LANNING 1. Introduction Interior point methods are, put simply, a technique of optimization where, given a problem
More information1 Training/Validation/Testing
CPSC 340 Final (Fall 2015) Name: Student Number: Please enter your information above, turn off cellphones, space yourselves out throughout the room, and wait until the official start of the exam to begin.
More informationCS 179 Lecture 16. Logistic Regression & Parallel SGD
CS 179 Lecture 16 Logistic Regression & Parallel SGD 1 Outline logistic regression (stochastic) gradient descent parallelizing SGD for neural nets (with emphasis on Google s distributed neural net implementation)
More informationA Course in Machine Learning
A Course in Machine Learning Hal Daumé III 13 UNSUPERVISED LEARNING If you have access to labeled training data, you know what to do. This is the supervised setting, in which you have a teacher telling
More informationRecent Developments in Model-based Derivative-free Optimization
Recent Developments in Model-based Derivative-free Optimization Seppo Pulkkinen April 23, 2010 Introduction Problem definition The problem we are considering is a nonlinear optimization problem with constraints:
More information1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics
1. Introduction EE 546, Univ of Washington, Spring 2016 performance of numerical methods complexity bounds structural convex optimization course goals and topics 1 1 Some course info Welcome to EE 546!
More informationDetection of Man-made Structures in Natural Images
Detection of Man-made Structures in Natural Images Tim Rees December 17, 2004 Abstract Object detection in images is a very active research topic in many disciplines. Probabilistic methods have been applied
More informationModern Methods of Data Analysis - WS 07/08
Modern Methods of Data Analysis Lecture XV (04.02.08) Contents: Function Minimization (see E. Lohrmann & V. Blobel) Optimization Problem Set of n independent variables Sometimes in addition some constraints
More informationNeural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer
More informationA Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images
A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images Marc Aurelio Ranzato Yann LeCun Courant Institute of Mathematical Sciences New York University - New York, NY 10003 Abstract
More informationMULTICORE LEARNING ALGORITHM
MULTICORE LEARNING ALGORITHM CHENG-TAO CHU, YI-AN LIN, YUANYUAN YU 1. Summary The focus of our term project is to apply the map-reduce principle to a variety of machine learning algorithms that are computationally
More informationUnsupervised Outlier Detection and Semi-Supervised Learning
Unsupervised Outlier Detection and Semi-Supervised Learning Adam Vinueza Department of Computer Science University of Colorado Boulder, Colorado 832 vinueza@colorado.edu Gregory Z. Grudic Department of
More informationParallel and Distributed Sparse Optimization Algorithms
Parallel and Distributed Sparse Optimization Algorithms Part I Ruoyu Li 1 1 Department of Computer Science and Engineering University of Texas at Arlington March 19, 2015 Ruoyu Li (UTA) Parallel and Distributed
More informationSemiBoost: Boosting for Semi-supervised Learning
To appear in the IEEE Transactions on Pattern Analysis and Machine Intelligence. SemiBoost: Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE,
More informationData fusion and multi-cue data matching using diffusion maps
Data fusion and multi-cue data matching using diffusion maps Stéphane Lafon Collaborators: Raphy Coifman, Andreas Glaser, Yosi Keller, Steven Zucker (Yale University) Part of this work was supported by
More informationAsynchronous Multi-Task Learning
Asynchronous Multi-Task Learning Inci M. Baytas, Ming Yan, Anil K. Jain and Jiayu Zhou December 14th, 2016 ICDM 2016 Inci M. Baytas, Ming Yan, Anil K. Jain and Jiayu Zhou 1 Outline 1 Introduction 2 Solving
More informationUnsupervised and Semi-Supervised Learning vial 1 -Norm Graph
Unsupervised and Semi-Supervised Learning vial -Norm Graph Feiping Nie, Hua Wang, Heng Huang, Chris Ding Department of Computer Science and Engineering University of Texas, Arlington, TX 769, USA {feipingnie,huawangcs}@gmail.com,
More informationTraining of Neural Networks. Q.J. Zhang, Carleton University
Training of Neural Networks Notation: x: input of the original modeling problem or the neural network y: output of the original modeling problem or the neural network w: internal weights/parameters of
More informationGraph-based Semi- Supervised Learning as Optimization
Graph-based Semi- Supervised Learning as Optimization Partha Pratim Talukdar CMU Machine Learning with Large Datasets (10-605) April 3, 2012 Graph-based Semi-Supervised Learning 0.2 0.1 0.2 0.3 0.3 0.2
More informationUnlabeled Data Classification by Support Vector Machines
Unlabeled Data Classification by Support Vector Machines Glenn Fung & Olvi L. Mangasarian University of Wisconsin Madison www.cs.wisc.edu/ olvi www.cs.wisc.edu/ gfung The General Problem Given: Points
More informationCS 340 Lec. 4: K-Nearest Neighbors
CS 340 Lec. 4: K-Nearest Neighbors AD January 2011 AD () CS 340 Lec. 4: K-Nearest Neighbors January 2011 1 / 23 K-Nearest Neighbors Introduction Choice of Metric Overfitting and Underfitting Selection
More informationLocality Preserving Projections (LPP) Abstract
Locality Preserving Projections (LPP) Xiaofei He Partha Niyogi Computer Science Department Computer Science Department The University of Chicago The University of Chicago Chicago, IL 60615 Chicago, IL
More informationKernels and representation
Kernels and representation Corso di AA, anno 2017/18, Padova Fabio Aiolli 20 Dicembre 2017 Fabio Aiolli Kernels and representation 20 Dicembre 2017 1 / 19 (Hierarchical) Representation Learning Hierarchical
More informationVisual Understanding via Multi-Feature Shared Learning with Global Consistency
Visual Understanding via Multi-Feature Shared Learning with Global Consistency Lei Zhang, Member, IEEE, and David Zhang, Fellow, IEEE Abstract Image/video data is usually represented with multiple visual
More informationNeural Networks: Optimization Part 1. Intro to Deep Learning, Fall 2018
Neural Networks: Optimization Part 1 Intro to Deep Learning, Fall 2018 1 Story so far Neural networks are universal approximators Can model any odd thing Provided they have the right architecture We must
More informationCS281 Section 3: Practical Optimization
CS281 Section 3: Practical Optimization David Duvenaud and Dougal Maclaurin Most parameter estimation problems in machine learning cannot be solved in closed form, so we often have to resort to numerical
More information