Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey. Chapter 4 : Optimization for Machine Learning
|
|
- Randolf Harris
- 5 years ago
- Views:
Transcription
1 Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey Chapter 4 : Optimization for Machine Learning
2 Summary of Chapter 2 Chapter 2: Convex Optimization with Sparsity Inducing Norm This chapter is on convex optimization of the form Where f is convex differentiable function and Ωis sparsityinducing non-smooth norm Ωl1, l1+ l1/lq, hierarchical l1/lqnorm Subgradient, block co-ordinate descent, reweighted l2 algorithms etc
3 Summary of Chapter 3 This chapter is on Cone linear and quadratic programming of the form Where is generalized inequality,, where C is closed pointed cone. Examples of cones :- 1) non-negative orhant 2) Second-order cone :- There is Python package CVXOPT to solve conic problems
4 Introduction This chapter considers optimization problems with cost functions such as Where m is very large. Therefore, using incremental methods that operate on singe rather than entire cost function.
5 Least Square and Related inference problems Classical regression L1-regularization problem Other possibilities include using non-quadratic convex loss functions
6 Dual Optimization in Separable Problems The problems of the form On non-convex set Y, have dual form
7 Weber Problem in Location Theory Find a point x whose weighted distances from given get of points Y (y1, y2, ym) is minimized
8 Incremental Gradient Methods Differentiable Problems When the component functions are differentiable we may use incremental gradient methods of the form Where ikis the index of cost component iterated on Such methods make fast progress when far from convergence but are slow when close to convergence Fixes: use constant step size or reduce to a small positive value
9 Variant of incremental gradient method Gradient method with momentum Aggregate component gradient Incremental gradient methods are also related to stochastic gradient method.
10 Incremental Sub-gradient Methods For cases when component functions are convex and nondifferentiable In place of gradient, arbitrary sub gradient is used. Convexity of fi(x) is essential Even non-incremental methods require sub-linear rate of convergence, hence incremental methods are favored
11 Incremental Proximal Methods These are the problems of the form This form is desirable as for some components, proximal iteration may be obtained in closed form Proximal iterations are considered more stable than gradient or subgradient iterations.
12 Incremental Subgradient-Proximal methods These methods include incremental algorithms with combination of proximal and sub-gradient iteration.
13 Both zkand xkare within constraint X which can be relaxed for either proximal or sub-gradient iterations which leads to easier computation So, the iterations in previous slides can be rewritten as: Or Incremental proximal iterations are closely related to sub-gradient iterations. So, we can re-write two steps given above in one step
14 Order of components Incremental sub-gradient proximal method s effectiveness depends on order {fi, hi} are chosen. 1) Cyclic : {fi, hi} are taken in fixed deterministic order 2) randomized order based on uniform sampling: each iteration pair {fi, hi} is randomly chosen Both order converge, however randomized order is superior to cyclic order
15 Applications: Regularized least squares Let s consider problem of the form Where R(x) is a l1-norm Then proximal iteration becomes
16 Applications: Regularized least squares It decomposed into Incremental algorithm are well-suited for such problem as proximal updates can be done in closed form Followed by gradient iteration
17 Iterated Projection Algorithm for Feasibility Problem Feasibility problem has the form Which can be re-written for Lipschitzcontinuous f and sufficiently large γ For which incremental algorithms apply
Nonlinear Programming
Nonlinear Programming SECOND EDITION Dimitri P. Bertsekas Massachusetts Institute of Technology WWW site for book Information and Orders http://world.std.com/~athenasc/index.html Athena Scientific, Belmont,
More informationConvexization in Markov Chain Monte Carlo
in Markov Chain Monte Carlo 1 IBM T. J. Watson Yorktown Heights, NY 2 Department of Aerospace Engineering Technion, Israel August 23, 2011 Problem Statement MCMC processes in general are governed by non
More informationConvex Optimization. Lijun Zhang Modification of
Convex Optimization Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Modification of http://stanford.edu/~boyd/cvxbook/bv_cvxslides.pdf Outline Introduction Convex Sets & Functions Convex Optimization
More informationLecture 19 Subgradient Methods. November 5, 2008
Subgradient Methods November 5, 2008 Outline Lecture 19 Subgradients and Level Sets Subgradient Method Convergence and Convergence Rate Convex Optimization 1 Subgradients and Level Sets A vector s is a
More informationCOMS 4771 Support Vector Machines. Nakul Verma
COMS 4771 Support Vector Machines Nakul Verma Last time Decision boundaries for classification Linear decision boundary (linear classification) The Perceptron algorithm Mistake bound for the perceptron
More informationSelected Topics in Column Generation
Selected Topics in Column Generation February 1, 2007 Choosing a solver for the Master Solve in the dual space(kelly s method) by applying a cutting plane algorithm In the bundle method(lemarechal), a
More informationOnline Learning. Lorenzo Rosasco MIT, L. Rosasco Online Learning
Online Learning Lorenzo Rosasco MIT, 9.520 About this class Goal To introduce theory and algorithms for online learning. Plan Different views on online learning From batch to online least squares Other
More informationConstrained optimization
Constrained optimization A general constrained optimization problem has the form where The Lagrangian function is given by Primal and dual optimization problems Primal: Dual: Weak duality: Strong duality:
More informationLecture 18: March 23
0-725/36-725: Convex Optimization Spring 205 Lecturer: Ryan Tibshirani Lecture 8: March 23 Scribes: James Duyck Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not
More informationOptimization. Industrial AI Lab.
Optimization Industrial AI Lab. Optimization An important tool in 1) Engineering problem solving and 2) Decision science People optimize Nature optimizes 2 Optimization People optimize (source: http://nautil.us/blog/to-save-drowning-people-ask-yourself-what-would-light-do)
More informationCS 395T Lecture 12: Feature Matching and Bundle Adjustment. Qixing Huang October 10 st 2018
CS 395T Lecture 12: Feature Matching and Bundle Adjustment Qixing Huang October 10 st 2018 Lecture Overview Dense Feature Correspondences Bundle Adjustment in Structure-from-Motion Image Matching Algorithm
More informationSparse Optimization Lecture: Proximal Operator/Algorithm and Lagrange Dual
Sparse Optimization Lecture: Proximal Operator/Algorithm and Lagrange Dual Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know learn the proximal
More informationProximal operator and methods
Proximal operator and methods Master 2 Data Science, Univ. Paris Saclay Robert M. Gower Optimization Sum of Terms A Datum Function Finite Sum Training Problem The Training Problem Convergence GD I Theorem
More informationLagrangian Relaxation: An overview
Discrete Math for Bioinformatics WS 11/12:, by A. Bockmayr/K. Reinert, 22. Januar 2013, 13:27 4001 Lagrangian Relaxation: An overview Sources for this lecture: D. Bertsimas and J. Tsitsiklis: Introduction
More informationLecture 4 Duality and Decomposition Techniques
Lecture 4 Duality and Decomposition Techniques Jie Lu (jielu@kth.se) Richard Combes Alexandre Proutiere Automatic Control, KTH September 19, 2013 Consider the primal problem Lagrange Duality Lagrangian
More informationOptimization for Machine Learning
Optimization for Machine Learning (Problems; Algorithms - C) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html
More informationCPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016
CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:
More information1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics
1. Introduction EE 546, Univ of Washington, Spring 2016 performance of numerical methods complexity bounds structural convex optimization course goals and topics 1 1 Some course info Welcome to EE 546!
More informationEnsemble methods in machine learning. Example. Neural networks. Neural networks
Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you
More informationRecent Developments in Model-based Derivative-free Optimization
Recent Developments in Model-based Derivative-free Optimization Seppo Pulkkinen April 23, 2010 Introduction Problem definition The problem we are considering is a nonlinear optimization problem with constraints:
More information3 Types of Gradient Descent Algorithms for Small & Large Data Sets
3 Types of Gradient Descent Algorithms for Small & Large Data Sets Introduction Gradient Descent Algorithm (GD) is an iterative algorithm to find a Global Minimum of an objective function (cost function)
More informationCONLIN & MMA solvers. Pierre DUYSINX LTAS Automotive Engineering Academic year
CONLIN & MMA solvers Pierre DUYSINX LTAS Automotive Engineering Academic year 2018-2019 1 CONLIN METHOD 2 LAY-OUT CONLIN SUBPROBLEMS DUAL METHOD APPROACH FOR CONLIN SUBPROBLEMS SEQUENTIAL QUADRATIC PROGRAMMING
More informationProjection-Based Methods in Optimization
Projection-Based Methods in Optimization Charles Byrne (Charles Byrne@uml.edu) http://faculty.uml.edu/cbyrne/cbyrne.html Department of Mathematical Sciences University of Massachusetts Lowell Lowell, MA
More informationAlternating Direction Method of Multipliers
Alternating Direction Method of Multipliers CS 584: Big Data Analytics Material adapted from Stephen Boyd (https://web.stanford.edu/~boyd/papers/pdf/admm_slides.pdf) & Ryan Tibshirani (http://stat.cmu.edu/~ryantibs/convexopt/lectures/21-dual-meth.pdf)
More informationParallel and Distributed Sparse Optimization Algorithms
Parallel and Distributed Sparse Optimization Algorithms Part I Ruoyu Li 1 1 Department of Computer Science and Engineering University of Texas at Arlington March 19, 2015 Ruoyu Li (UTA) Parallel and Distributed
More informationTheoretical Concepts of Machine Learning
Theoretical Concepts of Machine Learning Part 2 Institute of Bioinformatics Johannes Kepler University, Linz, Austria Outline 1 Introduction 2 Generalization Error 3 Maximum Likelihood 4 Noise Models 5
More informationConvex Optimization Algorithms
Convex Optimization Algorithms Dimitri P. Bertsekas Massachusetts Institute of Technology WWW site for book information and orders http://www.athenasc.com Athena Scientific, Belmont, Massachusetts Athena
More informationLink Dimensioning and LSP Optimization for MPLS Networks Supporting DiffServ EF and BE Classes
Link Dimensioning and LSP Optimization for MPLS Networks Supporting DiffServ EF and BE Classes Kehang Wu Douglas S. Reeves Capacity Planning for QoS DiffServ + MPLS QoS in core networks DiffServ provides
More informationAlternating Projections
Alternating Projections Stephen Boyd and Jon Dattorro EE392o, Stanford University Autumn, 2003 1 Alternating projection algorithm Alternating projections is a very simple algorithm for computing a point
More informationGradient Descent. Wed Sept 20th, James McInenrey Adapted from slides by Francisco J. R. Ruiz
Gradient Descent Wed Sept 20th, 2017 James McInenrey Adapted from slides by Francisco J. R. Ruiz Housekeeping A few clarifications of and adjustments to the course schedule: No more breaks at the midpoint
More informationConvex Optimization MLSS 2015
Convex Optimization MLSS 2015 Constantine Caramanis The University of Texas at Austin The Optimization Problem minimize : f (x) subject to : x X. The Optimization Problem minimize : f (x) subject to :
More informationLecture 19: November 5
0-725/36-725: Convex Optimization Fall 205 Lecturer: Ryan Tibshirani Lecture 9: November 5 Scribes: Hyun Ah Song Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not
More informationLagrangean relaxation - exercises
Lagrangean relaxation - exercises Giovanni Righini Set covering We start from the following Set Covering Problem instance: min z = x + 2x 2 + x + 2x 4 + x 5 x + x 2 + x 4 x 2 + x x 2 + x 4 + x 5 x + x
More informationConvex Analysis and Minimization Algorithms I
Jean-Baptiste Hiriart-Urruty Claude Lemarechal Convex Analysis and Minimization Algorithms I Fundamentals With 113 Figures Springer-Verlag Berlin Heidelberg New York London Paris Tokyo Hong Kong Barcelona
More informationProject Proposals. Xiang Zhang. Department of Computer Science Courant Institute of Mathematical Sciences New York University.
Project Proposals Xiang Zhang Department of Computer Science Courant Institute of Mathematical Sciences New York University March 26, 2013 Xiang Zhang (NYU) Project Proposals March 26, 2013 1 / 9 Contents
More informationIE598 Big Data Optimization Summary Nonconvex Optimization
IE598 Big Data Optimization Summary Nonconvex Optimization Instructor: Niao He April 16, 2018 1 This Course Big Data Optimization Explore modern optimization theories, algorithms, and big data applications
More informationComposite Self-concordant Minimization
Composite Self-concordant Minimization Volkan Cevher Laboratory for Information and Inference Systems-LIONS Ecole Polytechnique Federale de Lausanne (EPFL) volkan.cevher@epfl.ch Paris 6 Dec 11, 2013 joint
More informationLecture 19: Convex Non-Smooth Optimization. April 2, 2007
: Convex Non-Smooth Optimization April 2, 2007 Outline Lecture 19 Convex non-smooth problems Examples Subgradients and subdifferentials Subgradient properties Operations with subgradients and subdifferentials
More informationLecture 4. Convexity Robust cost functions Optimizing non-convex functions. 3B1B Optimization Michaelmas 2017 A. Zisserman
Lecture 4 3B1B Optimization Michaelmas 2017 A. Zisserman Convexity Robust cost functions Optimizing non-convex functions grid search branch and bound simulated annealing evolutionary optimization The Optimization
More informationMachine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013
Machine Learning Topic 5: Linear Discriminants Bryan Pardo, EECS 349 Machine Learning, 2013 Thanks to Mark Cartwright for his extensive contributions to these slides Thanks to Alpaydin, Bishop, and Duda/Hart/Stork
More informationCMU-Q Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization. Teacher: Gianni A. Di Caro
CMU-Q 15-381 Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization Teacher: Gianni A. Di Caro GLOBAL FUNCTION OPTIMIZATION Find the global maximum of the function f x (and
More informationCPSC 340: Machine Learning and Data Mining. Robust Regression Fall 2015
CPSC 340: Machine Learning and Data Mining Robust Regression Fall 2015 Admin Can you see Assignment 1 grades on UBC connect? Auditors, don t worry about it. You should already be working on Assignment
More informationGradient Descent Optimization Algorithms for Deep Learning Batch gradient descent Stochastic gradient descent Mini-batch gradient descent
Gradient Descent Optimization Algorithms for Deep Learning Batch gradient descent Stochastic gradient descent Mini-batch gradient descent Slide credit: http://sebastianruder.com/optimizing-gradient-descent/index.html#batchgradientdescent
More informationColumn-Action Methods in Image Reconstruction
Column-Action Methods in Image Reconstruction Per Christian Hansen joint work with Tommy Elfving Touraj Nikazad Overview of Talk Part 1: the classical row-action method = ART The advantage of algebraic
More informationSupport Vector Machines. James McInerney Adapted from slides by Nakul Verma
Support Vector Machines James McInerney Adapted from slides by Nakul Verma Last time Decision boundaries for classification Linear decision boundary (linear classification) The Perceptron algorithm Mistake
More informationIE521 Convex Optimization Introduction
IE521 Convex Optimization Introduction Instructor: Niao He Jan 18, 2017 1 About Me Assistant Professor, UIUC, 2016 Ph.D. in Operations Research, M.S. in Computational Sci. & Eng. Georgia Tech, 2010 2015
More informationCharacterizing Improving Directions Unconstrained Optimization
Final Review IE417 In the Beginning... In the beginning, Weierstrass's theorem said that a continuous function achieves a minimum on a compact set. Using this, we showed that for a convex set S and y not
More informationLECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION. 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach
LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach Basic approaches I. Primal Approach - Feasible Direction
More informationPARALLEL OPTIMIZATION
PARALLEL OPTIMIZATION Theory, Algorithms, and Applications YAIR CENSOR Department of Mathematics and Computer Science University of Haifa STAVROS A. ZENIOS Department of Public and Business Administration
More informationDavid G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer
David G. Luenberger Yinyu Ye Linear and Nonlinear Programming Fourth Edition ö Springer Contents 1 Introduction 1 1.1 Optimization 1 1.2 Types of Problems 2 1.3 Size of Problems 5 1.4 Iterative Algorithms
More informationOptimal Network Flow Allocation. EE 384Y Almir Mutapcic and Primoz Skraba 27/05/2004
Optimal Network Flow Allocation EE 384Y Almir Mutapcic and Primoz Skraba 27/05/2004 Problem Statement Optimal network flow allocation Find flow allocation which minimizes certain performance criterion
More informationOptimization. 1. Optimization. by Prof. Seungchul Lee Industrial AI Lab POSTECH. Table of Contents
Optimization by Prof. Seungchul Lee Industrial AI Lab http://isystems.unist.ac.kr/ POSTECH Table of Contents I. 1. Optimization II. 2. Solving Optimization Problems III. 3. How do we Find x f(x) = 0 IV.
More informationCHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM
20 CHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM 2.1 CLASSIFICATION OF CONVENTIONAL TECHNIQUES Classical optimization methods can be classified into two distinct groups:
More informationPicasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python
Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python J. Ge, X. Li, H. Jiang, H. Liu, T. Zhang, M. Wang and T. Zhao Abstract We describe a new library named picasso, which
More informationMathematical Programming and Research Methods (Part II)
Mathematical Programming and Research Methods (Part II) 4. Convexity and Optimization Massimiliano Pontil (based on previous lecture by Andreas Argyriou) 1 Today s Plan Convex sets and functions Types
More informationIDENTIFYING ACTIVE MANIFOLDS
Algorithmic Operations Research Vol.2 (2007) 75 82 IDENTIFYING ACTIVE MANIFOLDS W.L. Hare a a Department of Mathematics, Simon Fraser University, Burnaby, BC V5A 1S6, Canada. A.S. Lewis b b School of ORIE,
More informationBenders in a nutshell Matteo Fischetti, University of Padova
Benders in a nutshell Matteo Fischetti, University of Padova ODS 2017, Sorrento, September 2017 1 Benders decomposition The original Benders decomposition from the 1960s uses two distinct ingredients for
More informationCS 179 Lecture 16. Logistic Regression & Parallel SGD
CS 179 Lecture 16 Logistic Regression & Parallel SGD 1 Outline logistic regression (stochastic) gradient descent parallelizing SGD for neural nets (with emphasis on Google s distributed neural net implementation)
More informationMultidimensional scaling
Multidimensional scaling Lecture 5 Alexander & Michael Bronstein tosca.cs.technion.ac.il/book Numerical geometry of non-rigid shapes Stanford University, Winter 2009 Cinderella 2.0 2 If it doesn t fit,
More informationIteratively Re-weighted Least Squares for Sums of Convex Functions
Iteratively Re-weighted Least Squares for Sums of Convex Functions James Burke University of Washington Jiashan Wang LinkedIn Frank Curtis Lehigh University Hao Wang Shanghai Tech University Daiwei He
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University September 20 2018 Review Solution for multiple linear regression can be computed in closed form
More informationA primal-dual framework for mixtures of regularizers
A primal-dual framework for mixtures of regularizers Baran Gözcü baran.goezcue@epfl.ch Laboratory for Information and Inference Systems (LIONS) École Polytechnique Fédérale de Lausanne (EPFL) Switzerland
More informationSurrogate Gradient Algorithm for Lagrangian Relaxation 1,2
Surrogate Gradient Algorithm for Lagrangian Relaxation 1,2 X. Zhao 3, P. B. Luh 4, and J. Wang 5 Communicated by W.B. Gong and D. D. Yao 1 This paper is dedicated to Professor Yu-Chi Ho for his 65th birthday.
More informationAn Introduction to NNs using Keras
An Introduction to NNs using Keras Michela Paganini michela.paganini@cern.ch Yale University 1 Keras Modular, powerful and intuitive Deep Learning python library built on Theano and TensorFlow Minimalist,
More informationCost Functions in Machine Learning
Cost Functions in Machine Learning Kevin Swingler Motivation Given some data that reflects measurements from the environment We want to build a model that reflects certain statistics about that data Something
More informationDistributed Alternating Direction Method of Multipliers
Distributed Alternating Direction Method of Multipliers Ermin Wei and Asuman Ozdaglar Abstract We consider a network of agents that are cooperatively solving a global unconstrained optimization problem,
More informationResearch Interests Optimization:
Mitchell: Research interests 1 Research Interests Optimization: looking for the best solution from among a number of candidates. Prototypical optimization problem: min f(x) subject to g(x) 0 x X IR n Here,
More information15.082J and 6.855J. Lagrangian Relaxation 2 Algorithms Application to LPs
15.082J and 6.855J Lagrangian Relaxation 2 Algorithms Application to LPs 1 The Constrained Shortest Path Problem (1,10) 2 (1,1) 4 (2,3) (1,7) 1 (10,3) (1,2) (10,1) (5,7) 3 (12,3) 5 (2,2) 6 Find the shortest
More information6. Linear Discriminant Functions
6. Linear Discriminant Functions Linear Discriminant Functions Assumption: we know the proper forms for the discriminant functions, and use the samples to estimate the values of parameters of the classifier
More informationAlgebraic Iterative Methods for Computed Tomography
Algebraic Iterative Methods for Computed Tomography Per Christian Hansen DTU Compute Department of Applied Mathematics and Computer Science Technical University of Denmark Per Christian Hansen Algebraic
More informationContents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
page v Preface xiii I Basics 1 1 Optimization Models 3 1.1 Introduction... 3 1.2 Optimization: An Informal Introduction... 4 1.3 Linear Equations... 7 1.4 Linear Optimization... 10 Exercises... 12 1.5
More informationA Brief Overview of Optimization Problems. Steven G. Johnson MIT course , Fall 2008
A Brief Overview of Optimization Problems Steven G. Johnson MIT course 18.335, Fall 2008 Why optimization? In some sense, all engineering design is optimization: choosing design parameters to improve some
More informationConvex and Distributed Optimization. Thomas Ropars
>>> Presentation of this master2 course Convex and Distributed Optimization Franck Iutzeler Jérôme Malick Thomas Ropars Dmitry Grishchenko from LJK, the applied maths and computer science laboratory and
More informationNeural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer
More informationThe Alternating Direction Method of Multipliers
The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein October 8, 2015 1 / 30 Introduction Presentation Outline 1 Convex
More informationOutlier Pursuit: Robust PCA and Collaborative Filtering
Outlier Pursuit: Robust PCA and Collaborative Filtering Huan Xu Dept. of Mechanical Engineering & Dept. of Mathematics National University of Singapore Joint w/ Constantine Caramanis, Yudong Chen, Sujay
More informationPerceptron: This is convolution!
Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image
More informationConvex Optimization M2
Convex Optimization M2 Lecture 1 A. d Aspremont. Convex Optimization M2. 1/49 Today Convex optimization: introduction Course organization and other gory details... Convex sets, basic definitions. A. d
More informationFinding Euclidean Distance to a Convex Cone Generated by a Large Number of Discrete Points
Submitted to Operations Research manuscript (Please, provide the manuscript number!) Finding Euclidean Distance to a Convex Cone Generated by a Large Number of Discrete Points Ali Fattahi Anderson School
More informationAVERAGING RANDOM PROJECTION: A FAST ONLINE SOLUTION FOR LARGE-SCALE CONSTRAINED STOCHASTIC OPTIMIZATION. Jialin Liu, Yuantao Gu, and Mengdi Wang
AVERAGING RANDOM PROJECTION: A FAST ONLINE SOLUTION FOR LARGE-SCALE CONSTRAINED STOCHASTIC OPTIMIZATION Jialin Liu, Yuantao Gu, and Mengdi Wang Tsinghua National Laboratory for Information Science and
More informationNumerical Optimization: Introduction and gradient-based methods
Numerical Optimization: Introduction and gradient-based methods Master 2 Recherche LRI Apprentissage Statistique et Optimisation Anne Auger Inria Saclay-Ile-de-France November 2011 http://tao.lri.fr/tiki-index.php?page=courses
More informationComputational Machine Learning, Fall 2015 Homework 4: stochastic gradient algorithms
Computational Machine Learning, Fall 2015 Homework 4: stochastic gradient algorithms Due: Tuesday, November 24th, 2015, before 11:59pm (submit via email) Preparation: install the software packages and
More informationLecture 2 Convex Sets
Optimization Theory and Applications Lecture 2 Convex Sets Prof. Chun-Hung Liu Dept. of Electrical and Computer Engineering National Chiao Tung University Fall 2016 2016/9/29 Lecture 2: Convex Sets 1 Outline
More informationNeural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R.
Lecture 24: Learning 3 Victor R. Lesser CMPSCI 683 Fall 2010 Today s Lecture Continuation of Neural Networks Artificial Neural Networks Compose of nodes/units connected by links Each link has a numeric
More informationextreme Gradient Boosting (XGBoost)
extreme Gradient Boosting (XGBoost) XGBoost stands for extreme Gradient Boosting. The motivation for boosting was a procedure that combi nes the outputs of many weak classifiers to produce a powerful committee.
More informationComputational Methods. Constrained Optimization
Computational Methods Constrained Optimization Manfred Huber 2010 1 Constrained Optimization Unconstrained Optimization finds a minimum of a function under the assumption that the parameters can take on
More informationLECTURE 18 LECTURE OUTLINE
LECTURE 18 LECTURE OUTLINE Generalized polyhedral approximation methods Combined cutting plane and simplicial decomposition methods Lecture based on the paper D. P. Bertsekas and H. Yu, A Unifying Polyhedral
More informationSolution Methodologies for. the Smallest Enclosing Circle Problem
Solution Methodologies for the Smallest Enclosing Circle Problem Sheng Xu 1 Robert M. Freund 2 Jie Sun 3 Tribute. We would like to dedicate this paper to Elijah Polak. Professor Polak has made substantial
More informationParallel and Distributed Graph Cuts by Dual Decomposition
Parallel and Distributed Graph Cuts by Dual Decomposition Presented by Varun Gulshan 06 July, 2010 What is Dual Decomposition? What is Dual Decomposition? It is a technique for optimization: usually used
More informationA Brief Look at Optimization
A Brief Look at Optimization CSC 412/2506 Tutorial David Madras January 18, 2018 Slides adapted from last year s version Overview Introduction Classes of optimization problems Linear programming Steepest
More informationCS281 Section 3: Practical Optimization
CS281 Section 3: Practical Optimization David Duvenaud and Dougal Maclaurin Most parameter estimation problems in machine learning cannot be solved in closed form, so we often have to resort to numerical
More informationDeep Learning for Computer Vision II
IIIT Hyderabad Deep Learning for Computer Vision II C. V. Jawahar Paradigm Shift Feature Extraction (SIFT, HoG, ) Part Models / Encoding Classifier Sparrow Feature Learning Classifier Sparrow L 1 L 2 L
More information2. Linear Regression and Gradient Descent
Pattern Recognition And Machine Learning - EPFL - Fall 2015 Emtiyaz Khan, Timur Bagautdinov, Carlos Becker, Ilija Bogunovic & Ksenia Konyushkova 2. Linear Regression and Gradient Descent 2.1 Goals The
More informationPlanning and Control: Markov Decision Processes
CSE-571 AI-based Mobile Robotics Planning and Control: Markov Decision Processes Planning Static vs. Dynamic Predictable vs. Unpredictable Fully vs. Partially Observable Perfect vs. Noisy Environment What
More informationLogistic Regression
Logistic Regression ddebarr@uw.edu 2016-05-26 Agenda Model Specification Model Fitting Bayesian Logistic Regression Online Learning and Stochastic Optimization Generative versus Discriminative Classifiers
More informationDeep Neural Networks Optimization
Deep Neural Networks Optimization Creative Commons (cc) by Akritasa http://arxiv.org/pdf/1406.2572.pdf Slides from Geoffrey Hinton CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy
More information5 Machine Learning Abstractions and Numerical Optimization
Machine Learning Abstractions and Numerical Optimization 25 5 Machine Learning Abstractions and Numerical Optimization ML ABSTRACTIONS [some meta comments on machine learning] [When you write a large computer
More informationSequential Coordinate-wise Algorithm for Non-negative Least Squares Problem
CENTER FOR MACHINE PERCEPTION CZECH TECHNICAL UNIVERSITY Sequential Coordinate-wise Algorithm for Non-negative Least Squares Problem Woring document of the EU project COSPAL IST-004176 Vojtěch Franc, Miro
More informationRevisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization. Author: Martin Jaggi Presenter: Zhongxing Peng
Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization Author: Martin Jaggi Presenter: Zhongxing Peng Outline 1. Theoretical Results 2. Applications Outline 1. Theoretical Results 2. Applications
More informationComputational study of the step size parameter of the subgradient optimization method
1 Computational study of the step size parameter of the subgradient optimization method Mengjie Han 1 Abstract The subgradient optimization method is a simple and flexible linear programming iterative
More information