Iteratively Re-weighted Least Squares for Sums of Convex Functions

Size: px
Start display at page:

Download "Iteratively Re-weighted Least Squares for Sums of Convex Functions"

Transcription

1 Iteratively Re-weighted Least Squares for Sums of Convex Functions James Burke University of Washington Jiashan Wang LinkedIn Frank Curtis Lehigh University Hao Wang Shanghai Tech University Daiwei He University of Washington Iteratively Re-weighted Least Squares / 3

2 Outline Classical Iterative Re-Weighting Exact Penalization 3 The Iterative Re-Weighting Algorithm (IRWA) 4 Convergence 5 A Numerical Experiment: l -SVM 6 Least-Squares and Sums of Convex Functions 7 General Least-Squares Based Algorithms 8 General Methods Applied to J(x, ε) = l dist (A i x + b i C i ) + ɛ i 9 Comparison with IRWA Iteratively Re-weighted Least Squares / 3

3 Classical Iterative Re-Weighting: A mainstay of statistical computing l -Regression where A i is the ith row of A. m min Ax + b x R n := A i x + b i, Iteratively Re-weighted Least Squares 3 / 3

4 Classical Iterative Re-Weighting: A mainstay of statistical computing l -Regression where A i is the ith row of A. Iterative Least Squares Approach Having x k approximate Ax + b by m min Ax + b x R n := A i x + b i, Ax + b = m A i x + b i = m A i x + b i A i x + b i m A i x + b i A i x k + b i. Solve the linear least squares problem in the approximation for x k+ and iterate. Iteratively Re-weighted Least Squares 3 / 3

5 Classical Iterative Re-Weighting: A mainstay of statistical computing l -Regression where A i is the ith row of A. Iterative Least Squares Approach Having x k approximate Ax + b by m min Ax + b x R n := A i x + b i, Ax + b = m A i x + b i = m A i x + b i A i x + b i m A i x + b i A i x k + b i. Solve the linear least squares problem in the approximation for x k+ and iterate. Modified ɛ-approximate Weighted Least Squares m Ax + b w(x k, ɛ k i ) A i x + b i where w(x k, ɛ k i ) := / A i x k + b i + (ɛ k i ). Iteratively Re-weighted Least Squares 3 / 3

6 Classical Iterative Re-Weighting Lawson 96 Rice and Usow 968 Karlovitz 970 Kahng 97 Fletcher, Grant and Hebden 97 Schlossmacker 973 Beaton and Tukey 974 Wolke and Schwetlick 988 O Leary 990 Burrus and Burreto 99 Vargas and Burrus 999 Iteratively Re-weighted Least Squares 4 / 3

7 The Exact Penalty Subproblem min J 0(x) := g T x + x X x T Hx + dist (A i x + b i C i ), g R n, H S+, n A R m i n, b R m i, and C i R m i are non-empty closed convex, and dist (y i C i ) := inf y i z i = y i P Ci (y i ) z i C i P Ci is the projection onto C i. Examples: Equality: C i := {0} Inequality: C i := (, 0] Inequality: C i = [l i, u i ] l l -ball: y i l τ i, l =,, Trust region: X = {x x l τ}, l =,, Iteratively Re-weighted Least Squares 5 / 3

8 NLP and Exact Penalties Nonlinear Programming (NLP): minimize f (x) subject to F (x) C and x X - f : R n R and F : R n R m smooth - X R n and C R m non-empty, closed, and convex - C := C C C l, F (x) := (F (x) F (x)... F l (x)) F i (x) C i, F i : R n R m i, C i R m i Exact penalty formulation: Given α > 0 min f (x) + α x X Local direction finding approximation dist (F i (x) C i ) min J 0(x) := g T x + x X x T Hx + dist (A i x + b i C i ), Iteratively Re-weighted Least Squares 6 / 3

9 ɛ-approximate Re-Weighted Least Squares Let P Ci (y) C i be the projection of y onto C i, i.e. y P Ci (y) = dist (y C i ). At (x k, ɛ k ) approximate dist (A i x + b i C i ) = A i x + b i P Ci (A i x + b i ) by dist (A i x + b i C i ) dist (A i x + b i C i ) dist (A i x k + b i C i ) + (ɛ k i ) Iteratively Re-weighted Least Squares 7 / 3

10 ɛ-approximate Re-Weighted Least Squares Let P Ci (y) C i be the projection of y onto C i, i.e. y P Ci (y) = dist (y C i ). At (x k, ɛ k ) approximate dist (A i x + b i C i ) = A i x + b i P Ci (A i x + b i ) by dist (A i x + b i C i ) dist (A i x + b i C i ) dist (A i x k + b i C i ) + (ɛ k i ) = A ix + b i P Ci (A i x + b i ), dist (A i x k + b i C i ) + (ɛ ki ) Iteratively Re-weighted Least Squares 7 / 3

11 ɛ-approximate Re-Weighted Least Squares Let P Ci (y) C i be the projection of y onto C i, i.e. y P Ci (y) = dist (y C i ). At (x k, ɛ k ) approximate dist (A i x + b i C i ) = A i x + b i P Ci (A i x + b i ) by dist (A i x + b i C i ) dist (A i x + b i C i ) dist (A i x k + b i C i ) + (ɛ k i ) = A ix + b i P Ci (A i x + b i ), dist (A i x k + b i C i ) + (ɛ ki ) A i x + b i P Ci (A i x k + b i ) dist (A i x k + b i C i ) + (ɛ k i ), Iteratively Re-weighted Least Squares 7 / 3

12 The full approximation J 0 (x) = g T x + x T Hx + dist (A i x + b i C i ) g T x + x T Hx + w i (x k, ɛ k ) A i x + b i P Ci (A i x k + b i ), where w i (x k, ɛ k ) = / dist (A i x k + b i C i ) + (ɛ k i ). Iteratively Re-weighted Least Squares 8 / 3

13 The full approximation J 0 (x) = g T x + x T Hx + dist (A i x + b i C i ) g T x + x T Hx + w i (x k, ɛ k ) A i x + b i P Ci (A i x k + b i ), where w i (x k, ɛ k ) = / dist (A i x k + b i C i ) + (ɛ k i ). For simplicity, we drop the term g T x + x T Hx from future consideration. This can be done with (almost) no loss in generality. Iteratively Re-weighted Least Squares 8 / 3

14 The Iterative Re-Weighting Algorithm (IRWA) IRWA: x k+ arg min x X l w i(x k, ɛ k ) Ai x + b i P Ci (A i x k + b i ) with ɛ k 0. Iteratively Re-weighted Least Squares 9 / 3

15 The Iterative Re-Weighting Algorithm (IRWA) IRWA: x k+ arg min x X l w i(x k, ɛ k ) Ai x + b i P Ci (A i x k + b i ) with ɛ k 0. Initialize x 0, ɛ 0, M, ν > 0, η, σ, σ (0, ). Solve the re-weighted subproblem for x k+. 3 Set q k i := A i (x k+ x k ) If [ q i k M dist ( Ai x k ) + b i C i + (ɛ k i ) ] +ν, i =,..., l, choose ɛ k+ (0, ηɛ k ]; otherwise, set ɛ k+ := ɛ k. 4 If x k+ x k σ and ɛ k σ, stop; else k := k + and go to Step. Iteratively Re-weighted Least Squares 9 / 3

16 Convergence Let (x 0, ɛ 0 ) X R ++. Suppose that the sequence {(x k, ɛ k )} k=0 is generated by IRWA with σ = σ = 0. Let S := { k ɛ k+ ηɛ k }. If either ker(a) = {0} or X = R n, then any cluster point x of the subsequence {x k } k S satisfies 0 J 0 ( x) + N ( x X ). If it is assumed that ker(a) = {0}, then x k x the unique global solution, and in at most O(/ε ) iterations, x k is an ε-optimal solution, i.e. J 0 (x k ) inf x X J 0(x) + ε. Iteratively Re-weighted Least Squares 0 / 3

17 Support Vector Machine Experiment Consider the exact penalty form of the l -SVM problem: m n y i x ij β j + λ β, x i min β R n j= where {(xx i, y i )} m are the training data points with x i R n and y i {, } for each i =,..., m, and λ is the penalty parameter. + Iteratively Re-weighted Least Squares / 3

18 Support Vector Machine Experiment Consider the exact penalty form of the l -SVM problem: m n y i x ij β j + λ β, x i min β R n j= where {(xx i, y i )} m are the training data points with x i R n and y i {, } for each i =,..., m, and λ is the penalty parameter. For purposes of numerical comparison, we randomly generate 40 problems where X ranges from a matrix to a matrix where the sparsity of the true solution is always 0% of the number of columns. We compare the performance of IRWA with an ADMM implementation whose parameters are pre-optimized for performance on this data set. The least-squares subproblems for both methods are solved using the same CG solver. The effort is measured in terms of the number of CG solves. + Iteratively Re-weighted Least Squares / 3

19 ADAL - IRWA Numerical Comparison: Number of CGs With Nesterov acceleration. ADAL IRWA Objective CG value Problem Figure : In all 40 problems, IRWA obtains smaller objective function values with fewer CG steps. Iteratively Re-weighted Least Squares / 3

20 ADAL - IRWA Comparison: Number of False Positives ADAL IRWA err e 05 e e 05 e Figure : For both thresholds 0 4 and 0 5, IRWA yields fewer false positives in terms of the numbers of zero values computed. The numbers of false positives is similar for the threshold 0 3. At the threshold 0 5, the difference in recovery is dramatic with IRWA always having fewer than 4 false positives while ADAL has a median of about 000 false positives. Iteratively Re-weighted Least Squares 3 / 3

21 Least-Squares and Sums of Convex Functions Goals min x φ(x) := f (Ax + b) = f i (A i x + b i ) Minimize φ based on properties of the individual f i only with minimal linkage between the f i s (splitting). For example, apply Prox to the individual f i s, but not to the function φ as a whole. The A i s only enter through weighted least-squares: min x w i (x, p) A i x + b i s i (p), where p E 0 is a parameter vector and s : E 0 W. Iteratively Re-weighted Least Squares 4 / 3

22 Least-Squares and Sums of Convex Functions Goals min x φ(x) := f (Ax + b) = f i (A i x + b i ) Minimize φ based on properties of the individual f i only with minimal linkage between the f i s (splitting). For example, apply Prox to the individual f i s, but not to the function φ as a whole. The A i s only enter through weighted least-squares: min x w i (x, p) A i x + b i s i (p), where p E 0 is a parameter vector and s : E 0 W. The procedures we consider are iterative with x c and y c := Ax c + b representing the current best approximate solutions, and N(c) is the number of iterations to obtain x c. Iteratively Re-weighted Least Squares 4 / 3

23 Separate the Roles of the f i s and the A i s min x φ(x) := f (Ax + b) = f i (A i x + b i ) min x f i (y i ) s.t. y b + Ran A Iteratively Re-weighted Least Squares 5 / 3

24 Separate the Roles of the f i s and the A i s min x φ(x) := f (Ax + b) = f i (A i x + b i ) min x f i (y i ) s.t. y b + Ran A [ Prox γi,f i y := argmin z f i (z) + ] z y γ i Iteratively Re-weighted Least Squares 5 / 3

25 Least-Squares Based Algorithms Projected Subgradient: min x A i (x x c ) + t c i g c i, g c i f i (yi c ), ti c R := L N(c), O(/ɛ ) no acc Iteratively Re-weighted Least Squares 6 / 3

26 Least-Squares Based Algorithms Projected Subgradient: min x A i (x x c ) + t c i g c i, Projected Prox-Gradient: g c i f i (yi c ), ti c R := L N(c), O(/ɛ ) no acc min x A i (x x c ) + ti c (I Prox γi,f i )(A i x c + b i ), tc j := γ j l γ i, O(/ɛ) acc Iteratively Re-weighted Least Squares 6 / 3

27 Least-Squares Based Algorithms Projected Subgradient: min x A i (x x c ) + t c i g c i, Projected Prox-Gradient: g c i f i (yi c ), ti c R := L N(c), O(/ɛ ) no acc min x ADAL: min x A i (x x c ) + ti c (I Prox γi,f i )(A i x c + b i ), tc j := γ j l γ i, O(/ɛ) acc (A i (x x c ) + (I Prox γi,f γ i )(A i x c + b i + γ i ui c ), O(/ɛ) acc i u + i := u c i + γ i (A i x + + b i Prox γi,f i (A i x c + b i + γu c i )). Iteratively Re-weighted Least Squares 6 / 3

28 Application to J(x, ε) = l dist (A i x + b i C i ) + ɛ i A i (x x c ) + ti c (I P Ci )(A i x c + b i ) Projected Subgradient for J(x, 0) dist(x 0 Σ ) ti c :=, if A i x c + b i / C i, ldist(a i x c +b i C i ) N(c) 0, otherwise. Projected gradient for J(x, ε) t c j := ɛ min dist (A j x c + b j C j ) + ɛ j Projected Prox-Gradient for J(x, 0) γ j l, dist γ (A j x c + b j C j ) γ j tj c i :=, dist (A j x c + b j C j ) > γ j dist (A j x c +b j C j ) l γ i (Σ solution set) ADAL for J(x, 0) Same as projected prox-gradient but t c j := and we include shifts γ i u c i. Iteratively Re-weighted Least Squares 7 / 3

29 Comparison with IRWA General Methods: Ai (x x k ) + ti c (I P Ci )(A i x k + b i ) O( ɛ ) acc IRWA: Ai (x x k ) + (I P Ci )(A i x k + b i ) dist (A i x k + b i C i ) + ɛ O( ) no acc ɛ i Iteratively Re-weighted Least Squares 8 / 3

30 Comparison with IRWA General Methods: Ai (x x k ) + ti c (I P Ci )(A i x k + b i ) O( ɛ ) acc IRWA: Ai (x x k ) + (I P Ci )(A i x k + b i ) dist (A i x k + b i C i ) + ɛ O( ) no acc ɛ i New Improved IRWA: ɛ i A i(x x k ) + O( ɛ )!! acc ɛ i dist (A i x k + b i C i ) + ɛ i (I P Ci )(A i x k + b i ) Iteratively Re-weighted Least Squares 8 / 3

31 Comparison of Projected Gradient and New IRWA Projected gradient for J(x, ε) A i(x x k ) + ɛ min dist (A i x k + b i C i ) + ɛ i (I P Ci )(A i x k + b i ) New Improved IRWA: ɛ i A i(x x k ) + ɛ i dist (A i x k + b i C i ) + ɛ i (I P Ci )(A i x k + b i ) Iteratively Re-weighted Least Squares 9 / 3

32 Comparison of Projected Gradient and New IRWA Projected gradient for J(x, ε) A i(x x k ) + ɛ min dist (A i x k + b i C i ) + ɛ i (I P Ci )(A i x k + b i ) New Improved IRWA: ɛ i A i(x x k ) + ɛ i dist (A i x k + b i C i ) + ɛ i (I P Ci )(A i x k + b i ) When all ɛ i have the same value ɛ, they are the same algorithm! Iteratively Re-weighted Least Squares 9 / 3

33 Unaccelerated versions with ɛ = 0.0. Figure : In all 0 problems, the old IRWA with iterative re-weighting obtains similar objective function values with fewer CG steps. Iteratively Re-weighted Least Squares 0 / 3

34 Accelerated versions with ɛ = 0.0. Figure : In all 0 problems, the old IRWA with iterative re-weighting obtains similar objective function values with many fewer CG steps. Iteratively Re-weighted Least Squares / 3

35 Reference Thank you! Iteratively Reweighted Linear Least Squares for Exact Penalty Subproblems on Product Sets with F. Curtis, H. Wang and J. Wang. SIAM J. Optim. 5(05): Iteratively Re-weighted Least Squares / 3

36 The Euclidean Huber Distance to a Convex Set C E be non-empty closed and convex h := dist ( C ) the Euclidean distance to C The Euclidean Huber distance to C is just the Moreau-Yosida envelope of the distance to C. e γ h(y) = { γ dist (y C ), if dist (y C ) γ, dist (y C ) γ, if dist (y C ) > γ, { PC (y), if dist (y C ) γ, Prox γ,h (y) = γ y dist (y C ) (y P C (y)), if dist (y C ) γ. Iteratively Re-weighted Least Squares 3 / 3

Parallel and Distributed Sparse Optimization Algorithms

Parallel and Distributed Sparse Optimization Algorithms Parallel and Distributed Sparse Optimization Algorithms Part I Ruoyu Li 1 1 Department of Computer Science and Engineering University of Texas at Arlington March 19, 2015 Ruoyu Li (UTA) Parallel and Distributed

More information

The Pre-Image Problem in Kernel Methods

The Pre-Image Problem in Kernel Methods The Pre-Image Problem in Kernel Methods James Kwok Ivor Tsang Department of Computer Science Hong Kong University of Science and Technology Hong Kong The Pre-Image Problem in Kernel Methods ICML-2003 1

More information

Convex Optimization MLSS 2015

Convex Optimization MLSS 2015 Convex Optimization MLSS 2015 Constantine Caramanis The University of Texas at Austin The Optimization Problem minimize : f (x) subject to : x X. The Optimization Problem minimize : f (x) subject to :

More information

Convex optimization algorithms for sparse and low-rank representations

Convex optimization algorithms for sparse and low-rank representations Convex optimization algorithms for sparse and low-rank representations Lieven Vandenberghe, Hsiao-Han Chao (UCLA) ECC 2013 Tutorial Session Sparse and low-rank representation methods in control, estimation,

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning (Problems; Algorithms - C) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

More information

Lecture 19: Convex Non-Smooth Optimization. April 2, 2007

Lecture 19: Convex Non-Smooth Optimization. April 2, 2007 : Convex Non-Smooth Optimization April 2, 2007 Outline Lecture 19 Convex non-smooth problems Examples Subgradients and subdifferentials Subgradient properties Operations with subgradients and subdifferentials

More information

1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics

1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics 1. Introduction EE 546, Univ of Washington, Spring 2016 performance of numerical methods complexity bounds structural convex optimization course goals and topics 1 1 Some course info Welcome to EE 546!

More information

Alternating Direction Method of Multipliers

Alternating Direction Method of Multipliers Alternating Direction Method of Multipliers CS 584: Big Data Analytics Material adapted from Stephen Boyd (https://web.stanford.edu/~boyd/papers/pdf/admm_slides.pdf) & Ryan Tibshirani (http://stat.cmu.edu/~ryantibs/convexopt/lectures/21-dual-meth.pdf)

More information

Convex Optimization / Homework 2, due Oct 3

Convex Optimization / Homework 2, due Oct 3 Convex Optimization 0-725/36-725 Homework 2, due Oct 3 Instructions: You must complete Problems 3 and either Problem 4 or Problem 5 (your choice between the two) When you submit the homework, upload a

More information

ELEG Compressive Sensing and Sparse Signal Representations

ELEG Compressive Sensing and Sparse Signal Representations ELEG 867 - Compressive Sensing and Sparse Signal Representations Gonzalo R. Arce Depart. of Electrical and Computer Engineering University of Delaware Fall 211 Compressive Sensing G. Arce Fall, 211 1 /

More information

Efficient MR Image Reconstruction for Compressed MR Imaging

Efficient MR Image Reconstruction for Compressed MR Imaging Efficient MR Image Reconstruction for Compressed MR Imaging Junzhou Huang, Shaoting Zhang, and Dimitris Metaxas Division of Computer and Information Sciences, Rutgers University, NJ, USA 08854 Abstract.

More information

Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding

Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding B. O Donoghue E. Chu N. Parikh S. Boyd Convex Optimization and Beyond, Edinburgh, 11/6/2104 1 Outline Cone programming Homogeneous

More information

Lecture 19: November 5

Lecture 19: November 5 0-725/36-725: Convex Optimization Fall 205 Lecturer: Ryan Tibshirani Lecture 9: November 5 Scribes: Hyun Ah Song Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not

More information

An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation

An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation Xingguo Li Tuo Zhao Xiaoming Yuan Han Liu Abstract This paper describes an R package named flare, which implements

More information

Optimization for Machine Learning

Optimization for Machine Learning with a focus on proximal gradient descent algorithm Department of Computer Science and Engineering Outline 1 History & Trends 2 Proximal Gradient Descent 3 Three Applications A Brief History A. Convex

More information

A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization and continuation [Wen,Yin,Goldfarb,Zhang 2009]

A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization and continuation [Wen,Yin,Goldfarb,Zhang 2009] A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization and continuation [Wen,Yin,Goldfarb,Zhang 2009] Yongjia Song University of Wisconsin-Madison April 22, 2010 Yongjia Song

More information

ACCELERATED DUAL GRADIENT-BASED METHODS FOR TOTAL VARIATION IMAGE DENOISING/DEBLURRING PROBLEMS. Donghwan Kim and Jeffrey A.

ACCELERATED DUAL GRADIENT-BASED METHODS FOR TOTAL VARIATION IMAGE DENOISING/DEBLURRING PROBLEMS. Donghwan Kim and Jeffrey A. ACCELERATED DUAL GRADIENT-BASED METHODS FOR TOTAL VARIATION IMAGE DENOISING/DEBLURRING PROBLEMS Donghwan Kim and Jeffrey A. Fessler University of Michigan Dept. of Electrical Engineering and Computer Science

More information

A More Efficient Approach to Large Scale Matrix Completion Problems

A More Efficient Approach to Large Scale Matrix Completion Problems A More Efficient Approach to Large Scale Matrix Completion Problems Matthew Olson August 25, 2014 Abstract This paper investigates a scalable optimization procedure to the low-rank matrix completion problem

More information

Programming, numerics and optimization

Programming, numerics and optimization Programming, numerics and optimization Lecture C-4: Constrained optimization Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428 June

More information

Characterizing Improving Directions Unconstrained Optimization

Characterizing Improving Directions Unconstrained Optimization Final Review IE417 In the Beginning... In the beginning, Weierstrass's theorem said that a continuous function achieves a minimum on a compact set. Using this, we showed that for a convex set S and y not

More information

Convex or non-convex: which is better?

Convex or non-convex: which is better? Sparsity Amplified Ivan Selesnick Electrical and Computer Engineering Tandon School of Engineering New York University Brooklyn, New York March 7 / 4 Convex or non-convex: which is better? (for sparse-regularized

More information

Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization. Author: Martin Jaggi Presenter: Zhongxing Peng

Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization. Author: Martin Jaggi Presenter: Zhongxing Peng Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization Author: Martin Jaggi Presenter: Zhongxing Peng Outline 1. Theoretical Results 2. Applications Outline 1. Theoretical Results 2. Applications

More information

Alternating Projections

Alternating Projections Alternating Projections Stephen Boyd and Jon Dattorro EE392o, Stanford University Autumn, 2003 1 Alternating projection algorithm Alternating projections is a very simple algorithm for computing a point

More information

Distributed Alternating Direction Method of Multipliers

Distributed Alternating Direction Method of Multipliers Distributed Alternating Direction Method of Multipliers Ermin Wei and Asuman Ozdaglar Abstract We consider a network of agents that are cooperatively solving a global unconstrained optimization problem,

More information

The Benefit of Tree Sparsity in Accelerated MRI

The Benefit of Tree Sparsity in Accelerated MRI The Benefit of Tree Sparsity in Accelerated MRI Chen Chen and Junzhou Huang Department of Computer Science and Engineering, The University of Texas at Arlington, TX, USA 76019 Abstract. The wavelet coefficients

More information

Section 5 Convex Optimisation 1. W. Dai (IC) EE4.66 Data Proc. Convex Optimisation page 5-1

Section 5 Convex Optimisation 1. W. Dai (IC) EE4.66 Data Proc. Convex Optimisation page 5-1 Section 5 Convex Optimisation 1 W. Dai (IC) EE4.66 Data Proc. Convex Optimisation 1 2018 page 5-1 Convex Combination Denition 5.1 A convex combination is a linear combination of points where all coecients

More information

Kernels for Structured Data

Kernels for Structured Data T-122.102 Special Course in Information Science VI: Co-occurence methods in analysis of discrete data Kernels for Structured Data Based on article: A Survey of Kernels for Structured Data by Thomas Gärtner

More information

Sparse Optimization Lecture: Proximal Operator/Algorithm and Lagrange Dual

Sparse Optimization Lecture: Proximal Operator/Algorithm and Lagrange Dual Sparse Optimization Lecture: Proximal Operator/Algorithm and Lagrange Dual Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know learn the proximal

More information

An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation

An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation Xingguo Li Tuo Zhao Xiaoming Yuan Han Liu Abstract This paper describes an R package named flare, which implements

More information

Convexity Theory and Gradient Methods

Convexity Theory and Gradient Methods Convexity Theory and Gradient Methods Angelia Nedić angelia@illinois.edu ISE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign Outline Convex Functions Optimality

More information

A primal-dual framework for mixtures of regularizers

A primal-dual framework for mixtures of regularizers A primal-dual framework for mixtures of regularizers Baran Gözcü baran.goezcue@epfl.ch Laboratory for Information and Inference Systems (LIONS) École Polytechnique Fédérale de Lausanne (EPFL) Switzerland

More information

Divide and Conquer Kernel Ridge Regression

Divide and Conquer Kernel Ridge Regression Divide and Conquer Kernel Ridge Regression Yuchen Zhang John Duchi Martin Wainwright University of California, Berkeley COLT 2013 Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT 2013 1 / 15 Problem

More information

Robust Kernel Methods in Clustering and Dimensionality Reduction Problems

Robust Kernel Methods in Clustering and Dimensionality Reduction Problems Robust Kernel Methods in Clustering and Dimensionality Reduction Problems Jian Guo, Debadyuti Roy, Jing Wang University of Michigan, Department of Statistics Introduction In this report we propose robust

More information

Outline Introduction Problem Formulation Proposed Solution Applications Conclusion. Compressed Sensing. David L Donoho Presented by: Nitesh Shroff

Outline Introduction Problem Formulation Proposed Solution Applications Conclusion. Compressed Sensing. David L Donoho Presented by: Nitesh Shroff Compressed Sensing David L Donoho Presented by: Nitesh Shroff University of Maryland Outline 1 Introduction Compressed Sensing 2 Problem Formulation Sparse Signal Problem Statement 3 Proposed Solution

More information

Proximal operator and methods

Proximal operator and methods Proximal operator and methods Master 2 Data Science, Univ. Paris Saclay Robert M. Gower Optimization Sum of Terms A Datum Function Finite Sum Training Problem The Training Problem Convergence GD I Theorem

More information

Lecture 19 Subgradient Methods. November 5, 2008

Lecture 19 Subgradient Methods. November 5, 2008 Subgradient Methods November 5, 2008 Outline Lecture 19 Subgradients and Level Sets Subgradient Method Convergence and Convergence Rate Convex Optimization 1 Subgradients and Level Sets A vector s is a

More information

Sparse wavelet expansions for seismic tomography: Methods and algorithms

Sparse wavelet expansions for seismic tomography: Methods and algorithms Sparse wavelet expansions for seismic tomography: Methods and algorithms Ignace Loris Université Libre de Bruxelles International symposium on geophysical imaging with localized waves 24 28 July 2011 (Joint

More information

CMU-Q Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization. Teacher: Gianni A. Di Caro

CMU-Q Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization. Teacher: Gianni A. Di Caro CMU-Q 15-381 Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization Teacher: Gianni A. Di Caro GLOBAL FUNCTION OPTIMIZATION Find the global maximum of the function f x (and

More information

The Alternating Direction Method of Multipliers

The Alternating Direction Method of Multipliers The Alternating Direction Method of Multipliers Customizable software solver package Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein April 27, 2016 1 / 28 Background The Dual Problem Consider

More information

Markov Networks in Computer Vision

Markov Networks in Computer Vision Markov Networks in Computer Vision Sargur Srihari srihari@cedar.buffalo.edu 1 Markov Networks for Computer Vision Some applications: 1. Image segmentation 2. Removal of blur/noise 3. Stereo reconstruction

More information

Templates for Convex Cone Problems with Applications to Sparse Signal Recovery

Templates for Convex Cone Problems with Applications to Sparse Signal Recovery Mathematical Programming Computation manuscript No. (will be inserted by the editor) Templates for Convex Cone Problems with Applications to Sparse Signal Recovery Stephen R. Becer Emmanuel J. Candès Michael

More information

Convex Optimization and Machine Learning

Convex Optimization and Machine Learning Convex Optimization and Machine Learning Mengliu Zhao Machine Learning Reading Group School of Computing Science Simon Fraser University March 12, 2014 Mengliu Zhao SFU-MLRG March 12, 2014 1 / 25 Introduction

More information

Conditional gradient algorithms for machine learning

Conditional gradient algorithms for machine learning Conditional gradient algorithms for machine learning Zaid Harchaoui LEAR-LJK, INRIA Grenoble Anatoli Juditsky LJK, Université de Grenoble Arkadi Nemirovski Georgia Tech Abstract We consider penalized formulations

More information

Markov Networks in Computer Vision. Sargur Srihari

Markov Networks in Computer Vision. Sargur Srihari Markov Networks in Computer Vision Sargur srihari@cedar.buffalo.edu 1 Markov Networks for Computer Vision Important application area for MNs 1. Image segmentation 2. Removal of blur/noise 3. Stereo reconstruction

More information

Introduction to Modern Control Systems

Introduction to Modern Control Systems Introduction to Modern Control Systems Convex Optimization, Duality and Linear Matrix Inequalities Kostas Margellos University of Oxford AIMS CDT 2016-17 Introduction to Modern Control Systems November

More information

The flare Package for High Dimensional Linear Regression and Precision Matrix Estimation in R

The flare Package for High Dimensional Linear Regression and Precision Matrix Estimation in R Journal of Machine Learning Research 6 (205) 553-557 Submitted /2; Revised 3/4; Published 3/5 The flare Package for High Dimensional Linear Regression and Precision Matrix Estimation in R Xingguo Li Department

More information

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or

More information

Conditional gradient algorithms for machine learning

Conditional gradient algorithms for machine learning 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Lecture 18: March 23

Lecture 18: March 23 0-725/36-725: Convex Optimization Spring 205 Lecturer: Ryan Tibshirani Lecture 8: March 23 Scribes: James Duyck Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not

More information

A Multilevel Proximal Gradient Algorithm for a Class of Composite Optimization Problems

A Multilevel Proximal Gradient Algorithm for a Class of Composite Optimization Problems A Multilevel Proximal Gradient Algorithm for a Class of Composite Optimization Problems Panos Parpas May 9, 2017 Abstract Composite optimization models consist of the minimization of the sum of a smooth

More information

Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York

Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York Clustering Robert M. Haralick Computer Science, Graduate Center City University of New York Outline K-means 1 K-means 2 3 4 5 Clustering K-means The purpose of clustering is to determine the similarity

More information

Department of Electrical and Computer Engineering

Department of Electrical and Computer Engineering LAGRANGIAN RELAXATION FOR GATE IMPLEMENTATION SELECTION Yi-Le Huang, Jiang Hu and Weiping Shi Department of Electrical and Computer Engineering Texas A&M University OUTLINE Introduction and motivation

More information

Machine Learning Techniques for Detecting Hierarchical Interactions in GLM s for Insurance Premiums

Machine Learning Techniques for Detecting Hierarchical Interactions in GLM s for Insurance Premiums Machine Learning Techniques for Detecting Hierarchical Interactions in GLM s for Insurance Premiums José Garrido Department of Mathematics and Statistics Concordia University, Montreal EAJ 2016 Lyon, September

More information

Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey. Chapter 4 : Optimization for Machine Learning

Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey. Chapter 4 : Optimization for Machine Learning Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey Chapter 4 : Optimization for Machine Learning Summary of Chapter 2 Chapter 2: Convex Optimization with Sparsity

More information

Bilevel Sparse Coding

Bilevel Sparse Coding Adobe Research 345 Park Ave, San Jose, CA Mar 15, 2013 Outline 1 2 The learning model The learning algorithm 3 4 Sparse Modeling Many types of sensory data, e.g., images and audio, are in high-dimensional

More information

Learning Models of Similarity: Metric and Kernel Learning. Eric Heim, University of Pittsburgh

Learning Models of Similarity: Metric and Kernel Learning. Eric Heim, University of Pittsburgh Learning Models of Similarity: Metric and Kernel Learning Eric Heim, University of Pittsburgh Standard Machine Learning Pipeline Manually-Tuned Features Machine Learning Model Desired Output for Task Features

More information

The Alternating Direction Method of Multipliers

The Alternating Direction Method of Multipliers The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein October 8, 2015 1 / 30 Introduction Presentation Outline 1 Convex

More information

Exploiting Low-Rank Structure in Semidenite Programming by Approximate Operator Splitting

Exploiting Low-Rank Structure in Semidenite Programming by Approximate Operator Splitting Exploiting Low-Rank Structure in Semidenite Programming by Approximate Operator Splitting Mario Souto, Joaquim D. Garcia and Álvaro Veiga June 26, 2018 Outline Introduction Algorithm Exploiting low-rank

More information

Nonsmooth Optimization for Statistical Learning with Structured Matrix Regularization

Nonsmooth Optimization for Statistical Learning with Structured Matrix Regularization Nonsmooth Optimization for Statistical Learning with Structured Matrix Regularization PhD defense of Federico Pierucci Thesis supervised by Prof. Zaid Harchaoui Prof. Anatoli Juditsky Dr. Jérôme Malick

More information

Machine Learning: Think Big and Parallel

Machine Learning: Think Big and Parallel Day 1 Inderjit S. Dhillon Dept of Computer Science UT Austin CS395T: Topics in Multicore Programming Oct 1, 2013 Outline Scikit-learn: Machine Learning in Python Supervised Learning day1 Regression: Least

More information

The picasso Package for High Dimensional Regularized Sparse Learning in R

The picasso Package for High Dimensional Regularized Sparse Learning in R The picasso Package for High Dimensional Regularized Sparse Learning in R X. Li, J. Ge, T. Zhang, M. Wang, H. Liu, and T. Zhao Abstract We introduce an R package named picasso, which implements a unified

More information

An MM Algorithm for Multicategory Vertex Discriminant Analysis

An MM Algorithm for Multicategory Vertex Discriminant Analysis An MM Algorithm for Multicategory Vertex Discriminant Analysis Tong Tong Wu Department of Epidemiology and Biostatistics University of Maryland, College Park May 22, 2008 Joint work with Professor Kenneth

More information

Solving basis pursuit: infeasible-point subgradient algorithm, computational comparison, and improvements

Solving basis pursuit: infeasible-point subgradient algorithm, computational comparison, and improvements Solving basis pursuit: infeasible-point subgradient algorithm, computational comparison, and improvements Andreas Tillmann ( joint work with D. Lorenz and M. Pfetsch ) Technische Universität Braunschweig

More information

15.082J and 6.855J. Lagrangian Relaxation 2 Algorithms Application to LPs

15.082J and 6.855J. Lagrangian Relaxation 2 Algorithms Application to LPs 15.082J and 6.855J Lagrangian Relaxation 2 Algorithms Application to LPs 1 The Constrained Shortest Path Problem (1,10) 2 (1,1) 4 (2,3) (1,7) 1 (10,3) (1,2) (10,1) (5,7) 3 (12,3) 5 (2,2) 6 Find the shortest

More information

Comparison of Interior Point Filter Line Search Strategies for Constrained Optimization by Performance Profiles

Comparison of Interior Point Filter Line Search Strategies for Constrained Optimization by Performance Profiles INTERNATIONAL JOURNAL OF MATHEMATICS MODELS AND METHODS IN APPLIED SCIENCES Comparison of Interior Point Filter Line Search Strategies for Constrained Optimization by Performance Profiles M. Fernanda P.

More information

Templates for convex cone problems with applications to sparse signal recovery

Templates for convex cone problems with applications to sparse signal recovery Math. Prog. Comp. (2011) 3:165 218 DOI 10.1007/s12532-011-0029-5 FULL LENGTH PAPER Templates for convex cone problems with applications to sparse signal recovery Stephen R. Becker Emmanuel J. Candès Michael

More information

CSE 547: Machine Learning for Big Data Spring Problem Set 2. Please read the homework submission policies.

CSE 547: Machine Learning for Big Data Spring Problem Set 2. Please read the homework submission policies. CSE 547: Machine Learning for Big Data Spring 2019 Problem Set 2 Please read the homework submission policies. 1 Principal Component Analysis and Reconstruction (25 points) Let s do PCA and reconstruct

More information

AVERAGING RANDOM PROJECTION: A FAST ONLINE SOLUTION FOR LARGE-SCALE CONSTRAINED STOCHASTIC OPTIMIZATION. Jialin Liu, Yuantao Gu, and Mengdi Wang

AVERAGING RANDOM PROJECTION: A FAST ONLINE SOLUTION FOR LARGE-SCALE CONSTRAINED STOCHASTIC OPTIMIZATION. Jialin Liu, Yuantao Gu, and Mengdi Wang AVERAGING RANDOM PROJECTION: A FAST ONLINE SOLUTION FOR LARGE-SCALE CONSTRAINED STOCHASTIC OPTIMIZATION Jialin Liu, Yuantao Gu, and Mengdi Wang Tsinghua National Laboratory for Information Science and

More information

A primal-dual Dikin affine scaling method for symmetric conic optimization

A primal-dual Dikin affine scaling method for symmetric conic optimization A primal-dual Dikin affine scaling method for symmetric conic optimization Ali Mohammad-Nezhad Tamás Terlaky Department of Industrial and Systems Engineering Lehigh University July 15, 2015 A primal-dual

More information

Normalized cuts and image segmentation

Normalized cuts and image segmentation Normalized cuts and image segmentation Department of EE University of Washington Yeping Su Xiaodan Song Normalized Cuts and Image Segmentation, IEEE Trans. PAMI, August 2000 5/20/2003 1 Outline 1. Image

More information

PIPA: A New Proximal Interior Point Algorithm for Large-Scale Convex Optimization

PIPA: A New Proximal Interior Point Algorithm for Large-Scale Convex Optimization PIPA: A New Proximal Interior Point Algorithm for Large-Scale Convex Optimization Marie-Caroline Corbineau 1, Emilie Chouzenoux 1,2, Jean-Christophe Pesquet 1 1 CVN, CentraleSupélec, Université Paris-Saclay,

More information

Penalty Alternating Direction Methods for Mixed- Integer Optimization: A New View on Feasibility Pumps

Penalty Alternating Direction Methods for Mixed- Integer Optimization: A New View on Feasibility Pumps Penalty Alternating Direction Methods for Mixed- Integer Optimization: A New View on Feasibility Pumps Björn Geißler, Antonio Morsi, Lars Schewe, Martin Schmidt FAU Erlangen-Nürnberg, Discrete Optimization

More information

Stanford University. A Distributed Solver for Kernalized SVM

Stanford University. A Distributed Solver for Kernalized SVM Stanford University CME 323 Final Project A Distributed Solver for Kernalized SVM Haoming Li Bangzheng He haoming@stanford.edu bzhe@stanford.edu GitHub Repository https://github.com/cme323project/spark_kernel_svm.git

More information

Machine Learning Classifiers and Boosting

Machine Learning Classifiers and Boosting Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve

More information

Transfer Learning Algorithms for Image Classification

Transfer Learning Algorithms for Image Classification Transfer Learning Algorithms for Image Classification Ariadna Quattoni MIT, CSAIL Advisors: Michael Collins Trevor Darrell 1 Motivation Goal: We want to be able to build classifiers for thousands of visual

More information

Lagrangian methods for the regularization of discrete ill-posed problems. G. Landi

Lagrangian methods for the regularization of discrete ill-posed problems. G. Landi Lagrangian methods for the regularization of discrete ill-posed problems G. Landi Abstract In many science and engineering applications, the discretization of linear illposed problems gives rise to large

More information

Leaves Machine Learning and Optimization Library

Leaves Machine Learning and Optimization Library Leaves Machine Learning and Optimization Library October 7, 07 This document describes how to use LEaves Machine Learning and Optimization Library (LEMO) for modeling. LEMO is an open source library for

More information

Convex Optimization over Intersection of Simple Sets: improved Convergence Rate Guarantees via an Exact Penalty Approach

Convex Optimization over Intersection of Simple Sets: improved Convergence Rate Guarantees via an Exact Penalty Approach Convex Optimization over Intersection of Simple Sets: improved Convergence Rate Guarantees via an Exact Penalty Approach Achintya Kundu Francis Bach Chiranjib Bhattacharyya Department of CSA INRIA - École

More information

AMS : Combinatorial Optimization Homework Problems - Week V

AMS : Combinatorial Optimization Homework Problems - Week V AMS 553.766: Combinatorial Optimization Homework Problems - Week V For the following problems, A R m n will be m n matrices, and b R m. An affine subspace is the set of solutions to a a system of linear

More information

Kernel Methods in Machine Learning

Kernel Methods in Machine Learning Outline Department of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong Joint work with Ivor Tsang, Pakming Cheung, Andras Kocsor, Jacek Zurada, Kimo Lai November

More information

Mathematical Programming Methods for Minimizing the Zero-norm over Polyhedral Sets. Francesco Rinaldi

Mathematical Programming Methods for Minimizing the Zero-norm over Polyhedral Sets. Francesco Rinaldi Sapienza, University of Rome Mathematical Programming Methods for Minimizing the Zero-norm over Polyhedral Sets Francesco Rinaldi Dissertation for the Degree of Philosophy Doctor in Operations Research

More information

Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than O(1/ɛ)

Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than O(1/ɛ) Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than O(1/ɛ) Yi Xu, Yan Yan, Qihang Lin, Tianbao Yang Department of Computer Science, University of Iowa, Iowa City, IA 52242 QCIS, University

More information

No more questions will be added

No more questions will be added CSC 2545, Spring 2017 Kernel Methods and Support Vector Machines Assignment 2 Due at the start of class, at 2:10pm, Thurs March 23. No late assignments will be accepted. The material you hand in should

More information

General Instructions. Questions

General Instructions. Questions CS246: Mining Massive Data Sets Winter 2018 Problem Set 2 Due 11:59pm February 8, 2018 Only one late period is allowed for this homework (11:59pm 2/13). General Instructions Submission instructions: These

More information

COMS 4771 Support Vector Machines. Nakul Verma

COMS 4771 Support Vector Machines. Nakul Verma COMS 4771 Support Vector Machines Nakul Verma Last time Decision boundaries for classification Linear decision boundary (linear classification) The Perceptron algorithm Mistake bound for the perceptron

More information

Distance-to-Solution Estimates for Optimization Problems with Constraints in Standard Form

Distance-to-Solution Estimates for Optimization Problems with Constraints in Standard Form Distance-to-Solution Estimates for Optimization Problems with Constraints in Standard Form Philip E. Gill Vyacheslav Kungurtsev Daniel P. Robinson UCSD Center for Computational Mathematics Technical Report

More information

Lecture 4 Duality and Decomposition Techniques

Lecture 4 Duality and Decomposition Techniques Lecture 4 Duality and Decomposition Techniques Jie Lu (jielu@kth.se) Richard Combes Alexandre Proutiere Automatic Control, KTH September 19, 2013 Consider the primal problem Lagrange Duality Lagrangian

More information

Extending the search space of time-domain adjoint-state FWI with randomized implicit time shifts

Extending the search space of time-domain adjoint-state FWI with randomized implicit time shifts Extending the search space of time-domain adjoint-state FWI with randomized implicit time shifts Mathias Louboutin * and Felix J. Herrmann Seismic Laboratory for Imaging and Modeling (SLIM), The University

More information

Support Vector Machines

Support Vector Machines Support Vector Machines SVM Discussion Overview. Importance of SVMs. Overview of Mathematical Techniques Employed 3. Margin Geometry 4. SVM Training Methodology 5. Overlapping Distributions 6. Dealing

More information

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or

More information

Convergence of Nonconvex Douglas-Rachford Splitting and Nonconvex ADMM

Convergence of Nonconvex Douglas-Rachford Splitting and Nonconvex ADMM Convergence of Nonconvex Douglas-Rachford Splitting and Nonconvex ADMM Shuvomoy Das Gupta Thales Canada 56th Allerton Conference on Communication, Control and Computing, 2018 1 What is this talk about?

More information

Enclosures of Roundoff Errors using SDP

Enclosures of Roundoff Errors using SDP Enclosures of Roundoff Errors using SDP Victor Magron, CNRS Jointly Certified Upper Bounds with G. Constantinides and A. Donaldson Metalibm workshop: Elementary functions, digital filters and beyond 12-13

More information

Chapter 7: Numerical Prediction

Chapter 7: Numerical Prediction Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme Knowledge Discovery in Databases SS 2016 Chapter 7: Numerical Prediction Lecture: Prof. Dr.

More information

.. Spring 2017 CSC 566 Advanced Data Mining Alexander Dekhtyar..

.. Spring 2017 CSC 566 Advanced Data Mining Alexander Dekhtyar.. .. Spring 2017 CSC 566 Advanced Data Mining Alexander Dekhtyar.. Machine Learning: Support Vector Machines: Linear Kernel Support Vector Machines Extending Perceptron Classifiers. There are two ways to

More information

Recent Developments in Model-based Derivative-free Optimization

Recent Developments in Model-based Derivative-free Optimization Recent Developments in Model-based Derivative-free Optimization Seppo Pulkkinen April 23, 2010 Introduction Problem definition The problem we are considering is a nonlinear optimization problem with constraints:

More information

LECTURE 18 LECTURE OUTLINE

LECTURE 18 LECTURE OUTLINE LECTURE 18 LECTURE OUTLINE Generalized polyhedral approximation methods Combined cutting plane and simplicial decomposition methods Lecture based on the paper D. P. Bertsekas and H. Yu, A Unifying Polyhedral

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover

More information

IDENTIFYING ACTIVE MANIFOLDS

IDENTIFYING ACTIVE MANIFOLDS Algorithmic Operations Research Vol.2 (2007) 75 82 IDENTIFYING ACTIVE MANIFOLDS W.L. Hare a a Department of Mathematics, Simon Fraser University, Burnaby, BC V5A 1S6, Canada. A.S. Lewis b b School of ORIE,

More information

PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING. 1. Introduction

PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING. 1. Introduction PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING KELLER VANDEBOGERT AND CHARLES LANNING 1. Introduction Interior point methods are, put simply, a technique of optimization where, given a problem

More information

arxiv: v2 [stat.ml] 5 Nov 2018

arxiv: v2 [stat.ml] 5 Nov 2018 Kernel Distillation for Fast Gaussian Processes Prediction arxiv:1801.10273v2 [stat.ml] 5 Nov 2018 Congzheng Song Cornell Tech cs2296@cornell.edu Abstract Yiming Sun Cornell University ys784@cornell.edu

More information