Numerical Optimization: Introduction and gradient-based methods

Size: px
Start display at page:

Download "Numerical Optimization: Introduction and gradient-based methods"

Transcription

1 Numerical Optimization: Introduction and gradient-based methods Master 2 Recherche LRI Apprentissage Statistique et Optimisation Anne Auger Inria Saclay-Ile-de-France November Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

2 1 Numerical optimization A few examples Data fitting, regression In machine learning Black-box optimization Different notions of optimum Typical difficulties in optimization Deterministic vs stochastic - local versus global methods 2 Mathematical tools and optimality conditions First order differentiability and gradients Second order differentiability and hessian Optimality Conditions for unconstrainted optimization 3 Gradient based optimization algorithms Root finding methods (1-D optimization) Relaxation algorithm Descent methods Gradient descent, Newton descent, BFGS Trust regions methods Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

3 Numerical optimization Numerical or continuous optimization Unconstrained optimization Optimize a function where parameters to optimize are continuous (live in R n ). min x R n f (x) n: dimension of the problem corresponds to dimension of euclidian vector space R n Maximization vs Minimization Maximize f = Minimize f Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

4 Numerical optimization A few examples Analytical functions Convex quadratic function: f (x) = 1 2 xt Ax + b T x + c (1) = 1 2 (x x 0) T A(x x 0 ) + c (2) Exercice where A R n n, symmetric positive definite, b R n, c R. Express x 0 in (2) as a function of A and b. Express the minimum of f. For n = 2, plot the level sets of a convex-quadratic function where level sets are defined as the sets L c = {x R n f (x) = c}. Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

5 Numerical optimization A few examples Data fitting - Data calibration Objective Given a sequence of data points (x i, y i ) R p R, i = 1,..., N, find a model y = f (x) that explains the data experimental measurements in biology, chemistry,... In general, choice of a parametric model or family of functions (f θ ) θ R n use of expertise for choosing model or simple models only affordable (linear, quadratic) Try to find the parameter θ R n fitting best to the data Fitting best to the data Minimize the quadratic error: min θ R n N f θ (x i ) y i 2 i=1 Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

6 Numerical optimization A few examples Optimization and machine learning (Simple) Linear regression Given a set of data (examples): {y i, xi 1,..., x p i }{{} X T i min w R p,β R } i=1...n N w T X i + β y i 2 i=1 } {{ } Xw y 2 same as data fitting with linear model, i.e. f (w,β) (x) = w T x + β, θ R p Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

7 Numerical optimization A few examples Optimization and Machine Learning (cont.) Clustering k-means Given a set of observations (x 1,..., x p ) where each observation is a n-dimensional real vector, k-means clustering aims to partition the n observations into k sets (k p), S = {S 1, S 2,..., S k } so as to minimize the within-cluster sum of squares: minimize µ1,...,µ k k i=1 x j S i x j µ i 2 Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

8 Black-box optimization Optimization of well placement Numerical optimization A few examples Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

9 Numerical optimization Different notions of optimum Different notions of optimum local versus global minimum local minimum x : for all x in a neighborhood of x, f (x) f (x ) global minimum: for all x, f (x) f (x ) essential infimum of a function: Given a measure µ on R n, the essential infimum of f is defined as ess inf f = sup{b R : µ({x : f (x) < b}) = 0} important to keep in mind in the context of stochastic optimization algorithms Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

10 Numerical optimization Typical difficulties in optimization What Makes a Function Difficult to Solve? ruggedness non-smooth, discontinuous, multimodal, and/or noisy function dimensionality non-separability (considerably) larger than three dependencies between the objective variables ill-conditioning cut from 3-D example, solvable with an evolution strategy a narrow ridge Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

11 Numerical optimization Typical difficulties in optimization What Makes a Function Difficult to Solve? cut from 3-D example, solvable with an evolution strategy Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

12 Numerical optimization Curse of Dimensionality Typical difficulties in optimization The term Curse of dimensionality (Richard Bellman) refers to problems caused by the rapid increase in volume associated with adding extra dimensions to a (mathematical) space. Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

13 Numerical optimization Curse of Dimensionality Typical difficulties in optimization The term Curse of dimensionality (Richard Bellman) refers to problems caused by the rapid increase in volume associated with adding extra dimensions to a (mathematical) space. Example: Consider placing 100 points onto a real interval, say [0, 1]. To get similar coverage, in terms of distance between adjacent points, of the 10-dimensional space [0, 1] 10 would require = points. A 100 points appear now as isolated points in a vast empty space. Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

14 Numerical optimization Curse of Dimensionality Typical difficulties in optimization The term Curse of dimensionality (Richard Bellman) refers to problems caused by the rapid increase in volume associated with adding extra dimensions to a (mathematical) space. Example: Consider placing 100 points onto a real interval, say [0, 1]. To get similar coverage, in terms of distance between adjacent points, of the 10-dimensional space [0, 1] 10 would require = points. A 100 points appear now as isolated points in a vast empty space. Consequently, a search policy (e.g. exhaustive search) that is valuable in small dimensions might be useless in moderate or large dimensional search spaces. Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

15 Separable Problems Numerical optimization Typical difficulties in optimization Definition (Separable Problem) A function f is separable if arg min (x 1,...,x n) f (x 1,..., x n ) = ( ) arg min f (x 1,...),..., arg min f (..., x n ) x 1 x n it follows that f can be optimized in a sequence of n independent 1-D optimization processes Example: Additively decomposable functions n f (x 1,..., x n ) = f i (x i ) i=1 Rastrigin function Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

16 Non-Separable Problems Numerical optimization Typical difficulties in optimization Building a non-separable problem from a separable one (1,2) Rotating the coordinate system f : x f (x) separable f : x f (Rx) non-separable R rotation matrix R Hansen, Ostermeier, Gawelczyk (1995). On the adaptation of arbitrary normal mutation distributions in evolution strategies: The generating set adaptation. Sixth ICGA, pp , Morgan Kaufmann 2 Salomon (1996). Reevaluating Genetic Algorithm Performance under Coordinate Rotation of Benchmark Functions; A survey of some theoretical and practical aspects of genetic algorithms. BioSystems, 39(3): Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

17 If H I (small condition number of H) first order information (e.g. the gradient) is sufficient. Otherwise second order information (estimation of H 1 ) is necessary. Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38 Numerical optimization Ill-Conditioned Problems Curvature of level sets Typical difficulties in optimization Consider the convex-quadratic function f (x) = 1 2 (x x ) T H(x x ) = 1 2 i h i,i xi i j h i,j x i x j H is Hessian matrix of f and symmetric positive definite gradient direction f (x) T Newton direction H 1 f (x) T Ill-conditioning means squeezed level sets (high curvature). Condition number equals nine here. Condition numbers up to are not unusual in real world problems.

18 Numerical optimization Typical difficulties in optimization What Makes a Function Difficult to Solve?... and what can be done The Problem Ruggedness The Approach in ESs and continuous EDAs non-local policy, large sampling width (step-size) as large as possible while preserving a reasonable convergence speed stochastic, non-elitistic, population-based method recombination operator serves as repair mechanism Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

19 Numerical optimization Typical difficulties in optimization What Makes a Function Difficult to Solve?... and what can be done The Problem Ruggedness The Approach in ESs and continuous EDAs non-local policy, large sampling width (step-size) as large as possible while preserving a reasonable convergence speed stochastic, non-elitistic, population-based method recombination operator serves as repair mechanism Dimensionality, Non-Separability exploiting the problem structure locality, neighborhood, encoding Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

20 Numerical optimization Typical difficulties in optimization What Makes a Function Difficult to Solve?... and what can be done The Problem Ruggedness The Approach in ESs and continuous EDAs non-local policy, large sampling width (step-size) as large as possible while preserving a reasonable convergence speed stochastic, non-elitistic, population-based method recombination operator serves as repair mechanism Dimensionality, Non-Separability exploiting the problem structure locality, neighborhood, encoding Ill-conditioning second order approach changes the neighborhood metric Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

21 Numerical optimization Different optimization methods Deterministic vs stochastic - local versus global methods Deterministic methods local methods Convex optimization methods - gradient based methods most often require to use gradients of functions converge to local optima, fast if function has the right assumptions (smooth enough) deterministic methods that do not require convexity: simplex method, pattern search methods Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

22 Numerical optimization Deterministic vs stochastic - local versus global methods Different optimization methods (cont.) Remarks Stochastic methods global methods use of randomness to be able to escape local optima pure random search, simulated annealing (SA) not reasonable methods in practice (too slow) genetic algorithms not powerful for numerical optimization, originally introduced for binary search spaces {0, 1} n evolution strategies powerful for numerical optimization Impossible to be exhaustive Classification is not that binary methods combining deterministic and stochastic do of course exist Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

23 Mathematical tools and optimality conditions First order differentiability and gradients 1 Numerical optimization A few examples Data fitting, regression In machine learning Black-box optimization Different notions of optimum Typical difficulties in optimization Deterministic vs stochastic - local versus global methods 2 Mathematical tools and optimality conditions First order differentiability and gradients Second order differentiability and hessian Optimality Conditions for unconstrainted optimization 3 Gradient based optimization algorithms Root finding methods (1-D optimization) Relaxation algorithm Descent methods Gradient descent, Newton descent, BFGS Trust regions methods Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

24 Mathematical tools and optimality conditions First order differentiability and gradients Differentiability Generalization of derivability in 1-D Given a normed vector space (E,. ) and complete (Banach space), consider f : U E R with U open set of E. Exercice f is differentiable in x U if there exists a continuous linear form Df x such that f (x + h) = f (x) + Df x (h) + o( h ) (3) Df x is the differential of f in x Consider E = R n with the scalar product x, y = x T y. Let a R n, show that f (x) = a, x is differentiable and compute its differential. Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

25 Mathematical tools and optimality conditions First order differentiability and gradients Gradient If the norm. comes from a scalar product, i.e. x = x, x (the Banach space E is then called an Hilbert space), the gradient of f in x denoted f (x) is defined as the element of E such that Df x (h) = f (x), h (4) Riesz representation Theorem Taylor formula - order one Replacing differential by (4) in (3) we obtain the Taylor formula: f (x + h) = f (x) + f (x), h + o( h ) Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

26 Mathematical tools and optimality conditions First order differentiability and gradients Gradient (cont) Exercice Compute the gradient of the function f (x) = a, x. Exercice Level sets are sets defined as L c = {x R n f (x) = c}. Let x 0 L c. Show that f (x 0 ) is orthogonal to the level set in x 0. Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

27 Mathematical tools and optimality conditions First order differentiability and gradients Gradient Examples if f (x) = a, x, f (x) = a in R n, if f (x) = x T Ax, then f (x) = (A + A T )x particular case if f (x) = x 2, then f (x) = 2x in R, f (x) = f (x) in (R n,. 2 ) where x 2 = x, x is the euclidian norm ( f f (x) = }{{} natural gradient(amari,2000) ) T (x) x n (x),..., f x 1 }{{} vanilla gradient Attention: this equality will not hold for all norms. Has some importance in machine learning, see the natural gradient topic introduced by Amari. Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

28 Mathematical tools and optimality conditions Second order differentiability and hessian Second order differentiability (first order) differential: gives a linear local approximation second order differential: gives a quadratic local approximation Definition: second order differentiability f : U E R is differentiable at the second order in x U if it is differentiable in a neighborhood of x and if u Df u is differentiable in x Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

29 Mathematical tools and optimality conditions Second order differentiability (cont.) Second order differentiability and hessian Other definition f : U E R is differentiable at the second order in x U iff there exists a continuous linear application Df x and a bilinear symmetric continuous application D 2 f x such that In a Hilbert (E, ) f (x + h) = f (x) + Df x (h) D2 f x (h, h) + o( h 2 ) D 2 f x (h, h) = 2 f (x)(h), h where 2 f (x) : E E is a symmetric continuous operator. Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

30 Hessian matrix Mathematical tools and optimality conditions Second order differentiability and hessian In (R n, x, y = x T y), 2 f (x) is represented by a symmetric matrix called the hessian matrix. It can be computed as 2 (f ) = 2 f x f x 2 x 1. 2 f x n x 1 2 f x 1 x 2 2 f 2 f x 2 2. x 1 x n 2 f x 2 x n f x n x 2 2 f xn 2. Exercice In R n, compute the hessian matrix of f (x) = 1 2 xt Ax. Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

31 Mathematical tools and optimality conditions Optimality Conditions for unconstrainted optimization Optimality condition First order Necessary condition for 1-D optimization problems f : R R Assume f is derivable x is a local optimum f (x ) = 0 not a necessary condition: consider f (x) = x 3 proof via Taylor formula: f (x + h) = f (x ) + f (x )h + o( h ) Points y such that f (y) = 0 are called critical or stationary points. Generalization If f : E R with (E,, ) a Hilbert is differentiable x is a local minimum of f f (x ) = 0. proof via Taylor formula Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

32 Mathematical tools and optimality conditions Optimality Conditions for unconstrainted optimization Optimality conditions Second order necessary and sufficient conditions If f is twice continuously differentiable if x is a local optimum, then f (x ) = 0 and 2 f (x ) is positive semi-definite if f (x ) = 0 and 2 f (x ) is positive definite, then there is an α > 0 such that f (x) f (x ) + α x x 2 Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

33 Gradient based optimization algorithms 1 Numerical optimization A few examples Data fitting, regression In machine learning Black-box optimization Different notions of optimum Typical difficulties in optimization Deterministic vs stochastic - local versus global methods 2 Mathematical tools and optimality conditions First order differentiability and gradients Second order differentiability and hessian Optimality Conditions for unconstrainted optimization 3 Gradient based optimization algorithms Root finding methods (1-D optimization) Relaxation algorithm Descent methods Gradient descent, Newton descent, BFGS Trust regions methods Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

34 Gradient based optimization algorithms Optimization algorithms Iterative procedures that generate a sequence (x n ) n N where at each time step n, x n is the estimate of the optimum. Convergence No exact result in general, however the algorithms aim at converging to optima (local or global), i.e. x n x n 0 convergence speed of the algorithm = how fast x n x goes to zero Equivalently, given a fixed precision ɛ, the algorithms aim at approaching the optimum x with precision ɛ in finite time running time = how many iterations is needed to reach the precision ɛ Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

35 Gradient based optimization algorithms Root finding methods (1-D optimization) Newton-Raphson method Method to find the root of a function f : R R linear approximation of f : f (x + h) = f (x) + hf (x) + o( h ) at the first order, f (x + h) = 0 leads to h = f (x) f (x) Algorithm: x n+1 = x n f (xn) f (x n) quadratic convergence: x n+1 x µ x n x 2 provided the initial point is close enough from the root x and there is no other critical point in a neighborhood of x secant method: replaces the derivative computation its the finite difference approximation Application to the minimization of f try to solve the equation f (x) = 0 algorithm: x n+1 = x n f (x n) f (x n) Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

36 Gradient based optimization algorithms Root finding methods (1-D optimization) Bissection method An other method to find the root of a 1-D function bisects an interval and then selects a subinterval in which a root must lie method guaranteed to converge if f continuous and the intial points bracket the solution x n x a 1 b 1 2 n Wikipedia image Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

37 Gradient based optimization algorithms Relaxation algorithm Relaxation algorithm Naive approach for optimization of f : R n R Algorithm successive optimization w.r.t. each variable 1 choose an intial point x 0,k=1 2 while not happy for i = 1,..., n x k+1 i = arg min x R f (x k+1 1,..., x k+1 i 1, x, x k i+1,..., x k n ) use a 1D-minimization procedure endfor k=k+1 Critics Very bad for non-separable problems Not invariant if change of coordinate system Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

38 Gradient based optimization algorithms Descent methods Descent methods General principle 1 choose an intial point x 0,k=1 2 while not happy 2.1 choose a descent direction d k choose a step-size σ k > set x k+1 = x k + σ k d k, k = k + 1 Remaining questions How to choose d k? How to choose σ k? Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

39 Gradient based optimization algorithms Descent methods Gradient descent Rationale: d k = f (x k ) is a descent direction Indeed for f differentiable f (x σ f (x)) = f (x) σ f (x) 2 + o(σ) < }{{} for σ small enough f (x) Step-size optimal step-size σ k = arg min σ f (x k σ f (x k )) (5) in general precise minimization of (5) is expensive and unecessary, instead a line search algorithm executes a limited number of trial steps until one loose approximation of the minimum is found Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

40 Gradient based optimization algorithms Descent methods Gradient descent Convergence rate on convex-quadratic function On f (x) = 1 2 xt Ax b T x + c with A a symmetric positive definite matrix with Q = cond(a) we have that the gradient descent algorithm with optimal step-size satisfy: f (x k ) min f ( ) Q 1 2k [f (x 0 ) min f ] Q + 1 The convergence rate depends on the starting point, however ( Q 1 Q+1) 2k gives the correct description of the convergence rate for almost all starting points. Very slow convergence for ill-conditionned problems Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

41 Gradient based optimization algorithms Descent methods Newton algorithm, BFGS Descent methods Newton method descent direction: [ 2 f (x k )] 1 f (x k ) Newton direction points towards the optimum on f (x) = x T Ax + b T x + c However, hessian matrix is expensive to compute in general and its inversion is also not immediate Broyden-Fletcher-Goldfarb-Shanno (BFGS) method construct in an iterative manner using solely the gradient, an approximation of the Newton direction [ 2 f (x k )] 1 f (x k ) Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

42 Gradient based optimization algorithms Trust regions methods Trust regions methods General idea instead of choosing a descent direction and minimizing along this direction, trust regions algorithms construct a model m k of the objective function (usually a quadratic model) because the model may not be a good approximation of f far away from the current iterate x k, the search for a minimizer of the model is restricted to a region (trust region) around the current iterate trust region is usually a ball, whose radius is adapted (shrinked if no better solution is found) Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

43 Gradient based optimization algorithms Was not handled in this class but you might want to have a look at it Trust regions methods Constraint optimization min x R nf (x) subjected to c i (x) = 0, i = 1,..., p and b j (x) 0, j = 1,..., k equality constraints inequality constraints Anne Auger (Inria Saclay-Ile-de-France) Numercial Optimization I November / 38

Constrained and Unconstrained Optimization

Constrained and Unconstrained Optimization Constrained and Unconstrained Optimization Carlos Hurtado Department of Economics University of Illinois at Urbana-Champaign hrtdmrt2@illinois.edu Oct 10th, 2017 C. Hurtado (UIUC - Economics) Numerical

More information

Theoretical Concepts of Machine Learning

Theoretical Concepts of Machine Learning Theoretical Concepts of Machine Learning Part 2 Institute of Bioinformatics Johannes Kepler University, Linz, Austria Outline 1 Introduction 2 Generalization Error 3 Maximum Likelihood 4 Noise Models 5

More information

Introduction to Optimization

Introduction to Optimization Introduction to Optimization Second Order Optimization Methods Marc Toussaint U Stuttgart Planned Outline Gradient-based optimization (1st order methods) plain grad., steepest descent, conjugate grad.,

More information

Introduction to optimization methods and line search

Introduction to optimization methods and line search Introduction to optimization methods and line search Jussi Hakanen Post-doctoral researcher jussi.hakanen@jyu.fi How to find optimal solutions? Trial and error widely used in practice, not efficient and

More information

APPLIED OPTIMIZATION WITH MATLAB PROGRAMMING

APPLIED OPTIMIZATION WITH MATLAB PROGRAMMING APPLIED OPTIMIZATION WITH MATLAB PROGRAMMING Second Edition P. Venkataraman Rochester Institute of Technology WILEY JOHN WILEY & SONS, INC. CONTENTS PREFACE xiii 1 Introduction 1 1.1. Optimization Fundamentals

More information

Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Search & Optimization Search and Optimization method deals with

More information

25. NLP algorithms. ˆ Overview. ˆ Local methods. ˆ Constrained optimization. ˆ Global methods. ˆ Black-box methods.

25. NLP algorithms. ˆ Overview. ˆ Local methods. ˆ Constrained optimization. ˆ Global methods. ˆ Black-box methods. CS/ECE/ISyE 524 Introduction to Optimization Spring 2017 18 25. NLP algorithms ˆ Overview ˆ Local methods ˆ Constrained optimization ˆ Global methods ˆ Black-box methods ˆ Course wrap-up Laurent Lessard

More information

Mathematical Programming and Research Methods (Part II)

Mathematical Programming and Research Methods (Part II) Mathematical Programming and Research Methods (Part II) 4. Convexity and Optimization Massimiliano Pontil (based on previous lecture by Andreas Argyriou) 1 Today s Plan Convex sets and functions Types

More information

Introduction to Optimization

Introduction to Optimization Introduction to Optimization September 16, 2016 TC2 - Optimisation Université Paris-Saclay, Orsay, France Anne Auger Inria Saclay Ile-de-France Dimo Brockhoff Inria Saclay Ile-de-France Mastertitelformat

More information

A Brief Look at Optimization

A Brief Look at Optimization A Brief Look at Optimization CSC 412/2506 Tutorial David Madras January 18, 2018 Slides adapted from last year s version Overview Introduction Classes of optimization problems Linear programming Steepest

More information

Optimization. Industrial AI Lab.

Optimization. Industrial AI Lab. Optimization Industrial AI Lab. Optimization An important tool in 1) Engineering problem solving and 2) Decision science People optimize Nature optimizes 2 Optimization People optimize (source: http://nautil.us/blog/to-save-drowning-people-ask-yourself-what-would-light-do)

More information

Convexization in Markov Chain Monte Carlo

Convexization in Markov Chain Monte Carlo in Markov Chain Monte Carlo 1 IBM T. J. Watson Yorktown Heights, NY 2 Department of Aerospace Engineering Technion, Israel August 23, 2011 Problem Statement MCMC processes in general are governed by non

More information

Introduction to Modern Control Systems

Introduction to Modern Control Systems Introduction to Modern Control Systems Convex Optimization, Duality and Linear Matrix Inequalities Kostas Margellos University of Oxford AIMS CDT 2016-17 Introduction to Modern Control Systems November

More information

Numerical Optimization

Numerical Optimization Numerical Optimization Quantitative Macroeconomics Raül Santaeulàlia-Llopis MOVE-UAB and Barcelona GSE Fall 2018 Raül Santaeulàlia-Llopis (MOVE-UAB,BGSE) QM: Numerical Optimization Fall 2018 1 / 46 1 Introduction

More information

Introduction to Optimization

Introduction to Optimization Introduction to Optimization Randomized Search Heuristics + Introduction to Continuous Optimization I November 25, 2016 École Centrale Paris, Châtenay-Malabry, France Dimo Brockhoff INRIA Saclay Ile-de-France

More information

Convexity and Optimization

Convexity and Optimization Convexity and Optimization Richard Lusby Department of Management Engineering Technical University of Denmark Today s Material Extrema Convex Function Convex Sets Other Convexity Concepts Unconstrained

More information

Tree-GP: A Scalable Bayesian Global Numerical Optimization algorithm

Tree-GP: A Scalable Bayesian Global Numerical Optimization algorithm Utrecht University Department of Information and Computing Sciences Tree-GP: A Scalable Bayesian Global Numerical Optimization algorithm February 2015 Author Gerben van Veenendaal ICA-3470792 Supervisor

More information

Modern Methods of Data Analysis - WS 07/08

Modern Methods of Data Analysis - WS 07/08 Modern Methods of Data Analysis Lecture XV (04.02.08) Contents: Function Minimization (see E. Lohrmann & V. Blobel) Optimization Problem Set of n independent variables Sometimes in addition some constraints

More information

Logistic Regression

Logistic Regression Logistic Regression ddebarr@uw.edu 2016-05-26 Agenda Model Specification Model Fitting Bayesian Logistic Regression Online Learning and Stochastic Optimization Generative versus Discriminative Classifiers

More information

Today. Golden section, discussion of error Newton s method. Newton s method, steepest descent, conjugate gradient

Today. Golden section, discussion of error Newton s method. Newton s method, steepest descent, conjugate gradient Optimization Last time Root finding: definition, motivation Algorithms: Bisection, false position, secant, Newton-Raphson Convergence & tradeoffs Example applications of Newton s method Root finding in

More information

Convex Optimization CMU-10725

Convex Optimization CMU-10725 Convex Optimization CMU-10725 Conjugate Direction Methods Barnabás Póczos & Ryan Tibshirani Conjugate Direction Methods 2 Books to Read David G. Luenberger, Yinyu Ye: Linear and Nonlinear Programming Nesterov:

More information

Multi Layer Perceptron trained by Quasi Newton learning rule

Multi Layer Perceptron trained by Quasi Newton learning rule Multi Layer Perceptron trained by Quasi Newton learning rule Feed-forward neural networks provide a general framework for representing nonlinear functional mappings between a set of input variables and

More information

Convexity and Optimization

Convexity and Optimization Convexity and Optimization Richard Lusby DTU Management Engineering Class Exercises From Last Time 2 DTU Management Engineering 42111: Static and Dynamic Optimization (3) 18/09/2017 Today s Material Extrema

More information

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016 CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:

More information

Introduction to Optimization Problems and Methods

Introduction to Optimization Problems and Methods Introduction to Optimization Problems and Methods wjch@umich.edu December 10, 2009 Outline 1 Linear Optimization Problem Simplex Method 2 3 Cutting Plane Method 4 Discrete Dynamic Programming Problem Simplex

More information

Efficient Iterative Semi-supervised Classification on Manifold

Efficient Iterative Semi-supervised Classification on Manifold . Efficient Iterative Semi-supervised Classification on Manifold... M. Farajtabar, H. R. Rabiee, A. Shaban, A. Soltani-Farani Sharif University of Technology, Tehran, Iran. Presented by Pooria Joulani

More information

Lecture 6 - Multivariate numerical optimization

Lecture 6 - Multivariate numerical optimization Lecture 6 - Multivariate numerical optimization Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University February 13, 2014 1 / 36 Table of Contents 1 Plotting functions of two variables

More information

Comparison of Interior Point Filter Line Search Strategies for Constrained Optimization by Performance Profiles

Comparison of Interior Point Filter Line Search Strategies for Constrained Optimization by Performance Profiles INTERNATIONAL JOURNAL OF MATHEMATICS MODELS AND METHODS IN APPLIED SCIENCES Comparison of Interior Point Filter Line Search Strategies for Constrained Optimization by Performance Profiles M. Fernanda P.

More information

Programming, numerics and optimization

Programming, numerics and optimization Programming, numerics and optimization Lecture C-4: Constrained optimization Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428 June

More information

Convex Optimization MLSS 2015

Convex Optimization MLSS 2015 Convex Optimization MLSS 2015 Constantine Caramanis The University of Texas at Austin The Optimization Problem minimize : f (x) subject to : x X. The Optimization Problem minimize : f (x) subject to :

More information

Week 5. Convex Optimization

Week 5. Convex Optimization Week 5. Convex Optimization Lecturer: Prof. Santosh Vempala Scribe: Xin Wang, Zihao Li Feb. 9 and, 206 Week 5. Convex Optimization. The convex optimization formulation A general optimization problem is

More information

Recent Developments in Model-based Derivative-free Optimization

Recent Developments in Model-based Derivative-free Optimization Recent Developments in Model-based Derivative-free Optimization Seppo Pulkkinen April 23, 2010 Introduction Problem definition The problem we are considering is a nonlinear optimization problem with constraints:

More information

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited. page v Preface xiii I Basics 1 1 Optimization Models 3 1.1 Introduction... 3 1.2 Optimization: An Informal Introduction... 4 1.3 Linear Equations... 7 1.4 Linear Optimization... 10 Exercises... 12 1.5

More information

A Brief Overview of Optimization Problems. Steven G. Johnson MIT course , Fall 2008

A Brief Overview of Optimization Problems. Steven G. Johnson MIT course , Fall 2008 A Brief Overview of Optimization Problems Steven G. Johnson MIT course 18.335, Fall 2008 Why optimization? In some sense, all engineering design is optimization: choosing design parameters to improve some

More information

Experimental Data and Training

Experimental Data and Training Modeling and Control of Dynamic Systems Experimental Data and Training Mihkel Pajusalu Alo Peets Tartu, 2008 1 Overview Experimental data Designing input signal Preparing data for modeling Training Criterion

More information

Multivariate Numerical Optimization

Multivariate Numerical Optimization Jianxin Wei March 1, 2013 Outline 1 Graphics for Function of Two Variables 2 Nelder-Mead Simplex Method 3 Steepest Descent Method 4 Newton s Method 5 Quasi-Newton s Method 6 Built-in R Function 7 Linear

More information

Characterizing Improving Directions Unconstrained Optimization

Characterizing Improving Directions Unconstrained Optimization Final Review IE417 In the Beginning... In the beginning, Weierstrass's theorem said that a continuous function achieves a minimum on a compact set. Using this, we showed that for a convex set S and y not

More information

Classical Gradient Methods

Classical Gradient Methods Classical Gradient Methods Note simultaneous course at AMSI (math) summer school: Nonlin. Optimization Methods (see http://wwwmaths.anu.edu.au/events/amsiss05/) Recommended textbook (Springer Verlag, 1999):

More information

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013 Machine Learning Topic 5: Linear Discriminants Bryan Pardo, EECS 349 Machine Learning, 2013 Thanks to Mark Cartwright for his extensive contributions to these slides Thanks to Alpaydin, Bishop, and Duda/Hart/Stork

More information

CS281 Section 3: Practical Optimization

CS281 Section 3: Practical Optimization CS281 Section 3: Practical Optimization David Duvenaud and Dougal Maclaurin Most parameter estimation problems in machine learning cannot be solved in closed form, so we often have to resort to numerical

More information

5 Machine Learning Abstractions and Numerical Optimization

5 Machine Learning Abstractions and Numerical Optimization Machine Learning Abstractions and Numerical Optimization 25 5 Machine Learning Abstractions and Numerical Optimization ML ABSTRACTIONS [some meta comments on machine learning] [When you write a large computer

More information

Simplicial Global Optimization

Simplicial Global Optimization Simplicial Global Optimization Julius Žilinskas Vilnius University, Lithuania September, 7 http://web.vu.lt/mii/j.zilinskas Global optimization Find f = min x A f (x) and x A, f (x ) = f, where A R n.

More information

Neural Networks: Optimization Part 1. Intro to Deep Learning, Fall 2018

Neural Networks: Optimization Part 1. Intro to Deep Learning, Fall 2018 Neural Networks: Optimization Part 1 Intro to Deep Learning, Fall 2018 1 Story so far Neural networks are universal approximators Can model any odd thing Provided they have the right architecture We must

More information

An evolutionary annealing-simplex algorithm for global optimisation of water resource systems

An evolutionary annealing-simplex algorithm for global optimisation of water resource systems FIFTH INTERNATIONAL CONFERENCE ON HYDROINFORMATICS 1-5 July 2002, Cardiff, UK C05 - Evolutionary algorithms in hydroinformatics An evolutionary annealing-simplex algorithm for global optimisation of water

More information

CS 450 Numerical Analysis. Chapter 7: Interpolation

CS 450 Numerical Analysis. Chapter 7: Interpolation Lecture slides based on the textbook Scientific Computing: An Introductory Survey by Michael T. Heath, copyright c 2018 by the Society for Industrial and Applied Mathematics. http://www.siam.org/books/cl80

More information

An Asynchronous Implementation of the Limited Memory CMA-ES

An Asynchronous Implementation of the Limited Memory CMA-ES An Asynchronous Implementation of the Limited Memory CMA-ES Viktor Arkhipov, Maxim Buzdalov, Anatoly Shalyto ITMO University 49 Kronverkskiy prosp. Saint-Petersburg, Russia, 197101 Email: {arkhipov, buzdalov}@rain.ifmo.ru,

More information

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer David G. Luenberger Yinyu Ye Linear and Nonlinear Programming Fourth Edition ö Springer Contents 1 Introduction 1 1.1 Optimization 1 1.2 Types of Problems 2 1.3 Size of Problems 5 1.4 Iterative Algorithms

More information

Introduction to Design Optimization: Search Methods

Introduction to Design Optimization: Search Methods Introduction to Design Optimization: Search Methods 1-D Optimization The Search We don t know the curve. Given α, we can calculate f(α). By inspecting some points, we try to find the approximated shape

More information

A Brief Overview of Optimization Problems. Steven G. Johnson MIT course , Fall 2008

A Brief Overview of Optimization Problems. Steven G. Johnson MIT course , Fall 2008 A Brief Overview of Optimization Problems Steven G. Johnson MIT course 18.335, Fall 2008 Why optimization? In some sense, all engineering design is optimization: choosing design parameters to improve some

More information

Minima, Maxima, Saddle points

Minima, Maxima, Saddle points Minima, Maxima, Saddle points Levent Kandiller Industrial Engineering Department Çankaya University, Turkey Minima, Maxima, Saddle points p./9 Scalar Functions Let us remember the properties for maxima,

More information

Classification of Optimization Problems and the Place of Calculus of Variations in it

Classification of Optimization Problems and the Place of Calculus of Variations in it Lecture 1 Classification of Optimization Problems and the Place of Calculus of Variations in it ME256 Indian Institute of Science G. K. Ananthasuresh Professor, Mechanical Engineering, Indian Institute

More information

x n+1 = x n f(x n) f (x n ), (1)

x n+1 = x n f(x n) f (x n ), (1) 1 Optimization The field of optimization is large and vastly important, with a deep history in computer science (among other places). Generally, an optimization problem is defined by having a score function

More information

Machine Learning for Signal Processing Lecture 4: Optimization

Machine Learning for Signal Processing Lecture 4: Optimization Machine Learning for Signal Processing Lecture 4: Optimization 13 Sep 2015 Instructor: Bhiksha Raj (slides largely by Najim Dehak, JHU) 11-755/18-797 1 Index 1. The problem of optimization 2. Direct optimization

More information

Convexity Theory and Gradient Methods

Convexity Theory and Gradient Methods Convexity Theory and Gradient Methods Angelia Nedić angelia@illinois.edu ISE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign Outline Convex Functions Optimality

More information

Optimization in Scilab

Optimization in Scilab Scilab sheet Optimization in Scilab Scilab provides a high-level matrix language and allows to define complex mathematical models and to easily connect to existing libraries. That is why optimization is

More information

CMU-Q Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization. Teacher: Gianni A. Di Caro

CMU-Q Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization. Teacher: Gianni A. Di Caro CMU-Q 15-381 Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization Teacher: Gianni A. Di Caro GLOBAL FUNCTION OPTIMIZATION Find the global maximum of the function f x (and

More information

Foundations of Computing

Foundations of Computing Foundations of Computing Darmstadt University of Technology Dept. Computer Science Winter Term 2005 / 2006 Copyright c 2004 by Matthias Müller-Hannemann and Karsten Weihe All rights reserved http://www.algo.informatik.tu-darmstadt.de/

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 2. Convex Optimization

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 2. Convex Optimization Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 2 Convex Optimization Shiqian Ma, MAT-258A: Numerical Optimization 2 2.1. Convex Optimization General optimization problem: min f 0 (x) s.t., f i

More information

Lecture 12: Feasible direction methods

Lecture 12: Feasible direction methods Lecture 12 Lecture 12: Feasible direction methods Kin Cheong Sou December 2, 2013 TMA947 Lecture 12 Lecture 12: Feasible direction methods 1 / 1 Feasible-direction methods, I Intro Consider the problem

More information

B553 Lecture 12: Global Optimization

B553 Lecture 12: Global Optimization B553 Lecture 12: Global Optimization Kris Hauser February 20, 2012 Most of the techniques we have examined in prior lectures only deal with local optimization, so that we can only guarantee convergence

More information

Optimization Plugin for RapidMiner. Venkatesh Umaashankar Sangkyun Lee. Technical Report 04/2012. technische universität dortmund

Optimization Plugin for RapidMiner. Venkatesh Umaashankar Sangkyun Lee. Technical Report 04/2012. technische universität dortmund Optimization Plugin for RapidMiner Technical Report Venkatesh Umaashankar Sangkyun Lee 04/2012 technische universität dortmund Part of the work on this technical report has been supported by Deutsche Forschungsgemeinschaft

More information

What is machine learning?

What is machine learning? Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship

More information

PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING. 1. Introduction

PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING. 1. Introduction PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING KELLER VANDEBOGERT AND CHARLES LANNING 1. Introduction Interior point methods are, put simply, a technique of optimization where, given a problem

More information

A Study on the Optimization Methods for Optomechanical Alignment

A Study on the Optimization Methods for Optomechanical Alignment A Study on the Optimization Methods for Optomechanical Alignment Ming-Ta Yu a, Tsung-Yin Lin b *, Yi-You Li a, and Pei-Feng Shu a a Dept. of Mech. Eng., National Chiao Tung University, Hsinchu 300, Taiwan,

More information

Using Subspace Constraints to Improve Feature Tracking Presented by Bryan Poling. Based on work by Bryan Poling, Gilad Lerman, and Arthur Szlam

Using Subspace Constraints to Improve Feature Tracking Presented by Bryan Poling. Based on work by Bryan Poling, Gilad Lerman, and Arthur Szlam Presented by Based on work by, Gilad Lerman, and Arthur Szlam What is Tracking? Broad Definition Tracking, or Object tracking, is a general term for following some thing through multiple frames of a video

More information

Algorithm Design (4) Metaheuristics

Algorithm Design (4) Metaheuristics Algorithm Design (4) Metaheuristics Takashi Chikayama School of Engineering The University of Tokyo Formalization of Constraint Optimization Minimize (or maximize) the objective function f(x 0,, x n )

More information

IE598 Big Data Optimization Summary Nonconvex Optimization

IE598 Big Data Optimization Summary Nonconvex Optimization IE598 Big Data Optimization Summary Nonconvex Optimization Instructor: Niao He April 16, 2018 1 This Course Big Data Optimization Explore modern optimization theories, algorithms, and big data applications

More information

Mesh segmentation. Florent Lafarge Inria Sophia Antipolis - Mediterranee

Mesh segmentation. Florent Lafarge Inria Sophia Antipolis - Mediterranee Mesh segmentation Florent Lafarge Inria Sophia Antipolis - Mediterranee Outline What is mesh segmentation? M = {V,E,F} is a mesh S is either V, E or F (usually F) A Segmentation is a set of sub-meshes

More information

Projection-Based Methods in Optimization

Projection-Based Methods in Optimization Projection-Based Methods in Optimization Charles Byrne (Charles Byrne@uml.edu) http://faculty.uml.edu/cbyrne/cbyrne.html Department of Mathematical Sciences University of Massachusetts Lowell Lowell, MA

More information

Convex sets and convex functions

Convex sets and convex functions Convex sets and convex functions Convex optimization problems Convex sets and their examples Separating and supporting hyperplanes Projections on convex sets Convex functions, conjugate functions ECE 602,

More information

Computational Methods. Constrained Optimization

Computational Methods. Constrained Optimization Computational Methods Constrained Optimization Manfred Huber 2010 1 Constrained Optimization Unconstrained Optimization finds a minimum of a function under the assumption that the parameters can take on

More information

Lecture 2 September 3

Lecture 2 September 3 EE 381V: Large Scale Optimization Fall 2012 Lecture 2 September 3 Lecturer: Caramanis & Sanghavi Scribe: Hongbo Si, Qiaoyang Ye 2.1 Overview of the last Lecture The focus of the last lecture was to give

More information

Today s class. Roots of equation Finish up incremental search Open methods. Numerical Methods, Fall 2011 Lecture 5. Prof. Jinbo Bi CSE, UConn

Today s class. Roots of equation Finish up incremental search Open methods. Numerical Methods, Fall 2011 Lecture 5. Prof. Jinbo Bi CSE, UConn Today s class Roots of equation Finish up incremental search Open methods 1 False Position Method Although the interval [a,b] where the root becomes iteratively closer with the false position method, unlike

More information

Open and Closed Sets

Open and Closed Sets Open and Closed Sets Definition: A subset S of a metric space (X, d) is open if it contains an open ball about each of its points i.e., if x S : ɛ > 0 : B(x, ɛ) S. (1) Theorem: (O1) and X are open sets.

More information

Gradient Descent. Wed Sept 20th, James McInenrey Adapted from slides by Francisco J. R. Ruiz

Gradient Descent. Wed Sept 20th, James McInenrey Adapted from slides by Francisco J. R. Ruiz Gradient Descent Wed Sept 20th, 2017 James McInenrey Adapted from slides by Francisco J. R. Ruiz Housekeeping A few clarifications of and adjustments to the course schedule: No more breaks at the midpoint

More information

Newton and Quasi-Newton Methods

Newton and Quasi-Newton Methods Lab 17 Newton and Quasi-Newton Methods Lab Objective: Newton s method is generally useful because of its fast convergence properties. However, Newton s method requires the explicit calculation of the second

More information

CHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM

CHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM 20 CHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM 2.1 CLASSIFICATION OF CONVENTIONAL TECHNIQUES Classical optimization methods can be classified into two distinct groups:

More information

1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics

1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics 1. Introduction EE 546, Univ of Washington, Spring 2016 performance of numerical methods complexity bounds structural convex optimization course goals and topics 1 1 Some course info Welcome to EE 546!

More information

Clustering algorithms and introduction to persistent homology

Clustering algorithms and introduction to persistent homology Foundations of Geometric Methods in Data Analysis 2017-18 Clustering algorithms and introduction to persistent homology Frédéric Chazal INRIA Saclay - Ile-de-France frederic.chazal@inria.fr Introduction

More information

Local and Global Minimum

Local and Global Minimum Local and Global Minimum Stationary Point. From elementary calculus, a single variable function has a stationary point at if the derivative vanishes at, i.e., 0. Graphically, the slope of the function

More information

Convex sets and convex functions

Convex sets and convex functions Convex sets and convex functions Convex optimization problems Convex sets and their examples Separating and supporting hyperplanes Projections on convex sets Convex functions, conjugate functions ECE 602,

More information

Today. Gradient descent for minimization of functions of real variables. Multi-dimensional scaling. Self-organizing maps

Today. Gradient descent for minimization of functions of real variables. Multi-dimensional scaling. Self-organizing maps Today Gradient descent for minimization of functions of real variables. Multi-dimensional scaling Self-organizing maps Gradient Descent Derivatives Consider function f(x) : R R. The derivative w.r.t. x

More information

ORIE 6300 Mathematical Programming I November 13, Lecture 23. max b T y. x 0 s 0. s.t. A T y + s = c

ORIE 6300 Mathematical Programming I November 13, Lecture 23. max b T y. x 0 s 0. s.t. A T y + s = c ORIE 63 Mathematical Programming I November 13, 214 Lecturer: David P. Williamson Lecture 23 Scribe: Mukadder Sevi Baltaoglu 1 Interior Point Methods Consider the standard primal and dual linear programs:

More information

Introduction to Optimization

Introduction to Optimization Introduction to Optimization Approximation Algorithms and Heuristics November 21, 2016 École Centrale Paris, Châtenay-Malabry, France Dimo Brockhoff Inria Saclay Ile-de-France 2 Exercise: The Knapsack

More information

Homework. Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression Pod-cast lecture on-line. Next lectures:

Homework. Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression Pod-cast lecture on-line. Next lectures: Homework Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression 3.0-3.2 Pod-cast lecture on-line Next lectures: I posted a rough plan. It is flexible though so please come with suggestions Bayes

More information

Lecture 19 Subgradient Methods. November 5, 2008

Lecture 19 Subgradient Methods. November 5, 2008 Subgradient Methods November 5, 2008 Outline Lecture 19 Subgradients and Level Sets Subgradient Method Convergence and Convergence Rate Convex Optimization 1 Subgradients and Level Sets A vector s is a

More information

10703 Deep Reinforcement Learning and Control

10703 Deep Reinforcement Learning and Control 10703 Deep Reinforcement Learning and Control Russ Salakhutdinov Machine Learning Department rsalakhu@cs.cmu.edu Policy Gradient I Used Materials Disclaimer: Much of the material and slides for this lecture

More information

Fast Approximate Energy Minimization via Graph Cuts

Fast Approximate Energy Minimization via Graph Cuts IEEE Transactions on PAMI, vol. 23, no. 11, pp. 1222-1239 p.1 Fast Approximate Energy Minimization via Graph Cuts Yuri Boykov, Olga Veksler and Ramin Zabih Abstract Many tasks in computer vision involve

More information

M. Sc. (Artificial Intelligence and Machine Learning)

M. Sc. (Artificial Intelligence and Machine Learning) Course Name: Advanced Python Course Code: MSCAI 122 This course will introduce students to advanced python implementations and the latest Machine Learning and Deep learning libraries, Scikit-Learn and

More information

LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION. 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach

LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION. 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach Basic approaches I. Primal Approach - Feasible Direction

More information

Algorithms for convex optimization

Algorithms for convex optimization Algorithms for convex optimization Michal Kočvara Institute of Information Theory and Automation Academy of Sciences of the Czech Republic and Czech Technical University kocvara@utia.cas.cz http://www.utia.cas.cz/kocvara

More information

Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization. Author: Martin Jaggi Presenter: Zhongxing Peng

Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization. Author: Martin Jaggi Presenter: Zhongxing Peng Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization Author: Martin Jaggi Presenter: Zhongxing Peng Outline 1. Theoretical Results 2. Applications Outline 1. Theoretical Results 2. Applications

More information

Contents. I The Basic Framework for Stationary Problems 1

Contents. I The Basic Framework for Stationary Problems 1 page v Preface xiii I The Basic Framework for Stationary Problems 1 1 Some model PDEs 3 1.1 Laplace s equation; elliptic BVPs... 3 1.1.1 Physical experiments modeled by Laplace s equation... 5 1.2 Other

More information

Alternating Projections

Alternating Projections Alternating Projections Stephen Boyd and Jon Dattorro EE392o, Stanford University Autumn, 2003 1 Alternating projection algorithm Alternating projections is a very simple algorithm for computing a point

More information

arxiv: v1 [cs.cv] 2 May 2016

arxiv: v1 [cs.cv] 2 May 2016 16-811 Math Fundamentals for Robotics Comparison of Optimization Methods in Optical Flow Estimation Final Report, Fall 2015 arxiv:1605.00572v1 [cs.cv] 2 May 2016 Contents Noranart Vesdapunt Master of Computer

More information

Generative and discriminative classification techniques

Generative and discriminative classification techniques Generative and discriminative classification techniques Machine Learning and Category Representation 013-014 Jakob Verbeek, December 13+0, 013 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.13.14

More information

Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey. Chapter 4 : Optimization for Machine Learning

Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey. Chapter 4 : Optimization for Machine Learning Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey Chapter 4 : Optimization for Machine Learning Summary of Chapter 2 Chapter 2: Convex Optimization with Sparsity

More information

Humanoid Robotics. Least Squares. Maren Bennewitz

Humanoid Robotics. Least Squares. Maren Bennewitz Humanoid Robotics Least Squares Maren Bennewitz Goal of This Lecture Introduction into least squares Use it yourself for odometry calibration, later in the lecture: camera and whole-body self-calibration

More information

Combinatorial optimization and its applications in image Processing. Filip Malmberg

Combinatorial optimization and its applications in image Processing. Filip Malmberg Combinatorial optimization and its applications in image Processing Filip Malmberg Part 1: Optimization in image processing Optimization in image processing Many image processing problems can be formulated

More information

Lecture 2: August 31

Lecture 2: August 31 10-725/36-725: Convex Optimization Fall 2016 Lecture 2: August 31 Lecturer: Lecturer: Ryan Tibshirani Scribes: Scribes: Lidan Mu, Simon Du, Binxuan Huang 2.1 Review A convex optimization problem is of

More information