Introduction. Optimization

Similar documents
Simplex of Nelder & Mead Algorithm

Today. Golden section, discussion of error Newton s method. Newton s method, steepest descent, conjugate gradient

Introduction to unconstrained optimization - derivative-free methods

CMU-Q Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization. Teacher: Gianni A. Di Caro

Numerical Optimization

A Derivative-Free Approximate Gradient Sampling Algorithm for Finite Minimax Problems

Introduction to Optimization Problems and Methods

Multivariate Numerical Optimization

Constrained and Unconstrained Optimization

Lecture 25 Nonlinear Programming. November 9, 2009

Fast oriented bounding box optimization on the rotation group SO(3, R)

Modern Methods of Data Analysis - WS 07/08

Introduction to optimization methods and line search

Algoritmi di Ottimizzazione: Parte A Aspetti generali e algoritmi classici

Nelder-Mead Enhanced Extreme Learning Machine

1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics

NEW CERN PROTON SYNCHROTRON BEAM OPTIMIZATION TOOL

Computational Methods. Constrained Optimization

CPSC 340: Machine Learning and Data Mining. Robust Regression Fall 2015

Mathematical Programming and Research Methods (Part II)

Lecture 12: Feasible direction methods

Introduction to Optimization

Non-Derivative Optimization: Mathematics or Heuristics?

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

Delaunay-based Derivative-free Optimization via Global Surrogate. Pooriya Beyhaghi, Daniele Cavaglieri and Thomas Bewley

Programming, numerics and optimization

MATH3016: OPTIMIZATION

/ Approximation Algorithms Lecturer: Michael Dinitz Topic: Linear Programming Date: 2/24/15 Scribe: Runze Tang

Lecture 2 September 3

Optimization. Industrial AI Lab.

PARALLELIZATION OF THE NELDER-MEAD SIMPLEX ALGORITHM

Lecture 19: Convex Non-Smooth Optimization. April 2, 2007

Lecture 3: Convex sets

An Evolutionary Algorithm for Minimizing Multimodal Functions

Chapter 14 Global Search Algorithms

Chapter 3 Numerical Methods

Key points. Assume (except for point 4) f : R n R is twice continuously differentiable. strictly above the graph of f except at x

Variations on Regression Models. Prof. Bennett Math Models of Data Science 2/02/06

Ellipsoid Algorithm :Algorithms in the Real World. Ellipsoid Algorithm. Reduction from general case

Algorithms for convex optimization

B553 Lecture 12: Global Optimization

Convex Optimization Lecture 2

Convex Optimization MLSS 2015

Characterizing Improving Directions Unconstrained Optimization

LocalSolver 4.0: novelties and benchmarks

An evolutionary annealing-simplex algorithm for global optimisation of water resource systems

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

Unconstrained Optimization Principles of Unconstrained Optimization Search Methods

ICS 161 Algorithms Winter 1998 Final Exam. 1: out of 15. 2: out of 15. 3: out of 20. 4: out of 15. 5: out of 20. 6: out of 15.

Machine Learning for Signal Processing Lecture 4: Optimization

A Direct Search Algorithm for Global Optimization

Key Concepts: Economic Computation, Part II

f xx (x, y) = 6 + 6x f xy (x, y) = 0 f yy (x, y) = y In general, the quantity that we re interested in is

5. GENERALIZED INVERSE SOLUTIONS

Optimizing the TracePro Optimization Process

25. NLP algorithms. ˆ Overview. ˆ Local methods. ˆ Constrained optimization. ˆ Global methods. ˆ Black-box methods.

An Improvement of Incremental Conductance MPPT Algorithm for PV Systems Based on the Nelder Mead Optimization

A Short SVM (Support Vector Machine) Tutorial

Multidimensional Minimization

Convex Optimization CMU-10725

Optimization and least squares. Prof. Noah Snavely CS1114

Feature Detectors and Descriptors: Corners, Lines, etc.

CS281 Section 3: Practical Optimization

Bilinear Programming

Package optimsimplex

Short Reminder of Nonlinear Programming

Lecture 6 - Multivariate numerical optimization

Lecture 12: convergence. Derivative (one variable)

Simplicial Global Optimization

Gradient LASSO algoithm

LECTURE NOTES Non-Linear Programming

Solving Smart Grid Operation Problems Through Variable Neighborhood Search

Convexity Theory and Gradient Methods

Parameter Estimation of DC Motor using Adaptive Transfer Function based on Nelder-Mead Optimisation

Generic descent algorithm Generalization to multiple dimensions Problems of descent methods, possible improvements Fixes Local minima

Constrained Optimization COS 323

M. Sc. (Artificial Intelligence and Machine Learning)

Learning via Optimization

Sparse Optimization Lecture: Proximal Operator/Algorithm and Lagrange Dual

Convex Optimization - Chapter 1-2. Xiangru Lian August 28, 2015

HAM: A HYBRID ALGORITHM PREDICTION BASED MODEL FOR ELECTRICITY DEMAND

Optimization. 1. Optimization. by Prof. Seungchul Lee Industrial AI Lab POSTECH. Table of Contents

The Mesh Adaptive Direct Search Algorithm for Discrete Blackbox Optimization

Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization. Author: Martin Jaggi Presenter: Zhongxing Peng

CS 188: Artificial Intelligence

2. Optimization problems 6

Master of Science in Applied Geophysics Research Thesis. Triangulation for seismic modelling with optimization techniques.

Assignment 2 Master Solution. Part (a) /ESD.77 Spring 2004 Multidisciplinary System Design Optimization. (a1) Decomposition

CS 435, 2018 Lecture 2, Date: 1 March 2018 Instructor: Nisheeth Vishnoi. Convex Programming and Efficiency

16.410/413 Principles of Autonomy and Decision Making

A Brief Look at Optimization

Computational Optimization. Constrained Optimization Algorithms

Composite Self-concordant Minimization

Introduction to Machine Learning Spring 2018 Note Sparsity and LASSO. 1.1 Sparsity for SVMs

User s Guide to Climb High. (A Numerical Optimization Routine) Introduction

Camera calibration. Robotic vision. Ville Kyrki

Iterative Methods for Solving Linear Problems

Optimization Plugin for RapidMiner. Venkatesh Umaashankar Sangkyun Lee. Technical Report 04/2012. technische universität dortmund

EXTRA-CREDIT PROBLEMS ON SURFACES, MULTIVARIABLE FUNCTIONS AND PARTIAL DERIVATIVES

Hardware-Efficient Parallelized Optimization with COMSOL Multiphysics and MATLAB

Transcription:

Introduction to Optimization Amy Langville SAMSI Undergraduate Workshop N.C. State University SAMSI 6/1/05

GOAL: minimize f(x 1, x 2, x 3, x 4, x 5 ) = x 2 1.5x 2x 3 + x 4 /x 5 PRIZE: $1 million # of independent variables = z = f(x 1, x 2, x 3, x 4, x 5 ) lives in R?

GOAL: minimize f(x 1, x 2, x 3, x 4, x 5 ) = x 2 1.5x 2x 3 + x 4 /x 5 PRIZE: $1 million # of independent variables = z = f(x 1, x 2, x 3, x 4, x 5 ) lives in R? Suppose you know little to nothing about Calculus or Optimization, could you win the prize? How?

GOAL: minimize f(x 1, x 2, x 3, x 4, x 5 ) = x 2 1.5x 2x 3 + x 4 /x 5 PRIZE: $1 million # of independent variables = z = f(x 1, x 2, x 3, x 4, x 5 ) lives in R? Suppose you know little to nothing about Calculus or Optimization, could you win the prize? How? Trial and Error, repeated function evaluations

Calculus III Review local min vs. global min vs. saddle point CPs and horizontal T. planes Local Mins and 2nd Derivative Test Global Mins and CPs and BPs Gradient = Direction of?

Constrained vs. Unconstrained Opt. Unconstrained min f(x, y) = x 2 + y 2 Constrained min f(x, y) = x 2 + y 2 s.t. x 0, y 0 min f(x, y) = x 2 + y 2 s.t. x > 0, y > 0 min f(x, y) = x 2 + y 2 s.t. 1 x 2,0 y 3 EVT min f(x, y) = x 2 + y 2 s.t. y = x + 2

Gradient Descent Methods Hillclimbers on Cloudy Day: max f(x, y) = min f(x, y) Initializations 1st-order and 2nd-order info. from partials: Gradient + Hessian Matlab function: gd(α, x 0 )

Iterative Methods Issues Convergence Test: what is it for gd.m? Convergence Proof: is gd.m guaranteed to converge to local min? For α > 0? For α < 0? Rate of Convergence: how many iterations? How do starting points x 0 affect number of iterations? Worst starting point for α = 4? Best?

Convergence of Optimization Methods global vs. local vs. stationary point vs. none Most optimization algorithms cannot guarantee convergence to global min, much less local min. However, some classes of optim. problems are particularly nice. Convex objective EX: z =.5(α x 2 + y 2 ), α > 0 Every local min is global min! Even for particularly tough optim. problems, sometimes the most popular, successful algorithms perform well on many problems, despite lack of convergence theory. Must qualify statements: I found best global min to date.

Your Least Squares Problem how many variables/unknowns n =? z = f(x 1, x 2,..., x n ) lives in R? can we graph z?

Nonsmooth, Nondifferentiable Surfaces Can t compute gradient f can t use GD Methods Line Search Methods Method of Alternating Variables (Coordinate Descent): solve series of 1-D problems what would these steps look like on contour map?

fminsearch and Nelder-Mead maintain basis of n + 1 points where n = # variables form simplex using these points; convex hull idea: move in direction away from worst of these points EX: n = 2, so maintain basis of 3 points living in xy-plane simplex is triangle create new simplex by moving away from worst point: reflect, expand, contract, shrink steps

PROPERTIES OF NELDER MEAD 117 x 3 x 3 x x x r x r x e Fig. 1. Nelder Mead simplices after a reflection and an expansion step. The original simplex is shown with a dashed line. x 3 x 3 x cc x x x 1 x c x r Fig. 2. Nelder Mead simplices after an outside contraction, an inside contraction, and a shrink. The original simplex is shown with a dashed line. then x (k+1) 1 = x (k) 1. Beyond this, whatever rule is used to define the original ordering may be applied after a shrink. We define the change index k of iteration k as the smallest index of a vertex that differs between iterations k and k + 1: (2.8) k = min{ i x (k) i x (k+1) i }. (Tie-breaking rules are needed to define a unique value of k.) When Algorithm NM terminates in step 2, 1 <k n; with termination in step 3, k = 1; with termination in step 4, 1 k n + 1; and with termination in step 5, k = 1 or 2. A statement that x j changes means that j is the change index at the relevant iteration. The rules and definitions given so far imply that, for a nonshrink iteration,

N-M Algorithm

N-M Algorithm

.! / #! $% ( % 10 0 SN1939A 0.23 0.22 SN1939A, Residual norm as a function of λ 1 and λ 2 0.21 Residual norm 0.2 0.19 0.18 luminosity 10 1 0.17 20 40 60 80 10 2 0 20 40 60 80 100 120 140 160 180 days λ 100 1 2 3 4 5 6 7 8 λ 2 6 % 10 0 SN1939A 0.23 0.22 SN1939A, Residual norm as a function of λ 1 and λ 2 0.21 Residual norm 0.2 0.19 0.18 luminosity 10 1 0.17 20 40 60 80 10 2 0 20 40 60 80 100 120 140 160 180 days λ 100 1 2 3 4 5 6 7 8 λ 2 6)

8 % 10 0 SN1939A 0.23 0.22 SN1939A, Residual norm as a function of λ 1 and λ 2 0.21 Residual norm 0.2 0.19 0.18 luminosity 10 1 0.17 20 40 60 80 10 2 0 20 40 60 80 100 120 140 160 180 days λ 100 1 2 3 4 5 6 7 8 λ 2 Q % 10 0 SN1939A 0.23 0.22 SN1939A, Residual norm as a function of λ 1 and λ 2 0.21 Residual norm 0.2 0.19 0.18 luminosity 10 1 0.17 20 40 60 80 10 2 0 20 40 60 80 100 120 140 160 180 days λ 100 1 2 3 4 5 6 7 8 λ 2 6

% 10 0 SN1939A 0.23 0.22 SN1939A, Residual norm as a function of λ 1 and λ 2 0.21 Residual norm 0.2 0.19 0.18 luminosity 10 1 0.17 20 40 60 80 10 2 0 20 40 60 80 100 120 140 160 180 days λ 100 1 2 3 4 5 6 7 8 λ 2 68 % 10 0 SN1939A 0.23 0.22 SN1939A, Residual norm as a function of λ 1 and λ 2 0.21 Residual norm 0.2 0.19 0.18 luminosity 10 1 0.17 20 40 60 80 10 2 0 20 40 60 80 100 120 140 160 180 days λ 100 1 2 3 4 5 6 7 8 λ 2 8(

Q(! % 10 0 SN1939A 0.23 0.22 SN1939A, Residual norm as a function of λ 1 and λ 2 0.21 Residual norm 0.2 0.19 0.18 Starting guess λ 0 luminosity 10 1 0.17 20 40 Optimizer λ * 60 80 10 2 0 20 40 60 80 100 120 140 160 180 days λ 100 1 2 3 4 5 6 7 8 λ 2 7 / 7, ( % * " - ( ) - * +( " $ $ " - * - * ( $,* * ( $,* ("/ - * ( $,* * ( $,* ("/ --"- ( $,* - * - * *+ ( "-$ --"- - * - ( ) + ( " $ $ " ( $,* * ( $,* * *+ ( "-$ 86

N-M Algorithm not proven to converge in general but, widely used easy to implement inexpensive: usually only 1-2 function evaluations/iteration no derivatives needed makes good progress at beginning of iteration history Assignments: Display N-M steps using options.display= iter ; fminsearch(fun,[x0],options); Write nested for loops in Matlab to generate grid of starting points (and later random starting points) for fminsearch to find best global min

Genetic/Evolutionary Algorithms at each iteration either mate or mutate possible solution vectors based on fitness of possible solution vectors as measured by objective function