Convex Optimization. Lijun Zhang Modification of

Similar documents
Constrained optimization

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 2. Convex Optimization

Introduction to Constrained Optimization

Lecture 7: Support Vector Machine

Convex Optimization. Erick Delage, and Ashutosh Saxena. October 20, (a) (b) (c)

Nonlinear Programming

Linear methods for supervised learning

Unconstrained Optimization Principles of Unconstrained Optimization Search Methods

Mathematical Programming and Research Methods (Part II)

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

Lecture 4 Duality and Decomposition Techniques

Introduction to Machine Learning

Optimization under uncertainty: modeling and solution methods

Case Study 1: Estimating Click Probabilities

Convex Programs. COMPSCI 371D Machine Learning. COMPSCI 371D Machine Learning Convex Programs 1 / 21

1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics

COMS 4771 Support Vector Machines. Nakul Verma

CME307/MS&E311 Theory Summary

Convex optimization algorithms for sparse and low-rank representations

Programming, numerics and optimization

Lecture 18: March 23

Constrained Optimization COS 323

Characterizing Improving Directions Unconstrained Optimization

California Institute of Technology Crash-Course on Convex Optimization Fall Ec 133 Guilherme Freitas

Optimization III: Constrained Optimization

Introduction to Optimization

Convex Optimization MLSS 2015

Convex Optimization M2

Support Vector Machines. James McInerney Adapted from slides by Nakul Verma

9. Support Vector Machines. The linearly separable case: hard-margin SVMs. The linearly separable case: hard-margin SVMs. Learning objectives

Proximal operator and methods

IE521 Convex Optimization Introduction

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer

Introduction to Optimization

CME307/MS&E311 Optimization Theory Summary

Convex Functions & Optimization

A Brief Overview of Optimization Problems. Steven G. Johnson MIT course , Fall 2008

Tutorial on Convex Optimization for Engineers

Aspects of Convex, Nonconvex, and Geometric Optimization (Lecture 1) Suvrit Sra Massachusetts Institute of Technology

Convex Analysis and Minimization Algorithms I

Department of Electrical and Computer Engineering

OPTIMIZATION METHODS

Introduction to Convex Optimization. Prof. Daniel P. Palomar

Convex Optimization and Machine Learning

IE598 Big Data Optimization Summary Nonconvex Optimization

A Short SVM (Support Vector Machine) Tutorial

Kernel Methods & Support Vector Machines

Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey. Chapter 4 : Optimization for Machine Learning

Lecture 12: convergence. Derivative (one variable)

A primal-dual framework for mixtures of regularizers

Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding

A Brief Overview of Optimization Problems. Steven G. Johnson MIT course , Fall 2008

Sparse Optimization Lecture: Proximal Operator/Algorithm and Lagrange Dual

Lec13p1, ORF363/COS323

Data Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs)

CONVEX OPTIMIZATION: A SELECTIVE OVERVIEW

Convex Optimization Algorithms

Index. affine dependency, 133 minimal, 133 affine hull, 392 affinely independent, 393 α-bb, 230, 258, 297 approximate solutions, 9 approximation, 86

Optimization for Machine Learning

Composite Self-concordant Minimization

Introduction to Modern Control Systems

Optimization. Industrial AI Lab.

A Truncated Newton Method in an Augmented Lagrangian Framework for Nonlinear Programming

Department of Mathematics Oleg Burdakov of 30 October Consider the following linear programming problem (LP):

Lecture 10: SVM Lecture Overview Support Vector Machines The binary classification problem

Applied Lagrange Duality for Constrained Optimization

Theoretical Concepts of Machine Learning

Linear programming and duality theory

Fast-Lipschitz Optimization

Convex Optimization CMU-10725

Chapter 4 Convex Optimization Problems

Introduction to Mathematical Programming IE496. Final Review. Dr. Ted Ralphs

DISCRETE CONVEX ANALYSIS

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016

STRUCTURAL & MULTIDISCIPLINARY OPTIMIZATION

INTRODUCTION TO LINEAR AND NONLINEAR PROGRAMMING

Lec 11 Rate-Distortion Optimization (RDO) in Video Coding-I

Programs. Introduction

Lecture 19: Convex Non-Smooth Optimization. April 2, 2007

Efficient Optimization for L -problems using Pseudoconvexity

Linear Optimization and Extensions: Theory and Algorithms

Parallel and Distributed Sparse Optimization Algorithms

Interior Point I. Lab 21. Introduction

Exploiting Low-Rank Structure in Semidenite Programming by Approximate Operator Splitting

Solution Methods Numerical Algorithms

Nonsmooth Optimization and Related Topics

DM6 Support Vector Machines

Support Vector Machines

A Tutorial on Decomposition Methods for Network Utility Maximization Daniel P. Palomar, Member, IEEE, and Mung Chiang, Member, IEEE.

Optimization Plugin for RapidMiner. Venkatesh Umaashankar Sangkyun Lee. Technical Report 04/2012. technische universität dortmund

15. Cutting plane and ellipsoid methods

LECTURE 18 LECTURE OUTLINE

Convexity Theory and Gradient Methods

Training Data Selection for Support Vector Machines

Convexity: an introduction

ME 555: Distributed Optimization

Demo 1: KKT conditions with inequality constraints

Gate Sizing by Lagrangian Relaxation Revisited

Kernels and Constrained Optimization

One-class Problems and Outlier Detection. 陶卿 中国科学院自动化研究所

Transcription:

Convex Optimization Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Modification of http://stanford.edu/~boyd/cvxbook/bv_cvxslides.pdf

Outline Introduction Convex Sets & Functions Convex Optimization Problems Duality Convex Optimization Methods Summary

Mathematical Optimization Optimization Problem

Applications Dimensionality Reduction (PCA) max Clustering (NMF) s. t. 1 min, s. t. 0, 0 Classification (SVM) min,

Least-squares The Problem Given, predict by Properties

Linear Programming The Problem Properties

Convex Optimization Problem The Problem Conditions

Convex Optimization Problem The Problem Properties

Nonlinear Optimization Definition The objective or constraint functions are not linear Could be convex or nonconvex

Outline Introduction Convex Sets & Functions Convex Optimization Problems Duality Convex Optimization Methods Summary

Affine Set

Convex Set

Convex Cone

Some Examples (1)

Some Examples (2)

Some Examples (3)

Operations that Preserve Convexity

Convex Functions

Examples on

Examples on and

Restriction of a Convex Function to a Line

First-order Conditions

Second-order Conditions

Examples

Operations that Preserve Convexity

Positive Weighted Sum & Composition with Affine Function

Pointwise Maximum Hinge loss: l max 0,1

The Conjugate Function

Examples

Outline Introduction Convex Sets & Functions Convex Optimization Problems Duality Convex Optimization Methods Summary

Optimization Problem in Standard Form

Optimal and Locally Optimal Points

Implicit Constraints

Convex Optimization Problem

Example

Local and Global Optima

Optimality Criterion for Differentiable

Examples

Popular Convex Problems Linear Program (LP) Linear-fractional Program Quadratic Program (QP) Quadratically Constrained Quadratic program (QCQP) Second-order Cone Programming (SOCP) Geometric Programming (GP) Semidefinite Program (SDP)

Outline Introduction Convex Sets & Functions Convex Optimization Problems Duality Convex Optimization Methods Summary

Lagrangian

Lagrangian

Lagrange Dual Function

Lagrange Dual Function

Least-norm Solution of Linear Equations

Lagrange Dual and Conjugate Function

The Dual Problem

Weak and Strong Duality

Weak and Strong Duality

Slater s Constraint Qualification

Complementary Slackness

Karush-Kuhn-Tucker (KKT) Conditions

KKT Conditions for Convex Problem

An Example SVM (1) The Optimization Problem Define the hinge loss as Its Conjugate Function is

An Example SVM (2) The Optimization Problem becomes It is Equivalent to The Lagrangian is

An Example SVM (3) The Lagrange Dual Function is Minimize one by one

An Example SVM (4) Finally, We Obtain The Dual Problem is

An Example SVM (5) Karush-Kuhn-Tucker (KKT) Conditions

An Example SVM (5) Karush-Kuhn-Tucker (KKT) Conditions Can be used to recover from

An Example SVM (5) Karush-Kuhn-Tucker (KKT) Conditions Can be used to recover from

Outline Introduction Convex Sets & Functions Convex Optimization Problems Duality Convex Optimization Methods Summary

More Assumptions Lipschitz continuous Strong Convexity, 1 1 1 2 Smooth,

Performance Measure The Problem Convergence Rate Iteration Complexity

Gradient-based Methods The Convergence Rate GD Gradient Descent AGD Nesterov s Accelerated Gradient Descent [Nesterov, 2005, Nesterov, 2007, Tseng, 2008] EGD Epoch Gradient Descent [Hazan and Kale, 2011] SGD SGD with -suffix Averaging [Rakhlin et al., 2012]

Gradient Descent (1) Move along the opposite direction of gradients

Gradient Descent (2) Gradient Descent with Projection Projection Operator

Analysis (1)

Analysis (2)

Analysis (3)

A Key Step (1) Evaluate the Gradient or Subgradient Logit loss

A Key Step (1) Evaluate the Gradient or Subgradient Logit loss Hinge loss

A Key Step (2) Evaluate the Gradient or Subgradient Logit loss Hinge loss

A Key Step (3) Evaluate the Gradient or Subgradient Logit loss Hinge loss

Outline Introduction Convex Sets & Functions Convex Optimization Problems Duality Convex Optimization Methods Summary

Summary Convex Sets & Functions Definitions, Operations that Preserve Convexity Convex Optimization Problems Definitions, Optimality Criterion Duality Lagrange, Dual Problem, KKT Conditions Convex Optimization Methods Gradient-based Methods

Reference (1) Hazan, E. and Kale, S. (2011) Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization. In Proceedings of the 24th Annual Conference on Learning Theory, pages 421 436. Nesterov, Y. (2005) Smooth minimization of non-smooth functions. Mathematical Programming, 103(1):127 152. Nesterov, Y. (2007). Gradient methods for minimizing composite objective function. Core discussion papers.

Reference (2) Tseng, P. (2008). On acclerated proximal gradient methods for convexconcave optimization. Technical report, University of Washington. Boyd, S. and Vandenberghe, L. (2004). Convex Optimization. Cambridge University Press. Rakhlin, A., Shamir, O., and Sridharan, K. (2012) Making gradient descent optimal for strongly convex stochastic optimization. In Proceedings of the 29th International Conference on Machine Learning, pages 449 456.