ECE 592 Topics in Data Science

Size: px
Start display at page:

Download "ECE 592 Topics in Data Science"

Transcription

1 ECE 592 Topics in Data Science Dror Baron Associate Professor Dept. of Electrical and Computer Engr. North Carolina State University, NC, USA

2 Optimization Keywords: linear programming, dynamic programming, convex optimization, non-convex optimization

3 What is Optimization?

4 What is optimization? Wikipedia: In mathematics, computer science and operations research, mathematical optimization (alternatively, mathematical programming or simply, optimization) is the selection of a best element (with regard to some criterion) from some set of available alternatives. 4

5 Application #1-Classroom scheduling Real story NCSU has classes on multiple campuses, dozens of buildings, etc. We want a good schedule What s good? Availability of rooms Proximity of classroom to department Instructors have day/time preferences Match sizes of rooms and anticipated class enrollment Avoid conflicts between course pairs of interest to students 5

6 Application #2-l 1 recovery Among infinitely many solutions, seek one with smallest l 1 norm (sum of absolute values) Relation to compressed sensing recovery (later in course) Can express x=x p -x n xx 1 = NN ii=1 xx pp,ii + xx nn,ii min NN ii=1 xx pp,ii + xx nn,ii subject to (s.t.) y=φx p -Φx n Also need x p, x n to be non-negative 6

7 Application #3-reducing fuel consumption Suppose gas prices increase a lot Truck fleet company wants to save $ by reducing fuel consumption Things are simple on flat highways Challenges: 1) You see a hill; can push engine up hill and coast down, or accelerate before hill, then reduce speed while climbing 2) We see red light; should we coast, accelerate, slam brakes? Main point: dynamic behavior links past, present, future 7

8 Application #4-process design in factories Consider factory with complicated process Want to buy less inputs (chemicals) Want to use less energy Want product to be produced quickly (time) Want robustness to surprises (e.g., power shortages) Goal: tune production process to minimize costs Costs involve inputs, energy, time, robustness, Known as multi objective optimization 8

9 Dynamic Programming (DP) Keywords: Bellman equations, dynamic programming

10 What is dynamic programming (DP)? Wikipedia: In mathematics, management science, economics, computer science, and bioinformatics, dynamic programming (also known as dynamic optimization) is a method for solving a complex problem by breaking it down into a collection of simpler subproblems, solving each of those subproblems just once, and storing their solutions ideally, using a memory-based data structure. 10

11 Resembles divide and conquer Have a large problem Partition into parts Dynamic nature of problem links past, present, future Want decision whose combined costs (current plus future) is best Whereas brute force optimization is computationally intense, DP is fast 11

12 Problem setting t time T time horizon (maximal time) x t state at time t Possible actions aa tt Γ(xx tt ) T(x,a) next state upon choosing action a F(x,a) payoff from action a Want to maximize our payoff up to horizon T 12

13 Solution approach Basis case: t=t-1, have one time left for an action Maximize payoff by maximizing F(x t,a) a * =arg max a Γ F(x t,a) At time T (end of problem) arrive at state x T =T(x,a * ) Don t care about final state, only about payoff 13

14 Solution continued Recursive case: t<t-1, have multiple decisions left Let s keep it simple with t=t-2 Based on basis case, for each X t-1 =X T-2 can calculate a* for last decision (in next time step, t=t-1) Want optimal cost to account for current payoff and payoff in next step aa = arg max aa tt Γ(xx tt ) {FF xx tt, aa + next_payoff(t(xt,a))} 14

15 Recursive solution Let s simplify recursive case for t=t-2 using notation for optimal actions / payoffs at time t aa xx tt - optimal action at time t given state x t Ψ(xx tt ) optimal payoff starting from time t Basis case provides a * (x T-1 ), Ψ xx TT 1, xxxx Recursive case for t=t-2 aa = arg max aa tt Γ(xx tt ) {FF xx tt, aa + next_payoff(t(x t,a))} = arg max aa tt Γ(xx tt ) {FF xx tt, aa + Ψ(xx TT 1 = TT(xx tt, aa))} Repeat recursively for smaller t 15

16 Computationally efficient DP solution Instead of processing from t up to T, reverse order: t=t-1: compute aa xx tt, Ψ(xx tt ) for all possible x t t=t-2: aa = arg max {FF xx tt, aa + Ψ(xx TT 1 = TT(xx TT 2, aa))} aa tt Γ(xx tt ) t=t-3: aa = arg max {FF xx tt, aa + Ψ(xx TT 2 = TT(xx TT 3, aa))} aa tt Γ(xx tt ) General case: Bellman s optimality equations Each time step, store optimal actions and payoffs Lookup table (LUT) for Ψ instead of recomputing Can construct sequence of optimal actions with LUT 16

17 Why computationally efficient? Let s contrast computational complexities Brute force optimization: Γ actions per time step and T time steps Must evaluate Γ T trajectories of actions Θ( Γ T ) DP: Compute FF xx tt, aa + Ψ(xx tt = TT(xx tt, aa))} for Γ actions, T time Θ( Γ T) Whereas brute force optimization is computationally intense, DP is fast 17

18 Variations Deterministic / random Next state and payoff could be random Example: there could be more users than expected; adjust server (action) to account for future trajectory of software Finite / infinite horizon Infinite horizon decision problems require discount factor β to give future payoffs at time t weight β t Payoffs in far future matter less β<1 Discrete / continuous time 18

19 Example [Cormen et al.] Rod cutting problem Have rod of integer length n Have table of prices p i charged for length-i cuts Cutting is free Want to cut rod into parts (or not cut at all) to maximize profit 19

20 Example continued Length n=4 Can charge prices p 1 =1, p 2 =5, p 3 =8, p 4 =9 Could look at all possible sets of cuts (see below) 20

21 Example using DP Unrealistic to consider cutting configurations for large n, use DP instead Basis: n=1, Ψ(1)=p 1 =1 Recursion: n=2, Ψ(2)=max{2Ψ(1),p 2 }=5 n=3, Ψ(3)=max{Ψ(1)+Ψ(2),p 3 }=max{5+1,8}=8 At each stage, maximize over Ψ(k)+Ψ(n-k) for k=1,2,,n-1; and for k=n use p n Ex: Ψ(7)=max{Ψ(1)+Ψ(6),Ψ(2)+Ψ(5),Ψ(3)+Ψ(4),,p 7 } 21

22 Real-world application Viterbi algorithm Decodes convolution codes in CDMA Also used in speech recognition Text is hidden and (noisy) speech observations help estimate text Relies on DP Finds shortest path 22

23 Linear Programming Keywords: linear programming, simplex method

24 Formulation Canonical form max xx ss.tt. AAAA bb,xx 0 cctt xx Note: s.t. = subject to Matrix manipulations/tricks create variations: Ax=b by enforcing and We ve minimized x 1 (instead of c T x) 24

25 What s it good for? Transportation - pose airline costs and revenue as linear model, maximize profit (revenue-costs) w/lp Manufacturing minimize costs by posing them as linear model Common theme: many real-world problems are approximately linear, or can be linearized around working point (Taylor series) 25

26 History Early formulations date back to early 20 th century (rudimentary forms even earlier) Dantzig invented simplex method (solver) in 40s Polynomial average runtime; slow worst case Interior point methods much faster worst case 26

27 Simplex algorithm Linear constraints, AAAA bb, xx 0 Correspond to convex polytope Linear function being optimized, c T x Optimal on corner point of polytope Simplex = outer shell of convex polytope Start at some corner point (vertex) Examine neighboring vertices Either c T x already optimal, or it s better at neighbor Move to best neighboring vertex; iterate until done Specific steps correspond to linear algebra 27

28 Keywords: convex optimization Convex Optimization

29 What are convex/concave functions? Consider convex real valued function defined on space XX, ff: XX R Convex: f(λx+(1- λ)y)) λf(x)+(1- λ)f(y), x,y XX, λ (0,1) Concave: Note: f convex if and only if f concave; convex/concave imply negative/positive second derivatives Any local optimum is global optimum 29

30 What is convex optimization? Basic convex problem: xx = arg min xx XX {f x } Set XX and function f(x) must both be convex Alternate form: min ff(xx) ss.tt. gg ii xx 0, ii Functions f, g 1,, g m all convex 30

31 Applications (Why is this interesting?) Many problems can be posed as convex Least squares Entropy maximization Linear programming 31

32 Newton s method Newton s method finds roots of equations, f(x)=0 Instead, derivative f`=0 or gradient = 0 Taylor expansion: f T (x)=f(x t )+f xx f xx2 + Root of derivative: f (x t )+f (x t ) xx=0 Iterate with x t+1 =x t + xx Newton s method is simple but O(1/t) convergence 32

33 Second order methods Challenge: first order approximation to derivative slows down Newton s method Solution: use higher order approximation Instead of f (x t )+f (x t ) xx, use third derivative too Multi-dimensional function? Use gradient, Hessians Second order methods more complicated but faster 33

34 Gradient descent Keywords: gradient descent, line search, golden section search

35 Gradient descent In each iteration, select direction to pursue Coordinate descent move along one of coordinates Gradient descent - direction that minimizes cost function fastest How far should we move along that direction? Undershooting or overshooting bad for convergence 35

36 Line search Key sub-method is to move along direction just enough to minimize the function along that line Line search = optimization along line Many variations binary search, golden rule search Let s make up an example for this and code it! Check course webpage for Matlab script 36

37 Integer Programming Keywords: integer programming, integer linear programs, relaxation

38 What is integer programming? Integer program = optimization problem where some/all variables must be integers Integer linear programs (ILP): xx = arg max Slack variables s ss.tt. AAAA+ss=bb ss 0 xx Z nn cc TT xx 38

39 Example Support set detection y=ax+z Sparse x Want to identify support set where x 0 Can we do perfect support set detection? Are there tiny non-zeros? (yes difficult) What s the SNR (low difficult) 39

40 Example continued Support set detection, y=ax+z, want support set Algorithm: Consider candidate support set, s {0,1} N Create matrix A s contains column i iff s i =1 Run least squares using A s (find low-energy solution to y=a s x) Iterate over all s, select solution with smallest residual Algorithm is optimal & slow 40

41 More about ILP Integer linear programs can be shown to be NP This means they are slow Another algorithmic approach relaxation First ignore integer constraints, solve standard LP Next, round (sort of!) to nearby (not necessarily nearest) integer solution Various applications require integer solutions; we re just skimming the surface 41

42 Keywords: non-convex optimization Non-Convex Optimization

43 What s the challenge? Many functions are non-convex Convex one local min (it s the global min) Non-convex local min need not be global min Various algorithms could get stuck in local min 43

44 Is it hopeless? Maybe initialize an algo many different ways; could get stuck in different local mins, choose best But could be tons of local mins (especially in higher dimensions) 44

45 Markov chain Monte Carlo Markov chain Monte Carlo (MCMC) can solve some non-convex problems Form expression E(x) for energy (analogous to statistical physics) Distribution for signal: Pr(x)=Z exp{-se(x)} s analogous to inverse temperature; normalization term Z Sample next version of x from Gibbs distribution High temperature small s weak pull toward low energy Low temp large s strong pull to low energy Gradual cooling 45

46 MCMC continued 1) Do we sample entire sequence x? Not necessarily. Can consider re-generating one x i at a time; only need marginal distribution for x i 2) Is MCMC guaranteed to converge to global min? Maybe. If you cool down very slowly 3) So is it any good? Depends. MCMC is very slow but can converge to global min; some techniques to accelerate it 46

47 EM Algorithm Keywords: expectation maximization algorithm, Gaussian mixture models, latent variables

48 Main ideas Iterate over estimation (E) & maximization (M) Estimation create function for computing expected log likelihood (based on current parameters) Maximization update parameters to maximize expected log likelihood from E step Details coming up 48

49 Statistical model & motivation Model generates data X Z latent / missing values θ - parameter Likelihood function: L(θ;X,Z)=Pr(X,Z θ) Marginal likelihood: L(θ;X)=Pr(X θ)= L(θ;X,Z)dZ Might be intractable (e.g., due to many possible Z sequences) Want to compute L(θ;X), then optimize parameter Computationally intractable Motivates EM 49

50 Statistical model & motivation Expectation computed expected value of log likelihood for parameter θ (t) in current iteration t QQ θθ θθ tt = EE ZZ XX,θθ (tt) log LL θθ; XX, ZZ Z typically discrete latent variables Given parameter θ (t), sequence Z can be found; typically via fast algorithm, e.g., dynamic programming Maximization θθ (tt+1) = argmax θθ QQ θθ θθ tt 50

51 Example Gaussian mixture models What s a Gaussian mixture model (GMM)? XX~ ii αα ii NN(μμ ii, σσ ii ) Component i has probability αα ii mean μμ ii standard dev σσ ii Could be multi-dimensional data covariance matrix i Useful? Many distributions well-approximated by GMM In principle can model almost everything as GMM Trade-off between # components and model accuracy 51

52 Example continued Challenge - parameters (αα ii, μμ ii, σσ ii ) often unavailable Must estimate from data X To keep simple: N scalar samples X R NN Latent variable ZZ Z NN ; z n correspond to Gaussian components that x n belongs to E step: compute sequence Z given parameters θ=(αα ii, μμ ii, σσ ii ) Optimize θ given Z 52

Modern Methods of Data Analysis - WS 07/08

Modern Methods of Data Analysis - WS 07/08 Modern Methods of Data Analysis Lecture XV (04.02.08) Contents: Function Minimization (see E. Lohrmann & V. Blobel) Optimization Problem Set of n independent variables Sometimes in addition some constraints

More information

Convexization in Markov Chain Monte Carlo

Convexization in Markov Chain Monte Carlo in Markov Chain Monte Carlo 1 IBM T. J. Watson Yorktown Heights, NY 2 Department of Aerospace Engineering Technion, Israel August 23, 2011 Problem Statement MCMC processes in general are governed by non

More information

Gaussian Mixture Models For Clustering Data. Soft Clustering and the EM Algorithm

Gaussian Mixture Models For Clustering Data. Soft Clustering and the EM Algorithm Gaussian Mixture Models For Clustering Data Soft Clustering and the EM Algorithm K-Means Clustering Input: Observations: xx ii R dd ii {1,., NN} Number of Clusters: kk Output: Cluster Assignments. Cluster

More information

Regularization and Markov Random Fields (MRF) CS 664 Spring 2008

Regularization and Markov Random Fields (MRF) CS 664 Spring 2008 Regularization and Markov Random Fields (MRF) CS 664 Spring 2008 Regularization in Low Level Vision Low level vision problems concerned with estimating some quantity at each pixel Visual motion (u(x,y),v(x,y))

More information

EE/AA 578: Convex Optimization

EE/AA 578: Convex Optimization EE/AA 578: Convex Optimization Instructor: Maryam Fazel University of Washington Fall 2016 1. Introduction EE/AA 578, Univ of Washington, Fall 2016 course logistics mathematical optimization least-squares;

More information

An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework

An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework IEEE SIGNAL PROCESSING LETTERS, VOL. XX, NO. XX, XXX 23 An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework Ji Won Yoon arxiv:37.99v [cs.lg] 3 Jul 23 Abstract In order to cluster

More information

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or

More information

Mathematical Programming and Research Methods (Part II)

Mathematical Programming and Research Methods (Part II) Mathematical Programming and Research Methods (Part II) 4. Convexity and Optimization Massimiliano Pontil (based on previous lecture by Andreas Argyriou) 1 Today s Plan Convex sets and functions Types

More information

Linear Programming. Readings: Read text section 11.6, and sections 1 and 2 of Tom Ferguson s notes (see course homepage).

Linear Programming. Readings: Read text section 11.6, and sections 1 and 2 of Tom Ferguson s notes (see course homepage). Linear Programming Learning Goals. Introduce Linear Programming Problems. Widget Example, Graphical Solution. Basic Theory: Feasible Set, Vertices, Existence of Solutions. Equivalent formulations. Outline

More information

Linear Programming. Widget Factory Example. Linear Programming: Standard Form. Widget Factory Example: Continued.

Linear Programming. Widget Factory Example. Linear Programming: Standard Form. Widget Factory Example: Continued. Linear Programming Widget Factory Example Learning Goals. Introduce Linear Programming Problems. Widget Example, Graphical Solution. Basic Theory:, Vertices, Existence of Solutions. Equivalent formulations.

More information

CMU-Q Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization. Teacher: Gianni A. Di Caro

CMU-Q Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization. Teacher: Gianni A. Di Caro CMU-Q 15-381 Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization Teacher: Gianni A. Di Caro GLOBAL FUNCTION OPTIMIZATION Find the global maximum of the function f x (and

More information

K-Means and Gaussian Mixture Models

K-Means and Gaussian Mixture Models K-Means and Gaussian Mixture Models David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 43 K-Means Clustering Example: Old Faithful Geyser

More information

A Brief Look at Optimization

A Brief Look at Optimization A Brief Look at Optimization CSC 412/2506 Tutorial David Madras January 18, 2018 Slides adapted from last year s version Overview Introduction Classes of optimization problems Linear programming Steepest

More information

Clustering web search results

Clustering web search results Clustering K-means Machine Learning CSE546 Emily Fox University of Washington November 4, 2013 1 Clustering images Set of Images [Goldberger et al.] 2 1 Clustering web search results 3 Some Data 4 2 K-means

More information

Constrained and Unconstrained Optimization

Constrained and Unconstrained Optimization Constrained and Unconstrained Optimization Carlos Hurtado Department of Economics University of Illinois at Urbana-Champaign hrtdmrt2@illinois.edu Oct 10th, 2017 C. Hurtado (UIUC - Economics) Numerical

More information

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017 CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2017 Assignment 3: 2 late days to hand in tonight. Admin Assignment 4: Due Friday of next week. Last Time: MAP Estimation MAP

More information

Object Recognition Using Pictorial Structures. Daniel Huttenlocher Computer Science Department. In This Talk. Object recognition in computer vision

Object Recognition Using Pictorial Structures. Daniel Huttenlocher Computer Science Department. In This Talk. Object recognition in computer vision Object Recognition Using Pictorial Structures Daniel Huttenlocher Computer Science Department Joint work with Pedro Felzenszwalb, MIT AI Lab In This Talk Object recognition in computer vision Brief definition

More information

Segmentation: Clustering, Graph Cut and EM

Segmentation: Clustering, Graph Cut and EM Segmentation: Clustering, Graph Cut and EM Ying Wu Electrical Engineering and Computer Science Northwestern University, Evanston, IL 60208 yingwu@northwestern.edu http://www.eecs.northwestern.edu/~yingwu

More information

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask Machine Learning and Data Mining Clustering (1): Basics Kalev Kask Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand patterns of

More information

OPTIMIZATION METHODS

OPTIMIZATION METHODS D. Nagesh Kumar Associate Professor Department of Civil Engineering, Indian Institute of Science, Bangalore - 50 0 Email : nagesh@civil.iisc.ernet.in URL: http://www.civil.iisc.ernet.in/~nagesh Brief Contents

More information

3 INTEGER LINEAR PROGRAMMING

3 INTEGER LINEAR PROGRAMMING 3 INTEGER LINEAR PROGRAMMING PROBLEM DEFINITION Integer linear programming problem (ILP) of the decision variables x 1,..,x n : (ILP) subject to minimize c x j j n j= 1 a ij x j x j 0 x j integer n j=

More information

Reinforcement Learning: A brief introduction. Mihaela van der Schaar

Reinforcement Learning: A brief introduction. Mihaela van der Schaar Reinforcement Learning: A brief introduction Mihaela van der Schaar Outline Optimal Decisions & Optimal Forecasts Markov Decision Processes (MDPs) States, actions, rewards and value functions Dynamic Programming

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Clustering and EM Barnabás Póczos & Aarti Singh Contents Clustering K-means Mixture of Gaussians Expectation Maximization Variational Methods 2 Clustering 3 K-

More information

Mathematical and Algorithmic Foundations Linear Programming and Matchings

Mathematical and Algorithmic Foundations Linear Programming and Matchings Adavnced Algorithms Lectures Mathematical and Algorithmic Foundations Linear Programming and Matchings Paul G. Spirakis Department of Computer Science University of Patras and Liverpool Paul G. Spirakis

More information

COT 6936: Topics in Algorithms! Giri Narasimhan. ECS 254A / EC 2443; Phone: x3748

COT 6936: Topics in Algorithms! Giri Narasimhan. ECS 254A / EC 2443; Phone: x3748 COT 6936: Topics in Algorithms! Giri Narasimhan ECS 254A / EC 2443; Phone: x3748 giri@cs.fiu.edu http://www.cs.fiu.edu/~giri/teach/cot6936_s12.html https://moodle.cis.fiu.edu/v2.1/course/view.php?id=174

More information

Computer vision: models, learning and inference. Chapter 10 Graphical Models

Computer vision: models, learning and inference. Chapter 10 Graphical Models Computer vision: models, learning and inference Chapter 10 Graphical Models Independence Two variables x 1 and x 2 are independent if their joint probability distribution factorizes as Pr(x 1, x 2 )=Pr(x

More information

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant

More information

16.410/413 Principles of Autonomy and Decision Making

16.410/413 Principles of Autonomy and Decision Making 16.410/413 Principles of Autonomy and Decision Making Lecture 17: The Simplex Method Emilio Frazzoli Aeronautics and Astronautics Massachusetts Institute of Technology November 10, 2010 Frazzoli (MIT)

More information

5 Machine Learning Abstractions and Numerical Optimization

5 Machine Learning Abstractions and Numerical Optimization Machine Learning Abstractions and Numerical Optimization 25 5 Machine Learning Abstractions and Numerical Optimization ML ABSTRACTIONS [some meta comments on machine learning] [When you write a large computer

More information

Week 5. Convex Optimization

Week 5. Convex Optimization Week 5. Convex Optimization Lecturer: Prof. Santosh Vempala Scribe: Xin Wang, Zihao Li Feb. 9 and, 206 Week 5. Convex Optimization. The convex optimization formulation A general optimization problem is

More information

Computer Vision I - Filtering and Feature detection

Computer Vision I - Filtering and Feature detection Computer Vision I - Filtering and Feature detection Carsten Rother 30/10/2015 Computer Vision I: Basics of Image Processing Roadmap: Basics of Digital Image Processing Computer Vision I: Basics of Image

More information

Discrete Optimization. Lecture Notes 2

Discrete Optimization. Lecture Notes 2 Discrete Optimization. Lecture Notes 2 Disjunctive Constraints Defining variables and formulating linear constraints can be straightforward or more sophisticated, depending on the problem structure. The

More information

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013 Machine Learning Topic 5: Linear Discriminants Bryan Pardo, EECS 349 Machine Learning, 2013 Thanks to Mark Cartwright for his extensive contributions to these slides Thanks to Alpaydin, Bishop, and Duda/Hart/Stork

More information

1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics

1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics 1. Introduction EE 546, Univ of Washington, Spring 2016 performance of numerical methods complexity bounds structural convex optimization course goals and topics 1 1 Some course info Welcome to EE 546!

More information

IE598 Big Data Optimization Summary Nonconvex Optimization

IE598 Big Data Optimization Summary Nonconvex Optimization IE598 Big Data Optimization Summary Nonconvex Optimization Instructor: Niao He April 16, 2018 1 This Course Big Data Optimization Explore modern optimization theories, algorithms, and big data applications

More information

10-701/15-781, Fall 2006, Final

10-701/15-781, Fall 2006, Final -7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly

More information

Natural Language Processing

Natural Language Processing Natural Language Processing Classification III Dan Klein UC Berkeley 1 Classification 2 Linear Models: Perceptron The perceptron algorithm Iteratively processes the training set, reacting to training errors

More information

Chapter 15 Introduction to Linear Programming

Chapter 15 Introduction to Linear Programming Chapter 15 Introduction to Linear Programming An Introduction to Optimization Spring, 2015 Wei-Ta Chu 1 Brief History of Linear Programming The goal of linear programming is to determine the values of

More information

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014

More information

ELEG Compressive Sensing and Sparse Signal Representations

ELEG Compressive Sensing and Sparse Signal Representations ELEG 867 - Compressive Sensing and Sparse Signal Representations Gonzalo R. Arce Depart. of Electrical and Computer Engineering University of Delaware Fall 211 Compressive Sensing G. Arce Fall, 211 1 /

More information

Perceptron: This is convolution!

Perceptron: This is convolution! Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image

More information

Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Search & Optimization Search and Optimization method deals with

More information

CS 229 Midterm Review

CS 229 Midterm Review CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask

More information

Constrained Optimization COS 323

Constrained Optimization COS 323 Constrained Optimization COS 323 Last time Introduction to optimization objective function, variables, [constraints] 1-dimensional methods Golden section, discussion of error Newton s method Multi-dimensional

More information

Lecture 2 - Introduction to Polytopes

Lecture 2 - Introduction to Polytopes Lecture 2 - Introduction to Polytopes Optimization and Approximation - ENS M1 Nicolas Bousquet 1 Reminder of Linear Algebra definitions Let x 1,..., x m be points in R n and λ 1,..., λ m be real numbers.

More information

Theoretical Concepts of Machine Learning

Theoretical Concepts of Machine Learning Theoretical Concepts of Machine Learning Part 2 Institute of Bioinformatics Johannes Kepler University, Linz, Austria Outline 1 Introduction 2 Generalization Error 3 Maximum Likelihood 4 Noise Models 5

More information

Image Registration Lecture 4: First Examples

Image Registration Lecture 4: First Examples Image Registration Lecture 4: First Examples Prof. Charlene Tsai Outline Example Intensity-based registration SSD error function Image mapping Function minimization: Gradient descent Derivative calculation

More information

Clustering Lecture 5: Mixture Model

Clustering Lecture 5: Mixture Model Clustering Lecture 5: Mixture Model Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics

More information

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)

More information

Programming, numerics and optimization

Programming, numerics and optimization Programming, numerics and optimization Lecture C-4: Constrained optimization Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428 June

More information

Today. Golden section, discussion of error Newton s method. Newton s method, steepest descent, conjugate gradient

Today. Golden section, discussion of error Newton s method. Newton s method, steepest descent, conjugate gradient Optimization Last time Root finding: definition, motivation Algorithms: Bisection, false position, secant, Newton-Raphson Convergence & tradeoffs Example applications of Newton s method Root finding in

More information

Linear and Integer Programming :Algorithms in the Real World. Related Optimization Problems. How important is optimization?

Linear and Integer Programming :Algorithms in the Real World. Related Optimization Problems. How important is optimization? Linear and Integer Programming 15-853:Algorithms in the Real World Linear and Integer Programming I Introduction Geometric Interpretation Simplex Method Linear or Integer programming maximize z = c T x

More information

x n+1 = x n f(x n) f (x n ), (1)

x n+1 = x n f(x n) f (x n ), (1) 1 Optimization The field of optimization is large and vastly important, with a deep history in computer science (among other places). Generally, an optimization problem is defined by having a score function

More information

LP-Modelling. dr.ir. C.A.J. Hurkens Technische Universiteit Eindhoven. January 30, 2008

LP-Modelling. dr.ir. C.A.J. Hurkens Technische Universiteit Eindhoven. January 30, 2008 LP-Modelling dr.ir. C.A.J. Hurkens Technische Universiteit Eindhoven January 30, 2008 1 Linear and Integer Programming After a brief check with the backgrounds of the participants it seems that the following

More information

Introduction to Deep Learning

Introduction to Deep Learning ENEE698A : Machine Learning Seminar Introduction to Deep Learning Raviteja Vemulapalli Image credit: [LeCun 1998] Resources Unsupervised feature learning and deep learning (UFLDL) tutorial (http://ufldl.stanford.edu/wiki/index.php/ufldl_tutorial)

More information

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014

More information

Introduction to Optimization Problems and Methods

Introduction to Optimization Problems and Methods Introduction to Optimization Problems and Methods wjch@umich.edu December 10, 2009 Outline 1 Linear Optimization Problem Simplex Method 2 3 Cutting Plane Method 4 Discrete Dynamic Programming Problem Simplex

More information

Lecture 1: Introduction

Lecture 1: Introduction Lecture 1 1 Linear and Combinatorial Optimization Anders Heyden Centre for Mathematical Sciences Lecture 1: Introduction The course and its goals Basic concepts Optimization Combinatorial optimization

More information

Department of Mathematics Oleg Burdakov of 30 October Consider the following linear programming problem (LP):

Department of Mathematics Oleg Burdakov of 30 October Consider the following linear programming problem (LP): Linköping University Optimization TAOP3(0) Department of Mathematics Examination Oleg Burdakov of 30 October 03 Assignment Consider the following linear programming problem (LP): max z = x + x s.t. x x

More information

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016 CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:

More information

Introduction to Optimization

Introduction to Optimization Introduction to Optimization Second Order Optimization Methods Marc Toussaint U Stuttgart Planned Outline Gradient-based optimization (1st order methods) plain grad., steepest descent, conjugate grad.,

More information

Bilinear Programming

Bilinear Programming Bilinear Programming Artyom G. Nahapetyan Center for Applied Optimization Industrial and Systems Engineering Department University of Florida Gainesville, Florida 32611-6595 Email address: artyom@ufl.edu

More information

Convex Optimization MLSS 2015

Convex Optimization MLSS 2015 Convex Optimization MLSS 2015 Constantine Caramanis The University of Texas at Austin The Optimization Problem minimize : f (x) subject to : x X. The Optimization Problem minimize : f (x) subject to :

More information

Hardware-Software Codesign

Hardware-Software Codesign Hardware-Software Codesign 4. System Partitioning Lothar Thiele 4-1 System Design specification system synthesis estimation SW-compilation intellectual prop. code instruction set HW-synthesis intellectual

More information

COT 6936: Topics in Algorithms! Giri Narasimhan. ECS 254A / EC 2443; Phone: x3748

COT 6936: Topics in Algorithms! Giri Narasimhan. ECS 254A / EC 2443; Phone: x3748 COT 6936: Topics in Algorithms! Giri Narasimhan ECS 254A / EC 2443; Phone: x3748 giri@cs.fiu.edu http://www.cs.fiu.edu/~giri/teach/cot6936_s12.html https://moodle.cis.fiu.edu/v2.1/course/view.php?id=174

More information

TIM 206 Lecture Notes Integer Programming

TIM 206 Lecture Notes Integer Programming TIM 206 Lecture Notes Integer Programming Instructor: Kevin Ross Scribe: Fengji Xu October 25, 2011 1 Defining Integer Programming Problems We will deal with linear constraints. The abbreviation MIP stands

More information

Algorithms for Decision Support. Integer linear programming models

Algorithms for Decision Support. Integer linear programming models Algorithms for Decision Support Integer linear programming models 1 People with reduced mobility (PRM) require assistance when travelling through the airport http://www.schiphol.nl/travellers/atschiphol/informationforpassengerswithreducedmobility.htm

More information

10.4 Linear interpolation method Newton s method

10.4 Linear interpolation method Newton s method 10.4 Linear interpolation method The next best thing one can do is the linear interpolation method, also known as the double false position method. This method works similarly to the bisection method by

More information

Lecture 3.3 Robust estimation with RANSAC. Thomas Opsahl

Lecture 3.3 Robust estimation with RANSAC. Thomas Opsahl Lecture 3.3 Robust estimation with RANSAC Thomas Opsahl Motivation If two perspective cameras captures an image of a planar scene, their images are related by a homography HH 2 Motivation If two perspective

More information

POLYHEDRAL GEOMETRY. Convex functions and sets. Mathematical Programming Niels Lauritzen Recall that a subset C R n is convex if

POLYHEDRAL GEOMETRY. Convex functions and sets. Mathematical Programming Niels Lauritzen Recall that a subset C R n is convex if POLYHEDRAL GEOMETRY Mathematical Programming Niels Lauritzen 7.9.2007 Convex functions and sets Recall that a subset C R n is convex if {λx + (1 λ)y 0 λ 1} C for every x, y C and 0 λ 1. A function f :

More information

Fundamentals of Integer Programming

Fundamentals of Integer Programming Fundamentals of Integer Programming Di Yuan Department of Information Technology, Uppsala University January 2018 Outline Definition of integer programming Formulating some classical problems with integer

More information

Motivation: Shortcomings of Hidden Markov Model. Ko, Youngjoong. Solution: Maximum Entropy Markov Model (MEMM)

Motivation: Shortcomings of Hidden Markov Model. Ko, Youngjoong. Solution: Maximum Entropy Markov Model (MEMM) Motivation: Shortcomings of Hidden Markov Model Maximum Entropy Markov Models and Conditional Random Fields Ko, Youngjoong Dept. of Computer Engineering, Dong-A University Intelligent System Laboratory,

More information

11.1 Facility Location

11.1 Facility Location CS787: Advanced Algorithms Scribe: Amanda Burton, Leah Kluegel Lecturer: Shuchi Chawla Topic: Facility Location ctd., Linear Programming Date: October 8, 2007 Today we conclude the discussion of local

More information

Chapter II. Linear Programming

Chapter II. Linear Programming 1 Chapter II Linear Programming 1. Introduction 2. Simplex Method 3. Duality Theory 4. Optimality Conditions 5. Applications (QP & SLP) 6. Sensitivity Analysis 7. Interior Point Methods 1 INTRODUCTION

More information

CS 125 Section #10 Midterm 2 Review 11/5/14

CS 125 Section #10 Midterm 2 Review 11/5/14 CS 125 Section #10 Midterm 2 Review 11/5/14 1 Topics Covered This midterm covers up through NP-completeness; countability, decidability, and recognizability will not appear on this midterm. Disclaimer:

More information

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms for Inference Fall 2014

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms for Inference Fall 2014 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 1 Course Overview This course is about performing inference in complex

More information

ECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov

ECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov ECE521: Week 11, Lecture 20 27 March 2017: HMM learning/inference With thanks to Russ Salakhutdinov Examples of other perspectives Murphy 17.4 End of Russell & Norvig 15.2 (Artificial Intelligence: A Modern

More information

Support Vector Machines.

Support Vector Machines. Support Vector Machines srihari@buffalo.edu SVM Discussion Overview 1. Overview of SVMs 2. Margin Geometry 3. SVM Optimization 4. Overlapping Distributions 5. Relationship to Logistic Regression 6. Dealing

More information

General Purpose Methods for Combinatorial Optimization

General Purpose Methods for Combinatorial Optimization General Purpose Methods for Combinatorial Optimization 0/7/00 Maximum Contiguous Sum 3-4 9 6-3 8 97-93 -3 84 Σ = 87 Given:... N Z, at least one i > 0 ind i, j such that j k k = i is maximal 0/7/00 0/7/00

More information

Unit.9 Integer Programming

Unit.9 Integer Programming Unit.9 Integer Programming Xiaoxi Li EMS & IAS, Wuhan University Dec. 22-29, 2016 (revised) Operations Research (Li, X.) Unit.9 Integer Programming Dec. 22-29, 2016 (revised) 1 / 58 Organization of this

More information

Lecture 2 The k-means clustering problem

Lecture 2 The k-means clustering problem CSE 29: Unsupervised learning Spring 2008 Lecture 2 The -means clustering problem 2. The -means cost function Last time we saw the -center problem, in which the input is a set S of data points and the

More information

Planning and Control: Markov Decision Processes

Planning and Control: Markov Decision Processes CSE-571 AI-based Mobile Robotics Planning and Control: Markov Decision Processes Planning Static vs. Dynamic Predictable vs. Unpredictable Fully vs. Partially Observable Perfect vs. Noisy Environment What

More information

Ellipsoid Algorithm :Algorithms in the Real World. Ellipsoid Algorithm. Reduction from general case

Ellipsoid Algorithm :Algorithms in the Real World. Ellipsoid Algorithm. Reduction from general case Ellipsoid Algorithm 15-853:Algorithms in the Real World Linear and Integer Programming II Ellipsoid algorithm Interior point methods First polynomial-time algorithm for linear programming (Khachian 79)

More information

Integer Programming Theory

Integer Programming Theory Integer Programming Theory Laura Galli October 24, 2016 In the following we assume all functions are linear, hence we often drop the term linear. In discrete optimization, we seek to find a solution x

More information

Lecture 19: November 5

Lecture 19: November 5 0-725/36-725: Convex Optimization Fall 205 Lecturer: Ryan Tibshirani Lecture 9: November 5 Scribes: Hyun Ah Song Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not

More information

Introduction to Mobile Robotics

Introduction to Mobile Robotics Introduction to Mobile Robotics Clustering Wolfram Burgard Cyrill Stachniss Giorgio Grisetti Maren Bennewitz Christian Plagemann Clustering (1) Common technique for statistical data analysis (machine learning,

More information

/ Approximation Algorithms Lecturer: Michael Dinitz Topic: Linear Programming Date: 2/24/15 Scribe: Runze Tang

/ Approximation Algorithms Lecturer: Michael Dinitz Topic: Linear Programming Date: 2/24/15 Scribe: Runze Tang 600.469 / 600.669 Approximation Algorithms Lecturer: Michael Dinitz Topic: Linear Programming Date: 2/24/15 Scribe: Runze Tang 9.1 Linear Programming Suppose we are trying to approximate a minimization

More information

Outline. Column Generation: Cutting Stock A very applied method. Introduction to Column Generation. Given an LP problem

Outline. Column Generation: Cutting Stock A very applied method. Introduction to Column Generation. Given an LP problem Column Generation: Cutting Stock A very applied method thst@man.dtu.dk Outline History The Simplex algorithm (re-visited) Column Generation as an extension of the Simplex algorithm A simple example! DTU-Management

More information

Column Generation: Cutting Stock

Column Generation: Cutting Stock Column Generation: Cutting Stock A very applied method thst@man.dtu.dk DTU-Management Technical University of Denmark 1 Outline History The Simplex algorithm (re-visited) Column Generation as an extension

More information

Algorithms. Ch.15 Dynamic Programming

Algorithms. Ch.15 Dynamic Programming Algorithms Ch.15 Dynamic Programming Dynamic Programming Not a specific algorithm, but a technique (like divide-and-conquer). Developed back in the day when programming meant tabular method (like linear

More information

Computer Vision Group Prof. Daniel Cremers. 4a. Inference in Graphical Models

Computer Vision Group Prof. Daniel Cremers. 4a. Inference in Graphical Models Group Prof. Daniel Cremers 4a. Inference in Graphical Models Inference on a Chain (Rep.) The first values of µ α and µ β are: The partition function can be computed at any node: Overall, we have O(NK 2

More information

Optimization and least squares. Prof. Noah Snavely CS1114

Optimization and least squares. Prof. Noah Snavely CS1114 Optimization and least squares Prof. Noah Snavely CS1114 http://cs1114.cs.cornell.edu Administrivia A5 Part 1 due tomorrow by 5pm (please sign up for a demo slot) Part 2 will be due in two weeks (4/17)

More information

CPSC 340: Machine Learning and Data Mining. More Linear Classifiers Fall 2017

CPSC 340: Machine Learning and Data Mining. More Linear Classifiers Fall 2017 CPSC 340: Machine Learning and Data Mining More Linear Classifiers Fall 2017 Admin Assignment 3: Due Friday of next week. Midterm: Can view your exam during instructor office hours next week, or after

More information

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited. page v Preface xiii I Basics 1 1 Optimization Models 3 1.1 Introduction... 3 1.2 Optimization: An Informal Introduction... 4 1.3 Linear Equations... 7 1.4 Linear Optimization... 10 Exercises... 12 1.5

More information

Notes for Lecture 18

Notes for Lecture 18 U.C. Berkeley CS17: Intro to CS Theory Handout N18 Professor Luca Trevisan November 6, 21 Notes for Lecture 18 1 Algorithms for Linear Programming Linear programming was first solved by the simplex method

More information

MVE165/MMG630, Applied Optimization Lecture 8 Integer linear programming algorithms. Ann-Brith Strömberg

MVE165/MMG630, Applied Optimization Lecture 8 Integer linear programming algorithms. Ann-Brith Strömberg MVE165/MMG630, Integer linear programming algorithms Ann-Brith Strömberg 2009 04 15 Methods for ILP: Overview (Ch. 14.1) Enumeration Implicit enumeration: Branch and bound Relaxations Decomposition methods:

More information

D-Separation. b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C.

D-Separation. b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C. D-Separation Say: A, B, and C are non-intersecting subsets of nodes in a directed graph. A path from A to B is blocked by C if it contains a node such that either a) the arrows on the path meet either

More information

Primal Dual Schema Approach to the Labeling Problem with Applications to TSP

Primal Dual Schema Approach to the Labeling Problem with Applications to TSP 1 Primal Dual Schema Approach to the Labeling Problem with Applications to TSP Colin Brown, Simon Fraser University Instructor: Ramesh Krishnamurti The Metric Labeling Problem has many applications, especially

More information

1. Lecture notes on bipartite matching

1. Lecture notes on bipartite matching Massachusetts Institute of Technology 18.453: Combinatorial Optimization Michel X. Goemans February 5, 2017 1. Lecture notes on bipartite matching Matching problems are among the fundamental problems in

More information

NOTATION AND TERMINOLOGY

NOTATION AND TERMINOLOGY 15.053x, Optimization Methods in Business Analytics Fall, 2016 October 4, 2016 A glossary of notation and terms used in 15.053x Weeks 1, 2, 3, 4 and 5. (The most recent week's terms are in blue). NOTATION

More information