ECE 592 Topics in Data Science
|
|
- Coral Cobb
- 5 years ago
- Views:
Transcription
1 ECE 592 Topics in Data Science Dror Baron Associate Professor Dept. of Electrical and Computer Engr. North Carolina State University, NC, USA
2 Optimization Keywords: linear programming, dynamic programming, convex optimization, non-convex optimization
3 What is Optimization?
4 What is optimization? Wikipedia: In mathematics, computer science and operations research, mathematical optimization (alternatively, mathematical programming or simply, optimization) is the selection of a best element (with regard to some criterion) from some set of available alternatives. 4
5 Application #1-Classroom scheduling Real story NCSU has classes on multiple campuses, dozens of buildings, etc. We want a good schedule What s good? Availability of rooms Proximity of classroom to department Instructors have day/time preferences Match sizes of rooms and anticipated class enrollment Avoid conflicts between course pairs of interest to students 5
6 Application #2-l 1 recovery Among infinitely many solutions, seek one with smallest l 1 norm (sum of absolute values) Relation to compressed sensing recovery (later in course) Can express x=x p -x n xx 1 = NN ii=1 xx pp,ii + xx nn,ii min NN ii=1 xx pp,ii + xx nn,ii subject to (s.t.) y=φx p -Φx n Also need x p, x n to be non-negative 6
7 Application #3-reducing fuel consumption Suppose gas prices increase a lot Truck fleet company wants to save $ by reducing fuel consumption Things are simple on flat highways Challenges: 1) You see a hill; can push engine up hill and coast down, or accelerate before hill, then reduce speed while climbing 2) We see red light; should we coast, accelerate, slam brakes? Main point: dynamic behavior links past, present, future 7
8 Application #4-process design in factories Consider factory with complicated process Want to buy less inputs (chemicals) Want to use less energy Want product to be produced quickly (time) Want robustness to surprises (e.g., power shortages) Goal: tune production process to minimize costs Costs involve inputs, energy, time, robustness, Known as multi objective optimization 8
9 Dynamic Programming (DP) Keywords: Bellman equations, dynamic programming
10 What is dynamic programming (DP)? Wikipedia: In mathematics, management science, economics, computer science, and bioinformatics, dynamic programming (also known as dynamic optimization) is a method for solving a complex problem by breaking it down into a collection of simpler subproblems, solving each of those subproblems just once, and storing their solutions ideally, using a memory-based data structure. 10
11 Resembles divide and conquer Have a large problem Partition into parts Dynamic nature of problem links past, present, future Want decision whose combined costs (current plus future) is best Whereas brute force optimization is computationally intense, DP is fast 11
12 Problem setting t time T time horizon (maximal time) x t state at time t Possible actions aa tt Γ(xx tt ) T(x,a) next state upon choosing action a F(x,a) payoff from action a Want to maximize our payoff up to horizon T 12
13 Solution approach Basis case: t=t-1, have one time left for an action Maximize payoff by maximizing F(x t,a) a * =arg max a Γ F(x t,a) At time T (end of problem) arrive at state x T =T(x,a * ) Don t care about final state, only about payoff 13
14 Solution continued Recursive case: t<t-1, have multiple decisions left Let s keep it simple with t=t-2 Based on basis case, for each X t-1 =X T-2 can calculate a* for last decision (in next time step, t=t-1) Want optimal cost to account for current payoff and payoff in next step aa = arg max aa tt Γ(xx tt ) {FF xx tt, aa + next_payoff(t(xt,a))} 14
15 Recursive solution Let s simplify recursive case for t=t-2 using notation for optimal actions / payoffs at time t aa xx tt - optimal action at time t given state x t Ψ(xx tt ) optimal payoff starting from time t Basis case provides a * (x T-1 ), Ψ xx TT 1, xxxx Recursive case for t=t-2 aa = arg max aa tt Γ(xx tt ) {FF xx tt, aa + next_payoff(t(x t,a))} = arg max aa tt Γ(xx tt ) {FF xx tt, aa + Ψ(xx TT 1 = TT(xx tt, aa))} Repeat recursively for smaller t 15
16 Computationally efficient DP solution Instead of processing from t up to T, reverse order: t=t-1: compute aa xx tt, Ψ(xx tt ) for all possible x t t=t-2: aa = arg max {FF xx tt, aa + Ψ(xx TT 1 = TT(xx TT 2, aa))} aa tt Γ(xx tt ) t=t-3: aa = arg max {FF xx tt, aa + Ψ(xx TT 2 = TT(xx TT 3, aa))} aa tt Γ(xx tt ) General case: Bellman s optimality equations Each time step, store optimal actions and payoffs Lookup table (LUT) for Ψ instead of recomputing Can construct sequence of optimal actions with LUT 16
17 Why computationally efficient? Let s contrast computational complexities Brute force optimization: Γ actions per time step and T time steps Must evaluate Γ T trajectories of actions Θ( Γ T ) DP: Compute FF xx tt, aa + Ψ(xx tt = TT(xx tt, aa))} for Γ actions, T time Θ( Γ T) Whereas brute force optimization is computationally intense, DP is fast 17
18 Variations Deterministic / random Next state and payoff could be random Example: there could be more users than expected; adjust server (action) to account for future trajectory of software Finite / infinite horizon Infinite horizon decision problems require discount factor β to give future payoffs at time t weight β t Payoffs in far future matter less β<1 Discrete / continuous time 18
19 Example [Cormen et al.] Rod cutting problem Have rod of integer length n Have table of prices p i charged for length-i cuts Cutting is free Want to cut rod into parts (or not cut at all) to maximize profit 19
20 Example continued Length n=4 Can charge prices p 1 =1, p 2 =5, p 3 =8, p 4 =9 Could look at all possible sets of cuts (see below) 20
21 Example using DP Unrealistic to consider cutting configurations for large n, use DP instead Basis: n=1, Ψ(1)=p 1 =1 Recursion: n=2, Ψ(2)=max{2Ψ(1),p 2 }=5 n=3, Ψ(3)=max{Ψ(1)+Ψ(2),p 3 }=max{5+1,8}=8 At each stage, maximize over Ψ(k)+Ψ(n-k) for k=1,2,,n-1; and for k=n use p n Ex: Ψ(7)=max{Ψ(1)+Ψ(6),Ψ(2)+Ψ(5),Ψ(3)+Ψ(4),,p 7 } 21
22 Real-world application Viterbi algorithm Decodes convolution codes in CDMA Also used in speech recognition Text is hidden and (noisy) speech observations help estimate text Relies on DP Finds shortest path 22
23 Linear Programming Keywords: linear programming, simplex method
24 Formulation Canonical form max xx ss.tt. AAAA bb,xx 0 cctt xx Note: s.t. = subject to Matrix manipulations/tricks create variations: Ax=b by enforcing and We ve minimized x 1 (instead of c T x) 24
25 What s it good for? Transportation - pose airline costs and revenue as linear model, maximize profit (revenue-costs) w/lp Manufacturing minimize costs by posing them as linear model Common theme: many real-world problems are approximately linear, or can be linearized around working point (Taylor series) 25
26 History Early formulations date back to early 20 th century (rudimentary forms even earlier) Dantzig invented simplex method (solver) in 40s Polynomial average runtime; slow worst case Interior point methods much faster worst case 26
27 Simplex algorithm Linear constraints, AAAA bb, xx 0 Correspond to convex polytope Linear function being optimized, c T x Optimal on corner point of polytope Simplex = outer shell of convex polytope Start at some corner point (vertex) Examine neighboring vertices Either c T x already optimal, or it s better at neighbor Move to best neighboring vertex; iterate until done Specific steps correspond to linear algebra 27
28 Keywords: convex optimization Convex Optimization
29 What are convex/concave functions? Consider convex real valued function defined on space XX, ff: XX R Convex: f(λx+(1- λ)y)) λf(x)+(1- λ)f(y), x,y XX, λ (0,1) Concave: Note: f convex if and only if f concave; convex/concave imply negative/positive second derivatives Any local optimum is global optimum 29
30 What is convex optimization? Basic convex problem: xx = arg min xx XX {f x } Set XX and function f(x) must both be convex Alternate form: min ff(xx) ss.tt. gg ii xx 0, ii Functions f, g 1,, g m all convex 30
31 Applications (Why is this interesting?) Many problems can be posed as convex Least squares Entropy maximization Linear programming 31
32 Newton s method Newton s method finds roots of equations, f(x)=0 Instead, derivative f`=0 or gradient = 0 Taylor expansion: f T (x)=f(x t )+f xx f xx2 + Root of derivative: f (x t )+f (x t ) xx=0 Iterate with x t+1 =x t + xx Newton s method is simple but O(1/t) convergence 32
33 Second order methods Challenge: first order approximation to derivative slows down Newton s method Solution: use higher order approximation Instead of f (x t )+f (x t ) xx, use third derivative too Multi-dimensional function? Use gradient, Hessians Second order methods more complicated but faster 33
34 Gradient descent Keywords: gradient descent, line search, golden section search
35 Gradient descent In each iteration, select direction to pursue Coordinate descent move along one of coordinates Gradient descent - direction that minimizes cost function fastest How far should we move along that direction? Undershooting or overshooting bad for convergence 35
36 Line search Key sub-method is to move along direction just enough to minimize the function along that line Line search = optimization along line Many variations binary search, golden rule search Let s make up an example for this and code it! Check course webpage for Matlab script 36
37 Integer Programming Keywords: integer programming, integer linear programs, relaxation
38 What is integer programming? Integer program = optimization problem where some/all variables must be integers Integer linear programs (ILP): xx = arg max Slack variables s ss.tt. AAAA+ss=bb ss 0 xx Z nn cc TT xx 38
39 Example Support set detection y=ax+z Sparse x Want to identify support set where x 0 Can we do perfect support set detection? Are there tiny non-zeros? (yes difficult) What s the SNR (low difficult) 39
40 Example continued Support set detection, y=ax+z, want support set Algorithm: Consider candidate support set, s {0,1} N Create matrix A s contains column i iff s i =1 Run least squares using A s (find low-energy solution to y=a s x) Iterate over all s, select solution with smallest residual Algorithm is optimal & slow 40
41 More about ILP Integer linear programs can be shown to be NP This means they are slow Another algorithmic approach relaxation First ignore integer constraints, solve standard LP Next, round (sort of!) to nearby (not necessarily nearest) integer solution Various applications require integer solutions; we re just skimming the surface 41
42 Keywords: non-convex optimization Non-Convex Optimization
43 What s the challenge? Many functions are non-convex Convex one local min (it s the global min) Non-convex local min need not be global min Various algorithms could get stuck in local min 43
44 Is it hopeless? Maybe initialize an algo many different ways; could get stuck in different local mins, choose best But could be tons of local mins (especially in higher dimensions) 44
45 Markov chain Monte Carlo Markov chain Monte Carlo (MCMC) can solve some non-convex problems Form expression E(x) for energy (analogous to statistical physics) Distribution for signal: Pr(x)=Z exp{-se(x)} s analogous to inverse temperature; normalization term Z Sample next version of x from Gibbs distribution High temperature small s weak pull toward low energy Low temp large s strong pull to low energy Gradual cooling 45
46 MCMC continued 1) Do we sample entire sequence x? Not necessarily. Can consider re-generating one x i at a time; only need marginal distribution for x i 2) Is MCMC guaranteed to converge to global min? Maybe. If you cool down very slowly 3) So is it any good? Depends. MCMC is very slow but can converge to global min; some techniques to accelerate it 46
47 EM Algorithm Keywords: expectation maximization algorithm, Gaussian mixture models, latent variables
48 Main ideas Iterate over estimation (E) & maximization (M) Estimation create function for computing expected log likelihood (based on current parameters) Maximization update parameters to maximize expected log likelihood from E step Details coming up 48
49 Statistical model & motivation Model generates data X Z latent / missing values θ - parameter Likelihood function: L(θ;X,Z)=Pr(X,Z θ) Marginal likelihood: L(θ;X)=Pr(X θ)= L(θ;X,Z)dZ Might be intractable (e.g., due to many possible Z sequences) Want to compute L(θ;X), then optimize parameter Computationally intractable Motivates EM 49
50 Statistical model & motivation Expectation computed expected value of log likelihood for parameter θ (t) in current iteration t QQ θθ θθ tt = EE ZZ XX,θθ (tt) log LL θθ; XX, ZZ Z typically discrete latent variables Given parameter θ (t), sequence Z can be found; typically via fast algorithm, e.g., dynamic programming Maximization θθ (tt+1) = argmax θθ QQ θθ θθ tt 50
51 Example Gaussian mixture models What s a Gaussian mixture model (GMM)? XX~ ii αα ii NN(μμ ii, σσ ii ) Component i has probability αα ii mean μμ ii standard dev σσ ii Could be multi-dimensional data covariance matrix i Useful? Many distributions well-approximated by GMM In principle can model almost everything as GMM Trade-off between # components and model accuracy 51
52 Example continued Challenge - parameters (αα ii, μμ ii, σσ ii ) often unavailable Must estimate from data X To keep simple: N scalar samples X R NN Latent variable ZZ Z NN ; z n correspond to Gaussian components that x n belongs to E step: compute sequence Z given parameters θ=(αα ii, μμ ii, σσ ii ) Optimize θ given Z 52
Modern Methods of Data Analysis - WS 07/08
Modern Methods of Data Analysis Lecture XV (04.02.08) Contents: Function Minimization (see E. Lohrmann & V. Blobel) Optimization Problem Set of n independent variables Sometimes in addition some constraints
More informationConvexization in Markov Chain Monte Carlo
in Markov Chain Monte Carlo 1 IBM T. J. Watson Yorktown Heights, NY 2 Department of Aerospace Engineering Technion, Israel August 23, 2011 Problem Statement MCMC processes in general are governed by non
More informationGaussian Mixture Models For Clustering Data. Soft Clustering and the EM Algorithm
Gaussian Mixture Models For Clustering Data Soft Clustering and the EM Algorithm K-Means Clustering Input: Observations: xx ii R dd ii {1,., NN} Number of Clusters: kk Output: Cluster Assignments. Cluster
More informationRegularization and Markov Random Fields (MRF) CS 664 Spring 2008
Regularization and Markov Random Fields (MRF) CS 664 Spring 2008 Regularization in Low Level Vision Low level vision problems concerned with estimating some quantity at each pixel Visual motion (u(x,y),v(x,y))
More informationEE/AA 578: Convex Optimization
EE/AA 578: Convex Optimization Instructor: Maryam Fazel University of Washington Fall 2016 1. Introduction EE/AA 578, Univ of Washington, Fall 2016 course logistics mathematical optimization least-squares;
More informationAn Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework
IEEE SIGNAL PROCESSING LETTERS, VOL. XX, NO. XX, XXX 23 An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework Ji Won Yoon arxiv:37.99v [cs.lg] 3 Jul 23 Abstract In order to cluster
More informationThe exam is closed book, closed notes except your one-page (two-sided) cheat sheet.
CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or
More informationMathematical Programming and Research Methods (Part II)
Mathematical Programming and Research Methods (Part II) 4. Convexity and Optimization Massimiliano Pontil (based on previous lecture by Andreas Argyriou) 1 Today s Plan Convex sets and functions Types
More informationLinear Programming. Readings: Read text section 11.6, and sections 1 and 2 of Tom Ferguson s notes (see course homepage).
Linear Programming Learning Goals. Introduce Linear Programming Problems. Widget Example, Graphical Solution. Basic Theory: Feasible Set, Vertices, Existence of Solutions. Equivalent formulations. Outline
More informationLinear Programming. Widget Factory Example. Linear Programming: Standard Form. Widget Factory Example: Continued.
Linear Programming Widget Factory Example Learning Goals. Introduce Linear Programming Problems. Widget Example, Graphical Solution. Basic Theory:, Vertices, Existence of Solutions. Equivalent formulations.
More informationCMU-Q Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization. Teacher: Gianni A. Di Caro
CMU-Q 15-381 Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization Teacher: Gianni A. Di Caro GLOBAL FUNCTION OPTIMIZATION Find the global maximum of the function f x (and
More informationK-Means and Gaussian Mixture Models
K-Means and Gaussian Mixture Models David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 43 K-Means Clustering Example: Old Faithful Geyser
More informationA Brief Look at Optimization
A Brief Look at Optimization CSC 412/2506 Tutorial David Madras January 18, 2018 Slides adapted from last year s version Overview Introduction Classes of optimization problems Linear programming Steepest
More informationClustering web search results
Clustering K-means Machine Learning CSE546 Emily Fox University of Washington November 4, 2013 1 Clustering images Set of Images [Goldberger et al.] 2 1 Clustering web search results 3 Some Data 4 2 K-means
More informationConstrained and Unconstrained Optimization
Constrained and Unconstrained Optimization Carlos Hurtado Department of Economics University of Illinois at Urbana-Champaign hrtdmrt2@illinois.edu Oct 10th, 2017 C. Hurtado (UIUC - Economics) Numerical
More informationCPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017
CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2017 Assignment 3: 2 late days to hand in tonight. Admin Assignment 4: Due Friday of next week. Last Time: MAP Estimation MAP
More informationObject Recognition Using Pictorial Structures. Daniel Huttenlocher Computer Science Department. In This Talk. Object recognition in computer vision
Object Recognition Using Pictorial Structures Daniel Huttenlocher Computer Science Department Joint work with Pedro Felzenszwalb, MIT AI Lab In This Talk Object recognition in computer vision Brief definition
More informationSegmentation: Clustering, Graph Cut and EM
Segmentation: Clustering, Graph Cut and EM Ying Wu Electrical Engineering and Computer Science Northwestern University, Evanston, IL 60208 yingwu@northwestern.edu http://www.eecs.northwestern.edu/~yingwu
More informationMachine Learning and Data Mining. Clustering (1): Basics. Kalev Kask
Machine Learning and Data Mining Clustering (1): Basics Kalev Kask Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand patterns of
More informationOPTIMIZATION METHODS
D. Nagesh Kumar Associate Professor Department of Civil Engineering, Indian Institute of Science, Bangalore - 50 0 Email : nagesh@civil.iisc.ernet.in URL: http://www.civil.iisc.ernet.in/~nagesh Brief Contents
More information3 INTEGER LINEAR PROGRAMMING
3 INTEGER LINEAR PROGRAMMING PROBLEM DEFINITION Integer linear programming problem (ILP) of the decision variables x 1,..,x n : (ILP) subject to minimize c x j j n j= 1 a ij x j x j 0 x j integer n j=
More informationReinforcement Learning: A brief introduction. Mihaela van der Schaar
Reinforcement Learning: A brief introduction Mihaela van der Schaar Outline Optimal Decisions & Optimal Forecasts Markov Decision Processes (MDPs) States, actions, rewards and value functions Dynamic Programming
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Clustering and EM Barnabás Póczos & Aarti Singh Contents Clustering K-means Mixture of Gaussians Expectation Maximization Variational Methods 2 Clustering 3 K-
More informationMathematical and Algorithmic Foundations Linear Programming and Matchings
Adavnced Algorithms Lectures Mathematical and Algorithmic Foundations Linear Programming and Matchings Paul G. Spirakis Department of Computer Science University of Patras and Liverpool Paul G. Spirakis
More informationCOT 6936: Topics in Algorithms! Giri Narasimhan. ECS 254A / EC 2443; Phone: x3748
COT 6936: Topics in Algorithms! Giri Narasimhan ECS 254A / EC 2443; Phone: x3748 giri@cs.fiu.edu http://www.cs.fiu.edu/~giri/teach/cot6936_s12.html https://moodle.cis.fiu.edu/v2.1/course/view.php?id=174
More informationComputer vision: models, learning and inference. Chapter 10 Graphical Models
Computer vision: models, learning and inference Chapter 10 Graphical Models Independence Two variables x 1 and x 2 are independent if their joint probability distribution factorizes as Pr(x 1, x 2 )=Pr(x
More informationPattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition
Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant
More information16.410/413 Principles of Autonomy and Decision Making
16.410/413 Principles of Autonomy and Decision Making Lecture 17: The Simplex Method Emilio Frazzoli Aeronautics and Astronautics Massachusetts Institute of Technology November 10, 2010 Frazzoli (MIT)
More information5 Machine Learning Abstractions and Numerical Optimization
Machine Learning Abstractions and Numerical Optimization 25 5 Machine Learning Abstractions and Numerical Optimization ML ABSTRACTIONS [some meta comments on machine learning] [When you write a large computer
More informationWeek 5. Convex Optimization
Week 5. Convex Optimization Lecturer: Prof. Santosh Vempala Scribe: Xin Wang, Zihao Li Feb. 9 and, 206 Week 5. Convex Optimization. The convex optimization formulation A general optimization problem is
More informationComputer Vision I - Filtering and Feature detection
Computer Vision I - Filtering and Feature detection Carsten Rother 30/10/2015 Computer Vision I: Basics of Image Processing Roadmap: Basics of Digital Image Processing Computer Vision I: Basics of Image
More informationDiscrete Optimization. Lecture Notes 2
Discrete Optimization. Lecture Notes 2 Disjunctive Constraints Defining variables and formulating linear constraints can be straightforward or more sophisticated, depending on the problem structure. The
More informationMachine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013
Machine Learning Topic 5: Linear Discriminants Bryan Pardo, EECS 349 Machine Learning, 2013 Thanks to Mark Cartwright for his extensive contributions to these slides Thanks to Alpaydin, Bishop, and Duda/Hart/Stork
More information1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics
1. Introduction EE 546, Univ of Washington, Spring 2016 performance of numerical methods complexity bounds structural convex optimization course goals and topics 1 1 Some course info Welcome to EE 546!
More informationIE598 Big Data Optimization Summary Nonconvex Optimization
IE598 Big Data Optimization Summary Nonconvex Optimization Instructor: Niao He April 16, 2018 1 This Course Big Data Optimization Explore modern optimization theories, algorithms, and big data applications
More information10-701/15-781, Fall 2006, Final
-7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly
More informationNatural Language Processing
Natural Language Processing Classification III Dan Klein UC Berkeley 1 Classification 2 Linear Models: Perceptron The perceptron algorithm Iteratively processes the training set, reacting to training errors
More informationChapter 15 Introduction to Linear Programming
Chapter 15 Introduction to Linear Programming An Introduction to Optimization Spring, 2015 Wei-Ta Chu 1 Brief History of Linear Programming The goal of linear programming is to determine the values of
More informationClustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin
Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014
More informationELEG Compressive Sensing and Sparse Signal Representations
ELEG 867 - Compressive Sensing and Sparse Signal Representations Gonzalo R. Arce Depart. of Electrical and Computer Engineering University of Delaware Fall 211 Compressive Sensing G. Arce Fall, 211 1 /
More informationPerceptron: This is convolution!
Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image
More informationData Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Search & Optimization Search and Optimization method deals with
More informationCS 229 Midterm Review
CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask
More informationConstrained Optimization COS 323
Constrained Optimization COS 323 Last time Introduction to optimization objective function, variables, [constraints] 1-dimensional methods Golden section, discussion of error Newton s method Multi-dimensional
More informationLecture 2 - Introduction to Polytopes
Lecture 2 - Introduction to Polytopes Optimization and Approximation - ENS M1 Nicolas Bousquet 1 Reminder of Linear Algebra definitions Let x 1,..., x m be points in R n and λ 1,..., λ m be real numbers.
More informationTheoretical Concepts of Machine Learning
Theoretical Concepts of Machine Learning Part 2 Institute of Bioinformatics Johannes Kepler University, Linz, Austria Outline 1 Introduction 2 Generalization Error 3 Maximum Likelihood 4 Noise Models 5
More informationImage Registration Lecture 4: First Examples
Image Registration Lecture 4: First Examples Prof. Charlene Tsai Outline Example Intensity-based registration SSD error function Image mapping Function minimization: Gradient descent Derivative calculation
More informationClustering Lecture 5: Mixture Model
Clustering Lecture 5: Mixture Model Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics
More informationFMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu
FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)
More informationProgramming, numerics and optimization
Programming, numerics and optimization Lecture C-4: Constrained optimization Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428 June
More informationToday. Golden section, discussion of error Newton s method. Newton s method, steepest descent, conjugate gradient
Optimization Last time Root finding: definition, motivation Algorithms: Bisection, false position, secant, Newton-Raphson Convergence & tradeoffs Example applications of Newton s method Root finding in
More informationLinear and Integer Programming :Algorithms in the Real World. Related Optimization Problems. How important is optimization?
Linear and Integer Programming 15-853:Algorithms in the Real World Linear and Integer Programming I Introduction Geometric Interpretation Simplex Method Linear or Integer programming maximize z = c T x
More informationx n+1 = x n f(x n) f (x n ), (1)
1 Optimization The field of optimization is large and vastly important, with a deep history in computer science (among other places). Generally, an optimization problem is defined by having a score function
More informationLP-Modelling. dr.ir. C.A.J. Hurkens Technische Universiteit Eindhoven. January 30, 2008
LP-Modelling dr.ir. C.A.J. Hurkens Technische Universiteit Eindhoven January 30, 2008 1 Linear and Integer Programming After a brief check with the backgrounds of the participants it seems that the following
More informationIntroduction to Deep Learning
ENEE698A : Machine Learning Seminar Introduction to Deep Learning Raviteja Vemulapalli Image credit: [LeCun 1998] Resources Unsupervised feature learning and deep learning (UFLDL) tutorial (http://ufldl.stanford.edu/wiki/index.php/ufldl_tutorial)
More informationClustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin
Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014
More informationIntroduction to Optimization Problems and Methods
Introduction to Optimization Problems and Methods wjch@umich.edu December 10, 2009 Outline 1 Linear Optimization Problem Simplex Method 2 3 Cutting Plane Method 4 Discrete Dynamic Programming Problem Simplex
More informationLecture 1: Introduction
Lecture 1 1 Linear and Combinatorial Optimization Anders Heyden Centre for Mathematical Sciences Lecture 1: Introduction The course and its goals Basic concepts Optimization Combinatorial optimization
More informationDepartment of Mathematics Oleg Burdakov of 30 October Consider the following linear programming problem (LP):
Linköping University Optimization TAOP3(0) Department of Mathematics Examination Oleg Burdakov of 30 October 03 Assignment Consider the following linear programming problem (LP): max z = x + x s.t. x x
More informationCPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016
CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:
More informationIntroduction to Optimization
Introduction to Optimization Second Order Optimization Methods Marc Toussaint U Stuttgart Planned Outline Gradient-based optimization (1st order methods) plain grad., steepest descent, conjugate grad.,
More informationBilinear Programming
Bilinear Programming Artyom G. Nahapetyan Center for Applied Optimization Industrial and Systems Engineering Department University of Florida Gainesville, Florida 32611-6595 Email address: artyom@ufl.edu
More informationConvex Optimization MLSS 2015
Convex Optimization MLSS 2015 Constantine Caramanis The University of Texas at Austin The Optimization Problem minimize : f (x) subject to : x X. The Optimization Problem minimize : f (x) subject to :
More informationHardware-Software Codesign
Hardware-Software Codesign 4. System Partitioning Lothar Thiele 4-1 System Design specification system synthesis estimation SW-compilation intellectual prop. code instruction set HW-synthesis intellectual
More informationCOT 6936: Topics in Algorithms! Giri Narasimhan. ECS 254A / EC 2443; Phone: x3748
COT 6936: Topics in Algorithms! Giri Narasimhan ECS 254A / EC 2443; Phone: x3748 giri@cs.fiu.edu http://www.cs.fiu.edu/~giri/teach/cot6936_s12.html https://moodle.cis.fiu.edu/v2.1/course/view.php?id=174
More informationTIM 206 Lecture Notes Integer Programming
TIM 206 Lecture Notes Integer Programming Instructor: Kevin Ross Scribe: Fengji Xu October 25, 2011 1 Defining Integer Programming Problems We will deal with linear constraints. The abbreviation MIP stands
More informationAlgorithms for Decision Support. Integer linear programming models
Algorithms for Decision Support Integer linear programming models 1 People with reduced mobility (PRM) require assistance when travelling through the airport http://www.schiphol.nl/travellers/atschiphol/informationforpassengerswithreducedmobility.htm
More information10.4 Linear interpolation method Newton s method
10.4 Linear interpolation method The next best thing one can do is the linear interpolation method, also known as the double false position method. This method works similarly to the bisection method by
More informationLecture 3.3 Robust estimation with RANSAC. Thomas Opsahl
Lecture 3.3 Robust estimation with RANSAC Thomas Opsahl Motivation If two perspective cameras captures an image of a planar scene, their images are related by a homography HH 2 Motivation If two perspective
More informationPOLYHEDRAL GEOMETRY. Convex functions and sets. Mathematical Programming Niels Lauritzen Recall that a subset C R n is convex if
POLYHEDRAL GEOMETRY Mathematical Programming Niels Lauritzen 7.9.2007 Convex functions and sets Recall that a subset C R n is convex if {λx + (1 λ)y 0 λ 1} C for every x, y C and 0 λ 1. A function f :
More informationFundamentals of Integer Programming
Fundamentals of Integer Programming Di Yuan Department of Information Technology, Uppsala University January 2018 Outline Definition of integer programming Formulating some classical problems with integer
More informationMotivation: Shortcomings of Hidden Markov Model. Ko, Youngjoong. Solution: Maximum Entropy Markov Model (MEMM)
Motivation: Shortcomings of Hidden Markov Model Maximum Entropy Markov Models and Conditional Random Fields Ko, Youngjoong Dept. of Computer Engineering, Dong-A University Intelligent System Laboratory,
More information11.1 Facility Location
CS787: Advanced Algorithms Scribe: Amanda Burton, Leah Kluegel Lecturer: Shuchi Chawla Topic: Facility Location ctd., Linear Programming Date: October 8, 2007 Today we conclude the discussion of local
More informationChapter II. Linear Programming
1 Chapter II Linear Programming 1. Introduction 2. Simplex Method 3. Duality Theory 4. Optimality Conditions 5. Applications (QP & SLP) 6. Sensitivity Analysis 7. Interior Point Methods 1 INTRODUCTION
More informationCS 125 Section #10 Midterm 2 Review 11/5/14
CS 125 Section #10 Midterm 2 Review 11/5/14 1 Topics Covered This midterm covers up through NP-completeness; countability, decidability, and recognizability will not appear on this midterm. Disclaimer:
More informationMassachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms for Inference Fall 2014
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 1 Course Overview This course is about performing inference in complex
More informationECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov
ECE521: Week 11, Lecture 20 27 March 2017: HMM learning/inference With thanks to Russ Salakhutdinov Examples of other perspectives Murphy 17.4 End of Russell & Norvig 15.2 (Artificial Intelligence: A Modern
More informationSupport Vector Machines.
Support Vector Machines srihari@buffalo.edu SVM Discussion Overview 1. Overview of SVMs 2. Margin Geometry 3. SVM Optimization 4. Overlapping Distributions 5. Relationship to Logistic Regression 6. Dealing
More informationGeneral Purpose Methods for Combinatorial Optimization
General Purpose Methods for Combinatorial Optimization 0/7/00 Maximum Contiguous Sum 3-4 9 6-3 8 97-93 -3 84 Σ = 87 Given:... N Z, at least one i > 0 ind i, j such that j k k = i is maximal 0/7/00 0/7/00
More informationUnit.9 Integer Programming
Unit.9 Integer Programming Xiaoxi Li EMS & IAS, Wuhan University Dec. 22-29, 2016 (revised) Operations Research (Li, X.) Unit.9 Integer Programming Dec. 22-29, 2016 (revised) 1 / 58 Organization of this
More informationLecture 2 The k-means clustering problem
CSE 29: Unsupervised learning Spring 2008 Lecture 2 The -means clustering problem 2. The -means cost function Last time we saw the -center problem, in which the input is a set S of data points and the
More informationPlanning and Control: Markov Decision Processes
CSE-571 AI-based Mobile Robotics Planning and Control: Markov Decision Processes Planning Static vs. Dynamic Predictable vs. Unpredictable Fully vs. Partially Observable Perfect vs. Noisy Environment What
More informationEllipsoid Algorithm :Algorithms in the Real World. Ellipsoid Algorithm. Reduction from general case
Ellipsoid Algorithm 15-853:Algorithms in the Real World Linear and Integer Programming II Ellipsoid algorithm Interior point methods First polynomial-time algorithm for linear programming (Khachian 79)
More informationInteger Programming Theory
Integer Programming Theory Laura Galli October 24, 2016 In the following we assume all functions are linear, hence we often drop the term linear. In discrete optimization, we seek to find a solution x
More informationLecture 19: November 5
0-725/36-725: Convex Optimization Fall 205 Lecturer: Ryan Tibshirani Lecture 9: November 5 Scribes: Hyun Ah Song Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not
More informationIntroduction to Mobile Robotics
Introduction to Mobile Robotics Clustering Wolfram Burgard Cyrill Stachniss Giorgio Grisetti Maren Bennewitz Christian Plagemann Clustering (1) Common technique for statistical data analysis (machine learning,
More information/ Approximation Algorithms Lecturer: Michael Dinitz Topic: Linear Programming Date: 2/24/15 Scribe: Runze Tang
600.469 / 600.669 Approximation Algorithms Lecturer: Michael Dinitz Topic: Linear Programming Date: 2/24/15 Scribe: Runze Tang 9.1 Linear Programming Suppose we are trying to approximate a minimization
More informationOutline. Column Generation: Cutting Stock A very applied method. Introduction to Column Generation. Given an LP problem
Column Generation: Cutting Stock A very applied method thst@man.dtu.dk Outline History The Simplex algorithm (re-visited) Column Generation as an extension of the Simplex algorithm A simple example! DTU-Management
More informationColumn Generation: Cutting Stock
Column Generation: Cutting Stock A very applied method thst@man.dtu.dk DTU-Management Technical University of Denmark 1 Outline History The Simplex algorithm (re-visited) Column Generation as an extension
More informationAlgorithms. Ch.15 Dynamic Programming
Algorithms Ch.15 Dynamic Programming Dynamic Programming Not a specific algorithm, but a technique (like divide-and-conquer). Developed back in the day when programming meant tabular method (like linear
More informationComputer Vision Group Prof. Daniel Cremers. 4a. Inference in Graphical Models
Group Prof. Daniel Cremers 4a. Inference in Graphical Models Inference on a Chain (Rep.) The first values of µ α and µ β are: The partition function can be computed at any node: Overall, we have O(NK 2
More informationOptimization and least squares. Prof. Noah Snavely CS1114
Optimization and least squares Prof. Noah Snavely CS1114 http://cs1114.cs.cornell.edu Administrivia A5 Part 1 due tomorrow by 5pm (please sign up for a demo slot) Part 2 will be due in two weeks (4/17)
More informationCPSC 340: Machine Learning and Data Mining. More Linear Classifiers Fall 2017
CPSC 340: Machine Learning and Data Mining More Linear Classifiers Fall 2017 Admin Assignment 3: Due Friday of next week. Midterm: Can view your exam during instructor office hours next week, or after
More informationContents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
page v Preface xiii I Basics 1 1 Optimization Models 3 1.1 Introduction... 3 1.2 Optimization: An Informal Introduction... 4 1.3 Linear Equations... 7 1.4 Linear Optimization... 10 Exercises... 12 1.5
More informationNotes for Lecture 18
U.C. Berkeley CS17: Intro to CS Theory Handout N18 Professor Luca Trevisan November 6, 21 Notes for Lecture 18 1 Algorithms for Linear Programming Linear programming was first solved by the simplex method
More informationMVE165/MMG630, Applied Optimization Lecture 8 Integer linear programming algorithms. Ann-Brith Strömberg
MVE165/MMG630, Integer linear programming algorithms Ann-Brith Strömberg 2009 04 15 Methods for ILP: Overview (Ch. 14.1) Enumeration Implicit enumeration: Branch and bound Relaxations Decomposition methods:
More informationD-Separation. b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C.
D-Separation Say: A, B, and C are non-intersecting subsets of nodes in a directed graph. A path from A to B is blocked by C if it contains a node such that either a) the arrows on the path meet either
More informationPrimal Dual Schema Approach to the Labeling Problem with Applications to TSP
1 Primal Dual Schema Approach to the Labeling Problem with Applications to TSP Colin Brown, Simon Fraser University Instructor: Ramesh Krishnamurti The Metric Labeling Problem has many applications, especially
More information1. Lecture notes on bipartite matching
Massachusetts Institute of Technology 18.453: Combinatorial Optimization Michel X. Goemans February 5, 2017 1. Lecture notes on bipartite matching Matching problems are among the fundamental problems in
More informationNOTATION AND TERMINOLOGY
15.053x, Optimization Methods in Business Analytics Fall, 2016 October 4, 2016 A glossary of notation and terms used in 15.053x Weeks 1, 2, 3, 4 and 5. (The most recent week's terms are in blue). NOTATION
More information