An Introduction to LP Relaxations for MAP Inference

Size: px
Start display at page:

Download "An Introduction to LP Relaxations for MAP Inference"

Transcription

1 An Introduction to LP Relaxations for MAP Inference Adrian Weller MLSALT4 Lecture Feb 27, 2017 With thanks to David Sontag (NYU) for use of some of his slides and illustrations For more information, see 1 / 39

2 High level overview of our 3 lectures 1. Directed and undirected graphical models (last Friday) 2. LP relaxations for MAP inference (today) 3. Junction tree algorithm for exact inference, belief propagation, variational methods for approximate inference (this Friday) Further reading / viewing: Murphy, Machine Learning: a Probabilistic Perspective Barber, Bayesian Reasoning and Machine Learning Bishop, Pattern Recognition and Machine Learning Koller and Friedman, Probabilistic Graphical Models Wainwright and Jordan, Graphical Models, Exponential Families, and Variational Inference 2 / 39

3 Example of MAP inference: image denoising Inference is combining prior beliefs with observed evidence to form a prediction. MAP inference 3 / 39

4 Example of MAP inference: protein side-chain placement Probabilistic inference Find minimum energy configuration of amino acid side-chains Given along desired a fixed 3D carbon structure, backbone: choose amino-acids giving the most stable folding (Yanover, Meltzer, Weiss 06) Side-chain X 1 (corresponding to 1 amino acid) X 3 X 2 θ 13 (x 1, x 3 ) X 3 X 1 Potential function for each edge θ 12 (x 1, x 2 ) a table X 2 Protein backbone θ 34 (x 3, x 4 ) Joint distribution over the variables is given by Orientations of the side-chains are represented by discretized angles called rotamers Rotamer choices for nearby amino acids are energetically coupled Partition function Key problems (attractive and repulsive forces) Find marginals: Find most likely assignment (MAP): Focus of this talk X 4 4 / 39

5 Outline of talk Background on undirected graphical models Basic LP relaxation Tighter relaxations Message passing and dual decomposition We ll comment on When is an LP relaxation tight 5 / 39

6 Background: undirected graphical models Powerful way to represent relationships across variables Many applications including: computer vision, social network analysis, deep belief networks, protein folding... In this talk, focus on pairwise models with discrete variables (sometimes binary) Example: Grid for computer vision 6 / 39

7 Background: undirected graphical models Discrete variables X 1,..., X n with X i {0,..., k i 1} Potential functions, will somehow write as vector θ Write x = (... x 1,..., x n,... ) for one overcomplete configuration of all variables, θ x for its total score Probability distribution given by p(x) = 1 Z exp(θ x) To ensure probabilities sum to 1, need normalizing constant or partition function Z = x exp (θ x) We are interested in maximum a posteriori (MAP) inference i.e., find a global configuration with highest probability x arg max p(x) = arg max θ x 7 / 39

8 Background: how do we write potentials as a vector θ? θ x means the total score of a configuration x, where we sum over all potential functions If we have potential functions θ c over some subsets c C of variables, then we want c C θ c(x c ), where x c means a configuration of variables just in the subset c θ c (x c ) provides a measure of local compatibility, a table of values 8 / 39

9 Background: how do we write potentials as a vector θ? θ x means the total score of a configuration x, where we sum over all potential functions If we have potential functions θ c over some subsets c C of variables, then we want c C θ c(x c ), where x c means a configuration of variables just in the subset c θ c (x c ) provides a measure of local compatibility, a table of values If we only have some unary/singleton potentials θ i and edge/pairwise potentials θ ij then we can write the total score as θ i (x i ) + θ ij (x i, x j ) (i,j) i Indices? Usually assume either no unary potentials (absorb them into edges) or one for every variable, leading to a graph topology (V, E) with total score θ i (x i ) + θ ij (x i, x j ) i V ={1,...,n} (i,j) E 8 / 39

10 Background: overcomplete representation The overcomplete representation conveniently allows us to write θ x = θ i (x i ) + θ ij (x i, x j ) i V (i,j) E Concatenate singleton and edge terms into big vectors θ i (0) 1[X i = 0] θ i (1) 1[X i = 1] θ =... θ ij (0, 0) x =... 1[X i = 0, X j = 0] θ ij (0, 1) 1[X i = 0, X j = 1] θ ij (1, 0) 1[X i = 1, X j = 0] θ ij (1, 1) 1[X i = 1, X j = 1] There are many possible values of x how many? 9 / 39

11 Background: overcomplete representation The overcomplete representation conveniently allows us to write θ x = θ i (x i ) + θ ij (x i, x j ) i V (i,j) E Concatenate singleton and edge terms into big vectors θ i (0) 1[X i = 0] θ i (1) 1[X i = 1] θ =... θ ij (0, 0) x =... 1[X i = 0, X j = 0] θ ij (0, 1) 1[X i = 0, X j = 1] θ ij (1, 0) 1[X i = 1, X j = 0] θ ij (1, 1) 1[X i = 1, X j = 1] There are many possible values of x how many? i V k i 9 / 39

12 Background: Binary pairwise models θ x is the score of a configuration x Probability distribution given by p(x) = 1 Z exp(θ x) For MAP inference, want x arg max p(x) = arg max θ x Want to optimize over {0, 1} coordinates of overcomplete configuration space corresponding to all 2 n possible settings The convex hull of these defines the marginal polytope M Each point µ M corresponds to a probability distribution over the 2 n configurations, giving a vector of marginals 10 / 39

13 Background: the marginal polytope (all valid marginals) µ = 0! 1! 0! 1! 1! 0! 0" 0" 1" 0" 0! 0! 0! 1! 0! 0! 1! 0" Marginal polytope! (Wainwright & Jordan, 03)! µ = 1 µ 2 + µ valid marginal probabilities! X 1! = 1! 1! 0! 0! 1! 1! 0! 1" 0" 0" 0" 0! 1! 0! 0! 0! 0! 1! 0" Assignment for X 1" Assignment for X 2" Assignment for X 3! Edge assignment for" X 1 X 3! Edge assignment for" X 1 X 2! Edge assignment for" X 2 X 3! X 1! = 0! X 2! = 1! X 3 =! 0! X 2! = 1! X 3 =! 0! Figure 2-1: Illustration of the marginal polytope for a Markov random field with three nodes 11 / 39

14 Background: overcomplete and minimal representations The overcomplete representation is highly redundant, e.g. µ i (0) + µ i (1) = 1 i How many dimensions if n binary variables with m edges? 12 / 39

15 Background: overcomplete and minimal representations The overcomplete representation is highly redundant, e.g. µ i (0) + µ i (1) = 1 i How many dimensions if n binary variables with m edges? 2n + 4m Instead, we sometimes pick a minimal representation What s the minimum number of dimensions we need? 12 / 39

16 Background: overcomplete and minimal representations The overcomplete representation is highly redundant, e.g. µ i (0) + µ i (1) = 1 i How many dimensions if n binary variables with m edges? 2n + 4m Instead, we sometimes pick a minimal representation What s the minimum number of dimensions we need? n + m For example, we could use q = (q 1,..., q n,..., q ij,... ) where q i = µ i (1) i, q ij = µ ij (1, 1) (i, j), then ( ) ( ) ( ) 1 qi 1 qj 1 + qij q µ i =, µ q j =, µ i q ij = i q j q j q ij j q i q ij q ij Note many other possible minimal representations 12 / 39

17 LP relaxation: MAP as an integer linear program (ILP) MAP inference as a discrete optimization problem is to identify a configuration with maximum total score x arg max θ i (x i ) + θ ij (x i, x j ) x i V ij E = arg max x = arg max µ θ x θ i (x i )µ i (x i ) + θ ij (x i, x j )µ ij (x i, x j ) x i x i,x j i V = arg max θ µ µ Any other constraints? ij E s.t. µ is integral 13 / 39

18 What are the constraints? Force every cluster of variables to choose a local assignment: µ i (x i ) {0, 1} i V, x i x i µ i (x i ) = 1 i V µ ij (x i, x j ) {0, 1} ij E, x i, x j x i,x j µ ij (x i, x j ) = 1 ij E Enforce that these assignments are consistent: µ i (x i ) = x j µ ij (x i, x j ) ij E, x i µ j (x j ) = x i µ ij (x i, x j ) ij E, x j 14 / 39

19 MAP as an integer linear program (ILP) subject to: MAP(θ) = max θ i (x i )µ i (x i ) + θ ij (x i, x j )µ ij (x i, x j ) µ x i x i,x j i V = max µ θ µ ij E µ i (x i ) {0, 1} i V, x i (edge terms?) x i µ i (x i ) = 1 i V µ i (x i ) = x j µ ij (x i, x j ) ij E, x i µ j (x j ) = x i µ ij (x i, x j ) ij E, x j Many good off-the-shelf solvers, such as CPLEX and Gurobi 15 / 39

20 Linear programming (LP) relaxation for MAP Integer linear program was: subject to MAP(θ) = max µ θ µ µ i (x i ) {0, 1} i V, x i x i µ i (x i ) = 1 i V µ i (x i ) = x j µ ij (x i, x j ) ij E, x i µ j (x j ) = x i µ ij (x i, x j ) ij E, x j Now relax integrality constraints, allow variables to be between 0 and 1: µ i (x i ) [0, 1] i V, x i 16 / 39

21 Basic LP relaxation for MAP LP(θ) = max µ θ µ s.t. µ i (x i ) [0, 1] i V, x i x i µ i (x i ) = 1 i V µ i (x i ) = x j µ ij (x i, x j ) ij E, x i µ j (x j ) = x i µ ij (x i, x j ) ij E, x j Linear programs can be solved efficiently: simplex, interior point, ellipsoid algorithm Since the LP relaxation maximizes over a larger set, its value can only be higher MAP(θ) LP(θ) 17 / 39

22 The local polytope LP(θ) = max µ θ µ s.t. µ i (x i ) [0, 1] i V, x i x i µ i (x i ) = 1 i V µ i (x i ) = µ j (x j ) = µ ij (x i, x j ) ij E, x i x i µ ij (x i, x j ) ij E, x j All these constraints are linear Hence define a polytope in the space of marginals Here we enforced only local (pairwise) consistency, which defines the local polytope If instead we had optimized over the marginal polytope, which enforces global consistency, then we would have MAP(θ) =LP(θ), i.e. the LP is tight why? why don t we do this? 18 / 39

23 Tighter relaxations of the marginal polytope Enforcing consistency of pairs of variables leads to the local polytope L 2 The marginal polytope enforces consistency over all variables M = L n Natural to consider the Sherali-Adams hierarchy of successively tighter relaxations L r 2 r n which enforce consistency over clusters of r variables Just up from the local polytope is the triplet polytope TRI= L 3 19 / 39

24 Stylized illustration of polytopes marginal polytope M = L n global consistency... triplet polytope L 3 triplet consistency local polytope L 2 pair consistency More accurate Less accurate More computationally intensive Less computationally intensive 20 / 39

25 Stylized illustration of polytopes marginal polytope M = L n global consistency... triplet polytope L 3 triplet consistency local polytope L 2 pair consistency More accurate Less accurate More computationally intensive Less computationally intensive Can be shown that for binary variables, TRI=CYC, the cycle polytope, which enforces consistency over all cycles In general, TRI CYC, open problem if TRI = CYC [SonPhD 3] 20 / 39

26 When is the LP tight? For a model without cycles, local polytope L 2 =M marginal polytope, hence the basic LP ( first order ) is always tight More generally, if a model has treewidth r then LP+L r+1 is tight [WaiJor04] STRUCTURE Separately, if we allow any structure but restrict the class of potential functions, interesting results are known POTENTIALS For example, the basic LP is tight if all potentials are supermodular Fascinating recent work [KolThaZiv15]: if we do not restrict structure, then for any given family of potentials, either the basic LP relaxation is tight or the problem class is NP-hard! Identifying HYBRID conditions is an exciting current research area 21 / 39

27 When is MAP inference (relatively) easy? Tree Attractive (binary) model STRUCTURE POTENTIALS Both can be handled efficiently by the basic LP relaxation, LP+L 2 22 / 39

28 Cutting planes µ max θ µ µ max θ µ µ max θ µ µ max θ µ θ θ θ θ µ µ µ µ (a) (b) (c) (d) Figure 2-6: Illustration of the cutting-plane algorithm. (a) Solve the LP relaxation. (b) Find a violated constraint, add it to the relaxation, and repeat. (c) Result of solving the tighter LP relaxation. (d) Finally, we find the MAP assignment. of the clusters considered. Even optimizing over TRI(G), the first lifting of the Sherali- SonPhD Adams hierarchy, is impractical for all but the smallest of graphical models. Our approach is motivated by the observation that it may not be necessary to add all of the constraints that make up a higher-order relaxation such as TRI(G). In particular, it possible that the pairwise LP relaxation alone is close to being tight, and that only a few carefully chosen 23 / 39

29 Cutting planes Invalid! constraint" µ max θ µ µ θ Useless! constraint" want to choose constraints that are both valid and useful. A v SonPhD es not cut off any of the integer points. One way to guarantee We want to add constraints that are both valid and useful are useful is to use them within the cutting-plane algorithm. C Valid: does not cut off any integer points separate the current fractional solution from the rest of the int Useful: leads us to update to a better solution tightest valid constraints to add are the facets of the margina 24 / 39

30 Methods for solving general integer linear programs Local search Start from an arbitrary assignment (e.g., random). Choose a variable. Branch-and-bound Exhaustive search over space of assignments, pruning branches that can be provably shown not to contain a MAP assignment Can use the LP relaxation or its dual to obtain upper bounds Lower bound obtained from value of any assignment found Branch-and-cut (most powerful method; used by CPLEX & Gurobi) Same as branch-and-bound; spend more time getting tighter bounds Adds cutting-planes to cut off fractional solutions of the LP relaxation, making the upper bound tighter 25 / 39

31 Message passing Can be a computationally efficient way to obtain or approximate a MAP solution, takes advantage of the graph structure Classic example is max-product belief propagation (BP) Sufficient conditions are known s.t. this will always converge to the solution of the basic LP, includes that the basic LP is tight [ParkShin-UAI15] In general, however, this may not converge to the LP solution (even for supermodular potentials) Other methods have been developed, many relate to dual decomposition / 39

32 Dual decomposition and reparameterizations Consider the MAP problem for pairwise Markov random fields: MAP(θ) = max θ i (x i ) + θ ij (x i, x j ). x i V ij E If we push the maximizations inside the sums, the value can only increase: MAP(θ) max θ i (x i ) + max θ ij (x i, x j ) x i x i,x j i V ij E Note that the right-hand side can be easily evaluated One can always reparameterize a distribution by operations like θi new (x i ) = θi old (x i ) + f (x i ) θij new (x i, x j ) = θij old (x i, x j ) f (x i ) for any function f (x i ), without changing the distribution/energy 27 / 39

33 Dual decomposition Introduction to Dual Decomposition for Inference f (x 1,x 2 ) x 1 f1(x 1) x 2 f2(x 2) f(x1,x2) x 1 x 2 g(x1,x3) h(x2,x4) x 3 x 4 k(x3,x4) f1(x 1) f2(x 2) x 1 + x 1 x 2 + x g (x 1,x 3 ) 2 g1(x 1 ) h2(x 2 ) g3(x 3 ) h (x 2,x 4 ) g1(x 1 ) h2(x 2 ) x g3(x 3 ) k4(x 4 ) 3 h4(x 4 ) + x 3 x 4 + x 4 k3(x 3 ) h4(x 4 ) x 3 x 4 k (x 3,x 4 ) k3(x 3 ) k4(x 4 ) Figure 1.2: Illustration of the the dual decomposition objective. Left: The original pairwise model consisting of four factors. Right: The maximization 28 / 39

34 Dual decomposition Define: θ i (x i ) = θ i (x i ) + ij E δ j i (x i ) θ ij (x i, x j ) = θ ij (x i, x j ) δ j i (x i ) δ i j (x j ) It is easy to verify that θ i (x i ) + θ ij (x i, x j ) = i ij E i Thus, we have that: θ i (x i ) + ij E θ ij (x i, x j ) x MAP(θ) = MAP( θ) max θi (x i ) + max θij (x i, x j ) x i x i,x j i V ij E Every value of δ gives a different upper bound on the value of the MAP The tightest upper bound can be obtained by minimizing the RHS with respect to δ 29 / 39

35 Dual decomposition We obtain the following dual objective: L(δ) = ( max θ i (x i ) + ) δ j i (x i ) + ( ) max θ ij (x i, x j ) δ j i (x i ) δ i j (x j ), x i x i,x j i V ij E ij E DUAL-LP(θ) = min L(δ) δ This provides an upper bound on the MAP assignment MAP(θ) DUAL-LP(θ) L(δ) How can find δ which give tight bounds? 30 / 39

36 Solving the dual efficiently Many ways to solve the dual linear program, i.e. minimize with respect to δ: ( max θ i (x i ) + ) δ j i (x i ) + ( ) max θ ij (x i, x j ) δ j i (x i ) δ i j (x j ), x i x i,x j i V ij E ij E One option is to use the subgradient method Can also solve using block coordinate-descent, which gives algorithms that look very much like max-sum belief propagation 31 / 39

37 Solving the dual efficiently Many ways to solve the dual linear program, i.e. minimize with respect to δ: ( max θ i (x i ) + ) δ j i (x i ) + ( ) max θ ij (x i, x j ) δ j i (x i ) δ i j (x j ), x i x i,x j i V ij E ij E One option is to use the subgradient method Can also solve using block coordinate-descent, which gives algorithms that look very much like max-sum belief propagation 31 / 39

38 Solving the dual efficiently Many ways to solve the dual linear program, i.e. minimize with respect to δ: ( max θ i (x i ) + ) δ j i (x i ) + ( ) max θ ij (x i, x j ) δ j i (x i ) δ i j (x j ), x i x i,x j i V ij E ij E One option is to use the subgradient method Can also solve using block coordinate-descent, which gives algorithms that look very much like max-sum belief propagation 31 / 39

39 Max-product linear programming (MPLP) algorithm Input: A set of potentials θ i (x i ), θ ij (x i, x j ) Output: An assignment x 1,..., x n that approximates a MAP solution Algorithm: Initialize δ i j (x j ) = 0, δ j i (x i ) = 0, ij E, x i, x j Iterate until small enough change in L(δ): For each edge ij E (sequentially), perform the updates: δ j i (x i ) = 1 2 δ j i (x i ) + 1 [ ] 2 max θ ij (x i, x j ) + δ i x j (x j ) j δ i j (x j ) = 1 2 δ i j (x j ) + 1 [ ] 2 max θ ij (x i, x j ) + δ j x i (x i ) i x i x j where δ j i (x i ) = θ i (x i ) + ik E,k j δ k i(x i ) Return x i arg maxˆxi θ δ i (ˆx i) 32 / 39

40 Generalization to arbitrary factor graphs [SonGloJaa11] Introduction to Dual Decomposition for Inference Inputs: A set of factors θ i(x i), θ f (x f ). Output: An assignment x 1,...,x n that approximates the MAP. Algorithm: Initialize δ fi(x i)=0, f F, i f,x i. Iterate until small enough change in L(δ) (seeeq.1.2): For each f F, perform the updates δ fi(x i)= δ f i (x 1 i)+ f max θ f (x f )+ x f\i î f δ f î (xî), (1.16) simultaneously for all i f and x i.wedefineδ f i (x i)=θ i(x i)+ ˆf=f δ ˆfi (x i). Return x i arg maxˆxi θδ i (ˆx i)(seeeq.1.6). Figure 1.4: Description of the MPLP block coordinate descent algorithm for minimizing the dual L(δ) (see Section 1.5.2). Similar algorithms can be devised for different choices of coordinate blocks. See sections and The assignment returned in the final step follows the decoding scheme 33 / 39

41 Experimental results Performance on stereo vision inference task: Solved optimally! Duality gap! Dual obj.! Objective! Decoded assignment! Iteration! 34 / 39

42 Dual decomposition = basic LP relaxation Recall we obtained the following dual linear program: L(δ) = ( max θ i (x i ) + ) δ j i (x i ) + ( ) max θ ij (x i, x j ) δ j i (x i ) δ i j (x j ), x i x i,x j i V ij E ij E DUAL-LP(θ) = min L(δ) δ We showed two ways of upper bounding the value of the MAP assignment: MAP(θ) Basic LP(θ) (1) MAP(θ) DUAL-LP(θ) L(δ) (2) Although we derived these linear programs in seemingly very different ways, in turns out that: Basic LP(θ) = DUAL-LP(θ) [SonGloJaa11] The dual LP allows us to upper bound the value of the MAP assignment without solving an LP to optimality 35 / 39

43 Linear programming duality MAP assignment! µ x*! (Dual) LP relaxation! (Primal) LP relaxation! Integer linear program! MAP(θ) Basic LP(θ) = DUAL-LP(θ) L(δ) 36 / 39

44 Cutting-plane algorithm µ max θ µ µ max θ µ µ max θ µ µ max θ µ θ θ θ θ µ µ µ µ (a) (b) (c) (d) Figure 2-6: Illustration of the cutting-plane algorithm. (a) Solve the LP relaxation. (b) Find a violated constraint, add it to the relaxation, and repeat. (c) Result of solving the tighter LP relaxation. (d) Finally, we find the MAP assignment. is motivated by the observation that it may not be necessary to add all of the constraints that make up a higher-order relaxation such as TRI(G). In particular, it possible that the pairwise LP relaxation alone is close to being tight, and that only a few carefully chosen constraints would suffice to obtain an integer solution. Our algorithms tighten the relaxation in a problem-specific way, using additional computation just for the hard parts of each instance. We illustrate the general approach in 37 / 39

45 Conclusion LP relaxations yield a powerful approach for MAP inference Naturally lead to considerations of polytope or cutting planes dual decomposition and message passing Close relationship to methods for marginal inference Help build understanding as well as develop new algorithmic tools Exciting current research Thank you 38 / 39

46 References V. Kolmogorov, J. Thapper, and S. Živný. The power of linear programming for general-valued CSPs. SIAM Journal on Computing, 44(1):136, S. Park and J. Shin. Max-product belief propagation for linear programming: applications to combinatorial optimization. In UAI, D. Sontag. Approximate inference in graphical models using LP relaxations. PhD thesis, MIT, D. Sontag, A. Globerson, and T. Jaakkola. Introduction to dual decomposition for inference. In Optimization for Machine Learning, MIT Press, J. Thapper and S. Živný. The complexity of finite-valued CSPs. arxiv technical report, M. Wainwright and M. Jordan. Treewidth-based conditions for exactness of the Sherali-Adams and Lasserre relaxations. Univ. California, Berkeley, Technical Report, 671:4, M. Wainwright and M. Jordan. Graphical models, exponential families and variational inference. Foundations and Trends in Machine Learning, 1(1-2):1305, / 39

An LP View of the M-best MAP problem

An LP View of the M-best MAP problem An LP View of the M-best MAP problem Menachem Fromer Amir Globerson School of Computer Science and Engineering The Hebrew University of Jerusalem {fromer,gamir}@cs.huji.ac.il Abstract We consider the problem

More information

Tightening MRF Relaxations with Planar Subproblems

Tightening MRF Relaxations with Planar Subproblems Tightening MRF Relaxations with Planar Subproblems Julian Yarkony, Ragib Morshed, Alexander T. Ihler, Charless C. Fowlkes Department of Computer Science University of California, Irvine {jyarkony,rmorshed,ihler,fowlkes}@ics.uci.edu

More information

Convergent message passing algorithms - a unifying view

Convergent message passing algorithms - a unifying view Convergent message passing algorithms - a unifying view Talya Meltzer, Amir Globerson and Yair Weiss School of Computer Science and Engineering The Hebrew University of Jerusalem, Jerusalem, Israel {talyam,gamir,yweiss}@cs.huji.ac.il

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Eric Xing Lecture 14, February 29, 2016 Reading: W & J Book Chapters Eric Xing @

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 5 Inference

More information

New Outer Bounds on the Marginal Polytope

New Outer Bounds on the Marginal Polytope New Outer Bounds on the Marginal Polytope David Sontag Tommi Jaakkola Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge, MA 2139 dsontag,tommi@csail.mit.edu

More information

CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination

CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination Instructor: Erik Sudderth Brown University Computer Science September 11, 2014 Some figures and materials courtesy

More information

Expectation Propagation

Expectation Propagation Expectation Propagation Erik Sudderth 6.975 Week 11 Presentation November 20, 2002 Introduction Goal: Efficiently approximate intractable distributions Features of Expectation Propagation (EP): Deterministic,

More information

Planar Cycle Covering Graphs

Planar Cycle Covering Graphs Planar Cycle Covering Graphs Julian Yarkony, Alexander T. Ihler, Charless C. Fowlkes Department of Computer Science University of California, Irvine {jyarkony,ihler,fowlkes}@ics.uci.edu Abstract We describe

More information

Information Processing Letters

Information Processing Letters Information Processing Letters 112 (2012) 449 456 Contents lists available at SciVerse ScienceDirect Information Processing Letters www.elsevier.com/locate/ipl Recursive sum product algorithm for generalized

More information

The More the Merrier: Parameter Learning for Graphical Models with Multiple MAPs

The More the Merrier: Parameter Learning for Graphical Models with Multiple MAPs The More the Merrier: Parameter Learning for Graphical Models with Multiple MAPs Franziska Meier fmeier@usc.edu U. of Southern California, 94 Bloom Walk, Los Angeles, CA 989 USA Amir Globerson amir.globerson@mail.huji.ac.il

More information

Loopy Belief Propagation

Loopy Belief Propagation Loopy Belief Propagation Research Exam Kristin Branson September 29, 2003 Loopy Belief Propagation p.1/73 Problem Formalization Reasoning about any real-world problem requires assumptions about the structure

More information

CS242: Probabilistic Graphical Models Lecture 2B: Loopy Belief Propagation & Junction Trees

CS242: Probabilistic Graphical Models Lecture 2B: Loopy Belief Propagation & Junction Trees CS242: Probabilistic Graphical Models Lecture 2B: Loopy Belief Propagation & Junction Trees Professor Erik Sudderth Brown University Computer Science September 22, 2016 Some figures and materials courtesy

More information

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Markov Random Fields: Inference Exact: VE Exact+Approximate: BP Readings: Barber 5 Dhruv Batra

More information

OSU CS 536 Probabilistic Graphical Models. Loopy Belief Propagation and Clique Trees / Join Trees

OSU CS 536 Probabilistic Graphical Models. Loopy Belief Propagation and Clique Trees / Join Trees OSU CS 536 Probabilistic Graphical Models Loopy Belief Propagation and Clique Trees / Join Trees Slides from Kevin Murphy s Graphical Model Tutorial (with minor changes) Reading: Koller and Friedman Ch

More information

Dynamic Tree Block Coordinate Ascent

Dynamic Tree Block Coordinate Ascent Dynamic Tree Block Coordinate Ascent Daniel Tarlow 1, Dhruv Batra 2 Pushmeet Kohli 3, Vladimir Kolmogorov 4 1: University of Toronto 3: Microso3 Research Cambridge 2: TTI Chicago 4: University College

More information

On Iteratively Constraining the Marginal Polytope for Approximate Inference and MAP

On Iteratively Constraining the Marginal Polytope for Approximate Inference and MAP On Iteratively Constraining the Marginal Polytope for Approximate Inference and MAP David Sontag CSAIL Massachusetts Institute of Technology Cambridge, MA 02139 Tommi Jaakkola CSAIL Massachusetts Institute

More information

Revisiting MAP Estimation, Message Passing and Perfect Graphs

Revisiting MAP Estimation, Message Passing and Perfect Graphs James R. Foulds Nicholas Navaroli Padhraic Smyth Alexander Ihler Department of Computer Science University of California, Irvine {jfoulds,nnavarol,smyth,ihler}@ics.uci.edu Abstract Given a graphical model,

More information

Integrating Probabilistic Reasoning with Constraint Satisfaction

Integrating Probabilistic Reasoning with Constraint Satisfaction Integrating Probabilistic Reasoning with Constraint Satisfaction IJCAI Tutorial #7 Instructor: Eric I. Hsu July 17, 2011 http://www.cs.toronto.edu/~eihsu/tutorial7 Getting Started Discursive Remarks. Organizational

More information

MVE165/MMG631 Linear and integer optimization with applications Lecture 9 Discrete optimization: theory and algorithms

MVE165/MMG631 Linear and integer optimization with applications Lecture 9 Discrete optimization: theory and algorithms MVE165/MMG631 Linear and integer optimization with applications Lecture 9 Discrete optimization: theory and algorithms Ann-Brith Strömberg 2018 04 24 Lecture 9 Linear and integer optimization with applications

More information

Tree-structured approximations by expectation propagation

Tree-structured approximations by expectation propagation Tree-structured approximations by expectation propagation Thomas Minka Department of Statistics Carnegie Mellon University Pittsburgh, PA 15213 USA minka@stat.cmu.edu Yuan Qi Media Laboratory Massachusetts

More information

Models for grids. Computer vision: models, learning and inference. Multi label Denoising. Binary Denoising. Denoising Goal.

Models for grids. Computer vision: models, learning and inference. Multi label Denoising. Binary Denoising. Denoising Goal. Models for grids Computer vision: models, learning and inference Chapter 9 Graphical Models Consider models where one unknown world state at each pixel in the image takes the form of a grid. Loops in the

More information

A discriminative view of MRF pre-processing algorithms supplementary material

A discriminative view of MRF pre-processing algorithms supplementary material A discriminative view of MRF pre-processing algorithms supplementary material Chen Wang 1,2 Charles Herrmann 2 Ramin Zabih 1,2 1 Google Research 2 Cornell University {chenwang, cih, rdz}@cs.cornell.edu

More information

A Brief Look at Optimization

A Brief Look at Optimization A Brief Look at Optimization CSC 412/2506 Tutorial David Madras January 18, 2018 Slides adapted from last year s version Overview Introduction Classes of optimization problems Linear programming Steepest

More information

Part II. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Part II. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Part II C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Converting Directed to Undirected Graphs (1) Converting Directed to Undirected Graphs (2) Add extra links between

More information

Graphical Models. Pradeep Ravikumar Department of Computer Science The University of Texas at Austin

Graphical Models. Pradeep Ravikumar Department of Computer Science The University of Texas at Austin Graphical Models Pradeep Ravikumar Department of Computer Science The University of Texas at Austin Useful References Graphical models, exponential families, and variational inference. M. J. Wainwright

More information

Integer Programming Theory

Integer Programming Theory Integer Programming Theory Laura Galli October 24, 2016 In the following we assume all functions are linear, hence we often drop the term linear. In discrete optimization, we seek to find a solution x

More information

LECTURE 18 LECTURE OUTLINE

LECTURE 18 LECTURE OUTLINE LECTURE 18 LECTURE OUTLINE Generalized polyhedral approximation methods Combined cutting plane and simplicial decomposition methods Lecture based on the paper D. P. Bertsekas and H. Yu, A Unifying Polyhedral

More information

Convex Optimization MLSS 2015

Convex Optimization MLSS 2015 Convex Optimization MLSS 2015 Constantine Caramanis The University of Texas at Austin The Optimization Problem minimize : f (x) subject to : x X. The Optimization Problem minimize : f (x) subject to :

More information

A Generic Separation Algorithm and Its Application to the Vehicle Routing Problem

A Generic Separation Algorithm and Its Application to the Vehicle Routing Problem A Generic Separation Algorithm and Its Application to the Vehicle Routing Problem Presented by: Ted Ralphs Joint work with: Leo Kopman Les Trotter Bill Pulleyblank 1 Outline of Talk Introduction Description

More information

Learning Bayesian Network Structure using LP Relaxations

Learning Bayesian Network Structure using LP Relaxations Tommi Jaakkola David Sontag Amir Globerson Marina Meila MIT CSAIL MIT CSAIL Hebrew University University of Washington Abstract We propose to solve the combinatorial problem of finding the highest scoring

More information

Parallel constraint optimization algorithms for higher-order discrete graphical models: applications in hyperspectral imaging

Parallel constraint optimization algorithms for higher-order discrete graphical models: applications in hyperspectral imaging Parallel constraint optimization algorithms for higher-order discrete graphical models: applications in hyperspectral imaging MSc Billy Braithwaite Supervisors: Prof. Pekka Neittaanmäki and Phd Ilkka Pölönen

More information

MVE165/MMG630, Applied Optimization Lecture 8 Integer linear programming algorithms. Ann-Brith Strömberg

MVE165/MMG630, Applied Optimization Lecture 8 Integer linear programming algorithms. Ann-Brith Strömberg MVE165/MMG630, Integer linear programming algorithms Ann-Brith Strömberg 2009 04 15 Methods for ILP: Overview (Ch. 14.1) Enumeration Implicit enumeration: Branch and bound Relaxations Decomposition methods:

More information

Mean Field and Variational Methods finishing off

Mean Field and Variational Methods finishing off Readings: K&F: 10.1, 10.5 Mean Field and Variational Methods finishing off Graphical Models 10708 Carlos Guestrin Carnegie Mellon University November 5 th, 2008 10-708 Carlos Guestrin 2006-2008 1 10-708

More information

3 INTEGER LINEAR PROGRAMMING

3 INTEGER LINEAR PROGRAMMING 3 INTEGER LINEAR PROGRAMMING PROBLEM DEFINITION Integer linear programming problem (ILP) of the decision variables x 1,..,x n : (ILP) subject to minimize c x j j n j= 1 a ij x j x j 0 x j integer n j=

More information

Inter and Intra-Modal Deformable Registration:

Inter and Intra-Modal Deformable Registration: Inter and Intra-Modal Deformable Registration: Continuous Deformations Meet Efficient Optimal Linear Programming Ben Glocker 1,2, Nikos Komodakis 1,3, Nikos Paragios 1, Georgios Tziritas 3, Nassir Navab

More information

5.3 Cutting plane methods and Gomory fractional cuts

5.3 Cutting plane methods and Gomory fractional cuts 5.3 Cutting plane methods and Gomory fractional cuts (ILP) min c T x s.t. Ax b x 0integer feasible region X Assumption: a ij, c j and b i integer. Observation: The feasible region of an ILP can be described

More information

Approximate inference using planar graph decomposition

Approximate inference using planar graph decomposition Approximate inference using planar graph decomposition Amir Globerson Tommi Jaakkola Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge, MA 0239 gamir,tommi@csail.mit.edu

More information

Selected Topics in Column Generation

Selected Topics in Column Generation Selected Topics in Column Generation February 1, 2007 Choosing a solver for the Master Solve in the dual space(kelly s method) by applying a cutting plane algorithm In the bundle method(lemarechal), a

More information

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim, Germany

More information

Bayesian network model selection using integer programming

Bayesian network model selection using integer programming Bayesian network model selection using integer programming James Cussens Leeds, 2013-10-04 James Cussens IP for BNs Leeds, 2013-10-04 1 / 23 Linear programming The Belgian diet problem Fat Sugar Salt Cost

More information

Mean Field and Variational Methods finishing off

Mean Field and Variational Methods finishing off Readings: K&F: 10.1, 10.5 Mean Field and Variational Methods finishing off Graphical Models 10708 Carlos Guestrin Carnegie Mellon University November 5 th, 2008 10-708 Carlos Guestrin 2006-2008 1 10-708

More information

ECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov

ECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov ECE521: Week 11, Lecture 20 27 March 2017: HMM learning/inference With thanks to Russ Salakhutdinov Examples of other perspectives Murphy 17.4 End of Russell & Norvig 15.2 (Artificial Intelligence: A Modern

More information

A Tutorial Introduction to Belief Propagation

A Tutorial Introduction to Belief Propagation A Tutorial Introduction to Belief Propagation James Coughlan August 2009 Table of Contents Introduction p. 3 MRFs, graphical models, factor graphs 5 BP 11 messages 16 belief 22 sum-product vs. max-product

More information

Natural Language Processing

Natural Language Processing Natural Language Processing Classification III Dan Klein UC Berkeley 1 Classification 2 Linear Models: Perceptron The perceptron algorithm Iteratively processes the training set, reacting to training errors

More information

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim,

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG) Bayesian Networks General Factorization Bayesian Curve Fitting (1) Polynomial Bayesian

More information

Convex Max-Product Algorithms for Continuous MRFs with Applications to Protein Folding

Convex Max-Product Algorithms for Continuous MRFs with Applications to Protein Folding Convex Max-Product Algorithms for Continuous MRFs with Applications to Protein Folding Jian Peng Tamir Hazan David McAllester Raquel Urtasun TTI Chicago, Chicago Abstract This paper investigates convex

More information

Towards Efficient Higher-Order Semidefinite Relaxations for Max-Cut

Towards Efficient Higher-Order Semidefinite Relaxations for Max-Cut Towards Efficient Higher-Order Semidefinite Relaxations for Max-Cut Miguel F. Anjos Professor and Canada Research Chair Director, Trottier Energy Institute Joint work with E. Adams (Poly Mtl), F. Rendl,

More information

Fundamentals of Integer Programming

Fundamentals of Integer Programming Fundamentals of Integer Programming Di Yuan Department of Information Technology, Uppsala University January 2018 Outline Definition of integer programming Formulating some classical problems with integer

More information

/ Approximation Algorithms Lecturer: Michael Dinitz Topic: Linear Programming Date: 2/24/15 Scribe: Runze Tang

/ Approximation Algorithms Lecturer: Michael Dinitz Topic: Linear Programming Date: 2/24/15 Scribe: Runze Tang 600.469 / 600.669 Approximation Algorithms Lecturer: Michael Dinitz Topic: Linear Programming Date: 2/24/15 Scribe: Runze Tang 9.1 Linear Programming Suppose we are trying to approximate a minimization

More information

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Bayes Nets: Inference (Finish) Variable Elimination Graph-view of VE: Fill-edges, induced width

More information

Part I: Sum Product Algorithm and (Loopy) Belief Propagation. What s wrong with VarElim. Forwards algorithm (filtering) Forwards-backwards algorithm

Part I: Sum Product Algorithm and (Loopy) Belief Propagation. What s wrong with VarElim. Forwards algorithm (filtering) Forwards-backwards algorithm OU 56 Probabilistic Graphical Models Loopy Belief Propagation and lique Trees / Join Trees lides from Kevin Murphy s Graphical Model Tutorial (with minor changes) eading: Koller and Friedman h 0 Part I:

More information

Energy Minimization Under Constraints on Label Counts

Energy Minimization Under Constraints on Label Counts Energy Minimization Under Constraints on Label Counts Yongsub Lim 1, Kyomin Jung 1, and Pushmeet Kohli 2 1 Korea Advanced Istitute of Science and Technology, Daejeon, Korea yongsub@kaist.ac.kr, kyomin@kaist.edu

More information

Research Interests Optimization:

Research Interests Optimization: Mitchell: Research interests 1 Research Interests Optimization: looking for the best solution from among a number of candidates. Prototypical optimization problem: min f(x) subject to g(x) 0 x X IR n Here,

More information

Graph Coloring via Constraint Programming-based Column Generation

Graph Coloring via Constraint Programming-based Column Generation Graph Coloring via Constraint Programming-based Column Generation Stefano Gualandi Federico Malucelli Dipartimento di Elettronica e Informatica, Politecnico di Milano Viale Ponzio 24/A, 20133, Milan, Italy

More information

Regularization and Markov Random Fields (MRF) CS 664 Spring 2008

Regularization and Markov Random Fields (MRF) CS 664 Spring 2008 Regularization and Markov Random Fields (MRF) CS 664 Spring 2008 Regularization in Low Level Vision Low level vision problems concerned with estimating some quantity at each pixel Visual motion (u(x,y),v(x,y))

More information

10 Sum-product on factor tree graphs, MAP elimination

10 Sum-product on factor tree graphs, MAP elimination assachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 10 Sum-product on factor tree graphs, AP elimination algorithm The

More information

Conditions Beyond Treewidth for Tightness of Higher-order LP Relaxations

Conditions Beyond Treewidth for Tightness of Higher-order LP Relaxations Mark Rowland Aldo Pacchiano Adrian Weller University of Cambridge UC Berkeley University of Cambridge Abstract Linear programming (LP) relaxations are a popular method to attempt to find a most likely

More information

The ILP approach to the layered graph drawing. Ago Kuusik

The ILP approach to the layered graph drawing. Ago Kuusik The ILP approach to the layered graph drawing Ago Kuusik Veskisilla Teooriapäevad 1-3.10.2004 1 Outline Introduction Hierarchical drawing & Sugiyama algorithm Linear Programming (LP) and Integer Linear

More information

COT 6936: Topics in Algorithms! Giri Narasimhan. ECS 254A / EC 2443; Phone: x3748

COT 6936: Topics in Algorithms! Giri Narasimhan. ECS 254A / EC 2443; Phone: x3748 COT 6936: Topics in Algorithms! Giri Narasimhan ECS 254A / EC 2443; Phone: x3748 giri@cs.fiu.edu http://www.cs.fiu.edu/~giri/teach/cot6936_s12.html https://moodle.cis.fiu.edu/v2.1/course/view.php?id=174

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

Algorithms for Decision Support. Integer linear programming models

Algorithms for Decision Support. Integer linear programming models Algorithms for Decision Support Integer linear programming models 1 People with reduced mobility (PRM) require assistance when travelling through the airport http://www.schiphol.nl/travellers/atschiphol/informationforpassengerswithreducedmobility.htm

More information

Inference and Representation

Inference and Representation Inference and Representation Rachel Hodos New York University Lecture 5, October 6, 2015 Rachel Hodos Lecture 5: Inference and Representation Today: Learning with hidden variables Outline: Unsupervised

More information

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 51, NO. 7, JULY A New Class of Upper Bounds on the Log Partition Function

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 51, NO. 7, JULY A New Class of Upper Bounds on the Log Partition Function IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 51, NO. 7, JULY 2005 2313 A New Class of Upper Bounds on the Log Partition Function Martin J. Wainwright, Member, IEEE, Tommi S. Jaakkola, and Alan S. Willsky,

More information

Mathematical Programming Formulations, Constraint Programming

Mathematical Programming Formulations, Constraint Programming Outline DM87 SCHEDULING, TIMETABLING AND ROUTING Lecture 3 Mathematical Programming Formulations, Constraint Programming 1. Special Purpose Algorithms 2. Constraint Programming Marco Chiarandini DM87 Scheduling,

More information

Alina Ene. From Minimum Cut to Submodular Minimization Leveraging the Decomposable Structure. Boston University

Alina Ene. From Minimum Cut to Submodular Minimization Leveraging the Decomposable Structure. Boston University From Minimum Cut to Submodular Minimization Leveraging the Decomposable Structure Alina Ene Boston University Joint work with Huy Nguyen (Northeastern University) Laszlo Vegh (London School of Economics)

More information

Modern Benders (in a nutshell)

Modern Benders (in a nutshell) Modern Benders (in a nutshell) Matteo Fischetti, University of Padova (based on joint work with Ivana Ljubic and Markus Sinnl) Lunteren Conference on the Mathematics of Operations Research, January 17,

More information

COS 513: Foundations of Probabilistic Modeling. Lecture 5

COS 513: Foundations of Probabilistic Modeling. Lecture 5 COS 513: Foundations of Probabilistic Modeling Young-suk Lee 1 Administrative Midterm report is due Oct. 29 th. Recitation is at 4:26pm in Friend 108. Lecture 5 R is a computer language for statistical

More information

Mathematical and Algorithmic Foundations Linear Programming and Matchings

Mathematical and Algorithmic Foundations Linear Programming and Matchings Adavnced Algorithms Lectures Mathematical and Algorithmic Foundations Linear Programming and Matchings Paul G. Spirakis Department of Computer Science University of Patras and Liverpool Paul G. Spirakis

More information

Linear Programming. Course review MS-E2140. v. 1.1

Linear Programming. Course review MS-E2140. v. 1.1 Linear Programming MS-E2140 Course review v. 1.1 Course structure Modeling techniques Linear programming theory and the Simplex method Duality theory Dual Simplex algorithm and sensitivity analysis Integer

More information

ADVANCED MACHINE LEARNING MACHINE LEARNING. Kernel for Clustering kernel K-Means

ADVANCED MACHINE LEARNING MACHINE LEARNING. Kernel for Clustering kernel K-Means 1 MACHINE LEARNING Kernel for Clustering ernel K-Means Outline of Today s Lecture 1. Review principle and steps of K-Means algorithm. Derive ernel version of K-means 3. Exercise: Discuss the geometrical

More information

Graphical Models for Computer Vision

Graphical Models for Computer Vision Graphical Models for Computer Vision Pedro F Felzenszwalb Brown University Joint work with Dan Huttenlocher, Joshua Schwartz, Ross Girshick, David McAllester, Deva Ramanan, Allie Shapiro, John Oberlin

More information

Algorithms for Integer Programming

Algorithms for Integer Programming Algorithms for Integer Programming Laura Galli November 9, 2016 Unlike linear programming problems, integer programming problems are very difficult to solve. In fact, no efficient general algorithm is

More information

Computer Vision Group Prof. Daniel Cremers. 4a. Inference in Graphical Models

Computer Vision Group Prof. Daniel Cremers. 4a. Inference in Graphical Models Group Prof. Daniel Cremers 4a. Inference in Graphical Models Inference on a Chain (Rep.) The first values of µ α and µ β are: The partition function can be computed at any node: Overall, we have O(NK 2

More information

D-Separation. b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C.

D-Separation. b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C. D-Separation Say: A, B, and C are non-intersecting subsets of nodes in a directed graph. A path from A to B is blocked by C if it contains a node such that either a) the arrows on the path meet either

More information

MVE165/MMG631 Linear and integer optimization with applications Lecture 7 Discrete optimization models and applications; complexity

MVE165/MMG631 Linear and integer optimization with applications Lecture 7 Discrete optimization models and applications; complexity MVE165/MMG631 Linear and integer optimization with applications Lecture 7 Discrete optimization models and applications; complexity Ann-Brith Strömberg 2019 04 09 Lecture 7 Linear and integer optimization

More information

GENERAL ASSIGNMENT PROBLEM via Branch and Price JOHN AND LEI

GENERAL ASSIGNMENT PROBLEM via Branch and Price JOHN AND LEI GENERAL ASSIGNMENT PROBLEM via Branch and Price JOHN AND LEI Outline Review the column generation in Generalized Assignment Problem (GAP) GAP Examples in Branch and Price 2 Assignment Problem The assignment

More information

Introduction to Mathematical Programming IE406. Lecture 20. Dr. Ted Ralphs

Introduction to Mathematical Programming IE406. Lecture 20. Dr. Ted Ralphs Introduction to Mathematical Programming IE406 Lecture 20 Dr. Ted Ralphs IE406 Lecture 20 1 Reading for This Lecture Bertsimas Sections 10.1, 11.4 IE406 Lecture 20 2 Integer Linear Programming An integer

More information

B553 Lecture 12: Global Optimization

B553 Lecture 12: Global Optimization B553 Lecture 12: Global Optimization Kris Hauser February 20, 2012 Most of the techniques we have examined in prior lectures only deal with local optimization, so that we can only guarantee convergence

More information

Lecture 9: Undirected Graphical Models Machine Learning

Lecture 9: Undirected Graphical Models Machine Learning Lecture 9: Undirected Graphical Models Machine Learning Andrew Rosenberg March 5, 2010 1/1 Today Graphical Models Probabilities in Undirected Graphs 2/1 Undirected Graphs What if we allow undirected graphs?

More information

Methods and Models for Combinatorial Optimization Exact methods for the Traveling Salesman Problem

Methods and Models for Combinatorial Optimization Exact methods for the Traveling Salesman Problem Methods and Models for Combinatorial Optimization Exact methods for the Traveling Salesman Problem L. De Giovanni M. Di Summa The Traveling Salesman Problem (TSP) is an optimization problem on a directed

More information

BAYESIAN NETWORKS STRUCTURE LEARNING

BAYESIAN NETWORKS STRUCTURE LEARNING BAYESIAN NETWORKS STRUCTURE LEARNING Xiannian Fan Uncertainty Reasoning Lab (URL) Department of Computer Science Queens College/City University of New York http://url.cs.qc.cuny.edu 1/52 Overview : Bayesian

More information

Math 5593 Linear Programming Lecture Notes

Math 5593 Linear Programming Lecture Notes Math 5593 Linear Programming Lecture Notes Unit II: Theory & Foundations (Convex Analysis) University of Colorado Denver, Fall 2013 Topics 1 Convex Sets 1 1.1 Basic Properties (Luenberger-Ye Appendix B.1).........................

More information

Exact Inference: Elimination and Sum Product (and hidden Markov models)

Exact Inference: Elimination and Sum Product (and hidden Markov models) Exact Inference: Elimination and Sum Product (and hidden Markov models) David M. Blei Columbia University October 13, 2015 The first sections of these lecture notes follow the ideas in Chapters 3 and 4

More information

Lecture 2 - Introduction to Polytopes

Lecture 2 - Introduction to Polytopes Lecture 2 - Introduction to Polytopes Optimization and Approximation - ENS M1 Nicolas Bousquet 1 Reminder of Linear Algebra definitions Let x 1,..., x m be points in R n and λ 1,..., λ m be real numbers.

More information

Combinatorial Auctions: A Survey by de Vries and Vohra

Combinatorial Auctions: A Survey by de Vries and Vohra Combinatorial Auctions: A Survey by de Vries and Vohra Ashwin Ganesan EE228, Fall 2003 September 30, 2003 1 Combinatorial Auctions Problem N is the set of bidders, M is the set of objects b j (S) is the

More information

Complex Prediction Problems

Complex Prediction Problems Problems A novel approach to multiple Structured Output Prediction Max-Planck Institute ECML HLIE08 Information Extraction Extract structured information from unstructured data Typical subtasks Named Entity

More information

Lecture 13 Segmentation and Scene Understanding Chris Choy, Ph.D. candidate Stanford Vision and Learning Lab (SVL)

Lecture 13 Segmentation and Scene Understanding Chris Choy, Ph.D. candidate Stanford Vision and Learning Lab (SVL) Lecture 13 Segmentation and Scene Understanding Chris Choy, Ph.D. candidate Stanford Vision and Learning Lab (SVL) http://chrischoy.org Stanford CS231A 1 Understanding a Scene Objects Chairs, Cups, Tables,

More information

UNIVERSITY OF CALIFORNIA, IRVINE. Approximate Inference in Graphical Models DISSERTATION

UNIVERSITY OF CALIFORNIA, IRVINE. Approximate Inference in Graphical Models DISSERTATION UNIVERSITY OF CALIFORNIA, IRVINE Approximate Inference in Graphical Models DISSERTATION submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in Computer Science

More information

15.082J and 6.855J. Lagrangian Relaxation 2 Algorithms Application to LPs

15.082J and 6.855J. Lagrangian Relaxation 2 Algorithms Application to LPs 15.082J and 6.855J Lagrangian Relaxation 2 Algorithms Application to LPs 1 The Constrained Shortest Path Problem (1,10) 2 (1,1) 4 (2,3) (1,7) 1 (10,3) (1,2) (10,1) (5,7) 3 (12,3) 5 (2,2) 6 Find the shortest

More information

LP Decoding. LP Decoding: - LP relaxation for the Maximum-Likelihood (ML) decoding problem. J. Feldman, D. Karger, M. Wainwright, LP Decoding p.

LP Decoding. LP Decoding: - LP relaxation for the Maximum-Likelihood (ML) decoding problem. J. Feldman, D. Karger, M. Wainwright, LP Decoding p. LP Decoding Allerton, 2003 Jon Feldman jonfeld@ieor.columbia.edu Columbia University David Karger karger@theory.lcs.mit.edu MIT Martin Wainwright wainwrig@eecs.berkeley.edu UC Berkeley J. Feldman, D. Karger,

More information

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer David G. Luenberger Yinyu Ye Linear and Nonlinear Programming Fourth Edition ö Springer Contents 1 Introduction 1 1.1 Optimization 1 1.2 Types of Problems 2 1.3 Size of Problems 5 1.4 Iterative Algorithms

More information

Minimum Weight Constrained Forest Problems. Problem Definition

Minimum Weight Constrained Forest Problems. Problem Definition Slide 1 s Xiaoyun Ji, John E. Mitchell Department of Mathematical Sciences Rensselaer Polytechnic Institute Troy, NY, USA jix@rpi.edu, mitchj@rpi.edu 2005 Optimization Days Montreal, Canada May 09, 2005

More information

Convex Optimization CMU-10725

Convex Optimization CMU-10725 Convex Optimization CMU-10725 Ellipsoid Methods Barnabás Póczos & Ryan Tibshirani Outline Linear programs Simplex algorithm Running time: Polynomial or Exponential? Cutting planes & Ellipsoid methods for

More information

LP-Modelling. dr.ir. C.A.J. Hurkens Technische Universiteit Eindhoven. January 30, 2008

LP-Modelling. dr.ir. C.A.J. Hurkens Technische Universiteit Eindhoven. January 30, 2008 LP-Modelling dr.ir. C.A.J. Hurkens Technische Universiteit Eindhoven January 30, 2008 1 Linear and Integer Programming After a brief check with the backgrounds of the participants it seems that the following

More information

Target Cuts from Relaxed Decision Diagrams

Target Cuts from Relaxed Decision Diagrams Target Cuts from Relaxed Decision Diagrams Christian Tjandraatmadja 1, Willem-Jan van Hoeve 1 1 Tepper School of Business, Carnegie Mellon University, Pittsburgh, PA {ctjandra,vanhoeve}@andrew.cmu.edu

More information

arxiv: v1 [cs.ds] 23 Sep 2015

arxiv: v1 [cs.ds] 23 Sep 2015 Minimum Weight Perfect Matching via Blossom Belief Propagation Sungsoo Ahn, Sejun Park, Michael Chertkov, Jinwoo Shin September 24, 2015 arxiv:1509.06849v1 [cs.ds] 23 Sep 2015 Abstract Max-product Belief

More information

Linear Programming in Small Dimensions

Linear Programming in Small Dimensions Linear Programming in Small Dimensions Lekcija 7 sergio.cabello@fmf.uni-lj.si FMF Univerza v Ljubljani Edited from slides by Antoine Vigneron Outline linear programming, motivation and definition one dimensional

More information

Lecture 5: Exact inference. Queries. Complexity of inference. Queries (continued) Bayesian networks can answer questions about the underlying

Lecture 5: Exact inference. Queries. Complexity of inference. Queries (continued) Bayesian networks can answer questions about the underlying given that Maximum a posteriori (MAP query: given evidence 2 which has the highest probability: instantiation of all other variables in the network,, Most probable evidence (MPE: given evidence, find an

More information