Decomposition of log-linear models

Similar documents
Junction Trees and Chordal Graphs

Junction Trees and Chordal Graphs

These notes present some properties of chordal graphs, a set of undirected graphs that are important for undirected graphical models.

Lecture 11: May 1, 2000

Lecture 13: May 10, 2002

Faster parameterized algorithms for Minimum Fill-In

Chordal graphs MPRI

Faster parameterized algorithms for Minimum Fill-In

Small Survey on Perfect Graphs

CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

6. Lecture notes on matroid intersection

D-Separation. b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C.

Lecture 4: Undirected Graphical Models

Learning decomposable models with a bounded clique size

THE LEAFAGE OF A CHORDAL GRAPH

The 3-Steiner Root Problem

The Structure of Bull-Free Perfect Graphs

Chordal deletion is fixed-parameter tractable

9 About Intersection Graphs

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

36-720: Graphical Models

4. (a) Draw the Petersen graph. (b) Use Kuratowski s teorem to prove that the Petersen graph is non-planar.

Treewidth and graph minors

Mathematical and Algorithmic Foundations Linear Programming and Matchings

Lecture 5: Graphs. Rajat Mittal. IIT Kanpur

Lecture Notes on Graph Theory

EE512 Graphical Models Fall 2009

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings

A taste of perfect graphs (continued)

Bipartite Roots of Graphs

Sources for this lecture. 3. Matching in bipartite and general graphs. Symmetric difference

8 Matroid Intersection

6 : Factor Graphs, Message Passing and Junction Trees

A graph is finite if its vertex set and edge set are finite. We call a graph with just one vertex trivial and all other graphs nontrivial.

CMSC Honors Discrete Mathematics

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models

Fixed-Parameter Algorithms, IA166

Computing a Clique Tree with the Algorithm Maximal Label Search

9 Connectivity. Contents. 9.1 Vertex Connectivity

Discrete mathematics , Fall Instructor: prof. János Pach

Chordal deletion is fixed-parameter tractable

Math 454 Final Exam, Fall 2005

ON THE STRUCTURE OF SELF-COMPLEMENTARY GRAPHS ROBERT MOLINA DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE ALMA COLLEGE ABSTRACT

1. Lecture notes on bipartite matching

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models

Probe Distance-Hereditary Graphs

Chordal Graphs: Theory and Algorithms

Lecture 19 Thursday, March 29. Examples of isomorphic, and non-isomorphic graphs will be given in class.

Graphs and Network Flows IE411. Lecture 21. Dr. Ted Ralphs

Information Processing Letters

Algorithms, Spring 2014, CSE, OSU Greedy algorithms II. Instructor: Anastasios Sidiropoulos

Greedy Algorithms. Algorithms

1. Lecture notes on bipartite matching February 4th,

Math 485, Graph Theory: Homework #3

Minimal Dominating Sets in Graphs: Enumeration, Combinatorial Bounds and Graph Classes

MapReduce Algorithms. Barna Saha. March 28, 2016

arxiv: v1 [math.co] 10 Oct 2017

1 Matchings in Graphs

arxiv: v1 [cs.dm] 21 Dec 2015

Probabilistic Graphical Models

Lecture 5: Exact inference. Queries. Complexity of inference. Queries (continued) Bayesian networks can answer questions about the underlying

Lecture 4: Primal Dual Matching Algorithm and Non-Bipartite Matching. 1 Primal/Dual Algorithm for weighted matchings in Bipartite Graphs

Lecture 5: Exact inference

3-colouring AT-free graphs in polynomial time

3 : Representation of Undirected GMs

Computer Vision Group Prof. Daniel Cremers. 4a. Inference in Graphical Models

Recitation 4: Elimination algorithm, reconstituted graph, triangulation

CS242: Probabilistic Graphical Models Lecture 2B: Loopy Belief Propagation & Junction Trees

arxiv: v1 [cs.ds] 8 Jan 2019

Bayesian Networks, Winter Yoav Haimovitch & Ariel Raviv

ECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov

EE512 Graphical Models Fall 2009

Interaction Between Input and Output-Sensitive

On the Partial Sum of the Laplacian Eigenvalues of Abstract Simplicial Complexes

Contracting Chordal Graphs and Bipartite Graphs to Paths and Trees

Part II. Graph Theory. Year

Min-Cut and Randomization

COMBINATORIAL COMMUTATIVE ALGEBRA. Fatemeh Mohammadi (University of Bristol)

CS270 Combinatorial Algorithms & Data Structures Spring Lecture 19:

Assignment 5: Solutions

The External Network Problem

arxiv: v3 [cs.ds] 26 Sep 2013

On some subclasses of circular-arc graphs

On Structural Parameterizations of the Matching Cut Problem

Some results on Interval probe graphs

Graphical models and message-passing algorithms: Some introductory lectures

5. Lecture notes on matroid intersection

Probabilistic Graphical Models

Maximal Label Search algorithms to compute perfect and minimal elimination orderings

Convexization in Markov Chain Monte Carlo

Homework # 2 Due: October 6. Programming Multiprocessors: Parallelism, Communication, and Synchronization

Open and Closed Sets

Recognizing Interval Bigraphs by Forbidden Patterns

On the packing chromatic number of some lattices

Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1

Rigidity, connectivity and graph decompositions

Multiagent Graph Coloring: Pareto Efficiency, Fairness and Individual Rationality By Yaad Blum, Jeffrey S. Rosenchein

An Effective Upperbound on Treewidth Using Partial Fill-in of Separators

On 2-Subcolourings of Chordal Graphs

Transcription:

Graphical Models, Lecture 5, Michaelmas Term 2009 October 27, 2009

Generating class Dependence graph of log-linear model Conformal graphical models Factor graphs A density f factorizes w.r.t. A if there exist functions ψ a (x) which depend on x a only so that f (x) = a A ψ a (x). The set of distributions P A which factorize w.r.t. A is the hierarchical log linear model generated by A. A is the generating class of the log linear model.

Generating class Dependence graph of log-linear model Conformal graphical models Factor graphs For any generating class A we construct the dependence graph G(A) = G(P A ) of the log linear model P A. The dependence graph is determined by the relation α β a A : α, β a. For sets in A are clearly complete in G(A) and therefore distributions in P A do factorize according to G(A). They are thus also global, local, and pairwise Markov w.r.t. G(A).

Generating class Dependence graph of log-linear model Conformal graphical models Factor graphs As a generating class defines a dependence graph G(A), the reverse is also true. The set C(G) of cliques (maximal complete subsets) of G is a generating class for the log linear model of distributions which factorize w.r.t. G. If the dependence graph completely summarizes the restrictions imposed by A, i.e. if A = C(G(A)), A is conformal.

Generating class Dependence graph of log-linear model Conformal graphical models Factor graphs φ IJ φ IK I K J The factor graph of A is the bipartite graph with vertices V A and edges define by φ JK α a α a. Using this graph even non-conformal log linear models admit a simple visual representation.

Likelihood equations Iterative Proportional Scaling Closed form maximum likelihood The maximum likelihood estimate ˆp of p is the unique element of P A which satisfies the system of equations nˆp(x a ) = n(x a ), a A, x a X a. (1) Here g(x a ) = y:y a=x a g(y) is the a-marginal of the function g. The system of equations (2) expresses the fitting of the marginals in A.

Likelihood equations Iterative Proportional Scaling Closed form maximum likelihood There is a convergent algorithm which solves the likelihood equations. This cycles (repeatedly) through all the a-marginals in A and fit them one by one. For a A define the following scaling operation on p: (T a p)(x) p(x) n(x a) np(x a ), x X where 0/0 = 0 and b/0 is undefined if b 0.

Likelihood equations Iterative Proportional Scaling Closed form maximum likelihood Make an ordering of the generators A = {a 1,..., a k }. Define S by a full cycle of scalings Define the iteration It then holds that Sp = T ak T a2 T a1. p 0 (x) 1/ X, p n = Sp n 1, n = 1,.... lim p n = ˆp n where ˆp is the unique maximum likelihood estimate of p P A, i.e. the solution of the equation system (2).

Likelihood equations Iterative Proportional Scaling Closed form maximum likelihood In some cases the IPS algorithm converges after a finite number of cycles. An explicit formula is then available for the MLE of p P A. Consider first the case of a generating class with only two elements: A = {a, b} and thus V = a b. Let c = a b. Recall that the MLE is the unique solution to nˆp(x a ) = n(x a ), a A, x a X a. (2) Let p (x) = n(x a)n(x b ) n(x c )n.

Likelihood equations Iterative Proportional Scaling Closed form maximum likelihood This satisfies (2) since e.g. p (x) = n(x a)n(x b ) n(x c )n. np (x a ) = n(y a )n(y b ) = n(x a )n(y b ) n(y y:y c ) n(x a=x a y:y c ) a=x a = n(x a) n(y b ) = n(x a) n(x c ) n(x y:y c ) n(x c) = n(x a ) a=x a and similarly with the other marginal. Hence we have ˆp = p.

Chordal graphs The generating class A = {a, b} is conformal. Its dependence graph G has exactly two cliques a and b. The graph is chordal, meaning that any cycle of length 4 has a chord. A is called decomposable if A is conformal, i.e. A = C(G), and G is chordal. The IPS-algorithm converges after a finite number of cycles (at most two) if and only if A is decomposable.

Chordal graphs Decomposition of Markov properties Factorization of Markov distributions Explicit formula for MLE Properties of decomposability Non-decomposable generating classes A generating class can be non-decomposable in different ways. The generating class A = {{1, 2}, {2, 3}, {1, 3}} is the smallest non-decomposable generating class. This is non-conformal. The graph below is the smallest non-chordal graph and its generating class is non-decomposable:

Chordal graphs Decomposition of Markov properties Factorization of Markov distributions Explicit formula for MLE Properties of decomposability Consider an undirected graph G = (V, E). A partitioning of V into a triple (A, B, S) of subsets of V forms a decomposition of G if A G B S and S is complete. The decomposition is proper if A and B. The components of G are the induced subgraphs G A S and G B S. A graph is prime if no proper decomposition exists.

Chordal graphs Decomposition of Markov properties Factorization of Markov distributions Explicit formula for MLE Properties of decomposability Examples 2 4 1 5 7 The graph to the left is prime 3 6 Decomposition with A = {1, 3}, B = {4, 6, 7} and S = {2, 5} 2 4 2 2 4 1 5 7 1 5 5 7 3 6 3 6

Chordal graphs Decomposition of Markov properties Factorization of Markov distributions Explicit formula for MLE Properties of decomposability Suppose P satisfies (F) w.r.t. G and (A, B, S) is a decomposition. Then (i) P A S and P B S satisfy (F) w.r.t. G A S and G B S respectively; (ii) f (x)f S (x S ) = f A S (x A S )f B S (x B S ).

Chordal graphs Decomposition of Markov properties Factorization of Markov distributions Explicit formula for MLE Properties of decomposability Decomposability Any graph can be recursively decomposed into its maximal prime subgraphs: 2 4 2 4 2 1 5 7 1 5 5 7 5 7 3 6 3 6 A graph is decomposable (or rather fully decomposable) if it is complete or admits a proper decomposition into decomposable subgraphs. Definition is recursive. Alternatively this means that all maximal prime subgraphs are cliques.

Chordal graphs Decomposition of Markov properties Factorization of Markov distributions Explicit formula for MLE Properties of decomposability Recursive decomposition of a decomposable graph into cliques yields the formula: f (x) S S f S (x S ) ν(s) = C C f C (x C ). Here S is the set of minimal complete separators occurring in the decomposition process and ν(s) the number of times such a separator appears in this process.

Chordal graphs Decomposition of Markov properties Factorization of Markov distributions Explicit formula for MLE Properties of decomposability As we have a particularly simple factorization of the density, we have a similar factorization of the maximum likelihood estimate for a decomposable log-linear model. The MLE for p under the log-linear model with generating class A = C(G) for a chordal graph G is C C ˆp(x) = n(x C ) n S S n(x S) ν(s) where ν(s) is the number of times S appears as a separator in the total decomposition of its dependence graph.

Chordal graphs Decomposition of Markov properties Factorization of Markov distributions Explicit formula for MLE Properties of decomposability Perfect numbering A numbering V = {1,..., V } of the vertices of an undirected graph is perfect if j = 2,..., V : bd(j) {1,..., j 1} is complete in G. An undirected graph G is chordal if it has no chordless n-cycles with n 4. A set S is an (α, β)-separator if α G β S,

Chordal graphs Decomposition of Markov properties Factorization of Markov distributions Explicit formula for MLE Properties of decomposability Characterizing chordal graphs The following are equivalent for any undirected graph G. (i) G is chordal; (ii) G is decomposable; (iii) All maximal prime subgraphs of G are cliques; (iv) G admits a perfect numbering; (v) Every minimal (α, β)-separator are complete. Trees are chordal graphs and thus decomposable.

Here is a (greedy) algorithm for checking chordality: 1. Look for a vertex v with bd(v ) complete. If no such vertex exists, the graph is not chordal. 2. Form the subgraph G V \v and let v = V ; 3. Repeat the process under 1; 4. If the algorithm continues until only one vertex is left, the graph is chordal and the numbering is perfect. The complexity of this algorithm is O( V 2 ).

Log linear models Is this graph chordal?

Log linear models 7 Is this graph chordal?

Log linear models 6 7 Is this graph chordal?

Log linear models 5 6 7 Is this graph chordal?

Log linear models 5 6 7 This graph is not chordal, as there is no candidate for number 4.

Log linear models Is this graph chordal?

Log linear models 7 Is this graph chordal?

Log linear models 6 7 Is this graph chordal?

Log linear models 5 6 7 Is this graph chordal?

Log linear models 5 6 4 7 Is this graph chordal?

Log linear models 5 3 6 4 7 Is this graph chordal?

Log linear models 5 3 2 6 4 7 Is this graph chordal?

Log linear models 1 5 3 2 6 4 7 This graph is chordal!

This simple algorithm has complexity O( V + E ): 1. Choose v 0 V arbitrary and let v 0 = 1; 2. When vertices {1, 2,..., j} have been identified, choose v = j + 1 among V \ {1, 2,..., j} with highest cardinality of its numbered neighbours; 3. If bd(j + 1) {1, 2,..., j} is not complete, G is not chordal; 4. Repeat from 2; 5. If the algorithm continues until only one vertex is left, the graph is chordal and the numbering is perfect.

Maximum Cardinality Search Is this graph chordal?

Maximum Cardinality Search * 1 * * Is this graph chordal?

Maximum Cardinality Search 2 1 * ** * Is this graph chordal?

Maximum Cardinality Search 2 1 * 3 ** * * Is this graph chordal?

Maximum Cardinality Search 2 1 * 3 4 * ** Is this graph chordal?

Maximum Cardinality Search 2 1 * 3 4 * 5 Is this graph chordal?

Maximum Cardinality Search 2 1 6 3 4 ** 5 Is this graph chordal?

Maximum Cardinality Search 2 1 6 3 4 7 5 The graph is not chordal! because 7 does not have a complete boundary.

Maximum Cardinality Search 2 1 6 3 4 7 5 MCS numbering for the chordal graph. Algorithm runs essentially as before.

Finding the cliques of a chordal graph From an MCS numbering V = {1,..., V }, let B λ = bd(λ) {1,..., λ 1} and π λ = B λ. Call λ a ladder vertex if λ = V or if π λ+1 < π λ + 1. Let Λ be the set of ladder vertices. 2 1 6 3 4 7 5 π λ : 0,1,2,2,2,1,1. The cliques are C λ = {λ} B λ, λ Λ.

A chordal graph Log linear models This graph is chordal, but it might not be that easy to see... Maximum Cardinality Search is handy!