Loopy Belief Propagation

Similar documents
Expectation Propagation

Information Processing Letters

Part II. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Probabilistic Graphical Models

Graphical Models. Pradeep Ravikumar Department of Computer Science The University of Texas at Austin

D-Separation. b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C.

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Computer Vision Group Prof. Daniel Cremers. 4a. Inference in Graphical Models

Chapter 8 of Bishop's Book: Graphical Models

These notes present some properties of chordal graphs, a set of undirected graphs that are important for undirected graphical models.

Probabilistic Graphical Models

CS242: Probabilistic Graphical Models Lecture 2B: Loopy Belief Propagation & Junction Trees

Undirected Graphical Models. Raul Queiroz Feitosa

Multiple Constraint Satisfaction by Belief Propagation: An Example Using Sudoku

STA 4273H: Statistical Machine Learning

Tree-structured approximations by expectation propagation

Probabilistic Graphical Models

Mean Field and Variational Methods finishing off

6 : Factor Graphs, Message Passing and Junction Trees

Lecture 3: Conditional Independence - Undirected

A Tutorial Introduction to Belief Propagation

10708 Graphical Models: Homework 4

3136 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 52, NO. 11, NOVEMBER 2004

Bayesian Networks, Winter Yoav Haimovitch & Ariel Raviv

Comparison of Graph Cuts with Belief Propagation for Stereo, using Identical MRF Parameters

The Basics of Graphical Models

Algorithms for Markov Random Fields in Computer Vision

AN ANALYSIS ON MARKOV RANDOM FIELDS (MRFs) USING CYCLE GRAPHS

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 54, NO. 3, MARCH

Models for grids. Computer vision: models, learning and inference. Multi label Denoising. Binary Denoising. Denoising Goal.

Mean Field and Variational Methods finishing off

Motivation: Shortcomings of Hidden Markov Model. Ko, Youngjoong. Solution: Maximum Entropy Markov Model (MEMM)

Statistical and Learning Techniques in Computer Vision Lecture 1: Markov Random Fields Jens Rittscher and Chuck Stewart

A New Approach to Early Sketch Processing

Graphical Models. David M. Blei Columbia University. September 17, 2014

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models

Introduction to Graphical Models

Inference in the Promedas medical expert system

Computer vision: models, learning and inference. Chapter 10 Graphical Models

Variational Methods for Graphical Models

Lecture 4: Undirected Graphical Models

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms for Inference Fall 2014

OSU CS 536 Probabilistic Graphical Models. Loopy Belief Propagation and Clique Trees / Join Trees

A Well-Behaved Algorithm for Simulating Dependence Structures of Bayesian Networks

Belief propagation and MRF s

Pairwise Clustering and Graphical Models

Faster parameterized algorithms for Minimum Fill-In

Max-Product Particle Belief Propagation

Exact Inference: Elimination and Sum Product (and hidden Markov models)

Collective classification in network data

CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination

Workshop report 1. Daniels report is on website 2. Don t expect to write it based on listening to one project (we had 6 only 2 was sufficient

Faster parameterized algorithms for Minimum Fill-In

2. Graphical Models. Undirected graphical models. Factor graphs. Bayesian networks. Conversion between graphical models. Graphical Models 2-1

Image Restoration using Markov Random Fields

FMA901F: Machine Learning Lecture 6: Graphical Models. Cristian Sminchisescu

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

GAUSSIAN graphical models are used to represent the

3 : Representation of Undirected GMs

Junction tree propagation - BNDG 4-4.6

Lecture 5: Exact inference. Queries. Complexity of inference. Queries (continued) Bayesian networks can answer questions about the underlying

Convergent message passing algorithms - a unifying view

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

Building Classifiers using Bayesian Networks

On the Space-Time Trade-off in Solving Constraint Satisfaction Problems*

5 Minimal I-Maps, Chordal Graphs, Trees, and Markov Chains

Node Aggregation for Distributed Inference in Bayesian Networks

Generalized Belief Propagation on Tree Robust Structured Region Graphs

Byzantine Consensus in Directed Graphs

Copyright 2009 IEEE. Reprinted from IEEE Transactions on Information Theory.

arxiv: v2 [cs.cv] 30 Jul 2018

Conditional Random Fields and beyond D A N I E L K H A S H A B I C S U I U C,

Graphical models and message-passing algorithms: Some introductory lectures

CS 532c Probabilistic Graphical Models N-Best Hypotheses. December

Regularization and Markov Random Fields (MRF) CS 664 Spring 2008

Introduction to Graph Theory

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 51, NO. 7, JULY A New Class of Upper Bounds on the Log Partition Function

Accumulator networks: Suitors of local probability propagation

Integrating Probabilistic Reasoning with Constraint Satisfaction

Markov Logic: Representation

Structured Region Graphs: Morphing EP into GBP

An Effective Upperbound on Treewidth Using Partial Fill-in of Separators

More details on Loopy BP

Parallel Gibbs Sampling From Colored Fields to Thin Junction Trees

Message-passing algorithms (continued)

Probabilistic inference in graphical models

c 2004 Society for Industrial and Applied Mathematics

Mesh segmentation. Florent Lafarge Inria Sophia Antipolis - Mediterranee

Lecture 5: Exact inference

COS 513: Foundations of Probabilistic Modeling. Lecture 5

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 9, SEPTEMBER

Distributed minimum spanning tree problem

Conditional Random Fields for Object Recognition

Lecture 9: Undirected Graphical Models Machine Learning

Improving Mining Quality by Exploiting Data Dependency

Faster parameterized algorithm for Cluster Vertex Deletion

Markov Random Fields and Gibbs Sampling for Image Denoising

Machine Learning. Sourangshu Bhattacharya

Graphical Models, Distributed Fusion, and Sensor Networks

Transcription:

Loopy Belief Propagation Research Exam Kristin Branson September 29, 2003 Loopy Belief Propagation p.1/73

Problem Formalization Reasoning about any real-world problem requires assumptions about the structure of the problem: the relevant variables and the interrelationships of these variables. A graphical model is a formal representation of these assumptions. Loopy Belief Propagation p.2/73

Problem Formalization Reasoning about any real-world problem requires assumptions about the structure of the problem: the relevant variables and the interrelationships of these variables. A graphical model is a formal representation of these assumptions. Loopy Belief Propagation p.3/73

Probabilistic Model These assumptions are a simplification of the problem s true structure. The world appears stochastic in terms of the model. Graphical models are interpreted as describing the probability distribution of random variables. Loopy Belief Propagation p.4/73

Probabilistic Inference Reasoning about real-world problems can be modeled as probabilistic inference on a distribution described by a graph. Probabilistic inference involves computing desired properties of a distribution: What is the most probable state of the variables, given some evidence? What is the marginal distribution of a subset of the variables, given some evidence? Loopy Belief Propagation p.5/73

Inference Example Estimate the intensity value of each pixel of an image given a corrupted version of the image. Loopy Belief Propagation p.6/73

Inference Example Estimate the intensity value of each pixel of an image given a corrupted version of the image. Assume. Assume relationship between uncorrupted pixel variables can be described by a local smoothness constraint. Loopy Belief Propagation p.7/73

Inference is Intractible Assuming a sparse graphical model greatly simplifies the problem. Still, probabilistic inference is in general intractible. Exact inference algorithms are exponential in the graph size. Pearl s Belief Propagation (BP) algorithm performs approximate inference on an arbitrary graphical model. Loopy Belief Propagation p.8/73

Loopy BP The assumptions made by BP only hold for acyclic graphs. For graphs containing cycles, loopy BP is not guaranteed to converge or be correct. However, it has been applied with experimental success. Loopy Belief Propagation p.9/73

Experimental Results Impressive experimental results were first observed in graphical code schemes. The Turbo code error-correcting code scheme was described as the most exciting and potentially important development in coding theory in many years (McEliece et al., 1995). Murphy et al. experimented with loopy BP on graphical models that appear in machine learning. They concluded that loopy BP did converge to good approximations in many cases. Since then, loopy BP has successfully been applied to many applications in machine learning and computer vision. Loopy Belief Propagation p.10/73

Theoretical Analysis When considering applying loopy BP, one would like to know whether it will converge to good approximations. In this exam, I present recent analyses of loopy BP. Loopy Belief Propagation p.11/73

Outline Background. Undirected graphical models. Belief Propagation algorithm. Three techniques for analyzing loopy BP. Algebraic analysis. Unwrapped tree. Reparameterization. Future work. Loopy Belief Propagation p.12/73

Outline Background. Undirected graphical models. Belief Propagation algorithm. Three techniques for analyzing loopy BP. Algebraic analysis. Unwrapped tree. Reparameterization. Future work. Loopy Belief Propagation p.13/73

Markov Random Fields An undirected graphical model represents a distribution by an undirected graph. 1 2 3 4 5 6 Nodes represent random variables. Edges represent dependencies. Each clique is associated with a potential function,. Loopy Belief Propagation p.14/73

Markov Properties Paths in the graph represent dependencies. If two nodes are not connected, the variables are independent. A B If nodes separate nodes from nodes, then and are conditionally independent given. The Hammersley-Clifford theorem: the conditional independence assumptions hold if and only if the distribution factorizes as the product of potential functions over cliques:. C Loopy Belief Propagation p.15/73

Pairwise MRFs To simplify notation, we focus on pairwise MRFs. The largest clique size in a pairwise MRF is two. The distribution can therefore be represented as A MRF with larger cliques can be converted into a pairwise MRF. Loopy Belief Propagation p.16/73

Probabilistic Inference We discuss two inference problems: Marginalization: For each node, compute MAP assignment: Find the maximum probability assignment given the evidence: argmax Loopy Belief Propagation p.17/73

Max-Marginals To find the MAP assignment, we will compute the max-marginals: The MAP assignment maximizes. Loopy Belief Propagation p.18/73

Notation To simplify notation, we assume that effect of the observed data is encapsulated in the single-node potentials. Loopy Belief Propagation p.19/73

Variable Elimination Exact inference can be performed by repeatedly eliminating variables: Loopy Belief Propagation p.20/73

Outline Background. Undirected graphical models. Belief Propagation algorithm. Three techniques for analyzing loopy BP. Algebraic analysis. Unwrapped tree. Reparameterization. Future work. Loopy Belief Propagation p.21/73

BP for Trees BP breaks the [max-]marginalization for a node into independent subproblems corresponding to the subtrees rooted at the neighbors of. In each subproblem, BP eliminates all the variables except. The result is a message. Loopy Belief Propagation p.22/73

BP for Trees BP is a dynamic programming form of variable elimination. The creation of a message is equivalent to repeatedly removing leaf nodes of the subtree: Loopy Belief Propagation p.23/73

BP for Trees In terms of these messages, the [max-]marginals are Loopy Belief Propagation p.24/73

Parallel Message Passing Instead of waiting for smaller problems to be solved before solving larger problems, we can iteratively pass messages in parallel. until convergence, Initialize the messages for all For Update using for all.. Loopy Belief Propagation p.25/73

Loopy BP The parallel BP algorithm can be applied to arbitrary graphs. However, the assumptions made by BP do not hold for a loopy graph. Loopy BP is not guaranteed to converge. If it does converge, it is not guaranteed to converge to the correct [max-]marginals. We will call the approximate [max-]marginals beliefs. Loopy Belief Propagation p.26/73

Theoretical Analysis When will loopy BP converge? How good an approximation are the [max-]marginals and max-product assignment? I present three techniques for analyzing BP. The first two analyze the message-passing dynamics, while the third analyzes the steady-state beliefs directly. Loopy Belief Propagation p.27/73

Outline Background. Undirected graphical models. Belief Propagation algorithm. Three techniques for analyzing loopy BP. Algebraic analysis. Unwrapped tree. Reparameterization. Future work. Loopy Belief Propagation p.28/73

Algebraic Analysis Overview We first discuss an algebraic analysis of the sum-product algorithm for a single-cycle graph. We represent each message update of the sum-product algorithm as a matrix multiplication. We use linear algebra results to show the relationship between the steady-state beliefs and the true marginals, as well as convergence properties. The sum-product algorithm converges for a single-cycle graph. The convergence rate and the accuracy of the beliefs are related. Loopy Belief Propagation p.29/73

Matrix Representation We represent the message and belief functions as vectors and. Similarly, we represent the single- and pair-node potentials as matrices. and Loopy Belief Propagation p.30/73

Matrix Representation We represent the message and belief functions as vectors and. Similarly, we represent the single- and pair-node potentials as matrices. and Loopy Belief Propagation p.31/73

Matrix Message Updates For a graph consisting of a single cycle, a message update is a matrix multiplication Loopy Belief Propagation p.32/73

Matrix Message Updates For a graph consisting of a single cycle, a message update is a matrix multiplication Loopy Belief Propagation p.33/73

Matrix Belief Updates For a graph consisting of a single cycle, a belief update is diag Loopy Belief Propagation p.34/73

% & % # " $ #! ) ) ) ' + ' Matrix Message Updates! 1 2 4 3 A series of message-updates is a series of matrix multiplications. ) * ** ) ) ( ' Loopy Belief Propagation p.35/73

Power Method Lemma converges to the principal eigenvector of,. The convergence rate is the ratio of the first two eigenvalues,. This applies if The eigenvalues follow (e.g. if the distribution is positive). The initial vector is not orthogonal to. Loopy Belief Propagation p.36/73

True Marginals The sums and multiplications performed when computing the marginals are a distributed version of the sums and multiplications performed when computing the diagonal elements of : diag Loopy Belief Propagation p.37/73

Beliefs is the left eigenvector of, since The steady-state beliefs are therefore the diagonal elements of the outer product of the right and left principal eigenvectors of. Loopy Belief Propagation p.38/73

Beliefs vs True Marginals The diagonal elements of the outer product of the right and left principal eigenvectors is an approximation of the diagonal elements of The goodness of the approximation depends on the ratio. Recall that the convergence rate depends on a similar ratio,. The faster the convergence, the better the approximation.. Loopy Belief Propagation p.39/73

Algebraic Analysis Recap By representing the sum-product algorithm on a single-cycle graph as a series of matrix multiplications, we showed the following results: The sum-product algorithm converges for positive distributions. Both the covergence rate and the accuracy of the steady-state beliefs depend on the relative size of the first eigenvalue of the same matrix. Loopy Belief Propagation p.40/73

Outline Background. Undirected graphical models. Belief Propagation algorithm. Three techniques for analyzing loopy BP. Algebraic analysis. Unwrapped tree. Reparameterization. Future work. Loopy Belief Propagation p.41/73

Unwrapped Tree To analyze the BP algorithm, we construct the unwrapped tree,, an acyclic graph that is locally equivalent to the original graph,. 1 2 4 3 Loopy Belief Propagation p.42/73

Want Unwrapped Tree Analysis Loopy Belief Propagation p.43/73

Want Know Unwrapped Tree Analysis Loopy Belief Propagation p.44/73

Want Know Unwrapped Tree Analysis Show Show Loopy Belief Propagation p.45/73

Unwrapped Tree Overview The unwrapped tree was used to prove: The max-product assignment is exact for a graph containing a single cycle. The max-product assignment has a higher probability than any other assignment in a large neighborhood. Loopy Belief Propagation p.46/73

Want Know Unwrapped Tree Construction Show Show Loopy Belief Propagation p.47/73

Unwrapped Tree Construction The unwrapped tree,, is constructed from as follows: Choose an arbitrary root node. Initialize Repeat: For each leaf of, find the neighbors of the corresponding node in, other than the parent of in. Add these nodes to the tree.. Loopy Belief Propagation p.48/73

Unwrapped Tree Construction The unwrapped tree,, is constructed from as follows: Choose an arbitrary root node. Initialize Repeat: For each leaf of, find the neighbors of the corresponding node in, other than the parent of in. Add these nodes to the tree.. Loopy Belief Propagation p.49/73

Unwrapped Tree Construction The unwrapped tree,, is constructed from as follows: Choose an arbitrary root node. Initialize Repeat: For each leaf of, find the neighbors of the corresponding node in, other than the parent of in. Add these nodes to the tree.. Loopy Belief Propagation p.50/73

Unwrapped Tree Construction The unwrapped tree,, is constructed from as follows: Choose an arbitrary root node. Initialize Repeat: For each leaf of, find the neighbors of the corresponding node in, other than the parent of in. Add these nodes to the tree.. Loopy Belief Propagation p.51/73

Unwrapped Tree Construction The unwrapped tree,, is constructed from as follows: Choose an arbitrary root node. Initialize Repeat: For each leaf of, find the neighbors of the corresponding node in, other than the parent of in. Add these nodes to the tree.. Loopy Belief Propagation p.52/73

Unwrapped Tree Construction Copy the potentials from the corresponding nodes in. Modify the leaf single-node potentials to simulate the steady-state messages in. Because the graphs are locally the same, the will be replicas of. Loopy Belief Propagation p.53/73

Graphs Containing a Single Cycle For a graph containing a single cycle, the unwrapped tree is an infinite chain. We can construct so that each node is replicated times in the interior. Loopy Belief Propagation p.54/73

Want Know Graphs Containing a Single Cycle Show Show Loopy Belief Propagation p.55/73

Graphs Containing a Single Cycle Let be the log-likelihood of assignment for. Since the interior of consists of replicas of, the log-likelihood for is where As is the log-likelihood for the two leaf nodes. does not depend on, in the limit as,. Loopy Belief Propagation p.56/73

Want Know Graphs Containing a Single Cycle Shown Shown Loopy Belief Propagation p.57/73

Optimality for Arbitrary Graphs Let be a set of nodes whose induced subgraph contains at most one cycle per connected component. We can show that has a higher probability than any. Loopy Belief Propagation p.58/73

Outline Background. Undirected graphical models. Belief Propagation algorithm. Three techniques for analyzing loopy BP. Algebraic analysis. Unwrapped tree. Reparameterization. Future work. Loopy Belief Propagation p.59/73

Reparameterization Analysis The past two analysis techniques analyzed the message-passing dynamics of BP. The reparameterization technique analyzes the steady-state beliefs. Loopy Belief Propagation p.60/73

Reparameterization Overview We show that the beliefs define another parameterization of the distribution In this parameterization, We show that the steady-state beliefs are consistent w.r.t every subtree. We show that the max-product assignment satisfies an optimality condition w.r.t. every subgraph with at most one cycle per connected component.. Loopy Belief Propagation p.61/73

Steady-State Beliefs We analyze the steady-state single- and pair-node beliefs: * Loopy Belief Propagation p.62/73

Belief Parameterization define another parameterization of the The beliefs distribution: 1 2 3 1 2 3 4 5 6 4 5 6 Loopy Belief Propagation p.63/73

Belief Parameterization define another parameterization of the The beliefs distribution: ). as can be shown by substituting in the definition of Loopy Belief Propagation p.64/73

Consistency Definition: Let distribution be a subgraph with The beliefs beliefs are consistent w.r.t if the corresponding are the true max-marginals of. Loopy Belief Propagation p.65/73

Edge Consistency The edge beliefs are consistent: as can be seen by substituting in the message definitions of and. Loopy Belief Propagation p.66/73

Tree Consistency 1 2 3 4 5 6 The steady-state beliefs are consistent w.r.t every subtree of. This is shown by exploiting the edge consistency described to remove leaf nodes one at a time. In the end, we are left with only two nodes, a trivial base case. Loopy Belief Propagation p.67/73

Tree Plus Cycle Optimality 1 2 3 4 5 6 Let be a subgraph of with at most one cycle per connected component and distribution. maximizes The max-product assignment Loopy Belief Propagation p.68/73

Tree Plus Cycle Optimality Using the edge consistency described, we can show that for any other assignment. Loopy Belief Propagation p.69/73

Tree Plus Cycle Optimality If is a connected subgraph containing one cycle, then the edges of can be directed so that each node has exactly one parent: where is the parent of. Loopy Belief Propagation p.70/73

Corollaries of TPS Optimality The two results proved using the unwrapped tree are corollaries of the Tree-Plus-Cycle optimality. The Tree-Plus-Cycle optimality can also be used to show an error bound on the max-product assignment for an arbitrary graph. Loopy Belief Propagation p.71/73

Future Work I have presented three techniques for analyzing loopy BP. Experimental results are better than the results proved. Future work includes extending each technique to be more general and prove stronger results. Prove convergence properties of the max-product algorithm on a single-cycle graph using algebraic analysis. Prove the optimality of the max-product algorithm for specific multiple-loop graphs using the unwrapped tree technique. Show more powerful optimality results for arbitrary graph structures with specific potential properties. Loopy Belief Propagation p.72/73

References Aji, S., Horn, G., and McEliece, R. (1998). On the convergence of iterative decoding on graphs with a single cycle. In IEEE International Symposium on Information Theory. McEliece, R., Rodemich, E., and Cheng, J. (1995). The Turbo decision algorithm. In 33rd Allerton Conference on Communications, Control and Computing, Monticello, IL. Murphy, K., Weiss, Y., and Jordan, M. (1999). Loopy belief propagation for approximate inference: An empirical study. In Uncertainty in Artificial Intelligence, pages 467 475. Pearl, J. (1998). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers, Inc., San Mateo, CA. Wainwright, M. (January, 2002). Stochastic Processes on Graphs with Cycles: Geometric and Variational Approaches. PhD thesis, MIT, Cambridge, MA. Wainwright, M., Jaakola, T., and Willsky, A. (2003). Tree-based reparameterization framework for analysis of sum-product and related algorithms. IEEE Transactions on Information Theory, 49(5). Wainwright, M., Jaakola, T., and Willsky, A. (October 28, 2002). Tree consistency and bounds on the performance of the max-product algorithm and its generalizations. Technical Report P 2554, Laboratory for Information and Decision Systems, MIT. Weiss, Y. (November, 1997). Belief propagation and revision in networks with loops. Technical Report AI Memo No. 1616, C.B.C.L. Paper No. 155, AI Lab, MIT. Weiss, Y. and Freeman, W. (2001a). Correctness of belief propagation in Gaussian graphical models or arbitrary topology. Neural Computation, 13:2173 2200. Weiss, Y. and Freeman, W. (2001b). On the optimality of solutions of the max-product belief propagation algorithm in arbitrary graphs. IEEE Transactions on Information Theory, Loopy Belief Propagation p.73/73 47(2):723 735.