Graphical Models. Dmitrij Lagutin, T Machine Learning: Basic Principles

Similar documents
Cheng Soon Ong & Christian Walder. Canberra February June 2018

Machine Learning. Sourangshu Bhattacharya

FMA901F: Machine Learning Lecture 6: Graphical Models. Cristian Sminchisescu

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models

Bayesian Machine Learning - Lecture 6

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models

Machine Learning

D-Separation. b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C.

Graphical Models Part 1-2 (Reading Notes)

Probabilistic Graphical Models

ECE521 W17 Tutorial 10

Graphical Models & HMMs

Machine Learning

Workshop report 1. Daniels report is on website 2. Don t expect to write it based on listening to one project (we had 6 only 2 was sufficient

Machine Learning

Reasoning About Uncertainty

Computer vision: models, learning and inference. Chapter 10 Graphical Models

Introduction to Graphical Models

5/3/2010Z:\ jeh\self\notes.doc\7 Chapter 7 Graphical models and belief propagation Graphical models and belief propagation

ECE521 Lecture 21 HMM cont. Message Passing Algorithms

CS 343: Artificial Intelligence

Part II. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Bayesian Networks. A Bayesian network is a directed acyclic graph that represents causal relationships between random variables. Earthquake.

COMP90051 Statistical Machine Learning

Factor Graphs and message passing

Lecture 3: Conditional Independence - Undirected

Finding Asymptotes KEY

Warm-up as you walk in

CS 343: Artificial Intelligence

Graphical Models and Markov Blankets

Graphical Models. David M. Blei Columbia University. September 17, 2014

Tracking Algorithms. Lecture16: Visual Tracking I. Probabilistic Tracking. Joint Probability and Graphical Model. Deterministic methods

Sum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 15, 2015

1 : Introduction to GM and Directed GMs: Bayesian Networks. 3 Multivariate Distributions and Graphical Models

10-701/15-781, Fall 2006, Final

Machine Learning. Lecture Slides for. ETHEM ALPAYDIN The MIT Press, h1p://

Chapter 2 PRELIMINARIES. 1. Random variables and conditional independence

Recall from last time. Lecture 4: Wrap-up of Bayes net representation. Markov networks. Markov blanket. Isolating a node

CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination

Building Classifiers using Bayesian Networks

Sequence Labeling: The Problem

Bayesian Classification Using Probabilistic Graphical Models

Homework. Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression Pod-cast lecture on-line. Next lectures:

Machine Learning

2.2 Limit of a Function and Limit Laws

Markov Random Fields and Gibbs Sampling for Image Denoising

2. Graphical Models. Undirected graphical models. Factor graphs. Bayesian networks. Conversion between graphical models. Graphical Models 2-1

Machine Learning. Supervised Learning. Manfred Huber

The complement of PATH is in NL

Machine Learning

Exam Advanced Data Mining Date: Time:

Directed Graphical Models (Bayes Nets) (9/4/13)

Max-Sum Inference Algorithm

CS 188: Artificial Intelligence

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

CS-171, Intro to A.I. Mid-term Exam Summer Quarter, 2016

6 : Factor Graphs, Message Passing and Junction Trees

Today. Logistic Regression. Decision Trees Redux. Graphical Models. Maximum Entropy Formulation. Now using Information Theory

Lecture 4: Undirected Graphical Models

Deep Boltzmann Machines

Computational Intelligence

The Basics of Graphical Models

Mixture of Gaussian Processes for Combining Multiple Modalities

Markov Equivalence in Bayesian Networks

Image analysis. Computer Vision and Classification Image Segmentation. 7 Image analysis

CSCE 478/878 Lecture 6: Bayesian Learning and Graphical Models. Stephen Scott. Introduction. Outline. Bayes Theorem. Formulas

Machine Learning Lecture 16

Scene Grammars, Factor Graphs, and Belief Propagation

Expectation Propagation

Statistical and Learning Techniques in Computer Vision Lecture 1: Markov Random Fields Jens Rittscher and Chuck Stewart

Bayesian Estimation for Skew Normal Distributions Using Data Augmentation

Machine Learning A WS15/16 1sst KU Version: January 11, b) [1 P] For the probability distribution P (A, B, C, D) with the factorization

STA 4273H: Statistical Machine Learning

Graphical Models. Pradeep Ravikumar Department of Computer Science The University of Texas at Austin

6.867 Machine learning

Laboratorio di Problemi Inversi Esercitazione 4: metodi Bayesiani e importance sampling

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas

Unconstrained and Constrained Optimization

3 : Representation of Undirected GMs

Ch9: Exact Inference: Variable Elimination. Shimi Salant, Barak Sternberg

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

Lecture 5: Exact inference. Queries. Complexity of inference. Queries (continued) Bayesian networks can answer questions about the underlying

Introduction to Machine Learning

Polynomial Functions I

A Brief Introduction to Bayesian Networks AIMA CIS 391 Intro to Artificial Intelligence

Machine Learning

6.867 Machine learning

A Nonparametric Bayesian Approach to Detecting Spatial Activation Patterns in fmri Data

Lecture 9: Undirected Graphical Models Machine Learning

Data-analysis problems of interest

Causality with Gates

Graphing Functions. 0, < x < 0 1, 0 x < is defined everywhere on R but has a jump discontinuity at x = 0. h(x) =

Other Relational Query Languages

Review I" CMPSCI 383 December 6, 2011!

Probabilistic multi-resolution scanning for cross-sample differences

An Introduction to Graph Theory

MATHEMATICS SYLLABUS SECONDARY 5th YEAR

Research Article Structural Learning about Directed Acyclic Graphs from Multiple Databases

Scene Grammars, Factor Graphs, and Belief Propagation

Transcription:

Graphical Models Dmitrij Lagutin, dlagutin@cc.hut.fi T-61.6020 - Machine Learning: Basic Principles 12.3.2007

Contents Introduction to graphical models Bayesian networks Conditional independence Markov random fields Inference in graphical models

Introduction to graphical models This presentation concentrates on probabilistic graphical models, they have several benefits They provide a simple, clear way to visualize the probabilistic model, this helps e.g. designing new models Different properties of the probabilistic model, like conditional independence can be seen from the graphical model Comple computations in sophisticated models can be epressed in terms of graphical manipulations

Introduction to graphical models A graph consists of nodes (vertices that are connected by edges (links, arcs The graph can be directed (edges have arrows to indicate the direction or undirected (edges do not have arrows In probabilistic graphical models each node in a graph represents a random variable and the edges of the graph represent probabilistic relationships between these variables

Bayesian networks Also known as directed graphical models Each node is associated with a random variable or with a group of variables Edges describe conditional relationships between variables For eample, if the distribution is p(b a then the corresponding graphical model will have a directed edge from a to b

Bayesian networks: eample Let the distribution be: Then the corresponding graphical model is:, ( (, (,, ( ( ( ( 5 4 7 4 6 3 1 5 3 2 1 4 3 2 1 p p p p p p p

Bayesian networks: polynomial regression Lets build a graphical model for polynomial regression The polynomial regression model contains Vector of polynomial coefficients w T Observed data t = ( t 1,..., t N T Input data = ( 1,..., N Noise variance σ 2 Hyperparameter α

Bayesian networks: polynomial regression Lets concentrate first on random variables, their distribution can be epressed as p(t, w = p( w N n= 1 p( t n w The corresponding graphical model (the bo denotes that there are N values inside is:

Bayesian networks: polynomial regression Machine learning problems usually have a training set In graphical models these observed variables are denoted by shading Sometimes it is better to show parameters of a model eplicitly In graphical models deterministic parameters are denoted by small, solid circles (variables are denoted by larger, open circles

Bayesian networks: polynomial regression With parameters the distribution becomes: And the graphical model: = = N n n n w t p w p w p 1 2 2,, ( (,,, ( σ α σ α t

Bayesian networks: polynomial regression The aim of the polynomial regression model is: given a new input value ˆ find the corresponding probability distribution tˆ conditioned on the observed data Thus, the final form of the model is: p(ˆ, t t, w ˆ,, α, σ 2 N = n= 1 p( t n w, n 2, σ p( w α p(ˆ t ˆ, w, σ 2

Bayesian networks: polynomial regression The corresponding final graphical model is:

Conditional independence If p(a b,c = p(a c then a is conditionally independent of b given c: a b c We can write above as: p(a,b c = p(a b,cp(b c = p(a cp(b c Now the joint distribution factorises into the product of marginal distribution of a and b, both of these marginal distributions are conditioned on c

Conditional independence: d- separation D-separation can be used to determine wherever of not there eist a conditional independence in the underlying model Some concepts Node c is a tail-to-tail node: Node c is a head-to-tail node:

Conditional independence: d- separation Node c is a head-to-head node: Node y is a descendant of node if there is a path from to y in which each step of the path follows the directions of the arrows

Conditional independence: d- separation Let A, B and C be nonintersecting sets of nodes Lets consider all possible paths from any node in A to any node in B, the path is blocked if it includes a node such that either the arrows on the path meet either head-to-tail or tailto-tail at the node, and the node is in the set C, or the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, is in the set C If all paths are blocked, then A is d-separated from B by C and thus condition A B C is satisfied

Conditional independence: eample Node f does not block the path between a and b because it is a tail-to-tail node and is not observed Node e does not block the path because it has a descendant c Thus, a and b are not conditionally independent given c

Conditional independence: Node f blocks the path between a and b because it is a tail-to-tail node that second eample is observed, thus there eist a conditional independence: a b f In this case, also node e blocks the path, because e does not have descendants in the conditioned set

Markov random fields Also known as Markov networks or undirected graphical models In Markov random fields each node corresponds to a variable or to a group of variables just like in the Bayesian networks However, edges between nodes are undirected in Markov random fields

Markov random fields: conditional independence Let A, B and C be nonintersecting sets of nodes Lets consider all possible paths from any node in A to any node in B, the path is blocked if it includes a node from C If all such paths are blocked, then A and B are conditionally independent give C: A B C

Markov random fields: conditional independence In this eample A and B are conditionally independent because all paths between A and B go through C

Inference in graphical models Lets look at the Bayes theorem s rule: p(,y=p(p(y, this rule can be represented by a graph (a on the net slide If the y is observed (graph (b, then it is possible to infer corresponding posterior distribution of : p( y p( y ' p( ' p( = y ' = p( y p( p( y

Inference in graphical models Now the joint distribution p(,y can be epressed in terms of p(y and p( y as graph (c shows: p(,y = p(yp( y This is a simple eample of inference problem in graphical models

Inference in graphical models: factor graphs Graphical models allow a function of several variables to be epressed a product of factors over subsets of those variables Factor graphs make this separation eplicit by introducing additional nodes for factors Lets write the joint distribution in the form of a product of factors ( denotes a subset of variables s p ( = f s ( s s

Inference in graphical models: factor graphs, eample Let the distribution be The corresponding factor graph is (, (, (, ( ( 3 3 2 2 1 2 1 f f f f p d c b a =

Inference in graphical models: factor graphs Factor graphs are bipartite graphs, because they consist of two different kinds of nodes, and all edges go between nodes of opposite type Factor graphs are useful for building more comple models that are not covered here