Cheng Soon Ong & Christian Walder. Canberra February June 2018

Size: px
Start display at page:

Download "Cheng Soon Ong & Christian Walder. Canberra February June 2018"

Transcription

1 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression 1 Linear Regression 2 Linear Classification 1 Linear Classification 2 Kernel Methods Sparse Kernel Methods Mixture Models and EM 1 Mixture Models and EM 2 Neural Networks 1 Neural Networks 2 Principal Component Analysis Autoencoders Graphical Models 1 Graphical Models 2 Graphical Models 3 Sampling Sequential Data 1 Sequential Data 2 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 828

2 Part XIII Probabilistic Graphical Models 1 472of 828

3 via a picture Image Segmentation Cluster the gray value representation of an image Neighbourhood information lost Need to use the structure of the image. 473of 828

4 via (1/4) Why is the grass wet? Cloudy Sprinkler Rain WetGrass Introduce four Boolean variables : C(loudy), S(prinkler), R(ain), W(etGrass) {F(alse), T(rue)}. 474of 828

5 via (2/4) Model the conditional probabilities p(c = F) p(c = T) C p(s = F) p(s = T) F T S R p(w = F) p(w = T) F F T F F T T T C p(r = F) p(r = T) F T Cloudy Sprinkler Rain WetGrass 475of 828

6 via (3/4) If everything depends on everything C S R W p(c, S, R, W) F F F F... F F F T T T T F... T T T T... p(w, S, R, C) = p(w S, R, C) p(s, R, C) = p(w S, R, C) p(s R, C) p(r, C) = p(w S, R, C) p(s R, C) p(r C) p(c) Cloudy Sprinkler Rain WetGrass 476of 828

7 via (4/4) p(w) = p(w S, R, C) p(s R, C) p(r C) p(c) S,R,C Sprinkler Cloudy WetGrass Rain p(w) = p(w S, R) S,R C p(s C) p(r C) p(c) Sprinkler Cloudy Rain WetGrass 477of 828

8 via Distributive Law Two key observations when dealing with probabilities 1 Distributive Law can save operations a(b + c) = ab + ac }{{}}{{} 2 operations 3 operations 2 If some probabilities do not depend on all random variables, we might be able to factor them out. For example, assume p(x 1, x 3 x 2) = p(x 1 x 2) p(x 3 x 2), then (using x 3 p(x 3 x 2) = 1) p(x 1) = x 2,x 3 p(x 1, x 2, x 3) = x 2,x 3 p(x 1, x 3 x 2) p(x 2) = p(x 1 x 2) p(x 3 x 2) p(x 2) = p(x 1 x 2) p(x 2) x 2,x 3 x 2 } {{ } O( X 1 X 2 X 3 ) } {{ } O( X 1 X 2 ) 478of 828

9 via Complexity Reduction How to deal with more complex expression? p(x 1 ) p(x 2 ) p(x 3 ) p(x 4 x 1, x 2, x 3 ) p(x 5 x 1, x 3 ) p(x 6 x 4 ) p(x 7 x 4, x 5 ) 479of 828

10 via Complexity Reduction How to deal with more complex expression? p(x 1 ) p(x 2 ) p(x 3 ) p(x 4 x 1, x 2, x 3 ) p(x 5 x 1, x 3 ) p(x 6 x 4 ) p(x 7 x 4, x 5 ) Graphical models x 1 x 2 x 3 x 4 x 5 x 6 x 7 480of 828

11 Probabilistic Graphical Models: Aims Graphical models Visualise the structure of a probabilistic model Complex computations with formulas manipulations with graphs Obtain insights into model properties by inspection Develop and motivate new models 481of 828

12 Probabilistic Graphical Models: Main Flavours Graph 1 Nodes (vertices) : a random variable 2 Edges (links, arcs; directed or undirected) : probabilistic relationship Directed Graph : (also called Directed Graphical Model) expressing causal relationship between variables Undirected Graph : Markov Random Field expressing soft constraints between variables Factor Graph : convenient for solving inference problems (derived from s or Markov Random Fields). x1 x2 x3 x1 x2 x3 yi x4 x5 x6 x7 fa fb fc fd Factor Graph xi Markov Random Field 482of 828

13 : Arbitrary Joint Distribution p(a, b, c) = p(c a, b) p(a, b) = p(c a, b) p(b a) p(a) 483of 828

14 : Arbitrary Joint Distribution p(a, b, c) = p(c a, b) p(a, b) = p(c a, b) p(b a) p(a) 1 Draw a node for each conditional distribution associated with a random variable. 2 Draw an edge from each conditional distribution associated with a random variable to all other conditional distribution which are conditioned on this variable. a b c We have chosen a particular ordering of the variables! 484of 828

15 : Arbitrary Joint Distribution General case for K variables p(x 1,..., x K ) = p(x K x 1,..., x K 1 )... p(x 2 x 1 ) p(x 1 ) The graph of this distribution is fully connected. (Prove it.) What happens if we deal with a distribution represented by a graph which is not fully connected? Can not be the most general distribution anymore. The absence of edges carries important information. 485of 828

16 Joint Factorisation Graph p(x 1 )p(x 2 ) p(x 3 ) p(x 4 x 1, x 2, x 3 ) p(x 5 x 1, x 3 ) p(x 6 x 4 ) p(x 7 x 4, x 5 ) 486of 828

17 Joint Factorisation Graph p(x 1 )p(x 2 ) p(x 3 ) p(x 4 x 1, x 2, x 3 ) p(x 5 x 1, x 3 ) p(x 6 x 4 ) p(x 7 x 4, x 5 ) 1 Draw a node for each conditional distribution associated with a random variable. 2 Draw an edge from each conditional distribution associated with a random variable to all other conditional distribution which are conditioned on this variable. 487of 828

18 Joint Factorisation Graph p(x 1 )p(x 2 ) p(x 3 ) p(x 4 x 1, x 2, x 3 ) p(x 5 x 1, x 3 ) p(x 6 x 4 ) p(x 7 x 4, x 5 ) 1 Draw a node for each conditional distribution associated with a random variable. 2 Draw an edge from each conditional distribution associated with a random variable to all other conditional distribution which are conditioned on this variable. x 1 x 2 x 3 x 4 x 5 x 6 x 7 488of 828

19 Graph Joint Factorisation x 1 x 2 x 3 x 4 x 5 x 6 x 7 Can we get the expression from the graph? 489of 828

20 Graph Joint Factorisation x 1 x 2 x 3 x 4 x 5 x 6 x 7 Can we get the expression from the graph? 1 Write a product of probability distributions, one for each associated random variable. Draw a node for each conditional distribution associated with a random variable. 2 Add all random variables associated with parent nodes to the list of conditioning variables. Draw an edge from each conditional distribution associated with a random variable to all other conditional distribution which are conditioned on this variable. p(x 1 )p(x 2 ) p(x 3 ) p(x 4 x 1, x 2, x 3 ) p(x 5 x 1, x 3 ) p(x 6 x 4 ) p(x 7 x 4, x 5 ) 490of 828

21 Joint Factorisation Graph The joint distribution defined by a graph is given by the product, over all of the nodes of the graph, of a conditional distribution for each node conditioned on the variables corresponding to the parents of the node in the graph. p(x) = K p(x k pa(x k )) k=1 where pa(x k ) denotes the set of parents of x k and x = (x 1,..., x K ). The absence of edges (compared to the fully connected case) is the point. 491of 828

22 Joint Factorisation Graph Restriction : Graph must be a directed acyclic graph (DAG). There are no closed paths in the graph when moving along the directed edges. Or equivalently: There exists an ordering of the nodes such that there are no edges that go from any node to any lower numbered node. x 1 x 2 x 3 x 4 x 5 x 6 x 7 Extension: Can also have sets of variables, or vectors at a node. 492of 828

23 : Normalisation Given p(x) = K p(x k pa(x k )). k=1 Is p(x) normalised, x p(x) = 1? 493of 828

24 : Normalisation Given p(x) = K p(x k pa(x k )). k=1 Is p(x) normalised, x p(x) = 1? As graph is DAG, there always exists a node with no outgoing edges, say x i. p(x) = x x 1,...,x i 1,x i+1,...,x K K p(x k pa(x k )) k=1 k i x i p(x i pa(x i )) }{{} =1 because x i p(x i pa(x i )) = x i Repeat, until no node left. p(x i,pa(x i)) p(pa(x i)) = p(pa(xi)) p(pa(x i)) = 1 494of 828

25 - Bayesian polynomial regression : observed inputs x, observed targets t, noise variance σ 2, hyperparameter α controlling the priors for w. Focusing on t and w only p(t, w) = p(w) N p(t n w) n=1 w t 1 t N 495of 828

26 - Bayesian polynomial regression : observed inputs x, observed targets t, noise variance σ 2, hyperparameter α controlling the priors for w. Focusing on t and w only p(t, w) = p(w) N p(t n w) n=1 w w t 1 t N t n N A plate. 496of 828

27 - Include also the parameters into the graphical model N p(t, w x, α, σ 2 ) = p(w α) p(t n w, x n, σ 2 ) n=1 Random variables = open circles Deterministic variables = smaller solid circles x n α σ 2 t n N w 497of 828

28 - Random variables Observed random variables, e.g. t Unobserved random variables, e.g. w, (latent random variables, hidden random variables) Shade the observed random variables in the graphical model. x n α σ 2 t n N w 498of 828

29 - Prediction : new data point x. Want to predict t. [ N ] p( t, t, w x, x, α, σ 2 ) = p(t n x n, w, σ 2 ) p(w α) p( t x, w, σ 2 ) n=1 x n α w t n N x σ 2 t Polynomial regression model including prediction. 499of 828

30 - Any p(x) may be written p(x) = = K p(x k x 1, x 2,..., x k 1 ) k=1 K p(x k pa(x k )) k=1 What does it mean if pa(x k ) x 1, x 2,..., x k 1 in the above equation? Sparse graphical model. p(x) no longer general. Assumption, e.g. p(w C, S, R) = p(w S, R). independence. 500of 828

31 - Definition ( ) If for three random variables a, b, and c the following holds p(a b, c) = p(a c) then a is conditionally independent of b given c. Notation : a b c. The above equation must hold for all possible values of c. Consequence : p(a, b c) = p(a b, c) p(b c) = p(a c) p(b c) independence simplifies the structure of the model the computations needed to perform interference/learning. NB: conditional dependence is a related but different concept we are not concerned with in this course. 501of 828

32 - Rules for Symmetry : Decomposition : Weak Union : Contraction : Intersection : X Y Z = Y X Z Y, W X Z = Y X Z and W X Z X Y, W Z = X Y Z, W X W Z, Y and X Y Z = X W, Y Z X Y Z, W and X W Z, Y = X Y, W Z Note: Intersection is only valid for p(x), p(y), p(z), p(w) > of 828

33 : TT (1/3) Can we work with the graphical model directly? Check the simplest examples containing only three nodes. First example has joint distribution p(a, b, c) = p(a c) p(b c) p(c) Marginalise both sides over c p(a, b) = p(a c) p(b c) p(c) p(a) p(b). c Does not hold : a b (where is the empty set). c a b 503of 828

34 : TT (2/3) Now condition on c. p(a, b c) = Therefore a b c. p(a, b, c) p(c) c = p(a c) p(b c) a b 504of 828

35 : TT (3/3) Graphical interpretation In both graphical models there is a path from a to b. The node c is called tail-to-tail (TT) with respect to this path because the node c is connected to the tails of the arrows in the path. The presence of the TT-node c in the path left renders a dependent on b (and b dependent on a). Conditioning on c blocks the path from a to b and causes a and b to become conditionally independent on c. c c a b a b Not a b a b c 505of 828

36 : HT (1/3) Second example. p(a, b, c) = p(a) p(c a) p(b c) Marginalise over c to test for independence. p(a, b) = p(a) p(c a) p(b c) = p(a) p(b a) p(a) p(b) c Does not hold : a b. a c b 506of 828

37 : HT (2/3) Now condition on c. p(a, b c) = p(a, b, c) p(c) = p(a) p(c a) p(b c) p(c) = p(a c) p(b c) where we used Bayes theorem p(c a) = p(a c) p(c)/p(a). Therefore a b c. a c b 507of 828

38 : HT (3/3) a c b a c b Not a b a b c 508of 828

39 : HH (1/4) Third example. (A little bit more subtle.) p(a, b, c) = p(a) p(b) p(c a, b) Marginalise over c to test for independence. p(a, b) = c p(a) p(b) p(c a, b) = p(a) p(b) c p(c a, b) = p(a) p(b) a and b are independent if NO variable is observed: a b. a b c 509of 828

40 : HH (2/4) Now condition on c. p(a, b c) = p(a, b, c) p(c) = Does not hold : a b c. a p(a) p(b) p(c a, b) p(c) b p(a c) p(b c). c 510of 828

41 : HH (3/4) Graphical interpretation In both graphical models there is a path from a to b. The node c is called head-to-head (HH) with respect to this path because the node c is connected to the heads of the arrows in the path. The presence of the HH-node c in the path left makes a independent of b (and b independent of a). The unobserved c blocks the path from a to b. a b a b c a b c Not a b c 511of 828

42 : HH (4/4) Graphical interpretation Conditioning on c unblocks the path from a to b, and renders a conditionally dependent on b given c. Some more terminology: Node y is a descendant of node x if there is a path from x to y in which each step follows the directions of the arrows. A HH-path will become unblocked if either the node, or any of its descendants, is observed. a b a f e b c Not a b c c Not a f c 512of 828

43 - General Graph independence and has been established for all configurations of three variables: (HH, HT, TT) (observed, unobserved) the subtle case of HH junction with observed descendent. We can generalise these results to arbitrary Bayesian Networks. Roughly: the graph connectivity implies conditional independence for those sets of nodes which are directionally separated. 513of 828

44 - D-Separation Definition (Blocked Path) A blocked path is a path which contains an observed TT- or HT-node, or an unobserved HH-node whose descendants are all unobserved. c a b a c b a b c 514of 828

45 - D-Separation Consider a general directed graph in which A, B, and C are arbitrary non-intersecting sets of nodes. (There may be other nodes in the graph which are not contained in the union of A, B, and C.) Consider all possible paths from any node in A to any node in B. Any such path is blocked, if it includes a node such that either the node is HT or TT, and the node is in set C, or the node is HH, and neither the node, nor any of the descendants, is in set C. If all paths are blocked, then A is d separated from B by C, and the joint distribution over all the variables in the graph will satisfy A B C. (Note: D-separation stands for directional separation.) 515of 828

46 - D-separation Example The path from a to b is not blocked by f because f is a TT-node and unobserved. The path from a to b is not blocked by e because e is a HH-node, and although unobserved itself, one of its descendants (node c) is observed. a f e b c 516of 828

47 - D-separation Another example The path from a to b is blocked by f because f is a TT-node and observed. Therefore, a b f. Furthermore, the path from a to b is also blocked by e because e is a HH-node, and neither it nor its descendants are observed. Therefore a b f. a f e b c 517of 828

48 Factorisation (1/4) Theorem (Factorisation ) If a probability distribution factorises according to a directed acyclic graph, and if A, B and C are disjoint subsets of nodes such that A is d-separated from B by C in the graph, then the distribution satisfies A B C. Theorem ( Factorisation) If a probability distribution satisfies the conditional independence statements implied by d-separation over a particular directed graph, then it also factorises according to the graph. 518of 828

49 Factorisation (2/4) Why is Factorisation relevant? statements are usually what a domain expert knows about the problem at hand. Needed is a model p(x) for computation. The Factorisation provides p(x) from statements. One can build a global model for computation from local conditional independence statements. 519of 828

50 Factorisation (3/4) Given a set of statements. Adding another statement will in general produce other statements. All statements can be read as d-separation in a DAG. However, there are sets of statements which cannot be satisfied by any Bayesian Network. 520of 828

51 Factorisation (4/4) The broader picture A directed graphical model can be viewed as a filter accepting probability distributions p(x) and only letting these through which satisfy the factorisation property. The set of all possible distribution p(x) which pass through the filter is denoted as DF. Alternatively, only these probability distributions p(x) pass through the filter (graph), which respect the conditional independencies implied by the d-separation properties of the graph. The d-separation theorem says that the resulting set DF is the same in both cases. p(x) DF 521of 828

Graphical Models. Dmitrij Lagutin, T Machine Learning: Basic Principles

Graphical Models. Dmitrij Lagutin, T Machine Learning: Basic Principles Graphical Models Dmitrij Lagutin, dlagutin@cc.hut.fi T-61.6020 - Machine Learning: Basic Principles 12.3.2007 Contents Introduction to graphical models Bayesian networks Conditional independence Markov

More information

FMA901F: Machine Learning Lecture 6: Graphical Models. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 6: Graphical Models. Cristian Sminchisescu FMA901F: Machine Learning Lecture 6: Graphical Models Cristian Sminchisescu Graphical Models Provide a simple way to visualize the structure of a probabilistic model and can be used to design and motivate

More information

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models Prof. Daniel Cremers 4. Probabilistic Graphical Models Directed Models The Bayes Filter (Rep.) (Bayes) (Markov) (Tot. prob.) (Markov) (Markov) 2 Graphical Representation (Rep.) We can describe the overall

More information

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models Prof. Daniel Cremers 4. Probabilistic Graphical Models Directed Models The Bayes Filter (Rep.) (Bayes) (Markov) (Tot. prob.) (Markov) (Markov) 2 Graphical Representation (Rep.) We can describe the overall

More information

Machine Learning. Sourangshu Bhattacharya

Machine Learning. Sourangshu Bhattacharya Machine Learning Sourangshu Bhattacharya Bayesian Networks Directed Acyclic Graph (DAG) Bayesian Networks General Factorization Curve Fitting Re-visited Maximum Likelihood Determine by minimizing sum-of-squares

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG) Bayesian Networks General Factorization Bayesian Curve Fitting (1) Polynomial Bayesian

More information

Bayesian Machine Learning - Lecture 6

Bayesian Machine Learning - Lecture 6 Bayesian Machine Learning - Lecture 6 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 2, 2015 Today s lecture 1

More information

Graphical Models Part 1-2 (Reading Notes)

Graphical Models Part 1-2 (Reading Notes) Graphical Models Part 1-2 (Reading Notes) Wednesday, August 3 2011, 2:35 PM Notes for the Reading of Chapter 8 Graphical Models of the book Pattern Recognition and Machine Learning (PRML) by Chris Bishop

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Overview of Part One Probabilistic Graphical Models Part One: Graphs and Markov Properties Christopher M. Bishop Graphs and probabilities Directed graphs Markov properties Undirected graphs Examples Microsoft

More information

Workshop report 1. Daniels report is on website 2. Don t expect to write it based on listening to one project (we had 6 only 2 was sufficient

Workshop report 1. Daniels report is on website 2. Don t expect to write it based on listening to one project (we had 6 only 2 was sufficient Workshop report 1. Daniels report is on website 2. Don t expect to write it based on listening to one project (we had 6 only 2 was sufficient quality) 3. I suggest writing it on one presentation. 4. Include

More information

D-Separation. b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C.

D-Separation. b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C. D-Separation Say: A, B, and C are non-intersecting subsets of nodes in a directed graph. A path from A to B is blocked by C if it contains a node such that either a) the arrows on the path meet either

More information

Graphical Models & HMMs

Graphical Models & HMMs Graphical Models & HMMs Henrik I. Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I. Christensen (RIM@GT) Graphical Models

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University February 18, 2015 Today: Graphical models Bayes Nets: Representing distributions Conditional independencies

More information

Machine Learning

Machine Learning Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University February 15, 2011 Today: Graphical models Inference Conditional independence and D-separation Learning from

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 2, 2012 Today: Graphical models Bayes Nets: Representing distributions Conditional independencies

More information

Introduction to Graphical Models

Introduction to Graphical Models Robert Collins CSE586 Introduction to Graphical Models Readings in Prince textbook: Chapters 10 and 11 but mainly only on directed graphs at this time Credits: Several slides are from: Review: Probability

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2016 Lecturer: Trevor Cohn 21. Independence in PGMs; Example PGMs Independence PGMs encode assumption of statistical independence between variables. Critical

More information

Machine Learning

Machine Learning Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University February 17, 2011 Today: Graphical models Learning from fully labeled data Learning from partly observed data

More information

Machine Learning. Lecture Slides for. ETHEM ALPAYDIN The MIT Press, h1p://

Machine Learning. Lecture Slides for. ETHEM ALPAYDIN The MIT Press, h1p:// Lecture Slides for INTRODUCTION TO Machine Learning ETHEM ALPAYDIN The MIT Press, 2010 alpaydin@boun.edu.tr h1p://www.cmpe.boun.edu.tr/~ethem/i2ml2e CHAPTER 16: Graphical Models Graphical Models Aka Bayesian

More information

Graphical Models and Markov Blankets

Graphical Models and Markov Blankets Stephan Stahlschmidt Ladislaus von Bortkiewicz Chair of Statistics C.A.S.E. Center for Applied Statistics and Economics Humboldt-Universität zu Berlin Motivation 1-1 Why Graphical Models? Illustration

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University February 25, 2015 Today: Graphical models Bayes Nets: Inference Learning EM Readings: Bishop chapter 8 Mitchell

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University April 1, 2019 Today: Inference in graphical models Learning graphical models Readings: Bishop chapter 8 Bayesian

More information

CS 343: Artificial Intelligence

CS 343: Artificial Intelligence CS 343: Artificial Intelligence Bayes Nets: Independence Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.

More information

Chapter 2 PRELIMINARIES. 1. Random variables and conditional independence

Chapter 2 PRELIMINARIES. 1. Random variables and conditional independence Chapter 2 PRELIMINARIES In this chapter the notation is presented and the basic concepts related to the Bayesian network formalism are treated. Towards the end of the chapter, we introduce the Bayesian

More information

10-701/15-781, Fall 2006, Final

10-701/15-781, Fall 2006, Final -7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University March 4, 2015 Today: Graphical models Bayes Nets: EM Mixture of Gaussian clustering Learning Bayes Net structure

More information

CSCE 478/878 Lecture 6: Bayesian Learning and Graphical Models. Stephen Scott. Introduction. Outline. Bayes Theorem. Formulas

CSCE 478/878 Lecture 6: Bayesian Learning and Graphical Models. Stephen Scott. Introduction. Outline. Bayes Theorem. Formulas ian ian ian Might have reasons (domain information) to favor some hypotheses/predictions over others a priori ian methods work with probabilities, and have two main roles: Optimal Naïve Nets (Adapted from

More information

Today. Logistic Regression. Decision Trees Redux. Graphical Models. Maximum Entropy Formulation. Now using Information Theory

Today. Logistic Regression. Decision Trees Redux. Graphical Models. Maximum Entropy Formulation. Now using Information Theory Today Logistic Regression Maximum Entropy Formulation Decision Trees Redux Now using Information Theory Graphical Models Representing conditional dependence graphically 1 Logistic Regression Optimization

More information

Dependency detection with Bayesian Networks

Dependency detection with Bayesian Networks Dependency detection with Bayesian Networks M V Vikhreva Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Leninskie Gory, Moscow, 119991 Supervisor: A G Dyakonov

More information

Part II. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Part II. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Part II C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Converting Directed to Undirected Graphs (1) Converting Directed to Undirected Graphs (2) Add extra links between

More information

The Basics of Graphical Models

The Basics of Graphical Models The Basics of Graphical Models David M. Blei Columbia University September 30, 2016 1 Introduction (These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan.

More information

Graphical Models. David M. Blei Columbia University. September 17, 2014

Graphical Models. David M. Blei Columbia University. September 17, 2014 Graphical Models David M. Blei Columbia University September 17, 2014 These lecture notes follow the ideas in Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. In addition,

More information

Inference. Inference: calculating some useful quantity from a joint probability distribution Examples: Posterior probability: Most likely explanation:

Inference. Inference: calculating some useful quantity from a joint probability distribution Examples: Posterior probability: Most likely explanation: Inference Inference: calculating some useful quantity from a joint probability distribution Examples: Posterior probability: B A E J M Most likely explanation: This slide deck courtesy of Dan Klein at

More information

Reasoning About Uncertainty

Reasoning About Uncertainty Reasoning About Uncertainty Graphical representation of causal relations (examples) Graphical models Inference in graphical models (introduction) 1 Jensen, 1996 Example 1: Icy Roads 2 1 Jensen, 1996 Example

More information

Directed Graphical Models (Bayes Nets) (9/4/13)

Directed Graphical Models (Bayes Nets) (9/4/13) STA561: Probabilistic machine learning Directed Graphical Models (Bayes Nets) (9/4/13) Lecturer: Barbara Engelhardt Scribes: Richard (Fangjian) Guo, Yan Chen, Siyang Wang, Huayang Cui 1 Introduction For

More information

CS 343: Artificial Intelligence

CS 343: Artificial Intelligence CS 343: Artificial Intelligence Bayes Nets: Inference Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2016 Lecturer: Trevor Cohn 20. PGM Representation Next Lectures Representation of joint distributions Conditional/marginal independence * Directed vs

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models SML 17 Workshop#10 Prepared by: Yasmeen George Agenda Probabilistic Graphical Models Worksheet solution Design PGMs PGMs: Joint likelihood Using chain rule Using conditional

More information

Inference and Representation

Inference and Representation Inference and Representation Rachel Hodos New York University Lecture 5, October 6, 2015 Rachel Hodos Lecture 5: Inference and Representation Today: Learning with hidden variables Outline: Unsupervised

More information

Introduction to information theory and coding - Lecture 1 on Graphical models

Introduction to information theory and coding - Lecture 1 on Graphical models Introduction to information theory and coding - Lecture 1 on Graphical models Louis Wehenkel Department of Electrical Engineering and Computer Science University of Liège Montefiore - Liège - October,

More information

3 : Representation of Undirected GMs

3 : Representation of Undirected GMs 0-708: Probabilistic Graphical Models 0-708, Spring 202 3 : Representation of Undirected GMs Lecturer: Eric P. Xing Scribes: Nicole Rafidi, Kirstin Early Last Time In the last lecture, we discussed directed

More information

A note on the pairwise Markov condition in directed Markov fields

A note on the pairwise Markov condition in directed Markov fields TECHNICAL REPORT R-392 April 2012 A note on the pairwise Markov condition in directed Markov fields Judea Pearl University of California, Los Angeles Computer Science Department Los Angeles, CA, 90095-1596,

More information

Markov Equivalence in Bayesian Networks

Markov Equivalence in Bayesian Networks Markov Equivalence in Bayesian Networks Ildikó Flesch and eter Lucas Institute for Computing and Information Sciences University of Nijmegen Email: {ildiko,peterl}@cs.kun.nl Abstract robabilistic graphical

More information

K-Means and Gaussian Mixture Models

K-Means and Gaussian Mixture Models K-Means and Gaussian Mixture Models David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 43 K-Means Clustering Example: Old Faithful Geyser

More information

A Brief Introduction to Bayesian Networks AIMA CIS 391 Intro to Artificial Intelligence

A Brief Introduction to Bayesian Networks AIMA CIS 391 Intro to Artificial Intelligence A Brief Introduction to Bayesian Networks AIMA 14.1-14.3 CIS 391 Intro to Artificial Intelligence (LDA slides from Lyle Ungar from slides by Jonathan Huang (jch1@cs.cmu.edu)) Bayesian networks A simple,

More information

Bayesian Networks and Decision Graphs

Bayesian Networks and Decision Graphs ayesian Networks and ecision raphs hapter 7 hapter 7 p. 1/27 Learning the structure of a ayesian network We have: complete database of cases over a set of variables. We want: ayesian network structure

More information

Computer vision: models, learning and inference. Chapter 10 Graphical Models

Computer vision: models, learning and inference. Chapter 10 Graphical Models Computer vision: models, learning and inference Chapter 10 Graphical Models Independence Two variables x 1 and x 2 are independent if their joint probability distribution factorizes as Pr(x 1, x 2 )=Pr(x

More information

1 : Introduction to GM and Directed GMs: Bayesian Networks. 3 Multivariate Distributions and Graphical Models

1 : Introduction to GM and Directed GMs: Bayesian Networks. 3 Multivariate Distributions and Graphical Models 10-708: Probabilistic Graphical Models, Spring 2015 1 : Introduction to GM and Directed GMs: Bayesian Networks Lecturer: Eric P. Xing Scribes: Wenbo Liu, Venkata Krishna Pillutla 1 Overview This lecture

More information

Lecture 3: Conditional Independence - Undirected

Lecture 3: Conditional Independence - Undirected CS598: Graphical Models, Fall 2016 Lecture 3: Conditional Independence - Undirected Lecturer: Sanmi Koyejo Scribe: Nate Bowman and Erin Carrier, Aug. 30, 2016 1 Review for the Bayes-Ball Algorithm Recall

More information

Causality with Gates

Causality with Gates John Winn Microsoft Research, Cambridge, UK Abstract An intervention on a variable removes the influences that usually have a causal effect on that variable. Gates [] are a general-purpose graphical modelling

More information

Sum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 15, 2015

Sum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 15, 2015 Sum-Product Networks STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 15, 2015 Introduction Outline What is a Sum-Product Network? Inference Applications In more depth

More information

Graphical models are a lot like a circuit diagram they are written down to visualize and better understand a problem.

Graphical models are a lot like a circuit diagram they are written down to visualize and better understand a problem. Machine Learning (ML, F16) Lecture#15 (Tuesday, Nov. 1st) Lecturer: Byron Boots Graphical Models 1 Graphical Models Often, one is interested in representing a joint distribution P over a set of n random

More information

CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination

CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination Instructor: Erik Sudderth Brown University Computer Science September 11, 2014 Some figures and materials courtesy

More information

Comparison of Variational Bayes and Gibbs Sampling in Reconstruction of Missing Values with Probabilistic Principal Component Analysis

Comparison of Variational Bayes and Gibbs Sampling in Reconstruction of Missing Values with Probabilistic Principal Component Analysis Comparison of Variational Bayes and Gibbs Sampling in Reconstruction of Missing Values with Probabilistic Principal Component Analysis Luis Gabriel De Alba Rivera Aalto University School of Science and

More information

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1 Preface to the Second Edition Preface to the First Edition vii xi 1 Introduction 1 2 Overview of Supervised Learning 9 2.1 Introduction... 9 2.2 Variable Types and Terminology... 9 2.3 Two Simple Approaches

More information

Statistical Techniques in Robotics (STR, S15) Lecture#06 (Wednesday, January 28)

Statistical Techniques in Robotics (STR, S15) Lecture#06 (Wednesday, January 28) Statistical Techniques in Robotics (STR, S15) Lecture#06 (Wednesday, January 28) Lecturer: Byron Boots Graphical Models 1 Graphical Models Often one is interested in representing a joint distribution P

More information

Lecture 4: Undirected Graphical Models

Lecture 4: Undirected Graphical Models Lecture 4: Undirected Graphical Models Department of Biostatistics University of Michigan zhenkewu@umich.edu http://zhenkewu.com/teaching/graphical_model 15 September, 2016 Zhenke Wu BIOSTAT830 Graphical

More information

Stat 5421 Lecture Notes Graphical Models Charles J. Geyer April 27, Introduction. 2 Undirected Graphs

Stat 5421 Lecture Notes Graphical Models Charles J. Geyer April 27, Introduction. 2 Undirected Graphs Stat 5421 Lecture Notes Graphical Models Charles J. Geyer April 27, 2016 1 Introduction Graphical models come in many kinds. There are graphical models where all the variables are categorical (Lauritzen,

More information

Expectation Maximization (EM) and Gaussian Mixture Models

Expectation Maximization (EM) and Gaussian Mixture Models Expectation Maximization (EM) and Gaussian Mixture Models Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 2 3 4 5 6 7 8 Unsupervised Learning Motivation

More information

Introduction to Probabilistic Graphical Models

Introduction to Probabilistic Graphical Models Introduction to Probabilistic Graphical Models Tomi Silander School of Computing National University of Singapore June 13, 2011 What is a probabilistic graphical model? The formalism of probabilistic graphical

More information

Directed Graphical Models

Directed Graphical Models Copyright c 2008 2010 John Lafferty, Han Liu, and Larry Wasserman Do Not Distribute Chapter 18 Directed Graphical Models Graphs give a powerful way of representing independence relations and computing

More information

Note Set 4: Finite Mixture Models and the EM Algorithm

Note Set 4: Finite Mixture Models and the EM Algorithm Note Set 4: Finite Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine Finite Mixture Models A finite mixture model with K components, for

More information

Factor Graphs and message passing

Factor Graphs and message passing Factor Graphs and message passing Carl Edward Rasmussen October 28th, 2016 Carl Edward Rasmussen Factor Graphs and message passing October 28th, 2016 1 / 13 Key concepts Factor graphs are a class of graphical

More information

Statistical and Learning Techniques in Computer Vision Lecture 1: Markov Random Fields Jens Rittscher and Chuck Stewart

Statistical and Learning Techniques in Computer Vision Lecture 1: Markov Random Fields Jens Rittscher and Chuck Stewart Statistical and Learning Techniques in Computer Vision Lecture 1: Markov Random Fields Jens Rittscher and Chuck Stewart 1 Motivation Up to now we have considered distributions of a single random variable

More information

Markov Random Fields and Gibbs Sampling for Image Denoising

Markov Random Fields and Gibbs Sampling for Image Denoising Markov Random Fields and Gibbs Sampling for Image Denoising Chang Yue Electrical Engineering Stanford University changyue@stanfoed.edu Abstract This project applies Gibbs Sampling based on different Markov

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 9, 2012 Today: Graphical models Bayes Nets: Inference Learning Readings: Required: Bishop chapter

More information

COUNTING AND PROBABILITY

COUNTING AND PROBABILITY CHAPTER 9 COUNTING AND PROBABILITY Copyright Cengage Learning. All rights reserved. SECTION 9.3 Counting Elements of Disjoint Sets: The Addition Rule Copyright Cengage Learning. All rights reserved. Counting

More information

Bayesian Classification Using Probabilistic Graphical Models

Bayesian Classification Using Probabilistic Graphical Models San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Spring 2014 Bayesian Classification Using Probabilistic Graphical Models Mehal Patel San Jose State University

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Clustering and EM Barnabás Póczos & Aarti Singh Contents Clustering K-means Mixture of Gaussians Expectation Maximization Variational Methods 2 Clustering 3 K-

More information

2. Graphical Models. Undirected graphical models. Factor graphs. Bayesian networks. Conversion between graphical models. Graphical Models 2-1

2. Graphical Models. Undirected graphical models. Factor graphs. Bayesian networks. Conversion between graphical models. Graphical Models 2-1 Graphical Models 2-1 2. Graphical Models Undirected graphical models Factor graphs Bayesian networks Conversion between graphical models Graphical Models 2-2 Graphical models There are three families of

More information

These notes present some properties of chordal graphs, a set of undirected graphs that are important for undirected graphical models.

These notes present some properties of chordal graphs, a set of undirected graphs that are important for undirected graphical models. Undirected Graphical Models: Chordal Graphs, Decomposable Graphs, Junction Trees, and Factorizations Peter Bartlett. October 2003. These notes present some properties of chordal graphs, a set of undirected

More information

RNNs as Directed Graphical Models

RNNs as Directed Graphical Models RNNs as Directed Graphical Models Sargur Srihari srihari@buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 10. Topics in Sequence Modeling Overview

More information

CSC Discrete Math I, Spring Sets

CSC Discrete Math I, Spring Sets CSC 125 - Discrete Math I, Spring 2017 Sets Sets A set is well-defined, unordered collection of objects The objects in a set are called the elements, or members, of the set A set is said to contain its

More information

Section 6.3: Further Rules for Counting Sets

Section 6.3: Further Rules for Counting Sets Section 6.3: Further Rules for Counting Sets Often when we are considering the probability of an event, that event is itself a union of other events. For example, suppose there is a horse race with three

More information

A Brief Introduction to Bayesian Networks. adapted from slides by Mitch Marcus

A Brief Introduction to Bayesian Networks. adapted from slides by Mitch Marcus A Brief Introduction to Bayesian Networks adapted from slides by Mitch Marcus Bayesian Networks A simple, graphical notation for conditional independence assertions and hence for compact specification

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Overview of Part Two Probabilistic Graphical Models Part Two: Inference and Learning Christopher M. Bishop Exact inference and the junction tree MCMC Variational methods and EM Example General variational

More information

Machine Learning A WS15/16 1sst KU Version: January 11, b) [1 P] For the probability distribution P (A, B, C, D) with the factorization

Machine Learning A WS15/16 1sst KU Version: January 11, b) [1 P] For the probability distribution P (A, B, C, D) with the factorization Machine Learning A 708.064 WS15/16 1sst KU Version: January 11, 2016 Exercises Problems marked with * are optional. 1 Conditional Independence I [3 P] a) [1 P] For the probability distribution P (A, B,

More information

10708 Graphical Models: Homework 2

10708 Graphical Models: Homework 2 10708 Graphical Models: Homework 2 Due October 15th, beginning of class October 1, 2008 Instructions: There are six questions on this assignment. Each question has the name of one of the TAs beside it,

More information

Graphical Models. Pradeep Ravikumar Department of Computer Science The University of Texas at Austin

Graphical Models. Pradeep Ravikumar Department of Computer Science The University of Texas at Austin Graphical Models Pradeep Ravikumar Department of Computer Science The University of Texas at Austin Useful References Graphical models, exponential families, and variational inference. M. J. Wainwright

More information

Lecture 9: Undirected Graphical Models Machine Learning

Lecture 9: Undirected Graphical Models Machine Learning Lecture 9: Undirected Graphical Models Machine Learning Andrew Rosenberg March 5, 2010 1/1 Today Graphical Models Probabilities in Undirected Graphs 2/1 Undirected Graphs What if we allow undirected graphs?

More information

Using a Model of Human Cognition of Causality to Orient Arcs in Structural Learning

Using a Model of Human Cognition of Causality to Orient Arcs in Structural Learning Using a Model of Human Cognition of Causality to Orient Arcs in Structural Learning A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at George

More information

CAUSAL NETWORKS: SEMANTICS AND EXPRESSIVENESS*

CAUSAL NETWORKS: SEMANTICS AND EXPRESSIVENESS* Uncertainty in Artificial Intelligence 4 R.D. Shachter, T.S. Levitt, L.N. Kanal, J.F. Lemmer (Editors) Elsevier Science Publishers B.V. (North-Holland), 1990 69 CAUSAL NETWORKS: SEMANTICS AND EXPRESSIVENESS*

More information

Machine Learning. Chao Lan

Machine Learning. Chao Lan Machine Learning Chao Lan Machine Learning Prediction Models Regression Model - linear regression (least square, ridge regression, Lasso) Classification Model - naive Bayes, logistic regression, Gaussian

More information

Bioinformatics - Lecture 07

Bioinformatics - Lecture 07 Bioinformatics - Lecture 07 Bioinformatics Clusters and networks Martin Saturka http://www.bioplexity.org/lectures/ EBI version 0.4 Creative Commons Attribution-Share Alike 2.5 License Learning on profiles

More information

Machine Learning Lecture 16

Machine Learning Lecture 16 ourse Outline Machine Learning Lecture 16 undamentals (2 weeks) ayes ecision Theory Probability ensity stimation Undirected raphical Models & Inference 28.06.2016 iscriminative pproaches (5 weeks) Linear

More information

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim, Germany

More information

Transitivity and Triads

Transitivity and Triads 1 / 32 Tom A.B. Snijders University of Oxford May 14, 2012 2 / 32 Outline 1 Local Structure Transitivity 2 3 / 32 Local Structure in Social Networks From the standpoint of structural individualism, one

More information

Sampling PCA, enhancing recovered missing values in large scale matrices. Luis Gabriel De Alba Rivera 80555S

Sampling PCA, enhancing recovered missing values in large scale matrices. Luis Gabriel De Alba Rivera 80555S Sampling PCA, enhancing recovered missing values in large scale matrices. Luis Gabriel De Alba Rivera 80555S May 2, 2009 Introduction Human preferences (the quality tags we put on things) are language

More information

Joint Entity Resolution

Joint Entity Resolution Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute

More information

THREE LECTURES ON BASIC TOPOLOGY. 1. Basic notions.

THREE LECTURES ON BASIC TOPOLOGY. 1. Basic notions. THREE LECTURES ON BASIC TOPOLOGY PHILIP FOTH 1. Basic notions. Let X be a set. To make a topological space out of X, one must specify a collection T of subsets of X, which are said to be open subsets of

More information

Warm-up as you walk in

Warm-up as you walk in arm-up as you walk in Given these N=10 observations of the world: hat is the approximate value for P c a, +b? A. 1/10 B. 5/10. 1/4 D. 1/5 E. I m not sure a, b, +c +a, b, +c a, b, +c a, +b, +c +a, b, +c

More information

Learning Tractable Probabilistic Models Pedro Domingos

Learning Tractable Probabilistic Models Pedro Domingos Learning Tractable Probabilistic Models Pedro Domingos Dept. Computer Science & Eng. University of Washington 1 Outline Motivation Probabilistic models Standard tractable models The sum-product theorem

More information

7. Boosting and Bagging Bagging

7. Boosting and Bagging Bagging Group Prof. Daniel Cremers 7. Boosting and Bagging Bagging Bagging So far: Boosting as an ensemble learning method, i.e.: a combination of (weak) learners A different way to combine classifiers is known

More information

Probabilistic Learning Classification using Naïve Bayes

Probabilistic Learning Classification using Naïve Bayes Probabilistic Learning Classification using Naïve Bayes Weather forecasts are usually provided in terms such as 70 percent chance of rain. These forecasts are known as probabilities of precipitation reports.

More information

2.2 Set Operations. Introduction DEFINITION 1. EXAMPLE 1 The union of the sets {1, 3, 5} and {1, 2, 3} is the set {1, 2, 3, 5}; that is, EXAMPLE 2

2.2 Set Operations. Introduction DEFINITION 1. EXAMPLE 1 The union of the sets {1, 3, 5} and {1, 2, 3} is the set {1, 2, 3, 5}; that is, EXAMPLE 2 2.2 Set Operations 127 2.2 Set Operations Introduction Two, or more, sets can be combined in many different ways. For instance, starting with the set of mathematics majors at your school and the set of

More information

Bayesian Networks. A Bayesian network is a directed acyclic graph that represents causal relationships between random variables. Earthquake.

Bayesian Networks. A Bayesian network is a directed acyclic graph that represents causal relationships between random variables. Earthquake. Bayes Nets Independence With joint probability distributions we can compute many useful things, but working with joint PD's is often intractable. The naïve Bayes' approach represents one (boneheaded?)

More information

Sequence Labeling: The Problem

Sequence Labeling: The Problem Sequence Labeling: The Problem Given a sequence (in NLP, words), assign appropriate labels to each word. For example, POS tagging: DT NN VBD IN DT NN. The cat sat on the mat. 36 part-of-speech tags used

More information

Causal Networks: Semantics and Expressiveness*

Causal Networks: Semantics and Expressiveness* Causal Networks: Semantics and Expressiveness* Thomas Verina & Judea Pearl Cognitive Systems Laboratory, Computer Science Department University of California Los Angeles, CA 90024

More information

AI Programming CS S-15 Probability Theory

AI Programming CS S-15 Probability Theory AI Programming CS662-2013S-15 Probability Theory David Galles Department of Computer Science University of San Francisco 15-0: Uncertainty In many interesting agent environments, uncertainty plays a central

More information

2. Graphical Models. Undirected pairwise graphical models. Factor graphs. Bayesian networks. Conversion between graphical models. Graphical Models 2-1

2. Graphical Models. Undirected pairwise graphical models. Factor graphs. Bayesian networks. Conversion between graphical models. Graphical Models 2-1 Graphical Models 2-1 2. Graphical Models Undirected pairwise graphical models Factor graphs Bayesian networks Conversion between graphical models Graphical Models 2-2 Graphical models Families of graphical

More information