Reasoning About Uncertainty Graphical representation of causal relations (examples) Graphical models Inference in graphical models (introduction) 1 Jensen, 1996 Example 1: Icy Roads 2 1
Jensen, 1996 Example 2: Wet Grass 3 Jensen, 1996 Example 3: Burglary/Earthquake 4 2
Graph Connections Serial connection Diverging connection Converging connection 5 Serial Connection 6 3
Diverging Connection 7 Converging Connection 8 4
Jensen, 1996 Conditional Dependence 9 Graphical Models Directed graphs Undirected graphs Chain codes 10 5
Directed graphs represent causality encode independence and conditional independence orderedpair(v,d); D-set of directed edges X Y between vertices in V directed cycle (loop); acyclic & cyclic graphs path is a sequence of edges (each with endpoints); acyclic/cyclic/directed paths 11 Examples: Directed Graphs (cont ) 12 6
D-Separation Two variables A and B in a DAG are d-separated if for all paths between A and B there is an intermediate variable V such that either The connection is serial or diverging and the state of V is known or The connection converging and neither V nor any of Vs descendents have received evidence 13 Jensen, 1996 Examples of d-separation 14 7
Jensen, 1996 Examples of d-separation 15 Directed Global Markov Property The directed global Markov property, i.e., d- separation DS, is defined as: DG DS X Y Z if X and Y are d- separated by Z in DG. Example DG1 A C B; A C {B,D}; B D C; B D A; DG2 A C D; A C {B,D}; B D C; B D A; DG3 A D C; B A C; B D {C,A}; 16 8
Undirected Graphs An undirected graph (UG) is also known as a Markov random field An ordered pair(v,u); V is a set of vertices; U is a set of undirected edges X-Y between vertices. For disjoint sets of vertices X, Y and Z (Z may be empty), if there is no path from a variable X X to avariabley Y that does not include some variable in Z,thenX and Y aresaidtobe separated by Z. 17 Undirected Graphs (cont ) Examples: 18 9
Undirected Global Markov Property Theundirected global Markov property, i.e., separation U, is defined as: UG U X Y Z if X and Y are separated by Z in UG. Example UG1 A D C; A D {B,C}; B D C; A B C; A B {C,D}; UG2 A C B; A D {C,B}; A D B; A D C; D B C; 19 20 Chain Graphs Achain graph (CG), admitting both directed and undirected edges, generalises a graphical model based on directed or undirected graphs A chain graph has no partially directed cycles, where a partially directed cycle,g,isasequence of n distinct edges E 1,,E n (n 3) with endpoints X i, X i+1 respectively, such that: i. X 1 X n+1 ii. i (1 i n)eitherx i X i+1 or X i X i+1, and iii. j (1 j n) such that X j X j+1. 10
Examples: Chain Graphs (cont ) (a) CG1 (b) CG2 (a) graphs containing partially directed cycles and (b) chain graphs 21 22 Lauritzen-Wermuth-Frydenberg (LWF) global Markov property TheLWF global Markov property for chain graphs ( LWF), is then defined as: CG LWF X Y Z if X is separated from Y by Z in the undirected moral graph based on CG. Example CG1 A D C; A D {B,C}; B D C; A B C; A B {C,D}; CG2 A D C; A D {C,B}; A C B; A D B; D B C; 11
Chain Graphs (cont ) Acomplex in CG is a subgraph with the following form: X V 1 V n Y (n 1). Moralisation is needed for deriving the Markov property for chain graphs. Moralisation is achieved by adding the undirected edge X Y to a complex graph. A moral graph is the undirected graph created by moralizing all complexes in CG, and then replacing all directed edges with undirected edges. 23 Exact Inference in Graphical Models Directed graphs DAG, Bayesian network, belief network Representation of the joint probability using prior & conditional probabilities. Edges (arcs) & vertices (nodes) Causality Inference is represented graphically 24 12
Exact Inference in DAGs Example 1 Bayes theorem X)Y X) X,Y) Y)X Y) 25 Example 1 Bayes theorem (cont ) (Revision) Bayes theorem combines known (and observed) probabilities to compute unknown probability of interest. P ( A B) = B A) A) B) Bayes theorem can be formulated as: likelihoodx prior posterior = evidence 26 13
Example 1 Bayes theorem (cont ) Consider the first (left) model and let us assume that we are given X), Y X) and Y=y. We are interested in the posterior probability X Y=y). Employing the product rule of probability we derive the evidence (marginal distribution) Y)=Σ x Y X)X), and then use Bayes theorem to calculate: Y = y X ) X ) X Y = y) = Y = y) 27 Exact Inference in DAGs Example 2 conditional independence We assume the structure Z X Y for arbitrary edges and interested in computing Y Z=z) given Z,X), X) and Y,X). We can first derive the joint probability: 28 P ( Z, X ) = Z X ) X ) P ( Y, X ) = Y X ) X ) P ( X, Y, Z) = Z, X ) Y, X ) / X ) 14
Example 2 Conditional Independence (cont ) and then extract: P ( Y Z, X ) = X, Y, Z) Z, X ) Y = = Y X ) X ) Z X ) X ) Z, X ) 29 Example 2 Conditional Independence (cont ) The joint probability of different possible models can be factorised according to similar guidelines. For example, Z X Y can get different forms each is factorised differently: Z X Y: X,Y,Z)=X)Y X)Z X) Z X Y: X,Y,Z)=Y X)X Z)Z) Z X Y: X,Y,Z)=X Y)Z X)Y) 30 15