GORIHM FOR IION UOR ayesian networks Guest ecturer: ilja Renooij hanks to hor Whalen or kindly contributing to these slides
robabilistic Independence
onditional Independence
onditional Independence
hain rule & independence
Independence & space/time complexity
icient representation o independence
ayesian network N
ayesian network queries e e e e = = = = = = = = h H h H h H arg max max arg e H e H = = = = = h h h h
omplexity o queries decision versions all N-hard
xact inerence Inerence algorithms ariable elimination Message passing earl Junction tree propagation aka join tree/hugin prop. pproximate inerence oopy belie propagation tochastic sampling various Monte arlo methods! in general approximation within a guaranteed margin o error does not reduce complexity o inerence
Idea behind the Junction tree algorithm ome clever algorithm Many problems that are hard on arbitrary graphs are easy on tree-like structures.
Or more speciically. lique or bag G H G F separator G F GH ayesian Network one-dim. stochastic variables conditional probabilities econdary tructure: Junction ree multi-dim. stochastic variables cluster potentials
et s take a couple o steps back
We are interested in - Need to sum out eliminate: Initial actors: rute orce: ut let s try something more elegant = v s x t l a b b a a x l t a s b s l v t s v example in sia network
liminate variables in order: ombine all initial actors using : = v v v [ Note: although = in general the result o elimination is not necessarily a probability term] example continued more or less joins and
liminate variables in order: ombine initial actors or this iteration: = s s s s example cntnd [ Note: result o elimination may be a unction o several variables; and thus become connected ]
liminate variables in order: ombine actors or this iteration: = x x [ Note: a = 1 or all values a o ] example cntnd
liminate variables in order: ombine actors or this iteration: = t t t example cntnd [ Note: actors can include other s; this actor joins and ]
liminate variables in order: ombine actors or this iteration: = l l l example cntnd [ Note: joins and ]
liminate variables in order: ombine actors or this iteration: = a a a a example cntnd
liminate variables in order: ombine actors or this iteration: example cntnd = b b
g g g g g g g In our previous example: With a dierent ordering: omplexity is exponential in the size o these actors! intermediate actors
Notes about ctual computation is done in the elimination steps omputation depends on the order o elimination For each query we need to compute everything again! Many redundant calculations
Junction rees Redundant calculations can be avoided by generalising to the junction tree J algorithm introduced by auritzen & piegelhalter 1988 he J algorithm compiles a class o elimination orders into a data structure that supports the computation o all possible queries.
uilding a Junction ree G Moral Graph riangulated Graph Identiying liques Junction ree
F G H F G H F G H 1. For all Z : For all Y parz add an edge Y. 2. Undirect all edges. G M G = tep 1: Moralization
tep 2: riangulation G M G G G H H F F dd edges to G M such that there is no cycle with length 4 that does not contain a chord. NO Y
tep 3: Identiying liques G G G G H H F F ll maximal cliques complete subgraphs o G
tep 4-I: Junction Graph liques rom G incomplete Junction graph G J G G G H separators G F e.g. F = F GH junction graph or an undirected graph G is an undirected labeled graph. he nodes are the cliques in G. I two cliques intersect they are joined in the junction graph by an edge labeled with their intersection.
tep 4-II: Junction ree junction tree is a sub-graph o the junction graph that Is a tree ontains all the cliques spanning tree atisies the running intersection property: or each pair o nodes Y all nodes on the path between and Y contain Y
Junction graph G J incomplete Junction tree G J F G G GH G G F GH
Running intersection? ll cliques Z and separators along the path between any two nodes and Y contain the intersection Y. x: ={} Y={} Y={} ={} {} 1 ={} {} 2 ={} {} Y 1 2 Z G G F GH
Using a Junction ree or inerence G Junction ree Initialization Inconsistent Junction ree onsistent Junction ree = v = e ropagation message passing Marginalization summing out
tep 1: Initialization For each conditional distribution rom the N create a node potential: ssign each node potential to a single clique or which he clique potential or is the product o its assigned node potentials
Marginalisation and Inconsistency otentials in the junction tree can be inconsistent i.e. computing a marginal i rom dierent cliques can give dierent results: = Σ φ ce = 0.12 0.33 0.11 0.03 = Σ φ de = 0.02 0.43 0.31 0.12 G G F GH
ropagating potentials: idea Message assing rom clique to clique 1. roject the potential o into separator 2. bsorb the potential o separator into rojection bsorption
Global propagation: idea 1. hoose a root 2. O-IN messages 1-5: leas to root. N corresponds with a perect elimination order! 9 2 3 6 5 Root 7 G 3. IRIU-IN messages 6-10: root to leas G 1 8 4 10 F GH ter global propagation potentials are consistent and marginalisation gives correct results.
Message passing Message passing in the junction tree resembles earl s λ-π -message passing algorithm or singly connected graphs. o you want to know how and why that works? sk those doing the robabilistic Reasoning course!
ack to complexity omputing probabilities rom a N with graph G with n nodes and tree-width w requires On expw time. tree-width o G = minimum width over all possible junction trees o G width o a junction tree = size o the largest clique minus 1 inerence and M can be solved in polynomial time on networks o bounded tree-width! Only M remains N-complete even on graphs with w 2
We ve seen that: ummary & More ayesian networks eiciently represent a joint probability distribution. he junction tree propagation algorithm elegantly combines elimination orders rom and message passing alike earl. We haven t discussed how to: triangulate a graph construct a Junction ree rom a junction graph exactly compute probabilities rom it urious? bit more can be ound in the bonus slides Finally: Junction tree algorithms are also useul or other purposes! here s so much more to Ns!
onus slides
ach elimination ordering triangulates the graph not necessarily in the same way: H F G H F G H F G H F G H F G H F G H F G H F G H F G riangulation G G M G
riangulation with Min-Fill Intuitively triangulations with as ew ill-ins as possible are preerred eaves us with small cliques small potentials common heuristic Min-ill : Repeat until no nodes remain: Find the node whose elimination would require the least number o ill-ins may be zero. liminate that node and note the need or a ill-in edge between any two non-adjacent neighbors. dd the ill-in edges to the original graph.
riangulation example G H G liminate the vertex that requires least number o edges to be added. F G M F G H F G vertex induced added removed clique edges 1 H GH - 2 G G - 3 F F - 4 -- vertex induced added removed clique edges 5 -- 6-7 - 8 -
ew useul theorems n undirected graph is triangulated i and only i its junction graph has a junction tree sub-tree o the junction graph o a triangulated graph is a junction tree i and only i it is a spanning o maximal weight M.
Finding a Minimal panning ree Kruskal s algorithm: choose successively a link o maximal weight unless it creates a cycle. Junction graph G J incomplete Junction tree G J G G G G F GH F GH
mall propagation example xample N: hase 1: create a Junction ree:
mall propagation example hase 2-step 1: initialization ariable ssociated luster lique otentials φ = φ = φ = φ =
hase 2: ollect evidence hoose arbitrary clique e.g. {} where all potential unctions will be collected. Recursively call neighbouring cliques or messages: 1. all {}: 1. rojection onto separator : φ φ mall propagation example 2. bsorption into {}: = = = φ φ φ = = old φ No old value in irst pass 1
2. all {}: mall propagation example hase 2: ollect evidence cntd 1. rojection onto separator : φ 2. bsorption into {}: = = = φ 1 φ φ φ = old φ Result rom absorption in irst call
mall propagation example hase 2: istribute evidence ass messages recursively to neighboring nodes ass message rom {} to {}: 1. rojection onto separator : φ 2. bsorption into {}: φ = = = φ old φ φ φ = From phase 1
ass message rom {} to {}: 1. rojection onto separator : φ 2. bsorption into {}: φ mall propagation example hase 2: istribute evidence cntd old φ = = = φ φ φ = = 1 From phase 1 Now the junction tree is consistent and marginalisation in any clique is okay.