Graphical Models & HMMs

Size: px
Start display at page:

Download "Graphical Models & HMMs"

Transcription

1 Graphical Models & HMMs Henrik I. Christensen Robotics & Intelligent GT Georgia Institute of Technology, Atlanta, GA hic@cc.gatech.edu Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 1 / 83

2 Outline 1 Introduction 2 Bayesian Networks 3 Conditional Independence 4 Inference 5 Factor Graphs 6 Sum-Product Algorithm 7 HMM Introduction 8 Markov Model 9 Hidden Markov Model 10 ML solution for the HMM 11 Forward-Backward 12 Viterbi 13 Example 14 Summary Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 2 / 83

3 Introduction Basically we can describe Bayesian inference through repeated use of the sum and product rules Using a graphical / diagrammatical representation is often useful 1 A way to visualize structure and consider relations 2 Provides insights into a model and possible independence 3 Allow us to leverage of the many graphical algorithms available Will consider both directed and undirected models This is a very rich areas with numerous good references Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 3 / 83

4 Outline 1 Introduction 2 Bayesian Networks 3 Conditional Independence 4 Inference 5 Factor Graphs 6 Sum-Product Algorithm 7 HMM Introduction 8 Markov Model 9 Hidden Markov Model 10 ML solution for the HMM 11 Forward-Backward 12 Viterbi 13 Example 14 Summary Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 4 / 83

5 Bayesian Networks Consider the joint probability p(a, b, c) Using product rule we can rewrite it as p(a, b, c) = p(c a, b)p(a, b) Which again can be changed to p(a, b, c) = p(c a, b)p(b a)p(a) We can illustrate this as a b Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 5 / 83 c

6 Bayesian Networks Nodes represent variables Arcs/links represent conditional dependence We can use the decomposition for any joint distribution. The direct / brute-force application generates fully connected graphs We can represent much more general relations Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 6 / 83

7 Example with sparse connections We can represent relations such as p(x 1 )p(x 2 )p(x 3 )p(x 4 x 1, x 2, x 3 )p(x 5 x 1, x 3 )p(x 6 x 4 )p(x 7 x 4, x 5 ) Which is shown below x 1 x 2 x 3 x 4 x 5 x 6 x 7 Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 7 / 83

8 The general case We can think of this as coding the factors p(x k pa k ) where pa k is the set of parents to a variable x k The inference is then p(x) = k p(x k pa k ) we will refer to this as factorization There can be no directed cycles in the graph The general form is termed a directed acyclic graph - DAG Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 8 / 83

9 Basic example We have seen the polynomial regression before p(t, w) = p(w) n p(t n w) Which can be visualized as w t 1 t N Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 9 / 83

10 Bayesian Regression We can make the parameters and variables explicit p(t, w x, α, σ 2 ) = p(w α) n p(t n w, x n, σ 2 ) as shown here x n α σ 2 t n N w Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 10 / 83

11 Bayesian Regression - Learning When entering data we can condition inference on it p(w t) p(w) n p(t n w) x n α σ 2 t n N w Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 11 / 83

12 Generative Models - Example Image Synthesis Object Position Orientation Image Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 12 / 83

13 Discrete Variables - 1 General joint distribution has K 2 1 parameters (for K possible outcomes) x 1 x 2 p(x 1, x 2 µ) = Independent joint distributions have 2(K 1) parameters x 1 x 2 p(x 1, x 2 µ) = K K µ x 1i x 2j ij i=1 j=1 K i=1 µ x 1i 1i K j=1 µ x 2j 2j Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 13 / 83

14 Discrete Variables - 2 General joint distribution over M variables will have K M 1 parameters A Markov chain with M nodes will have K 1 + (M 1)K(K 1) parameters x 1 x 2 x M Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 14 / 83

15 Discrete Variables - Bayesian Parms µ 1 µ 2 µ M x 1 x 2 x M The parameters can be modelled explicitly M p({x m, µ m }) = p(x 1 µ 1 )p(µ 1 ) p(x m x m 1, µ m )p(µ m ) m=2 It is assumed that p(µ m ) is a Dirachlet Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 15 / 83

16 Discrete Variables - Bayesian Parms (2) µ 1 µ x 1 x 2 x M For shared paraemeters the situation is simpler M p({x m }, µ 1, µ) = p(x 1 µ 1 )p(µ 1 ) p(x m x m 1, µ)p(µ) m=2 Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 16 / 83

17 Extension to Linear Gaussian Models The model can be extended to have each node as a Gaussian process/variable that is a linear function of its parents p(x i pa i ) = N x i w ij x j + b i, v i j pa i Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 17 / 83

18 Outline 1 Introduction 2 Bayesian Networks 3 Conditional Independence 4 Inference 5 Factor Graphs 6 Sum-Product Algorithm 7 HMM Introduction 8 Markov Model 9 Hidden Markov Model 10 ML solution for the HMM 11 Forward-Backward 12 Viterbi 13 Example 14 Summary Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 18 / 83

19 Conditional Independence Considerations of independence is important as part of the analysis and setup of a system As an example a is independent of b given c Or equivalently Frequent notation in statistics p(a b, c) = p(a c) p(a, b c) = p(a b, c)p(b c) = p(a c)p(b c) a b c Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 19 / 83

20 Conditional Independence - Case 1 c p(a, b, c) = p(a c)p(b c)p(c) a b p(a, b) = c a / b p(a c)p(b c)p(c) Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 20 / 83

21 Conditional Independence - Case 1 a c b p(a, b c) = p(a, b, c) p(c) = p(a c)p(b c) a b c Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 21 / 83

22 Conditional Independence - Case 2 a c b p(a, b, c) = p(a)p(c a)p(b c) p(a, b) = p(a) c p(c a)p(b c) = p(a)p(b a) a / b Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 22 / 83

23 Conditional Independence - Case 2 a c b p(a, b c) = p(a, b, c) p(c) = p(a)p(c a)p(b c) p(c) = p(a c)p(b c) a b c Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 23 / 83

24 Conditional Independence - Case 3 a b c p(a, b, c) = p(a)p(b)p(c a, b) p(a, b) = p(a)p(b) a b This is the opposite of Case 1 - when c unobserved Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 24 / 83

25 Conditional Independence - Case 3 a c b p(a, b c) = = a / b c p(a, b, c) p(c) p(a)p(b)p(c a, b) p(c) This is the opposite of Case 1 - when c observed Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 25 / 83

26 Diagnostics - Out of fuel? B G F p(g = 1 B = 1, F = 1) = 0.8 p(g = 1 B = 1, F = 0) = 0.2 p(g = 1 B = 0, F = 1) = 0.2 p(g = 1 B = 0, F = 0) = 0.1 B = Battery F = Fuel Tank G = Fuel Gauge p(b = 1) = 0.9 p(f = 1) = 0.9 p(f = 0) = 0.1 Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 26 / 83

27 Diagnostics - Out of fuel? B F G p(f = 0 G = 0) = p(g = 0 F = 0)p(F = 0) p(g = 0) Observing G=0 increased the probability of an empty tank Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 27 / 83

28 Diagnostics - Out of fuel? B F G p(f = 0 G = 0, B = 0) = p(g = 0 B = 0, F = 0)p(F = 0) p(g = 0 B = 0, F )p(f ) Observing B=0 implies less likely empty tank Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 28 / 83

29 D-separation Consider non-intersecting subsets A, B, and C in a directed graph A path between subsets A and B is considered blocked if it contains a node such that: 1 the arcs are head-to-tail or tail-to-tail and in the set C 2 the arcs meet head-to-head and neither the node or its descendents are in the set C. If all paths between A and B are blocked then A is d-separated from B by C. If there is d-separation then all the variables in the graph satisfies A B C Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 29 / 83

30 D-Separation - Discussion a f a f e b e b c c Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 30 / 83

31 Outline 1 Introduction 2 Bayesian Networks 3 Conditional Independence 4 Inference 5 Factor Graphs 6 Sum-Product Algorithm 7 HMM Introduction 8 Markov Model 9 Hidden Markov Model 10 ML solution for the HMM 11 Forward-Backward 12 Viterbi 13 Example 14 Summary Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 31 / 83

32 Inference in graphical models Consider inference of p(x, y) we can formulate this as p(x, y) = p(x y)p(y) = p(y x)p(x) We can further marginalize p(y) = p(y x )p(x ) x Using Bayes Rule we can reverse the inference p(x y) = p(y x)p(x) p(y) Helpful as mechanisms for inference Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 32 / 83

33 Inference in graphical models x x x y y y (a) (b) (c) Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 33 / 83

34 Inference on a chain x 1 x 2 x N 1 x N p(x) = 1 Z ψ 1,2(x 1, x 2 )ψ 2,3 (x 2, x 3 ) ψ N 1,N (x N 1, x N ) p(x n ) = p(x) x 1 x n 1 x n+1 x N Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 34 / 83

35 Inference on a chain µ α (x n 1 ) µ α (x n ) µ β (x n ) µ β (x n+1 ) x 1 x n 1 x n x n+1 x N p(x n ) = 1 [ ] ψ n 1,n (x n 1, x n ) ψ 1,2 (x 1, x 2 ) Z xn 1 x 1 }{{} [ µ a(x n) [ ]] ψ n,n+1 (x n, x n+1 ) ψ N 1,N (x N 1, x N ) x n+1 x N }{{} µ b (x n) Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 35 / 83

36 Inference on a chain µ α (x n 1 ) µ α (x n ) µ β (x n ) µ β (x n+1 ) x 1 x n 1 x n x n+1 x N µ a (x n ) = x n 1 ψ n 1,n (x n 1, x n ) [ = x n 1 ψ n 1,n (x n 1, x n )µ a (x n 1 ) µ b (x n ) = x n+1 ψ n,n+1 (x n, x n+1 ) [ = x n+1 ψ n,n+1 (x n, x n+1 )µ b (x n+1 x 1 ψ 1,2 (x 1, x 2 ) ] x N ψ N 1,N (x N 1, x N ) ] Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 36 / 83

37 Inference on a chain µ α (x n 1 ) µ α (x n ) µ β (x n ) µ β (x n+1 ) x 1 x n 1 x n x n+1 x N µ a (x 2 ) = x 1 ψ 1,2 (x 1, x 2 ) µ b (x N 1 ) = x N ψ N 1,N (x N 1, x n ) Z = x n µ a (x n )µ b (x n ) Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 37 / 83

38 Inference on a chain To compute local marginals Compute and store forward messages µ a (x n ) Compute and store backward messages µ b (x n ) Compute Z at all nodes Compute for all variables p(x n ) = 1 Z µ a(x n )µ b (x n ) Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 38 / 83

39 Inference in trees Undirected Tree Directed Tree Directed Polytree Henrik I. Christensen Graphical Models & HMMs 39 / 83

40 Outline 1 Introduction 2 Bayesian Networks 3 Conditional Independence 4 Inference 5 Factor Graphs 6 Sum-Product Algorithm 7 HMM Introduction 8 Markov Model 9 Hidden Markov Model 10 ML solution for the HMM 11 Forward-Backward 12 Viterbi 13 Example 14 Summary Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 40 / 83

41 Factor Graphs x 1 x 2 x 3 f a f b f c f d p(x) = f a (x 1, x 2 )f b (x 1, x 2 )f c (x 2, x 3 )f d (x 3 ) p(x) = s f s (x s ) Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 41 / 83

42 Factor Graphs from Directed Graphs x 1 x 2 x 1 x 2 f x 1 x 2 f c f a f b x 3 x 3 p(x) = p(x 1 )p(x 2 ) f (x 1, x 2, x 3 ) = f a (x 1 ) = p(x 1 ) p(x 3 x 1, x 2 ) p(x 1 )p(x 2 )p(x 3 x 1, x 2 ) f b (x 2 ) = p(x 2 ) f c (x 1, x 2, x 3 ) = p(x 3 x 2, x 1 ) x 3 Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 42 / 83

43 Factor Graphs from Undirected Graphs x 1 x 2 x 1 x 2 f x 1 x 2 f a f b x 3 x 3 ψ(x 1, x 2, x 3 ) f (x 1, x 2, x 3 ) f a (x 1, x 2, x 3 )f b (x 2, x 3 ) = ψ(x 1, x 2, x 3 ) = ψ(x 1, x 2, x 3 ) x 3 Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 43 / 83

44 Outline 1 Introduction 2 Bayesian Networks 3 Conditional Independence 4 Inference 5 Factor Graphs 6 Sum-Product Algorithm 7 HMM Introduction 8 Markov Model 9 Hidden Markov Model 10 ML solution for the HMM 11 Forward-Backward 12 Viterbi 13 Example 14 Summary Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 44 / 83

45 The Sum-Product Algorithm Objective Key Idea exact, efficient algorithm for computing marginals make allow multiple marginals to be computed efficiently The distributive Law ab + ac = a(b + c) Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 45 / 83

46 The Sum-Product Algorithm Fs(x, Xs) f s µ fs x(x) x p(x) = p(x) x\x p(x) = s Ne(x) F s (x, X s ) Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 46 / 83

47 The Sum-Product Algorithm Fs(x, Xs) f s µ fs x(x) x p(x) = s Ne(x) Xs F s (x, X s ) = µ fs x(x) s Ne(x) µ fs x(x) = X s F s (x, X s ) Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 47 / 83

48 The Sum-Product Algorithm x M µ xm f s (x M ) f s µ fs x(x) x x m G m (x m, X sm ) F s (x, X s ) = f s (x, x 1,..., x M )G 1 (x 1, X s1 )... G M (x M, X sm ) Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 48 / 83

49 The Sum-Product Algorithm x M µ xm f s (x M ) f s µ fs x(x) x x m G m (x m, X sm ) µ fs x(x) = f s (x, x 1,..., x M ) x 1 x M = x 1 x M f s (x, x 1,..., x M ) m Ne(f s)\x m Ne(f s)\x G m (x m, X sm ) Xsm µ xm f s (x m ) Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 49 / 83

50 The Sum-Product Algorithm x M µ xm f s (x M ) f s µ fs x(x) x x m G m (x m, X sm ) µ xm fs (x m ) G m (x m, X sm ) = F l (x m, X lm ) X sm X sm l Ne(x m)\f s = µ fl x m (x m ) l Ne(x m)\f s Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 50 / 83

51 The Sum-Product Algorithm Initialization For variable nodes µ x f (x) = 1 For factor nodes x µ f x (x) = f(x) f f x Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 51 / 83

52 The Sum-Product Algorithm To compute local marginals Pick an arbitrary node as root Compute and propagate msgs from leafs to root (store msgs) Compute and propagate msgs from root to leaf nodes (store msgs) Compute products of received msgs and normalize as required Propagate up the tree and down again to compute all marginals (as needed) Henrik I. Christensen Graphical Models & HMMs 52 / 83

53 Outline 1 Introduction 2 Bayesian Networks 3 Conditional Independence 4 Inference 5 Factor Graphs 6 Sum-Product Algorithm 7 HMM Introduction 8 Markov Model 9 Hidden Markov Model 10 ML solution for the HMM 11 Forward-Backward 12 Viterbi 13 Example 14 Summary Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 53 / 83

54 Introduction There are many examples where we are processing sequential data Everyday examples include Processing of stock data Speech analysis Traffic patterns Gesture analysis Manufacturing flow We cannot assume IID as a basis for analysis The Hidden Markov Model is frequently used Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 54 / 83

55 Apple Stock Quote Example Henrik I. Christensen Graphical Models & HMMs 55 / 83

56 Outline 1 Introduction 2 Bayesian Networks 3 Conditional Independence 4 Inference 5 Factor Graphs 6 Sum-Product Algorithm 7 HMM Introduction 8 Markov Model 9 Hidden Markov Model 10 ML solution for the HMM 11 Forward-Backward 12 Viterbi 13 Example 14 Summary Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 56 / 83

57 Markov model We have already talked about the Markov model. Estimation of joint probability as a sequential challenge N p(x 1,..., x N ) = p(x n x n 1,..., x 1 ) n=1 An Nth order model is dependent on the N last terms A 1st order model is then N P(x 1,..., x N ) = p(x n x n 1 ) n=1 I.e.: p(x n x n 1,..., x 1 ) = p(x n x n 1 ) Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 57 / 83

58 Markov chains x 1 x 2 x 3 x 4 x 1 x 2 x 3 x 4 Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 58 / 83

59 Outline 1 Introduction 2 Bayesian Networks 3 Conditional Independence 4 Inference 5 Factor Graphs 6 Sum-Product Algorithm 7 HMM Introduction 8 Markov Model 9 Hidden Markov Model 10 ML solution for the HMM 11 Forward-Backward 12 Viterbi 13 Example 14 Summary Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 59 / 83

60 Hidden Markov Model z 1 z 2 z n 1 z n z n+1 x 1 x 2 x n 1 x n x n+1 Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 60 / 83

61 Modelling of HMM s We can model the transition probabilities as a table A jk = p(z nk = 1 z n 1,j = 1) The conditionals are then (with a 1-out-of-K coding) p(z n z n 1, A) = K K A z n 1,j z nk jk k=1 j=1 The per element probability is expressed by π k = p(z 1k = 1) with k π k = 1 p(z 1 π) = K k=1 π z 1k k Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 61 / 83

62 Illustration of HMM k = 1 A 11 A 11 A 11 k = 2 k = 3 A 33 A 33 A 33 n 2 n 1 n n + 1 Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 62 / 83

63 Outline 1 Introduction 2 Bayesian Networks 3 Conditional Independence 4 Inference 5 Factor Graphs 6 Sum-Product Algorithm 7 HMM Introduction 8 Markov Model 9 Hidden Markov Model 10 ML solution for the HMM 11 Forward-Backward 12 Viterbi 13 Example 14 Summary Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 63 / 83

64 Maximum likelihood for the HMM If we observe a set of data X = {x 1,..., x N } we can estimate the parameters using ML p(x θ) = Z p(x, Z θ) I.e. summation of paths through lattice We can use EM as a strategy to find a solution E-Step: Estimation of p(z X, θ old ) M-Step: Maximize over θ to optimize Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 64 / 83

65 ML solution to HMM Define Q(θ, θ old ) = Z p(z X, θ old ) ln p(x, Z θ) γ(z n ) = p(z n X, θ old ) γ(z nk ) = E[z nk ] = z γ(z)z nk ξ(z n 1, z n ) = p(z n 1, z n X, θ old ) ξ(z n 1,j, z nk ) = z γ(z)z n 1,j z nk Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 65 / 83

66 ML solution to HMM The quantities can be computed π k = A jk = γ(z 1k ) K j=1 γ(z 1j) N n=2 ξ(z n 1,j, z nk ) K N l=1 n=2 ξ(z n 1,j, z nl ) Assume p(x φ k ) = N(x µ k, Σ k ) so that µ k = n γ(z nk)x n Σ k = n γ(z nk) How do we efficiently compute γ(z nk )? n γ(z nk)(x n µ k )(x n µ k ) T n γ(z nk) Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 66 / 83

67 Outline 1 Introduction 2 Bayesian Networks 3 Conditional Independence 4 Inference 5 Factor Graphs 6 Sum-Product Algorithm 7 HMM Introduction 8 Markov Model 9 Hidden Markov Model 10 ML solution for the HMM 11 Forward-Backward 12 Viterbi 13 Example 14 Summary Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 67 / 83

68 Forward-Backward / Baum-Welch How can we efficiently compute γ() and ξ(.,.)? Remember the HMM is a tree model Using message passing we can compute the model efficiently (remember earlier discussion?) We have two parts to the message passing forward and backward for any component We have From earlier we have γ(z n ) = p(z n X ) = P(X z n)p(z n ) p(x ) γ(z n ) = α(z n)β(z n ) p(x ) Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 68 / 83

69 Forward-Backward We can then compute α(z n ) = p(x n z n ) z n 1 α(z n 1 )p(z n z n 1 ) β(z n ) = z n+1 β(z n+1 )p(x n+1 z n+1 )p(z n+1 z n ) p(x ) = z n α(z n )β(z n ) ξ(z n 1, z n ) = α(z n 1)p(x n z n )p(z n z n 1 )β(z n ) p(x ) Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 69 / 83

70 Sum-product algorithms for the HMM Given the HMM is a tree structure Use of sum-product rule to compute marginals We can derive a simple factor graph for the tree χ z 1 z n 1 z n ψ n g 1 g n 1 g n x 1 x n 1 x n Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 70 / 83

71 Sum-product algorithms for the HMM We can then compute the factors h(z 1 ) = p(z 1 )p(x 1 z 1 ) f n (z n 1, z n ) = p(z n z n 1 )p(x n z n ) The update factors µ fn z n (z n ) can be used to derive message passing with α(.) and β(.) Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 71 / 83

72 Outline 1 Introduction 2 Bayesian Networks 3 Conditional Independence 4 Inference 5 Factor Graphs 6 Sum-Product Algorithm 7 HMM Introduction 8 Markov Model 9 Hidden Markov Model 10 ML solution for the HMM 11 Forward-Backward 12 Viterbi 13 Example 14 Summary Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 72 / 83

73 Viterbi Algorithm Using the message passing framework it is possible to determine the most likely solution (ie best recognition) Intuitively Keep only track of the most likely / probably path through the graph At any time there are only K possible paths to maintain Basically a greedy evaluation of the best solution Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 73 / 83

74 Outline 1 Introduction 2 Bayesian Networks 3 Conditional Independence 4 Inference 5 Factor Graphs 6 Sum-Product Algorithm 7 HMM Introduction 8 Markov Model 9 Hidden Markov Model 10 ML solution for the HMM 11 Forward-Backward 12 Viterbi 13 Example 14 Summary Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 74 / 83

75 Small example of gesture tracking Tracking of hands using an HMM to interpret track Pre-process images to generate tracks Color segmentation Track regions using Kalman Filter Interpret tracks using HMM Henrik I. Christensen Graphical Models & HMMs 75 / 83

76 Pre-process architecture Camera RGB image Color segmentation Binary image Downscale Density map Thresholding and Connecting Lookup table N largest blobs Initial HSV thresholds Color model Adapted lookup table Tracker Histogram generation Head, left and right hand blobs Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 76 / 83

77 Basic idea Henrik I. Christensen Graphical Models & HMMs 77 / 83

78 Tracking 3.5 x Motion A B C 0.5 Threshold Frame Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 78 / 83

79 Motion Patterns Attention Idle Forward Back Left Right Turn left Turn right Faster Slower Stop Henrik I. Christensen Graphical Models & HMMs 79 / 83

80 Evaluation Acquired 2230 image sequences Covering 5 people in a normal living room 1115 used for training 1115 sequences were used for evaluation Capture of position and velocity data Rec Rates Position Velocity Combined Result [%] Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 80 / 83

81 Example Timing Phase Time/frame[ms] Image Transfer 4.3 Segmentation 0.6 Density Est 2.1 Connect Comp 2.1 Kalman Filter 0.3 HMM 21.0 Total 30.4 Henrik I. Christensen Graphical Models & HMMs 81 / 83

82 Outline 1 Introduction 2 Bayesian Networks 3 Conditional Independence 4 Inference 5 Factor Graphs 6 Sum-Product Algorithm 7 HMM Introduction 8 Markov Model 9 Hidden Markov Model 10 ML solution for the HMM 11 Forward-Backward 12 Viterbi 13 Example 14 Summary Henrik I. Christensen (RIM@GT) Graphical Models & HMMs 82 / 83

83 Summary The Hidden Markov Model (HMM) Many uses for sequential data models HMM is one possible formulation Autoregressive Models are common in data processing Henrik I. Christensen Graphical Models & HMMs 83 / 83

Machine Learning. Sourangshu Bhattacharya

Machine Learning. Sourangshu Bhattacharya Machine Learning Sourangshu Bhattacharya Bayesian Networks Directed Acyclic Graph (DAG) Bayesian Networks General Factorization Curve Fitting Re-visited Maximum Likelihood Determine by minimizing sum-of-squares

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG) Bayesian Networks General Factorization Bayesian Curve Fitting (1) Polynomial Bayesian

More information

ECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov

ECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov ECE521: Week 11, Lecture 20 27 March 2017: HMM learning/inference With thanks to Russ Salakhutdinov Examples of other perspectives Murphy 17.4 End of Russell & Norvig 15.2 (Artificial Intelligence: A Modern

More information

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models Prof. Daniel Cremers 4. Probabilistic Graphical Models Directed Models The Bayes Filter (Rep.) (Bayes) (Markov) (Tot. prob.) (Markov) (Markov) 2 Graphical Representation (Rep.) We can describe the overall

More information

D-Separation. b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C.

D-Separation. b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C. D-Separation Say: A, B, and C are non-intersecting subsets of nodes in a directed graph. A path from A to B is blocked by C if it contains a node such that either a) the arrows on the path meet either

More information

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models Prof. Daniel Cremers 4. Probabilistic Graphical Models Directed Models The Bayes Filter (Rep.) (Bayes) (Markov) (Tot. prob.) (Markov) (Markov) 2 Graphical Representation (Rep.) We can describe the overall

More information

Computer Vision Group Prof. Daniel Cremers. 4a. Inference in Graphical Models

Computer Vision Group Prof. Daniel Cremers. 4a. Inference in Graphical Models Group Prof. Daniel Cremers 4a. Inference in Graphical Models Inference on a Chain (Rep.) The first values of µ α and µ β are: The partition function can be computed at any node: Overall, we have O(NK 2

More information

Part II. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Part II. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Part II C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Converting Directed to Undirected Graphs (1) Converting Directed to Undirected Graphs (2) Add extra links between

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

FMA901F: Machine Learning Lecture 6: Graphical Models. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 6: Graphical Models. Cristian Sminchisescu FMA901F: Machine Learning Lecture 6: Graphical Models Cristian Sminchisescu Graphical Models Provide a simple way to visualize the structure of a probabilistic model and can be used to design and motivate

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 5 Inference

More information

Max-Sum Inference Algorithm

Max-Sum Inference Algorithm Ma-Sum Inference Algorithm Sargur Srihari srihari@cedar.buffalo.edu 1 The ma-sum algorithm Sum-product algorithm Takes joint distribution epressed as a factor graph Efficiently finds marginals over component

More information

Workshop report 1. Daniels report is on website 2. Don t expect to write it based on listening to one project (we had 6 only 2 was sufficient

Workshop report 1. Daniels report is on website 2. Don t expect to write it based on listening to one project (we had 6 only 2 was sufficient Workshop report 1. Daniels report is on website 2. Don t expect to write it based on listening to one project (we had 6 only 2 was sufficient quality) 3. I suggest writing it on one presentation. 4. Include

More information

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim,

More information

Introduction to Graphical Models

Introduction to Graphical Models Robert Collins CSE586 Introduction to Graphical Models Readings in Prince textbook: Chapters 10 and 11 but mainly only on directed graphs at this time Credits: Several slides are from: Review: Probability

More information

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas CS839: Probabilistic Graphical Models Lecture 10: Learning with Partially Observed Data Theo Rekatsinas 1 Partially Observed GMs Speech recognition 2 Partially Observed GMs Evolution 3 Partially Observed

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Overview of Part One Probabilistic Graphical Models Part One: Graphs and Markov Properties Christopher M. Bishop Graphs and probabilities Directed graphs Markov properties Undirected graphs Examples Microsoft

More information

Graphical Models Part 1-2 (Reading Notes)

Graphical Models Part 1-2 (Reading Notes) Graphical Models Part 1-2 (Reading Notes) Wednesday, August 3 2011, 2:35 PM Notes for the Reading of Chapter 8 Graphical Models of the book Pattern Recognition and Machine Learning (PRML) by Chris Bishop

More information

Hidden Markov Models in the context of genetic analysis

Hidden Markov Models in the context of genetic analysis Hidden Markov Models in the context of genetic analysis Vincent Plagnol UCL Genetics Institute November 22, 2012 Outline 1 Introduction 2 Two basic problems Forward/backward Baum-Welch algorithm Viterbi

More information

18 October, 2013 MVA ENS Cachan. Lecture 6: Introduction to graphical models Iasonas Kokkinos

18 October, 2013 MVA ENS Cachan. Lecture 6: Introduction to graphical models Iasonas Kokkinos Machine Learning for Computer Vision 1 18 October, 2013 MVA ENS Cachan Lecture 6: Introduction to graphical models Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Center for Visual Computing Ecole Centrale Paris

More information

Regularization and Markov Random Fields (MRF) CS 664 Spring 2008

Regularization and Markov Random Fields (MRF) CS 664 Spring 2008 Regularization and Markov Random Fields (MRF) CS 664 Spring 2008 Regularization in Low Level Vision Low level vision problems concerned with estimating some quantity at each pixel Visual motion (u(x,y),v(x,y))

More information

Graphical Models. Dmitrij Lagutin, T Machine Learning: Basic Principles

Graphical Models. Dmitrij Lagutin, T Machine Learning: Basic Principles Graphical Models Dmitrij Lagutin, dlagutin@cc.hut.fi T-61.6020 - Machine Learning: Basic Principles 12.3.2007 Contents Introduction to graphical models Bayesian networks Conditional independence Markov

More information

ECE521 Lecture 21 HMM cont. Message Passing Algorithms

ECE521 Lecture 21 HMM cont. Message Passing Algorithms ECE521 Lecture 21 HMM cont Message Passing Algorithms Outline Hidden Markov models Numerical example of figuring out marginal of the observed sequence Numerical example of figuring out the most probable

More information

ECE521 W17 Tutorial 10

ECE521 W17 Tutorial 10 ECE521 W17 Tutorial 10 Shenlong Wang and Renjie Liao *Some of materials are credited to Jimmy Ba, Eric Sudderth, Chris Bishop Introduction to A4 1, Graphical Models 2, Message Passing 3, HMM Introduction

More information

Computer vision: models, learning and inference. Chapter 10 Graphical Models

Computer vision: models, learning and inference. Chapter 10 Graphical Models Computer vision: models, learning and inference Chapter 10 Graphical Models Independence Two variables x 1 and x 2 are independent if their joint probability distribution factorizes as Pr(x 1, x 2 )=Pr(x

More information

CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination

CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination Instructor: Erik Sudderth Brown University Computer Science September 11, 2014 Some figures and materials courtesy

More information

Directed Graphical Models

Directed Graphical Models Copyright c 2008 2010 John Lafferty, Han Liu, and Larry Wasserman Do Not Distribute Chapter 18 Directed Graphical Models Graphs give a powerful way of representing independence relations and computing

More information

CS 343: Artificial Intelligence

CS 343: Artificial Intelligence CS 343: Artificial Intelligence Bayes Nets: Inference Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.

More information

The Basics of Graphical Models

The Basics of Graphical Models The Basics of Graphical Models David M. Blei Columbia University September 30, 2016 1 Introduction (These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan.

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University April 1, 2019 Today: Inference in graphical models Learning graphical models Readings: Bishop chapter 8 Bayesian

More information

Graphical Models. David M. Blei Columbia University. September 17, 2014

Graphical Models. David M. Blei Columbia University. September 17, 2014 Graphical Models David M. Blei Columbia University September 17, 2014 These lecture notes follow the ideas in Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. In addition,

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University March 4, 2015 Today: Graphical models Bayes Nets: EM Mixture of Gaussian clustering Learning Bayes Net structure

More information

CSCI 599 Class Presenta/on. Zach Levine. Markov Chain Monte Carlo (MCMC) HMM Parameter Es/mates

CSCI 599 Class Presenta/on. Zach Levine. Markov Chain Monte Carlo (MCMC) HMM Parameter Es/mates CSCI 599 Class Presenta/on Zach Levine Markov Chain Monte Carlo (MCMC) HMM Parameter Es/mates April 26 th, 2012 Topics Covered in this Presenta2on A (Brief) Review of HMMs HMM Parameter Learning Expecta2on-

More information

Exact Inference: Elimination and Sum Product (and hidden Markov models)

Exact Inference: Elimination and Sum Product (and hidden Markov models) Exact Inference: Elimination and Sum Product (and hidden Markov models) David M. Blei Columbia University October 13, 2015 The first sections of these lecture notes follow the ideas in Chapters 3 and 4

More information

Assignment 2. Unsupervised & Probabilistic Learning. Maneesh Sahani Due: Monday Nov 5, 2018

Assignment 2. Unsupervised & Probabilistic Learning. Maneesh Sahani Due: Monday Nov 5, 2018 Assignment 2 Unsupervised & Probabilistic Learning Maneesh Sahani Due: Monday Nov 5, 2018 Note: Assignments are due at 11:00 AM (the start of lecture) on the date above. he usual College late assignments

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Overview of Part Two Probabilistic Graphical Models Part Two: Inference and Learning Christopher M. Bishop Exact inference and the junction tree MCMC Variational methods and EM Example General variational

More information

Bayesian Machine Learning - Lecture 6

Bayesian Machine Learning - Lecture 6 Bayesian Machine Learning - Lecture 6 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 2, 2015 Today s lecture 1

More information

10-701/15-781, Fall 2006, Final

10-701/15-781, Fall 2006, Final -7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly

More information

Algorithms for Markov Random Fields in Computer Vision

Algorithms for Markov Random Fields in Computer Vision Algorithms for Markov Random Fields in Computer Vision Dan Huttenlocher November, 2003 (Joint work with Pedro Felzenszwalb) Random Field Broadly applicable stochastic model Collection of n sites S Hidden

More information

Today. Logistic Regression. Decision Trees Redux. Graphical Models. Maximum Entropy Formulation. Now using Information Theory

Today. Logistic Regression. Decision Trees Redux. Graphical Models. Maximum Entropy Formulation. Now using Information Theory Today Logistic Regression Maximum Entropy Formulation Decision Trees Redux Now using Information Theory Graphical Models Representing conditional dependence graphically 1 Logistic Regression Optimization

More information

Biology 644: Bioinformatics

Biology 644: Bioinformatics A statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved (hidden) states in the training data. First used in speech and handwriting recognition In

More information

Mixture Models and the EM Algorithm

Mixture Models and the EM Algorithm Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Finite Mixture Models Say we have a data set D = {x 1,..., x N } where x i is

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2016 Lecturer: Trevor Cohn 21. Independence in PGMs; Example PGMs Independence PGMs encode assumption of statistical independence between variables. Critical

More information

Machine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves

Machine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves Machine Learning A 708.064 11W 1sst KU Exercises Problems marked with * are optional. 1 Conditional Independence I [2 P] a) [1 P] Give an example for a probability distribution P (A, B, C) that disproves

More information

Object Recognition Using Pictorial Structures. Daniel Huttenlocher Computer Science Department. In This Talk. Object recognition in computer vision

Object Recognition Using Pictorial Structures. Daniel Huttenlocher Computer Science Department. In This Talk. Object recognition in computer vision Object Recognition Using Pictorial Structures Daniel Huttenlocher Computer Science Department Joint work with Pedro Felzenszwalb, MIT AI Lab In This Talk Object recognition in computer vision Brief definition

More information

Hidden Markov Models. Implementing the forward-, backward- and Viterbi-algorithms

Hidden Markov Models. Implementing the forward-, backward- and Viterbi-algorithms Hidden Markov Models Implementing the forward-, backward- and Viterbi-algorithms Recursion: Viterbi Basis: Forward Recursion: Basis: Backward Recursion: Basis: Viterbi Recursion: Problem: The values in

More information

Machine Learning

Machine Learning Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University February 15, 2011 Today: Graphical models Inference Conditional independence and D-separation Learning from

More information

Bayesian Classification Using Probabilistic Graphical Models

Bayesian Classification Using Probabilistic Graphical Models San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Spring 2014 Bayesian Classification Using Probabilistic Graphical Models Mehal Patel San Jose State University

More information

Inference and Representation

Inference and Representation Inference and Representation Rachel Hodos New York University Lecture 5, October 6, 2015 Rachel Hodos Lecture 5: Inference and Representation Today: Learning with hidden variables Outline: Unsupervised

More information

Decomposition of log-linear models

Decomposition of log-linear models Graphical Models, Lecture 5, Michaelmas Term 2009 October 27, 2009 Generating class Dependence graph of log-linear model Conformal graphical models Factor graphs A density f factorizes w.r.t. A if there

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2016 Lecturer: Trevor Cohn 20. PGM Representation Next Lectures Representation of joint distributions Conditional/marginal independence * Directed vs

More information

Introduction to Probabilistic Graphical Models

Introduction to Probabilistic Graphical Models Introduction to Probabilistic Graphical Models Tomi Silander School of Computing National University of Singapore June 13, 2011 What is a probabilistic graphical model? The formalism of probabilistic graphical

More information

ε-machine Estimation and Forecasting

ε-machine Estimation and Forecasting ε-machine Estimation and Forecasting Comparative Study of Inference Methods D. Shemetov 1 1 Department of Mathematics University of California, Davis Natural Computation, 2014 Outline 1 Motivation ε-machines

More information

Motivation: Shortcomings of Hidden Markov Model. Ko, Youngjoong. Solution: Maximum Entropy Markov Model (MEMM)

Motivation: Shortcomings of Hidden Markov Model. Ko, Youngjoong. Solution: Maximum Entropy Markov Model (MEMM) Motivation: Shortcomings of Hidden Markov Model Maximum Entropy Markov Models and Conditional Random Fields Ko, Youngjoong Dept. of Computer Engineering, Dong-A University Intelligent System Laboratory,

More information

Problem 1 (20 pt) Answer the following questions, and provide an explanation for each question.

Problem 1 (20 pt) Answer the following questions, and provide an explanation for each question. Problem 1 Answer the following questions, and provide an explanation for each question. (5 pt) Can linear regression work when all X values are the same? When all Y values are the same? (5 pt) Can linear

More information

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim, Germany

More information

Latent Variable Models and Expectation Maximization

Latent Variable Models and Expectation Maximization Latent Variable Models and Expectation Maximization Oliver Schulte - CMPT 726 Bishop PRML Ch. 9 2 4 6 8 1 12 14 16 18 2 4 6 8 1 12 14 16 18 5 1 15 2 25 5 1 15 2 25 2 4 6 8 1 12 14 2 4 6 8 1 12 14 5 1 15

More information

Lecture 9: Undirected Graphical Models Machine Learning

Lecture 9: Undirected Graphical Models Machine Learning Lecture 9: Undirected Graphical Models Machine Learning Andrew Rosenberg March 5, 2010 1/1 Today Graphical Models Probabilities in Undirected Graphs 2/1 Undirected Graphs What if we allow undirected graphs?

More information

Scene Grammars, Factor Graphs, and Belief Propagation

Scene Grammars, Factor Graphs, and Belief Propagation Scene Grammars, Factor Graphs, and Belief Propagation Pedro Felzenszwalb Brown University Joint work with Jeroen Chua Probabilistic Scene Grammars General purpose framework for image understanding and

More information

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant

More information

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Bayes Nets: Inference (Finish) Variable Elimination Graph-view of VE: Fill-edges, induced width

More information

Conditional Random Fields - A probabilistic graphical model. Yen-Chin Lee 指導老師 : 鮑興國

Conditional Random Fields - A probabilistic graphical model. Yen-Chin Lee 指導老師 : 鮑興國 Conditional Random Fields - A probabilistic graphical model Yen-Chin Lee 指導老師 : 鮑興國 Outline Labeling sequence data problem Introduction conditional random field (CRF) Different views on building a conditional

More information

Expectation Maximization. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University

Expectation Maximization. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University Expectation Maximization Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University April 10 th, 2006 1 Announcements Reminder: Project milestone due Wednesday beginning of class 2 Coordinate

More information

CS 188: Artificial Intelligence

CS 188: Artificial Intelligence CS 188: Artificial Intelligence Bayes Nets: Inference Instructors: Dan Klein and Pieter Abbeel --- University of California, Berkeley [These slides were created by Dan Klein and Pieter Abbeel for CS188

More information

Introduction to SLAM Part II. Paul Robertson

Introduction to SLAM Part II. Paul Robertson Introduction to SLAM Part II Paul Robertson Localization Review Tracking, Global Localization, Kidnapping Problem. Kalman Filter Quadratic Linear (unless EKF) SLAM Loop closing Scaling: Partition space

More information

A Brief Introduction to Bayesian Networks. adapted from slides by Mitch Marcus

A Brief Introduction to Bayesian Networks. adapted from slides by Mitch Marcus A Brief Introduction to Bayesian Networks adapted from slides by Mitch Marcus Bayesian Networks A simple, graphical notation for conditional independence assertions and hence for compact specification

More information

CS242: Probabilistic Graphical Models Lecture 2B: Loopy Belief Propagation & Junction Trees

CS242: Probabilistic Graphical Models Lecture 2B: Loopy Belief Propagation & Junction Trees CS242: Probabilistic Graphical Models Lecture 2B: Loopy Belief Propagation & Junction Trees Professor Erik Sudderth Brown University Computer Science September 22, 2016 Some figures and materials courtesy

More information

Time series, HMMs, Kalman Filters

Time series, HMMs, Kalman Filters Classic HMM tutorial see class website: *L. R. Rabiner, "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition," Proc. of the IEEE, Vol.77, No.2, pp.257--286, 1989. Time series,

More information

Lecture 21 : A Hybrid: Deep Learning and Graphical Models

Lecture 21 : A Hybrid: Deep Learning and Graphical Models 10-708: Probabilistic Graphical Models, Spring 2018 Lecture 21 : A Hybrid: Deep Learning and Graphical Models Lecturer: Kayhan Batmanghelich Scribes: Paul Liang, Anirudha Rayasam 1 Introduction and Motivation

More information

Introduction to Hidden Markov models

Introduction to Hidden Markov models 1/38 Introduction to Hidden Markov models Mark Johnson Macquarie University September 17, 2014 2/38 Outline Sequence labelling Hidden Markov Models Finding the most probable label sequence Higher-order

More information

Machine Learning Lecture 16

Machine Learning Lecture 16 ourse Outline Machine Learning Lecture 16 undamentals (2 weeks) ayes ecision Theory Probability ensity stimation Undirected raphical Models & Inference 28.06.2016 iscriminative pproaches (5 weeks) Linear

More information

Collective classification in network data

Collective classification in network data 1 / 50 Collective classification in network data Seminar on graphs, UCSB 2009 Outline 2 / 50 1 Problem 2 Methods Local methods Global methods 3 Experiments Outline 3 / 50 1 Problem 2 Methods Local methods

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University February 25, 2015 Today: Graphical models Bayes Nets: Inference Learning EM Readings: Bishop chapter 8 Mitchell

More information

3 : Representation of Undirected GMs

3 : Representation of Undirected GMs 0-708: Probabilistic Graphical Models 0-708, Spring 202 3 : Representation of Undirected GMs Lecturer: Eric P. Xing Scribes: Nicole Rafidi, Kirstin Early Last Time In the last lecture, we discussed directed

More information

Scene Grammars, Factor Graphs, and Belief Propagation

Scene Grammars, Factor Graphs, and Belief Propagation Scene Grammars, Factor Graphs, and Belief Propagation Pedro Felzenszwalb Brown University Joint work with Jeroen Chua Probabilistic Scene Grammars General purpose framework for image understanding and

More information

Conditional Random Fields and beyond D A N I E L K H A S H A B I C S U I U C,

Conditional Random Fields and beyond D A N I E L K H A S H A B I C S U I U C, Conditional Random Fields and beyond D A N I E L K H A S H A B I C S 5 4 6 U I U C, 2 0 1 3 Outline Modeling Inference Training Applications Outline Modeling Problem definition Discriminative vs. Generative

More information

Hidden Markov Models. Slides adapted from Joyce Ho, David Sontag, Geoffrey Hinton, Eric Xing, and Nicholas Ruozzi

Hidden Markov Models. Slides adapted from Joyce Ho, David Sontag, Geoffrey Hinton, Eric Xing, and Nicholas Ruozzi Hidden Markov Models Slides adapted from Joyce Ho, David Sontag, Geoffrey Hinton, Eric Xing, and Nicholas Ruozzi Sequential Data Time-series: Stock market, weather, speech, video Ordered: Text, genes Sequential

More information

Sequence Labeling: The Problem

Sequence Labeling: The Problem Sequence Labeling: The Problem Given a sequence (in NLP, words), assign appropriate labels to each word. For example, POS tagging: DT NN VBD IN DT NN. The cat sat on the mat. 36 part-of-speech tags used

More information

Modeling time series with hidden Markov models

Modeling time series with hidden Markov models Modeling time series with hidden Markov models Advanced Machine learning 2017 Nadia Figueroa, Jose Medina and Aude Billard Time series data Barometric pressure Temperature Data Humidity Time What s going

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 2, 2012 Today: Graphical models Bayes Nets: Representing distributions Conditional independencies

More information

7. Boosting and Bagging Bagging

7. Boosting and Bagging Bagging Group Prof. Daniel Cremers 7. Boosting and Bagging Bagging Bagging So far: Boosting as an ensemble learning method, i.e.: a combination of (weak) learners A different way to combine classifiers is known

More information

Lecture 5: Exact inference. Queries. Complexity of inference. Queries (continued) Bayesian networks can answer questions about the underlying

Lecture 5: Exact inference. Queries. Complexity of inference. Queries (continued) Bayesian networks can answer questions about the underlying given that Maximum a posteriori (MAP query: given evidence 2 which has the highest probability: instantiation of all other variables in the network,, Most probable evidence (MPE: given evidence, find an

More information

6 : Factor Graphs, Message Passing and Junction Trees

6 : Factor Graphs, Message Passing and Junction Trees 10-708: Probabilistic Graphical Models 10-708, Spring 2018 6 : Factor Graphs, Message Passing and Junction Trees Lecturer: Kayhan Batmanghelich Scribes: Sarthak Garg 1 Factor Graphs Factor Graphs are graphical

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University February 18, 2015 Today: Graphical models Bayes Nets: Representing distributions Conditional independencies

More information

Monte Carlo for Spatial Models

Monte Carlo for Spatial Models Monte Carlo for Spatial Models Murali Haran Department of Statistics Penn State University Penn State Computational Science Lectures April 2007 Spatial Models Lots of scientific questions involve analyzing

More information

2. Graphical Models. Undirected graphical models. Factor graphs. Bayesian networks. Conversion between graphical models. Graphical Models 2-1

2. Graphical Models. Undirected graphical models. Factor graphs. Bayesian networks. Conversion between graphical models. Graphical Models 2-1 Graphical Models 2-1 2. Graphical Models Undirected graphical models Factor graphs Bayesian networks Conversion between graphical models Graphical Models 2-2 Graphical models There are three families of

More information

Lecture 3: Conditional Independence - Undirected

Lecture 3: Conditional Independence - Undirected CS598: Graphical Models, Fall 2016 Lecture 3: Conditional Independence - Undirected Lecturer: Sanmi Koyejo Scribe: Nate Bowman and Erin Carrier, Aug. 30, 2016 1 Review for the Bayes-Ball Algorithm Recall

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover

More information

Expectation-Maximization. Nuno Vasconcelos ECE Department, UCSD

Expectation-Maximization. Nuno Vasconcelos ECE Department, UCSD Expectation-Maximization Nuno Vasconcelos ECE Department, UCSD Plan for today last time we started talking about mixture models we introduced the main ideas behind EM to motivate EM, we looked at classification-maximization

More information

732A54/TDDE31 Big Data Analytics

732A54/TDDE31 Big Data Analytics 732A54/TDDE31 Big Data Analytics Lecture 10: Machine Learning with MapReduce Jose M. Peña IDA, Linköping University, Sweden 1/27 Contents MapReduce Framework Machine Learning with MapReduce Neural Networks

More information

Learning Objects and Parts in Images

Learning Objects and Parts in Images Learning Objects and Parts in Images Chris Williams E H U N I V E R S I T T Y O H F R G E D I N B U School of Informatics, University of Edinburgh, UK Learning multiple objects and parts from images (joint

More information

Exam Topics. Search in Discrete State Spaces. What is intelligence? Adversarial Search. Which Algorithm? 6/1/2012

Exam Topics. Search in Discrete State Spaces. What is intelligence? Adversarial Search. Which Algorithm? 6/1/2012 Exam Topics Artificial Intelligence Recap & Expectation Maximization CSE 473 Dan Weld BFS, DFS, UCS, A* (tree and graph) Completeness and Optimality Heuristics: admissibility and consistency CSPs Constraint

More information

K-Means and Gaussian Mixture Models

K-Means and Gaussian Mixture Models K-Means and Gaussian Mixture Models David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 43 K-Means Clustering Example: Old Faithful Geyser

More information

Mixture Models and EM

Mixture Models and EM Table of Content Chapter 9 Mixture Models and EM -means Clustering Gaussian Mixture Models (GMM) Expectation Maximiation (EM) for Mixture Parameter Estimation Introduction Mixture models allows Complex

More information

Deep Generative Models Variational Autoencoders

Deep Generative Models Variational Autoencoders Deep Generative Models Variational Autoencoders Sudeshna Sarkar 5 April 2017 Generative Nets Generative models that represent probability distributions over multiple variables in some way. Directed Generative

More information

22 October, 2012 MVA ENS Cachan. Lecture 5: Introduction to generative models Iasonas Kokkinos

22 October, 2012 MVA ENS Cachan. Lecture 5: Introduction to generative models Iasonas Kokkinos Machine Learning for Computer Vision 1 22 October, 2012 MVA ENS Cachan Lecture 5: Introduction to generative models Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Center for Visual Computing Ecole Centrale Paris

More information

Graphical Models. Pradeep Ravikumar Department of Computer Science The University of Texas at Austin

Graphical Models. Pradeep Ravikumar Department of Computer Science The University of Texas at Austin Graphical Models Pradeep Ravikumar Department of Computer Science The University of Texas at Austin Useful References Graphical models, exponential families, and variational inference. M. J. Wainwright

More information

Statistical Methods for NLP

Statistical Methods for NLP Statistical Methods for NLP Information Extraction, Hidden Markov Models Sameer Maskey * Most of the slides provided by Bhuvana Ramabhadran, Stanley Chen, Michael Picheny Speech Recognition Lecture 4:

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Raquel Urtasun and Tamir Hazan TTI Chicago April 25, 2011 Raquel Urtasun and Tamir Hazan (TTI-C) Graphical Models April 25, 2011 1 / 17 Clique Trees Today we are going to

More information

1 : Introduction to GM and Directed GMs: Bayesian Networks. 3 Multivariate Distributions and Graphical Models

1 : Introduction to GM and Directed GMs: Bayesian Networks. 3 Multivariate Distributions and Graphical Models 10-708: Probabilistic Graphical Models, Spring 2015 1 : Introduction to GM and Directed GMs: Bayesian Networks Lecturer: Eric P. Xing Scribes: Wenbo Liu, Venkata Krishna Pillutla 1 Overview This lecture

More information