Machine Learning Lecture 16
|
|
- Emory Short
- 5 years ago
- Views:
Transcription
1 ourse Outline Machine Learning Lecture 16 undamentals (2 weeks) ayes ecision Theory Probability ensity stimation Undirected raphical Models & Inference iscriminative pproaches (5 weeks) Linear iscriminant unctions Statistical Learning Theory & SVMs nsemble Methods & oosting ecision Trees & Randomized Trees astian Leibe RWTH achen Many slides adapted from. Schiele, S. Roth,. harahmani enerative Models (4 weeks) ayesian Networks Markov Random ields xact Inference 2 Topics of This Lecture Recap: raphical Models Recap: irected raphical Models (ayesian Networks) actorization properties onditional independence ayes all algorithm Two basic kinds of graphical models irected graphical models or ayesian Networks Undirected graphical models or Markov Random ields Undirected raphical Models (Markov Random ields) onditional Independence actorization xample application: image restoration onverting directed into undirected graphs xact Inference in raphical Models Marginalization for undirected graphs Inference on a chain Inference on a tree Message passing formalism 3 Key components Nodes Random variables dges irected or undirected The value of a random variable may be known or unknown. Slide credit: ernt Schiele unknown irected graphical model known Undirected graphical model 4 Recap: irected raphical Models Recap: irected raphical Models hains of nodes: p(a) p(bja) p(cjb) onvergent connections: Knowledge about a is expressed by the prior probability: p(a) ependencies are expressed through conditional probabilities: p(bja); p(cjb) Joint distribution of all three variables: p(a; b; c) = p(cja; b)p(a; b) = p(cjb)p(bja)p(a) Here the value of c depends on both variables a and b. This is modeled with the conditional probability: p(cja; b) Therefore, the joint probability of all three variables is given as: p(a; b; c) = p(cja; b)p(a; b) = p(cja; b)p(a)p(b) Slide credit: ernt Schiele, Stefan Roth 5 Slide credit: ernt Schiele, Stefan Roth 6 1
2 Recap: actorization of the Joint Probability Recap: actorized Representation omputing the joint probability p(x 1 ; : : : ; x 7 ) = p(x 1 )p(x 2 )p(x 3 )p(x 4 jx 1 ; x 2 ; x 3 ) p(x 5 jx 1 ; x 3 )p(x 6 jx 4 )p(x 7 jx 4 ; x 5 ) Reduction of complexity Joint probability of n binary variables requires us to represent values by brute force O(2 n ) terms eneral factorization The factorized form obtained from the graphical model only requires O(n 2 k ) terms k: maximum number of parents of a node. We can directly read off the factorization of the joint from the network structure! It s the edges that are missing in the graph that are important! They encode the simplifying assumptions we make. 7 Image source:. ishop, 2006 Slide credit: ernt Schiele, Stefan Roth 8 Recap: onditional Independence Recap: onditional Independence X is conditionally independent of Y given V efinition: Three cases ivergent ( Tail-to-Tail ) onditional independence when c is observed. lso: Special case: Marginal Independence hain ( Head-to-Tail ) onditional independence when c is observed. Often, we are interested in conditional independence between sets of variables: onvergent ( Head-to-Head ) onditional independence when neither c, nor any of its descendants are observed Image source:. ishop, 2006 Recap: -Separation efinition Let,, and be non-intersecting subsets of nodes in a directed graph. path from to is blocked if it contains a node such that either The arrows on the path meet either head-to-tail or tail-to-tail at the node, and the node is in the set, or The arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set. If all paths from to are blocked, is said to be d-separated from by. If is d-separated from by, the joint distribution over all variables in the graph satisfies. Read: is conditionally independent of given. Slide adapted from hris ishop 11 Intuitive View: The ayes all lgorithm ame an you get a ball from X to Y without being blocked by V? epending on its direction and the previous node, the ball can Pass through (from parent to all children, from child to all parents) ounce back (from any parent/child to all parents/children) e blocked R.. Shachter, ayes-all: The Rational Pastime (for etermining Irrelevance and Requisite Information in elief Networks and Influence iagrams), UI 98, 1998 Slide adapted from oubin harahmani 12 2
3 The ayes all lgorithm xample: ayes all ame rules n unobserved node (W V) passes through balls from parents, but also bounces back balls from children. n observed node (W 2 V) bounces back balls from parents, but blocks balls from children. The ayes all algorithm determines those nodes that are d- separated from the query node. Which nodes are d-separated from given and? 13 Image source: R. Shachter, xample: ayes all xample: ayes all Rule: Rule: Which nodes are d-separated from given and? Which nodes are d-separated from given and? xample: ayes all xample: ayes all Rule: Rules: Which nodes are d-separated from given and? Which nodes are d-separated from given and?
4 xample: ayes all The Markov lanket Rule: Which nodes are d-separated from given and? is d-separated from given and. 19 Markov blanket of a node x i Minimal set of nodes that isolates x i from the rest of the graph. This comprises the set of Parents, hildren, and o-parents of x i. This is what we have to watch out for! 20 Image source:. ishop, 2006 Topics of This Lecture Undirected raphical Models Recap: irected raphical Models (ayesian Networks) actorization properties onditional independence ayes all algorithm Undirected graphical models ( Markov Random ields ) iven by undirected graph Undirected raphical Models (Markov Random ields) onditional Independence actorization xample application: image restoration onverting directed into undirected graphs xact Inference in raphical Models Marginalization for undirected graphs Inference on a chain Inference on a tree Message passing formalism 23 onditional independence is easier to read off for MRs. Without arrows, there is only one type of neighbors. Simpler Markov blanket: 24 Image source:. ishop, 2006 Undirected raphical Models actorization in MRs actorization actorization is more complicated in MRs than in Ns. Important concept: maximal cliques lique Subset of the nodes such that there exists a link between all pairs of nodes in the subset. lique onditional independence for undirected graphs If every path from any node in set to set passes through at least one node in set, then. Maximal clique The biggest possible such clique in a given graph. Maximal lique 25 Image source:. ishop, Image source:. ishop,
5 actorization in MRs actorization in MRs Joint distribution Written as product of potential functions over maximal cliques in the graph: p(x) = 1 Y Ã (x ) The normalization constant is called the partition function. = X Y Ã (x ) x Remarks Ns are automatically normalized. ut for MRs, we have to explicitly perform the normalization. Presence of normalization constant is major limitation! Role of the potential functions eneral interpretation No restriction to potential functions that have a specific probabilistic interpretation as marginals or conditional distributions. onvenient to express them as exponential functions ( oltzmann distribution ) Ã (x ) = expf (x )g with an energy function. Why is this convenient? Joint distribution is the product of potentials sum of energies. We can take the log and simply work with the sums valuation of involves summing over O(K M ) terms for M nodes omparison: irected vs. Undirected raphs onverting irected to Undirected raphs irected graphs (ayesian networks) etter at expressing causal relationships. Interpretation of a link: onditional probability p(b a). a b actorization is simple (and result is automatically normalized). onditional independence is more complicated. Simple case: chain Undirected graphs (Markov Random ields) etter at representing soft constraints between variables. Interpretation of a link: a b There is some relationship between a and b. actorization is complicated (and result needs normalization). onditional independence is simple. 29 We can directly replace the directed links by undirected ones. 34 Slide adapted from hris ishop Image source:. ishop, 2006 onverting irected to Undirected raphs onverting irected to Undirected raphs More difficult case: multiple parents fully connected, no cond. indep.! eneral procedure to convert directed undirected 1. dd undirected links to marry the parents of each node. 2. rop the arrows on the original links moral graph. 3. ind maximal cliques for each node and initialize all clique potentials to Take each conditional distribution factor of the original directed graph and multiply it into one clique potential. Need a clique of x 1,,x 4 to represent this factor! Restriction onditional independence properties are often lost! Moralization results in additional connections and larger cliques. Need to introduce additional links ( marry the parents ). This process is called moralization. It results in the moral graph. Slide adapted from hris ishop 35 Image source:. ishop, 2006 Slide adapted from hris ishop 36 5
6 xample: raph onversion Step 1) Marrying the parents. xample: raph onversion Step 2) ropping the arrows. p(x 1 ; : : : ; x 7 ) = p(x 1 )p(x 2 )p(x 3 )p(x 4 jx 1 ; x 2 ; x 3 ) p(x 5 jx 1 ; x 3 )p(x 6 jx 4 )p(x 7 jx 4 ; x 5 ) 37 p(x 1 ; : : : ; x 7 ) = p(x 1 )p(x 2 )p(x 3 )p(x 4 jx 1 ; x 2 ; x 3 ) p(x 5 jx 1 ; x 3 )p(x 6 jx 4 )p(x 7 jx 4 ; x 5 ) 38 xample: raph onversion xample: raph onversion Step 3) inding maximal cliques for each node. Ã 2(x 1; x 3; x 4; x 5) Ã 1(x 1; x 2; x 3; x 4) Step 4) ssigning the probabilities to clique potentials. Ã 1(x 1; x 2; x 3; x 4) = 1 p(x 4 jx 1 ; x 2 ; x 3 ) p(x 2 )p(x 3 ) Ã 2(x 1; x 3; x 4; x 5) = 1 p(x 5 jx 1 ; x 3 ) p(x 1 ) Ã 4(x 4; x 6) Ã 3(x 4; x 5; x 7) = 1 p(x 7 jx 4 ; x 5 ) Ã 3(x 4; x 5; x 7) Ã 4(x 4; x 6) = 1 p(x 6 jx 4 ) p(x 1 ; : : : ; x 7 ) = p(x 1 )p(x 2 )p(x 3 )p(x 4 jx 1 ; x 2 ; x 3 ) p(x 5 jx 1 ; x 3 )p(x 6 jx 4 )p(x 7 jx 4 ; x 5 ) 39 p(x 1 ; : : : ; x 7 ) = p(x 1 )p(x 2 )p(x 3 )p(x ) 4 jx 1 ; x 2 ; x 3 ) p(x 5 jx 1 ; x 3 )p(x 6 jx 4 )p(x ) 7 jx 4 ; x 5 ) 40 omparison of xpressive Power Topics of This Lecture oth types of graphs have unique configurations. No directed graph can represent these and only these independencies. Recap: irected raphical Models (ayesian Networks) actorization properties onditional independence ayes all algorithm Undirected raphical Models (Markov Random ields) onditional Independence actorization xample application: image restoration onverting directed into undirected graphs No undirected graph can represent these and only these independencies. xact Inference in raphical Models Marginalization for undirected graphs Inference on a chain Inference on a tree Message passing formalism Slide adapted from hris ishop 41 Image source:. ishop,
7 Inference in raphical Models Inference in raphical Models xample 1: oal: compute the marginals p(a) = X? p(a)p(bja)p(cjb) b;c p(b) = X? p(a)p(bja)p(cjb) a;c Inference eneral definition valuate the probability distribution over some set of variables, given the values of another set of variables (=observations). xample 2: p(ajb = b 0 ) = X? p(a)p(b = b 0 ja)p(cjb = b 0 ) c = p(a)p(b = b 0 ja) xample: p(; ; ; ; ) = p()p()p(j;? )p(j; )p(j; ) How can we compute p( = c)? p(cjb = b 0 ) = X? p(a)p(b = b 0 ja)p(cjb = b 0 ) a = p(cjb = b 0 ) 43 Slide adapted from Stefan Roth, ernt Schiele Idea: Slide credit: oubin harahmani p(j = c) = p(; = c) p( = c) 44 Inference in raphical Models Inference in raphical Models omputing p( = c) We know p(; ; ; ; ) = p()p()p(j; )p(j; )p(j; ) ssume each variable is binary. Naïve approach: p(; = c) = X Two possible values for each 2 4 terms p(; ; = c; ; ) 16 operations ;; p( = c) = X p(; = c) 2 operations We know p(; ; ; ; ) = p()p()p(j; )p(j; )p(j; ) More efficient method for p( = c): p(; = c) = X p()p()p( = cj; )p(j; = c)p(j = c; ) ;; = X p()p()p( = cj; ) X p(j; = c) X p(j = c; ) = X =1 =1 p()p()p( = cj; ) 4 operations p(j = c) = Slide credit: oubin harahmani p(; = c) p( = c) 2 operations Total: = 20 operations 45 Rest stays the same: Strategy: Use the conditional independence in a graph to perform efficient inference. or singly connected graphs, exponential gains in efficiency! Slide credit: oubin harahmani Total: = 8 operations 46 omputing Marginals Inference on a hain How do we apply graphical models? iven some observed variables, we want to compute distributions of the unobserved variables. In particular, we want to compute marginal distributions, for example p(x 4 ). hain graph Joint probability How can we compute marginals? lassical technique: sum-product algorithm by Judea Pearl. In the context of (loopy) undirected models, this is also called (loopy) belief propagation [Weiss, 1997]. asic idea: message-passing. Marginalization Slide credit: ernt Schiele, Stefan Roth 47 Slide adapted from hris ishop 48 Image source:. ishop,
8 Inference on a hain Inference on a hain Idea: Split the computation into two parts ( messages ). We can define the messages recursively Slide adapted from hris ishop 49 Image source:. ishop, 2006 Slide adapted from hris ishop 50 Image source:. ishop, 2006 Inference on a hain Summary: Inference on a hain To compute local marginals: ompute and store all forward messages ¹ (x n ). ompute and store all backward messages ¹ (x n ). ompute at any node x m. Until we reach the leaf nodes ompute Interpretation We pass messages from the two ends towards the query node x n. for all variables required. We still need the normalization constant. This can be easily obtained from the marginals: Inference through message passing We have thus seen a first message passing algorithm. How can we generalize this? 51 Image source:. ishop, 2006 Slide adapted from hris ishop 52 Inference on Trees Inference on Trees Let s next assume a tree graph. xample: Strategy Marginalize out all other variables by summing over them. We are given the following joint distribution: p(; ; ; ; ) = 1? f 1(; ) f 2 (; ) f 3 (; ) f 4 (; ) Then rearrange terms: p() = X X XX p(; ; ; ; ) = X X XX 1 f1(; ) f2(; ) f3(; ) f4(; ) Ã Ã! Ã Ã X X X X!!! ssume we want to know the marginal p() = 1 f 4(; ) f 3(; ) f 2(; ) f 1(; ) Slide credit: ernt Schiele, Stefan Roth 53 Slide credit: ernt Schiele, Stefan Roth 54 8
9 Marginalization with Messages Marginalization with Messages Use messages to express the marginalization: f 1 (; ) f 3 (; ) Use messages to express the marginalization: f 1 (; ) f 3 (; ) m! = X m! = X m! = X f 2 (; )m! () m! = X m! = X m! = X f 2 (; )m! () m! = X f 4 (; )m! ()m! () m! = X f 4 (; )m! ()m! () Ã Ã! Ã Ã!!! p() = 1 X X X X f 4(; ) f 3(; ) f 2(; ) f 1(; ) Ã Ã! Ã!! = 1 X X X f 4(; ) f 3(; ) f 2(; ) m!() Ã Ã! Ã Ã!!! p() = 1 X X X X f 4(; ) f 3(; ) f 2(; ) f 1(; ) Ã Ã!! = 1 X X f 4(; ) f 3(; ) m!() Slide credit: ernt Schiele, Stefan Roth 55 Slide credit: ernt Schiele, Stefan Roth 56 Marginalization with Messages Marginalization with Messages Use messages to express the marginalization: f 1 (; ) f 3 (; ) Use messages to express the marginalization: f 1 (; ) f 3 (; ) m! = X m! = X m! = X f 2 (; )m! () m! = X m! = X m! = X f 2 (; )m! () m! = X f 4 (; )m! ()m! () m! = X f 4 (; )m! ()m! () Ã Ã! Ã Ã!!! p() = 1 X X X X f 4(; ) f 3(; ) f 2(; ) f 1(; ) Ã! = 1 X f 4(; ) m!() m!() Ã Ã! Ã Ã!!! p() = 1 X X X X f 4(; ) f 3(; ) f 2(; ) f 1(; ) = 1 m!() Slide credit: ernt Schiele, Stefan Roth 57 Slide credit: ernt Schiele, Stefan Roth 58 Inference on Trees Trees How an We eneralize? We can generalize this for all tree graphs. Root the tree at the variable that we want to compute the marginal of. Start computing messages at the leaves. ompute the messages for all nodes for which all incoming messages have already been computed. Repeat until we reach the root. Undirected Tree irected Tree Polytree If we want to compute the marginals for all possible nodes (roots), we can reuse some of the messages. omputational expense linear in the number of nodes. Next lecture ormalize the message-passing idea Sum-product algorithm ommon representation of the above actor graphs eal with loopy graphs structures Junction tree algorithm Slide credit: ernt Schiele, Stefan Roth Image source:. ishop,
10 References and urther Reading thorough introduction to raphical Models in general and ayesian Networks in particular can be found in hapter 8 of ishop s book. hristopher M. ishop Pattern Recognition and Machine Learning Springer,
Part II. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS
Part II C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Converting Directed to Undirected Graphs (1) Converting Directed to Undirected Graphs (2) Add extra links between
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG) Bayesian Networks General Factorization Bayesian Curve Fitting (1) Polynomial Bayesian
More informationComputer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models
Prof. Daniel Cremers 4. Probabilistic Graphical Models Directed Models The Bayes Filter (Rep.) (Bayes) (Markov) (Tot. prob.) (Markov) (Markov) 2 Graphical Representation (Rep.) We can describe the overall
More informationD-Separation. b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C.
D-Separation Say: A, B, and C are non-intersecting subsets of nodes in a directed graph. A path from A to B is blocked by C if it contains a node such that either a) the arrows on the path meet either
More informationMachine Learning. Sourangshu Bhattacharya
Machine Learning Sourangshu Bhattacharya Bayesian Networks Directed Acyclic Graph (DAG) Bayesian Networks General Factorization Curve Fitting Re-visited Maximum Likelihood Determine by minimizing sum-of-squares
More informationFMA901F: Machine Learning Lecture 6: Graphical Models. Cristian Sminchisescu
FMA901F: Machine Learning Lecture 6: Graphical Models Cristian Sminchisescu Graphical Models Provide a simple way to visualize the structure of a probabilistic model and can be used to design and motivate
More informationMarkov Random Fields
3750 Machine earning ecture 4 Markov Random ields Milos auskrecht milos@cs.pitt.edu 5329 ennott quare 3750 dvanced Machine earning Markov random fields Probabilistic models with symmetric dependences.
More informationComputer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models
Prof. Daniel Cremers 4. Probabilistic Graphical Models Directed Models The Bayes Filter (Rep.) (Bayes) (Markov) (Tot. prob.) (Markov) (Markov) 2 Graphical Representation (Rep.) We can describe the overall
More informationMachine Learning
Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University February 15, 2011 Today: Graphical models Inference Conditional independence and D-separation Learning from
More informationComputer Vision Group Prof. Daniel Cremers. 4a. Inference in Graphical Models
Group Prof. Daniel Cremers 4a. Inference in Graphical Models Inference on a Chain (Rep.) The first values of µ α and µ β are: The partition function can be computed at any node: Overall, we have O(NK 2
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 5 Inference
More informationGraphical Models Part 1-2 (Reading Notes)
Graphical Models Part 1-2 (Reading Notes) Wednesday, August 3 2011, 2:35 PM Notes for the Reading of Chapter 8 Graphical Models of the book Pattern Recognition and Machine Learning (PRML) by Chris Bishop
More informationWorkshop report 1. Daniels report is on website 2. Don t expect to write it based on listening to one project (we had 6 only 2 was sufficient
Workshop report 1. Daniels report is on website 2. Don t expect to write it based on listening to one project (we had 6 only 2 was sufficient quality) 3. I suggest writing it on one presentation. 4. Include
More informationLecture 9: Undirected Graphical Models Machine Learning
Lecture 9: Undirected Graphical Models Machine Learning Andrew Rosenberg March 5, 2010 1/1 Today Graphical Models Probabilities in Undirected Graphs 2/1 Undirected Graphs What if we allow undirected graphs?
More informationMachine Learning Lecture 3
Many slides adapted from B. Schiele Machine Learning Lecture 3 Probability Density Estimation II 26.04.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course
More informationCS 343: Artificial Intelligence
CS 343: Artificial Intelligence Bayes Nets: Independence Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.
More informationChapter 8 of Bishop's Book: Graphical Models
Chapter 8 of Bishop's Book: Graphical Models Review of Probability Probability density over possible values of x Used to find probability of x falling in some range For continuous variables, the probability
More informationCOMP90051 Statistical Machine Learning
COMP90051 Statistical Machine Learning Semester 2, 2016 Lecturer: Trevor Cohn 21. Independence in PGMs; Example PGMs Independence PGMs encode assumption of statistical independence between variables. Critical
More informationCS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination
CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination Instructor: Erik Sudderth Brown University Computer Science September 11, 2014 Some figures and materials courtesy
More informationLecture 3: Conditional Independence - Undirected
CS598: Graphical Models, Fall 2016 Lecture 3: Conditional Independence - Undirected Lecturer: Sanmi Koyejo Scribe: Nate Bowman and Erin Carrier, Aug. 30, 2016 1 Review for the Bayes-Ball Algorithm Recall
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 2, 2012 Today: Graphical models Bayes Nets: Representing distributions Conditional independencies
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University February 18, 2015 Today: Graphical models Bayes Nets: Representing distributions Conditional independencies
More informationMachine Learning Lecture 3
Course Outline Machine Learning Lecture 3 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Probability Density Estimation II 26.04.206 Discriminative Approaches (5 weeks) Linear
More informationECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning
ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Bayes Nets: Inference (Finish) Variable Elimination Graph-view of VE: Fill-edges, induced width
More informationComputer vision: models, learning and inference. Chapter 10 Graphical Models
Computer vision: models, learning and inference Chapter 10 Graphical Models Independence Two variables x 1 and x 2 are independent if their joint probability distribution factorizes as Pr(x 1, x 2 )=Pr(x
More informationMachine Learning Lecture 3
Machine Learning Lecture 3 Probability Density Estimation II 19.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Exam dates We re in the process
More informationBayesian Networks and Decision Graphs
ayesian Networks and ecision raphs hapter 7 hapter 7 p. 1/27 Learning the structure of a ayesian network We have: complete database of cases over a set of variables. We want: ayesian network structure
More informationALGORITHMS FOR DECISION SUPPORT
GORIHM FOR IION UOR ayesian networks Guest ecturer: ilja Renooij hanks to hor Whalen or kindly contributing to these slides robabilistic Independence onditional Independence onditional Independence hain
More information6 : Factor Graphs, Message Passing and Junction Trees
10-708: Probabilistic Graphical Models 10-708, Spring 2018 6 : Factor Graphs, Message Passing and Junction Trees Lecturer: Kayhan Batmanghelich Scribes: Sarthak Garg 1 Factor Graphs Factor Graphs are graphical
More informationECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning
ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Markov Random Fields: Inference Exact: VE Exact+Approximate: BP Readings: Barber 5 Dhruv Batra
More informationGraphical Models. David M. Blei Columbia University. September 17, 2014
Graphical Models David M. Blei Columbia University September 17, 2014 These lecture notes follow the ideas in Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. In addition,
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture 8 Junction Trees CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due next Wednesday (Nov 4) in class Start early!!! Project milestones due Monday (Nov 9) 4
More informationExact Inference: Elimination and Sum Product (and hidden Markov models)
Exact Inference: Elimination and Sum Product (and hidden Markov models) David M. Blei Columbia University October 13, 2015 The first sections of these lecture notes follow the ideas in Chapters 3 and 4
More informationRecap: Gaussian (or Normal) Distribution. Recap: Minimizing the Expected Loss. Topics of This Lecture. Recap: Maximum Likelihood Approach
Truth Course Outline Machine Learning Lecture 3 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Probability Density Estimation II 2.04.205 Discriminative Approaches (5 weeks)
More informationCS 343: Artificial Intelligence
CS 343: Artificial Intelligence Bayes Nets: Inference Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.
More informationSTAT 598L Probabilistic Graphical Models. Instructor: Sergey Kirshner. Exact Inference
STAT 598L Probabilistic Graphical Models Instructor: Sergey Kirshner Exact Inference What To Do With Bayesian/Markov Network? Compact representation of a complex model, but Goal: efficient extraction of
More informationAND/OR Cutset Conditioning
ND/OR utset onditioning Robert Mateescu and Rina Dechter School of Information and omputer Science University of alifornia, Irvine, 92697 {mateescu, dechter}@ics.uci.edu bstract utset conditioning is one
More informationECE521 W17 Tutorial 10
ECE521 W17 Tutorial 10 Shenlong Wang and Renjie Liao *Some of materials are credited to Jimmy Ba, Eric Sudderth, Chris Bishop Introduction to A4 1, Graphical Models 2, Message Passing 3, HMM Introduction
More informationGraphical Models & HMMs
Graphical Models & HMMs Henrik I. Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I. Christensen (RIM@GT) Graphical Models
More informationBayesian Machine Learning - Lecture 6
Bayesian Machine Learning - Lecture 6 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 2, 2015 Today s lecture 1
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationLoopy Belief Propagation
Loopy Belief Propagation Research Exam Kristin Branson September 29, 2003 Loopy Belief Propagation p.1/73 Problem Formalization Reasoning about any real-world problem requires assumptions about the structure
More informationComputational Intelligence
Computational Intelligence A Logical Approach Problems for Chapter 10 Here are some problems to help you understand the material in Computational Intelligence: A Logical Approach. They are designed to
More informationPart I: Sum Product Algorithm and (Loopy) Belief Propagation. What s wrong with VarElim. Forwards algorithm (filtering) Forwards-backwards algorithm
OU 56 Probabilistic Graphical Models Loopy Belief Propagation and lique Trees / Join Trees lides from Kevin Murphy s Graphical Model Tutorial (with minor changes) eading: Koller and Friedman h 0 Part I:
More informationLecture 4: Undirected Graphical Models
Lecture 4: Undirected Graphical Models Department of Biostatistics University of Michigan zhenkewu@umich.edu http://zhenkewu.com/teaching/graphical_model 15 September, 2016 Zhenke Wu BIOSTAT830 Graphical
More informationAv. Prof. Mello Moraes, 2231, , São Paulo, SP - Brazil
" Generalizing Variable Elimination in Bayesian Networks FABIO GAGLIARDI COZMAN Escola Politécnica, University of São Paulo Av Prof Mello Moraes, 31, 05508-900, São Paulo, SP - Brazil fgcozman@uspbr Abstract
More informationThe Basics of Graphical Models
The Basics of Graphical Models David M. Blei Columbia University September 30, 2016 1 Introduction (These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan.
More informationProbabilistic Graphical Models
Overview of Part One Probabilistic Graphical Models Part One: Graphs and Markov Properties Christopher M. Bishop Graphs and probabilities Directed graphs Markov properties Undirected graphs Examples Microsoft
More informationGraphical Models. Pradeep Ravikumar Department of Computer Science The University of Texas at Austin
Graphical Models Pradeep Ravikumar Department of Computer Science The University of Texas at Austin Useful References Graphical models, exponential families, and variational inference. M. J. Wainwright
More informationCollective classification in network data
1 / 50 Collective classification in network data Seminar on graphs, UCSB 2009 Outline 2 / 50 1 Problem 2 Methods Local methods Global methods 3 Experiments Outline 3 / 50 1 Problem 2 Methods Local methods
More informationBayesian Networks. A Bayesian network is a directed acyclic graph that represents causal relationships between random variables. Earthquake.
Bayes Nets Independence With joint probability distributions we can compute many useful things, but working with joint PD's is often intractable. The naïve Bayes' approach represents one (boneheaded?)
More informationIntroduction to Graphical Models
Robert Collins CSE586 Introduction to Graphical Models Readings in Prince textbook: Chapters 10 and 11 but mainly only on directed graphs at this time Credits: Several slides are from: Review: Probability
More informationLecture 5: Exact inference. Queries. Complexity of inference. Queries (continued) Bayesian networks can answer questions about the underlying
given that Maximum a posteriori (MAP query: given evidence 2 which has the highest probability: instantiation of all other variables in the network,, Most probable evidence (MPE: given evidence, find an
More information3 : Representation of Undirected GMs
0-708: Probabilistic Graphical Models 0-708, Spring 202 3 : Representation of Undirected GMs Lecturer: Eric P. Xing Scribes: Nicole Rafidi, Kirstin Early Last Time In the last lecture, we discussed directed
More informationEscola Politécnica, University of São Paulo Av. Prof. Mello Moraes, 2231, , São Paulo, SP - Brazil
Generalizing Variable Elimination in Bayesian Networks FABIO GAGLIARDI COZMAN Escola Politécnica, University of São Paulo Av. Prof. Mello Moraes, 2231, 05508-900, São Paulo, SP - Brazil fgcozman@usp.br
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Raquel Urtasun and Tamir Hazan TTI Chicago April 22, 2011 Raquel Urtasun and Tamir Hazan (TTI-C) Graphical Models April 22, 2011 1 / 22 If the graph is non-chordal, then
More informationV,T C3: S,L,B T C4: A,L,T A,L C5: A,L,B A,B C6: C2: X,A A
Inference II Daphne Koller Stanford University CS228 Handout #13 In the previous chapter, we showed how efficient inference can be done in a BN using an algorithm called Variable Elimination, that sums
More informationA New Approach For Convert Multiply-Connected Trees in Bayesian networks
A New Approach For Convert Multiply-Connected Trees in Bayesian networks 1 Hussein Baloochian, Alireza khantimoory, 2 Saeed Balochian 1 Islamic Azad university branch of zanjan 2 Islamic Azad university
More informationOSU CS 536 Probabilistic Graphical Models. Loopy Belief Propagation and Clique Trees / Join Trees
OSU CS 536 Probabilistic Graphical Models Loopy Belief Propagation and Clique Trees / Join Trees Slides from Kevin Murphy s Graphical Model Tutorial (with minor changes) Reading: Koller and Friedman Ch
More informationMachine Learning. Lecture Slides for. ETHEM ALPAYDIN The MIT Press, h1p://
Lecture Slides for INTRODUCTION TO Machine Learning ETHEM ALPAYDIN The MIT Press, 2010 alpaydin@boun.edu.tr h1p://www.cmpe.boun.edu.tr/~ethem/i2ml2e CHAPTER 16: Graphical Models Graphical Models Aka Bayesian
More informationDirected Graphical Models (Bayes Nets) (9/4/13)
STA561: Probabilistic machine learning Directed Graphical Models (Bayes Nets) (9/4/13) Lecturer: Barbara Engelhardt Scribes: Richard (Fangjian) Guo, Yan Chen, Siyang Wang, Huayang Cui 1 Introduction For
More informationBayesian Networks. 2. Separation in Graphs
ayesian Netwks ayesian Netwks 2. Separation in raphs Lars Schmidt-Thieme Infmation Systems and Machine Learning Lab (ISMLL) Institute f usiness conomics and Infmation Systems & Institute f omputer Science
More informationCS242: Probabilistic Graphical Models Lecture 2B: Loopy Belief Propagation & Junction Trees
CS242: Probabilistic Graphical Models Lecture 2B: Loopy Belief Propagation & Junction Trees Professor Erik Sudderth Brown University Computer Science September 22, 2016 Some figures and materials courtesy
More informationECE521 Lecture 21 HMM cont. Message Passing Algorithms
ECE521 Lecture 21 HMM cont Message Passing Algorithms Outline Hidden Markov models Numerical example of figuring out marginal of the observed sequence Numerical example of figuring out the most probable
More informationRecall from last time. Lecture 4: Wrap-up of Bayes net representation. Markov networks. Markov blanket. Isolating a node
Recall from last time Lecture 4: Wrap-up of Bayes net representation. Markov networks Markov blanket, moral graph Independence maps and perfect maps Undirected graphical models (Markov networks) A Bayes
More information2. Graphical Models. Undirected graphical models. Factor graphs. Bayesian networks. Conversion between graphical models. Graphical Models 2-1
Graphical Models 2-1 2. Graphical Models Undirected graphical models Factor graphs Bayesian networks Conversion between graphical models Graphical Models 2-2 Graphical models There are three families of
More informationGraphical models are a lot like a circuit diagram they are written down to visualize and better understand a problem.
Machine Learning (ML, F16) Lecture#15 (Tuesday, Nov. 1st) Lecturer: Byron Boots Graphical Models 1 Graphical Models Often, one is interested in representing a joint distribution P over a set of n random
More information1 : Introduction to GM and Directed GMs: Bayesian Networks. 3 Multivariate Distributions and Graphical Models
10-708: Probabilistic Graphical Models, Spring 2015 1 : Introduction to GM and Directed GMs: Bayesian Networks Lecturer: Eric P. Xing Scribes: Wenbo Liu, Venkata Krishna Pillutla 1 Overview This lecture
More informationECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov
ECE521: Week 11, Lecture 20 27 March 2017: HMM learning/inference With thanks to Russ Salakhutdinov Examples of other perspectives Murphy 17.4 End of Russell & Norvig 15.2 (Artificial Intelligence: A Modern
More informationLecture 5: Exact inference
Lecture 5: Exact inference Queries Inference in chains Variable elimination Without evidence With evidence Complexity of variable elimination which has the highest probability: instantiation of all other
More informationGraph Theory. Probabilistic Graphical Models. L. Enrique Sucar, INAOE. Definitions. Types of Graphs. Trajectories and Circuits.
Theory Probabilistic ical Models L. Enrique Sucar, INAOE and (INAOE) 1 / 32 Outline and 1 2 3 4 5 6 7 8 and 9 (INAOE) 2 / 32 A graph provides a compact way to represent binary relations between a set of
More informationStatistical Techniques in Robotics (STR, S15) Lecture#06 (Wednesday, January 28)
Statistical Techniques in Robotics (STR, S15) Lecture#06 (Wednesday, January 28) Lecturer: Byron Boots Graphical Models 1 Graphical Models Often one is interested in representing a joint distribution P
More informationCOS 513: Foundations of Probabilistic Modeling. Lecture 5
COS 513: Foundations of Probabilistic Modeling Young-suk Lee 1 Administrative Midterm report is due Oct. 29 th. Recitation is at 4:26pm in Friend 108. Lecture 5 R is a computer language for statistical
More informationMotivation: Shortcomings of Hidden Markov Model. Ko, Youngjoong. Solution: Maximum Entropy Markov Model (MEMM)
Motivation: Shortcomings of Hidden Markov Model Maximum Entropy Markov Models and Conditional Random Fields Ko, Youngjoong Dept. of Computer Engineering, Dong-A University Intelligent System Laboratory,
More information4 Factor graphs and Comparing Graphical Model Types
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 4 Factor graphs and Comparing Graphical Model Types We now introduce
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University April 1, 2019 Today: Inference in graphical models Learning graphical models Readings: Bishop chapter 8 Bayesian
More informationDynamic Bayesian network (DBN)
Readings: K&F: 18.1, 18.2, 18.3, 18.4 ynamic Bayesian Networks Beyond 10708 Graphical Models 10708 Carlos Guestrin Carnegie Mellon University ecember 1 st, 2006 1 ynamic Bayesian network (BN) HMM defined
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University February 25, 2015 Today: Graphical models Bayes Nets: Inference Learning EM Readings: Bishop chapter 8 Mitchell
More informationECE521 Lecture 18 Graphical Models Hidden Markov Models
ECE521 Lecture 18 Graphical Models Hidden Markov Models Outline Graphical models Conditional independence Conditional independence after marginalization Sequence models hidden Markov models 2 Graphical
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Raquel Urtasun and Tamir Hazan TTI Chicago April 25, 2011 Raquel Urtasun and Tamir Hazan (TTI-C) Graphical Models April 25, 2011 1 / 17 Clique Trees Today we are going to
More informationGraphical Models Reconstruction
Graphical Models Reconstruction Graph Theory Course Project Firoozeh Sepehr April 27 th 2016 Firoozeh Sepehr Graphical Models Reconstruction 1/50 Outline 1 Overview 2 History and Background 3 Graphical
More informationGraphical Models. Dmitrij Lagutin, T Machine Learning: Basic Principles
Graphical Models Dmitrij Lagutin, dlagutin@cc.hut.fi T-61.6020 - Machine Learning: Basic Principles 12.3.2007 Contents Introduction to graphical models Bayesian networks Conditional independence Markov
More informationBayesian Networks Inference
Bayesian Networks Inference Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University November 5 th, 2007 2005-2007 Carlos Guestrin 1 General probabilistic inference Flu Allergy Query: Sinus
More informationReading. Graph Terminology. Graphs. What are graphs? Varieties. Motivation for Graphs. Reading. CSE 373 Data Structures. Graphs are composed of.
Reading Graph Reading Goodrich and Tamassia, hapter 1 S 7 ata Structures What are graphs? Graphs Yes, this is a graph. Graphs are composed of Nodes (vertices) dges (arcs) node ut we are interested in a
More informationCSCE 478/878 Lecture 6: Bayesian Learning and Graphical Models. Stephen Scott. Introduction. Outline. Bayes Theorem. Formulas
ian ian ian Might have reasons (domain information) to favor some hypotheses/predictions over others a priori ian methods work with probabilities, and have two main roles: Optimal Naïve Nets (Adapted from
More informationDirected Graphical Models
Copyright c 2008 2010 John Lafferty, Han Liu, and Larry Wasserman Do Not Distribute Chapter 18 Directed Graphical Models Graphs give a powerful way of representing independence relations and computing
More informationDATA MINING LECTURE 10B. Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines
DATA MINING LECTURE 10B Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines NEAREST NEIGHBOR CLASSIFICATION 10 10 Illustrating Classification Task Tid Attrib1
More informationFactor Graphs and Inference
Factor Graph and Inerence Sargur Srihari rihari@cedar.bualo.edu 1 Topic 1. Factor Graph 1. Factor in probability ditribution. Deriving them rom graphical model. Eact Inerence Algorithm or Tree graph 1.
More informationProbabilistic Graphical Models
Overview of Part Two Probabilistic Graphical Models Part Two: Inference and Learning Christopher M. Bishop Exact inference and the junction tree MCMC Variational methods and EM Example General variational
More informationBelief propagation in a bucket-tree. Handouts, 275B Fall Rina Dechter. November 1, 2000
Belief propagation in a bucket-tree Handouts, 275B Fall-2000 Rina Dechter November 1, 2000 1 From bucket-elimination to tree-propagation The bucket-elimination algorithm, elim-bel, for belief updating
More informationMassachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014
Suggested Reading: Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Probabilistic Modelling and Reasoning: The Junction
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Eric Xing Lecture 14, February 29, 2016 Reading: W & J Book Chapters Eric Xing @
More informationDesign and Analysis of Algorithms
S 101, Winter 2018 esign and nalysis of lgorithms Lecture 3: onnected omponents lass URL: http://vlsicad.ucsd.edu/courses/cse101-w18/ nd of Lecture 2: s, Topological Order irected cyclic raph.g., Vertices
More informationNote. Out of town Thursday afternoon. Willing to meet before 1pm, me if you want to meet then so I know to be in my office
raphs and Trees Note Out of town Thursday afternoon Willing to meet before pm, email me if you want to meet then so I know to be in my office few extra remarks about recursion If you can write it recursively
More informationSequence Labeling: The Problem
Sequence Labeling: The Problem Given a sequence (in NLP, words), assign appropriate labels to each word. For example, POS tagging: DT NN VBD IN DT NN. The cat sat on the mat. 36 part-of-speech tags used
More informationRecitation 4: Elimination algorithm, reconstituted graph, triangulation
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Recitation 4: Elimination algorithm, reconstituted graph, triangulation
More informationExpectation Propagation
Expectation Propagation Erik Sudderth 6.975 Week 11 Presentation November 20, 2002 Introduction Goal: Efficiently approximate intractable distributions Features of Expectation Propagation (EP): Deterministic,
More informationCS535 Big Data Fall 2017 Colorado State University 9/5/2017. Week 3 - A. FAQs. This material is built based on,
S535 ig ata Fall 217 olorado State University 9/5/217 Week 3-9/5/217 S535 ig ata - Fall 217 Week 3--1 S535 IG T FQs Programming ssignment 1 We will discuss link analysis in week3 Installation/configuration
More informationCS 188: Artificial Intelligence
CS 188: Artificial Intelligence Bayes Nets: Inference Instructors: Dan Klein and Pieter Abbeel --- University of California, Berkeley [These slides were created by Dan Klein and Pieter Abbeel for CS188
More information2. Graphical Models. Undirected pairwise graphical models. Factor graphs. Bayesian networks. Conversion between graphical models. Graphical Models 2-1
Graphical Models 2-1 2. Graphical Models Undirected pairwise graphical models Factor graphs Bayesian networks Conversion between graphical models Graphical Models 2-2 Graphical models Families of graphical
More information