Workshop report 1. Daniels report is on website 2. Don t expect to write it based on listening to one project (we had 6 only 2 was sufficient

Similar documents
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Machine Learning. Sourangshu Bhattacharya

FMA901F: Machine Learning Lecture 6: Graphical Models. Cristian Sminchisescu

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models

Part II. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

D-Separation. b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C.

Computer vision: models, learning and inference. Chapter 10 Graphical Models

May 1, CODY, Error Backpropagation, Bischop 5.3, and Support Vector Machines (SVM) Bishop Ch 7. May 3, Class HW SVM, PCA, and K-means, Bishop Ch

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Graphical Models Part 1-2 (Reading Notes)

Machine Learning

Probabilistic Graphical Models

STA 4273H: Statistical Machine Learning

Graphical Models & HMMs

Bayesian Machine Learning - Lecture 6

Directed Graphical Models (Bayes Nets) (9/4/13)

Lecture 5: Exact inference. Queries. Complexity of inference. Queries (continued) Bayesian networks can answer questions about the underlying

Lecture 5: Exact inference

Machine Learning

Machine Learning

Bayesian Networks. A Bayesian network is a directed acyclic graph that represents causal relationships between random variables. Earthquake.

ECE521 W17 Tutorial 10

Graphical Models. Dmitrij Lagutin, T Machine Learning: Basic Principles

Loopy Belief Propagation

Machine Learning

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

Lecture 3: Graphs and flows

Machine Learning Lecture 16

Machine Learning

Graphical Models. David M. Blei Columbia University. September 17, 2014

CS 343: Artificial Intelligence

Undirected Graphical Models. Raul Queiroz Feitosa

Chapter 8 of Bishop's Book: Graphical Models

Graphical Models. Pradeep Ravikumar Department of Computer Science The University of Texas at Austin

Computer Vision Group Prof. Daniel Cremers. 4a. Inference in Graphical Models

Graph and Digraph Glossary

Introduction to Graphical Models

Recall from last time. Lecture 4: Wrap-up of Bayes net representation. Markov networks. Markov blanket. Isolating a node

Using Machine Learning to Optimize Storage Systems

Lecture 4: Undirected Graphical Models

Probabilistic Graphical Models

STAT 598L Probabilistic Graphical Models. Instructor: Sergey Kirshner. Exact Inference

2. Graphical Models. Undirected graphical models. Factor graphs. Bayesian networks. Conversion between graphical models. Graphical Models 2-1

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1

Graph Theory. Probabilistic Graphical Models. L. Enrique Sucar, INAOE. Definitions. Types of Graphs. Trajectories and Circuits.

The Basics of Graphical Models

Expectation Propagation

ECE521 Lecture 21 HMM cont. Message Passing Algorithms

Graphs. Data Structures and Algorithms CSE 373 SU 18 BEN JONES 1

Sum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 15, 2015

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

Lecture 3: Conditional Independence - Undirected

COMP90051 Statistical Machine Learning

COS 513: Foundations of Probabilistic Modeling. Lecture 5

Conditional Random Fields and beyond D A N I E L K H A S H A B I C S U I U C,

18 October, 2013 MVA ENS Cachan. Lecture 6: Introduction to graphical models Iasonas Kokkinos

These notes present some properties of chordal graphs, a set of undirected graphs that are important for undirected graphical models.

3 : Representation of Undirected GMs

Lecture 9: Undirected Graphical Models Machine Learning

CS 441 Discrete Mathematics for CS Lecture 26. Graphs. CS 441 Discrete mathematics for CS. Final exam

Learning decomposable models with a bounded clique size

Max-Sum Inference Algorithm

Collective classification in network data

A Fast Learning Algorithm for Deep Belief Nets

CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination

6 : Factor Graphs, Message Passing and Junction Trees

Modeling and Reasoning with Bayesian Networks. Adnan Darwiche University of California Los Angeles, CA

An undirected graph is a tree if and only of there is a unique simple path between any 2 of its vertices.

10708 Graphical Models: Homework 2

Directed Graphical Models

Statistical and Learning Techniques in Computer Vision Lecture 1: Markov Random Fields Jens Rittscher and Chuck Stewart

Sequence Labeling: The Problem

Mean Field and Variational Methods finishing off

Object Recognition Using Pictorial Structures. Daniel Huttenlocher Computer Science Department. In This Talk. Object recognition in computer vision

Big Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1

1 : Introduction to GM and Directed GMs: Bayesian Networks. 3 Multivariate Distributions and Graphical Models

Structural and Syntactic Pattern Recognition

7. Boosting and Bagging Bagging

OSU CS 536 Probabilistic Graphical Models. Loopy Belief Propagation and Clique Trees / Join Trees

Trees (Part 1, Theoretical) CSE 2320 Algorithms and Data Structures University of Texas at Arlington

Graph Algorithms Using Depth First Search

Deep Generative Models Variational Autoencoders

BACKGROUND: A BRIEF INTRODUCTION TO GRAPH THEORY

Models for grids. Computer vision: models, learning and inference. Multi label Denoising. Binary Denoising. Denoising Goal.

Crossing bridges. Crossing bridges Great Ideas in Theoretical Computer Science. Lecture 12: Graphs I: The Basics. Königsberg (Prussia)

Machine Learning. Lecture Slides for. ETHEM ALPAYDIN The MIT Press, h1p://

Stat 5421 Lecture Notes Graphical Models Charles J. Geyer April 27, Introduction. 2 Undirected Graphs

Graph Algorithms. Chapter 22. CPTR 430 Algorithms Graph Algorithms 1

Probabilistic Graphical Models

Variational Methods for Graphical Models

Inference Complexity As Learning Bias. The Goal Outline. Don t use model complexity as your learning bias

Bayesian Networks Inference

Foundations of Discrete Mathematics

A Transformational Characterization of Markov Equivalence for Directed Maximal Ancestral Graphs

Lecture 15: The subspace topology, Closed sets

COMP90051 Statistical Machine Learning

Probabilistic Graphical Models

Motivation: Shortcomings of Hidden Markov Model. Ko, Youngjoong. Solution: Maximum Entropy Markov Model (MEMM)

Uninformed Search Methods. Informed Search Methods. Midterm Exam 3/13/18. Thursday, March 15, 7:30 9:30 p.m. room 125 Ag Hall

Transcription:

Workshop report 1. Daniels report is on website 2. Don t expect to write it based on listening to one project (we had 6 only 2 was sufficient quality) 3. I suggest writing it on one presentation. 4. Include figures (from a related paper or their presentation) 5. Include references

May 8, CODY Machine Learning for finding oil, focusing on 1) robust seismic denoising/interpolation using structured matrix approximation 2) seismic image clustering and classification, using t-sne(tdistributed stochastic neighbor embedding) and CNN. Weichang Li, Goup Leader Aramco, Houston. May 10, Class HW First distribution of final projects. Ocean acoustic source tracking. Final projects. Final project is the main goal in last month. Bishop Ch 9 Mixture models May 15, CODY Seismology and Machine Learning, Daniel Trugman (half class), ch 8 Graphical models May 17, Class HW ch 8 May 22, Dictionary learning, Mike Bianco (half class), Bishop Ch 13 May 24, Class HW Bishop Ch 13 MAY 30 CODY May 31, No Class. Workshop, Big Data and The Earth Sciences: Grand Challenges Workshop June 5, Discuss workshop, ch13. Spiess Hall open for project discussion 11am-. June 7, Workshop report. No class June 12 Spiess Hall open for project discussion 9-11:30am and 2-7pm June 16 Final report delivered. Beer time For final project discussion every afternoon Mark and I will be available Chapter 13 Sequential data

Ocean source tracking X Re-implement Source Localization in an Ocean Waveguide using Supervised Machine Learning X-ray spectrum absorption interpretation using NN Neural decoding Plankton Transfer learning and deep feature extraction for planktonic image data sets Speaker tagger Coral Resturant Amazon rainforest (Kaggle) Myshake Seismic High-precision indoor positioning framework for most wifi-enabled devices Please ask questions Mark and I available all afternoons. Just come or email for time slots. Spiess hall 330 is open Monday 5 and 12 June. If interested I can book it at other times Report Rather concise than long. Larger group can do more. Start with some very simple example. To show your idea and that it is working. End with showing the advanced abilities Several figures. Equations are nice. Delivery Zip file (Friday 16) Main code (not all). It should be able to run. Report (pdf preferred). Final Report

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

10.1.4 Graph terminology Before we continue, we must define a few basic terms, most of which are very intuitive. A graph G =(V, E) consists of a set of nodes or vertices, V = {1,...,V}, andaset of edges, E = {(s, t) :s, t V}. We can represent the graph by its adjacency matrix, in which we write G(s, t) =1to denote (s, t) E, thatis,ifs t is an edge in the graph. If G(s, t) =1iff G(t, s) =1,wesaythegraphisundirected, otherwiseitisdirected. We usually assume G(s, s) =0,whichmeanstherearenoself loops. Here are some other terms we will commonly use: Parent For a directed graph, the parents of a node is the set of all nodes that feed into it: pa(s) {t : G(t, s) =1}. Child For a directed graph, the children of a node is the set of all nodes that feed out of it: ch(s) {t : G(s, t) =1}. Family For a directed graph, the family of a node is the node and its parents, fam(s) = {s} pa(s). Root For a directed graph, a root is a node with no parents. Leaf For a directed graph, a leaf is a node with no children. Ancestors For a directed graph, the ancestors are the parents, grand-parents, etc of a node. That is, the ancestors of t is the set of nodes that connect to t via a trail: anc(t) {s : s t}. Descendants For a directed graph, the descendants are the children, grand-children, etc of anode.thatis,thedescendantsofs is the set of nodes that can be reached via trails from s: desc(s) {t : s t}. Neighbors For any graph, we define the neighbors of a node as the set of all immediately connected nodes, nbr(s) {t : G(s, t) =1 G(t, s) =1}. For an undirected graph, we

write s t to indicate that s and t are neighbors (so (s, t) E is an edge in the graph). Degree The degree of a node is the number of neighbors. For directed graphs, we speak of the in-degree and out-degree, whichcountthenumberofparentsandchildren. Cycle or loop For any graph, we define a cycle or loop to be a series of nodes such that we can get back to where we started by following edges, s 1 s 2 s n s 1, n 2. Ifthe graph is directed, we may speak of a directed cycle. For example, in Figure 10.1(a), there are no directed cycles, but 1 2 4 3 1 is an undirected cycle. DAG A directed acyclic graph or DAG is a directed graph with no directed cycles. See Figure 10.1(a) for an example. Topological ordering For a DAG, a topological ordering or total ordering is a numbering of the nodes such that parents have lower numbers than their children. For example, in Figure 10.1(a), we can use (1, 2, 3, 4, 5), or(1, 3, 2, 5, 4), etc. Path or trail A path or trail s t is a series of directed edges leading from s to t. Tree An undirected tree is an undirectecd graph with no cycles. A directed tree is a DAG in which there are no directed cycles. If we allow a node to have multiple parents, we call it a polytree, otherwisewecallitamoraldirectedtree. Forest A forest is a set of trees. Subgraph A(node-induced)subgraph G A is the graph created by using the nodes in A and their corresponding edges, G A =(V A, E A ). Clique For an undirected graph, a clique is a set of nodes that are all neighbors of each other. A maximal clique is a clique which cannot be made any larger without losing the clique property. For example, in Figure 10.1(b), {1, 2} is a clique but it is not maximal, since we can add 3 and still maintain the clique property. In fact, the maximal cliques are as follows: {1, 2, 3}, {2, 3, 4}, {3, 5}.

Three types of graphical model Directed graphs useful for designing models Undirected graphs good for some domains, e.g. computer vision Factor graphs useful for inference and learning

Bayesian Networks (Bayes Nets) or Directed graphical model (DGM) Decomposition

Directed Graphs or Bayesian Networks General Factorization

Bayesian Curve Fitting (1) Polynomial Plate

Bayesian Curve Fitting (3) Input variables and explicit hyperparameters Condition on data

Bayesian Curve Fitting Prediction Predictive distribution: where

Generative Models Causal process for generating images

Discrete Variables (1) General joint distribution: K 2-1 parameters Independent joint distribution: 2(K-1) parameters General joint distribution over M variables: K M -1parameters M-node Markov chain: K-1 + (M-1) K(K-1) parameters

Discrete Variables: Bayesian Parameters (1)

Discrete Variables: Bayesian Parameters (2) Shared prior

Parameterized Conditional Distributions If K-state variables, has O(K M ) parameters. are discrete, in general The parameterized form requires only M + 1 parameters

Conditional Independence a is independent of b given c Equivalently Notation

Conditional Independence: Example 1

Conditional Independence: Example 2

Conditional Independence: Example 3 Note: this is the opposite of Example 1, with c unobserved.

D-separation A, B, and C are non-intersecting subsets of nodes in a directed graph. path from A to B is blocked if it contains a node such that either a) the arrows on the path meet either head-to-tail or tail-to-tail at the node, and the node is in the set C, or b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C. If all paths from A to B are blocked, A is said to be d-separated from B by C. Then the joint distribution over all variables satisfies. D-separation: Example

Markov Random Fields or Undirected Graphs

Cliques and Maximal Cliques Clique Maximal Clique

Where Joint Distribution is the potential over clique C and is the normalization coefficient; note: M K-state variables K M terms in Z. Energies and the Boltzmann distribution

Illustration: Image De-Noising Noisy Image Restored Image (ICM) Restored Image (Graph cuts)

Converting Directed to Undirected Graphs (1)

Converting Directed to Undirected Graphs (2) Additional links

Directed vs. Undirected Graphs (2)

Inference in Graphical Models

Inference on a Chain

Inference on a Chain

Inference on a Chain

Inference on a Chain To compute local marginals: Compute and store all forward messages,. Compute and store all backward messages,. Compute Z at any node x m Compute for all variables required.

Trees Undirected Tree Directed Tree Polytree

Factorization Directed graphs: Undirected graphs: Both have the form of products of factors:

Factor Graphs More verbose!

from Directed Graphs to Factor Graphs

Factor Graphs from Undirected Graphs

INFERENCE

The Sum-Product Algorithm (1) Objective: i. to obtain an efficient, exact inference algorithm for finding marginals; ii. in situations where several marginals are required, to allow computations to be shared efficiently. Key idea: Distributive Law 7 versus 3 operations

The Sum-Product Algorithm f 3 (x, y) y u w x f 1 (u, w) f 2 (w, x) f 4 (x, z) z

What if the messages are intractable? True distribution Monte Carlo Variational Message Passing Expectation propagation

Learning is just inference!

The Sum-Product Algorithm (2)

The Sum-Product Algorithm (3)

The Sum-Product Algorithm (4)

The Sum-Product Algorithm (5) Initialization To compute local marginals: Pick an arbitrary node as root Compute and propagate messages from the leaf nodes to the root, storing received messages at every node. Compute and propagate messages from the root to the leaf nodes, storing received messages at every node. Compute the product of received messages at each node for which the marginal is required, and normalize if necessary.

Sum-Product: Example (1)

Sum-Product: Example (2) and (3)