{, } Probabilistic programming. John Winn and Tom Minka Machine Learning Summer School. September 2009

Similar documents
Probabilistic Programming or Revd. Bayes meets Countess Lovelace

Probabilistic programming the future of Machine Learning? John Winn Ph.D. Summer School, July 2014

Monolingual probabilistic programming using generalized coroutines

Causality with Gates

PIC 10A. Lecture 17: Classes III, overloading

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models

Modeling and Reasoning with Bayesian Networks. Adnan Darwiche University of California Los Angeles, CA

Fast Introduction to Object Oriented Programming and C++

Probabilistic Graphical Models

Dynamic Types, Concurrency, Type and effect system Section and Practice Problems Apr 24 27, 2018

Probabilistic Graphical Models

Probabilistic Graphical Models

Computer vision: models, learning and inference. Chapter 10 Graphical Models

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models

Machine Learning

Announcements. Working on requirements this week Work on design, implementation. Types. Lecture 17 CS 169. Outline. Java Types

COMP 202 Recursion. CONTENTS: Recursion. COMP Recursion 1

4TC00 Model-based systems engineering 2.8 Scalable solutions - reuse Functions 2.10 Stochastics

Dynamic Data Structures. John Ross Wallrabenstein

CMSC 330: Organization of Programming Languages

DC69 C# &.NET DEC 2015

Functional Logic Programming Language Curry

CSci 1113 Midterm 1. Name: Student ID:

Interactive Development Environment for Probabilistic Programming

c. Typically results in an intractably large set of test cases even for small programs

We do not teach programming

Q1 Q2 Q3 Q4 Q5 Total 1 * 7 1 * 5 20 * * Final marks Marks First Question

CSE 332 Spring 2013: Midterm Exam (closed book, closed notes, no calculators)

10-701/15-781, Fall 2006, Final

Python review. 1 Python basics. References. CS 234 Naomi Nishimura

CPSC 211, Sections : Data Structures and Implementations, Honors Final Exam May 4, 2001

A Fast Learning Algorithm for Deep Belief Nets

Confidence Intervals. Dennis Sun Data 301

Chapter Contents. An Introduction to Sorting. Selection Sort. Selection Sort. Selection Sort. Iterative Selection Sort. Chapter 9

Imperative languages

Data Structures and Other Objects Using C++

BUGS: Language, engines, and interfaces

#include <iostream> #include <algorithm> #include <cmath> using namespace std; int f1(int x, int y) { return (double)(x/y); }

Computer Science 1 Bh

Lecture 2: Big-Step Semantics

Why Fuzzy? Definitions Bit of History Component of a fuzzy system Fuzzy Applications Fuzzy Sets Fuzzy Boundaries Fuzzy Representation

02157 Functional Programming. Michael R. Ha. Lecture 2: Functions, Types and Lists. Michael R. Hansen

Expectation Propagation

A noninformative Bayesian approach to small area estimation

Semantics of programming languages

15-780: Problem Set #3

PIANOS requirements specifications

COMP90051 Statistical Machine Learning

AUTONOMOUS SYSTEMS MULTISENSOR INTEGRATION

Design Programming DECO2011

Problem 1. Multiple Choice (choose only one answer)

Arrays in C. Rab Nawaz Jadoon DCS. Assistant Professor. Department of Computer Science. COMSATS IIT, Abbottabad Pakistan

Advanced features of Functional Programming (Haskell)

Calibration and emulation of TIE-GCM

COMP Summer 2015 (A01) Jim (James) Young jimyoung.ca

Genetic Programming on Program Traces as an Inference Engine for Probabilistic Languages

Structurally Recursive Descent Parsing. Nils Anders Danielsson (Nottingham) Joint work with Ulf Norell (Chalmers)

STA 4273H: Statistical Machine Learning

Recommended Design Techniques for ECE241 Project Franjo Plavec Department of Electrical and Computer Engineering University of Toronto

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

THE MONTE CARLO DATABASE SYSTEM FOR QUERYING IMPRECISE, UNCERTAIN, AND MISSING DATA

Topic 6: Types COS 320. Compiling Techniques. Princeton University Spring Prof. David August. Adapted from slides by Aarne Ranta

CS558 Programming Languages

Section Marks Pre-Midterm / 32. Logic / 29. Total / 100

CPSC 427: Object-Oriented Programming

Bayesian Methods. David Rosenberg. April 11, New York University. David Rosenberg (New York University) DS-GA 1003 April 11, / 19

Operators & Expressions

Lecture 5: Methods CS2301

CIS 194: Homework 3. Due Wednesday, February 11, Interpreters. Meet SImPL

Motivation for typed languages

Fall 09, Homework 5

The Basics of Graphical Models

CIS Fall Data Structures Midterm exam 10/16/2012

Computer Science 1 Ah

Lightweight Implementations of Probabilistic Programming Languages Via Transformational Compilation

CSC312 Principles of Programming Languages : Functional Programming Language. Copyright 2006 The McGraw-Hill Companies, Inc.

The Multi Stage Gibbs Sampling: Data Augmentation Dutch Example

Practical Course WS12/13 Introduction to Monte Carlo Localization

Informatics 1 Functional Programming Lecture 9. Algebraic Data Types. Don Sannella University of Edinburgh

CS 61BL Midterm 2 Review Session. Chris Gioia, Patrick Lutz, Ralph Arroyo, Sarah Kim

THE LOGIC OF COMPOUND STATEMENTS

Lab Session # 3 Conditional Statements. ALQUDS University Department of Computer Engineering

GADTs. Alejandro Serrano. AFP Summer School. [Faculty of Science Information and Computing Sciences]

Short-Cut MCMC: An Alternative to Adaptation

Probabilistic Robotics

Verifying Concurrent Programs

Solutions to Midterm 2 - Monday, July 11th, 2009

RNNs as Directed Graphical Models

Recap. Recap. If-then-else expressions. If-then-else expressions. If-then-else expressions. If-then-else expressions

Midterm Review. Short Answer. Short Answer. Practice material from the Winter 2014 midterm. Will cover some (but not all) of the questions.

LSU EE 4755 Homework 1 Solution Due: 9 September 2015

A LISP Interpreter in ML

Operational Semantics of Cool

A Tutorial Introduction 1

MTAT : Software Testing

Graphical Models. David M. Blei Columbia University. September 17, 2014

COS 126 General Computer Science Fall Written Exam 1

Chapter 12 Supplement: Recursion with Java 1.5. Mr. Dave Clausen La Cañada High School

CS 223: Data Structures and Programming Techniques. Exam 2

University of Cape Town ~ Department of Computer Science Computer Science 1015F ~ June Exam

Transcription:

{, } Probabilistic programming John Winn and Tom Minka Machine Learning Summer School September 2009

Goals of Probabilistic Programming Make it easier to do probabilistic inference in custom models If you can write the model as a program, you can do inference on it Not limited by graphical notation Libraries of models can be built up and shared A big area of research!

Heights example Suppose we take a woman at random and a man at random from the UK population The woman turns out to be taller than the man What is the probability of this event? What is the posterior distribution over the woman s height? What is the posterior distribution over the man s height?

Heights example Gaussian Gaussian heightwoman heightman > istaller

Machine learning software Black box Probabilistic programming Weka Bayes Net software (e.g Hugin) SVM libraries gr BNT VIBES BUGS Infer.NET Church Hierarchical Bayes Compiler

CSOFT probabilistic language A representation language for probabilistic models. Takes C# and adds support for: random variables constraints on variables inference Can be embedded in ordinary C# to allow integration of deterministic + stochastic code

CSOFT random variables Normal variables have a fixed single value. e.g. int length=6, bool visible=true. Random variables have uncertain value specified by a probability distribution. e.g. int length = random(uniform(0,10)) bool visible = random(bernoulli(0.8)). Introduce random operator which means is distributed as.

CSOFT constraints We can define constraints on random variables, e.g. constrain(visible==true) constrain(length==4) constrain(length>0) constrain(i==j) The constrain(b) operator means we constrain b to be true.

CSOFT inference The infer() operator gives the posterior distribution of one or more random variables. Example: int i = random(uniform(1,10)); bool b = (i*i>50); Dist bdist = infer(b);//bernoulli(0.3) Output of infer() is always deterministic even when input is random.

Heights example in CSOFT double heightman = random(gaussian(177,64)); double heightwoman = random(gaussian(164,64)); Bernoulli dist = infer(heightwoman > heightman); constrain(heightwoman > heightman); Gaussian distwoman = infer(heightwoman); Gaussian distman = infer(heightman); First infer is computed without the constraint Later infers are computed with the constraint

Sampling interpretation Imagine running the program many times, where random(dist) is an ordinary function that draws a random number from dist constrain(b) stops the run if b is not true infer(x) collects the value of x into a persistent memory (one for each use of infer in the program) If enough x s have been stored, return their distribution Otherwise stop the run (i.e. wait until enough samples are collected) This defines the meaning of a Csoft program

Probabilistic programs & graphical models

Random variables Probabilistic program double x = random(gaussian(0,1)); Graphical model Gaussian(0,1) x

Bayesian networks Probabilistic program double x = random(gaussian(0,1)); double y = random(gamma(1,1)); double z = random(gaussian(x,y)); Graphical model Gaussian(0,1) Gamma(1,1)) x y Gaussian z

Loops/plates Probabilistic program for(int i=0;i<10;i++) { double x = random(gaussian(0,1)); } Graphical model Gaussian(0,1) x i=0..9

Loops/plates II Probabilistic program double x = random(gaussian(0,1)); double y = random(gamma(1,1)); for(int i=0;i<10;i++) { double z = random(gaussian(x,y)); } Graphical model Gaussian(0,1) Gamma(1,1)) x y Gaussian z i=0..9

If statement/gates Probabilistic program bool b = random(bernoulli(0.5)); double x; if (b) { x = random(gaussian(0,1)); } else { x = random(gaussian(10,1)); } Graphical model Bernoulli(0.5) b T Gaussian(0,1) F Gaussian(10,1) x Gates (Minka and Winn, NIPS 2008)

Other language features Probabilistic program Functions/recursion Indexing Jagged arrays Mutation: x=x+1 Objects... Graphical model No common equivalent

Needs of Probabilistic Programming Flexible and general inference algorithms Modelling constructs that integrate nicely with inference E.g. Gates (Minka and Winn, NIPS 2008) Compiler technology for probabilistic constructs Automatic scheduling of fixed-point updates Automatic parallelization

Probabilistic programming in Infer.NET

Infer.NET Compiles probabilistic programs into inference code. No in-memory factor graphs = no overhead Consists of a chain of code transformations: CSOFT program T1 T2 T3 Inference program Calling infer invokes this chain automatically

Infer.NET Model is specified using C#, with operators overloaded to look like Csoft C# code is internally converted into Csoft Inference compiler works only with Csoft In a future version, it will be possible to program in Csoft directly Free for academic use http://research.microsoft.com/infernet

Random variables in Infer.NET Probabilistic program double x = random(gaussian(0,1)); C# code Variable<double> x = Variable.Gaussian(0,1);

Bayesian networks Probabilistic program double x = random(gaussian(0,1)); double y = random(gamma(1,1)); double z = random(gaussian(x,y)); C# code Variable<double> x = Variable.Gaussian(0,1); Variable<double> y = Variable.Gamma(1,1); Variable<double> z = Variable.Gaussian(x,y);

Inference in Infer.NET Probabilistic program double x = random(gaussian(0,1)); Dist xdist = infer(x); C# code Variable<double> x = Variable.Gaussian(0,1); InferenceEngine engine = new InferenceEngine(); IDistribution<double> xdist = engine.infer(x); // or Gaussian xdist = engine.infer<gaussian>(x);

Loops/plates Probabilistic program for(int i=0;i<10;i++) { double x = random(gaussian(0,1)); } C# code Range i = new Range(10); using(variable.foreach(i)) { Variable<double> x = Variable.Gaussian(0,1); }

Loops/plates II Probabilistic program double[] x = new double[10]; for(int i=0;i<10;i++) { x[i] = random(gaussian(0,1)); } C# code Range i = new Range(10); VariableArray<double> x = Variable.Array<double>(i); using(variable.foreach(i)) { x[i] = Variable.Gaussian(0,1); }

If statement/gates Probabilistic program bool b = random(bernoulli(0.5)); double x; if (b) { x = random(gaussian(0,1)); } else { x = random(gaussian(10,1)); } C# code Variable<bool> b = Variable.Bernoulli(0.5); Variable<double> x = Variable.New<double>(); using(variable.if(b)) { x.setto( Variable.Gaussian(0,1) ); } using(variable.ifnot(b)) { x.setto( Variable.Gaussian(10,1) ); }

Indexing by random integer Probabilistic program bool[] b = new bool[2] { true, false }; int i = random(discrete(0.4,0.6)); bool c = b[i]; // Bernoulli(0.4) C# code VariableArray<bool> b = Variable.Array<bool>(range); b.observedvalue = new bool[2] { true, false }; Variable<int> i = Variable.Discrete(range,0.4,0.6); Variable<bool> c = Variable.New<bool>(); using(variable.switch(i)) { c.setto( b[i] ); }

On to the practical! http://mlg.eng.cam.ac.uk/mlss09/material.htm