Integrating Probabilistic Reasoning with Constraint Satisfaction

Similar documents
Integrating Probabilistic Reasoning with Constraint Satisfaction

Today. CS 188: Artificial Intelligence Fall Example: Boolean Satisfiability. Reminder: CSPs. Example: 3-SAT. CSPs: Queries.

Probabilistic Graphical Models

Markov Logic: Representation

CS 188: Artificial Intelligence Spring Today

Announcements. CS 188: Artificial Intelligence Fall Reminder: CSPs. Today. Example: 3-SAT. Example: Boolean Satisfiability.

CS 188: Artificial Intelligence Fall 2008

D-Separation. b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C.

OSU CS 536 Probabilistic Graphical Models. Loopy Belief Propagation and Clique Trees / Join Trees

Example: Map coloring

Collective classification in network data

Notes on CSP. Will Guaraldi, et al. version 1.7 4/18/2007

Modeling and Reasoning with Bayesian Networks. Adnan Darwiche University of California Los Angeles, CA

Constraint Satisfaction Problems

CS 188: Artificial Intelligence

Satisfiability (SAT) Applications. Extensions/Related Problems. An Aside: Example Proof by Machine. Annual Competitions 12/3/2008

CS 4100 // artificial intelligence

Part II. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Satisfiability. Michail G. Lagoudakis. Department of Computer Science Duke University Durham, NC SATISFIABILITY

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models

Introduction to Graphical Models

CS 188: Artificial Intelligence. Recap: Search

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Set 5: Constraint Satisfaction Problems Chapter 6 R&N

What is Search For? CSE 473: Artificial Intelligence. Example: N-Queens. Example: N-Queens. Example: Map-Coloring 4/7/17

Machine Learning. Sourangshu Bhattacharya

CS 188: Artificial Intelligence

Computer vision: models, learning and inference. Chapter 10 Graphical Models

Part I: Sum Product Algorithm and (Loopy) Belief Propagation. What s wrong with VarElim. Forwards algorithm (filtering) Forwards-backwards algorithm

CSE 473: Artificial Intelligence

Set 5: Constraint Satisfaction Problems

Mean Field and Variational Methods finishing off

Learning the Structure of Sum-Product Networks. Robert Gens Pedro Domingos

Mean Field and Variational Methods finishing off

Solving 3-SAT. Radboud University Nijmegen. Bachelor Thesis. Supervisors: Henk Barendregt Alexandra Silva. Author: Peter Maandag s

Module 4. Constraint satisfaction problems. Version 2 CSE IIT, Kharagpur

Announcements. Homework 4. Project 3. Due tonight at 11:59pm. Due 3/8 at 4:00pm

Notes on CSP. Will Guaraldi, et al. version /13/2006

Processing Probabilistic and Deterministic Graphical Models. Rina Dechter DRAFT

Set 5: Constraint Satisfaction Problems

CS 188: Artificial Intelligence Fall 2011

Constraint Satisfaction Problems. Chapter 6

Announcements. CS 188: Artificial Intelligence Spring Production Scheduling. Today. Backtracking Search Review. Production Scheduling

Announcements. Reminder: CSPs. Today. Example: N-Queens. Example: Map-Coloring. Introduction to Artificial Intelligence

Loopy Belief Propagation

Constraint Satisfaction Problems

Local Consistency in Weighted CSPs and Inference in Max-SAT

Class2: Constraint Networks Rina Dechter

STA 4273H: Statistical Machine Learning

Geometric Registration for Deformable Shapes 3.3 Advanced Global Matching

Set 5: Constraint Satisfaction Problems

What is Search For? CS 188: Artificial Intelligence. Example: Map Coloring. Example: N-Queens. Example: N-Queens. Constraint Satisfaction Problems

CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination

Lecture 14: Lower Bounds for Tree Resolution

CS 188: Artificial Intelligence

Parallel and Distributed Computing

CS 664 Flexible Templates. Daniel Huttenlocher

Object Recognition Using Pictorial Structures. Daniel Huttenlocher Computer Science Department. In This Talk. Object recognition in computer vision

Computer Vision Group Prof. Daniel Cremers. 4a. Inference in Graphical Models

K-Consistency. CS 188: Artificial Intelligence. K-Consistency. Strong K-Consistency. Constraint Satisfaction Problems II

Expectation Propagation

More details on Loopy BP

Boolean Functions (Formulas) and Propositional Logic

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

Motivation: Shortcomings of Hidden Markov Model. Ko, Youngjoong. Solution: Maximum Entropy Markov Model (MEMM)

What is Search For? CS 188: Artificial Intelligence. Constraint Satisfaction Problems

ECE521 W17 Tutorial 10

CS 343: Artificial Intelligence

Local Search for CSPs

CS-171, Intro to A.I. Mid-term Exam Winter Quarter, 2016

6 : Factor Graphs, Message Passing and Junction Trees

Space of Search Strategies. CSE 573: Artificial Intelligence. Constraint Satisfaction. Recap: Search Problem. Example: Map-Coloring 11/30/2012

A fuzzy constraint assigns every possible tuple in a relation a membership degree. The function

CS 188: Artificial Intelligence. What is Search For? Constraint Satisfaction Problems. Constraint Satisfaction Problems

CS242: Probabilistic Graphical Models Lecture 2B: Loopy Belief Propagation & Junction Trees

Constraint Satisfaction Problems

EECS 219C: Computer-Aided Verification Boolean Satisfiability Solving. Sanjit A. Seshia EECS, UC Berkeley

An Introduction to LP Relaxations for MAP Inference

Class2: Constraint Networks Rina Dechter

Constraint Satisfaction Problems

8.1 Polynomial-Time Reductions

Constraint Satisfaction

1 Inference for Boolean theories

Announcements. CS 188: Artificial Intelligence Spring Today. A* Review. Consistency. A* Graph Search Gone Wrong

4.1 Review - the DPLL procedure

Sum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 15, 2015

Probabilistic Belief. Adversarial Search. Heuristic Search. Planning. Probabilistic Reasoning. CSPs. Learning CS121

EECS 219C: Formal Methods Boolean Satisfiability Solving. Sanjit A. Seshia EECS, UC Berkeley

P Is Not Equal to NP. ScholarlyCommons. University of Pennsylvania. Jon Freeman University of Pennsylvania. October 1989

DM841 DISCRETE OPTIMIZATION. Part 2 Heuristics. Satisfiability. Marco Chiarandini

Announcements. CS 188: Artificial Intelligence Fall 2010

Exam Topics. Search in Discrete State Spaces. What is intelligence? Adversarial Search. Which Algorithm? 6/1/2012

CS 188: Artificial Intelligence Fall 2011

Constraint Satisfaction Problems. Chapter 6

Constraint Satisfaction Problems

Mini-Buckets: A General Scheme for Generating Approximations in Automated Reasoning

Reductions. Linear Time Reductions. Desiderata. Reduction. Desiderata. Classify problems according to their computational requirements.

Artificial Intelligence

(Due to rounding, values below may be only approximate estimates.) We will supply these numbers as they become available.

Transcription:

Integrating Probabilistic Reasoning with Constraint Satisfaction IJCAI Tutorial #7 Instructor: Eric I. Hsu July 17, 2011 http://www.cs.toronto.edu/~eihsu/tutorial7

Getting Started Discursive Remarks. Organizational Details. Motivation. The Actual Tutorial. Scott Cummings 2

Bottom Line Up Front (BLUF) Thesis: Constraint satisfaction and probabilistic inference are closely related at a formal level, and this relationship motivates new ways of applying techniques from one area to the other. Block A: Probabilistic inference and constraint satisfaction in terms of graphical models, and their formal correspondence. (about 2 hours.) Block B: Concrete applications of the formal correspondence. (1 hour.) Block C: Conclusions, Future Research Opportunities, Discussion. (0.5 hours.) 3

BLUF: Main Idea CSP s directly encode (uniform) probability distributions over their own solutions. P (v 1,,v n ) = (1/Z) f c (v 1,, v n ) c P (v 1 = + ) = (1/Z) f c (+,, v n ) V:v 1 = + c f c (v 1,,v n ) = 1, if v 1, v n satisfies constraint c; 0 otherwise. 4

BLUF: Main Idea Quasigroup with Holes Problem Boolean Satisfiability Problem 5

BLUF: Concrete Applications List of Solutions Maximum Likelihood Estimator Constraint Program Survey : Probability for each variable to be fixed a certain way when sampling a solution at random. P(Θ v SAT) 6

BLUF: Opportunities and Challenges Core formal principles are not new, but it there is a terrific opportunity to operationalize them. Natural next step at the formal level is to emphasize pure optimization over specific prepackaged methodologies. Natural emphasis at the operational level is resultsoriented, problem driven, highly empirical. Natural target for technical improvement is to find profitable places to sacrifice accuracy for speed. 7

Organization: Required Background Will build from the ground up. <Audience survey>. 8

My level of experience is: A. Student. B. Recent graduate. C. Established. 9

I would describe my main area of research as: A. Constraint Satisfaction / Constraint Programming. B. Probabilistic Models / Machine Learning. C. Some other related field. D. Some other mostly unrelated field. E. I do not have an area of research yet. 10

For those who answered A or B : A. I know almost nothing about the other field. B. I get the main idea and already know the basic techniques, and want to know the current state of affairs. C. I actually have a lot of experience with the other field and am more interested in applications. 11

Organization: Scope Will adopt a biased (but accurate) perspective. Will actually try to cover terminology comprehensively. Limitation to to discrete, finite domains. Examples will focus on Boolean Satisfiability (SAT). 12

Organization: Miscellaneous Questions encouraged at any time! Overstatements, citations Treating the two areas interchangeably. Coming and going, laptop use are fine, please just minimize noise. 13

Motivation Understanding Problem-Solving 14

Overview of Block A Part I. Basic principles of constraint satisfaction and probabilistic reasoning over graphical models. Part II. Advanced concepts in CSP and probabilistic reasoning. 15

I. Basic Probabilistic Reasoning and Constraint Satisfaction; Integration 16

BLUF: Factor Graphs and Equivalent Methods The Factor Graph representation allows us to link Probabilistic Reasoning and CSP through a common notation, clarifying the equivalence between basic methods from both areas. 17

Constraint Satisfaction X 1 X 2 X 3 CSP {0,1} X 4 X 5 What is the sum of function outputs, over all configurations? 18

Constraint Satisfaction: Formal Definition A Constraint Satisfaction Problem (CSP) is a triple <X, D, C>, where: X is a set of variables. D is a set of domains for the variables. Here, all assumed to be identical, discrete, and finite. C is a set of functions ( constraints ) mapping subsets of X to {0,1}. 19

Constraint Satisfaction: Queries Decision Problem: Does there exist any configuration of X (assignment to all the variables with values from there respective domains) such that CSP <X, D, C> evaluates to 1? Satisfiable versus Unsatisfiable instances give different flavors to the decision problem. Model Counting (#CSP, #SAT): For how many of the exponentially numerous possible configurations does the CSP evaluate to 1? Maximization (WCSP, MAXSAT): Assign weights to constraints. Instead of evaluating to 0/1, CSP assigns score to a configuration consisting of sum of weights of satisfying constraints. Find configuration with maximum score. 20

Probabilistic Inference X 1 X 2 X 3 X 4 Joint Probability Distribution [0,1] X 5 21

Probabilistic Inference: Queries (If more familiar with logical representations, you can view variables as propositions about the world.) Marginalization: what is the probability that variable x i holds value v? (i.e. what is the sum of outputs for all configurations containing this assignment? MAP (a.k.a. MPE): what is the most likely configuration of the variables (i.e., which input gives the greatest output?) 22

Constraint Satisfaction: Brute Force X 1 X 2 X 3 CSP {0,1} X 4 X 5 Explicitly iterate over all O(2 n ) configurations, check for output of 1. 23

Probabilistic Inference: Brute Force X 1 X 2 X 3 v X 4 Joint Probability Distribution [0,1] X 5 Fix given variable to a given value; iterate over all O(2 n-1 ) configurations of the other variables, summing outputs. 24

Graphical Models: Localized Structure X 1 X 2 X 3 X 4 F 1 F 2 F 3 {0,1} or [0,1] X 5 Factor Graph: Global structure factors into local functions. 25

Factor Graph Representation Factor ( function ) nodes connect to nodes for variables that appear in their scopes. Generalizes other graphical models (Constraint Graph, MRF, Bayes Net, Kalman Filter, HMM ) 26

Factor Graphs: Extensions ( Tuples ) Notation for f2(x2,x3,x4): X 2 X 3 X 4 F 2 Scope of f2: {x2, x3, x4} Extension to f2: assignment to all the variables in its scope, i.e. <0, 2, 1>. Also known as a tuple. Projection of a configuration onto f2 s scope: yields assignments relevant to f2. 27

Example: Bayes Network V S V S T L B T L B O O X D X D Nodes with no parents are associated with prior probability distributions, remaining nodes hold distributions that are conditioned on their parents. 28

Example: Markov Random Field Each clique of nodes is associated with a potential function. 29

Probabilistic Factor Graphs and Marginal Computation: The Partition Function The Probability of a Configuration: The Marginal Computation: Probability of particular variable holding particular value. (a.k.a. proportion of total mass contributed by vectors containing this assignment) (a.k.a. computing the partition function N ) 30

Returning to the Main Idea Input Configuration Function Output Range ~x 2 4 2 3 5 1 1 2 4 5 3 1 2 3 5 4... Assignment to variables......... Constraint Satisfaction Problem f0; 1g Whether problem is satisfied ~x 2 4 2 3 5 1 1 2 4 5 3 1 2 3 5 4............ Assignment to (random) variables Probabilistic Model Joint probability of assignment 31

Main Idea: Product Decomposition Function Type Internal Function Structure Algebraic Semiring Constraint Program L : _ N : ^ Probabilistic Graphical Model L : + N : 32

Factor Graph Representation 33

Basic Operations: Search and Inference Search: Try different combinations of values for the variables. 1 Yes No 3 1 1 Yes Yes 3 2 Inference: Propagate consequences of a variable choice; reason over constraint extensions. 1 C 1? C 2? 1 C 3?? 34

Basic Operation: Distributing over Sum Note the structure of the parenthesized expressions dynamic programming! 35

Basic Operation: Node Merging Node Merging 36

Basic Operation: Dualization ( Extension : assignment to all the variables in a function s scope) Factors become variables defined over their possible extensions. Variables become 0/1-factors enforcing consistency between extensions. Scoring factors introduced to simulate factors from the primal graph. 37

Algorithm: Variable Elimination on a Tree We will send messages between functions and variables, and variables and functions. Messages are themselves functions defined over the possible values of the participating variable. 1. Initialize arbitrary node as the root of the tree. Choose any ordering that begins with leaves and ends at the root. 2. Initialize messages at leaves. Variable leaf nodes: identity message. Function leaf nodes: send their own functions as messages. 38

Algorithm: Variable Elimination on a Tree Variable-to-function message represents totality of downstream influence on the variable by all other functions. Function-to-variable message represents totality of downstream influence on the variable by summarizing all downstream functions and variables as a single function. 39

Algorithm: Variable Elimination on a Tree 3. Apply update rules according to ordering from leaves to root. 4. Apply update rules according to reverse ordering from root to leaves. 5. Calculate final marginals according to formula very similar to variableto-function update rule; normalize. Calculates all marginals at once. 40

Example: Variable Elimination on a Tree Consider marginal probablity that x3 has value 0: 41

Example: Variable Elimination on a Tree 42

Basic Operation: Distributing over Sum Note the structure of the parenthesized expressions dynamic programming! 43

What if the factor graph is not a tree? Make it into a tree. (Exact, Convergent). Full Variable Elimination (Belief Propagation) Cluster-Tree Elimination. 44

What if the factor graph is not a tree? Make it into a tree. (Exact, Convergent). Full Variable Elimination (Belief Propagation) Cluster-Tree Elimination. 45

What if the factor graph is not a tree? Make it into a tree. (Exact, Convergent) Full Variable Elimination (Belief Propagation) Cluster-Tree Elimination. Cycle-Cutset. Recursive Conditioning. Bucket Elimination, Node-Splitting. Pretend that the graph is a tree, even if it isn t. (Approximate, Non-Convergent) Loopy Belief Propagation. (Variations). 46

Alternative Method: Sampling V S T L B O X D Choose an arbitrary initial configuration. For some number of iterations: For each variable in the graph (according to some ordering): Sample a new value for the variable according to the current values of its neighbors. On conclusion, marginal probability of a variable assignment is the fraction of iterations under which given variable holds given value. Ergodicity ensures that we can explore entire configuration space. 47

Summary: Marginals over Graphical Models Dualization: Node-Merging: Variable elimination as node-merging. Belief propagation as variable elimination. Loopy belief propagation as wrongful belief propagation. (Same goes for inference in the sense of CSP!) 48

Basic Constraint Satisfaction Search (Sum): Backtracking (depth-first) search to build up a satisfying configuration. Failure to build such a configuration upon exhausting entire space indicates unsatisfiability. Inference (Product): During backtracking search, propagate consequences of an individual variable assignment. (Can also add constraints during search). Can also just solve a problem entirely by inference. 49

Basic Backtracking Search 50

Quasigroup with Holes (QWH) Problem Latin Square: 3 4 5 2 1 4 2 3 1 5 2 1 4 5 3 d rows by d columns Cell variables {1,, d} No repeats in any row or column: alldiff constraint 1 5 2 3 4 5 3 1 4 2 d = 5 51

Quasigroup with Holes (QWH) Problem Latin Square: 3 5 1 2 5 1 1 2 4 3 4 d = 5 d rows by d columns Cell variables {1,, d} No repeats in any row or column: alldiff constraint QWH Problem: Remove a percentage of the holes from a valid Latin Square. (Kautz et al., 01) 52

Factor Graph for QWH (Unary constraints represent the fixed values.) 53

Boolean Satisfiability (SAT) in CNF Variables are Boolean (domain {0,1}). Literals represent the assertion or negation of a variable. i.e. X1, or else X1; positive or negative literal for X1. Constraints consist of clauses, disjunctions of literals. i.e. (X1 X2 X3). At least one literals must be satisfied. CNF = Conjunctive Normal Form. 54

Factor Graph for SAT (Solid or dotted lines indicate that variable appears positively or negatively in clause.) 55

Basic Backtracking Search Two most important issues: order variables/values, perform propagation (inference) 56

Basic Inference: Consistency Prune the domains of variables to include only those that are supportable by some assignment to the other variables. Node consistency. Check variable assignment against zero other variables. ( Individual consistency.) (Generalized) Arc consistency. Check variable assignment against any one other variable. ( Pairwise consistency.) Special case for SAT: unit propagation. Many more C 1 C 2 1?? 57

II. Advanced Concepts from Probabilistic Reasoning and Constraint Satisfaction 58

Overview of CSP/Prob. Inference Contemporary research in constraint satisfaction: Random problems: backbones and backdoor. Constraint propagators. Data structures and engineering. Clause learning / No-Good learning. Contemporary research in probabilistic graphical models: Marginal polytope. Emphasis on pure (numerical) optimization. Function propagators. Comparison. 59

Motivation: Discovering Structure Certain variables are much more important than the rest. Combinatorial cores can be small. Remainder is extraneous. 60

Motivation: Discovering Structure Certain variables are much more important than the rest. Combinatorial cores can be small. Remainder is extraneous. Can t just rely on graph structure in the general case. 61

Frequency Interesting Features of Random CSPs Combinatorial abstraction for real problems Small, tightly constrained cores Backdoors/Backbones (Williams et al., 03) Restarts Heavy-Tailed Behavior High sensitivity to variable ordering (Gomes et al., 00; Hulubei et al., 04) Runtimes over Multiple Trials 62

Survey Propagation and the Phase Transition in Satisfiability Random problems are of interest to statistical physics and other applications. Hard random problems can be viewed in terms of backbone and backdoor variables. 63

Contemporary Research in CSP Propagators. Modeling /Applications. Clause Learning/No-Good Learning. Restarts. Data Structures and Algorithms. 64

Prob. Inference as Pure Optimization Computing sets of marginals can be viewed as an optimization problem constrained by marginal polytope. Survey : set of marginal estimates. 65

Gibbs Free Energy We want hypothesized q-distribution to match true p- distribution. First term: Mean Free Energy; make them match. Second term: Entropy; don t be too extreme. (Third term: Hemholz term, for normalization.) 66

Mean Field Approximation No correlation: probability of a given extension is the product of the individual marginal probabilities of its constituent variable assignments. (Node consistency!) 67

Bethe Approximation (BP) Pairwise correlation: hypothesize distributions over extensions, and distributions over individual variables. Extension distributions must marginalize to individual variable distributions, one variable at a time. (Generalized Arc Consistency!) 68

CSP and Marginalization as Optimization CSP: 69

CSP and Marginalization as Optimization Probabilistic Marginalization: 70

CSP and Marginalization as Optimization CSP: Marginalization: 71

Contemporary Research in Prob. Inference Optimization. Modeling / Applications. Algorithms. Propagators. (Learning? a.k.a. column generation, cutting planes) 72

Summary Marginalization is not just similar to constraint satisfaction, it just is constraint satisfaction under a relaxed space. Duality, node-merging, and adding constraints during solving, yield corresponding techniques. 73