Linear Time Unit Propagation, Horn-SAT and 2-SAT

Similar documents
PROPOSITIONAL LOGIC (2)

CS 512, Spring 2017: Take-Home End-of-Term Examination

Boolean Functions (Formulas) and Propositional Logic

8 NP-complete problem Hard problems: demo

The Satisfiability Problem [HMU06,Chp.10b] Satisfiability (SAT) Problem Cook s Theorem: An NP-Complete Problem Restricted SAT: CSAT, k-sat, 3SAT

Chapter 2 PRELIMINARIES

Lecture 14: Lower Bounds for Tree Resolution

Example of a Demonstration that a Problem is NP-Complete by reduction from CNF-SAT

SAT-CNF Is N P-complete

Resolution and Clause Learning for Multi-Valued CNF

Horn Formulae. CS124 Course Notes 8 Spring 2018

Mixed Integer Linear Programming

Where Can We Draw The Line?

EECS 219C: Computer-Aided Verification Boolean Satisfiability Solving. Sanjit A. Seshia EECS, UC Berkeley

2SAT Andreas Klappenecker

ALGORITHMS EXAMINATION Department of Computer Science New York University December 17, 2007

Polynomial SAT-Solver Algorithm Explanation

EECS 219C: Formal Methods Boolean Satisfiability Solving. Sanjit A. Seshia EECS, UC Berkeley

P Is Not Equal to NP. ScholarlyCommons. University of Pennsylvania. Jon Freeman University of Pennsylvania. October 1989

A Satisfiability Procedure for Quantified Boolean Formulae

Notes on CSP. Will Guaraldi, et al. version /13/2006

Normal Forms for Boolean Expressions

8.1 Polynomial-Time Reductions

NP and computational intractability. Kleinberg and Tardos, chapter 8

4.1 Review - the DPLL procedure

Chapter 10 Part 1: Reduction

Foundations of AI. 9. Predicate Logic. Syntax and Semantics, Normal Forms, Herbrand Expansion, Resolution

CS446: Machine Learning Fall Problem Set 4. Handed Out: October 17, 2013 Due: October 31 th, w T x i w

(QiuXin Hui) 7.2 Given the following, can you prove that the unicorn is mythical? How about magical? Horned? Decide what you think the right answer

Reductions and Satisfiability

Exercises Computational Complexity

The Complexity of Camping

The Resolution Algorithm

Reductions. Linear Time Reductions. Desiderata. Reduction. Desiderata. Classify problems according to their computational requirements.

NP-Completeness. Algorithms

Mathematical Logic Prof. Arindama Singh Department of Mathematics Indian Institute of Technology, Madras. Lecture - 9 Normal Forms

Propositional Logic Formal Syntax and Semantics. Computability and Logic

SAT Solvers. Ranjit Jhala, UC San Diego. April 9, 2013

1. Suppose you are given a magic black box that somehow answers the following decision problem in polynomial time:

DM841 DISCRETE OPTIMIZATION. Part 2 Heuristics. Satisfiability. Marco Chiarandini

CS 125 Section #4 RAMs and TMs 9/27/16

CSc Parallel Scientific Computing, 2017

Example: Map coloring

Some Hardness Proofs

Binary Decision Diagrams

! Greed. O(n log n) interval scheduling. ! Divide-and-conquer. O(n log n) FFT. ! Dynamic programming. O(n 2 ) edit distance.

Foundations of AI. 8. Satisfiability and Model Construction. Davis-Putnam, Phase Transitions, GSAT and GWSAT. Wolfram Burgard & Bernhard Nebel

Definition: A context-free grammar (CFG) is a 4- tuple. variables = nonterminals, terminals, rules = productions,,

Handout 9: Imperative Programs and State

Notes on CSP. Will Guaraldi, et al. version 1.7 4/18/2007

Circuit versus CNF Reasoning for Equivalence Checking

Fixed-Parameter Algorithm for 2-CNF Deletion Problem

COMP4418 Knowledge Representation and Reasoning

CMPSCI611: The SUBSET-SUM Problem Lecture 18

Answer Key #1 Phil 414 JL Shaheen Fall 2010

Chapter 8. NP and Computational Intractability. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Notes on Non-Chronologic Backtracking, Implication Graphs, and Learning

Computability Theory

CS-E3200 Discrete Models and Search

arxiv: v2 [cs.cc] 29 Mar 2010

To prove something about all Boolean expressions, we will need the following induction principle: Axiom 7.1 (Induction over Boolean expressions):

NP-Completeness of 3SAT, 1-IN-3SAT and MAX 2SAT

! Greed. O(n log n) interval scheduling. ! Divide-and-conquer. O(n log n) FFT. ! Dynamic programming. O(n 2 ) edit distance.

Finite Model Generation for Isabelle/HOL Using a SAT Solver

Multi Domain Logic and its Applications to SAT

Core Membership Computation for Succinct Representations of Coalitional Games

NP-complete Reductions

Implementation of a Sudoku Solver Using Reduction to SAT

PCP and Hardness of Approximation

Boolean Representations and Combinatorial Equivalence

CS 395T Computational Learning Theory. Scribe: Wei Tang

Complexity Classes and Polynomial-time Reductions

1 Introduction. 1. Prove the problem lies in the class NP. 2. Find an NP-complete problem that reduces to it.

Unit 8: Coping with NP-Completeness. Complexity classes Reducibility and NP-completeness proofs Coping with NP-complete problems. Y.-W.

Practical SAT Solving

Steven Skiena. skiena

Learning a SAT Solver from Single-

NP-Complete Reductions 2

Decision Procedures. An Algorithmic Point of View. Decision Procedures for Propositional Logic. D. Kroening O. Strichman.

W4231: Analysis of Algorithms

1 Definition of Reduction

Best known solution time is Ω(V!) Check every permutation of vertices to see if there is a graph edge between adjacent vertices

Decision Procedures in First Order Logic

Parallel and Distributed Computing

3.7 Denotational Semantics

Combinatorial Optimization

Semantics. A. Demers Jan This material is primarily from Ch. 2 of the text. We present an imperative

P and NP (Millenium problem)

Propositional Calculus: Boolean Algebra and Simplification. CS 270: Mathematical Foundations of Computer Science Jeremy Johnson

Lecture 10 October 7, 2014

Satisfiability. Michail G. Lagoudakis. Department of Computer Science Duke University Durham, NC SATISFIABILITY

Homework 4 Solutions

New Worst-Case Upper Bound for #2-SAT and #3-SAT with the Number of Clauses as the Parameter

Incomplete Information: Null Values

Lecture 20: Satisfiability Steven Skiena. Department of Computer Science State University of New York Stony Brook, NY

Inapproximability of the Perimeter Defense Problem

Chapter 3. Set Theory. 3.1 What is a Set?

Integrity Constraints (Chapter 7.3) Overview. Bottom-Up. Top-Down. Integrity Constraint. Disjunctive & Negative Knowledge. Proof by Refutation

Exam in Algorithms & Data Structures 3 (1DL481)

Examples of P vs NP: More Problems

Transcription:

Notes on Satisfiability-Based Problem Solving Linear Time Unit Propagation, Horn-SAT and 2-SAT David Mitchell mitchell@cs.sfu.ca September 25, 2013 This is a preliminary draft of these notes. Please do not distribute without permission. Corrections and comments are welcome. In this section, we describe linear time algorithms for two syntactic restrictions of SAT. The algorithms are based on an efficient simplification process, called unit propagation, which also lies at the core of most high-performance SAT solvers. Terms and Conventions In this section Γ denotes a set of clauses, with n distinct atoms, m clauses, and length (number of literals) l, and α is a (possibly partial) truth assignment for the atoms of Γ. We will often write CNF formulas as sets of clauses, sometimes leaving out the braces, so that (P Q) (R S) becomes (P Q), (R S). 1 Unit Propagation Unit propagation is an important simplification operation for CNF formulas based on the observation that, if Γ contains a unit clause (P), then every satisfying assignment for Γ must make P true. If Γ also contains a clause ( P Q), we may now observe that it is effectively unit, because the only way to satisfy it is to set Q true. Unit Propagation (UP) is the name for the process of following this line of implications. Definition 1. The Unit Clause Rule (UCR) is the following simplification rule for CNF formulas. UCR: If Γ contains a unit clause (L), delete every occurrence of L from clauses of Γ, and delete every clause containing L from Γ. We denote by UP(Γ) the result of repeated application of UCR until there are no unit clauses in the resulting formula. That is, it is a fixpoint of UCR. 1

The fixpoint of UCR is unique, provided it does not contain (). If a fixpoint of UCR contains (), it need not be unique, but every fixpoint will contain (). Example 1. Let Γ be (P), (P Q R), ( P R U), ( P Q), ( Q P R), (R U V), ( P Q V) Applying UCR with (P), we obtain (R U), ( Q), ( Q R), (R U V), (Q V) Which produces a new unit clause ( Q). Now apply UCR with ( Q), and obtain (R U), (R U V), (V) Finally, we apply UCR with the new unit clause (V), obtaining Exercise 1. Construct UP(Γ), where Γ is (R U). (P)(P, U, V)(P, Q)(P, R)(R, Q, S)(S, Q, T, U, V)(S, Q, T, W, X)(P, Q, U, V)(P, Q, U, W). Proposition 1. For any CNF formula Γ: 1. Γ is satisfiable iff UP(Γ) is; 2. if () UP(Γ) then Γ is unsatisfiable. In the next two sections, we show how to solve two important special cases of SAT efficiently using UP. In later notes we will see how efficient UP is a central component in industrial-strength SAT solvers. Naive computation of UP(Γ) takes quadratic time, as it involves repeated scanning of the entire formula to find all occurrences of a literal. However, with suitable data structures it can be computed in linear time. Later in this section of notes we will describe a method for doing this that is important in practice. Proposition 2. UP(Γ) can be computed in time O(l). We show how to do so in Section 4. 2 Horn-SAT Definition 2. A clause is Horn if it contains at most one positive literal. Clause set Γ is Horn if each of its clauses is Horn. Horn-SAT is the problem of deciding satisfiability of sets of Horn clauses (a.k.a. Horn formulas). 2

We may think of the size of a truth assignment as being the number of atoms it makes true. Then every Horn formula has a unique minimal model, and we can find this in linear time. Proposition 3. If a Horn formula has no positive unit clauses, then it is satisfiable. Exercise 2. Prove Proposition 3. (Hint: This is a special case of the property that an arbitrary CNF formula with no positive clause is satisfiable.) In fact, the following process shows that we can compute the minimal model of a Horn formula using UP. First, partition Γ into two sets: Γ containing only the negative clauses of Γ (i.e., those clauses containing no positive literal); and Γ +, containing only those clauses which have a positive literal. Now compute UP(Γ + ), and construct the partial truth assignment τ of atoms set true during the computation. If τ makes a clause of Γ false, then Γ is unsatisfiable. Otherwise, Γ is satisfiable, and the total assignment obtained from τ by setting all remaining atoms to false is its minimal model. To see why this it the case, first observe that the only unit clauses which appear during computation of UP(Γ + ) are positive. We can show this by induction on the number of unit clauses generated. The base case is zero: the initial formula has only positive unit clauses, by construction. If re-write each non-unit clause ( P 1 P 2... P k P) of Γ + as the equivalent formula (P 1 P 2... P k P), then it is easy to see that, in each step where a new unit clause is obtained, it must be positive (otherwise there was a negative unit clause which made P false). Thus, we see that τ sets some atoms true, and (by Proposition 1) those atoms are true in every satisfying assignment for Γ. If clause of Γ is made false by the assignments in τ, then Γ is unsatisfiable. Otherwise, all remaining atoms can be set false, and this will be sufficient to satisfy all clauses of Γ. Exercise 3. Compute the minimal model of (P Q R), (Q S), (R), (W S), (W), (W X P) This leads us to the following conclusion. Proposition 4. Horn formula Γ is unsatisfiable if and only if UP(Γ) contains the empty clause. Corollary 1. Horn-SAT can be solved in time O(l). 3

3 2-SAT Definition 3. 2-SAT is the problem of deciding satisfiability of sets of clauses with at most two literals per clause (a.k.a. 2-CNF formulas). There are two well-known linear-time algorithms for 2-SAT, one based on unit propagation, and one based on finding cycles in a graph associated with the formula. We focus on the former, which is more practical and is closely related to other algorithms we use. In Section 3.1 we give a brief sketch of the graph based method. The UP-based algorithm is based upon the following observation. Suppose Γ is a 2-CNF and we apply UCR for some unit clause (L). Then for each clause C in Γ we have exactly one of: 1. C contains L, so is now satisfied; 2. C contains L, so becomes a unit clause; 3. C is untouched. It follows that UP(Γ) either contains the empty clause, or contains only clauses of size exactly two, none of which contain atoms which were given values during unit propagation. For any literal L which occurs in Γ, define UP(Γ,L) to be UP(Γ {L}). Proposition 5. If Γ is a 2-CNF in which L occurs, then exactly one of the following is true: 1. UP(Γ,L) contains the empty clause, in which case Γ = L; 2. UP(Γ,L) is a proper subset of Γ, and is satisfiable if and only if Γ is. A simple polytime algorithm for 2-SAT, based on this proposition, is given as Algorithm 1. Assuming linear time unit propagation, the algorithm runs in time O(nl). Example 2 demonstrates that this upper bound is tight. Example 2. Let Γ be the following conjunction of clauses: (P 1 P 2 ) (P 2 P 3 )... (P n 1 P n ) (P n Q) (P n Q) (P 1 R 1 ) (P 2 R 2 )... (P n R n ) (1) Γ has 2n atoms, and 2n + 1 clauses. Consider a run of Algorithm 1 with input Γ in which the first literal chosen in line 3 is P 1, the second is P 2, etc. When P 0 is chosen, unit propagation sets all atoms P i, for i > 1, true, and then constructs the empty clause from (P n Q) (P n Q). When P 1 is set in line 6, the clause (P 1 R) is satisfied, and nothing else happens. The pattern is 4

Algorithm 1 Unit propagation based algorithm for 2-SAT 1: procedure UP-2SAT( Γ ) // Γ is a set of binary clauses 2: while Γ is not empty do 3: L a literal from Γ 4: UP(Γ,L) 5: if () then // we know Γ = L 6: UP(Γ,L) 7: if () then // we know Γ = L and Γ = L 8: return UNSAT 9: end if 10: end if 11: Γ 12: end while 13: return SAT 14: end procedure similar after choosing P 2, P 3, etc. With linear time unit propagation, this algorithm has worst-case time Θ(nl), which can be quadratic in the input size. To obtain a linear time algorithm, we must ensure that, whenever unit propagation does more than a very small amount of work, that work is reflected in a reduction in the size of the formula. We can arrange for this by performing the propagation processes for L and L in parallel, as shown in Algorithm 2. For purposes of describing this algorithm, we define one step of unit propagation to be touching (eliminating either by identifying as satisfied or identifying as a new unit) one clause. This algorithm runs in time O(l). Exercise 4. It is well-known that 2-Col can be solved in polynomial time, and in fact linear time. Consider the representations of K-colouring as CNF given in the notes on CNF, for the special case when K = 2. What can we say about the CNF representations of 2-Col? 3.1 The Implication Graph Method Here, we give a brief sketch of the other, and perhaps most common, method of showing 2-SAT is in linear time. Definition 4. The implication graph IG Γ of a 2-CNF formula Γ is the directed graph IG Γ = V, A with vertices V = {P, P P is an atom occurring in Γ} and arc set A = 5

Algorithm 2 Linear-time Unit propagation based algorithm for 2-SAT 1: procedure Linear-2SAT( Γ ) // Γ is a set of binary clauses 2: while Γ is not empty do 3: L a literal from Γ 4: Perform computation of UP(Γ,L) and UP(Γ,L) step-for-step in parallel; 5: When the first of these processes terminates, halt the other also; 6: if UP(Γ,L) halted first then 7: let L denote L. 8: end if 9: if () UP(Γ,L) then 10: Γ UP(Γ,L) 11: else 12: Γ UP(Γ,L) 13: if () Γ then 14: return UNSAT 15: end if 16: end if 17: end while 18: return SAT 19: end procedure {(L, L ), (L, L) (L L ) Γ }. If Γ contains a unit clause (L), we treat it as the logically equivalent 2-clause (L L). The intuition is that the arcs of the graph capture the implications represented by each clause. A clause (L, L ) in Γ requires that, if L is false then L is true, and if L is false, then L is true. Proposition 6. 2-CNF Γ is unsatisfiable off, for some atom P, G Γ has a directed cycle containing both P and P. To check satisfiability of Γ, we generate IG Γ, find its strongly connected components, and then check if any strong component contains both an atom and its negation. Strongly connected components can be constructed in linear time. 6

4 Linear Time UP with the Two Watched Literals Method Performing unit propagation in linear time requires using an appropriate data structure. The method described here is sufficient for showing the 2-SAT and Horn-SAT algorithms given above are linear time, and for effective implementations of these algorithms. More importantly, fast UP lies at the heart of modern industrial SAT solvers (other related software), and this is the standard method used in such solvers. It is useful to be clear about our what we need to do. Typically, we do not need to explicitly construct the formula UP(Γ,L). For example, in deciding Horn-SAT, we first want to know if performing UP generates the empty clause. If so, we don t need to know anything else about UP(Γ,lit). To determine this, it is sufficient to be able to identify what new unit clauses arise (or if the empty clause arises), and ignore almost everything else. Rather than thinking of applying UCR literally, we imagine a process of constructing a partial truth assignment which assigns true to exactly the literals which appear in unit clauses. A clause is effectively a unit clause if all of its literals but one have been set false. Now, suppose we have just detected the unit clause (L). Our goal is to determine if setting L true will generate a new unit clause. We don t care about clauses that don t mention L or L. We also don t care about clauses containing L, as they will be satisfied. Now, consider a clause C containing L, but also many other literals. Setting L true effectively makes C shorter by one, but it C cannot become a unit clause if there are other literals which have not been made false. We want to avoid repeatedly visiting and scanning long clauses, except if they are in danger of having turned into unit clauses. To do this, we select two literals in each clause C to watch. As long as both watched literals of C remain non-false (i.e., they are either unassigned of assigned true), we can ignore C. The only time we need to visit C is when one of its watched literals has been set false, at which point it may have been made effectively a unit clause, or even empty. The data TWL data structure maintains, for each literal L, a list of watched occurrences of Lin clauses of Γ. This is illustrated in Figure 1. The algorithm for performing UP with the TWL data structure is given in Algorithm 3. To ensure the algorithm executes in linear time, we must be careful about how we scan clauses when visiting them. To see why, consider formulas containing a clause C = (L 1, L 2,... L k ), where k is some constant fraction of l (the total number of literals in the formula). Suppose that we begin with literals L 1 and L k watched, that the order of assignments made by the algorithm is L 1, L 2,..., and that each time we visit C we do a sequential scan from left to right looking for a new literal to watch. It is clear that we spend Ω(l) time just scanning the clause C. However, if we use some reasonable strategy 7

... (P, Q, R), (P, Q, R, S),... (P, Q,... S), (P, Q,... R),... (P, Q, R), (P, Q,....... P... P... Q... Q... R... R.... Figure 1: The TWL data structure. The clause list appears across the top; the list of watch lists down the left side. Each clause has two watched literals. to ensure we never inspect a literal of C more than once (or some small constant number of times) during an entire run of the algorithm, then we can be sure the algorithm runs in O(l) time. A simple way to do this is to maintain an index into C which we alway increase when doing a scan, so we inspect each literal only once. We also need to arrange O(1) access to the second watched literal, which can be done in several ways. 4.1 TWL UP and 2-SAT If the formula is 2-CNF, the TWL algorithm works as advertised, but is also overkill. Since each clause contains 2 literals, both of them are watched. Thus, the watch list for L is essentially a list of all occurrences of L in the formula. When we set L false, all clauses containing L become unit. Thus, it is sufficient to keep the list of these literals, and forget about the middle case of the if-elseif-else statement in the body of Algorithm 3. 8

Algorithm 3 Linear-Time Unit Propagation using Two Watched Literals 1: procedure UP-TWL( Γ ) // Γ is a set of clauses 2: Q empty queue of literals; 3: α an empty partial truth assignment 4: for each unit clause (L) in Γ do 5: set α(l) true, and enqueue L on Q 6: end for 7: Construct the TWL data structure for all non-unit clauses of Γ. 8: while Q is not empty do 9: L a literal dequeued from Q 10: for each clause C in the list of watched occurrences of L do 11: // We were watching L in C, but L has been made false 12: H the other watched literal of C 13: if there is a non-false literal J in C\{H, L} then 14: add C to the watch list for J 15: else if α(h) = f alse then // C is effectively an empty clause 16: return () UP(Γ) 17: else// C is effectively unit, 18: set α(h) = true and enqueue H to Q 19: end if 20: end for 21: end while 22: return () UP(Γ) 23: end procedure 9