A Structure-Based Variable Ordering Heuristic for SAT By Jinbo Huang and Adnan Darwiche Presented by Jack Pinette
Overview 1. Divide-and-conquer for SAT 2. DPLL & variable ordering 3. Using dtrees for SAT decomposition 4. Conflict-directed backtracking 5. Experiments 6. Questions
Basic DPLL: sat(cnf:c) 1. If there is an inconsistent clause, return false 2. If there is no uninstantiated variable, return true 3. Select an uninstd. variable v vars(c) 4. Return sat(c v=true ) sat(c v=false ) (neglects unit propagation)
Divide and Conquer Split a large problem into disconnected components Each subcomponent can be solved independently Solution is the combination of sub-solutions
Dividing SAT problems CNF C = {c 1, c 2,, c m }, a set of clauses which must be satisfied In general, any c i, c j C may share variables Straightforward divide-and-conquer is not possible
Disconnecting CNF C = {c 1, c 2,, c m } Split C into C L and C R, with variable sets V L and V R, respectively V L V R 0 ; they may still be connected
Workaround In sat(c), insist that vars V L V R are instantiated first Then we have C L and C R as fully independent sub-problems: V L = V L - V L V R V R = V R - V L V R Recursively decompose to individual clauses
Using the decomposition Feed the decomposition to the DPLL solver as a variable ordering
Variable orderings in SAT Recall line 3 of the DPLL algorithm: 3. Select an uninstantiated variable v C In general, different choices of v may lead to different search complexity
Static vs. Dynamic Static orderings are predetermined and fixed for the run of the SAT solver Dynamic orderings are computed before each split decision Chaff uses VSIDS, a dynamic ordering
Chaff s VSIDS 1. Keep occurrence count of each literal (that is, x and x are counted separately) 2. Periodically divide all counts by some constant 3. Choose the literal with the highest count In practice, favors literals involved in recent conflicts
Divide-and-conquer ordering For CNF C = C L C R, order by groups (recursively): 1. V L V R 2. V L = V L - V L V R 3. V R = V R - V L V R (choice of left-to-right is arbitrary)
Divide-and-conquer ordering Within each group, use any other variable ordering 1. V L V R 2. V L = V L - V L V R 3. V R = V R - V L V R (choice of left-to-right is arbitrary)
Partition mechanism How do we choose the partition C L, C R? Use a dtree (decomposition tree)
dtree for CNF Full binary tree Nodes represent a subset of CNF C Leaves are individual clauses of C
dtree definitions Variables of a dtree node: union of its children s variables For a leaf: all vars mentioned in the associated clause
Cutset of a dtree node: intersection of its children s variables, minus its ancestors cutsets dtree definitions
variables and cutsets explained Each node of the dtree represents part of the CNF A node s variables = all the variables in that part of the CNF
variables and cutsets explained A node s cutset = the vars that must be instantiated before the node s children become independent subproblems
dtree variable group ordering The v.g.o. induced by a dtree: 1. The cutset of the root 2. The v.g.o. of the left subtree 3. The v.g.o. of the right subtree
dtree variable group ordering Our example dtree induces the v.g.o.: {u,z}, {x}, {y}, {w}, {v}
Choosing a dtree Any nontrivial CNF has multiple dtrees Q: how to pick a good dtree? A: use a hypergraph partitioning algorithm
Hypergraph partitioning A hypergraph is a generalized graph, where edges (hyperedges) may connect more than two vertices. Hypergraph partitioning: split the vertices into k approximately equal-sized parts, minimizing the connections between vertices in different parts
Hypergraph partitioning Hypergraph partitioning is well-studied The authors use the hmetis package from University of Minnesota; literature claims order-of-magnitude performance gains over competitors hmetis lets the user specify how balanced the partition should be
CNF hypergraph Add a hypergraph node for each CNF clause Add a hyperedge for each variable in the CNF, connecting the nodes (clauses) in which the variable appears
Hypergraph example Showing hyperedges connecting two or more nodes
Hypergraph example hmetis tries to choose a balanced partition, that minimizes the number of crossing edges
Hypergraph example This partition is balanced (2 and 2) and crosses two hyperedges, u and z This corresponds to a cutset of {u,z} for the root node of the dtree
How to use this? We can use hypergraph partitioning to generate a dtree We can use the dtree to generate a variable group ordering Does this guarantee that a DPLL solver will handle the problem decomposition?
Wasted effort Example: C = C L C R, say V L V R is fully instantiated, and C L has been satisfied If a conflict is found while exploring C R, DPLL could backtrack to a decision in V L (wasting effort). It should backtrack directly to a decision in V L V R.
Conflict-directed backtracking Tracks the decisions responsible for the assignments leading to a conflict A DPLL solver with conflict-directed backtracking will know not to backtrack to a decision in V L, but will skip back to V L V R as desired.
Implementation Modifications to ZChaff: Package to generate dtree Package to extract v.g.o. from dtree Changes to ZChaff: forced to obey v.g.o.; inside a group, uses VSIDS to choose Compared to stock ZChaff on selected benchmarks
Results Dtree-ZChaff wins for many instances
Improves many instances Most improved are hard instances for Zchaff Harmful for a few instances
Complexity For a CNF C whose connectivity graph has treewidth w: There exists a dtree for C with height log n (where C has n clauses) and all cutsets have size w A DPLL solver with conflict-directed backtracking can solve C in O(n exp(w log n)) time
Questions How were the benchmarks selected? Are some problem domains known to generate problems with this type of structure? Is there a fast way to guess whether dtree will be helpful or not for an arbitrary CNF?
Related materials S. Szeider at University of Toronto has recent papers on fixed-parameter tractable SAT problems http://www.cs.toronto.edu/~szeider/