Formally Certified Satisfiability Solving

Similar documents
versat: A Verified Modern SAT Solver

Formally certified satisfiability solving

Extended Abstract: Combining a Logical Framework with an RUP Checker for SMT Proofs

Andrew Reynolds Liana Hadarean

versat: A Verified Modern SAT Solver

Deductive Methods, Bounded Model Checking

Improving Coq Propositional Reasoning Using a Lazy CNF Conversion

Integrating a SAT Solver with Isabelle/HOL

SMT Proof Checking Using a Logical Framework

Integration of SMT Solvers with ITPs There and Back Again

Dependently Typed Programming with Mutable State

Generating Small Countermodels. Andrew Reynolds Intel August 30, 2012

SMTCoq: A plug-in for integrating SMT solvers into Coq

The design of a programming language for provably correct programs: success and failure

Adam Chlipala University of California, Berkeley ICFP 2006

SAT Solver. CS 680 Formal Methods Jeremy Johnson

DPLL(Γ+T): a new style of reasoning for program checking

SMT-LIB for HOL. Daniel Kroening Philipp Rümmer Georg Weissenbacher Oxford University Computing Laboratory. ITP Workshop MSR Cambridge 25 August 2009

Seminar decision procedures: Certification of SAT and unsat proofs

Satisfiability Modulo Theories. DPLL solves Satisfiability fine on some problems but not others

SAT Solver Heuristics

SAT and Termination. Nao Hirokawa. Japan Advanced Institute of Science and Technology. SAT and Termination 1/41

Finite Model Generation for Isabelle/HOL Using a SAT Solver

Yices 1.0: An Efficient SMT Solver

The SMT-LIB 2 Standard: Overview and Proposed New Theories

Mechanically-Verified Validation of Satisfiability Solvers

Satisfiability (SAT) Applications. Extensions/Related Problems. An Aside: Example Proof by Machine. Annual Competitions 12/3/2008

HySAT. what you can use it for how it works example from application domain final remarks. Christian Herde /12

CAV Verification Mentoring Workshop 2017 SMT Solving

Yices 1.0: An Efficient SMT Solver

Leonardo de Moura and Nikolaj Bjorner Microsoft Research

DPLL(T ):Fast Decision Procedures

Deductive Program Verification with Why3, Past and Future

Isabelle/HOL:Selected Features and Recent Improvements

Lost in translation. Leonardo de Moura Microsoft Research. how easy problems become hard due to bad encodings. Vampire Workshop 2015

Symbolic Methods. The finite-state case. Martin Fränzle. Carl von Ossietzky Universität FK II, Dpt. Informatik Abt.

SAT Solvers. Ranjit Jhala, UC San Diego. April 9, 2013

argo-lib: A Generic Platform for Decision Procedures

Complete Instantiation of Quantified Formulas in Satisfiability Modulo Theories. ACSys Seminar

Pooya Saadatpanah, Michalis Famelis, Jan Gorzny, Nathan Robinson, Marsha Chechik, Rick Salay. September 30th, University of Toronto.

Theorem proving. PVS theorem prover. Hoare style verification PVS. More on embeddings. What if. Abhik Roychoudhury CS 6214

PROPOSITIONAL LOGIC (2)

Towards Coq Formalisation of {log} Set Constraints Resolution

An Introduction to Satisfiability Modulo Theories

SAT/SMT Solvers and Applications

Symbolic and Concolic Execution of Programs

Automated Theorem Proving in a First-Order Logic with First Class Boolean Sort

Decision Procedures in the Theory of Bit-Vectors

Formalization of Incremental Simplex Algorithm by Stepwise Refinement

Midwest Verification Day 2011

Runtime Checking for Program Verification Systems

QuteSat. A Robust Circuit-Based SAT Solver for Complex Circuit Structure. Chung-Yang (Ric) Huang National Taiwan University

Pluggable SAT-Solvers for SMT-Solvers

EXTENDING SAT SOLVER WITH PARITY CONSTRAINTS

A Pearl on SAT Solving in Prolog (extended abstract)

Minimum Satisfying Assignments for SMT. Işıl Dillig, Tom Dillig Ken McMillan Alex Aiken College of William & Mary Microsoft Research Stanford U.

Foundations of AI. 9. Predicate Logic. Syntax and Semantics, Normal Forms, Herbrand Expansion, Resolution

PVS, SAL, and the ToolBus

Language Techniques for Provably Safe Mobile Code

K and Matching Logic

Planning as Search. Progression. Partial-Order causal link: UCPOP. Node. World State. Partial Plans World States. Regress Action.

Full CNF Encoding: The Counting Constraints Case

EECS 219C: Formal Methods Boolean Satisfiability Solving. Sanjit A. Seshia EECS, UC Berkeley

SAT Solver and its Application to Combinatorial Problems

A Decision Procedure for (Co)datatypes in SMT Solvers. Andrew Reynolds Jasmin Christian Blanchette IJCAI sister conference track, July 12, 2016

CS-E3200 Discrete Models and Search

Towards certification of TLA + proof obligations with SMT solvers

Practical SAT Solving

DM841 DISCRETE OPTIMIZATION. Part 2 Heuristics. Satisfiability. Marco Chiarandini

Chapter 2 PRELIMINARIES

Motivation. CS389L: Automated Logical Reasoning. Lecture 17: SMT Solvers and the DPPL(T ) Framework. SMT solvers. The Basic Idea.

VS 3 : SMT Solvers for Program Verification

SAT, SMT and QBF Solving in a Multi-Core Environment

COMP4418 Knowledge Representation and Reasoning

OpenMath and SMT-LIB

arxiv: v2 [cs.lo] 17 Dec 2009

The Prototype Verification System PVS

Learning Techniques for Pseudo-Boolean Solving and Optimization

Encoding First Order Proofs in SMT

2 Decision Procedures for Propositional Logic

Automated Theorem Proving and Proof Checking

Combinational Equivalence Checking

CODE ANALYSIS CARPENTRY

Matching Logic. Grigore Rosu University of Illinois at Urbana-Champaign

Combining Proofs and Programs in a Dependently Typed Language. Certification of High-level and Low-level Programs

System Description of a SAT-based CSP Solver Sugar

A Certified Implementation of a Distributed Security Logic

Multi Domain Logic and its Applications to SAT

Efficient Circuit to CNF Conversion

Extending ACL2 with SMT solvers

On Computing Minimum Size Prime Implicants

Polymorphic lambda calculus Princ. of Progr. Languages (and Extended ) The University of Birmingham. c Uday Reddy

A Small Interpreted Language

A Scalable Algorithm for Minimal Unsatisfiable Core Extraction

Why3 where programs meet provers

The SMT-LIB Standard Version 2.0

System Description: iprover An Instantiation-Based Theorem Prover for First-Order Logic

EECS 219C: Computer-Aided Verification Boolean Satisfiability Solving. Sanjit A. Seshia EECS, UC Berkeley

Three Applications of Strictness in the Efficient Implementaton of Higher-Order Terms

BOOGIE. Presentation by Itsik Hefez A MODULAR REUSABLE VERIFIER FOR OBJECT-ORIENTED PROGRAMS MICROSOFT RESEARCH

Transcription:

SAT/SMT Proof Checking Verifying SAT Solver Code Future Work Computer Science, The University of Iowa, USA April 23, 2012 Seoul National University

SAT/SMT Proof Checking Verifying SAT Solver Code Future Work Satisfiability Solving Satisfiability: Is there a M such that M Φ? Satisfiability (SAT) Input is a propositional formula Satisfiability Modulo Theories (SMT) First-order formulas with equality Also w/ theories: integer/real arithmetic, array, bit-vector,... Main target: decidable fragments (usually quantifier-free) Various SMT logics: combinations of those theories SAT/SMT solvers are High-performance automated theorem provers Used in formal verification and artificial intelligence

SAT/SMT Proof Checking Verifying SAT Solver Code Future Work SAT/SMT Solver Verification Motivation Solvers are highly optimized (and large, > 50k lines for SMT) To increase the trust level of all systems that use them Approaches to Verified SAT/SMT Solving Verify the certificate from solvers: Solvers are not trusted, a trusted checker is needed SAT instance: a model that is found by the solver (easy for SAT) UNSAT instance: a refutational proof Verify the code: Prove theorems below using formal methods: solve(φ) = SAT, then M.M Φ solve(φ) = UNSAT, then M.M Φ solve(φ) terminates

SAT/SMT Proof Checking Verifying SAT Solver Code Future Work LFSC Encoding SMT Results Outline SAT/SMT Proof Checking LFSC Encoding SMT Results Verifying SAT Solver Code GURU Specification Implementation Results

SAT/SMT Proof Checking Verifying SAT Solver Code Future Work LFSC Encoding SMT Results First Approach: SAT/SMT Proof Checking Proof φ Solver Checker Y/N Challenges High-performance checking: Proofs of 100s MB and even GBs Flexibility for extending (SMT): new theories are added all the time

SAT/SMT Proof Checking Verifying SAT Solver Code Future Work LFSC Encoding SMT Results The Proposal Using a meta-language called LFSC Based on Edinburgh Logical Framework (LF) LF is a meta-logic based on type theory Formulas can be encode as types (and proofs as terms) Team @ Iowa developed an efficient LF type checker Extended with a side condition language for computations Papers: Towards an SMT Proof Format. Aaron Stump and. SMT 08 Fast and Flexible Proof Checking for SMT., Andrew Reynolds, and Aaron Stump. SMT 09 Combining a Logical Framework with an RUP Checker for SMT Proofs., and Aaron Stump. SMT 11

SAT/SMT Proof Checking Verifying SAT Solver Code Future Work LFSC Encoding SMT Results Proof Encoding Examples in LFSC Syntax declare + : int int int declare = : int int form declare and : form form form Judgement declare : form type Rules declare AndE 1 : (φ : form) (ψ : form) (P : ( (and φ ψ))) ( φ) Proof λf : form. λg : form. λp : ( (and F G)). (AndE 1 P)

SAT/SMT Proof Checking Verifying SAT Solver Code Future Work LFSC Encoding SMT Results Power of LFSC LF s Higher-Order Abstract Syntax declare forall : (int form) form declare inst : (F : int form) ( (forall F)) (y : int) ( (F y)) declare Impl-I : (F : form) (G : form) (( F) ( G)) ( (imp F G)) Side Condition Language Extension C v D v C D VS C D E resolve(c, D, v) = E Simple (LISP-like) functional programming language Side conditions are like trusted tactics Built-in integer/rational operations (GNU MP library)

SAT/SMT Proof Checking Verifying SAT Solver Code Future Work LFSC Encoding SMT Results Encoding a SMT Logic - QF IDL QF IDL - Quantifier Free Integer Difference Logic Basic SMT reasoning SMT Solvers start with CNF conversion Elimination rules for logical connectives and let-bindings 32 CNF conversion rules, resolution rule IDL theory reasoning Atomic formulas of the form: x y c (w/ variation) 15 normalization rules, contradiction, transitivity idl < : (x : int) (y : int) ( x < y) ( x y 1) idl contra : ( x x c) { c < 0 } ( false) 897 lines in LFSC (54 lines of side condition code)

SAT/SMT Proof Checking Verifying SAT Solver Code Future Work LFSC Encoding SMT Results clsat - Proof Generating SAT/SMT Solver Implemented standard SAT/SMT features (written in C++) SMT-COMP 2008 Participant Proof generation overhead: 10% Proofs are constructed in memory and pruned before output Note: proofs can be dumped without storing in memory - larger proofs and larger overhead

SAT/SMT Proof Checking Verifying SAT Solver Code Future Work LFSC Encoding SMT Results Performance Results - QF IDL Configuration Completed Timeouts Failures Time(s) clsat (solve only) 542 30-29,507.7 clsat+lfsc 539 32 1 38,833.6 1000 LFSC proof checking (s) 100 10 1 1 10 100 1000 UNSAT Benchmarks from SMT-LIB* Timeout: 1,800 sec Overall overhead: 31.6% Worst overheads: close to solving times clsat solving only (s) *Except for 50 benchmarks with unsupported language features

SAT/SMT Proof Checking Verifying SAT Solver Code Future Work LFSC Encoding SMT Results Summary: Proof Checking in LFSC Flexibility Supports SMT logics and more Adding new rules is easy and modular Performance Optimizations implemented once in LFSC Much faster checking than interactive theorem provers Trustworthiness Trusted base: encoded logic + generic LFSC checker Encoded logics are intuitive More secure than ad-hoc proof checkers

Outline SAT/SMT Proof Checking LFSC Encoding SMT Results Verifying SAT Solver Code GURU Specification Implementation Results

Second Approach: Verifying the Code (Related Works) S. Lescuyer (2008) [Coq] Classical DPLL: primitively recursive implementation N. Shankar (2011) [PVS] Modern DPLL: conflict analysis, clause learning, backjumping F. Marić (2009) [Isabelle] Modern DPLL: conflict analysis, clause learning, backjumping Also implemented the two-literal watch lists Summary Used model theoretic specification: M.M Φ Proved sound and complete Inefficient implementation at low-level

versat: a Verified SAT Solver versat: A Verified Modern SAT Solver., Aaron Stump, Corey Oliver, and Kevin Clancy. VMCAI 12 Goal: A verified real-world SAT Solver implemented modern-style DPLL with software engineerings low-level optimized using efficient data structure Verified all the way down to machine words and bits Focus on performance & productivity Statically verified to produce sound UNSAT answers SAT certificates are checked at run-time (low overhead) No completeness(termination) proof Performance is more important than guarantee of termination

The Guru Programming Language Guru is a functional programming language with: Dependent type system (for verification) Resource type system (for efficient code generation)

The Guru Programming Language Guru is a functional programming language with: Dependent type system (for verification) inductive datatypes (induction on first order variables) general recursion (reasoning about partial functions) first order logic with equality predicate (formula types) provable equality over operational semantics (call-by-value) Resource type system (for efficient code generation)

The Guru Programming Language Guru is a functional programming language with: Dependent type system (for verification) inductive datatypes (induction on first order variables) general recursion (reasoning about partial functions) first order logic with equality predicate (formula types) provable equality over operational semantics (call-by-value) Resource type system (for efficient code generation) configurable resource management policies: reference counting (default, automatic) linear typing (mutable data structures, annotations) arrays with constant time access

Specification: Soundness of UNSAT answer Statement of Unsatisfiability Model Theoretically: M.M Φ Proof Theoretically: Φ They are all equivalent for propositional logic Solver returns UNSAT when the empty clause is deduced Verification Strategy Verify the deduction steps follow the proof rules Isolate conflict analysis, where deductions are performed

Specification: Inference System (the pf type) The pf type encodes res (refutation complete) Define lit := word Define clause := <list lit> Define formula := <list clause> Inductive pf : Fun(F : formula)(c:clause).type := pf_asm : Fun(F : formula)(c:clause) (u : { (member C F eq_clause) = tt }). <pf F C> pf_res : Fun(F : formula)(c1 C2 Cr : clause)(l:lit) (d1 : <pf F C1>) (d2 : <pf F C2>) (u : { (is_resolvent Cr C1 C2 l) = tt }). <pf F Cr> <pf F C> type represents the judgement F res C A value of <pf F C> represents a proof of F res C Term constructors represents the inference rules is resolvent tests if Cr is a resolvent of C1 and C2

Specification: The type of solve function Inductive answer : Fun(F:formula).type := sat : Fun(spec F:formula).<answer F> unsat : Fun(spec F:formula)(spec p:<pf F (nil lit)>).<answer F> Define solve : Fun(F:formula).<answer F> :=... (nil lit) is the empty list of literals (the empty clause) spec (specificational) arguments are only for type checking So, proofs are not generated at run-time GURU makes sure that spec arguments are terminating and only dependent on the invariants (always computable) solve function should contain implementation and proof (internal verification)

Specification: Summary Encodes the propositional logic The trusted core of versat The rest of versat is actual implementation and proof to be checked and certified by the GURU compiler Size: 259 lines of GURU code (small and straightforward) Includes a parser for DIMACS benchmark format a trusted interpretation of string as formula 145 lines (out of 259 lines)!

Implemented Features A core set of modern features Engineering: Two-literal Watch Lists Conflict Analysis + Fast Resolution Backjumping (Non-chronological Backtracking) Heuristics: Decision Heuristics (Scoring variable activities) Clause Learning Summary: 9884 lines of GURU code (20%) and proofs (80%) Proved 247 lemmas Also, added numerous lemmas in the standard library

Efficient Representation of Clauses aclause type: array-based clause and invariants Inductive aclause : Fun(nv:word)(F:formula).type := mk_aclause : Fun(spec nv:word)(spec F:formula) (spec n:word)(l:<array lit n>) (u1:{ (array_in_bounds nv l) = tt }) (spec c:clause)(spec pf_c:<pf F c>) (u2:{ c = (to_cl l) }).<aclause nv F> keep specification simple, implementation efficient aclause stores a clause in the array array in bounds: all variable numbers are within bounds and the array is null-terminated to cl interprets a null-terminated array as a list the interpretation of array is valid in F

Conflict Analysis with Fast Resolution (1/7) C l D l l var status C D. v Not/Pos/Neg. Problem: duplicate literals Solution: a look-up table

Conflict Analysis with Fast Resolution (2/7) D 1 l 1 D 2 l 2 D n l n C l 1 l 1... l 2 l n C l 1... l n are assigned after the last decision literal C will have only one literal assigned after the last decision

Conflict Analysis with Fast Resolution (3/7) An Example Conflict: Clauses 3 4 1 4 5 2 3 4 5 Assignment Sequence: 1 d, 2, 3 d, 4, 5 = conflicting with 2 3 4 5. Analysis: C D l 2 3 4 5

Conflict Analysis with Fast Resolution (3/7) An Example Conflict: Clauses 3 4 1 4 5 2 3 4 5 Assignment Sequence: 1 d, 2, 3 d, 4, 5 = conflicting with 2 3 4 5. Analysis: C D l 2 3 4 5 1 4 5 5 2 3 4 1

Conflict Analysis with Fast Resolution (3/7) An Example Conflict: Clauses 3 4 1 4 5 2 3 4 5 Assignment Sequence: 1 d, 2, 3 d, 4, 5 = conflicting with 2 3 4 5. Analysis: C D l 2 3 4 5 1 4 5 5 2 3 4 1 3 4 4 2 3 1 Time complexity of removal depends on the length of C Literals being resolved are assigned after the last decision

Conflict Analysis with Fast Resolution (4/7) An Example Conflict: Clauses 3 4 1 4 5 2 3 4 5 Assignment Sequence: 1 d, 2, 3 d, 4, 5 = conflicting with 2 3 4 5. Analysis (Old & Improved): C D l C 1 C 2 D l 2 3 4 5 1 4 5 5 2 3 4 1 3 4 4 2 3 1 2 3 4 5

Conflict Analysis with Fast Resolution (4/7) An Example Conflict: Clauses 3 4 1 4 5 2 3 4 5 Assignment Sequence: 1 d, 2, 3 d, 4, 5 = conflicting with 2 3 4 5. Analysis (Old & Improved): C D l C 1 C 2 D l 2 3 4 5 1 4 5 5 2 3 4 1 3 4 4 2 3 1 2 3 4 5 1 4 5 5 2 1 3 4

Conflict Analysis with Fast Resolution (4/7) An Example Conflict: Clauses 3 4 1 4 5 2 3 4 5 Assignment Sequence: 1 d, 2, 3 d, 4, 5 = conflicting with 2 3 4 5. Analysis (Old & Improved): C D l C 1 C 2 D l 2 3 4 5 1 4 5 5 2 3 4 1 3 4 4 2 3 1 2 3 4 5 1 4 5 5 2 1 3 4 3 4 4 2 1 3 Time complexity of removal depends on the length of C 2

Conflict Analysis with Fast Resolution (5/7) Data Structure: For duplication removal & faster remove operation D i l i C C C l i

Conflict Analysis with Fast Resolution (6/7) Data Structure (even better): For duplication removal & constant time remove operation C D i l i C C l i C2 is not calculated at run-time & C2L tracks the length Removal is a constant time operation! At the end, C2 has only one literal (C2L = 1) At the end, C2 can be deduced

Conflict Analysis with Fast Resolution (7/7) Invariants: C D i l i C C X l i (u1:{ (all_lits_are_assigned T (append C1 C2)) = tt }) (u2:{ (cl_has_all_vars (append C1 C2) T) = tt }) (u3:{ (cl_set_at_prev_levels dl dls C1) = tt }) (u4:{ C2L = (length C2) }) (u5:{ (cl_unique C2) = tt })

Example Theorem: Clearing the Look-up Table Define cl_has_all_vars_implies_clear_vars_like_new : Forall (nv:word) (T:<array assignment nv>) (C:clause) (u:{ (cl_valid nv C) = tt }) (r:{ (cl_has_all_vars C T) = tt }).{ (clear_vars T C) = (array_new nv UN) }

Results: versat vs. State-of-the-art Solvers SAT Race 2008 Test Set 1 50 benchmarks System: Intel Xeon X5650 2.67GHz w/ 12GB of memory 900 seconds timeout for solving Systems #Solved #Timeout #Error/Wrong versat 19 31 0 picosat-936 46 4 0 minisat-2.2.0 47 3 0 Note: versat solved velev-live-sat-1.0-03 (78MB size, 224,920 variables, 3,596,474 clauses)

Results: versat vs. proof checking The Certified Track benchmarks of SAT Competition 2007 16 benchmarks (believed to be UNSAT) System: Intel Core 2 Duo 2.40GHz w/ 3GB of memory One hour timeout for solving and checking, individually Systems #Solved #Certified versat 6 6 picosat + RUP 14 4 picosat + TraceCheck 14 12 Trusted Base: versat: GURU compiler + 259 lines of GURU code checker3 (RUP checker): 1,538 lines of C code tracecheck (TraceCheck checker): 2,989 lines of C code + boolforce library (minisat-2.2.0 is 2,500 lines of C++)

Summary: Verifying a SAT Solver versat: a modern SAT solver verified in GURU UNSAT-soundness is verified statically Can solve and certify realistic formulas Comparable with the current proof checking technology Source code is available at http://cs.uiowa.edu/~duoe/ Standalone certified C code is also available

SAT/SMT Proof Checking Verifying SAT Solver Code Future Work Future Work Reusing versat code base Add features: Restarting, Preprocessing, CC Minimization Implement other tools: verified/efficient RUP proof checker Real-world software engineering and verification Verified code library for software engineering techniques low-level optimizations design patterns Software engineering for verification Lemma database: sharing and exchanging proofs Enforcing high-level design onto low-level details and implementations