Rethinking Automated Theorem Provers?

Similar documents
An Annotated Language

Satisfiability Modulo Theories. DPLL solves Satisfiability fine on some problems but not others

Deductive Methods, Bounded Model Checking

Lecture Notes on Real-world SMT

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

Leonardo de Moura and Nikolaj Bjorner Microsoft Research

Part II. Hoare Logic and Program Verification. Why specify programs? Specification and Verification. Code Verification. Why verify programs?

Decision Procedures for Equality Logic. Daniel Kroening and Ofer Strichman 1

Decision Procedures in First Order Logic

Complete Instantiation of Quantified Formulas in Satisfiability Modulo Theories. ACSys Seminar

Plan of the lecture. Quick-Sort. Partition of lists (or using extra workspace) Quick-Sort ( 10.2) Quick-Sort Tree. Partitioning arrays

Combining Static and Dynamic Contract Checking for Curry

CSC313 High Integrity Systems/CSCM13 Critical Systems. CSC313/CSCM13 Chapter 2 1/ 221

Translation validation: from Simulink to C

Specifications. Prof. Clarkson Fall Today s music: Nice to know you by Incubus

Lecture 4. First order logic is a formal notation for mathematics which involves:

Softwaretechnik. Program verification. Albert-Ludwigs-Universität Freiburg. June 28, Softwaretechnik June 28, / 24

Cover Page. The handle holds various files of this Leiden University dissertation

Playing with Vampire: the dark art of theorem proving

Basic Verification Strategy

Yices 1.0: An Efficient SMT Solver

Integration of SMT Solvers with ITPs There and Back Again

Finite Model Generation for Isabelle/HOL Using a SAT Solver

Automated Theorem Proving and Proof Checking

Random Testing in PVS

Finding Deadlocks of Event-B Models by Constraint Solving

Static program checking and verification

Intro to semantics; Small-step semantics Lecture 1 Tuesday, January 29, 2013

8. Symbolic Trajectory Evaluation, Term Rewriting. Motivation for Symbolic Trajectory Evaluation

Symbolic and Concolic Execution of Programs

Static Contract Checking for Haskell

Runtime Checking for Program Verification Systems

Lectures 20, 21: Axiomatic Semantics

Push-button verification of Files Systems via Crash Refinement

Improving Coq Propositional Reasoning Using a Lazy CNF Conversion

Rewriting for Sound and Complete Union, Intersection and Negation Types

TWO main approaches used to increase software quality

DPLL(Γ+T): a new style of reasoning for program checking

6.0 ECTS/4.5h VU Programm- und Systemverifikation ( ) June 22, 2016

Chapter 1 Programming: A General Overview

Lecture 5 - Axiomatic semantics

The Boogie Intermediate Language

Lost in translation. Leonardo de Moura Microsoft Research. how easy problems become hard due to bad encodings. Vampire Workshop 2015

The Prototype Verification System PVS

Propositional Calculus: Boolean Algebra and Simplification. CS 270: Mathematical Foundations of Computer Science Jeremy Johnson

Computation Club: Gödel s theorem

(a) (4 pts) Prove that if a and b are rational, then ab is rational. Since a and b are rational they can be written as the ratio of integers a 1

Mutual Summaries: Unifying Program Comparison Techniques

Finding and Fixing Bugs in Liquid Haskell. Anish Tondwalkar

Lambda Calculus and Type Inference

Computer-Aided Program Design

Reasoning About Imperative Programs. COS 441 Slides 10

Theorem proving. PVS theorem prover. Hoare style verification PVS. More on embeddings. What if. Abhik Roychoudhury CS 6214

Contents. Program 1. Java s Integral Types in PVS (p.4 of 37)

argo-lib: A Generic Platform for Decision Procedures

Symbolic Evaluation/Execution

CSCI.6962/4962 Software Verification Fundamental Proof Methods in Computer Science (Arkoudas and Musser) Chapter 11 p. 1/38

CSC 501 Semantics of Programming Languages

Extended Static Checking for Haskell (ESC/Haskell)

Chapter 1 Programming: A General Overview

AXIOMS OF AN IMPERATIVE LANGUAGE PARTIAL CORRECTNESS WEAK AND STRONG CONDITIONS. THE AXIOM FOR nop

Hoare triples. Floyd-Hoare Logic, Separation Logic

Decision Procedures. An Algorithmic Point of View. Decision Procedures for Propositional Logic. D. Kroening O. Strichman.

SAT-based Model Checking for C programs

Challenging Problems for Yices

SMT Solvers for Verification and Synthesis. Andrew Reynolds VTSA Summer School August 1 and 3, 2017

Satisfiability Modulo Theories: ABsolver

Advances in Programming Languages

Specifying and Verifying Programs (Part 2)

Summary of Course Coverage

Programming Languages Third Edition

GNATprove a Spark2014 verifying compiler Florian Schanda, Altran UK

6. Hoare Logic and Weakest Preconditions

correlated to the Michigan High School Content Expectations Geometry

Programming Languages

Numerical Computations and Formal Methods

Encoding functions in FOL. Where we re at. Ideally, would like an IF stmt. Another example. Solution 1: write your own IF. No built-in IF in Simplify

Lecture Notes 15 Number systems and logic CSS Data Structures and Object-Oriented Programming Professor Clark F. Olson

Generating Small Countermodels. Andrew Reynolds Intel August 30, 2012

MATH 139 W12 Review 1 Checklist 1. Exam Checklist. 1. Introduction to Predicates and Quantified Statements (chapters ).

Advances in Programming Languages

Testing, Debugging, and Verification

Reading Assignment. Symbolic Evaluation/Execution. Move from Dynamic Analysis to Static Analysis. Move from Dynamic Analysis to Static Analysis

Program verification. Generalities about software Verification Model Checking. September 20, 2016

SMT-LIB for HOL. Daniel Kroening Philipp Rümmer Georg Weissenbacher Oxford University Computing Laboratory. ITP Workshop MSR Cambridge 25 August 2009

Lee Pike. June 3, 2005

Solving Quantified Verification Conditions using Satisfiability Modulo Theories

A short manual for the tool Accumulator

Reasoning about programs. Chapter 9 of Thompson

Chapter 1. Introduction

Geometry Note-Sheet Overview

Applications of Logic in Software Engineering. CS402, Spring 2016 Shin Yoo

Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of.

Lecture 1 Contracts : Principles of Imperative Computation (Fall 2018) Frank Pfenning

Automatic Software Verification

COMP80 Lambda Calculus Programming Languages Slides Courtesy of Prof. Sam Guyer Tufts University Computer Science History Big ideas Examples:

Outline Introduction The Spec# language Running Spec# Tutorials on Spec# Carl Leonardsson 2/

CSE 20 DISCRETE MATH. Fall

Alive: Provably Correct InstCombine Optimizations

What is a Computer Algorithm?

Transcription:

Rethinking Automated Theorem Provers? David J. Pearce School of Engineering and Computer Science Victoria University of Wellington @WhileyDave http://whiley.org http://github.com/whiley

Background

Verification: A Challenge for Computer Science A verifying compiler uses automated mathematical and logical reasoning methods to check the correctness of the programs that it compiles Hoare 03

Verification: A Long History In order that the man who checks may not have too difficult a task the programmer should make a number of definite assertions which can be checked individually, and from which the correctness of the whole program easily follows. Turing 49

Whiley

Overview: What is Whiley? function max(int x, int y) -> (int z) // result must be one of the arguments ensures x == z y == z // result must be greater-or-equal than arguments ensures x <= z && y <= z:... A language designed specifically to simplify verifying software Several trade offs e.g. performance for verifiability - Unbounded Arithmetic, value semantics, etc Goal: to statically verify functions meet their specifications

Example: max(int[]) // Returns index of largest item in array function max(int[] items) -> (int r)

Diagram!

Verification Condition Generation

Verification Condition Generation function abs(int x) -> (int r) // Either x or its negation returned ensures (r == x) (r == -x) // return value cannot be negative ensures r >= 0: // if x >= 0: return x else: return -x To verify above function, compiler generates verification conditions (roughly, first-order logic formulas)

Verification Condition Generation

Automated Theorem Proving

Automated Theorem Proving These [decision] procedures have to be highly efficient, since the problems they solve are inherently hard. Kroenig and Strichman Automatic theorem provers (ATPs) based on the resolution principle... have reached a high degree of sophistication. They can often find long proofs even for problems having thousands of axioms Benzmuller et al. Automated Theorem Provers are a dark art just use Z3!

Theorem Prover Design Decisions Programmer Compiler Theorem Prover Theorem prover typically viewed as hidden out the back But, theorem prover is inevitably part of user interface...... and should be given first-class status

Theorem Prover as a User Interface???? When verification succeeds, all is well! But, when verification inevitably fails...... programmer must figure out what happened

Implications What are the implications of this? Represent formulae in CNF? NO! Use DPLL as main loop? NO! Use SMT-LIB as input language? NO!... Am I Mad?

Whiley Automated Theorem Prover (WyTP)

Theorem Proving: Assertion Language Whiley compiler emits verification conditions in assertion language define abs_ensures_0(int x, int r) is: (r == x) (r == -x) assert "postcondition not satisfied": forall(int x): if: x >= 0 then: abs_ensures_0(x, x) Verification conditions from abs() example shown above In principle, can hook up different automatic theorem provers

Theorem Proving: More Assertion Language Assertions are paths through function... and should reflect that... int i = 0 while i < xs where all { k in 0.. i xs[k]!= x }:... Example assertion generated from above: define indexof_loopinvariant_50(int[] xs, int x, int i) is: forall(int k).(((0 <= k) && (k < i)) ==> (xs[k]!= x)) assert "loop invariant does not hold on entry": forall(int x, int i, int[] xs): if: i == 0 then: indexof_loopinvariant_50(xs, x, i)

Theorem Proving: Proofs (1) (int x).(x 0 x < 0) (2) x 1 < 0 x 1 0 ( -elimination, 1) (3) x 1 0 ( -elimination, 2) (4) x 1 < 0 ( -elimination, 2) (5) 0 < 0 ( -closure, 3 + 4) (6) (simplification, 5) Purpose-built Automated Theorem Prover developed Focus on simplicity rather than scale For example, not based on DPLL

Theorem Proving: Simplification Largest group of proof rules are for simplification: Logical. For example, f f = f, true f = f, etc. Arithmetic. For example, 1+1 = 2, (2*x)-x = x, etc. Also, x < x = 0 < 0, etc. Arrays. For example, [x,y,z][0] = x, [x,y,z] = 3, and [e;n] = n, etc. Records. For example, {x:1,y:2}.y = 2, {x:1,y:2}[x:=3] = {x:3,y:2}, etc.

Theorem Proving: -Elimination (1) (int x).((x = 0 x > 0) x < 0) (2) (x 1 = 0 x 1 > 0) x 1 < 0 ( -elimination, 1) (3) (x 1 = 0 x 1 > 0) ( -elimination, 2) (4) x 1 < 0 ( -elimination, 2) (5) x 1 = 0 ( -elimination, 2) (6) x 1 < x 1 (congruence, 4 + 5) (7) (simplification, 6) (8) x 1 > 0 ( -elimination, 2) (9) 0 < 0 ( -closure, 4 + 8) (10) (simplification, 5)

Theorem Proving: Axioms Should the following hold or not? assert: forall(int i, int[] xs): if: xs[i] > 0 then: (i >= 0) Arithmetic. If x/y is defined, then y 0. Arrays (a). If xs[i] is defined, then 0 i < xs Arrays (b). If xs = [e; n] is defined, then 0 n and xs = n. Arrays (b). If xs has array type, then xs > 0. Functions. If f(x) f(y) then x y.

Theorem Proving: Case Analysis How to prove the following? assert: forall(int i, int[] xs): if: xs == [1,2,3] then: xs[i] >= 0 Above reduces to showing [1,2,3][i] < 0 is contradiction Apply case analysis with (i==0) (i==1) (i==2)

Theorem Proving: Quantifier Instantiation How to prove the following? assert: forall(int[] xs): if: forall(int i).(xs[i] > 0) then: forall(int j).(xs[j] >= 0) What is meaning of (int i).(xs[i] > 0)? Equivalent to infinite conjunction! We just need to pick the right conjunct...

Theorem Proving: Proof Optimisation (1) (inti). ( (i 0) (i == 0) (i 0) ) (2) (i 1 < 0) (i 1 == 0) (i 1 > 0) ( -elimination, 1) (3) i 1 < 0 ( -elimination, 2) (4) i 1 == 0 ( -elimination, 2) (5) i 1 > 0 ( -elimination, 2) (6) 0 < 0 (congruence, 3+4) (7) (simplification, 6) Full Proof. Reflects work done searching proof space by automated theorem prover. Pruned Proof. For easier reading, should eliminate unused facts which were explored.

Q) how big are these proofs?

Theorem Proving: Data Set Whiley Compiler has (approx) 540 valid and 287 invalid test cases Each test case is single Whiley file (either correct or not) From this, generated 1998 valid assertions and 91 invalid assertions

Theorem Proving: Experimental Results I

Theorem Proving: Experimental Results II

Theorem Proving: Experimental Results III

Q) What Causes a Large Proof?

Theorem Proving: Experimental Results IV

Theorem Proving: Experimental Results V

Theorem Proving: Experimental Results VI

Theorem Proving: Counterexample Generation? Most bugs have small counter examples -Jackson 06

Theorem Proving: Counterexample Generation Approach. Use brute force generation with a small world (e.g. integers in range 5... 5, array lengths 0... 2, etc). forall(int i, int[] arr): (arr[i] >= 0) ==> (i == arr ) Example. For above, generate models i=0,arr=[], i=0,arr=[0], i=1,arr=[0], etc. Problems. E.g. uninterpreted functions and any,!int, etc. And, what about undefined behaviour? forall(int[] xs): xs[0] > 0

Theorem Proving: Counterexample Generation Test Counterexample test_11 i=1, x=[0], i 1 =1, i 2 =1 test_102 xs=[0], y=0, x=1 test_129 x 1 ={f:-1}, x={f:1} test_198 r 1 =[0], r=[0,0], i=0, i 1 =1, ls=[0,0] Generated counterexamples for 75 / 91 invalid assertions!

Conclusion Should we optimise theorem provers for BIG problems? OR Should we make them more user friendly?

http://whiley.org @WhileyDave http://github.com/whiley

Theorem Proving: (SWEN224 Dataset) Experimental Results I

Theorem Proving: (SWEN224 Dataset) Experimental Results II

Theorem Proving: (SWEN224 Dataset) Experimental Results III