Provably Correct Software

Similar documents
Programming with Universes, Generically

Theorem Proving Principles, Techniques, Applications Recursion

Coq, a formal proof development environment combining logic and programming. Hugo Herbelin

Overview. A Compact Introduction to Isabelle/HOL. Tobias Nipkow. System Architecture. Overview of Isabelle/HOL

COP4020 Programming Languages. Functional Programming Prof. Robert van Engelen

Isabelle/HOL:Selected Features and Recent Improvements

On Agda JAIST/AIST WS CVS/AIST Yoshiki Kinoshita, Yoriyuki Yamagata. Agenda

Introduction to dependent types in Coq

The design of a programming language for provably correct programs: success and failure

Lecture 8: Summary of Haskell course + Type Level Programming

10 Years of Partiality and General Recursion in Type Theory

Coq Summer School. Yves Bertot

Introduction to ML. Based on materials by Vitaly Shmatikov. General-purpose, non-c-like, non-oo language. Related languages: Haskell, Ocaml, F#,

Expr_annotated.v. Expr_annotated.v. Printed by Zach Tatlock

Combining Programming with Theorem Proving

Basic Foundations of Isabelle/HOL

Functional Programming. Big Picture. Design of Programming Languages

Dependent Polymorphism. Makoto Hamana

Adam Chlipala University of California, Berkeley ICFP 2006

Functional Programming and Modeling

Functional Programming with Isabelle/HOL

Towards Reasoning about State Transformer Monads in Agda. Master of Science Thesis in Computer Science: Algorithm, Language and Logic.

Programs and Proofs in Isabelle/HOL

LECTURE 16. Functional Programming

Deductive Program Verification with Why3, Past and Future

Chapter 1. Introduction

Isabelle s meta-logic. p.1

Importing HOL-Light into Coq

Ideas over terms generalization in Coq

ABriefOverviewofAgda A Functional Language with Dependent Types

Testing. Wouter Swierstra and Alejandro Serrano. Advanced functional programming - Lecture 2. [Faculty of Science Information and Computing Sciences]

Integration of SMT Solvers with ITPs There and Back Again

the application rule M : x:a: B N : A M N : (x:a: B) N and the reduction rule (x: A: B) N! Bfx := Ng. Their algorithm is not fully satisfactory in the

GADTs. Alejandro Serrano. AFP Summer School. [Faculty of Science Information and Computing Sciences]

SOFTWARE VERIFICATION AND COMPUTER PROOF (lesson 1) Enrico Tassi Inria Sophia-Antipolis

Idris: Implementing a Dependently Typed Programming Language

3 Pairs and Lists. 3.1 Formal vs. Informal Proofs

4 Programming with Types

Programming with Math and Logic

Analysis of dependent types in Coq through the deletion of the largest node of a binary search tree

Functional Languages. Hwansoo Han

Type Systems. Pierce Ch. 3, 8, 11, 15 CSE

CS3110 Spring 2017 Lecture 6 Building on Problem Set 1

Why. an intermediate language for deductive program verification

CIS 500 Software Foundations. Midterm I. (Standard and advanced versions together) October 1, 2013 Answer key

How Efficient Can Fully Verified Functional Programs Be - A Case Study of Graph Traversal Algorithms

Programming Languages Third Edition

MoreIntro_annotated.v. MoreIntro_annotated.v. Printed by Zach Tatlock. Oct 04, 16 21:55 Page 1/10

Lambda Calculus and Type Inference

CS 11 Haskell track: lecture 1

λ calculus is inconsistent

Alonzo a Compiler for Agda

Lectures 20, 21: Axiomatic Semantics

August 5-10, 2013, Tsinghua University, Beijing, China. Polymorphic types

GADTs. Wouter Swierstra. Advanced functional programming - Lecture 7. Faculty of Science Information and Computing Sciences

ATS: a language to make typeful programming real and fun

CIS 500: Software Foundations

Coq. LASER 2011 Summerschool Elba Island, Italy. Christine Paulin-Mohring

CS152: Programming Languages. Lecture 11 STLC Extensions and Related Topics. Dan Grossman Spring 2011

Universes. Universes for Data. Peter Morris. University of Nottingham. November 12, 2009

Idris, a language with dependent types Extended Abstract

GADTs. Wouter Swierstra and Alejandro Serrano. Advanced functional programming - Lecture 7. [Faculty of Science Information and Computing Sciences]

CIS 500: Software Foundations

Automata and Formal Languages - CM0081 Introduction to Agda

Congruence Closure in Intensional Type Theory

Lists. Michael P. Fourman. February 2, 2010

Programming with dependent types: passing fad or useful tool?

Polymorphism and Type Inference

Type checking by theorem proving in IDRIS

Martin-L f's Type Theory. B. Nordstr m, K. Petersson and J. M. Smith. Contents. 5.4 The set of functions (Cartesian product of a family of sets) 24

Polymorphism and Type Inference

Software System Design and Implementation

Programming Languages Fall 2014

Introduction to OCaml

Introduction to Homotopy Type Theory

Formal Methods. CITS5501 Software Testing and Quality Assurance

Mathematics for Computer Scientists 2 (G52MC2)

MoreIntro.v. MoreIntro.v. Printed by Zach Tatlock. Oct 07, 16 18:11 Page 1/10. Oct 07, 16 18:11 Page 2/10. Monday October 10, 2016 lec02/moreintro.

IA014: Advanced Functional Programming

PROGRAMMING IN HASKELL. Chapter 2 - First Steps

Processadors de Llenguatge II. Functional Paradigm. Pratt A.7 Robert Harper s SML tutorial (Sec II)

Programming Languages 3. Definition and Proof by Induction

Why3 where programs meet provers

The Haskell HOP: Higher-order Programming

Generic Programming With Dependent Types: II

CSCI-GA Scripting Languages

Second-Order Type Systems

Lecture Notes on Ints

Inductive datatypes in HOL. lessons learned in Formal-Logic Engineering

Chapter 11 :: Functional Languages

Polymorphic lambda calculus Princ. of Progr. Languages (and Extended ) The University of Birmingham. c Uday Reddy

Concrete Semantics with Coq and CoqHammer

Inductive Definitions, continued

Higher-Order Conditional Term Rewriting. In this paper, we extend the notions of rst-order conditional rewrite systems

Type Checking and Type Inference

G Programming Languages - Fall 2012

Advanced Type System Features Tom Schrijvers. Leuven Haskell User Group

Programming with C Library Functions Safely

CIS 500: Software Foundations

Types and Programming Languages. Lecture 5. Extensions of simple types

Transcription:

Provably Correct Software Max Schäfer Institute of Information Science/Academia Sinica September 17, 2007 1 / 48

The Need for Provably Correct Software BUT bugs are annoying, embarrassing, and cost gazillions of $/TWD/es every year there are many approaches to developing less buggy software traditionally: code review, testing etc. E. W. Dijkstra: Program testing can be used to show the presence of bugs, but never to show their absence. (EWD 249) we are interested in provably correct software, i.e. software 1 that has a precise (mathematical) specication 2 that provably (machine checkably) fullls it 2 / 48

The Need for Provably Correct Software BUT bugs are annoying, embarrassing, and cost gazillions of $/TWD/es every year there are many approaches to developing less buggy software traditionally: code review, testing etc. E. W. Dijkstra: Program testing can be used to show the presence of bugs, but never to show their absence. (EWD 249) we are interested in provably correct software, i.e. software 1 that has a precise (mathematical) specication 2 that provably (machine checkably) fullls it 2 / 48

The Need for Provably Correct Software BUT bugs are annoying, embarrassing, and cost gazillions of $/TWD/es every year there are many approaches to developing less buggy software traditionally: code review, testing etc. E. W. Dijkstra: Program testing can be used to show the presence of bugs, but never to show their absence. (EWD 249) we are interested in provably correct software, i.e. software 1 that has a precise (mathematical) specication 2 that provably (machine checkably) fullls it 2 / 48

The Need for Provably Correct Software BUT bugs are annoying, embarrassing, and cost gazillions of $/TWD/es every year there are many approaches to developing less buggy software traditionally: code review, testing etc. E. W. Dijkstra: Program testing can be used to show the presence of bugs, but never to show their absence. (EWD 249) we are interested in provably correct software, i.e. software 1 that has a precise (mathematical) specication 2 that provably (machine checkably) fullls it 2 / 48

How to get there in order to specify program behavior and reason about programs, we need a mathematical model of the language thus we either need to work in a language that is amenable to mathematical treatment (functional languages, interactive theorem provers) work harder to construct a model for a (simplication of a) real world language, then integrate specications and proofs with programs 3 / 48

Outline We will present the following systems: Isabelle/HOL: SML-inspired programming language, about which propositions can be proved (based on LF) Agda: practical dependently typed programming language (based on UTT) Coq: integration of dependently typed programming language and theorem prover (based on PCIC) Proof Carrying Code Why/Caduceus: verication of imperative programs Much of the material is based on lecture notes of the TYPES Summer School 2007 (see http://typessummerschool07.cs.unibo.it). 5 / 48

Outline 1 Interactive Theorem Provers Verication of Functional Programs Dependent Types and Inductive Families 2 Proof Carrying Code 3 Verication of Imperative Programs 6 / 48

Functional Programming in One Slide the interactive theorem provers we discuss all have an internal functional language functions dened in the language should behave like mathematical functions no state, no assignable variables details of program evaluation should matter as little as possible in strongly typed languages (which we exclusively consider here), datatypes are used to ensure that functions are only invoked with meaningful arguments new datatypes can be dened inductively For the moment, we use the language of Isabelle/HOL. 7 / 48

Isabelle Isabelle is a generic proof assistant developed by L. C. Paulson (Cambridge) and T. Nipkow (München) since the late '80s it is based on the logical framework approach with a lean metalogic in which object logics can be implemented inference trees of the object logic are represented as terms of the metalogic; correct application of the rules is ensured by type checking the terms of the metalogic most used object logic is Isabelle/HOL, which implements higher order logic from the Isabelle homepage: The main application is the formalization of mathematical proofs and in particular formal verication, which includes proving the correctness of computer hardware or software and proving properties of computer languages and protocols. Homepage: http://isabelle.in.tum.de/ 9 / 48

Inductive Datatypes datatype of booleans: datatype bool = False True this tells us: 1 False and True have type bool 2 everything that has type bool is either False or True 3 False and True are dierent datatype of natural numbers: datatype nat = Zero Suc nat this type is recursive; we have: 1 Zero is of type nat, Suc is of type nat->nat 2 elements of type nat are Zero, (Suc Zero), (Suc (Suc Zero)), etc.; but every one is either Zero or of the form (Suc x), where x itself is also of type nat 3 Zero is not equal to any (Suc x); if (Suc x) equals (Suc y), then x equals y for convenience, we can use numerals (0:=Zero, 1:=Suc Zero,... ) 11 / 48

Inductive Datatypes datatype of booleans: datatype bool = False True this tells us: 1 False and True have type bool 2 everything that has type bool is either False or True 3 False and True are dierent datatype of natural numbers: datatype nat = Zero Suc nat this type is recursive; we have: 1 Zero is of type nat, Suc is of type nat->nat 2 elements of type nat are Zero, (Suc Zero), (Suc (Suc Zero)), etc.; but every one is either Zero or of the form (Suc x), where x itself is also of type nat 3 Zero is not equal to any (Suc x); if (Suc x) equals (Suc y), then x equals y for convenience, we can use numerals (0:=Zero, 1:=Suc Zero,... ) 11 / 48

Inductive Datatypes datatype of booleans: datatype bool = False True this tells us: 1 False and True have type bool 2 everything that has type bool is either False or True 3 False and True are dierent datatype of natural numbers: datatype nat = Zero Suc nat this type is recursive; we have: 1 Zero is of type nat, Suc is of type nat->nat 2 elements of type nat are Zero, (Suc Zero), (Suc (Suc Zero)), etc.; but every one is either Zero or of the form (Suc x), where x itself is also of type nat 3 Zero is not equal to any (Suc x); if (Suc x) equals (Suc y), then x equals y for convenience, we can use numerals (0:=Zero, 1:=Suc Zero,... ) 11 / 48

Polymorphic Types datatype of polymorphic lists: datatype 'a list = Nil Cons 'a 'a list list itself is not a type; it needs to be instantiated with a concrete 'a; for example, bool list and nat list are (dierent) types elements of bool list are Nil, Cons true Nil, Cons false (Cons false Nil), etc. elements of nat list are Nil, Cons (Suc (Suc Zero)) (Cons (Suc Zero) Nil), etc. we can not form a list like Cons true (Cons Zero Nil) Nil is ambiguous, it could be either Nil::bool list or Nil::nat list; most of the time, the compiler can gure it out 13 / 48

Functions on Lists functions on inductive datatypes can be dened by pattern matching example: appending two lists consts app :: 'a list => 'a list => 'a list primrec app Nil ys = ys app (Cons x xs) ys = Cons x (app xs ys) reversing a list consts rev :: 'a list => 'a list primrec rev Nil = Nil rev (Cons x xs) = app (rev xs) (Cons x Nil) 15 / 48

How do we know that they are correct? we can now formulate and prove statements about the functions structural induction on lists: a property P about lists can be proved by showing that 1 P holds on Nil 2 if P holds on xs, then it holds on Cons x xs corresponding induction schemata are automatically derived for every inductive datatype for example: we can prove that reversing a list twice yields the original list in Isabelle: theorem rev_rev: rev (rev xs) = xs 17 / 48

A Word about Termination it is very hard to reason about non-terminating functions e.g., if we could dene f(x) = f(x) + 1, then 0 = 0 + f(x) f(x) = 0 + f(x) + 1 f(x) = 1 hence, in Isabelle (and most other proof assistants) only terminating functions can be dened two ways to achieve this: 1 only use restricted recursion schemata (like primitive recursion) 2 provide explicit termination proofs both are possible in Isabelle thus, Isabelle (like most theorem provers) is not Turing complete! 19 / 48

Who are we trusting? the proofs are done directly on the source code, no need for translation to pseudo code we need not trust our code or our understanding of it the proofs are checked by Isabelle; we need to trust the Isabelle kernel we do not need to trust the people writing tactics! Slogan Make the amount of code that needs to be trusted as small as possible. 20 / 48

Who are we trusting? the proofs are done directly on the source code, no need for translation to pseudo code we need not trust our code or our understanding of it the proofs are checked by Isabelle; we need to trust the Isabelle kernel we do not need to trust the people writing tactics! Slogan Make the amount of code that needs to be trusted as small as possible. 20 / 48

Integrating Proofs and Programs programs and proofs about them should not be separated look at the type of app in Isabelle: app :: 'a list => 'a list => 'a list it guarantees that, when given two lists, it will return a list this is not strong enough to convince us of its correctness we would like to know that if we have lists xs and ys of length m and n, then app xs ys is a list of length m + n for any 0 i < m, xs[i]=(app xs ys)[i] for any 0 i < n, ys[i]=(app xs ys)[m+i] we need a stronger type system, in which types can depend on data one language that makes this possible is Agda 22 / 48

Agda Agda is a theorem prover/programming language developed at Chalmers rst version was written by Catarina Coquand in the '90s Agda2 is a complete reimplementation, mostly by Ulf Norell it is based on Luo's Universal Type Theory, an extension of the Calculus of Constructions most important concepts are dependent types, universes, and inductive families syntax is similar to Haskell with sophisticated pattern matching Homepage: http://www.cs.chalmers.se/~ulfn/agda/ 24 / 48

Inductive Families the type of sized lists in Agda: data list {A : Set} : nat -> Set where [] : list A 0 _::_ : {n : nat} -> A -> list {A} n -> list {A} (S n) we now have true :: false :: [] : list {bool} 2 (observe inx notation of ::) safe head function: head : {A : Set} {n : nat} -> list {A} (S n) -> A head (x :: _) = x note that Agda's pattern matching mechanism gures out that the list argument cannot be empty! 26 / 48

The Append Function we can dene the append function _++_ to immediately show the eect on lengths: _++_ : {A : Set} {m n : nat} -> list {A} m -> list {A} n -> list {A} (m + n) [] ++ ys = ys (x :: xs) ++ ys = x :: (xs ++ ys) in order to prove that it does what it should, we need a function to index lists, preferably with a syntax like l[s zero] can you implement the following function? _[_] : {A : Set} {n : nat} -> list A n -> nat -> A 28 / 48

The Append Function we can dene the append function _++_ to immediately show the eect on lengths: _++_ : {A : Set} {m n : nat} -> list {A} m -> list {A} n -> list {A} (m + n) [] ++ ys = ys (x :: xs) ++ ys = x :: (xs ++ ys) in order to prove that it does what it should, we need a function to index lists, preferably with a syntax like l[s zero] can you implement the following function? _[_] : {A : Set} {n : nat} -> list A n -> nat -> A 28 / 48

Safe Indexing we need to ensure that the index is smaller than n solution: dene, for every n : nat the type below n of natural numbers smaller than it data below : nat -> Set where bzero : {n : nat} -> below (S n) bsuc : {n : nat} -> below n -> below (S n) (sadly we cannot reuse the usual notation for natural numbers) now we can dene _[_] : {A : Set} {n : nat} -> list A n -> below n -> A and proceed to prove our implementation of _++_ correct 30 / 48

Safe Indexing we need to ensure that the index is smaller than n solution: dene, for every n : nat the type below n of natural numbers smaller than it data below : nat -> Set where bzero : {n : nat} -> below (S n) bsuc : {n : nat} -> below n -> below (S n) (sadly we cannot reuse the usual notation for natural numbers) now we can dene _[_] : {A : Set} {n : nat} -> list A n -> below n -> A and proceed to prove our implementation of _++_ correct 30 / 48

Function Denitions in Agda function denitions look similar to Haskell, always done by pattern matching functions have to be explicitly annotated with their type in a dependently typed setting it is not generally possible to infer types without annotations type checking is also quite hard; sometimes unexpected results: for Agda, the types list {A} (x+y) and list {A} (y+x) are dierent, although they have the same inhabitants function denitions for which Agda cannot ensure termination are still accepted, but marked by the editor 32 / 48

Advanced Dependent Types dependent types in conjunction with universes are extremely powerful; very few primitive concepts are needed example: denition of dependent sum type data Σ {A : Set} (P : A -> Set) : exist : {x : A} -> P x -> Σ P Set where an inhabitant of this type is a pair t, M, where t : A and M : P t; for example, Σ list is a type for lists of any length seen from a logical perspective, this is an implementation of the (constructive) existential quantier 34 / 48

Advanced Dependent Types (cont.) example: identity type data _==_ {A : Set} : A -> A -> Set where refl : (x : A) -> x == x the only way to obtain an element of this type is through refl, hence if we have an element of s == t, s and t must in fact be equal Agda's pattern matching can exploit this fact: subst : {A : Set} (C : A -> Set) (x y : A) -> x == y -> C x -> C y subst C.x.x (refl x) cx = cx 36 / 48

Comparison: Agda vs. Isabelle/HOL dierent goals: Isabelle/HOL is mainly a proof assistant, Agda is mainly a programming language Agda does not have lemmas, tactics, etc. (it can still be used as a proof assistant, however) Agda has type universes (like Set), which Isabelle/HOL lacks underlying concepts are quite similar 38 / 48

Throwing Everything Together: Coq Coq unies the programming language approach and the theorem prover approach started as an implementation of a type checker for the pure Calculus of Constructions of Coquand and Huet in the early '80s recent versions are based on the Predicative Calculus of Inductive Constructions it can be used to formulate and prove mathematical results similar to Isabelle inductive families and matching like in Agda are also available (not quite so sophisticated) tactic language similar to Isabelle/HOL, with user-denable tactics large library of predened datatypes, functions, and results about them functions written in Coq can be extracted to OCaml, Haskell, or Scheme for ecient execution (has been done for fairly large programs: Compcert project) Homepage: http://coq.inria.fr 40 / 48

Writing Certied Programs in Coq dening divisibility in Coq: Definition divides (d m:nat) := exists k, m = k*d. greatest common divisor: Definition is_gcd (m n d:nat) := divides d m /\ divides d n /\ (forall d', divides d' m -> divides d' n -> d' <= d). we want a function like the following: Definition gcd (m n:nat) : {d:nat is_gcd m n d}. later, we can extract from it an OCaml function gcd : nat -> nat -> nat without the explicit proofs 42 / 48

Comparison: Agda, Coq the technical underpinnings are very similar (at least from a user's perspective) Agda is missing Coq's theorem prover features, it does not have as large a codebase as Coq but this also means it has less historical ballast... the real selling point for Coq is program extraction 43 / 48

Outline 1 Interactive Theorem Provers Verication of Functional Programs Dependent Types and Inductive Families 2 Proof Carrying Code 3 Verication of Imperative Programs 44 / 48

Proof Carrying Code See the slides by David Pichardie and Benjamin Gregoire on the summer school website. 45 / 48

Outline 1 Interactive Theorem Provers Verication of Functional Programs Dependent Types and Inductive Families 2 Proof Carrying Code 3 Verication of Imperative Programs 46 / 48

Why/Caduceus Why is a tool for verifying imperative programs based on a simple ML-like imperative language (also called Why), which is annotated with conditions and invariants formulated in wp-style the Why compiler generates verication conditions which can be solved by an automatic prover or using an interactive proof environment (like Coq or Isabelle/HOL) Caduceus translates annotated C programs into Why for Java, there is a similar tool called Krakatoa Focus is on verication of real world programs with as much automation as possible. 47 / 48

Further Explanation and Examples For further explanations and examples see the slides by Jean-Christophe Filliâtre on the summer school website. 48 / 48