An experiment with variable binding, denotational semantics, and logical relations in Coq. Adam Chlipala University of California, Berkeley

Similar documents
Adam Chlipala University of California, Berkeley PLDI 2007

Adam Chlipala University of California, Berkeley ICFP 2006

Proof Pearl: Substitution Revisited, Again

Autosubst Manual. May 20, 2016

c constructor P, Q terms used as propositions G, H hypotheses scope identifier for a notation scope M, module identifiers t, u arbitrary terms

Coq quick reference. Category Example Description. Inductive type with instances defined by constructors, including y of type Y. Inductive X : Univ :=

Mutable References. Chapter 1

Equations: a tool for dependent pattern-matching

Introduction to dependent types in Coq

Introduction to Coq Proof Assistant

Inductive Definitions, continued

Urmas Tamm Tartu 2013

Formal Semantics. Prof. Clarkson Fall Today s music: Down to Earth by Peter Gabriel from the WALL-E soundtrack

Comp 411 Principles of Programming Languages Lecture 7 Meta-interpreters. Corky Cartwright January 26, 2018

Embedding logics in Dedukti

Beluga: A Framework for Programming and Reasoning with Deductive Systems (System Description)

Intrinsically Typed Reflection of a Gallina Subset Supporting Dependent Types for Non-structural Recursion of Coq

CIS 1.5 Course Objectives. a. Understand the concept of a program (i.e., a computer following a series of instructions)

Static and User-Extensible Proof Checking. Antonis Stampoulis Zhong Shao Yale University POPL 2012

A Simple Supercompiler Formally Verified in Coq

Analysis of dependent types in Coq through the deletion of the largest node of a binary search tree

the application rule M : x:a: B N : A M N : (x:a: B) N and the reduction rule (x: A: B) N! Bfx := Ng. Their algorithm is not fully satisfactory in the

Calculus of Inductive Constructions

Functional programming languages

Assistant for Language Theory. SASyLF: An Educational Proof. Corporation. Microsoft. Key Shin. Workshop on Mechanizing Metatheory

Foundations of AI. 9. Predicate Logic. Syntax and Semantics, Normal Forms, Herbrand Expansion, Resolution

Programming Languages Fall 2014

An Extensible Programming Language for Verified Systems Software. Adam Chlipala MIT CSAIL WG 2.16 meeting, 2012

Lambda Calculus. Type Systems, Lectures 3. Jevgeni Kabanov Tartu,

Œuf: Minimizing the Coq Extraction TCB

Chapter 1. Introduction

Lambda Calculus: Implementation Techniques and a Proof. COS 441 Slides 15

Coq. LASER 2011 Summerschool Elba Island, Italy. Christine Paulin-Mohring

Formalization of Array Combinators and their Fusion Rules in Coq

CS 456 (Fall 2018) Scribe Notes: 2

Compilers: The goal. Safe. e : ASM

Meta-programming with Names and Necessity p.1

Mathematics for Computer Scientists 2 (G52MC2)

Programming Languages, Summary CSC419; Odelia Schwartz

A Certified Implementation of a Distributed Security Logic

From Math to Machine A formal derivation of an executable Krivine Machine Wouter Swierstra Brouwer Seminar

3.4 Deduction and Evaluation: Tools Conditional-Equational Logic

Formal Systems and their Applications

The Substitution Model

CS 242. Fundamentals. Reading: See last slide

G Programming Languages - Fall 2012

Coq projects for type theory 2018

Work-in-progress: "Verifying" the Glasgow Haskell Compiler Core language

Basic Foundations of Isabelle/HOL

Theorem proving. PVS theorem prover. Hoare style verification PVS. More on embeddings. What if. Abhik Roychoudhury CS 6214

HOL DEFINING HIGHER ORDER LOGIC LAST TIME ON HOL CONTENT. Slide 3. Slide 1. Slide 4. Slide 2 WHAT IS HIGHER ORDER LOGIC? 2 LAST TIME ON HOL 1

HIGHER-ORDER ABSTRACT SYNTAX IN TYPE THEORY

A Pronominal Account of Binding and Computation

CS152: Programming Languages. Lecture 11 STLC Extensions and Related Topics. Dan Grossman Spring 2011

Logical Verification Course Notes. Femke van Raamsdonk Vrije Universiteit Amsterdam

Lecture 15 CIS 341: COMPILERS

Lecture Notes on Data Representation

Lecture 13: Subtyping

The design of a programming language for provably correct programs: success and failure

INF 102 CONCEPTS OF PROG. LANGS FUNCTIONAL COMPOSITION. Instructors: James Jones Copyright Instructors.

A formal derivation of an executable Krivine machine

Lecture 5: The Untyped λ-calculus

Foundations. Yu Zhang. Acknowledgement: modified from Stanford CS242

More Untyped Lambda Calculus & Simply Typed Lambda Calculus

Last class. CS Principles of Programming Languages. Introduction. Outline

Eta-equivalence in Core Dependent Haskell

From natural numbers to the lambda calculus

Ideas over terms generalization in Coq

Programming Language Features. CMSC 330: Organization of Programming Languages. Turing Completeness. Turing Machine.

CMSC 330: Organization of Programming Languages

From the λ-calculus to Functional Programming Drew McDermott Posted

1. true / false By a compiler we mean a program that translates to code that will run natively on some machine.

CHAPTER ONE OVERVIEW. 1.1 Continuation-passing style

Types and Programming Languages. Lecture 6. Normalization, references, and exceptions

Short Notes of CS201

Work-in-progress: Verifying the Glasgow Haskell Compiler Core language

CSE 505: Programming Languages. Lecture 10 Types. Zach Tatlock Winter 2015

CMSC 330: Organization of Programming Languages. Operational Semantics

CMSC 330: Organization of Programming Languages. Formal Semantics of a Prog. Lang. Specifying Syntax, Semantics

CS201 - Introduction to Programming Glossary By

LOGIC AND DISCRETE MATHEMATICS

Symmetry in Type Theory

Denotational Semantics. Domain Theory

J. Barkley Rosser, 81, a professor emeritus of mathematics and computer science at the University of Wisconsin who had served in government, died

INF 212 ANALYSIS OF PROG. LANGS FUNCTION COMPOSITION. Instructors: Crista Lopes Copyright Instructors.

CS 11 Haskell track: lecture 1

Programming Language Processor Theory

Exploiting Uniformity in Substitution: the Nuprl Term Model. Abhishek Anand

Normalization by hereditary substitutions

Chapter 2 The Language PCF

Certified Programming with Dependent Types

Programming type-safe transformations using higher-order abstract syntax

The Substitution Model. Nate Foster Spring 2018

A NEW PROOF-ASSISTANT THAT REVISITS HOMOTOPY TYPE THEORY THE THEORETICAL FOUNDATIONS OF COQ USING NICOLAS TABAREAU

Lectures 20, 21: Axiomatic Semantics

Programming Languages Third Edition

Types Summer School Gothenburg Sweden August Dogma oftype Theory. Everything has a type

CS 4110 Programming Languages & Logics. Lecture 28 Recursive Types

Induction in Coq. Nate Foster Spring 2018

COMP 1130 Lambda Calculus. based on slides by Jeff Foster, U Maryland

Transcription:

A Certified TypePreserving Compiler from Lambda Calculus to Assembly Language An experiment with variable binding, denotational semantics, and logical relations in Coq Adam Chlipala University of California, Berkeley 1

The Big Picture Source Program Idealized assembly language with abstract, typedirected garbage collector Simply-typed lambda calculus Transformations: CPS conversion, closure conversion, explicit heap allocation, register allocation,... Intermediate Program I... Intermediate Program n Compiler Target Program Implemented in Coq Certifying compilation: Certified compiler: Source target For any valid and input, theprograms compiler are produces anobservationally observationallyequivalent. equivalent output. Theorem proved in Coq 2

Type-Preserving Compilation Preserve static type information in some prefix of the compilation process. Taken all the way, you end up with typed assembly language, proof-carrying code, etc.. More modestly, implement nearly tagfree garbage collection. Replace tag bits, boxing, etc., with static tables mapping registers to types. Used in the MLton SML compiler. 3

What's tricky? Nested variable scopes Relational reasoning Proof management and automation This is what the POPLmark Challenge is all about! 4

Design Decision #1: Dependently-Typed ASTs Compiler Input Program Output Program Typing Derivation Typing Derivation Use dependent types to make the compiler typepreserving by construction! Type Preservation Theorem. If the input program has type T, then the output program has type C(T). Semantics Preservation Theorem. If the input program has meaning M, then the output program has meaning C(M). 5

Design Decision #2: Denotational Semantics Compiler Input Program Output Program Denotational Semantics Version: Operational Semantics Version: 1. Compile the input program to CIC. If the input program multi-steps to result v, then the output program 2. Compile the output program to CIC. multi-steps to result v. 3. The two results must be equal. Semantics Preservation Theorem. If the input program has meaning M, then the output program has meaning C(M). 6

Secret Weapons Programming with dependent types is hard! Object language description Generic programming system Syntactic support functions + generic proofs of their correctness Writing formal proofs is hard! The trickiest bits deal with administrative operations that adjust variable bindings... The combination of dependent types and but these are still routine and hardly denotational semantics enables some very language-specific! effective decision procedures to be coded in Coq's Ltac language. Put the rooster to work! 7

Rest of the Talk... Summary of compilation Dependently-typed ASTs Denotational semantics in Coq Writing compiler passes...including generic programming of helper functions Proving semantics preservation 8

Source and Target Languages Source language: simply-typed lambda calculus ::= N! e ::= n x e e x :, e Target language: idealized assembly language o ::= r n new(r, R) read(r, n) i ::= r := o; i jump r p ::= (I, i) 9

Compiler Stages x, f x CPS conversion ktop( x, k, f x k) Closure conversion let F = e, x, k, e.1 x k in ktop(hf, [f]i) Explicit heap allocation let F = e, x, k, e.1.1 e.1.2 x k in let r1 = [f] in let r2 = [F, r1] in ktop(r2) Flattening F: r4 := r1.1; r1 := r4.2; r4 := r4.1; jump r4 main: r3 := r1.1; r1 := r1.2; r2 := new [f]; r2 := new [F, r2]; jump r3 10

Correctness Proof Compiler and proof implemented entirely within Coq 8.0 Axioms: Functional extensionality: Uniqueness of equality proofs: 8f, g, (8x, f(x) = g(x)) ) f = g 8, 8x, y :, 8P1, P2 : x = y, P1 = P2 The compiler is almost runnable as part of a proof. 11

Denotational Semantics of the Source Language 12

For Types... Inductive ty : Set := Nat : ty Arrow : ty > ty > ty. Fixpoint tydenote (t : ty) : Set := match t with Nat => nat Arrow t1 t2 => tydenote t1 > tydenote t2 end. 13

Representing Terms Nominal syntax Inductive term : Set := Const : nat > term Var : name > term Lam : name > term > term App : term > term > term. 14

Representing Terms De Bruijn syntax Inductive term : Set := Const : nat > term Var : nat > term Lam : term > term App : term > term > term. 15

Representing Terms Dependent de Bruijn syntax Inductive term : nat > Set := Const : forall n, nat > term n Var : forall n x, x < n > term n Lam : forall n, term (S n) > term n App : forall n, term n > term n > term n. 16

Representing Terms Dependent de Bruijn syntax with typing Inductive term : list ty > ty > Set := Const : forall G, nat > term G Nat Var : forall G t, var G t > term G t Lam : forall G dom ran, term (dom :: G) ran > term G (Arrow dom ran) App : forall G dom ran, term G (Arrow dom ran) > term G dom > term G ran. 17

Term Denotations Fixpoint termdenote (G : list ty) (t : ty) (e : term G t) {struct e} : subst tydenote G > tydenote t := match e in (term G t) return (subst tydenote G > tydenote t) with Const _ n => fun _ => n Var x => fun s => vardenote x s Lam _ e' => fun s => fun x => termdenote e' (SCons x s) App _ e1 e2 => fun s => (termdenote e1 s) (termdenote e2 s) end. 18

Definition of Values for Free Operational Denotational n value x :, e value Syntactic characterization used throughout definitions and proofs Inherit any canonical forms properties of the underlying Coq types. A natural number is either zero or a successor of another natural number. Caveat: We don't get the same kind of property for functions! 19

No Substitution Function! Operational Denotational Customized syntactic substitution function written for each object language ( x :, e1) e2! e1[x := e2] Coq's operational semantics Reduction rules defined using substitution provides the substitution operation for us! 20

Free Metatheorems Operational For each object language, give customized, syntactic proofs of properties like: Denotational Object Language Object Language Type safety preservation Type safety progress Confluence Strong normalization... The majority of programming language theory mechanization experiments only look at proving these sorts of theorems! Meta-theorems proved once and for all about CIC 21

Free Theorems Proof. By reflexivity of equality. Coq's proof checker identifies as equivalent terms that reduce to each other! This means that both compilation of terms into CIC and evaluation of the results are zero cost operations. 22

But Wait! Doesn't that only work for languages that are: Strongly normalizing Purely functional Deterministic Single-threaded...etc... (In other words, a lot like Coq) 23

Monads to the Rescue! Summary rebuttal: Take a cue from Haskell. Use object language agnostic embedded languages to allow expression of effectful computations Keep using Coq's definitional equality to handle reasoning about pure sublanguages, and even some of the mechanics of impure pieces. 24

Non-Strongly-Normalizing Languages For closed, first-order programs with basic block structure (e.g., structured assembly) Potentially -infinite trace Basic block denotation function (pc0, mem0) (pc1, mem1) (pc1, mem1) A total denotation function that executes a basic block, determining the next program (pc1, mem1) (pc2, mem2) (pc2, mem2) A functioncounter runs basic and blocks memory state. repeatedly to build a lazy list describing an execution trace. (no non-computational definitions required) (pc2, mem2) (pc3, mem3) (pc3, mem3) 25

Co-inductive Traces T ::= n?, T Termination with a natural Take one more step of Run-time failure number answer computation. By keeping only these summaries of program executions, we enable effective equality reasoning. Example: Garbage collection safety Equality of traces is a good way to characterize the appropriate effect on programs from rearranging the heap and root pointers to a new, isomorphic configuration. 26

Example Compilation Phase: CPS Transform Type error! Translation works in some context... but used in context, u : 1! 2! Recall that terms are represented as typing derivations. We need a syntactic helper function equivalent to a weakening lemma. 27

Dependently-Typed Syntactic Helper Functions? Could just write this function from scratch for each new language. Probably using tactic-based proof search The brave (and patient) write the CIC term directly. My recipe for writing generic substitution functions involves three auxiliary recursive functions! Much nicer to automate these details using generic programming! Write each function once, not once per object language. 28

What Do We Need? 1. The helper function itself weaken : forall (G : list ty) (t : ty), term G t > forall (t' : ty), term (t' :: G) t 2. Lemmas about the function For any term e, properly-typed substitution ¾, and properly-typed value v: Can prove this generically for any compositional denotation function! For example, for simply-typed lambda calculus, there must exist fvar, fapp, and flam such that: 29

Reflection-Based Generic Programming Denotation Language Definition Coq plug-in (outside the logic) (Coq inductive type) Specific function (type-compatible with original language definition) Function Reflected Language Definition (term of CIC) Generic function Coq plug-in Reflected Denotation Function Generic proof Specific proof 30

What to Prove? Overall correctness theorem: The compilation of a program of type N runs to the same result as the original program does. What do we prove about individual phases? Prove that input/output pairs are in an appropriate logical relation. E.g., for the CPS transform: This function space contains many functions not representable in our object languages! 31

In the Trenches Easy first step: Use introduction rules for forall's and implications at 32 the start of the goal.

In the Trenches Key observation: The quantified variables have very specific dependent types. We can use greedy quantifier instantiation! Now we're blocked at the tricky point for automated provers: proving existential facts and applying universal facts. 33

In the Trenches 34

In the Trenches Existential hypotheses are easy to eliminate. 35

In the Trenches We can't make further progress with this hypothesis, since no term 36 of the type given for k exists in the proof state.

In the Trenches We can simplify the conclusion by applying rewrite rules (like those 37 we generated automatically) until no more apply.

In the Trenches Now the conclusion has a subterm with the right type to instantiate 38 a hypothesis!

In the Trenches We can use H1 to rewrite the goal. 39

In the Trenches 40

And That's That! This strategy does almost all of the proving for the CPS transformation correctness proof! About 20 lines of proof script total. Basic approach: Figure out the right syntactic rewrite lemmas, prove them, and add them as hints. State the induction principle to use. Call a generic tactic from a library. 41

A Recipe for Certified Compilers 1.Define object languages with dependently-typed ASTs. 2.Give object languages denotational semantics. 3.Use generic programming to build basic support functions and lemmas. 4.Write compiler phases as dependently-typed Coq functions. 5.Express phase correctness with logical relations. 6.Prove correctness theorems using a generic decision procedure relying heavily on greedy quantifier instantiation. 42

Design Decisions Why dependently-typed ASTs? Avoid well-formedness side conditions Easy to construct denotational semantics defined only over well-typed terms Makes greedy quantifier instantiation realistic Why denotational semantics? Concise to define Known to work well with code transformation Many reasoning steps come for free via Coq's definitional equality 43

Conclusion Yet another bag of suggestions on how to formalize programming languages and their metatheories and tools! Would be interesting to see other approaches to formalizing this kind of compilation. Acknowledgements Thanks to my advisor George Necula. This work was funded by a US National Defense fellowship and the US National Science Foundation. 44