A Verifying Core for a Cryptographic Language Compiler

Similar documents
A Verifying Core for a Cryptographic Language Compiler

ACL2 Challenge Problem: Formalizing BitCryptol April 20th, John Matthews Galois Connections

A Robust Machine Code Proof Framework for Highly Secure Applications

Embedding Cryptol in Higher Order Logic

Verification Condition Generation via Theorem Proving

Compositional Cutpoint Verification

Efficient, Formally Verifiable Data Structures using ACL2 Single-Threaded Objects for High-Assurance Systems

Parameterized Congruences in ACL2

Theorem proving. PVS theorem prover. Hoare style verification PVS. More on embeddings. What if. Abhik Roychoudhury CS 6214

Reasoning About LLVM Code Using Codewalker

A Tool for Simplifying ACL2 Definitions

Backtracking and Induction in ACL2

Rockwell Collins Evolving FM Methodology

Mechanized Operational Semantics

Reasoning About Programs Panagiotis Manolios

Development of a Translator from LLVM to ACL2

Improving Eliminate-Irrelevance for ACL2

Creating Formally Verified Components for Layered Assurance with an LLVM-to-ACL2 Translator

A Futures Library and Parallelism Abstractions for a Functional Subset of Lisp

Modeling Algorithms in SystemC and ACL2. John O Leary, David Russinoff Intel Corporation

Milawa an extensible proof checker

Reasoning About Programs Panagiotis Manolios

handled appropriately. The design of MILS/MSL systems guaranteed to perform correctly with respect to security considerations is a daunting challenge.

An ACL2 Tutorial. Matt Kaufmann and J Strother Moore

An Industrially Useful Prover

A CRASH COURSE IN SEMANTICS

Introduction to ACL2. CS 680 Formal Methods for Computer Verification. Jeremy Johnson Drexel University

Reasoning About Programs Panagiotis Manolios

Creating Formally Verified Components for Layered Assurance with an LLVM to ACL2 Translator

Turning proof assistants into programming assistants

A Mechanically Checked Proof of the Correctness of the Boyer-Moore Fast String Searching Algorithm

36 Modular Arithmetic

ACL2: A Program Verifier for Applicative Common Lisp. Matt Kaufmann - AMD, Austin, TX J Strother Moore UT CS Dept, Austin, TX

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Common Lisp Fundamentals

Industrial Hardware and Software Verification with ACL2

TEITP User and Evaluator Expectations for Trusted Extensions. David Hardin Rockwell Collins Advanced Technology Center Cedar Rapids, Iowa USA

Contents. Program 1. Java s Integral Types in PVS (p.4 of 37)

Cryptography and Network Security

Lecture Notes on Ints

Main Goal. Language-independent program verification framework. Derive program properties from operational semantics

Defining the Ethereum Virtual Machine for Interactive Theorem Provers

A Library For Hardware Verification

Propositional Calculus. Math Foundations of Computer Science

Polymorphic Types in ACL2

Temporal Refinement Using SMT and Model Checking with an Application to Physical-Layer Protocols

Finite Set Theory. based on Fully Ordered Lists. Jared Davis UT Austin. ACL2 Workshop 2004

Induction Schemes. Math Foundations of Computer Science

CSCC24 Functional Programming Scheme Part 2

A verified runtime for a verified theorem prover

Runtime Checking for Program Verification Systems

Automatic Formal Verification of Block Cipher Implementations

The Common Criteria, Formal Methods and ACL2

Summary of Course Coverage

An Overview of the DE Hardware Description Language and Its Application in Formal Verification of the FM9001 Microprocessor

Reasoning about copydata

CSE 20 DISCRETE MATH. Fall

Algebraic Processors

Distributed Systems Programming (F21DS1) Formal Verification

Verifying Centaur s Floating Point Adder

Integration of SMT Solvers with ITPs There and Back Again

CHAPTER 1 INTRODUCTION

Formal Verification in Industry

Optimized Scientific Computing:

Challenges and Possibilities for Safe and Secure ASN.1 Encoders and Decoders

The Specification, Verification, and Implementation of a High-Assurance Data Structure: An ACL2 Approach

Mathematica for the symbolic. Combining ACL2 and. ACL2 Workshop 2003 Boulder,CO. TIMA Laboratory -VDS Group, Grenoble, France

CIS 194: Homework 6. Due Monday, February 25. Fibonacci numbers

Function Memoization and Unique Object Representation for ACL2 Functions

Introduction to Sets and Logic (MATH 1190)

CSE 20 DISCRETE MATH. Winter

Transforming Programs into Recursive Functions

HOL DEFINING HIGHER ORDER LOGIC LAST TIME ON HOL CONTENT. Slide 3. Slide 1. Slide 4. Slide 2 WHAT IS HIGHER ORDER LOGIC? 2 LAST TIME ON HOL 1

Reducing Fair Stuttering Refinement of Transaction Systems

Translation Validation for a Verified OS Kernel

Finite Fields can be represented in various ways. Generally, they are most

Scheme Tutorial. Introduction. The Structure of Scheme Programs. Syntax

Rethinking Automated Theorem Provers?

Recursion. Chapter 7

Function Memoization and Unique Object Representation for ACL2 Functions

Alive: Provably Correct InstCombine Optimizations

Divisibility Rules and Their Explanations

Control Flow. COMS W1007 Introduction to Computer Science. Christopher Conway 3 June 2003

Lecture 6: Arithmetic and Threshold Circuits

Functional Programming. Big Picture. Design of Programming Languages

About the Author. Dependency Chart. Chapter 1: Logic and Sets 1. Chapter 2: Relations and Functions, Boolean Algebra, and Circuit Design

CS152: Programming Languages. Lecture 11 STLC Extensions and Related Topics. Dan Grossman Spring 2011

Extending ACL2 with SMT solvers

Outline. Proof Carrying Code. Hardware Trojan Threat. Why Compromise HDL Code?

Refinements to techniques for verifying shape analysis invariants in Coq

COMP 122/L Lecture 2. Kyle Dewey

Chapter 1. Introduction

Practical Formal Verification of Domain-Specific Language Applications

Lecture Notes: Hoare Logic

At full speed with Python

c constructor P, Q terms used as propositions G, H hypotheses scope identifier for a notation scope M, module identifiers t, u arbitrary terms

Coq quick reference. Category Example Description. Inductive type with instances defined by constructors, including y of type Y. Inductive X : Univ :=

Efficient execution in an automated reasoning environment

Lecture Notes, CSE 232, Fall 2014 Semester

4.1 Review - the DPLL procedure

Interplay Between Language and Formal Verification

Transcription:

A Verifying Core for a Cryptographic Language Compiler Lee Pike 1 Mark Shields 2 John Matthews Galois Connections November 21, 2006 1 Presenting. 2 Presently at Microsoft.

Thanks Rockwell Collins Advanced Technology Center, especially David Hardin, Eric Smith, and Tom Johnson Konrad Slind, Bill Young, and our anonymous ACL2 Workshop reviewers Matt Kaufmann and the other folks on the ACL2-Help list And of course, Pete Manolios and Matt Wilding for a heckuva conference!

Compiler Assurance: The Landscape Compilers are complex! Risk of bugs, especially for specialized DSL compilers. Easy target for backdoors and Trojan horses. How do we get assurance for correctness? Long-term and widespread use (e.g., gcc). Certification (e.g., Common Criteria, DO-178B). Mathematical proof.

Proofs and Compilers: Two Approaches 1. A verified compiler is one associated with a mathematical proof. One monolithic proof of correctness for all time. Deep and difficult requiring parameterized proofs about the language semantics and the compiler transformations. 2. A verifying compiler 3 is one that emits both object code and a proof that the object code implements the source code. Requires a proof for each compilation (the proof process must be automated). But the proofs are only about concrete programs. If you have a highly-automated theorem-prover (hmmm... where can I find one of those?), a verifying compiler is easier. We take the verifying compiler approach. 3 Unrelated to Tony Hoare s concept by the same name.

Overall Infrastructure front-end source µcryptol compilation compiler shallow embedding verifier higher-order logic equivalence proof higher-order logic Isabelle core indexed µcryptol shallow embedding compilation Common Lisp automated equivalence proof ACL2 canonical µcryptol shallow embedding Common Lisp back-end compilation equivalence proof (cutpoint reasoning) ACL2 binary AAMP7 deep embedding binary AAMP7 on lisp simulator

What We ve Done: Snapshot A semi-decision procedure in ACL2 for proving correspondence between µcryptol programs in indexed form and in canonical form. A semi-decision procedure for proving termination in ACL2 of µcryptol programs (including mutually-recursive cliques of streams). A simple translator for shallowly embedding µcryptol into ACL2. An ACL2 book of executable primitive operations for specifying encryption protocols (including modular arithmetic, arithmetic in Galois Fields, bitvector operations, and vector operations). These results are germane to Verifying compilers for other functional languages The verification of cryptographic protocols in ACL2 Industrial-scale automated theorem-proving

Applications and Informal Metrics Framework for Automated translations, correspondence proofs, and termination proofs for, e.g., Fibonacci, factorial, etc. TEA, RC6, AES Caveat: mcc doesn t output the correspondence proof itself yet. ACL2 Condition of Nontriviality: for AES, ACL2 automatically generates About 350 definitions 200 proofs 47,000 lines of proof output

1. Language overview 2. Automated termination proofs 3. Verifier infrastructure 4. What s left 5. Dirty laundry The Details: Outline

µcryptol in One Slide fac : B^32 -> B^8; fac i = facs @@ i where { rec index : B^8^inf; index = [0] ## [ x + 1 x <- index]; and facs : B^8^inf; facs = [1] ## [ x * y x <- facs y <- drops{1} index]; }; index = 0, 1, 2, 3, 4,..., 255, 0, 1,... facs = 1, 1, 2, 6, 24, 120, 208, 176,... fac 3 = facs @@ 3 = 6

Well-Definedness The stream delay from stream x to occurrence of stream y is d means, for sufficiently large index k N, that the k th element of stream x depends on the value of the (k d) th element of stream y. Let S be the set of stream names defined by a mutually-recursive clique of stream definitions. Then we say the clique is well defined if there exists a measure function f : (N S) N such that for each occurrence of a stream y in the body of the definition of stream x with delay d, we have k N. k d f (k d, y) < f (k, x)

Decidable! (Thanks, Mark) The mcc compiler type system ensures well-definedness The compiler constructs a minimum delay graph for the clique of streams. N.B.: Only linearly-recursive programs can be written in µcryptol. This appears to be all you need for encryption protocols....but can we trust the compiler s type system?

Well-Definedness Example (Indexed Form) rec index : B^8^inf; index = [0] ## [ x + 1 x <- index]; and facs : B^8^inf; facs = [1] ## [ x * y x <- facs y <- drops{1} index]; (defun fac-measure (i s) (acl2-count (+ (* (+ i (cond ((eq s facs) 0) ((eq s index) 0))) 2) (cond ((eq s facs) 1) ((eq s index) 0))))) All termination proofs are automatic in ACL2.

Transformations: Source to Canonical Front-End Transformations 1. Introduce safety checks 2. Simplify vector comprehensions 3. Eliminate patterns 4. Convert to indexed form Indexed Form Generated Begin Core Transformations 5. Push stream applications 6. Collapse arms 7. Align arms 8. Takes/segments to indexes 9. Convert to iterator form 10. Eliminate simple primitives 11. Eliminate zero-sized values 12. Inline and simplify 13. Introduce temporaries 14. Eliminate nested definitions 15. Share top-level value definitions 16. Box top-level definitions 17. Eliminate shadowing Canonical Form Generated

Contributed ACL2 Book: Cryptographic Primitives Arithmetic in Z 2 n (arithmetic modulo 2 n ): addition, negation, subtraction, multiplication, division, remainder after division, greatest common divisor, exponentiation, base-two logarithm, minimum, maximum, and negation. Bitvector operations: shift left, shift right, rotate left, rotate right, append of arbitrary width bitvectors, extraction of n bitvectors from a bitvector, append of fixed-width bitvectors, split into fixed-width bitvectors, bitvector segment extraction, bitvector transposition, reversal, and parity. Arithmetic in GF 2 n (the Galois Field over 2 n ): polynomial addition, multiplication, division, remainder after division, greatest common divisor, irreducibility, and inverse with respect to an irreducible polynomial. Pointwise extension of logical operations to bitvectors: bitwise conjunction, bitwise disjunction, bitwise exclusive-or, and negation bitwise complementation. Vector operations: shift left, shift right, rotate left, rotate right, vector append for an arbitrary number of vectors, extraction of n subvectors extraction from a vector, flattening a vector vectors, building a vector of vectors from a vector, taking an arbitrary segment from a vector, vector transposition, and vector reverse.

Correspondence Proof We prove the following property for the core transformations: for source program S and compiled program C, If S has well-defined semantics (does not go wrong), then S and C are observationally equivalent. Xavier Leroy

Example: Factorial Proof (make-thm :name inv-facs-thm :thm-type invariant :ind-name idx_2_facs_2 :itr-name iter_idx_facs_3 :init-hist ((0) (0)) :hist-widths (0 0) :branches ( idx_2 facs_2 )) This top-level macro call, with the appropriate keys, generates the correspondence theorem.

Two Problems for Automated Proof Generation Two problems: The proof infrastructure must be general enough to automatically prove correspondence for arbitrary programs. The proof infrastructure must not fall over on real programs (factorial took about a day; AES took a couple of months).

Some Mitigations The two difficulties are mitigated by ACL2 (and its community): Generality: Use powerful ACL2 books, particularly Rockwell Collins super-ihs (slated for public release). For any other hard lemmas, have the macros instantiate them with concrete values (usually making their proofs trivial) and prove them at run-time these are usually bitvector theorems where we want to fix the width of the bitvectors. Scaling: Package up large conjunctions in recursive definitions to prevent gratuitous expensive rewrites. Cascading computed hints that iteratively enable definitions after successive occurrences of being stable under simplification.

Dirty (Clean?) Laundry How hard was all this? Regarding the first author, Experience: Some Common Lisp experience. Little compiler experience. Little ACL2 experience. No µcryptol experience. No AAMP7 experience. Effort: Approx. 3 months to complete the core verifier. About 2 months investigating back-end verification. DSL verifying compilers are feasible!

What the ACL2 Folks Got Right Or... How an ACL2 novice can quickly do something useful. Powerful and easy macros: Avoid (hard) general proofs by simple instantiation of parameters. Simplifies creating a proof framework that is essential for an automated verifying compiler. Industrial strength prover able to handle models as large as the AAMP7 model and easily generate proofs tens of thousands of lines long. First-order language forces the user to consider specifications that have more automated proofs from the get-go. Engaged user-community and active acl2-help listserv. Good documentation. Powerful user-defined books (e.g., ihs books). Work with the folks at Rockwell Collins :)

What could have helped even more? A better way to find/search books (e.g., priorities on hints). Better integration with decision procedures/smt? Heuristics for searching for inconsistent hypotheses.

What s Left? Front end: in Isabelle (because of higher-order language constructs); just a few transformations and pattern-matching. Back-end: more substantial: Galois helped do an initial cutpoint-proof of factorial on the AAMP7. Without the AAMP7 model, the back-end verification is infeasible: Stay tuned for the next talk!

Additional Resources Example µcryptol & ACL2 specs and cryptographic primitives http://www.galois.com/files/core verifier/ µcryptol design and compiler overview (solely authored by M. Shields) http://www.cartesianclosed.com/pub/mcryptol/ µcryptol Reference Manual (solely authored by M. Shields) http://galois.com/files/mcryptol refman-0.9.pdf

Shallow Embedding mcc contains a small (1.2klocs, excluding libraries) translator from µcryptol to Common Lisp (the translator is unverified). Some highlights: µcryptol types as ACL2 predicates: B^32^2, (defund $ind_0_typep (x) (and (true-listp x) (natp (nth 0 x)) (< (nth 0 x) 4294967296) (natp (nth 1 x)) (< (nth 1 x) 4294967296))) defunded because AES has types like B^8^4^4^11. µcryptol primitives:...

Proof Macros Correspondence proofs are generated from a few macros: Function correspondence theorems of non-recursive definitions. Type correspondence theorems of type declarations. Vector comprehension correspondence theorems. Stream-clique correspondence theorems of recursive cliques of stream comprehensions. Vector-splitting correspondence theorems of type correspondence for vectors that have been split into a vector of subvectors. Inlined segments/takes correspondence theorems for inlined segments and takes operators over streams.

Factorial Correspondence Theorem (defthm factorial-invariant (implies (and (natp i) (natp lim) (true-listp hist) (<= i (+ lim 1)) (equal {\color{blue}(nth (loghead 0 i) (nth 0 hist))} {\color{red}(ind-facs i idx)}) (equal {\color{blue}(nth (loghead 1 i) (nth 1 hist))} \red{(ind-facs i facs)})) (and (equal {\color{blue}(nth (loghead 0 lim) (itr-facs i lim hist))} {\color{red}(ind-facs lim idx))} (equal {\color{blue}(nth (loghead 1 lim) (itr-facs i lim hist))} {\color{red}(ind-facs lim facs)}))))