Verification of an ML compiler. Lecture 3: Closures, closure conversion and call optimisations

Similar documents
A New Verified Compiler Backend for CakeML. Yong Kiam Tan* Magnus O. Myreen Ramana Kumar Anthony Fox Scott Owens Michael Norrish

CAKEML. A Verified Implementation of ML. S-REPLS 6 UCL 25 May, Ramana Kumar. Magnus Myreen. Yong Kiam Tan

Verification of an ML compiler. Lecture 1: An introduction to compiler verification

The Verified CakeML Compiler Backend

Verified Characteristic Formulae for CakeML. Armaël Guéneau, Magnus O. Myreen, Ramana Kumar, Michael Norrish April 27, 2017

Background. From my PhD (2009): Verified Lisp interpreter in ARM, x86 and PowerPC machine code

Verified compilation of a first-order Lisp language

Verified compilers. Guest lecture for Compiler Construction, Spring Magnus Myréen. Chalmers University of Technology

Automatically Introducing Tail Recursion in CakeML. Master s thesis in Computer Science Algorithms, Languages, and Logic OSKAR ABRAHAMSSON

Handout 2 August 25, 2008

A Lazy to Strict Language Compiler Master s thesis in Computer Science and Engineering PHILIP THAM

CS153: Compilers Lecture 15: Local Optimization

CSCI-GA Scripting Languages

Flang typechecker Due: February 27, 2015

Variables and Bindings

Optimising Functional Programming Languages. Max Bolingbroke, Cambridge University CPRG Lectures 2010

Recap: Functions as first-class values

Functional Programming. Pure Functional Programming

G Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University

Principles of Programming Languages

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

CSE413: Programming Languages and Implementation Racket structs Implementing languages with interpreters Implementing closures

LECTURE 3. Compiler Phases

Type Checking. Outline. General properties of type systems. Types in programming languages. Notation for type rules.

# true;; - : bool = true. # false;; - : bool = false 9/10/ // = {s (5, "hi", 3.2), c 4, a 1, b 5} 9/10/2017 4

CSE341: Programming Languages Lecture 17 Implementing Languages Including Closures. Dan Grossman Autumn 2018

Outline. General properties of type systems. Types in programming languages. Notation for type rules. Common type rules. Logical rules of inference

Compiler Theory. (Semantic Analysis and Run-Time Environments)

Overview. A normal-order language. Strictness. Recursion. Infinite data structures. Direct denotational semantics. Transition semantics

CS558 Programming Languages

G Programming Languages - Fall 2012

CSE3322 Programming Languages and Implementation

Processadors de Llenguatge II. Functional Paradigm. Pratt A.7 Robert Harper s SML tutorial (Sec II)

CPS Conversion. Di Zhao.

Parsing Scheme (+ (* 2 3) 1) * 1

Scope, Functions, and Storage Management

Compilers. Type checking. Yannis Smaragdakis, U. Athens (original slides by Sam

Lecture 09: Data Abstraction ++ Parsing is the process of translating a sequence of characters (a string) into an abstract syntax tree.

Booleans (aka Truth Values) Programming Languages and Compilers (CS 421) Booleans and Short-Circuit Evaluation. Tuples as Values.

CS558 Programming Languages

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

CPL 2016, week 10. Clojure functional core. Oleg Batrashev. April 11, Institute of Computer Science, Tartu, Estonia

CMSC 330: Organization of Programming Languages

A Functional Space Model. COS 326 David Walker Princeton University

6.184 Lecture 4. Interpretation. Tweaked by Ben Vandiver Compiled by Mike Phillips Original material by Eric Grimson

Comp 311: Sample Midterm Examination

Project Compiler. CS031 TA Help Session November 28, 2011

7. Introduction to Denotational Semantics. Oscar Nierstrasz

CMSC 330: Organization of Programming Languages

Code Generation & Parameter Passing

COP4020 Programming Languages. Functional Programming Prof. Robert van Engelen

Introduction to ML. Based on materials by Vitaly Shmatikov. General-purpose, non-c-like, non-oo language. Related languages: Haskell, Ocaml, F#,

Compilers and computer architecture: A realistic compiler to MIPS

CS558 Programming Languages

Begin at the beginning

CSE3322 Programming Languages and Implementation

Compiler Construction Lent Term 2013 Lectures 9 & 10 (of 16)

CSE341: Programming Languages Lecture 9 Function-Closure Idioms. Dan Grossman Winter 2013

OCaml. ML Flow. Complex types: Lists. Complex types: Lists. The PL for the discerning hacker. All elements must have same type.

Mini-ML. CS 502 Lecture 2 8/28/08

CS 242. Fundamentals. Reading: See last slide

Programming Languages Lecture 14: Sum, Product, Recursive Types

CS153: Compilers Lecture 16: Local Optimization II

A Functional Evaluation Model

Topic 5: Examples and higher-order functions

Intermediate Representations & Symbol Tables

Principles of Programming Languages

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

Computer Science 21b (Spring Term, 2015) Structure and Interpretation of Computer Programs. Lexical addressing

CakeML. A verified implementation of ML. LFCS seminar, Edinburgh, May Magnus Myreen Chalmers University of Technology & University of Cambridge

Compilers: The goal. Safe. e : ASM

A Func'onal Space Model

CS558 Programming Languages

The Substitution Model

CMSC 330: Organization of Programming Languages. Formal Semantics of a Prog. Lang. Specifying Syntax, Semantics

News. CSE 130: Programming Languages. Environments & Closures. Functions are first-class values. Recap: Functions as first-class values

Why OCaml? Introduction to Objective Caml. Features of OCaml. Applications written in OCaml. Stefan Wehr

Introduction to Objective Caml

Topics Covered Thus Far. CMSC 330: Organization of Programming Languages. Language Features Covered Thus Far. Programming Languages Revisited

Outline. Introduction Concepts and terminology The case for static typing. Implementing a static type system Basic typing relations Adding context

Racket. CSE341: Programming Languages Lecture 14 Introduction to Racket. Getting started. Racket vs. Scheme. Example.

Introduction to OCaml

CMSC330. Objects, Functional Programming, and lambda calculus

Agenda. CS301 Session 9. Abstract syntax: top level. Abstract syntax: expressions. The uscheme interpreter. Approaches to solving the homework problem

SML A F unctional Functional Language Language Lecture 19

Lecture 15 CIS 341: COMPILERS

Side note: Tail Recursion. Begin at the beginning. Side note: Tail Recursion. Base Types. Base Type: int. Base Type: int

Tail Recursion: Factorial. Begin at the beginning. How does it execute? Tail recursion. Tail recursive factorial. Tail recursive factorial

CS153: Compilers Lecture 14: Type Checking

CSE341, Spring 2013, Final Examination June 13, 2013

Advanced C Programming

CMSC 330: Organization of Programming Languages. OCaml Imperative Programming

Functional programming Primer I

Introduction to ML. Mooly Sagiv. Cornell CS 3110 Data Structures and Functional Programming

Typical Runtime Layout. Tiger Runtime Environments. Example: Nested Functions. Activation Trees. code. Memory Layout

A Trustworthy Monadic Formalization of the ARMv7 Instruction Set Architecture. Anthony Fox and Magnus O. Myreen University of Cambridge

Hofl, a Higher-order Functional Language. 1 An Overview of Hofl. 2 Abstractions and Function Applications

CS558 Programming Languages. Winter 2013 Lecture 3

CPSC 311, 2010W1 Midterm Exam #2

CSE 413 Midterm, May 6, 2011 Sample Solution Page 1 of 8

Transcription:

Verification of an ML compiler Lecture 3: Closures, closure conversion and call optimisations Marktoberdorf Summer School MOD 2017 Magnus O. Myreen, Chalmers University of Technology

Implementing the ML abstractions The two most interesting transitions function values are implemented (topic of this lecture) data abstraction is implemented (topic of previous lecture) Values Languages Compiler transformations source syntax Parse concrete syntax ML source source AST Infer types, exit if fail Eliminate modules no modules Replace constructor names with numbers no cons names Reduce declarations to Intermediate no declarations exps; introduce global vars Make patterns exhaustive exhaustive Move nullary constructor languages pat. matches patterns with upwards Compile pattern matches no pat. match to nested Ifs and Lets Rephrase language abstract values incl. closures and ref pointers abstract values incl. ref and code pointers machine words and code labels 32-bit words 64-bit words first-class functions. Fuse function calls/apps into multi-arg calls/apps ClosLang: Track where closure values last language flow; annotate program with closures Introduce C-style fast (has multi-arg calls wherever possible No size limits. closures) Remove deadcode Prepare for closure conv. BVL: functional language without closures DataLang: imperative language WordLang: imperative language with machine words, memory and a GC primitive Perform closure conv. Inline small functions Fold shrink constants Lets and No first-class functions. only 1 global, handle in call No size limits. Split over-sized functions into many small functions Compile global vars into a dynamically resized array Optimise Let-expressions Switch to imperative style Reduce caller-saved vars Combine adjacent memory allocations Remove data abstraction Simplify program Select target instructions Perform SSA-like renaming Force two-reg code (if req.) Remove deadcode Allocate register names Machine types. Concretise stack StackLang: Implement GC primitive Strict imperative Turn stack access into language size limits. memory acceses with array-like stack and Rename registers to match arch registers/conventions optional GC Flatten code LabLang: assembly lang. ARMv6 Delete no-ops (Tick, Skip) Encode program as concrete machine code ARMv8 x86-64 MIPS-64 RISC-V Machine code All languages communicate with the external world via a byte-array-based foreign-function interface.

Value type before: This is a minor simplification of CakeML s actual value type here. v = Number int Word64 word64 values contain code Block num (v list) RefPtr num Closure (v list) exp Recclosure (v list) (exp list) num Intermediate languages with first-class functions. No size limits. Value types after: v = Number int Word64 word64 Block num (v list) RefPtr num CodePtr num No first-class functions. No size limits.

DeBruijn indexing an index into the environment element-of operation evaluate ([Var n],env,s) = if n < length env then (Rval [el n env],s) else (Rerr(Rabort Rtype_error),s) evaluate ([Let xs x2],env,s) = case evaluate (xs,env,s) of (Rval vals,s1) => evaluate ([x2],vals ++ env,s1) res => res) value of new bound variables are prefixed to the current environment

Semantics of closures Closure creation in the concrete syntax: fn v => e Evaluation in the semantics: evaluate ([Fn e],env,s) = (Rval [Closure env e],s) no variable name given, since we are using db indexing the created closure captures the current env

Semantics of closures (cont.) Function application in SML concrete syntax, e.g. fac 50 Evaluation in the semantics: evaluate ([App e1 e2],env,s) = case evaluate env s [e1,e2] of (Rval [f,arg],s1) => (case app_env_exp f arg of Some (env,exp) => if s1.clock = 0 then (Rerr (Rabort Rtimeout_error), s1) else evaluate ([exp],env,dec_clock s1) _ => (Rerr(Rabort Rtype_error),s1)) res => res evaluate function and argument extract the closure s exp and env evaluate the exp app_env_exp (Closure env exp) arg = Some ([arg]++env, exp)

This lecture Function values (called closures) bring challenges: 1 Closures make stating the value relation harder 2 Closures need to be compiled 3 Vital optimisations If time allows: a walk-through of the compiler diagram

This lecture Function values (called closures) bring challenges: 1 Closures make stating the value relation harder 2 Closures need to be compiled 3 Vital optimisations If time allows: a walk-through of the compiler diagram

Closures cause complications Constant folding phase: compile (Add (Lit i) (Lit j)) = Lit (i+j) compile (Fn e) = Fn (compile e)... Evaluation in the semantics: evaluate ([Add (Lit 3) (Lit 5)],env,s) = (Rval (Number 8),s) evaluate ([compile (Add (Lit 3) (Lit 5))],env,s) = evaluate ([Lit 8],env,s) = (Rval (Number 8),s) optimised code produced the same result good!

Closures cause complications Constant folding phase: compile (Add (Lit i) (Lit j)) = Lit (i+j) compile (Fn e) = Fn (compile e)... Evaluation in the semantics: evaluate ([Fn (Add (Lit 3) (Lit 5))],env,s) =??? evaluate ([compile (Fn (Add (Lit 3) (Lit 5)))],env,s) = evaluate ([Fn (Lit 8)],env,s) =???

Closures cause complications Constant folding phase: compile (Add (Lit i) (Lit j)) = Lit (i+j) compile (Fn e) = Fn (compile e)... Evaluation in the semantics: evaluate ([Fn (Add (Lit 3) (Lit 5))],env,s) = (Rval (Closure env (Add (Lit 3) (Lit 5))),s) evaluate ([compile (Fn (Add (Lit 3) (Lit 5)))],env,s) = evaluate ([Fn (Lit 8)],env,s) = (Rval (Closure env (Lit 8)),s) Values can no longer be compared with equality

Value relation options How do we relate values in presence of closures? Closure env (Add (Lit 3) (Lit 5))) Closure env (Lit 8) Syntactic option: val_rel_list env1 env2 code in closure must be produced by the current compiler function e2 = compile e1 val_rel (Closure env1 e1) (Closure env2 e2) Semantic option: jargon: type-directed, step indexed, One can define a logical relation which relates closures, if related inputs produce related outputs.

Definition: val_rel x y val_rel_list xs ys val_rel_list (x::xs) (y::ys) val_rel_list [] [] Syntactic option: val_rel_list env1 env2 code in closure must be produced by the current compiler function e2 = compile e1 val_rel (Closure env1 e1) (Closure env2 e2) Semantic option: jargon: type-directed, step indexed, One can define a logical relation which relates closures, if related inputs produce related outputs.

Pros and Cons Syntactic option: val_rel_list env1 env2 code in closure must be produced by the current compiler function e2 = compile e1 val_rel (Closure env1 e1) (Closure env2 e2) Pro: easy to set up Con: compiler specific, boilerplate repeated Semantic option: jargon: type-directed, step indexed, One can define a logical relation which relates closures, if related inputs produce related outputs. Pro: can be expressive Con: can be very hard to set up

This lecture Function values (called closures) bring challenges: 1 Closures make stating the value relation harder 2 Closures need to be compiled 3 Vital optimisations If time allows: a walk-through of the compiler diagram

Value type before: v = Number int Word64 word64 Block num (v list) RefPtr num Closure (v list) exp Recclosure (v list) (exp list) num Closure conversion Intermediate languages with first-class functions. No size limits. Value types after: v = Number int Word64 word64 Block num (v list) RefPtr num CodePtr num No first-class functions. No size limits.

Value type before: Closure conversion v = Number int Word64 word64 Block num (v list) RefPtr num Closure (v list) exp Recclosure (v list) (exp list) num Value types after: v = Number int Word64 word64 Block num (v list) RefPtr num CodePtr num Closure values will be represented as tuples with a code pointer.

Value relation environment list must related to values stored in Block the compiled code for the body must be in the global code store val_rel_list code env vals lookup p code = compile body val_rel code (Closure env body) (Block clos_tag ([CodePtr p] ++ vals)) references or internal pointers are used the Block has a special marker so that equality can distinguish closures from other data Notes: mutually recursive closures are more complicated to represent because they need to have each other in the env

Minimal environments? env and vals are lists of same length val_rel_list code env vals lookup p code = compile body val_rel code (Closure env body) (Block clos_tag ([CodePtr p] ++ vals)) Reminder: Are we wasting space? evaluate ([Fn e],env,s) = (Rval [Closure env e],s) Yes! The env can contain values that are never used in e.

Minimal environments? env and vals are lists of same length val_rel_list code env vals lookup p code = compile body val_rel code (Closure env body) (Block clos_tag ([CodePtr p] ++ vals)) Are we wasting space? Note: any descent compiler will shrink the environments that are stored into the Blocks CakeML implements this as a compiler phase right before closure conversion

This lecture Function values (called closures) bring challenges: 1 Closures make stating the value relation harder 2 Closures need to be compiled 3 Vital optimisations If time allows: a walk-through of the compiler diagram

Optimisations with high impact 1.2 1 0.8 0.6 these optimisations combined reduce running time by 60 % or more 0.4 0.2 0 btree fib foldl nqueens qsortimp qsortimp qsort queue reverse No Optimisations + Multi + Known + Calls + Remove

Comparing ML compilers 4 and are crucial in making CakeML perform well 3.5 3 execution time relative to native-code compiled OCaml (red) 2.5 2 1.5 1 0.5 0 btree fib foldl nqueens qsortimp qsort queue reverse ocamlopt 4.02.3 mlton 20100608 smlnj v110.78 polyc v5.6 cakeml

What do the optimisations do? Answer: improve compilation of closures and calls in fact, they try to avoid closures if possible

Naive implementations are slow Example: fun foo x y z = x+y+z; val n = foo 0 89 21; The above is syntactic sugar for: we are looking up the value for foo, even though it is possible to known statically val foo = fn x => (fn y => (fn z => x+y+z)); val n = ((foo 0) 89) 21; each application only consumes one argument at a time between each application a new closure is created

Optimisation of function calls fun reverse xs = let fun append xs ys = case xs of [] => ys (x::xs) => x :: append xs ys; fun rev xs = case xs of [] => xs (x::xs) => append (rev xs) [x] in rev xs end; val example = reverse [1,2,3];

Optimisation of function calls set_global 0 (fn xs => let fun append xs = fn ys => if xs = [] then ys else el 0 xs :: (append (el 1 xs)) ys fun rev xs = if xs = [] then xs else (append (rev (el 1 xs))) [el 0 xs] in rev xs end); set_global 1 ((get_global 0) [1,2,3]); -level definition has been allocated to glo

Optimisation of function calls true multi-argument closure set_global 0 (fn 4 xs => let fun append 0 hxs,ysi = if xs = [] then ys else el 0 xs :: append 0 hel 1 xs, ysi fun rev 2 xs = if xs = [] then xs else append 0 hrev 2 (el 1 xs), [el 0 xs]i in rev 2 xs end); set_global 1 ((get_global 0) 4 [1,2,3]); subscripts give each closure body a unique number superscripts indicate that a known body is called

Optimisation of function calls set_global 0 (fn xs => Call 5 hxsi); set_global 1 (Call 5 [1,2,3]); Code Table: 1 hxs,ysi => if xs = [] then ys else el 0 xs :: Call 1 hel 1 xs, ysi 3 hxsi => if xs = [] then xs else Call 1 hcall 3 (el 1 xs), [el 0 xs]i 5 hxsi => let val append = 0 val rev = 0 in Call 3 hxsi end C-like function calls

This lecture If time allows: a walk-through of the compiler diagram

Values Languages Compiler transformations source syntax Parse concrete syntax source AST Infer types, exit if fail abstract values incl. closures and ref pointers abstract values incl. ref and code pointers no modules no cons names no declarations exhaustive pat. matches no pat. match ClosLang: last language with closures (has multi-arg closures) BVL: functional language without closures only 1 global, handle in call DataLang: imperative language Eliminate modules Replace constructor names with numbers Reduce declarations to exps; introduce global vars Make patterns exhaustive Move nullary constructor patterns upwards Compile pattern matches to nested Ifs and Lets Rephrase language Fuse function calls/apps into multi-arg calls/apps Track where closure values flow; annotate program Introduce C-style fast calls wherever possible Remove deadcode Prepare for closure conv. Perform closure conv. Inline small functions Fold constants and shrink Lets Split over-sized functions into many small functions Compile global vars into a dynamically resized array Optimise Let-expressions Switch to imperative style Reduce caller-saved vars Combine adjacent memory allocations Latest version: 12 intermediate languages (ILs) and many within-il optimisations each IL at the right level of abstraction Remove data abstraction Simplify program machine words and code labels WordLang: imperative language with machine words, memory and a GC primitive StackLang: imperative language with array-like stack and optional GC LabLang: assembly lang. Select target instructions Perform SSA-like renaming Force two-reg code (if req.) Remove deadcode Allocate register names Concretise stack Implement GC primitive Turn stack access into memory acceses Rename registers to match arch registers/conventions Flatten code Delete no-ops (Tick, Skip) Encode program as concrete machine code for the benefit of proofs and compiler implementation 32-bit words ARMv6 64-bit words ARMv8 x86-64 MIPS-64 RISC-V All languages communicate with the external world via a byte-array-based foreign-function interface. (next slide zooms in)

Values used by the semantics Both proved sound and complete. Values Languages Compiler transformations source syntax source AST Parse concrete syntax Infer types, exit if fail Parser and type inferencer as before act values incl. closures and ref pointers no modules no cons names no declarations exhaustive pat. matches no pat. match ClosLang: last language with closures (has multi-arg closures) Eliminate modules Replace constructor names with numbers Reduce declarations to exps; introduce global vars Make patterns exhaustive Move nullary constructor patterns upwards Compile pattern matches to nested Ifs and Lets Rephrase language Fuse function calls/apps into multi-arg calls/apps Track where closure values flow; annotate program Introduce C-style fast calls wherever possible Remove deadcode Early phases reduce the number of language features Language with multiargument closures

abstract values incl. closur ClosLang: last language with closures (has multi-arg closures) Rephrase language Fuse function calls/apps into multi-arg calls/apps Track where closure values flow; annotate program Introduce C-style fast calls wherever possible Remove deadcode Prepare for closure conv. Perform closure conv. Language with multiargument closures BVL: functional language without closures Inline small functions Fold constants and shrink Lets Split over-sized functions into many small functions Simple first-order functional language abstract values incl. ref and code pointers only 1 global, handle in call DataLang: imperative language Compile global vars into a dynamically resized array Optimise Let-expressions Switch to imperative style Reduce caller-saved vars Combine adjacent memory allocations Imperative language Remove data abstraction WordLang: imperative Simplify program Select target instructions Machine-like types

machine words and code labels 32-bit words 64-bit words WordLang: imperative language with machine words, memory and a GC primitive StackLang: imperative language with array-like stack and optional GC LabLang: assembly lang. ARMv6 Remove data abstraction Simplify program Select target instructions Perform SSA-like renaming Force two-reg code (if req.) Remove deadcode Allocate register names Concretise stack Implement GC primitive Turn stack access into memory acceses Rename registers to match arch registers/conventions Flatten code Delete no-ops (Tick, Skip) Encode program as concrete machine code ARMv8 x86-64 MIPS-64 RISC-V All languages communicate with the external world via a byte-array-based foreign-function interface. Machine-like types Imperative compiler with an FP twist: garbage collector, live-var annotations, fast exception mechanisms (for ML) Targets 5 architectures

What we learnt Function values (called closures) bring challenges: 1 2 3 Closures make stating the value relation harder because closure values contain code, which is modified by the compiler. comparing values with = does not work Closures need to be compiled to tuples that carry a code pointer and a snapshot of the relevant environment. Good compilation of closures and function applications is crucial for performance.

Extra slides

Mutually recursive closures Value: env list of function bodies v =... Recclosure (v list) (exp list) num index: which body is to be used Closure creation in the concrete syntax: let fun f1 x = and f2 = and f3 = in end Evaluation in the semantics: evaluate ([Letrec funs rest],env,s) = evaluate ([rest], build env funs ++ env, s) one binding per function build env fns = Genlist (Recclosure env fns) (length fns) Genlist f n = if 0 then [] else Genlist f (n- 1) ++ [f (n- 1)]

Application evaluate ([App e1 e2],env,s) = case evaluate env s [e1,e2] of (Rval [f,arg],s1) => (case app_env_exp f arg of Some (env,exp) => if s1.clock = 0 then (Rerr (Rabort Rtimeout_error), s1) else evaluate ([exp],env,dec_clock s1) _ => (Rerr(Rabort Rtype_error),s1)) res => res same as on earlier slide the new part app_env_exp (Closure env exp) arg = Some ([arg]++env, exp) app_env_exp (Recclosure env funs index) arg = if index < length funs then Some ([arg] ++ build env funs ++ env, el index funs) else None