Gradual Typing for Functional Languages. Jeremy Siek and Walid Taha (presented by Lindsey Kuper)

Similar documents
CS152: Programming Languages. Lecture 11 STLC Extensions and Related Topics. Dan Grossman Spring 2011

Gradual Typing with Inference

Gradual Typing for Functional Languages

Lecture 13: Subtyping

Review. CS152: Programming Languages. Lecture 11 STLC Extensions and Related Topics. Let bindings (CBV) Adding Stuff. Booleans and Conditionals

CS-XXX: Graduate Programming Languages. Lecture 9 Simply Typed Lambda Calculus. Dan Grossman 2012

Space-Efficient Blame Tracking for Gradual Types

Goal. CS152: Programming Languages. Lecture 15 Parametric Polymorphism. What the Library Likes. What The Client Likes. Start simpler.

Programming Languages

CMSC 336: Type Systems for Programming Languages Lecture 5: Simply Typed Lambda Calculus Acar & Ahmed January 24, 2008

Operational Semantics 1 / 13

Programming Languages Lecture 14: Sum, Product, Recursive Types

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

Tradeoffs. CSE 505: Programming Languages. Lecture 15 Subtyping. Where shall we add useful completeness? Where shall we add completeness?

Programming Languages Fall 2014

Mutable References. Chapter 1

Simply-Typed Lambda Calculus

2 Blending static and dynamic typing

Types for References, Exceptions and Continuations. Review of Subtyping. Γ e:τ τ <:σ Γ e:σ. Annoucements. How s the midterm going?

1 Introduction. 3 Syntax

The Polymorphic Blame Calculus and Parametricity

Costly software bugs that could have been averted with type checking

CS558 Programming Languages

CS4215 Programming Language Implementation. Martin Henz

Type Inference. Prof. Clarkson Fall Today s music: Cool, Calm, and Collected by The Rolling Stones

Typed Lambda Calculus and Exception Handling

COS 320. Compiling Techniques

Consistent Subtyping for All

CS 6110 S11 Lecture 25 Typed λ-calculus 6 April 2011

Lecture 9: Typed Lambda Calculus

CSE 505, Fall 2008, Midterm Examination 29 October Please do not turn the page until everyone is ready.

Chapter 13: Reference. Why reference Typing Evaluation Store Typings Safety Notes

We defined congruence rules that determine the order of evaluation, using the following evaluation

Pierce Ch. 3, 8, 11, 15. Type Systems

Part III. Chapter 15: Subtyping

CSCI-GA Scripting Languages

At build time, not application time. Types serve as statically checkable invariants.

Note that in this definition, n + m denotes the syntactic expression with three symbols n, +, and m, not to the number that is the sum of n and m.

Harvard School of Engineering and Applied Sciences Computer Science 152

CIS 500 Software Foundations Fall October 2

Types. Type checking. Why Do We Need Type Systems? Types and Operations. What is a type? Consensus

Type Systems. Pierce Ch. 3, 8, 11, 15 CSE

Subtyping. Lecture 13 CS 565 3/27/06

Subsumption. Principle of safe substitution

More Lambda Calculus and Intro to Type Systems

Type Checking and Type Inference

Concepts of programming languages

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

The design of a programming language for provably correct programs: success and failure

Typing Multi-Staged Programs and Beyond

Operational Semantics. One-Slide Summary. Lecture Outline

CS558 Programming Languages

CSE-321 Programming Languages 2010 Final

The Substitution Model

Writing Evaluators MIF08. Laure Gonnord

Typed Lambda Calculus. Chapter 9 Benjamin Pierce Types and Programming Languages

Softwaretechnik. Lecture 03: Types and Type Soundness. Peter Thiemann. University of Freiburg, Germany SS 2008

The gradual typing approach to mixing static and dynamic typing

Typing Control. Chapter Conditionals

Programming Language Features. CMSC 330: Organization of Programming Languages. Turing Completeness. Turing Machine.

CMSC 330: Organization of Programming Languages

CSE 505 Graduate PL. Fall 2013

Programming Languages Lecture 15: Recursive Types & Subtyping

Part III Chapter 15: Subtyping

Chapter 13: Reference. Why reference Typing Evaluation Store Typings Safety Notes

CSE 505: Programming Languages. Lecture 10 Types. Zach Tatlock Winter 2015

Formal Systems and their Applications

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

Dependent types and program equivalence. Stephanie Weirich, University of Pennsylvania with Limin Jia, Jianzhou Zhao, and Vilhelm Sjöberg

Interlanguage Migration

COSC 2P91. Bringing it all together... Week 4b. Brock University. Brock University (Week 4b) Bringing it all together... 1 / 22

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

From IMP to Java. Andreas Lochbihler. parts based on work by Gerwin Klein and Tobias Nipkow ETH Zurich

Topics Covered Thus Far CMSC 330: Organization of Programming Languages

Compilation 2012 Static Type Checking

Collage of Static Analysis

Dependent types and program equivalence. Stephanie Weirich, University of Pennsylvania with Limin Jia, Jianzhou Zhao, and Vilhelm Sjöberg

Typing & Static Analysis of Multi-Staged Programs

Lecture 3: Recursion; Structural Induction

Let Arguments Go First

Simple Unification-based Type Inference for GADTs

Dynamic Types , Spring March 21, 2017

CIS 500 Software Foundations Fall September 25

CSE 505: Concepts of Programming Languages

Static Type Checking. Static Type Checking. The Type Checker. Type Annotations. Types Describe Possible Values

Operational Semantics of Cool

Types and Type Inference

Intermediate Code Generation Part II

` e : T. Gradual Typing. ` e X. Ronald Garcia University of British Columbia

Lecture #23: Conversion and Type Inference

Conversion vs. Subtyping. Lecture #23: Conversion and Type Inference. Integer Conversions. Conversions: Implicit vs. Explicit. Object x = "Hello";

CMSC 330: Organization of Programming Languages. Operational Semantics

Formal Semantics. Aspects to formalize. Lambda calculus. Approach

Formal Semantics. Prof. Clarkson Fall Today s music: Down to Earth by Peter Gabriel from the WALL-E soundtrack

Gradual Typing with Union and Intersection Types

Outline. Introduction Concepts and terminology The case for static typing. Implementing a static type system Basic typing relations Adding context

Part VI. Imperative Functional Programming

Exploring the Design Space of Higher-Order Casts ; CU-CS

Fundamental Concepts. Chapter 1

Types and Type Inference

Transcription:

Gradual Typing for Functional Languages Jeremy Siek and Walid Taha (presented by Lindsey Kuper) 1

Introduction 2

What we want Static and dynamic typing: both are useful! (If you re here, I assume you agree.) So, we want a type system that......lets us choose the degree to which we want to annotate programs with types....lets us write programs in a dynamically typed style (no explicit coercions to/from type Dynamic)....uses type annotations for static type checking, not just improving run-time performance....behaves just like a static type system on completely annotated programs. Siek and Taha s gradual type system fulfills all these desires. 3

Contributions Siek and Taha present lmd λ? ( lambda-dyn ) and show that: λ? lmd, with its gradual type system, is equivalent to the STLC for fully-annotated programs. (Theorem 1) They extend the language with references and assignment to show that it s suitable for imperative languages as well. lmd λ? is type safe: if evaluation terminates, the result is either a value of the expected type or a cast error, but not a type error. (Theorem 2) On the way to Theorem 2, they prove an interesting result λ? about lmd: the run-time cost of dynamism in the language is pay-as-you-go. The proofs are all mechanically verified with Isabelle. 4

Introduction to Gradual Typing 5

Syntax of lmd λ? λ? The syntax of lmd is simple. We have: variables x ground types ɣ constants c types τ ɣ τ τ expressions e c x λx : τ. e e e λx. e λx :. e We indicate the unknown portions of a type with, so a type number is a pair of a number and an element of unknown type. Programming in a dynamically-typed style in this language is easy. Just leave off the type annotations on parameters. (A λ with no parameter type annotation is sugar for one that has parameter type.) 6

What the type system does: Easy first-order example The job of the type system is to reject programs that have inconsistencies in the known parts of types. ((lambda (x : number) (succ x)) #t) This program is rejected because it s an application of a function of type number number to an argument of type boolean. But ((lambda (x) (succ x)) #t) is accepted by the static type system (and the type error is caught at run-time). 7

What the type system does: Fancy higher-order example map : (number number) number list number list (map (lambda (x) (succ x)) (list 1 2 3)) We d like (lambda (x) (succ x)) to be accepted by our type system, but it has type number and map expects type number number. How do we design the type system to not reject this program? Intuition: require known portions of the two types number and number number to be equal; ignore the unknown parts. In effect, we re delaying comparison of unknown parts until run-time. Analogy with partial functions: two partial functions are consistent when every element in the domain of both functions maps to the same result. 8

Type consistency rules Also known as compatibility rules. Just four simple rules: CREFL: Every type is consistent with itself. CFUN: If σ1 ~ τ1 and σ2 ~ τ2, then σ1 σ2 ~ τ1 τ2. CUNL: Every type is consistent with. CUNR: is consistent with every type. Reflexive and symmetric, but not transitive. 9

Typing rules Rules for variables, constants, λ expressions: exactly like STLC. Rules for application: (GAPP1) If the operator s type is, the type of the entire expression is. (GAPP2) If the operator s type is τ τ and the operand s type is consistent with τ, then the type of the entire expression is τ. (Theorem 1) For STLC terms (aka fully-annotated terms), G typing judgments are just like STLC typing judgments. Proof sketch: throw out any typing rules that mention. We re left with the STLC s typing rules. (Corollary 1) If an STLC term isn t well-typed under STLC typing rules, it isn t well-typed under G typing rules, either. 10

Run-time semantics 11

Adding explicit casts lmd λ? doesn t make programmers write explicit casts; instead, it inserts them itself, producing an intermediate language we call lmt ( lambda-cast ). lmt is the language we ll actually evaluate. λ τ The translation to λ lmt τ only requires casts to be inserted for certain kinds of application expressions: (CAPP1) For function applications where the function s type is, just insert a cast to τ2 where τ2 is the argument s type. (CAPP2) For function applications where the function s type is τ τ and the argument s type is τ2, which is ~ with τ, we just cast the argument to τ. λ lmt s τ typing rules are much like STLC s, but with a rule added for expressions containing an explicit cast. λ τ 12

Useful properties of lmt λ τ Lemma 1. Inversion lemmas for λ lmt s τ typing rules. (These lemmas, which invert the typing rules, come in handy for some of the other lemmas.) Lemma 2. Every λ lmt τ expression has a unique type. Lemma 3. Cast insertion produces well-typed lmt terms. λ τ Lemma 4. Cast insertion does nothing to STLC terms. 13

Run-time semantics of lmt λ τ The result of evaluating a λ lmt τ term can either be a value, a CastError, a TypeError, or a KillError. There are two kinds of run-time type errors: those that cause undefined behavior (like what happens when we have a buffer overflow in C) and those that are caught by the run-time system (like in Scheme). We say that the former are TypeErrors and the latter are CastErrors. We need KillError because of a technicality in the type safety proof. It could have been avoided if we d been using a small-step semantics rather than a big-step semantics for lmt. λ τ 14

That canonical forms lemma that we said would be interesting Canonical forms lemmas always say something like If v is of type τ, then it must be.... For instance, if v is of type boolean, then it must be either #t or #f. Handy for compiler optimizations: we can use an efficient unboxed representation for every value whose type is completely known at compile time. And we get these efficient representations proportionally to the amount that we use type annotations in our programs: we pay as we go for efficiency. 15

Evaluation of lmt λ τ λ lmt τ has an operational semantics defined in big-step style, where each rule completely evaluates the expression to a result. For instance... (ECSTG) If we evaluate e for n steps, producing a result v, and v (unboxed if necessary) has type ɣ, then e cast to ɣ can be evaluated for n+1 steps to produce v. (ECSTE) If e evaluates to v and v has type σ, which is inconsistent with τ, then a cast of e to τ will result in a runtime CastError. This big-step semantics is unusual and prevents us from using a more typical progress-and-preservation-style proof of type safety. 16

Examples Our original example ((lambda (x) (succ x)) #t) produces a CastError at run-time. Our higher-order example ((lambda (f : number) (f 1)) (lambda (x : number) (succ x))) evaluates to the result 2. 17

A few more lemmas on the way to type safety Lemma 6 (Environment Expansion and Contraction). If a term e has type τ under environment Γ......and we extend Γ with a binding for a fresh variable, e still has type τ. (Pierce calls this weakening.)...and we remove something we don t need from the environment, e still has type τ....and we swap in a new store typing for the old one, as long as they agree on the types of all locations, e still has type τ. Lemma 7 (Substitution preserves typing). If e has type τ under Γ and we substitute some subexpression x of e with another subexpression e of the same type as x, e still has type τ. 18

Finally, a proof of type safety Lemma 8 (Soundness of evaluation). If an λ lmt τ expression e is well-typed with type τ (which can include ), it will evaluate to a result r, which will be either a value, a CastError, or a KillError. Theorem 2 (Type safety). If a lmd λ? expression e with type τ can be converted to a λ lmt τ expression e with type τ, it will evaluate to a result r, which will be either a value, a CastError, or a KillError. Proof: Lemma 3 (cast insertion produces welltyped lmt terms) followed by Lemma 8. λ τ 19

Adding references to lmd λ? A couple of additions to the syntax: types τ... ref τ expressions e... ref e!e e e ref e creates;!e dereferences; e e assigns and returns the value of the expression on the left after the assignment has happened. Interesting typing rules: (GDEREF1) If e s type is then!e s type is. (GASSIGNI) If e1 s type is and e2 s type is τ, then e1 e2 has type ref τ. (GASSIGN2) If e1 s type is ref τ and e2 s type is σ, and σ ~ τ, then e1 e2 has type ref τ. Types of locations can t change, or type safety is compromised. 20

Related work We re probably reading these two papers within the next 1-2 weeks: Quasi-static typing (Section 3) Abadi et al. s language of explicit casts (Section 6) Languages with some degree of gradual typing, previously not formalized: Cecil, Boo, Bigloo, proposed extensions to VB.NET/C # and Java,... (and since the paper came out: Typed Racket, and maybe also JavaScript) Languages with optional type annotations for run-time performance improvement only: Common Lisp, Dylan,... Soft Typing: type inference for run-time performance improvement Lots of others! 21

Conclusion 22

Main points It s no fun to start writing code in a dynamic language only to have to translate to a static language midway through. Ideally, you could keep the same language, and the language would have a type system that supports gradual addition of static types. Gradual typing gives us that. In lmd, λ? all programs are type-safe in the sense that non-type-safe actions can t be completed, either because of static type checking or because of run-time exceptions. lmd λ? is pay-as-you-go: the degree to which one or the other mechanism enforces the type safety of a particular program corresponds to the degree to which that program has type annotations. (And we get as much efficiency as we pay for, too.) 23

Possible directions for future work Add support for lists, arrays, ADTs, implicit coercions (such as between numeric types in Scheme) to a gradual type system. Investigate relationship between gradual typing and......parametric polymorphism....hindley-milner type inference. Incorporate gradual typing into a mainstream dynamic language (Python?) and find out if it really benefits programmer productivity. Everything else we re going to talk about in this course... 24

(exit) 25