Type Inference. Prof. Clarkson Fall Today s music: Cool, Calm, and Collected by The Rolling Stones

Similar documents
The Environment Model

Outline. Introduction Concepts and terminology The case for static typing. Implementing a static type system Basic typing relations Adding context

The Environment Model. Nate Foster Spring 2018

CSCI-GA Scripting Languages

The Substitution Model

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

Topic 9: Type Checking

Topic 9: Type Checking

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

Topics Covered Thus Far CMSC 330: Organization of Programming Languages

Type Checking and Type Inference

CMSC 330: Organization of Programming Languages

Simply-Typed Lambda Calculus

Semantic Analysis. Lecture 9. February 7, 2018

The role of semantic analysis in a compiler

Programming Languages, Summary CSC419; Odelia Schwartz

CS558 Programming Languages

CSCE 314 Programming Languages. Type System

Scope and Introduction to Functional Languages. Review and Finish Scoping. Announcements. Assignment 3 due Thu at 11:55pm. Website has SML resources

Type Systems, Type Inference, and Polymorphism

Types and Type Inference

Interpreters. Prof. Clarkson Fall Today s music: Step by Step by New Kids on the Block

Topics Covered Thus Far. CMSC 330: Organization of Programming Languages. Language Features Covered Thus Far. Programming Languages Revisited

Types, Type Inference and Unification

TYPE INFERENCE. François Pottier. The Programming Languages Mentoring ICFP August 30, 2015

Formal Semantics. Prof. Clarkson Fall Today s music: Down to Earth by Peter Gabriel from the WALL-E soundtrack

Introduction to OCaml

CSC324 Principles of Programming Languages

CS Lecture 7: The dynamic environment. Prof. Clarkson Spring Today s music: Down to Earth by Peter Gabriel from the WALL-E soundtrack

Types and Type Inference

! Broaden your language horizons! Different programming languages! Different language features and tradeoffs. ! Study how languages are implemented

A Tour of Language Implementation

Type Systems COMP 311 Rice University Houston, Texas

The Substitution Model. Nate Foster Spring 2018

Lecture 13 CIS 341: COMPILERS

ML Type Inference and Unification. Arlen Cox

CS 242. Fundamentals. Reading: See last slide

Overview. Elements of Programming Languages. Another example. Consider the humble identity function

1. true / false By a compiler we mean a program that translates to code that will run natively on some machine.

Gradual Typing for Functional Languages. Jeremy Siek and Walid Taha (presented by Lindsey Kuper)

Lecture 5: The Untyped λ-calculus

Introduction to Prof. Clarkson Fall Today s music: Prelude from Final Fantasy VII by Nobuo Uematsu (remastered by Sean Schafianski)

Concepts of Programming Languages

CMSC 330: Organization of Programming Languages. Lambda Calculus

ITT8060 Advanced Programming

Concepts of Programming Languages

Administration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator

Key components of a lang. Deconstructing OCaml. In OCaml. Units of computation. In Java/Python. In OCaml. Units of computation.

CS152 Programming Language Paradigms Prof. Tom Austin, Fall Syntax & Semantics, and Language Design Criteria

Data Types. Every program uses data, either explicitly or implicitly to arrive at a result.

CMSC 330: Organization of Programming Languages. Formal Semantics of a Prog. Lang. Specifying Syntax, Semantics

CMSC 330: Organization of Programming Languages. Lambda Calculus

Scoping and Type Checking

Compilers and computer architecture: Semantic analysis

Type Bindings. Static Type Binding

Tracing Ambiguity in GADT Type Inference

CMSC 330: Organization of Programming Languages. Administrivia

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

9/7/17. Outline. Name, Scope and Binding. Names. Introduction. Names (continued) Names (continued) In Text: Chapter 5

COSE212: Programming Languages. Lecture 3 Functional Programming in OCaml

Overloading, Type Classes, and Algebraic Datatypes

The design of a programming language for provably correct programs: success and failure

CS Lecture 6: Map and Fold. Prof. Clarkson Fall Today s music: Selections from the soundtrack to 2001: A Space Odyssey

CS A331 Programming Language Concepts

Writing Evaluators MIF08. Laure Gonnord

Dynamically-typed Languages. David Miller

CSE 341: Programming Languages

CSE 341: Programming Languages

Functional Programming

Semantic Analysis. How to Ensure Type-Safety. What Are Types? Static vs. Dynamic Typing. Type Checking. Last time: CS412/CS413

Dynamic Types , Spring March 21, 2017

CS-XXX: Graduate Programming Languages. Lecture 9 Simply Typed Lambda Calculus. Dan Grossman 2012

INF 212 ANALYSIS OF PROG. LANGS Type Systems. Instructors: Crista Lopes Copyright Instructors.

Operational Semantics. One-Slide Summary. Lecture Outline

G Programming Languages Spring 2010 Lecture 6. Robert Grimm, New York University

cs242 Kathleen Fisher Reading: Concepts in Programming Languages, Chapter 6 Thanks to John Mitchell for some of these slides.

INF 212/CS 253 Type Systems. Instructors: Harry Xu Crista Lopes

CS 11 Haskell track: lecture 1

The Compiler So Far. Lexical analysis Detects inputs with illegal tokens. Overview of Semantic Analysis

Chapter 5. Names, Bindings, and Scopes

CS 565: Programming Languages. Spring 2008 Tu, Th: 16:30-17:45 Room LWSN 1106

Outline. Introduction to Programming (in C++) Introduction. First program in C++ Programming examples

Course outline. CSE 341: Programming Languages. Why study programming languages? Course motivation and objectives. 1 lecture: Concepts

Announcements. Working on requirements this week Work on design, implementation. Types. Lecture 17 CS 169. Outline. Java Types

CMPT 379 Compilers. Anoop Sarkar.

CS153: Compilers Lecture 14: Type Checking

Polymorphism and Type Inference

IA010: Principles of Programming Languages

COMP 1130 Lambda Calculus. based on slides by Jeff Foster, U Maryland

8/27/17. CS-3304 Introduction. What will you learn? Semester Outline. Websites INTRODUCTION TO PROGRAMMING LANGUAGES

Motivation for typed languages

To figure this out we need a more precise understanding of how ML works

COMP 181. Agenda. Midterm topics. Today: type checking. Purpose of types. Type errors. Type checking

CS 11 Ocaml track: lecture 2

Introduction to ML. Based on materials by Vitaly Shmatikov. General-purpose, non-c-like, non-oo language. Related languages: Haskell, Ocaml, F#,

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

Fundamentals of Artificial Intelligence COMP221: Functional Programming in Scheme (and LISP)

Lecture 12: Data Types (and Some Leftover ML)

Adding GADTs to OCaml the direct approach

Transcription:

Type Inference Prof. Clarkson Fall 2016 Today s music: Cool, Calm, and Collected by The Rolling Stones

Review Previously in 3110: Interpreters: ASTs, evaluation, parsing Formal syntax Formal semantics Small-step Big-step Today: Type inference

Kinds of typing Static: type checking done by analysis of program Compiler/interpreter verifies that type errors cannot occur e.g., C, C++, F#, Haskell, Java, OCaml Dynamic: type checking done by run-time Run-time detects type errors and report them. Usually requires keeping extra tag information for each value in memory. e.g., JavaScript, LISP, Matlab, PHP, Python, Ruby Can be a spectrum, e.g., instanceof in Java: some checking done at compile time, rest of checking done at run time

Kinds of typing Strong: type of a value is independent of how it s used Can t pass a string where an int expected, etc. e.g., OCaml, Haskell, Python, Java, Ruby Weak: type of value is dependent on how it s used If a string is used where an int expected, it gets converted automatically or by type cast to an int e.g., C, C++, Perl Can be a spectrum e.g., Java + operator converts objects to strings Troll alert: strong vs. weak is debated a lot; probably not helpful to degenerate into such debates

Typing quadrant Weak Strong Sta,c C, C++ OCaml, Java, Haskell Dynamic Perl, Assembly Ruby, Python, Scheme

Kinds of typing Manifest: type information supplied in source code e.g., C, C++, Java Implicit: type information not supplied in source code Implementation 1: Dynamic typing e.g., LISP, Python, Ruby, PHP Implementation 2: Type inference e.g., Haskell, OCaml Tradeoff: ease of implementation vs. run-time performance Can be a spectrum e.g., no reasonable language requires you to write to provide the type of 5 in x:int = 5

Type inference Goal is to reconstruct types of expressions based on known types of some symbols that occur in expressions Type checkers have to do some of this anyway Difference between inference and checking is really a matter of degree Best known in functional languages Especially useful in managing the types of higher-order functions But starting to appear in mainstream languages, e.g., C++11: auto x = e; declares variable x, initialized with expression e, and type of x is automatically inferred decltype(e) is a type that means whatever type e has Invented by Robin Milner for SML (though other people also deserve credit; see the notes)

Robin Milner Awarded 1991 Turing Award for ML, the first language to include polymorphic type inference and a typesafe exception handling mechanism 1934-2010

Is type inference hard? The algorithm used in ML is quite clever yet relatively easy to implement Difficulty of doing type inference for any particular language is often hard to determine Designing type inference for a particular language can be quite hard; must balance expressivity of type system with possibility of inferring all types without requiring annotations

HM type inference Algorithm used in OCaml is called HM Hindley & Milner invented it independently Guarantees of HM: It never makes mistakes. HM will never infer types that cause a program to fail to type check. It never fails. HM will never reject a program that could have been type-checked if programmer had written down all the types. (true of nearly all the language; over time some features have been added for which it's not true; see RWO for examples)

HM type inference Determine types of definitions in order Use types of earlier definitions to infer later (which is one reason why you can t use later definitions in file) For each definition: collect constraints on types solve constraints to determine type

Example let g x = 5 + x Desugar: let g = fun x -> ((+) 5) x fun x apply apply x (+) 5

Example let g = fun x -> ((+) 5) x Step 1: Assign preliminary types to all subexpressions Subexpression fun x -> ((+) 5) x Preliminary type

Example let g = fun x -> ((+) 5) x Step 1: Assign preliminary types to all subexpressions Subexpression fun x -> ((+) 5) x x ((+) 5) x (+) 5 (+) 5 x Preliminary type

Example let g = fun x -> ((+) 5) x Step 1: Assign preliminary types to all subexpressions Subexpression Preliminary type fun x -> ((+) 5) x x ((+) 5) x (+) 5 (+) int -> int -> int 5 int x

Example let g = fun x -> ((+) 5) x Step 1: Assign preliminary types to all subexpressions Subexpression Preliminary type fun x -> ((+) 5) x R x U ((+) 5) x S (+) 5 T (+) int -> int -> int 5 int x V R,S,T,U,V are preliminary type variables used during inference

Example Subexpression Preliminary type fun x -> ((+) 5) x R x U ((+) 5) x S (+) 5 T (+) int -> int -> int 5 int x V fun : R x : U apply : S apply : T x : V (+) :int->int->int 5:int

Question Did we really need to give x two different preliminary type variables? x : U fun : R apply : S apply : T x : V A. Yes B. No (+) :int->int->int 5:int

Question Did we really need to give x two different preliminary type variables? x : U fun : R apply : S apply : T x : V A. Yes B. No (+) :int->int->int 5:int

Example let g = fun x -> ((+) 5) x Step 2: Collect constraints

Example let g = fun x -> ((+) 5) x Step 2: Collect constraints Subexpression Preliminary type fun x -> ((+) 5) x R x U ((+) 5) x S Constraint from function: R = U -> S

Example let g = fun x -> ((+) 5) x Step 2: Collect constraints Subexpression x x Preliminary type U V Constraint from variable usage: U = V

Example let g = fun x -> ((+) 5) x Step 2: Collect constraints Subexpression Preliminary type ((+) 5) x S x V (+) 5 T Constraint from application: T = V -> S

Example let g = fun x -> ((+) 5) x Step 2: Collect constraints Subexpression Preliminary type (+) 5 T (+) int -> int -> int 5 int Constraint from application: int -> int -> int = int -> T

Example let g = fun x -> ((+) 5) x Step 2: Collect constraints U = V R = U-> S T = V-> S int -> int -> int = int -> T

Example let g = fun x -> ((+) 5) x Step 3: Solve constraints U R = U-> V S T R = U-> S int -> int -> int T = int V-> S -> T int -> int -> int = int -> T

Example let g = fun x -> ((+) 5) x Step 3: Solve constraints U R = U-> V S T R = U-> S int -> int -> int T = int V-> S -> T int -> int -> int = int -> T

Example let g = fun x -> ((+) 5) x Step 3: Solve constraints R = U-> S T = U-> S int -> int -> int = int -> T

Example let g = fun x -> ((+) 5) x Step 3: Solve constraints R = U-> S T = U-> S int -> int -> int = int -> T

Example let g = fun x -> ((+) 5) x Step 3: Solve constraints R = U-> S int -> int -> int = int -> U -> S

Example let g = fun x -> ((+) 5) x Step 3: Solve constraints R = U-> S int -> int -> int = int -> U -> S

Example let g = fun x -> ((+) 5) x Step 3: Solve constraints R = int -> int

Example let g = fun x -> ((+) 5) x Step 3: Solve constraints R = int -> int Done: type of g is int -> int

Algorithm for constraint collection Input: an expression e Assume that every anonymous function in e has a different variable name as its argument Easy to ensure that holds, thanks to lexical scope: rename arguments as necessary Output: a set of constraints

Constraint collection Intuition: assign a unique type variable (e.g., R, S, T, ), one to each argument x of a function in e one to every subexpression e in e like how we decorated (aka annotated) AST in example Formally: define two functions that return type variables: D: definition of an argument U: use of a subexpression D(x) returns the type variable assigned to argument x U(e ) returns the type variable assigned to subexpression e

Constraint collection Example: Input: fun x -> (fun y -> x) Define two functions for type variables: D(x) = R D(y) = S U(fun x -> (fun y -> x)) = T U(fun y -> x) = X U(x) = Y

Constraint collection Constraints that are collected (intuition): For each kind of expression (application, anonymous function, let, etc.), collect a set of equations that must hold for that kind of expression e.g., the type of entire anonymous function must equal type of its argument arrow type of its body which is what we did in example earlier

Constraint collection Constraints that are collected (formally): At a variable usage x: U(x) = D(x) At a function application e1 e2: U(e1) = U(e2) -> U(e1 e2) At an anonymous function fun x -> e: U(fun x -> e) = D(x) -> U(e) At a let expression let x = e1 in e2: U(let x = e1 in e2) = U(e2) and D(x) = U(e1) etc. Unioned with constraints collected at each subexpression Note how these are essentially the static semantics! Return those constraints as output of algorithm

Constraint collection Example (continued): Input: fun x -> (fun y -> x) x occurs as subexpression, so generate constraint U(x) = D(x) Already have U(x) = Y and D(x) = R So constraint is Y = R fun y -> ux occurs as subexpression, so generate constraint U(fun y -> x) = D(y) -> U(x) Already have U(x) = Y, and U(fun y -> x) = X, and D(y) = S So constraint is X = S -> Y fun x -> (fun y -> x) occurs as subexpression, so generate constraint U(fun x -> (fun y -> x)) = D(x) -> U(fun y -> x) Resulting constraint is T = R -> X

Solving constraints After collection, have a set of constraints Really a set of equations Need to solve those equations for type of main expression of interest Unification algorithm [Robinson 1965] roughly like Gaussian elimination to solve system of matrix equations in linear algebra see notes for the algorithm

Upcoming events nothing this week This is cool, calm, and collected. THIS IS 3110