Types and Type Inference

Similar documents
Types and Type Inference

Types, Type Inference and Unification

cs242 Kathleen Fisher Reading: Concepts in Programming Languages, Chapter 6 Thanks to John Mitchell for some of these slides.

Principles of Programming Languages

301AA - Advanced Programming [AP-2017]

Polymorphism and Type Inference

Polymorphism and Type Inference

Type Systems, Type Inference, and Polymorphism

INF 212/CS 253 Type Systems. Instructors: Harry Xu Crista Lopes

INF 212 ANALYSIS OF PROG. LANGS Type Systems. Instructors: Crista Lopes Copyright Instructors.

G Programming Languages Spring 2010 Lecture 6. Robert Grimm, New York University

Type Checking and Type Inference

CS 330 Lecture 18. Symbol table. C scope rules. Declarations. Chapter 5 Louden Outline

CS558 Programming Languages

Data Types. Every program uses data, either explicitly or implicitly to arrive at a result.

Principles of Programming Languages

Design Issues. Subroutines and Control Abstraction. Subroutines and Control Abstraction. CSC 4101: Programming Languages 1. Textbook, Chapter 8

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1

CSCE 314 Programming Languages. Type System

Static Semantics. Lecture 15. (Notes by P. N. Hilfinger and R. Bodik) 2/29/08 Prof. Hilfinger, CS164 Lecture 15 1

CS558 Programming Languages

CS558 Programming Languages

Topics Covered Thus Far CMSC 330: Organization of Programming Languages

Types. What is a type?

Overloading, Type Classes, and Algebraic Datatypes

Weeks 6&7: Procedures and Parameter Passing

CS 430 Spring Mike Lam, Professor. Data Types and Type Checking

Programmiersprachen (Programming Languages)

Polymorphic lambda calculus Princ. of Progr. Languages (and Extended ) The University of Birmingham. c Uday Reddy

Chapter 5. Names, Bindings, and Scopes

Concepts Introduced in Chapter 7

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

Programming Languages Third Edition. Chapter 7 Basic Semantics

Chapter 5 Names, Binding, Type Checking and Scopes

Language Sequence. The Algol Family and ML. Algol 60 Sample. Algol 60. Some trouble spots in Algol 60. Algol Joke. John Mitchell

CS558 Programming Languages

CS321 Languages and Compiler Design I Winter 2012 Lecture 13

CSCI-GA Scripting Languages

Some instance messages and methods

Generic programming in OO Languages

Heap, Variables, References, and Garbage. CS152. Chris Pollett. Oct. 13, 2008.

Seminar in Programming Languages

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 17: Types and Type-Checking 25 Feb 08

Imperative Programming

Programming Languages, Summary CSC419; Odelia Schwartz

Type Bindings. Static Type Binding

Discussion. Type 08/12/2016. Language and Type. Type Checking Subtypes Type and Polymorphism Inheritance and Polymorphism

Outline. Java Models for variables Types and type checking, type safety Interpretation vs. compilation. Reasoning about code. CSCI 2600 Spring

Introduction to ML. Based on materials by Vitaly Shmatikov. General-purpose, non-c-like, non-oo language. Related languages: Haskell, Ocaml, F#,

CMSC 331 Final Exam Section 0201 December 18, 2000

Topic 9: Type Checking

Topic 9: Type Checking

Lecture #23: Conversion and Type Inference

TYPES, VALUES AND DECLARATIONS

Scope, Functions, and Storage Management

Lecture Overview. [Scott, chapter 7] [Sebesta, chapter 6]

Conversion vs. Subtyping. Lecture #23: Conversion and Type Inference. Integer Conversions. Conversions: Implicit vs. Explicit. Object x = "Hello";

CS 345. Functions. Vitaly Shmatikov. slide 1

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

Names, Scopes, and Bindings II. Hwansoo Han

CS 242. Fundamentals. Reading: See last slide

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

22c:111 Programming Language Concepts. Fall Types I

G Programming Languages - Fall 2012

Principles of Programming Languages COMP251: Functional Programming in Scheme (and LISP)

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

Properties of an identifier (and the object it represents) may be set at

Question No: 1 ( Marks: 1 ) - Please choose one One difference LISP and PROLOG is. AI Puzzle Game All f the given

Programming Languages

COS 320. Compiling Techniques

02157 Functional Programming. Michael R. Ha. Lecture 2: Functions, Types and Lists. Michael R. Hansen

CSE 307: Principles of Programming Languages

Programming Language Concepts: Lecture 14

CS 320: Concepts of Programming Languages

Data Types The ML Type System

9/7/17. Outline. Name, Scope and Binding. Names. Introduction. Names (continued) Names (continued) In Text: Chapter 5

Functional Programming. Pure Functional Programming

Lambda Calculus and Type Inference

Lecture 16: Static Semantics Overview 1

G Programming Languages - Fall 2012

COMP 181. Agenda. Midterm topics. Today: type checking. Purpose of types. Type errors. Type checking

Principles of Programming Languages Topic: Scope and Memory Professor Louis Steinberg Fall 2004

Fundamentals of Artificial Intelligence COMP221: Functional Programming in Scheme (and LISP)

Lists. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca

Semantic Analysis. How to Ensure Type-Safety. What Are Types? Static vs. Dynamic Typing. Type Checking. Last time: CS412/CS413

Static Checking and Type Systems

Introduction to Programming Using Java (98-388)

NOTE: Answer ANY FOUR of the following 6 sections:

Type systems. Static typing

Lecture 7: Type Systems and Symbol Tables. CS 540 George Mason University

Lecture 19: Functions, Types and Data Structures in Haskell

Compiler Construction

Functional Languages and Higher-Order Functions

1. true / false By a compiler we mean a program that translates to code that will run natively on some machine.

Principles of Programming Languages. Lecture Outline

Run-time Environments - 3

Types. Type checking. Why Do We Need Type Systems? Types and Operations. What is a type? Consensus

CSCI 171 Chapter Outlines

CSC312 Principles of Programming Languages : Functional Programming Language. Copyright 2006 The McGraw-Hill Companies, Inc.

Transcription:

CS 242 2012 Types and Type Inference Notes modified from John Mitchell and Kathleen Fisher Reading: Concepts in Programming Languages, Revised Chapter 6 - handout on Web!!

Outline General discussion of types What is a type? Compile-time versus run-time checking Conservative program analysis Type inference Discuss algorithm and examples Illustrative example of static analysis algorithm Polymorphism Uniform versus non-uniform implementations

Language Goals and Trade-offs Thoughts to keep in mind What features are convenient for programmer? What other features do they prevent? What are design tradeoffs? Easy to write but harder to read? Easy to write but poorer error messages? What are the implementation costs? Architect Programmer Q/A Tester Programming Language Diagnostic Tools Compiler, Runtime environment

What is a type? A type is a collection of computable values that share some structural property. Examples Integer String Int Bool (Int Int) Bool Non-examples 3, True, \x->x Even integers f:int Int x>3 => f(x) > x *(x+1) Distinction between sets of values that are types and sets that are not types is language dependent.

Advantages of Types Program organization and documentation Separate types for separate concepts Represent concepts from problem domain Document intended use of declared identifiers Types can be checked, unlike program comments Identify and prevent errors Compile-time or run-time checking can prevent meaningless computations such as 3 + true Bill Support optimization Example: short integers require fewer bits Access components of structures by known offset

What is a type error? Whatever the compiler/interpreter says it is? Something to do with bad bit sequences? Floating point representation has specific form An integer may not be a valid float Something about programmer intent and use? A type error occurs when a value is used in a way that is inconsistent with its definition Example: declare as character, use as integer

Type errors are language dependent Array out of bounds access C/C++: runtime errors. Haskell/Java: dynamic type errors. Null pointer dereference C/C++: run-time errors Haskell/ML: pointers are hidden inside datatypes Null pointer dereferences would be incorrect use of these datatypes, therefore static type errors

Compile-time vs Run-time Checking JavaScript and Lisp use run-time type checking f(x) Make sure f is a function before calling f Haskell and Java use compile-time type checking f(x) Must have f :: A B and x :: A Basic tradeoff Both kinds of checking prevent type errors Run-time checking slows down execution Compile-time checking restricts program flexibility JavaScript array: elements can have different types Haskell list: all elements must have same type Which gives better programmer diagnostics?

Expressiveness In JavaScript, we can write a function like function f(x) { return x < 10? x : x(); } Some uses will produce type error, some will not. Static typing always conservative if (complicated-boolean-expression) then f(5); else f(15); Cannot decide at compile time if run-time error will occur!

Relative Type-Safety of Languages Not safe: BCPL family, including C and C++ Casts, pointer arithmetic Almost safe: Algol family, Pascal, Ada. Dangling pointers. Allocate a pointer p to an integer, deallocate the memory referenced by p, then later use the value pointed to by p. No language with explicit deallocation of memory is fully type-safe. Safe: Lisp, Smalltalk, ML, Haskell, Java, JavaScript Dynamically typed: Lisp, Smalltalk, JavaScript Statically typed: ML, Haskell, Java If code accesses data, it is handled with the type associated with the creation and previous manipulation of that data

Type Checking vs Type Inference Standard type checking: int f(int x) { return x+1; }; int g(int y) { return f(y+1)*2; }; Examine body of each function Use declared types to check agreement Type inference: int f(int x) { return x+1; }; int g(int y) { return f(y+1)*2; }; Examine code without type information. Infer the most general types that could have been declared. ML and Haskell are designed to make type inference feasible.

Why study type inference? Types and type checking Improved steadily since Algol 60 Eliminated sources of unsoundness. Become substantially more expressive. Important for modularity, reliability and compilation Type inference Reduces syntactic overhead of expressive types. Guaranteed to produce most general type. Widely regarded as important language innovation.

History Original type inference algorithm Invented by Haskell Curry and Robert Feys for the simply typed lambda calculus in 1958 In 1969, Hindley extended the algorithm to a richer language and proved it always produced the most general type In 1978, Milner independently developed equivalent algorithm, called algorithm W, during his work designing ML. In 1982, Damas proved the algorithm was complete. Currently used in many languages: ML, Ada, Haskell, C# 3.0, F#, Visual Basic.Net 9.0. Have been plans for Fortress, Perl 6, C++0x,

uhaskell Subset of Haskell to explain type inference. Haskell and ML both have overloading Will not cover type inference with overloading <decl> ::= [<name> <pat> = <exp>] <pat> ::= Id (<pat>, <pat>) <pat> : <pat> [] <exp> ::= Int Bool [] Id (<exp>) <exp> <op> <exp> <exp> <exp> (<exp>, <exp>) if <exp> then <exp> else <exp>

Type Inference: Basic Idea Example f x = 2 + x > f :: Int -> Int What is the type of f? + has type: Int Int Int 2 has type: Int Since we are applying + to x we need x :: Int Therefore f x = 2 + x has type Int Int

Step 1: Parse Program Parse program text to construct parse tree. f x = 2 + x Infix operators are converted to Curied function application during parsing: 2 + x (+) 2 x

Step 2: Assign type variables to nodes f x = 2 + x Variables are given same type as binding occurrence.

Step 3: Add Constraints f x = 2 + x t_0 = t_1 -> t_6 t_4 = t_1 -> t_6 t_2 = t_3 -> t_4 t_2 = Int -> Int -> Int t_3 = Int

Step 4: Solve Constraints t_0 = t_1 -> t_6 t_4 = t_1 -> t_6 t_2 = t_3 -> t_4 t_2 = Int -> Int -> Int t_3 = Int t_0 = t_1 -> t_6 t_4 = t_1 -> t_6 t_4 = Int -> Int t_2 = Int -> Int -> Int t_3 = Int t_0 = Int -> Int t_1 = Int t_6 = Int t_4 = Int -> Int t_2 = Int -> Int -> Int t_3 = Int t_3 -> t_4 = Int -> (Int -> Int) t_3 = Int t_4 = Int -> Int t_1 -> t_6 = Int -> Int t_1 = Int t_6 = Int

Step 5: Determine type of declaration t_0 = Int -> Int t_1 = Int t_6 = Int -> Int t_4 = Int -> Int t_2 = Int -> Int -> Int t_3 = Int f x = 2 + x > f :: Int -> Int

Type Inference Algorithm Parse program to build parse tree Assign type variables to nodes in tree Generate constraints: From environment: constants (2), built-in operators (+), known functions (tail). From form of parse tree: e.g., application and abstraction nodes. Solve constraints using unification Determine types of top-level declarations J. A. Robinson, A Machine-oriented logic based on the resolution principle,. J. Assoc. Comput. Mach. 12, 23 41 (1965).

Constraints from Application Nodes f x t_0 = t_1 -> t_2 Function application (apply f to x) Type of f (t_0 in figure) must be domain range. Domain of f must be type of argument x (t_1 in fig) Range of f must be result of application (t_2 in fig) Constraint: t_0 = t_1 -> t_2

Constraints from Abstractions f x = e t_0 = t_1 -> t_2 Function declaration: Type of f (t_0 in figure) must be domain range Domain is type of abstracted variable x (t_1 in fig) Range is type of function body e (t_2 in fig) Constraint: t_0 = t_1 -> t_2

Inferring Polymorphic Types Example: Step 1: Build Parse Tree f g = g 2 > f :: (Int -> t_4) -> t_4

Inferring Polymorphic Types Example: f g = g 2 Step 2: Assign type variables > f :: (Int -> t_4) -> t_4

Inferring Polymorphic Types Example: Step 3: Generate constraints t_0 = t_1 -> t_4 t_1 = t_3 -> t_4 t_3 = Int f g = g 2 > f :: (Int -> t_4) -> t_4

Inferring Polymorphic Types Example: Step 4: Solve constraints t_0 = t_1 -> t_4 t_1 = t_3 -> t_4 t_3 = Int f g = g 2 > f :: (Int -> t_4) -> t_4 t_0 = (Int -> t_4) -> t_4 t_1 = Int -> t_4 t_3 = Int

Inferring Polymorphic Types Example: Step 5: Determine type of top-level declaration Unconstrained type variables become polymorphic types. f g = g 2 > f :: (Int -> t_4) -> t_4 t_0 = (Int -> t_4) -> t_4 t_1 = Int -> t_4 t_3 = Int

Using Polymorphic Functions Function: f g = g 2 > f :: (Int -> t_4) -> t_4 Possible applications: add x = 2 + x > add :: Int -> Int iseven x = mod (x, 2) == 0 > iseven:: Int -> Bool f add > 4 :: Int f iseven > True :: Bool

Recognizing Type Errors Function: f g = g 2 > f :: (Int -> t_4) -> t_4 Incorrect use not x = if x then True else False > not :: Bool -> Bool f not > Error: operator and operand don t agree operator domain: Int -> a operand: Bool -> Bool Type error: cannot unify Bool Bool and Int t

Another Example Example: Step 1: Build Parse Tree f (g,x) = g (g x) > f :: (t_8 -> t_8, t_8) -> t_8

Another Example Example: Step 2: Assign type variables f (g,x) = g (g x) > f :: (t_8 -> t_8, t_8) -> t_8

Another Example Example: Step 3: Generate constraints f (g,x) = g (g x) > f :: (t_8 -> t_8, t_8) -> t_8 t_0 = t_3 -> t_8 t_3 = (t_1, t_2) t_1 = t_7 -> t_8 t_1 = t_2 -> t_7

Another Example Example: Step 4: Solve constraints f (g,x) = g (g x) > f :: (t_8 -> t_8, t_8) -> t_8 t_0 = t_3 -> t_8 t_3 = (t_1, t_2) t_1 = t_7 -> t_8 t_1 = t_2 -> t_7 t_0 = (t_8 -> t_8, t_8) -> t_8

Another Example Example: Step 5: Determine type of f f (g,x) = g (g x) > f :: (t_8 -> t_8, t_8) -> t_8 t_0 = t_3 -> t_8 t_3 = (t_1, t_2) t_1 = t_7 -> t_8 t_1 = t_2 -> t_7 t_0 = (t_8 -> t_8, t_8) -> t_8

Polymorphic Datatypes Functions may have multiple clauses length [] = 0 length (x:rest) = 1 + (length rest) Type inference Infer separate type for each clause Combine by adding constraint that all clauses must have the same type Recursive calls: function has same type as its definition

Type Inference with Datatypes Example: Step 1: Build Parse Tree length (x:rest) = 1 + (length rest)

Type Inference with Datatypes Example: length (x:rest) = 1 + (length rest) Step 2: Assign type variables

Type Inference with Datatypes Example: length (x:rest) = 1 + (length rest) Step 3: Generate constraints t_0 = t_3 -> t_10 t_3 = t_2 t_3 = [t_1] t_6 = t_9 -> t_10 t_4 = t_5 -> t_6 t_4 = Int -> Int -> Int t_5 = Int t_0 = t_2 -> t_9

Type Inference with Datatypes Example: Step 3: Solve Constraints length (x:rest) = 1 + (length rest) t_0 = t_3 -> t_10 t_3 = t_2 t_3 = [t_1] t_6 = t_9 -> t_10 t_4 = t_5 -> t_6 t_4 = Int -> Int -> Int t_5 = Int t_0 = t_2 -> t_9 t_0 = [t_1] -> Int

Multiple Clauses Function with multiple clauses Infer type of each clause First clause: append ([],r) = r append (x:xs, r) = x : append (xs, r) > append :: ([t_1], t_2) -> t_2 Second clause: > append :: ([t_3], t_4) -> [t_3] Combine by equating types of two clauses > append :: ([t_1], [t_1]) -> [t_1]

Most General Type Type inference produces the most general type map (f, [] ) = [] map (f, x:xs) = f x : map (f, xs) > map :: (t_1 -> t_2, [t_1]) -> [t_2] Functions may have many less general types > map :: (t_1 -> Int, [t_1]) -> [Int] > map :: (Bool -> t_2, [Bool]) -> [t_2] > map :: (Char -> Int, [Char]) -> [Int] Less general types are all instances of most general type, also called the principal type

Type Inference Algorithm When Hindley/Milner type inference algorithm was developed, its complexity was unknown In 1989, Kanellakis, Mairson, and Mitchell proved that the problem was exponentialtime complete Usually linear in practice though Running time is exponential in the depth of polymorphic declarations

Information from Type Inference Consider this function reverse [] = [] reverse (x:xs) = reverse xs and its most general type: > reverse :: [t_1] -> [t_2] What does this type mean? Reversing a list should not change its type, so there must be an error in the definition of reverse!

Type Inference: Key Points Type inference computes the types of expressions Does not require type declarations for variables Finds the most general type by solving constraints Leads to polymorphism Sometimes better error detection than type checking Type may indicate a programming error even if no type error. Some costs More difficult to identify program line that causes error. Natural implementation requires uniform representation sizes. Complications regarding assignment took years to work out. Idea can be applied to other program properties Discover properties of program using same kind of analysis

Haskell Type Inference Haskell uses type classes supports user-defined overloading, so the inference algorithm is more complicated. ML restricts the language to ensure that no annotations are required Haskell provides additional features like polymorphic recursion for which types cannot be inferred and so the user must provide annotations

Parametric Polymorphism: Haskell vs C++ Haskell polymorphic function Declarations (generally) require no type information Type inference uses type variables to type expressions Type inference substitutes for type variables as needed to instantiate polymorphic code C++ function template Programmer must declare the argument and result types of functions. Programmers must use explicit type parameters to express polymorphism Function application: type checker does instantiation

Example: Swap Two Values Haskell C++ swap :: (IORef a, IORef a) -> IO () swap (x,y) = do { val_x <- readioref x; val_y <- readioref y; writeioref y val_x; return () } template <typename T> void swap(t& x, T& y){ } T tmp = x; x=y; y=tmp; writeioref x val_y; Declarations both swap two values polymorphically, but they are compiled very differently.

Implementation Haskell swap is compiled into one function Typechecker determines how function can be used C++ swap is compiled differently for each instance (details beyond scope of this course ) Why the difference? Haskell ref cell is passed by pointer. The local x is a pointer to value on heap, so its size is constant. C++ arguments passed by reference (pointer), but local x is on the stack, so its size depends on the type.

Summary Types are important in modern languages Program organization and documentation Prevent program errors Provide important information to compiler Type inference Determine best type for an expression, based on known information about symbols in the expression Polymorphism Single algorithm (function) can have many types