Type Inference and Type Habitation (5/10/2004)

Similar documents
Outcome-Oriented Programming (5/12/2004)

Logic-Oriented Programming (5/11/2004)

C311 Lab #3 Representation Independence: Representation Independent Interpreters

Project 2: Scheme Interpreter

CS 342 Lecture 8 Data Abstraction By: Hridesh Rajan

Why do we need an interpreter? SICP Interpretation part 1. Role of each part of the interpreter. 1. Arithmetic calculator.

Building a system for symbolic differentiation

6.184 Lecture 4. Interpretation. Tweaked by Ben Vandiver Compiled by Mike Phillips Original material by Eric Grimson

Functional Programming. Pure Functional Programming

Building a system for symbolic differentiation

Principles of Programming Languages 2017W, Functional Programming

Data Abstraction. An Abstraction for Inductive Data Types. Philip W. L. Fong.

Lecture 09: Data Abstraction ++ Parsing is the process of translating a sequence of characters (a string) into an abstract syntax tree.

An Explicit-Continuation Metacircular Evaluator

Comp 311: Sample Midterm Examination

Streams, Delayed Evaluation and a Normal Order Interpreter. CS 550 Programming Languages Jeremy Johnson

1. For every evaluation of a variable var, the variable is bound.

6.037 Lecture 4. Interpretation. What is an interpreter? Why do we need an interpreter? Stages of an interpreter. Role of each part of the interpreter

Functional Programming. Pure Functional Languages

Scheme in Scheme: The Metacircular Evaluator Eval and Apply

;;; Determines if e is a primitive by looking it up in the primitive environment. ;;; Define indentation and output routines for the output for

A Brief Introduction to Scheme (II)

Fall Semester, The Metacircular Evaluator. Today we shift perspective from that of a user of computer langugaes to that of a designer of

Lecture 3: Expressions

Functional Programming - 2. Higher Order Functions

Intro. Scheme Basics. scm> 5 5. scm>

Using Symbols in Expressions (1) evaluate sub-expressions... Number. ( ) machine code to add

Functional Programming. Pure Functional Languages

Lecture08: Scope and Lexical Address

The Typed Racket Guide

Discussion 12 The MCE (solutions)

CS61A Midterm 2 Review (v1.1)

Typed Racket: Racket with Static Types

COP4020 Programming Languages. Functional Programming Prof. Robert van Engelen

A pattern-matcher for αkanren -or- How to get into trouble with CPS macros

6.001 Notes: Section 15.1

Typed Scheme: Scheme with Static Types

Scheme. Functional Programming. Lambda Calculus. CSC 4101: Programming Languages 1. Textbook, Sections , 13.7

LECTURE 16. Functional Programming

Lecture 15 CIS 341: COMPILERS

6.001 Notes: Section 8.1

Announcements. The current topic: Scheme. Review: BST functions. Review: Representing trees in Scheme. Reminder: Lab 2 is due on Monday at 10:30 am.

Summer 2017 Discussion 10: July 25, Introduction. 2 Primitives and Define

Fall 2018 Discussion 8: October 24, 2018 Solutions. 1 Introduction. 2 Primitives

Syntactic Sugar: Using the Metacircular Evaluator to Implement the Language You Want

CSE413: Programming Languages and Implementation Racket structs Implementing languages with interpreters Implementing closures

6.821 Programming Languages Handout Fall MASSACHVSETTS INSTITVTE OF TECHNOLOGY Department of Electrical Engineering and Compvter Science

Comp 411 Principles of Programming Languages Lecture 7 Meta-interpreters. Corky Cartwright January 26, 2018

Computer Science 21b (Spring Term, 2015) Structure and Interpretation of Computer Programs. Lexical addressing

(scheme-1) has lambda but NOT define

Introduction to Scheme

Documentation for LISP in BASIC

Functional Programming. Big Picture. Design of Programming Languages

4.2 Variations on a Scheme -- Lazy Evaluation

CSE341: Programming Languages Lecture 17 Implementing Languages Including Closures. Dan Grossman Autumn 2018

Parsing Scheme (+ (* 2 3) 1) * 1

CS61A Discussion Notes: Week 11: The Metacircular Evaluator By Greg Krimer, with slight modifications by Phoebus Chen (using notes from Todd Segal)

From Syntactic Sugar to the Syntactic Meth Lab:

CS 275 Name Final Exam Solutions December 16, 2016

CSE 413 Midterm, May 6, 2011 Sample Solution Page 1 of 8

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

Scheme Quick Reference

Normal Order (Lazy) Evaluation SICP. Applicative Order vs. Normal (Lazy) Order. Applicative vs. Normal? How can we implement lazy evaluation?

UNIVERSITY of CALIFORNIA at Berkeley Department of Electrical Engineering and Computer Sciences Computer Sciences Division

CS 61A Interpreters, Tail Calls, Macros, Streams, Iterators. Spring 2019 Guerrilla Section 5: April 20, Interpreters.

Introduction to Typed Racket. The plan: Racket Crash Course Typed Racket and PL Racket Differences with the text Some PL Racket Examples

Note that in this definition, n + m denotes the syntactic expression with three symbols n, +, and m, not to the number that is the sum of n and m.

CSc 520 Principles of Programming Languages

Introduction to Logic Programming. Ambrose

Fall 2017 Discussion 7: October 25, 2017 Solutions. 1 Introduction. 2 Primitives

Functional programming with Common Lisp

CMSC 330: Organization of Programming Languages

Principles of Programming Languages

CMSC 330: Organization of Programming Languages. Operational Semantics

Scheme Quick Reference

Compilers. Compiler Construction Tutorial The Front-end

Parsing. Zhenjiang Hu. May 31, June 7, June 14, All Right Reserved. National Institute of Informatics

n n Try tutorial on front page to get started! n spring13/ n Stack Overflow!

6.037 Lecture 7B. Scheme Variants Normal Order Lazy Evaluation Streams

COP4020 Programming Assignment 1 - Spring 2011

Building up a language SICP Variations on a Scheme. Meval. The Core Evaluator. Eval. Apply. 2. syntax procedures. 1.

CS 360 Programming Languages Interpreters

A LISP Interpreter in ML

Problem Set 4: Streams and Lazy Evaluation

Polymorphic lambda calculus Princ. of Progr. Languages (and Extended ) The University of Birmingham. c Uday Reddy

Basic concepts. Chapter Toplevel loop

Lexical vs. Dynamic Scope

Type Checking and Type Inference

Lambda Calculus. CS 550 Programming Languages Jeremy Johnson

1.3. Conditional expressions To express case distinctions like

9 Objects and Classes

15 Unification and Embedded Languages in Lisp

COMP 181. Agenda. Midterm topics. Today: type checking. Purpose of types. Type errors. Type checking

Spring 2018 Discussion 7: March 21, Introduction. 2 Primitives

Typed Lambda Calculus and Exception Handling

CSE 341 Lecture 16. More Scheme: lists; helpers; let/let*; higher-order functions; lambdas

SCHEME 7. 1 Introduction. 2 Primitives COMPUTER SCIENCE 61A. October 29, 2015

Evaluating Scheme Expressions

3.7 Denotational Semantics

CS61A Summer 2010 George Wang, Jonathan Kotker, Seshadri Mahalingam, Eric Tzeng, Steven Tang

Transcription:

1 Type Inference and Type Habitation (5/10/2004) Daniel P. Friedman, William E. Byrd, David W. Mack Computer Science Department, Indiana University Bloomington, IN 47405, USA Oleg Kiselyov Fleet Numerical Meteorology and Oceanography Center, Monterey, CA 93943, USA An interesting application of logic programming is the type-inference problem. We start by considering integers and booleans. Next, we include some familiar primitives. When we are comfortable with those features we include conditionals, followed by lexical variables, then lambda, application, and fix expressions. Finally, we include polymorphic let. Type inference allows for the system to determine a unique type if the expression has one. Similarly, using the same program we can determine an expression that has a given type. This last application is called type habitation. The language for which we infer a type is basically a lambda-calculus variant of Scheme. We have chosen, however, to parse this variant into a language where every expression has a tag as in parse below. (define parse (lambda (e) (cond [(symbol? e) (var,e)] [(number? e) (intc,e)] [(boolean? e) (boolc,e)] [else (case (car e) [(zero?) (zero?,(parse (cadr e)))] [(sub1) (sub1,(parse (cadr e)))] [(+) (+,(parse (cadr e)),(parse (caddr e)))] [(*) (*,(parse (cadr e)),(parse (caddr e)))] [(if) (if,(parse (cadr e)),(parse (caddr e)),(parse (cadddr e)))] [(fix) (fix,(parse (cadr e)))] [(lambda) (lambda,(cadr e),(parse (caddr e)))] [(let) (let ([,(car (car (cadr e))),(parse (cadr (car (cadr e))))]),(parse (caddr e)))] [else (app,(parse (car e)),(parse (cadr e)))])]))) While you are reading the definitions for the type system,!-, it is important to keep in mind that although we are presenting a type inferencing algorithm, it is just a relatively simple logic program.

2 Friedman, Byrd, Mack, Kiselyov!- corresponds to the mathematical symbol (turnstile) and reads we can infer. That is, from looking at the relation intc-rel and the definition of!- below, we can read it as, From g we can infer that x is of type int provided that x is an intc. For now, we leave g unspecified. In the expressions below, we use int, bool, and --> for our type constructors. (define intc-rel (fact (g x) g (intc,x) int)) (define!- intc-rel) (!- g (parse 17)? :: int > (answer () (!- g (parse #t) int))) #f In the first example, we verify that17 is of typeint. In this case,?is instantiated to int. The second one does not display the trace-vars, so this one does not hold: The type of #t is not int. Even more interesting, is to try (!- g (parse #t) #f which also displays nothing, so #t does not have a type according to the current definition of!-. Why? Because we have not yet extended the definition of!- to include a rule for boolean constants. So, we include boolc-rel below in!-. (define boolc-rel (fact (g x) g (boolc,x) bool)) (define!- (extend-relation (a1 a2 a3)!- boolc-rel)) (!- g (parse #t) Before we include relations for the arithmetic primitives, we observe that we need to use all!!. We do this because our type inferencer is deterministic. There cannot be any backtracking. That is, once a decision is made, it cannot be reconsidered! Now we can extend!- below with the relations zero?-rel sub1-rel, +-rel, and *-rel, which correspond to the primitves zero?, sub1, +, and *, respectively.

(define zero?-rel (relation (g x) (to-show g (zero?,x) bool) (all!! (!- g x int)))) (define sub1-rel (relation (g x) (to-show g (sub1,x) int) (all!! (!- g x int)))) (define +-rel (relation (g x y) (to-show g (+,x,y) int) (all!! (!- g x int) (!- g y int)))) (define *-rel (relation (g x y) (to-show g (*,x,y) int) (all!! (!- g x int) (!- g y int)))) Tutorial 3 (define!- (extend-relation (a1 a2 a3)!- zero?-rel sub1-rel +-rel *-rel)) (!- g (parse (zero? 24)) (!- g (parse (zero? (+ 24 50))) The type system can infer that (zero? 24) is of type bool, because it can infer that 24 is of type int. It can infer that (+ 24 50) is of type int, so the answer in the second example must be of typebool. We can, of course, make more complicated examples using zero?, sub1, +, and *, but if they have a type, it is int or bool. For example, (!- g (parse (zero? (sub1 (+ 18 (+ 24 50))))) Although our parser expects a larger language, at each stage of defining!-, we are writing a type inferencer for a larger and larger language. When we have defined!- for let, then we have a type inferencer for the full language. To reiterate, the language starts out very small! It only contains integers. Then, as we progress, it gets larger. But, the beauty of type inferencing, is that these little relations grow

4 Friedman, Byrd, Mack, Kiselyov naturally. Of course, we cannot test the relation for zero? until we have a relation for numbers and booleans, so there is a natural ordering to some extent. In this type system, we must preserve the property that Every well-typed expression has one type. So, what do we do about conditionals? Easy. We require that not only must the test be of type bool, but the true branch and the false branch must have the same type. In a language without variables, lambda expressions, applications, fix expressions, and let expressions, that means that they must both be of type int or they must both be of type bool. By extending!-, we can now handle if expressions. (define if-rel (relation (g t test conseq alt) (to-show g (if,test,conseq,alt) t) (all!! (!- g test bool) (!- g conseq t) (!- g alt t)))) (define!- (extend-relation (a1 a2 a3)!- if-rel)) (!- g (parse (if (zero? 24) 3 4))? :: int And we discover the test has type bool and the entire expression has type int. Next, we include lexical variables, which are represented using symbols. What is the type of (zero? a)? If the type of a is int, then we know that the type of the entire expression is bool, but if the type of a is bool, then the expression does not have a type. How do we determine the type of a? We look in g, which is a type environment that associates lexical (both generic and non-generic) variables with types. So far we have ignoredg, but now we consider its content using non-generic (a tag) variables. We extend!- to include a relation for variables. (define var-rel (relation (g v t) (to-show g (var,v) t) (all!! (env g v t)))) (define!- (extend-relation (a1 a2 a3)!- var-rel)) Now we must define the env relation. (define non-generic-base-env (fact (g v t) (non-generic,v,t,g) v t)) (define non-generic-recursive-env (relation (g v t) (to-show (non-generic,_,_,g) v t) (all!! (instantiated g) (env g v t))))

Tutorial 5 (define env (extend-relation (a1 a2 a3) non-generic-base-env non-generic-recursive-env)) (env (non-generic b int (non-generic a bool,g)) a (!- (non-generic a int,g) (parse (zero? a)) (!- (non-generic b bool (non-generic a int,g)) (parse (zero? a)) The first example tests env. The environment starts out with int bound to b and bool bound to a. The non-generic-recursive-env relation succeeds, since we are looking up a, and then non-generic-base-env succeeds, since we find a. In the second answer, we have one item in the type environment and the non-generic-base-env relation is followed. In the third example, we have two items in the type environment, so we take non-generic-recursive-env, then non-generic-base-env succeeds, since we have stripped off b and bool, leaving a and its associated type. Now that we can deal with lexical (non-generic) variables, we can consider the relation for lambda expressions by extending!-. (define lambda-rel (relation (g v t body type-v) (to-show g (lambda (,v),body) (-->,type-v,t)) (all!! (!- (non-generic,v,type-v,g) body t)))) (define!- (extend-relation (a1 a2 a3)!- lambda-rel)) (!- (non-generic b bool (non-generic a int,g)) (parse (lambda (x) (+ x 5))) (!- (non-generic b bool (non-generic a int,g)) (parse (lambda (x) (+ x 5)))? :: (--> int int)

6 Friedman, Byrd, Mack, Kiselyov (!- (non-generic b bool (non-generic a int,g)) (parse (lambda (x) (+ x a)))? :: (--> int int) (!- g (parse (lambda (a) (lambda (x) (+ x a))))? :: (--> int (--> int int)) In the first answer, we see that we have an arrow (-->) type. The left argument of the arrow type is the type of argument coming into the function, and the right argument of the arrow type is the type of the result going out of the function. So, the inferred type is a function whose argument is an integer and whose result is an integer. The second answer states that the argument is an integer, but consults the type environment to make sure that the argument going out is an integer. In the third example, we forget about the first environment, because there are no free variables in the expression. We see that the argument coming in is an integer, but the result is an arrow type, which takes in an integer and returns an integer. Close inspection of the type (directly aligned below the item it is typing) shows that for each lambda there is an arrow and for each formal parameter there is a type. Also, there is a type for the body of each lambda. If we think about the type from the inside out, we see that (+ x a) is an integer only if x and a are integers. That determines the type of the inner lambda and then the type of the outer lambda. It is important that we can infer the type of lambda expressions, even though we do not yet have application in our language. This should be a bit of a surprise. We come next to application. (define app-rel (relation (g rator rand t) (to-show g (app,rator,rand) t) (exists (t-rand) (all!! (!- g rator (-->,t-rand,t)) (!- g rand t-rand))))) (define!- (extend-relation (a1 a2 a3)!- app-rel)) In determining the type of an application, we know that the operator in an application should be some arrow type. Furthermore, once we know that type, we know that the type going out of that type is the same as the type of the entire application and we know that the operand of the application must be the type going into that type.

Tutorial 7 (!- g (parse (lambda (f) (lambda (x) ((f x) x))))? :: (--> (--> t.0 (--> t.0 t.1)) (--> t.0 t.1)) Here, the type of f is (--> t.0 (--> t.0 t.1)), so the type of x must be t.0, and the type of (lambda (x) ((f x) x)) must be (--> t.0 t.1). As should be evident, once we add a relation for application, things start to get a bit tricky. We can no longer rely on aligning the lambdas with the arrows. Here we have two lambdas and four arrows. Yet another surprise. Our inferencer is starting to be clever. We may be tempted to use our language to write (and test) recursive functions. To test the expression, we use the call-by-value fix primitive: (define fix-rel (relation (g rand t) (to-show g (fix,rand) t) (all!! (!- g rand (-->,t,t))))) (define!- (extend-relation (a1 a2 a3)!- fix-rel)) (!- g (parse ((fix (lambda (sum) (lambda (n) (if (zero? n) 0 (+ n (sum (sub1 n))))))) 10))? :: int In Scheme, we definefix below. Althoughfix can be defined using simplelambda terms as in the Y combinator, our type system cannot determine Y s type. Thus, fix must be primitive and the associated primitive call s type can be inferred as above. (define fix (lambda (e) (e (lambda (z) ((fix e) z)))))

8 Friedman, Byrd, Mack, Kiselyov Let s consider the following expression > ((fix (lambda (sum) (lambda (n) (+ n (sum (sub1 n)))))) 10) It fails to terminate. But, can we infer its type? Yes, an expression may have a type, even if evaluating it would lead to nontermination. This is a confusing aspect of type inference. We know from the Halting Problem that we cannot tell in advance whether evaluating an arbitrary expression will terminate, but this type inferencing system is guaranteed to terminate. Thus, we can infer the type before we run it. As a result, information that the run-time system can learn from the type (or the process of inferring the type) can be put to good use. Here is its type. (!- g (parse ((fix (lambda (sum) (lambda (n) (+ n (sum (sub1 n)))))) 10))? :: int The let-expression is a bit more subtle. Let s take a look at an expression that should type check, but won t in the absence of let. (!- g (parse ((lambda (f) (if (f (zero? 5)) (+ (f 4) 8) (+ (f 3) 7))) (lambda (x) x))) #f Because f becomes the non-generic identity, once a type for f is determined, it must stay the same. Obviously, we would expect that the evaluation of ((lambda (f)...)...) to be 10, but it has no type. If, however, we change the expression to use let

Tutorial 9 (let ([f (lambda (x) x)]) (if (f (zero? 5)) (+ (f 4) 8) (+ (f 3) 7))) and think about β-substituting for f throughout, (if ((lambda (x) x) (zero? 5)) (+ ((lambda (x) x) 4) 8) (+ ((lambda (x) x) 3) 7)) then we can see that this expression should have a type, int. Instead of doing the substitution, we mark certain variables generic as they are placed in the environment. Below is the polymorphic let, since the variable is tagged with generic in the environment. If the variable were tagged with non-generic, then this would be the familiar let. We have only to determine what happens in environment lookup when a variable with a generic tag is an arrow type. (Those who wish to include a more general let expression, one whose binding variable is bound to a non-generic, feel free to do so, but for our purposes, we assume that all right-hand sides of let expressions are lambda expressions.) (define polylet-rel (relation (g v rand body t) (to-show g (let ([,v,rand]),body) t) (exists (t-rand) (all!! (!- g rand t-rand) (!- (generic,v,t-rand,g) body t))))) (define!- (extend-relation (a1 a2 a3)!- polylet-rel)) The relation logical-copy-term takes a term with variables within it and creates a new term with new, but corresponding variables in it. Thus if the same variable occurs in several positions in the input term, then a new variable will occur in the same positions in the output term. In logical-copy-term below, we pass input-term, and output-term, along with an empty table and a table varmap, with no bindings, which associates each input term variable with an output term variable. (define logical-copy-term (relation (input-term output-term) (to-show input-term output-term) (exists (varmap) (logical-copy-term-raw input-term output-term () varmap)))) In logical-copy-term-raw, if the ef/only test succeeds, then car is bound to the car of the input term and cdr is bound to the cdr of the input term. The

10 Friedman, Byrd, Mack, Kiselyov two recursive calls feed the variable map generated from the first call to the seocnd call. When the second call binds varmap, it contains an association list which binds each input variable to each output variable. of the first call as the input list of the second call. Then output-term is constructed from new-car and new-cdr. If the input is not instantiated, it is a free variable, and we lookup its value in the variable map. So, for the invocation of logical-assq, input-term is a logic variable and output-term is a logic variable. (define logical-copy-term-raw (relation (input-term output-term lst varmap) (to-show input-term output-term lst varmap) (exists (car cdr) (ef/only (all!! (instantiated input-term) (== input-term (,car.,cdr))) (exists (new-car new-cdr varmap-car) (all!! (logical-copy-term-raw car new-car lst varmap-car) (logical-copy-term-raw cdr new-cdr varmap-car varmap) (== output-term (,new-car.,new-cdr)))) (logical-assq input-term lst output-term varmap))))) In the first relation of logical-assq, we make no changes, since input-var is bound, and we are only replacing free variables. In the second relation, we bind var and output-var from the first element of varmap provided that input-var and var are the same variable. In the third relation, we do the natural recursion on the cdr of varmap looking for a match. In the final relation, which is a fact, if the varmap is empty, we add one association to it. (define logical-assq (extend-relation (a1 a2 a3 a4) (relation (input-var varmap) (to-show input-var varmap input-var varmap) (instantiated input-var)) (relation (input-var varmap output-var) (to-show input-var varmap output-var varmap) (exists (var) (all!! (== varmap ((,var.,output-var).,_)) (predicate (*equal? var input-var))))) (relation (input-var h varmap output-var varmap1) (to-show input-var (,h.,varmap) output-var (,h.,varmap1)) (logical-assq input-var varmap output-var varmap1)) (fact (input-var output-var) input-var () output-var ((,input-var.,output-var)))))

(define *equal? (lambda (x y) (cond [(and (var? x) (var? y)) (eq? x y)] [(var? x) #f] [(var? y) #f] [else (equal? x y)]))) Tutorial 11 We can now implement generic variables using the logical-copy-term relation. (define generic-base-env (relation (g v targ tresult t) (to-show (generic,v (-->,targ,tresult),g) v t) (logical-copy-term (-->,targ,tresult) t))) (define generic-recursive-env (relation (g v t) (to-show (generic,_,_,g) v t) (all! (env g v t)))) (define generic-env (extend-relation (a1 a2 a3) generic-base-env generic-recursive-env)) (define env (extend-relation (a1 a2 a3) env generic-env)) Now that we have extended our environments to handle generic as well as nongeneric variables, we can infer the right type. (!- g (parse (let ([f (lambda (x) x)]) (if (f (zero? 5)) (+ (f 4) 8) (+ (f 3) 7))))? :: int Finally we consider type habitation. We are going to do an experiment and in order to get the results we want, we need to respecify the order of the relations. Thus we redefine!-. (define!- (extend-relation (a1 a2 a3) var-rel intc-rel boolc-rel zero?-rel sub1-rel +-rel if-rel lambda-rel app-rel fix-rel polylet-rel)) Here are three, perhaps unexpected, examples.

12 Friedman, Byrd, Mack, Kiselyov > (answer (g?) (!- g? (--> int int))) g :: (non-generic v.0 (--> int int) g.0)? :: (var v.0) > (answer (la f b) (!- g (,la (,f),b) (--> int int)))) la :: lambda f :: v.0 b :: (var v.0) > (answer* 1 (h r q z y t) (!- g (,h,r (,q,z,y)) t))) h :: + r :: (intc x.0) q :: + z :: (intc x.1) y :: (intc x.2) t :: int h :: lambda r :: (v.0) q :: + z :: (var v.0) y :: (var v.0) t :: (--> int int) #f The first example attempts to find an expression whose type is (--> int int), but instead finds a type environment that binds that type to the variable v.0, and then the expression is trivially v.0. The second example produces an expression given the type. This is answering the question, What expression inhabits that type? In our case, the identity function inhabits that type. But, to make these first two examples work, we had to place var-rel first in the most recent definition of!-. The last example yields two answers. First, it infers that t must be of int type. Then since there is only one binary operation that returns an int (i.e., +), it determines q and h. Next, we can infer that r, z, and y must be of int type, and what is easier than making them all values of type integer. Second, it infers that t is of type (-> int int). That forces h to be the symbol lambda. Then we need an expression whose value is of type int. Thus, r becomes (v.0), and q becomes +, the first binary operator on integers. Finally, z and y become (var v.0). Exercise 1: Take the results of these four test programs and reconstruct what the terms are by substituting for each variable.