Semantic Processing (Part 2)

Similar documents
Semantic Processing (Part 2)

Semantic Processing. Semantic Errors. Semantics - Part 1. Semantics - Part 1

Lecture #23: Conversion and Type Inference

Conversion vs. Subtyping. Lecture #23: Conversion and Type Inference. Integer Conversions. Conversions: Implicit vs. Explicit. Object x = "Hello";

Static Checking and Type Systems

Type Checking and Type Inference

CS558 Programming Languages

Types-2. Polymorphism

Type Checking. Outline. General properties of type systems. Types in programming languages. Notation for type rules.

Outline. General properties of type systems. Types in programming languages. Notation for type rules. Common type rules. Logical rules of inference

Type Systems. Seman&cs. CMPT 379: Compilers Instructor: Anoop Sarkar. anoopsarkar.github.io/compilers-class

Lecture #13: Type Inference and Unification. Typing In the Language ML. Type Inference. Doing Type Inference

Type systems. Static typing

CS558 Programming Languages

CS558 Programming Languages

Polymorphic lambda calculus Princ. of Progr. Languages (and Extended ) The University of Birmingham. c Uday Reddy

Overloading, Type Classes, and Algebraic Datatypes

Programming Languages and Compilers (CS 421)

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

COS 320. Compiling Techniques

The SPL Programming Language Reference Manual

FUNCTIONAL AND LOGIC PROGRAMS

Haskell 98 in short! CPSC 449 Principles of Programming Languages

Introduction to Programming Using Java (98-388)

CMSC 330: Organization of Programming Languages. Operational Semantics

COSE212: Programming Languages. Lecture 3 Functional Programming in OCaml

CMSC 330: Organization of Programming Languages. Formal Semantics of a Prog. Lang. Specifying Syntax, Semantics

Lecture Overview. [Scott, chapter 7] [Sebesta, chapter 6]

n Closed book n You are allowed 5 cheat pages n Practice problems available in Course Materials n Check grades in Rainbow grades

Lecture 15 CIS 341: COMPILERS

Type Checking. Error Checking

Intermediate Code Generation Part II

Compiler Theory. (Semantic Analysis and Run-Time Environments)

CSCI-GA Scripting Languages

CMSC 330: Organization of Programming Languages

Topics Covered Thus Far. CMSC 330: Organization of Programming Languages. Language Features Covered Thus Far. Programming Languages Revisited

CSE P 501 Compilers. Static Semantics Hal Perkins Winter /22/ Hal Perkins & UW CSE I-1

Types and Type Inference

Haske k ll An introduction to Functional functional programming using Haskell Purely Lazy Example: QuickSort in Java Example: QuickSort in Haskell

Chapter 6 Intermediate Code Generation

Lecture 7: Type Systems and Symbol Tables. CS 540 George Mason University

Programming Languages Third Edition

Types and Type Inference

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1

An introduction introduction to functional functional programming programming using usin Haskell

The type checker will complain that the two branches have different types, one is string and the other is int

COMP 181. Agenda. Midterm topics. Today: type checking. Purpose of types. Type errors. Type checking

CSE 431S Type Checking. Washington University Spring 2013

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 17: Types and Type-Checking 25 Feb 08

l e t print_name r = p r i n t _ e n d l i n e ( Name : ^ r. name)

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

Pierce Ch. 3, 8, 11, 15. Type Systems

(Refer Slide Time: 4:00)

Inductive Data Types

CMSC 330: Organization of Programming Languages. OCaml Data Types

Summer 2017 Discussion 10: July 25, Introduction. 2 Primitives and Define

MIDTERM EXAMINATION - CS130 - Spring 2003

1 Terminology. 2 Environments and Static Scoping. P. N. Hilfinger. Fall Static Analysis: Scope and Types

Tail Calls. CMSC 330: Organization of Programming Languages. Tail Recursion. Tail Recursion (cont d) Names and Binding. Tail Recursion (cont d)

Data types for mcrl2

CMSC 330: Organization of Programming Languages

Chapter 2: Using Data

Type checking of statements We change the start rule from P D ; E to P D ; S and add the following rules for statements: S id := E

These notes are intended exclusively for the personal usage of the students of CS352 at Cal Poly Pomona. Any other usage is prohibited without

Project Compiler. CS031 TA Help Session November 28, 2011

Generic polymorphism on steroids

CA Compiler Construction

CS558 Programming Languages

Data Types. Every program uses data, either explicitly or implicitly to arrive at a result.

Functional Programming. Pure Functional Programming

Compilers and computer architecture: Semantic analysis

Symbol Tables. For compile-time efficiency, compilers often use a symbol table: associates lexical names (symbols) with their attributes

CS 430 Spring Mike Lam, Professor. Data Types and Type Checking

9/21/17. Outline. Expression Evaluation and Control Flow. Arithmetic Expressions. Operators. Operators. Notation & Placement

Functional Programming

Java Primer 1: Types, Classes and Operators

Lecture 16: Static Semantics Overview 1

CSCE 314 Programming Languages. Type System

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

A First Look at ML. Chapter Five Modern Programming Languages, 2nd ed. 1

The role of semantic analysis in a compiler

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

ML 4 Unification Algorithm

Compiler Construction I

Writing Evaluators MIF08. Laure Gonnord

Data Types The ML Type System

Begin at the beginning

Lecture 19: Functions, Types and Data Structures in Haskell

CSC324- TUTORIAL 5. Shems Saleh* *Some slides inspired by/based on Afsaneh Fazly s slides

The Substitution Model

Symbol Table Information. Symbol Tables. Symbol table organization. Hash Tables. What kind of information might the compiler need?

Lists. Michael P. Fourman. February 2, 2010

Lecture 12: Data Types (and Some Leftover ML)

Algorithms & Data Structures

CMSC 330: Organization of Programming Languages. OCaml Imperative Programming

Scheme: Data. CS F331 Programming Languages CSCE A331 Programming Language Concepts Lecture Slides Monday, April 3, Glenn G.

Expressions and Assignment

Test I Solutions MASSACHUSETTS INSTITUTE OF TECHNOLOGY Spring Department of Electrical Engineering and Computer Science

CSC 467 Lecture 13-14: Semantic Analysis

CSE 3302 Programming Languages Lecture 8: Functional Programming

Transcription:

Semantic Processing (Part 2) All Projects Due: Fray 12-2-05, Noon Final: Monday, December 5, 2005, 10:15-12:05 Comprehensive 1

Recursive Type Definitions type MyRec is record f1: integer; f2: array of MyRec; end; MyRec record............... array MyRec record............... array MyRec Option #1 Option #2 2

Our approach is a hybr... MyRec record............... array mydef NamedType MyRec 3

Our approach is a hybr... MyRec record............... array mydef NamedType MyRec concretetype TypeDecl MyRec 4

Testing Type Equivalence Name Equivalence Stop when you get to a defined name Are the definitions the same (==)? Structural Equivalence Test whether the type trees have the same shape. Graphs may contain cycles! The previous algorithm ( typeequiv ) will inifinite loop. Need an algorithm for testing Graph Isomorphism PCAT Recursion can occur in arrays and records. type R is record info: integer; next: R; end; type A is array of A; PCAT uses Name Equivalence 5

type R1 is record info: integer; next: R1; end; Representing Recursive Types in PCAT RecordType fielddecls next type FieldDecl info next type FieldDecl next null mydef NamedType integer mydef NamedType R1 6

type R1 is record info: integer; next: R1; end; Representing Recursive Types in PCAT type TypeDecl R1 RecordType fielddecls next type FieldDecl info next type FieldDecl next null mydef NamedType integer mydef NamedType R1 7

type R1 is record info: integer; next: R1; end; Representing Recursive Types in PCAT type TypeDecl R1 R1 fielddecls RecordType...... next type FieldDecl info next type FieldDecl next null mydef NamedType integer mydef NamedType R1 8

type R1 is record info: integer; next: R1; end; type R2 is R1; R1 R2...... type mydef Representing Recursive Types in PCAT TypeDecl R2 NamedType R2 fielddecls next type type FieldDecl info TypeDecl R1 RecordType next type FieldDecl next null mydef NamedType integer mydef NamedType R1 9

type R1 is record info: integer; next: R1; end; type R2 is R1; R1 R2 integer...... type type mydef TypeDecl integer Representing Recursive Types in PCAT TypeDecl R2 NamedType R2 BasicTypeInteger (no fields) fielddecls next type mydef type FieldDecl info TypeDecl R1 RecordType NamedType integer next type mydef FieldDecl next null NamedType R1 10

type R1 is record info: integer; next: R1; end; type R2 is R1; R1 R2 integer...... concretetype concretetype mydef TypeDecl integer Representing Recursive Types in PCAT TypeDecl R2 NamedType R2 BasicTypeInteger (no fields) concretetype fielddecls next type mydef FieldDecl info TypeDecl R1 RecordType NamedType integer next type mydef FieldDecl next null NamedType R1 11

type R1 is record info: integer; next: R1; end; type R2 is R1; The Symbol Table goes away and we are left with just the type structures! concretetype concretetype mydef TypeDecl integer Representing Recursive Types in PCAT TypeDecl R2 NamedType R2 BasicTypeInteger (no fields) concretetype fielddecls next type mydef FieldDecl info TypeDecl R1 RecordType NamedType integer next type mydef FieldDecl next null NamedType R1 12

var r: real; i: integer;... r + i... During Type-checking... Compiler discovers the problem Must insert conversion code Type Conversions Case 1: No extra code needed. i = p; // e.g., pointer to integer conversion. Case 2: One (or a few) machine instructions r = i; // e.g., integer to real conversion. Case 3: Will need to call an external routine System.out.print ( i= + i); // int to string Perhaps written in the source language (an upcall ) One compiler may use all 3 techniques. 13

Explicit Type Conversions Example (Java): Type Error i = r; Programmer must insert something to say This is okay : i = (int) r; Language Design Approaches: C casting notation i = (int) r; Function call notation i = realtoint (r); Keyword i = realtoint r; I like this: No additional syntax Fits easily with other user-coded data transformations Compiler may insert: nothing machine instructions an upcall 14

Implicit Type Conversions ( Coercions ) Example (Java, PCAT): r = i; Compiler determines when a coercion must be inserted. Rules can be complex... Ugh! Source of subtle errors. Java Philosophy: Implicit coercions are okay when no loss of numerical accuracy. byte short int long float double Compiler may insert: nothing machine instructions an upcall My preference: Minimize implicit coercions Require explicit conversions 15

Overloading Functions and Operators What does + mean? integer addition 16-bit? 32-bit? floating-point addition Single precision? Double precision? string concatenation user-defined meanings e.g., complex-number addition Compiler must resolve the meaning of the symbols Will determine the operator from types of arguments i+i integer addition d+i floating-point addition (and double-to-int coercion) s+i string concatenation (and int-to-string coercion) 16

AST Design Options expr1 expr2 IntegerAdd Option 1 op expr1 expr2 BinaryOp PLUS 17

AST Design Options expr1 expr2 IntegerAdd Option 1 op expr1 expr2 BinaryOp PLUS op expr1 expr2 BinaryOp INT-ADD Option 2 18

AST Design Options expr1 expr2 IntegerAdd Option 1 op expr1 expr2 BinaryOp PLUS op expr1 expr2 BinaryOp INT-ADD Option 2 op mode expr1 expr2 BinaryOp PLUS INTEGER Option 3 19

AST Design Options expr1 expr2 IntegerAdd Option 1 op expr1 expr2 BinaryOp PLUS op expr1 expr2 BinaryOp INT-ADD Option 2 General Principle: It is better to ADD new information, than to CHANGE your data structures op mode expr1 expr2 BinaryOp PLUS INTEGER Option 3 20

Parsing Issues? E E E Working with Functions Want to say: var f: int real :=... ;... x := f(i); The application Operators Syntax operator E E + E E * E E E... Sometimes adjacency is used for function application 3N 3 * N foo N foo N The programmer can always add parentheses: foo 3 foo (3) (foo) 3 If the language also has tuples... foo(4,5,6) (foo)(4,5,6) AST: foo,, 6 4 5 AST: apply foo tuple tuple 6 4 5 21

Syntax: E E E or: E E E or: E E ( E ) Type Checking for Function Application expr1 expr2 Apply Type-Checking Code (e.g., in checkapply )... t1 = type of expr1; t2 = type of expr2; if t1 has the form t DOMAIN t RANGE then if typeequals(t2,t DOMAIN ) then resulttype = t RANGE ; else error; endif else error endif 22

Traditional ADD operator: add: int int int... add(3,4)... Curried ADD operator: add: int int int... add 3 4... Curried Functions Each argument is supplied indivually, one at a time. add 3 4 (add 3) 4 Can also say: f: int int f = add 3;... f 4... Recall: function application is Right-Associative int (int int) 23

type is a synthesized attribute Type Checking apply apply add type=int (int int) 3 type=int 24

type is a synthesized attribute Type Checking apply apply These types are matched add type=int (int int) 3 type=int 25

type is a synthesized attribute Type Checking apply apply type=(int int) add type=int (int int) 3 This is the Result type type=int 26

type is a synthesized attribute Type Checking apply apply apply type=(int int) 4 type=int add type=int (int int) 3 type=int 27

type is a synthesized attribute Type Checking apply apply type=int apply type=(int int) 4 type=int add type=int (int int) 3 type=int 28

A Data Structure Example Goal: Write a function that finds the length of a list. type MyRec is record info: integer; next: MyRec; end; procedure length (p:myrec) : integer is var len: integer := 0; begin while (p <> nil) do len := len + 1; p := p.next; end; return len; end; Traditional Languages: Each parameter must have a single, unique type. 29

A Data Structure Example Goal: Write a function that finds the length of a list. type MyRec is record info: integer; next: MyRec; end; procedure length (p:myrec) : integer is var len: integer := 0; begin while (p <> nil) do len := len + 1; p := p.next; end; return len; end; Traditional Languages: Each parameter must have a single, unique type. Problem: Must write a new length function for every record type!!!... Even though we dn t access the fields particular to MyRec 30

Passed: Returns: Another Example: The find Function A list of T s A function test, which has type T boolean A list of all elements that passed the test i.e., a list of all elements x, for which test(x) is true procedure find (inlist: array of T; test: T boolean) : array of T is var result: array of T; i, j: integer := 1; begin result :=... new array...; while i < sizeof(inlist) do if test(inlist[i]) then result[j] := inlist[i]; j := j + 1; endif; i := i + 1; endwhile; return result; end; 31

This function should work for any type T. Goal: Write the function once and re-use. This problem is typical... Data Structure Manipulation Want to re-use code... Hash Table Lookup Algorithms Sorting Algorithms B-Tree Algorithms etc....regardless of the type of data being mainpulated. 32

The ML Version of Length Background: Data Types: Int Bool List(...) Lists: [1,3,5,7,9] [] [[1,2], [5,4,3], [], [6]] Operations on Lists: head head([5,4,3]) 5 head: List(T) T tail tail([5,4,3]) [4,3] tail: List(T) List(T) null null([5,4,3]) false null: List(T) Bool Type is: List(Int) Type is: List(List(Int)) Notation: x:t means: The type of x is T 33

The ML Version of Length Operations on Integers: + 5 + 7 +(5,7) 12 +: Int Int Int Constants: 0: Int 1: Int 2: Int... fun length (x) = if null(x) then 0 else length(tail(x))+1 New symbols introduced here: x: List(α) length: List(α) Int Constant Function: Int Int (A function of zero arguments) No types are specified explicitly! No Declarations! ML infers the types from the way the symbols are used!!! 34

Predicate Logic Refresher Logical Operators (AND, OR, NOT, IMPLIES) &,, ~, Predicate Symbols P, Q, R,... Function and Constant Symbols f, g, h,... a, b, c,... Variables x, y, z,... Quantifiers, WFF: Well-Formed Formulas x. ~P(f(x)) & Q(x) Q(x) Precedence and Associativity: (Quantifiers bind most loosely) x.(((~p(f(x))) & Q(x)) Q(x)) A grammar of Predicate Logic Expressions? Sure! 35

Basic Types Int, Bool, etc. Type Expressions Constructed Types,, List(), Array(), Pointer(), etc. Type Expressions List(Int Int) List(Int Bool) Type Variables α, β, γ, α 1, α 2, α 3,... Universal Quantification: α. List(α) List(α) (Won t use existential quantifier, ) Remember: binds loosely α.(list(α) List(α)) For any type α, a function that maps lists of α s to lists of α s. 36

Type Expressions Okay to change variables (as long as you do it consistently)... α. Pointer(α) Boolean β. Pointer(β) Boolean What do we mean by that? Same as for predicate logic... Can t change α to a variable name already in use elsewhere Must change all occurrences of α to the same variable We will use only universal quantification ( for all, ) Will not use Okay to just drop the quantifiers. α. β. (List(α) (α β)) List(β) (List(α) (α β)) List(β) (List(β) (β γ)) List(γ) 37

Given: x: Int y: Int Boolean What is the type of (x,y)? Practice 38

Given: x: Int y: Int Boolean What is the type of (x,y)? (x,y): Int (Int Boolean) Practice 39

Given: x: Int y: Int Boolean What is the type of (x,y)? (x,y): Int (Int Boolean) Given: f: List(α) List(α) z: List(Int) What is the type of f(z)? Practice 40

Given: x: Int y: Int Boolean What is the type of (x,y)? (x,y): Int (Int Boolean) Given: f: List(α) List(α) z: List(Int) What is the type of f(z)? f(z): List(Int) Practice 41

Given: x: Int y: Int Boolean What is the type of (x,y)? (x,y): Int (Int Boolean) Given: f: List(α) List(α) z: List(Int) What is the type of f(z)? f(z): List(Int) What is going on here? We matched α to Int We used a Substitution α = Int What do we mean by matched??? Practice 42

Given: x: Int y: Int Boolean Practice What is the type of (x,y)? (x,y): Int (Int Boolean) Given: f: List(α) List(α) z: List(Int) What is the type of f(z)? f(z): List(Int) What is going on here? We matched α to Int We used a Substitution α = Int What do we mean by matched??? UNIFICATION! 43

Unification Given: Two [type] expressions Goal: Try to make them equal Using: Consistent substitutions for any [type] variables in them Result: Success plus the variable substitution that was used Failure 44

P D ; E D D ; D : Q Q. Q T T T T T T List ( T ) Int Bool ( T ) E int E E ( E, E ) ( E ) A Language With Polymorphic Functions Quantified Type Expressions Unquantified Type Expressions Type Variables Grouping Function Apply Tuple Construction Grouping 45

A Language With Polymorphic Functions P D ; E D D ; D : Q Q. Q T T T T T T List ( T ) Int Bool ( T ) E int E E ( E, E ) ( E ) Examples of Expressions: 123 (x) foo(x) find(test,mylist) add(3,4) 46

A Language With Polymorphic Functions P D ; E D D ; D : Q Q. Q T T T T T T List ( T ) Int Bool ( T ) E int E E ( E, E ) ( E ) Examples of Types: Int Bool Bool (Int Bool) α A Type Variable () α (α Bool) (((β Bool) List(β)) List(β)) 47

A Language With Polymorphic Functions P D ; E D D ; D : Q Q. Q T T T T T T List ( T ) Int Bool ( T ) E int E E ( E, E ) ( E ) Examples of Quatified Types: Int Bool α. (α Bool) β.(((β Bool) List(β)) List(β)) 48

A Language With Polymorphic Functions P D ; E D D ; D : Q Q. Q T T T T T T List ( T ) Int Bool ( T ) E int E E ( E, E ) ( E ) Examples of Declarations: i: Int; mylist: List(Int); test: α. (α Bool); find: β.(((β Bool) List(β)) List(β)) 49

P D ; E D D ; D : Q Q. Q T T T T T T List ( T ) Int Bool ( T ) E int E E ( E, E ) ( E ) A Language With Polymorphic Functions An Example Program: mylist: List(Int); test: α. (α Bool); find: β.(((β Bool) List(β)) List(β)); find (test, mylist) GOAL: Type-check this expression given these typings! 50

Parse Tree (Annotated with Synthesized Types) apply Expression: find (test, mylist) find tuple test mylist 51

Parse Tree (Annotated with Synthesized Types) apply find tuple test mylist Add known typing info: mylist: List(Int); test: α. (α Bool); find: β.(((β Bool) List(β)) List(β)); 52

Parse Tree (Annotated with Synthesized Types) Nothing Known apply type=δ Nothing Known find type= ((β Bool) List(β )) List(β ) tuple type=γ test type=α Bool mylist type=list(int) Add known typing info: mylist: List(Int); test: α. (α Bool); find: β.(((β Bool) List(β)) List(β)); 53

Parse Tree (Annotated with Synthesized Types) apply type=δ find type= ((β Bool) List(β )) List(β ) tuple type=γ test type=α Bool mylist type=list(int) Tuple Node: Match γ to (α Bool) List(Int) 54

Parse Tree (Annotated with Synthesized Types) apply type=δ find type= ((β Bool) List(β )) List(β ) tuple type=(α Bool) List(Int) test type=α Bool mylist type=list(int) Tuple Node: Match γ to (α Bool) List(Int) Conclude: γ = (α Bool) List(Int) 55

Parse Tree (Annotated with Synthesized Types) apply type=δ find type= ((β Bool) List(β )) List(β ) tuple type=(α Bool) List(Int) test type=α Bool mylist type=list(int) Apply Node: Match (β Bool) List(β) (α Bool) List(Int) Conclude: β = Int α = β = Int 56

Parse Tree (Annotated with Synthesized Types) apply type=δ find type= ((Int Bool) List(Int)) List(Int) tuple type=(int Bool) List(Int) test type=int Bool mylist type=list(int) Apply Node: Match (β Bool) List(β) (α Bool) List(Int) Conclude: β = Int α = β = Int 57

Parse Tree (Annotated with Synthesized Types) apply type=δ find type= ((Int Bool) List(Int)) List(Int) tuple type=(int Bool) List(Int) test type=int Bool mylist type=list(int) Apply Node: Match List(Int) δ Conclude: δ = List(Int) 58

Parse Tree (Annotated with Synthesized Types) apply type=list(int) find type= ((Int Bool) List(Int)) List(Int) tuple type=(int Bool) List(Int) test type=int Bool mylist type=list(int) Apply Node: Match List(Int) δ Conclude: δ = List(Int) 59

Parse Tree (Annotated with Synthesized Types) apply type=list(int) find type= ((Int Bool) List(Int)) List(Int) tuple type=(int Bool) List(Int) test type=int Bool Results: α = Int β = Int δ = List(Int) γ = (Int Bool) List(Int) mylist type=list(int) 60

Example: t 1 = α Int t 2 = List(β) γ Unification of Two Expressions Is there a subsitution that makes t 1 = t 2? t 1 unifies with t 2 if and only if there is a substitution S such that S(t 1 ) = S(t 2 ) Here is a substitution that makes t 1 = t 2 : α List(β) γ Int Other notation for substitutions: {α/list(β), γ/int} 61

There may be several substitutions. Some are more general than others. Example: t 1 = α Int t 2 = List(β) γ Unifying Substitution #1: α List(List(List(Bool))) β List(List(Bool)) γ Int Unifying Substitution #2: α List(Bool δ) β Bool δ γ Int Unifying Substitution #3: α List(β) γ Int Most General Unifier This is the Most General Unifier 62

Unifying Two Terms / Types Unify these two terms: f(g(a,x),y) f(z,z) Unification makes the terms entical. f f Z Z g Y a X 63

Unifying Two Terms / Types Unify these two terms: f(g(a,x),y) f(z,z) Unification makes the terms entical. The substitution: Y Z Z g(a,x) f Z f Z g Y a X 64

Unifying Two Terms / Types Unify these two terms: f(g(a,x),y) f(z,z) Unification makes the terms entical. The substitution: Y Z Z g(a,x) Merge the trees into one! g f Y Z f Z a X 65

Unifying Two Terms / Types Unify these two terms: f(g(a,x),y) f(z,z) Unification makes the terms entical. f(g(a,x),g(a,x)) The substitution: Y Z Z g(a,x) Merge the trees into one! f Y=Z=g a X 66

Unifying Two Terms / Types Unify these two terms: f(g(a,x),y) f(z,z) Unification makes the terms entical. f(g(a,x),g(a,x)) The substitution: Y Z Z g(a,x) Merge the trees into one! f Y=Z=g Same with unifying types! (Int List(X)) Y Z Z a X 67

Representing Types With Trees domaintype rangetype Function type1 type2 CrossProd Same for other basic and constructed types Real, Bool, List(T), etc. Integer (no fields) TypeVar T1 or α 1 68

Merging Sets Approach: Will work with sets of nodes. Each set will have a representative node. Goal: Merge two sets of nodes into a single set. When two sets are merged (the union operation)... make one representative point to the other! A Set of Nodes The representative c A Set of Nodes The representative a b d g e f g 69

Merging Sets Approach: Will work with sets of nodes. Each set will have a representative node. Goal: Merge two sets of nodes into a single set. When two sets are merged (the union operation)... make one representative point to the other! The new representative c The combined Set b g e f The set pointers a d g 70

Representing Type Expressions domaintype rangetype set type1 type2 set Function CrossProd The set pointers will point toward the representative node. (Initialized to null.) set Integer set TypeVar T1 or α 1 71

Merging Sets Find(p) ptr Given a pointer to a node, return a pointer to the representative of the set containing p. Just chase the set pointers as far as possible. Union(p,q) Merge the set containing p with the set containing q. Do this by making the representative of one of the sets point to the representative of the other set. If one representative is a variable node and the other is not, always use the non-variable node as the representative of the combined, merged sets. In other words, make the variable node point to the other node. 72

The Unification Algorithm function Unify (s, t : Node) returns bool s = Find(s ) t = Find(t ) if s == t then return true elseif s and t both point to INTEGER nodes then return true elseif s or t points to a VARIABLE node then Union(s,t) elseif s points to a node FUNCTION(s 1,s 2 ) and t points to a node FUNCTION(t 1,t 2 ) then Union(s,t) return Unify(s 1,t 1 ) and Unify(s 2,t 2 ) elseif s points to a node CROSSPROD(s 1,s 2 ) and t points to a node CROSSPROD(t 1,t 2 ) then Union(s,t) return Unify(s 1,t 1 ) and Unify(s 2,t 2 ) elseif... else return false endif Etc., for other type constructors and basic type nodes 73

Example: Unify... t 1 = α Integer t 2 = List(β) γ set type1 type2 set TypeVar α CrossProd set Integer null t 1 t 2 type1 type2 set List type set null TypeVar β set null CrossProd null set TypeVar γ 74

Example: Unify... t 1 = α Integer t 2 = List(β) γ set type1 type2 set TypeVar α CrossProd set Integer null t 1 t 2 type1 type2 set List type set null TypeVar β set null CrossProd null set TypeVar γ 75

Example: Unify... t 1 = α Integer t 2 = List(β) γ set type1 type2 set TypeVar α CrossProd set Integer null t 1 t 2 type1 type2 set List type set null TypeVar β set null CrossProd null set TypeVar γ 76

Example: Unify... t 1 = α Integer t 2 = List(β) γ set type1 type2 set TypeVar α CrossProd set Integer null t 1 t 2 type1 type2 set List type set null TypeVar β set null CrossProd null set TypeVar γ 77

Example: Unify... t 1 = α Integer t 2 = List(β) γ set type1 type2 set TypeVar α CrossProd set Integer null Recovering the Substitution: α List(β) γ Integer t 1 t 2 type1 type2 set List type set null TypeVar β set null CrossProd null set TypeVar γ 78

Type-Checking with an Attribute Grammar Lookup(string) type Lookup a name in the symbol table and return its type. Fresh(type) type Make a copy of the type tree. Replace all variables (consistently) with new, never-seen-before variables. MakeIntNode() type Make a new leaf node to represent the Int type MakeVarNode() type Create a new variable node and return it. MakeFunctionNode(type 1,type 2 ) type Create a new Function node and return it. Fill in its domain and range types. MakeCrossNode(type 1,type 2 ) type Create a new Cross Product node and return it. Fill in the types of its components. Unify(type 1,type 2 ) bool Unify the two type trees and return true if success. Modify the type trees to perform the substitutions. 79

Type-Checking with an Attribute Grammar E E.type = Fresh(Lookup(.svalue)); E int E.type = MakeIntNode(); E 0 E 1 E 2 p = MakeVarNode(); f = MakeFunctionNode(E 2.type, p); Unify(E 1.type, f); E 0.type = p; E 0 ( E 1, E 2 ) E 0.type = MakeCrossNode(E 1.type, E 2.type); E 0 ( E 1 ) E 0.type = E 1.type ; 80

Conclusion Theoretical Approaches: Regular Expressions and Finite Automata Context-Free Grammars and Parsing Algorithms Attribute Grammars Type Theory Function Types Type Expressions Unification Algorithm Make it possible to parse and check complex, high-level programming lanaguages! Would not be possible without these theoretical underpinnings! The Next Step? Generate Target Code and Execute the Program! 81