Connecting Definition and Use? Tiger Semantic Analysis. Symbol Tables. Symbol Tables (cont d)

Similar documents
Compilation 2013 Semantic Analysis

Programming Languages Third Edition. Chapter 7 Basic Semantics

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

The role of semantic analysis in a compiler

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

Compiler construction

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

Compiler Theory. (Semantic Analysis and Run-Time Environments)

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

Lecture 09: Data Abstraction ++ Parsing is the process of translating a sequence of characters (a string) into an abstract syntax tree.

Typical Runtime Layout. Tiger Runtime Environments. Example: Nested Functions. Activation Trees. code. Memory Layout

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

Mid-Term 2 Grades

Semantic Analysis. How to Ensure Type-Safety. What Are Types? Static vs. Dynamic Typing. Type Checking. Last time: CS412/CS413

Informal Semantics of Data. semantic specification names (identifiers) attributes binding declarations scope rules visibility

Scope. Chapter Ten Modern Programming Languages 1

COP4020 Programming Languages. Compilers and Interpreters Robert van Engelen & Chris Lacher

Lecture 15 CIS 341: COMPILERS

CSCE 314 Programming Languages. Type System

Compiler Passes. Semantic Analysis. Semantic Analysis/Checking. Symbol Tables. An Example

CMSC 330: Organization of Programming Languages. Formal Semantics of a Prog. Lang. Specifying Syntax, Semantics

Compiling and Interpreting Programming. Overview of Compilers and Interpreters

6.184 Lecture 4. Interpretation. Tweaked by Ben Vandiver Compiled by Mike Phillips Original material by Eric Grimson

Static Semantics. Lecture 15. (Notes by P. N. Hilfinger and R. Bodik) 2/29/08 Prof. Hilfinger, CS164 Lecture 15 1

The PCAT Programming Language Reference Manual

CS 321 Homework 4 due 1:30pm, Thursday, March 15, 2012 This homework specification is copyright by Andrew Tolmach. All rights reserved.

Outline. Introduction Concepts and terminology The case for static typing. Implementing a static type system Basic typing relations Adding context

G Programming Languages - Fall 2012

Syntax Errors; Static Semantics

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

Lecture 16: Static Semantics Overview 1

Functional Programming. Pure Functional Programming

The Zephyr Abstract Syntax Description Language

Type Bindings. Static Type Binding

Semantic actions for declarations and expressions

Compiler construction

Background Functions (1C) Young Won Lim 12/6/17

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

SMURF Language Reference Manual Serial MUsic Represented as Functions

Recap: Functions as first-class values

Semantic actions for declarations and expressions. Monday, September 28, 15

Compilers. Type checking. Yannis Smaragdakis, U. Athens (original slides by Sam

Data Types. Every program uses data, either explicitly or implicitly to arrive at a result.

Compilation 2014 Activation Records

Topic 7: Intermediate Representations

CS 330 Lecture 18. Symbol table. C scope rules. Declarations. Chapter 5 Louden Outline

The basic operations defined on a symbol table include: free to remove all entries and free the storage of a symbol table

Lecture 7: Type Systems and Symbol Tables. CS 540 George Mason University

CSE 307: Principles of Programming Languages

CSE 307: Principles of Programming Languages

Programming Languages Third Edition. Chapter 10 Control II Procedures and Environments

The Compiler So Far. Lexical analysis Detects inputs with illegal tokens. Overview of Semantic Analysis

Acknowledgement. CS Compiler Design. Semantic Processing. Alternatives for semantic processing. Intro to Semantic Analysis. V.

Midterm 2 Solutions Many acceptable answers; one was the following: (defparameter g1

Procedures. EOPL3: Section 3.3 PROC and App B: SLLGEN

Lecture 11: Subprograms & their implementation. Subprograms. Parameters

Flang typechecker Due: February 27, 2015

Names, Scopes, and Bindings II. Hwansoo Han

COS 320. Compiling Techniques

Intermediate Code Generation

Handout 2 August 25, 2008

Questions? Static Semantics. Static Semantics. Static Semantics. Next week on Wednesday (5 th of October) no

Types II. Hwansoo Han

Semantic analysis. Semantic analysis. What we ll cover. Type checking. Symbol tables. Symbol table entries

Type Checking and Type Inference

CS558 Programming Languages. Winter 2013 Lecture 3

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 17: Types and Type-Checking 25 Feb 08

Attributes, Bindings, and Semantic Functions Declarations, Blocks, Scope, and the Symbol Table Name Resolution and Overloading Allocation, Lifetimes,

6.037 Lecture 4. Interpretation. What is an interpreter? Why do we need an interpreter? Stages of an interpreter. Role of each part of the interpreter

Functional programming Primer I

CS558 Programming Languages

About this exam review

Imperative Programming

CSE3322 Programming Languages and Implementation

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Chapter 13: Reference. Why reference Typing Evaluation Store Typings Safety Notes

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

CA Compiler Construction

CSE413: Programming Languages and Implementation Racket structs Implementing languages with interpreters Implementing closures

CST-402(T): Language Processors

(Not Quite) Minijava

The SPL Programming Language Reference Manual

Some Advanced ML Features

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division

Operational Semantics of Cool

Programs as data first-order functional language type checking

Informatica 3 Syntax and Semantics

M/s. Managing distributed workloads. Language Reference Manual. Miranda Li (mjl2206) Benjamin Hanser (bwh2124) Mengdi Lin (ml3567)

CSE3322 Programming Languages and Implementation

Static Checking and Type Systems

Lecture Outline. COOL operational semantics. Operational Semantics of Cool. Motivation. Notation. The rules. Evaluation Rules So Far.

A Second Look At ML. Chapter Seven Modern Programming Languages, 2nd ed. 1

Topics Covered Thus Far. CMSC 330: Organization of Programming Languages. Language Features Covered Thus Far. Programming Languages Revisited

Semantic Processing. Semantic Errors. Semantics - Part 1. Semantics - Part 1

5. Semantic Analysis. Mircea Lungu Oscar Nierstrasz

Working of the Compilers

Data Types. (with Examples In Haskell) COMP 524: Programming Languages Srinivas Krishnan March 22, 2011

Transcription:

Tiger source program Tiger Semantic Analysis lexical analyzer report all lexical errors token get next token parser construct variable deinitions to their uses report all syntactic errors absyn checks that each expression has a correct type semantic analysis translates the abstract syntax into a simpler intermediate representation suitable or generating machine code. report all semantic errors correct absyn; then intermediate trees Connecting Deinition and Use? Make sure each variable is deined; Check the type consistency!... unction (v : int) = let var v := 6 unction g(x : int) = (print (x+v); print \n ) unction h(v : int) = (print v; print \n ) in g v; let var v := 8 in print v ; h v; Solution: use a symbol table --- traverse the abstract syntax tree in certain order while maintaining a (variable -> type) symbol table. Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 1 o 24 Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 2 o 24 Symbol Tables Conceptually, a symbol table (also called environment) is a set o (name, attribute) pairs. Typical Names: strings, e.g., oo, do_nothing1,... Typical Attributes (also called bindings): type identiier type (e.g., int, string) variable identiier type ; access ino. or value unction identiier arg. & result type; access ino. or... Main Issues --- or a symbol table T Given an identiier name, how to look up its attribute in T? How to insert or delete a pair o new (id, attr) into the table T? Eiciency is important!!! Symbol Tables (cont d) How to deal with visibility (i.e., lexical scoping under nested block structure)? v 1 v 3 v 4... unction (v : int) = let var v := 6 unction g(x : int) = (print (x+v);...) unction h(v : int) = (print v;...) in g v; h v;... let var v := 8 in print v ; Initial Table T insert v 1 ; insert ; insert v 3 ; lookup sees v 3 MUST delete v 3 ; insert v 4 ; lookup sees v 4 MUST delete v 4 ; MUST delete ; Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 3 o 24 Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 4 o 24

Symbol Table Impl. Hash Table --- eicient, but need explicit delete due to side-eects! Symbol Table Impl. (cont d) Balanced Binary-Tree ---- persistent, unctional, yet eicient Initial Table T 0 T 1 T 2 T 3 T 4 Initial Table T insert v 1 ; insert ; v 3 g insert v 1 ; T 1 insert ; T 2 insert v 3 ; lookup sees v 3 MUST delete v 3 ; v 1 insert v 3 ; T 3 lookup sees v 3 (* delete v 3 ;*) use T 2 a v 1 v 3 v 4 insert v 4 ; lookup sees v 4 MUST delete v 4 ; v 4 insert v 4 ; T 4 lookup sees v 4 (* delete v 4 ;*) use T 2 MUST delete ; v 1 v 1 v 1 v 1 v 1 insert insert v 3 delete v 3 insert v 4 delete v 4 (* delete ;*) use T 1 q z Insert / Access Time O(log N) Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 5 o 24 Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 6 o 24 Summary: Symbol Table Impl. Using hash-table is ok but explicit delete is a big headache! We preer the unctional approach --- using persistent balanced binary tree --- no need to explicit delete ; access and insertion time O(log N) The Symbol signature ( symbol table is an abstract datatype --- used to hide the implementation details ) signature SYMBOL = sig eqtype symbol val symbol : string -> symbol val name : symbol -> string type a table val empty : a table val enter : a table * symbol * a -> a table val look : a table * symbol -> a option No delete because we use unctional approach! String <==> Symbol Using string as the search key is slow --- involves a string comparison Associate each string with a integer --- which is used as the key or all access to the symbol table (i.e., binary tree) type symbol = string * int exception Symbol val nextsym = re 0 structure H =... a HashTable rom STRING to INTEGER... un symbol name = case H.ind hashtable name o SOME i => (name, i) NONE => let val i =!nextsym in inc nextsym; H.insert hashtable (name,i); (name,i) un name(s,n) = s Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 7 o 24 Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 8 o 24

Summary: Symbol Table A symbol is a pair o string and integer (s,n) where the string s is the identiier name, the integer n is its associated search key. The mapping rom a string to its corresponding search key (a integer) is implemented using a hash table. The symbol table --- rom a symbol to its attributes --- is implemented using IntBinaryMap --- a persistent balanced binary tree. structure Symbol :> SYMBOL = (* see Appel page 110 *) struct type symbol = string * int... type a table = a IntBinaryMap.intmap (* in SML Library *) val empty = IntBinaryMap.empty un enter(t,(s,n),a) = IntBinaryMap.insert(t,n,a) un look(t,(s,n)) = IntBinaryMap.look(t,n) Environments Bindings ---- interesting attributes associated with type, variable, or unction identiiers during compilations. Type bindings --- internal representation o types structure Types = struct type unique = unit re datatype ty = INT STRING RECORD o (Symbol.symbol * ty) list * unique ARRAY o ty * unique NIL UNIT NAME o Symbol.symbol * ty option re Variable/Function Bindings --- type + localtion&access inormation Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 9 o 24 Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 10 o 24 Environments (cont d) The signature or Environment signature Env = struct type access type level type label type ty (* = Type.ty *) datatype enventry = VARentry o {access : access, ty : ty} FUNentry o {level : level, label : label, ormals : ty list, result : ty} location o the variable at static nesting level val base_tenv : ty Symbol.table val base_env : enventry Symbol.table Normally we build one environment or each name space! base_tenv is the initial type environment base_env is the initial variable+unction environment the unction machine Tiger Absyn datatype a option = NONE SOME o a datatype var =... and exp =... OpExp o {let: exp, oper: oper, right: exp,... LetExp o {decs: dec list, body: exp,... and dec = FunctionDec o undec list TypeDec o tydec list VarDec o vardec multually-recursive declarations ( must be consecutive in the source ) withtype ield = {name: symbol, typ: symbol, pos: pos} and undec = {name: symbol, params: ield list, result : (symbol * pos) option, body: exp, pos: pos} Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 11 o 24 Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 12 o 24

Type-Checking Expressions type tenv = Types.ty Symbol.table type env = enventry Symbol.table (* transexp : env * tenv -> exp -> ty *) un transexp (env,tenv) = let un g(opexp{let,oper=a.plusop,right,pos}) = (checkint(g let, pos); checkint(g right, pos); Types.INT) g(letexp{decs, body, pos}) = let val (env,tenv ) = transdecs (env,tenv) decs in transexp (env,tenv ) body... in g Type-Checking Declarations (* transdec : env * tenv -> dec -> env * tenv *) un transdec (env,tenv) = let un g(vardec{var,typ=none,init}) = let val ty = transexp (env,tenv) init val b = VARentry{access=(),ty=ty} in (enter(env,var,b), tenv) g(functiondec[{name,params,body,pos,result=_}])= let val b = FUNentry{...} val env = enter(env,name,b) val env = enterparams(params,env ) in transexp (env,tenv) body; (env, tenv) g... in g (* transdecs : env * tenv -> dec list -> env * tenv *) un transdecs (env, tenv) [] = (env,tenv) transdecs (env, tenv) (a::r) = let val (env, tenv ) = transdec (env, tenv) a in trandsdecs (env, tenv ) r Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 13 o 24 Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 14 o 24 Type-Checking The type o an expression tells us the values it can denote and the operations that can be applied to it. Type system --- deinition o well-ormed types + a set o typing rules that deine what type-consistency means. Type-checking ensures that the operations in a program are applied properly. A program that executes without type errors is said to be type sae. Static Type-checking : type are checked at compile time. (once and or all) parser absyn type checker correct absyn intermediate trees Dynamic Type-checking : types are checked at run time. (inside the code) Type Saety Modern programming languages are always equipped with a strong type system ------ meaning a program will either run successully, or the compiler & the runtime system will report the type error. strongly-typed languages: Modula-3, Scheme, ML, Haskell weakly-typed languages: C, C++ Saety ---- a language eature is unsae i its misuse can corrupt the runtime system so that urther execution o the program is not aithul to the language semantics. (e.g., no array bounds checking,...) A statically-typed language (e.g., ML, Haskell) does most o its typechecking at compile time (except array-bounds checking). A dynamically-typed language (e.g., Scheme, Lisp) does most o its type-checking at run time. Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 15 o 24 Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 16 o 24

Main Issues What are valid type expressions? e.g., int, string, unit, nil, array o int, record {... How to deine two types are equivalent? name equivalence or structure equivalence What are the typing rules? How much type ino should be speciied in the source program? implicitly-typed lang., e.g., ML ----- uses type inerence explicitly-typed lang. e.g., Tiger, Modula-3 ----- must speciy the type o each newly-introduced variables. Types in Tiger Tiger types are ty -> type-id array o type-id {} {id : type-id {,id : type-id}} type-id is deined by type declarations: tydec -> type type-id = ty Typechecker must translate all source-level type speciication (in absyn) into the ollowing internal type representation: structure Types = struct type unique = unit re implementing Name Equivalence datatype ty = RECORD o (Symbol.symbol * ty) list * unique NIL INT STRING ARRAY o ty * unique NAME o Symbol.symbol * ty option re UNIT or recursive type declarations Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 17 o 24 Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 18 o 24 Type Equivalence When are two type expressions equivalent? Name equivalence (NE) : T 1 and T 2 are equivalent i T 1 and T 2 are identical type names deined by the exact same type declaration. Structure equivalence (SE) : T 1 and T 2 are equivalent i T 1 and T 2 are composed o the same constructors applied in the same order. Here point and ptr are equivalent under SE but not equivalent under NE type point = {x : int, y : int} type ptr = {x : int, y : int} unction (a : point) = a Here the redeclaration o point deines a new type under NE; thus it is a type error when unction is applied to p type point = {x : int, y : int} var p : point = point {x=3, y=5} var q : point = (p) Typing Rules in Tiger Tiger uses name equivalence; type constraints must be a type-id ( used on variable declarations, unction parameters and results, array elements, and record ields) The expression nil has the special type NIL. NIL belongs to every record type ---- it is equivalent to any record type. nil must be used in a context where its type can be determined. var p : point := nil OK i p <> nil then... OK var a := nil Illegal For variable declaration: var id : type-id := exp the type o expression exp must be equivalent to type type-id. Assignment expression id := exp --- id & exp have equivalent type. Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 19 o 24 Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 20 o 24

Typing Rules in Tiger (cont d) Function call: the types o ormal parameters must be equivalent to the types o actual arguments. Array subscript must has integer type. Array creation type-id [exp 1 ] o exp 2 exp 1 has type int, exp 2 must have type equivalent to that o the element o type-id Record creation type-id {id = exp 1,... } the type o each ield (exp i ) must have type equivalent to that deined in type-id I-expression i exp 1 then exp 2 else exp 3 the type o exp 1 must be integer, the type o exp 2 and exp 3 should be equivalent. For-expression or id := exp 1 to exp 2 do exp 3 the type o exp 1 and exp 2 must be integer. exp 3 should produce no value... For more ino, read Appix in Appel. Recursive Type Declarations How to convert the ollowing declaration into the internal type representations? type list = {irst : int, rest : list} Problem: when we do the conversion o the r.h.s., list is not deined in the tenv yet. Solution: use the special Name type datatype ty = NAME o Symbol.symbol * ty option re... First, enter a header type or list val tenv = enter(tenv,name, NAME(name,re NONE)) Then, we process the body (i.e., r.h.s) o the type declarations, and assign the result into the reerence cell in the NAME type Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 21 o 24 Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 22 o 24 Recursive Function Declarations Problem: when we process the right hand side o unction declarations, we may encounter symbols that are not deined in the env yet unction do_nothing1(a: int, b: string)= do_nothing2(a+1) unction do_nothing2(d: int) = do_nothing1(d, str ) Solution: irst put all unction names (on the l.h.s.) with their header inormation (e.g., parameter list, unction name, type, etc., all can be igured out easily) into the env --------- then process each unction s body in this augmented env. Other Semantic Check Many other things can be done in the type-checking phase: resolve overloading operators type inerence check i all identiiers are deined check correct nesting o break statements. Comming soon --- Assignment 5 is to write the type-checker. Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 23 o 24 Copyright 1994-2010 Zhong Shao, Yale University Semantic Analysis : Page 24 o 24