Formal Languages and Compilers Lecture IX Semantic Analysis: Type Chec. Type Checking & Symbol Table

Similar documents
Type Checking. Error Checking

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

Formal Languages and Compilers Lecture X Intermediate Code Generation

Compilerconstructie. najaar Rudy van Vliet kamer 124 Snellius, tel rvvliet(at)liacs.

Type systems. Static typing

Intermediate Code Generation

Intermediate Code Generation Part II

Type checking of statements We change the start rule from P D ; E to P D ; S and add the following rules for statements: S id := E

Concepts Introduced in Chapter 6

Static Checking and Type Systems

Formal Languages and Compilers Lecture VI: Lexical Analysis

Formal Languages and Compilers Lecture I: Introduction to Compilers

Concepts Introduced in Chapter 6

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

Principle of Compilers Lecture VIII: Intermediate Code Generation. Alessandro Artale

The Compiler So Far. Lexical analysis Detects inputs with illegal tokens. Overview of Semantic Analysis

The compilation process is driven by the syntactic structure of the program as discovered by the parser

1 Lexical Considerations

Formal Languages and Compilers Lecture VII Part 3: Syntactic A

Intermediate Code Generation

Semantic Analysis and Type Checking

Acknowledgement. CS Compiler Design. Semantic Processing. Alternatives for semantic processing. Intro to Semantic Analysis. V.

Type Checking. Chapter 6, Section 6.3, 6.5

Semantic Analysis computes additional information related to the meaning of the program once the syntactic structure is known.

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Lecture 7: Type Systems and Symbol Tables. CS 540 George Mason University

5. Semantic Analysis!

Intermediate Code Generation

Lexical Considerations

2 Intermediate Representation. HIR (high level IR) preserves loop structure and. More of a wizardry rather than science. in a set of source languages

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

CS2210: Compiler Construction. Code Generation

Type Checking. Outline. General properties of type systems. Types in programming languages. Notation for type rules.

Outline. General properties of type systems. Types in programming languages. Notation for type rules. Common type rules. Logical rules of inference

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

Lexical Considerations

PART 4 - SYNTAX DIRECTED TRANSLATION. F. Wotawa TU Graz) Compiler Construction Summer term / 309

Type Checking. CS308 Compiler Theory 1

Compilers. Type checking. Yannis Smaragdakis, U. Athens (original slides by Sam

Principles of Programming Languages

CA Compiler Construction

Syntax Errors; Static Semantics

CSE 307: Principles of Programming Languages

Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Gr

CSE P 501 Compilers. Static Semantics Hal Perkins Winter /22/ Hal Perkins & UW CSE I-1

CSC 467 Lecture 13-14: Semantic Analysis

5. Syntax-Directed Definitions & Type Analysis

Introduction to Programming Using Java (98-388)

There is a level of correctness that is deeper than grammar. There is a level of correctness that is deeper than grammar

Intermediate Code Generation

PART 4 - SYNTAX DIRECTED TRANSLATION. F. Wotawa TU Graz) Compiler Construction Summer term / 264

5. Semantic Analysis. Mircea Lungu Oscar Nierstrasz

Symbol Tables. For compile-time efficiency, compilers often use a symbol table: associates lexical names (symbols) with their attributes

Formal Languages and Compilers Lecture IV: Regular Languages and Finite. Finite Automata

Time : 1 Hour Max Marks : 30

Operational Semantics. One-Slide Summary. Lecture Outline

NARESHKUMAR.R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY Page 1

About this exam review

LECTURE 3. Compiler Phases

Programming Language Concepts, cs2104 Lecture 04 ( )

Formal Languages and Compilers Lecture VII Part 4: Syntactic A

A simple syntax-directed

intermediate code generator syntax tree token stream intermediate representation parser checker tree

Operational Semantics of Cool

Informal Semantics of Data. semantic specification names (identifiers) attributes binding declarations scope rules visibility

Chapter 6 Intermediate Code Generation

Lecture Outline. COOL operational semantics. Operational Semantics of Cool. Motivation. Notation. The rules. Evaluation Rules So Far.

UNIT-4 (COMPILER DESIGN)

Lecture Outline. COOL operational semantics. Operational Semantics of Cool. Motivation. Lecture 13. Notation. The rules. Evaluation Rules So Far

Static Semantics. Lecture 15. (Notes by P. N. Hilfinger and R. Bodik) 2/29/08 Prof. Hilfinger, CS164 Lecture 15 1

Today's Topics. CISC 458 Winter J.R. Cordy

5. Semantic Analysis. Mircea Lungu Oscar Nierstrasz

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

Programming Languages Third Edition. Chapter 7 Basic Semantics

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1

More On Syntax Directed Translation

CSE 504: Compiler Design. Intermediate Representations Symbol Table

Type Analysis. Type Checking vs. Type Inference

The role of semantic analysis in a compiler

Gujarat Technological University Sankalchand Patel College of Engineering, Visnagar B.E. Semester VII (CE) July-Nov Compiler Design (170701)

The PCAT Programming Language Reference Manual

Semantic Analysis. How to Ensure Type-Safety. What Are Types? Static vs. Dynamic Typing. Type Checking. Last time: CS412/CS413

Symbol Table Information. Symbol Tables. Symbol table organization. Hash Tables. What kind of information might the compiler need?

Lecture08: Scope and Lexical Address

Principle of Complier Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms

IPCoreL. Phillip Duane Douglas, Jr. 11/3/2010

A Simple Syntax-Directed Translator

CS415 Compilers Context-Sensitive Analysis Type checking Symbol tables

Module 27 Switch-case statements and Run-time storage management

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

Principles of Compiler Design

Lecture 14 Sections Mon, Mar 2, 2009

Syntax-Directed Translation Part I

Dixita Kagathara Page 1

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

Functional Programming. Pure Functional Programming

Run-time Environments. Lecture 13. Prof. Alex Aiken Original Slides (Modified by Prof. Vijay Ganesh) Lecture 13

Transcription:

Formal Languages and Compilers Lecture IX Semantic Analysis: Type Checking & Symbol Table Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/ artale/ Formal Languages and Compilers BSc course 2017/18 First Semester

Summary Type Checking Type System Specifying a Type Checker Type Conversion Symbol Table Scope Rules and Symbol Tables Translation Schemes for building Symbol Tables

Type Checking: Intro A compiler must check that the program follows the Type Rules of the language. Information about Data Types is maintained and computed by the compiler. The Type Checker is a module of a compiler devoted to type checking tasks. Examples of Tasks. 1 The operator mod is defined only if the operands are integers; 2 Indexing is allowed only on an array and the index must be an integer; 3 A function must have a precise number of arguments and the parameters must have a correct type; 4 etc...

Type Checking: Intro (Cont.) Type Checking may be either static or dynamic: The one done at compile time is static. In languages like Java, C and Pascal type checking is primarily static and is used to check the correctness of a program before its execution. Static type checking is also useful to determine the amount of memory needed to store variables. The design of a Type Checker depends on the syntactic structure of language constructs, the Type Expressions of the language, and the rules for assigning types to constructs.

Summary Type Checking Type System Specifying a Type Checker Type Conversion Symbol Table Scope Rules and Symbol Tables Translation Schemes for building Symbol Tables

Type Expressions A Type Expression denotes the type of a language construct. A type expression is either a Basic Type or is built applying Type Constructors to other types. 1 A Basic Type is a type expression (int, real, boolean, char). The basic type void represents the empty set and allows statements to be checked; 2 Type expressions can be associated with a name: Type Names are type expressions; 3 A Type Constructor applied to type expressions is a type expression. Type constructors inlude: 1 Array. If T is a type expression, then array(i, T ) is a type expression denoting an array with elements of type T and index range in I e.g., array[1..10] of int == array(1..10,int); 2 Product. If T 1 and T 2 are type expressions, then their Cartesian Product T 1 T 2 is a type expression; 3 Record. Similar to Product but with names for different fields (used to access the components of a record). Example of a C record type: struct { double r; int i; }

Type Expressions (Cont.) 4 Pointer. If T is a type expression, then pointer(t ) is the type expression pointer to an object of type T "; 5 Function. If D is the domain and R the range of the function then we denote its type by the type expression: D : R. The mod operator has type, int int : int. The Pascal function: function f(a, b: char): int has type: char char : int

Type System Type System: Collection of (semantic) rules for assigning type expressions to the various part of a program. We will show how syntax directed definition can be used to specify Type Systems. A type checker implements a type system. Definition. A language is strongly typed if its compiler can guarantee that the program it accepts will execute without type errors.

Summary Type Checking Type System Specifying a Type Checker Type Conversion Symbol Table Scope Rules and Symbol Tables Translation Schemes for building Symbol Tables

Specification of a Type Checker We specify a type checker for a simple language where identifiers have an associated type. Grammar for Declarations and Expressions: P D; E D D; D id : T T char int array[num] of T T E literal num id E mod E E[E] E

Specification of a Type Checker (Cont.) The syntax directed definition for associating a type to an Identifier is: Production D id : T T char T int T T 1 T array[num] of T 1 All the attributes are synthesized. Semantic Rule addtype(id.entry,t.type) T.type := char T.type := int T.type := pointer(t 1.type) T.type := array(1..num.val, T 1.type) Since P D; E, all the identifiers will have their types saved in the symbol table before type checking an expression E.

Specification of a Type Checker (Cont.) The syntax directed definition for associating a type to an Expression is: Production E literal E num E id E E 1 mod E 2 E E 1 [E 2 ] E E 1 Semantic Rule E.type := char E.type := int E.type := lookup(id.entry) E.type := if E 1.type = int and E 2.type = int then int else type_error E.type := if E 2.type = int and E 1.type = array(i,t) then t else type_error E.type := if E 1.type = pointer(t) then t else type_error

Specification of a Type Checker (Cont.) The syntax directed definition for associating a type to a Statement is: Production S id := E S if E then S 1 S while E do S 1 S S 1 ; S 2 Semantic Rule S.type := if id.type = E.type then void else type_error S.type := if E.type = boolean then S 1.type else type_error S.type := if E.type = boolean then S 1.type else type_error S.type := if S 1.type = void and S 2.type = void then void else type_error The type expression for a statement is either void or type_error. Final Remark. For languages with type names or (even worst) allowing sub-typing we need to define when two types are equivalent.

Type Checker for Functions Production Semantic Rule D id : T addtype(id.entry,t.type); D.type := T.type D D 1 ; D 2 D.type := D 1.type D 2.type Fun fun id(d): T ; B addtype(id.entry,d.type:t.type) B {S} S id(elist) S.type := if lookup(id.entry)=t 1 :t 2 and EList.type=t 1 then t 2 else type_error EList E EList.type := E.type EList EList, E EList.type := EList 1.type E.type

Summary Type Checking Type System Specifying a Type Checker Type Conversion Symbol Table Scope Rules and Symbol Tables Translation Schemes for building Symbol Tables

Type Conversion Example. What s the type of x + y if: 1 x is of type real; 2 y is of type int; 3 Different machine instructions are used for operations on reals and integers. Depending on the language, specific conversion rules must be adopted by the compiler to convert the type of one of the operand of + The type checker in a compiler can insert these conversion operators into the intermediate code. Such an implicit type conversion is called Coercion.

Type Coercion in Expressions The syntax directed definition for coercion from integer to real for a generic arithmetic operation op is: Production Semantic Rule E num E.type := int E realnum E.type := real E id E.type := lookup(id.entry) E E 1 op E 2 E.type := if E 1.type = int and E 2.type = int then int else if E 1.type = int and E 2.type = real then real else if E 1.type = real and E 2.type = int then real else if E 1.type = real and E 2.type = real then real else type_error

Summary Type Checking Type System Specifying a Type Checker Type Conversion Symbol Table Scope Rules and Symbol Tables Translation Schemes for building Symbol Tables

Symbol Table The Symbol Table is the major inherited attribute and the major data structure as well. Symbol Tables store information about the name, type, scope and allocation size. Symbol Table must maintain efficiency against insertion and lookup. Dynamic data structures must be used to implement a symbol table: Linear Lists and Hash Tables are the most used. Each entry has the form of a record with a field for each peace of information.

Storing Names Names for identifiers are stored in a symbol table following an indirect scheme: The Name field stores a pointer to an array of characters, the String Table, pointing to the first character of the lexeme. This scheme allows for a fixed size of the Name field.

Summary Type Checking Type System Specifying a Type Checker Type Conversion Symbol Table Scope Rules and Symbol Tables Translation Schemes for building Symbol Tables

Symbol Tables and Scope Rules A Block in a programming language is any set of language constructs that can contain declarations. A language is Block Structured if 1 Blocks can be nested inside other blocks, and 2 The Scope of declarations in a block is limited to that block and the blocks contained in that block. Most Closely Nested Rule. Given several different declarations for the same identifier, the declaration that applies is the one in the most closely nested block.

Symbol Tables and Scope Rules (Cont.) To implement symbol tables complying with nested scopes 1 The insert operation into the symbol table must not overwrite previous declarations; 2 The lookup operation into the symbol table must always refer to the most close block rule; 3 The delete operation must delete only the most recent declarations. The symbol table behaves in a stack-like manner.

Symbol Tables and Scope Rules (Cont.) One possible solution to implement a symbol table under nested scope is to maintain separate symbol tables for each scope. Tables must be linked both from inner to outer scope, and from outer to inner scope.

Summary Type Checking Type System Specifying a Type Checker Type Conversion Symbol Table Scope Rules and Symbol Tables Translation Schemes for building Symbol Tables

Lexical- Vs. Syntactic-Time Construction 1 Information is first entered into a symbol table by the lexical analyzer only if the programming language does not allow for different declarations for the same identifier (scope). 2 If scoping is allowed, the lexical analyzer will only return the name of the identifier together with the token: The identifier is inserted into the symbol table when the syntactic role played by the identifier is discovered.

Relative Address Relative Address. Is a storage allocation information consisting of an offset from a base (usually zero) address: The Loader will be responsible for the run-time storage. The following translation scheme computes such address using a global variable called offset. P {offset := 0} D D D; D D id : T {enter(id.name, T.type, offset); offset := offset + T.width} T int {T.type := int; T.width := 4} T real {T.type := real; T.width := 8} T array[num] of T 1 {T.type := array(num.val,t 1.type); T.width := num.val * T 1.width} T T 1 {T.type := pointer(t 1.type); T.width := 4}

Relative Address (Cont.) The global variable offset keeps track of the next available address. Before the first declaration, offset is set to 0; As each new identifier is seen it is entered in the symbol table and offset is incremented. type and width are synthesized attributes for non-terminal T.

Keeping Track of Scope Information Let s consider the case of Nested Procedures: When a nested procedure is seen processing of declarations in the enclosing procedure is suspended. To keep track of nesting a stack is maintained. We associate a new symbol table for each procedure: When we need to enter a new identifier into a symbol table we need to specify which symbol table to use.

Keeping Track of Scope Information (Cont.) We are now able to provide the translation scheme for processing declarations in nested procedure. We use markers non-terminals, M, N, to rewrite productions with embedded rules. P M D M ϵ D D s ; D D s D s proc id;n D;S D s id : T N ϵ {addwidth(top(tblptr), top(offset)); pop(tblptr); pop(offset)} {t := mktable(nil); push(t, tblptr); push(0, offset)} {t := top(tblptr); addwidth(t, top(offset)); pop(tblptr); pop(offset); enterproc(top(tblptr), id.name, t)} {enter(top(tblptr), id.name, T.type, top(offset)); top(offset) := top(offset) + T.width} {t := mktable(top(tblptr)); push(t, tblptr); push(0, offset)}

Keeping Track of Scope Information (Cont.) The semantic rules make use of the following procedures and stack variables: 1 mktable(previous) creates a new symbol table and returns its pointer. The argument previous is the pointer to the enclosing procedure. 2 The stack tblptr holds pointers to symbol tables of the enclosing procedures. 3 The stack offset keeps track of the relative address w.r.t. a given nesting level. 4 enter(table,name,type,offset) creates a new entry for the identifier name in the symbol table pointed to by table, specifying its type and offset. 5 addwidth(table,width) records the cumulative width of all the entries in table in the header of the symbol table. 6 enterproc(table,name,newtable) creates a new entry for procedure name in the symbol table pointed to by table. The argument newtable points to the symbol table for this procedure name.

Summary of Lecture IX Type Checking Type System Specifying a Type Checker Type Conversion Symbol Table Scope Rules and Symbol Tables Translation Schemes for building Symbol Tables