Semantic Analysis and Type Checking

Similar documents
The compilation process is driven by the syntactic structure of the program as discovered by the parser

Acknowledgement. CS Compiler Design. Semantic Processing. Alternatives for semantic processing. Intro to Semantic Analysis. V.

5. Semantic Analysis!

5. Semantic Analysis. Mircea Lungu Oscar Nierstrasz

Symbol Tables. For compile-time efficiency, compilers often use a symbol table: associates lexical names (symbols) with their attributes

5. Semantic Analysis. Mircea Lungu Oscar Nierstrasz

Symbol Table Information. Symbol Tables. Symbol table organization. Hash Tables. What kind of information might the compiler need?

CA Compiler Construction

The role of semantic analysis in a compiler

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

1 Lexical Considerations

Lecture 7: Type Systems and Symbol Tables. CS 540 George Mason University

Lexical Considerations

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Short Notes of CS201

Lexical Considerations

CS201 - Introduction to Programming Glossary By

The Compiler So Far. Lexical analysis Detects inputs with illegal tokens. Overview of Semantic Analysis

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 17: Types and Type-Checking 25 Feb 08

Static Semantics. Lecture 15. (Notes by P. N. Hilfinger and R. Bodik) 2/29/08 Prof. Hilfinger, CS164 Lecture 15 1

CS558 Programming Languages

A Short Summary of Javali

Type Checking. Outline. General properties of type systems. Types in programming languages. Notation for type rules.

G Programming Languages - Fall 2012

Outline. General properties of type systems. Types in programming languages. Notation for type rules. Common type rules. Logical rules of inference

Semantic Analysis. Lecture 9. February 7, 2018

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Simply-Typed Lambda Calculus

G Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1

Compilers. Type checking. Yannis Smaragdakis, U. Athens (original slides by Sam

The Decaf Language. 1 Lexical considerations

Intermediate Code Generation

CS558 Programming Languages

CS 330 Lecture 18. Symbol table. C scope rules. Declarations. Chapter 5 Louden Outline

Type systems. Static typing

Lecture 16: Static Semantics Overview 1

Programming Languages Third Edition. Chapter 7 Basic Semantics

Compiler Theory. (Semantic Analysis and Run-Time Environments)

COMP 181. Agenda. Midterm topics. Today: type checking. Purpose of types. Type errors. Type checking

Operational Semantics of Cool

Types. Type checking. Why Do We Need Type Systems? Types and Operations. What is a type? Consensus

Formal Languages and Compilers Lecture IX Semantic Analysis: Type Chec. Type Checking & Symbol Table

Informatica 3 Syntax and Semantics

CST-402(T): Language Processors

COS 320. Compiling Techniques

Lecture Outline. COOL operational semantics. Operational Semantics of Cool. Motivation. Lecture 13. Notation. The rules. Evaluation Rules So Far

Semantic Analysis. How to Ensure Type-Safety. What Are Types? Static vs. Dynamic Typing. Type Checking. Last time: CS412/CS413

G Programming Languages - Fall 2012

Type Checking. Error Checking

LECTURE 3. Compiler Phases

The Structure of a Syntax-Directed Compiler

Lecture08: Scope and Lexical Address

UNIT-4 (COMPILER DESIGN)

System Software Assignment 1 Runtime Support for Procedures

Semantic Analysis. CSE 307 Principles of Programming Languages Stony Brook University

CSE 504. Expression evaluation. Expression Evaluation, Runtime Environments. One possible semantics: Problem:

Topics Covered Thus Far. CMSC 330: Organization of Programming Languages. Language Features Covered Thus Far. Programming Languages Revisited

Compiling and Interpreting Programming. Overview of Compilers and Interpreters

Operational Semantics. One-Slide Summary. Lecture Outline

CSCE 314 Programming Languages. Type System

Inheritance. Transitivity

CMSC 330: Organization of Programming Languages

More Lambda Calculus and Intro to Type Systems

Intermediate Formats. for object oriented languages

Intermediate Code Generation

Final CSE 131B Spring 2004

Informal Semantics of Data. semantic specification names (identifiers) attributes binding declarations scope rules visibility

Procedure and Object- Oriented Abstraction

Introduction to Programming Using Java (98-388)

Weeks 6&7: Procedures and Parameter Passing

The Decaf language 1

CS 415 Midterm Exam Spring 2002

CSC 533: Organization of Programming Languages. Spring 2005

Lecture Outline. COOL operational semantics. Operational Semantics of Cool. Motivation. Notation. The rules. Evaluation Rules So Far.

CS 314 Principles of Programming Languages. Lecture 11

Intermediate Representations & Symbol Tables

CS143 Handout 03 Summer 2012 June 27, 2012 Decaf Specification

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

Compiler construction

Design issues for objectoriented. languages. Objects-only "pure" language vs mixed. Are subclasses subtypes of the superclass?

Implementing Higher-Level Languages. Quick tour of programming language implementation techniques. From the Java level to the C level.

The PCAT Programming Language Reference Manual

11. a b c d e. 12. a b c d e. 13. a b c d e. 14. a b c d e. 15. a b c d e

Tail Calls. CMSC 330: Organization of Programming Languages. Tail Recursion. Tail Recursion (cont d) Names and Binding. Tail Recursion (cont d)

Chapter 5. Names, Bindings, and Scopes

Module 27 Switch-case statements and Run-time storage management

Implications of Substitution עזאם מרעי המחלקה למדעי המחשב אוניברסיטת בן-גוריון

Every language has its own scoping rules. For example, what is the scope of variable j in this Java program?

Intermediate Formats. for object oriented languages

Today's Topics. CISC 458 Winter J.R. Cordy

Run-time Environments

The SPL Programming Language Reference Manual

CS164: Programming Assignment 5 Decaf Semantic Analysis and Code Generation

Run-time Environments

Chapter 5 Names, Bindings, Type Checking, and Scopes

Transcription:

Semantic Analysis and Type Checking The compilation process is driven by the syntactic structure of the program as discovered by the parser Semantic routines: interpret meaning of the program based on its syntactic structure two purposes: finish analysis by deriving data and control-dependent information begin synthesis by generating the IR or target code associated with individual productions of a context-free grammar or subtrees of a syntax tree 1

Data and Control-dependent analyses What dependency questions might the compiler ask? Answers require understanding of contexts. 1. Is x scalar, an array, or a function? 2. Is x declared before it is used? 3. Are any names declared but not used? 4. Which declaration of x does this reference? 5. Is an expression type-consistent? 6. Does the dimension of a reference match the declaration? 7. Where can x be stored? (heap, stack,...) 8. Does *p reference the result of a malloc()? 9. Is x defined before it is used? 10. Is an array reference in bounds? 11. Does function foo produce a constant value? These cannot be answered with a context-free grammar 2

Context-sensitive analysis Why are answers to these questions hard? answers often depend on values, not syntax answers rely on tracking data and control-flow questions and answers involve non-local information Several alternatives: abstract syntax tree (attribute grammars) symbol tables language design specify non-local computations automatic evaluators central store for facts express checking code simplify language avoid problems 3

Symbol tables For compile-time efficiency, compilers often use a symbol table: associates lexical names (symbols) with their attributes What items should be entered? variable names defined constants procedure and function names literal constants and strings source text labels compiler-generated temporaries Separate table for structure layouts (types) (we ll get there) (field offsets and lengths) A symbol table is a compile-time structure 4

Symbol table information What kind of information might the compiler need? textual name data type dimension information declaring procedure lexical level of declaration storage class offset in storage if record, pointer to structure table if parameter, by-reference or by-value? can it be aliased? to what other names? number and type of arguments to functions (for aggregates) (base address) 5

Nested scopes: block-structured symbol tables What information is needed? when we ask about a name, we want the most recent declaration the declaration may be from the current scope or some enclosing scope innermost scope overrides declarations from outer scopes Key point: new declarations (usually) occur only in current scope What operations do we need? void put (Symbol key, Object value) binds key to value Object get(symbol key) returns value bound to key void beginscope() remembers current state of table void endscope() restores table to state at most recent scope that has not been ended May need to preserve list of locals for the debugger 6

Attribute information Attributes are internal representation of declarations Symbol table associates names with attributes Names may have different attributes depending on their meaning: variables: type, procedure level, frame offset types: type descriptor, data size/alignment constants: type, value procedures: formals (names/types), result type, block information (local decls.), frame size 7

Type expressions Type expressions are a textual representation for types: 1. basic types: boolean, char, integer, real, etc. 2. type names 3. constructed types (constructors applied to type expressions): (a) array(i,t) denotes array of elements type T, index type I e.g., array(1...10,integer) (b) T 1 T 2 denotes Cartesian product of type expressions T 1 and T 2 (c) records: fields have names e.g., record((a integer),(b real)) (d) pointer(t) denotes the type pointer to object of type T (e) D R denotes type of function mapping domain D to range R e.g., integer integer integer 8

!! Type descriptors Type descriptors are compile-time structures representing type expressions e.g., char char pointer(integer) pointer or pointer char char integer char integer 9

Type compatibility Type checking needs to determine type equivalence Two approaches: Name equivalence: each type name is a distinct type Structural equivalence: two types are equivalent iff. they have the same structure (after substituting type expressions for type names) s t iff. s and t are the same basic types array(s 1,s 2 ) array(t 1,t 2 ) iff. s 1 t 1 and s 2 t 2 s 1 s 2 t 1 t 2 iff. s 1 t 1 and s 2 t 2 pointer(s) pointer(t) iff. s t s 1 s 2 t 1 t 2 iff. s 1 t 1 and s 2 t 2 10

Type compatibility: example Consider: type link = cell; var next : link; last : link; p : cell; q, r : cell; Under name equivalence: next and last have the same type p, q and r have the same type p and next have different type Under structural equivalence all variables have the same type Ada/Pascal/Modula-2/Tiger are somewhat confusing: they treat distinct type definitions as distinct types, so p has different type from q and r 11

Type compatibility: Pascal-style name equivalence Build compile-time structure called a type graph: each constructor or basic type creates a node each name creates a leaf (associated with the type s descriptor) last p q r next = pointer pointer pointer link cell Type expressions are equivalent if they are represented by the same node in the graph 12

Type compatibility: recursive types Consider: type link = cell; cell = record info : integer; next : link; end; We may want to eliminate the names from the type graph Eliminating name link from type graph for record: cell =record info integer next pointer 13 cell

Type compatibility: recursive types Allowing cycles in the type graph eliminates cell: cell =record info integer next pointer 14

Java inheritance: field overloading Fields declared in a subclass can overload fields declared in superclasses Overloading is same name used in different contexts to refer to different things, such as different fields Consider: class A { int j; } class B extends A { int j; } A a = new A();// let s call this object X // X has one field, named j, declared in A a.j = 1; // assigns 1 to the field j of X declared in A a = new B(); // let s call this object Y // Y has two fields, both named j, // one declared in A, the other in B a.j = 2; // assigns 2 to the field j of Y declared in A B b = a; b.j = 3; // assigns 3 to the field j of Y declared in B 15

Java inheritance: method overriding Methods declared in subclasses can override methods declared in superclasses Overriding is same name used to name a different thing, regardless of context, such as methods in subclasses with the same name Consider: class A { int j; void set_j(int i) { this.j = i; } class B extends A { int j; void set_j(int i) { this.j = i; } A a = new A();// let s call this object X a.set_j(1); // assigns 1 to the field j of X declared in A // i.e., invokes A set_j method a = new B(); // let s call this object Y a.set_j(2); // assigns 2 to the field j of Y declared in B // i.e., invokes B set_j method B b = a; b.set_j(3); // assigns 3 to the field j of Y declared in B // i.e. invokes B set_j method 16

Java method overloading Java also supports method overloading, which has nothing to do with inheritance Consider: class A { int j; boolean b; void set(int i) { this.j = i; } void set(boolean b) { this.j = b; } } Don t confuse method overloading with method overriding 17

Why Types? 1. Low-level languages such as assembler are as expressive as Turing machines. 2. Can encode complex data structures (e.g., lists), recursion, and other interesting objects. 3. They impose no structure on how these objects are used. For example, registers hold both pointers and integers that can be used interchangeably. 4. It is useful to have a language with types for both improved implementation as well as expressivity. 18

Properties 1. Should be enforceable. 2. Should be checkable algorithmically. 3. Should be transparent: Why is a program not well-typed? 19

Types as sets 1. A variable can assume a range of values during execution. 2. An upper-bound of such a range is called a type: (a) A variable of type bool only holds Boolean values. (b) If x has type bool, then not(x) has a sensible meaning in any context in which it is used. 20

Errors and Soundness 1. Static Type Systems: (a) Want to prevent execution errors when the program is running. (b) The scope of execution errors prevented is determined by the expressivity of the type system. 2. Trapped Execution Errors: being well-typed does not mean a program cannot raise an error. (a) division by zero (b) infinite loops leading to a segmentation fault (c) dereferencing an invalid address 3. A program is safe if it does not cause untrapped errors. Scheme, ML, and Java are examples of safe languages; C is not. 21

Type Systems A multi-step process: 1. Define syntax (expressions, types, binding and scoping) 2. Static rules 3. Compare static type rules with dynamic execution 4. Relate static and dynamic semantics: Well-typed programs do not go wrong : (a) Progress: a well-typed term is not stuck: it is either a value or can be further evaluated. (b) Preservation: if a well-typed term reduces to another term, the resulting term is also well-typed. 22