Semantic Analysis. Lecture 9. February 7, 2018

Similar documents
Scoping and Type Checking

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

The role of semantic analysis in a compiler

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

The Compiler So Far. Lexical analysis Detects inputs with illegal tokens. Overview of Semantic Analysis

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

Types. Type checking. Why Do We Need Type Systems? Types and Operations. What is a type? Consensus

Overview of Semantic Analysis. Lecture 9

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Announcements. Working on requirements this week Work on design, implementation. Types. Lecture 17 CS 169. Outline. Java Types

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Introduction to Parsing. Lecture 5

Building Compilers with Phoenix

Operational Semantics. One-Slide Summary. Lecture Outline

MIDTERM EXAM (Solutions)

CSE 401 Midterm Exam Sample Solution 2/11/15

CMSC 330: Organization of Programming Languages

Introduction to Parsing. Lecture 5

Topics Covered Thus Far. CMSC 330: Organization of Programming Languages. Language Features Covered Thus Far. Programming Languages Revisited

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

Operational Semantics of Cool

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

Semantic Analysis and Type Checking

Project Compiler. CS031 TA Help Session November 28, 2011

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1

Tail Calls. CMSC 330: Organization of Programming Languages. Tail Recursion. Tail Recursion (cont d) Names and Binding. Tail Recursion (cont d)

Context Analysis. Mooly Sagiv. html://

Syntax Errors; Static Semantics

CSCE 314 Programming Languages. Type System

CS558 Programming Languages

Lecture Outline. COOL operational semantics. Operational Semantics of Cool. Motivation. Lecture 13. Notation. The rules. Evaluation Rules So Far

LECTURE NOTES ON COMPILER DESIGN P a g e 2

Type Checking. Outline. General properties of type systems. Types in programming languages. Notation for type rules.

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

First Midterm Exam CS164, Fall 2007 Oct 2, 2007

Outline. General properties of type systems. Types in programming languages. Notation for type rules. Common type rules. Logical rules of inference

Introduction to Lexical Analysis

CMSC 330: Organization of Programming Languages. Operational Semantics

Compiler Theory. (Semantic Analysis and Run-Time Environments)

Midterm 2 Solutions Many acceptable answers; one was the following: (defparameter g1

COMP 181. Agenda. Midterm topics. Today: type checking. Purpose of types. Type errors. Type checking

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Informal Semantics of Data. semantic specification names (identifiers) attributes binding declarations scope rules visibility

Lexical Analysis. Chapter 2

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

UNIVERSITY OF CALIFORNIA

Lexical Analysis. Lecture 3-4

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

Semantic Analysis. Compiler Architecture

LECTURE 3. Compiler Phases

Lecture Outline. COOL operational semantics. Operational Semantics of Cool. Motivation. Notation. The rules. Evaluation Rules So Far.

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

Briefly describe the purpose of the lexical and syntax analysis phases in a compiler.

Programming in C. main. Level 2. Level 2 Level 2. Level 3 Level 3

Compilers. Type checking. Yannis Smaragdakis, U. Athens (original slides by Sam

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

Lexical Analysis. Lecture 2-4

CS164: Midterm I. Fall 2003

Time : 1 Hour Max Marks : 30

CS152 Programming Language Paradigms Prof. Tom Austin, Fall Syntax & Semantics, and Language Design Criteria

CSE P 501 Exam 11/17/05 Sample Solution

What is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573

Outline. Introduction Concepts and terminology The case for static typing. Implementing a static type system Basic typing relations Adding context

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

G Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University

CS558 Programming Languages

COMPILER CONSTRUCTION Seminar 02 TDDB44

Outline. Java Models for variables Types and type checking, type safety Interpretation vs. compilation. Reasoning about code. CSCI 2600 Spring

Program Abstractions, Language Paradigms. CS152. Chris Pollett. Aug. 27, 2008.

The compilation process is driven by the syntactic structure of the program as discovered by the parser

ECE251 Midterm practice questions, Fall 2010

CMSC 330: Organization of Programming Languages. Formal Semantics of a Prog. Lang. Specifying Syntax, Semantics

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

Syntax-Directed Translation. Lecture 14

The Environment Model. Nate Foster Spring 2018

Topics Covered Thus Far CMSC 330: Organization of Programming Languages

CSE 401 Midterm Exam Sample Solution 11/4/11

Operational Semantics of Cool

Introduction to Parsing. Lecture 8

Crafting a Compiler with C (II) Compiler V. S. Interpreter

Formal Semantics. Chapter Twenty-Three Modern Programming Languages, 2nd ed. 1

Type Checking in COOL (II) Lecture 10

In Our Last Exciting Episode

Announcements. Written Assignment 2 due today at 5:00PM. Programming Project 2 due Friday at 11:59PM. Please contact us with questions!

The Environment Model

CMSC 350: COMPILER DESIGN

More Static Semantics. Static. One-Slide Summary. Lecture Outline. Typing Rules. Dispatch Rules SELF_TYPE

Lecture 7: Type Systems and Symbol Tables. CS 540 George Mason University

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

COS 320. Compiling Techniques

Scoping rules match identifier uses with identifier definitions. A type is a set of values coupled with a set of operations on those values.

G Programming Languages - Fall 2012

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing

Static Semantics. Lecture 15. (Notes by P. N. Hilfinger and R. Bodik) 2/29/08 Prof. Hilfinger, CS164 Lecture 15 1

COMPILER DESIGN. For COMPUTER SCIENCE

Transcription:

Semantic Analysis Lecture 9 February 7, 2018

Midterm 1 Compiler Stages 12 / 14 COOL Programming 10 / 12 Regular Languages 26 / 30 Context-free Languages 17 / 21 Parsing 20 / 23 Extra Credit 4 / 6 Average 92 Compiler Construction 2/48

Formal Languages Wrap-up You made it! Takeaways from lexical analysis and parsing Regular languages define tokens Context-free languages define parse trees Classes of languages require different levels of expressive power Compiler Construction 3/48

Compilers Reminder 1. Lexical Analysis Source code Tokens 2. Syntactic Analysis Tokens Abstract Syntax Tree 3. Semantic Analysis (later today) 4. Code Generation 5. Optimization Compiler Construction 4/48

Compilers Reminder Remember, we want to support developers effort to write software They give our compiler source code Our job is to turn it into a program Automatic conversion requires various levels of analysis On the front end, we want to extract structure from the string of source code Compiler Construction 5/48

Parsing to Extract Structure Consider two input strings: x = 1; y = 1; And what about: z = 42; y = 100; = Is there a structural difference between these strings? x 1 ; Compiler Construction 6/48

Parsing to Extract Structure Consider two input strings: x = 1; y = 1; And what about: z = 42; y = 100; = Is there a structural difference between these strings? y 1 ; Compiler Construction 6/48

Parsing to Extract Structure Consider two input strings: x = 1; y = 1; And what about: z = 42; y = 100; = Is there a structural difference between these strings? z 42 ; Compiler Construction 6/48

Parsing to Extract Structure Consider two input strings: x = 1; y = 1; And what about: z = 42; y = 100; = Is there a structural difference between these strings? y 100 ; Assign x = 1; y = 1; z = 42; y = 100;... Compiler Construction 6/48

Parsing to Extract Structure Consider two input strings: x = 1; y = 1; And what about: z = 42; y = 100; = Is there a structural difference between these strings? y 100 ; Assign x = 1; y = 1; z = 42; y = 100;... Compiler Construction 6/48

Parsing to Extract Structure Consider two input strings: x = 1; y = 1; And what about: z = 42; y = 100; = Assign ID = VALUE ; Is there a structural difference between these strings? ID VALUE ; Compiler Construction 6/48

Parsing to Extract Structure We want a concise grammar Don t want to depend on specific values! S ID = VAL ;, not S x = 42 ; Well, we need an earlier lexical analysis step Bin lexemes into tokens x, y, z all the same thing: identifiers 1, 42, 100 all the same thing: integer constants Compiler Construction 7/48

Wrapping Up Formal Languages Lexical Analysis Regular Language = DFA = NFA = Regular Expression Syntactic Analysis / Parsing Context-free Language = Context-free Grammar = Nondeterministic Pushdown Automaton = Parsing Automaton Compiler Construction 8/48

Formal Languages Highlights Regular Languages are easier to recognize than CFLs! Need a stack with CFLs As compiler designers, we restrict our choices of CFL LL and LR are less expressive, but easier to parse Parsing DFA is kind of a lie! Technically, CFLs are recognized with NPDAs This is not a formal languages course Compiler Construction 9/48

Parse Tree vs. Abstract Syntax Tree Almost the same Input: 6 + ( ( 5 ) ) Parse Tree: EXP EXP EXP EXP int EXP 6 + int ( ( 5 ) ) Compiler Construction 10/48

Parse Tree vs. Abstract Syntax Tree Almost the same Input: 6 + ( ( 5 ) ) Abstract Syntax Tree: ADD_EXP int:6 int:5 Compiler Construction 10/48

Parsing Highlights You should understand that LL and LR are different restricted subsets of CFLs Trade language expressiveness for parsing ease You should know top-down and bottom-up parsing Real parser generators build bottom-up parsers We talked about LL(1), recursive descent, and LR(1) parsing There are many other parsing algorithms: LALR (lookahead LR) SLR (simple) GLR (generalized) IELR (inadequacy elimination) Earley Compiler Construction 11/48

Compiler Construction 12/48

The Front End Lexical Analysis Detects inputs with illegal tokens Syntactic Analysis Detects inputs with ill-formed parse trees Semantic Analysis Last front end phase Detects more invalid inputs Compiler Construction 12/48

Intro to Type Checking We have just parsed our program It followed the rules of a context-free grammar We know it is a structurally valid program Compiler Construction 13/48

Intro to Type Checking We have just parsed our program It followed the rules of a context-free grammar We know it is a structurally valid program Ultimately, programming languages are not context-free Need a rigorous analysis to tell us about semantic validity of the program Basically, a laundry list of tasks Compiler Construction 13/48

Type Checking Summary Scoping rules match identifier uses with identifier definitions. A type is a set of values coupled with a set of operations on those values. A type system specifies which operations are valid for which types. Type checking can be done statically (at compile time) or dynamically (at run time). Compiler Construction 14/48

What s Wrong? Example 1: 1 l e t y : I n t in 2 x + 3 Example 2: 1 l e t y : S t r i n g < " abc " in 2 y + 3 Compiler Construction 15/48

Semantic Analysis: Why? Parsing cannot catch some errors Some language constructs are not context-free. All used variables must have been declared (scoping) All methods must be invoked with arguments of the proper type (i.e. typing) Compiler Construction 16/48

What does semantic analysis do? Lots of things! In Cool: All identifiers are declared Static types Inheritance relationships Classes defined only once Methods in a class defined only once Reserved identifiers are not misused several others... The requirements are language-dependent. Which of these are checked by Python? C++? Compiler Construction 17/48

Scope Scoping rules match identifier uses with identifier declarations Critical analysis in most languages (including Cool) Compiler Construction 18/48

Scope The scope of an identifier is the portion of a program in which that identifier is accessible The same identifier may refer to different things in different parts of the program Different scopes for the same name do not overlap An identifier may have restricted scope Compiler Construction 19/48

Static vs. Dynamic Scope Most languages have static scope Scope depends only on the program text, not the run-time behavior Cool is statically scoped A few languages are dynamically scoped L A TEX, SNOBOL Compiler Construction 20/48

Static vs. Dynamic Scope 1 int x; 2 3 int main () { 4 x = 2; 5 f(); 6 g(); 7 } 8 void f () { 9 int x = 3; 10 h(); 11 } 12 void g () { 13 int x = 4; 14 h(); 15 } 16 void h() { 17 printf ("%d\n", x); 18 } Compiler Construction 21/48

Static vs. Dynamic Scope 1 int x; 2 3 int main () { 4 x = 2; 5 f(); 6 g(); 7 } 8 void f () { 9 int x = 3; 10 h(); 11 } 12 void g () { 13 int x = 4; 14 h(); 15 } 16 void h() { 17 printf ("%d\n", x); 18 } Static scoping output: 2 2 Dynamic scoping output: 3 4 Compiler Construction 21/48

Static Scoping Example 1 let x : Int <- 0 in 2 { 3 x; 4 { let x : Int <- 1 in 5 x ; 6 }; 7 x; 8 } Compiler Construction 22/48

Static Scoping Example 9 let x : Int <- 0 in 10 { 11 x; 12 { let x : Int <- 1 in 13 x ; 14 }; 15 x; 16 } Uses of x refer to closest enclosing definition. Compiler Construction 22/48

Scope in Cool Cool identifier bindings are introduced by Class declarations (class names) Method definitions (method names) Let expressions (object id s) Formal parameters (object id s) Attribute definitions in a class (object id s) Case expressions (object id s) Compiler Construction 23/48

Implementing the Most-Closely Nested Rule Much of semantic analysis can be expressed as a recursive descent of an AST! Process an AST node n Process the children of n Finish processing the node n Compiler Construction 24/48

Implementing... (2) Example: the scope of let bindings is one subtree 1 l e t x : I n t < 0 in expr x can be used in subtree expr Compiler Construction 24/48

Symbol Tables Recall: let x : Int <- 0 in expr Idea: Before processing expr, add definition of x to current definitions (override any previous definition) After processing expr, remove definition of x and restore old definition of x (if any) A symbol table is a data structure that tracks the current bindings of identifiers Part of PA4 How might we implement this? Compiler Construction 25/48

Exceptions to Most-closely Nested Not all kinds of identifiers follow the most-closely nested rule Class definitions in Cool are globally visible throughout all of the program Implication: class names can be used before it is defined Compiler Construction 26/48

Example: Use before define 1 class Foo { 2... let y : Test in... 3 }; 4 class Test { 5... 6 }; Compiler Construction 27/48

More Scope in Cool Attribute names are global within the class in which they are defined 1 class Foo { 2 f() : Int { tm }; 3 tm : Int <- 0; 4 }; Compiler Construction 28/48

More Scope in Cool Method and attribute names have complex rules A method need not be defined in the class in which it is used, but in some parent class that s inheritance Methods may also be redefined (overridden) Which languages allow this? Cool does Compiler Construction 29/48

Class Definitions Class names can be used before being defined We can t check this property in one pass :( Solution Pass 1: Collect all the class names Pass 2: Do the checking? Pass 4: Profit! Semantic analysis requires multiple passes Compiler Construction 30/48

Types What is a Type? The notion varies from language to language Consensus A set of values A set of operations on those values Classes are one instantiation of the modern notion of type Compiler Construction 31/48

Why do we need types? Consider this assembly: addi, $r1, $r2, $r3 What are the types of r1, r2, and r3? Compiler Construction 32/48

Types and Operations Cretain operations are legal or valid for values of each type It doesn t make sense to add a function pointer and an integer in C It does make sense to add two integers Both are the same in assembly! Compiler Construction 33/48

Type Systems A language s type system specifies which operations are valid for which types The goal of type checking is to ensure that operations are used with the correct types Enforces intended interpretation of values, because nothing else will! Our last, best hope... for victorious execution Type systems provide a concise formalization of the semantic checking rules Compiler Construction 34/48

What Can Types Do? Can detect certain kinds of errors Memory errors: Reading from an invalid pointer, etc. Violation of abstraction boundaries class FileSystem { open ( x : String ) : File {... }; }; The f method can t see data inside fdesc! class Client { f( fs : FileSystem ) { File fdesc <- fs. open (" foo ")... } }; Compiler Construction 35/48

Type Checking Overview Three kinds of languages: Statically typed: All or almost all checking of types is done as part of compilation (C, Java, Cool) Dynamically typed: Almost all checking of types is done during execution (Python, PHP, Ruby) Untyped: No type checking (machine code) Compiler Construction 36/48

The Type Wars Competing views on static vs. dynamic typing Static typing proponents say: Static checking catches many programming errors at compile time Avoids overhead of runtime type checks Dynamic typing proponents say: Static type systems are restrictive Rapid prototyping is easier in a dynamic type system Compiler Construction 37/48

The Type Wars (2) In practice, most code is written in statically-typed languages with an escape mechanism. Unsafe casts in C, native methods in Java,... Dynamic typing (a.k.a. duck typing) is big in the scripting world Compiler Construction 38/48

Cool Types In Cool, types are Class names SELF_TYPE Technically, there are no native types Everything is boxed as an object Java Integer int The use declares types for all identifiers The compiler infers types for expressions every expression Compiler Construction 39/48

Type checking and Type Inference Type Checking is the process of verifying fully-typed programs Type Inference is the process of filling in missing type information Sometimes used interchangably Compiler Construction 40/48

Rules of Inference You have survived formal languages in specifying parts ot he compiler Regular expressions (lexer) Context-free grammars (parser) We use logical rules of inference as a formalism for type inference Compiler Construction 41/48

Rules of Inference? Inference rules have the form If Hypothesis is true, then Conclusion is true Type checking computes via reasoning: If E 1 and E 2 have certain types, then E 3 has a certain type Rules of inference are a compact notation for If-then statements Compiler Construction 42/48

From English to Inference Rules Start with a simplified system and gradually add features... Building blocks Symbol is and Symbol is if-then x:t is x has type T Compiler Construction 43/48

English to Inference Rules (2) If e 1 has type Int and e 2 has type Int, then e 1 + e 2 has type Int (e 1 has type Int e 2 has type Int) e 1 + e 2 has type Int (e 1 : Int e 2 : Int) e 1 + e 2 : Int This is an inference rule! e 1 : Int e 2 : Int are two hypotheses, e 1 + e 2 : Int is a conclusion! Compiler Construction 44/48

Notation for Inference Rules By tradition, inference rules are written Hypothesis 1... Hypothesis n Conclusion Cool type rules have hypotheses and conclusions of the form: e : T means we can prove that... Compiler Construction 45/48

Examples (i is an Integer) i : Int Int e 1 : Int e 2 : Int e 1 + e 2 : Int Add (the result of adding to Ints is an Int) Compiler Construction 46/48

Examples (2) e 1 : Int e 2 : Int e 1 + e 2 : Int Add These rules give templates describing how to type integers and + expressions... We can fill in the templates, producing a complete typing for any expression! false : Int true : Int true + false : Int Compiler Construction 47/48

Example: 1+2 1 : Int 2 : Int 1 + 2 : Int Compiler Construction 48/48