CSE Compilers. Reminders/ Announcements. Lecture 15: Seman9c Analysis, Part III Michael Ringenburg Winter 2013

Similar documents
CSE P 501 Compilers. Static Semantics Hal Perkins Winter /22/ Hal Perkins & UW CSE I-1

CSE P 501 Compilers. Static Semantics Hal Perkins Spring UW CSE P 501 Spring 2018 I-1

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1

Compiler Passes. Semantic Analysis. Semantic Analysis/Checking. Symbol Tables. An Example

Principles of Programming Languages

Research opportuni/es with me

CSE 431S Type Checking. Washington University Spring 2013

CS101: Fundamentals of Computer Programming. Dr. Tejada www-bcf.usc.edu/~stejada Week 1 Basic Elements of C++

CSE 401/M501 Compilers

The role of semantic analysis in a compiler

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

Ways to implement a language

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Objec+ves. Review. Basics of Java Syntax Java fundamentals. What are quali+es of good sooware? What is Java? How do you compile a Java program?

CSE Lecture 10: Modules and separate compila5on 18 Feb Nate Nystrom University of Texas at Arlington

Lecture 2: Memory in C

Programming Languages and Techniques (CIS120)

Lecture 13: Abstract Data Types / Stacks

Why OO programming? want but aren t. Ø What are its components?

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

Instructor: Randy H. Katz hap://inst.eecs.berkeley.edu/~cs61c/fa13. Fall Lecture #7. Warehouse Scale Computer

Lecture 7: Type Systems and Symbol Tables. CS 540 George Mason University

CSE 401/M501 Compilers

NAMES, SCOPES AND BINDING A REVIEW OF THE CONCEPTS

Project Compiler. CS031 TA Help Session November 28, 2011

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Topics Covered Thus Far CMSC 330: Organization of Programming Languages

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 17: Types and Type-Checking 25 Feb 08

Compila(on /15a Lecture 6. Seman(c Analysis Noam Rinetzky

CSCE 314 Programming Languages. Type System

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

CSE P 501 Compilers. Memory Management and Garbage Collec<on Hal Perkins Winter UW CSE P 501 Winter 2016 W-1

Data Types. Every program uses data, either explicitly or implicitly to arrive at a result.

Principles of Programming Languages

CSE P 501 Compilers. Implementing ASTs (in Java) Hal Perkins Winter /22/ Hal Perkins & UW CSE H-1

CMSC 330: Organization of Programming Languages

The Compiler So Far. Lexical analysis Detects inputs with illegal tokens. Overview of Semantic Analysis

Syntax Errors; Static Semantics

Introduction to Programming Using Java (98-388)

Topics Covered Thus Far. CMSC 330: Organization of Programming Languages. Language Features Covered Thus Far. Programming Languages Revisited

Programming Languages and Techniques (CIS120)

Agenda. Address vs. Value Consider memory to be a single huge array. Review. Pointer Syntax. Pointers 9/9/12

Encapsula)on, cont d. Polymorphism, Inheritance part 1. COMP 401, Spring 2015 Lecture 7 1/29/2015

CSE P 501 Compilers. Implementing ASTs (in Java) Hal Perkins Autumn /20/ Hal Perkins & UW CSE H-1

Outline. Pointers arithme.c and others Func.ons & pointers

Programming Languages Third Edition. Chapter 7 Basic Semantics

Data Types (cont.) Administrative Issues. Academic Dishonesty. How do we detect plagiarism? Strongly Typed Languages. Type System

Compilers. Type checking. Yannis Smaragdakis, U. Athens (original slides by Sam

Principles of Programming Languages

CSE 374 Programming Concepts & Tools. Hal Perkins Spring 2010

Advanced Topics in MNIT. Lecture 1 (27 Aug 2015) CADSL

Effec%ve So*ware. Lecture 9: JVM - Memory Analysis, Data Structures, Object Alloca=on. David Šišlák

SQLite with a Fine-Toothed Comb. John Regehr Trust-in-So1 / University of Utah

CSE 5317 Midterm Examination 4 March Solutions

Lecture 4: Build Systems, Tar, Character Strings

Garbage collec,on Parameter passing in Java. Sept 21, 2016 Sprenkle - CSCI Assignment 2 Review. public Assign2(int par) { onevar = par; }

CS 101: Computer Programming and Utilization

CS 61C: Great Ideas in Computer Architecture Strings and Func.ons. Anything can be represented as a number, i.e., data or instruc\ons

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

Semantic analysis. Semantic analysis. What we ll cover. Type checking. Symbol tables. Symbol table entries

Semantic Analysis. How to Ensure Type-Safety. What Are Types? Static vs. Dynamic Typing. Type Checking. Last time: CS412/CS413

Structure of Programming Languages Lecture 10

BBM 101 Introduc/on to Programming I Fall 2014, Lecture 3. Aykut Erdem, Erkut Erdem, Fuat Akal

Semantic Analysis and Type Checking

Object Oriented Programming: In this course we began an introduction to programming from an object-oriented approach.

Lecture 10: Potpourri: Enum / struct / union Advanced Unix #include function pointers

8/25/17. Demo: Create application. CS2110, Recita.on 1. Arguments to method main, Packages, Wrapper Classes, Characters, Strings.

Array Basics: Outline

Topic 9: Type Checking

Topic 9: Type Checking

COSC 121: Computer Programming II. Dr. Bowen Hui University of Bri?sh Columbia Okanagan

Vectors and Pointers CS 16: Solving Problems with Computers I Lecture #13

Compilers CS S-05 Semantic Analysis

1 Terminology. 2 Environments and Static Scoping. P. N. Hilfinger. Fall Static Analysis: Scope and Types

Class Information ANNOUCEMENTS

COMP 181. Agenda. Midterm topics. Today: type checking. Purpose of types. Type errors. Type checking

Genericity. Philippe Collet. Master 1 IFI Interna3onal h9p://dep3nfo.unice.fr/twiki/bin/view/minfo/sofeng1314. P.

CS 61C: Great Ideas in Computer Architecture Func%ons and Numbers

CSE 374 Programming Concepts & Tools. Hal Perkins Fall 2015 Lecture 19 Introduction to C++

Principles of Programming Languages

Outline. Java Models for variables Types and type checking, type safety Interpretation vs. compilation. Reasoning about code. CSCI 2600 Spring

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

Semantic Analysis. Lecture 9. February 7, 2018

Object Oriented Design (OOD): The Concept

Static Semantics. Lecture 15. (Notes by P. N. Hilfinger and R. Bodik) 2/29/08 Prof. Hilfinger, CS164 Lecture 15 1

Dynamic Languages. CSE 501 Spring 15. With materials adopted from John Mitchell

CS558 Programming Languages

Memory Management. CS449 Fall 2017

Types and Type Inference

CS 314 Principles of Programming Languages. Lecture 13

Generic programming POLYMORPHISM 10/25/13

CSE 401 Final Exam. December 16, 2010

G Programming Languages - Fall 2012

The plan. Racket will return! Lecture will not recount every single feature of Java. Final project will be wri)ng a Racket interpreter in Java.

Programming Languages and Techniques (CIS120)

Func+on applica+ons (calls, invoca+ons)

History of Java. Java was originally developed by Sun Microsystems star:ng in This language was ini:ally called Oak Renamed Java in 1995

Outline. Introduction Concepts and terminology The case for static typing. Implementing a static type system Basic typing relations Adding context

C++ Overview (1) COS320 Heejin Ahn

Transcription:

CSE 401 - Compilers Lecture 15: Seman9c Analysis, Part III Michael Ringenburg Winter 2013 Winter 2013 UW CSE 401 (Michael Ringenburg) Reminders/ Announcements Project Part 2 due Wednesday Midterm Friday Sec9ons this week will be devoted to midterm review Lined up a guest lecture on register alloca9on of the last day of class (3/15) from Preston Briggs Affiliate faculty here, who previously did some of the founda9onal work in register alloca9on he s men9oned in your textbook (Chapter 13 notes). Also looking at a guest lecture that week about real- world, non- compiler applica9ons of parsing. Winter 2013 UW CSE 401 (Michael Ringenburg) 2

Today s Agenda Symbol Tables And symbol tables for MiniJava Typechecking Winter 2013 UW CSE 401 (Michael Ringenburg) 3 Symbol Tables Primary place where informa9on collected during Seman9c analysis is stored Maps iden9fiers to proper9es such as type, size, loca9on, etc. Opera9ons Lookup(id) => informa9on Enter(id, informa9on) Open/close scopes Build & use during seman9cs pass Build first from declara9ons Then use to check seman9c rules Use (and add to) during later phases as well Winter 2013 UW CSE 401 (Michael Ringenburg) 4

Aside: Implemen9ng Symbol Tables Big topic in old compiler courses: implemen9ng a hashed symbol table These days: use the collec9on classes that are provided with the standard language libraries (Java, C#, C++, ML, Haskell, etc.) Then tune & op9mize if it really macers In produc9on compilers, it really macers Up to a point Java: Map (HashMap) will handle most cases List (ArrayList) for ordered lists (parameters, etc.) Winter 2013 UW CSE 401 (Michael Ringenburg) 5 Symbol Tables for MiniJava Consider this a general outline, based on recommenda9ons courtesy of Hal Perkins (whose given this project many 9mes). Feel free to modify to fit your needs A mix of global and local tables First Global Table Per Program Informa9on Single global table to map class names to per- class symbol tables Created in a pass over class defini9ons in AST Used in remaining parts of compiler to check class types and extract informa9on about them (e.g., fields and methods) Winter 2013 UW CSE 401 (Michael Ringenburg) 6

Symbol Tables for MiniJava Other Global Tables Per Class Informa9on 1 Symbol table for each class 1 entry per method/field declared in the class Contents: type informa9on, public/private/protected (if implemen9ng not required in basic MiniJava), parameter types (for methods), storage loca9ons (offset of fields in class - will be discussed later), etc. Note: Storage info probably not needed for project part 3, but will be in part 4. Make sure it s easy to extend your implementa9on. In full Java, need mul9ple symbol tables (or more complex symbol table) per class Ex.: Java allows the same iden9fier to name both a method and a field in a class mul9ple namespaces Winter 2013 UW CSE 401 (Michael Ringenburg) 7 Conceptual Diagram of Global Tables Class List Global Table class foo class bar This is conceptual real implementation will likely have a Map for classes (global class list table) or fields and methods (per class tables) Global Table: class foo Field x à type, etc. Field y à type, etc. Method a à param/return types, etc. Method b Global Table: class bar Field z à type, etc. Winter 2013 UW CSE 401 (Michael Ringenburg) 8

Symbol Tables for MiniJava Global (cont) All global tables persist throughout the compila9on And beyond in a real compiler (e.g., symbolic informa9on in Java.class or MSIL files, link- 9me op9miza9on informa9on in gcc) Cray compilers generate program libraries, which contain full symbols tables and full post- front- end IR for every func9on in every module.» Can use this for interprocedural op9miza9on across source files (modules). Tradi9onally, each module compiled and op9mized individually into a.o/.class file (containing object- or byte- code). Winter 2013 UW CSE 401 (Michael Ringenburg) 9 Symbol Tables for MiniJava 1 local symbol table for each method 1 entry for each local variable or parameter Contents: type informa9on, storage loca9ons (offset from stack - filled in/discussed later), etc. Needed only while compiling the method; in a single pass compiler, you could discard when done with the method But if type checking and code gen, etc. are done in separate passes, this table needs to persist un9l we re done with it Your project implementa9on will be mul9pass Winter 2013 UW CSE 401 (Michael Ringenburg) 10

Beyond MiniJava What we aren t dealing with: nested scopes Inner classes Nested scopes in methods reuse of iden9fiers in parallel or inner scopes; nested func9ons Conceptual idea: keep a stack of symbol tables (pointers to tables, really) Push a new symbol table when we enter an inner scope Look for iden9fier in inner scope; if not found look at the element above it in the stack, recursively. Pop symbol table when we exit scope (conceptually but can t really ) Winter 2013 UW CSE 401 (Michael Ringenburg) 11 Scopes (Conceptual) void foo() {! int a, b;!! while (a!= b) {! int x, y;!! }! }! Stack Winter 2013 UW CSE 401 (Michael Ringenburg) 12

Scopes (Conceptual) void foo() {! int a, b;!! while (a!= b) {! int x, y;!! }! }! Table1 (a,b) Stack Winter 2013 UW CSE 401 (Michael Ringenburg) 13 Scopes (Conceptual) void foo() {! int a, b;!! while (a!= b) {! int x, y;!! }! }! Table2 (x,y) Table1 (a,b) Stack Winter 2013 UW CSE 401 (Michael Ringenburg) 14

Scopes (Conceptual) void foo() {! int a, b;!! while (a!= b) {! int x, y;!! }! }! Table1 (a,b) Stack Winter 2013 UW CSE 401 (Michael Ringenburg) 15 Scopes (Conceptual) void foo() {! int a, b;!! while (a!= b) {! int x, y;!! }! }! Stack Winter 2013 UW CSE 401 (Michael Ringenburg) 16

Engineering Issues In mul9pass compilers, symbol table info needs to persist aner analysis of inner scopes for use on later passes So popping can t really delete the scope s table. Keep around with pointer to parent scope. Effec9vely creates an upside- down tree of scopes (nodes have parent pointers rather than children pointers). Statements have pointers to their innermost scope. May want to retain O(1) lookup Not O(depth of scope nes9ng) although some compilers just assume this will be small enough to not macer. Compilers that care may use hash tables with addi9onal informa9on to get the scope nes9ng right. Winter 2013 UW CSE 401 (Michael Ringenburg) 17 Error Recovery What to do when an undeclared iden9fier is encountered? Prefer to only complain once (Why?) Can forge a symbol table entry for it once you ve complained so it will be found in the future Assign the forged entry a type of unknown Unknown is the type of all malformed expressions and is compa9ble with all other types Allows you to only complain once! (How?) Winter 2013 UW CSE 401 (Michael Ringenburg) 18

Predefined Things Many languages have some predefined items (func9ons, classes, standard library, ) Include ini9aliza9on code or declara9ons in the compiler to manually create symbol table entries for these when the compiler starts up Rest of compiler generally doesn t need to know the difference between predeclared items and ones found in the program Possible to put standard prelude informa9on in a file or data resource and use that to ini9alize Tradeoffs? Winter 2013 UW CSE 401 (Michael Ringenburg) 19 Today s Agenda Symbol Tables And symbol tables for MiniJava Typechecking Winter 2013 UW CSE 401 (Michael Ringenburg) 20

Types Types play a key role in most programming languages. E.g., Run- 9me safety Compile- 9me error detec9on Improved expressiveness (method or operator overloading, for example) Provide informa9on to op9mizer Strongly typed languages what data might be used where Type qualifiers (e.g., const and restrict in C) Winter 2013 UW CSE 401 (Michael Ringenburg) 21 Type Checking Terminology Sta9c vs. dynamic typing sta9c: checking done prior to execu9on (e.g. compile- 9me) dynamic: checking during execu9on Strong vs. weak typing strong: guarantees no illegal opera9ons performed weak: can t make guarantees Caveats: Hybrids common static dynamic Inconsistent usage common strong Java, ML Scheme, Ruby untyped, typeless could mean dynamic or weak weak C PERL Winter 2013 UW CSE 401 (Michael Ringenburg) 22

Type Systems Base Types Fundamental, atomic types Typical examples: int, double, char, bool Compound/Constructed Types Built up from other types (recursively) via type constructors Constructors include arrays, records/structs/ classes, pointers, enumera9ons, func9ons, modules, Winter 2013 UW CSE 401 (Michael Ringenburg) 23 Constructed/Compound Types Array Type Dimensions: 2 Class Name: foo Field x Field y Winter 2013 UW CSE 401 (Michael Ringenburg) 24

Type Representa9on Typical compiler representa9ons create a shallow class hierarchy, for example: abstract class Type { } // or interface class ClassType extends Type { } class BaseType extends Type { } Should not need too many of these Winter 2013 UW CSE 401 (Michael Ringenburg) 25 Types vs ASTs Types are not AST nodes! AST nodes may have a type field, however AST = abstract representa9on of source program (including source program type info) Types = abstract representa9on of type seman9cs for type checking, inference, etc. Can include informa9on not explicitly represented in the source code, or may describe types in ways more convenient for processing Be sure you have a separate type class hierarchy in your compiler dis9nct from the AST Winter 2013 UW CSE 401 (Michael Ringenburg) 26

Base Types For each base type (int, boolean, others in other languages), create a single object to represent it Base types in symbol table entries and AST nodes are direct references to these objects Base type objects usually created at compiler startup Useful to create a type void object to tag func9ons that do not return a value Also useful to create a type unknown object for errors ( void and unknown types reduce the need for special case code in various places in the type checker no null type or return type fields)) Winter 2013 UW CSE 401 (Michael Ringenburg) 27 Compound Types Basic idea: use a appropriate type constructor object that refers to the component types Limited number of these correspond directly to type constructors in the language (record/struct/ class, array, func9on,) Winter 2013 UW CSE 401 (Michael Ringenburg) 28

Class Types Type for: class Id { fields and methods } class ClassType extends Type { Type baseclasstype; // ref to base class Map fields; // type info for fields Map methods; // type info for methods } Base class pointer, so we can check field references against base class if we don t find in this class. (Note: may not want to do this literally, depending on how class symbol tables are represented; i.e., class symbol tables might be useful or sufficient as the representa9on of the class type.) Winter 2013 UW CSE 401 (Michael Ringenburg) 29 Array Types For regular Java this is simple: only possibility is # of dimensions and element type class ArrayType extends Type { int ndims; Type elementtype; } More interes9ng in languages like Pascal (more complex array indexing) Winter 2013 UW CSE 401 (Michael Ringenburg) 30

Methods/Func9ons Type of a method is its result type plus an ordered list of parameter types class MethodType extends Type { Type resulttype; // type or void List parametertypes; } Winter 2013 UW CSE 401 (Michael Ringenburg) 31 Type Equivalance For base types this is simple If you have just a single instance of each base type (as recommend), then types are the same if and only if they are iden9cal Pointer/reference comparison in the type checker Normally there are well defined rules for coercions between arithme9c types Compiler inserts these automa9cally or when requested by programmer (casts) onen involves inser9ng cast/conversion nodes in AST Basic MiniJava doesn t need these only int s Winter 2013 UW CSE 401 (Michael Ringenburg) 32

Type Equivalence for Compound Types Two basic strategies Structural equivalence: two types are the same if they are the same kind of type and their component types are equivalent, recursively E.g., two struct types, each with exactly two int fields Name equivalence: two types are the same only if they have the same name. If their structures match, but have dis9nct names, they are not equal. I.e., two variables only have the same type if they are declared from the same class (even if two classes are structurally iden9cal). Different language design philosophies Same languages use a mixture, as well Winter 2013 UW CSE 401 (Michael Ringenburg) 33 Structural Equivalence Structural equivalence says two types are equal iff they have same structure Iden9cal base types clearly have the same structure if type constructors: same constructor recursively, equivalent arguments to constructor Ex: atomic types, array types, ML record types Implement with recursive implementa9on of equals, or by canonicaliza9on of types when types created then use pointer/reference equality Winter 2013 UW CSE 401 (Michael Ringenburg) 34

Name Equivalence Name equivalence says that two types are equal iff they came from the same textual occurrence of a type constructor Ex: class types, C struct types (struct tag name), datatypes in ML special case: type synonyms (e.g. typedef in C) do not define new types Implement with pointer/reference equality assuming appropriate representa9on of type info Winter 2013 UW CSE 401 (Michael Ringenburg) 35 Type Equivalence and Inheritance Suppose we have class Base { } class Extended extends Base { } A variable declared with type Base has a compile- <me type of Base During execu9on, that variable may refer to an object of class Base or any of its subclasses like Extended (or can be null) Some9mes called the run<me type Subclasses guaranteed to have all fields/methods of base class, so typechecking as base class suffices Winter 2013 UW CSE 401 (Michael Ringenburg) 36

Type Casts In most languages, one can explicitly cast an object of one type to another some9mes cast means a conversion (e.g., casts between numeric types) some9mes cast means a change of sta9c type without doing any computa9on (casts between pointer types or pointer and numeric types) With class types, may also mean upcast (free) or downcast (run9me check) Note: Casts not present in basic MiniJava Winter 2013 UW CSE 401 (Michael Ringenburg) 37 Type Conversions and Coercions In Java, we can explicitly convert an value of type double to one of type int Can represent as unary operator Typecheck, generate code normally In Java, can implicitly coerce an value of type int to one of type double Compiler must insert unary conversion operators, based on result of type checking Once again only ints in basic MiniJava Winter 2013 UW CSE 401 (Michael Ringenburg) 38

C and Java: type casts In C: safety/correctness of casts not checked Allows wri9ng low- level code that s type- unsafe Result is onen implementa9on dependent/undefined. Not portable, but some9mes useful. In Java: downcasts from superclass to subclass need run- 9me check to preserve type safety Otherwise, might use field (or call method) that is not present in superclass Sta9c typechecker allows the cast Code generator introduces run- 9me check (same code needed to handle instanceof ) Java s main form of dynamic type checking Winter 2013 UW CSE 401 (Michael Ringenburg) 39 Various No9ons of Equivalance There are usually several rela9ons on types that we need to deal with: is the same as is assignable to is same or a subclass of is conver9ble to Be sure to check for the right one(s) Winter 2013 UW CSE 401 (Michael Ringenburg) 40

Useful Compiler Func9ons Create a handful of methods to decide different kinds of type compa9bility: Types are iden9cal Type t1 is assignment compa9ble with t2 Parameter list is compa9ble with types of expressions in the call Usual modularity reasons: isolates these decisions in one place and hides the actual type representa9on from the rest of the compiler Probably belongs in the same package with the type representa9on classes (package for dealing with types) Winter 2013 UW CSE 401 (Michael Ringenburg) 41 Implemen9ng Type Checking for MiniJava Create mul9ple visitors for the AST First pass/passes: gather class informa9on Collect global type informa9on for classes Could do this in one pass, or might want to do one pass to collect class informa9on, then a second one to collect per- class informa9on about fields, methods you decide Next set of passes: go through method bodies to check types, other seman9c constraints Winter 2013 UW CSE 401 (Michael Ringenburg) 42

Disclaimer This discussion of seman9cs, type representa9on, etc. should give you a good idea of what needs to be done in your project, but you ll need to adapt the ideas to the project specifics. Project part 3 out later this week targe9ng Thursday (day aner part 2 is due). You ll also find good ideas in your compiler book(s). Winter 2013 UW CSE 401 (Michael Ringenburg) 43 Coming Acrac9ons Need to start thinking about transla9ng to object code (actually x86(- 64) assembly language, the default for this project) Next lectures x86 overview (as a target for simple compilers) Run9me representa9on of classes, objects, data, and method stack frames Assembly language code for higher- level language statements And there s a midterm on Friday! Winter 2013 UW CSE 401 (Michael Ringenburg) 44