CSE P 501 Compilers. Implementing ASTs (in Java) Hal Perkins Autumn /20/ Hal Perkins & UW CSE H-1

Similar documents
CSE P 501 Compilers. Implementing ASTs (in Java) Hal Perkins Winter /22/ Hal Perkins & UW CSE H-1

CSE 401/M501 Compilers

CSE P 501 Compilers. Static Semantics Hal Perkins Winter /22/ Hal Perkins & UW CSE I-1

Chapter 4. Abstract Syntax

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1

CSE 401/M501 Compilers

An Automatic Visitor Generator for Ada

Project 1: Scheme Pretty-Printer

CSE 401/M501 Compilers

CSE P 501 Compilers. Intermediate Representations Hal Perkins Spring UW CSE P 501 Spring 2018 G-1

CSE 401 Midterm Exam Sample Solution 11/4/11

Introduction to Programming Using Java (98-388)

MiniJava Parser and AST. Winter /27/ Hal Perkins & UW CSE E1-1

CSE P 501 Exam 8/5/04

Semantic analysis. Compiler Construction. An expression evaluator. Examples of computations. Computations on ASTs Aspect oriented programming

CSE 401 Final Exam. December 16, 2010

LL(k) Compiler Construction. Top-down Parsing. LL(1) parsing engine. LL engine ID, $ S 0 E 1 T 2 3

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

LL(k) Compiler Construction. Choice points in EBNF grammar. Left recursive grammar

Last Time. What do we want? When do we want it? An AST. Now!

ASTs, Objective CAML, and Ocamlyacc

.jj file with actions

CSE 401/M501 Compilers

CPS2000 Compiler Theory & Practice

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Spring UW CSE P 501 Spring 2018 C-1

Pattern Examples Behavioural

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

CSE 401/M501 Compilers

A Type Graph Model for Java Programs

Visitors. Move functionality

CS 406: Syntax Directed Translation

Day 4. COMP1006/1406 Summer M. Jason Hinek Carleton University

CSE P 501 Compilers. Static Semantics Hal Perkins Spring UW CSE P 501 Spring 2018 I-1

Table-Driven Top-Down Parsers

Program Representations

CSE 401 Midterm Exam Sample Solution 2/11/15

Administration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator

CSE 331 Software Design & Implementation

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1

Project Compiler. CS031 TA Help Session November 28, 2011

Semantic Analysis. Compiler Architecture

CS 4120 Introduction to Compilers

Think of drawing/diagramming editors. ECE450 Software Engineering II. The problem. The Composite pattern

C++ Important Questions with Answers

Implementing Object Equivalence in Java Using the Template Method Design Pattern

Lecture 8 CS 412/413 Spring '00 -- Andrew Myers 2. Lecture 8 CS 412/413 Spring '00 -- Andrew Myers 4

INCORPORATING ADVANCED PROGRAMMING TECHNIQUES IN THE COMPUTER INFORMATION SYSTEMS CURRICULUM

Rules and syntax for inheritance. The boring stuff

Abstract Syntax. Mooly Sagiv. html://

A programming language requires two major definitions A simple one pass compiler

CMPT 379 Compilers. Parse trees

Object-oriented Compiler Construction

drawobject Circle draw

Context-free grammars (CFG s)

Static Semantics. Lecture 15. (Notes by P. N. Hilfinger and R. Bodik) 2/29/08 Prof. Hilfinger, CS164 Lecture 15 1

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

Lab 2 Tutorial (An Informative Guide)

CPSC 411: Introduction to Compiler Construction

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division

Chapter 8, Design Patterns Visitor

CSE302: Compiler Design

LECTURE 3. Compiler Phases

D Programming Language

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

CSE 5317 Midterm Examination 4 March Solutions

Examples of attributes: values of evaluated subtrees, type information, source file coordinates,

BFH/HTA Biel/DUE/Course 355/ Software Engineering 2

Design patterns (part 3)

CS 536 Spring Programming Assignment 1 Identifier Cross-Reference Analysis

Writing a Lexer. CS F331 Programming Languages CSCE A331 Programming Language Concepts Lecture Slides Monday, February 6, Glenn G.

Languages, Automata, Regular Expressions & Scanners. Winter /8/ Hal Perkins & UW CSE B-1

1B1b Inheritance. Inheritance. Agenda. Subclass and Superclass. Superclass. Generalisation & Specialisation. Shapes and Squares. 1B1b Lecture Slides

CSE 401 Final Exam. March 14, 2017 Happy π Day! (3/14) This exam is closed book, closed notes, closed electronics, closed neighbors, open mind,...

Repeating patterns of boilerplate code (like parent/child navigation, Visitor pattern implementation, factory methods for parser tree construction)

CSE 413 Final Exam Spring 2011 Sample Solution. Strings of alternating 0 s and 1 s that begin and end with the same character, either 0 or 1.

Lecture 3. COMP1006/1406 (the Java course) Summer M. Jason Hinek Carleton University

CSE 12 Abstract Syntax Trees

What are the characteristics of Object Oriented programming language?

Semantic Analysis. Lecture 9. February 7, 2018

Abstract Syntax Trees

Time : 1 Hour Max Marks : 30

A clarification on terminology: Recognizer: accepts or rejects strings in a language. Parser: recognizes and generates parse trees (imminent topic)

AP Computer Science Chapter 10 Implementing and Using Classes Study Guide

Compilers. Type checking. Yannis Smaragdakis, U. Athens (original slides by Sam

Implementing Actions

Today. Assignments. Lecture Notes CPSC 326 (Spring 2019) Operator Associativity & Precedence. AST Navigation. HW4 out (due next Thurs)

Syntax Errors; Static Semantics

Compilers. Compiler Construction Tutorial The Front-end

CSE 413 Final Exam. June 7, 2011

David Talby March 21, 2006

Type Checking Binary Operators

CSE 401/M501 Compilers

Topics in Object-Oriented Design Patterns

Experiences in Building a Compiler for an Object-Oriented Language

Grammars and Parsing, second week

CO Java SE 8: Fundamentals

Design Patterns. Visitor Pattern. Hans Vangheluwe and Alexandre Denault

EDA180: Compiler Construc6on. Top- down parsing. Görel Hedin Revised: a

CSE 582 Autumn 2002 Exam 11/26/02

Object Orientated Analysis and Design. Benjamin Kenwright

Transcription:

CSE P 501 Compilers Implementing ASTs (in Java) Hal Perkins Autumn 2009 10/20/2009 2002-09 Hal Perkins & UW CSE H-1

Agenda Representing ASTs as Java objects Parser actions Operations on ASTs Modularity and encapsulation Visitor pattern This is a general sketch of the ideas more details and sample code online for MiniJava 10/20/2009 2002-09 Hal Perkins & UW CSE H-2

Review: ASTs An Abstract Syntax Tree captures the essential structure of the program, without the extra concrete grammar details needed to guide the parser AST: Example: while ( n > 0 ) { n = n 1; 10/20/2009 2002-09 Hal Perkins & UW CSE H-3

Representation in Java Basic idea is simple: use small classes as records (or structs) to represent nodes in the AST Simple data structures, not too smart But also use a bit of inheritance so we can treat related nodes polymorphically Following slides sketch the ideas not necessarily what you ll use in your project 10/20/2009 2002-09 Hal Perkins & UW CSE H-4

AST Nodes - Sketch // Base class of AST node hierarchy public abstract class ASTNode { // constructors (for convenience) // operations // string representation public abstract String tostring() ; // visitor methods, etc. 10/20/2009 2002-09 Hal Perkins & UW CSE H-5

Some Statement Nodes // Base class for all statements public abstract class StmtNode extends ASTNode { // while (exp) stmt public class WhileNode extends StmtNode { public ExpNode exp; public StmtNode stmt; public WhileNode(ExpNode exp, StmtNode stmt) { this.exp = exp; this.stmt = stmt; public String tostring() { return While( + exp + ) + stmt; (Note on tostring: most of the time we ll want to print the tree in a separate traversal, so this is mostly useful for limited debugging) 10/20/2009 2002-09 Hal Perkins & UW CSE H-6

More Statement Nodes // if (exp) stmt [else stmt] public class IfNode extends StmtNode { public ExpNode exp; public StmtNode thenstmt, elsestmt; public IfNode(ExpNode exp,stmtnode thenstmt,stmtnode elsestmt) { this.exp=exp; this.thenstmt=thenstmt;this.elsestmt=elsestmt; public IfNode(ExpNode exp, StmtNode thenstmt) { this(exp, thenstmt, null); public String tostring() { 10/20/2009 2002-09 Hal Perkins & UW CSE H-7

Expressions // Base class for all expressions public abstract class ExpNode extends ASTNode { // exp1 op exp2 public class BinExp extends ExpNode { public ExpNode exp1, exp2; // operands public int op; // operator (lexical token) public BinExp(Token op, ExpNode exp1, ExpNode exp2) { this.op = op; this.exp1 = exp1; this.exp2 = exp2; public String tostring() { 10/20/2009 2002-09 Hal Perkins & UW CSE H-8

More Expressions // Method call: id(arguments) public class MethodExp extends ExpNode { public ExpNode id; // method public List args; // list of argument expressions public BinExp(ExpNode id, List args) { this.id = id; this.args = args; public String tostring() { 10/20/2009 2002-09 Hal Perkins & UW CSE H-9

&c These examples are meant to get across the ideas, not necessarily to be used literally E.g., you might find it much better to have a specific AST node for argument list that encapsulates the List of arguments You ll also need nodes for class and method declarations, parameter lists, and so forth Starter code on the web for MiniJava 10/20/2009 2002-09 Hal Perkins & UW CSE H-10

Position Information in Nodes To produce useful error messages, it s helpful to record the source program location corresponding to a node in that node Most scanner/parser generators have a hook for this, usually storing source position information in tokens Included in the MiniJava starter code we distributed useful to take advantage of it in your code 10/20/2009 2002-09 Hal Perkins & UW CSE H-11

AST Generation Idea: each time the parser recognizes a complete production, it produces as its result an AST node (with links to the subtrees that are the components of the production in its instance variables) When we finish parsing, the result of the goal symbol is the complete AST for the program 10/20/2009 2002-09 Hal Perkins & UW CSE H-12

Example: Recursive-Descent AST Generation // parse while (exp) stmt WhileNode whilestmt() { // skip while ( getnexttoken(); getnexttoken(); // skip ) getnexttoken; // parse stmt StmtNode body = stmt(); // parse exp ExpNode condition = exp(); // return AST node for while return new WhileNode (condition, body); 10/20/2009 2002-09 Hal Perkins & UW CSE H-13

AST Generation in YACC/CUP A result type can be specified for each item in the grammar specification Each parser rule can be annotated with a semantic action, which is just a piece of Java code that returns a value of the result type The semantic action is executed when the rule is reduced 10/20/2009 2002-09 Hal Perkins & UW CSE H-14

YACC/CUP Parser Specification Specification non terminal StmtNode stmt, whilestmt; non terminal ExpNode exp; stmt ::= WHILE LPAREN exp:e RPAREN stmt:s {: RESULT = new WhileNode(e,s); : ; See the starter code for version with line numbers 10/20/2009 2002-09 Hal Perkins & UW CSE H-15

ANTLR/JavaCC/others Integrated tools like these provide tools to generate syntax trees automatically Advantage: saves work, don t need to define AST classes and write semantic actions Disadvantage: generated trees might not have the right level of abstraction for what you want to do For our project, do-it-yourself with CUP The (revised) starter code contains the AST classes from the minijava web site 10/20/2009 2002-09 Hal Perkins & UW CSE H-16

Operations on ASTs Once we have the AST, we may want to Print a readable dump of the tree (pretty printing) Do static semantic analysis Type checking Verify that things are declared and initialized properly Etc. etc. etc. etc. Perform optimizing transformations on the tree Generate code from the tree, or Generate another IR from the tree for further processing 10/20/2009 2002-09 Hal Perkins & UW CSE H-17

Where do the Operations Go? Pure object-oriented style Really smart AST nodes Each node knows how to perform every operation on itself public class WhileNode extends StmtNode { public WhileNode(); public typecheck(); public StrengthReductionOptimize(); public generatecode(); public prettyprint(); 10/20/2009 2002-09 Hal Perkins & UW CSE H-18

Critique This is nicely encapsulated all details about a WhileNode are hidden in that class But it is poor modularity What happens if we want to add a new Optimize operation? Have to open up every node class Furthermore, it means that the details of any particular operation (optimization, type checking) are scattered across the node classes 10/20/2009 2002-09 Hal Perkins & UW CSE H-19

Modularity Issues Smart nodes make sense if the set of operations is relatively fixed, but we expect to need flexibility to add new kinds of nodes Example: graphics system Operations: draw, move, iconify, highlight Objects: textbox, scrollbar, canvas, menu, dialog box, plus new objects defined as the system evolves 10/20/2009 2002-09 Hal Perkins & UW CSE H-20

Modularity in a Compiler Abstract syntax does not change frequently over time Kinds of nodes are relatively fixed As a compiler evolves, it is common to modify or add operations on the AST nodes Want to modularize each operation (type check, optimize, code gen) so its components are together Want to avoid having to change node classes when we modify or add an operation on the tree 10/20/2009 2002-09 Hal Perkins & UW CSE H-21

Two Views of Modularity transmogrify highlight iconify move draw Print Flatten Generate x86 Optimize Type check IDENT X X X X X exp X X X X X while X X X X X if X X X X X Binop X X X X X circle X X X X X text X X X X X canvas X X X X X scroll X X X X X dialog X X X X X 10/20/2009 2002-09 Hal Perkins & UW CSE H-22

Visitor Pattern Idea: Package each operation in a separate class One operation method for each AST node kind Create one instance of this visitor class Sometimes called a function object Include a generic accept visitor method in every node class To perform the operation, pass the visitor object around the AST during a traversal This object contains separate methods to process each AST node type 10/20/2009 2002-09 Hal Perkins & UW CSE H-23

Avoiding instanceof Next issue: we d like to avoid huge if-elseif nests to check the node type in the visitor void checktypes(astnode p) { if (p instanceof WhileNode) { else if (p instanceof IfNode) { else if (p instanceof BinExp) { Solution: Include an overloaded visit method in each AST node type and get the AST node to call back to the correct operation for that node(!) Double dispatch 10/20/2009 2002-09 Hal Perkins & UW CSE H-24

One More Issue We want to be able to add new operations easily, so the nodes shouldn t know anything specific about the actual visitor class(es) Solution: an abstract Visitor interface AST nodes include accept visitor method for the interface Specific operations (type check, code gen) are implementations of this interface 10/20/2009 2002-09 Hal Perkins & UW CSE H-25

Visitor Interface interface Visitor { // overload visit for each AST node type public void visit(whilenode s); public void visit(ifnode s); public void visit(binexp e); Aside: The result type can be whatever is convenient, doesn t have to be void 10/20/2009 2002-09 Hal Perkins & UW CSE H-26

Specific class TypeCheckVisitor // Perform type checks on the AST public class TypeCheckVisitor implements Visitor { // override operations for each node type public void visit(binexp e) { // visit subexpressions pass this visitor object e.exp1.accept(this); e.exp2.accept(this); // do additional processing on e before or after public void visit(whilenode s) { public void visit(ifnode s) { 10/20/2009 2002-09 Hal Perkins & UW CSE H-27

Add Visitor Method to AST Nodes Add a new method to class ASTNode (base class or interface describing all AST nodes) public abstract class ASTNode { // accept a visit from a Visitor object v public abstract void accept(visitor v); 10/20/2009 2002-09 Hal Perkins & UW CSE H-28

Override Accept Method in Each Specific AST Node Class Example public class WhileNode extends StmtNode { // accept a visit from a Visitor object v public void accept(visitor v) { v.visit(this); // dynamic dispatch on this (WhileNode) Key points Visitor object passed as a parameter to WhileNode WhileNode calls visit, which dispatches to visit(whilenode) automatically i.e., the correct method for this kind of node 10/20/2009 2002-09 Hal Perkins & UW CSE H-29

Encapsulation A visitor object often needs to be able to access state in the AST nodes May need to expose more state than we might do to otherwise Overall a good tradeoff better modularity (plus, the nodes are relatively simple data objects anyway not hiding much of anything) 10/20/2009 2002-09 Hal Perkins & UW CSE H-30

Composite Objects If the node contains references to subnodes, we often visit them first (i.e., pass the visitor along in a depth-first traversal of the AST) public class WhileNode extends StmtNode { // accept a visit from Visitor object v public void accept(visitor v) { this.exp.accept(v); this.stmt.accept(v); v.visit(this); Other traversals can be added if needed 10/20/2009 2002-09 Hal Perkins & UW CSE H-31

Visitor Actions A visitor function has a reference to the node it is visiting (the parameter) can access subtrees via that node It s also possible for the visitor object to contain local state (data), used to accumulate information during the traversal Effectively global data shared by visit methods public class TypeCheckVisitor extends NodeVisitor { public void visit(whilenode s) { public void visit(ifnode s) { private <local state>; 10/20/2009 2002-09 Hal Perkins & UW CSE H-32

Responsibility for the Traversal Possible choices The node objects (as done above) The visitor object (the visitor has access to the node, so it can traverse any substructure it wishes) Some sort of iterator object In a compiler, the first choice will handle many common cases 10/20/2009 2002-09 Hal Perkins & UW CSE H-33

References For Visitor pattern (and many others) Design Patterns: Elements of Reusable Object-Oriented Software Gamma, Helm, Johnson, and Vlissides Addison-Wesley, 1995 Specific information for MiniJava AST and visitors in Appel textbook & online 10/20/2009 2002-09 Hal Perkins & UW CSE H-34

Coming Attractions Static Analysis Type checking & representation of types Non-context-free rules (variables and types must be declared, etc.) Symbol Tables & more 10/20/2009 2002-09 Hal Perkins & UW CSE H-35