CSE P 501 Compilers. Implementing ASTs (in Java) Hal Perkins Winter /22/ Hal Perkins & UW CSE H-1

Similar documents
CSE P 501 Compilers. Implementing ASTs (in Java) Hal Perkins Autumn /20/ Hal Perkins & UW CSE H-1

CSE 401/M501 Compilers

CSE P 501 Compilers. Static Semantics Hal Perkins Winter /22/ Hal Perkins & UW CSE I-1

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1

Chapter 4. Abstract Syntax

CSE 401/M501 Compilers

An Automatic Visitor Generator for Ada

CSE 401/M501 Compilers

CSE P 501 Compilers. Intermediate Representations Hal Perkins Spring UW CSE P 501 Spring 2018 G-1

Project 1: Scheme Pretty-Printer

CSE P 501 Exam 8/5/04

Introduction to Programming Using Java (98-388)

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

CSE 401 Midterm Exam Sample Solution 11/4/11

MiniJava Parser and AST. Winter /27/ Hal Perkins & UW CSE E1-1

LL(k) Compiler Construction. Top-down Parsing. LL(1) parsing engine. LL engine ID, $ S 0 E 1 T 2 3

Pattern Examples Behavioural

LL(k) Compiler Construction. Choice points in EBNF grammar. Left recursive grammar

Think of drawing/diagramming editors. ECE450 Software Engineering II. The problem. The Composite pattern

ASTs, Objective CAML, and Ocamlyacc

CS 406: Syntax Directed Translation

CSE 401 Final Exam. December 16, 2010

Last Time. What do we want? When do we want it? An AST. Now!

A Type Graph Model for Java Programs

CPS2000 Compiler Theory & Practice

Semantic analysis. Compiler Construction. An expression evaluator. Examples of computations. Computations on ASTs Aspect oriented programming

Object-oriented Compiler Construction

CSE 401/M501 Compilers

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

CSE P 501 Compilers. Static Semantics Hal Perkins Spring UW CSE P 501 Spring 2018 I-1

Visitors. Move functionality

Day 4. COMP1006/1406 Summer M. Jason Hinek Carleton University

.jj file with actions

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Spring UW CSE P 501 Spring 2018 C-1

INCORPORATING ADVANCED PROGRAMMING TECHNIQUES IN THE COMPUTER INFORMATION SYSTEMS CURRICULUM

Chapter 8, Design Patterns Visitor

Program Representations

Project Compiler. CS031 TA Help Session November 28, 2011

CS 4120 Introduction to Compilers

Semantic Analysis. Compiler Architecture

D Programming Language

Administration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1

C++ Important Questions with Answers

Rules and syntax for inheritance. The boring stuff

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division

Lecture 8 CS 412/413 Spring '00 -- Andrew Myers 2. Lecture 8 CS 412/413 Spring '00 -- Andrew Myers 4

Semantic Analysis. Lecture 9. February 7, 2018

A programming language requires two major definitions A simple one pass compiler

Repeating patterns of boilerplate code (like parent/child navigation, Visitor pattern implementation, factory methods for parser tree construction)

CMPT 379 Compilers. Parse trees

drawobject Circle draw

Context-free grammars (CFG s)

Abstract Syntax. Mooly Sagiv. html://

Lab 2 Tutorial (An Informative Guide)

C++ Inheritance and Encapsulation

CSE 401 Midterm Exam Sample Solution 2/11/15

Lecture 3. COMP1006/1406 (the Java course) Summer M. Jason Hinek Carleton University

CPSC 411: Introduction to Compiler Construction

CSE 401/M501 Compilers

Implementing Object Equivalence in Java Using the Template Method Design Pattern

Table-Driven Top-Down Parsers

Writing a Lexer. CS F331 Programming Languages CSCE A331 Programming Language Concepts Lecture Slides Monday, February 6, Glenn G.

CSE 331 Software Design & Implementation

LECTURE 3. Compiler Phases

CSE302: Compiler Design

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

Grammars and Parsing, second week

CS 251 Intermediate Programming Inheritance

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

Visitor Pattern.» Represent an operation to be performed on all of the components of an object structure

The Visitor Pattern. Design Patterns In Java Bob Tarr

Topics in Object-Oriented Design Patterns

CSE 5317 Midterm Examination 4 March Solutions

Type Checking Binary Operators

Inheritance & Polymorphism Recap. Inheritance & Polymorphism 1

Examples of attributes: values of evaluated subtrees, type information, source file coordinates,

Tecniche di Progettazione: Design Patterns

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

David Talby March 21, 2006

CS 536 Spring Programming Assignment 1 Identifier Cross-Reference Analysis

BFH/HTA Biel/DUE/Course 355/ Software Engineering 2

The Object Recursion Pattern

Error Detection in LALR Parsers. LALR is More Powerful. { b + c = a; } Eof. Expr Expr + id Expr id we can first match an id:

Visitor Pattern CS356 Object-Oriented Design and Programming November 5, 2014 Yu Sun, Ph.D.

Design patterns (part 3)

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

Languages, Automata, Regular Expressions & Scanners. Winter /8/ Hal Perkins & UW CSE B-1

Static Semantics. Lecture 15. (Notes by P. N. Hilfinger and R. Bodik) 2/29/08 Prof. Hilfinger, CS164 Lecture 15 1

What are the characteristics of Object Oriented programming language?

OOPS Viva Questions. Object is termed as an instance of a class, and it has its own state, behavior and identity.

Intermediate Code Generation

Object-Oriented Oriented Programming

CSE 413 Final Exam Spring 2011 Sample Solution. Strings of alternating 0 s and 1 s that begin and end with the same character, either 0 or 1.

AP Computer Science Chapter 10 Implementing and Using Classes Study Guide

CSE 12 Abstract Syntax Trees

Abstract Syntax Trees

EDA180: Compiler Construc6on. More Top- Down Parsing Abstract Syntax Trees Görel Hedin Revised:

Time : 1 Hour Max Marks : 30

Implementing Actions

Transcription:

CSE P 501 Compilers Implementing ASTs (in Java) Hal Perkins Winter 2008 1/22/2008 2002-08 Hal Perkins & UW CSE H-1

Agenda Representing ASTs as Java objects Parser actions Operations on ASTs Modularity and encapsulation Visitor pattern This is a general sketch of the ideas more details and sample code online for MiniJava 1/22/2008 2002-08 Hal Perkins & UW CSE H-2

Review: ASTs An Abstract Syntax Tree captures the essential structure of the program, without the extra concrete grammar details needed to guide the parser AST: Example: while ( n > 0 ) { n = n 1; 1/22/2008 2002-08 Hal Perkins & UW CSE H-3

Representation in Java Basic idea is simple: use small classes as records (or structs) to represent nodes in the AST Simple data structures, not too smart But also use a bit of inheritance so we can treat related nodes polymorphically 1/22/2008 2002-08 Hal Perkins & UW CSE H-4

AST Nodes - Sketch // Base class of AST node hierarchy public abstract class ASTNode { // constructors (for convenience) // operations // string representation public abstract String tostring() ; // etc. Note: In a production compiler, we would put the node classes into a separate Java package. Use your own judgment for your project. 1/22/2008 2002-08 Hal Perkins & UW CSE H-5

Some Statement Nodes // Base class for all statements public abstract class StmtNode extends ASTNode { // while (exp) stmt public class WhileNode extends StmtNode { public ExpNode exp; public StmtNode stmt; public WhileNode(ExpNode exp, StmtNode stmt) { this.exp = exp; this.stmt = stmt; public String tostring() { return While( + exp + ) + stmt; (Note on tostring: most of the time we ll want to print the tree in a separate traversal, so this is mostly useful for limited debugging) 1/22/2008 2002-08 Hal Perkins & UW CSE H-6

More Statement Nodes // if (exp) stmt [else stmt] public class IfNode extends StmtNode { public ExpNode exp; public StmtNode thenstmt, elsestmt; public IfNode(ExpNode exp,stmtnode thenstmt,stmtnode elsestmt) { this.exp=exp; this.thenstmt=thenstmt;this.elsestmt=elsestmt; public IfNode(ExpNode exp, StmtNode thenstmt) { this(exp, thenstmt, null); public String tostring() { 1/22/2008 2002-08 Hal Perkins & UW CSE H-7

Java Style Note (1) Some good housekeeping reminders use your own judgement about what to use in your project Normally, any significant Java type should be defined by an interface interface ASTNode {... If there are at least some methods that will be used by most implementations of the interface, provide a default implementation public class ASTNodeImpl {... Similarly for subclasses and subinterfaces interface Statement implements ASTNode {... or public class StatementIMPL implements Statement {... public class StatementIMPL extends ASTNodeIMPL implements Statement {... 1/22/2008 2002-08 Hal Perkins & UW CSE H-8

Java Style Note (2) Method parameters and variables should use the interface names as types for maximum flexibility wherever possible Implementations of nodes can either extend some other class or directly implement an interface as appropriate Specific kinds of nodes that will not be extended can be defined directly no interface needed These slides use inheritance only (historical laziness and it s more compact) 1/22/2008 2002-08 Hal Perkins & UW CSE H-9

Expressions // Base class for all expressions public abstract class ExpNode extends ASTNode { // exp1 op exp2 public class BinExp extends ExpNode { public ExpNode exp1, exp2; // operands public int op; // operator (lexical token) public BinExp(Token op, ExpNode exp1, ExpNode exp2) { this.op = op; this.exp1 = exp1; this.exp2 = exp2; public String tostring() { 1/22/2008 2002-08 Hal Perkins & UW CSE H-10

More Expressions // Method call: id(arguments) public class MethodExp extends ExpNode { public ExpNode id; // method public List args; // list of argument expressions public BinExp(ExpNode id, List args) { this.id = id; this.args = args; public String tostring() { 1/22/2008 2002-08 Hal Perkins & UW CSE H-11

&c These examples are meant to give you some ideas, not necessarily to be used literally E.g., you might find it much better to have a specific AST node for argument list that encapsulates the List of arguments You ll also need nodes for class and method declarations, parameter lists, and so forth Starter code on the web for MiniJava 1/22/2008 2002-08 Hal Perkins & UW CSE H-12

Position Information in Nodes To produce useful error messages, it s helpful to record the source program location corresponding to a node in that node Most scanner/parser generators have a hook for this, usually storing source position information in tokens Would be nice in our projects, but not required (i.e., get the parser/ast construction working first) 1/22/2008 2002-08 Hal Perkins & UW CSE H-13

AST Generation Idea: each time the parser recognizes a complete production, it produces as its result an AST node (with links to the subtrees that are the components of the production in its instance variables) When we finish parsing, the result of the goal symbol is the complete AST for the program 1/22/2008 2002-08 Hal Perkins & UW CSE H-14

Example: Recursive-Descent AST Generation // parse while (exp) stmt WhileNode whilestmt() { // skip while ( getnexttoken(); getnexttoken(); // skip ) getnexttoken; // parse stmt StmtNode body = stmt(); // parse exp ExpNode condition = exp(); // return AST node for while return new WhileNode (condition, body); 1/22/2008 2002-08 Hal Perkins & UW CSE H-15

AST Generation in YACC/CUP A result type can be specified for each item in the grammar specification Each parser rule can be annotated with a semantic action, which is just a piece of Java code that returns a value of the result type The semantic action is executed when the rule is reduced 1/22/2008 2002-08 Hal Perkins & UW CSE H-16

YACC/CUP Parser Specification Specification non terminal StmtNode stmt, whilestmt; non terminal ExpNode exp; stmt ::= WHILE LPAREN exp:e RPAREN stmt:s {: RESULT = new WhileNode(e,s); : ; 1/22/2008 2002-08 Hal Perkins & UW CSE H-17

ANTLR/JavaCC/others Integrated tools like these provide tools to generate syntax trees automatically Advantage: saves work, don t need to define AST classes and write semantic actions Disadvantage: generated trees might not have the right level of abstraction for what you want to do 1/22/2008 2002-08 Hal Perkins & UW CSE H-18

Operations on ASTs Once we have the AST, we may want to Print a readable dump of the tree (pretty printing) Do static semantic analysis Type checking Verify that things are declared and initialized properly Etc. etc. etc. etc. Perform optimizing transformations on the tree Generate code from the tree, or Generate another IR from the tree for further processing (often flatten to a linear IR) 1/22/2008 2002-08 Hal Perkins & UW CSE H-19

Where do the Operations Go? Pure object-oriented style Really smart AST nodes Each node knows how to perform every operation on itself public class WhileNode extends StmtNode { public WhileNode( ); public typecheck( ); public StrengthReductionOptimize( ); public generatecode( ); public prettyprint( ); 1/22/2008 2002-08 Hal Perkins & UW CSE H-20

Critique This is nicely encapsulated all details about a WhileNode are hidden in that class But it is poor modularity What happens if we want to add a new Optimize operation? Have to open up every node class Furthermore, it means that the details of any particular operation (optimization, type checking) are scattered across the node classes 1/22/2008 2002-08 Hal Perkins & UW CSE H-21

Modularity Issues Smart nodes make sense if the set of operations is relatively fixed, but we expect to need flexibility to add new kinds of nodes Example: graphics system Operations: draw, move, iconify, highlight Objects: textbox, scrollbar, canvas, menu, dialog box, plus new objects defined as the system evolves 1/22/2008 2002-08 Hal Perkins & UW CSE H-22

Modularity in a Compiler Abstract syntax does not change frequently over time Kinds of nodes are relatively fixed As a compiler evolves, it is common to modify or add operations on the AST nodes Want to modularize each operation (type check, optimize, code gen) so its components are together Want to avoid having to change node classes when we modify or add an operation on the tree 1/22/2008 2002-08 Hal Perkins & UW CSE H-23

Two Views of Modularity transmogrify highlight iconify move draw Print Flatten Generate x86 Optimize Type check IDENT X X X X X exp X X X X X while X X X X X if X X X X X Binop X X X X X circle X X X X X text X X X X X canvas X X X X X scroll X X X X X dialog X X X X X 1/22/2008 2002-08 Hal Perkins & UW CSE H-24

Visitor Pattern Idea: Package each operation in a separate class One method for each AST node kind Create one instance of this visitor class Sometimes called a function object Include a generic accept visitor method in every node class To perform the operation, pass the visitor object around the AST during a traversal This object contains separate methods to process each AST node type 1/22/2008 2002-08 Hal Perkins & UW CSE H-25

Avoiding instanceof Next issue: we d like to avoid huge if-elseif nests to check the node type in the visitor void checktypes(astnode p) { if (p instanceof WhileNode) { else if (p instanceof IfNode) { else if (p instanceof BinExp) { Solution: Include an overloaded visit method for each node type and get the node to call back to the correct operation for that node(!) Double dispatch 1/22/2008 2002-08 Hal Perkins & UW CSE H-26

One More Issue We want to be able to add new operations easily, so the nodes shouldn t know anything specific about the actual visitor class(es) Solution: an abstract Visitor interface AST nodes include accept visitor method for the interface Specific operations (type check, code gen) are implementations of this interface 1/22/2008 2002-08 Hal Perkins & UW CSE H-27

Visitor Interface interface Visitor { // overload visit for each AST node type public void visit(whilenode s); public void visit(ifnode s); public void visit(binexp e); Aside: The result type can be whatever is convenient, doesn t have to be void 1/22/2008 2002-08 Hal Perkins & UW CSE H-28

Specific class TypeCheckVisitor // Perform type checks on the AST public class TypeCheckVisitor implements Visitor { // override operations for each node type public void visit(binexp e) { e.exp1.accept(this); e.exp2.accept(this); // do additional processing on e before or after public void visit(whilenode s) { public void visit(ifnode s) { 1/22/2008 2002-08 Hal Perkins & UW CSE H-29

Add Visitor Method to AST Nodes Add a new method to class ASTNode (base class or interface describing all AST nodes) public abstract class ASTNode { // accept a visit from a Visitor object v public abstract void accept(visitor v); 1/22/2008 2002-08 Hal Perkins & UW CSE H-30

Override Accept Method in Each Specific AST Node Class Example public class WhileNode extends StmtNode { // accept a visit from a Visitor object v public void accept(visitor v) { v.visit(this); // dynamic dispatch on this (WhileNode) Key points Visitor object passed as a parameter to WhileNode WhileNode calls visit, which dispatches to visit(whilenode) automatically i.e., the correct method for this kind of node 1/22/2008 2002-08 Hal Perkins & UW CSE H-31

Encapsulation A visitor object often needs to be able to access state in the AST nodes May need to expose more state than we might do to otherwise Overall a good tradeoff better modularity (plus, the nodes are relatively simple data objects anyway) 1/22/2008 2002-08 Hal Perkins & UW CSE H-32

Composite Objects If the node contains references to subnodes, we often visit them first (i.e., pass the visitor along in a depth-first traversal of the AST) public class WhileNode extends StmtNode { // accept a visit from Visitor object v public void accept(visitor v) { this.exp.accept(v); this.stmt.accept(v); v.visit(this); Other traversals can be added if needed 1/22/2008 2002-08 Hal Perkins & UW CSE H-33

Visitor Actions A visitor function has a reference to the node it is visiting (the parameter) can access subtrees via that node It s also possible for the visitor object to contain local instance data, used to accumulate information during the traversal Effectively global data shared by visit methods public class TypeCheckVisitor extends NodeVisitor { public void visit(whilenode s) { public void visit(ifnode s) { private <local state>; 1/22/2008 2002-08 Hal Perkins & UW CSE H-34

Responsibility for the Traversal Possible choices The node objects (as done above) The visitor object (the visitor has access to the node, so it can traverse any substructure it wishes) Some sort of iterator object In a compiler, the first choice will handle many common cases 1/22/2008 2002-08 Hal Perkins & UW CSE H-35

References For Visitor pattern (and many others) Design Patterns: Elements of Reusable Object-Oriented Software Gamma, Helm, Johnson, and Vlissides Addison-Wesley, 1995 Specific information for MiniJava AST and visitors in Appel textbook & online 1/22/2008 2002-08 Hal Perkins & UW CSE H-36

Coming Attractions Static Analysis Type checking & representation of types Non-context-free rules (variables and types must be declared, etc.) Symbol Tables & more 1/22/2008 2002-08 Hal Perkins & UW CSE H-37