Visitors Compiler Construction Visitor pattern, Semantic analysis Lennart Andersson How to modularize in Java (or any other OO language) if we do not have access to AOP mechanisms? Revision 2011-02-08 2011 Compiler Construction 2011 F07-1 Example How can we factor out the code for print()? class Add extends Expr { Expr expr1, expr2; void print() { expr1.print(); System.out.print( + ); expr2.print(); class IntExpr extends Expr { int value; void print() { System.out.print(value); Compiler Construction 2011 F07-2 Move functionality class Add extends Expr { class Visitor { Expr expr1, expr2; void visit(add node) { void print() { node.expr1.print(); expr1.print(); System.out.print( + ); System.out.print( + ); node.expr2.print(); expr2.print(); class IntExpr extends Expr { void visit(intexpr node){ int value; System.out.print( void print() { node.value); System.out.print(value); Compiler Construction 2011 F07-3 Compiler Construction 2011 F07-4
Missing functionality Delegate class Add extends Expr { class Visitor { Expr expr1, expr2; void visit(add node) { node.expr1.print(); System.out.print( + ); node.expr2.print(); class IntExpr extends Expr { int value; void visit(intexpr node){ System.out.print( node.value); class Add extends Expr { class Visitor { Expr expr1, expr2; void visit(add node) { void accept(visitor v) { node.expr1.accept(this); v.visit(this); System.out.print( + ); node.expr2.accept(this); class IntExpr extends Expr { void visit(intexpr node){ int value; System.out.print( void accept(visitor v) { node.value); v.visit(this); Compiler Construction 2011 F07-5 Why not call visit directly? Compiler Construction 2011 F07-6 Generalise class Add extends Expr { class Visitor { Expr expr1, expr2; void visit(add node) { void accept(visitor v) { visit(node.expr1); v.visit(this); System.out.print( + ); visit(node.expr2); class IntExpr extends Expr { void visit(intexpr node){ int value; System.out.print( void accept(visitor v) { node.value); v.visit(this); class Add extends Expr { interface Visitor { Expr expr1, expr2; void visit(add node); void accept(visitor v) { void visit(intexpr node); v.visit(this); class PrintVisitor implements Visitor { class IntExpr extends Expr { void visit(add node) { int value; node.expr1.accept(this); void accept(visitor v) { System.out.print( + ); v.visit(this); node.expr1.accept(this); Compiler Construction 2011 F07-7 Compiler Construction 2011 F07-8
Generalise with parameter and return interface Visitor { Object visit(add node, Object data); Object visit(intexpr node, Object data); class IntExpr extends Expr { Expr expr1, expr2; Object accept(visitor v, Object data) { return v.visit(this, data); class PrintVisitor implements Visitor { Object visit(intexpr node, Object data) { System.out.print(node.value); return null; Compiler Construction 2011 F07-9 The Visitor Pattern Intent Represent an operation to be performed on the elements of an object structure. Visitor lets you define a new operation without changing the classes of the elements on which it operates. Another visitor class ValueVisitor implements Visitor { Object visit(add node, Object data) { int n1 = (Integer) node.expr1.accept(this, data); int n2 = (Integer) node.expr2.accept(this, data); return new Integer(n1+n2); Object visit(intexpr node, Object data) { return new Integer(node.value); Sketch Expr expr = new Add( ); expr.accept(new PrintVisitor(), null); int value = expr.accept(new ValueVisitor(), null); Compiler Construction 2011 F07-10 Expr Sub class Add accept() each AST class has an Object accept(visitor, Object) method ValueVistor.java delegates computation class ValueVisitor implements Visitor { Object visit(add, Object) { Object visit(sub, Object) { Visitor.java interface Visitor { Object visit(add, Object); Object visit(sub, Object); //one visit method for each AST class UnparseVisitor.java class UnparseVisitor implements Visitor { Object visit(add, Object) { Object visit(sub, Object) { Compiler Construction 2011 F07-11 Compiler Construction 2011 F07-12
Interface Visitor interface Visitor { Object visit(add node, Object data); Object visit(sub node, Object data); Object visit(intexpr node, Object data); The visit method is overloaded for different AST argument types Each method returns an untyped object Each method has an untyped argument (data) Compiler Construction 2011 F07-13 One more example Count the number of identifiers in a program abstract Stmt; IfStmt : Stmt ::= Cond:Expr Then:Stmt [Else:Stmt]; abstract Expr; abstract BinExpr : Expr ::= Left:Expr Right:Expr; Add : BinExpr ::= ; Sub : BinExpr ::= ; Int : Expr ::= <INT:String>; IdExpr : Expr ::= <ID:String>; How can the Visitor be implemented? Compiler Construction 2011 F07-15 Visitor support in AST nodes Method accept that delegates the computation to a Visitor abstract class Expr { abstract Object accept(visitor v, Object data); abstract class BinExpr extends Expr { Object accept(visitor v, Object data) { return v.visit(this, data); class Add extends BinExpr { Object accept(visitor v, Object data) { return v.visit(this, data); class IntExpr extends Expr { Object accept(visitor v, Object data) { return v.visit(this, data); Compiler Construction 2011 F07-14 TraversingVisitor class TraversingVisitor implements Visitor { Object visit(ifstmt node, Object data) { node.getcond().accept(this, data); node.getthen().accept(this, data); if (node.haselse()) { node.getelse().accept(this, data); Object visit(add node, Object data) { node.getleft().accept(this, data); node.getright().accept(this, data); The code above is independent of the computation and could have been generated from the abstract grammar (but this is currently not done in JJTree or JastAdd). Compiler Construction 2011 F07-16
CountIdentifiers as a visitor Intertype declarations vs. Visitor class CountIdentifiers extends TraversingVisitor { Example of use types for arguments and return values separate compilation? pure Java? declar- intertype tions what can be modularized? instance variables, methods, implements clauses arbitrary no preprocessor required no requires additional tools Visitor only methods Object visit(, Object) (one untyped argument and one result) yes yes Compiler Construction 2011 F07-17 Using the modularization techniques Compiler Construction 2011 F07-18 Semantic analysis Interpretation Unparsing Metrics Semantic Analysis Name analysis connect an identifier to its declaration Type analysis? compute the type of an expression... Code generation Compute the size needed for objects and methods Generate instructions... Name analysis bind each identifier to the appropriate declaration Type analysis compute the type for each expression Error checking Is the identifier declared? Is the expression of a legal type? Does the procedure call have the correct number of arguments? Compiler Construction 2011 F07-19 Compiler Construction 2011 F07-20
Name analysis Program example Which identifiers are IdDecls? Which are IdUses? We introduce two different AST classes for identifiers: IdDecl a declared occurrence an identifier that names a declaration IdUse an applied occurrence an identifier that refers to a declaration Name analysis Bind each IdUse to the IdDecl to which it refers. class GraphicalObject { Position pos; Position getpos() { return pos; class Circle extends GraphicalObject { float radius; float area() { return Math.PI*radius*radius; Compiler Construction 2011 F07-21 Scope (synlighetsområde) Block Compiler Construction 2011 F07-22 The scope of a declaration the parts of the program where the name of the declaration is visible Block a syntactic unit with declarations and statements may require memory allocation during execution (once or several times) Block structure (nesting) a block can have inner blocks (recursively) declarations in a block are visible also in the inner blocks Example of block structure an anonymous block inside a method a method inside a class a method inside another method a class inside another class Compiler Construction 2011 F07-23 Compiler Construction 2011 F07-24
Special blocks Scope rules (visibility rules) Govern how IdUses are bound to IdDecls Global declarations can be viewed as belonging to an outermost block Static fields A global block can be created for each class to hold its static fields Typical factors (differ in different languages) combination how can blocks be combined? name collisions what happens if the same name is declared in many blocks? declaration order does it affect the bindings? method overloading can there be several methods of the same name, but with different argument types? What are the binding rules? parameters how do they relate to local variables? return values are they named explicitly? visibility restrictions private, public,... qualified access access via another name Compiler Construction 2011 F07-25 How can blocks be combined? Compiler Construction 2011 F07-26 Name collisions Block structure declarations in an outer block are visible also in an inner block Inheritance declarations in a class are visible also in subclasses Combined block structure and inheritance e.g., a method in a subclass can access instance variables in a superclass Shadowing (skuggning) inner declarations shadow outer declarations of the same name {int x; {int x; Forbidden shadowing Some languages prohibit inner blocks to declare a name that is already present in an outer block. Compiler Construction 2011 F07-27 Compiler Construction 2011 F07-28
Declaration order Other name issues Homogeneous blocks the order between the declarations is irrelevant Declarations in Java classes All declarations in Algol Declare-before-use a name must be declared before it is used Declarations in Java methods Declarations in C Declarations in Pascal Can the same name be used for declarations of different kind? E.g., fields and methods in Java: int c, int c() what about class c? Do all names share the same lexical definition? or are there different kinds of identifiers? In Smalltalk, attributes must start with a lower case letter, classes with an upper case letter. In Java it is just a convention that class names should start with a capital letter. Compiler Construction 2011 F07-29 Overloaded method names Overloaded method The same method name Different signatures (method signature name, parameter types, return type) Bind to the method with the most specific signature (with respect to the static types) void visit(expr node) { void visit(add node) { Expr e = new Add(); visit(e); which method is called? e.accept(visitor); which method is called? Compiler Construction 2011 F07-31 Compiler Construction 2011 F07-30 Parameters Usually, parameters can be seen as special local variables void m(int x, y) { int s = 2; x = s + y; Usually, it is an error to declare a local variable with the same name void m(int x, y) { int x; // Multiple declaration of x Compiler Construction 2011 F07-32
Return value Visibility modifiers The value can be returned by a special return statement (C, Java, ) int m() { return 3; The value can be returned by using the function name as a variable (Algol, Pascal, ) int m() { m = 3; Explicit modifiers private, protected, public (Java) hidden, protected (Simula) friend, (C++) Default rules what rules hold for classes/methods/fields without modifiers? Compiler Construction 2011 F07-33 Qualified access Compiler Construction 2011 F07-34 AST based name analysis Indirect access via another name circle.area() circle.getpos(); Bindings Represent the binding as a reference variable in the IdUses Algorithm Traverse the AST Keep track of visible declarations in a symbol table Look up the declaration for each IdUse Symbol table maps names to declarations a stack can be used to handle block structure Compiler Construction 2011 F07-35 Compiler Construction 2011 F07-36
Bindnings List(Decl*) VarDecl Program Block List(Stmt*) Assignment IntType IdDecl x IdUse x IntConst 3 Distinguish between identifier uses and declarations in the AST Replace IdExpr by IdDecl declared identifier occurrence IdUse applied identifier occurrence (has a reference to an IdDecl node) Compiler Construction 2011 F07-37 Properties of this simple language declarations appear before statements (cannot be mixed like in C or Java) no IdUses inside the declaration part blocks can be nested begin int x; x = 3; end; begin int x; int y; x = 3; begin int z; z = x + 5; y = z + 1; end; end; Compiler Construction 2011 F07-39 Example Program ::= Block; begin Block ::= Decl* Stmt*; int x; abstract Decl; x = 3; VarDecl: Decl ::= Type IdDecl; end; abstract Type; IntType: Type ::=; IdDecl ::= <ID>; begin abstract Stmt; int x; AssignStmt: Stmt ::= IdUse Expr; int y; BlockStmt: Stmt ::= Block; x = 3; abstract Expr; begin IdUse: Expr ::= <ID>; int z; IntConst: Expr ::= <INT>; z = x + 5; Add: Expr ::= Left:Expr Right:Expr; y = z + 1; end; end; Compiler Construction 2011 F07-38 Implementation of name analysis aspect NameAnalysis { void Program.nameAnalysis() { SymbolTable table = new SymbolTable(); nameanalysis(table); void ASTNode.nameAnalysis(SymbolTable table) { for (k=0; k<getnumchild(); k++) { getchild(k).nameanalysis(table); void Block.nameAnalysis(SymbolTable table) { table.enterblock(); getdecllist().adddecls(table); getstmtlist().nameanalysis(table); table.exitblock(); Compiler Construction 2011 F07-40
... Implementation of name analysis void ASTNode.addDecls(SymbolTable table) { for (k=0; k<getnumchild(); k++) { getchild(k).adddecls(table); void IdDecl.addDecls(SymbolTable table) { table.add(getid(), this); Implementation of the symbol table a stack of hashmaps (one hashmap for each block) a new hashmap is pushed when entering a block a hashmap is popped when exiting a block the symbol table is empty after the name analysis IdDecl IdUse.decl; void IdUse.nameAnalysis(SymbolTable table) { decl = table.lookup(getid()); Compiler Construction 2011 F07-41 Compiler Construction 2011 F07-42 Symbol table API Constructor summary SymbolTable() Method summary void add(string symbol, Object meaning) Adds the symbol and its associated meaning to the top table boolean alreadydeclared(string symbol) Returns true if the symbol is already in the top table int blocklevel() Returns the current block level (= the number of dictionaries on the stack) void enterblock() Adds a new table to the stack void exitblock() Removes the top table from the stack Object lookup(string symbol) Returns the meaning of symbol Compiler Construction 2011 F07-43 Compiler Construction 2011 F07-44
Implementation of SymbolTable public class SymbolTable { private class LinkedHashMap extends HashMap<String, Object>() { LinkedHashMap next; LinkedHashMap(LinkedHashMap next) { this.next = next; Object lookup(string symbol) { Object result = get(symbol); if (result!= null next == null) return result; else return next.lookup(symbol); Compiler Construction 2011 F07-45 Variations Implementation of SymbolTable public class SymbolTable { private LinkedHashMap top = null; public void add(string symbol, Object meaning) { top.put(symbol, meaning); public Object lookup(string symbol) { return top.lookup(symbol); public void enterblock() { top = new LinkedHashMap(top); public void exitblock() { top = top.next; Compiler Construction 2011 F07-46 The declaration part could contain IdUses, e.g., variables with initial values names of user defined types (e.g., classes and interfaces) The declaration part could contain blocks, e.g., methods, inner classes Mixed declarations and statements as in C and Java Order between the declarations Decl-before-use like in C and Java methods Compiler Construction 2011 F07-47