Syntax Analysis The Parser Generator (BYacc/J) CMPSC 470 Lecture 09-2 Topics: Yacc, BYacc/J A. Yacc Yacc is a computer program that generate LALR parser. Yacc stands for Yet Another Compiler-Compiler. It is originally developed and written in B programming language by Stephen C. Johnson at AT&T Corporation in the early 1970s, and rewritten in C. Yacc program 1. Takes, as input, a program containing CFG and the action of each production, 2. Create LALR parse table, and 3. Generate C program. Yacc usually works with a lexical analyzer generator, such as Lex or Flex. How to use Yacc program 1. From source translate.y file, generate y.tab.c file using yacc program as follows: yacc translate.y 2. Compile the C program, and make a.out file: cc y.tab.c 3. Run a.out file Format of yacc file. declaration translation rule supporting C routine
B. BYacc/J BYacc/J is an extension of the Berkeley YACC. BYacc/J generates C/C++ and Java parser. (using J flag) Its Windows and Linux binaries are available at http://byaccj.sourceforge.net/ Steps to use BYacc/J 1. Download binaries for Win32 or Linux from http://byaccj.sourceforge.net/#download 2. Unzip the yacc.exe file 3. Compile your myparser.y as follows: yacc.exe J Parser.y 4. It generates Parser.java and ParserVal.java parser files. C. BYacc/J input format and output format % Import java.io.* % %token ADD MUL %token <ival> NUM %type <dval> expr %type <obj> exprs Parser.y %left ADD %left MUL start : exprs action 1 exprs : exprs expr SEMI action 2 action 3 expr : expr ADD expr action 4 expr MUL expr action 5 LPAREN expr RPAREN action 6 NUM action 7 /* ParserVal yylval is already defined */ private int yylex () yylval = new ParserVal(0) int yyl_return = lexer.yylex() return yyl_return public void yyerror (String error) System.err.println ("Error: " + error) public Parser(Reader r) lexer = new Lexer(r, this) Parser.java import java.io.* public class Parser int yyparse() switch(yyn) case 1 : action 1 break case 10: action 2 break case 12: action 3 break Case 20: action 4 break Case..: action 7 break Case..: action 6 break Case..: action 5 break private int yylex () yylval = new ParserVal(0) int yyl_return = lexer.yylex() return yyl_return public void yyerror (String error) System.err.println ("Error: " + error) public Parser(Reader r) lexer = new Lexer(r, this)
D. Example % import java.io.* import java.util.arraylist % %token <ival> NUM %token ADD SUB MUL DIV SEMI LPAREN RPAREN %type <obj> exprs start %type <ival> expr %left ADD SUB %left MUL DIV start : exprs ArrayList<Integer> vals = (ArrayList<Integer>)$1 for(int i=0 i<vals.size() i++) System.out.println(vals.get(i)) exprs : exprs expr SEMI ArrayList<Integer> vals = (ArrayList<Integer>)$1 int val = $2 vals.add(val) $$ = vals $$ = new ArrayList<Integer>() expr : expr ADD expr $$ = $1 + $3 expr SUB expr $$ = $1 - $3 expr MUL expr $$ = $1 * $3 expr DIV expr int val1 = $1 int val3 = $3 $$ = val1 / val3 LPAREN expr RPAREN int val = $2 $$ = val NUM int num = $1 $$ = num private Lexer lexer private int yylex () int yyl_return = -1 try yylval = new ParserVal(0) yyl_return = lexer.yylex() catch (IOException e) System.out.println("IO error :"+e) return yyl_return public void yyerror (String error) System.err.println ("Error: " + error) public Parser(Reader r) lexer = new Lexer(r, this)
import java.io.* import java.util.arraylist public class Parser public final static short NUM=257 public final static short ADD=258 public final static short RPAREN=264 private Lexer lexer private int yylex () int yyl_return = -1 try yylval = new ParserVal(0) yyl_return = lexer.yylex() catch (IOException e) System.out.println("IO error :"+e) return yyl_return public void yyerror (String error) System.err.println ("Error: " + error) public Parser(Reader r) lexer = new Lexer(r, this) ParserVal yyval //used to return semantic vals from action routines ParserVal yylval//the 'lval' (result) I got from yylex() //############################################################### // method: yyparse : parse input and execute indicated items //############################################################### int yyparse() switch(yyn) //########## USER-SUPPLIED ACTIONS ########## case 1: ArrayList<Integer> vals = (ArrayList<Integer>)val_peek(0).obj for(int i=0 i<vals.size() i++) System.out.println(vals.get(i)) break case 2: ArrayList<Integer> vals = (ArrayList<Integer>)val_peek(2).obj int val = val_peek(1).ival vals.add(val) yyval.obj = vals break case 3: yyval.obj = new ArrayList<Integer>() break case 4: yyval.ival = val_peek(2).ival + val_peek(0).ival break case 5: yyval.ival = val_peek(2).ival - val_peek(0).ival break case 6: yyval.ival = val_peek(2).ival * val_peek(0).ival break case 7: int val1 = val_peek(2).ival int val3 = val_peek(0).ival yyval.ival = val1 / val3 break case 8: int val = val_peek(1).ival yyval.ival = val break case 9: int num = val_peek(0).ival yyval.ival = num break //########## END OF USER-SUPPLIED ACTIONS ##########
E. Declaration %token <ival> NUM1 NUM2 NUM3 %token <dval> REAL1 REAL2 REAL3 %token <sval> ID1 ID2 ID3 %token ADD SUB MUL DIV SEMI LPAREN RPAREN %type <obj> exprs start %type <ival> expr %right ASSIGN %left ADD SUB %left MUL DIV
F. Translation Rules The translation rule should be written as follows: head body 1 semantic action 1 body 2 semantic action 2 body n semantic action n Example) Given grammar GG: llllllll llllllll eeeeeeee εε eeeeeeee eeeeeeee + tttttttt tttttttt tttttttt tttttttt ffffffffffff ffffffffffff ffffffffffff NNNNNN (eeeeeeee) The translation rule can be written as follows: list : list expr SEMI $1.add($2) $$ = $1 $$ = new list() expr : expr ADD term $$ = $1 + $3 term term : term MUL factor $$ = $1 * $3 factor $$ = $1 factor : NUM $$ = str2int($1) LPAREN expr RPAREN $$ = $2
G. Supporting runtime routine part The third part of a yacc file specify member functions or member variables of Parser class. To generate java parser using BYacc/J, the usre must provide two methods in the yacc source: void yyerror(string msg) int yylex(). int yylex() This method must return token ID, or <0 if there is an error, or 0 when it encounters the end of input. Following shows the sample yylex() function that use jflex. private int yylex () int yyl_return = -1 try yylval = new ParserVal(0) yyl_return = lexer.yylex() catch (IOException e) System.err.println("IO error :"+e) return yyl_return The function yylex() returns the token type as integer, and stores its corresponding token attribute to yylval. The token attribute yylval, whose type is ParserVal, can have a string, integer, double, or object value, as defined as follows: public class ParserVal public int ival public double dval public String sval public Object obj void yyerror(string msg) BYacc/J uses this method to provide error messages. Following shows the sample yyerror() function: public void yyerror (String error) System.err.println ("Error: " + error) You can add custom Parser constructor here.
H. Compile options of BYacc/J BYacc/J support several options for java parser. Followings show the options that is useful in this class. -J Generates java parser, instead of generating C/C++ parser. -Jclass=<classname> Changes the name of the Java class. -Jpackage=<packagename> Set the package in which the parser resides. -Jextends=<extendname> Set the superclass -Jnorun Informs BYacc/J to not generate a run() method. You do not need to create a run() method in your assignment. -Jthrows=<excaption_list> Informs BYacc/J to declare thrown exceptions for yyparse() method. The following compile option will be useful in this class: Yacc.exe -Jthrows="Exception" -Jextends=ParserBase -Jnorun -J Parser.y