Operator Precedence a+b*c b*c E + T T * P ( E )

Similar documents
Let s look at the CUP specification for CSX-lite. Recall its CFG is

Operator Precedence. Java CUP. E E + T T T * P P P id id id. Does a+b*c mean (a+b)*c or

Java CUP. Java CUP Specifications. User Code Additions. Package and Import Specifications

Properties of Regular Expressions and Finite Automata

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Fall

CS 536 Spring Programming Assignment 1 Identifier Cross-Reference Analysis

CS 541 Fall Programming Assignment 3 CSX_go Parser

Last Time. What do we want? When do we want it? An AST. Now!

Optimizing Finite Automata

CS 536 Spring Programming Assignment 3 CSX Parser. Due: Tuesday, March 24, 2015 Not accepted after Tuesday, March 31, 2015

CUP. Lecture 18 CUP User s Manual (online) Robb T. Koether. Hampden-Sydney College. Fri, Feb 27, 2015

Defining syntax using CFGs

For example, we might have productions such as:

The Structure of a Syntax-Directed Compiler

Action Table for CSX-Lite. LALR Parser Driver. Example of LALR(1) Parsing. GoTo Table for CSX-Lite

JavaCUP. There are also many parser generators written in Java

Fall Compiler Principles Lecture 4: Parsing part 3. Roman Manevich Ben-Gurion University of the Negev

The Structure of a Compiler

CSE 401 Midterm Exam 11/5/10

CSE 401 Midterm Exam Sample Solution 11/4/11

The Structure of a Syntax-Directed Compiler

Configuration Sets for CSX- Lite. Parser Action Table

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

Fall Compiler Principles Lecture 5: Parsing part 4. Roman Manevich Ben-Gurion University

A clarification on terminology: Recognizer: accepts or rejects strings in a language. Parser: recognizes and generates parse trees (imminent topic)

Using JFlex. "Linux Gazette...making Linux just a little more fun!" by Christopher Lopes, student at Eastern Washington University April 26, 1999

Context-Free Grammars

CS 536 Midterm Exam Spring 2013

Lecture 12: Parser-Generating Tools

LL(1) Grammars. Example. Recursive Descent Parsers. S A a {b,d,a} A B D {b, d, a} B b { b } B λ {d, a} D d { d } D λ { a }

Table-Driven Top-Down Parsers

Assignment 1. Due 08/28/17

The Structure of a Syntax-Directed Compiler

Reading Assignment. Scanner. Read Chapter 3 of Crafting a Compiler.

Error Detection in LALR Parsers. LALR is More Powerful. { b + c = a; } Eof. Expr Expr + id Expr id we can first match an id:

COMP 181. Prelude. Prelude. Summary of parsing. A Hierarchy of Grammar Classes. More power? Syntax-directed translation. Analysis

Defining syntax using CFGs

Compilers. Bottom-up Parsing. (original slides by Sam

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

POLITECNICO DI TORINO. (01JEUHT) Formal Languages and Compilers. Laboratory N 3. Lab 3. Cup Advanced Use

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

Building A Recursive Descent Parser. Example: CSX-Lite. match terminals, and calling parsing procedures to match nonterminals.

Type Checking Binary Operators

When do We Run a Compiler?

CS241 Final Exam Review

Introduction to Compiler Design

How do LL(1) Parsers Build Syntax Trees?

CSX-lite Example. LL(1) Parse Tables. LL(1) Parser Driver. Example of LL(1) Parsing. An LL(1) parse table, T, is a twodimensional

CS 230 Programming Languages

A Bison Manual. You build a text file of the production (format in the next section); traditionally this file ends in.y, although bison doesn t care.

EDA180: Compiler Construc6on. Top- down parsing. Görel Hedin Revised: a

Type Checking Simple Variable Declarations

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 2

Compilers. History of Compilers. A compiler allows programmers to ignore the machine-dependent details of programming.

Syntax Analysis The Parser Generator (BYacc/J)

A Simple Syntax-Directed Translator

Character Classes. Our specification of the reserved word if, as shown earlier, is incomplete. We don t (yet) handle upper or mixedcase.

Last time. What are compilers? Phases of a compiler. Scanner. Parser. Semantic Routines. Optimizer. Code Generation. Sunday, August 29, 2010

Topic 5: Syntax Analysis III

Lecture Code Generation for WLM

Programming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators. Jeremy R. Johnson

Parser and syntax analyzer. Context-Free Grammar Definition. Scanning and parsing. How bottom-up parsing works: Shift/Reduce tecnique.

ASTs, Objective CAML, and Ocamlyacc

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

. PLT: PROGRAMMING LANGUAGE TOOL JARED RUBIN

PLT 4115 LRM: JaTesté

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

TML Language Reference Manual

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

Formats of Translated Programs

CS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built)

CSE P 501 Compilers. Implementing ASTs (in Java) Hal Perkins Autumn /20/ Hal Perkins & UW CSE H-1

Classes, Structs and Records. Internal and External Field Access

A simple syntax-directed

Project 3 - Parsing. Misc Details. Parser.java Recursive Descent Parser... Using Lexer.java

ECE251 Midterm practice questions, Fall 2010

Lab 2 Tutorial (An Informative Guide)

Sticker. CS 241 Fall 2009 Final Exam. Draft of 6 December 01:43 hrs

Syntax and Parsing COMS W4115. Prof. Stephen A. Edwards Fall 2003 Columbia University Department of Computer Science

Context-free grammars (CFGs)

LECTURE 11. Semantic Analysis and Yacc

CS606- compiler instruction Solved MCQS From Midterm Papers

Abstract Syntax. Mooly Sagiv. html://

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Spring 2015

Building a Parser II. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

What do Compilers Produce?

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

More Examples. Lex/Flex/JLex

cmps104a 2002q4 Assignment 3 LALR(1) Parser page 1

Semantic Analysis. Lecture 9. February 7, 2018

CSE 401 Midterm Exam Sample Solution 2/11/15

Syntax Analysis, III Comp 412

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

CMSC 330: Organization of Programming Languages

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Fall

CSE 582 Autumn 2002 Exam Sample Solution

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Topic 3: Syntax Analysis I

Transcription:

Operator Precedence Most programming languages have operator precedence rules that state the order in which operators are applied (in the absence of explicit parentheses). Thus in C and Java and CSX, a+b*c means compute b*c, then add in a. These operators precedence rules can be incorporated directly into a CFG. Consider E E + T T T T * P P P id ( E ) 217

Does a+b*c mean (a+b)*c or a+(b*c)? The grammar tells us! Look at the derivation tree: E E + T T T * P P P id id id The other grouping can t be obtained unless explicit parentheses are used. (Why?) 218

Java CUP Java CUP is a parser-generation tool, similar to Yacc. CUP builds a Java parser for LALR(1) grammars from production rules and associated Java code fragments. When a particular production is recognized, its associated code fragment is executed (typically to build an AST). CUP generates a Java source file parser.java. It contains a class parser, with a method Symbol parse() The Symbol returned by the parser is associated with the grammar s start symbol and contains the AST for the whole source program. 219

The file sym.java is also built for use with a JLex-built scanner (so that both scanner and parser use the same token codes). If an unrecovered syntax error occurs, Exception() is thrown by the parser. CUP and Yacc accept exactly the same class of grammars all LL(1) grammars, plus many useful non- LL(1) grammars. CUP is called as java java_cup.main < file.cup 220

Java CUP Specifications Java CUP specifications are of the form: Package and import specifications User code additions Terminal and non-terminal declarations A context-free grammar, augmented with Java code fragments Package and Import Specifications You define a package name as: package name ; You add imports to be used as: import java_cup.runtime.*; 221

User Code Additions You may define Java code to be included within the generated parser: action code {: /*java code */ :} This code is placed within the generated action class (which holds user-specified production actions). parser code {: /*java code */ :} This code is placed within the generated parser class. init with{: /*java code */ :} This code is used to initialize the generated parser. scan with{: /*java code */ :} This code is used to tell the generated parser how to get tokens from the scanner. 222

Terminal and Non-terminal Declarations You define terminal symbols you will use as: terminal classname name 1, name 2,... classname is a class used by the scanner for tokens (CSXToken, CSXIdentifierToken, etc.) You define non-terminal symbols you will use as: non terminal classname name 1, name 2,... classname is the class for the AST node associated with the non-terminal (stmtnode, exprnode, etc.) 223

Production Rules Production rules are of the form name ::= name 1 name 2... action ; or name ::= name 1 name 2... action 1 name 3 name 4... action 2... ; Names are the names of terminals or non-terminals, as declared earlier. Actions are Java code fragments, of the form {: /*java code */ :} The Java object assocated with a symbol (a token or AST node) may be named by adding a :id suffix to a terminal or non-terminal in a rule. 224

RESULT names the left-hand side non-terminal. The Java classes of the symbols are defined in the terminal and non-terminal declaration sections. For example, prog ::= LBRACE:l stmts:s RBRACE {: RESULT = new csxlitenode(s, l.linenum,l.colnum); :} This corresponds to the production prog { stmts } The left brace is named l; the stmts non-terminal is called s. In the action code, a new CSXLiteNode is created and assigned to prog. It is constructed from the AST node associated with s. Its line and column numbers are those given to the left brace, l (by the scanner). 225

To tell CUP what non-terminal to use as the start symbol (prog in our example), we use the directive: start with prog; 226

Example Let s look at the CUP specification for CSX-lite. Recall its CFG is program { stmts } stmts stmt stmts λ stmt id = expr ; if ( expr ) stmt expr expr + id expr - id id 227

The corresponding CUP specification is: /*** This Is A Java CUP Specification For CSX-lite, a Small Subset of The CSX Language, Used In Cs536 ***/ /* Preliminaries to set up and use the scanner. */ import java_cup.runtime.*; parser code {: public void syntax_error (Symbol cur_token){ report_error( CSX syntax error at line + String.valueOf(((CSXToken) cur_token.value).linenum), null);} :}; init with {: :}; scan with {: return Scanner.next_token(); :}; 228

/* Terminals (tokens returned by the scanner). */ terminal CSXIdentifierToken IDENTIFIER; terminal CSXToken SEMI, LPAREN, RPAREN, ASG, LBRACE, RBRACE; terminal CSXToken PLUS, MINUS, rw_if; /* Non terminals */ non terminal csxlitenode prog; non terminal stmtsnode stmts; non terminal stmtnode stmt; non terminal exprnode exp; non terminal namenode ident; start with prog; prog::= LBRACE:l stmts:s RBRACE {: RESULT= new csxlitenode(s, l.linenum,l.colnum); :} ; stmts::= stmt:s1 stmts:s2 {: RESULT= new stmtsnode(s1,s2, s1.linenum,s1.colnum); :} 229

{: RESULT= stmtsnode.null; :} ; stmt::= ident:id ASG exp:e SEMI {: RESULT= new asgnode(id,e, id.linenum,id.colnum); :} rw_if:i LPAREN exp:e RPAREN stmt:s {: RESULT=new ifthennode(e,s, stmtnode.null, i.linenum,i.colnum); :} ; exp::= exp:leftval PLUS:op ident:rightval {: RESULT=new binaryopnode(leftval, sym.plus, rightval, op.linenum,op.colnum); :} exp:leftval MINUS:op ident:rightval {: RESULT=new binaryopnode(leftval, sym.minus,rightval, op.linenum,op.colnum); :} ident:i {: RESULT = i; :} ; 230

ident::= IDENTIFIER:i {: RESULT = new namenode( new identnode(i.identifiertext, i.linenum,i.colnum), exprnode.null, i.linenum,i.colnum); :} ; 231

Let s parse { a = b ; } First, a is parsed using ident::= IDENTIFIER:i {: RESULT = new namenode( new identnode(i.identifiertext, i.linenum,i.colnum), exprnode.null, i.linenum,i.colnum); :} We build namenode identnode a nullexprnode 232

Next, a is parsed using ident::= IDENTIFIER:i {: RESULT = new namenode( new identnode(i.identifiertext, i.linenum,i.colnum), exprnode.null, i.linenum,i.colnum); :} We build namenode identnode b nullexprnode 233

Then b s subtree is recognized as an exp: ident:i {: RESULT = i; :} Now the assignment statement is recognized: stmt::= ident:id ASG exp:e SEMI {: RESULT= new asgnode(id,e, id.linenum,id.colnum); :} We build asgnode namenode namenode identnode nullexprnode identnode nullexprnode a b 234

The stmts λ production is matched (indicating that there are no more statements in the program). CUP matches stmts::= {: RESULT= stmtsnode.null; :} and we build nullstmtsnode Next, stmts stmt stmts is matched using stmts::= stmt:s1 stmts:s2 {: RESULT= new stmtsnode(s1,s2, s1.linenum,s1.colnum); :} 235

This builds stmtsnode asgnode nullstmtsnode namenode namenode identnode nullexprnode identnode nullexprnode a b As the last step of the parse, the parser matches program { stmts } using the CUP rule prog::= LBRACE:l stmts:s RBRACE {: RESULT= new csxlitenode(s, l.linenum,l.colnum); :} ; 236

The final AST reurned by the parser is csxlitenode stmtsnode asgnode nullstmtsnode namenode namenode identnode nullexprnode identnode nullexprnode a b 237