JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

Similar documents
Automated Tools. The Compilation Task. Automated? Automated? Easier ways to create parsers. The final stages of compilation are language dependant

Building Compilers with Phoenix

In this simple example, it is quite clear that there are exactly two strings that match the above grammar, namely: abc and abcc

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Building Compilers with Phoenix

COMP3131/9102: Programming Languages and Compilers

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

A programming language requires two major definitions A simple one pass compiler

Parsing Techniques. CS152. Chris Pollett. Sep. 24, 2008.

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

Parsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)

Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

Comp 411 Principles of Programming Languages Lecture 3 Parsing. Corky Cartwright January 11, 2019

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Using an LALR(1) Parser Generator

Compiler construction in4303 lecture 3

Downloaded from Page 1. LR Parsing

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

Monday, September 13, Parsers

Syntax Analysis. Chapter 4

CSE 401 Midterm Exam Sample Solution 2/11/15

Syntax Analysis. The Big Picture. The Big Picture. COMP 524: Programming Languages Srinivas Krishnan January 25, 2011

UNIVERSITY OF CALIFORNIA

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

Compiler construction lecture 3

LECTURE 3. Compiler Phases

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

CS 230 Programming Languages

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

CSCI 1260: Compilers and Program Analysis Steven Reiss Fall Lecture 4: Syntax Analysis I

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

CPS 506 Comparative Programming Languages. Syntax Specification

Syntax-Directed Translation. Lecture 14

Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

Abstract Syntax Trees & Top-Down Parsing

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Top down vs. bottom up parsing

shift-reduce parsing

Chapter 4. Lexical and Syntax Analysis

Parser. Larissa von Witte. 11. Januar Institut für Softwaretechnik und Programmiersprachen. L. v. Witte 11. Januar /23

3. Parsing. Oscar Nierstrasz

EDA180: Compiler Construc6on. Top- down parsing. Görel Hedin Revised: a

Last time. What are compilers? Phases of a compiler. Scanner. Parser. Semantic Routines. Optimizer. Code Generation. Sunday, August 29, 2010

Syntax-Directed Translation

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

Lexical and Syntax Analysis (2)

1. Consider the following program in a PCAT-like language.

Topic of the day: Ch. 4: Top-down parsing

Outline. Top Down Parsing. SLL(1) Parsing. Where We Are 1/24/2013

Syntactic Analysis. Syntactic analysis, or parsing, is the second phase of compilation: The token file is converted to an abstract syntax tree.

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

CS2210: Compiler Construction Syntax Analysis Syntax Analysis

JJTree. The Compilation Task. Automated? JJTree. An easier way to create an Abstract Syntax Tree

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

Wednesday, August 31, Parsers

Parsing Algorithms. Parsing: continued. Top Down Parsing. Predictive Parser. David Notkin Autumn 2008

Question Points Score

Fall Compiler Principles Lecture 5: Parsing part 4. Roman Manevich Ben-Gurion University

CMSC 330: Organization of Programming Languages

Compiler Passes. Syntactic Analysis. Context-free Grammars. Syntactic Analysis / Parsing. EBNF Syntax of initial MiniJava.

CSE302: Compiler Design

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

Syntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc.

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

Chapter 4: Syntax Analyzer

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Outline. 1 Introduction. 2 Context-free Grammars and Languages. 3 Top-down Deterministic Parsing. 4 Bottom-up Deterministic Parsing

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

CS152 Programming Language Paradigms Prof. Tom Austin, Fall Syntax & Semantics, and Language Design Criteria

CSE431 Translation of Computer Languages

G53CMP: Lecture 4. Syntactic Analysis: Parser Generators. Henrik Nilsson. University of Nottingham, UK. G53CMP: Lecture 4 p.1/32

LL(k) Compiler Construction. Top-down Parsing. LL(1) parsing engine. LL engine ID, $ S 0 E 1 T 2 3

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Subject Name: CS2352 Principles of Compiler Design Year/Sem : III/VI

Syntax and Grammars 1 / 21

VIVA QUESTIONS WITH ANSWERS

Syntax. 2.1 Terminology

Introduction to Syntax Analysis Recursive-Descent Parsing


Semantic Analysis. Lecture 9. February 7, 2018

List of Figures. About the Authors. Acknowledgments

Chapter 3: Lexing and Parsing

Building a Parser II. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Lecture 8: Deterministic Bottom-Up Parsing

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

Context-free grammars

Context-free grammars (CFG s)

Transcription:

JavaCC Parser The Compilation Task Input character stream Lexer stream Parser Abstract Syntax Tree Analyser Annotated AST Code Generator Code CC&P 2003 1 CC&P 2003 2 Automated? JavaCC Parser The initial stages of compilation are very mechanistic Lexer Text to stream of tokens Pattern matching apple Parser Application of grammar rules Recovery from errors AST Generation Data structure for subsequent stages JavaCC's generates a parser class recursive descent parser each BNF production in the.jj file is translated into a method Algorithm if the prefix of input sequence tokens match the current nonterminal definition, then remove such a prefix from the input sequence else throw a ParseException CC&P 2003 3 CC&P 2003 4

An Example Grammar start ::= expr ";" expr ::= (addop)? term (addop term)* addop ::= "+" "-" Implicit Declaration void start() : expr() ";" void expr() : 0 or 1 [addop()] term() ( addop() term() )* void addop() : "+" "-" Implicit Declaration CC&P 2003 5 term An Example Grammar ::= factor (multop factor)* multop ::= "*" "/" factor ::= primary ( "**" primary)? primary ::= number "(" expr ")" ident ( "(" rvalue ( "," rvalue )* ")" )? void term() : factor() ( multop() factor() )* void multop() : "*" "/" void factor() : primary() [ "**" primary() ] Java declaration section Implicit Declaration void primary() : number() "(" expr() ")" ident() [ "(" rvalue() ( "," rvalue())* ")" ] CC&P 2003 6 An Example Grammar JavaCC Grammar Productions rvalue ::= expr string number ::= <INTEGER_LITERAL> <REAL_LITERAL> <HEX_LITERAL> ident ::= <IDENTIFIER> void rvalue() : expr() string() void number() : <INTEGER_LITERAL> <REAL_LITERAL> <HEX_LITERAL> void ident() : <IDENTIFIER> Previous slides completely define the grammar of our small expression language This grammar is LL(1) What do we do if the grammar is not LL(1)? string ::= <STRING> void string() : <STRING> CC&P 2003 7 CC&P 2003 8

What do we do if the grammar is not LL(1)? 1. Identify the first and follow sets of each nonterminal symbol. 2. Prove that the grammar is LL(1) for each pair of alternatives at a branch point, we need first(a 1 ) first(a 2 ) = 3. If the grammar is not LL(1), then transform it and goto (1). Grammar is not LL(1)? If grammar is not LL(1) JavaCC will warn you Warning: Choice conflict... Consider using a lookahead of n for... Choice conflict? Consider a ::= <ID> b <ID> c CC&P 2003 9 CC&P 2003 10 JavaCC Look-ahead mechanism For alternation the first choice is the default Therefore in a ::= <ID> b <ID> c, the second choice is unreachable. Add a LOOKAHEAD specification to the first alternative, eg. void a() : LOOKAHEAD(2) <ID> b() <ID> c() JavaCC Look-ahead mechanism If however b and c start out the same and are only distinguishable by how they end. No statically determined limit on the length of the lookahead will do. E.g. b ::= (<NUMBER>)* ; c ::= (<NUMBER>)+. (<NUMBER>)+ CC&P 2003 11 CC&P 2003 12

JavaCC Look-ahead mechanism Use "syntactic lookahead". The parser will look ahead to see if a particular syntactic pattern is matched before committing to a choice. void a() : // Take the first alternative if an <ID> followed by a b() // appears next LOOKAHEAD(<ID> b() ) <ID> b() <ID> c() JavaCC Look-ahead mechanism However the sequence <ID> b() may be parsed twice for lookahead and for regular parsing. Another way to resolve conflicts is to rewrite the grammar. The above nonterminal can be rewritten as a ::= <ID> (b c) So do NOT use LOOKAHEAD indiscriminately. CC&P 2003 13 CC&P 2003 14 JavaCC Semantic Lookahead Two types of look-ahead mechanisms Syntactic A particular token is looked ahead in the input stream. Semantic Any arbitrary Boolean expression can be specified as a lookahead parameter. JavaCC Semantic Lookahead Example A :: abc and B :: b(c)? Valid strings: abc and abcc void B() : "b" [ LOOKAHEAD(get(1).kind == C && get(2).kind!= C ) <C:"c"> ] CC&P 2003 15 CC&P 2003 16

JavaCC Lookahead Summary Exploration of tokens further ahead in the input stream. Backtracking is unacceptable due to performance hit. By default JavaCC has 1 token look-ahead. Can specify any number for look-ahead. Globally and locally How does JavaCC differ from standard LL(1) parsing? JavaCC is more flexible It lets you use multiple token lookahead syntactic lookahead semantic lookahead. Without lookahead only subtly different from LL(1) parsing CC&P 2003 17 CC&P 2003 18 Finally Building and Executing Since LL(1) LALR(1) LR(1), wouldn't a tool based on LALR or LR parsing be better? True, but a hard problem JavaCC is based on LL(1) parsing, but it allows you to use grammars that are not LL(1). JavaCC can handle any grammar that is not leftrecursive. If necessary use the ambiguous set of grammar rules, and use other mechanisms to resolve the ambiguity. C:\Rob\jj>javacc Expression.jj Java Compiler Compiler Version 3.0 (Parser Generator) (type "javacc" with no arguments for help) Reading from file Expression.jj... Parser generated successfully. C:\Rob\jj>javac *.java C:\Rob\jj>java Expression Reading from standard input... -2+p("fred",3/4); Thank you. CC&P 2003 19 CC&P 2003 20

Questions Write a grammar for the Lexer portion of JavaCC What happens is the input language is incorrect? CC&P 2003 21