Compilation 2010 SableCC

Similar documents
Compilation 2012 ocamllex and menhir

Compilation 2013 Parser Generators, Conflict Management, and ML-Yacc

Compilation 2012 Context-Free Languages Parsers and Scanners. Jan Midtgaard Michael I. Schwartzbach Aarhus University

Parsing. COMP 520: Compiler Design (4 credits) Professor Laurie Hendren.

A Bison Manual. You build a text file of the production (format in the next section); traditionally this file ends in.y, although bison doesn t care.

Action Table for CSX-Lite. LALR Parser Driver. Example of LALR(1) Parsing. GoTo Table for CSX-Lite

CS 411 Midterm Feb 2008

Configuration Sets for CSX- Lite. Parser Action Table

Error Detection in LALR Parsers. LALR is More Powerful. { b + c = a; } Eof. Expr Expr + id Expr id we can first match an id:

Principle of Compilers Lecture IV Part 4: Syntactic Analysis. Alessandro Artale

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

In One Slide. Outline. LR Parsing. Table Construction

EDA180: Compiler Construc6on. Top- down parsing. Görel Hedin Revised: a

More Bottom-Up Parsing

UNIVERSITY OF CALIFORNIA

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

Conflicts in LR Parsing and More LR Parsing Types

Lex and Yacc. More Details

S Y N T A X A N A L Y S I S LR

LR Parsing LALR Parser Generators

Topic 5: Syntax Analysis III

Formal Languages and Compilers Lecture VII Part 4: Syntactic A

Bottom-Up Parsing. Lecture 11-12

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

Parser and syntax analyzer. Context-Free Grammar Definition. Scanning and parsing. How bottom-up parsing works: Shift/Reduce tecnique.

Lecture 7: Deterministic Bottom-Up Parsing

Bottom-Up Parsing. Lecture 11-12

LR Parsing LALR Parser Generators

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

Lecture 12: Parser-Generating Tools

Chapter 3: Lexing and Parsing

3. Parsing. 3.1 Context-Free Grammars and Push-Down Automata 3.2 Recursive Descent Parsing 3.3 LL(1) Property 3.4 Error Handling

CS 11 Ocaml track: lecture 6

LR Parsing. Table Construction

Compilers. Bottom-up Parsing. (original slides by Sam

CSE 401 Midterm Exam Sample Solution 11/4/11

G53CMP: Lecture 4. Syntactic Analysis: Parser Generators. Henrik Nilsson. University of Nottingham, UK. G53CMP: Lecture 4 p.1/32

Lecture Notes on Bottom-Up LR Parsing

Compiler Construction

Compiler Construction

Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

shift-reduce parsing

Lecture Notes on Bottom-Up LR Parsing

Lecture 8: Deterministic Bottom-Up Parsing

Building Compilers with Phoenix

CS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1

HW8 Use Lex/Yacc to Turn this: Into this: Lex and Yacc. Lex / Yacc History. A Quick Tour. if myvar == 6.02e23**2 then f(..!

LR Parsing - Conflicts

Parser Combinators 11/3/2003 IPT, ICS 1

Syntax and Parsing COMS W4115. Prof. Stephen A. Edwards Fall 2003 Columbia University Department of Computer Science

Compilers and Language Processing Tools

Lexical and Syntax Analysis. Bottom-Up Parsing

JavaCC: SimpleExamples

CPS 506 Comparative Programming Languages. Syntax Specification

COMP3131/9102: Programming Languages and Compilers

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

Syntax-Directed Translation

COMPILER (CSE 4120) (Lecture 6: Parsing 4 Bottom-up Parsing )

Parsing Techniques. CS152. Chris Pollett. Sep. 24, 2008.

A simple implementation of grammar libraries

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

MP 3 A Lexer for MiniJava

Abstract Syntax Trees (1)

Parser Tools: lex and yacc-style Parsing

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

LALR stands for look ahead left right. It is a technique for deciding when reductions have to be made in shift/reduce parsing. Often, it can make the

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

MODULE 14 SLR PARSER LR(0) ITEMS

Parser Tools: lex and yacc-style Parsing

COMP 520 Fall 2012 Scanners and Parsers (1) Scanners and parsers

Fall Compiler Principles Lecture 4: Parsing part 3. Roman Manevich Ben-Gurion University of the Negev

A clarification on terminology: Recognizer: accepts or rejects strings in a language. Parser: recognizes and generates parse trees (imminent topic)

CS 132 Compiler Construction, Fall 2012 Instructor: Jens Palsberg Multiple Choice Exam, Nov 6, 2012

CS Parsing 1

Part II : Lexical Analysis

Parsing - 1. What is parsing? Shift-reduce parsing. Operator precedence parsing. Shift-reduce conflict Reduce-reduce conflict

Lex and Yacc. A Quick Tour

Introduction to Yacc. General Description Input file Output files Parsing conflicts Pseudovariables Examples. Principles of Compilers - 16/03/2006

Using an LALR(1) Parser Generator

Compiler Construction

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

CSE 3302 Programming Languages Lecture 2: Syntax

CSE 401 Midterm Exam Sample Solution 2/11/15

Fall Compiler Principles Lecture 5: Parsing part 4. Roman Manevich Ben-Gurion University

Syntactic Analysis. Syntactic analysis, or parsing, is the second phase of compilation: The token file is converted to an abstract syntax tree.

Programs as data Parsing cont d; first-order functional language, type checking

Compiler Passes. Syntactic Analysis. Context-free Grammars. Syntactic Analysis / Parsing. EBNF Syntax of initial MiniJava.

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Syntax Analysis Part IV

UNIT III & IV. Bottom up parsing

How do LL(1) Parsers Build Syntax Trees?

Extra Credit Question

CUP. Lecture 18 CUP User s Manual (online) Robb T. Koether. Hampden-Sydney College. Fri, Feb 27, 2015

Building Compilers with Phoenix

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

CSE 582 Autumn 2002 Exam Sample Solution

CSE302: Compiler Design

ECE251 Midterm practice questions, Fall 2010

LALR Parsing. What Yacc and most compilers employ.

Transcription:

Compilation 2010 Jan Midtgaard Michael I. Schwartzbach Aarhus University

The Tool is a (lexer and) parser generator The input is: a sequence of token definitions a context-free grammar The output is: a LALR(1) parser for the defined language available as a Java class 2

Our Favorite Grammar in Helpers tab = 9; cr = 13; lf = 10; Tokens eol = cr lf cr lf; blank = ' ' tab; star = '*'; slash = '/'; plus = '+'; minus = '-'; lpar = '('; rpar = ')'; id = 'x' 'y' 'z'; Productions start = {plus} start plus term {minus} start minus term {term} term; term = {mult} term star factor {div} term slash factor {factor} factor; factor = {id} id {paren} lpar start rpar; Ignored Tokens blank,eol; 3

Generated Classes drwxr-xr-x 2 mis users 4096 Sep 7 09:28 analysis/ drwxr-xr-x 2 mis users 4096 Sep 7 09:28 lexer/ drwxr-xr-x 2 mis users 4096 Sep 7 09:32 node/ drwxr-xr-x 2 mis users 4096 Sep 7 09:28 parser/ -rw-r--r-- 1 mis users 536 Sep 7 09:32 xyz.sablecc We never need to look at this output 4

The Main Application import parser.*; import lexer.*; import node.*; import java.io.*; class Main { } public static void main(string args[]) { } try { Parser p = new Parser ( new Lexer ( new PushbackReader(new InputStreamReader(System.in)))); Start tree = p.parse(); /* parse the input */ } catch(exception e) { System.out.println(e); } 5

An Ambiguous Grammar X a X a a X Any string in this language has exponentially many different parse trees a a... a has exactly Fib(n) parse trees n 6

The Version Tokens a = 'a'; Productions x = {empty} {one} a x {two} [first]:a [second]:a x; Note that all symbols must have unique names The default name for foo is [foo]: 7

is Unhappy reduce/reduce conflict in state [stack: TA TA PX *] on EOF in { [ PX = TA PX * ] followed by EOF (reduce), [ PX = TA TA PX * ] followed by EOF (reduce) } The LALR(1) table contains conflicting actions 8

Solution: Less Stupid Grammar Tokens a = 'a'; Productions x = {empty} {one} a x ; 9

A Grammar for If-Statements Tokens eol = cr lf cr lf; blank = ' ' tab; exp = 'exp'; if = 'if'; then = 'then'; else = 'else'; assign = 'assign'; Ignored Tokens blank,eol; Productions stm = {one} if exp then stm {both} if exp then [thenbranch]:stm else [elsebranch]:stm {assign} assign; 10

is Unhappy shift/reduce conflict in state [stack: TIf TExp TThen PStm *] on TElse in { [ PStm = TIf TExp TThen PStm * TElse PStm ] (shift), [ PStm = TIf TExp TThen PStm * ] followed by TElse (reduce) } But the grammar does not appear to be stupid... 11

Solution: Less Natural Grammar Productions stm = {one} if exp then stm {both} if exp then [thenbranch]:stm2 else [elsebranch]:stm {assign} assign; stm2 = {both} if exp then [thenbranch]:stm2 else [elsebranch]:stm2 {assign} assign; 12

Dangling Else Problem An example statement: if exp then if exp then assign else assign To which if does the else belong? The first grammar is ambiguous Our modified grammar parses the string as: if exp then ( if exp then assign else assign) 13

The Palindrome Grammar Tokens zero = '0'; one = '1'; Productions pal = {empty} {one} one {zero} zero {oneone} [first]:one pal [second]:one {zerozero} [first]:zero pal [second]:zero; 14

is Unhappy shift/reduce conflict in state [stack: TZero *] on TZero in { [ PPal = * TZero PPal TZero ] (shift), [ PPal = * TZero ] (shift), } [ PPal = * ] followed by TZero (reduce), [ PPal = TZero * ] followed by TZero (reduce) shift/reduce conflict in state [stack: TZero *] on TZero in { [ PPal = * TZero PPal TZero ] (shift), } [ PPal = * TZero ] (shift), [ PPal = * ] followed by TZero (reduce), [ PPal = TZero * ] followed by TZero (reduce) shift/reduce conflict in state [stack: TZero *] on TOne in { [ PPal = * TOne PPal TOne ] (shift), [ PPal = * TOne ] (shift), [ PPal = TZero * ] followed by TOne (reduce) } 15

No Solution! There is no LALR(1) grammar for this language Some grammars are not LALR(1) And some languages are not LALR(1) Some grammars are ambiguous And some languages are ambiguous 16

Language Containments { a i b j c k i=j or j=k } Context-Free Unambiguous LALR(1) palindromes 17

EBNF Features allows right-hand side abbreviations: Optional: x = y? List: x = y* Non-empty list: x = y+ This has many benefits: shorter less error-prone fewer names must be invented 18

EBNF Example block = lbrace decl* stm+ rbrace ; decl = type id init? semicolon ; init = equals exp; 19

EBNF Expansion x = y? x = {some} y {none} ; x = y* x = {zero} {more} y x ; x = y+ x = {one} y {more} y x ; 20