Compilation 2013 Parser Generators, Conflict Management, and ML-Yacc

Similar documents
COMPILER (CSE 4120) (Lecture 6: Parsing 4 Bottom-up Parsing )

Topic 5: Syntax Analysis III

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

Conflicts in LR Parsing and More LR Parsing Types

LR Parsing LALR Parser Generators

Context-free grammars

LR Parsing LALR Parser Generators

Compiler Construction: Parsing

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

shift-reduce parsing

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

LR Parsing - Conflicts

Compilers. Bottom-up Parsing. (original slides by Sam

Fall Compiler Principles Lecture 4: Parsing part 3. Roman Manevich Ben-Gurion University of the Negev

Bottom-Up Parsing. Lecture 11-12

Lecture 8: Deterministic Bottom-Up Parsing

Lecture Notes on Bottom-Up LR Parsing

Fall Compiler Principles Lecture 5: Parsing part 4. Roman Manevich Ben-Gurion University

Lecture 7: Deterministic Bottom-Up Parsing

In One Slide. Outline. LR Parsing. Table Construction

Principle of Compilers Lecture IV Part 4: Syntactic Analysis. Alessandro Artale

Action Table for CSX-Lite. LALR Parser Driver. Example of LALR(1) Parsing. GoTo Table for CSX-Lite

Introduction to Yacc. General Description Input file Output files Parsing conflicts Pseudovariables Examples. Principles of Compilers - 16/03/2006

Syntax Analysis. Tokens ---> Parse Tree. Main Problems. Grammars. Convert the list of tokens into a parse tree ( hierarchical analysis)

Bottom-Up Parsing. Lecture 11-12

Lecture Notes on Bottom-Up LR Parsing

Downloaded from Page 1. LR Parsing

Syntax Analysis Part IV

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

UNIT III & IV. Bottom up parsing

CS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1

G53CMP: Lecture 4. Syntactic Analysis: Parser Generators. Henrik Nilsson. University of Nottingham, UK. G53CMP: Lecture 4 p.1/32

Compilers and Language Processing Tools

Formal Languages and Compilers Lecture VII Part 4: Syntactic A

Error Recovery. Computer Science 320 Prof. David Walker - 1 -

Configuration Sets for CSX- Lite. Parser Action Table

LALR stands for look ahead left right. It is a technique for deciding when reductions have to be made in shift/reduce parsing. Often, it can make the

CS 11 Ocaml track: lecture 6

Yacc Yet Another Compiler Compiler

ML-Yacc User s Manual Version 2.4

Review of CFGs and Parsing II Bottom-up Parsers. Lecture 5. Review slides 1

Lexical and Syntax Analysis. Bottom-Up Parsing

Building a Parser Part III

CSCI Compiler Design

Parser Tools: lex and yacc-style Parsing

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Compilation 2010 SableCC

Compilation 2012 ocamllex and menhir

Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Gr

LALR Parsing. What Yacc and most compilers employ.

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

POLITECNICO DI TORINO. (01JEUHT) Formal Languages and Compilers. Laboratory N 3. Lab 3. Cup Advanced Use

Using an LALR(1) Parser Generator

Compiler Construction

Agenda. Previously. Tentative syllabus. Fall Compiler Principles Lecture 5: Parsing part 4 12/2/2015. Roman Manevich Ben-Gurion University

Syntax-Directed Translation

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

Parser Tools: lex and yacc-style Parsing

LR Parsing. Table Construction

S Y N T A X A N A L Y S I S LR

Last Time. What do we want? When do we want it? An AST. Now!

Error Detection in LALR Parsers. LALR is More Powerful. { b + c = a; } Eof. Expr Expr + id Expr id we can first match an id:

Compiler Construction

COMP 181. Prelude. Prelude. Summary of parsing. A Hierarchy of Grammar Classes. More power? Syntax-directed translation. Analysis

Compilation 2012 Context-Free Languages Parsers and Scanners. Jan Midtgaard Michael I. Schwartzbach Aarhus University

Yacc: A Syntactic Analysers Generator

Parser Generators. Aurochs and ANTLR and Yaccs, Oh My. Tuesday, December 1, 2009 Reading: Appel 3.3. Department of Computer Science Wellesley College

UNIVERSITY OF CALIFORNIA


Parsing. COMP 520: Compiler Design (4 credits) Professor Laurie Hendren.

Lab 2. Lexing and Parsing with Flex and Bison - 2 labs

Topic 3: Syntax Analysis I

More Bottom-Up Parsing

Compiler Construction Using

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

Wednesday, August 31, Parsers

Compiler Theory. (Semantic Analysis and Run-Time Environments)

Recursive Descent Parsers

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

Lesson 10. CDT301 Compiler Theory, Spring 2011 Teacher: Linus Källberg

Syntactic Analysis. Syntactic analysis, or parsing, is the second phase of compilation: The token file is converted to an abstract syntax tree.

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

Top down vs. bottom up parsing

Formal Languages and Compilers Lecture VII Part 3: Syntactic A

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

Compiler Passes. Syntactic Analysis. Context-free Grammars. Syntactic Analysis / Parsing. EBNF Syntax of initial MiniJava.

LR Parsing - The Items

How do LL(1) Parsers Build Syntax Trees?

Abstract Syntax. Mooly Sagiv. html://

CUP. Lecture 18 CUP User s Manual (online) Robb T. Koether. Hampden-Sydney College. Fri, Feb 27, 2015

Lex and Yacc. More Details

CSE P 501 Exam 11/17/05 Sample Solution

CS415 Compilers. LR Parsing & Error Recovery

Chapter 3: Lexing and Parsing

Programming Languages and Compilers (CS 421)

Programming Languages and Compilers (CS 421)

Monday, September 13, Parsers

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Transcription:

Compilation 2013 Parser Generators, Conflict Management, and ML-Yacc Erik Ernst Aarhus University

Parser generators, ML-Yacc LR parsers are tedious to write, but can be generated, e.g., by ML-Yacc Input: A parser specification Grammar rules (context-free, typically LALR(1)) Directives (manage types, conflicts,..) User code, supporting semantic actions Output: Source code of a parser, ready to use Hurdle: Conflict management! 2

Example Grammar Grammar: P L L S S id := id L L ; S S while id do S S begin L end S if id then S S if id then S else S Expressing the same thing for ML-Yacc Rule format: stm : WHILE ID DO stm () Sections: user declarations %% parser declarations %% grammar rules 3

Example Grammar Spec ML-Yacc specification: %% %term ID WHILE BEGIN END DO IF THEN ELSE SEMI ASSIGN EOF %nonterm prog stm stmlist %pos int %verbose %start prog %eop EOF %noshift EOF %% prog : stmlist () stm : ID ASSIGN ID () WHILE ID DO stm () BEGIN stmlist END () IF ID THEN stm () IF ID THEN stm ELSE stm () stmlist : stm () stmlist SEMI stm () 4

Example Grammar Spec Verbose ensures feedback, esp. on conflicts: 1 shift/reduce conflict error: state 17: shift/reduce conflict (shift ELSE, reduce by rule 4) state 0:... prog :. stmlist ID shift 6 WHILE shift 5 BEGIN shift 4 IF shift 3 prog goto 21 stm goto 2 stmlist goto 1. error state 17:... state 21: stm : IF ID THEN stm. (reduce by rule 4) stm : IF ID THEN stm. ELSE stm ELSE shift 19. reduce by rule 4 EOF accept. error 15 of 57 action table entries left after compaction 9 goto table entries 5

Grammar: Revisit Ambiguous Grammar E id E E + E E num E E E E E * E E ( E ) E E / E Much worse: 16 shift/reduce conflicts! Last time: Rewritten (E,T,F) Can be managed using conflict resolution hints Default: Prefer shift for sh/rd, first rule for rd/rd 6

Grammar: Wishes for Revisited Grammar E id E E + E E num E E E E E * E E ( E ) E E / E Multiplication, division binds stronger than addition, subtraction (precedence) Operators left associative May also specify no associativity 7

Shift-reduce and Precedence A partial LR(1) state: E E * E + E E + E (any) If we prefer shift: Stack is E * E becomes E * E + and later E * E + E which must reduce the addition Must prefer reduce for right precedence: E * E becomes E and later E + E which has reduced the multiplication 8

Shift-reduce and Associativity A partial LR(1) state: E E + E + E E + E (any) If we prefer shift: Stack is E + E becomes E + E + and later E + E + E reducing the rightmost addition Must prefer reduce for right precedence: E + E becomes E and later E + E which has reduced the leftmost addition first 9

A partial LR(1) state: Non-Associativity E E E E E E (any) Make that position in the parsing table an error Point 2 3 4 is inherently error-prone, just outlaw it! Similarly for i == j == k 10

Precedence & Associativity Control Related specification declarations: %% %term INT PLUS MINUS TIMUS UMINUS EXP EQ NEQ EOF %nonterm exp %start exp %eop EOF %nonassoc EQ NEQ %left PLUS MINUS %left TIMES %right EXP %left UMINUS never returned from lexer! %% stm :... MINUS exp %prec UMINUS () Use %prec to force precedence on rule 11

Error Handling Starting point: Do not just report one error and stop, keep running and report many Local strategy: Correct at location that made parser engine enter error state Global strategy: Perform minimal change that makes program syntactically correct Actions: Change parser stack, change input, change inverse lookahead 12

Error Handling Using error Symbol A local strategy, known from YACC Idea: Grammar rules contain error tokens, e.g. E id L L ; E E E + E E ( E ) E ( error ) L E E error ; E Generator: error terminal, generate usual shift Engine: When error, pop until action for error is shift; shift it; discard input until non-error action; resume parsing 13

Burke-Fisher Error Handling A limited global strategy Choose K, create K-place buffer Maintain two parse stacks, K steps apart Semantic actions: only delayed stack (side-effects!) On error: Try all possible deletions/insertions/changes on the buffer Complexity not bad: N tokens => (1+2N )K Advantage: Grammar, table unchanged, only engine algorithm changed Need default values (ID, INT) May use %change directives: >1 token 14

Summary LR parsing nice, implementation tedious Use generated code! E.g., ML-Yacc Specification file (3 sections, %%) Conflicts reported shift/reduce, reduce/reduce Control parser table: precedence, associativity Error handling: local, global Using error token Using Burke-Fisher try all edits approach 15