CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1

Similar documents
CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Spring UW CSE P 501 Spring 2018 C-1

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter UW CSE P 501 Winter 2016 C-1

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

CSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions

Defining syntax using CFGs

Syntax Analysis Part I

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CSE 413 Programming Languages & Implementation. Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1

Syntax Analysis Check syntax and construct abstract syntax tree

Introduction to Parsing. Lecture 5

EECS 6083 Intro to Parsing Context Free Grammars

Introduction to Parsing. Lecture 8

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

Introduction to Parsing

Parsing II Top-down parsing. Comp 412

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

Parsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2

Comp 411 Principles of Programming Languages Lecture 3 Parsing. Corky Cartwright January 11, 2019

Introduction to Parsing Ambiguity and Syntax Errors

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Introduction to Parsing Ambiguity and Syntax Errors

CMSC 330: Organization of Programming Languages

Languages, Automata, Regular Expressions & Scanners. Winter /8/ Hal Perkins & UW CSE B-1

Introduction to Parsing. Lecture 5

Part 5 Program Analysis Principles and Techniques

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

CS 314 Principles of Programming Languages

Outline. Parser overview Context-free grammars (CFG s) Derivations Syntax-Directed Translation

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

CMSC 330: Organization of Programming Languages. Context Free Grammars

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Gr

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

CMSC 330: Organization of Programming Languages

COP 3402 Systems Software Syntax Analysis (Parser)

LR Parsing Techniques

Lecture 8: Deterministic Bottom-Up Parsing

Monday, September 13, Parsers

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

CMSC 330: Organization of Programming Languages. Context Free Grammars

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

Derivations of a CFG. MACM 300 Formal Languages and Automata. Context-free Grammars. Derivations and parse trees

Context-free grammars (CFG s)

Lecture 7: Deterministic Bottom-Up Parsing

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

3. Parsing. Oscar Nierstrasz

A Simple Syntax-Directed Translator

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

CMSC 330: Organization of Programming Languages

CSE302: Compiler Design

CSE 582 Autumn 2002 Exam 11/26/02

Principles of Programming Languages COMP251: Syntax and Grammars

Grammars and ambiguity. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 8 1

Grammars and Parsing. Paul Klint. Grammars and Parsing

Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

Intro To Parsing. Step By Step

( ) i 0. Outline. Regular languages revisited. Introduction to Parsing. Parser overview. Context-free grammars (CFG s) Lecture 5.

Wednesday, August 31, Parsers

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

Compilation 2012 Context-Free Languages Parsers and Scanners. Jan Midtgaard Michael I. Schwartzbach Aarhus University

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Introduction to Parsing. Lecture 5. Professor Alex Aiken Lecture #5 (Modified by Professor Vijay Ganesh)

Outline. Regular languages revisited. Introduction to Parsing. Parser overview. Context-free grammars (CFG s) Lecture 5. Derivations.

Chapter 3. Describing Syntax and Semantics ISBN

Context-Free Grammars

Introduction to Syntax Analysis

Optimizing Finite Automata

A programming language requires two major definitions A simple one pass compiler

CMSC 330: Organization of Programming Languages. Context Free Grammars

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5

programming languages need to be precise a regular expression is one of the following: tokens are the building blocks of programs

COMP3131/9102: Programming Languages and Compilers

Compiler Design Concepts. Syntax Analysis

Introduction to Syntax Analysis. The Second Phase of Front-End

Bottom-Up Parsing. Lecture 11-12

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

Syntax. In Text: Chapter 3

Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

CSE 401/M501 Compilers

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Syntactic Analysis. Syntactic analysis, or parsing, is the second phase of compilation: The token file is converted to an abstract syntax tree.

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part2 3.3 Parse Trees and Abstract Syntax Trees

A simple syntax-directed

E E+E E E (E) id. id + id E E+E. id E + E id id + E id id + id. Overview. derivations and parse trees. Grammars and ambiguity. ambiguous grammars

CSCI312 Principles of Programming Languages!

Properties of Regular Expressions and Finite Automata

Compiler Construction 2016/2017 Syntax Analysis

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

2.2 Syntax Definition

Transcription:

CSE P 501 Compilers Parsing & Context-Free Grammars Hal Perkins Winter 2008 1/15/2008 2002-08 Hal Perkins & UW CSE C-1

Agenda for Today Parsing overview Context free grammars Ambiguous grammars Reading: Dragon book, ch. 4 1/15/2008 2002-08 Hal Perkins & UW CSE C-2

Parsing The syntax of most programming languages can be specified by a context-free grammar (CGF) Parsing: Given a grammar G and a sentence w in L(G ), traverse the derivation (parse tree) for w in some standard order and do something useful at each node The tree might not be produced explicitly, but the control flow of a parser corresponds to a traversal 1/15/2008 2002-08 Hal Perkins & UW CSE C-3

Old Example program G program ::= statement program statement statement ::= assignstmt ifstmt assignstmt ::= id = expr ; ifstmt ::= if ( expr ) stmt expr ::= id int expr + expr Id ::= a b c i j k n x y z int ::= 0 1 2 3 4 5 6 7 8 9 program statement assignstmt expr statement ifstmt statement assignstmt id expr expr expr id expr int id int int w a = 1 ; if ( a + 1 ) b = 2 ; 1/15/2008 2002-08 Hal Perkins & UW CSE C-4

Standard Order For practical reasons we want the parser to be deterministic (no backtracking), and we want to examine the source program from left to right. (i.e., parse the program in linear time in the order it appears in the source file) 1/15/2008 2002-08 Hal Perkins & UW CSE C-5

Common Orderings Top-down Start with the root Traverse the parse tree depth-first, left-to-right (leftmost derivation) LL(k) Bottom-up Start at leaves and build up to the root Effectively a rightmost derivation in reverse(!) LR(k) and subsets (LALR(k), SLR(k), etc.) 1/15/2008 2002-08 Hal Perkins & UW CSE C-6

Something Useful At each point (node) in the traversal, perform some semantic action Construct nodes of full parse tree (rare) Construct abstract syntax tree (common) Construct linear, lower-level representation (more common in later parts of a modern compiler) Generate target code on the fly (1-pass compiler; not common in production compilers can t generate very good code in one pass but great if you need a quick n dirty working compiler) 1/15/2008 2002-08 Hal Perkins & UW CSE C-7

Context-Free Grammars Formally, a grammar G is a tuple <N,Σ,P,S> where N a finite set of non-terminal symbols Σ a finite set of terminal symbols P a finite set of productions A subset of N (N Σ )* S the start symbol, a distinguished element of N If not specified otherwise, this is usually assumed to be the non-terminal on the left of the first production 1/15/2008 2002-08 Hal Perkins & UW CSE C-8

Standard Notations a, b, c elements of Σ w, x, y, z elements of Σ* A, B, C elements of N X, Y, Z elements of N Σ,, elements of (N Σ )* A or A ::= if <A, > in P 1/15/2008 2002-08 Hal Perkins & UW CSE C-9

Derivation Relations (1) A => derives iff A ::= in P A =>* w if there is a chain of productions starting with A that generates w transitive closure 1/15/2008 2002-08 Hal Perkins & UW CSE C-10

Derivation Relations (2) w A => lm w derives leftmost iff A ::= in P A w => rm w iff A ::= in P derives rightmost We will only be interested in leftmost and rightmost derivations not random orderings 1/15/2008 2002-08 Hal Perkins & UW CSE C-11

Languages For A in N, L(A) = { w A =>* w } If S is the start symbol of grammar G, define L(G ) = L(S ) 1/15/2008 2002-08 Hal Perkins & UW CSE C-12

Reduced Grammars Grammar G is reduced iff for every production A ::= in G there is a derivation S =>* x A z => x z =>* xyz i.e., no production is useless Convention: we will use only reduced grammars 1/15/2008 2002-08 Hal Perkins & UW CSE C-13

Ambiguity Grammar G is unambiguous iff every w in L(G ) has a unique leftmost (or rightmost) derivation Fact: unique leftmost or unique rightmost implies the other A grammar without this property is ambiguous Note that other grammars that generate the same language may be unambiguous We need unambiguous grammars for parsing 1/15/2008 2002-08 Hal Perkins & UW CSE C-14

Example: Ambiguous Grammar for Arithmetic Expressions expr ::= expr + expr expr - expr expr * expr expr / expr int int ::= 0 1 2 3 4 5 6 7 8 9 Exercise: show that this is ambiguous How? Show two different leftmost or rightmost derivations for the same string Equivalently: show two different parse trees for the same string 1/15/2008 2002-08 Hal Perkins & UW CSE C-15

Example (cont) expr ::= expr + expr expr - expr expr * expr expr / expr int int ::= 0 1 2 3 4 5 6 7 8 9 Give a leftmost derivation of 2+3*4 and show the parse tree 1/15/2008 2002-08 Hal Perkins & UW CSE C-16

Example (cont) expr ::= expr + expr expr - expr expr * expr expr / expr int int ::= 0 1 2 3 4 5 6 7 8 9 Give a different leftmost derivation of 2+3*4 and show the parse tree 1/15/2008 2002-08 Hal Perkins & UW CSE C-17

Another example expr ::= expr + expr expr - expr expr * expr expr / expr int int ::= 0 1 2 3 4 5 6 7 8 9 Give two different derivations of 5+6+7 1/15/2008 2002-08 Hal Perkins & UW CSE C-18

What s going on here? The grammar has no notion of precedence or associatively Solution Create a non-terminal for each level of precedence Isolate the corresponding part of the grammar Force the parser to recognize higher precedence subexpressions first 1/15/2008 2002-08 Hal Perkins & UW CSE C-19

Classic Expression Grammar expr ::= expr + term expr term term term ::= term * factor term / factor factor factor ::= int ( expr ) int ::= 0 1 2 3 4 5 6 7 1/15/2008 2002-08 Hal Perkins & UW CSE C-20

Check: Derive 2 + 3 * 4 expr ::= expr + term expr term term term ::= term * factor term / factor factor factor ::= int ( expr ) int ::= 0 1 2 3 4 5 6 7 1/15/2008 2002-08 Hal Perkins & UW CSE C-21

expr ::= expr + term expr term term term ::= term * factor term / factor factor factor ::= int ( expr ) int ::= 0 1 2 3 4 5 6 7 Check: Derive 5 + 6 + 7 Note interaction between left- vs right-recursive rules and resulting associativity 1/15/2008 2002-08 Hal Perkins & UW CSE C-22

expr ::= expr + term expr term term term ::= term * factor term / factor factor factor ::= int ( expr ) int ::= 0 1 2 3 4 5 6 7 Check: Derive 5 + (6 + 7) 1/15/2008 2002-08 Hal Perkins & UW CSE C-23

Another Classic Example Grammar for conditional statements ifstmt ::= if ( cond ) stmt if ( cond ) stmt else stmt Exercise: show that this is ambiguous How? 1/15/2008 2002-08 Hal Perkins & UW CSE C-24

ifstmt ::= if ( cond ) stmt if ( cond ) stmt else stmt One Derivation if ( cond ) if ( cond ) stmt else stmt 1/15/2008 2002-08 Hal Perkins & UW CSE C-25

ifstmt ::= if ( cond ) stmt if ( cond ) stmt else stmt Another Derivation if ( cond ) if ( cond ) stmt else stmt 1/15/2008 2002-08 Hal Perkins & UW CSE C-26

Solving if Ambiguity Fix the grammar to separate if statements with else clause and if statements with no else Done in Java reference grammar Adds lots of non-terminals Use some ad-hoc rule in parser else matches closest unpaired if 1/15/2008 2002-08 Hal Perkins & UW CSE C-27

Parser Tools and Operators Most parser tools can cope with ambiguous grammars Makes life simpler if used with discipline Typically one can specify operator precedence & associativity Allows simpler, ambiguous grammar with fewer nonterminals as basis for generated parser, without creating problems 1/15/2008 2002-08 Hal Perkins & UW CSE C-28

Parser Tools and Ambiguous Grammars Possible rules for resolving other problems Earlier productions in the grammar preferred to later ones Longest match used if there is a choice Parser tools normally allow for this But be sure that what the tool does is really what you want 1/15/2008 2002-08 Hal Perkins & UW CSE C-29

Coming Attractions Next topic: LR parsing Continue reading ch. 4 1/15/2008 2002-08 Hal Perkins & UW CSE C-30