Defining syntax using CFGs

Similar documents
Defining syntax using CFGs

EDA180: Compiler Construc6on Context- free grammars. Görel Hedin Revised:

Related Course Objec6ves

HW2 posted. Announcements

Principles of Programming Languages COMP251: Syntax and Grammars

CS 314 Principles of Programming Languages

Parsing II Top-down parsing. Comp 412

Dr. D.M. Akbar Hussain

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter UW CSE P 501 Winter 2016 C-1

Topic 3: Syntax Analysis I

Eng. Maha Talaat Page 1

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

4 (c) parsing. Parsing. Top down vs. bo5om up parsing

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

EDA180: Compiler Construc6on. Top- down parsing. Görel Hedin Revised: a

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal)

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Syntax Analysis Check syntax and construct abstract syntax tree

Ambiguity. Lecture 8. CS 536 Spring

Context-free grammars (CFGs)

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Compilers Course Lecture 4: Context Free Grammars

Parsing II Top-down parsing. Comp 412

CS 536 Midterm Exam Spring 2013

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

Parsing #1. Leonidas Fegaras. CSE 5317/4305 L3: Parsing #1 1

Context-Free Grammars


CIT Lecture 5 Context-Free Grammars and Parsing 4/2/2003 1

Context-Free Grammars

CMSC 330: Organization of Programming Languages. Context-Free Grammars Ambiguity

Chapter 3. Parsing #1

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam

3. Context-free grammars & parsing

CSE 3302 Programming Languages Lecture 2: Syntax

EECS 6083 Intro to Parsing Context Free Grammars

Introduction to Parsing

Introduction to Parsing. Comp 412

CSX-lite Example. LL(1) Parse Tables. LL(1) Parser Driver. Example of LL(1) Parsing. An LL(1) parse table, T, is a twodimensional

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

CSE302: Compiler Design

CMSC 330: Organization of Programming Languages

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

CS 406/534 Compiler Construction Parsing Part I

Properties of Regular Expressions and Finite Automata

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5

Compiler Design Concepts. Syntax Analysis

Introduction to Parsing. Lecture 5

Topic 5: Syntax Analysis III

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Introduction to Lexing and Parsing

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis

CMSC 330: Organization of Programming Languages

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

Context-Free Languages and Parse Trees

Habanero Extreme Scale Software Research Project

Introduction to Parsing. Lecture 8

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Spring UW CSE P 501 Spring 2018 C-1

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages. Context Free Grammars

Specifying Syntax. An English Grammar. Components of a Grammar. Language Specification. Types of Grammars. 1. Terminal symbols or terminals, Σ

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

Syntax Analysis Part I

CSCI312 Principles of Programming Languages

COP 3402 Systems Software Syntax Analysis (Parser)

Administrativia. PA2 assigned today. WA1 assigned today. Building a Parser II. CS164 3:30-5:00 TT 10 Evans. First midterm. Grammars.

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

Introduction to Parsing. Lecture 5

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

3. Parsing. Oscar Nierstrasz

Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Gr

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

Compila(on (Semester A, 2013/14)

Parsing III. (Top-down parsing: recursive descent & LL(1) )

CMSC 330: Organization of Programming Languages. Context Free Grammars

Syntax Analysis, III Comp 412

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

Introduction to Parsing Ambiguity and Syntax Errors

A simple syntax-directed

Wednesday, September 9, 15. Parsers

A programming language requires two major definitions A simple one pass compiler

Introduction to Parsing Ambiguity and Syntax Errors

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

Building Compilers with Phoenix

Optimizing Finite Automata

Recursive Descent Parsers

Part 5 Program Analysis Principles and Techniques

Table-Driven Top-Down Parsers

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis

Last Time. What do we want? When do we want it? An AST. Now!

CMPT 755 Compilers. Anoop Sarkar.

Transcription:

Defining syntax using CFGs

Roadmap Last 8me Defined context-free grammar This 8me CFGs for syntax design Language membership List grammars Resolving ambiguity

CFG Review G = (N,Σ,P,S) means derives derives in 1 or more steps CFG generates a string by applying produc8ons un8l no non-terminals remain Example: Nested parens N = { Q } Σ = { (, ) } P = Q ( Q ) ε S = Q

Formal CFG Language Defini8on Let G= (N,Σ,P,S) be a CFG. Then L(G) = w S + w where S is the start nonterminal of G w is a sequence of terminals or ε

CFGs as Language Defini8on CFG produc8ons define the syntax of a language 1. Prog begin end 2. semicolon Stmt 3. Stmt 4. Stmt id assign 5. id 6. plus id We call this nota8on BNF or extended BNF HTTP grammar using BNF: hrp://www.w3.org/protocols/rfc2616/rfc2616-sec2.html

List Grammars Useful to repeat a structure arbitrarily oyen semicolon Stmt Stmt ; Stmt List skews lec ; Stmt ; Stmt ; Stmt

List Grammars Useful to repeat a structure arbitrarily oyen Stmt semicolon Stmt Stmt ; Stmt ; List skews right Stmt ; Stmt ;

List Grammars What if we allowed both skews? semicolon Stmt ; ; ; Stmt ;

Deriva8on Order LeYmost Deriva8on: always expand the leymost nonterminal Rightmost Deriva8on: always expand the rightmost nonterminal 1. Prog begin end 2. semicolon Stmt 3. Stmt 4. Stmt id assign 5. id 6. plus id begin LeCmost expands this nonterminal Prog semicolon end Stmt Rightmost expands this nonterminal

Ambiguity Even with a fixed deriva8on order, it is possible to derive the same string in mul8ple ways For Grammar G and string w G is ambiguous if >1 leymost deriva8on of w >1 rightmost deriva8on of w > 1 parse tree for w

Example: Ambiguous Grammars minus 6mes Derive the string 4-7 * 3 (assume tokenizaron) lparen rparen Parse Tree 1 Parse Tree 2 minus 6mes 6mes minus 4 3 7 3 4 7

Why is Ambiguity Bad? Eventually, we ll be using CFGs as the basis for our parser Parsing is much easier when there is no ambiguity in the grammar The parse tree may mismatch user understanding! 4-7 * 3 Operator precedence minus 6mes 6mes minus 4 3 7 3 4 7

Resolving Grammar Ambiguity: Precedence minus 6mes lparen rparen Intui8ve problem Context-freeness Nonterminals are the same for both operators To fix precedence 1 nonterminal per precedence level Parse lowest level first

Resolving Grammar Ambiguity: Precedence minus 6mes lparen rparen lowest precedence level first 1 nonterm per precedence level Derive the string 4-7 * 3 minus minus 6mes Factor Factor lparen rparen Factor 4 Factor 7 6mes Factor 3

Resolving Grammar Ambiguity: Precedence Fixed Grammar minus 6mes Factor Factor lparen rparen Derive the string 4-7 * 3 Let s try to re-build the wrong parse tree 6mes minus Factor 6mes 4 Factor 7 Factor 3 We ll never be able to derive minus without parens Factor 3

Did we fix all ambiguity? Fixed Grammar minus 6mes Factor Factor lparen rparen Derive the string 4-7 - 3 minus Factor minus Factor Factor NO! These subtrees could have been swapped!

Where we are so far Precedence We want correct behavior on 4 7 * 9 A new nonterminal for each precedence level Associa8vity We want correct behavior on 4 7 9 Minus should be le: associa<ve: a b c = (a b) c Problem: the recursion in a rule like minus

Defini8on: Recursion in Grammars A grammar is recursive in (nonterminal) X if X + αxγ for non-empty strings of symbols α and γ A grammar is le:-recursive in X if X + Xγ for non-empty string of symbols γ A grammar is right-recursive in X if X + αx for non-empty string of symbols α

Resolving Grammar Ambiguity: Associa8vity Recognize ley-assoc operators with ley-associa8ve produc8ons Recognize right-assoc operators with right-associa8ve produc8ons minus Factor 6mes Factor Factor lparen rparen Example: 4 7 9 E E - E - T T F F 7 4 T F 9

Resolving Grammar Ambiguity: AssociaRvity minus 6mes Factor Factor Factor lparen rparen E Example: 4 7 9 Let s try to re-build the wrong parse tree again E - T T F 4 We ll never be able to derive minus without parens

Example Language of Boolean expressions bexp TRUE FALSE bexp OR bexp bexp AND bexp NOT bexp LPAREN bexp RPAREN Add nonterminals so that OR has lowest precedence, then AND, then NOT. Then change the grammar to reflect the fact that both AND and OR are ley associa8ve. Draw a parse tree for the expression: true AND NOT true

Another ambiguous example Stmt if Cond then Stmt if Cond then Stmt else Stmt Consider this word in this grammar: if a then if b then s else s2 How would you derive it?

Summary To understand how a parser works, we start by understanding context-free grammars, which are used to define the language recognized by the parser. terminal symbol (non)terminal symbol grammar rule (or produc8on) deriva8on (leymost deriva8on, rightmost deriva8on) parse (or deriva8on) tree the language defined by a grammar ambiguous grammar