Fall Compiler Principles Context-free Grammars Refresher. Roman Manevich Ben-Gurion University of the Negev

Similar documents
Syntax Analysis Check syntax and construct abstract syntax tree

Outline. Parser overview Context-free grammars (CFG s) Derivations Syntax-Directed Translation

MA513: Formal Languages and Automata Theory Topic: Context-free Grammars (CFG) Lecture Number 18 Date: September 12, 2011

Introduction to Parsing. Lecture 5

Compiler Design Concepts. Syntax Analysis

Introduction to Parsing. Lecture 5. Professor Alex Aiken Lecture #5 (Modified by Professor Vijay Ganesh)

Context-Free Languages and Parse Trees

Introduction to Parsing. Lecture 5

Syntax Analysis Part I

Announcements. Written Assignment 1 out, due Friday, July 6th at 5PM.

Introduction to Parsing. Lecture 8

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

Outline. Regular languages revisited. Introduction to Parsing. Parser overview. Context-free grammars (CFG s) Lecture 5. Derivations.

Context-Free Grammars

Context-Free Grammars

( ) i 0. Outline. Regular languages revisited. Introduction to Parsing. Parser overview. Context-free grammars (CFG s) Lecture 5.

CS 314 Principles of Programming Languages

CMSC 330: Organization of Programming Languages. Context-Free Grammars Ambiguity

Parsing: Derivations, Ambiguity, Precedence, Associativity. Lecture 8. Professor Alex Aiken Lecture #5 (Modified by Professor Vijay Ganesh)

More on Syntax. Agenda for the Day. Administrative Stuff. More on Syntax In-Class Exercise Using parse trees

Fall Compiler Principles Lecture 2: LL parsing. Roman Manevich Ben-Gurion University of the Negev

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

CSE302: Compiler Design

Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Gr

Introduction to Parsing Ambiguity and Syntax Errors

EECS 6083 Intro to Parsing Context Free Grammars

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Ambiguous Grammars and Compactification

Introduction to Parsing Ambiguity and Syntax Errors

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

Fall Compiler Principles Lecture 2: LL parsing. Roman Manevich Ben-Gurion University of the Negev

Optimizing Finite Automata

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

3. Parsing. Oscar Nierstrasz

Grammars and ambiguity. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 8 1

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

Context-Free Grammars

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

Intro To Parsing. Step By Step

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Parsing II Top-down parsing. Comp 412

([1-9] 1[0-2]):[0-5][0-9](AM PM)? What does the above match? Matches clock time, may or may not be told if it is AM or PM.

A simple syntax-directed

A Simple Syntax-Directed Translator

Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

Lecture 8: Context Free Grammars

Describing Syntax and Semantics


MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

Principles of Programming Languages COMP251: Syntax and Grammars

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

Fall Compiler Principles Lecture 3: Parsing part 2. Roman Manevich Ben-Gurion University

Outline. Limitations of regular languages Parser overview Context-free grammars (CFG s) Derivations Syntax-Directed Translation

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

Introduction to Parsing

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

programming languages need to be precise a regular expression is one of the following: tokens are the building blocks of programs

E E+E E E (E) id. id + id E E+E. id E + E id id + E id id + id. Overview. derivations and parse trees. Grammars and ambiguity. ambiguous grammars

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal)

Introduction to Syntax Analysis

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program.

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Context-Free Grammars and Languages (2015/11)

Defining syntax using CFGs

Context-Free Languages. Wen-Guey Tzeng Department of Computer Science National Chiao Tung University

COP 3402 Systems Software Syntax Analysis (Parser)

Properties of Regular Expressions and Finite Automata

Syntax Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Introduction to Syntax Analysis. The Second Phase of Front-End

CMSC 330: Organization of Programming Languages

Dr. D.M. Akbar Hussain

Chapter 4. Syntax - the form or structure of the expressions, statements, and program units

Compilers Course Lecture 4: Context Free Grammars

Plan for Today. Regular Expressions: repetition and choice. Syntax and Semantics. Context Free Grammars

Context-Free Languages. Wen-Guey Tzeng Department of Computer Science National Chiao Tung University

CMSC 330: Organization of Programming Languages

Grammars & Parsing. Lecture 12 CS 2112 Fall 2018

CS153: Compilers Lecture 4: Recursive Parsing

Context-Free Languages. Wen-Guey Tzeng Department of Computer Science National Chiao Tung University

Context-Free Languages. Wen-Guey Tzeng Department of Computer Science National Chiao Tung University

CMSC 330: Organization of Programming Languages

VIVA QUESTIONS WITH ANSWERS

Syntax. In Text: Chapter 3

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1

EDA180: Compiler Construc6on Context- free grammars. Görel Hedin Revised:

Eng. Maha Talaat Page 1

Definition: two derivations are similar if one of them precedes the other.

Ambiguity. Grammar E E + E E * E ( E ) int. The string int * int + int has two parse trees. * int

CMPT 755 Compilers. Anoop Sarkar.

2.2 Syntax Definition

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

Compilers and computer architecture From strings to ASTs (2): context free grammars

CSCI312 Principles of Programming Languages!

Introduction to Lexing and Parsing

Transcription:

Fall 2016-2017 Compiler Principles Context-free Grammars Refresher Roman Manevich Ben-Gurion University of the Negev 1

xample grammar S S ; S S id := S print (L) id num + L L L, shorthand for Statement shorthand for xpression shorthand for List (of expressions) 2

CFG terminology S S ; S S id := S print (L) id num + L L L, Symbols: Terminals (tokens): ; := ( ) id num print Non-terminals: S L Start non-terminal: S Convention: the non-terminal appearing in the first derivation rule Grammar productions (rules) N α 3

More definitions Sentential form: a sequence of symbols, terminals (tokens) and non-terminals Sentence: a sequence of terminals (tokens) Derivation step: given a sentential form αnβ and rule N µ a step is the transition αnβ αµβ Derivation sequence: a sequence of derivation steps 1 k such that i i+1 is the result of applying one production and k is a sentence 4

Language of a CFG A word ω is in L(G) (valid program) if there exists a corresponding derivation sequence Start the start symbol Repeatedly replace one of the non-terminals by a right-hand side of a production Stop when the sentence contains only terminals ω is in L(G) if S * ω Leftmost derivation Rightmost derivation 5

Leftmost derivation a := 56 ; b := 7 + 3 1 2 3 4 5 6 7 8 S S ; S S id := S print (L) id num + L L L, S => S ; S => id := ; S => id := num ; S => id := num ; id := => id := num ; id := + => id := num ; id := num + => id := num ; id := num + num id := num ; id := num + num 6

Rightmost derivation a := 56 ; b := 7 + 3 1 2 3 4 5 6 7 8 S S ; S S id := S print (L) id num + L L L, S => S ; S => S ; id := => S ; id := + => S ; id := + num => S ; id := num + num => id := ; id := num + num => id := num ; id := num + num id := num ; id := num + num 7

Canonical derivations Leftmost/rightmost derivations may not be unique but they allow describing a derivation by the sequence of production rules taken (since non-terminal is already known) Leftmost derivation example: 1, 2, 5, 2, 6, 5, 5 Rightmost derivation example: 1, 2, 6, 5, 5, 2, 5 8

Parse trees Tree nodes are symbols, children ordered left-to-right ach internal node is non-terminal and its children correspond to one of its productions N µ 1 µ k N µ 1 µ k Root is start non-terminal Leaves are tokens Yield of parse tree: left-to-right walk over leaves 9

Parse tree exercise S S ; S S id := S print (L) id Draw parse tree for expression num + L L L, id := num ; id := num + num 10

Parse tree exercise S S ; S S id := S print (L) id num + L L L, quivalently add parentheses labeled by non-terminal names S S Order-independent representation id := num ; id := num + num ( S ( S a := ( 56) ) S ; ( S b := ( ( 7) + ( 3) ) ) S ) S S 11

Capabilities and limitations of CFGs CFGs naturally express Hierarchical structure A program is a list of classes, A Class is a list of definition Alternatives A definition is either a field definition or a method definition Beginning-end type of constraints Balanced parentheses S (S)S ε Cannot express p. 173 Correlations between unbounded strings (identifiers) For example: variables are declared before use: ω S ω Handled by semantic analysis (attribute grammars) 12

Badly-formed grammars By Oren neu dag (Own work) [CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons 13

Badly-formed grammars A non-terminal N is reachable if S * αnβ A non-terminal N is generating if N * ω A grammar G is badly-formed if it either contains unreachable nonterminals or non-generating non-terminals G 1 = S x N y G 2 = S x N N a N b N xercise: algorithm to test whether a grammar is badly-formed Theorem: for every grammar G there exists an equivalent wellformed grammar G ( that is, L(G)=L(G ) ) Proof: exercise From now on, we will only handle well-formed grammars 14

Ambiguity in Context-free grammars 15

Sometimes there are two parse trees Arithmetic expressions: 1 + 2 + 3 id num 1 + (2 + 3) (1 + 2) + 3 + * ( ) num(1) + num(2) + num(3) num(1) + num(2) + num(3) Leftmost derivation + num + num + + num + num + num + num + num Rightmost derivation + + num + + num + num + num num + num + num 16

Is ambiguity a problem for compilers? Arithmetic expressions: 1 + 2 + 3 id num 1 + (2 + 3) (1 + 2) + 3 + Depends on semantics * ( ) num(1) + num(2) + num(3) = 6 num(1) + num(2) + num(3) = 6 Leftmost derivation + num + num + + num + num + num + num + num Rightmost derivation + + num + + num + num + num num + num + num 17

Problematic ambiguity example Arithmetic expressions: 1 + 2 * 3 id num 1 + (2 * 3) This is what we (1 + 2) * 3 + usually want: * has * precedence over + ( ) num(1) + num(2) * num(3) = 7 num(1) + num(2) * num(3) = 9 Leftmost derivation + num + num + * num + num * num + num * num Rightmost derivation * * num + * num + num * num num + num * num 18

Ambiguous grammars A grammar is ambiguous if there exists a word for which there are Two different leftmost derivations Two different rightmost derivations Two different parse trees Property of grammars, not languages 19

Facts about ambiguous grammars Some languages are inherently ambiguous no unambiguous grammars exist [Parikh 1961] Checking whether an arbitrary grammar is ambiguous is undecidable [Hopcroft, Motwani, Ullman, 2001] 20