Dr. D.M. Akbar Hussain

Similar documents
3. Context-free grammars & parsing

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part2 3.3 Parse Trees and Abstract Syntax Trees

CIT Lecture 5 Context-Free Grammars and Parsing 4/2/2003 1

EECS 6083 Intro to Parsing Context Free Grammars

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

Principles of Programming Languages COMP251: Syntax and Grammars

CMSC 330: Organization of Programming Languages. Context Free Grammars

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

CMSC 330: Organization of Programming Languages

COP 3402 Systems Software Syntax Analysis (Parser)

CMSC 330: Organization of Programming Languages. Context Free Grammars

CSE 311 Lecture 21: Context-Free Grammars. Emina Torlak and Kevin Zatloukal

Syntax. In Text: Chapter 3

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

CMSC 330: Organization of Programming Languages. Context Free Grammars

Defining syntax using CFGs

CS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built)

CMSC 330: Organization of Programming Languages

CSE 3302 Programming Languages Lecture 2: Syntax

Syntax. A. Bellaachia Page: 1

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

Habanero Extreme Scale Software Research Project

A language is a subset of the set of all strings over some alphabet. string: a sequence of symbols alphabet: a set of symbols

Describing Syntax and Semantics

Part 5 Program Analysis Principles and Techniques

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Chapter 3. Describing Syntax and Semantics ISBN

Principles of Programming Languages COMP251: Syntax and Grammars

CSE302: Compiler Design

Chapter 4. Syntax - the form or structure of the expressions, statements, and program units

Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2

CPS 506 Comparative Programming Languages. Syntax Specification

Introduction to Syntax Analysis

CMPT 755 Compilers. Anoop Sarkar.

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

Introduction to Syntax Analysis. The Second Phase of Front-End

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

CS 314 Principles of Programming Languages

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Introduction to Parsing

programming languages need to be precise a regular expression is one of the following: tokens are the building blocks of programs

Optimizing Finite Automata

announcements CSE 311: Foundations of Computing review: regular expressions review: languages---sets of strings

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5

Programming Language Syntax and Analysis

Lecture 4: Syntax Specification

Context-Free Languages and Parse Trees

A Simple Syntax-Directed Translator

Chapter 3. Describing Syntax and Semantics

Programming Language Definition. Regular Expressions

Building Compilers with Phoenix

Introduction to Lexing and Parsing

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

UNIT I Programming Language Syntax and semantics. Kainjan Sanghavi

It parses an input string of tokens by tracing out the steps in a leftmost derivation.

Compiler Design Concepts. Syntax Analysis

Properties of Regular Expressions and Finite Automata

Chapter 3. Describing Syntax and Semantics ISBN

A programming language requires two major definitions A simple one pass compiler

Chapter 3. Describing Syntax and Semantics

Theory and Compiling COMP360

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

CSCI312 Principles of Programming Languages!

Lecture 10 Parsing 10.1

Chapter 3. Describing Syntax and Semantics

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

CSCE 314 Programming Languages

3. Parsing. Oscar Nierstrasz

Introduction to Parsing. Lecture 8

Context-Free Grammars

Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

Plan for Today. Regular Expressions: repetition and choice. Syntax and Semantics. Context Free Grammars

Syntax Analysis. The Big Picture. The Big Picture. COMP 524: Programming Languages Srinivas Krishnan January 25, 2011

EDA180: Compiler Construc6on Context- free grammars. Görel Hedin Revised:

Syntax Intro and Overview. Syntax

Syntax Analysis Check syntax and construct abstract syntax tree

Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Gr

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

This book is licensed under a Creative Commons Attribution 3.0 License

2.2 Syntax Definition

Chapter 3. Syntax - the form or structure of the expressions, statements, and program units

Formal Languages. Formal Languages

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Parsing II Top-down parsing. Comp 412

Syntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc.


Homework & Announcements

Languages and Compilers

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

Parsing a primer. Ralf Lämmel Software Languages Team University of Koblenz-Landau

Transcription:

Syntax Analysis Parsing Syntax Or Structure Given By Determines Grammar Rules Context Free Grammar 1 Context Free Grammars (CFG) Provides the syntactic structure: A grammar is quadruple (V T, V N, S, R) A set of finite terminals V T : Basic symbols from which sentences are formed. A set of finite non-terminals V N : Syntactic variables denoting sets of sentences. A set of productions R : Rules specifying how the terminals and non-terminals can be combined to form sentences. (Å Æ) A unique start symbol S : A distinguished non-terminal denoting the language (S N). 2 Compiler Construction F6S/Chapter 3 1

Context Free Grammars (CFG) CFG grammar is quadruple (V T, V N, S, R) Three conditions are to be full filled: 1. V T V N = Ø; V T, V N are not allowed to have symbol in common. Meaning we must be able to tell terminals and non-terminals apart. 2. S V N ; S is an element of non-terminal. 3. R {(N, α) N V N, α (V N V T )*} Which means the left side of each production must be non-terminal and right hand side may consists of both (terminals and nonterminals) and is not allowed to include any other symbol. 3 Example Grammar Rules: op op + - * / First rule defines an ression structure (with name ) consists of ression followed by an operator and another ression. Second rule defines an operator (name op) consists of add, subtract, multiply and division. (This notation was given by John Backus and adopted by Peter Naur, so called Backus Naur form, BNF) Terminals: id, +, -, *, / Non-terminals:, op Start symbol: 4 Compiler Construction F6S/Chapter 3 2

Specifications for CFG CFG uses similar naming conventions and operations as RE, the difference is; rules are recursive, so no symbol is required for repetition. Given an alphabet a CFG rule in BNF consists of a string of symbols, first symbol is the name of the structure, followed by meta symbol. After the meta symbol, there are symbols either from alphabet or structure name or meta-symbol. Informally, a BNF rule defines a structure whose name is on the left side of the arrow. Structure consists of one of the choices separated by. The sequence of symbols and structures within each choice defines the layout of the structure. Meta-symbol alternatives:(no universal Standard), =, :, ::= For text files structure names are written in angle brackets. Example: <> ::= <> <op> <> 5 Derivations E E + E E * E (E) -E id A derivation step is an application of a production as a rewriting rule. (E drives in -E) E -E A sequence of derivation steps E -E -( E ) -( id ) is called a derivation of -( id ) from E. The symbol * denotes, derives in zero or more steps and symbol + denotes, derives in one or more steps. 6 Compiler Construction F6S/Chapter 3 3

Language Determined by Grammar Rules How? Grammar rules determine the legal strings of token symbols by means of derivation. Derivation is a sequence of replacements of structure name by choice on the right hand side of the grammar rule. Derivation begins with a single structure name and end with a string of token symbols. String: (43-63) * 100 [ op () number] [op + - * ] op op number * number () * number ( op ) * number ( op number) * number ( - number) * number (number - number) * number 7 Example of CFG String: (())((())())() Rule: S SS (S) () S SS SSS (S)SS (())SS (())(S)S (())(SS)S (())((S)S)S (())((())S)S (())((())())S (())((())())() 8 Compiler Construction F6S/Chapter 3 4

Example L(G) = {a, (a), ((a)), (((a))),...} E (E) a E (E) ((E)) ((a)) L(G) = {a, a+a, a+a+a,...} E E + a a E E (E) L(G) ={ } E + a E + a + a E + a + a + a E +. 9 Leftmost & Rightmost Derivation A leftmost derivation always chooses the leftmost non-terminal to rewrite: Rules: E E + E E E - E E E * E E E / E E ( E ) E id String: (x + y)/(x - y) Leftmost: Rightmost: E E / E E E / E ( E ) / E E / ( E ) (E + E) / E E / (E - E) (id + E) / E E / (E - id) (id + id) / E E / (id - id) (id + id) / (E) (E) / (id - id) (id + id) / (E - E) (E + E) / (id - id) (id + id) / (id - E) (E + id) / (id - id) (id + id) / (id - id) (id + id) / (id - id) 10 Compiler Construction F6S/Chapter 3 5

Example: [ (34-3)*42 ] LMD (1) op [ op ] (2) () op [ ( )] (3) ( op ) op [ op ] (4) (number op ) op [ number] (5) (number - ) op [op -] (6) (number - number) op [ number] (7) (number - number) * [op *] (8) (number - number) * number [ number] RMD (1) op [ op ] (2) op number [ number] (3) * number [op * ] (4) ( )*number [ ( )] (5) ( op )*number [ op ] (6) ( op number) * number [ number ] (7) ( - number) * number [op - ] (8) (number - number)*number [ number ] 11 Non-Context Free Grammar String: xxxxbbbbcccc Rules: S xsbc S xbc CB BC bb bb bc bc cc cc S xsbc xxsbcbc xxxsbcbcbc xxxxbcbcbcbc xxxxbbccbcbc xxxxbbccbcbc xxxxbbcbccbc xxxxbbbcccbc xxxxbbbcccbc xxxxbbbccbcc xxxxbbbcbccc xxxxbbbbcccc xxxxbbbbcccc xxxxbbbbcccc xxxxbbbbcccc xxxxbbbbcccc xxxxbbbbcccc 12 Compiler Construction F6S/Chapter 3 6

Example yxx: from Left recursion: A A x y A A x A x y xxy: Right recursion: A x A y A x A x A y 13 Difference A leftmost derivation corresponds to a pre-order traversal of the parse tree. A rightmost derivation corresponds to a post-order traversal of the parse tree in reverse order. Both of these construct different types of parsers. LMD: Top-down Parser RMD: Bottom-up Parser Top-down parsers construct leftmost derivations. Left-to-right traversal of input, constructing a Leftmost derivation Bottom-up parsers construct rightmost derivations. Left-to-right traversal of input, constructing a Rightmost derivation 14 Compiler Construction F6S/Chapter 3 7

Parse Tree op number op number + number + number op number + number 15 Parse Tree (pre-order numbering) (1) op (2) number op (3) number + (4) number + number 1 2 3 op 4 number + number 16 Compiler Construction F6S/Chapter 3 8

Parse Tree (post-orderorder numbering) 1 (1) op (2) op number (3) + number (4) number + number 4 3 op 2 number + number (1) op (2) +? (3) number + (4) number + number 17 Parse Tree (100 200) * 300 1 4 3 op 2 ( 5 ) * number 8 7 op 6 number - number 18 Compiler Construction F6S/Chapter 3 9

Abstract Syntax Tree (100 200) * 300 * - 300 100 200 19 Example statement if-stmt other if-stmt if ( ) statement if ( )statement else statement 0 1 Possible strings: other if(0) other if(1) other if(0) other else other if(1) other else other if(0) if(0) other if(0) if(1) other else other if(1) other else if(0) other else other 20 Compiler Construction F6S/Chapter 3 10

ε- Productions Grammar generating sequences of one or more statements separated by a semicolon. stmt-seq stmt ; stmt-seq stmt stmt s L(G) = {s, s;s, s;s;s,.} To include ε stmt-seq stmt ; stmt-seq ε stmt s L(G) = {ε, s;, s;s;, s;s;s;,.} In this case ; has become the statement terminator instead of statement separator. (zero or more stmts terminated by a ; ) To fix the problem: stmt-seq non-stmt-seq ε non-stmt-seq stmt; non-stmt-seq stmt stmt s 21 Dangling else Problem statement if-stmt other if-stmt if ( ) statement if ( )statement else statement 0 1 Consider the following string: if(0) if(1) other else other It will produce the following two trees: 22 Compiler Construction F6S/Chapter 3 11

Parse trees for Dangling Else Problem Correct (Reason?) statement statement if-stmt if-stmt if ( ) statement else statement if ( ) statement 0 if-stmt other 0 if-stmt if ( ) statement if ( ) statement else statement 1 other 1 other other 23 Solutions for Dangling Problems Most-closely nested rule are easy to state, but hard to put into the grammar itself. Two Possibilities to deal with dangling: Always associate else part with the nearest if-statement that does not yet have an associated else-part. Use a Bracketing Keyword to remove the ambiguity: if-stmt if ( ) stmt end if ( )stmt else stmt end Bracketing keyword 24 Compiler Construction F6S/Chapter 3 12

EBNF Standard Backus-Naur Form (BNF) Meta-symbols are ε Extended BNF (EBNF): New meta-symbols [ ] and { } ε largely eliminated by these new symbols Brackets [ ] mean optional like? term term becomes: term [ ] if-stmt if ( ) stmt if ( )stmt else stmt becomes: if-stmt if ( ) stmt [ else stmt ] 25 EBNF continued Braces { } mean repetition + term term becomes: term { + term } Choices: + term - term term term { + term } term { - term } are they same? 26 Compiler Construction F6S/Chapter 3 13

EBNF ression example term { addop term } addop + - term factor { mulop factor } mulop * factor ( ) number 27 Syntax Diagram for EBNF term > term < addop < factor > ( > > ) > > number 28 Compiler Construction F6S/Chapter 3 14