Context free grammars and predictive parsing

Similar documents
Topic 3: Syntax Analysis I

CSE 3302 Programming Languages Lecture 2: Syntax

Context-Free Grammars

CMPT 755 Compilers. Anoop Sarkar.

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

Defining syntax using CFGs

3. Context-free grammars & parsing

Dr. D.M. Akbar Hussain

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

CPS 506 Comparative Programming Languages. Syntax Specification

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Spring UW CSE P 501 Spring 2018 C-1

EECS 6083 Intro to Parsing Context Free Grammars

Building Compilers with Phoenix

ECE251 Midterm practice questions, Fall 2010

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

Syntax. A. Bellaachia Page: 1

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

Compiler Design Overview. Compiler Design 1

A programming language requires two major definitions A simple one pass compiler

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

CSE 311 Lecture 21: Context-Free Grammars. Emina Torlak and Kevin Zatloukal

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

CSCI312 Principles of Programming Languages!

Syntax Analysis Check syntax and construct abstract syntax tree

Lecture 4: Syntax Specification

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

CS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built)

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part2 3.3 Parse Trees and Abstract Syntax Trees

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

CIT Lecture 5 Context-Free Grammars and Parsing 4/2/2003 1

Introduction to Lexing and Parsing

Principles of Programming Languages COMP251: Syntax and Grammars

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter UW CSE P 501 Winter 2016 C-1

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Syntax. In Text: Chapter 3

Chapter 4. Abstract Syntax

CSCE 314 Programming Languages

A simple syntax-directed

Context-free grammars (CFG s)

COP 3402 Systems Software Syntax Analysis (Parser)

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1

CS 536 Midterm Exam Spring 2013

Lexical and Syntax Analysis. Top-Down Parsing

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal)

CS 314 Principles of Programming Languages

CMSC 330: Organization of Programming Languages. Context Free Grammars

Administrativia. PA2 assigned today. WA1 assigned today. Building a Parser II. CS164 3:30-5:00 TT 10 Evans. First midterm. Grammars.

Building a Parser II. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

A Simple Syntax-Directed Translator

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

Compiler Construction

Lexical and Syntax Analysis

CMSC 330: Organization of Programming Languages. Context Free Grammars

announcements CSE 311: Foundations of Computing review: regular expressions review: languages---sets of strings

Introduction to Parsing

Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

CMSC 330: Organization of Programming Languages

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Bottom-Up Parsing. Lecture 11-12

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

CMSC 330: Organization of Programming Languages

CSE302: Compiler Design

Earlier edition Dragon book has been revised. Course Outline Contact Room 124, tel , rvvliet(at)liacs(dot)nl

CMSC 330: Organization of Programming Languages

This book is licensed under a Creative Commons Attribution 3.0 License

COMP3131/9102: Programming Languages and Compilers

CMSC 330: Organization of Programming Languages. Context Free Grammars

Types of parsing. CMSC 430 Lecture 4, Page 1

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Syntax-Directed Translation. Lecture 14

2.2 Syntax Definition

CS152 Programming Language Paradigms Prof. Tom Austin, Fall Syntax & Semantics, and Language Design Criteria

CSE 582 Autumn 2002 Exam 11/26/02

Homework & Announcements

A language is a subset of the set of all strings over some alphabet. string: a sequence of symbols alphabet: a set of symbols

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

CSCI 1260: Compilers and Program Analysis Steven Reiss Fall Lecture 4: Syntax Analysis I

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Compilers Course Lecture 4: Context Free Grammars

Bottom-Up Parsing. Lecture 11-12

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

Compiler Construction

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

High Level Languages. Java (Object Oriented) This Course. Jython in Java. Relation. ASP RDF (Horn Clause Deduction, Semantic Web) Dr.

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

programming languages need to be precise a regular expression is one of the following: tokens are the building blocks of programs

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Recursive Descent Parsers

3. Parsing. Oscar Nierstrasz

Specifying Syntax COMP360

Chapter 3. Parsing #1

Transcription:

Context free grammars and predictive parsing Programming Language Concepts and Implementation Fall 2011, Lecture 6 Only 8/15 submitted! Why? Merge: } Complexity? Mandatory ex 5 public static List<T> Merge<T>(List<T> first, List<T> second) where T: IComparable<T> { List<T> result = new List<T>(); result.addrange(first); result.addrange(second); result.sort(); return result; Does the right thing, but what did we learn? Exercises important! Also non-mandatory ones 2

Context free grammars Next week: LR parsing Describing programming language syntax Ambiguities and eliminating these The parser generator coco/r Overview Predictive parsing: Under the hood of coco/r 3 An example and a derivation = + * () Context free grammars => + => + * => 2 + 3*4 Think of it as regular expressions + recursion Terminology: - 1 non-terminal - 5 terminals (tokens): +, *, (, ), num - 4 productions (right hand sides) - Terminals and nonterminals collectively are symbols 4

Straight line programs (from book) S = S;S id := E print(l) E = id E + E (S,E) L = E L,E Another example S S ; S S ; id := E id := E; id := E id := num ; id := E id := num ; id := E + E id := num ; id := E + (S, E) id := num ; id := id + (S, E) id := num ; id := id + (id := E, E) id := num ; id := id + (id := E + E, E) id := num ; id := id + (id := E + E, id ) id := num ; id := id + (id := num + E, id) id := num ; id := id + (id := num + num, id) 5 A context free grammar consists of - A finite set of nonterminals - A finite set of terminals - A finite set of productions Official definition A production consists of - A nonterminal (called the left hand side) - A string of symbols (terminals or nonterminals) This is called Backus-Naur Form (BNF) 6

= + * () Ambiguity + 2 + 4 3 4 => + => + * => 2 + 3*4 2 3 => * => + * => 2 + 3*4 7 Encoding operator precedence Multiplication has higher precedence (binds stronger) than addition One nonterminal per precedence level Exercise: = + Term = Term * Term Term () - How many ways can you parse 2+3*4? - How about 2 + 3 + 4? 8

Ambiguity and associativity = - 5 2 3 2 Forcing left associativity 5 3 = - num 9 Exercise What ambiguities exist in the following grammar, and how do we get rid of them? = + * - / () 10

Exercise What ambiguities exist in the following grammar, and how do we get rid of them? = + * - / () * and / have higher precedence than -,+ All operators associate to the left, e.g., - 3/6*2 = (3/6)*2 2/(6*3) - 3-6+2 = (3-6)+2 3-(6+2) 11 Encoding operator precedence = + * - / () Encoding associativity = + Term - Term Term Term = Term * num Term * () Term / num Term / () () or(better) = + - Term Term = Term * Term Term / Term () = + Term - Term Term Term = Term * Prim Term / Prim Prim Prim = () Exercise 12

Associativity of operators Most binary operators are left associative, e.g., +, -, *, / Few are right associative, e.g. = in C: x = y = 2 parsed as x = (y = 2) Forcing right associativity = ident = ident Some are not associative, e.g., 1<2<3 is not legal Log = < =... 13 Consider the grammar Amguity: How to parse? Ambiguity: Dangling else Stmt = if then Stmt else Stmt if then Stmt id = if then if then Stmt else Stmt 14

Consider the grammar Amguity: How to parse Resolving the ambiguity Stmt = Matched_Stmt Unmatched_Stmt Ambiguity: Dangling else Stmt = if then Stmt else Stmt if then Stmt id = if then if then Stmt else Stmt Matched_Stmt = if then Matched_Stmt else Matched_Stmt id = Better to handle this using parser tricks. See later Unmatched_Stmt = if then Stmt if then Matched_Stmt else Unmatched_Stmt 15 From MCIJ (note mixed notation) Example: Mini Java 16

SQL specification (in extended BNF)... <query specification> ::= SELECT [ <set quantifier> ] <select list> <table expression> <select list> ::= <asterisk> <select sublist> [ { <comma> <select sublist> }... ] <select sublist> ::= <derived column> <qualifier> <period> <asterisk> <derived column> ::= <value expression> [ <as clause> ] <as clause> ::= [ AS ] <column name> <table expression> ::= <from clause> [ <where clause> ] [ <group by clause> ] [ <having clause> ] http://savage.net.au/ SQL/sql-92.bnf <from clause> ::= FROM <table reference> [ { <comma> <table reference> }... ]... 17 Extended BNF Example = Term { + Term - Term } Term = num { * num} Extra symbols - {α} means zero, one or many α - [α] means zero or one α - (α) is used for grouping EBNF is no more expressive than BNF, only more convenient 18

Using coco/r COMPILER essions... PRODUCTIONS /*-------------------------------------------------------------------*/ = Term { '+' Term '-' Term }. Term = number { '*' number }. essions =. END essions. 19 Using coco/r 20

Semantic actions in coco/r COMPILER essions public int res;... PRODUCTIONS /*-------------------------------------------------------------------*/ <out int n> (. int n1, n2;.) = Term<out n1> (. n = n1;.) { '+' Term<out n2> (. n = n+n2;.) '-' Term<out n2> (. n = n-n2;.) }. Term<out int n> = number (. n = Convert.ToInt32(t.val);.) { '*' number (. n = n*convert.toint32(t.val);.) }. essions (. int n;.) = <out n> (. res = n;.). END essions. 21 Method for parsing expressions In resulting Parser.cs void (out int n) { int n1, n2; Term(out n1); n = n1; while (la.kind == 3 la.kind == 4) { if (la.kind == 3) { Get(); Term(out n2); n = n+n2; } else { Get(); Term(out n2); n = n-n2; } } } The generated parser Pass by reference, similar to ref If next token is + 22

Using coco/r with semantic actions 23 Predictive parsing Top-down parsing method aka LL-parsing coco/r generates LL parsers Produces left-most derivations Example grammar 3.11 Guess a production based on the next token Example parsing on board S = if E then S else S begin S L print E L = end ; S L E = num ident 24

Parser implementation final int IF=1, THEN=2, ELSE=3, BEGIN=4, END=5, PRINT=6, SEMI=7, NUM=8, EQ=9; int tok = gettoken(); void advance() {tok=gettoken();} void eat(int t) {if (tok==t) advance(); else error();} void S() {switch(tok) { case IF: eat(if); E(); eat(then); S(); eat(else); S(); break; case BEGIN: eat(begin); S(); L(); break; case PRINT: eat(print); E(); break; default: error(); }} void L() {switch(tok) { case END: eat(end); break; case SEMI: eat(semi); S(); L(); break; default: error(); 25 Parsing table S L E ---------------------------------------------------- if S->if E then S else S begin S->begin S L print S->print E end L->end ; L->;S L num E->num ident E->ident S = if E then S else S begin S L print E L = end ; S L E = num ident 26

Intended learning outcomes Construct grammars for programming languages Eliminate ambiguity by - Encoding operator precedence - Encoding operator associativity Use coco/r to create parsers and lexers 27