Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

Similar documents
COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

A Simple Syntax-Directed Translator

A programming language requires two major definitions A simple one pass compiler

Building Compilers with Phoenix

A simple syntax-directed

Examples of attributes: values of evaluated subtrees, type information, source file coordinates,

Question Points Score

2.2 Syntax Definition

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

Lexical and Syntax Analysis. Top-Down Parsing

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Syntax Analysis Part I

Lexical and Syntax Analysis

More Assigned Reading and Exercises on Syntax (for Exam 2)

Earlier edition Dragon book has been revised. Course Outline Contact Room 124, tel , rvvliet(at)liacs(dot)nl

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

Principles of Programming Languages COMP251: Syntax and Grammars

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

CSE 3302 Programming Languages Lecture 2: Syntax

Parsing II Top-down parsing. Comp 412

Chapter 3. Parsing #1

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

Parsing #1. Leonidas Fegaras. CSE 5317/4305 L3: Parsing #1 1

How do LL(1) Parsers Build Syntax Trees?

Syntax Analysis Check syntax and construct abstract syntax tree

Syntax. In Text: Chapter 3

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Syntax-Directed Translation. Lecture 14

Context-free grammars

COP 3402 Systems Software Syntax Analysis (Parser)

Using an LALR(1) Parser Generator

Compiler Construction: Parsing

CS 314 Principles of Programming Languages

CSCI312 Principles of Programming Languages

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

MIT Top-Down Parsing. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

LANGUAGE PROCESSORS. Introduction to Language processor:

Part 3. Syntax analysis. Syntax analysis 96

Lecture 8: Deterministic Bottom-Up Parsing

CSCE 314 Programming Languages

Chapter 3. Describing Syntax and Semantics ISBN

Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

Lecture 7: Deterministic Bottom-Up Parsing

Table-Driven Top-Down Parsers

Lexical and Syntax Analysis (2)

Context-free grammars (CFG s)

Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

Top down vs. bottom up parsing

Principles of Programming Languages COMP251: Syntax and Grammars

3.5 Practical Issues PRACTICAL ISSUES Error Recovery

Chapter 4. Syntax - the form or structure of the expressions, statements, and program units

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

Parsing Algorithms. Parsing: continued. Top Down Parsing. Predictive Parser. David Notkin Autumn 2008

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

PSD3A Principles of Compiler Design Unit : I-V. PSD3A- Principles of Compiler Design

Outline CS412/413. Administrivia. Review. Grammars. Left vs. Right Recursion. More tips forll(1) grammars Bottom-up parsing LR(0) parser construction

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

CS606- compiler instruction Solved MCQS From Midterm Papers

More on Syntax. Agenda for the Day. Administrative Stuff. More on Syntax In-Class Exercise Using parse trees

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

PRINCIPLES OF COMPILER DESIGN

CSE 12 Abstract Syntax Trees

Comp 411 Principles of Programming Languages Lecture 3 Parsing. Corky Cartwright January 11, 2019

Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

Syntax Analysis. Chapter 4

Chapter 4. Lexical and Syntax Analysis

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

1. Explain the input buffer scheme for scanning the source program. How the use of sentinels can improve its performance? Describe in detail.

Downloaded from Page 1. LR Parsing

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Syntax Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Compilers. Predictive Parsing. Alex Aiken

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

Parsing III. (Top-down parsing: recursive descent & LL(1) )

Compiler Design Concepts. Syntax Analysis

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part2 3.3 Parse Trees and Abstract Syntax Trees

LL(k) Compiler Construction. Choice points in EBNF grammar. Left recursive grammar

LL(k) Compiler Construction. Top-down Parsing. LL(1) parsing engine. LL engine ID, $ S 0 E 1 T 2 3

Midterm Solutions COMS W4115 Programming Languages and Translators Wednesday, March 25, :10-5:25pm, 309 Havemeyer

CMPT 379 Compilers. Parse trees

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

Review of CFGs and Parsing II Bottom-up Parsers. Lecture 5. Review slides 1

Introduction to Compiler Construction

Compiler Design Aug 1996

Compiler Code Generation COMP360

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis

10/26/17. Attribute Evaluation Order. Attribute Grammar for CE LL(1) CFG. Attribute Grammar for Constant Expressions based on LL(1) CFG

Introduction to Parsing. Comp 412

UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE. M.Sc. in Computational Science & Engineering

1. The output of lexical analyser is a) A set of RE b) Syntax Tree c) Set of Tokens d) String Character

VIVA QUESTIONS WITH ANSWERS

COMPILERS BASIC COMPILER FUNCTIONS

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Syntax Analysis. The Big Picture. The Big Picture. COMP 524: Programming Languages Srinivas Krishnan January 25, 2011

Transcription:

Concepts Introduced in Chapter 2 A more detailed overview of the compilation process. Parsing Scanning Semantic Analysis Syntax-Directed Translation Intermediate Code Generation Context-Free Grammar A grammar can be used to describe the possible hierarchical structure of a program. A context free grammar has 4 components: A set of tokens, known as terminal symbols. A set of nonterminals. A set of productions where each production consists of a nonterminal, called the left side of the production, an arrow, and a sequence of tokens and/or nonterminals, called the right side of the production. A designation of one of the nonterminals as the start symbol. The token strings that can be derived from the start symbol form the language defined by the grammar. Example Grammar and Derivation list list + digit list list - digit list digit digit 0 1 2 3 4 5 6 7 8 9 list => list+digit => list digit + digit => digit digit + digit => 9 digit + digit => 9 5 + digit => 9 5 + 2 Parse Trees A parse tree pictorially shows how the start symbol of a grammar derives a specific string in the language. Given a context free grammar, a parse tree is a tree with the following properties: The root is labeled by the start symbol. Each leaf is labeled by a token or by. Each interior node is labeled by a nonterminal. If A is the nonterminal labeling some interior node and X1, X2,..., Xn are the labels of the children of that node from left to right, then A X1X2...Xn is a production.

Ambiguous Grammars The leaves (tokens) of a parse tree read from left to right form a legal string in the language defined by the associated grammar. If a grammar can have more than one parse tree generating the same string of tokens, then the grammar is said to be ambiguous. For a grammar representing a programming language, we need to ensure that the grammar is unambiguous or there are additional rules to resolve the ambiguities. string string + string string string string 0 1 2 3 4 5 6 7 8 9 Precedence and Associativity Precedence determines which operator is applied first when different operators appear in an expression and parentheses do not explicitly indicate the order. Associativity is used to define the order of operations when there are multiple operators with the same precedence in an expression. Left associativity means that (x op1 y) is applied first in the expression (x op1 y op2 z) when op1 and op2 have the same precedence. Right associativity means that (y op2 z) is applied first in the expression (x op1 y op2 z) when op1 and op2 have the same precedence. Converting Infix to Postfix If E is a variable or constant, then the postfix notation for E is E itself. If E is an expression of the form E1 op E2, where op is any binary operator, then the postfix notation for E is E1' E2' op, where E1' and E2' are the postfix notations for E1 and E2, respectively. If E is an expression of the form ( E1 ), then the postfix notation for E1 is also the postfix notation for E. (9-5)+2 95-2+ 9-(5+2) 952+- Syntax-Directed Definition Uses a grammar to define the syntactic structure. Associates attributes with each grammar symbol. Associates semantic rules for computing the values of the attributes.

Translation Scheme A translation scheme is a grammar with program fragments called semantic actions that are embedded within the right hand side of the productions. Unlike a syntax-directed definition, the order of the evaluation of the semantic rules is explicitly shown. Parsing Parsing is the process of determining if a string of tokens can be generated by a grammar. Parsing Methods Top-Down Construction starts at the root and proceeds to the leaves. Can be more easily constructed by hand. Bottom-Up Construction starts at the leaves and proceeds to the root. Can accept a larger class of grammars. Syntax Trees Concrete Syntax Tree - a parse tree Abstract Syntax Tree Each interior node is an operator rather than a nonterminal. Convenient for translation. Recursive Descent Parsing Top-down method for syntax analysis. A procedure is associated with each nonterminal of a grammar. Can be implemented by hand. Decides which production to use by examining the lookahead symbol. The appropriate procedure is invoked for each nonterminal in the rhs of the production. Predictive parsing means that a single lookahead symbol can be used to determine the procedure to be called for the next nonterminal.

Example Grammar for Recursive Descent Parsing Must not be left recursive. Must be left factored. expr term rest rest + term { print('+') } rest - term { print('-') } rest term 0 { print('0') } term 1 { print('1') }... term 9 { print('9') } Lexical Analysis Terms A token is a group of characters having a collective meaning. id A lexeme is an actual character sequence forming a specific instance of a token. num Characters between tokens are called whitespace. blanks, tabs, newlines, comments Buffering I/O It is too expensive to access a file one character at a time. Buffers are used for both input and output. Data is read from or written to the buffer until another buffer needs to be read or written. Symbol Table Used to save lexemes (identifiers) and their attributes. It is common to initialize a symbol table to include reserved words so the form of an identifier can be handled in a uniform manner. Attributes are stored in the symbol table for later use in semantic checks and translation.

l-values and r-values l-value Used on the left side of an assignment statement. Used to refer to a location. r-value Used on the right side of an assignment statement. Used to refer to a value. Abstract Stack Machine Stack machines are a common form used for the intermediate representation of a program. push v push v onto the stack rvalue l push contents of data location l lvalue l push address of data location l pop throw away value on top of the stack := the r-value on top is placed in the l-value below it and both are popped copy push a copy of the top value on the stack Example of Stack Machine Intermediate Code day := (1461*y) div 4 + (153*m + 2) div 5 + d; lvalue day push 2 push 1461 + rvalue y push 5 * div push 4 + div rvalue d push 153 + rvalue m := * Control Flow Support for control flow is needed when translating statements. label l target of jumps to l; has no other effect goto l next instruction is taken from statement with label l gofalse l pop the top value; jump if it is zero gotrue l pop the top value; jump if it is nonzero halt stop execution