Some Basic Definitions. Some Basic Definitions. Some Basic Definitions. Language Processing Systems. Syntax Analysis (Parsing) Prof.

Similar documents
ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

Chapter 4: Syntax Analyzer

3. Context-free grammars & parsing

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

Outline. Limitations of regular languages Parser overview Context-free grammars (CFG s) Derivations Syntax-Directed Translation

CSE302: Compiler Design

Syntax Analysis. Chapter 4

Introduction to Parsing. Lecture 8

Programming Languages & Translators PARSING. Baishakhi Ray. Fall These slides are motivated from Prof. Alex Aiken: Compilers (Stanford)

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

Unit 13. Compiler Design

Introduction to Parsing Ambiguity and Syntax Errors

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

3. Parsing. Oscar Nierstrasz

Introduction to Parsing Ambiguity and Syntax Errors

COMPILER DESIGN - QUICK GUIDE COMPILER DESIGN - OVERVIEW

Top down vs. bottom up parsing

Compilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 3, vrijdag 22 september 2017

CA Compiler Construction

Syntax Analysis Check syntax and construct abstract syntax tree

A programming language requires two major definitions A simple one pass compiler

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

Ambiguity. Grammar E E + E E * E ( E ) int. The string int * int + int has two parse trees. * int

CS 314 Principles of Programming Languages

Topic 3: Syntax Analysis I

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

Syntax. In Text: Chapter 3

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

Concepts Introduced in Chapter 4

Administrativia. WA1 due on Thu PA2 in a week. Building a Parser III. Slides on the web site. CS164 3:30-5:00 TT 10 Evans.

Parser. Larissa von Witte. 11. Januar Institut für Softwaretechnik und Programmiersprachen. L. v. Witte 11. Januar /23

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

Wednesday, August 31, Parsers

CSE302: Compiler Design

CS502: Compilers & Programming Systems

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam

Configuration Sets for CSX- Lite. Parser Action Table

CS 403: Scanning and Parsing

Syntax Analysis Part I

Introduction to Syntax Analysis. Compiler Design Syntax Analysis s.l. dr. ing. Ciprian-Bogdan Chirila

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

LANGUAGE PROCESSORS. Introduction to Language processor:

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

How do LL(1) Parsers Build Syntax Trees?

Building Compilers with Phoenix

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:


COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

ICOM 4036 Spring 2004

Compilers Course Lecture 4: Context Free Grammars

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Compiler Design Concepts. Syntax Analysis

Error Handling Syntax-Directed Translation Recursive Descent Parsing

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Syntactic Analysis. Top-Down Parsing

Dr. D.M. Akbar Hussain

Error Handling Syntax-Directed Translation Recursive Descent Parsing

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

Syntax Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Monday, September 13, Parsers

Compiler principles, PS1

Programming Language Syntax and Analysis

Programming Language Specification and Translation. ICOM 4036 Fall Lecture 3

Chapter 4. Lexical and Syntax Analysis

CS 406/534 Compiler Construction Parsing Part I

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Outline. Parser overview Context-free grammars (CFG s) Derivations Syntax-Directed Translation

4. Lexical and Syntax Analysis

Syntax. A. Bellaachia Page: 1

Principles of Programming Languages COMP251: Syntax and Grammars

Chapter 4. Syntax - the form or structure of the expressions, statements, and program units

Context-free grammars

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Parsing II Top-down parsing. Comp 412

Lexical and Syntax Analysis

CSCI312 Principles of Programming Languages

CMSC 330: Organization of Programming Languages. Context Free Grammars

A Simple Syntax-Directed Translator

4. Lexical and Syntax Analysis

Lecture 8: Deterministic Bottom-Up Parsing

CMSC 330: Organization of Programming Languages

Context-free grammars (CFG s)

Table-Driven Top-Down Parsers

Compiler Construction: Parsing

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Introduction to parsers

Transcription:

Language Processing Systems Prof. Mohamed Hamada Software ngineering Lab. he University of Aizu Japan Syntax Analysis (Parsing) Some Basic Definitions Some Basic Definitions syntax: the way in which words are put together to form phrases, clauses, or sentences. he rules governing the formation of statements in a programming language. syntax analysis: the task concerned with fitting a sequence of tokens into a specified syntax. parsing: o break a sentence down into its component parts of speech with an explanation of the form, function, and syntactical relationship of each part. parsing = lexical analysis syntax analysis semantic analysis: the task concerned with calculating the program s meaning. Some Basic Definitions Syntactic structure: the syntactic structure of programming languages can be informally expressed by the following diagram. Program Block(s) Statement(s) xpression(s) oken(s) next char get next char Source Program 1. Uses Regular xpressions to define tokens 2. Uses Finite Automata to recognize tokens lexical analyzer next token get next token symbol table (Contains a record for each identifier) Uses op-down parsing or Bottom-up parsing o construct a Parse tree Syntax analyzer

Syntax errors rror recovery Parsing errors include: 1. misspelling of identifier, keyword, or operator 2. arithmetic expression with unbalanced parentheses 3. punctuation errors such as using comma in place of semicolon he error handler in a parser has the following jobs: 1. report the presence of errors clearly and accurately 2. quick recovery of errors 3. not to slow the processing of programs 4. missing brackets, semicolons, etc. xample: he following C code shows some examples of syntax errors: #include<stdio.h> int max(int I ; int j) if(i>j) return(i) return(j); void main() int x, y, scanf("%d %d", x, y); printf("%d", max(x,y) ; xample: A typical compilation of this erroneous program gives the following list of errors: 1. error C2235: ';' in formal parameter list 2. error C2059: syntax error : ')' 3. error C2239: unexpected token 'f' following declaration of 'j' 4. error C2078: too many initializers 5. error C2660: 'max' : function does not take 2 parameters 6. error C2143: syntax error : missing ')' before ';' xample: he correct version of this program is #include<stdio.h> int max(int i, int j) if(i>j) return(i); return(j); void main() int x, y; scanf("%d %d", x, y); printf("%d", max(x,y)); Definition of Context-Free Grammars A context-free grammar G = (, N, S, P) consists of: 1., a set of terminals (scanner tokens). 2. N, a set of nonterminals (syntactic variables generated by productions). 3. S, a designated start nonterminal. 4. P, a set of productions. ach production has the form, A::= α, where A is a nonterminal and α is a sentential form, i.e., a string of zero or more grammar symbols (terminals/nonterminals).

Definition of Context-Free Grammars Syntax Analysis Grammars offer several significant advantages: 1. asy to understand and construct programs 2. asy parsing 3. asy error detection and handling 4. asy language extension Syntax Analysis Problem Statement: o find a derivation sequence in a grammar G for the input token stream (or say that none exists). Parsing Parsing op Down Parsing Bottom Up Parsing op Down Parsing Bottom Up Parsing Predictive Parsing LL(k) Parsing Shift-reduce Parsing LR(k) Parsing Predictive Parsing LL(k) Parsing Shift-reduce Parsing LR(k) Parsing op-down parsers: starts constructing the parse tree at the top (root) of the tree and move down towards the leaves. asy to implement by hand, but work with restricted grammars. xample: predictive parsers Consider the grammar: * F F Consider the grammar: * F F A top-down parser might loop forever when parsing an expression using this grammar A grammar that has at least one production of the form A Aα is a left recursive grammar. op-down parsers do not work with left-recursive grammars. Left-recursion can often be eliminated by rewriting the grammar.

his left-recursive grammar: * F F Can be re-written to eliminate the immediate left recursion: λ F *F λ Predictive Parsing Consider the grammar: stm if expr then stmt else stmt while expr do stmt begin stmt_list end A parser for this grammar can be written with the following simple structure: Based only on the first token, the parser knows which rule to use to derive a statement. herefore this is called a predictive parser. switch(gettoken()) case if:. break; case while:. break; case begin:. break; default: reject input; he following grammar: stmt if expr then stmt else stmt if expr then stmt Cannot be parsed by a predictive parser that looks one element ahead. But the grammar can be re-written: Where λ is the empty string. stmt if expr then stmt stmt stmt else stmt λ Rewriting a grammar to eliminate multiple productions starting with the same token is called left factoring. he basic idea is, in general, as follows: 1. let A à αβ 1 αβ 2 be two production rules for the nonterminal symbol A 2. if the input begins with a nonempty string derived from α 3. and we do not know whether to expand A to αβ 1 or αβ 2 4. then we may defer the decision by expanding A to αa' 5. after seeing the input derived from α, we expand A' to β 1 or to β 2 6. this means, left-factored, the original productions become A à αa' A' à β 1 β 2 A Predictive Parser How it works? 1. Construct the parsing table from the given grammar 2. Apply the predictive parsing algorithm to construct the parse tree A Predictive Parser 1. Construct the parsing table from the given grammar he following algorithm shows how we can construct the parsing table: Input: a grammar G Output: the corresponding parsing table M Method: For each production A à α of the grammar do the following steps: 1. For each terminal a in FIRS(α), add A à α to M[A,a]. 2. If λ in FIRS(α), add A à α to M[A,b] for each terminal b in FOLLOW(A). 3. If λ FIRS(α) and $ in FOLLOW(A), add A à α to M[A,$] How to construct FIRS and FOLLOW operations?

he Parsing able How to construct FIRS and FOLLOW operations? xample Given this grammar: ABL: λ F *F λ How is this parsing table built? INPU SYMBOL RMINAL id * ( ) $ λ λ F F λ *F λ λ F F id F () FIRS and FOLLOW We need to build a FIRS set and a FOLLOW set for each symbol in the grammar. he elements of FIRS and FOLLOW are terminal symbols. FIRS(α) is the set of terminal symbols that can begin any string derived from α. FOLLOW(α) is the set of terminal symbols that can follow α: t FOLLOW(α) derivation containing αt Rules to Create FIRS λ F *F λ SS: FIRS(id) = id FIRS(*) = * FIRS() = FIRS(() = ( FIRS()) = ) FIRS( ) = ε, λ FIRS( ) = ε *, λ FIRS(F) = (, id FIRS() = FIRS(F) = (, id FIRS() = FIRS() = (, id FIRS rules: 1. If X is a terminal, FIRS(X) = X 2. If X λ, then ε FIRS(X) 3. If X Y 1 Y 2 Y k and Y 1 Y i-1 * λ and a FIRS(Y i ) then a FIRS(X) FIRS( ) =, λ FIRS( ) = *, λ FIRS(F) = (, id FIRS() = (, id FIRS() = (, id λ F *F λ SS: FOLLOW() = $ ), $ FOLLOW( ) = ), $ FOLLOW() = ), $ and a FIRS(β) FIRS( ) =, λ FIRS( ) = *, λ FIRS(F) = (, id FIRS() = (, id FIRS() = (, id λ F *F λ SS: FOLLOW() = ), $ FOLLOW( ) = ), $ FOLLOW() = ), $, ), $ and a FIRS(β) FIRS( ) =, λ FIRS( ) = *, λ FIRS(F) = (, id FIRS() = (, id FIRS() = (, id λ F *F λ SS: FOLLOW() = ), $ FOLLOW( ) = ), $ FOLLOW() =, ), $ FOLLOW( ) =, ), $ and a FIRS(β)

FIRS( ) =, λ FIRS( ) = *, λ FIRS(F) = (, id FIRS() = (, id FIRS() = (, id λ F *F λ SS: FOLLOW() = ), $ FOLLOW( ) = ), $ FOLLOW() =, ), $ FOLLOW( ) =, ), $ FOLLOW(F) =, ), $ and a FIRS(β) FIRS( ) =, λ FIRS( ) = *, λ FIRS(F) = (, id FIRS() = (, id FIRS() = (, id λ F *F λ SS: FOLLOW() = ), $ FOLLOW( ) = ), $ FOLLOW() =, ), $ FOLLOW( ) =, ), $ FOLLOW(F) =, ), $, *, ), $ and a FIRS(β) Rules to Build FIRS( ) =, λ Parsing FIRS( ) = *, λ FIRS(F) = (, id able λ F *F λ FIRS SS: FIRS() = (, id FIRS() = (, id if a FIRS(α), add A α to M[A, a] FOLLOW SS: FOLLOW() = ), $ FOLLOW( ) = ), $ FOLLOW() =, ), $ FOLLOW( ) =, ), $ FOLLOW(F) =, *, ), $ Rules to Build FIRS( ) =, λ Parsing FIRS( ) = *, λ FIRS(F) = (, id able λ F *F λ FIRS SS: FIRS() = (, id FIRS() = (, id if a FIRS(α), add A α to M[A, a] FOLLOW SS: FOLLOW() = ), $ FOLLOW( ) = ), $ FOLLOW() =, ), $ FOLLOW( ) =, ), $ FOLLOW(F) =, *, ), $ ABL: INPU SYMBOL RMINAL id * ( ) $ λ λ F F λ *F λ λ F F id F () INPU SYMBOL RMINAL id * ( ) $ ABL M: λ λ F F λ *F λ λ F F id F () Rules to Build FIRS( ) =, λ Parsing FIRS( ) = *, λ FIRS(F) = (, id able λ F *F λ FIRS SS: FIRS() = (, id FIRS() = (, id if a FIRS(α), add A α to M[A, a] FOLLOW SS: FOLLOW() = ), $ FOLLOW( ) = ), $ FOLLOW() =, ), $ FOLLOW( ) =, ), $ FOLLOW(F) =, *, ), $ Rules to Build FIRS( ) =, λ Parsing FIRS( ) = *, λ FIRS(F) = (, id able λ F *F λ FIRS SS: FIRS() = (, id FIRS() = (, id if a FIRS(α), add A α to M[A, a] FOLLOW SS: FOLLOW() = ), $ FOLLOW( ) = ), $ FOLLOW() =, ), $ FOLLOW( ) =, ), $ FOLLOW(F) =, *, ), $ ABL M: INPU SYMBOL RMINAL id * ( ) $ λ λ F F λ *F λ λ F F id F () ABL M: INPU SYMBOL RMINAL id * ( ) $ λ λ F F λ *F λ λ F F id F ()

Rules to Build FIRS( ) =, λ Parsing FIRS( ) = *, λ FIRS(F) = (, id able λ F *F λ FIRS SS: FIRS() = (, id FIRS() = (, id if a FIRS(α), add A α to M[A, a] FOLLOW SS: FOLLOW() = ), $ FOLLOW( ) = ), $ FOLLOW() =, ), $ FOLLOW( ) =, ), $ FOLLOW(F) =, *, ), $ Rules to Build FIRS( ) =, λ Parsing FIRS( ) = *, λ FIRS(F) = (, id able λ F *F λ FIRS SS: FIRS() = (, id FIRS() = (, id if a FIRS(α), add A α to M[A, a] 2. If A α: if λ FIRS(α), add A α to M[A, b] for each terminal b FOLLOW(A), FOLLOW SS: FOLLOW() = ), $ FOLLOW( ) = ), $ FOLLOW() =, ), $ FOLLOW( ) =, ), $ FOLLOW(F) =, *, ), $ ABL M: INPU SYMBOL RMINAL id * ( ) $ λ λ F F λ *F λ λ F F id F () ABL M: INPU SYMBOL RMINAL id * ( ) $ λ λ F F λ *F λ λ F F id F () Rules to Build FIRS( ) =, λ Parsing FIRS( ) = *, λ FIRS(F) = (, id able λ F *F λ ABL M: FIRS SS: FIRS() = (, id FIRS() = (, id if a FIRS(α), add A α to M[A, a] 2. If A α: if λ FIRS(α), add A α to M[A, b] for each terminal b FOLLOW(A), FOLLOW SS: FOLLOW() = ), $ FOLLOW( ) = ), $ FOLLOW() =, ), $ FOLLOW( ) =, ), $ FOLLOW(F) =, *, ), $ INPU SYMBOL RMINAL id * ( ) $ λ λ F F λ *F λ λ F F id F () Rules to Build FIRS( ) =, λ Parsing FIRS( ) = *, λ FIRS(F) = (, id able λ F *F λ ABL M: FIRS SS: FIRS() = (, id FIRS() = (, id if a FIRS(α), add A α to M[A, a] 2. If A α: if λ FIRS(α), add A α to M[A, b] for each terminal b FOLLOW(A), 3. If A α: if λ FIRS(α), and $ FOLLOW(A), add A α to M[A, $] FOLLOW SS: FOLLOW() = ), $ FOLLOW( ) = ), $ FOLLOW() =, ), $ FOLLOW( ) =, ), $ FOLLOW(F) =, *, ), $ INPU SYMBOL RMINAL id * ( ) $ λ λ F F λ *F λ λ F F id F ()