Exercises II. Exercise: Lexical Analysis

Similar documents
Formal Languages and Compilers Lecture VII Part 3: Syntactic A

Context-free grammars

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Principles of Programming Languages

S Y N T A X A N A L Y S I S LR

Compiler Construction: Parsing

Bottom-Up Parsing LR Parsing

LR Parsers. Aditi Raste, CCOEW

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root

UNIT-III BOTTOM-UP PARSING

Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

Abstract Syntax Trees & Top-Down Parsing

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

Syntax Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing

Building Compilers with Phoenix

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

Formal Languages and Compilers Lecture VII Part 4: Syntactic A

More Bottom-Up Parsing

Compiler Construction 2016/2017 Syntax Analysis

A left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence.

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Formal Languages and Compilers Lecture VI: Lexical Analysis

CS143 Midterm Sample Solution Fall 2010

Bottom-Up Parsing. Parser Generation. LR Parsing. Constructing LR Parser

LR Parsing LALR Parser Generators

Time : 1 Hour Max Marks : 30

LR Parsing - The Items

LR Parsing LALR Parser Generators

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing

Compilers. Bottom-up Parsing. (original slides by Sam

Bottom-Up Parsing. Lecture 11-12

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.

Bottom-Up Parsing. Lecture 11-12

Concepts Introduced in Chapter 4

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

LR Parsing Techniques

Monday, September 13, Parsers

Lexical and Syntax Analysis. Bottom-Up Parsing

LALR Parsing. What Yacc and most compilers employ.

3. Parsing. Oscar Nierstrasz

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

LR(0) Parsing Summary. LR(0) Parsing Table. LR(0) Limitations. A Non-LR(0) Grammar. LR(0) Parsing Table CS412/CS413

Lecture Bottom-Up Parsing

CA Compiler Construction

MODULE 14 SLR PARSER LR(0) ITEMS

A programming language requires two major definitions A simple one pass compiler

HW 3: Bottom-Up Parsing Techniques

CS 4120 Introduction to Compilers

Wednesday, August 31, Parsers

Top down vs. bottom up parsing

CSE302: Compiler Design

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Syntax-Directed Translation Part II

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

Introduction to Parsing. Lecture 5. Professor Alex Aiken Lecture #5 (Modified by Professor Vijay Ganesh)

2068 (I) Attempt all questions.

Outline. Regular languages revisited. Introduction to Parsing. Parser overview. Context-free grammars (CFG s) Lecture 5. Derivations.

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

SLR parsers. LR(0) items

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

Types of parsing. CMSC 430 Lecture 4, Page 1

Syntax Analyzer --- Parser

Zhizheng Zhang. Southeast University

PESIT Bangalore South Campus Hosur road, 1km before Electronic City, Bengaluru -100 Department of Computer Science and Engineering

CSC 4181 Compiler Construction. Parsing. Outline. Introduction

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

CSCI312 Principles of Programming Languages

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

Example CFG. Lectures 16 & 17 Bottom-Up Parsing. LL(1) Predictor Table Review. Stacks in LR Parsing 1. Sʹ " S. 2. S " AyB. 3. A " ab. 4.

Unit 13. Compiler Design

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata

shift-reduce parsing

4. Lexical and Syntax Analysis

Syntactic Analysis. Chapter 4. Compiler Construction Syntactic Analysis 1

In One Slide. Outline. LR Parsing. Table Construction


LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

1 Introduction. 2 Recursive descent parsing. Predicative parsing. Computer Language Implementation Lecture Note 3 February 4, 2004

Syntactic Analysis. Top-Down Parsing

Syn S t yn a t x a Ana x lysi y s si 1

Syntax Analysis Part I

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

COMPILER DESIGN - QUICK GUIDE COMPILER DESIGN - OVERVIEW

CSCI Compiler Design

Administrativia. WA1 due on Thu PA2 in a week. Building a Parser III. Slides on the web site. CS164 3:30-5:00 TT 10 Evans.

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

COMPILER (CSE 4120) (Lecture 6: Parsing 4 Bottom-up Parsing )

CS606- compiler instruction Solved MCQS From Midterm Papers

CSCI312 Principles of Programming Languages!


Introduction to Parsing. Lecture 8

UNIT III & IV. Bottom up parsing

Review: Shift-Reduce Parsing. Bottom-up parsing uses two actions: Bottom-Up Parsing II. Shift ABC xyz ABCx yz. Lecture 8. Reduce Cbxy ijk CbA ijk

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Transcription:

xercises II Text adapted from : Alessandro Artale, Free University of Bolzano les adapted from : nrico Cimitan, Università di Padova xercise: Lexical Analysis Describe the notions of token, token name, lexeme, and attribute, and prove examples of their use 1

xercise: Lexical Analysis Input : y = 42 The lexemes are y, =, 42 The tokens are, y, assign, num, 42 The token names are, assign, num The attributes for first and third tokens are y, 42 ; second token does not need an attribute xercise: Lexical Analysis To describe the set of lexemes we need patterns xample of patterns expressed by means of Rs Idenifier: [a-za-z][a-za-z_0-9]* (N.B. keywords ignored) Number : [0-9] + 2

xercise: Lexical Analysis During LA there are two kinds of conflicts everal portions of a lexeme are recognized by the same Rs The same lexeme is recognized by several Rs Describe how to resolve these conflicts xercise: Lexical Analysis The conflict between several portions of different lengths is resolved by taking the longest match The conflict between several Rs on the same lexeme is resolved by taking the Rs with highest precedence 3

xercise: Finite Automata Describe a lexer that recognizes entifiers and numbers (integers), and show the finite automaton xercise: Finite Automata Flex code ws [\n \t]+ %% [0-9]+{ws} {printf ( int\n );} [a-za-z][a-za-z_0-9]*{ws {printf( \n );} %% 4

xercise: Finite Automata ɛ [a-za-z] ws print start [a-za-z_0-9] ɛ [0-9] ws print «integer» [0-9] xercise: Top-Down Parsing Conser the following grammar with terminals T = { [, ], a, b, c, +, - } : [ X ] a X + Y Y b ɛ Y - X c ɛ Prove the parsing table for the LL(1) top down parser 5

Recall: FIRT() FIRT(α) = the set of terminals that begin all strings derived from α FIRT(a) = {a} if a T FIRT(ε) = {ε} FIRT(A) = A α FIRT(α) for A α P FIRT(X 1 X 2 X k ) : if for all j = 1,, i-1 : ε FIRT(X j ) then add FIRT(X i )\{ε} to FIRT(X 1 X 2 X k ) if for all j = 1,, k : ε FIRT(X j ) then add ε to FIRT(X 1 X 2 X k ) xercise: Top-Down Parsing FIRT(a) = {a}, if a T = { [, ], a, b, c, +, - } FIRT(Y) = { -, ε } FIRT() = { [, a } FIRT(X) = {+} FIRT(Y)\{ε} {b} {ε} (since Y derives ε) = { +, -, b, ε } 6

xercise: Top-Down Parsing A α FIRT(α) [ X ] [ a a X + Y + X Y b - b X ɛ ε Y - X c - Y ɛ ε Recall: FOLLOW() FOLLOW(A) = the set of terminals that can immediately follow nonterminal A FOLLOW(A) = for all (B α A β) P do add FIRT(β)\{ε} to FOLLOW(A) for all (B α A β) P and ε FIRT(β) do add FOLLOW(B) to FOLLOW(A) for all (B α A) P do add FOLLOW(B) to FOLLOW(A) if A is the start symbol then add $ to FOLLOW(A) 7

xercise: Top-Down Parsing FOLLOW(X) = { ], c } FOLLOW(Y) = FOLLOW(X) {b} = { ], c, b } FOLLOW() = {$} FIRT(X)\{ε} {]} ( [X] and X ɛ) FIRT(X)\{ε} {c} (Y -Xc and X ɛ) FIRT(Y)\{ε} FOLLOW(X) (X +Y and Y ɛ) (since FIRT(Y) FIRT(X) ) = {$} FIRT(X) \{ε} {]} {c} FOLLOW(X) = = { $, +, -, b, ], c } xercise: Top-Down Parsing A X Y FOLLOW(A) $ + - b ] c ] c ] c b 8

Recall: Constructing an LL(1) Predictive Parsing Table for each production A α do for each a FIRT(α) do add A α to M[A, a] enddo if ε FIRT(α) then for each b FOLLOW(A) do add A α to M[A, b] enddo endif enddo Mark each undefined entry in M error xercise: Top-Down Parsing FIRT & FOLLOW as computed before : A α FIRT(α) A FOLLOW(A) [ X ] [ a a $ + - b ] c X + Y + X Y b - b X ] c X ɛ ε Y - X c - Y ɛ ε Y ] c b 9

xercise: Top-Down Parsing [ ] a b c + - $ [ X ] a X X ɛ (FOLLOW) X Y b X ɛ (FOLLOW) X + Y X Y b Y Y ɛ (FOLLOW) Y ɛ (FOLLOW) Y ɛ (FOLLOW) Y - X c Productions marked as FOLLOW are inserted in the second phase of the algorithm xercise: Top-Down Parsing Prove the stack and the moves of the LL(1) parser on input [ a b ] 10

xercise: Top-Down Parsing tack $ $ ] X [ $ ] X $ ] X a $ ] X $ ] b Y $ ] b $ ] $ Input [ a b ] $ [ a b ] $ a b ] $ a b ] $ b ] $ b ] $ b ] $ ] $ $ Production applied [ X ] (match!) a (match!) X Y b Y ɛ (match!) (match!) (end! accept.) xercise: Top-Down Parsing xplain the backtracking technique for top-down parsing and prove an example consering the grammar [ X ] a X + Y Y b ɛ Y - X c ɛ 11

xercise: Top-Down Parsing Backtracking is exploited in top-down parsers that do not use predictive parsing The parser keeps a record of all previous decisions for production application, and sets up a trial-and-error strategy Note that choosing a wrong production leads the repeated reading of a portion of the input xercise: Top-Down Parsing tack $ $ ] X [ $ ] X $ ] X ] X [ Input [ a b ] $ [ a b ] $ a b ] $ a b ] $ Production applied [ X ] (match!) [ X ] Not matching! Back to last choice that can be changed (8 ) 8 choice (1/2) 8 choice (1/2) 8 marks choices where alternatives are available 12

xercise: Top-Down Parsing tack $ $ ] X [ $ ] X $ ] X a $ ] X $ ] Y + Input [ a b ] $ [ a b ] $ a b ] $ a b ] $ b ] $ b ] $ Production applied [ X ] (match!) a (match!) X + Y Not matching! Back to last choice that can be changed (8 ) 8 choice (1/2) choice (2/2) 8 choice (1/3) xercise: Top-Down Parsing tack $ $ ] X [ $ ] X $ ] X a $ ] X $ ] b Y $ ] b c X - Input [ a b ] $ [ a b ] $ a b ] $ a b ] $ b ] $ b ] $ b ] $ Production applied [ X ] (match!) a (match!) X Y b Y - X c Not matching! Back to last choice that can be changed (8 ) 8 choice (1/2) choice (2/2) 8 choice (2/3) 8 choice (1/2) 13

xercise: Top-Down Parsing tack $ $ ] X [ $ ] X $ ] X a $ ] X $ ] b Y $ ] b $ ] $ Input [ a b ] $ [ a b ] $ a b ] $ a b ] $ b ] $ b ] $ b ] $ ] $ $ Production applied [ X ] (match!) a (match!) X Y b Y ɛ (match!) (match!) (end! accept.) 8 choice (1/2) choice (2/2) 8 choice (2/3) choice (2/2) xercise: Bottom-Up Parsing Conser the grammar L ; L T VL T array Idx of T int VL, VL Idx num Prove the parsing table for the LR bottom-up parser 14

Recall: Function closure() 1. tart with closure(i) = I 2. If [A α Bβ] closure(i) then for each production B γ in the grammar, add the item [B γ] to I if not already in I 3. Repeat 2 until no new items can be added Recall: Function goto() 1. For each [A α Xβ] I, add [A αx β] to goto(i, X), if not already there 2. Compute closure() of the resulting set 15

Recall: build LR(0) collection teps: - augment grammar with initial production - start from the state containing the closure of the set containing only the item derived from the initial production: closure( {[ ʹ ]} ) - add iteratively the states that are reachable from the existing states using goto(i,x), for some state I already added and some symbol (terminal or nonterminal) X Recall: build the collection Procedure: C = { closure( {[ ʹ ]} ) } repeat for each set of items I in C and each grammar symbol X such that goto(i, X) is not empty and not in C do add goto(i, X) to C until no new sets of items can be added to C 16

xercise: Bottom-Up Parsing Augmented grammar L ʹ L L ; L T VL T array Idx of T int VL, VL Idx num In the next sle, 8 means kernel item xercise: Bottom-Up Parsing tate 0 8 L ʹ L L ; L L T VL T array Idx of T T int tate 1 8 T array Idx of T Idx num tate 2 8 T int tate 3 8 L ʹ L tate 4 8 L ; L 8 L tate 5 8 T VL VL, VL VL tate 6 8 T array Idx of T tate 7 8 Idx num tate 8 8 L ; L L ; L L T VL T array Idx of T T int tate 9 8 T VL tate 10 8 VL, VL 8 VL tate 11 8 T array Idx of T T array Idx of T T int tate 12 8 L ; L tate 13 8 VL, VL VL, VL VL tate 14 8 T array Idx of T tate 15 8 VL, VL 17

start xercise: Bottom-Up Parsing Idx 6 1 num array 7 int 2 int array L 0 3 8 ; 4 T T VL 5 9 of 11 T array 14 int L 12 15, VL 10 13 Recall: LR parsing table 1. Augment the grammar with L L P done 2. Construct the set C={I 0,I 1,,I n } of LR(0) states P done 3. If [A α aβ] I i and goto(i i, a)=i j then set action[i, a]=shift j 4. If [A α ] I i then set action[i, a]=reduce A α for all a FOLLOW(A) (apply only if A L ) O need FOLLOW! 5. If [L L ] is in I i then set action[i, $]=accept 6. If goto(i i, A)=I j then set goto[i, A]=j 7. Repeat 3-6 until no more entries added 8. The initial state i is the I i holding item [ ] 18

xercise: Bottom-Up Parsing L ʹ L L ; L T VL T array Idx of T int VL, VL Idx num A FOLLOW(A) Idx of T FIRT(VL)= L $ ; FOLLOW(L)= $ VL FOLLOW()= ; $ 1. L ʹ L 2. L ; L 3. L 4. T VL 5. T array Idx of T 6. T int 7. VL, VL 8. VL 9. Idx num states action array int num, ; of $ L VL T Idx 0 s1 s2 3 4 5 1 s7 6 2 r6 3 acc 4 s8 r3 5 s10 9 6 s11 7 r9 8 s1 s2 12 4 5 9 r4 r4 10 s13 r8 r8 sx : shift & go to state x 11 s1 s2 14 ry : reduce using y 12 r2 Red entries: LR 13 s10 15 reduce actions derived from 14 r5 lookahead (FOLLOW) 15 r7 r7 goto 19

xercise: Bottom-Up Parsing how the stack and the moves of the LR parser on input array 5 of int x xercise: Bottom-Up Parsing tack $ 0 $ 0 $ 0 array 1 $ 0 array 1 5 7 $ 0 array 1 Idx 6 $ 0 array 1 Idx 6 of 11 $ 0 array 1 Idx 6 of 11 int 2 $ 0 array 1 Idx 6 of 11 T 14 $ 0 T 5 $ 0 T 5 x 10 $ 0 T 5 VL 9 $ 0 4 $ 0 L 3 Input array 5 of int x $ array 5 of int x $ 5 of int x $ of int x $ of int x $ int x $ x $ x $ x $ $ $ $ $ Action start from state 0 shift & goto 1 shift & goto 7 reduce with 9: Idx num goto(1,idx)=6 shift & goto 11 shift & goto 2 reduce with 6: T int goto(11,t)=14 reduce with 5: T array Idx of T goto(0,t)=5 shift & goto 10 reduce with 8: VL goto(5,vl)=9 reduce with 4: T VL goto(0,)=4 reduce with 3: L goto(0,l)=3 action(3,$)= accept! 20

xercise: emantic Analysis 4) Conser the DD in the next sles, where newlabel() generates a fresh symbolic label newtemp() generates a fresh variable name gen() generates strings, is string concatenation code is the attribute containing 3AC.place is the name of the variable associated to the token relop.op is a comparison operator (<, <=, =, ) xercise 4: emantic Analysis Productions Prog 1 ; 2 if Test then{ 1 } = emantic rules.next = newlabel(); Prog.code =.code gen(.next : ) 1.next = newlabel(); 2.next =.next;.code = 1.code gen( 1.next : ) 2.code Test.true = newlabel(); Test.false =.next; 1.next =.next;.code = Test.code gen(test.true : ) 1.code.code =.code gen(.place =.place) 21

xercise 4: emantic Analysis Productions Test 1 relop 2 1 + emantic rules Test.code = gen( if 1.place relop.op 2.place goto Test.true) gen( goto Test.false).place = newtemp();.code = 1.code gen(.place = 1.place +.place).place =.place;.code = xercise 4: emantic Analysis Conser the input: if y > w then { y = x + z}; x = z + v 22

xercise 4: emantic Analysis 4.1) how the annotated parse tree (without the code attribute) for the input together with the values of the attributes xercise 4: emantic Analysis Prog ; if Test relop then { } = + = + if y > w then { y = x + z } ; x = z + v 23

xercise 4: emantic Analysis Prog.next = LABL3.next = LABL2 ;.next = LABL3 if then Test.true = LABL1 Test.false = LABL2 {.next = LABL2 }.place = y =.place =t1.place = x =.place = z +.place =t2.place = v.place =x +.place = z.place = y.place = w.place = z relop.op = > Id.place = x if y > w then { y = x + z } ; x = z + v xercise 4: emantic Analysis Prog ; if Test then { } : = relop : = + + if y > w then { y : = x + z } ; x : = z + v 24

xercise 4: emantic Analysis code next = LABL3 code Prog ; code next = LABL3 place = t2 code Place = x = + place = z code Place = z Place = v xercise 4: emantic Analysis Prog ; if Test relop then { } = + = + if y > w then { y = x + z } ; x = z + v 25

xercise 4: emantic Analysis true = LABL1 false = LABL2 code code next = LABL2 code next = LABL2 if Test then { } = place = t1 code relop Place = y + Place = z op = > place = x code Place = y Place = w Place = x xercise 4: emantic Analysis 4.2) how the three-address code produced by the semantic actions for the given input 26

xercise 4: emantic Analysis Prog Test if y > x goto LABL1 goto LABL2 LABL1 : t1 = x + z y = t1 LABL2 : t2 = z + v x = t2 LABL3 : xercise 4: emantic Analysis 4.3) Give the definition of inherited attribute. For the rule if Test then { 1 } show what are the synthesized attributes and what are the inherited attributes. 27

xercise 4: emantic Analysis In semantic analysis, an attribute of a nonterminal A 1 is called inherited if it is derived only from operations on attributes of A 1 s parent or siblings in the parsing tree; i.e., if it is computed in the semantic rule associated to a production with A 1 in the right-hand se. On the other hand, synthesized attributes of A 1 are generated only from the below, i.e. from the attributes of the children of A 1. In other words, in the rule where A 1 appears in the left-hand se. xercise 4: emantic Analysis Production if Test then { 1 } emantic rules Test.true = newlabel(); ynthesized (convention) Test.false =.next; Inherited 1.next =.next; Inherited.code = Test.code gen(test.true : ) 1.code ynthesized 28