Syn S t yn a t x a Ana x lysi y s si 1

Size: px
Start display at page:

Download "Syn S t yn a t x a Ana x lysi y s si 1"

Transcription

1 Syntax Analysis 1

2 Position of a Parser in the Compiler Model Source Program Lexical Analyzer Token, tokenval Get next token Parser and rest of front-end Intermediate representation Lexical error Syntax error Semantic error Symbol Table 2

3 The Role Of Parser A parser implements a C-F grammar The role of the parser is twofold: 1. To check syntax (= string recognizer) And to report syntax errors accurately 2. To invoke semantic actions For static semantics checking, e.g. type checking of expressions, functions, etc. For syntax-directed translation of the source code to an intermediate representation 3

4 Syntax-Directed Translation One of the major roles of the parser is to produce an intermediate representation (IR) of the source program using syntax-directed translation methods Possible IR output: Abstract syntax trees (ASTs) Control-flow graphs (CFGs) with triples, three-address code, or register transfer list notation WHIRL (SGI Pro64 compiler) has 5 IR levels! 4

5 Error Handling A good compiler should assist in identifying and locating errors Lexical errors: important, compiler can easily recover and continue Syntax errors: most important for compiler, can almost always recover Static semantic errors: important, can sometimes recover Dynamic semantic errors: hard or impossible to detect at compile time, runtime checks are required Logical errors: hard or impossible to detect 5

6 Viable-Prefix Property The viable-prefix property of LL/LR parsers allows early detection of syntax errors Goal: detection of an error as soon as possible without further consuming unnecessary input How: detect an error as soon as the prefix of the input does not match a prefix of any string in the language Prefix for (;) Error is detected here Prefix Error is detected here DO 10 I = 1;0 6

7 Error Recovery Strategies Panic mode Discard input until a token in a set of designated synchronizing tokens is found Phrase-level recovery Perform local correction on the input to repair the error Error productions Augment grammar with productions for erroneous constructs Global correction Choose a minimal sequence of changes to obtain a global least-cost correction 7

8 Grammars (Recap) Context-free grammar is a 4-tuple G = (N, T, P, S) where T is a finite set of tokens (terminal symbols) N is a finite set of nonterminals P is a finite set of productions of the form where (N T)* N (N T)* and (N T)* S N is a designated start symbol 8

9 Notational Conventions Used Terminals a,b,c, T specific terminals: 0, 1, id, + Nonterminals A,B,C, N specific nonterminals: expr, term, stmt Grammar symbols X,Y,Z (N T) Strings of terminals u,v,w,x,y,z T* Strings of grammar symbols,, (N T)* 9

10 Derivations (Recap) The one-step derivation is defined by A where A is a production in the grammar In addition, we define is leftmost lm if does not contain a nonterminal is rightmost rm if does not contain a nonterminal Transitive closure * (zero or more steps) Positive closure + (one or more steps) The language generated by G is defined by L(G) = {w T* S + w} 10

11 Derivation (Example) Grammar G = ({E}, {+,*,(,),-,id}, P, E) with productions P = E E + E E E * E E ( E ) E - E E id Example derivations: E - E - id E rm E + E rm E + id rm id + id E * E E * id + id E + id * id + id 11

12 Chomsky Hierarchy: Language Classification A grammar G is said to be Regular if it is right linear where each production is of the form A w B or A w or left linear where each production is of the form A B w or A w Context free if each production is of the form A where A N and (N T)* Context sensitive if each production is of the form A where A N,,, (N T)*, > 0 Unrestricted 12

13 Chomsky Hierarchy L(regular) L(context free) L(context sensitive) L(unrestricted) Where L(T) = { L(G) G is of type T } That is: the set of all languages generated by grammars G of type T Examples: Every finite language is regular! (construct a FSA for strings in L(G)) L 1 = { a n b n n 1 } is context free L 2 = { a n b n c n n 1 } is context sensitive 13

14 Parsing Universal (any C-F grammar) Cocke-Younger-Kasimi Earley Top-down (C-F grammar with restrictions) Recursive descent (predictive parsing) LL (Left-to-right, Leftmost derivation) methods Bottom-up (C-F grammar with restrictions) Operator precedence parsing LR (Left-to-right, Rightmost derivation) methods SLR, canonical LR, LALR 14

15 Top-Down Parsing LL methods (Left-to-right, Leftmost derivation) and recursive-descent parsing Grammar: E T + T T ( E ) T - E T id Leftmost derivation: E lm T + T lm id + T lm id + id E E E E T T T T T T + id + id + id 15

16 Left Recursion (Recap) Productions of the form A A are left recursive When one of the productions in a grammar is left recursive then a predictive parser loops forever on certain inputs 16

17 General Left Recursion Elimination Method Arrange the nonterminals in some order A 1, A 2,, A n for i = 1,, n do for j = 1,, i-1 do replace each A i A j with A i 1 2 k where enddo A j 1 2 k enddo eliminate the immediate left recursion in A i 17

18 Immediate Left-Recursion Elimination Method Rewrite every left-recursive production A A A into a right-recursive production: A A R A R A R A R A R 18

19 Example Left Recursion Elim. A B C a B C A A b C A B C C a Choose arrangement: A, B, C i = 1: i = 2, j = 1: i = 3, j = 1: i = 3, j = 2: nothing to do B C A A b B C A B C b a b (imm) B C A B R a b B R B R C b B R C A B C C a C B C B a B C C a C B C B a B C C a C C A B R C B a b B R C B a B C C a (imm) C a b B R C B C R a B C R a C R C R A B R C B C R C C R 19

20 Left Factoring When a nonterminal has two or more productions whose right-hand sides start with the same grammar symbols, the grammar is not LL(1) and cannot be used for predictive parsing Replace productions A 1 2 n with A A R A R 1 2 n 20

21 Predictive Parsing Eliminate left recursion from grammar Left factor the grammar Compute FIRST and FOLLOW Two variants: Recursive (recursive calls) Non-recursive (table-driven) 21

22 FIRST (Revisited) FIRST( ) = { the set of terminals that begin all strings derived from } FIRST(a) = {a} if a T FIRST( ) = { } FIRST(A) = A FIRST( ) for A P FIRST(X 1 X 2 X k ) = if for all j = 1,, i-1 : FIRST(X j ) then add non- in FIRST(X i ) to FIRST(X 1 X 2 X k ) if for all j = 1,, k : FIRST(X j ) then add to FIRST(X 1 X 2 X k ) 22

23 FOLLOW FOLLOW(A) = { the set of terminals that can immediately follow nonterminal A } FOLLOW(A) = for all (B A ) P do add FIRST( )\{ } to FOLLOW(A) for all (B A ) P and FIRST( ) do add FOLLOW(B) to FOLLOW(A) for all (B A) P do add FOLLOW(B) to FOLLOW(A) if A is the start symbol S then add $ to FOLLOW(A) 23

24 Red : A Blue : First Set (2) G 0 S ase S B B bbe B C C cce C d

25 Red : A Step 1: Blue : First (S ase) First (S B ) First (B bbe) First Set (2) = First(a) ={a} = First(B) = First(b)={b} First (B C ) = First(C) First (C cce) = First(c) ={c} First (C d ) = First(d)={d} G 0 S ase S B B bbe B C C cce C d

26 Red : A Step 1: Blue : First (S ase) First (S B ) First (B bbe) First Set (2) = {a} = First(B) = {b} First (B C ) = First(C) First (C cce) = {c} First (C d ) = {d} G 0 S ase S B B bbe B C C cce C d Step First Set S B C a b c d Step 1 {a} First(B) {b} First(C) {c, d}

27 Red : A Step 2: Blue : First (S ase) First Set (2) = {a} First (S B ) = First(B) = {b} First(C) First (B bbe) = {b} First (B C ) = First(C) First (C cce) = {c} First (C d ) = {d} G 0 S ase S B B bbe B C C cce C d Step First Set S B C a b c d Step 1 {a} First(B) {b} First(C) {c, d}

28 Red : A Step 2: Blue : First (S ase) First Set (2) = {a} First (S B ) = {b} First(C) First (B bbe) = {b} First (B C ) = First(C) First (C cce) = {c} First (C d ) = {d} G 0 S ase S B B bbe B C C cce C d Step First Set S B C a b c d Step 1 {a} First(B) {b} First(C) {c, d} Step 2 {a} {b} First(C)

29 Red : A Step 2: Blue : First (S ase) First Set (2) = {a} First (S B ) = {b} First(C) First (B bbe) = {b} First (B C ) = First(C) = {c, d} First (C cce) = {c} First (C d ) = {d} G 0 S ase S B B bbe B C C cce C d Step First Set S B C a b c d Step 1 {a} First(B) {b} First(C) {c, d} Step 2 {a} {b} First(C)

30 Red : A Step 2: Blue : First (S ase) First Set (2) = {a} First (S B ) = {b} First(C) First (B bbe) = {b} First (B C ) = {c, d} First (C cce) = {c} First (C d ) = {d} G 0 S ase S B B bbe B C C cce C d Step First Set S B C a b c d Step 1 {a} First(B) {b} First(C) {c, d} Step 2 {a} {b} First(C) {b} {c, d} = {b,c,d} {c, d}

31 Red : A Step 3: Blue : First (S ase) First Set (2) = {a} First (S B ) = {b} First(C) = {b} {c, d} First (B bbe) = {b} First (B C ) = {c, d} First (C cce) = {c} First (C d ) = {d} G 0 S ase S B B bbe B C C cce C d Step First Set S B C a b c d Step 1 {a} First(B) {b} First(C) {c, d} Step 2 {a} {b} First(C) {b} {c, d} = {b,c,d} {c, d}

32 Red : A Step 3: Blue : First (S ase) First Set (2) = {a} First (S B ) = {b, c, d} First (B bbe) = {b} First (B C ) = {c, d} First (C cce) = {c} First (C d ) = {d} G 0 S ase S B B bbe B C C cce C d Step Step 1 Step 2 First Set S B C a b c d {a} First(B) {b} First(C) {c, d} {a} {b} First(C) {b} {c, d} = {b,c,d} {c, d} Step {a} {b} {c, d} = {a,b,c,d} {b} {c, d} = {b,c,d} {c, d}

33 Red : A Step 3: Blue : First (S ase) First Set (2) = {a} First (S B ) = {b, c, d} First (B bbe) = {b} First (B C ) = {c, d} First (C cce) First (C d ) Step = {c} = {d} G 0 S ase S B B bbe B C C cce C d If no more change The first set of a terminal symbol is itself First Set S B C a b c d Step 1 {a} First(B) {b} First(C) {c, d} Step 2 {a} {b} First(C) {b} {c, d} = {b,c,d} {c, d} Step 3 {a} {b} {c, d} = {a,b,c,d} {b} {c, d} = {b,c,d} {c, d} {a} {b} {c} {d}

34 Another Example.

35 Red : A Blue : First Set (2) G 0 S ABc A a A B b B

36 Red : A Step 1: Blue : First (S ABc) First (A a ) First (A ) First Set (2) = First(ABc) = First(a) = First( ) First( ) First (B b ) = First(b) First (B ) = First( ) First( ) G 0 S ABc A a A B b B

37 Red : A Step 1: Blue : First (S ABc) First (A a ) First (A ) First (B b ) = {b} First (B ) = { } First Set (2) = First(ABc) = {a} = { } G 0 S ABc A a A B b B Step First Set S A B a b c Step 1 First(ABc) {a, } {b, }

38 Red : A Step 2: Blue : First Set (2) First (S ABc) = First(ABc) = {a, } = {a, } - { } First(Bc) = {a} First(Bc) First (A a )) = {a} First (A ) = { } First (B b ) = {b} G 0 S ABc A a A B b B First (B ) = { } Step First Set S A B a b c Step 1 First(ABc) {a, } {b, } Step 2 {a} First(Bc) {a, } {b, }

39 Red : A Step 3: Blue : First (S ABc) First (A a ) First (A ) First (B b ) First (B ) First Set (2) = {a} First(Bc) = {a} {b, } = {a} {b, } - { } First(c) = {a} {b,c} = {a} = { } = {b} Step First Set = { } G 0 S ABc A a A B b B S A B a b c Step 1 First(ABc) {a, } {b, } Step 2 {a} First(Bc) {a, } {b, } Step 3 {a} {b, c}= {a,b,c} {a, } {b, }

40 Red : A Step 3: Blue : First (S ABc) First (A a ) First (A ) First Set (2) = {a,b,c} = {a} = { } First (B b ) = {b} First (B ) = { } Step Step 1 Step 2 First Set G 0 S ABc A a A B b B If no more change The first set of a terminal symbol is itself S A B a b c First(ABc) {a, } {b, } {a} First(Bc) {a, } {b, } Step {a} {b, c}= {a,b,c} {a, } {b, } {a} {b} {c}

41 LL(1) Grammar A grammar G is LL(1) if it is not left recursive and for each collection of productions A 1 2 n for nonterminal A the following holds: 1. FIRST( i ) FIRST( j ) = for all i j 2. if i * then 2.a. j * for all i j 2.b. FIRST( j ) FOLLOW(A) = for all i j 41

42 Non-LL(1) Examples Grammar S S a a S a S a S a R R S S a R a R S Not LL(1) because: Left recursive FIRST(a S) FIRST(a) For R: S * and * For R: FIRST(S) FOLLOW(R) 42

43 Recursive Descent Parsing (Recap) Grammar must be LL(1) Every nonterminal has one (recursive) procedure responsible for parsing the nonterminal s syntactic category of input tokens When a nonterminal has multiple productions, each production is implemented in a branch of a selection statement based on input look-ahead information 43

44 Using FIRST and FOLLOW to Write a Recursive Descent Parser expr term rest rest + term rest - term rest term id procedure rest(); begin if lookahead in FIRST(+ term rest) then match( + ); term(); rest() else if lookahead in FIRST(- term rest) then match( - ); term(); rest() else if lookahead in FOLLOW(rest) then return else error() end; FIRST(+ term rest) = { + } FIRST(+ term rest) = { + } FIRST(- term rest) = { - } FOLLOW(rest) = { $ } 44

45 Non-Recursive Predictive Parsing: Table-Driven Parsing Given an LL(1) grammar G = (N, T, P, S) construct a table M[A,a] for A N, a T and use a driver program with a stack input a + b $ stack X Y Z $ Predictive parsing program (driver) Parsing table M output 45

46 Constructing an LL(1) Predictive Parsing Table for each production A do for each a FIRST( ) do add A to M[A,a] enddo if FIRST( ) then for each b FOLLOW(A) do add A to M[A,b] enddo endif enddo Mark each undefined entry in M error 46

47 Example Table E T E R E R + T E R T F T R T R * F T R F ( E ) id A FIRST( ) FOLLOW(A) E T E R ( id $ ) E R + T E R + $ ) E R $ ) T F T R ( id + $ ) T R * F T R * + $ ) T R + $ ) F ( E ) ( * + $ ) F id id * + $ ) id + * ( ) $ E E T E R E T E R E R E R + T E R E R E R T T F T R T F T R T R T R T R * F T R T R T R F F id F ( E ) 47

48 LL(1) Grammars are Unambiguous Ambiguous grammar S i E t S S R a S R e S E b Error: duplicate table entry A FIRST( ) FOLLOW(A) S i E t S S R i e $ S a a e $ S R e S e e $ S R e $ E b b t a b e i t $ S S a S i E t S S R S R E E b S R S R e S S R 48

49 Predictive Parsing Program (Driver) push($) push(s) a := lookahead repeat X := pop() if X is a terminal or X = $ then match(x) // moves to next token and a := lookahead else if M[X,a] = X Y 1 Y 2 Y k then push(y k, Y k-1,, Y 2, Y 1 ) // such that Y 1 is on top invoke actions and/or produce IR output else error() endif until X = $ 49

50 Example Table-Driven Parsing Stack $E $E R T $E R T R F $E R T R id $E R T R $E R $E R T+ $E R T $E R T R F $E R T R id $E R T R $E R T R F* $E R T R F $E R T R id $E R T R $E R $ Input id+id*id$ id+id*id$ id+id*id$ id+id*id$ +id*id$ +id*id$ +id*id$ id*id$ id*id$ id*id$ *id$ *id$ id$ id$ $ $ $ Production applied E T E R T F T R F id T R E R + T E R T F T R F id T R * F T R F id T R E R R 50

51 Panic Mode Recovery Add synchronizing actions to undefined entries based on FOLLOW Pro: Cons: Can be automated Error messages are needed FOLLOW(E) = { ) $ } FOLLOW(E R ) = { ) $ } FOLLOW(T) = { + ) $ } FOLLOW(T R ) = { + ) $ } FOLLOW(F) = { + * ) $ } id + * ( ) $ E E T E R E T E R synch synch E R E R + T E R E R E R T T F T R synch T F T R synch synch T R T R T R * F T R T R T R F F id synch synch F ( E ) synch synch synch: the driver pops current nonterminal A and skips input till synch token or skips input until one of FIRST(A) is found 51

52 Phrase-Level Recovery Change input stream by inserting missing tokens For example: id id is changed into id * id Pro: Cons: Can be automated Recovery not always intuitive Can then continue here id + * ( ) $ E E T E R E T E R synch synch E R E R + T E R E R E R T T F T R synch T F T R synch synch T R insert * T R T R * F T R T R T R F F id synch synch F ( E ) synch synch insert *: driver inserts missing * and retries the production 52

53 Error Productions E T E R E R + T E R T F T R T R * F T R F ( E ) id Add error production : T R F T R to ignore missing *, e.g.: id id Pro: Cons: Powerful recovery method Cannot be automated id + * ( ) $ E E T E R E T E R synch synch E R E R + T E R E R E R T T F T R synch T F T R synch synch T R T R F T R T R T R * F T R T R T R F F id synch synch F ( E ) synch synch 53

54 Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation) SLR, Canonical LR, LALR Other special cases: Shift-reduce parsing Operator-precedence parsing 54

55 Operator-Precedence Parsing Special case of shift-reduce parsing We will not further discuss (you can skip textbook section 4.6) 55

56 Shift-Reduce Parsing Grammar: S a A B e A A b c b B d These match These match production s right-hand sides Reducing a sentence: a b b c d e a A b c d e a A d e a A B e S Shift-reduce corresponds to a rightmost derivation: S rm a A B e rm a A d e rm a A b c d e rm a b b c d e S A A A A A A B A B a b b c d e a b b c d e a b b c d e a b b c d e 56

57 Handles A handle is a substring of grammar symbols in a right-sentential form that matches a right-hand side of a production Grammar: S a A B e A A b c b B d a b b c d e a A b c d e a A d e a A B e S Handle a b b c d e a A b c d e a A A e? NOT a handle, because further reductions will fail (result is not a sentential form) 57

58 Stack Implementation of Shift-Reduce Parsing Grammar: E E + E E E * E E ( E ) E id Find handles to reduce Stack $ $id $E $E+ $E+id $E+E $E+E* $E+E*id $E+E*E $E+E $E Input id+id*id$ +id*id$ +id*id$ id*id$ *id$ *id$ id$ $ $ $ $ Action shift reduce E id shift shift reduce E id shift (or reduce?) shift reduce E id reduce E E * E reduce E E + E accept How to resolve conflicts? 58

59 Conflicts Shift-reduce and reduce-reduce conflicts are caused by The limitations of the LR parsing method (even when the grammar is unambiguous) Ambiguity of the grammar 59

60 Shift-Reduce Parsing: Shift-Reduce Conflicts Ambiguous grammar: S if E then S if E then S else S other Stack $ $ if E then S Input $ else $ Action shift or reduce? Resolve in favor Resolve in favor of shift, so else matches closest if 60

61 Shift-Reduce Parsing: Reduce-Reduce Conflicts Grammar: C A B A a B a Stack $ $a Input aa$ a$ Action shift reduce A a or B a? Resolve in favor Resolve in favor of reduce A a, otherwise we re stuck! 61

62 LR(k) Parsers: Use a DFA for Shift/Reduce Decisions start 0 C A a Grammar: S C C A B A a B a B a 4 5 goto(i 0,C) State I 0 : S C C A B A a State I 1 : S C goto(i 0,A) State I 2 : C A B B a State I 4 : C A B goto(i 2,B) goto(i 2,a) Can only reduce A a (not B a) goto(i 0,a) State I 3 : A a State I 5 : B a 62

63 DFA for Shift/Reduce Decisions Grammar: S C C A B A a B a State I 0 : S C C A B A a goto(i 0,a) The states of the DFA are used to determine if a handle is on top of the stack State I 3 : A a Stack $ 0 $ 0 $ 0 a 3 $ 0 A 2 $ 0 A 2 a 5 $ 0 A 2 B 4 $ 0 C 1 Input aa$ aa$ a$ a$ $ $ $ Action start in state 0 shift (and goto state 3) reduce A a (goto 2) shift (goto 5) reduce B a (goto 4) reduce C AB (goto 1) accept (S C) 63

64 DFA for Shift/Reduce Decisions Grammar: S C C A B A a B a State I 0 : S C C A B A a goto(i 0,A) The states of the DFA are used to determine if a handle is on top of the stack State I 2 : C A B B a Stack $ 0 $ 0 $ 0 a 3 $ 0 A 2 $ 0 A 2 a 5 $ 0 A 2 B 4 $ 0 C 1 Input aa$ aa$ a$ a$ $ $ $ Action start in state 0 shift (and goto state 3) reduce A a (goto 2) shift (goto 5) reduce B a (goto 4) reduce C AB (goto 1) accept (S C) 64

65 DFA for Shift/Reduce Decisions Grammar: S C C A B A a B a State I 2 : C A B B a goto(i 2,a) The states of the DFA are used to determine if a handle is on top of the stack State I 5 : B a Stack $ 0 $ 0 $ 0 a 3 $ 0 A 2 $ 0 A 2 a 5 $ 0 A 2 B 4 $ 0 C 1 Input aa$ aa$ a$ a$ $ $ $ Action start in state 0 shift (and goto state 3) reduce A a (goto 2) shift (goto 5) reduce B a (goto 4) reduce C AB (goto 1) accept (S C) 65

66 DFA for Shift/Reduce Decisions Grammar: S C C A B A a B a State I 2 : C A B B a goto(i 2,B) The states of the DFA are used to determine if a handle is on top of the stack State I 4 : C A B Stack $ 0 $ 0 $ 0 a 3 $ 0 A 2 $ 0 A 2 a 5 $ 0 A 2 B 4 $ 0 C 1 Input aa$ aa$ a$ a$ $ $ $ Action start in state 0 shift (and goto state 3) reduce A a (goto 2) shift (goto 5) reduce B a (goto 4) reduce C AB (goto 1) accept (S C) 66

67 DFA for Shift/Reduce Decisions Grammar: S C C A B A a B a State I 0 : S C C A B A a goto(i 0,C) The states of the DFA are used to determine if a handle is on top of the stack State I 1 : S C Stack $ 0 $ 0 $ 0 a 3 $ 0 A 2 $ 0 A 2 a 5 $ 0 A 2 B 4 $ 0 C 1 Input aa$ aa$ a$ a$ $ $ $ Action start in state 0 shift (and goto state 3) reduce A a (goto 2) shift (goto 5) reduce B a (goto 4) reduce C AB (goto 1) accept (S C) 67

68 DFA for Shift/Reduce Decisions Grammar: S C C A B A a B a State I 0 : S C C A B A a goto(i 0,C) The states of the DFA are used to determine if a handle is on top of the stack State I 1 : S C Stack $ 0 $ 0 $ 0 a 3 $ 0 A 2 $ 0 A 2 a 5 $ 0 A 2 B 4 $ 0 C 1 Input aa$ aa$ a$ a$ $ $ $ Action start in state 0 shift (and goto state 3) reduce A a (goto 2) shift (goto 5) reduce B a (goto 4) reduce C AB (goto 1) accept (S C) 68

69 Model of an LR Parser stack input a 1... a i... a n $ S m X m S m-1 LR Parsing Algorithm output X m-1.. Action Table Goto Table S 1 X 1 S 0 s t a t e s terminals and $ four different actions s t a t e s non-terminal each item is a state number 69

70 A Configuration of LR Parsing Algorithm A configuration of a LR parsing is: ( S o X 1 S 1... X m S m, a i a i+1... a n $ ) Stack Rest of Input S m and a i decides the parser action by consulting the parsing action table. (Initial Stack contains just S o ) A configuration of a LR parsing represents the right sentential form: X 1... X m a i a i+1... a n $

71 Actions of A LR-Parser 1. shift s -- shifts the next input symbol and the state s onto the stack ( S o X 1 S 1... X m S m, a i a i+1... a n $ ) ( S o X 1 S 1... X m S m a i s, a i+1... a n $ ) 2. reduce A (or rn where n is a production number) pop 2 (=r) items from the stack; then push A and s where s=goto[s m-r,a] ( S o X 1 S 1... X m S m, a i a i+1... a n $ ) ( S o X 1 S 1... X m-r S m-r A s, a i... a n $ ) Output is the reducing production reduce A 3. Accept Parsing successfully completed 4. Error -- Parser detected an error (an empty entry in the action table)

72 Reduce Action pop 2 (=r) items from the stack; let us assume that = Y 1 Y 2...Y r then push A and s where s=goto[s m-r,a] ( S o X 1 S 1... X m-r S m-r Y 1 S m-r+1...y r S m, a i a i+1... a n $ ) ( S o X 1 S 1... X m-r S m-r A s, a i... a n $ ) In fact, Y 1 Y 2...Y r is a handle. X 1... X m-r A a i... a n $ X 1... X m Y 1...Y r a i a i+1... a n $

73 (SLR) Parsing Tables for Expression 1) E E+T 2) E T 3) T T*F 4) T F 5) F (E) 6) F id Grammar Action Table state id + * ( ) $ E T F 0 s5 s s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 Goto Table 4 s5 s r6 r6 r6 r6 6 s5 s s5 s s6 s11 9 r1 s7 r1 r1 10 r3 r3 r3 r3 11 r5 r5 r5 r5

74 Actions of A (S)LR-Parser -- Example stack input action output 0 id*id+id$ shift 5 0id5 *id+id$ reduce by F id F id 0F3 *id+id$ reduce by T F T F 0T2 *id+id$ shift 7 0T2*7 id+id$ shift 5 0T2*7id5 +id$ reduce by F id F id 0T2*7F10 +id$ reduce by T T*FT T*F 0T2 +id$ reduce by E T E T 0E1 +id$ shift 6 0E1+6 id$ shift 5 0E1+6id5 $ reduce by F id F id 0E1+6F3 $ reduce by T F T F 0E1+6T9$ reduce by E E+T E E+T 0E1 $ accept

75 SLR Grammars SLR (Simple LR): a simple extension of LR(0) shift-reduce parsing SLR eliminates some conflicts by populating the parsing table with reductions A on symbols in FOLLOW(A) S E E id + E E id State I 0 : S E E id + E E id State I 2 : Shift on + goto(i 0,id) E id + E E id goto(i 3,+) FOLLOW(E)={$} thus reduce on $ 75

76 SLR Parsing Table Reductions do not fill entire rows Otherwise the same as LR(0) 1. S E 2. E id + E 3. E id Shift on id + $ E s2 s2 s3 ac c r3 r2 1 4 FOLLOW(E)={$} thus reduce on $ 76

77 SLR Parsing An LR(0) state is a set of LR(0) items An LR(0) item is a production with a (dot) in the right-hand side Build the LR(0) DFA by Closure operation to construct LR(0) items Goto operation to determine transitions Construct the SLR parsing table from the DFA LR parser program uses the SLR parsing table to determine shift/reduce operations 77

78 LR(0) Items of a Grammar An LR(0) item of a grammar G is a production of G with a at some position of the right-hand side Thus, a production A X Y Z has four items: [A X Y Z] [A X Y Z] [A X Y Z] [A X Y Z ] Note that production A has one item [A ] 78

79 Constructing the set of LR(0) Items of a Grammar 1. The grammar is augmented with a new start symbol S and production S S 2. Initially, set C = closure({[s S]}) (this is the start state of the DFA) 3. For each set of items I C and each grammar symbol X (N T) such that goto(i,x) C and goto(i,x), add the set of items goto(i,x) to C 4. Repeat 3 until no more sets can be added to C 79

80 The Closure Operation for LR(0) Items 1. Initially, every LR(0) item in I is added to closure(i) 2. If [A B ] closure(i) then for each production B in the grammar, add the item [B ] to I if not already in I 3. Repeat 2 until no new items can be added 80

81 The Closure Operation (Example) closure({[e E]}) = { [E E] } Grammar: E E + T T T T * F F F ( E ) F id Add [E ] { [E E] [E E + T] [E T] } Add [T ] { [E E] [E E + T] [E T] [T T * F] [T F] } Add [F ] { [E E] [E E + T] [E T] [T T * F] [T F] [F ( E )] [F id] } 81

82 The Goto Operation for LR(0) Items 1. For each item [A X ] I, add the set of items closure({[a X ]}) to goto(i,x) if not already there 2. Repeat step 1 until no more items can be added to goto(i,x) 3. Intuitively, goto(i,x) is the set of items that are valid for the viable prefix X when I is the set of items that are valid for 82

83 The Goto Operation (Example 1) Suppose I = { [E E] [E E + T] [E T] [T T * F] [T F] [F ( E )] [F id] } Then goto(i,e) = closure({[e E, E E + T]}) = { [E E ] [E E + T] } Grammar: E E + T T T T * F F F ( E ) F id 83

84 The Goto Operation (Example 2) Suppose I = { [E E ], [E E + T] } Then goto(i,+) = closure({[e E + T]}) = { [E E + T] [T T * F] [T F] [F ( E )] [F id] } Grammar: E E + T T T T * F F F ( E ) F id 84

85 Constructing SLR Parsing Tables 1. Augment the grammar with S S 2. Construct the set C={I 0,I 1,,I n } of LR(0) items 3. If [A a ] I i and goto(i i,a)=i j then set action[i,a]=shift j 4. If [A ] I i then set action[i,a]=reduce A for all a FOLLOW(A) A (apply only if A S ) 5. If [S S ] is in I i then set action[i,$]=accept 6. If goto(i i,a)=i j then set goto[i,a]=j 7. Repeat 3-6 until no more entries added 8. The initial state i is the I i holding item [S S] 85

86 The Canonical LR(0) Collection -- Example I 0 : E.E I 1 : E E. I 6 : E E+.T I 9 : E E+T. E.E+T E E.+T T.T*F T T.*F E.T T.F T.T*F I 2 : E T. F.(E) I 10 : T T*F. T.F T T.*F F.id F.(E) F.id I 3 : T F. I 7 : T T*.F I 11 : F (E). I 4 : F (.E) E.E+T E.T T.T*F T.F F.(E) F.id F.(E) F.id I 8 : F (E.) E E.+T

87 Transition Diagram (DFA) of Goto I 0 E I 1 T F id ( I 2 I 3 I 4 I 5 id + * E T F ( I 6 I 7 I 8 to I 2 to I 3 to I 4 Function T ) + F ( F id ( id I 9 to I 3 to I 4 to I 5 I 10 to I 4 to I 5 I 11 to I 6 * to I 7

88 Example SLR Grammar and LR(0) Items Augmented grammar: 1. C C 2. C A B 3. A a 4. B a I 0 = closure({[c C]}) I 1 = goto(i 0,C) = closure({[c C ]}) goto(i 0,C) State I 1 : C C final State I 4 : C A B start State I 0 : C C C A B A a goto(i 0,A) State I 2 : C A B B a goto(i 2,B) goto(i 2,a) goto(i 0,a) State I 3 : A a State I 5 : B a 88

89 Example SLR Parsing Table State I 0 : C C C A B A a State I 1 : C C State I 2 : C A B B a State I 3 : A a State I 4 : C A B State I 5 : B a a $ C A B start 0 C a A B a s3 s5 r3 ac c r2 r Grammar: 1. C C 2. C A B 3. A a 4. B a 89

90 SLR and Ambiguity Every SLR grammar is unambiguous, but not every unambiguous grammar is SLR Consider for example the unambiguous grammar S L = R R L * R id R L I 0 : S S S L=R S R L *R L id R L I 1 : S S I 2 : S L =R R L action[2,=]=s6 action[2,=]=r5 I 3 : S R no I 4 : L * R R L L *R L id Has no SLR parsing table I 5 : L id I 6 : S L= R R L L *R L id I 7 : L *R I 8 : R L I 9 : S L=R 90

91 LR(1) Grammars SLR too simple LR(1) parsing uses lookahead to avoid unnecessary conflicts in parsing table LR(1) item = LR(0) item + lookahead LR(0) item: [A ] LR(1) item: [A, a] 91

92 SLR Versus LR(1) Split the SLR states by adding LR(1) lookahead Unambiguous grammar 1. S L = R 2. R 3. L * R 4. id 5. R L I 2 : S L =R R L split S L =R R L lookahead=$ action[2,=]=s6 action[2,$]=r5 Should not reduce on =, because no right-sentential form begins with R= 92

93 LR(1) Items An LR(1) item [A, a] contains a lookahead terminal a, meaning already on top of the stack, expect to see a For items of the form [A, a] the lookahead a is used to reduce A only if the next input is a For items of the form [A, a] with the lookahead has no effect 93

94 The Closure Operation for LR(1) Items 1. Start with closure(i) = I 2. If [A B, a] closure(i) then for each production B in the grammar and each terminal b FIRST( a), add the item [B, b] to I if not already in I 3. Repeat 2 until no new items can be added 94

95 The Goto Operation for LR(1) Items 1. For each item [A X, a] I, add the set of items closure({[a X, a]}) to goto(i,x) if not already there 2. Repeat step 1 until no more items can be added to goto(i,x) 95

96 Constructing the set of LR(1) Items of a Grammar 1. Augment the grammar with a new start symbol S and production S S 2. Initially, set C = closure({[s S, $]}) (this is the start state of the DFA) 3. For each set of items I C and each grammar symbol X (N T) such that goto(i,x) C and goto(i,x), add the set of items goto(i,x) to C 4. Repeat 3 until no more sets can be added to C 96

97 Example Grammar and LR(1) Items Unambiguous LR(1) grammar: S L = R R L * R id R L Augment with S S LR(1) items (next slide) 97

98 I 0 : [S S, $] goto(i 0,S)=I 1 [S L=R, $] goto(i 0,L)=I 2 [S R, $] goto(i 0,R)=I 3 [L *R, =/$] goto(i 0,*)=I 4 [L id, =/$] goto(i 0,id)=I 5 [R L, $] goto(i 0,L)=I 2 I 6 : I 7 : [S L= R, $] goto(i 6,R)=I 9 [R L, $] goto(i 6,L)=I 10 [L *R, $] goto(i 6,*)=I 11 [L id, $] goto(i 6,id)=I 12 [L *R, =/$] I 1 : [S S, $] I 8 : [R L, =/$] I 2 : I 3 : I 4 : I 5 : [S L =R, $] goto(i 0,=)=I 6 [R L, $] [S R, $] [L * R, =/$] goto(i 4,R)=I 7 [R L, =/$] goto(i 4,L)=I 8 [L *R, =/$] goto(i 4,*)=I 4 [L id, =/$] goto(i 4,id)=I 5 [L id, =/$] I 9 : I 10 : I 11 : I 12 : I 13 : [S L=R, $] [R L, $] [L * R, $] goto(i 11,R)=I 13 [R L, $] goto(i 11,L)=I 10 [L *R, $] goto(i 11,*)=I 11 [L id, $] goto(i 11,id)=I 12 [L id, $] [L *R, $] 98

99 Constructing Canonical LR(1) Parsing Tables 1. Augment the grammar with S S 2. Construct the set C={I 0,I 1,,I n } of LR(1) items 3. If [A a, b] I i and goto(i i,a)=i j then set action[i,a]=shift j 4. If [A, a] I i then set action[i,a]=reduce A (apply only if A S ) 5. If [S S, $] is in I i then set action[i,$]=accept 6. If goto(i i,a)=i j then set goto[i,a]=j 7. Repeat 3-6 until no more entries added 8. The initial state i is the I i holding item [S S,$] 99

100 Example LR(1) Parsing Table 0 id * = $ s5 s4 S L R Grammar: 1. S S 2. S L = R 3. S R 4. L * R 5. L id 6. R L s5 s1 2 s1 2 s4 s1 1 s1 1 s6 r5 r4 r6 ac c r6 r3 r5 r4 r6 r2 r

101 LALR(1) Grammars LR(1) parsing tables have many states LALR(1) parsing (Look-Ahead LR) combines LR(1) states to reduce table size Less powerful than LR(1) Will not introduce shift-reduce conflicts, because shifts do not use lookaheads May introduce reduce-reduce conflicts, but seldom do so for grammars of programming languages 101

102 Constructing LALR(1) Parsing Tables 1. Construct sets of LR(1) items 2. Combine LR(1) sets with sets of items that share the same first part I 4 : I 11 : [L * R, =] [R L, =] [L *R, =] [L id, =] [L * R, $] [R L, $] [L *R, $] [L id, $] [L * R, =/$] [R L, =/$] [L *R, =/$] [L id, =/$] Shorthand for two items in the same set 102

103 Example LALR(1) Grammar Unambiguous LR(1) grammar: S L = R R L * R id R L Augment with S S LALR(1) items (next slide) 103

104 I 0 : I 1 : [S S,$] goto(i 0,S)=I 1 [S L=R,$] goto(i 0,L)=I 2 [S R,$] goto(i 0,R)=I 3 [L *R,=/$] goto(i 0,*)=I 4 [L id,=/$] goto(i 0,id)=I 5 [R L,$] goto(i 0,L)=I 2 goto(i 0,=)= I 6 [S S,$] I 6 : I 7 : I 8 : [S L= R, $] goto(i 6,R)=I 8 [R L, $] goto(i 6,L)=I 9 [L *R, $] goto(i 6,*)=I 4 [L id, $] goto(i 6,id)=I 5 [L *R,=/$] [S L=R,$] I 2 : [S L =R,$] [R L,$] I 9 : [R L,=/$] I 3 : I 4 : [S R,$] [L * R,=/$] goto(i 4,R)=I 7 [R L,=/$] goto(i 4,L)=I 9 [L *R,=/$] goto(i 4,*)=I 4 [L id,=/$] goto(i 4,id)=I 5 [R L,=] [R L,$] Shorthand for two items I 5 : [L id,=/$] 104

105 Example LALR(1) Parsing Table 0 id * = $ s5 s4 S L R Grammar: 1. S S 2. S L = R 3. S R 4. L * R 5. L id 6. R L s5 s5 s4 s4 s6 r5 r4 r6 ac c r6 r3 r5 r4 r2 r

106 LL, SLR, LR, LALR Summary LL parse tables computed using FIRST/FOLLOW Nonterminals terminals productions Computed using FIRST/FOLLOW LR parsing tables computed using closure/goto LR states terminals shift/reduce actions LR states nonterminals goto state transitions A grammar is LL(1) if its LL(1) parse table has no conflicts SLR if its SLR parse table has no conflicts LALR(1) if its LALR(1) parse table has no conflicts LR(1) if its LR(1) parse table has no conflicts 106

107 LL, SLR, LR, LALR Grammars LR(1) LALR(1) LL(1) SLR LR(0) 107

108 Dealing with Ambiguous Grammars 1. S E 2. E E + E 3. E id id + $ s2 s2 s3 r3 s3/r 2 ac c r3 r2 E 1 4 $ 0 stack $ 0 E E 4 input id+id+id$ +id$ When shifting on +: yields right associativity id+(id+id) Shift/reduce conflict: action[4,+] = shift 4 action[4,+] = reduce E E + E When reducing on +: yields left associativity (id+id)+id 108

109 Using Associativity and Precedence to Resolve Conflicts Left-associative operators: reduce Right-associative operators: shift Operator of higher precedence on stack: reduce Operator of lower precedence on stack: shift stack $ 0 S E E E + E E E * E E id $ 0 E 1 * 3 E 5 input id*id+id$ +id$ reduce E E * E 109

110 Error Detection in LR Parsing Canonical LR parser uses full LR(1) parse tables and will never make a single reduction before recognizing the error when a syntax error occurs on the input SLR and LALR may still reduce when a syntax error occurs on the input, but will never shift the erroneous input symbol 110

111 Error Recovery in LR Parsing Panic mode Pop until state with a goto on a nonterminal A is found, (where A represents a major programming construct), push A Discard input symbols until one is found in the FOLLOW set of A Phrase-level recovery Implement error routines for every error entry in table Error productions Pop until state has error production, then shift on stack Discard input until symbol is encountered that allows parsing to continue 111

112 ANTLR, Yacc, and Bison ANTLR tool Generates LL(k) parsers Yacc (Yet Another Compiler Compiler) Generates LALR(1) parsers Bison Improved version of Yacc 112

113 Creating an LALR(1) Parser with Yacc/Bison yacc specification yacc.y Yacc or Bison compiler y.tab.c y.tab.c C compiler a.out input stream a.out output stream 113

114 Yacc Specification A yacc specification consists of three parts: yacc declarations, and C declarations within %{ %} %% translation rules %% user-defined auxiliary procedures The translation rules are productions with actions: production 1 { semantic action 1 } production 2 { semantic action 2 } production n { semantic action n } 114

115 Writing a Grammar in Yacc Productions in Yacc are of the form Nonterminal: tokens/nonterminals { action } tokens/nonterminals { action } ; Tokens that are single characters can be used directly within productions, e.g. + Named tokens must be declared first in the declaration part using %token TokenName 115

116 Synthesized Attributes Semantic actions may refer to values of the synthesized attributes of terminals and nonterminals in a production: X : Y 1 Y 2 Y 3 Y n { action } $$ refers to the value of the attribute of X $i refers to the value of the attribute of Y i For example factor : ( expr ) { $$=$2; } factor.val=x ( expr.val=x ) $$=$2 116

117 Example 1 Also results in definition of #define DIGIT xxx %{ #include <ctype.h> %} %token DIGIT %% line : expr \n { printf( %d\n, $1); } ; expr : expr + term { $$ = $1 + $3; } term { $$ = $1; } ; term : term * factor { $$ = $1 * $3; } factor { $$ = $1; } ; factor : ( expr ) { $$ = $2; } DIGIT { $$ = $1; } ; %% int yylex() { int c = getchar(); if (isdigit(c)) { yylval = c- 0 ; return DIGIT; } return c; } Attribute of term (parent) Example of a very crude lexical analyzer invoked by the parser Attribute of factor (child) Attribute of token (stored in yylval) 117

118 Dealing With Ambiguous Grammars By defining operator precedence levels and left/right associativity of the operators, we can specify ambiguous grammars in Yacc, such as E E+E E-E E*E E/E (E) -E num To define precedence levels and associativity in Yacc s declaration part: %left + - %left * / %right UMINUS 118

119 Example 2 %{ #include <ctype.h> #include <stdio.h> #define YYSTYPE double %} %token NUMBER %left + - %left * / %right UMINUS %% Double type for attributes and yylval lines : lines expr \n { printf( %g\n, $2); } lines \n /* empty */ ; expr : expr + expr { $$ = $1 + $3; } expr - expr { $$ = $1 - $3; } expr * expr { $$ = $1 * $3; } expr / expr { $$ = $1 / $3; } ( expr ) { $$ = $2; } - expr %prec UMINUS { $$ = -$2; } NUMBER ; %% 119

120 Example 2 (cont d) %% int yylex() { int c; while ((c = getchar()) == ) ; if ((c ==. ) isdigit(c)) { ungetc(c, stdin); scanf( %lf, &yylval); return NUMBER; } return c; } int main() { if (yyparse()!= 0) fprintf(stderr, Abnormal exit\n ); return 0; } int yyerror(char *s) { fprintf(stderr, Error: %s\n, s); } Crude lexical analyzer for fp doubles and arithmetic operators Run the parser Invoked by parser to report parse errors 120

121 Combining Lex/Flex with Yacc/Bison yacc specification yacc.y Yacc or Bison compiler y.tab.c y.tab.h Lex specification lex.l and token definitions y.tab.h Lex or Flex compiler lex.yy.c lex.yy.c y.tab.c C compiler a.out input stream a.out output stream 121

122 Lex Specification for Example 2 %option noyywrap %{ #include y.tab.h extern double yylval; %} number [0-9]+\.? [0-9]*\.[0-9]+ %% [ ] { /* skip blanks */ } {number} { sscanf(yytext, %lf, &yylval); return NUMBER; } \n. { return yytext[0]; } Generated by Yacc, contains #define NUMBER xxx Defined in y.tab.c yacc -d example2.y lex example2.l gcc y.tab.c lex.yy.c./a.out bison -d -y example2.y flex example2.l gcc y.tab.c lex.yy.c./a.out 122

123 Error Recovery in Yacc %{ %} %% lines : lines expr \n { printf( %g\n, $2; } lines \n /* empty */ error \n { yyerror( reenter last line: ); yyerrok; } ; Error production: set error mode and skip input until newline Reset parser to normal mode 123

Principles of Programming Languages

Principles of Programming Languages Principles of Programming Languages h"p://www.di.unipi.it/~andrea/dida2ca/plp- 14/ Prof. Andrea Corradini Department of Computer Science, Pisa Lesson 8! Bo;om- Up Parsing Shi?- Reduce LR(0) automata and

More information

Principles of Programming Languages

Principles of Programming Languages Principles of Programming Languages h"p://www.di.unipi.it/~andrea/dida2ca/plp- 15/ Prof. Andrea Corradini Department of Computer Science, Pisa Lesson 10! LR parsing with ambiguous grammars Error recovery

More information

Syntax Analysis Part IV

Syntax Analysis Part IV Syntax Analysis Part IV Chapter 4: Bison Slides adapted from : Robert van Engelen, Florida State University Yacc and Bison Yacc (Yet Another Compiler Compiler) Generates LALR(1) parsers Bison Improved

More information

Syntax Analysis Part I

Syntax Analysis Part I Syntax Analysis Part I Chapter 4: Context-Free Grammars Slides adapted from : Robert van Engelen, Florida State University Position of a Parser in the Compiler Model Source Program Lexical Analyzer Token,

More information

LR Parsing Techniques

LR Parsing Techniques LR Parsing Techniques Introduction Bottom-Up Parsing LR Parsing as Handle Pruning Shift-Reduce Parser LR(k) Parsing Model Parsing Table Construction: SLR, LR, LALR 1 Bottom-UP Parsing A bottom-up parser

More information

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S. Bottom up parsing Construct a parse tree for an input string beginning at leaves and going towards root OR Reduce a string w of input to start symbol of grammar Consider a grammar S aabe A Abc b B d And

More information

LALR Parsing. What Yacc and most compilers employ.

LALR Parsing. What Yacc and most compilers employ. LALR Parsing Canonical sets of LR(1) items Number of states much larger than in the SLR construction LR(1) = Order of thousands for a standard prog. Lang. SLR(1) = order of hundreds for a standard prog.

More information

CS308 Compiler Principles Syntax Analyzer Li Jiang

CS308 Compiler Principles Syntax Analyzer Li Jiang CS308 Syntax Analyzer Li Jiang Department of Computer Science and Engineering Shanghai Jiao Tong University Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program.

More information

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table. MODULE 18 LALR parsing After understanding the most powerful CALR parser, in this module we will learn to construct the LALR parser. The CALR parser has a large set of items and hence the LALR parser is

More information

Parsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)

Parsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1) TD parsing - LL(1) Parsing First and Follow sets Parse table construction BU Parsing Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1) Problems with SLR Aho, Sethi, Ullman, Compilers

More information

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino 3. Syntax Analysis Andrea Polini Formal Languages and Compilers Master in Computer Science University of Camerino (Formal Languages and Compilers) 3. Syntax Analysis CS@UNICAM 1 / 54 Syntax Analysis: the

More information

UNIT-III BOTTOM-UP PARSING

UNIT-III BOTTOM-UP PARSING UNIT-III BOTTOM-UP PARSING Constructing a parse tree for an input string beginning at the leaves and going towards the root is called bottom-up parsing. A general type of bottom-up parser is a shift-reduce

More information

Context-free grammars

Context-free grammars Context-free grammars Section 4.2 Formal way of specifying rules about the structure/syntax of a program terminals - tokens non-terminals - represent higher-level structures of a program start symbol,

More information

Compiler Construction: Parsing

Compiler Construction: Parsing Compiler Construction: Parsing Mandar Mitra Indian Statistical Institute M. Mitra (ISI) Parsing 1 / 33 Context-free grammars. Reference: Section 4.2 Formal way of specifying rules about the structure/syntax

More information

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309 PART 3 - SYNTAX ANALYSIS F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 64 / 309 Goals Definition of the syntax of a programming language using context free grammars Methods for parsing

More information

LR Parsing Techniques

LR Parsing Techniques LR Parsing Techniques Bottom-Up Parsing - LR: a special form of BU Parser LR Parsing as Handle Pruning Shift-Reduce Parser (LR Implementation) LR(k) Parsing Model - k lookaheads to determine next action

More information

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous. Section A 1. What do you meant by parser and its types? A parser for grammar G is a program that takes as input a string w and produces as output either a parse tree for w, if w is a sentence of G, or

More information

Compilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 3, vrijdag 22 september 2017

Compilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 3, vrijdag 22 september 2017 Compilerconstructie najaar 2017 http://www.liacs.leidenuniv.nl/~vlietrvan1/coco/ Rudy van Vliet kamer 140 Snellius, tel. 071-527 2876 rvvliet(at)liacs(dot)nl college 3, vrijdag 22 september 2017 + werkcollege

More information

Syntax Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Syntax Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay Syntax Analysis (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay September 2007 College of Engineering, Pune Syntax Analysis: 2/124 Syntax

More information

Syntax-Directed Translation

Syntax-Directed Translation Syntax-Directed Translation ALSU Textbook Chapter 5.1 5.4, 4.8, 4.9 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 What is syntax-directed translation? Definition: The compilation

More information

Formal Languages and Compilers Lecture VII Part 3: Syntactic A

Formal Languages and Compilers Lecture VII Part 3: Syntactic A Formal Languages and Compilers Lecture VII Part 3: Syntactic Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/

More information

A left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence.

A left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence. Bottom-up parsing Recall For a grammar G, with start symbol S, any string α such that S α is a sentential form If α V t, then α is a sentence in L(G) A left-sentential form is a sentential form that occurs

More information

Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form Bottom-up parsing Bottom-up parsing Recall Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form If α V t,thenα is called a sentence in L(G) Otherwise it is just

More information

WWW.STUDENTSFOCUS.COM UNIT -3 SYNTAX ANALYSIS 3.1 ROLE OF THE PARSER Parser obtains a string of tokens from the lexical analyzer and verifies that it can be generated by the language for the source program.

More information

Monday, September 13, Parsers

Monday, September 13, Parsers Parsers Agenda Terminology LL(1) Parsers Overview of LR Parsing Terminology Grammar G = (Vt, Vn, S, P) Vt is the set of terminals Vn is the set of non-terminals S is the start symbol P is the set of productions

More information

UNIT III & IV. Bottom up parsing

UNIT III & IV. Bottom up parsing UNIT III & IV Bottom up parsing 5.0 Introduction Given a grammar and a sentence belonging to that grammar, if we have to show that the given sentence belongs to the given grammar, there are two methods.

More information

MODULE 14 SLR PARSER LR(0) ITEMS

MODULE 14 SLR PARSER LR(0) ITEMS MODULE 14 SLR PARSER LR(0) ITEMS In this module we shall discuss one of the LR type parser namely SLR parser. The various steps involved in the SLR parser will be discussed with a focus on the construction

More information

CSE302: Compiler Design

CSE302: Compiler Design CSE302: Compiler Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University March 27, 2007 Outline Recap General/Canonical

More information

Wednesday, September 9, 15. Parsers

Wednesday, September 9, 15. Parsers Parsers What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

More information

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs: What is a parser Parsers A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

More information

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468 Parsers Xiaokang Qiu Purdue University ECE 468 August 31, 2018 What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure

More information

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing Roadmap > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing The role of the parser > performs context-free syntax analysis > guides

More information

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing Parsing Wrapup Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing LR(1) items Computing closure Computing goto LR(1) canonical collection This lecture LR(1) parsing Building ACTION

More information

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised: EDAN65: Compilers, Lecture 06 A LR parsing Görel Hedin Revised: 2017-09-11 This lecture Regular expressions Context-free grammar Attribute grammar Lexical analyzer (scanner) Syntactic analyzer (parser)

More information

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals. Bottom-up Parsing: Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals. Basic operation is to shift terminals from the input to the

More information

Acknowledgements. The slides for this lecture are a modified versions of the offering by Prof. Sanjeev K Aggarwal

Acknowledgements. The slides for this lecture are a modified versions of the offering by Prof. Sanjeev K Aggarwal Acknowledgements The slides for this lecture are a modified versions of the offering by Prof. Sanjeev K Aggarwal Syntax Analysis Check syntax and construct abstract syntax tree if == = ; b 0 a b Error

More information

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology MIT 6.035 Parse Table Construction Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Parse Tables (Review) ACTION Goto State ( ) $ X s0 shift to s2 error error goto s1

More information

Concepts Introduced in Chapter 4

Concepts Introduced in Chapter 4 Concepts Introduced in Chapter 4 Grammars Context-Free Grammars Derivations and Parse Trees Ambiguity, Precedence, and Associativity Top Down Parsing Recursive Descent, LL Bottom Up Parsing SLR, LR, LALR

More information

SYNTAX ANALYSIS 1. Define parser. Hierarchical analysis is one in which the tokens are grouped hierarchically into nested collections with collective meaning. Also termed as Parsing. 2. Mention the basic

More information

Wednesday, August 31, Parsers

Wednesday, August 31, Parsers Parsers How do we combine tokens? Combine tokens ( words in a language) to form programs ( sentences in a language) Not all combinations of tokens are correct programs (not all sentences are grammatically

More information

Syntax Analyzer --- Parser

Syntax Analyzer --- Parser Syntax Analyzer --- Parser ASU Textbook Chapter 4.2--4.9 (w/o error handling) Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 A program represented by a sequence of tokens

More information

Bottom-Up Parsing. Parser Generation. LR Parsing. Constructing LR Parser

Bottom-Up Parsing. Parser Generation. LR Parsing. Constructing LR Parser Parser Generation Main Problem: given a grammar G, how to build a top-down parser or a bottom-up parser for it? parser : a program that, given a sentence, reconstructs a derivation for that sentence ----

More information

3. Parsing. Oscar Nierstrasz

3. Parsing. Oscar Nierstrasz 3. Parsing Oscar Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes. http://www.cs.ucla.edu/~palsberg/ http://www.cs.purdue.edu/homes/hosking/

More information

Syntactic Analysis. Chapter 4. Compiler Construction Syntactic Analysis 1

Syntactic Analysis. Chapter 4. Compiler Construction Syntactic Analysis 1 Syntactic Analysis Chapter 4 Compiler Construction Syntactic Analysis 1 Context-free Grammars The syntax of programming language constructs can be described by context-free grammars (CFGs) Relatively simple

More information

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing Also known as Shift-Reduce parsing More powerful than top down Don t need left factored grammars Can handle left recursion Attempt to construct parse tree from an input string eginning at leaves and working

More information

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38 Syntax Analysis Martin Sulzmann Martin Sulzmann Syntax Analysis 1 / 38 Syntax Analysis Objective Recognize individual tokens as sentences of a language (beyond regular languages). Example 1 (OK) Program

More information

Syntactic Analysis. Top-Down Parsing

Syntactic Analysis. Top-Down Parsing Syntactic Analysis Top-Down Parsing Copyright 2017, Pedro C. Diniz, all rights reserved. Students enrolled in Compilers class at University of Southern California (USC) have explicit permission to make

More information

shift-reduce parsing

shift-reduce parsing Parsing #2 Bottom-up Parsing Rightmost derivations; use of rules from right to left Uses a stack to push symbols the concatenation of the stack symbols with the rest of the input forms a valid bottom-up

More information

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing. LR Parsing Compiler Design CSE 504 1 Shift-Reduce Parsing 2 LR Parsers 3 SLR and LR(1) Parsers Last modifled: Fri Mar 06 2015 at 13:50:06 EST Version: 1.7 16:58:46 2016/01/29 Compiled at 12:57 on 2016/02/26

More information

Downloaded from Page 1. LR Parsing

Downloaded from  Page 1. LR Parsing Downloaded from http://himadri.cmsdu.org Page 1 LR Parsing We first understand Context Free Grammars. Consider the input string: x+2*y When scanned by a scanner, it produces the following stream of tokens:

More information

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table COMPILER CONSTRUCTION Lab 2 Symbol table LABS Lab 3 LR parsing and abstract syntax tree construction using ''bison' Lab 4 Semantic analysis (type checking) PHASES OF A COMPILER Source Program Lab 2 Symtab

More information

Lecture 7: Deterministic Bottom-Up Parsing

Lecture 7: Deterministic Bottom-Up Parsing Lecture 7: Deterministic Bottom-Up Parsing (From slides by G. Necula & R. Bodik) Last modified: Tue Sep 20 12:50:42 2011 CS164: Lecture #7 1 Avoiding nondeterministic choice: LR We ve been looking at general

More information

CS2210: Compiler Construction Syntax Analysis Syntax Analysis

CS2210: Compiler Construction Syntax Analysis Syntax Analysis Comparison with Lexical Analysis The second phase of compilation Phase Input Output Lexer string of characters string of tokens Parser string of tokens Parse tree/ast What Parse Tree? CS2210: Compiler

More information

Lecture 8: Deterministic Bottom-Up Parsing

Lecture 8: Deterministic Bottom-Up Parsing Lecture 8: Deterministic Bottom-Up Parsing (From slides by G. Necula & R. Bodik) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 1 Avoiding nondeterministic choice: LR We ve been looking at general

More information

Top down vs. bottom up parsing

Top down vs. bottom up parsing Parsing A grammar describes the strings that are syntactically legal A recogniser simply accepts or rejects strings A generator produces sentences in the language described by the grammar A parser constructs

More information

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing 8 Parsing Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces strings A parser constructs a parse tree for a string

More information

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1 CSE P 501 Compilers LR Parsing Hal Perkins Spring 2018 UW CSE P 501 Spring 2018 D-1 Agenda LR Parsing Table-driven Parsers Parser States Shift-Reduce and Reduce-Reduce conflicts UW CSE P 501 Spring 2018

More information

Introduction to parsers

Introduction to parsers Syntax Analysis Introduction to parsers Context-free grammars Push-down automata Top-down parsing LL grammars and parsers Bottom-up parsing LR grammars and parsers Bison/Yacc - parser generators Error

More information

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root Parser Generation Main Problem: given a grammar G, how to build a top-down parser or a bottom-up parser for it? parser : a program that, given a sentence, reconstructs a derivation for that sentence ----

More information

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview n Tokens and regular expressions n Syntax and context-free grammars n Grammar derivations n More about parse trees n Top-down and

More information

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012 LR(0) Drawbacks Consider the unambiguous augmented grammar: 0.) S E $ 1.) E T + E 2.) E T 3.) T x If we build the LR(0) DFA table, we find that there is a shift-reduce conflict. This arises because the

More information

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1 CSE 401 Compilers LR Parsing Hal Perkins Autumn 2011 10/10/2011 2002-11 Hal Perkins & UW CSE D-1 Agenda LR Parsing Table-driven Parsers Parser States Shift-Reduce and Reduce-Reduce conflicts 10/10/2011

More information

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram CS6660 COMPILER DESIGN Question Bank UNIT I-INTRODUCTION TO COMPILERS 1. Define compiler. 2. Differentiate compiler and interpreter. 3. What is a language processing system? 4. List four software tools

More information

CSCI Compiler Design

CSCI Compiler Design Syntactic Analysis Automatic Parser Generators: The UNIX YACC Tool Portions of this lecture were adapted from Prof. Pedro Reis Santos s notes for the 2006 Compilers class lectured at IST/UTL in Lisbon,

More information

VIVA QUESTIONS WITH ANSWERS

VIVA QUESTIONS WITH ANSWERS VIVA QUESTIONS WITH ANSWERS 1. What is a compiler? A compiler is a program that reads a program written in one language the source language and translates it into an equivalent program in another language-the

More information

S Y N T A X A N A L Y S I S LR

S Y N T A X A N A L Y S I S LR LR parsing There are three commonly used algorithms to build tables for an LR parser: 1. SLR(1) = LR(0) plus use of FOLLOW set to select between actions smallest class of grammars smallest tables (number

More information

LR Parsers. Aditi Raste, CCOEW

LR Parsers. Aditi Raste, CCOEW LR Parsers Aditi Raste, CCOEW 1 LR Parsers Most powerful shift-reduce parsers and yet efficient. LR(k) parsing L : left to right scanning of input R : constructing rightmost derivation in reverse k : number

More information

Compiler Construction 2016/2017 Syntax Analysis

Compiler Construction 2016/2017 Syntax Analysis Compiler Construction 2016/2017 Syntax Analysis Peter Thiemann November 2, 2016 Outline 1 Syntax Analysis Recursive top-down parsing Nonrecursive top-down parsing Bottom-up parsing Syntax Analysis tokens

More information

Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6

Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6 Compiler Design 1 Bottom-UP Parsing Compiler Design 2 The Process The parse tree is built starting from the leaf nodes labeled by the terminals (tokens). The parser tries to discover appropriate reductions,

More information

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program.

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program. COMPILER DESIGN 1. What is a compiler? A compiler is a program that reads a program written in one language the source language and translates it into an equivalent program in another language-the target

More information

Formal Languages and Compilers Lecture VII Part 4: Syntactic A

Formal Languages and Compilers Lecture VII Part 4: Syntactic A Formal Languages and Compilers Lecture VII Part 4: Syntactic Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/

More information

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Subject Name: CS2352 Principles of Compiler Design Year/Sem : III/VI

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Subject Name: CS2352 Principles of Compiler Design Year/Sem : III/VI DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Subject Name: CS2352 Principles of Compiler Design Year/Sem : III/VI UNIT I - LEXICAL ANALYSIS 1. What is the role of Lexical Analyzer? [NOV 2014] 2. Write

More information

COP 3402 Systems Software Syntax Analysis (Parser)

COP 3402 Systems Software Syntax Analysis (Parser) COP 3402 Systems Software Syntax Analysis (Parser) Syntax Analysis 1 Outline 1. Definition of Parsing 2. Context Free Grammars 3. Ambiguous/Unambiguous Grammars Syntax Analysis 2 Lexical and Syntax Analysis

More information

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam Compilers Parsing Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Next step text chars Lexical analyzer tokens Parser IR Errors Parsing: Organize tokens into sentences Do tokens conform

More information

Syntax Analysis. Chapter 4

Syntax Analysis. Chapter 4 Syntax Analysis Chapter 4 Check (Important) http://www.engineersgarage.com/contributio n/difference-between-compiler-andinterpreter Introduction covers the major parsing methods that are typically used

More information

Principle of Compilers Lecture IV Part 4: Syntactic Analysis. Alessandro Artale

Principle of Compilers Lecture IV Part 4: Syntactic Analysis. Alessandro Artale Free University of Bolzano Principles of Compilers Lecture IV Part 4, 2003/2004 AArtale (1) Principle of Compilers Lecture IV Part 4: Syntactic Analysis Alessandro Artale Faculty of Computer Science Free

More information

Using an LALR(1) Parser Generator

Using an LALR(1) Parser Generator Using an LALR(1) Parser Generator Yacc is an LALR(1) parser generator Developed by S.C. Johnson and others at AT&T Bell Labs Yacc is an acronym for Yet another compiler compiler Yacc generates an integrated

More information

Compilers. Bottom-up Parsing. (original slides by Sam

Compilers. Bottom-up Parsing. (original slides by Sam Compilers Bottom-up Parsing Yannis Smaragdakis U Athens Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Bottom-Up Parsing More general than top-down parsing And just as efficient Builds

More information

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1 Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 1 Shift/Reduce Conflicts If a DFA state contains both [X: α aβ, b] and [Y: γ, a],

More information

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States TDDD16 Compilers and Interpreters TDDB44 Compiler Construction LR Parsing, Part 2 Constructing Parse Tables Parse table construction Grammar conflict handling Categories of LR Grammars and Parsers An NFA

More information

Bottom-Up Parsing. Lecture 11-12

Bottom-Up Parsing. Lecture 11-12 Bottom-Up Parsing Lecture 11-12 (From slides by G. Necula & R. Bodik) 2/20/08 Prof. Hilfinger CS164 Lecture 11 1 Administrivia Test I during class on 10 March. 2/20/08 Prof. Hilfinger CS164 Lecture 11

More information

Lex & Yacc. by H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Lex & Yacc. by H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages: Lex & Yacc by H. Altay Güvenir A compiler or an interpreter performs its task in 3 stages: 1) Lexical Analysis: Lexical analyzer: scans the input stream and converts sequences of characters into tokens.

More information

Introduction to Yacc. General Description Input file Output files Parsing conflicts Pseudovariables Examples. Principles of Compilers - 16/03/2006

Introduction to Yacc. General Description Input file Output files Parsing conflicts Pseudovariables Examples. Principles of Compilers - 16/03/2006 Introduction to Yacc General Description Input file Output files Parsing conflicts Pseudovariables Examples General Description A parser generator is a program that takes as input a specification of a

More information

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan Language Processing Systems Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan Syntax Analysis (Parsing) 1. Uses Regular Expressions to define tokens 2. Uses Finite Automata to

More information

Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2). LL(k) Grammars We need a bunch of terminology. For any terminal string a we write First k (a) is the prefix of a of length k (or all of a if its length is less than k) For any string g of terminal and

More information

4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

More information

Yacc Yet Another Compiler Compiler

Yacc Yet Another Compiler Compiler LEX and YACC work as a team Yacc Yet Another Compiler Compiler How to work? Some material adapted from slides by Andy D. Pimentel LEX and YACC work as a team Availability call yylex() NUM + NUM next token

More information

Lex & Yacc. By H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Lex & Yacc. By H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages: Lex & Yacc By H. Altay Güvenir A compiler or an interpreter performs its task in 3 stages: 1) Lexical Analysis: Lexical analyzer: scans the input stream and converts sequences of characters into tokens.

More information

Bottom-Up Parsing. Lecture 11-12

Bottom-Up Parsing. Lecture 11-12 Bottom-Up Parsing Lecture 11-12 (From slides by G. Necula & R. Bodik) 9/22/06 Prof. Hilfinger CS164 Lecture 11 1 Bottom-Up Parsing Bottom-up parsing is more general than topdown parsing And just as efficient

More information

CSE302: Compiler Design

CSE302: Compiler Design CSE302: Compiler Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University February 20, 2007 Outline Recap

More information

4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

More information

COMPILER (CSE 4120) (Lecture 6: Parsing 4 Bottom-up Parsing )

COMPILER (CSE 4120) (Lecture 6: Parsing 4 Bottom-up Parsing ) COMPILR (CS 4120) (Lecture 6: Parsing 4 Bottom-up Parsing ) Sungwon Jung Mobile Computing & Data ngineering Lab Dept. of Computer Science and ngineering Sogang University Seoul, Korea Tel: +82-2-705-8930

More information

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F). CS 2210 Sample Midterm 1. Determine if each of the following claims is true (T) or false (F). F A language consists of a set of strings, its grammar structure, and a set of operations. (Note: a language

More information

CA Compiler Construction

CA Compiler Construction CA4003 - Compiler Construction David Sinclair A top-down parser starts with the root of the parse tree, labelled with the goal symbol of the grammar, and repeats the following steps until the fringe of

More information

Parsing How parser works?

Parsing How parser works? Language Processing Systems Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan Syntax Analysis (Parsing) 1. Uses Regular Expressions to define tokens 2. Uses Finite Automata to

More information

CS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1

CS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1 CS453 : JavaCUP and error recovery CS453 Shift-reduce Parsing 1 Shift-reduce parsing in an LR parser LR(k) parser Left-to-right parse Right-most derivation K-token look ahead LR parsing algorithm using

More information

CS415 Compilers. LR Parsing & Error Recovery

CS415 Compilers. LR Parsing & Error Recovery CS415 Compilers LR Parsing & Error Recovery These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Review: LR(k) items The LR(1) table construction

More information

PRACTICAL CLASS: Flex & Bison

PRACTICAL CLASS: Flex & Bison Master s Degree Course in Computer Engineering Formal Languages FORMAL LANGUAGES AND COMPILERS PRACTICAL CLASS: Flex & Bison Eliana Bove eliana.bove@poliba.it Install On Linux: install with the package

More information

TDDD55 - Compilers and Interpreters Lesson 3

TDDD55 - Compilers and Interpreters Lesson 3 TDDD55 - Compilers and Interpreters Lesson 3 November 22 2011 Kristian Stavåker (kristian.stavaker@liu.se) Department of Computer and Information Science Linköping University LESSON SCHEDULE November 1,

More information

Yacc: A Syntactic Analysers Generator

Yacc: A Syntactic Analysers Generator Yacc: A Syntactic Analysers Generator Compiler-Construction Tools The compiler writer uses specialised tools (in addition to those normally used for software development) that produce components that can

More information