CS2210: Compiler Construction Syntax Analysis Syntax Analysis

Size: px
Start display at page:

Download "CS2210: Compiler Construction Syntax Analysis Syntax Analysis"

Transcription

1

2 Comparison with Lexical Analysis The second phase of compilation Phase Input Output Lexer string of characters string of tokens Parser string of tokens Parse tree/ast

3 What Parse Tree? CS2210: Compiler Construction A parse tree represents the program structure of the input Programming language constructs usually have recursive structures If-stmt if (EXPR) then Stmt else Stmt fi Stmt If-stmt While-stmt...

4 A Parse Tree Example Code to be compiled:... if x==y then... else... fi Lexer: Parser: Input: sequence of tokens... IF ID==ID THEN... ELSE... FI Desired output: IF-STMT == STMT STMT ID ID

5 How to Specify a Syntax How can we specify a syntax with nested structures? Is it possible to use RE/FA? RE(Regular Expression) FA(Finite Automata)

6 How to Specify a Syntax How can we specify a syntax with nested structures? Is it possible to use RE/FA? RE(Regular Expression) FA(Finite Automata) RE/FA is not powerful enough Example: matching parenthesis: # of ( equals # of ) (x+y)*z ((x+y)+y)*z... (...(((x+y)+y)+y)...) ((x+y)+y)+y)*z

7 RE/FA is Not Powerful Enough Pumping Lemma for Regular Languages In FA, all sufficiently long words may be pumped that is, have a substring repeated an arbitrary number of times If an FA is to accept a string longer than the number of states, there must be a set of states that are repeated (q s... q t in above picture) Let y be the substring that is repeated. Then all strings xy i z (i >= 0) are also accepted and is part of language

8 RE/FA is Not Powerful Enough Matching parentheses is not a Regular Language Suppose it is RL and there is an FA for this language Now construct a string with matching ()s long enough such that there are more (s than the number of states Then, there would be a pumping y filled entirely with ( Then, any string xy i z will also be accepted by FA but not be matching Contradiction QED Similarly, any language with nested structures is not an RL

9 What Language is C Syntax? An example of a Context Free Language (CFL) A broader category of languages that includes languages with nested structures Before discussing CFL, we need to learn a more general way of specifying languages than RE, called Grammars Can specify both RL and CFL and more...

10 Grammar General Language Specification Recall language definition Language set of strings over alphabet Alphabet: finite set of symbols Sentences: strings in the language A grammar is a general way to specify a language E.g. The English grammar specifies the English language

11 An Example CS2210: Compiler Construction Language L = { any string with 00 at the end } 1 A 0 B 0 C 0 Grammar G = { T, N, s, δ } where T = { 0, 1 }, N = { A, B }, s = A, and grammar rule set δ = { A 0A 1A 0B, B 0 } Derivation: from grammar to language A 0A 00B 000 A 1A 10B 100 A 0A 00A 000B 0000 A 0A 01A...

12 Grammar CS2210: Compiler Construction A grammar consists of 4 components (T, N, s, δ) T set of terminal symbols Essentially tokens leaves in the parse tree N set of non-terminal symbols Internal nodes in the parse tree One or more symbols grouped into higher construct example: declaration, statement, loop,... s the non-terminal start symbol where derivation starts δ a set of production rules LHS RHS : left-hand-side produces right-hand-side

13 Production Rule and Derivation Production Rule: LHS RHS Meaning: LHS can be replaced with RHS A derivation a series of applications of production rules Derivation: β α Meaning: String α is derived from β β α 1 step β α 0 or more steps β = α 0 or more steps example: A 0A 00B 000 A = 000 A = + 000

14 Noam Chomsky Grammars Classification of languages based on form of grammar rules Four(4) types of grammars: Type 0 recursive grammar Type 1 context sensitive grammar Type 2 context free grammar Type 3 regular grammar

15 Type 0: Unrestricted Grammar Type 0 grammar unrestricted or recursively enumerable Form of rules α β where α (N T ) +, β (N T ) Implied restrictions: LHS: no ε allowed Example: ab acd ; LHS is shorter than RHS aab ab ; LHS is longer than RHS A ε ; ε-productions are allowed Computational complexity: unbounded Derivation strings may contract and expand repeatedly (Since LHS may be longer or shorter than RHS) Unbounded number of productions before target string L(Unrestricted) L(Turing Machine)

16 Type 1: Context Sensitive Grammar Type 1 grammar context sensitive grammar Form of rules αaβ αγβ where A N, α, β (N T ), γ (N T ) + Replace A by γ only if found in the context of α and β Implied restrictions: LHS: shorter or equal to RHS RHS: no ε allowed Example: aab acb ; Replace A with C when in between a and B A C ; Replace A with C regardless of context Computational complexity: likely NP-Complete Derivation strings may only expand Bounded number of derivations before target string L(CSG) L(Linear Bounded Turing Machine)

17 Type 2: Context Free Grammar Type 2 grammar context free grammar Form of rules A γ where A N, γ (N T ) + Replace A by γ (no context can be specified) Implied restrictions: LHS: a single non-terminal RHS: no ε allowed Sometimes relaxed to simplify grammar but rules can always be rewritten to exclude ε-productions Example: A abc ; Replace A with abc regardless of context Computational complexity: Polynomial O(n ), but most real world CFGs are O(n) L(CFG) L(Pushdown Automaton)

18 Type 3: Regular Grammar Type 3 grammar regular grammar Form of rules A α, or A αb where A, B N, α T In terms of FA: Move from state A to state B on input α Implied restrictions: LHS: a single non-terminal RHS: a terminal or a terminal followed by a non-terminal Example: A 1A 0 Computational complexity: Linear O(n) Derivation string length increases by 1 at each step L(RG) L(Finite Automaton) L(Regular Expression)

19 Differentiating Type 2 (RG) and 3 (CFG) Language L1 = { [ i ] j i, j >= 1} Regular grammar S [ S [ T T ] T ] Language L2 = { [ i ] i i >= 1} Context free grammar S [ S ] [ ]

20 Differentiating Type 1 (CSG) and 2 (CFG) Type 2 grammar (context free) S D U D int x; int y; U x=1; y=1; Type 1 grammar (context sensitive) S D U D int x; int y; int x; U int x; x=1; int y; U int y; y=1;

21 Differentiating Type 1 (CSG) and 2 (CFG) Language from type 2 (CFG): S DU int x; U int x; x=1; S DU int x; U int x; y=1; S DU int y; U int y; x=1; S DU int y; U int y; y=1; Language from type 1 (CSG): S DU int x; U int x; x=1; S DU int y; U int y; y=1;

22 Differentiating Type 1 (CSG) and 2 (CFG) Language from type 2 (CFG): S DU int x; U int x; x=1; S DU int x; U int x; y=1; S DU int y; U int y; x=1; S DU int y; U int y; y=1; Language from type 1 (CSG): S DU int x; U int x; x=1; S DU int y; U int y; y=1; If PLs are context-sensitive, why use CFGs for parsing? CSG parsers are provably inefficient (PSPACE-Complete) Most PL constructs are context-free: If-stmt, declarations The remaining context-sensitive constructs can be analyzed at the semantic analysis stage (e.g. def-before-use, matching formal/actual parameters)

23 Language Classification Regular Grammar CFG CSG Recursive Grammar Type 0 Type 1 Type 2 Type 3

24 Language Classification Regular Grammar CFG CSG Recursive Grammar Type 0 Type 1 Type 2 Type 3 However, L y L x where L x :[ i ] k RG, L y :[ i ] i CFG Is it a problem?

25 Context Free Grammars

26 Grammar and Grammar is used to derive string or construct parser A derivation is a sequence of applications of rules Starting from the start symbol S (sentence) Leftmost and Rightmost derivations At each derivation step, leftmost derivation always replaces the leftmost non-terminal symbol Rightmost derivation always replaces the rightmost one

27 Examples CS2210: Compiler Construction E E * E E + E ( E ) id leftmost derivation E E + E E * E + E id * E + E id * id + E... id * id + id * id rightmost derivation E E + E E + E * E E + E * id E + id * id... id * id + id * id

28 Parse Trees CS2210: Compiler Construction Parse tree structure Describes program structure (defined by the rules applied) Internal nodes are non-terminals, leaves are terminals Structure depends on what production rules were applied Grammars should specify precisely what rules to apply to avoid ambiguity in program structure Structure does not depend on order of application Same tree for previous rightmost/leftmost derivations Order of rule application is implementation dependent

29 Parse Trees CS2210: Compiler Construction Parse tree structure Describes program structure (defined by the rules applied) Internal nodes are non-terminals, leaves are terminals Structure depends on what production rules were applied Grammars should specify precisely what rules to apply to avoid ambiguity in program structure Structure does not depend on order of application Same tree for previous rightmost/leftmost derivations Order of rule application is implementation dependent E E + E E * E E * E id id id id

30 Different Parse Trees Consider the string id * id + id * id can result in 3 different trees (and more) for this grammar E E + E E E * E E E * E E * E E * E id E + E id E * E id id id id id E * E E + E id id id id id

31 Ambiguity CS2210: Compiler Construction A grammar G is ambiguous if there exist a string str L(G) such that more than one parse tree derives str In practice, we prefer unambiguous grammars Fortunately, ambiguity is (often) the property of a grammar and not the language It is (often) possible to rewrite grammar to remove ambiguity Programming languages are designed not to be ambiguous it s the job of compiler writers to express those rules accurately in the grammar

32 How to Remove Ambiguity Step I: Specify precedence Build precedence into grammar by recursion on a different non-terminal for each precedence level. Insight: Lower precedence tend to be higher in tree (close to root) Higher precedence tend to be lower in tree (far from root) Place lower precedence non-terminals higher up in the tree E E + E E - E E * E E / E E E ( E ) id Rewrite it to: E E + T E - T T ; Lowest precedence: +, - T T * F T / F F ; Middle precedence: *, / F B F B ; Highest precedence: B id ( E ) ; Even higher precedence: ()

33 How to Remove Ambiguity Step II: Specify associativity Allow recursion only on either left or right non-terminal Left associative recursion on left non-terminal Right associative recursion on right non-terminal In the previous example, E E + E... ; allows both left/right associativity Rewrite it to: E E + T... F B F... ; + is left associative ; exponent is right associative

34 Properties of Context Free Grammars Decidable: computable using a Turing Machine In other words, guaranteed an answer in finite time E.g. if a string is in a context free language: decidable In fact we can run a CFL parser in polynomial time If a CFG is ambiguous: Undecidable In general, impossible to tell if a CFG is ambiguous Can only check whether it is ambiguous for a given string In practice, parser generators like Yacc check for a more restricted grammar (e.g. LALR(1)) instead LALR(1) is a subset of unambiguous grammars Whether grammar is LALR(1) is easily decidable If two CFGs generate same language: Undecidable Impossible to tell if language changed by tweaking grammar Parsers are regression tested against a test set frequently

35 From Grammar to Parser What exactly is parsing, or syntax analysis? To process an input string for a given grammar, and compose the derivation if the string is in the language Two subtasks to determine if string is in the language or not to construct a parse tree (a representation of the derivation)

36 From Grammar to Parser What exactly is parsing, or syntax analysis? To process an input string for a given grammar, and compose the derivation if the string is in the language Two subtasks to determine if string is in the language or not to construct a parse tree (a representation of the derivation) How would you construct a parser from a grammar?

37 Types of Parsers CS2210: Compiler Construction Universal parser Can parse any CFG e.g. Early s algorithm Powerful but can be very inefficient (O(N 3 ) where N is length of string) Top-down parser Starts from root and expands into leaves Tries to expand start symbol to input string Finds leftmost derivation Only works for a certain class of CFGs called LL But provably works in O(n) time Parser code structure closely mimics grammar Amenable to implementation by hand Automated tools exist to convert to code (e.g. ANTLR)

38 Types of Parsers (cont.) Bottom-up parser Starts at leaves and builds up to root Tries to reduce the input string to the start symbol Finds reverse order of the rightmost derivation Works for a wider class of CFGs called LR Also provably works in O(n) time Parser code structure nothing like grammar Very difficult to implement by hand Automated tools exist to convert to code (e.g. Yacc, Bison)

39 What Output do We Want? The output of parsing is parse tree, or abstract syntax tree An abstract syntax tree is abbreviated representation of a parse tree drops some details without compromising meaning some terminal symbols that no longer contribute to semantics are dropped (e.g. parentheses) internal nodes may contain terminal symbols

40 An Example CS2210: Compiler Construction Consider the grammar E int ( E ) E + E and an input 5 + ( ) After lexical analysis, we have a sequence of tokens INT 5 + ( INT 2 + INT 3 )

41 Parse Tree of the Input E E + E INT 5 ( E ) E + E INT 2 INT A parse tree 3 Traces the process of derivation Captures all the rules that were applied but contains redundant information parentheses single-successor nodes

42 Abstract Syntax Tree An Abstract Syntax Tree (AST) for the input PLUS PLUS AST captures all that is semantically relevant AST but is more compact in representation AST abstracts from parse tree (a.k.a. concrete syntax tree) ASTs are used in most compilers rather than parse trees

43 How are ASTs constructed in a compiler? Through implementation of semantic actions Semantic actions: actions triggered on a rule match Already used in lexical analysis to return token tuples For syntactic analysis, involves computing semantic value of whole construct from values of its parts To construct AST, attach an attribute for each symbol X Attributes contain the semantic value for each symbol X.ast the constructed AST for symbol X Then attach semantic actions to production rules, e.g. X Y 1 Y 2...Y n { actions } actions define X.ast using Y i.ast (1 i n)

44 Example CS2210: Compiler Construction For the previous example, we have E int { E.ast = mkleaf(int.lval) } E1 + E2 { E.ast = mkplus(e1.ast, E2.ast) } (E1) { E.ast = E1.ast } Two functions that need to be defined ptr1=mkleaf(n) create a leaf node and assign value n ptr2=mkplus(t1, t2) create a tree node and assign value PLUS, then attach two child trees t1 and t2 ptr1 ptr2 n PLUS t1 t2

45 AST Construction Steps For input INT 5 + ( INT 2 + INT 3 ) Construction order given is for a bottom-up LR(1) parser (Order can change depending on parser implementation)

46 AST Construction Steps For input INT 5 + ( INT 2 + INT 3 ) Construction order given is for a bottom-up LR(1) parser (Order can change depending on parser implementation) E1.ast=mkleaf(5) 5

47 AST Construction Steps For input INT 5 + ( INT 2 + INT 3 ) Construction order given is for a bottom-up LR(1) parser (Order can change depending on parser implementation) E1.ast=mkleaf(5) E2.ast=mkleaf(2) 5 2

48 AST Construction Steps For input INT 5 + ( INT 2 + INT 3 ) Construction order given is for a bottom-up LR(1) parser (Order can change depending on parser implementation) E1.ast=mkleaf(5) E2.ast=mkleaf(2) E3.ast=mkleaf(3) 5 2 3

49 AST Construction Steps For input INT 5 + ( INT 2 + INT 3 ) Construction order given is for a bottom-up LR(1) parser (Order can change depending on parser implementation) E4.ast=mkplus(E2.ast, E3.ast) PLUS E1.ast=mkleaf(5) E2.ast=mkleaf(2) E3.ast=mkleaf(3) 5 2 3

50 AST Construction Steps For input INT 5 + ( INT 2 + INT 3 ) Construction order given is for a bottom-up LR(1) parser (Order can change depending on parser implementation) E4.ast=mkplus(E2.ast, E3.ast) PLUS E1.ast=mkleaf(5) 5 2 3

51 AST Construction Steps For input INT 5 + ( INT 2 + INT 3 ) Construction order given is for a bottom-up LR(1) parser (Order can change depending on parser implementation) E5.ast=mkplus(E1.ast, E4.ast) PLUS E4.ast=mkplus(E2.ast, E3.ast) PLUS E1.ast=mkleaf(5) 5 2 3

52 AST Construction Steps For input INT 5 + ( INT 2 + INT 3 ) Construction order given is for a bottom-up LR(1) parser (Order can change depending on parser implementation) E5.ast=mkplus(E1.ast, E4.ast) PLUS PLUS 5 2 3

53 Summary CS2210: Compiler Construction Compilers specify program structure using CFG Most programming languages are not context free Context sensitive analysis can easily separate out to semantic analysis phase A parser uses CFG to... answer if an input str L(G)... and build a parse tree... or build an AST instead... and pass it to the rest of compiler

54 Parsing

55 Parsing CS2210: Compiler Construction We will study two approaches Top-down Easier to understand and implement manually Bottom-up More powerful, but hard to implement manually

56 Example CS2210: Compiler Construction Consider a CFG grammar G S A B A a C B b D D d C c Actually, this language has only one sentence, i.e. L(G) = { acbd } Leftmost Derivation: S AB (1) acb (2) acb (3) acbd (4) acbd (5) S Rightmost Derivation: S AB (5) AbD (4) Abd (3) acbd (2) acbd (1)

57 Example CS2210: Compiler Construction Consider a CFG grammar G S A B A a C B b D D d C c Actually, this language has only one sentence, i.e. L(G) = { acbd } Leftmost Derivation: S AB (1) acb (2) acb (3) acbd (4) acbd (5) S Rightmost Derivation: S AB (5) AbD (4) Abd (3) acbd (2) acbd (1) A B

58 Example CS2210: Compiler Construction Consider a CFG grammar G S A B A a C B b D D d C c Actually, this language has only one sentence, i.e. L(G) = { acbd } Leftmost Derivation: S AB (1) acb (2) acb (3) acbd (4) acbd (5) S Rightmost Derivation: S AB (5) AbD (4) Abd (3) acbd (2) acbd (1) A B a C

59 Example CS2210: Compiler Construction Consider a CFG grammar G S A B A a C B b D D d C c Actually, this language has only one sentence, i.e. L(G) = { acbd } Leftmost Derivation: S AB (1) acb (2) acb (3) acbd (4) acbd (5) S Rightmost Derivation: S AB (5) AbD (4) Abd (3) acbd (2) acbd (1) A B a C c

60 Example CS2210: Compiler Construction Consider a CFG grammar G S A B A a C B b D D d C c Actually, this language has only one sentence, i.e. L(G) = { acbd } Leftmost Derivation: S AB (1) acb (2) acb (3) acbd (4) acbd (5) S Rightmost Derivation: S AB (5) AbD (4) Abd (3) acbd (2) acbd (1) A B a C b D c

61 Example CS2210: Compiler Construction Consider a CFG grammar G S A B A a C B b D D d C c Actually, this language has only one sentence, i.e. L(G) = { acbd } Leftmost Derivation: S AB (1) acb (2) acb (3) acbd (4) acbd (5) S Rightmost Derivation: S AB (5) AbD (4) Abd (3) acbd (2) acbd (1) A B a C b D c d

62 Example CS2210: Compiler Construction Consider a CFG grammar G S A B A a C B b D D d C c Actually, this language has only one sentence, i.e. L(G) = { acbd } Leftmost Derivation: S AB (1) acb (2) acb (3) acbd (4) acbd (5) S Rightmost Derivation: S AB (5) AbD (4) Abd (3) acbd (2) acbd (1) A B a C b D c d a c b d

63 Example CS2210: Compiler Construction Consider a CFG grammar G S A B A a C B b D D d C c Actually, this language has only one sentence, i.e. L(G) = { acbd } Leftmost Derivation: S AB (1) acb (2) acb (3) acbd (4) acbd (5) S Rightmost Derivation: S AB (5) AbD (4) Abd (3) acbd (2) acbd (1) A B a C b D C c d a c b d

64 Example CS2210: Compiler Construction Consider a CFG grammar G S A B A a C B b D D d C c Actually, this language has only one sentence, i.e. L(G) = { acbd } Leftmost Derivation: S AB (1) acb (2) acb (3) acbd (4) acbd (5) S Rightmost Derivation: S AB (5) AbD (4) Abd (3) acbd (2) acbd (1) A B A a C b D C c d a c b d

65 Example CS2210: Compiler Construction Consider a CFG grammar G S A B A a C B b D D d C c Actually, this language has only one sentence, i.e. L(G) = { acbd } Leftmost Derivation: S AB (1) acb (2) acb (3) acbd (4) acbd (5) S Rightmost Derivation: S AB (5) AbD (4) Abd (3) acbd (2) acbd (1) A B A a C b D C D c d a c b d

66 Example CS2210: Compiler Construction Consider a CFG grammar G S A B A a C B b D D d C c Actually, this language has only one sentence, i.e. L(G) = { acbd } Leftmost Derivation: S AB (1) acb (2) acb (3) acbd (4) acbd (5) S Rightmost Derivation: S AB (5) AbD (4) Abd (3) acbd (2) acbd (1) A B A B a C b D C D c d a c b d

67 Example CS2210: Compiler Construction Consider a CFG grammar G S A B A a C B b D D d C c Actually, this language has only one sentence, i.e. L(G) = { acbd } Leftmost Derivation: S AB (1) acb (2) acb (3) acbd (4) acbd (5) S Rightmost Derivation: S AB (5) AbD (4) Abd (3) acbd (2) acbd (1) S A B A B a C b D C D c d a c b d

68 Top Down Parsers CS2210: Compiler Construction Recursive descent parser Implemented using recursive calls to functions that implement the expansion of each non-terminal Goes through all possible expansions by trial-and-error until match with input; backtracks when mismatch detected Simple to implement, but may take exponential time Predictive parser Recursive descent parser with prediction (no backtracking) Predict next rule by looking ahead k number of symbols Restrictions on the grammar to avoid backtracking Only works for a class of grammars called LL(k) Nonrecursive predictive parser Predictive parser with no recursive calls Table driven suitable for automated parser generators

69 Recursive Descent Example E T + E T T int * T int ( E ) input string: start symbol: int * int E initial parse tree is E

70 Recursive Descent Example E T + E T T int * T int ( E ) input string: start symbol: int * int E initial parse tree is E Assume: when there are alternative rules, try right rule first

71 Parsing Sequence (using Backtracking) E

72 Parsing Sequence (using Backtracking) E T pick right most rule E T

73 Parsing Sequence (using Backtracking) E T ( E ) pick right most rule E T pick right most rule T (E)

74 Parsing Sequence (using Backtracking) E T ( E ) pick right most rule E T pick right most rule T (E) ( does not match int

75 Parsing Sequence (using Backtracking) E T ( E ) pick right most rule E T pick right most rule T (E) ( does not match int failure, backtrack one level

76 Parsing Sequence (using Backtracking) E T ( E ) pick right most rule E T pick right most rule T (E) ( does not match int failure, backtrack one level

77 Parsing Sequence (using Backtracking) E T ( E ) int pick right most rule E T pick right most rule T (E) ( does not match int failure, backtrack one level pick next T int int matches input int

78 Parsing Sequence (using Backtracking) E T ( E ) int pick right most rule E T pick right most rule T (E) ( does not match int failure, backtrack one level pick next T int int matches input int however, more tokens remain failure, backtrack one level

79 Parsing Sequence (using Backtracking) E T ( E ) int pick right most rule E T pick right most rule T (E) ( does not match int failure, backtrack one level pick next T int int matches input int however, more tokens remain failure, backtrack one level

80 Parsing Sequence (using Backtracking) E T ( E ) int int * T pick right most rule E T pick right most rule T (E) ( does not match int failure, backtrack one level pick next T int int matches input int however, more tokens remain failure, backtrack one level pick next T int * T int * matches input int *

81 Parsing Sequence (using Backtracking) E T ( E ) int int * T int * ( E ) pick right most rule E T pick right most rule T (E) ( does not match int failure, backtrack one level pick next T int int matches input int however, more tokens remain failure, backtrack one level pick next T int * T int * matches input int * pick right most rule T (E)

82 Parsing Sequence (using Backtracking) E T ( E ) int int * T int * ( E ) pick right most rule E T pick right most rule T (E) ( does not match int failure, backtrack one level pick next T int int matches input int however, more tokens remain failure, backtrack one level pick next T int * T int * matches input int * pick right most rule T (E) ( does not match input int failure, backtrack one level

83 Parsing Sequence (using Backtracking) E T ( E ) int int * T int * ( E ) pick right most rule E T pick right most rule T (E) ( does not match int failure, backtrack one level pick next T int int matches input int however, more tokens remain failure, backtrack one level pick next T int * T int * matches input int * pick right most rule T (E) ( does not match input int failure, backtrack one level

84 Parsing Sequence (using Backtracking) E T ( E ) int int * T int * ( E ) int * int pick right most rule E T pick right most rule T (E) ( does not match int failure, backtrack one level pick next T int int matches input int however, more tokens remain failure, backtrack one level pick next T int * T int * matches input int * pick right most rule T (E) ( does not match input int failure, backtrack one level pick next T int

85 Parsing Sequence (using Backtracking) E T ( E ) int int * T int * ( E ) int * int pick right most rule E T pick right most rule T (E) ( does not match int failure, backtrack one level pick next T int int matches input int however, more tokens remain failure, backtrack one level pick next T int * T int * matches input int * pick right most rule T (E) ( does not match input int failure, backtrack one level pick next T int match, accept

86 Recursive Descent Parsing uses Backtracking Approach: for a non-terminal in the derivation, productions are tried in some order until A production is found that generates a portion of the input, or No production is found that generates a portion of the input, in which case backtrack to previous non-terminal Terminals of the derivation are compared against input Match advance input, continue parsing Mismatch backtrack, or fail Parsing fails if no production for the start symbol generates the entire input

87 Implementation CS2210: Compiler Construction Create a procedure for each non-terminal 1. For RHS of each production rule, a. For a terminal, match with input symbol and consume b. For a non-terminal, call procedure for that non-terminal c. If match succeeds for entire RHS, return success d. If match fails, regurgitate input and try next production rule 2. If match succeeds for any rule, apply that rule to LHS

88 Sample Code CS2210: Compiler Construction Sample implementation of parser for previous grammar: E T + E T T int * T int ( E ) int match(int token) { return getnext() == token; } int expr() { R1: if (!term() ) goto R2; if (!match(addnum) ) goto R2; if (!expr() ) goto R2; return 1; R2: /* backtrack R1 */ if (!term() ) goto Fail: return 1; Fail: /* backtrack R2 */ return 0; } int term() { R1: if (!match(intnum) ) goto R2; if (!match(starnum) ) goto R2; if (!term() ) goto R2; return 1; R2: /* backtrack R1 */ if (!match(intnum) ) goto R3: return 1; R3: /* backtrack R2 */ if (!match(leftparennum) ) goto Fail; if (!expr() ) goto Fail; if (!match(rightparennum) ) goto Fail; return 1; Fail: /* backtrack R3 */ return 0; }

89 Left Recursion Problem Recursive descent doesn t work with left recursion Right recursion is okay Why is left recursion a problem? For left recursive grammar A A b c We may repeatedly choose to apply A b A A b A b b... Sentence can grow indefinitely w/o consuming input How do you know when to stop recursion and choose c?

90 Left Recursion Problem Recursive descent doesn t work with left recursion Right recursion is okay Why is left recursion a problem? For left recursive grammar A A b c We may repeatedly choose to apply A b A A b A b b... Sentence can grow indefinitely w/o consuming input How do you know when to stop recursion and choose c? Rewrite the grammar so that it is right recursive Which expresses the same language

91 Remove Left Recursion In general, we can eliminate all immediate left recursion A A x y change to A y A A x A ε Not all left recursion is immediate may be hidden in multiple production rules A BC D B AE F... see Section 4.3 for elimination of general left recursion... (not required for this course)

92 Summary of Recursive Descent Recursive descent is a simple and general parsing strategy Left-recursion must be eliminated first Can be eliminated automatically as in previous slide L(Recursive_Descent) L(CFG) CFL However it is not popular because of backtracking Backtracking requires re-parsing the same string Which is inefficient (can take exponential time) Also undoing semantic actions may be difficult E.g. removing already added nodes in parse tree Techniques used in practice do no backtracking... at the cost of restricting the class of grammar

93 Predictive Parsers CS2210: Compiler Construction A top-down parser with no backtracking: choose the appropriate production given next input terminal If first terminal of every alternative production is unique A a B D b B B B c b c e D d parsing input abced requires no backtracking

94 Predictive Parsers CS2210: Compiler Construction A top-down parser with no backtracking: choose the appropriate production given next input terminal If first terminal of every alternative production is unique A a B D b B B B c b c e D d parsing input abced requires no backtracking Rewriting grammars for predictive parsing Left factor to enable prediction A αβ αγ can be changed to A α A A β γ Eliminate left recursion, just like for recursive descent

95 LL(k) Parsers CS2210: Compiler Construction LL(k) Parser A predictive parser that uses k lookahead tokens L left to right scan L leftmost derivation k k symbols of lookahead LL(k) Grammar A grammar that can parsed using a LL(k) parser LL(k) Language A language that can be expressed as a LL(k) grammar LL(k) languages are a restricted subset of CFLs But many languages are LL(k).. in fact many are LL(1)! Can be implemented in a recursive or nonrecursive fashion

96 Nonrecursive Predictive Parser input tokens $ read head syntax stack TOP parser driver output $ parse table M[N,T] Syntax stack holds unmatched portion of derivation string Parser driver next action based on (current token, stack top) Parse table M[A,b] an entry containing rule A... or error Table-driven: amenable to automatic code generation (just like lexers)

97 A Sample LL(1) Parse Table int * + ( ) $ E E TX E TX X X +E X ε X ε T T int Y T (E) Y Y *T Y ε Y ε Y ε Implementation with 2D parse table First column lists all non-terminals First row lists all possible terminals and $ A table entry contains one production One action for each (non-terminal, input) combination It predicts the correct action based on one lookahead No backtracking required

98 Algorithm for Parsing X symbol at the top of the syntax stack a current input symbol Parsing based on (X,a) If X==a==$, then parser halts with success If X==a!=$, then pop X from stack and advance input head If X!=a, then Case (a): if X T, then parser halts with failed, input rejected Case (b): if X N, M[X,a] = X RHS pop X and push RHS to stack in reverse order

99 then CS2210: Compiler Construction Push RHS in Reverse Order X symbol at the top of the syntax stack a current input symbol if M[X,a] = X B c D X... $ B c D... $

100 Applying LL(1) Parsing to a Grammar Given our old grammar E T + E T T int * T int ( E ) No left recursion But require left factoring After rewriting grammar, we have E T X X + E ε T int Y ( E ) Y * T ε

101 Using the Parse Table To recognize int * int input int * int $ parser driver E $ Stack Table int * + ( ) $ E E TX E TX X X +E X ε X ε T T int Y T (E) Y Y *T Y ε Y ε Y ε

102 Using the Parse Table To recognize int * int input int * int $ parser driver E $ Stack Table int * + ( ) $ E E TX E TX X X +E X ε X ε T T int Y T (E) Y Y *T Y ε Y ε Y ε

103 Using the Parse Table To recognize int * int input int * int $ parser driver T X $ Stack Table int * + ( ) $ E E TX E TX X X +E X ε X ε T T int Y T (E) Y Y *T Y ε Y ε Y ε

104 Using the Parse Table To recognize int * int input int * int $ parser driver T X $ Stack Table int * + ( ) $ E E TX E TX X X +E X ε X ε T T int Y T (E) Y Y *T Y ε Y ε Y ε

105 Using the Parse Table To recognize int * int input int * int $ parser driver int Y X $ Stack Table int * + ( ) $ E E TX E TX X X +E X ε X ε T T int Y T (E) Y Y *T Y ε Y ε Y ε

106 Using the Parse Table To recognize int * int input int * int $ parser driver Y X $ Stack Table int * + ( ) $ E E TX E TX X X +E X ε X ε T T int Y T (E) Y Y *T Y ε Y ε Y ε

107 Using the Parse Table To recognize int * int input int * int $ parser driver Y X $ Stack Table int * + ( ) $ E E TX E TX X X +E X ε X ε T T int Y T (E) Y Y *T Y ε Y ε Y ε

108 Using the Parse Table To recognize int * int input int * int $ parser driver * T X $ Stack Table int * + ( ) $ E E TX E TX X X +E X ε X ε T T int Y T (E) Y Y *T Y ε Y ε Y ε

109 Using the Parse Table To recognize int * int input int * int $ parser driver T X $ Stack Table int * + ( ) $ E E TX E TX X X +E X ε X ε T T int Y T (E) Y Y *T Y ε Y ε Y ε

110 Using the Parse Table To recognize int * int input int * int $ parser driver T X $ Stack Table int * + ( ) $ E E TX E TX X X +E X ε X ε T T int Y T (E) Y Y *T Y ε Y ε Y ε

111 Using the Parse Table To recognize int * int input int * int $ parser driver int Y X $ Stack Table int * + ( ) $ E E TX E TX X X +E X ε X ε T T int Y T (E) Y Y *T Y ε Y ε Y ε

112 Using the Parse Table To recognize int * int input int * int $ parser driver Y X $ Stack Table int * + ( ) $ E E TX E TX X X +E X ε X ε T T int Y T (E) Y Y *T Y ε Y ε Y ε

113 Using the Parse Table To recognize int * int input int * int $ parser driver Y X $ Stack Table int * + ( ) $ E E TX E TX X X +E X ε X ε T T int Y T (E) Y Y *T Y ε Y ε Y ε

114 Using the Parse Table To recognize int * int input int * int $ parser driver X $ Stack Table int * + ( ) $ E E TX E TX X X +E X ε X ε T T int Y T (E) Y Y *T Y ε Y ε Y ε

115 Using the Parse Table To recognize int * int input int * int $ parser driver X $ Stack Table int * + ( ) $ E E TX E TX X X +E X ε X ε T T int Y T (E) Y Y *T Y ε Y ε Y ε

116 Using the Parse Table To recognize int * int input int * int $ parser driver $ Stack Table int * + ( ) $ E E TX E TX X X +E X ε X ε T T int Y T (E) Y Y *T Y ε Y ε Y ε

117 Using the Parse Table To recognize int * int input int * int $ parser driver Accept!!! $ Stack Table int * + ( ) $ E E TX E TX X X +E X ε X ε T T int Y T (E) Y Y *T Y ε Y ε Y ε

118 Recognition Sequence It is possible to write in a action list Stack Input Action E $ int * int $ E TX T X $ int * int $ T int Y int Y X $ int * int $ match Y X $ * int $ Y * T * T X $ * int $ match T X $ int $ T int Y int Y X $ int $ match Y X $ $ Y ε X $ $ X ε $ $ halt and accept

119 Constructing the Parse Table Need to know 2 sets First(A): set of terminals that can appear first in A Follow(A): set of terminals that can appear following A

120 Intuitive Meaning of First and Follow S α A a β c γ c First(A) a Follow(A) Why is the Follow Set important?

121 First(α) CS2210: Compiler Construction First(α) = set of terminals that start string of terminals derived from α. Apply followsing rules until no terminal or ε can be added 1). If t T, then First(t)={t}. For example First(+)={+}. 2). If X N and X ε exists, then add ε to First(X). For example, First(Y) = {*, ε}. 3). If X N and X Y 1 Y 2 Y 3...Y m, then Add (First(Y 1 ) - ε) to First(X). If First(Y 1 ),..., First(Y k 1 ) all contain ε, then add ( P 1 i k First(Y i) ε) to First(X). If First(Y 1 ),..., First(Y m) all contain ε, then add ε to First(X).

122 Follow(α) CS2210: Compiler Construction Follow(α) = {t S αtβ} Intuition: if X A B, then First(B) Follow(A) little trickier because B may be ε i.e. B * ε Apply followsing rules until no terminal or ε can be added 1). $ Follow(S), where S is the start symbol. e.g. Follow(E) = {$... }. 2). Look at the occurrence of a non-terminal on the right hand side of a production which is followed by something If A αbβ, then First(β)-{ε} Follow(B) 3). Look at N on the RHS that is not followed by anything, if (A αb) or (A αbβ and ε First(β)), then Follow(A) Follow(B)

123 Informal Interpretation of First and Follow Sets First set of X (focus on LHS) If X T, then terminal symbol itself If X Y Z, then add First(Y) If X ε, then add ε Follow set of X (focus on RHS) If X is start symbol, add $ If... X Y, then add First(Y) If Y X, then add Follow(Y)

124 For the example CS2210: Compiler Construction E T X X + E ε T int Y ( E ) Y * T ε First set (focus on LHS) E T X X + E X ε T int Y T ( E ) Y * T Y ε Follow set (focus on RHS) $ T ( E ) X + E E T X Y * T T int Y

125 Example CS2210: Compiler Construction Symbol ( ) + int Y X T E E T X X + E ε T int Y ( E ) Y * T ε First ( ) + int *, ε +, ε (, int (, int Symbol E X T Y Follow $,) $,) $,),+ $,),+

126 Construction of LL(1) Parse Table To construct the parse table, we check each A α For each terminal a First(α), add A α to M[A,a]. If ε First(α), then for each terminal b Follow(A), add A α to M[A,b].

127 Example CS2210: Compiler Construction E T X X + E ε T int Y ( E ) Y * T ε Symbol ( ) + int Y X T E First ( ) + int *, ε +, ε (, int (, int Symbol E X T Y Follow $,) $,) $,),+ $,),+ Table int * + ( ) $ E E TX E TX X X +E X ε X ε T T int Y T (E) Y Y *T Y ε Y ε Y ε

128 Determine if Grammar G is LL(1) Observation If a grammar is LL(1), then each of its LL(1) table entry contains at most one rule. Otherwise, it is not LL(1). Two methods to determine if a grammar is LL(1) or not (1) Construct LL(1) table, and check if there is a multi-rule entry or (2) Checking each rule as if the table is getting constructed. G is LL(1) iff for a rule A α β First(α) First(β) = φ at most one of α and β can derive ε If β derives ε, then First(α) Follow(A) = φ

129 Non-LL(1) Grammars Suppose a grammar is not LL(1). What then? (1) The language may still be LL(1). Try to massage grammar to LL(1) grammar: Apply left-factoring Apply left-recursion removal Try to remove ambiguity in grammar: Encode precendence into rules Encode associativity into rules (2) If (1) fails, language may not be LL(1) Impossible to resolve conflict at the grammar level Programmer chooses which rule to use for conflicting entry (if choosing that rule is always semantically correct) Otherwise, use a more powerful parser (e.g. LL(k), LR(1))

130 Non-LL(1) Grammar Example Some grammars are not LL(1) even after left-factoring and left-recursion removal S if C then S if C then S else S a (other statements) C b change to S if C then S X a X else S ε C b problem sentence: if b then if b then a else a else First(X) First(X)-ε Follow(S) X else... ε else Follow(X) Such grammars are potentially ambiguous

131 Non-LL(1) Even After Removing Ambiguity For the if-then-else example, how would you rewrite it? S if C then S if C then S2 else S S2 if C then S2 else S2 a C b Now grammar is unambiguous but it is not LL(k) for any k Must lookahead until matching else to choose rule for S That lookahead may be an arbirary number of tokens with arbitrary nesting of if-then-else statements Some languages are not LL(1) or even LL(k) No amount of lookahead can tell parser whether this statement is an if-then or if-then-else Fortunately in this case, can resolve conflicts in table Always choose X else S over X ε on else token Semantically correct, since else should match closest if

132 LL(1) Time and Space Complexity Linear time and space relative to length of input

133 LL(1) Time and Space Complexity Linear time and space relative to length of input Time each input symbol is consumed within a constant number of steps. Why?

134 LL(1) Time and Space Complexity Linear time and space relative to length of input Time each input symbol is consumed within a constant number of steps. Why? If symbol at top of stack is a terminal: Matched immediately in one step If symbol at top of stack is a non-terminal: Matched in at most N steps, where N = number of rules Since no left-recursion, cannot apply same rule twice without consuming input

135 LL(1) Time and Space Complexity Linear time and space relative to length of input Time each input symbol is consumed within a constant number of steps. Why? If symbol at top of stack is a terminal: Matched immediately in one step If symbol at top of stack is a non-terminal: Matched in at most N steps, where N = number of rules Since no left-recursion, cannot apply same rule twice without consuming input Space smaller than input (after removing X ε). Why?

136 LL(1) Time and Space Complexity Linear time and space relative to length of input Time each input symbol is consumed within a constant number of steps. Why? If symbol at top of stack is a terminal: Matched immediately in one step If symbol at top of stack is a non-terminal: Matched in at most N steps, where N = number of rules Since no left-recursion, cannot apply same rule twice without consuming input Space smaller than input (after removing X ε). Why? RHS is always longer or equal to LHS Derivation string expands monotonically Derivation string is always shorter than final input string Stack is a subset of derivation string (unmatched portion)

137 Nonrecursive Parser Summary First and Follow sets are used to construct predictive parsing tables Intuitively, First and Follow sets guide the choice of rules For non-terminal A and lookahead t, use the production rule A α where t First(α) OR For non-terminal A and lookahead t, use the production rule A α where ε First(α) and t Follow(A) There can only be ONE such rule or grammar is not LL(1)

138 Questions CS2210: Compiler Construction What is LL(0)? Why LL(2)... LL(k) are not widely used?

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing Roadmap > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing The role of the parser > performs context-free syntax analysis > guides

More information

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38 Syntax Analysis Martin Sulzmann Martin Sulzmann Syntax Analysis 1 / 38 Syntax Analysis Objective Recognize individual tokens as sentences of a language (beyond regular languages). Example 1 (OK) Program

More information

Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012 Derivations vs Parses Grammar is used to derive string or construct parser Context ree Grammars A derivation is a sequence of applications of rules Starting from the start symbol S......... (sentence)

More information

Top down vs. bottom up parsing

Top down vs. bottom up parsing Parsing A grammar describes the strings that are syntactically legal A recogniser simply accepts or rejects strings A generator produces sentences in the language described by the grammar A parser constructs

More information

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing Review of Parsing Abstract Syntax Trees & Top-Down Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree

More information

3. Parsing. Oscar Nierstrasz

3. Parsing. Oscar Nierstrasz 3. Parsing Oscar Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes. http://www.cs.ucla.edu/~palsberg/ http://www.cs.purdue.edu/homes/hosking/

More information

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7 Top-Down Parsing and Intro to Bottom-Up Parsing Lecture 7 1 Predictive Parsers Like recursive-descent but parser can predict which production to use Predictive parsers are never wrong Always able to guess

More information

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing Abstract Syntax Trees & Top-Down Parsing Review of Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree

More information

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing Review of Parsing Abstract Syntax Trees & Top-Down Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree

More information

Types of parsing. CMSC 430 Lecture 4, Page 1

Types of parsing. CMSC 430 Lecture 4, Page 1 Types of parsing Top-down parsers start at the root of derivation tree and fill in picks a production and tries to match the input may require backtracking some grammars are backtrack-free (predictive)

More information

Syntactic Analysis. Top-Down Parsing

Syntactic Analysis. Top-Down Parsing Syntactic Analysis Top-Down Parsing Copyright 2017, Pedro C. Diniz, all rights reserved. Students enrolled in Compilers class at University of Southern California (USC) have explicit permission to make

More information

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7 Top-Down Parsing and Intro to Bottom-Up Parsing Lecture 7 1 Predictive Parsers Like recursive-descent but parser can predict which production to use Predictive parsers are never wrong Always able to guess

More information

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012 Predictive Parsers LL(k) Parsing Can we avoid backtracking? es, if for a given input symbol and given nonterminal, we can choose the alternative appropriately. his is possible if the first terminal of

More information

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4) CS1622 Lecture 9 Parsing (4) CS 1622 Lecture 9 1 Today Example of a recursive descent parser Predictive & LL(1) parsers Building parse tables CS 1622 Lecture 9 2 A Recursive Descent Parser. Preliminaries

More information

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing 8 Parsing Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces strings A parser constructs a parse tree for a string

More information

Monday, September 13, Parsers

Monday, September 13, Parsers Parsers Agenda Terminology LL(1) Parsers Overview of LR Parsing Terminology Grammar G = (Vt, Vn, S, P) Vt is the set of terminals Vn is the set of non-terminals S is the start symbol P is the set of productions

More information

CS 321 Programming Languages and Compilers. VI. Parsing

CS 321 Programming Languages and Compilers. VI. Parsing CS 321 Programming Languages and Compilers VI. Parsing Parsing Calculate grammatical structure of program, like diagramming sentences, where: Tokens = words Programs = sentences For further information,

More information

CS 314 Principles of Programming Languages

CS 314 Principles of Programming Languages CS 314 Principles of Programming Languages Lecture 5: Syntax Analysis (Parsing) Zheng (Eddy) Zhang Rutgers University January 31, 2018 Class Information Homework 1 is being graded now. The sample solution

More information

Parsing III. (Top-down parsing: recursive descent & LL(1) )

Parsing III. (Top-down parsing: recursive descent & LL(1) ) Parsing III (Top-down parsing: recursive descent & LL(1) ) Roadmap (Where are we?) Previously We set out to study parsing Specifying syntax Context-free grammars Ambiguity Top-down parsers Algorithm &

More information

CA Compiler Construction

CA Compiler Construction CA4003 - Compiler Construction David Sinclair A top-down parser starts with the root of the parse tree, labelled with the goal symbol of the grammar, and repeats the following steps until the fringe of

More information

Wednesday, September 9, 15. Parsers

Wednesday, September 9, 15. Parsers Parsers What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

More information

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs: What is a parser Parsers A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

More information

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones Parsing III (Top-down parsing: recursive descent & LL(1) ) (Bottom-up parsing) CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones Copyright 2003, Keith D. Cooper,

More information

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam Compilers Parsing Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Next step text chars Lexical analyzer tokens Parser IR Errors Parsing: Organize tokens into sentences Do tokens conform

More information

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468 Parsers Xiaokang Qiu Purdue University ECE 468 August 31, 2018 What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure

More information

SYNTAX ANALYSIS 1. Define parser. Hierarchical analysis is one in which the tokens are grouped hierarchically into nested collections with collective meaning. Also termed as Parsing. 2. Mention the basic

More information

Wednesday, August 31, Parsers

Wednesday, August 31, Parsers Parsers How do we combine tokens? Combine tokens ( words in a language) to form programs ( sentences in a language) Not all combinations of tokens are correct programs (not all sentences are grammatically

More information

Syntax Analysis Part I

Syntax Analysis Part I Syntax Analysis Part I Chapter 4: Context-Free Grammars Slides adapted from : Robert van Engelen, Florida State University Position of a Parser in the Compiler Model Source Program Lexical Analyzer Token,

More information

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino 3. Syntax Analysis Andrea Polini Formal Languages and Compilers Master in Computer Science University of Camerino (Formal Languages and Compilers) 3. Syntax Analysis CS@UNICAM 1 / 54 Syntax Analysis: the

More information

CSCI312 Principles of Programming Languages

CSCI312 Principles of Programming Languages Copyright 2006 The McGraw-Hill Companies, Inc. CSCI312 Principles of Programming Languages! LL Parsing!! Xu Liu Derived from Keith Cooper s COMP 412 at Rice University Recap Copyright 2006 The McGraw-Hill

More information

CS 406/534 Compiler Construction Parsing Part I

CS 406/534 Compiler Construction Parsing Part I CS 406/534 Compiler Construction Parsing Part I Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy and Dr.

More information

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1 Building a Parser III CS164 3:30-5:00 TT 10 Evans 1 Overview Finish recursive descent parser when it breaks down and how to fix it eliminating left recursion reordering productions Predictive parsers (aka

More information

Concepts Introduced in Chapter 4

Concepts Introduced in Chapter 4 Concepts Introduced in Chapter 4 Grammars Context-Free Grammars Derivations and Parse Trees Ambiguity, Precedence, and Associativity Top Down Parsing Recursive Descent, LL Bottom Up Parsing SLR, LR, LALR

More information

Parsing Part II (Top-down parsing, left-recursion removal)

Parsing Part II (Top-down parsing, left-recursion removal) Parsing Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

More information

Administrativia. WA1 due on Thu PA2 in a week. Building a Parser III. Slides on the web site. CS164 3:30-5:00 TT 10 Evans.

Administrativia. WA1 due on Thu PA2 in a week. Building a Parser III. Slides on the web site. CS164 3:30-5:00 TT 10 Evans. Administrativia Building a Parser III CS164 3:30-5:00 10 vans WA1 due on hu PA2 in a week Slides on the web site I do my best to have slides ready and posted by the end of the preceding logical day yesterday,

More information

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10 Ambiguity, Precedence, Associativity & Top-Down Parsing Lecture 9-10 (From slides by G. Necula & R. Bodik) 9/18/06 Prof. Hilfinger CS164 Lecture 9 1 Administrivia Please let me know if there are continued

More information

Compilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 3, vrijdag 22 september 2017

Compilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 3, vrijdag 22 september 2017 Compilerconstructie najaar 2017 http://www.liacs.leidenuniv.nl/~vlietrvan1/coco/ Rudy van Vliet kamer 140 Snellius, tel. 071-527 2876 rvvliet(at)liacs(dot)nl college 3, vrijdag 22 september 2017 + werkcollege

More information

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S. Bottom up parsing Construct a parse tree for an input string beginning at leaves and going towards root OR Reduce a string w of input to start symbol of grammar Consider a grammar S aabe A Abc b B d And

More information

Bottom-Up Parsing. Lecture 11-12

Bottom-Up Parsing. Lecture 11-12 Bottom-Up Parsing Lecture 11-12 (From slides by G. Necula & R. Bodik) 9/22/06 Prof. Hilfinger CS164 Lecture 11 1 Bottom-Up Parsing Bottom-up parsing is more general than topdown parsing And just as efficient

More information

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309 PART 3 - SYNTAX ANALYSIS F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 64 / 309 Goals Definition of the syntax of a programming language using context free grammars Methods for parsing

More information

Table-Driven Parsing

Table-Driven Parsing Table-Driven Parsing It is possible to build a non-recursive predictive parser by maintaining a stack explicitly, rather than implicitly via recursive calls [1] The non-recursive parser looks up the production

More information

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised: EDAN65: Compilers, Lecture 06 A LR parsing Görel Hedin Revised: 2017-09-11 This lecture Regular expressions Context-free grammar Attribute grammar Lexical analyzer (scanner) Syntactic analyzer (parser)

More information

Syntax Analysis Check syntax and construct abstract syntax tree

Syntax Analysis Check syntax and construct abstract syntax tree Syntax Analysis Check syntax and construct abstract syntax tree if == = ; b 0 a b Error reporting and recovery Model using context free grammars Recognize using Push down automata/table Driven Parsers

More information

Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2). LL(k) Grammars We need a bunch of terminology. For any terminal string a we write First k (a) is the prefix of a of length k (or all of a if its length is less than k) For any string g of terminal and

More information

Compilers. Predictive Parsing. Alex Aiken

Compilers. Predictive Parsing. Alex Aiken Compilers Like recursive-descent but parser can predict which production to use By looking at the next fewtokens No backtracking Predictive parsers accept LL(k) grammars L means left-to-right scan of input

More information

CS502: Compilers & Programming Systems

CS502: Compilers & Programming Systems CS502: Compilers & Programming Systems Top-down Parsing Zhiyuan Li Department of Computer Science Purdue University, USA There exist two well-known schemes to construct deterministic top-down parsers:

More information

4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

More information

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Syntax Analysis These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Limits of Regular Languages Advantages of Regular Expressions

More information

4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Context Free Grammars and Parsing 1 Recall: Architecture of Compilers, Interpreters Source Parser Static Analyzer Intermediate Representation Front End Back

More information

4 (c) parsing. Parsing. Top down vs. bo5om up parsing

4 (c) parsing. Parsing. Top down vs. bo5om up parsing 4 (c) parsing Parsing A grammar describes syntac2cally legal strings in a language A recogniser simply accepts or rejects strings A generator produces strings A parser constructs a parse tree for a string

More information

Compiler Design Concepts. Syntax Analysis

Compiler Design Concepts. Syntax Analysis Compiler Design Concepts Syntax Analysis Introduction First task is to break up the text into meaningful words called tokens. newval=oldval+12 id = id + num Token Stream Lexical Analysis Source Code (High

More information

VIVA QUESTIONS WITH ANSWERS

VIVA QUESTIONS WITH ANSWERS VIVA QUESTIONS WITH ANSWERS 1. What is a compiler? A compiler is a program that reads a program written in one language the source language and translates it into an equivalent program in another language-the

More information

Ambiguity. Grammar E E + E E * E ( E ) int. The string int * int + int has two parse trees. * int

Ambiguity. Grammar E E + E E * E ( E ) int. The string int * int + int has two parse trees. * int Administrivia Ambiguity, Precedence, Associativity & op-down Parsing eam assignments this evening for all those not listed as having one. HW#3 is now available, due next uesday morning (Monday is a holiday).

More information

Bottom-Up Parsing. Lecture 11-12

Bottom-Up Parsing. Lecture 11-12 Bottom-Up Parsing Lecture 11-12 (From slides by G. Necula & R. Bodik) 2/20/08 Prof. Hilfinger CS164 Lecture 11 1 Administrivia Test I during class on 10 March. 2/20/08 Prof. Hilfinger CS164 Lecture 11

More information

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5 Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5 1 Not all languages are regular So what happens to the languages which are not regular? Can we still come up with a language recognizer?

More information

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1 CSE P 501 Compilers LR Parsing Hal Perkins Spring 2018 UW CSE P 501 Spring 2018 D-1 Agenda LR Parsing Table-driven Parsers Parser States Shift-Reduce and Reduce-Reduce conflicts UW CSE P 501 Spring 2018

More information

Chapter 4. Lexical and Syntax Analysis

Chapter 4. Lexical and Syntax Analysis Chapter 4 Lexical and Syntax Analysis Chapter 4 Topics Introduction Lexical Analysis The Parsing Problem Recursive-Descent Parsing Bottom-Up Parsing Copyright 2012 Addison-Wesley. All rights reserved.

More information

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved. Syntax Analysis Prof. James L. Frankel Harvard University Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved. Context-Free Grammar (CFG) terminals non-terminals start

More information

More Bottom-Up Parsing

More Bottom-Up Parsing More Bottom-Up Parsing Lecture 7 Dr. Sean Peisert ECS 142 Spring 2009 1 Status Project 1 Back By Wednesday (ish) savior lexer in ~cs142/s09/bin Project 2 Due Friday, Apr. 24, 11:55pm My office hours 3pm

More information

Optimizing Finite Automata

Optimizing Finite Automata Optimizing Finite Automata We can improve the DFA created by MakeDeterministic. Sometimes a DFA will have more states than necessary. For every DFA there is a unique smallest equivalent DFA (fewest states

More information

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis Chapter 4 Lexical and Syntax Analysis Introduction - Language implementation systems must analyze source code, regardless of the specific implementation approach - Nearly all syntax analysis is based on

More information

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F). CS 2210 Sample Midterm 1. Determine if each of the following claims is true (T) or false (F). F A language consists of a set of strings, its grammar structure, and a set of operations. (Note: a language

More information

Introduction to parsers

Introduction to parsers Syntax Analysis Introduction to parsers Context-free grammars Push-down automata Top-down parsing LL grammars and parsers Bottom-up parsing LR grammars and parsers Bison/Yacc - parser generators Error

More information

Lecture 7: Deterministic Bottom-Up Parsing

Lecture 7: Deterministic Bottom-Up Parsing Lecture 7: Deterministic Bottom-Up Parsing (From slides by G. Necula & R. Bodik) Last modified: Tue Sep 20 12:50:42 2011 CS164: Lecture #7 1 Avoiding nondeterministic choice: LR We ve been looking at general

More information

Part 3. Syntax analysis. Syntax analysis 96

Part 3. Syntax analysis. Syntax analysis 96 Part 3 Syntax analysis Syntax analysis 96 Outline 1. Introduction 2. Context-free grammar 3. Top-down parsing 4. Bottom-up parsing 5. Conclusion and some practical considerations Syntax analysis 97 Structure

More information

Context-free grammars

Context-free grammars Context-free grammars Section 4.2 Formal way of specifying rules about the structure/syntax of a program terminals - tokens non-terminals - represent higher-level structures of a program start symbol,

More information

Topic 3: Syntax Analysis I

Topic 3: Syntax Analysis I Topic 3: Syntax Analysis I Compiler Design Prof. Hanjun Kim CoreLab (Compiler Research Lab) POSTECH 1 Back-End Front-End The Front End Source Program Lexical Analysis Syntax Analysis Semantic Analysis

More information

Properties of Regular Expressions and Finite Automata

Properties of Regular Expressions and Finite Automata Properties of Regular Expressions and Finite Automata Some token patterns can t be defined as regular expressions or finite automata. Consider the set of balanced brackets of the form [[[ ]]]. This set

More information

Extra Credit Question

Extra Credit Question Top-Down Parsing #1 Extra Credit Question Given this grammar G: E E+T E T T T * int T int T (E) Is the string int * (int + int) in L(G)? Give a derivation or prove that it is not. #2 Revenge of Theory

More information

Example CFG. Lectures 16 & 17 Bottom-Up Parsing. LL(1) Predictor Table Review. Stacks in LR Parsing 1. Sʹ " S. 2. S " AyB. 3. A " ab. 4.

Example CFG. Lectures 16 & 17 Bottom-Up Parsing. LL(1) Predictor Table Review. Stacks in LR Parsing 1. Sʹ  S. 2. S  AyB. 3. A  ab. 4. Example CFG Lectures 16 & 17 Bottom-Up Parsing CS 241: Foundations of Sequential Programs Fall 2016 1. Sʹ " S 2. S " AyB 3. A " ab 4. A " cd Matt Crane University of Waterloo 5. B " z 6. B " wz 2 LL(1)

More information

PESIT Bangalore South Campus Hosur road, 1km before Electronic City, Bengaluru -100 Department of Computer Science and Engineering

PESIT Bangalore South Campus Hosur road, 1km before Electronic City, Bengaluru -100 Department of Computer Science and Engineering TEST 1 Date : 24 02 2015 Marks : 50 Subject & Code : Compiler Design ( 10CS63) Class : VI CSE A & B Name of faculty : Mrs. Shanthala P.T/ Mrs. Swati Gambhire Time : 8:30 10:00 AM SOLUTION MANUAL 1. a.

More information

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program.

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program. COMPILER DESIGN 1. What is a compiler? A compiler is a program that reads a program written in one language the source language and translates it into an equivalent program in another language-the target

More information

Compiler Construction: Parsing

Compiler Construction: Parsing Compiler Construction: Parsing Mandar Mitra Indian Statistical Institute M. Mitra (ISI) Parsing 1 / 33 Context-free grammars. Reference: Section 4.2 Formal way of specifying rules about the structure/syntax

More information

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis. Topics Chapter 4 Lexical and Syntax Analysis Introduction Lexical Analysis Syntax Analysis Recursive -Descent Parsing Bottom-Up parsing 2 Language Implementation Compilation There are three possible approaches

More information

Introduction to Parsing. Comp 412

Introduction to Parsing. Comp 412 COMP 412 FALL 2010 Introduction to Parsing Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission to make

More information

Syntax Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Syntax Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay Syntax Analysis (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay September 2007 College of Engineering, Pune Syntax Analysis: 2/124 Syntax

More information

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING KATHMANDU UNIVERSITY SCHOOL OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING REPORT ON NON-RECURSIVE PREDICTIVE PARSER Fourth Year First Semester Compiler Design Project Final Report submitted

More information

Lecture 8: Deterministic Bottom-Up Parsing

Lecture 8: Deterministic Bottom-Up Parsing Lecture 8: Deterministic Bottom-Up Parsing (From slides by G. Necula & R. Bodik) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 1 Avoiding nondeterministic choice: LR We ve been looking at general

More information

Homework & Announcements

Homework & Announcements Homework & nnouncements New schedule on line. Reading: Chapter 18 Homework: Exercises at end Due: 11/1 Copyright c 2002 2017 UMaine School of Computing and Information S 1 / 25 COS 140: Foundations of

More information

Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill Syntax Analysis Björn B. Brandenburg The University of North Carolina at Chapel Hill Based on slides and notes by S. Olivier, A. Block, N. Fisher, F. Hernandez-Campos, and D. Stotts. The Big Picture Character

More information

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous. Section A 1. What do you meant by parser and its types? A parser for grammar G is a program that takes as input a string w and produces as output either a parse tree for w, if w is a sentence of G, or

More information

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice. Parsing Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students

More information

COMP 181. Prelude. Next step. Parsing. Study of parsing. Specifying syntax with a grammar

COMP 181. Prelude. Next step. Parsing. Study of parsing. Specifying syntax with a grammar COMP Lecture Parsing September, 00 Prelude What is the common name of the fruit Synsepalum dulcificum? Miracle fruit from West Africa What is special about miracle fruit? Contains a protein called miraculin

More information

Lexical and Syntax Analysis. Top-Down Parsing

Lexical and Syntax Analysis. Top-Down Parsing Lexical and Syntax Analysis Top-Down Parsing Easy for humans to write and understand String of characters Lexemes identified String of tokens Easy for programs to transform Data structure Syntax A syntax

More information

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax Susan Eggers 1 CSE 401 Syntax Analysis/Parsing Context-free grammars (CFG s) Purpose: determine if tokens have the right form for the language (right syntactic structure) stream of tokens abstract syntax

More information

Compilation Lecture 3: Syntax Analysis: Top-Down parsing. Noam Rinetzky

Compilation Lecture 3: Syntax Analysis: Top-Down parsing. Noam Rinetzky Compilation 0368-3133 Lecture 3: Syntax Analysis: Top-Down parsing Noam Rinetzky 1 Recursive descent parsing Define a function for every nonterminal Every function work as follows Find applicable production

More information

Compiler Design 1. Top-Down Parsing. Goutam Biswas. Lect 5

Compiler Design 1. Top-Down Parsing. Goutam Biswas. Lect 5 Compiler Design 1 Top-Down Parsing Compiler Design 2 Non-terminal as a Function In a top-down parser a non-terminal may be viewed as a generator of a substring of the input. We may view a non-terminal

More information

Review of CFGs and Parsing II Bottom-up Parsers. Lecture 5. Review slides 1

Review of CFGs and Parsing II Bottom-up Parsers. Lecture 5. Review slides 1 Review of CFGs and Parsing II Bottom-up Parsers Lecture 5 1 Outline Parser Overview op-down Parsers (Covered largely through labs) Bottom-up Parsers 2 he Functionality of the Parser Input: sequence of

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Parsing CMSC 330 - Spring 2017 1 Recall: Front End Scanner and Parser Front End Token Source Scanner Parser Stream AST Scanner / lexer / tokenizer converts

More information

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal)

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal) Parsing Part II (Ambiguity, Top-down parsing, Left-recursion Removal) Ambiguous Grammars Definitions If a grammar has more than one leftmost derivation for a single sentential form, the grammar is ambiguous

More information

Prelude COMP 181 Tufts University Computer Science Last time Grammar issues Key structure meaning Tufts University Computer Science

Prelude COMP 181 Tufts University Computer Science Last time Grammar issues Key structure meaning Tufts University Computer Science Prelude COMP Lecture Topdown Parsing September, 00 What is the Tufts mascot? Jumbo the elephant Why? P. T. Barnum was an original trustee of Tufts : donated $0,000 for a natural museum on campus Barnum

More information

Fall Compiler Principles Lecture 3: Parsing part 2. Roman Manevich Ben-Gurion University

Fall Compiler Principles Lecture 3: Parsing part 2. Roman Manevich Ben-Gurion University Fall 2014-2015 Compiler Principles Lecture 3: Parsing part 2 Roman Manevich Ben-Gurion University Tentative syllabus Front End Intermediate Representation Optimizations Code Generation Scanning Lowering

More information

Syntax Analysis. The Big Picture. The Big Picture. COMP 524: Programming Languages Srinivas Krishnan January 25, 2011

Syntax Analysis. The Big Picture. The Big Picture. COMP 524: Programming Languages Srinivas Krishnan January 25, 2011 Syntax Analysis COMP 524: Programming Languages Srinivas Krishnan January 25, 2011 Based in part on slides and notes by Bjoern Brandenburg, S. Olivier and A. Block. 1 The Big Picture Character Stream Token

More information

COP 3402 Systems Software Syntax Analysis (Parser)

COP 3402 Systems Software Syntax Analysis (Parser) COP 3402 Systems Software Syntax Analysis (Parser) Syntax Analysis 1 Outline 1. Definition of Parsing 2. Context Free Grammars 3. Ambiguous/Unambiguous Grammars Syntax Analysis 2 Lexical and Syntax Analysis

More information

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall Phases of a compiler Syntactic structure Figure 1.6, page 5 of text Recap Lexical analysis: LEX/FLEX (regex -> lexer) Syntactic analysis:

More information

Lexical and Syntax Analysis

Lexical and Syntax Analysis Lexical and Syntax Analysis (of Programming Languages) Top-Down Parsing Lexical and Syntax Analysis (of Programming Languages) Top-Down Parsing Easy for humans to write and understand String of characters

More information

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root Parser Generation Main Problem: given a grammar G, how to build a top-down parser or a bottom-up parser for it? parser : a program that, given a sentence, reconstructs a derivation for that sentence ----

More information

Languages and Compilers

Languages and Compilers Principles of Software Engineering and Operational Systems Languages and Compilers SDAGE: Level I 2012-13 3. Formal Languages, Grammars and Automata Dr Valery Adzhiev vadzhiev@bournemouth.ac.uk Office:

More information

Chapter 3. Parsing #1

Chapter 3. Parsing #1 Chapter 3 Parsing #1 Parser source file get next character scanner get token parser AST token A parser recognizes sequences of tokens according to some grammar and generates Abstract Syntax Trees (ASTs)

More information