Lecture Compiler Construction

Size: px
Start display at page:

Download "Lecture Compiler Construction"

Transcription

1 Lecture Compiler Construction Franz Wotawa Institute for Software Technology Technische Universität Graz Inffeldgasse 16b/2, A-8010 Graz, Austria Summer term 2017 F. Wotawa TU Graz) Compiler Construction Summer term / 309

2 Compilers are everywhere Programming languages like Java, C#, C, C++, Pascal, Modula, SML, Lisp, VHDL, Basic,.. Graphical languages also need compilers (HTML, L A T E X,...) Means for communication between computers like XML Natural languages Compilers translate a sentence written in one language to another language. F. Wotawa TU Graz) Compiler Construction Summer term / 309

3 Example HTML <!DOCTYPE HTML PUBLIC "...- > <html> <head> <meta content=text/html;..."http-equiv=content-type- > <title>hello World</title> </head> <body> <h1>hello World!</h1> </body> </html> F. Wotawa TU Graz) Compiler Construction Summer term / 309

4 Another example (Source: en.wikipedia.org/wiki/java bytecode) F. Wotawa TU Graz) Compiler Construction Summer term / 309

5 Compiler vs. Interpreter Compiler: Translates from one into another language where programs are directly executed (C,C++,Pascal,...). Interpreter: Executes a program directly without converting it (BASIC, batch languages,..). Exact boundary difficult to define nowadays (Byte code interpreter vs. CPU, which executes machine code statements. Front end (analysis phase) of compilers is always needed! F. Wotawa TU Graz) Compiler Construction Summer term / 309

6 Organizational issues Lecture (Vorlesung) 2h Practical exercises (Übung) 1h Examples from the Compiler Construction theory Hands on part: Development of a compiler More information soon (via and/or the webpage) Office hours: Franz Wotawa: Tuesday, 13:00 14:00 Roxane Koitz: Monday, 13:00 14:00 F. Wotawa TU Graz) Compiler Construction Summer term / 309

7 Lecture dates Monday, , 16:00-18:00 (preliminary discussion) Tuesday, , 16:00-17:30 (lexical analysis) Monday, , 16:00-19:00 (parsing) Tuesday, , 16:00-17:30 (attributed grammars) Monday, , 16:00-19:00 (type checking) Tuesday, , 16:00-17:00 (runtime environment) Monday, , 16:00-19:00 (code generation) Tuesday, , 16:00-17:30 (code optimization, Q&A) F. Wotawa TU Graz) Compiler Construction Summer term / 309

8 Exam Written form only; no papers, books etc. allowed! Content: DFA, NFA, LL(1), SLR(1), Attributed Grammars, Type Checking, Code Generation, Code Optimization, etc.) 0. date: Monday, , 18:00-19:00, i13, (1 groups, 1 hour) 1. date: Monday, , 18:00-20:00, i13, (2 groups, each 1 hour) 2. date: in October 2017, will be announced soon F. Wotawa TU Graz) Compiler Construction Summer term / 309

9 Literature, etc. Aho, Seti, Ullman, Compilers Principles, Techniques, and Tools, Addison-Wesley, 1985, ISBN Tremblay, Sorenson, The Theory and Practice of Compiler Writing, McGraw Hill, 1985, ISBN Copy of slides but no lecture notes Compiler construction webpage: F. Wotawa TU Graz) Compiler Construction Summer term / 309

10 PART 1 - INTRODUCTION F. Wotawa TU Graz) Compiler Construction Summer term / 309

11 What is a compiler? Program Source language Target language Error reports if source code contains errors Source program Compiler Target program Error messages F. Wotawa TU Graz) Compiler Construction Summer term / 309

12 Implications of definition There are (very) many compilers Thousands of source languages (e.g. Pascal, Modula, C, C++, Java,... ) Thousands of target languages (other high-level programming languages, machine code) Classification: single-pass, multi-pass, debugging, or optimizing But: basics of creating a compiler are mostly consistent F. Wotawa TU Graz) Compiler Construction Summer term / 309

13 Concept of compilation Analysis phase Split source program into tokens Generate intermediate code (intermediate representation of source program) Synthesis phase Generate target program based on intermediate code Enables reuse of program parts for different source and target languages F. Wotawa TU Graz) Compiler Construction Summer term / 309

14 Concept of compilation Analysis Operations used within program are stored in hierarchical structure Syntax Tree Other tools which perform an analysis: Structure editors Pretty printers Static checkers Interpreters Web browser F. Wotawa TU Graz) Compiler Construction Summer term / 309

15 Language-processing systems skeletal source program preprocessor source program compiler target assembly language assembler relocatable machine code relocatable machine code loader/link-editor absolute machine code library, relocatable object files F. Wotawa TU Graz) Compiler Construction Summer term / 309

16 Components of a compiler source code lexical analyser syntax analyser semantic analyser symbol-table manager intermediate code generator error handler code optimizer code generator target program F. Wotawa TU Graz) Compiler Construction Summer term / 309

17 Lexical analysis, scanning Syntax tree Example: pos := init + rate * 60 Tokens: 1 Identifier pos 2 Assignment symbol := 3 Identifier init 4 Plusoperator 5 Identifier rate 6 Multiplication operator 7 Constant (number) 60 := pos + init * rate 60 F. Wotawa TU Graz) Compiler Construction Summer term / 309

18 Syntax analysis, parsing Grammatical analysis Token rules of grammar Rules of grammar (example) 1 Each identifier is an expression 2 Each number is an expression 3 Assuming that ex 1 and ex 2 are expressions, then ex 1 + ex 2 and ex 1 ex 2 are expressions as well 4 A term: identifier := Expression is a statement Parse tree F. Wotawa TU Graz) Compiler Construction Summer term / 309

19 Example - Parse tree assignment statement identifier := expression pos expression + expression identifier expression * expression init identifier number rate 60 F. Wotawa TU Graz) Compiler Construction Summer term / 309

20 Semantic analysis Check for semantic errors Type checking Identification of operators and operands Necessary for code generation Example: conversion from int to real statement pos := init + rate * 60 becomes pos := init + rate * int2real(60) F. Wotawa TU Graz) Compiler Construction Summer term / 309

21 Intermediate code generation E.g.: three-address code Example: 1. tmp1 := int2real(60) 2. tmp2 := id3 * tmp1 3. tmp3 := id2 + tmp2 4. id1 := tmp3 using the symbol table pos (id1) real... init (id2) real... rate (id3) real... F. Wotawa TU Graz) Compiler Construction Summer term / 309

22 Code optimizer Goal: faster machine code Example: 1. tmp1 := int2real(60) 2. tmp2 := id3 * tmp1 3. tmp3 := id2 + tmp2 4. id1 := tmp3 1. tmp1 := id3 * id1 := id2 + tmp1 F. Wotawa TU Graz) Compiler Construction Summer term / 309

23 Code generator Generation of actual target code (Machine code, Assembler) Example: 1. MOVF id3, R2 2. MULF #60.0, R2 3. MOVF id2, R1 4. ADDF R2, R1 5. MOVF R1, id1 F. Wotawa TU Graz) Compiler Construction Summer term / 309

24 PART 2 - LEXICAL ANALYSIS F. Wotawa TU Graz) Compiler Construction Summer term / 309

25 Goals of this section Specification and implementation of lexical analysers Easiest method: 1 Create diagram which describes structure of tokens 2 Manually translate diagram into a program Regular expressions in finite state machines F. Wotawa TU Graz) Compiler Construction Summer term / 309

26 Tasks source program lexical analyser token get next token parser symbol table Filter comments, white spaces,.. Establish relations between error messages of compiler and source code (e.g. via line number) Preprocessor functions (e.g. in C) Create copy of source program containing error messages F. Wotawa TU Graz) Compiler Construction Summer term / 309

27 Why separate lexical analysis and parsing? 1 Simpler design: e.g. removal of comments and white spaces enables use of simpler parser design 2 Improve efficiency: large portion of time is invested into reading of programs and conversion into tokens 3 Enhance portability of compilers 4 ( There are tools for both phases ) F. Wotawa TU Graz) Compiler Construction Summer term / 309

28 Tokens, patterns, lexemes Pattern: Rule describing a set of strings (a token) Token: Set of strings described by a pattern Lexem: Sequence of characters which are matched by a pattern of a token Token Lexem Patterns (verbal) if if if relation >, <,... < or > or... id pi, test cases letter followed by letters, digits, or num 2.7, 0, 12e-4 any numeric constant literal segmentation fault any characters between and F. Wotawa TU Graz) Compiler Construction Summer term / 309

29 Tokens Terminals in the grammar (checked by the parser) In programming languages: keywords, operators, identifiers, constants, strings, brackets Language conventions: 1 Tokens have to occur in specific places within the code 2 Tokens are separated by white spaces Example (Fortran): DO 5 I = 1.25 is interpreted as DO5I = Keywords are reserved and must not be used as identifiers 1. is (currently) not used 2. is accepted 3. is reasonable (and simplyfies compiler construction) IF THEN THEN THEN = ELSE; ELSE ELSE = THEN; F. Wotawa TU Graz) Compiler Construction Summer term / 309

30 Token attributes Store lexeme for further processing Store pointers to entries in symbol table Line number or position of token within source code U = 2 * R * PI < id, pointer to U > < assign op > < num, integer value 2 > < mult op > < id, pointer to R > < mult op > < id, pointer to PI > F. Wotawa TU Graz) Compiler Construction Summer term / 309

31 Lexical errors fi ( a == 1)... fi is interpreted as undefined identifier String sequences which cannot be matched to a token e.g.: 2.3e*4 instead of 2.3e+4. Error-correcting: 1 Panic Mode: Removal of faulty characters until a token can be matched 2 Remove a faulty character 3 Include a new character 4 Replace a character 5 Change order of neighboring characters Some errors (like the first example) may be only recognizable by the parser F. Wotawa TU Graz) Compiler Construction Summer term / 309

32 Specification of tokens Strings over an alphabet: Alphabet... finite set of symbols example: {0, 1} Binary alphabet String (word, sentence)... finite sequence of symbols of the alphabet ɛ... empty string Language: Set of all strings over an alphabet Operators: Concatenation xy is a string whereby y is attached to the end of x x i is defined as: x 0 = ɛ, i > 0 x i = x i 1 x Prefix: computer comp Suffix: computer uter Substring: computer put Subsequence: computer opt F. Wotawa TU Graz) Compiler Construction Summer term / 309

33 Operations on languages Union L M = {x x L x M} Concatenation LM = {xy x L y M} Kleen Closure L = i=0 Li Positive Closure L + = i=1 Li Example: L = {A,..., Z}, M = {0, 1,..., 9} L M Set of letters and numbers LM Strings containing a letter followed by a number L 4 Strings which are made up of four letters L Strings made up of letters including ɛ M + Number strings containing at least one number F. Wotawa TU Graz) Compiler Construction Summer term / 309

34 Regular expressions Example: Pascal identifier letter ( letter digit ) Regular expression r defines a language over an alphabet Σ using the following rules: 1 The regular expression ɛ describes {ɛ} 2 a Σ: The regular Expression a describes {a} 3 r, s are regular expressions (languages L(r), L(s)): 1 (r) (s) describes L(r) L(s) 2 (r)(s) describes L(r)L(s) 3 (r) describes (L(r)) Regular sets 4 (r)+ describes (L(r)) + F. Wotawa TU Graz) Compiler Construction Summer term / 309

35 Examples Assuming Σ = {a, b} 1 a b describes the language {a, b} 2 (a b)(a b) describes {aa, ab, ba, bb} 3 a describes {ɛ, a, aa, aaa, aaaa,...} 4 (a b) describes all strings which contain no or multiple a s and b s Not all languages can be described by regular expressions e.g. {wcw w is a string made up of a s and b s} Regular expressions may only describe words containing either a fixed amount of repetitions or an infinite amount of repetitions F. Wotawa TU Graz) Compiler Construction Summer term / 309

36 Regular definitions A regular definition is a sequence of the form: d 1 r 1... d n r n whereby d i is a distinct name and r i is a regular expression over Σ {d 1,..., d n } Example: letter A... Z a... z digit id letter(letter digit) F. Wotawa TU Graz) Compiler Construction Summer term / 309

37 Token detection Goal: Identify lexeme of token in input buffer Output: Token-attribute-pair Regular Expression Token Attribute Value white space - - if if - then then - id id pointer to table entry num num pointer to table entry < relop LT > relop GT F. Wotawa TU Graz) Compiler Construction Summer term / 309

38 Transition diagrams Actions performed by lexical analyser (after get next token of the parser) States connected by directed edges Edges are labeled: Labels contain symbols expected in input buffer in order to to progress to next state Label other denotes all symbols not used by any other edge originating from the same node One start state End states after matching a token Deterministic (not necessarily) F. Wotawa TU Graz) Compiler Construction Summer term / 309

39 Example Regular expression letter (letter digit) can be described by following transition diagram: letter or digit start letter other return(gettoken(),install_id()) Assume the input buffer contains: Pos P I =... and the current-symbol pointer is pointing to position 1 F. Wotawa TU Graz) Compiler Construction Summer term / 309

40 Tokenmatching - Example 1 Current symbol = P, position = 1 Transition from state 1 to state 2 and read new symbol 2 Current symbol = I, position = 2 Remain in state 2 and read new symbol 3 Current symbol = SPACE, position = 3 Transition from state 2 to state 3 4 State 3 is an end state I 2 letter or digit start 1 letter 2 other 3 P 1 3 return(gettoken(),install_id()) F. Wotawa TU Graz) Compiler Construction Summer term / 309

41 Example Implementing a Lexer directly Pseudo code 1 lexem = ; 2 if (next char letter) then 3 lexem = lexem + next char; 4 get next char(); 5 while (next char letter digit) do 6 lexem = lexem + next char; 7 get next char(); 8 return IDENTIFIER(lexem); F. Wotawa TU Graz) Compiler Construction Summer term / 309

42 Generalization Finite automata Nondeterministic finite automaton (NFA) (S, Σ, move, s 0, F ) 1 Set of states S 2 Set of input symbols Σ 3 Transition relation move : S (Σ {ɛ}) 2 S, relating state-symbol-pairs to states 4 Initial state s 0 S 5 Set of end states F S NFA can be visualized in form of directed graph (transition graph) Distinctions from transition diagram: Nondeterministic ɛ-moves allowed F. Wotawa TU Graz) Compiler Construction Summer term / 309

43 NFA Example NFA for aa bb (language, accepting NFA): ({0, 1, 2, 3, 4}, {a, b}, move, 0, {2, 4}) with move: state symbol new state 0 ɛ {1,3} 1 a {2} 2 a {2} 3 b {4} 4 b {4} F. Wotawa TU Graz) Compiler Construction Summer term / 309

44 Deterministic finite automaton Deterministic finite automaton (DFA) is a special case of a NFA whereby: 1 No state has an ɛ-move 2 For each state s there is at most one edge which is marked with an input symbol a DFAs contain exactly one transition for each input Each entry in transition table contains exactly one state State Input Symbols a b F. Wotawa TU Graz) Compiler Construction Summer term / 309

45 DFA simulation Input: Input string x, DFA D Output: Yes if D accepts x, No otherwise s := s 0 c := nextchar while c eof do s := move(s, c) if s == then return No c := nextchar end if s F then return Yes else return No end F. Wotawa TU Graz) Compiler Construction Summer term / 309

46 DFA Example DFA for aa bb : ({0, 1, 2}, {a, b}, move, 0, {1, 2}) whereby move: state symbol new state 0 a 1 0 b 2 1 a 1 2 b 2 a 1 a 0 b b 2 F. Wotawa TU Graz) Compiler Construction Summer term / 309

47 Conversion NFA DFA Subset construction algorithm Idea: 1 Each DFA state corresponds to a set of NFA states 2 After reading a 1 a 2... a n the DFA is in a state which ist equivalent to a subset of the NFA. This subset corresponds to the path of the NFA when reading a 1 a 2... a n. Number of DFA states is exponential in the nunber of NFA states (worst case) F. Wotawa TU Graz) Compiler Construction Summer term / 309

48 Subset construction Input: NFA N = (S, Σ, move, s 0, F ) Output: DFA D, accepting the same language as N Initially, ɛ-closure(s 0 ) is the only state in Dstates and it is unmarked while there is an unmarked state T in Dstates do mark T for each input symbol a do U := ɛ-closure(move(t,a)) if U Dstates then Add U as unmarked state to Dstates end Dtran[T, a] := U end end F. Wotawa TU Graz) Compiler Construction Summer term / 309

49 ɛ-closure, move ɛ-closure(s): Set of NFA states which are accessible from a state s via ɛ-moves ɛ-closure(t ): Set of NFA states which are accessible from a state s T via ɛ-moves move(t, a): Set of NFA states which are accessible from a state s T via a move relation of the input symbol a F. Wotawa TU Graz) Compiler Construction Summer term / 309

50 Calculation of ɛ-closure Input: NFA N, set of states T Output: ɛ-closure Push all states in T onto stack Initialize ɛ-closure(t ) to T while stack do Pop t, the top element of stack for each state u with an edge from t to u labeled ɛ do if u ɛ-closure(t ) then Add u to ɛ-closure(t ) Push u onto stack end end end F. Wotawa TU Graz) Compiler Construction Summer term / 309

51 Example NFA DFA ε ε a 2 3 ε ε 0 1 ε a b b ε ε b 4 5 ε F. Wotawa TU Graz) Compiler Construction Summer term / 309

52 Thompson s Construction Input: Regular expression r over an alphabet Σ Output: NFA N, which accepts L(r) ɛ ε i f st i f N(s) ε N(t) a a Σ i f s i ε ε f N(s) ε s t i ε N(s) ε f [s] ε ε N(t) whereby [s] = (s ɛ) F. Wotawa TU Graz) Compiler Construction Summer term / 309

53 NFA simulation Input: NFA N, input string x Output: Yes if N accepts x, No otherwise S := ɛ-closure({s 0 }) a := nextchar while a eof do S := ɛ-closure(move(s,a)) a := nextchar end if S F then return Yes else return No end F. Wotawa TU Graz) Compiler Construction Summer term / 309

54 Summary Check whether an input string x is contained in a language defined by the regular expression r: 1 Construct NFA (Thompson s construction) and apply simulation algorithm for NFAs 2 Construct NFA (Thompson s construction), convert NFA to DFA and apply simulation algorithm for DFAs Time-space tradeoffs Automaton Space Time NFA O( r ) O( r x ) DFA O(2 r ) O( x ) F. Wotawa TU Graz) Compiler Construction Summer term / 309

55 Regular expression DFA Direct conversion Notation: NFA state s is important s has at least one non-ɛ-move to a successor Extended regular expression (Augmented regular expression) (r)# Illustration of extended expressions via syntax trees (cat-node, or-node, star-node) Each leaf node is labeled either with a symbol (of the alphabet) or with ɛ Each leaf node ( ɛ) has a designated number (position) Remark: NFA-DFA-conversion only via important states = positions F. Wotawa TU Graz) Compiler Construction Summer term / 309

56 Syntax tree Example # 6 * a 3 b 4 b 5 a b 1 2 Syntax tree for (a b) abb# F. Wotawa TU Graz) Compiler Construction Summer term / 309

57 Algorithm - Idea Approach: 1 Create syntax tree for extended regular expression 2 Calculate four functions: nullable, f irstpos, lastpos, f ollowpos 3 Construct DFA using f ollowpos DFA states correspond to sets of positions (important NFA states) Position i, f ollowpos(i) are positions j for which there exists an input string... cd... where i corresponds to c and j corresponds to d F. Wotawa TU Graz) Compiler Construction Summer term / 309

58 f ollowpos Example Syntax tree for (a b) abb# followpos(1) = {1, 2, 3} Explanation: Suppose we see an a. This symbol could belong either to (a b) or to the following a. If we see a b then this symbol has to belong to (a b). Thus positions 1,2,3 are contained in followpos(1). Informal definition of functions (node n, string s) firstpos... positions, which match the first symbol of s lastpos... positions, which match the last symbol of s nullable... True, if node n can create a language containing the empty string, false otherwise F. Wotawa TU Graz) Compiler Construction Summer term / 309

59 Rules for nullable, f irstpos, lastpos Node n nullable f irstpos [lastpos] Leaf n labeled true [ ] ɛ Leaf n labeled false {i} [{i}] position i c 1 c 2 nullable(c 1 ) nullable(c 2 ) firstpos(c 1 ) firstpos(c 2 ) [lastpos(c 1 ) lastpos(c 2 )] c 1 c 2 nullable(c 1 ) nullable(c 2 ) if nullable(c 1 ) then firstpos(c 1 ) firstpos(c 2 ) else firstpos(c 1 ) if nullable(c 2) then lastpos(c 1 ) lastpos(c 2 ) else lastpos(c 2 ) c 1 true firstpos(c 1 ) [lastpos(c 1 )] F. Wotawa TU Graz) Compiler Construction Summer term / 309

60 f irstpos, f ollowpos Example {1,2,3} {6} {1,2,3} {5} {6} # {6} 6 {1,2,3} {4} {5} b {5} 5 {1,2,3} {3} {4} b {4} 4 {1,2} * {1,2} {3} a {3} 3 {1,2} {1,2} {1} a {1} {2} b {2} 1 2 F. Wotawa TU Graz) Compiler Construction Summer term / 309

61 Calculation f ollowpos 1 If n is a cat-node ( ), c 1 is the left, c 2 is the right child and i is a position in lastpos(c 1 ), then all positions of firstpos(c 2 ) are contained in f ollowpos(i) 2 If n is a star-node ( ) and i is contained in lastpos(n), then all positions of f irstpos(c) are contained in f ollowpos(i) Node followpos 1 {1,2,3} 2 {1,2,3} 3 {4} 4 {5} 5 {6} F. Wotawa TU Graz) Compiler Construction Summer term / 309

62 f ollowpos-graph NFA A NFA without ɛ-moves can be created based on a f ollowpos-graph: 1 Mark all positions in firstpos of the root node of the syntax tree as initial states 2 Label each directed edge (i, j) with the symbol of position i 3 Mark position which belongs to # as end state f ollowpos-graph can be converted to DFA (using subset construction) F. Wotawa TU Graz) Compiler Construction Summer term / 309

63 DFA-algorithm Input: Regular expression r Output: DFA D, accepting the language L(r) Initially, the only unmarked state in Dstates is f irstpos(root) while there is an unmarked state T Dstates do Mark T for each input symbol a do Let U be the set of positions that are in followpos(p) for some position p T where the symbol of p is a. if U and U Dstates then Add U as unmarked state to Dstates end DT ran(t, a) = U end end F. Wotawa TU Graz) Compiler Construction Summer term / 309

64 PART 3 - SYNTAX ANALYSIS F. Wotawa TU Graz) Compiler Construction Summer term / 309

65 Goals Definition of the syntax of a programming language using context free grammars Methods for parsing of programs determine whether a program is syntactically correct Advantages (of grammars): Precise, easily comprehensible language definition Automatic construction of parsers Declaration of the structure of a programming language (important for translation and error detection) Easy language extensions and modifications F. Wotawa TU Graz) Compiler Construction Summer term / 309

66 Tasks source program lexical analyser token get next token parser parse tree rest of the front end intermediate representation symbol table Parser types: Universal parsers (inefficient) Top-down-parser Bottom-up-parser Only subclasses of grammars (LL, LR) Collect token informations Type checking Immediate code generation F. Wotawa TU Graz) Compiler Construction Summer term / 309

67 Syntax error handling Error types: Lexical errors (spelling of a keyword) Syntactic errors (closing bracket is missing) Semantic errors (operand is incompatible to operator) Logic Errors (infinite loop) Tasks: Exact error description Error recovery consecutive errors should be detectable Error correction should not slow down the processing of correct programs F. Wotawa TU Graz) Compiler Construction Summer term / 309

68 Problems during error handling Spurious Errors: Consecutive errors created by error recovery Example: Compiler issues error-recovery resulting in the removal of the declaration of pi Error during semantic analysis: pi undefined Error is detected late in the process error message does not point to the correct position within the code Too many error messages are issued F. Wotawa TU Graz) Compiler Construction Summer term / 309

69 Error-recovery Panic mode: Skip symbols until input can by synchronized to a token Phrase-level recovery: Local error corrections, e.g. replacement of, by a ; Error productions: Extension of grammar to handle common errors Global correction: Minimal correction of program in order to find a matching derivation (cost intensive) F. Wotawa TU Graz) Compiler Construction Summer term / 309

70 Grammars Grammar A grammar is a 4-tupel G = (V N, V T, S, Φ) whereby: V N Set of nonterminal symbols V T Set of terminal symbols S V N Start symbol Φ : (V N V T ) V N (V N V T ) (V N V T ) Set of production rules (rewriting rules) (α, β) is represented as α β Example: ({S, A, Z}, {a, b, 1, 2}, S, {S AZ, A a, A b, Z ɛ, Z 1, Z 2, Z ZZ}) F. Wotawa TU Graz) Compiler Construction Summer term / 309

71 Derivations in grammars Direct derivation σ, ψ (V T V N ). σ can be directly derived from ψ (in one step; ψ σ), if there are two strings φ 1, φ 2, so that σ = φ 1 βφ 2 and ψ = φ 1 αφ 2 and α β Φ. Example: ψ σ Rule used φ 1 φ 2 S A Z S AZ ɛ ɛ az a1 Z 1 a ɛ AZZ A2Z Z 2 A Z F. Wotawa TU Graz) Compiler Construction Summer term / 309

72 Derivation Production: A string ψ produces σ (ψ + σ), if there are strings φ 0,..., φ n (n > 0), so that ψ = φ 0 φ 1, φ 1 φ 2,..., φ n 1 φ n = σ. Example: S AZ AZZ A2Z a2z a21 Reflexive, transitive closure: ψ σ ψ + σ or ψ = σ Accepted language: A grammar G accepts the following language L(G) = {σ S σ, σ (V T ) } F. Wotawa TU Graz) Compiler Construction Summer term / 309

73 Parse trees Example: E E + E E E id 2 derivations (and parse trees) for id+id*id E E E + E E * E id E * E E + E id id id id id Grammar is ambiguous F. Wotawa TU Graz) Compiler Construction Summer term / 309

74 Classification of grammars Chomsky (restriction of production rules α β) Unrestricted Grammar: no restrictions Context-Sensitive Grammar: α β Context-Free Grammar: α β and α V N Regular Grammar: α β, α V N and β is in the form of: ab or a whereby a V T and B V N F. Wotawa TU Graz) Compiler Construction Summer term / 309

75 Grammar examples Regular grammar: (a b) abb A 0 aa 0 ba 0 aa 1 A 1 ba 2 A 2 ba 3 A 3 ɛ Context-sensitive grammars: L 1 = {wcw w (a b) } But L 1 = {wcw R w (a b) } is context-free L 2 = {a n b m c n d m n 1, m 1} L 3 = {a n b n c n n 1} F. Wotawa TU Graz) Compiler Construction Summer term / 309

76 Conversions Remove ambiguities stmnt if expr then stmnt if expr then stmnt else stmnt other 2 parse trees for if E 1 then if E 2 then S 1 else S 2. smtn smtn if expr then smtn E1 if expr then smtn else smtn if expr then smtn else smtn E1 S2 if expr then smtn E2 S1 S2 E2 S1 Prefer left tree Associate each else with the closest preceding then F. Wotawa TU Graz) Compiler Construction Summer term / 309

77 Removing left recursions A grammar is left-recursive if there is a nonterminal A and a production A + Aα Top-Down-Parsing can t handle left recursions Example: convert A Aα β to: A βa 1 A 1 αa 1 ɛ F. Wotawa TU Graz) Compiler Construction Summer term / 309

78 Algorithm to eliminate left recursions Input: Grammar G without cycles and ɛ-productions Output: Grammar without left recursions Arrange the nonterminals in some order A 1, A 2,..., A n for i := 1 to n do for j := 1 to i 1 do Replace each production A i A j γ by the productions A i δ 1 γ... δ k γ, where A j δ 1... δ k are all the current A j -production end Eliminate the immediate left recursion among the A i -productions end F. Wotawa TU Graz) Compiler Construction Summer term / 309

79 Left factoring Important for predictive parsing Elimination of alternative productions stmnt if expr then stmnt else stmnt Example: if expr then stmnt Solution: For each nonterminal A find the longest prefix α for two or more alternative productions If α ɛ then replace all A-productions A αβ 1 αβ 2... αβ n γ (γ does not start with α) with: A αa 1 γ A 1 β 1 β 2... β n Apply transformation until no prefixes α ɛ can be found F. Wotawa TU Graz) Compiler Construction Summer term / 309

80 Top-down-parsing Idea: Construct parse tree for a given input, starting at root node Recursive-descent parsing (with backtracking) Example: S cad A ab a Matching of cad c S A (1) Predictive parsing (without backtracking, special case of recursive-descent parsing) Left-recursive grammars can lead to infinite loops! d c a S A (2) b d c S A a (3) d F. Wotawa TU Graz) Compiler Construction Summer term / 309

81 Predictive parsers Recursive-descent parser without backtracking Possible if production which needs to be used is obvious for each input symbol Transition diagrams 1 Remove left recursions 2 Left factoring 3 For each nonterminal A: 1 Create a initial state and an end state 2 For each production A X 1X 2... X n create a path leading from the initial state to the end state while labeling the edges X 1,..., X n F. Wotawa TU Graz) Compiler Construction Summer term / 309

82 Predictive parsers (II) Processing: 1 Start at the initial state of the current start symbol 2 Suppose we are currently in the state s which has an edge whose label contains a terminal a and leads to the state t. If the next input symbol is a then go to state t and read a new input symbol. 3 Suppose the edge (from s) is marked by a nonterminal A. In that case go to the initial state of A (without reading a new input symbol). If we reach the end state of A then go to state t which is succeeding s. 4 If the edge is marked by ɛ then go directly to t without reading the input. Easily implemented by recursive procedures F. Wotawa TU Graz) Compiler Construction Summer term / 309

83 Example - Predictive parser E() E1() E ide 1 (E) E 1 ope ɛ if nexttoken=id then getnexttoken E1() if nexttoken=( then getnexttoken E() if nexttoken=) then accept if nexttoken=op then getnexttoken E() else return E: E1: id 0 1 ( 2 op E ε E1 E ) 3 4 F. Wotawa TU Graz) Compiler Construction Summer term / 309

84 Non-recursive predictive parser INPUT a + b $ STACK X Y Z $ Predictive Parsing Program OUTPUT Parsing Table M Input buffer: String to be parsed (terminated by a $) Stack: Initialized with the start symbol and contains nonterminals which are not derivated yet (terminated by a $) Parsing table M(A, a), A is a nonterminal, a a terminal or $ F. Wotawa TU Graz) Compiler Construction Summer term / 309

85 Top-down parsing with stack Mode of operation: X is top element of stack, a the current input symbol 1 X is a terminal: If X = a = $, then the input was matched. If X = a $, pop X off the stack and read next input symbol. Otherwise an error occured. 2 X ist a nonterminal: Fetch entry of M(X, a). If this entry is an error skip to error recovery. Otherwise the entry is a production of the form X UV W. Replace X on the stack with W V U (afterward U is the top most element on the stack). F. Wotawa TU Graz) Compiler Construction Summer term / 309

86 Example Grammar E id E 1 (E) E 1 op E ɛ Parsing table M(X, a) NONTERMINAL id op ( ) $ E E id E 1 E (E) E 1 E 1 op E E 1 ɛ E 1 ɛ Derivation of id op id. F. Wotawa TU Graz) Compiler Construction Summer term / 309

87 Example (II) STACK INPUT OUTPUT $ E id op id $ $ E 1 id id op id $ E id E 1 $ E 1 op id $ $ E op op id $ E 1 op E $ E id $ $ E 1 id id $ E id E 1 $ E 1 $ $ $ E 1 ɛ F. Wotawa TU Graz) Compiler Construction Summer term / 309

88 FIRST and FOLLOW Used when calculating parse table F IRST (α) Set of terminals, which can be derived from α (α string of grammar symbols) F OLLOW (A) Set of terminals which occur directly on the right side next to the nonterminal A in a derivation If A is the right most element of a derivation, then $ is contained in F OLLOW (A) F. Wotawa TU Graz) Compiler Construction Summer term / 309

89 Calculation of FIRST F IRST (X) for a grammar symbol X 1 X is a terminal: F IRST (X) = {X} 2 X ɛ is a production: Add ɛ to F IRST (X) 3 X is a nonterminal and X Y 1 Y 2... Y k is a production a is in F IRST (X) if: 1 An i exists; a is in F IRST (Y i) and ɛ is in every set F IRST (Y 1)... F IRST (Y i 1) 2 a = ɛ and ɛ is in every set F IRST (Y 1)... F IRST (Y k ) F IRST (X 1 X 2... X n ): Each non-ɛ symbol of F IRST (X 1 ) is in the result If ɛ F IRST (X 1 ), then each non-ɛ symbol of F IRST (X 2 ) is in the result and so on Is ɛ in every F IRST -set, then it it also is contained in the result F. Wotawa TU Graz) Compiler Construction Summer term / 309

90 Calculation of FOLLOW In order to calculate F OLLOW (A) of a nonterminal A use following rules: 1 Add $ to F OLLOW (S), whereby S is the initial symbol 2 For each production A αbβ, add all elements of F IRST (β) except ɛ to F OLLOW (B) 3 For each production A αb and A αbβ with ɛ F IRST (β), add each element of F OLLOW (A) to F OLLOW (B) F. Wotawa TU Graz) Compiler Construction Summer term / 309

91 Example Grammar: FIRST sets: FOLLOW sets: E id E 1 (E) E 1 op E ɛ F IRST (E) = {id, (} F IRST (E 1 ) = {op, ɛ} F OLLOW (E) = F OLLOW (E 1 ) = {$, )} F. Wotawa TU Graz) Compiler Construction Summer term / 309

92 Construction of parsing tables Input: Grammar G Output: Parsing table M 1 For each production A α do Steps 2 and 3. 2 For each terminal a in F IRST (α), add A α to M(A, a). 3 If ɛ is in F IRST (α), add A α to M(A, b) for each terminal b in F OLLOW (A). If ɛ is in F IRST (α) and $ is in F OLLOW (A), add A α to M(A, $) 4 Make each undefined entry of M be error. Example: See table of last example grammar F. Wotawa TU Graz) Compiler Construction Summer term / 309

93 LL(1) Grammars Parsing table construction can be used with arbitrary grammars Multiple elements per entry may occur LL(1) Grammar: Grammar whose parsing table contains no multiple entries L... Scanning the Input from LEFT to right L... Producing the LEFTMOST derivation 1... Using 1 input symbol lookahead F. Wotawa TU Graz) Compiler Construction Summer term / 309

94 Properties of LL(1) No ambiguous or left-recursive grammar is LL(1) G ist LL(1) For each two different productions A α β it is neccessary that: 1 No strings may be derived from both α and β which start with the same terminal a 2 At most one of the productions α or β may be derivable to ɛ 3 If β ɛ, then α may not derive any string which starts with an element in F OLLOW (A) Multiple entries in the parsing table can occasionally be removed by hand (without changing the language recognized by the automaton) F. Wotawa TU Graz) Compiler Construction Summer term / 309

95 Error-recovery in predictive parsing Heuristics in panic-mode error recovery: 1 Initially, all symbols in F OLLOW (A) can be used for synchronization: Skip all tokens until an element in F OLLOW (A) is read and remove A from the stack. 2 If F OLLOW sets don t suffice: Use hierarchical structure of program constructs. E.g. use keywords occurring at the beginning of a statement as addition to the synchronization set. 3 F IRST (A) can be used as well: If an element in F IRST (A) is read, continue parsing at A. 4 If a terminal which can t be matched is at the top of the stack, remove it. F. Wotawa TU Graz) Compiler Construction Summer term / 309

96 Bottom-up parsing Shift-reduce parsing Reduction of an input towards the start symbol of the grammar Reduction step: Replace a substring, which matches the right side of a production with the left side of that same production Example: S aabe A Abc b B d abbcde aabcde aade aabe S F. Wotawa TU Graz) Compiler Construction Summer term / 309

97 Handles Substring, which matches the right side of a production and leads to a valid derivation (rightmost derivation) Example (ambiguous grammar): E E + E E E E E (E) E id Rightmost derivation of id + id * id: Right-Sentential Form Handle Reducing Production id + id * id id E id id + id * E id E id id + E * E E E E E E id + E id E id E + E E + E E E + E E F. Wotawa TU Graz) Compiler Construction Summer term / 309

98 Stack implementation Initially: Stack Input $ w$ Shift n 0 symbols from input onto stack until a handle can be found Reduce handle (replace handle with left side of production) F. Wotawa TU Graz) Compiler Construction Summer term / 309

99 Example shift-reduce parsing Stack Input Action (1) $ id + id * id $ shift (2) $ id + id * id $ reduce by E id (3) $ E + id * id $ shift (4) $ E+ id * id $ shift (5) $ E+ id * id $ reduce by E id (6) $ E + E * id $ shift (7) $ E + E id $ shift (8) $ E + E id $ reduce by E id (9) $ E + E E $ reduce by E E E (10) $ E + E $ reduce by E E + E (11) $ E $ accept F. Wotawa TU Graz) Compiler Construction Summer term / 309

100 Viable prefixes, conflicts Viable prefix: Right sentential forms which can occur within the stack of a shift-reduce parser Conflicts: (Ambiguous grammars) stmt if expr then stmt if expr then stmt else stmt other Configuration: Stack Input... if expr then stmt else... No unambiguous handle (shift-reduce conflict) F. Wotawa TU Graz) Compiler Construction Summer term / 309

101 LR parser LR(k) parsing L... Left-to-right scanning R... Rightmost derivation in reverse Advantages: Can be used for (nearly) every programming language construct Most generic backtrack-free shift-reduce parsing method Class of LR-grammars is greater than those of LL-grammars LR-parsers identify errors as early as possible Disadvantage: LR-parser is hard to build manually F. Wotawa TU Graz) Compiler Construction Summer term / 309

102 LR-parsing algorithm INPUT a... a... 1 i a n $ STACK s m Xm s m-1 Xm-1... LR Parsing Program OUTPUT s 0 action goto Stack stores s 0 X 1 s 1 X 2 s 2... X m s m (X i grammar, s i state) Parsing table = action- and goto-table s m current state, a i current input symbol action[s m, a i ] {shift, reduce, accept, error} goto[s m, a i ] transition function of a DFA F. Wotawa TU Graz) Compiler Construction Summer term / 309

103 LR-parsing mode of operation Configuration (s 0 X 1 s 1... X m s m, a i a i+1... a n ) Next step (move) is determined by reading of a i Dependent on action[s m, a i ]: 1 action[s m, a i ] = shift s New configuration: (s 0 X 1 s 1... X m s m a i s, a i+1... a n ) 2 action[s m, a i ] = reduce A β New configuration: (s 0 X 1 s 1... X m r s m r As, a i a i+1... a n ) whereby s = goto[s m r, A], r length of β 3 action[s m, a i ] = accept 4 action[s m, a i ] = error F. Wotawa TU Graz) Compiler Construction Summer term / 309

104 Example ) E E + T ) E T ) T T F ) T F ) F (E) ) F id State action goto id + * ( ) $ E T F 0 s5 s s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s r6 r6 r6 r6 6 s5 s s5 s s6 s11 9 r1 s7 r1 r1 10 r3 r3 r3 r3 11 r5 r5 r5 r5 F. Wotawa TU Graz) Compiler Construction Summer term / 309

105 Construction of SLR parsing tables LR(0)-items: Production with dot at one position of the right side Example: Production A XY Z has 4 items: A.XY Z, A X.Y Z, A XY.Z and A XY Z. Exception: Produktion A ɛ only has the item: A. Augmented grammar: Grammar with new start symbol S and production S S. Functions: closure and goto closure(i) (I... set of items) 1 All I are within closure 2 If A α.bβ is part of closure and B γ is a production, then add B.γ to closure F. Wotawa TU Graz) Compiler Construction Summer term / 309

106 Construction, goto goto(i, X) with I as set of items and X a symbol of the grammar goto = closure of set of all items A αx.β for all A α.xβ in I Example: I = {E E., E E. + T } goto(i, +) = {E E +.T, T.T F, T.F, F.(E), F.id} Sets-of-items construction (Construction of all LR(0)-items) items(g ) I 0 = closure({s.s}) C = {I 0 } repeat for each set of items I C and each grammar symbol X such that goto(i, X) is not empty and not in C do Add goto(i, X) to C until no more sets of items can be added to C F. Wotawa TU Graz) Compiler Construction Summer term / 309

107 SLR parsing table Input: Augmented grammar G Output: SLR parsing table 1 Calculate C = {I 0, I 1,..., I n }, the set of LR(0)-items of G 2 State i is created by I i as follows: 1 If A α.aβ is in I i and goto(i i, a) = I j, then action(i, a) = shift j (a is a terminal symbol) 2 If A α. is in I i, then action[i, a] = reduce A α for all a F OLLOW (A) A S 3 If S S. is in I i, then action[i, $] = accept 3 For all nonterminal symbols A: goto[i, A] = j if goto(i i, A) = I j 4 Every other table entry is set to error 5 Initial state is determined by the item set with S.S F. Wotawa TU Graz) Compiler Construction Summer term / 309

108 SLR(1), conflicts, error handling If we recieve a table without multiple entries using the SLR-parsing-table-algorithm then the grammar is SLR(1) Otherwise the algorithm fails and an algorithm for extended languages (like LR) needs to be utilized generally results in increased processing requirements Shift/reduce-conflicts can be partially resolved The process usually involves the determination of operator binding strength and associativity Error handling can be directly incorporated into the parsing table F. Wotawa TU Graz) Compiler Construction Summer term / 309

109 PART 4 - SYNTAX DIRECTED TRANSLATION F. Wotawa TU Graz) Compiler Construction Summer term / 309

110 Setting Translation of context-free languages Information attributes of grammar symbols Values of attributes are defined by semantic rules 2 possibilities: Syntax directed definitions (high-level spec) Translation schemes (implementation details) Evaluation: (1) Parse input, (2) Generate parse tree, (3) Evaluate parse tree F. Wotawa TU Graz) Compiler Construction Summer term / 309

111 Syntax directed definitions Generalization of context-fee grammars Each grammar symbol has a set of attributes Synthesized vs. inherited attributes Attribute: string, number, type, memory location,... Value of attribute is defined by semantic rules Synthesized: Value of child node in parse tree Inherited: Value of parent node in parse tree Semantic rules define dependencies between attributes Dependency graph defines calculation order of semantic rules Semantic rules can have side effects F. Wotawa TU Graz) Compiler Construction Summer term / 309

112 Form of a syntax directed definition Grammar production: A α Associated semantic rule: b := f(c 1,..., c k ) f is a function Synthesized: b is a synthesized attribute of A and c 1,..., c k are grammar symbols of the production Inherited: b is an inherited attribute of a grammar symbol on the right side of the production and c 1,..., c k are grammar symbols of the production b depends on c 1,..., c k F. Wotawa TU Graz) Compiler Construction Summer term / 309

113 Example Calculator -program: val is a synthesized attribute for nonterminals E, T and F Production L En E E 1 +T E T T T 1 *F T F F (E) F digit Semantic Rule print(e.val) E.val := E 1.val + T.val E.val := T.val T.val := T 1.val F.val T.val := F.val F.val := E.val F.val := digit.lexval F. Wotawa TU Graz) Compiler Construction Summer term / 309

114 S-attributed grammar Attributed grammar exclusively using synthesized attributes Example-evaluation: 3*5+4n (annotated parse tree) L E.val=19 n E.val=15 + T.val=4 T.val=15 F.val=4 T.val=3 * F.val=5 digit.lexval=4 F.val=3 digit.lexval=5 digit.lexval=3 F. Wotawa TU Graz) Compiler Construction Summer term / 309

115 Inherited attributes Definition of dependencies of program language constructs and their context Example: (type checking) Production D T L T int T real L L 1, id L id Semantic Rule L.in := T.type T.type := integer T.type := real L 1.in := L.in addtype(id.entry, L.in) addtype(id.entry, L.in) F. Wotawa TU Graz) Compiler Construction Summer term / 309

116 Inherited attributes Annotated parse tree real id 1, id 2, id 3 D T.type=real L.in=real real L.in=real, id 3 L.in=real, id 2 id 1 F. Wotawa TU Graz) Compiler Construction Summer term / 309

117 Dependency graphs Show dependencies between attributes Each rule is represented in the form b := f(c 1,..., c k ) Nodes correspond to attributes; edges to dependencies Definition: for each node n in the parse tree do for each attribute a of the grammar symbol at n do construct a node in the dependency graph for a for each node n in the parse tree do for each semantic rule b := f(c 1,..., c k ) associated with the production used at n do for i := 1 to k do construct an edge from the node for c i to the node for b F. Wotawa TU Graz) Compiler Construction Summer term / 309

118 Dependency graph Example D T type 4 5 in L 6 real in 7 L 8, id 3 3 entry in 9 L, 10 id 2 2 entry id 1 1 entry F. Wotawa TU Graz) Compiler Construction Summer term / 309

119 Topological sort Arrangement of m 1,..., m k nodes in a directed, acyclic graph where edges point from smaller nodes to bigger nodes If m i m j is an edge, then the node m i is smaller than the node m j Important for order in which the attributes are calculated Example (cont.): 1 a 4 := real 2 a 5 := a 4 3 addtype(id 3.entry, a 5 ) 4 a 7 := a 5 5 addtype(id 2.entry, a 7 ) 6 a 9 := a 7 7 addtype(id 1.entry, a 9 ) F. Wotawa TU Graz) Compiler Construction Summer term / 309

120 Example - syntax trees Abstract syntax tree = simplified form of a parse tree Operators and keywords are supplied to intermediate nodes by leaf nodes Productions with only one element can collapse Examples: if-then-else + B S S 1 2 * F. Wotawa TU Graz) Compiler Construction Summer term / 309

121 Syntax trees Expressions Functions (return value: pointer to new node): mknode(op, left, right): node label op, 2 child nodes left, right mkleaf(id, entry): leaf id, entry in symbol table entry mkleaf(num, val): leaf num, value val Syntax directed definition: Production E E 1 + T E E 1 T E T T (E) T id T num Semantic Rule E.nptr := mknode( +, E 1.nptr, T.nptr) E.nptr := mknode(, E 1.nptr, T.nptr) E.nptr := T.nptr T.nptr := E.nptr T.nptr := mkleaf(id, id.entry) T.nptr := mkleaf(num, num.val) F. Wotawa TU Graz) Compiler Construction Summer term / 309

122 Syntax trees Expressions (ex.) Syntax tree for a-4+c E nptr E nptr + T nptr E - T nptr id + T nptr num id - id to entry for c id num 4 to entry for a F. Wotawa TU Graz) Compiler Construction Summer term / 309

123 Evaluation of S-attributed definitions Attributed definition exclusively using synthesized attributes Evaluation using bottom-up parser (LR-parser) Idea: store attribute information on stack State Val Semantic rule: A.a := f(x.x, Y.y, Z.z) X X.x Production: A XY Z Y Y.y Before XY Z is reduced to A, value top Z Z.z of Z.z stored in val[top], Y.y stored in val[top 1], X.x in val[top 2] F. Wotawa TU Graz) Compiler Construction Summer term / 309

124 Example - S-attributed evaluation Calculator -example: Production Code Fragment L En print(val[top 1]) E E 1 + T val[ntop] := val[top 2] + val[top] E T T T 1 F val[ntop] := val[top 2] val[top] T F F (E) val[ntop] := val[top 1] F digit Code executed before reduction ntop = top r + 1, after reduction: top := ntop F. Wotawa TU Graz) Compiler Construction Summer term / 309

125 Result for 3*5+4n Input state val Production used 3*5+4n *5+4n 3 3 *5+4n F 3 F digit *5+4n T 3 T F 5+4n T * 3 +4n T * n T * F 3 5 F digit +4n T 15 T T F +4n E 15 E T 4n E + 15 n E n E + F 15 4 F digit n E + T 15 4 T F n E 19 E E + T E n 19 L 19 L En F. Wotawa TU Graz) Compiler Construction Summer term / 309

126 L-attributed definitions Definition: A syntax directed definition is L-attributed if each inherited attribute of X j, 1 j n, on the right side of A X 1,..., X n is only dependent on: 1 the attributes X 1,..., X j 1 to the left of X j and 2 the inherited attributes of A Each S-attributed grammar is a L-attributed grammar Evaluation using depth-first order procedure df visit(n : node) for each child m of n, from left to right do evaluate inherited attributes of m df visit(m) end evaluate synthesized attributes of n end F. Wotawa TU Graz) Compiler Construction Summer term / 309

127 Translation schemes Translation scheme = context-free language with attributes for grammar symbols and semantic actions which are placed on the right side of a production between grammar symbols and are confined within {} Example: T T 1 F {T.val := T 1.val F.val} If only synthesized attributes are used, the action is always placed at the end of the right side of a production Note: Actions may not access attributes which are not calculated yet (limits positions of semantic actions) F. Wotawa TU Graz) Compiler Construction Summer term / 309

128 Translation schemes (cont.) If both inherited and synthesized attributes are used the following needs to be taken into consideration: 1 An inherited attribute of a symbol on the right side of a production has to be calculated in an action which is positioned to the left of the symbol 2 An action may not reference a synthesized attribute belonging to a symbol which is positioned to the right of the action 3 A synthesized attribute of a nonterminal on the left side can only be calculated if all referenced attributes have already been calculated actions like these are usually placed at the end of the right side F. Wotawa TU Graz) Compiler Construction Summer term / 309

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309 PART 3 - SYNTAX ANALYSIS F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 64 / 309 Goals Definition of the syntax of a programming language using context free grammars Methods for parsing

More information

Context-free grammars

Context-free grammars Context-free grammars Section 4.2 Formal way of specifying rules about the structure/syntax of a program terminals - tokens non-terminals - represent higher-level structures of a program start symbol,

More information

Compiler Construction: Parsing

Compiler Construction: Parsing Compiler Construction: Parsing Mandar Mitra Indian Statistical Institute M. Mitra (ISI) Parsing 1 / 33 Context-free grammars. Reference: Section 4.2 Formal way of specifying rules about the structure/syntax

More information

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino 3. Syntax Analysis Andrea Polini Formal Languages and Compilers Master in Computer Science University of Camerino (Formal Languages and Compilers) 3. Syntax Analysis CS@UNICAM 1 / 54 Syntax Analysis: the

More information

Syntax-Directed Translation

Syntax-Directed Translation Syntax-Directed Translation 1 Syntax-Directed Translation 1. We associate information with the programming language constructs by attaching attributes to grammar symbols. 2. Values of these attributes

More information

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! [ALSU03] Chapter 3 - Lexical Analysis Sections 3.1-3.4, 3.6-3.7! Reading for next time [ALSU03] Chapter 3 Copyright (c) 2010 Ioanna

More information

CS308 Compiler Principles Lexical Analyzer Li Jiang

CS308 Compiler Principles Lexical Analyzer Li Jiang CS308 Lexical Analyzer Li Jiang Department of Computer Science and Engineering Shanghai Jiao Tong University Content: Outline Basic concepts: pattern, lexeme, and token. Operations on languages, and regular

More information

Syntax Analysis Part I

Syntax Analysis Part I Syntax Analysis Part I Chapter 4: Context-Free Grammars Slides adapted from : Robert van Engelen, Florida State University Position of a Parser in the Compiler Model Source Program Lexical Analyzer Token,

More information

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview n Tokens and regular expressions n Syntax and context-free grammars n Grammar derivations n More about parse trees n Top-down and

More information

Compilers. 5. Attributed Grammars. Laszlo Böszörmenyi Compilers Attributed Grammars - 1

Compilers. 5. Attributed Grammars. Laszlo Böszörmenyi Compilers Attributed Grammars - 1 Compilers 5. Attributed Grammars Laszlo Böszörmenyi Compilers Attributed Grammars - 1 Adding Attributes We connect the grammar rules with attributes E.g. to implement type-checking or code generation A

More information

LR Parsing Techniques

LR Parsing Techniques LR Parsing Techniques Introduction Bottom-Up Parsing LR Parsing as Handle Pruning Shift-Reduce Parser LR(k) Parsing Model Parsing Table Construction: SLR, LR, LALR 1 Bottom-UP Parsing A bottom-up parser

More information

Concepts Introduced in Chapter 4

Concepts Introduced in Chapter 4 Concepts Introduced in Chapter 4 Grammars Context-Free Grammars Derivations and Parse Trees Ambiguity, Precedence, and Associativity Top Down Parsing Recursive Descent, LL Bottom Up Parsing SLR, LR, LALR

More information

PART 4 - SYNTAX DIRECTED TRANSLATION. F. Wotawa TU Graz) Compiler Construction Summer term / 309

PART 4 - SYNTAX DIRECTED TRANSLATION. F. Wotawa TU Graz) Compiler Construction Summer term / 309 PART 4 - SYNTAX DIRECTED TRANSLATION F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 109 / 309 Setting Translation of context-free languages Information attributes of grammar symbols Values

More information

UNIT-III BOTTOM-UP PARSING

UNIT-III BOTTOM-UP PARSING UNIT-III BOTTOM-UP PARSING Constructing a parse tree for an input string beginning at the leaves and going towards the root is called bottom-up parsing. A general type of bottom-up parser is a shift-reduce

More information

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram CS6660 COMPILER DESIGN Question Bank UNIT I-INTRODUCTION TO COMPILERS 1. Define compiler. 2. Differentiate compiler and interpreter. 3. What is a language processing system? 4. List four software tools

More information

Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form Bottom-up parsing Bottom-up parsing Recall Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form If α V t,thenα is called a sentence in L(G) Otherwise it is just

More information

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous. Section A 1. What do you meant by parser and its types? A parser for grammar G is a program that takes as input a string w and produces as output either a parse tree for w, if w is a sentence of G, or

More information

Compiler Construction 2016/2017 Syntax Analysis

Compiler Construction 2016/2017 Syntax Analysis Compiler Construction 2016/2017 Syntax Analysis Peter Thiemann November 2, 2016 Outline 1 Syntax Analysis Recursive top-down parsing Nonrecursive top-down parsing Bottom-up parsing Syntax Analysis tokens

More information

[Lexical Analysis] Bikash Balami

[Lexical Analysis] Bikash Balami 1 [Lexical Analysis] Compiler Design and Construction (CSc 352) Compiled By Central Department of Computer Science and Information Technology (CDCSIT) Tribhuvan University, Kirtipur Kathmandu, Nepal 2

More information

Formal Languages and Compilers Lecture VII Part 3: Syntactic A

Formal Languages and Compilers Lecture VII Part 3: Syntactic A Formal Languages and Compilers Lecture VII Part 3: Syntactic Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/

More information

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview Tokens and regular expressions Syntax and context-free grammars Grammar derivations More about parse trees Top-down and bottom-up

More information

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program.

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program. COMPILER DESIGN 1. What is a compiler? A compiler is a program that reads a program written in one language the source language and translates it into an equivalent program in another language-the target

More information

Syntax-Directed Translation Part I

Syntax-Directed Translation Part I 1 Syntax-Directed Translation Part I Chapter 5 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007-2011 2 The Structure of our Compiler Revisited Character stream

More information

A left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence.

A left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence. Bottom-up parsing Recall For a grammar G, with start symbol S, any string α such that S α is a sentential form If α V t, then α is a sentence in L(G) A left-sentential form is a sentential form that occurs

More information

Module 6 Lexical Phase - RE to DFA

Module 6 Lexical Phase - RE to DFA Module 6 Lexical Phase - RE to DFA The objective of this module is to construct a minimized DFA from a regular expression. A NFA is typically easier to construct but string matching with a NFA is slower.

More information

VIVA QUESTIONS WITH ANSWERS

VIVA QUESTIONS WITH ANSWERS VIVA QUESTIONS WITH ANSWERS 1. What is a compiler? A compiler is a program that reads a program written in one language the source language and translates it into an equivalent program in another language-the

More information

Formal Languages and Compilers Lecture VI: Lexical Analysis

Formal Languages and Compilers Lecture VI: Lexical Analysis Formal Languages and Compilers Lecture VI: Lexical Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/ artale/ Formal

More information

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata Lexical Analysis Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata Phase Ordering of Front-Ends Lexical analysis (lexer) Break input string

More information

[Syntax Directed Translation] Bikash Balami

[Syntax Directed Translation] Bikash Balami 1 [Syntax Directed Translation] Compiler Design and Construction (CSc 352) Compiled By Central Department of Computer Science and Information Technology (CDCSIT) Tribhuvan University, Kirtipur Kathmandu,

More information

Lexical Analyzer Scanner

Lexical Analyzer Scanner Lexical Analyzer Scanner ASU Textbook Chapter 3.1, 3.3, 3.4, 3.6, 3.7, 3.5 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Main tasks Read the input characters and produce

More information

Lexical Analyzer Scanner

Lexical Analyzer Scanner Lexical Analyzer Scanner ASU Textbook Chapter 3.1, 3.3, 3.4, 3.6, 3.7, 3.5 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Main tasks Read the input characters and produce

More information

Syntax Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Syntax Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay Syntax Analysis (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay September 2007 College of Engineering, Pune Syntax Analysis: 2/124 Syntax

More information

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing Roadmap > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing The role of the parser > performs context-free syntax analysis > guides

More information

Principles of Programming Languages

Principles of Programming Languages Principles of Programming Languages h"p://www.di.unipi.it/~andrea/dida2ca/plp- 14/ Prof. Andrea Corradini Department of Computer Science, Pisa Lesson 11! Syntax- Directed Transla>on The Structure of the

More information

Syntax-Directed Translation. Concepts Introduced in Chapter 5. Syntax-Directed Definitions

Syntax-Directed Translation. Concepts Introduced in Chapter 5. Syntax-Directed Definitions Concepts Introduced in Chapter 5 Syntax-Directed Definitions Translation Schemes Synthesized Attributes Inherited Attributes Dependency Graphs Syntax-Directed Translation Uses a grammar to direct the translation.

More information

S Y N T A X A N A L Y S I S LR

S Y N T A X A N A L Y S I S LR LR parsing There are three commonly used algorithms to build tables for an LR parser: 1. SLR(1) = LR(0) plus use of FOLLOW set to select between actions smallest class of grammars smallest tables (number

More information

CS606- compiler instruction Solved MCQS From Midterm Papers

CS606- compiler instruction Solved MCQS From Midterm Papers CS606- compiler instruction Solved MCQS From Midterm Papers March 06,2014 MC100401285 Moaaz.pk@gmail.com Mc100401285@gmail.com PSMD01 Final Term MCQ s and Quizzes CS606- compiler instruction If X is a

More information

Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6

Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6 Compiler Design 1 Bottom-UP Parsing Compiler Design 2 The Process The parse tree is built starting from the leaf nodes labeled by the terminals (tokens). The parser tries to discover appropriate reductions,

More information

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design i About the Tutorial A compiler translates the codes written in one language to some other language without changing the meaning of the program. It is also expected that a compiler should make the target

More information

Wednesday, August 31, Parsers

Wednesday, August 31, Parsers Parsers How do we combine tokens? Combine tokens ( words in a language) to form programs ( sentences in a language) Not all combinations of tokens are correct programs (not all sentences are grammatically

More information

Zhizheng Zhang. Southeast University

Zhizheng Zhang. Southeast University Zhizheng Zhang Southeast University 2016/10/5 Lexical Analysis 1 1. The Role of Lexical Analyzer 2016/10/5 Lexical Analysis 2 2016/10/5 Lexical Analysis 3 Example. position = initial + rate * 60 2016/10/5

More information

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis. Topics Chapter 4 Lexical and Syntax Analysis Introduction Lexical Analysis Syntax Analysis Recursive -Descent Parsing Bottom-Up parsing 2 Language Implementation Compilation There are three possible approaches

More information

Question Bank. 10CS63:Compiler Design

Question Bank. 10CS63:Compiler Design Question Bank 10CS63:Compiler Design 1.Determine whether the following regular expressions define the same language? (ab)* and a*b* 2.List the properties of an operator grammar 3. Is macro processing a

More information

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468 Parsers Xiaokang Qiu Purdue University ECE 468 August 31, 2018 What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure

More information

Wednesday, September 9, 15. Parsers

Wednesday, September 9, 15. Parsers Parsers What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

More information

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs: What is a parser Parsers A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

More information

Bottom-Up Parsing. Lecture 11-12

Bottom-Up Parsing. Lecture 11-12 Bottom-Up Parsing Lecture 11-12 (From slides by G. Necula & R. Bodik) 9/22/06 Prof. Hilfinger CS164 Lecture 11 1 Bottom-Up Parsing Bottom-up parsing is more general than topdown parsing And just as efficient

More information

Chapter 4. Lexical and Syntax Analysis

Chapter 4. Lexical and Syntax Analysis Chapter 4 Lexical and Syntax Analysis Chapter 4 Topics Introduction Lexical Analysis The Parsing Problem Recursive-Descent Parsing Bottom-Up Parsing Copyright 2012 Addison-Wesley. All rights reserved.

More information

Principles of Programming Languages

Principles of Programming Languages Principles of Programming Languages h"p://www.di.unipi.it/~andrea/dida2ca/plp- 14/ Prof. Andrea Corradini Department of Computer Science, Pisa Lesson 10! Con:nua:on of the course Syntax- Directed Transla:on

More information

COMPILER DESIGN UNIT I LEXICAL ANALYSIS. Translator: It is a program that translates one language to another Language.

COMPILER DESIGN UNIT I LEXICAL ANALYSIS. Translator: It is a program that translates one language to another Language. UNIT I LEXICAL ANALYSIS Translator: It is a program that translates one language to another Language. Source Code Translator Target Code 1. INTRODUCTION TO LANGUAGE PROCESSING The Language Processing System

More information

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root Parser Generation Main Problem: given a grammar G, how to build a top-down parser or a bottom-up parser for it? parser : a program that, given a sentence, reconstructs a derivation for that sentence ----

More information

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant Syntax Analysis: Context-free Grammars, Pushdown Automata and Part - 4 Department of Computer Science and Automation Indian Institute of Science Bangalore 560 012 NPTEL Course on Principles of Compiler

More information

3. Parsing. Oscar Nierstrasz

3. Parsing. Oscar Nierstrasz 3. Parsing Oscar Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes. http://www.cs.ucla.edu/~palsberg/ http://www.cs.purdue.edu/homes/hosking/

More information

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1 CSE P 501 Compilers LR Parsing Hal Perkins Spring 2018 UW CSE P 501 Spring 2018 D-1 Agenda LR Parsing Table-driven Parsers Parser States Shift-Reduce and Reduce-Reduce conflicts UW CSE P 501 Spring 2018

More information

Compilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 3, vrijdag 22 september 2017

Compilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 3, vrijdag 22 september 2017 Compilerconstructie najaar 2017 http://www.liacs.leidenuniv.nl/~vlietrvan1/coco/ Rudy van Vliet kamer 140 Snellius, tel. 071-527 2876 rvvliet(at)liacs(dot)nl college 3, vrijdag 22 september 2017 + werkcollege

More information

VETRI VINAYAHA COLLEGE OF ENGINEERING AND TECHNOLOGY

VETRI VINAYAHA COLLEGE OF ENGINEERING AND TECHNOLOGY VETRI VINAYAHA COLLEGE OF ENGINEERING AND TECHNOLOGY DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CS6660 COMPILER DESIGN III year/ VI sem CSE (Regulation 2013) UNIT I -INTRODUCTION TO COMPILER PART A

More information

Principles of Compiler Design Presented by, R.Venkadeshan,M.Tech-IT, Lecturer /CSE Dept, Chettinad College of Engineering &Technology

Principles of Compiler Design Presented by, R.Venkadeshan,M.Tech-IT, Lecturer /CSE Dept, Chettinad College of Engineering &Technology Principles of Compiler Design Presented by, R.Venkadeshan,M.Tech-IT, Lecturer /CSE Dept, Chettinad College of Engineering &Technology 6/30/2010 Principles of Compiler Design R.Venkadeshan 1 Preliminaries

More information

Monday, September 13, Parsers

Monday, September 13, Parsers Parsers Agenda Terminology LL(1) Parsers Overview of LR Parsing Terminology Grammar G = (Vt, Vn, S, P) Vt is the set of terminals Vn is the set of non-terminals S is the start symbol P is the set of productions

More information

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1 CSE 401 Compilers LR Parsing Hal Perkins Autumn 2011 10/10/2011 2002-11 Hal Perkins & UW CSE D-1 Agenda LR Parsing Table-driven Parsers Parser States Shift-Reduce and Reduce-Reduce conflicts 10/10/2011

More information

LR Parsing LALR Parser Generators

LR Parsing LALR Parser Generators LR Parsing LALR Parser Generators Outline Review of bottom-up parsing Computing the parsing DFA Using parser generators 2 Bottom-up Parsing (Review) A bottom-up parser rewrites the input string to the

More information

SYNTAX ANALYSIS 1. Define parser. Hierarchical analysis is one in which the tokens are grouped hierarchically into nested collections with collective meaning. Also termed as Parsing. 2. Mention the basic

More information

4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

More information

Introduction to Syntax Analysis

Introduction to Syntax Analysis Compiler Design 1 Introduction to Syntax Analysis Compiler Design 2 Syntax Analysis The syntactic or the structural correctness of a program is checked during the syntax analysis phase of compilation.

More information

Syntax-Directed Translation

Syntax-Directed Translation Syntax-Directed Translation 1 Syntax-Directed Translation 2 Syntax-Directed Translation 3 Syntax-Directed Translation In a syntax-directed definition, each production A α is associated with a set of semantic

More information

PSD3A Principles of Compiler Design Unit : I-V. PSD3A- Principles of Compiler Design

PSD3A Principles of Compiler Design Unit : I-V. PSD3A- Principles of Compiler Design PSD3A Principles of Compiler Design Unit : I-V 1 UNIT I - SYLLABUS Compiler Assembler Language Processing System Phases of Compiler Lexical Analyser Finite Automata NFA DFA Compiler Tools 2 Compiler -

More information

Compilers. Bottom-up Parsing. (original slides by Sam

Compilers. Bottom-up Parsing. (original slides by Sam Compilers Bottom-up Parsing Yannis Smaragdakis U Athens Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Bottom-Up Parsing More general than top-down parsing And just as efficient Builds

More information

UNIT III & IV. Bottom up parsing

UNIT III & IV. Bottom up parsing UNIT III & IV Bottom up parsing 5.0 Introduction Given a grammar and a sentence belonging to that grammar, if we have to show that the given sentence belongs to the given grammar, there are two methods.

More information

Lexical Analysis (ASU Ch 3, Fig 3.1)

Lexical Analysis (ASU Ch 3, Fig 3.1) Lexical Analysis (ASU Ch 3, Fig 3.1) Implementation by hand automatically ((F)Lex) Lex generates a finite automaton recogniser uses regular expressions Tasks remove white space (ws) display source program

More information

Part 5 Program Analysis Principles and Techniques

Part 5 Program Analysis Principles and Techniques 1 Part 5 Program Analysis Principles and Techniques Front end 2 source code scanner tokens parser il errors Responsibilities: Recognize legal programs Report errors Produce il Preliminary storage map Shape

More information

Chapter 3 Lexical Analysis

Chapter 3 Lexical Analysis Chapter 3 Lexical Analysis Outline Role of lexical analyzer Specification of tokens Recognition of tokens Lexical analyzer generator Finite automata Design of lexical analyzer generator The role of lexical

More information

Introduction to Syntax Analysis. The Second Phase of Front-End

Introduction to Syntax Analysis. The Second Phase of Front-End Compiler Design IIIT Kalyani, WB 1 Introduction to Syntax Analysis The Second Phase of Front-End Compiler Design IIIT Kalyani, WB 2 Syntax Analysis The syntactic or the structural correctness of a program

More information

A programming language requires two major definitions A simple one pass compiler

A programming language requires two major definitions A simple one pass compiler A programming language requires two major definitions A simple one pass compiler [Syntax: what the language looks like A context-free grammar written in BNF (Backus-Naur Form) usually suffices. [Semantics:

More information

Lecture 7: Deterministic Bottom-Up Parsing

Lecture 7: Deterministic Bottom-Up Parsing Lecture 7: Deterministic Bottom-Up Parsing (From slides by G. Necula & R. Bodik) Last modified: Tue Sep 20 12:50:42 2011 CS164: Lecture #7 1 Avoiding nondeterministic choice: LR We ve been looking at general

More information

Lecture 8: Deterministic Bottom-Up Parsing

Lecture 8: Deterministic Bottom-Up Parsing Lecture 8: Deterministic Bottom-Up Parsing (From slides by G. Necula & R. Bodik) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 1 Avoiding nondeterministic choice: LR We ve been looking at general

More information

MODULE 14 SLR PARSER LR(0) ITEMS

MODULE 14 SLR PARSER LR(0) ITEMS MODULE 14 SLR PARSER LR(0) ITEMS In this module we shall discuss one of the LR type parser namely SLR parser. The various steps involved in the SLR parser will be discussed with a focus on the construction

More information

Formal Languages and Compilers Lecture IV: Regular Languages and Finite. Finite Automata

Formal Languages and Compilers Lecture IV: Regular Languages and Finite. Finite Automata Formal Languages and Compilers Lecture IV: Regular Languages and Finite Automata Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/

More information

LR Parsing Techniques

LR Parsing Techniques LR Parsing Techniques Bottom-Up Parsing - LR: a special form of BU Parser LR Parsing as Handle Pruning Shift-Reduce Parser (LR Implementation) LR(k) Parsing Model - k lookaheads to determine next action

More information

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens Concepts Introduced in Chapter 3 Lexical Analysis Regular Expressions (REs) Nondeterministic Finite Automata (NFA) Converting an RE to an NFA Deterministic Finite Automatic (DFA) Lexical Analysis Why separate

More information

UNIT II LEXICAL ANALYSIS

UNIT II LEXICAL ANALYSIS UNIT II LEXICAL ANALYSIS 2 Marks 1. What are the issues in lexical analysis? Simpler design Compiler efficiency is improved Compiler portability is enhanced. 2. Define patterns/lexeme/tokens? This set

More information

Compiler course. Chapter 3 Lexical Analysis

Compiler course. Chapter 3 Lexical Analysis Compiler course Chapter 3 Lexical Analysis 1 A. A. Pourhaji Kazem, Spring 2009 Outline Role of lexical analyzer Specification of tokens Recognition of tokens Lexical analyzer generator Finite automata

More information

1. INTRODUCTION TO LANGUAGE PROCESSING The Language Processing System can be represented as shown figure below.

1. INTRODUCTION TO LANGUAGE PROCESSING The Language Processing System can be represented as shown figure below. UNIT I Translator: It is a program that translates one language to another Language. Examples of translator are compiler, assembler, interpreter, linker, loader and preprocessor. Source Code Translator

More information

General Overview of Compiler

General Overview of Compiler General Overview of Compiler Compiler: - It is a complex program by which we convert any high level programming language (source code) into machine readable code. Interpreter: - It performs the same task

More information

4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

More information

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S. Bottom up parsing Construct a parse tree for an input string beginning at leaves and going towards root OR Reduce a string w of input to start symbol of grammar Consider a grammar S aabe A Abc b B d And

More information

Syntax-Directed Translation

Syntax-Directed Translation Syntax-Directed Translation What is syntax-directed translation? The compilation process is driven by the syntax. The semantic routines perform interpretation based on the syntax structure. Attaching attributes

More information

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States TDDD16 Compilers and Interpreters TDDB44 Compiler Construction LR Parsing, Part 2 Constructing Parse Tables Parse table construction Grammar conflict handling Categories of LR Grammars and Parsers An NFA

More information

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis Chapter 4 Lexical and Syntax Analysis Introduction - Language implementation systems must analyze source code, regardless of the specific implementation approach - Nearly all syntax analysis is based on

More information

Gujarat Technological University Sankalchand Patel College of Engineering, Visnagar B.E. Semester VII (CE) July-Nov Compiler Design (170701)

Gujarat Technological University Sankalchand Patel College of Engineering, Visnagar B.E. Semester VII (CE) July-Nov Compiler Design (170701) Gujarat Technological University Sankalchand Patel College of Engineering, Visnagar B.E. Semester VII (CE) July-Nov 2014 Compiler Design (170701) Question Bank / Assignment Unit 1: INTRODUCTION TO COMPILING

More information

Lexical and Syntax Analysis. Bottom-Up Parsing

Lexical and Syntax Analysis. Bottom-Up Parsing Lexical and Syntax Analysis Bottom-Up Parsing Parsing There are two ways to construct derivation of a grammar. Top-Down: begin with start symbol; repeatedly replace an instance of a production s LHS with

More information

Bottom-Up Parsing. Lecture 11-12

Bottom-Up Parsing. Lecture 11-12 Bottom-Up Parsing Lecture 11-12 (From slides by G. Necula & R. Bodik) 2/20/08 Prof. Hilfinger CS164 Lecture 11 1 Administrivia Test I during class on 10 March. 2/20/08 Prof. Hilfinger CS164 Lecture 11

More information

Unit 13. Compiler Design

Unit 13. Compiler Design Unit 13. Compiler Design Computers are a balanced mix of software and hardware. Hardware is just a piece of mechanical device and its functions are being controlled by a compatible software. Hardware understands

More information

2068 (I) Attempt all questions.

2068 (I) Attempt all questions. 2068 (I) 1. What do you mean by compiler? How source program analyzed? Explain in brief. 2. Discuss the role of symbol table in compiler design. 3. Convert the regular expression 0 + (1 + 0)* 00 first

More information

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved. Syntax Analysis Prof. James L. Frankel Harvard University Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved. Context-Free Grammar (CFG) terminals non-terminals start

More information

LR Parsing LALR Parser Generators

LR Parsing LALR Parser Generators Outline LR Parsing LALR Parser Generators Review of bottom-up parsing Computing the parsing DFA Using parser generators 2 Bottom-up Parsing (Review) A bottom-up parser rewrites the input string to the

More information

Earlier edition Dragon book has been revised. Course Outline Contact Room 124, tel , rvvliet(at)liacs(dot)nl

Earlier edition Dragon book has been revised. Course Outline Contact Room 124, tel , rvvliet(at)liacs(dot)nl Compilerconstructie najaar 2013 http://www.liacs.nl/home/rvvliet/coco/ Rudy van Vliet kamer 124 Snellius, tel. 071-527 5777 rvvliet(at)liacs(dot)nl college 1, dinsdag 3 september 2013 Overview 1 Why this

More information

PART 4 - SYNTAX DIRECTED TRANSLATION. F. Wotawa TU Graz) Compiler Construction Summer term / 264

PART 4 - SYNTAX DIRECTED TRANSLATION. F. Wotawa TU Graz) Compiler Construction Summer term / 264 PART 4 - SYNTAX DIRECTED TRANSLATION F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 109 / 264 Setting Translation of context-free languages Information attributes of grammar symbols Values

More information

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]

More information

COMPILER DESIGN. For COMPUTER SCIENCE

COMPILER DESIGN. For COMPUTER SCIENCE COMPILER DESIGN For COMPUTER SCIENCE . COMPILER DESIGN SYLLABUS Lexical analysis, parsing, syntax-directed translation. Runtime environments. Intermediate code generation. ANALYSIS OF GATE PAPERS Exam

More information

COMPILER DESIGN - QUICK GUIDE COMPILER DESIGN - OVERVIEW

COMPILER DESIGN - QUICK GUIDE COMPILER DESIGN - OVERVIEW COMPILER DESIGN - QUICK GUIDE http://www.tutorialspoint.com/compiler_design/compiler_design_quick_guide.htm COMPILER DESIGN - OVERVIEW Copyright tutorialspoint.com Computers are a balanced mix of software

More information

Bottom-Up Parsing II (Different types of Shift-Reduce Conflicts) Lecture 10. Prof. Aiken (Modified by Professor Vijay Ganesh.

Bottom-Up Parsing II (Different types of Shift-Reduce Conflicts) Lecture 10. Prof. Aiken (Modified by Professor Vijay Ganesh. Bottom-Up Parsing II Different types of Shift-Reduce Conflicts) Lecture 10 Ganesh. Lecture 10) 1 Review: Bottom-Up Parsing Bottom-up parsing is more general than topdown parsing And just as efficient Doesn

More information