Topic 3: Syntax Analysis I

Size: px
Start display at page:

Download "Topic 3: Syntax Analysis I"

Transcription

1 Topic 3: Syntax Analysis I Compiler Design Prof. Hanjun Kim CoreLab (Compiler Research Lab) POSTECH 1

2 Back-End Front-End The Front End Source Program Lexical Analysis Syntax Analysis Semantic Analysis IR Code Generation Intermediate Representation IR Optimization Target Code Generation Target Code Optimization Target Program Lexical Analysis Break into tokens Think words, punctuation Syntax Analysis Parse phase structure Think document, paragraphs, sentences Semantic Analysis Calculate meaning 2

3 Parser in the Front-End Source Stream of Abstract Lexer Tokens Parser Syntax Tree FE IR Parser Functions: Verify that token stream is valid If it is not valid, report syntax error and recover Build Abstract Syntax Tree (AST) 3

4 Analogy to English Parsing Understanding sentence structure Check grammar Ex: This line is a longer sentence article noun verb article adjective noun subject complement sentence

5 Syntax Analysis (Parsing) A process that verifies that token stream is valid Check grammar in program language Ex: if a < b then c = 1 else c = 2 ID LT ID ID ASSIGN NUM ID ASSIGN NUM IF expression THEN statement ELSE statement IF-THEN-ELSE statement

6 Syntax Analysis (Parsing) Syntax analysis (Parsing) Every programming language has a set of rules that describe syntax of well-formed programs A process that determine if source program satisfies these rules Why do we need a parser in addition to a lexer? Some program construct may have recursive structures digits = [0-9]+ expr = {digits} ( {expr} + {expr} ) 28, (28+301), ((28+301) + 9) Finite automata cannot recognize recursive constructs 6

7 Limitation of Finite Automata Cannot recognize recursive constructs A machine with N states cannot remember a parenthesis-nesting depth greater than N Can FA check correctness for (( ))? ( ( ) ) Then, the FA check correctness for ((( )))? Can FA remember its nested states? ( ) ) 7

8 We need a more powerful formalism: Context-Free Grammar 8

9 Context-Free Grammar Regular Expressions describe lexical structure of tokens Regular Expressions Lexer Generator Lexer Context-Free Grammars describe syntactic nature of programs Context-Free Grammar Parser Generator Parser 9

10 Analogy Lexical Analysis Syntax Analysis Output Set of tokens Set of source programs Output of Each Rule Token Source Program Input ASCII character Token 10

11 Context-Free Grammars Context-Free Grammars consist of a set of productions symbol -> symbol symbol symbol Symbol types: Terminal : token types Non-terminal : a symbol that appears on the left-side of some production Left-Hand Side (LHS) : non-terminal Right-Hand Side (RHS) : terminals or non-terminals Start Symbol : A special non-terminal; A whole accepted program by grammar Each production specifies how terminals and non-terminals may be combined to form a substring in language Easy to specify recursion: stmt -> IF exp THEN stmt ELSE stmt 11

12 End-of-File Marker Parse must also recognize the End-of-File (EOF) EOF marker in the grammar is $ Introduce new start symbol and the production S -> S$ 12

13 Derivation Derivation (Execution of Parsing) 1. Begin with start symbol 2. While non-terminal exist, replace any non-terminal with RHS of production Multiple derivations exist for given sentence Left-most derivation replace left-most non-terminal in each step Right-most derivation replace right-most non-terminal in each step 13

14 Example Terminals SEMI ; ID NUM ASSIGN := LPAREN ( RPAREN ) PLUS + PRINT print COMMA, Non Terminals stmt: statement expr: expression expr_list: expression list Rules stmt -> stmt ; stmt stmt -> ID := expr stmt -> PRINT (expr_list) expr -> ID expr -> NUM expr -> expr + expr expr -> (stmt, expr) expr_list -> expr expr_list -> expr_list, expr 14

15 Example Terminals SEMI ; ID NUM ASSIGN := LPAREN ( RPAREN ) PLUS + PRINT print COMMA, Non Terminals stmt: statement expr: expression expr_list: expression list Rules stmt -> stmt SEMI stmt stmt -> ID ASSIGN expr stmt -> PRINT LPAREN expr_list RPAREN expr -> ID expr -> NUM expr -> expr PLUS expr expr -> LPAREN stmt COMMA expr RPAREN expr_list -> expr expr_list -> expr_list COMMA expr 15

16 Example: Left-most Derivation Input: a := 12; print(23) Results from Lexical Analysis ID ASSIGN NUM SEMI PRINT LPAREN NUM RPAREN Left-most Derivation 1. stmt 2. stmt SEMI stmt 3. ID ASSIGN expr SEMI stmt 4. ID ASSIGN NUM SEMI stmt 5. ID ASSIGN NUM SEMI PRINT LPAREN expr_list RPAREN 6. ID ASSIGN NUM SEMI PRINT LPAREN expr RPAREN 7. ID ASSIGN NUM SEMI PRINT LPAREN NUM RPAREN 16

17 Example: Right-most Derivation Input: a := 12; print(23) Results from Lexical Analysis ID ASSIGN NUM SEMI PRINT LPAREN NUM RPAREN Right-most Derivation 1. stmt 2. stmt SEMI stmt 3. stmt SEMI PRINT LPAREN expr_list RPAREN 4. stmt SEMI PRINT LPAREN expr RPAREN 5. stmt SEMI PRINT LPAREN NUM RPAREN 6. ID ASSIGN expr SEMI LPAREN NUM RPAREN 7. ID ASSIGN NUM SEMI LPAREN NUM RPAREN 17

18 Parsing Tree Graphical representation of derivation Each internal node is labeled with a non-terminal Each leaf node is labeled with a terminal Parsing tree of the example: ID ASSIGN NUM SEMI PRINT LPAREN NUM RPAREN stmt stmt SEMI stmt ID ASSIGN expr PRINT LPAREN expr_list RPAREN NUM expr NUM 18

19 Inefficiency in Parsing Tree Concrete parse tree Each internal node labeled with non-terminal Children labeled with symbols in RHS of production Concrete parse trees are inconvenient to use!!! Punctuation needed to specify structure when writing code, but Tree already describes program structure Make trees simple! Remove tokens containing no additional information 19

20 Inefficiency in Parsing Tree P -> (S) E -> ID E -> E - E S -> S ; S E -> NUM E -> E * E S -> ID := E E -> E + E E -> E / E ( a := 4 ; b := 5 ) P ( S ) S ; S ID( a ) := E ID( b ) := E NUM(4) NUM(5) Do we need (, ) or ;? 20

21 Abstract Syntax Tree Solution: generate abstract parse tree (abstract syntax tree, AST) AST similar to concrete parse tree, except redundant tokens left out CompoundStm AssignStm AssignStm ID( a ) NUM(4) ID( b ) NUM(5) 21

22 Abstract Syntax Tree Example P -> (S) E -> ID E -> E - E S -> S ; S E -> NUM E -> E * E S -> ID := E E -> E + E E -> E / E How can you describe abstract syntax tree structure? type id = string datatype binop = PLUS MINUS TIMES DIV datatype stm = CompoundStm of stm * stm AssignStm of id * exp datatype exp = IDExp of id NUMExp of int OpExp of exp * binop * exp 22

23 Ambiguous Grammars A grammar is ambiguous if it can derive a string of tokens with two or more different parsing trees Example expr -> NUM expr -> expr + expr expr -> expr * expr Consider: * 6; is this 34 or 54? expr expr expr * expr expr + expr expr + expr NUM(6) NUM(4) expr * expr NUM(4) NUM(5) NUM(5) NUM(6) 23

24 Ambiguous Grammars Problem: Compiler uses parse tree to interpret meaning of parsed expressions Different Parse trees may have different meanings, resulting in different interpreted results For example, does 4+5*6 equal 34 or 54? Solution: rewrite grammar to eliminate ambiguity Operators have a relative precedence * binds tighter than + Operators wit the same precedence must be resolved by associativity Some operators have left associativity; others have right associativity 24

25 Ambiguous Grammars Non-Terminals expr: Expression term: Term (add) fact: Factor (mult) expr * 6 expr + term Rules expr -> expr + term expr -> term term -> term * fact term -> fact fact -> NUM term fact NUM(4) term fact NUM(5) * fact NUM(6) 25

26 How to analyze the syntax of a program? 26

27 Back to analogy How do you recognize an English sentence? Prediction-based approach If you see a subject, you will expect a verb to be followed. If you see a verb at the beginning of a sentence, you can know the sentence is a question. Predictive parsing (LL parsing) Bottom-up based approach Read a sentence, and then figure out its structure. Bottom-up parsing (LR parsing, shift-reduce parsing) 27

28 Recursive Descent Parsing 1. LL(k) Parsing 28

29 Recursive Descent Parsing One recursive function for each non-terminal Each production becomes clause in function A.K.A. predictive parsing, top-down parsing, LL(1) LL(1) Left-to-right parse, Leftmost-derivation, 1 symbol lookahead 29

30 Example Grammar: Non-terminals: S, E, L Terminals: IF(if), THEN(then), ELSE(else), BEGIN(begin),END(end), SEMI(;), NUM, EQ(=) S -> if E then S else S L -> end E -> num = num S -> begin S L L -> ; S L datatype token = EOF IF THEN ELSE BEGIN END SEMI NUM EQ val tok = ref (gettoken()) fun advance() = tok := gettoken() fun eat(t) = if (!tok = t) then advance() else error() fun S() = case!tok of IF BEGIN fun L() = case!tok of END SEMI fun E() = => (eat(if); E(); eat(then); S(); eat(else); S()) => (eat(begin); S(); L()) => (eat(end)) => (eat(semi); S(); L()) (eat(num); eat(eq); eat(num)) 30

31 Formal Techniques Before making a parser, we need to compute 3 values Nullable For each γ corresponding to RHS of production, γ is nullable if γ can be derived to empty string (ε) First(γ) For each γ corresponding to RHS of production, first(γ) is a set of all terminal symbols that can begin any string derived from γ Ex: S -> if E then S else S First(S): if Follow(X) For each non-terminal X in grammar, follow(x) is a set of all terminal symbols that can immediately follow X in a derivation Ex: S -> if E then S else S Follow(E): then 31

32 Computation of Nullable γ is nullable if every symbol S γ is nullable Check if S can derive ε Example Z XYZ Y c X a Z d Y ε X bye Initial Iteration 1 Iteration 2 X No No No Y No Yes Yes Z No No No 32

33 Computation of First If T is a terminal symbol, then First(T) = {T} If X is a non-terminal and X Y 1 Y 2 Y 3 Y n then, first Y 1 first Y 2 first Y 3 first Y n First X first X if Y 1 is nullable first X if Y 1, Y 2 are nullable first X if Y 1, Y 2,, Y n 1 are nullable 33

34 Computation of Follow Let X, Y be non-terminals; γ, γ 1, γ 2 be strings of terminals and non-terminals If grammar includes production: X γy follow X follow Y If grammar includes production: X γ 1 Yγ 2 first(γ 2 ) follow Y follow X follow Y, if γ 2 is nullable Perform iterative technique in order to compute nullable, first and follow set for each non-terminal in grammar 34

35 Example Z XYZ Y c X a Z d Y ε X bye X Y Z Initial nullable first follow No No No Iteration 1 nullable first follow X No a,b Y Yes c Z No d Iteration 2 nullable first follow X No a,b Y Yes c Z No d,a,b Iteration 2 nullable first follow X No a,b c,d,a,b Y Yes c e,d,a,b Z No d,a,b 35

36 Example Z XYZ Y c X a Z d Y ε X bye nullable first follow X No a,b c,d,a,b Y Yes c e,d,a,b Z No d,a,b Build predictive parsing table from nullable, first, and follow sets a b c d e X X a X bye Y Y ε Y ε Y c Y ε Y ε Z Z XYZ Z XYZ Z d Enter S γ in row S, column T: for each T first γ If γ is nullable, enter S γ in row S, column T: for each T follow(s) Entry in row S, column T tells parser which clause to execute if current function is S and next token is T Blank entries are syntax errors 36

37 Another Example S S$ S IF E THEN A ELSE A T NUM S E E E + T A ID = NUM S IF E THEN A E T S S E T A nullable first follow 37

38 Another Example S S$ S IF E THEN A ELSE A T NUM S E E E + T A ID = NUM S IF E THEN A E T nullable first follow S No IF, NUM S No IF, NUM $ E No NUM $,THEN,+ T No NUM $,THEN,+ A No ID $,ELSE 38

39 Another Example S S$ S IF E THEN A ELSE A T NUM S E E E + T A ID = NUM S IF E THEN A E T IF THEN ELSE + NUM ID = $ S S S$ S S$ S E T A S IF E THEN A S IF E THEN A ELSE A S E E E + T E T T NUM A ID = NUM 39

40 Left-Recursion Problem E E + T E T First(E+T) = First(T) When in Function E(), if next token is NUM, parser will get stuck Grammar is left-recursive that cannot be LL(1) Solution: rewrite grammar so that it is right-recursive E TE E ϵ E +TE Rule: X Xγ X α X αx X ε X γx 40

41 Left-Factoring S IF E THEN A S IF E THEN A ELSE A Two productions begin with the same symbol first(if E THEN A) = first(if E THEN A ELSE A) Solution: Left-Factoring S IF E THEN A V V ε V ELSE A 41

42 Modified Example S S$ V ELSE A T NUM S E E TE A ID = NUM S IF E THEN A V E ε V ε E +TE S S V E E T A nullable first follow 42

43 Modified Example S S$ V ELSE A T NUM S E E TE A ID = NUM S IF E THEN A V E ε V ε E +TE nullable first follow S No IF, NUM S No IF, NUM $ V Yes ELSE $ E No NUM $,THEN E Yes + $,THEN T No NUM $,THEN,+ A No ID $,ELSE 43

44 Modified Example S S$ V ELSE A T NUM S E E TE A ID = NUM S IF E THEN A V E ε V ε E +TE IF THEN ELSE + NUM ID = $ S S S$ S S$ S S IF E THEN A V S E V V ELSE A V ε E E TE E E ε E +TE E ε T T NUM A A ID = NUM 44

Topic 5: Syntax Analysis III

Topic 5: Syntax Analysis III Topic 5: Syntax Analysis III Compiler Design Prof. Hanjun Kim CoreLab (Compiler Research Lab) POSTECH 1 Back-End Front-End The Front End Source Program Lexical Analysis Syntax Analysis Semantic Analysis

More information

Context free grammars and predictive parsing

Context free grammars and predictive parsing Context free grammars and predictive parsing Programming Language Concepts and Implementation Fall 2011, Lecture 6 Only 8/15 submitted! Why? Merge: } Complexity? Mandatory ex 5 public static List Merge(List

More information

Compilation 2014 Warm-up project

Compilation 2014 Warm-up project Compilation 2014 Warm-up project Aslan Askarov aslan@cs.au.dk Revised from slides by E. Ernst Straight-line Programming Language Toy programming language: no branching, no loops Skip lexing and parsing

More information

Defining syntax using CFGs

Defining syntax using CFGs Defining syntax using CFGs Roadmap Last time Defined context-free grammar This time CFGs for specifying a language s syntax Language membership List grammars Resolving ambiguity CFG Review G = (N,Σ,P,S)

More information

Abstract Syntax. Mooly Sagiv. html://www.cs.tau.ac.il/~msagiv/courses/wcc06.html

Abstract Syntax. Mooly Sagiv. html://www.cs.tau.ac.il/~msagiv/courses/wcc06.html Abstract Syntax Mooly Sagiv html://www.cs.tau.ac.il/~msagiv/courses/wcc06.html Outline The general idea Cup Motivating example Interpreter for arithmetic expressions The need for abstract syntax Abstract

More information

LL parsing Nullable, FIRST, and FOLLOW

LL parsing Nullable, FIRST, and FOLLOW EDAN65: Compilers LL parsing Nullable, FIRST, and FOLLOW Görel Hedin Revised: 2014-09- 22 Regular expressions Context- free grammar ATribute grammar Lexical analyzer (scanner) SyntacKc analyzer (parser)

More information

Types of parsing. CMSC 430 Lecture 4, Page 1

Types of parsing. CMSC 430 Lecture 4, Page 1 Types of parsing Top-down parsers start at the root of derivation tree and fill in picks a production and tries to match the input may require backtracking some grammars are backtrack-free (predictive)

More information

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones Parsing III (Top-down parsing: recursive descent & LL(1) ) (Bottom-up parsing) CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones Copyright 2003, Keith D. Cooper,

More information

Wednesday, August 31, Parsers

Wednesday, August 31, Parsers Parsers How do we combine tokens? Combine tokens ( words in a language) to form programs ( sentences in a language) Not all combinations of tokens are correct programs (not all sentences are grammatically

More information

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers. Part III : Parsing From Regular to Context-Free Grammars Deriving a Parser from a Context-Free Grammar Scanners and Parsers A Parser for EBNF Left-Parsable Grammars Martin Odersky, LAMP/DI 1 From Regular

More information

CS2210: Compiler Construction Syntax Analysis Syntax Analysis

CS2210: Compiler Construction Syntax Analysis Syntax Analysis Comparison with Lexical Analysis The second phase of compilation Phase Input Output Lexer string of characters string of tokens Parser string of tokens Parse tree/ast What Parse Tree? CS2210: Compiler

More information

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012 Predictive Parsers LL(k) Parsing Can we avoid backtracking? es, if for a given input symbol and given nonterminal, we can choose the alternative appropriately. his is possible if the first terminal of

More information

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis Lexical and Syntactic Analysis Lexical and Syntax Analysis In Text: Chapter 4 Two steps to discover the syntactic structure of a program Lexical analysis (Scanner): to read the input characters and output

More information

EDA180: Compiler Construc6on. Top- down parsing. Görel Hedin Revised: a

EDA180: Compiler Construc6on. Top- down parsing. Görel Hedin Revised: a EDA180: Compiler Construc6on Top- down parsing Görel Hedin Revised: 2013-01- 30a Compiler phases and program representa6ons source code Lexical analysis (scanning) Intermediate code genera6on tokens intermediate

More information

A Simple Syntax-Directed Translator

A Simple Syntax-Directed Translator Chapter 2 A Simple Syntax-Directed Translator 1-1 Introduction The analysis phase of a compiler breaks up a source program into constituent pieces and produces an internal representation for it, called

More information

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis CSE450 Translation of Programming Languages Lecture 4: Syntax Analysis http://xkcd.com/859 Structure of a Today! Compiler Source Language Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator

More information

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing Review of Parsing Abstract Syntax Trees & Top-Down Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree

More information

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis Lexical and Syntactic Analysis Lexical and Syntax Analysis In Text: Chapter 4 Two steps to discover the syntactic structure of a program Lexical analysis (Scanner): to read the input characters and output

More information

Compiler Design Concepts. Syntax Analysis

Compiler Design Concepts. Syntax Analysis Compiler Design Concepts Syntax Analysis Introduction First task is to break up the text into meaningful words called tokens. newval=oldval+12 id = id + num Token Stream Lexical Analysis Source Code (High

More information

CS 314 Principles of Programming Languages

CS 314 Principles of Programming Languages CS 314 Principles of Programming Languages Lecture 5: Syntax Analysis (Parsing) Zheng (Eddy) Zhang Rutgers University January 31, 2018 Class Information Homework 1 is being graded now. The sample solution

More information

Introduction to Lexical Analysis

Introduction to Lexical Analysis Introduction to Lexical Analysis Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexers Regular expressions Examples

More information

Chapter 3. Parsing #1

Chapter 3. Parsing #1 Chapter 3 Parsing #1 Parser source file get next character scanner get token parser AST token A parser recognizes sequences of tokens according to some grammar and generates Abstract Syntax Trees (ASTs)

More information

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF) Chapter 3: Describing Syntax and Semantics Introduction Formal methods of describing syntax (BNF) We can analyze syntax of a computer program on two levels: 1. Lexical level 2. Syntactic level Lexical

More information

Top down vs. bottom up parsing

Top down vs. bottom up parsing Parsing A grammar describes the strings that are syntactically legal A recogniser simply accepts or rejects strings A generator produces sentences in the language described by the grammar A parser constructs

More information

CSE 3302 Programming Languages Lecture 2: Syntax

CSE 3302 Programming Languages Lecture 2: Syntax CSE 3302 Programming Languages Lecture 2: Syntax (based on slides by Chengkai Li) Leonidas Fegaras University of Texas at Arlington CSE 3302 L2 Spring 2011 1 How do we define a PL? Specifying a PL: Syntax:

More information

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing Abstract Syntax Trees & Top-Down Parsing Review of Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree

More information

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing Review of Parsing Abstract Syntax Trees & Top-Down Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree

More information

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468 Parsers Xiaokang Qiu Purdue University ECE 468 August 31, 2018 What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure

More information

ICOM 4036 Spring 2004

ICOM 4036 Spring 2004 Language Specification and Translation ICOM 4036 Spring 2004 Lecture 3 Copyright 2004 Pearson Addison-Wesley. All rights reserved. 3-1 Language Specification and Translation Topics Structure of a Compiler

More information

Wednesday, September 9, 15. Parsers

Wednesday, September 9, 15. Parsers Parsers What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

More information

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs: What is a parser Parsers A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

More information

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised: EDAN65: Compilers, Lecture 06 A LR parsing Görel Hedin Revised: 2017-09-11 This lecture Regular expressions Context-free grammar Attribute grammar Lexical analyzer (scanner) Syntactic analyzer (parser)

More information

CS 132 Compiler Construction

CS 132 Compiler Construction CS 132 Compiler Construction 1. Introduction 2 2. Lexical analysis 31 3. LL parsing 58 4. LR parsing 110 5. JavaCC and JTB 127 6. Semantic analysis 150 7. Translation and simplification 165 8. Liveness

More information

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview Tokens and regular expressions Syntax and context-free grammars Grammar derivations More about parse trees Top-down and bottom-up

More information

Principles of Programming Languages COMP251: Syntax and Grammars

Principles of Programming Languages COMP251: Syntax and Grammars Principles of Programming Languages COMP251: Syntax and Grammars Prof. Dekai Wu Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong, China Fall 2006

More information

Error Recovery. Computer Science 320 Prof. David Walker - 1 -

Error Recovery. Computer Science 320 Prof. David Walker - 1 - Error Recovery Syntax Errors: A Syntax Error occurs when stream of tokens is an invalid string. In LL(k) or LR(k) parsing tables, blank entries refer to syntax erro How should syntax errors be handled?

More information

Building Compilers with Phoenix

Building Compilers with Phoenix Building Compilers with Phoenix Syntax-Directed Translation Structure of a Compiler Character Stream Intermediate Representation Lexical Analyzer Machine-Independent Optimizer token stream Intermediate

More information

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview n Tokens and regular expressions n Syntax and context-free grammars n Grammar derivations n More about parse trees n Top-down and

More information

Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012 Derivations vs Parses Grammar is used to derive string or construct parser Context ree Grammars A derivation is a sequence of applications of rules Starting from the start symbol S......... (sentence)

More information

Programming Language Specification and Translation. ICOM 4036 Fall Lecture 3

Programming Language Specification and Translation. ICOM 4036 Fall Lecture 3 Programming Language Specification and Translation ICOM 4036 Fall 2009 Lecture 3 Some parts are Copyright 2004 Pearson Addison-Wesley. All rights reserved. 3-1 Language Specification and Translation Topics

More information

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4) CS1622 Lecture 9 Parsing (4) CS 1622 Lecture 9 1 Today Example of a recursive descent parser Predictive & LL(1) parsers Building parse tables CS 1622 Lecture 9 2 A Recursive Descent Parser. Preliminaries

More information

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing 8 Parsing Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces strings A parser constructs a parse tree for a string

More information

Syntax Analysis Part I

Syntax Analysis Part I Syntax Analysis Part I Chapter 4: Context-Free Grammars Slides adapted from : Robert van Engelen, Florida State University Position of a Parser in the Compiler Model Source Program Lexical Analyzer Token,

More information

CPS 506 Comparative Programming Languages. Syntax Specification

CPS 506 Comparative Programming Languages. Syntax Specification CPS 506 Comparative Programming Languages Syntax Specification Compiling Process Steps Program Lexical Analysis Convert characters into a stream of tokens Lexical Analysis Syntactic Analysis Send tokens

More information

A clarification on terminology: Recognizer: accepts or rejects strings in a language. Parser: recognizes and generates parse trees (imminent topic)

A clarification on terminology: Recognizer: accepts or rejects strings in a language. Parser: recognizes and generates parse trees (imminent topic) A clarification on terminology: Recognizer: accepts or rejects strings in a language Parser: recognizes and generates parse trees (imminent topic) Assignment 3: building a recognizer for the Lake expression

More information

SYNTAX ANALYSIS 1. Define parser. Hierarchical analysis is one in which the tokens are grouped hierarchically into nested collections with collective meaning. Also termed as Parsing. 2. Mention the basic

More information

Course Overview. Introduction (Chapter 1) Compiler Frontend: Today. Compiler Backend:

Course Overview. Introduction (Chapter 1) Compiler Frontend: Today. Compiler Backend: Course Overview Introduction (Chapter 1) Compiler Frontend: Today Lexical Analysis & Parsing (Chapter 2,3,4) Semantic Analysis (Chapter 5) Activation Records (Chapter 6) Translation to Intermediate Code

More information

Syntactic Analysis. Top-Down Parsing

Syntactic Analysis. Top-Down Parsing Syntactic Analysis Top-Down Parsing Copyright 2017, Pedro C. Diniz, all rights reserved. Students enrolled in Compilers class at University of Southern California (USC) have explicit permission to make

More information

Lexical Analysis. Lecture 3. January 10, 2018

Lexical Analysis. Lecture 3. January 10, 2018 Lexical Analysis Lecture 3 January 10, 2018 Announcements PA1c due tonight at 11:50pm! Don t forget about PA1, the Cool implementation! Use Monday s lecture, the video guides and Cool examples if you re

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Context Free Grammars and Parsing 1 Recall: Architecture of Compilers, Interpreters Source Parser Static Analyzer Intermediate Representation Front End Back

More information

Programming Languages and Compilers (CS 421)

Programming Languages and Compilers (CS 421) Programming Languages and Compilers (CS 421) Elsa L Gunter 2112 SC, UIUC http://courses.engr.illinois.edu/cs421 Based in part on slides by Mattox Beckman, as updated by Vikram Adve and Gul Agha 10/30/17

More information

Programming Languages & Compilers. Programming Languages and Compilers (CS 421) I. Major Phases of a Compiler. Programming Languages & Compilers

Programming Languages & Compilers. Programming Languages and Compilers (CS 421) I. Major Phases of a Compiler. Programming Languages & Compilers Programming Languages & Compilers Programming Languages and Compilers (CS 421) I Three Main Topics of the Course II III Elsa L Gunter 2112 SC, UIUC http://courses.engr.illinois.edu/cs421 New Programming

More information

Lexical and Syntax Analysis. Top-Down Parsing

Lexical and Syntax Analysis. Top-Down Parsing Lexical and Syntax Analysis Top-Down Parsing Easy for humans to write and understand String of characters Lexemes identified String of tokens Easy for programs to transform Data structure Syntax A syntax

More information

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Syntax Analysis These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Limits of Regular Languages Advantages of Regular Expressions

More information

Some Basic Definitions. Some Basic Definitions. Some Basic Definitions. Language Processing Systems. Syntax Analysis (Parsing) Prof.

Some Basic Definitions. Some Basic Definitions. Some Basic Definitions. Language Processing Systems. Syntax Analysis (Parsing) Prof. Language Processing Systems Prof. Mohamed Hamada Software ngineering Lab. he University of Aizu Japan Syntax Analysis (Parsing) Some Basic Definitions Some Basic Definitions syntax: the way in which words

More information

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]

More information

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis Chapter 4 Lexical and Syntax Analysis Introduction - Language implementation systems must analyze source code, regardless of the specific implementation approach - Nearly all syntax analysis is based on

More information

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis. Topics Chapter 4 Lexical and Syntax Analysis Introduction Lexical Analysis Syntax Analysis Recursive -Descent Parsing Bottom-Up parsing 2 Language Implementation Compilation There are three possible approaches

More information

Syntax Analysis Check syntax and construct abstract syntax tree

Syntax Analysis Check syntax and construct abstract syntax tree Syntax Analysis Check syntax and construct abstract syntax tree if == = ; b 0 a b Error reporting and recovery Model using context free grammars Recognize using Push down automata/table Driven Parsers

More information

Lexical and Syntax Analysis

Lexical and Syntax Analysis Lexical and Syntax Analysis In Text: Chapter 4 N. Meng, F. Poursardar Lexical and Syntactic Analysis Two steps to discover the syntactic structure of a program Lexical analysis (Scanner): to read the input

More information

Parsing III. (Top-down parsing: recursive descent & LL(1) )

Parsing III. (Top-down parsing: recursive descent & LL(1) ) Parsing III (Top-down parsing: recursive descent & LL(1) ) Roadmap (Where are we?) Previously We set out to study parsing Specifying syntax Context-free grammars Ambiguity Top-down parsers Algorithm &

More information

Syntactic Analysis. CS345H: Programming Languages. Lecture 3: Lexical Analysis. Outline. Lexical Analysis. What is a Token? Tokens

Syntactic Analysis. CS345H: Programming Languages. Lecture 3: Lexical Analysis. Outline. Lexical Analysis. What is a Token? Tokens Syntactic Analysis CS45H: Programming Languages Lecture : Lexical Analysis Thomas Dillig Main Question: How to give structure to strings Analogy: Understanding an English sentence First, we separate a

More information

CSE 401 Midterm Exam Sample Solution 2/11/15

CSE 401 Midterm Exam Sample Solution 2/11/15 Question 1. (10 points) Regular expression warmup. For regular expression questions, you must restrict yourself to the basic regular expression operations covered in class and on homework assignments:

More information

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square) CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square) Introduction This semester, through a project split into 3 phases, we are going

More information

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal)

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal) Parsing Part II (Ambiguity, Top-down parsing, Left-recursion Removal) Ambiguous Grammars Definitions If a grammar has more than one leftmost derivation for a single sentential form, the grammar is ambiguous

More information

Part 3. Syntax analysis. Syntax analysis 96

Part 3. Syntax analysis. Syntax analysis 96 Part 3 Syntax analysis Syntax analysis 96 Outline 1. Introduction 2. Context-free grammar 3. Top-down parsing 4. Bottom-up parsing 5. Conclusion and some practical considerations Syntax analysis 97 Structure

More information

CS 536 Midterm Exam Spring 2013

CS 536 Midterm Exam Spring 2013 CS 536 Midterm Exam Spring 2013 ID: Exam Instructions: Write your student ID (not your name) in the space provided at the top of each page of the exam. Write all your answers on the exam itself. Feel free

More information

Software II: Principles of Programming Languages

Software II: Principles of Programming Languages Software II: Principles of Programming Languages Lecture 4 Language Translation: Lexical and Syntactic Analysis Translation A translator transforms source code (a program written in one language) into

More information

Chapter 4. Lexical and Syntax Analysis

Chapter 4. Lexical and Syntax Analysis Chapter 4 Lexical and Syntax Analysis Chapter 4 Topics Introduction Lexical Analysis The Parsing Problem Recursive-Descent Parsing Bottom-Up Parsing Copyright 2012 Addison-Wesley. All rights reserved.

More information

Monday, September 13, Parsers

Monday, September 13, Parsers Parsers Agenda Terminology LL(1) Parsers Overview of LR Parsing Terminology Grammar G = (Vt, Vn, S, P) Vt is the set of terminals Vn is the set of non-terminals S is the start symbol P is the set of productions

More information

Programming Languages & Compilers. Programming Languages and Compilers (CS 421) Programming Languages & Compilers. Major Phases of a Compiler

Programming Languages & Compilers. Programming Languages and Compilers (CS 421) Programming Languages & Compilers. Major Phases of a Compiler Programming Languages & Compilers Programming Languages and Compilers (CS 421) Three Main Topics of the Course I II III Sasa Misailovic 4110 SC, UIUC https://courses.engr.illinois.edu/cs421/fa2017/cs421a

More information

Fall Compiler Principles Lecture 2: LL parsing. Roman Manevich Ben-Gurion University of the Negev

Fall Compiler Principles Lecture 2: LL parsing. Roman Manevich Ben-Gurion University of the Negev Fall 2017-2018 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion University of the Negev 1 Books Compilers Principles, Techniques, and Tools Alfred V. Aho, Ravi Sethi, Jeffrey D. Ullman

More information

ASTs, Objective CAML, and Ocamlyacc

ASTs, Objective CAML, and Ocamlyacc ASTs, Objective CAML, and Ocamlyacc Stephen A. Edwards Columbia University Fall 2012 Parsing and Syntax Trees Parsing decides if the program is part of the language. Not that useful: we want more than

More information

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam Compilers Parsing Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Next step text chars Lexical analyzer tokens Parser IR Errors Parsing: Organize tokens into sentences Do tokens conform

More information

Introduction to Lexical Analysis

Introduction to Lexical Analysis Introduction to Lexical Analysis Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexical analyzers (lexers) Regular

More information

A simple syntax-directed

A simple syntax-directed Syntax-directed is a grammaroriented compiling technique Programming languages: Syntax: what its programs look like? Semantic: what its programs mean? 1 A simple syntax-directed Lexical Syntax Character

More information

CA Compiler Construction

CA Compiler Construction CA4003 - Compiler Construction David Sinclair A top-down parser starts with the root of the parse tree, labelled with the goal symbol of the grammar, and repeats the following steps until the fringe of

More information

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38 Syntax Analysis Martin Sulzmann Martin Sulzmann Syntax Analysis 1 / 38 Syntax Analysis Objective Recognize individual tokens as sentences of a language (beyond regular languages). Example 1 (OK) Program

More information

A programming language requires two major definitions A simple one pass compiler

A programming language requires two major definitions A simple one pass compiler A programming language requires two major definitions A simple one pass compiler [Syntax: what the language looks like A context-free grammar written in BNF (Backus-Naur Form) usually suffices. [Semantics:

More information

4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

More information

COP 3402 Systems Software Syntax Analysis (Parser)

COP 3402 Systems Software Syntax Analysis (Parser) COP 3402 Systems Software Syntax Analysis (Parser) Syntax Analysis 1 Outline 1. Definition of Parsing 2. Context Free Grammars 3. Ambiguous/Unambiguous Grammars Syntax Analysis 2 Lexical and Syntax Analysis

More information

Syntax. In Text: Chapter 3

Syntax. In Text: Chapter 3 Syntax In Text: Chapter 3 1 Outline Syntax: Recognizer vs. generator BNF EBNF Chapter 3: Syntax and Semantics 2 Basic Definitions Syntax the form or structure of the expressions, statements, and program

More information

Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill Syntax Analysis Björn B. Brandenburg The University of North Carolina at Chapel Hill Based on slides and notes by S. Olivier, A. Block, N. Fisher, F. Hernandez-Campos, and D. Stotts. The Big Picture Character

More information

Homework & Announcements

Homework & Announcements Homework & nnouncements New schedule on line. Reading: Chapter 18 Homework: Exercises at end Due: 11/1 Copyright c 2002 2017 UMaine School of Computing and Information S 1 / 25 COS 140: Foundations of

More information

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino 3. Syntax Analysis Andrea Polini Formal Languages and Compilers Master in Computer Science University of Camerino (Formal Languages and Compilers) 3. Syntax Analysis CS@UNICAM 1 / 54 Syntax Analysis: the

More information

Defining syntax using CFGs

Defining syntax using CFGs Defining syntax using CFGs Roadmap Last 8me Defined context-free grammar This 8me CFGs for syntax design Language membership List grammars Resolving ambiguity CFG Review G = (N,Σ,P,S) means derives derives

More information

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation Concepts Introduced in Chapter 2 A more detailed overview of the compilation process. Parsing Scanning Semantic Analysis Syntax-Directed Translation Intermediate Code Generation Context-Free Grammar A

More information

Compilers. Bottom-up Parsing. (original slides by Sam

Compilers. Bottom-up Parsing. (original slides by Sam Compilers Bottom-up Parsing Yannis Smaragdakis U Athens Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Bottom-Up Parsing More general than top-down parsing And just as efficient Builds

More information

Last time. What are compilers? Phases of a compiler. Scanner. Parser. Semantic Routines. Optimizer. Code Generation. Sunday, August 29, 2010

Last time. What are compilers? Phases of a compiler. Scanner. Parser. Semantic Routines. Optimizer. Code Generation. Sunday, August 29, 2010 Last time Source code Scanner Tokens Parser What are compilers? Phases of a compiler Syntax tree Semantic Routines IR Optimizer IR Code Generation Executable Extra: Front-end vs. Back-end Scanner + Parser

More information

4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

More information

Introduction to Lexing and Parsing

Introduction to Lexing and Parsing Introduction to Lexing and Parsing ECE 351: Compilers Jon Eyolfson University of Waterloo June 18, 2012 1 Riddle Me This, Riddle Me That What is a compiler? 1 Riddle Me This, Riddle Me That What is a compiler?

More information

Context-free grammars

Context-free grammars Context-free grammars Section 4.2 Formal way of specifying rules about the structure/syntax of a program terminals - tokens non-terminals - represent higher-level structures of a program start symbol,

More information

CS 11 Ocaml track: lecture 6

CS 11 Ocaml track: lecture 6 CS 11 Ocaml track: lecture 6 n Today: n Writing a computer language n Parser generators n lexers (ocamllex) n parsers (ocamlyacc) n Abstract syntax trees Problem (1) n We want to implement a computer language

More information

Syntax-Directed Translation. Lecture 14

Syntax-Directed Translation. Lecture 14 Syntax-Directed Translation Lecture 14 (adapted from slides by R. Bodik) 9/27/2006 Prof. Hilfinger, Lecture 14 1 Motivation: parser as a translator syntax-directed translation stream of tokens parser ASTs,

More information

Fall Compiler Principles Lecture 4: Parsing part 3. Roman Manevich Ben-Gurion University of the Negev

Fall Compiler Principles Lecture 4: Parsing part 3. Roman Manevich Ben-Gurion University of the Negev Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev Tentative syllabus Front End Intermediate Representation Optimizations Code Generation Scanning

More information

CSCI312 Principles of Programming Languages

CSCI312 Principles of Programming Languages Copyright 2006 The McGraw-Hill Companies, Inc. CSCI312 Principles of Programming Languages! LL Parsing!! Xu Liu Derived from Keith Cooper s COMP 412 at Rice University Recap Copyright 2006 The McGraw-Hill

More information

Lexical Analysis. Finite Automata

Lexical Analysis. Finite Automata #1 Lexical Analysis Finite Automata Cool Demo? (Part 1 of 2) #2 Cunning Plan Informal Sketch of Lexical Analysis LA identifies tokens from input string lexer : (char list) (token list) Issues in Lexical

More information

Programming Language Syntax and Analysis

Programming Language Syntax and Analysis Programming Language Syntax and Analysis 2017 Kwangman Ko (http://compiler.sangji.ac.kr, kkman@sangji.ac.kr) Dept. of Computer Engineering, Sangji University Introduction Syntax the form or structure of

More information

CS 406/534 Compiler Construction Parsing Part I

CS 406/534 Compiler Construction Parsing Part I CS 406/534 Compiler Construction Parsing Part I Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy and Dr.

More information

ECE251 Midterm practice questions, Fall 2010

ECE251 Midterm practice questions, Fall 2010 ECE251 Midterm practice questions, Fall 2010 Patrick Lam October 20, 2010 Bootstrapping In particular, say you have a compiler from C to Pascal which runs on x86, and you want to write a self-hosting Java

More information