# Introduction to Parsing. Lecture 5

Size: px
Start display at page:

Transcription

1 Introduction to Parsing Lecture 5 1

2 Outline Regular languages revisited Parser overview Context-free grammars (CFG s) Derivations Ambiguity 2

3 Languages and Automata Formal languages are very important in CS specially in programming languages Regular languages The weakest formal languages widely used Many applications (as we ve seen) We will also study context-free languages, tree languages 3

4 Beyond Regular Languages Difficulty with regular languages is that many languages are not regular Some are very important They can t be expressed using Rs and FAs x. Strings of balanced parentheses are not regular: Note this is fairly representative of lots of programming constructs {() i i i 0} Note: given as set not R 4

5 Beyond Regular Languages x. Nested arithmetic expressions ((1+2) * 3) x. Nested if then else statements if if if fi fi then fi then then if here acts like ( in previous example Note that even if language doesn t have the fi like Cool, it is usually implied 5

6 An xample To Help Understand the Limitations Consider the following DFA 1 0 x: What does it recognize? 0 Note: doesn t have any way of knowing length of input string 6

7 Beyond Regular Languages In general: Nesting constructs cannot be handled by regular expressions Raises the questions: What can be expressed? Why are Rs insufficient for recognizing arbitrary nesting constructs? 7

8 What Can Regular Languages xpress? Languages requiring counting modulo a fixed integer.g., parity Intuition: A finite automaton that runs long enough must repeat states Finite automaton can t remember # of times it has visited a particular state 8

9 The Functionality of the Parser Input: sequence of tokens from lexer Output: parse tree of the program (But some parsers never produce a parse tree...) 9

10 xample Cool if x = y then 1 else 2 fi Parser input (from lexical analyzer) IF ID = ID THN INT LS INT FI Parser output IF-THN-LS = INT INT ID ID 10

11 xample Note: nesting structure has been made explicit by tree Also the three components of the if then else Predicate Then branch lse branch IF-THN-LS = INT INT ID ID 11

12 Comparison with Lexical Analysis Phase Input Output Lexer Parser String of characters String of tokens String of tokens Parse tree 12

13 Couple of things As mentioned, sometimes parse tree is only implicit More on this later Many compilers do build full parse tree, many do not There are compilers that combine lexer and parser phases into one phase verything done by the parser Parsing technology powerful enough to express lexical analysis in addition to parsing But most compilers use two phases, because Rs are such a good match for lexical analysis 13

14 The Role of the Parser Not all strings of tokens are programs parser must distinguish between valid and invalid strings of tokens And give error messages for the invalid ones We need A language for describing valid strings of tokens An algorithm for distinguishing valid from invalid strings of tokens 14

15 Context-Free Grammars Programming language constructs have recursive structure An XPR in Cool can be if XPR then XPR else XPR fi while XPR loop XPR pool Note: Recursively composed of other expressions Context-free grammars are a natural notation for this recursive structure 15

16 What is a Context-Free Grammar (CFG)? A CFG consists of A set of terminals T A set of non-terminals N A start symbol S (a non-terminal) A set of productions X Y 1 Y 2!Y n where X N and Y T N { ε} i 16

17 Notational Conventions In these lecture notes Non-terminals are written upper-case Terminals are written lower-case The start symbol is the left-hand side of the first production This is standard for CFGs 17

18 xamples of CFGs S ( S ) S ε 18

19 xamples of CFGs S ( S ) S ε What are the parts of the grammar: N =? T =? Start =? 19

20 xamples of CFGs S ( S ) S ε What are the parts of the grammar: N = { S } T = { (, ) } Start = S (the only nonterminal) 20

21 xamples of CFGs S ( S ) S ε What are the parts of the grammar: N = { S } T = { (, ) } Start = S (the only nonterminal) Productions? 21

22 xamples of CFGs A fragment of Cool: XPR if XPR then XPR else XPR fi while XPR loop XPR pool id 22

23 xamples of CFGs (cont.) Simple arithmetic expressions: + ( ) id 23

24 The Language of a CFG Read productions as rules: X Y 1!Y n Means X can be replaced by Y 1!Y n That is, in general, the right hand side can replace the left hand side. 24

25 Key Idea 1. Begin with a string consisting of the start symbol S 2. Replace any non-terminal X in the string by a the right-hand side of some production X Y 1!Y n 3. Repeat (2) until there are no non-terminals in the string So note, the string is changing over time 25

26 The Language of a CFG (Cont.) More formally, write X 1! X i! X n X 1! X i 1 Y 1!Y m X i+1! X n if there is a production X i Y 1!Y m and say that the left hand side derives the right, or can derive the right hand side, etc. This is one step of a context-free derivation. 26

27 The Language of a CFG (Cont.) Write if X 1! X n Y 1!Y m in 0 or more steps X 1! X n!! Y 1!Y m We say the left hand side rewrites in zero or more steps to the right hand side 27

28 So, in general When we write X 0 X n it is shorthand for saying that there is some sequence of individual productions (rules) that get us from X 0 to X n in zero or more steps 28

29 The Language of a CFG Let G be a context-free grammar with start symbol S. Then the language, L(G), of G is: # & \$ a 1 a n S a 1 a n and every a i is a terminal' % ( 29

30 Terminals Terminals are so-called because there are no rules for replacing them Once generated, terminals are permanent feature of the string Terminals ought to be tokens of the language 30

31 Recall earlier xample L(G) is the language of CFG G Strings of balanced parentheses {() i i i 0} Two grammars: S S ( S) ε OR S ( S) ε 31

32 Cool xample A fragment of COOL: XPR if XPR then XPR else XPR fi while XPR loop XPR pool id Recall: Non-terminals are written upper-case Terminals are written lower-case Also, could have written as three productions 32

33 Cool xample (Cont.) Some elements of the language (why?) id if id then id else id fi while id loop id pool if while id loop id pool then id else id if if id then id else id fi then id else id fi 33

34 Arithmetic xample Simple arithmetic expressions: + () id Some elements of the language: id id + id (id) id id (id) id id (id) 34

35 Notes The idea of a CFG is a big step. But: Membership in a language is yes or no ; also need parse tree of the input Must handle errors gracefully Need an implementation of CFG s (e.g., bison) 35

36 More Notes Form of the grammar is important Many grammars generate the same language Tools are sensitive to the grammar Note: Tools for regular languages (e.g., flex) are sensitive to the form of the regular expression, but this is rarely a problem in practice 36

37 Completely Off Topic, But Relevant What the heck do we mean by context-free when we use the term context-free grammar Well, it s not context-sensitive Let s look at a couple of context sensitive grammars: 37

38 Completely Off Topic, But Relevant How do we recognize CFG vs CSG? In a CFG, all productions have only a single nonterminal on the left side of any production In a CSG, some production has more than a single non-terminal on the left side of a production 38

39 Completely Off Topic, But Relevant What does this have to do with context? In a CFG, you can replace a given nonterminal, using a production, without regard to what symbols surround it (the context ) 39

40 Completely Off Topic, But Relevant What does this have to do with context? But in a CSG, context matters. For example, in the first grammar, if c preceeds a B, you can replace that with WB. But if b preceeds a B, then you replace it with bb. That is, what you can replace B with depends on the surrounding symbols (the context ) 40

41 Completely Off Topic, But Relevant What does this have to do with context? In the second CSG below, one can replace an A with the terminal a, but one cannot replace B with a unless B is preceded by A (again, context matters) 41

42 Why is this Relevant Because when we later cover some of the material in semantic analysis, we ll mention that some programming language constructs are not context-free (and thus cannot be represented by a CFG) 42

43 Derivations and Parse Trees A derivation is a sequence of productions S!!! A derivation can be drawn as a tree Start symbol is the tree s root For a production add children to node X X Y 1!Y n X Y 1!Y n Y 1 Y n 43

44 Derivation xample Grammar + () id String id id + id We wish to parse the string 44

45 Derivation xample (Cont.) + + id id id + id + id + id + * id id id parse tree (of the input string) 45

46 Derivation in Detail (1) 46

47 Derivation in Detail (2)

48 Derivation in Detail (3) * 48

49 Derivation in Detail (4) * id + id 49

50 Derivation in Detail (5) id + * id id + id id 50

51 Derivation in Detail (6) id + * id id id id + id + id id id 51

52 Some Interesting Things About Parse Trees A parse tree has Terminals at the leaves Non-terminals at the interior nodes An in-order traversal of the leaves is the original input Let s go back and take a look The parse tree shows the association of operations, the input string does not Note * binds more tightly than + because * is a subtree of the parse tree 52

53 Aside: Inorder Traversal 53

54 An Interesting Question How did I know to pick this particular parse tree for the derivation? It turns out that there is more than one 54

55 Left-most and Right-most Derivations The example we did is a left-most derivation At each step, replace the left-most non-terminal There is an equivalent notion of a right-most derivation + + id + id id + id id + id 55

56 Left-most and Right-most Derivations The example we did is a left-most derivation At each step, replace the left-most non-terminal NOT unique Depends on sequence of productions There is an equivalent notion of a right-most derivation + +id + id id + id id id + id 56

57 Right-most Derivation in Detail (1) 57

58 Right-most Derivation in Detail (2)

59 Right-most Derivation in Detail (3) id id 59

60 Right-most Derivation in Detail (4) + + +id * id + id 60

61 Right-most Derivation in Detail (5) + + +id + id * id id + id id 61

62 Right-most Derivation in Detail (6) + +id + + id * id id + id id id + id id id 62

63 Derivations and Parse Trees Note that right-most and left-most derivations have the same parse tree In this case And this is not an accident The difference is the order in which branches are added Finally, there could be other parse trees that arise from neither left-most or right-most derivation But we are most interested in left-most and rightmost 63

64 Summary of Derivations We are not just interested in whether s is in L(G) We need a parse tree for s A derivation defines a parse tree But one parse tree may have many derivations Left-most and right-most derivations are important in parser implementation 64

65 Ambiguity Grammar + () id String id id + id 65

66 Ambiguity (Cont.) This string has two parse trees + * * id id + id id id id 66

67 Ambiguity (Cont.) A grammar is ambiguous if it has more than one parse tree for some string quivalently, there is more than one right-most or left-most derivation for some string Ambiguity is BAD Leaves meaning of some programs ill-defined ffectively, you re leaving it up to the compiler to pick which of the multiple interpretations of the program is to be used 67

68 Dealing with Ambiguity There are several ways to handle ambiguity Most direct method is to rewrite grammar unambiguously + ' ' id ʹ id () ʹ () ' nforces precedence of * over + Because it forces the + to be used higher in tree

69 Ambiguity in Arithmetic xpressions Recall the grammar + * ( ) int The string int * int + int has two parse trees: + * * int int + int int int int 69

70 Ambiguity: The Dangling lse Consider the grammar if then if then else OTHR This grammar is also ambiguous Recall parse subtree for if-then-else has 3 branches predicate then clause else clause Parse subtree for if-then has 2 branches predicate then clause 70

71 The Dangling lse: xample The expression if 1 then if 2 then 3 else 4 has two parse trees if if 1 if 4 1 if if ( 1 ) then (if 2 then 3 ) else 4 if ( 1 ) then (if 2 then 3 else 4 )

72 The Dangling lse: xample The expression if 1 then if 2 then 3 else 4 has two parse trees if if 1 if 4 1 if Typically we want the second form else goes with closest if 72

73 The Dangling lse: A Fix else matches the closest unmatched then We can describe this in the grammar matched if MIF /* all then are matched */ UIF /* some then is unmatched */ MIF if then MIF else MIF OTHR UIF if then if then MIF else UIF Describes the same set of strings unmatched if 73

74 The Dangling lse: xample Revisited The expression if 1 then if 2 then 3 else 4 if if 1 if 1 if A valid parse tree (for a UIF) Not valid because the then expression is not a MIF 74

75 Ambiguity No general techniques for handling ambiguity Impossible to automatically convert an ambiguous grammar to an unambiguous one Used with care, ambiguity can simplify the grammar Sometimes allows more natural definitions We need disambiguation mechanisms 75

76 Precedence and Associativity Declarations Instead of rewriting the grammar Use the more natural (ambiguous) grammar Along with disambiguating declarations Most tools allow precedence and associativity declarations to disambiguate grammars 76

77 Precedence and Associativity Declarations Associativity: An operator is left-associative if an operand with this operator on both sides of it belongs to the left operator x. The + operator is typically left-associative, so in the expression , the 6 belongs to the first plus sign, i.e., (4+6) + 10 Similar definition for right-associative x. The assignment operator = is typically rightassociative, so a = b = c is interpreted as a = (b = c) 77

78 Precedence and Associativity Declarations What about 5 * 3 6? Not a case of associativity, since different operators on each side of 3 But, we d expect this to be (5 *3) 6, because of the usual precedence rules Tools like Bison and Java CUP allow users to specify associativity and precedence rules using, amazingly enough, precedence and associativity declarations 78

79 Associativity Declarations Consider the grammar + int Ambiguous: two parse trees of int + int + int int int + int int int int Left associativity declaration: %left + 79

80 Precedence Declarations Consider the grammar + * int And the string int + int * int * + + int int * int int int Precedence declarations: %left + %left * int 80

81 Precedence Declarations Consider the grammar + * int And the string int + int * int * + + int int * int int int Precedence declarations: %left + %left * Note that higher precedence operator is lower in list int

### Introduction to Parsing. Lecture 5

Introduction to Parsing Lecture 5 1 Outline Regular languages revisited Parser overview Context-free grammars (CFG s) Derivations Ambiguity 2 Languages and Automata Formal languages are very important

### Introduction to Parsing. Lecture 5. Professor Alex Aiken Lecture #5 (Modified by Professor Vijay Ganesh)

Introduction to Parsing Lecture 5 (Modified by Professor Vijay Ganesh) 1 Outline Regular languages revisited Parser overview Context-free grammars (CFG s) Derivations Ambiguity 2 Languages and Automata

### Outline. Regular languages revisited. Introduction to Parsing. Parser overview. Context-free grammars (CFG s) Lecture 5. Derivations.

Outline Regular languages revisited Introduction to Parsing Lecture 5 Parser overview Context-free grammars (CFG s) Derivations Prof. Aiken CS 143 Lecture 5 1 Ambiguity Prof. Aiken CS 143 Lecture 5 2 Languages

### ( ) i 0. Outline. Regular languages revisited. Introduction to Parsing. Parser overview. Context-free grammars (CFG s) Lecture 5.

Outline Regular languages revisited Introduction to Parsing Lecture 5 Parser overview Context-free grammars (CFG s) Derivations Prof. Aiken CS 143 Lecture 5 1 Ambiguity Prof. Aiken CS 143 Lecture 5 2 Languages

### Introduction to Parsing Ambiguity and Syntax Errors

Introduction to Parsing Ambiguity and Syntax rrors Outline Regular languages revisited Parser overview Context-free grammars (CFG s) Derivations Ambiguity Syntax errors Compiler Design 1 (2011) 2 Languages

### Introduction to Parsing Ambiguity and Syntax Errors

Introduction to Parsing Ambiguity and Syntax rrors Outline Regular languages revisited Parser overview Context-free grammars (CFG s) Derivations Ambiguity Syntax errors 2 Languages and Automata Formal

### Introduction to Parsing. Lecture 8

Introduction to Parsing Lecture 8 Adapted from slides by G. Necula Outline Limitations of regular languages Parser overview Context-free grammars (CFG s) Derivations Languages and Automata Formal languages

### Grammars and ambiguity. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 8 1

Grammars and ambiguity CS164 3:30-5:00 TT 10 vans 1 Overview derivations and parse trees different derivations produce may produce same parse tree ambiguous grammars what they are and how to fix them 2

### Parsing: Derivations, Ambiguity, Precedence, Associativity. Lecture 8. Professor Alex Aiken Lecture #5 (Modified by Professor Vijay Ganesh)

Parsing: Derivations, Ambiguity, Precedence, Associativity Lecture 8 (Modified by Professor Vijay Ganesh) 1 Topics covered so far Regular languages and Finite automaton Parser overview Context-free grammars

### Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

Outline Limitations of regular languages Introduction to Parsing Parser overview Lecture 8 Adapted from slides by G. Necula Context-free grammars (CFG s) Derivations Languages and Automata Formal languages

### Intro To Parsing. Step By Step

#1 Intro To Parsing Step By Step #2 Self-Test from Last Time Are practical parsers and scanners based on deterministic or non-deterministic automata? How can regular expressions be used to specify nested

### E E+E E E (E) id. id + id E E+E. id E + E id id + E id id + id. Overview. derivations and parse trees. Grammars and ambiguity. ambiguous grammars

Overview Grammars and ambiguity CS164 3:30-5:00 TT 10 vans derivations and parse trees dferent derivations produce may produce same parse tree ambiguous grammars what they are and how to fix them 1 2 Recall:

### Context-Free Grammars

Context-Free Grammars Lecture 7 http://webwitch.dreamhost.com/grammar.girl/ Outline Scanner vs. parser Why regular expressions are not enough Grammars (context-free grammars) grammar rules derivations

### Ambiguity. Lecture 8. CS 536 Spring

Ambiguity Lecture 8 CS 536 Spring 2001 1 Announcement Reading Assignment Context-Free Grammars (Sections 4.1, 4.2) Programming Assignment 2 due Friday! Homework 1 due in a week (Wed Feb 21) not Feb 25!

### Outline. Parser overview Context-free grammars (CFG s) Derivations Syntax-Directed Translation

Outline Introduction to Parsing (adapted from CS 164 at Berkeley) Parser overview Context-free grammars (CFG s) Derivations Syntax-Directed ranslation he Functionality of the Parser Input: sequence of

### Outline. Limitations of regular languages Parser overview Context-free grammars (CFG s) Derivations Syntax-Directed Translation

Outline Introduction to Parsing Lecture 8 Adapted from slides by G. Necula and R. Bodik Limitations of regular languages Parser overview Context-free grammars (CG s) Derivations Syntax-Directed ranslation

### Programming Languages & Translators PARSING. Baishakhi Ray. Fall These slides are motivated from Prof. Alex Aiken: Compilers (Stanford)

Programming Languages & Translators PARSING Baishakhi Ray Fall 2018 These slides are motivated from Prof. Alex Aiken: Compilers (Stanford) Languages and Automata Formal languages are very important in

### Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Ambiguity, Precedence, Associativity & Top-Down Parsing Lecture 9-10 (From slides by G. Necula & R. Bodik) 9/18/06 Prof. Hilfinger CS164 Lecture 9 1 Administrivia Please let me know if there are continued

### Compilers and computer architecture From strings to ASTs (2): context free grammars

1 / 1 Compilers and computer architecture From strings to ASTs (2): context free grammars Martin Berger October 2018 Recall the function of compilers 2 / 1 3 / 1 Recall we are discussing parsing Source

### Context-Free Grammars

CFG2: Ambiguity Context-Free Grammars CMPT 379: Compilers Instructor: Anoop Sarkar anoopsarkar.github.io/compilers-class Ambiguity - / - / ( ) - / / - 16-06-22 2 Ambiguity Grammar is ambiguous if more

### Syntax Analysis Check syntax and construct abstract syntax tree

Syntax Analysis Check syntax and construct abstract syntax tree if == = ; b 0 a b Error reporting and recovery Model using context free grammars Recognize using Push down automata/table Driven Parsers

### Compilers Course Lecture 4: Context Free Grammars

Compilers Course Lecture 4: Context Free Grammars Example: attempt to define simple arithmetic expressions using named regular expressions: num = [0-9]+ sum = expr "+" expr expr = "(" sum ")" num Appears

### Bottom-Up Parsing. Lecture 11-12

Bottom-Up Parsing Lecture 11-12 (From slides by G. Necula & R. Bodik) 9/22/06 Prof. Hilfinger CS164 Lecture 11 1 Bottom-Up Parsing Bottom-up parsing is more general than topdown parsing And just as efficient

### Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal)

Parsing Part II (Ambiguity, Top-down parsing, Left-recursion Removal) Ambiguous Grammars Definitions If a grammar has more than one leftmost derivation for a single sentential form, the grammar is ambiguous

### Bottom-Up Parsing. Lecture 11-12

Bottom-Up Parsing Lecture 11-12 (From slides by G. Necula & R. Bodik) 2/20/08 Prof. Hilfinger CS164 Lecture 11 1 Administrivia Test I during class on 10 March. 2/20/08 Prof. Hilfinger CS164 Lecture 11

### Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Gr

Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Grammars Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/

### Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part2 3.3 Parse Trees and Abstract Syntax Trees

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part2 3.3 Parse Trees and Abstract Syntax Trees 3.3.1 Parse trees 1. Derivation V.S. Structure Derivations do not uniquely represent the structure of the strings

### Ambiguity. Grammar E E + E E * E ( E ) int. The string int * int + int has two parse trees. * int

Administrivia Ambiguity, Precedence, Associativity & op-down Parsing eam assignments this evening for all those not listed as having one. HW#3 is now available, due next uesday morning (Monday is a holiday).

### CS 314 Principles of Programming Languages

CS 314 Principles of Programming Languages Lecture 5: Syntax Analysis (Parsing) Zheng (Eddy) Zhang Rutgers University January 31, 2018 Class Information Homework 1 is being graded now. The sample solution

### ([1-9] 1[0-2]):[0-5][0-9](AM PM)? What does the above match? Matches clock time, may or may not be told if it is AM or PM.

What is the corresponding regex? [2-9]: ([1-9] 1[0-2]):[0-5][0-9](AM PM)? What does the above match? Matches clock time, may or may not be told if it is AM or PM. CS 230 - Spring 2018 4-1 More CFG Notation

### CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1

CSE P 501 Compilers Parsing & Context-Free Grammars Hal Perkins Winter 2008 1/15/2008 2002-08 Hal Perkins & UW CSE C-1 Agenda for Today Parsing overview Context free grammars Ambiguous grammars Reading:

### Ambiguity and Errors Syntax-Directed Translation

Outline Ambiguity (revisited) Ambiguity and rrors Syntax-Directed Translation xtensions of CFG for parsing Precedence declarations rror handling Semantic actions Constructing a parse tree CS780(Prasad)

### Compiler Design Concepts. Syntax Analysis

Compiler Design Concepts Syntax Analysis Introduction First task is to break up the text into meaningful words called tokens. newval=oldval+12 id = id + num Token Stream Lexical Analysis Source Code (High

### COP 3402 Systems Software Syntax Analysis (Parser)

COP 3402 Systems Software Syntax Analysis (Parser) Syntax Analysis 1 Outline 1. Definition of Parsing 2. Context Free Grammars 3. Ambiguous/Unambiguous Grammars Syntax Analysis 2 Lexical and Syntax Analysis

### Optimizing Finite Automata

Optimizing Finite Automata We can improve the DFA created by MakeDeterministic. Sometimes a DFA will have more states than necessary. For every DFA there is a unique smallest equivalent DFA (fewest states

### CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Spring UW CSE P 501 Spring 2018 C-1

CSE P 501 Compilers Parsing & Context-Free Grammars Hal Perkins Spring 2018 UW CSE P 501 Spring 2018 C-1 Administrivia Project partner signup: please find a partner and fill out the signup form by noon

### LR Parsing LALR Parser Generators

Outline LR Parsing LALR Parser Generators Review of bottom-up parsing Computing the parsing DFA Using parser generators 2 Bottom-up Parsing (Review) A bottom-up parser rewrites the input string to the

### CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Scanner Parser Static Analyzer Intermediate Representation Front End Back End Compiler / Interpreter

### Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

CMSC 330: Organization of Programming Languages Context Free Grammars Where We Are Programming languages Ruby OCaml Implementing programming languages Scanner Uses regular expressions Finite automata Parser

### CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

CSE450 Translation of Programming Languages Lecture 4: Syntax Analysis http://xkcd.com/859 Structure of a Today! Compiler Source Language Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator

### EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing Görel Hedin Revised: 2017-09-04 This lecture Regular expressions Context-free grammar Attribute grammar

### Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

Compiler Construction Grammars Parsing source code scanner tokens regular expressions lexical analysis Lennart Andersson parser context free grammar Revision 2012 01 23 2012 parse tree AST builder (implicit)

### Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

Architecture of Compilers, Interpreters : Organization of Programming Languages ource Analyzer Optimizer Code Generator Context Free Grammars Intermediate Representation Front End Back End Compiler / Interpreter

### Lecture 8: Deterministic Bottom-Up Parsing

Lecture 8: Deterministic Bottom-Up Parsing (From slides by G. Necula & R. Bodik) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 1 Avoiding nondeterministic choice: LR We ve been looking at general

### A Simple Syntax-Directed Translator

Chapter 2 A Simple Syntax-Directed Translator 1-1 Introduction The analysis phase of a compiler breaks up a source program into constituent pieces and produces an internal representation for it, called

### programming languages need to be precise a regular expression is one of the following: tokens are the building blocks of programs

Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott Introduction programming languages need to be precise natural languages less so both form (syntax) and meaning

### LR Parsing LALR Parser Generators

LR Parsing LALR Parser Generators Outline Review of bottom-up parsing Computing the parsing DFA Using parser generators 2 Bottom-up Parsing (Review) A bottom-up parser rewrites the input string to the

### CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages Context Free Grammars and Parsing 1 Recall: Architecture of Compilers, Interpreters Source Parser Static Analyzer Intermediate Representation Front End Back

### Fall Compiler Principles Context-free Grammars Refresher. Roman Manevich Ben-Gurion University of the Negev

Fall 2016-2017 Compiler Principles Context-free Grammars Refresher Roman Manevich Ben-Gurion University of the Negev 1 xample grammar S S ; S S id := S print (L) id num + L L L, shorthand for Statement

### Properties of Regular Expressions and Finite Automata

Properties of Regular Expressions and Finite Automata Some token patterns can t be defined as regular expressions or finite automata. Consider the set of balanced brackets of the form [[[ ]]]. This set

### COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]

### Lecture 7: Deterministic Bottom-Up Parsing

Lecture 7: Deterministic Bottom-Up Parsing (From slides by G. Necula & R. Bodik) Last modified: Tue Sep 20 12:50:42 2011 CS164: Lecture #7 1 Avoiding nondeterministic choice: LR We ve been looking at general

### CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers Syntax Analysis These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Limits of Regular Languages Advantages of Regular Expressions

### afewadminnotes CSC324 Formal Language Theory Dealing with Ambiguity: Precedence Example Office Hours: (in BA 4237) Monday 3 4pm Wednesdays 1 2pm

afewadminnotes CSC324 Formal Language Theory Afsaneh Fazly 1 Office Hours: (in BA 4237) Monday 3 4pm Wednesdays 1 2pm January 16, 2013 There will be a lecture Friday January 18, 2013 @2pm. 1 Thanks to

### Parsing II Top-down parsing. Comp 412

COMP 412 FALL 2018 Parsing II Top-down parsing Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled

### Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

Derivations vs Parses Grammar is used to derive string or construct parser Context ree Grammars A derivation is a sequence of applications of rules Starting from the start symbol S......... (sentence)

### Semantic Analysis. Lecture 9. February 7, 2018

Semantic Analysis Lecture 9 February 7, 2018 Midterm 1 Compiler Stages 12 / 14 COOL Programming 10 / 12 Regular Languages 26 / 30 Context-free Languages 17 / 21 Parsing 20 / 23 Extra Credit 4 / 6 Average

### Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5 1 Not all languages are regular So what happens to the languages which are not regular? Can we still come up with a language recognizer?

### CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

### Syntax-Directed Translation. Lecture 14

Syntax-Directed Translation Lecture 14 (adapted from slides by R. Bodik) 9/27/2006 Prof. Hilfinger, Lecture 14 1 Motivation: parser as a translator syntax-directed translation stream of tokens parser ASTs,

### Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Parsing Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students

### CMSC 330: Organization of Programming Languages. Context Free Grammars

CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

### In One Slide. Outline. LR Parsing. Table Construction

LR Parsing Table Construction #1 In One Slide An LR(1) parsing table can be constructed automatically from a CFG. An LR(1) item is a pair made up of a production and a lookahead token; it represents a

### Conflicts in LR Parsing and More LR Parsing Types

Conflicts in LR Parsing and More LR Parsing Types Lecture 10 Dr. Sean Peisert ECS 142 Spring 2009 1 Status Project 2 Due Friday, Apr. 24, 11:55pm The usual lecture time is being replaced by a discussion

### CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

### ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών Lecture 5a Syntax Analysis lias Athanasopoulos eliasathan@cs.ucy.ac.cy Syntax Analysis Συντακτική Ανάλυση Context-free Grammars (CFGs) Derivations Parse trees

### EECS 6083 Intro to Parsing Context Free Grammars

EECS 6083 Intro to Parsing Context Free Grammars Based on slides from text web site: Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. 1 Parsing sequence of tokens parser

### Syntax Analysis Part I

Syntax Analysis Part I Chapter 4: Context-Free Grammars Slides adapted from : Robert van Engelen, Florida State University Position of a Parser in the Compiler Model Source Program Lexical Analyzer Token,

### MIT Specifying Languages with Regular Expressions and Context-Free Grammars

MIT 6.035 Specifying Languages with Regular essions and Context-Free Grammars Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Language Definition Problem How to precisely

### LECTURE 3. Compiler Phases

LECTURE 3 Compiler Phases COMPILER PHASES Compilation of a program proceeds through a fixed series of phases. Each phase uses an (intermediate) form of the program produced by an earlier phase. Subsequent

### Building a Parser II. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Building a Parser II CS164 3:30-5:00 TT 10 Evans 1 Grammars Programming language constructs have recursive structure. which is why our hand-written parser had this structure, too An expression is either:

### CSE 401 Midterm Exam Sample Solution 2/11/15

Question 1. (10 points) Regular expression warmup. For regular expression questions, you must restrict yourself to the basic regular expression operations covered in class and on homework assignments:

### Context-Free Languages and Parse Trees

Context-Free Languages and Parse Trees Mridul Aanjaneya Stanford University July 12, 2012 Mridul Aanjaneya Automata Theory 1/ 41 Context-Free Grammars A context-free grammar is a notation for describing

### Introduction to Lexing and Parsing

Introduction to Lexing and Parsing ECE 351: Compilers Jon Eyolfson University of Waterloo June 18, 2012 1 Riddle Me This, Riddle Me That What is a compiler? 1 Riddle Me This, Riddle Me That What is a compiler?

### Defining syntax using CFGs

Defining syntax using CFGs Roadmap Last time Defined context-free grammar This time CFGs for specifying a language s syntax Language membership List grammars Resolving ambiguity CFG Review G = (N,Σ,P,S)

### EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

EDAN65: Compilers, Lecture 06 A LR parsing Görel Hedin Revised: 2017-09-11 This lecture Regular expressions Context-free grammar Attribute grammar Lexical analyzer (scanner) Syntactic analyzer (parser)

### CMSC 330: Organization of Programming Languages. Context Free Grammars

CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

### Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Parsing III (Top-down parsing: recursive descent & LL(1) ) (Bottom-up parsing) CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones Copyright 2003, Keith D. Cooper,

### Announcements. Written Assignment 1 out, due Friday, July 6th at 5PM.

Syntax Analysis Announcements Written Assignment 1 out, due Friday, July 6th at 5PM. xplore the theoretical aspects of scanning. See the limits of maximal-munch scanning. Class mailing list: There is an

### MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

MIT 6.035 Specifying Languages with Regular essions and Context-Free Grammars Martin Rinard Massachusetts Institute of Technology Language Definition Problem How to precisely define language Layered structure

### CS2210: Compiler Construction Syntax Analysis Syntax Analysis

Comparison with Lexical Analysis The second phase of compilation Phase Input Output Lexer string of characters string of tokens Parser string of tokens Parse tree/ast What Parse Tree? CS2210: Compiler

### Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Roadmap > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing The role of the parser > performs context-free syntax analysis > guides

### Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

More detailed overview of compiler front end Structure of a compiler Today we ll take a quick look at typical parts of a compiler. This is to give a feeling for the overall structure. source program lexical

### Administrativia. PA2 assigned today. WA1 assigned today. Building a Parser II. CS164 3:30-5:00 TT 10 Evans. First midterm. Grammars.

Administrativia Building a Parser II CS164 3:30-5:00 TT 10 Evans PA2 assigned today due in 12 days WA1 assigned today due in a week it s a practice for the exam First midterm Oct 5 will contain some project-inspired

### 2.2 Syntax Definition

42 CHAPTER 2. A SIMPLE SYNTAX-DIRECTED TRANSLATOR sequence of "three-address" instructions; a more complete example appears in Fig. 2.2. This form of intermediate code takes its name from instructions

### CSCI312 Principles of Programming Languages!

CSCI312 Principles of Programming Languages!! Chapter 3 Regular Expression and Lexer Xu Liu Recap! Copyright 2006 The McGraw-Hill Companies, Inc. Clite: Lexical Syntax! Input: a stream of characters from

### Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2

Formal Languages and Grammars Chapter 2: Sections 2.1 and 2.2 Formal Languages Basis for the design and implementation of programming languages Alphabet: finite set Σ of symbols String: finite sequence

### Lecture 4: Syntax Specification

The University of North Carolina at Chapel Hill Spring 2002 Lecture 4: Syntax Specification Jan 16 1 Phases of Compilation 2 1 Syntax Analysis Syntax: Webster s definition: 1 a : the way in which linguistic

### Context-Free Grammars

Context-Free Grammars 1 Informal Comments A context-free grammar is a notation for describing languages. It is more powerful than finite automata or RE s, but still cannot define all possible languages.

### Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

Theoretical Part Chapter one:- - What are the Phases of compiler? Six phases Scanner Parser Semantic Analyzer Source code optimizer Code generator Target Code Optimizer Three auxiliary components Literal

### CS 406/534 Compiler Construction Parsing Part I

CS 406/534 Compiler Construction Parsing Part I Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy and Dr.

### CMSC 330: Organization of Programming Languages. Context-Free Grammars Ambiguity

CMSC 330: Organization of Programming Languages Context-Free Grammars Ambiguity Review Why should we study CFGs? What are the four parts of a CFG? How do we tell if a string is accepted by a CFG? What

### CSE 3302 Programming Languages Lecture 2: Syntax

CSE 3302 Programming Languages Lecture 2: Syntax (based on slides by Chengkai Li) Leonidas Fegaras University of Texas at Arlington CSE 3302 L2 Spring 2011 1 How do we define a PL? Specifying a PL: Syntax:

### CIT Lecture 5 Context-Free Grammars and Parsing 4/2/2003 1

CIT3136 - Lecture 5 Context-Free Grammars and Parsing 4/2/2003 1 Definition of a Context-free Grammar: An alphabet or set of basic symbols (like regular expressions, only now the symbols are whole tokens,

### Topic 5: Syntax Analysis III

Topic 5: Syntax Analysis III Compiler Design Prof. Hanjun Kim CoreLab (Compiler Research Lab) POSTECH 1 Back-End Front-End The Front End Source Program Lexical Analysis Syntax Analysis Semantic Analysis

### CSCI312 Principles of Programming Languages!

CSCI312 Principles of Programming Languages! Chapter 2 Syntax! Xu Liu Review! Principles of PL syntax, naming, types, semantics Paradigms of PL design imperative, OO, functional, logic What makes a successful

### LR Parsing. Table Construction

#1 LR Parsing Table Construction #2 Outline Review of bottom-up parsing Computing the parsing DFA Closures, LR(1) Items, States Transitions Using parser generators Handling Conflicts #3 In One Slide An

### syntax tree - * * * - * * * * * 2 1 * * 2 * (2 * 1) - (1 + 0)

0//7 xpression rees rom last time: we can draw a syntax tree for the Java expression ( 0). 0 ASS, GRAMMARS, PARSING, R RAVRSALS Lecture 3 CS0 all 07 Preorder, Postorder, and Inorder Preorder, Postorder,