EDA180: Compiler Construc6on Context free grammars. Görel Hedin Revised:


 Wesley Melton
 2 years ago
 Views:
Transcription
1 EDA180: Compiler Construc6on Context free grammars Görel Hedin Revised:
2 Compiler phases and program representa6ons source code Lexical analysis (scanning) Intermediate code genera6on tokens intermediate code Syntac6c analysis (parsing) Op6miza6on AST APributed AST intermediate code Seman6c analysis Analysis Machine code genera6on Synthesis machine code 2
3 A closer look at the parser text Scanner tokens Defined by: regular expressions Parser Pure parsing concrete parse tree (implicit) This lecture context free grammar AST building AST abstract grammar The concrete parse tree is never constructed. The parser builds the AST directly using seman<c ac<ons. 3
4 Regular ressions vs Context Free Grammars Example RE: ID = [az][az09]* Example CFG: Stmt > WhileStmt AssignStmt WhileStmt > WHILE LPAR RPAR Stmt An RE can have itera<on A CFG can also have recursion (it is possible to derive a symbol, e.g., Stmt, from itself) 4
5 Elements of a Context Free Grammar Example CFG: Stmt > WhileStmt AssignStmt WhileStmt > WHILE LPAR RPAR Stmt Nonterminal symbols Terminal symbols (tokens) Produc<on rules: N > s 1 s 2 s n where s k is a symbol (terminal or nonterminal) Start symbol (one of the nonterminals, usually the first one) 5
6 Exercise Construct a grammar covering the following program: Example program: while (k <= n) {sum = sum + k; k = k+1;} CFG: Stmt > WhileStmt AssignStmt CompoundStmt WhileStmt > "while" "(" ")" Stmt AssignStmt > ID "=" CompoundStmt > > LessEq > Add > Usually, simple tokens are wripen directly as text strings 6
7 Solu6on Construct a grammar covering the following program: Example program: while (k <= n) {sum = sum + k; k = k+1;} CFG: Stmt > WhileStmt AssignStmt CompoundStmt WhileStmt > "while" "(" ")" Stmt AssignStmt > ID "=" CompoundStmt > "{" (Stmt ";")* "}" > LessEq Add ID LessEq > "<=" Add > "+" 7
8 Real world example: The Java Language Specifica6on Compila6onUnit: PackageDeclara6on opt ImportDeclara6ons opt TypeDeclara6ons opt ImportDeclara6ons: ImportDeclara6on ImportDeclara6ons ImportDeclara6on TypeDeclara6ons: TypeDeclara6on TypeDeclara6ons TypeDeclara6on PackageDeclara6on: Annota6ons opt package PackageName ; See Take a look at Chapter 2 about the Java grammar defini6ons, and look at some parts of the specifica6on. 8
9 Parsing Use the grammar to derive a tree for a program: Example program: sum = sum + k Stmt Start symbol sum = sum + k 9
10 Parse tree Use the grammar to derive a parse tree for a program: Example program: sum = sum + k Stmt AssignStmt Start symbol Nonterminals are inner nodes Add sum = sum + k Terminals are leafs 10
11 Corresponding abstract syntax tree (will be discussed in Lecture 5) Example program: sum = sum + k AssignStmt Add Id Id Id sum = sum + k 11
12 Forms of CFGs: EBNF: Stmt > AssignStmt CompoundStmt AssignStmt > ID "=" CompoundStmt > "{" (Stmt ";")* "}" > Add ID Add > "+" Canonical form: Stmt > ID "=" Stmt > "{" Stmts "}" Stmts > ε Stmts > Stmt ";" Stmts > "+" > ID Extended Backus Naur Form: Compact, easy to read and write BNF has alterna6ves EBNF also has repe66on, op6onals, parentheses Common nota6on for prac6cal use EBNF used in JavaCC BNF oben used in LR parser generators Canonical form: Core formalism for CFGs Replaces repe66on with recursion Replaces alterna6ves, op6onals, parentheses with several produc6on rules Useful for proving proper6es and explaining algorithms 12
13 Formal defini6on of CFGs (canonical form) A contextfree grammar G = (N, T, P, S), where N the set of nonterminal symbols T the set of terminal symbols P the set of production rules, each with the form X > Y 1 Y 2 Y n where X N, and Y k N T S the start symbol (one of the nonterminals). I.e., S N So, the lefthand side X of a rule is a nonterminal. And the righthand side Y 1 Y 2 Y n is a sequence of nonterminals and terminals. If the rhs for a production is empty, i.e., n = 0, we write X > ε 13
14 A grammar G defines a language L(G) A contextfree grammar G = (N, T, P, S), where N the set of nonterminal symbols T the set of terminal symbols P the set of production rules, each with the form X > Y 1 Y 2 Y n where X N, and Y k N T S the start symbol (one of the nonterminals). I.e., S N G defines a language L(G) over the alphabet T T* is the set of all possible sequences of T symbols. L(G) is the subset of T* that can be derived from the start symbol S, by following the production rules P. 14
15 Exercise G = (N, T, P, S) L(G) = { P = { Stmt > ID "=", Stmt > "{" Stmts "}", Stmts > ε, Stmts > Stmt ";" Stmts, > "+", > ID } N = { } T = { } S = } 15
16 Solu6on G = (N, T, P, S) P = { Stmt > ID "=", Stmt > "{" Stmts "}", Stmts > ε, Stmts > Stmt ";" Stmts, > "+", > ID } L(G) = { ID "=" ID, ID "=" ID "+" ID, ID "=" ID "+" ID "+" ID, "{" "}", "{" ID "=" ID ";" "}", "{" ID "=" ID ";" "{" "}" ";" "}", N = {Stmt,, Stmts} T = {ID, "=", "{", "}", ";", "+"} S = Stmt } The sequences in L(G) are usually called sentences or strings 16
17 Deriva6on step If we have a sequence of terminals and nonterminals, e.g., X a Y Y b we can replace one of the nonterminals, applying a production rule. This is called a derivation step. (Swedish: Härledningssteg) Suppose there is a production Y > X a and we apply it for the first Y in the sequence. We write the derivation step as follows: X a Y Y b => X a X a Y b 17
18 Deriva6on A derivation, is simply a sequence of derivation steps, e.g.: γ 1 => γ 2 => => γ n (n 0) where each γ i is a sequence of terminals and nonterminals If there is a derivation from γ 1 to γ n, we can write this as γ 1 =>* γ n So this means it is possible to get from the sequence γ 1 to the sequence γ n by following the production rules. 18
19 Recall that: Defini6on of the language L(G) G = (N, T, P, S) T* is the set of all possible sequences of T symbols. L(G) is the subset of T* that can be derived from the start symbol S, by following the production rules P. Using the concept of derivations, we can formally define L(G) as follows: L(G) = { w T* S =>* w } 19
20 Exercise: Prove that a sentence belongs to a language Prove that INT + INT * INT belongs to the language of the following grammar: p 1 :# > "+" p 2 :# > "*" p 3 :# > INT Proof (by showing all the derivation steps from the start symbol ): => 20
21 Solu6on: Prove that a sentence belongs to a language Prove that INT + INT * INT belongs to the language of the following grammar: p 1 :# > "+" p 2 :# > "*" p 3 :# > INT Proof (by showing all the derivation steps from the start symbol ): => "+" #(p 1 ) => INT "+" #(p 3 ) => INT "+" "*" #(p 2 ) => INT "+" INT "*" #(p 3 ) => INT "+" INT "*" INT #(p 3 ) 21
22 Lebmost and rightmost deriva6ons In a leftmost derivation, the leftmost nonterminal is replaced in each derivation step, e.g.,: => "+" => INT "+" => INT "+" "*" => INT "+" INT "*" => INT "+" INT "*" INT In a rightmost derivation, the rightmost nonterminal is replaced in each derivation step, e.g.,: => "+" => "+" "*" => "+" "*" INT => "+" INT "*" INT => INT "+" INT "*" INT LL parsing algoritms use leftmost derivation. LR parsing algorithms use rightmost derivation. Will be discussed in later lectures. 22
23 A deriva6on corresponds to building a parse tree Grammar: > "+" > "*" > INT Exercise: build the parse tree (also called derivation tree). Example derivation: => "+" => INT "+" => INT "+" "*" => INT "+" INT "*" => INT "+" INT "*" INT 23
24 A deriva6on corresponds to building a parse tree Grammar: > "+" > "*" > INT Parse tree: Example derivation: "+" => "+" => INT "+" => INT "+" "*" => INT "+" INT "*" => INT "+" INT "*" INT INT INT "*" INT 24
25 Exercise: Can we do another deriva6on of the same sentence, that gives a different parse tree? Grammar: > "+" > "*" > INT Parse tree: Another derivation: => 25
26 Solu6on: Can we do another deriva6on of the same sentence, that gives a different parse tree? Grammar: > "+" > "*" > INT Parse tree: Another derivation: "*" => "*" => "+" "*" => INT "+" "*" => INT "+" INT "*" => INT "+" INT "*" INT INT "+" INT INT Which parse tree would we prefer? 26
27 Ambiguous context free grammars A CFG is ambiguous if a sentence in the language can be derived by two (or more) different parse trees. A CFG is unambiguous if each sentence in the language can be derived by only one syntax tree. (Swedish: tvetydig, otvetydig) 27
28 How can we know if a CFG is ambiguous? If we find an example of an ambiguity, we know the grammar is ambiguous. There are algorithms for deciding if a CFG belongs to certain subsets of CFGs, e.g. LL, LR, etc. (See later lectures.) These grammars are unambiguous. But in the general case, the problem is undecidable: it is not possible to construct a general algorithm that decides ambiguity for any CFG. 28
29 Strategies for elimina6ng ambigui6es We should try to eliminate ambiguities, without changing the language. First, decide which parse tree is the desired one. Create an equivalent unambiguous grammar from which only the desired parse trees can be derived. Or, use additional priority and associativity rules to instruct the parser to derive the "right" parse tree. (Works for some ambiguities.) 29
30 Equivalent grammars Two grammars, G 1 and G 2, are equivalent if they generate the same language. I.e., each sentence in one of the grammars can be derived also in the other grammar: L(G 1 ) = L(G 2 ) 30
31 Example ambiguity: Priority (also called precedence) > "+" > "*" > INT Two parse trees for INT "+" INT "*" INT "+" "*" INT "*" "+" INT INT INT INT INT prio("*") > prio("+") (according to tradition) prio("+") > prio("*") (would be unexpected and confusing) 31
32 Example ambiguity: Associa<vity > "+" > "" > "**" > INT For operators with the same priority, how do several in a sequence associate? "+" "**" "" INT INT "**" INT INT INT INT Leftassociative (usual for most operators) Rightassociative (usual for the power operator) 32
33 Example ambiguity: Non associa<vity > "<" > INT For some operators, it does not make sense to have several in a sequence at all. They are nonassociative. "<" "<" "<" INT INT "<" INT INT INT INT We would like to forbid both trees. I.e., rule out the sentence from the langauge. 33
34 Disambigua6ng expression grammars How can we change the grammar so that only the desired trees can be derived? Idea: Restrict certain subtrees by introducing new nonterminals. Priority: Introduce a new nonterminal for each priority level: Term, Factor, Primary,... Left associativity: Restrict the right operand so it only can contain expressions of higher priority Right associativity:... 34
35 Exercise Ambiguous grammar: Equivalent unambiguous grammar: r > r "+" r r > r "*" r r > INT r > "(" r ")" 35
36 Solu6on Ambiguous grammar: r > r "+" r r > r "*" r r > INT r > "(" r ")" Equivalent unambiguous grammar: r > r "+" Term r > Term Term > Term "*" Factor Term > Factor Factor > INT Factor > "(" r ")" Here, we introduce a new nonterminal, Term, that is more restricted than r. That is, from Term, we can not derive any new addi6ons. We use Term as the right operand in the addi6on produc6on, to make sure no new addi6ons will appear to the right. This give leb associa6vity. We use Term, and the even more restricted nonterminal Factor, in the mul6plica6on produc6on, to make sure no addi6ons can appear there, without using parentheses. This gives mul6plica6on higher priority than addi6on. 36
37 Wri6ng the grammar in different nota6ons Equivalent BNF (BackusNaur Form): Canonical form: r > r "+" Term r > Term Term > Term "*" Factor Term > Factor Factor > INT Factor > "(" r ")" Use alterna<ves instead of several produc6ons per nonterminal. Equivalent EBNF (Extended BNF): Use repe<<on instead of recursion, where possible. 37
38 Wri6ng the grammar in different nota6ons Equivalent BNF (BackusNaur Form): Canonical form: r > r "+" Term r > Term Term > Term "*" Factor Term > Factor Factor > INT Factor > "(" r ")" r > r "+" Term Term Term > Term "*" Factor Factor Factor > INT "(" r ")" Use alterna<ves instead of several produc6ons per nonterminal. Equivalent EBNF (Extended BNF): r > Term ("+" Term)* Term > Factor ("*" Factor)* Factor > INT "(" r ")" Use repe<<on instead of recursion, where possible. 38
39 EBNF Transla6ng EBNF to Canonical form Equivalent canonical form Top level repetition X > γ 1 γ 2 * γ 3 Top level alternative X > γ 1 γ 2 Top level parentheses X > γ 1 (...) γ 2 Where γ k is a sequence of terminals and nonterminals 39
40 EBNF Transla6ng EBNF to Canonical form Equivalent canonical form Top level repetition X > γ 1 γ 2 * γ 3 X > γ 1 N γ 3 N > γ 2 N N > ε Top level alternative X > γ 1 γ 2 X > γ 1 X > γ 2 Top level parentheses X > γ 1 (...) γ 2 X > γ 1 N γ 2 N >... 40
41 Exercise: Translate from EBNF to Canonical form EBNF: Equivalent Canonical Form r > Term ("+" Term)* 41
42 Solu6on: Translate from EBNF to Canonical form EBNF: r > Term ("+" Term)* Equivalent Canonical Form r > Term N N > "+" Term N N > ε 42
43 Can we show that these are equivalent? Equivalent Canonical Form trivial r > Term N N > "+" Term N N > ε EBNF: r > Term ("+" Term)* Alternative Equivalent Canonical Form non trivial r > r "+" Term r > Term 43
44 Example proof 1. We start with this: r > Term ("+" Term)* 2. We can move the repetition: r > (Term "+")* Term 5. Replace N Term by r in second production: r > N Term N > r "+" N > ε 3. Introduce a nonterminal N: r > N Term N > (Term "+")* 6. Eliminate N: r > r "+" Term r > Term 4. Eliminate the repetition: r > N Term N > N Term "+" N > ε 44
45 Equivalence of grammars Given two contextfree grammars, G1 and G2. Are they equivalent? I.e., is L(G1) = L(G2)? Undecidable problem: a general algorithm cannot be constructed. We need to rely on our ingenuity to find out. (In the general case.) 45
46 Regular ressions vs Context Free Grammars RE CFG Alphabet characters terminal symbols (tokens) Language strings (char sequences) sentences (token sequences) Used for tokens parse trees Power iteration recursion Recognizer DFA NFA with stack 46
47 The Chomsky grammar hierarchy Grammar Rule patterns Type regular X > ay or X > a or X > ε 3 context free X > γ 2 context sensitive α X β > α γ β 1 arbitrary γ > δ 0 Type(3) Type (2) Type(1) Type(0) Regular grammars have the same power as regular expressions (tail recursion = itera6on). Type 2 and 3 are of prac6cal use in compiler construc6on. Type 0 and 1 are only of theore6cal interest. 47
48 Summary ques6ons Define a small example language with a CFG What is a nonterminal symbol? A terminal symbol? A produc6on? A start symbol? A parse tree? What is a leb hand side of a produc6on? A right hand side? Given a grammar G, what is meant by the language L(G)? What is a deriva6on step? A deriva6on? A lebmost deriva6on? How does a deriva6on correspond to a parse tree? When is a grammar ambiguous? Unambiguous? How can ambigui6es for expressions be removed? When are two grammars equivalent? When should we use canonical form, and when EBNF? Translate an EBNF grammar to canonical form. lain why context free grammars are more powerful than regular expressions. What is the Chomsky hierarchy? 48
49 Readings F3: Context free grammars, deriva6ons, ambiguity, EBNF Appel, chapter Appel, relevant exercises: Try solve the problems in Seminar 2. F4: Predic6ve parsing. Recursive descent. LL grammars and parsing. Leb recursion and factoriza6on. Appel, chapter
EDA180: Compiler Construc6on. Top down parsing. Görel Hedin Revised: a
EDA180: Compiler Construc6on Top down parsing Görel Hedin Revised: 201301 30a Compiler phases and program representa6ons source code Lexical analysis (scanning) Intermediate code genera6on tokens intermediate
More informationParsing. source code. while (k<=n) {sum = sum+k; k=k+1;}
Compiler Construction Grammars Parsing source code scanner tokens regular expressions lexical analysis Lennart Andersson parser context free grammar Revision 2012 01 23 2012 parse tree AST builder (implicit)
More informationEDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:
EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing Görel Hedin Revised: 20170904 This lecture Regular expressions Contextfree grammar Attribute grammar
More informationEDA180: Compiler Construc6on. More Top Down Parsing Abstract Syntax Trees Görel Hedin Revised:
EDA180: Compiler Construc6on More Top Down Parsing Abstract Syntax Trees Görel Hedin Revised: 201302 05 Compiler phases and program representa6ons source code Lexical analysis (scanning) Intermediate
More informationSyntax. In Text: Chapter 3
Syntax In Text: Chapter 3 1 Outline Syntax: Recognizer vs. generator BNF EBNF Chapter 3: Syntax and Semantics 2 Basic Definitions Syntax the form or structure of the expressions, statements, and program
More informationDefining syntax using CFGs
Defining syntax using CFGs Roadmap Last 8me Defined contextfree grammar This 8me CFGs for syntax design Language membership List grammars Resolving ambiguity CFG Review G = (N,Σ,P,S) means derives derives
More informationDefining syntax using CFGs
Defining syntax using CFGs Roadmap Last time Defined contextfree grammar This time CFGs for specifying a language s syntax Language membership List grammars Resolving ambiguity CFG Review G = (N,Σ,P,S)
More informationCS 314 Principles of Programming Languages
CS 314 Principles of Programming Languages Lecture 5: Syntax Analysis (Parsing) Zheng (Eddy) Zhang Rutgers University January 31, 2018 Class Information Homework 1 is being graded now. The sample solution
More informationEDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:
EDAN65: Compilers, Lecture 06 A LR parsing Görel Hedin Revised: 20170911 This lecture Regular expressions Contextfree grammar Attribute grammar Lexical analyzer (scanner) Syntactic analyzer (parser)
More informationDescribing Syntax and Semantics
Describing Syntax and Semantics Introduction Syntax: the form or structure of the expressions, statements, and program units Semantics: the meaning of the expressions, statements, and program units Syntax
More informationLL parsing Nullable, FIRST, and FOLLOW
EDAN65: Compilers LL parsing Nullable, FIRST, and FOLLOW Görel Hedin Revised: 201409 22 Regular expressions Context free grammar ATribute grammar Lexical analyzer (scanner) SyntacKc analyzer (parser)
More informationCMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield
CMPS 3500 Programming Languages Dr. Chengwei Lei CEECS California State University, Bakersfield Chapter 3 Describing Syntax and Semantics Chapter 3 Topics Introduction The General Problem of Describing
More informationLexical and Syntax Analysis. TopDown Parsing
Lexical and Syntax Analysis TopDown Parsing Easy for humans to write and understand String of characters Lexemes identified String of tokens Easy for programs to transform Data structure Syntax A syntax
More informationParsing. Roadmap. > Contextfree grammars > Derivations and precedence > Topdown parsing > Leftrecursion > Lookahead > Tabledriven parsing
Roadmap > Contextfree grammars > Derivations and precedence > Topdown parsing > Leftrecursion > Lookahead > Tabledriven parsing The role of the parser > performs contextfree syntax analysis > guides
More informationLexical and Syntax Analysis
Lexical and Syntax Analysis (of Programming Languages) TopDown Parsing Lexical and Syntax Analysis (of Programming Languages) TopDown Parsing Easy for humans to write and understand String of characters
More informationCMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters
: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Scanner Parser Static Analyzer Intermediate Representation Front End Back End Compiler / Interpreter
More informationLecture 4: Syntax Specification
The University of North Carolina at Chapel Hill Spring 2002 Lecture 4: Syntax Specification Jan 16 1 Phases of Compilation 2 1 Syntax Analysis Syntax: Webster s definition: 1 a : the way in which linguistic
More informationIntroduction to Lexing and Parsing
Introduction to Lexing and Parsing ECE 351: Compilers Jon Eyolfson University of Waterloo June 18, 2012 1 Riddle Me This, Riddle Me That What is a compiler? 1 Riddle Me This, Riddle Me That What is a compiler?
More informationCOP4020 Programming Languages. Syntax Prof. Robert van Engelen
COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview n Tokens and regular expressions n Syntax and contextfree grammars n Grammar derivations n More about parse trees n Topdown and
More informationSyntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc.
Syntax Syntax Syntax defines what is grammatically valid in a programming language Set of grammatical rules E.g. in English, a sentence cannot begin with a period Must be formal and exact or there will
More informationRelated Course Objec6ves
Syntax 9/18/17 1 Related Course Objec6ves Develop grammars and parsers of programming languages 9/18/17 2 Syntax And Seman6cs Programming language syntax: how programs look, their form and structure Syntax
More informationProgramming Language Specification and Translation. ICOM 4036 Fall Lecture 3
Programming Language Specification and Translation ICOM 4036 Fall 2009 Lecture 3 Some parts are Copyright 2004 Pearson AddisonWesley. All rights reserved. 31 Language Specification and Translation Topics
More informationCOP4020 Programming Languages. Syntax Prof. Robert van Engelen
COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview Tokens and regular expressions Syntax and contextfree grammars Grammar derivations More about parse trees Topdown and bottomup
More informationICOM 4036 Spring 2004
Language Specification and Translation ICOM 4036 Spring 2004 Lecture 3 Copyright 2004 Pearson AddisonWesley. All rights reserved. 31 Language Specification and Translation Topics Structure of a Compiler
More informationArchitecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End
Architecture of Compilers, Interpreters : Organization of Programming Languages ource Analyzer Optimizer Code Generator Context Free Grammars Intermediate Representation Front End Back End Compiler / Interpreter
More informationCMSC 330: Organization of Programming Languages. Context Free Grammars
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationDr. D.M. Akbar Hussain
Syntax Analysis Parsing Syntax Or Structure Given By Determines Grammar Rules Context Free Grammar 1 Context Free Grammars (CFG) Provides the syntactic structure: A grammar is quadruple (V T, V N, S, R)
More informationChapter 4. Syntax  the form or structure of the expressions, statements, and program units
Syntax  the form or structure of the expressions, statements, and program units Semantics  the meaning of the expressions, statements, and program units Who must use language definitions? 1. Other language
More information3. Parsing. Oscar Nierstrasz
3. Parsing Oscar Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes. http://www.cs.ucla.edu/~palsberg/ http://www.cs.purdue.edu/homes/hosking/
More informationSyntax Analysis Part I
Syntax Analysis Part I Chapter 4: ContextFree Grammars Slides adapted from : Robert van Engelen, Florida State University Position of a Parser in the Compiler Model Source Program Lexical Analyzer Token,
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Context Free Grammars and Parsing 1 Recall: Architecture of Compilers, Interpreters Source Parser Static Analyzer Intermediate Representation Front End Back
More informationA programming language requires two major definitions A simple one pass compiler
A programming language requires two major definitions A simple one pass compiler [Syntax: what the language looks like A contextfree grammar written in BNF (BackusNaur Form) usually suffices. [Semantics:
More informationCS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University
CS415 Compilers Syntax Analysis These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Limits of Regular Languages Advantages of Regular Expressions
More informationCPS 506 Comparative Programming Languages. Syntax Specification
CPS 506 Comparative Programming Languages Syntax Specification Compiling Process Steps Program Lexical Analysis Convert characters into a stream of tokens Lexical Analysis Syntactic Analysis Send tokens
More informationFormal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2
Formal Languages and Grammars Chapter 2: Sections 2.1 and 2.2 Formal Languages Basis for the design and implementation of programming languages Alphabet: finite set Σ of symbols String: finite sequence
More informationCSE 3302 Programming Languages Lecture 2: Syntax
CSE 3302 Programming Languages Lecture 2: Syntax (based on slides by Chengkai Li) Leonidas Fegaras University of Texas at Arlington CSE 3302 L2 Spring 2011 1 How do we define a PL? Specifying a PL: Syntax:
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationLanguages and Compilers
Principles of Software Engineering and Operational Systems Languages and Compilers SDAGE: Level I 201213 3. Formal Languages, Grammars and Automata Dr Valery Adzhiev vadzhiev@bournemouth.ac.uk Office:
More informationOptimizing Finite Automata
Optimizing Finite Automata We can improve the DFA created by MakeDeterministic. Sometimes a DFA will have more states than necessary. For every DFA there is a unique smallest equivalent DFA (fewest states
More informationWhere We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars
CMSC 330: Organization of Programming Languages Context Free Grammars Where We Are Programming languages Ruby OCaml Implementing programming languages Scanner Uses regular expressions Finite automata Parser
More informationCS 406/534 Compiler Construction Parsing Part I
CS 406/534 Compiler Construction Parsing Part I Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy and Dr.
More informationParsing II Topdown parsing. Comp 412
COMP 412 FALL 2017 Parsing II Topdown parsing Comp 412 source code IR Front End OpMmizer Back End IR target code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled
More informationCOP 3402 Systems Software Syntax Analysis (Parser)
COP 3402 Systems Software Syntax Analysis (Parser) Syntax Analysis 1 Outline 1. Definition of Parsing 2. Context Free Grammars 3. Ambiguous/Unambiguous Grammars Syntax Analysis 2 Lexical and Syntax Analysis
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationPrinciples of Programming Languages COMP251: Syntax and Grammars
Principles of Programming Languages COMP251: Syntax and Grammars Prof. Dekai Wu Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong, China Fall 2006
More informationChapter 3. Describing Syntax and Semantics
Chapter 3 Describing Syntax and Semantics Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute Grammars Describing the Meanings of Programs:
More informationCMSC 330: Organization of Programming Languages. Context Free Grammars
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationSyntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing
Syntax/semantics Program program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing Metamodels 8/27/10 1 Program program execution Syntax Semantics
More informationprogramming languages need to be precise a regular expression is one of the following: tokens are the building blocks of programs
Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott Introduction programming languages need to be precise natural languages less so both form (syntax) and meaning
More informationBuilding Compilers with Phoenix
Building Compilers with Phoenix SyntaxDirected Translation Structure of a Compiler Character Stream Intermediate Representation Lexical Analyzer MachineIndependent Optimizer token stream Intermediate
More informationDerivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012
Derivations vs Parses Grammar is used to derive string or construct parser Context ree Grammars A derivation is a sequence of applications of rules Starting from the start symbol S......... (sentence)
More informationChapter 3. Describing Syntax and Semantics ISBN
Chapter 3 Describing Syntax and Semantics ISBN 0321493621 Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Copyright 2009 AddisonWesley. All
More informationIntroduction to Syntax Analysis. The Second Phase of FrontEnd
Compiler Design IIIT Kalyani, WB 1 Introduction to Syntax Analysis The Second Phase of FrontEnd Compiler Design IIIT Kalyani, WB 2 Syntax Analysis The syntactic or the structural correctness of a program
More informationBottomUp Parsing. Lecture 1112
BottomUp Parsing Lecture 1112 (From slides by G. Necula & R. Bodik) 9/22/06 Prof. Hilfinger CS164 Lecture 11 1 BottomUp Parsing Bottomup parsing is more general than topdown parsing And just as efficient
More informationBottomUp Parsing. Lecture 1112
BottomUp Parsing Lecture 1112 (From slides by G. Necula & R. Bodik) 2/20/08 Prof. Hilfinger CS164 Lecture 11 1 Administrivia Test I during class on 10 March. 2/20/08 Prof. Hilfinger CS164 Lecture 11
More informationCMPT 755 Compilers. Anoop Sarkar.
CMPT 755 Compilers Anoop Sarkar http://www.cs.sfu.ca/~anoop Parsing source program Lexical Analyzer token next() Parser parse tree Later Stages Lexical Errors Syntax Errors Contextfree Grammars Set of
More informationCSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis
CSE450 Translation of Programming Languages Lecture 4: Syntax Analysis http://xkcd.com/859 Structure of a Today! Compiler Source Language Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator
More informationGrammars and Parsing. Paul Klint. Grammars and Parsing
Paul Klint Grammars and Languages are one of the most established areas of Natural Language Processing and Computer Science 2 N. Chomsky, Aspects of the theory of syntax, 1965 3 A Language...... is a (possibly
More informationProgramming Language Syntax and Analysis
Programming Language Syntax and Analysis 2017 Kwangman Ko (http://compiler.sangji.ac.kr, kkman@sangji.ac.kr) Dept. of Computer Engineering, Sangji University Introduction Syntax the form or structure of
More informationChapter 3. Describing Syntax and Semantics ISBN
Chapter 3 Describing Syntax and Semantics ISBN 0321493621 Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute Grammars Describing the
More informationSyntax Analysis/Parsing. Contextfree grammars (CFG s) Contextfree grammars vs. Regular Expressions. BNF description of PL/0 syntax
Susan Eggers 1 CSE 401 Syntax Analysis/Parsing Contextfree grammars (CFG s) Purpose: determine if tokens have the right form for the language (right syntactic structure) stream of tokens abstract syntax
More informationCSCI312 Principles of Programming Languages!
CSCI312 Principles of Programming Languages!! Chapter 3 Regular Expression and Lexer Xu Liu Recap! Copyright 2006 The McGrawHill Companies, Inc. Clite: Lexical Syntax! Input: a stream of characters from
More informationCS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).
CS 2210 Sample Midterm 1. Determine if each of the following claims is true (T) or false (F). F A language consists of a set of strings, its grammar structure, and a set of operations. (Note: a language
More informationCS 315 Programming Languages Syntax. Parser. (Alternatively handbuilt) (Alternatively handbuilt)
Programming languages must be precise Remember instructions This is unlike natural languages CS 315 Programming Languages Syntax Precision is required for syntax think of this as the format of the language
More informationIntroduction to Syntax Analysis
Compiler Design 1 Introduction to Syntax Analysis Compiler Design 2 Syntax Analysis The syntactic or the structural correctness of a program is checked during the syntax analysis phase of compilation.
More informationCSE P 501 Compilers. Parsing & ContextFree Grammars Hal Perkins Winter UW CSE P 501 Winter 2016 C1
CSE P 501 Compilers Parsing & ContextFree Grammars Hal Perkins Winter 2016 UW CSE P 501 Winter 2016 C1 Administrivia Project partner signup: please find a partner and fill out the signup form by noon
More informationCompila(on (Semester A, 2013/14)
Compila(on 03683133 (Semester A, 2013/14) Lecture 4: Syntax Analysis (Top Down Parsing) Modern Compiler Design: Chapter 2.2 Noam Rinetzky Slides credit: Roman Manevich, Mooly Sagiv, Jeff Ullman, Eran
More informationSyntax Analysis Check syntax and construct abstract syntax tree
Syntax Analysis Check syntax and construct abstract syntax tree if == = ; b 0 a b Error reporting and recovery Model using context free grammars Recognize using Push down automata/table Driven Parsers
More informationCSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B1
CSEP 501 Compilers Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter 2008 1/8/2008 200208 Hal Perkins & UW CSE B1 Agenda Basic concepts of formal grammars (review) Regular expressions
More information3. Contextfree grammars & parsing
3. Contextfree grammars & parsing The parsing process sequences of tokens parse tree or syntax tree a / [ / index / ]/= / 4 / + / 2 The parsing process sequences of tokens parse tree or syntax tree a
More informationWednesday, September 9, 15. Parsers
Parsers What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda
More informationParsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:
What is a parser Parsers A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda
More informationList of Figures. About the Authors. Acknowledgments
List of Figures Preface About the Authors Acknowledgments xiii xvii xxiii xxv 1 Compilation 1 1.1 Compilers..................................... 1 1.1.1 Programming Languages......................... 1
More informationCS5363 Final Review. cs5363 1
CS5363 Final Review cs5363 1 Programming language implementation Programming languages Tools for describing data and algorithms Instructing machines what to do Communicate between computers and programmers
More informationCompilers. Yannis Smaragdakis, U. Athens (original slides by Sam
Compilers Parsing Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Next step text chars Lexical analyzer tokens Parser IR Errors Parsing: Organize tokens into sentences Do tokens conform
More informationChapter 3. Describing Syntax and Semantics
Chapter 3 Describing Syntax and Semantics Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute Grammars Describing the Meanings of Programs:
More information4 (c) parsing. Parsing. Top down vs. bo5om up parsing
4 (c) parsing Parsing A grammar describes syntac2cally legal strings in a language A recogniser simply accepts or rejects strings A generator produces strings A parser constructs a parse tree for a string
More informationCompiler phases. Nontokens
Compiler phases Compiler Construction Scanning Lexical Analysis source code scanner tokens regular expressions lexical analysis Lennart Andersson parser context free grammar Revision 2011 01 21 parse tree
More informationPrinciples of Programming Languages
Principles of Programming Languages h"p://www.di.unipi.it/~andrea/dida2ca/plp 14/ Prof. Andrea Corradini Department of Computer Science, Pisa Lesson 11! Syntax Directed Transla>on The Structure of the
More informationCOMP421 Compiler Design. Presented by Dr Ioanna Dionysiou
COMP421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]
More informationCMSC 330: Organization of Programming Languages. Context Free Grammars
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationPart III : Parsing. From Regular to ContextFree Grammars. Deriving a Parser from a ContextFree Grammar. Scanners and Parsers.
Part III : Parsing From Regular to ContextFree Grammars Deriving a Parser from a ContextFree Grammar Scanners and Parsers A Parser for EBNF LeftParsable Grammars Martin Odersky, LAMP/DI 1 From Regular
More informationChapter 3: CONTEXTFREE GRAMMARS AND PARSING Part 1
Chapter 3: CONTEXTFREE GRAMMARS AND PARSING Part 1 1. Introduction Parsing is the task of Syntax Analysis Determining the syntax, or structure, of a program. The syntax is defined by the grammar rules
More informationChapter 3. Syntax  the form or structure of the expressions, statements, and program units
Syntax  the form or structure of the expressions, statements, and program units Semantics  the meaning of the expressions, statements, and program units Who must use language definitions? 1. Other language
More informationOutline CS412/413. Administrivia. Review. Grammars. Left vs. Right Recursion. More tips forll(1) grammars Bottomup parsing LR(0) parser construction
C12/1 Introduction to Compilers and Translators pring 00 Outline More tips forll1) grammars Bottomup parsing LR0) parser construction Lecture 5: Bottomup parsing Lecture 5 C 12/1 pring '00 Andrew Myers
More informationHomework & Announcements
Homework & nnouncements New schedule on line. Reading: Chapter 18 Homework: Exercises at end Due: 11/1 Copyright c 2002 2017 UMaine School of Computing and Information S 1 / 25 COS 140: Foundations of
More informationIntroduction to Parsing
Introduction to Parsing The Front End Source code Scanner tokens Parser IR Errors Parser Checks the stream of words and their parts of speech (produced by the scanner) for grammatical correctness Determines
More informationECE251 Midterm practice questions, Fall 2010
ECE251 Midterm practice questions, Fall 2010 Patrick Lam October 20, 2010 Bootstrapping In particular, say you have a compiler from C to Pascal which runs on x86, and you want to write a selfhosting Java
More informationParsing Part II. (Ambiguity, Topdown parsing, Leftrecursion Removal)
Parsing Part II (Ambiguity, Topdown parsing, Leftrecursion Removal) Ambiguous Grammars Definitions If a grammar has more than one leftmost derivation for a single sentential form, the grammar is ambiguous
More informationMIT Specifying Languages with Regular Expressions and ContextFree Grammars
MIT 6.035 Specifying Languages with Regular essions and ContextFree Grammars Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Language Definition Problem How to precisely
More informationTheoretical Part. Chapter one:  What are the Phases of compiler? Answer:
Theoretical Part Chapter one:  What are the Phases of compiler? Six phases Scanner Parser Semantic Analyzer Source code optimizer Code generator Target Code Optimizer Three auxiliary components Literal
More informationPart 5 Program Analysis Principles and Techniques
1 Part 5 Program Analysis Principles and Techniques Front end 2 source code scanner tokens parser il errors Responsibilities: Recognize legal programs Report errors Produce il Preliminary storage map Shape
More informationCompilers Course Lecture 4: Context Free Grammars
Compilers Course Lecture 4: Context Free Grammars Example: attempt to define simple arithmetic expressions using named regular expressions: num = [09]+ sum = expr "+" expr expr = "(" sum ")" num Appears
More informationEECS 6083 Intro to Parsing Context Free Grammars
EECS 6083 Intro to Parsing Context Free Grammars Based on slides from text web site: Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. 1 Parsing sequence of tokens parser
More informationA simple syntaxdirected
Syntaxdirected is a grammaroriented compiling technique Programming languages: Syntax: what its programs look like? Semantic: what its programs mean? 1 A simple syntaxdirected Lexical Syntax Character
More informationCSCE 314 Programming Languages
CSCE 314 Programming Languages Syntactic Analysis Dr. Hyunyoung Lee 1 What Is a Programming Language? Language = syntax + semantics The syntax of a language is concerned with the form of a program: how
More information4. Lexical and Syntax Analysis
4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal
More informationCSCI312 Principles of Programming Languages!
CSCI312 Principles of Programming Languages! Chapter 2 Syntax! Xu Liu Review! Principles of PL syntax, naming, types, semantics Paradigms of PL design imperative, OO, functional, logic What makes a successful
More informationContextfree grammars (CFG s)
Syntax Analysis/Parsing Purpose: determine if tokens have the right form for the language (right syntactic structure) stream of tokens abstract syntax tree (AST) AST: captures hierarchical structure of
More informationSYNTAX ANALYSIS 1. Define parser. Hierarchical analysis is one in which the tokens are grouped hierarchically into nested collections with collective meaning. Also termed as Parsing. 2. Mention the basic
More information