Related Course Objec6ves

Size: px
Start display at page:

Download "Related Course Objec6ves"

Transcription

1 Syntax 9/18/17 1

2 Related Course Objec6ves Develop grammars and parsers of programming languages 9/18/17 2

3 Syntax And Seman6cs Programming language syntax: how programs look, their form and structure Syntax is defined using a kind of formal grammar Programming language seman>cs: what programs do, their behavior and meaning Seman>cs is harder to define -- mechanisms for precisely defining seman>cs are harder to learn. 9/18/17 3

4 Outline Grammar and parse tree examples BNF and parse tree defini>ons Construc>ng grammars Phrase structure and lexical structure Dealing with Ambiguity 9/18/17 4

5 An English Grammar A sentence is a noun phrase, a verb, and a noun phrase. A noun phrase is an article and a noun. A verb is An article is A noun is... <S> ::= <NP> <V> <NP> <NP> ::= <A> <N> <V> ::= loves hates eats <A> ::= a the <N> ::= dog cat rat 9/18/17 5

6 Deriva6on A grammar says how a sentence (program) is generated. A sentence can be generated by the grammar as follows: Start with the string <S> Repeatedly replace a symbol that is the lem-hand side of a rule by the right-hand side of the same rule Un>l no symbols in the string appears as the lem-hand side of any rule. 9/18/17 6

7 Deriva6on A deriva>on step: replacing a symbol in a string that is the lem-hand side symbol of a rule by the right-hand side of the rule. A deriva>on step is a rela>on èover strings Mul>ple step deriva>on is a rela>on that is the reflexive and transi>ve closure of è LeMmost/rightmost deriva>ons: Always choose the lemmost (rightmost) non-terminal symbol to replace. A parse tree corresponds to a unique lemmost/ rightmost deriva>on. 9/18/17 7

8 Building a parse tree The grammar is a set of rules that say how to build a parse tree You put <S> at the root of the tree The grammar s rules say how children can be added at any point in the tree For instance, the rule says you can add nodes <NP>, <V>, and <NP>, in that order, as children of <S> <S> ::= <NP> <V> <NP> 9/18/17 8

9 A Parse Tree <S> <NP> <V> <NP> <A> <N> loves <A> <N> the dog the cat 9/18/17 9

10 A Programming Language Grammar <exp> ::= <exp> + <exp> <exp> * <exp> ( <exp> ) id An expression can be the sum of two expressions, or the product of two expressions, or a parenthesized subexpression Or it can be one of the variable iden>fier 9/18/17 10

11 A Parse Tree <exp> ( <exp> ) ((a+b)*c) <exp> * <exp> ( <exp> ) id <exp> + <exp> id id 9/18/17 11

12 start symbol <S> ::= <NP> <V> <NP> a production <NP> ::= <A> <N> <V> ::= loves hates eats non-terminal symbols <A> ::= a the <N> ::= dog cat rat tokens 9/18/17 12

13 BNF Grammar Defini6on A BNF grammar consists of four parts: The set of tokens The set of non-terminal symbols The start symbol The set of produc8ons 9/18/17 13

14 Defini6on, Con6nued The tokens are the smallest units of syntax Strings of one or more characters of program text They are atomic: not treated as being composed from smaller parts The non-terminal symbols stand for larger pieces of syntax They are strings enclosed in angle brackets, as in <NP> They are not strings that occur literally in program text The grammar says how they can be expanded into strings of tokens The start symbol is the par>cular non-terminal that forms the root of any parse tree for the grammar 9/18/17 14

15 Defini6on, Con6nued The produc8ons are the tree-building rules Each one has a lem-hand side, the separator ::=, and a right-hand side The lem-hand side is a single non-terminal The right-hand side is a sequence of one or more things, each of which can be either a token or a non-terminal A produc>on gives one possible way of building a parse tree. 9/18/17 15

16 Alterna6ves When there is more than one produc>on with the same lem-hand side, an abbreviated form can be used The BNF grammar can give the lem-hand side, the separator ::=, and then a list of possible right-hand sides separated by the special symbol 9/18/17 16

17 Example <exp> ::= <exp> + <exp> <exp> * <exp> ( <exp> ) id Note that there are 4 productions in this grammar. It is equivalent to this one: <exp> ::= <exp> + <exp> <exp> ::= <exp> * <exp> <exp> ::= ( <exp> ) <exp> ::= id 9/18/17 17

18 Empty The special nonterminal <empty> is for places where you want the grammar to generate nothing For example, this grammar defines a typical if-then construct with an op>onal else part: <if-stmt> ::= if <expr> then <stmt> <else-part> <else-part> ::= else <stmt> <empty> 9/18/17 18

19 Parse Trees To build a parse tree, put the start symbol at the root Add children to every non-terminal, following any one of the produc8ons for that non-terminal in the grammar Done when all the leaves are tokens Read off leaves from lem to right that is the string derived by the tree 9/18/17 19

20 Prac6ce <exp> ::= <exp> + <exp> <exp> * <exp> ( <exp> ) id Show a parse tree for each of these sentences: a+b a*b+c (a+b) (a+(b)) 9/18/17 20

21 Deriva6on αaγèαβγ if A::=β is a produc>on rule Wri>ng deriva>ons for the above sentences/programs. 9/18/17 21

22 Languages The language defined by a grammar is the set of all strings that can be derived by some parse tree for the grammar, I.e. all terminal strings that can be derived from the start symbol. As in the previous example, that set is omen infinite (though grammars are finite) 9/18/17 22

23 Construc6ng Grammars Most important trick: divide and conquer Example: the language of Java declara>ons: a type name, a list of variables separated by commas, and a semicolon Each variable can be followed by an ini>alizer: float a; boolean a,b,c; int a=1, b, c=1+2; 9/18/17 23

24 Example, Con6nued Easy if we postpone defining the comma-separated list of variables with ini>alizers: Primi>ve type names are easy enough too: <var-dec> ::= <type-name> <declarator-list> ; <type-name> ::= boolean byte short int long char float double 9/18/17 24

25 Example, Con6nued That leaves the comma-separated list of variables with ini>alizers Again, postpone defining variables with ini>alizers, and just do the comma-separated list part: <declarator-list> ::= <declarator> <declarator>, <declarator-list> 9/18/17 25

26 Example, Con6nued That leaves the variables with ini>alizers: <declarator> ::= id id = <expr> For full Java, we would need to allow pairs of square brackets amer the variable name There is also a syntax for array ini>alizers 9/18/17 26

27 Logic terms A term is either a variable or a constant or a func>on symbol followed by a comma separated sequence of terms enclosed in a pair of parenthesis. 9/18/17 27

28 λ-calculus terms A term is either a variable, or λ followed by a variable, a dot and a term, or juxtaposi>on of two terms. 9/18/17 28

29 Where Do Tokens Come From? Tokens are pieces of program text that we do not choose to think of as being built from smaller pieces Iden>fiers (count), keywords (if), operators (==), constants (123.4), etc. Programs stored in files are just sequences of characters How is such a file divided into a sequence of tokens? 9/18/17 29

30 Lexical Structure And Phrase Structure Grammars so far have defined phrase structure: how a program is built from a sequence of tokens We also need to define lexical structure: how a text file is divided into tokens 9/18/17 30

31 Separate Grammars Usually there are two separate grammars One says how to construct a sequence of tokens from a file of characters One says how to construct a parse tree from a sequence of tokens 9/18/17 31

32 Separate Compiler Passes The scanner reads the input file and divides it into tokens according to the first grammar The scanner discards white space and comments The parser constructs a parse tree (or at least goes through the mo>ons more about this later) from the token stream according to the second grammar 9/18/17 32

33 Formal Context-Free Grammars In the study of formal languages and automata, grammars are expressed in yet another nota>on: S asb X X cx These are called context-free grammars Other kinds of grammars are also studied: regular grammars (weaker), context-sensi8ve grammars (stronger), etc. 9/18/17 33

34 Working Grammar G: <exp> ::= <exp> + <exp> <exp> * <exp> (<exp>) id This generates a language of arithmetic expressions using parentheses, the operators + and *, and the variables a, b and c 9/18/17 34

35 Precedence <exp> <exp> * <exp> <exp> + <exp> c a b Our grammar generates this tree for a+b*c. In this tree, the addition is performed before the multiplication, which is not the usual convention for operator precedence. 9/18/17 35

36 Operator Precedence Applies when the order of evalua>on is not completely decided by parentheses Each operator has a precedence level, and those with higher precedence are performed before those with lower precedence. a+b*c = a+(b*c) Examples C (15 levels of precedence) Pascal (5 levels) Smalltalk (1 level for all binary operators) 9/18/17 36

37 Precedence In The Grammar G: <exp> ::= <exp> + <exp> <exp> * <exp> (<exp>) id To fix the precedence problem, we modify the grammar so that it is forced to put * below + in the parse tree. G1: <exp> ::= <exp> + <exp> <mulexp> <mulexp> ::= <mulexp> * <mulexp> (<exp>) id 9/18/17 37

38 Correct Precedence < exp > < exp > + < exp > G1 parse tree: < mulexp > < mulexp > a < mulexp > * < mulexp > b c Our new grammar generates this tree for a+b*c. It generates the same language as before, but no longer generates parse trees with incorrect precedence. 9/18/17 38

39 Associa6vity <exp> <exp> + <exp> <exp> <exp> + <exp> <mulexp> <exp> + <exp> <exp> + <exp> <mulexp> a <mulexp> <mulexp> <mulexp> <mulexp> c b c a b Our grammar G1 generates both these trees for a+b+c. The first one is not the usual convention for operator associativity. 9/18/17 39

40 Operator Associa6vity Applies when the order of evalua>on is not decided by parentheses or by precedence LeA-associa8ve operators group lem to right: a+b+c+d = ((a+b) +c)+d Right-associa8ve operators group right to lem: a+b+c+d = a+(b+ (c+d)) Most operators in most languages are lem-associa>ve, but there are excep>ons 9/18/17 40

41 Associa6vity Examples C ML Fortran a<<b<<c most operators are left-associative a=b=0 right-associative (assignment) most operators are left-associativ 1::2::nil right-associative (list builder) a/b*c most operators are left-associativ a**b**c right-associative (exponentiation) 9/18/17 41

42 Associa6vity In The Grammar G1: <exp> ::= <exp> + <exp> <mulexp> <mulexp> ::= <mulexp> * <mulexp> (<exp>) id To fix the associativity problem, we modify the grammar to make trees of +s grow down to the left (and likewise for *s) G2: <exp> ::= <exp> + <mulexp> <mulexp> <mulexp> ::= <mulexp> * <rootexp> <rootexp> <rootexp> ::= (<exp>) id 9/18/17 42

43 Correct Associa6vity <exp> <exp> + <mulexp> <exp> <mulexp> <rootexp> + <mulexp> <rootexp> b <rootexp> c a Our new grammar generates this tree for a+b+c. It generates the same language as before, but no longer generates trees with incorrect associativity. 9/18/17 43

44 Prac6ce Starting with this grammar: G2: <exp> ::= <exp> + <mulexp> <mulexp> <mulexp> ::= <mulexp> * <rootexp> <rootexp> <rootexp> ::= (<exp>) id 1.) Add a left-associative & operator, at lower precedence than any of the others 2.) Then add a right-associative ** operator, at higher precedence than any of the others 9/18/17 44

45 Ambiguity G was ambiguous: it generated more than one parse tree for the same string Fixing the associa>vity and precedence problems eliminated all the ambiguity This is usually a good thing: the parse tree corresponds to the meaning of the program, and we don t want ambiguity about that 9/18/17 45

46 Dangling Else In Grammars <stmt> ::= <if-stmt> s1 s2 <if-stmt> ::= if <expr> then <stmt> else <stmt> if <expr> then <stmt> <expr> ::= e1 e2 This grammar has a classic dangling-else ambiguity. The statement we want derive is if e1 then if e2 then s1 else s2 and the next slide shows two different parse trees for it... 9/18/17 46

47 <if-stmt> if <exp> then <stmt> else <stmt> e1 <if-stmt> s2 if <exp> then <stmt> e2 s1 <if-stmt> if <exp> then <stmt> e1 <if-stmt> Most languages that have this problem choose this parse tree: else goes with nearest unmatched then if <exp> then <stmt> else <stmt> e2 s1 9/18/17 47 s2

48 Elimina6ng The Ambiguity <stmt> ::= <if-stmt> s1 s2 <if-stmt> ::= if <expr> then <stmt> else <stmt> if <expr> then <stmt> <expr> ::= e1 e2 We want to insist that if this expands into an if, that if must already have its own else. First, we make a new non-terminal <matched-stmt> that generates everything <stmt> generates, except that it can not generate if statements without else: <matched-stmt> ::= <matched-if> s1 s2 <matched-if> ::= if <expr> then <matched-stmt> else <matched-stmt> 9/18/17 48

49 Elimina6ng The Ambiguity <stmt> ::= <if-stmt> s1 s2 <if-stmt> ::= if <expr> then <matched-stmt> else <stmt> if <expr> then <stmt> <expr> ::= e1 e2 Then we use the new non-terminal here. The effect is that the new grammar can match an else part with an if part only if all the nearer if parts are already matched. 9/18/17 49

50 Correct Parse Tree - Exercise Please draw the parse tree for if e1 then if e2 then s1 else s2 using the revised grammar. 9/18/17 50

51 Conclusion We use grammars to define programming language syntax, both lexical structure and phrase structure Connec>on between theory and prac>ce Two grammars, two compiler passes Parser-generators can write code for those two passes automa>cally from grammars 9/18/17 51

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1 Defining Program Syntax Chapter Two Modern Programming Languages, 2nd ed. 1 Syntax And Semantics Programming language syntax: how programs look, their form and structure Syntax is defined using a kind

More information

Principles of Programming Languages COMP251: Syntax and Grammars

Principles of Programming Languages COMP251: Syntax and Grammars Principles of Programming Languages COMP251: Syntax and Grammars Prof. Dekai Wu Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong, China Fall 2006

More information

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF) Chapter 3: Describing Syntax and Semantics Introduction Formal methods of describing syntax (BNF) We can analyze syntax of a computer program on two levels: 1. Lexical level 2. Syntactic level Lexical

More information

EDA180: Compiler Construc6on Context- free grammars. Görel Hedin Revised:

EDA180: Compiler Construc6on Context- free grammars. Görel Hedin Revised: EDA180: Compiler Construc6on Context- free grammars Görel Hedin Revised: 2013-01- 28 Compiler phases and program representa6ons source code Lexical analysis (scanning) Intermediate code genera6on tokens

More information

Defining syntax using CFGs

Defining syntax using CFGs Defining syntax using CFGs Roadmap Last 8me Defined context-free grammar This 8me CFGs for syntax design Language membership List grammars Resolving ambiguity CFG Review G = (N,Σ,P,S) means derives derives

More information

Principles of Programming Languages COMP251: Syntax and Grammars

Principles of Programming Languages COMP251: Syntax and Grammars Principles of Programming Languages COMP251: Syntax and Grammars Prof. Dekai Wu Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong, China Fall 2007

More information

A simple syntax-directed

A simple syntax-directed Syntax-directed is a grammaroriented compiling technique Programming languages: Syntax: what its programs look like? Semantic: what its programs mean? 1 A simple syntax-directed Lexical Syntax Character

More information

Syntax. A. Bellaachia Page: 1

Syntax. A. Bellaachia Page: 1 Syntax 1. Objectives & Definitions... 2 2. Definitions... 3 3. Lexical Rules... 4 4. BNF: Formal Syntactic rules... 6 5. Syntax Diagrams... 9 6. EBNF: Extended BNF... 10 7. Example:... 11 8. BNF Statement

More information

2.2 Syntax Definition

2.2 Syntax Definition 42 CHAPTER 2. A SIMPLE SYNTAX-DIRECTED TRANSLATOR sequence of "three-address" instructions; a more complete example appears in Fig. 2.2. This form of intermediate code takes its name from instructions

More information

Specifying Syntax. An English Grammar. Components of a Grammar. Language Specification. Types of Grammars. 1. Terminal symbols or terminals, Σ

Specifying Syntax. An English Grammar. Components of a Grammar. Language Specification. Types of Grammars. 1. Terminal symbols or terminals, Σ Specifying Syntax Language Specification Components of a Grammar 1. Terminal symbols or terminals, Σ Syntax Form of phrases Physical arrangement of symbols 2. Nonterminal symbols or syntactic categories,

More information

CPS 506 Comparative Programming Languages. Syntax Specification

CPS 506 Comparative Programming Languages. Syntax Specification CPS 506 Comparative Programming Languages Syntax Specification Compiling Process Steps Program Lexical Analysis Convert characters into a stream of tokens Lexical Analysis Syntactic Analysis Send tokens

More information

A Simple Syntax-Directed Translator

A Simple Syntax-Directed Translator Chapter 2 A Simple Syntax-Directed Translator 1-1 Introduction The analysis phase of a compiler breaks up a source program into constituent pieces and produces an internal representation for it, called

More information

Syntax Analysis Check syntax and construct abstract syntax tree

Syntax Analysis Check syntax and construct abstract syntax tree Syntax Analysis Check syntax and construct abstract syntax tree if == = ; b 0 a b Error reporting and recovery Model using context free grammars Recognize using Push down automata/table Driven Parsers

More information

This book is licensed under a Creative Commons Attribution 3.0 License

This book is licensed under a Creative Commons Attribution 3.0 License 6. Syntax Learning objectives: syntax and semantics syntax diagrams and EBNF describe context-free grammars terminal and nonterminal symbols productions definition of EBNF by itself parse tree grammars

More information

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; }

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; } Ex: The difference between Compiler and Interpreter The interpreter actually carries out the computations specified in the source program. In other words, the output of a compiler is a program, whereas

More information

Compiler Design Concepts. Syntax Analysis

Compiler Design Concepts. Syntax Analysis Compiler Design Concepts Syntax Analysis Introduction First task is to break up the text into meaningful words called tokens. newval=oldval+12 id = id + num Token Stream Lexical Analysis Source Code (High

More information

Syntax and Grammars 1 / 21

Syntax and Grammars 1 / 21 Syntax and Grammars 1 / 21 Outline What is a language? Abstract syntax and grammars Abstract syntax vs. concrete syntax Encoding grammars as Haskell data types What is a language? 2 / 21 What is a language?

More information

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]

More information

Non-deterministic Finite Automata (NFA)

Non-deterministic Finite Automata (NFA) Non-deterministic Finite Automata (NFA) CAN have transitions on the same input to different states Can include a ε or λ transition (i.e. move to new state without reading input) Often easier to design

More information

Semester Review CSC 301

Semester Review CSC 301 Semester Review CSC 301 Programming Language Classes There are many different programming language classes, but four classes or paradigms stand out: l l l l Imperative Languages l assignment and iteration

More information

Syntax Intro and Overview. Syntax

Syntax Intro and Overview. Syntax Syntax Intro and Overview CS331 Syntax Syntax defines what is grammatically valid in a programming language Set of grammatical rules E.g. in English, a sentence cannot begin with a period Must be formal

More information

Introduction to Lexical Analysis

Introduction to Lexical Analysis Introduction to Lexical Analysis Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexers Regular expressions Examples

More information

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; }

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; } Ex: The difference between Compiler and Interpreter The interpreter actually carries out the computations specified in the source program. In other words, the output of a compiler is a program, whereas

More information

ECE251 Midterm practice questions, Fall 2010

ECE251 Midterm practice questions, Fall 2010 ECE251 Midterm practice questions, Fall 2010 Patrick Lam October 20, 2010 Bootstrapping In particular, say you have a compiler from C to Pascal which runs on x86, and you want to write a self-hosting Java

More information

CSE 3302 Programming Languages Lecture 2: Syntax

CSE 3302 Programming Languages Lecture 2: Syntax CSE 3302 Programming Languages Lecture 2: Syntax (based on slides by Chengkai Li) Leonidas Fegaras University of Texas at Arlington CSE 3302 L2 Spring 2011 1 How do we define a PL? Specifying a PL: Syntax:

More information

B The SLLGEN Parsing System

B The SLLGEN Parsing System B The SLLGEN Parsing System Programs are just strings of characters. In order to process a program, we need to group these characters into meaningful units. This grouping is usually divided into two stages:

More information

Syntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc.

Syntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc. Syntax Syntax Syntax defines what is grammatically valid in a programming language Set of grammatical rules E.g. in English, a sentence cannot begin with a period Must be formal and exact or there will

More information

announcements CSE 311: Foundations of Computing review: regular expressions review: languages---sets of strings

announcements CSE 311: Foundations of Computing review: regular expressions review: languages---sets of strings CSE 311: Foundations of Computing Fall 2013 Lecture 19: Regular expressions & context-free grammars announcements Reading assignments 7 th Edition, pp. 878-880 and pp. 851-855 6 th Edition, pp. 817-819

More information

CSE 401 Midterm Exam Sample Solution 2/11/15

CSE 401 Midterm Exam Sample Solution 2/11/15 Question 1. (10 points) Regular expression warmup. For regular expression questions, you must restrict yourself to the basic regular expression operations covered in class and on homework assignments:

More information

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών Lecture 5a Syntax Analysis lias Athanasopoulos eliasathan@cs.ucy.ac.cy Syntax Analysis Συντακτική Ανάλυση Context-free Grammars (CFGs) Derivations Parse trees

More information

Wednesday, August 31, Parsers

Wednesday, August 31, Parsers Parsers How do we combine tokens? Combine tokens ( words in a language) to form programs ( sentences in a language) Not all combinations of tokens are correct programs (not all sentences are grammatically

More information

Part 5 Program Analysis Principles and Techniques

Part 5 Program Analysis Principles and Techniques 1 Part 5 Program Analysis Principles and Techniques Front end 2 source code scanner tokens parser il errors Responsibilities: Recognize legal programs Report errors Produce il Preliminary storage map Shape

More information

Formal Semantics. Chapter Twenty-Three Modern Programming Languages, 2nd ed. 1

Formal Semantics. Chapter Twenty-Three Modern Programming Languages, 2nd ed. 1 Formal Semantics Chapter Twenty-Three Modern Programming Languages, 2nd ed. 1 Formal Semantics At the beginning of the book we saw formal definitions of syntax with BNF And how to make a BNF that generates

More information

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield CMPS 3500 Programming Languages Dr. Chengwei Lei CEECS California State University, Bakersfield Chapter 3 Describing Syntax and Semantics Chapter 3 Topics Introduction The General Problem of Describing

More information

Week 2: Syntax Specification, Grammars

Week 2: Syntax Specification, Grammars CS320 Principles of Programming Languages Week 2: Syntax Specification, Grammars Jingke Li Portland State University Fall 2017 PSU CS320 Fall 17 Week 2: Syntax Specification, Grammars 1/ 62 Words and Sentences

More information

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview Tokens and regular expressions Syntax and context-free grammars Grammar derivations More about parse trees Top-down and bottom-up

More information

CSCE 531 Spring 2009 Final Exam

CSCE 531 Spring 2009 Final Exam CSCE 531 Spring 2009 Final Exam Do all problems. Write your solutions on the paper provided. This test is open book, open notes, but no electronic devices. For your own sake, please read all problems before

More information

Introduction to Lexical Analysis

Introduction to Lexical Analysis Introduction to Lexical Analysis Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexical analyzers (lexers) Regular

More information

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis CSE450 Translation of Programming Languages Lecture 4: Syntax Analysis http://xkcd.com/859 Structure of a Today! Compiler Source Language Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator

More information

EDA180: Compiler Construc6on. Top- down parsing. Görel Hedin Revised: a

EDA180: Compiler Construc6on. Top- down parsing. Görel Hedin Revised: a EDA180: Compiler Construc6on Top- down parsing Görel Hedin Revised: 2013-01- 30a Compiler phases and program representa6ons source code Lexical analysis (scanning) Intermediate code genera6on tokens intermediate

More information

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview n Tokens and regular expressions n Syntax and context-free grammars n Grammar derivations n More about parse trees n Top-down and

More information

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata Lexical Analysis Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata Phase Ordering of Front-Ends Lexical analysis (lexer) Break input string

More information

EECS 6083 Intro to Parsing Context Free Grammars

EECS 6083 Intro to Parsing Context Free Grammars EECS 6083 Intro to Parsing Context Free Grammars Based on slides from text web site: Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. 1 Parsing sequence of tokens parser

More information

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;} Compiler Construction Grammars Parsing source code scanner tokens regular expressions lexical analysis Lennart Andersson parser context free grammar Revision 2012 01 23 2012 parse tree AST builder (implicit)

More information

Chapter 3. Describing Syntax and Semantics

Chapter 3. Describing Syntax and Semantics Chapter 3 Describing Syntax and Semantics Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute Grammars Describing the Meanings of Programs:

More information

Compilers and computer architecture From strings to ASTs (2): context free grammars

Compilers and computer architecture From strings to ASTs (2): context free grammars 1 / 1 Compilers and computer architecture From strings to ASTs (2): context free grammars Martin Berger October 2018 Recall the function of compilers 2 / 1 3 / 1 Recall we are discussing parsing Source

More information

3. Context-free grammars & parsing

3. Context-free grammars & parsing 3. Context-free grammars & parsing The parsing process sequences of tokens parse tree or syntax tree a / [ / index / ]/= / 4 / + / 2 The parsing process sequences of tokens parse tree or syntax tree a

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2

Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2 Formal Languages and Grammars Chapter 2: Sections 2.1 and 2.2 Formal Languages Basis for the design and implementation of programming languages Alphabet: finite set Σ of symbols String: finite sequence

More information

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars CMSC 330: Organization of Programming Languages Context Free Grammars Where We Are Programming languages Ruby OCaml Implementing programming languages Scanner Uses regular expressions Finite automata Parser

More information

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters : Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Scanner Parser Static Analyzer Intermediate Representation Front End Back End Compiler / Interpreter

More information

A programming language requires two major definitions A simple one pass compiler

A programming language requires two major definitions A simple one pass compiler A programming language requires two major definitions A simple one pass compiler [Syntax: what the language looks like A context-free grammar written in BNF (Backus-Naur Form) usually suffices. [Semantics:

More information

Homework & Announcements

Homework & Announcements Homework & nnouncements New schedule on line. Reading: Chapter 18 Homework: Exercises at end Due: 11/1 Copyright c 2002 2017 UMaine School of Computing and Information S 1 / 25 COS 140: Foundations of

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part2 3.3 Parse Trees and Abstract Syntax Trees

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part2 3.3 Parse Trees and Abstract Syntax Trees Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part2 3.3 Parse Trees and Abstract Syntax Trees 3.3.1 Parse trees 1. Derivation V.S. Structure Derivations do not uniquely represent the structure of the strings

More information

Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Gr

Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Gr Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Grammars Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/

More information

CMSC 330: Organization of Programming Languages. Context Free Grammars

CMSC 330: Organization of Programming Languages. Context Free Grammars CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

Some Basic Definitions. Some Basic Definitions. Some Basic Definitions. Language Processing Systems. Syntax Analysis (Parsing) Prof.

Some Basic Definitions. Some Basic Definitions. Some Basic Definitions. Language Processing Systems. Syntax Analysis (Parsing) Prof. Language Processing Systems Prof. Mohamed Hamada Software ngineering Lab. he University of Aizu Japan Syntax Analysis (Parsing) Some Basic Definitions Some Basic Definitions syntax: the way in which words

More information

Syntax. In Text: Chapter 3

Syntax. In Text: Chapter 3 Syntax In Text: Chapter 3 1 Outline Syntax: Recognizer vs. generator BNF EBNF Chapter 3: Syntax and Semantics 2 Basic Definitions Syntax the form or structure of the expressions, statements, and program

More information

CMSC 330: Organization of Programming Languages. Context Free Grammars

CMSC 330: Organization of Programming Languages. Context Free Grammars CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

COP 3402 Systems Software Top Down Parsing (Recursive Descent) COP 3402 Systems Software Top Down Parsing (Recursive Descent) Top Down Parsing 1 Outline 1. Top down parsing and LL(k) parsing 2. Recursive descent parsing 3. Example of recursive descent parsing of arithmetic

More information

CSE302: Compiler Design

CSE302: Compiler Design CSE302: Compiler Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University February 20, 2007 Outline Recap

More information

Introduction to Syntax Analysis. Compiler Design Syntax Analysis s.l. dr. ing. Ciprian-Bogdan Chirila

Introduction to Syntax Analysis. Compiler Design Syntax Analysis s.l. dr. ing. Ciprian-Bogdan Chirila Introduction to Syntax Analysis Compiler Design Syntax Analysis s.l. dr. ing. Ciprian-Bogdan Chirila chirila@cs.upt.ro http://www.cs.upt.ro/~chirila Outline Syntax Analysis Syntax Rules The Role of the

More information

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1 Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1 1. Introduction Parsing is the task of Syntax Analysis Determining the syntax, or structure, of a program. The syntax is defined by the grammar rules

More information

CS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built)

CS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built) Programming languages must be precise Remember instructions This is unlike natural languages CS 315 Programming Languages Syntax Precision is required for syntax think of this as the format of the language

More information

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications Agenda for Today Regular Expressions CSE 413, Autumn 2005 Programming Languages Basic concepts of formal grammars Regular expressions Lexical specification of programming languages Using finite automata

More information

COP 3402 Systems Software Syntax Analysis (Parser)

COP 3402 Systems Software Syntax Analysis (Parser) COP 3402 Systems Software Syntax Analysis (Parser) Syntax Analysis 1 Outline 1. Definition of Parsing 2. Context Free Grammars 3. Ambiguous/Unambiguous Grammars Syntax Analysis 2 Lexical and Syntax Analysis

More information

Chapter 4. Syntax - the form or structure of the expressions, statements, and program units

Chapter 4. Syntax - the form or structure of the expressions, statements, and program units Syntax - the form or structure of the expressions, statements, and program units Semantics - the meaning of the expressions, statements, and program units Who must use language definitions? 1. Other language

More information

CS /534 Compiler Construction University of Massachusetts Lowell. NOTHING: A Language for Practice Implementation

CS /534 Compiler Construction University of Massachusetts Lowell. NOTHING: A Language for Practice Implementation CS 91.406/534 Compiler Construction University of Massachusetts Lowell Professor Li Xu Fall 2004 NOTHING: A Language for Practice Implementation 1 Introduction NOTHING is a programming language designed

More information

Syntactic Analysis. The Big Picture Again. Grammar. ICS312 Machine-Level and Systems Programming

Syntactic Analysis. The Big Picture Again. Grammar. ICS312 Machine-Level and Systems Programming The Big Picture Again Syntactic Analysis source code Scanner Parser Opt1 Opt2... Optn Instruction Selection Register Allocation Instruction Scheduling machine code ICS312 Machine-Level and Systems Programming

More information

Syntax and Semantics

Syntax and Semantics Syntax and Semantics Syntax - The form or structure of the expressions, statements, and program units Semantics - The meaning of the expressions, statements, and program units Syntax Example: simple C

More information

Compilers Course Lecture 4: Context Free Grammars

Compilers Course Lecture 4: Context Free Grammars Compilers Course Lecture 4: Context Free Grammars Example: attempt to define simple arithmetic expressions using named regular expressions: num = [0-9]+ sum = expr "+" expr expr = "(" sum ")" num Appears

More information

Lexical Scanning COMP360

Lexical Scanning COMP360 Lexical Scanning COMP360 Captain, we re being scanned. Spock Reading Read sections 2.1 3.2 in the textbook Regular Expression and FSA Assignment A new assignment has been posted on Blackboard It is due

More information

CS101: Fundamentals of Computer Programming. Dr. Tejada www-bcf.usc.edu/~stejada Week 1 Basic Elements of C++

CS101: Fundamentals of Computer Programming. Dr. Tejada www-bcf.usc.edu/~stejada Week 1 Basic Elements of C++ CS101: Fundamentals of Computer Programming Dr. Tejada stejada@usc.edu www-bcf.usc.edu/~stejada Week 1 Basic Elements of C++ 10 Stacks of Coins You have 10 stacks with 10 coins each that look and feel

More information

Chapter 3. Describing Syntax and Semantics ISBN

Chapter 3. Describing Syntax and Semantics ISBN Chapter 3 Describing Syntax and Semantics ISBN 0-321-49362-1 Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Copyright 2009 Addison-Wesley. All

More information

Syntax Analysis. Chapter 4

Syntax Analysis. Chapter 4 Syntax Analysis Chapter 4 Check (Important) http://www.engineersgarage.com/contributio n/difference-between-compiler-andinterpreter Introduction covers the major parsing methods that are typically used

More information

Optimizing Finite Automata

Optimizing Finite Automata Optimizing Finite Automata We can improve the DFA created by MakeDeterministic. Sometimes a DFA will have more states than necessary. For every DFA there is a unique smallest equivalent DFA (fewest states

More information

CS251 Programming Languages Spring 2016, Lyn Turbak Department of Computer Science Wellesley College

CS251 Programming Languages Spring 2016, Lyn Turbak Department of Computer Science Wellesley College Functions in Racket CS251 Programming Languages Spring 2016, Lyn Turbak Department of Computer Science Wellesley College Racket Func+ons Functions: most important building block in Racket (and 251) Functions/procedures/methods/subroutines

More information

Introduction to Parsing Ambiguity and Syntax Errors

Introduction to Parsing Ambiguity and Syntax Errors Introduction to Parsing Ambiguity and Syntax rrors Outline Regular languages revisited Parser overview Context-free grammars (CFG s) Derivations Ambiguity Syntax errors Compiler Design 1 (2011) 2 Languages

More information

ADTS, GRAMMARS, PARSING, TREE TRAVERSALS

ADTS, GRAMMARS, PARSING, TREE TRAVERSALS 1 Pointers to material ADS, GRAMMARS, PARSING, R RAVRSALS Lecture 13 CS110 all 016 Parse trees: text, section 3.36 Definition of Java Language, sometimes useful: docs.oracle.com/javase/specs/jls/se8/html/index.html

More information

Wednesday, September 9, 15. Parsers

Wednesday, September 9, 15. Parsers Parsers What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

More information

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs: What is a parser Parsers A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

More information

Syntactic Analysis. CS345H: Programming Languages. Lecture 3: Lexical Analysis. Outline. Lexical Analysis. What is a Token? Tokens

Syntactic Analysis. CS345H: Programming Languages. Lecture 3: Lexical Analysis. Outline. Lexical Analysis. What is a Token? Tokens Syntactic Analysis CS45H: Programming Languages Lecture : Lexical Analysis Thomas Dillig Main Question: How to give structure to strings Analogy: Understanding an English sentence First, we separate a

More information

More Assigned Reading and Exercises on Syntax (for Exam 2)

More Assigned Reading and Exercises on Syntax (for Exam 2) More Assigned Reading and Exercises on Syntax (for Exam 2) 1. Read sections 2.3 (Lexical Syntax) and 2.4 (Context-Free Grammars) on pp. 33 41 of Sethi. 2. Read section 2.6 (Variants of Grammars) on pp.

More information

Stating the obvious, people and computers do not speak the same language.

Stating the obvious, people and computers do not speak the same language. 3.4 SYSTEM SOFTWARE 3.4.3 TRANSLATION SOFTWARE INTRODUCTION Stating the obvious, people and computers do not speak the same language. People have to write programs in order to instruct a computer what

More information

Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity. Eliminating Ambiguity Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity. Example: consider the following grammar stat if expr then stat if expr then stat else stat other One can

More information

Question Bank. 10CS63:Compiler Design

Question Bank. 10CS63:Compiler Design Question Bank 10CS63:Compiler Design 1.Determine whether the following regular expressions define the same language? (ab)* and a*b* 2.List the properties of an operator grammar 3. Is macro processing a

More information

Introduction to Parsing Ambiguity and Syntax Errors

Introduction to Parsing Ambiguity and Syntax Errors Introduction to Parsing Ambiguity and Syntax rrors Outline Regular languages revisited Parser overview Context-free grammars (CFG s) Derivations Ambiguity Syntax errors 2 Languages and Automata Formal

More information

Parsing. Zhenjiang Hu. May 31, June 7, June 14, All Right Reserved. National Institute of Informatics

Parsing. Zhenjiang Hu. May 31, June 7, June 14, All Right Reserved. National Institute of Informatics National Institute of Informatics May 31, June 7, June 14, 2010 All Right Reserved. Outline I 1 Parser Type 2 Monad Parser Monad 3 Derived Primitives 4 5 6 Outline Parser Type 1 Parser Type 2 3 4 5 6 What

More information

Lexical Analysis. Chapter 2

Lexical Analysis. Chapter 2 Lexical Analysis Chapter 2 1 Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexers Regular expressions Examples

More information

CSCE 314 Programming Languages

CSCE 314 Programming Languages CSCE 314 Programming Languages Syntactic Analysis Dr. Hyunyoung Lee 1 What Is a Programming Language? Language = syntax + semantics The syntax of a language is concerned with the form of a program: how

More information

Lexical and Syntax Analysis. Top-Down Parsing

Lexical and Syntax Analysis. Top-Down Parsing Lexical and Syntax Analysis Top-Down Parsing Easy for humans to write and understand String of characters Lexemes identified String of tokens Easy for programs to transform Data structure Syntax A syntax

More information

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1 CSE P 501 Compilers Parsing & Context-Free Grammars Hal Perkins Winter 2008 1/15/2008 2002-08 Hal Perkins & UW CSE C-1 Agenda for Today Parsing overview Context free grammars Ambiguous grammars Reading:

More information

ITEC2620 Introduction to Data Structures

ITEC2620 Introduction to Data Structures ITEC2620 Introduction to Data Structures Lecture 10a Grammars II Review Three main problems Derive a sentence from a grammar Develop a grammar for a language Given a grammar, determine if an input is valid

More information

Defining syntax using CFGs

Defining syntax using CFGs Defining syntax using CFGs Roadmap Last time Defined context-free grammar This time CFGs for specifying a language s syntax Language membership List grammars Resolving ambiguity CFG Review G = (N,Σ,P,S)

More information

Consider a description of arithmetic. It includes two equations that define the structural types of digit and operator:

Consider a description of arithmetic. It includes two equations that define the structural types of digit and operator: Syntax A programming language consists of syntax, semantics, and pragmatics. We formalize syntax first, because only syntactically correct programs have semantics. A syntax definition of a language lists

More information

Syntax and Parsing COMS W4115. Prof. Stephen A. Edwards Fall 2003 Columbia University Department of Computer Science

Syntax and Parsing COMS W4115. Prof. Stephen A. Edwards Fall 2003 Columbia University Department of Computer Science Syntax and Parsing COMS W4115 Prof. Stephen A. Edwards Fall 2003 Columbia University Department of Computer Science Lexical Analysis (Scanning) Lexical Analysis (Scanning) Goal is to translate a stream

More information

CS 403: Scanning and Parsing

CS 403: Scanning and Parsing CS 403: Scanning and Parsing Stefan D. Bruda Fall 2017 THE COMPILATION PROCESS Character stream Scanner (lexical analysis) Token stream Parser (syntax analysis) Parse tree Semantic analysis Abstract syntax

More information

1 Lexical Considerations

1 Lexical Considerations Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2013 Handout Decaf Language Thursday, Feb 7 The project for the course is to write a compiler

More information