Context-free grammars (CFGs)

Size: px
Start display at page:

Download "Context-free grammars (CFGs)"

Transcription

1 Context-free grammars (CFGs)

2 Roadmap Last time RegExp == DFA Jlex: a tool for generating (Java code for) a lexer/scanner Mainly a collection of regexp, action pairs This time CFGs, the underlying abstraction for parsers Next week Java CUP: a tool for generating (Java code for) a parser Mainly a collection of CFG-rule, action pairs regexp : JLex :: CFG : Java CUP

3 RegExps Are Great! Perfect for tokenizing a language However, they have some limitations Can only define a limited family of languages Cannot use a RegExp to specify all the programming constructs we need No notion of structure Let s explore both of these issues

4 Limitations of RegExps Cannot handle matching E.g., language of balanced parentheses L () = { ( n ) n where n > 0} No DFA exists for this language Intuition: A given FSM only has a fixed, finite amount of memory For an FSM, memory = the states With a fixed, finite amount of memory, how could an FSM remember how many ( characters it has seen?

5 Theorem: No RegExp/DFA can describe the language L () Proof by contradiction: Suppose that there exists a DFA A for L () and A has N states A has to accept the string ( N ) N with some path q 0 q 1 q N q 2N+1 By the pigeonhole principle some state has to repeat: q i = q j for some i<j<n Therefore the run q 0 q 1 q i q j+1 q N q 2N+1 is also accepting A accepts the string ( N-(j-i) ) N L (), which is a contradiction!

6 Limitations of RegExps: No Structure Our Enhanced-RegExp scanner can emit a stream of tokens: X = Y + Z ID ASSIGN ID PLUS ID but this doesn t really enforce any order of operations

7 The Chomsky Hierarchy LANGUAGE CLASS: power efficiency Recursively enumerable Turing machine Context-Sensitive Context-Free Happy medium? Regular FSM Noam Chomsky

8 Context Free Grammars (CFGs) A set of (recursive) rewriting rules to generate patterns of strings Can envision a parse tree that keeps structure

9 CFG: Intuition S ( S ) A rule that says that you can rewrite S to be an S surrounded by a single set of parenthesis Before applying rule S After applying rule S ( S )

10 Context Free Grammars (CFGs) A CFG is a 4-tuple (N,Σ,P,S) N is a set of non-terminals, e.g., A, B, S, Σ is the set of terminals P is a set of production rules S N is the initial non-terminal symbol ( start symbol )

11 Context Free Grammars (CFGs) A CFG is a 4-tuple (N,Σ,P,S) N is a set of non-terminals, e.g., A, B, S Σ is the set of terminals P is a set of production rules S (in N) is the initial non-terminal symbol Placeholder / interior nodes in the parse tree Tokens from scanner Rules for deriving strings If not otherwise specified, use the non-terminal that appears on the LHS of the first production as the start

12 Production Syntax LHS RHS Single nonterminal symbol Expression: Sequence of terminals and nonterminals Examples: S à ( S ) S à ε

13 Production Shorthand Nonterm expression Nonterm ε equivalently: Nonterm expression ε equivalently: Nonterm expression ε S à ( S ) S à ε S à ( S ) ε S à ( S ) ε

14 Derivations To derive a string: Start by setting Current Sequence to the start symbol Repeatedly, Find a Nonterminal X in the Current Sequence Find a production of the form X α Apply the production: create a new current sequence in which α replaces X Stop when there are no more non-terminals This process derives a string of terminal symbols

15 Derivation Syntax We ll use the symbol for derives We ll use the symbol & for derives in one or more steps (also written as & ) We ll use the symbol for derives in zero or more steps (also written as )

16 An Example Grammar

17 Terminals begin end semicolon assign id plus An Example Grammar

18 For readability, bold and lowercase Terminals begin end semicolon assign id plus An Example Grammar

19 For readability, bold and lowercase Terminals begin Program end boundary semicolon assign id plus An Example Grammar

20 An Example Grammar For readability, bold and lowercase Terminals begin Program end boundary semicolon Represents ; assign Separates statements id plus

21 An Example Grammar For readability, bold and lowercase Terminals begin Program end boundary semicolon Represents ; assign Separates statements id plus Represents = in an assignment statement

22 An Example Grammar For readability, bold and lowercase Terminals begin Program end boundary semicolon Represents ; assign Separates statements id plus Represents = in an assignment statement Identifier / variable name

23 An Example Grammar For readability, bold and lowercase Terminals begin Program end boundary semicolon Represents ; assign Separates statements id plus Represents = in an assignment statement Identifier / variable name Represents + operator in an expression

24 For readability, bold and lowercase Terminals begin end semicolon assign id plus An Example Grammar Nonterminals Prog Stmts Stmt Expr

25 For readability, bold and lowercase Terminals begin end semicolon assign id plus An Example Grammar For readability, Italics and UpperCamelCase Nonterminals Prog Stmts Stmt Expr

26 For readability, bold and lowercase Terminals begin end semicolon assign id plus An Example Grammar For readability, Italics and UpperCamelCase Nonterminals Prog Root of the parse tree Stmts Stmt Expr

27 For readability, bold and lowercase Terminals begin end semicolon assign id plus An Example Grammar For readability, Italics and UpperCamelCase Nonterminals Prog Root of the parse tree Stmts List of statements Stmt Expr

28 For readability, bold and lowercase Terminals begin end semicolon assign id plus An Example Grammar For readability, Italics and UpperCamelCase Nonterminals Prog Root of the parse tree Stmts List of statements Stmt A single statement Expr

29 For readability, bold and lowercase Terminals begin end semicolon assign id plus An Example Grammar For readability, Italics and UpperCamelCase Nonterminals Prog Root of the parse tree Stmts List of statements Stmt A single statement Expr A mathematical expression

30 An Example Grammar For readability, bold and lowercase Terminals begin end semicolon assign id plus For readability, Italics and UpperCamelCase Nonterminals Prog Stmts Stmt Expr Defines the syntax of legal programs Productions Prog begin Stmts end Stmts Stmts semicolon Stmt Stmt Stmt id assign Expr Expr id Expr plus id

31 An Example Grammar For readability, bold and lowercase Terminals begin Program end boundary semicolon Represents ; assign Separates statements id plus Represents = statement Identifier / variable name Represents + expression For readability, Italics and UpperCamelCase Nonterminals Prog Root of the parse tree Stmts List of statements Stmt A single statement Expr An expression Defines the syntax of legal programs Productions Prog begin Stmts end Stmts Stmts semicolon Stmt Stmt Stmt id assign Expr Expr id Expr plus id

32 Productions 1. Prog begin Stmts end 2. Stmts Stmts semicolon Stmt 3. Stmt 4. Stmt id assign Expr 5. Expr id 6. Expr plus id

33 Productions 1. Prog begin Stmts end 2. Stmts Stmts semicolon Stmt 3. Stmt 4. Stmt id assign Expr 5. Expr id 6. Expr plus id Derivation Sequence

34 Productions 1. Prog begin Stmts end Parse Tree 2. Stmts Stmts semicolon Stmt 3. Stmt 4. Stmt id assign Expr 5. Expr id 6. Expr plus id Derivation Sequence

35 Productions 1. Prog begin Stmts end Parse Tree 2. Stmts Stmts semicolon Stmt 3. Stmt 4. Stmt id assign Expr 5. Expr id 6. Expr plus id Derivation Sequence Key terminal Nonterminal Rule used

36 Productions 1. Prog begin Stmts end 2. Stmts Stmts semicolon Stmt 3. Stmt 4. Stmt id assign Expr 5. Expr id 6. Expr plus id Parse Tree Prog Derivation Sequence Prog Key terminal Nonterminal Rule used

37 Productions 1. Prog begin Stmts end 2. Stmts Stmts semicolon Stmt 3. Stmt 4. Stmt id assign Expr 5. Expr id 6. Expr plus id Parse Tree Prog Derivation Sequence Prog begin Stmts end 1 Key terminal Nonterminal Rule used

38 Productions 1. Prog begin Stmts end 2. Stmts Stmts semicolon Stmt 3. Stmt begin 4. Stmt id assign Expr 5. Expr id 6. Expr plus id Parse Tree Prog Stmts end Derivation Sequence Prog begin Stmts end 1 Key terminal Nonterminal Rule used

39 Productions 1. Prog begin Stmts end 2. Stmts Stmts semicolon Stmt 3. Stmt begin Prog Stmts end 4. Stmt id assign Expr 5. Expr id Stmts semicolon Stmt 6. Expr plus id Parse Tree Derivation Sequence Prog begin Stmts end 1 begin Stmts semicolon Stmt end 2 Key terminal Nonterminal Rule used

40 Productions 1. Prog begin Stmts end 2. Stmts Stmts semicolon Stmt 3. Stmt begin Prog Stmts end 4. Stmt id assign Expr 5. Expr id Stmts semicolon Stmt 6. Expr plus id Stmt Parse Tree Derivation Sequence Prog begin Stmts end 1 begin Stmts semicolon Stmt end begin Stmt semicolon Stmt end 2 3 Key terminal Nonterminal Rule used

41 Productions 1. Prog begin Stmts end 2. Stmts Stmts semicolon Stmt 3. Stmt begin Prog Stmts end 4. Stmt id assign Expr 5. Expr id Stmts semicolon Stmt 6. Expr plus id Stmt Parse Tree Derivation Sequence id Prog begin Stmts end 1 2 begin Stmts semicolon Stmt end 3 begin Stmt semicolon Stmt end begin id assign Expr semicolon Stmt end assign Expr 4 Key terminal Nonterminal Rule used

42 Productions 1. Prog begin Stmts end 2. Stmts Stmts semicolon Stmt 3. Stmt begin Prog Stmts end 4. Stmt id assign Expr 5. Expr id Stmts Parse Tree semicolon Stmt 6. Expr plus id Stmt id assign Expr Derivation Sequence id assign Prog begin Stmts end 1 2 begin Stmts semicolon Stmt end 3 begin Stmt semicolon Stmt end 4 begin id assign Expr semicolon Stmt end begin id assign Expr semicolon id assign Expr end Expr 4 Key terminal Nonterminal Rule used

43 Productions 1. Prog begin Stmts end 2. Stmts Stmts semicolon Stmt 3. Stmt begin Prog Stmts end 4. Stmt id assign Expr 5. Expr id Stmts Parse Tree semicolon Stmt 6. Expr plus id Stmt id assign Expr Derivation Sequence id assign Prog begin Stmts end 1 2 begin Stmts semicolon Stmt end 3 begin Stmt semicolon Stmt end 4 begin id assign Expr semicolon Stmt end begin id assign Expr semicolon id assign Expr end begin id assign id semicolon id assign Expr end Expr id 4 5 Key terminal Nonterminal Rule used

44 Productions 1. Prog begin Stmts end 2. Stmts Stmts semicolon Stmt 3. Stmt begin Prog Stmts end 4. Stmt id assign Expr 5. Expr id Stmts Parse Tree semicolon Stmt 6. Expr plus id Stmt id assign Expr Derivation Sequence id assign Expr Prog begin Stmts end 1 2 begin Stmts semicolon Stmt end id 3 begin Stmt semicolon Stmt end 4 4 begin id assign Expr semicolon Stmt end 5 begin id assign Expr semicolon id assign Expr end begin id assign id semicolon id assign Expr end begin id assign id semicolon id assign Expr plus id end 6 Expr plus id Key terminal Nonterminal Rule used

45 Productions 1. Prog begin Stmts end 2. Stmts Stmts semicolon Stmt 3. Stmt begin Prog Stmts end 4. Stmt id assign Expr 5. Expr id Stmts Parse Tree semicolon Stmt 6. Expr plus id Stmt id assign Expr Derivation Sequence id assign Expr Prog begin Stmts end 1 2 begin Stmts semicolon Stmt end id 3 begin Stmt semicolon Stmt end 4 4 begin id assign Expr semicolon Stmt end 5 begin id assign Expr semicolon id assign Expr end begin id assign id semicolon id assign Expr end begin id assign id semicolon id assign Expr plus id end begin id assign id semicolon id assign id plus id end 5 6 Expr plus id id Key terminal Nonterminal Rule used

46 A five minute introduction MAKEFILE

47 Makefiles: Motivation Typing the series of commands to generate our code can be tedious Multiple steps that depend on each other Somewhat complicated commands May not need to rebuild everything Makefiles solve these issues Record a series of commands in a script-like DSL Specify dependency rules and Make generates the results

48 Makefiles: Basic Structure <target>: <dependency list> (tab) <command to satisfy target> Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java

49 Makefiles: Basic Structure <target>: <dependency list> (tab) <command to satisfy target> Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java

50 Makefiles: Basic Structure <target>: <dependency list> (tab) <command to satisfy target> Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java Example.class depends on example.java and IO.class

51 Makefiles: Basic Structure <target>: <dependency list> (tab) <command to satisfy target> Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java Example.class depends on example.java and IO.class Example.class is generated by javac Example.java

52 Makefiles: Dependencies Example.class Example.java IO.class Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java IO.java

53 Makefiles: Dependencies Internal Dependency graph Example.class Example.java IO.class Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java IO.java

54 Makefiles: Dependencies Internal Dependency graph Example.class Example.java Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java IO.class IO.java A file is rebuilt if one of it s dependencies changes

55 Makefiles: Variables You can thread common configuration values through your makefile

56 Makefiles: Variables You can thread common configuration values through your makefile Example JC = /s/std/bin/javac JFLAGS = -g

57 Makefiles: Variables You can thread common configuration values through your makefile Example JC = /s/std/bin/javac JFLAGS = -g Build for debug

58 Makefiles: Variables You can thread common configuration values through your makefile Example JC = /s/std/bin/javac JFLAGS = -g Build for debug Example.class: Example.java IO.class $(JC) $(JFLAGS) Example.java IO.class: IO.java $(JC) $(JFLAGS) IO.java

59 Makefiles: Phony Targets You can run commands via make Write a target with no dependencies (called phony) Will cause it to execute the command every time Example clean: rm f *.class test: java cp. Test.class

60 Makefiles: Phony Targets You can run commands via make Write a target with no dependencies (called phony) Will cause it to execute the command every time Example clean: rm f *.class test: java cp. Test.class

61 Makefiles: Phony Targets You can run commands via make Write a target with no dependencies (called phony) Will cause it to execute the command every time Example clean: rm f *.class test: java cp. Test.class

62 Recap We ve defined context-free grammars More powerful than regular expressions Learned a bit about makefiles Next time: we ll look at grammars in more detail

HW2 posted. Announcements

HW2 posted. Announcements HW2 posted Announcements Context-free grammars Roadmap Last :me Regex == DFA JLex for genera:ng Lexers/Scanners This :me CFGs, the underlying abstrac:on for Parsers RegExs Are Great! Perfect for tokenizing

More information

Defining syntax using CFGs

Defining syntax using CFGs Defining syntax using CFGs Roadmap Last time Defined context-free grammar This time CFGs for specifying a language s syntax Language membership List grammars Resolving ambiguity CFG Review G = (N,Σ,P,S)

More information

CMSC 330: Organization of Programming Languages. Context Free Grammars

CMSC 330: Organization of Programming Languages. Context Free Grammars CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

Context-Free Grammars

Context-Free Grammars Context-Free Grammars Describing Languages We've seen two models for the regular languages: Finite automata accept precisely the strings in the language. Regular expressions describe precisely the strings

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Context Free Grammars and Parsing 1 Recall: Architecture of Compilers, Interpreters Source Parser Static Analyzer Intermediate Representation Front End Back

More information

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Syntax Analysis These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Limits of Regular Languages Advantages of Regular Expressions

More information

Defining syntax using CFGs

Defining syntax using CFGs Defining syntax using CFGs Roadmap Last 8me Defined context-free grammar This 8me CFGs for syntax design Language membership List grammars Resolving ambiguity CFG Review G = (N,Σ,P,S) means derives derives

More information

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters : Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Scanner Parser Static Analyzer Intermediate Representation Front End Back End Compiler / Interpreter

More information

Context Free Grammars. CS154 Chris Pollett Mar 1, 2006.

Context Free Grammars. CS154 Chris Pollett Mar 1, 2006. Context Free Grammars CS154 Chris Pollett Mar 1, 2006. Outline Formal Definition Ambiguity Chomsky Normal Form Formal Definitions A context free grammar is a 4-tuple (V, Σ, R, S) where 1. V is a finite

More information

A Simple Syntax-Directed Translator

A Simple Syntax-Directed Translator Chapter 2 A Simple Syntax-Directed Translator 1-1 Introduction The analysis phase of a compiler breaks up a source program into constituent pieces and produces an internal representation for it, called

More information

CS 314 Principles of Programming Languages

CS 314 Principles of Programming Languages CS 314 Principles of Programming Languages Lecture 5: Syntax Analysis (Parsing) Zheng (Eddy) Zhang Rutgers University January 31, 2018 Class Information Homework 1 is being graded now. The sample solution

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

Optimizing Finite Automata

Optimizing Finite Automata Optimizing Finite Automata We can improve the DFA created by MakeDeterministic. Sometimes a DFA will have more states than necessary. For every DFA there is a unique smallest equivalent DFA (fewest states

More information

Announcements. Working in pairs is only allowed for programming assignments and not for homework problems. H3 has been posted

Announcements. Working in pairs is only allowed for programming assignments and not for homework problems. H3 has been posted Announcements Working in pairs is only allowed for programming assignments and not for homework problems H3 has been posted 1 Syntax Directed Transla@on 2 CFGs so Far CFGs for Language Defini&on The CFGs

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

Theory and Compiling COMP360

Theory and Compiling COMP360 Theory and Compiling COMP360 It has been said that man is a rational animal. All my life I have been searching for evidence which could support this. Bertrand Russell Reading Read sections 2.1 3.2 in the

More information

COP 3402 Systems Software Syntax Analysis (Parser)

COP 3402 Systems Software Syntax Analysis (Parser) COP 3402 Systems Software Syntax Analysis (Parser) Syntax Analysis 1 Outline 1. Definition of Parsing 2. Context Free Grammars 3. Ambiguous/Unambiguous Grammars Syntax Analysis 2 Lexical and Syntax Analysis

More information

Context-Free Grammars

Context-Free Grammars Context-Free Grammars Describing Languages We've seen two models for the regular languages: Finite automata accept precisely the strings in the language. Regular expressions describe precisely the strings

More information

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved. Syntax Analysis Prof. James L. Frankel Harvard University Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved. Context-Free Grammar (CFG) terminals non-terminals start

More information

Syntax Analysis Part I

Syntax Analysis Part I Syntax Analysis Part I Chapter 4: Context-Free Grammars Slides adapted from : Robert van Engelen, Florida State University Position of a Parser in the Compiler Model Source Program Lexical Analyzer Token,

More information

Properties of Regular Expressions and Finite Automata

Properties of Regular Expressions and Finite Automata Properties of Regular Expressions and Finite Automata Some token patterns can t be defined as regular expressions or finite automata. Consider the set of balanced brackets of the form [[[ ]]]. This set

More information

CMSC 330: Organization of Programming Languages. Context Free Grammars

CMSC 330: Organization of Programming Languages. Context Free Grammars CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

Introduction to Parsing. Lecture 5

Introduction to Parsing. Lecture 5 Introduction to Parsing Lecture 5 1 Outline Regular languages revisited Parser overview Context-free grammars (CFG s) Derivations Ambiguity 2 Languages and Automata Formal languages are very important

More information

CS 406/534 Compiler Construction Parsing Part I

CS 406/534 Compiler Construction Parsing Part I CS 406/534 Compiler Construction Parsing Part I Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy and Dr.

More information

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis CSE450 Translation of Programming Languages Lecture 4: Syntax Analysis http://xkcd.com/859 Structure of a Today! Compiler Source Language Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator

More information

Ambiguous Grammars and Compactification

Ambiguous Grammars and Compactification Ambiguous Grammars and Compactification Mridul Aanjaneya Stanford University July 17, 2012 Mridul Aanjaneya Automata Theory 1/ 44 Midterm Review Mathematical Induction and Pigeonhole Principle Finite Automata

More information

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast! Lexical Analysis Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast! Compiler Passes Analysis of input program (front-end) character stream

More information

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End Architecture of Compilers, Interpreters : Organization of Programming Languages ource Analyzer Optimizer Code Generator Context Free Grammars Intermediate Representation Front End Back End Compiler / Interpreter

More information

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars CMSC 330: Organization of Programming Languages Context Free Grammars Where We Are Programming languages Ruby OCaml Implementing programming languages Scanner Uses regular expressions Finite automata Parser

More information

Parsing II Top-down parsing. Comp 412

Parsing II Top-down parsing. Comp 412 COMP 412 FALL 2018 Parsing II Top-down parsing Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled

More information

Context-Free Languages and Parse Trees

Context-Free Languages and Parse Trees Context-Free Languages and Parse Trees Mridul Aanjaneya Stanford University July 12, 2012 Mridul Aanjaneya Automata Theory 1/ 41 Context-Free Grammars A context-free grammar is a notation for describing

More information

Context-Free Grammars

Context-Free Grammars Context-Free Grammars Describing Languages We've seen two models for the regular languages: Automata accept precisely the strings in the language. Regular expressions describe precisely the strings in

More information

CMSC 330: Organization of Programming Languages. Context Free Grammars

CMSC 330: Organization of Programming Languages. Context Free Grammars CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

Syntax Analysis Check syntax and construct abstract syntax tree

Syntax Analysis Check syntax and construct abstract syntax tree Syntax Analysis Check syntax and construct abstract syntax tree if == = ; b 0 a b Error reporting and recovery Model using context free grammars Recognize using Push down automata/table Driven Parsers

More information

Lecture 8: Context Free Grammars

Lecture 8: Context Free Grammars Lecture 8: Context Free s Dr Kieran T. Herley Department of Computer Science University College Cork 2017-2018 KH (12/10/17) Lecture 8: Context Free s 2017-2018 1 / 1 Specifying Non-Regular Languages Recall

More information

Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2

Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2 Formal Languages and Grammars Chapter 2: Sections 2.1 and 2.2 Formal Languages Basis for the design and implementation of programming languages Alphabet: finite set Σ of symbols String: finite sequence

More information

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Spring UW CSE P 501 Spring 2018 C-1

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Spring UW CSE P 501 Spring 2018 C-1 CSE P 501 Compilers Parsing & Context-Free Grammars Hal Perkins Spring 2018 UW CSE P 501 Spring 2018 C-1 Administrivia Project partner signup: please find a partner and fill out the signup form by noon

More information

CSX-lite Example. LL(1) Parse Tables. LL(1) Parser Driver. Example of LL(1) Parsing. An LL(1) parse table, T, is a twodimensional

CSX-lite Example. LL(1) Parse Tables. LL(1) Parser Driver. Example of LL(1) Parsing. An LL(1) parse table, T, is a twodimensional LL(1) Parse Tables CSX-lite Example An LL(1) parse table, T, is a twodimensional array. Entries in T are production numbers or blank (error) entries. T is indexed by: A, a non-terminal. A is the nonterminal

More information

CSE302: Compiler Design

CSE302: Compiler Design CSE302: Compiler Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University February 20, 2007 Outline Recap

More information

Grammars and Parsing. Paul Klint. Grammars and Parsing

Grammars and Parsing. Paul Klint. Grammars and Parsing Paul Klint Grammars and Languages are one of the most established areas of Natural Language Processing and Computer Science 2 N. Chomsky, Aspects of the theory of syntax, 1965 3 A Language...... is a (possibly

More information

Administrativia. PA2 assigned today. WA1 assigned today. Building a Parser II. CS164 3:30-5:00 TT 10 Evans. First midterm. Grammars.

Administrativia. PA2 assigned today. WA1 assigned today. Building a Parser II. CS164 3:30-5:00 TT 10 Evans. First midterm. Grammars. Administrativia Building a Parser II CS164 3:30-5:00 TT 10 Evans PA2 assigned today due in 12 days WA1 assigned today due in a week it s a practice for the exam First midterm Oct 5 will contain some project-inspired

More information

CSE 401/M501 18au Midterm Exam 11/2/18. Name ID #

CSE 401/M501 18au Midterm Exam 11/2/18. Name ID # Name ID # There are 7 questions worth a total of 100 points. Please budget your time so you get to all of the questions. Keep your answers brief and to the point. The exam is closed books, closed notes,

More information

Compilers Course Lecture 4: Context Free Grammars

Compilers Course Lecture 4: Context Free Grammars Compilers Course Lecture 4: Context Free Grammars Example: attempt to define simple arithmetic expressions using named regular expressions: num = [0-9]+ sum = expr "+" expr expr = "(" sum ")" num Appears

More information

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice. Parsing Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students

More information

Introduction to Parsing. Lecture 8

Introduction to Parsing. Lecture 8 Introduction to Parsing Lecture 8 Adapted from slides by G. Necula Outline Limitations of regular languages Parser overview Context-free grammars (CFG s) Derivations Languages and Automata Formal languages

More information

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal)

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal) Parsing Part II (Ambiguity, Top-down parsing, Left-recursion Removal) Ambiguous Grammars Definitions If a grammar has more than one leftmost derivation for a single sentential form, the grammar is ambiguous

More information

CSE 3302 Programming Languages Lecture 2: Syntax

CSE 3302 Programming Languages Lecture 2: Syntax CSE 3302 Programming Languages Lecture 2: Syntax (based on slides by Chengkai Li) Leonidas Fegaras University of Texas at Arlington CSE 3302 L2 Spring 2011 1 How do we define a PL? Specifying a PL: Syntax:

More information

Languages and Compilers

Languages and Compilers Principles of Software Engineering and Operational Systems Languages and Compilers SDAGE: Level I 2012-13 3. Formal Languages, Grammars and Automata Dr Valery Adzhiev vadzhiev@bournemouth.ac.uk Office:

More information

Introduction to Parsing. Lecture 5

Introduction to Parsing. Lecture 5 Introduction to Parsing Lecture 5 1 Outline Regular languages revisited Parser overview Context-free grammars (CFG s) Derivations Ambiguity 2 Languages and Automata Formal languages are very important

More information

Table-Driven Top-Down Parsers

Table-Driven Top-Down Parsers Table-Driven Top-Down Parsers Recursive descent parsers have many attractive features. They are actual pieces of code that can be read by programmers and extended. This makes it fairly easy to understand

More information

Part 3. Syntax analysis. Syntax analysis 96

Part 3. Syntax analysis. Syntax analysis 96 Part 3 Syntax analysis Syntax analysis 96 Outline 1. Introduction 2. Context-free grammar 3. Top-down parsing 4. Bottom-up parsing 5. Conclusion and some practical considerations Syntax analysis 97 Structure

More information

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1 CSE P 501 Compilers Parsing & Context-Free Grammars Hal Perkins Winter 2008 1/15/2008 2002-08 Hal Perkins & UW CSE C-1 Agenda for Today Parsing overview Context free grammars Ambiguous grammars Reading:

More information

CSE 413 Programming Languages & Implementation. Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions

CSE 413 Programming Languages & Implementation. Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions CSE 413 Programming Languages & Implementation Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions 1 Agenda Overview of language recognizers Basic concepts of formal grammars Scanner Theory

More information

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1 Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1 1. Introduction Parsing is the task of Syntax Analysis Determining the syntax, or structure, of a program. The syntax is defined by the grammar rules

More information

Introduction to Parsing. Comp 412

Introduction to Parsing. Comp 412 COMP 412 FALL 2010 Introduction to Parsing Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission to make

More information

2.2 Syntax Definition

2.2 Syntax Definition 42 CHAPTER 2. A SIMPLE SYNTAX-DIRECTED TRANSLATOR sequence of "three-address" instructions; a more complete example appears in Fig. 2.2. This form of intermediate code takes its name from instructions

More information

Languages, Automata, Regular Expressions & Scanners. Winter /8/ Hal Perkins & UW CSE B-1

Languages, Automata, Regular Expressions & Scanners. Winter /8/ Hal Perkins & UW CSE B-1 CSE 401 Compilers Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter 2010 1/8/2010 2002-10 Hal Perkins & UW CSE B-1 Agenda Quick review of basic concepts of formal grammars Regular

More information

([1-9] 1[0-2]):[0-5][0-9](AM PM)? What does the above match? Matches clock time, may or may not be told if it is AM or PM.

([1-9] 1[0-2]):[0-5][0-9](AM PM)? What does the above match? Matches clock time, may or may not be told if it is AM or PM. What is the corresponding regex? [2-9]: ([1-9] 1[0-2]):[0-5][0-9](AM PM)? What does the above match? Matches clock time, may or may not be told if it is AM or PM. CS 230 - Spring 2018 4-1 More CFG Notation

More information

Syntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc.

Syntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc. Syntax Syntax Syntax defines what is grammatically valid in a programming language Set of grammatical rules E.g. in English, a sentence cannot begin with a period Must be formal and exact or there will

More information

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES THE COMPILATION PROCESS Character stream CS 403: Scanning and Parsing Stefan D. Bruda Fall 207 Token stream Parse tree Abstract syntax tree Modified intermediate form Target language Modified target language

More information

CS 4120 Introduction to Compilers

CS 4120 Introduction to Compilers CS 4120 Introduction to Compilers Andrew Myers Cornell University Lecture 6: Bottom-Up Parsing 9/9/09 Bottom-up parsing A more powerful parsing technology LR grammars -- more expressive than LL can handle

More information

Syntax. In Text: Chapter 3

Syntax. In Text: Chapter 3 Syntax In Text: Chapter 3 1 Outline Syntax: Recognizer vs. generator BNF EBNF Chapter 3: Syntax and Semantics 2 Basic Definitions Syntax the form or structure of the expressions, statements, and program

More information

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing Roadmap > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing The role of the parser > performs context-free syntax analysis > guides

More information

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones Parsing III (Top-down parsing: recursive descent & LL(1) ) (Bottom-up parsing) CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones Copyright 2003, Keith D. Cooper,

More information

3. Parsing. Oscar Nierstrasz

3. Parsing. Oscar Nierstrasz 3. Parsing Oscar Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes. http://www.cs.ucla.edu/~palsberg/ http://www.cs.purdue.edu/homes/hosking/

More information

CS 403: Scanning and Parsing

CS 403: Scanning and Parsing CS 403: Scanning and Parsing Stefan D. Bruda Fall 2017 THE COMPILATION PROCESS Character stream Scanner (lexical analysis) Token stream Parser (syntax analysis) Parse tree Semantic analysis Abstract syntax

More information

Context-Free Grammars

Context-Free Grammars Context-Free Grammars Lecture 7 http://webwitch.dreamhost.com/grammar.girl/ Outline Scanner vs. parser Why regular expressions are not enough Grammars (context-free grammars) grammar rules derivations

More information

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]

More information

CS 536 Midterm Exam Spring 2013

CS 536 Midterm Exam Spring 2013 CS 536 Midterm Exam Spring 2013 ID: Exam Instructions: Write your student ID (not your name) in the space provided at the top of each page of the exam. Write all your answers on the exam itself. Feel free

More information

Part 5 Program Analysis Principles and Techniques

Part 5 Program Analysis Principles and Techniques 1 Part 5 Program Analysis Principles and Techniques Front end 2 source code scanner tokens parser il errors Responsibilities: Recognize legal programs Report errors Produce il Preliminary storage map Shape

More information

Project 1: Scheme Pretty-Printer

Project 1: Scheme Pretty-Printer Project 1: Scheme Pretty-Printer CSC 4101, Fall 2017 Due: 7 October 2017 For this programming assignment, you will implement a pretty-printer for a subset of Scheme in either C++ or Java. The code should

More information

CSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions

CSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions CSE 413 Programming Languages & Implementation Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions 1 Agenda Overview of language recognizers Basic concepts of formal grammars Scanner Theory

More information

Chapter 4. Lexical and Syntax Analysis

Chapter 4. Lexical and Syntax Analysis Chapter 4 Lexical and Syntax Analysis Chapter 4 Topics Introduction Lexical Analysis The Parsing Problem Recursive-Descent Parsing Bottom-Up Parsing Copyright 2012 Addison-Wesley. All rights reserved.

More information

Introduction to Parsing

Introduction to Parsing Introduction to Parsing The Front End Source code Scanner tokens Parser IR Errors Parser Checks the stream of words and their parts of speech (produced by the scanner) for grammatical correctness Determines

More information

CSCI312 Principles of Programming Languages!

CSCI312 Principles of Programming Languages! CSCI312 Principles of Programming Languages!! Chapter 3 Regular Expression and Lexer Xu Liu Recap! Copyright 2006 The McGraw-Hill Companies, Inc. Clite: Lexical Syntax! Input: a stream of characters from

More information

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

MIT Specifying Languages with Regular Expressions and Context-Free Grammars MIT 6.035 Specifying Languages with Regular essions and Context-Free Grammars Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Language Definition Problem How to precisely

More information

Lexical Scanning COMP360

Lexical Scanning COMP360 Lexical Scanning COMP360 Captain, we re being scanned. Spock Reading Read sections 2.1 3.2 in the textbook Regular Expression and FSA Assignment A new assignment has been posted on Blackboard It is due

More information

Lexical and Syntax Analysis. Top-Down Parsing

Lexical and Syntax Analysis. Top-Down Parsing Lexical and Syntax Analysis Top-Down Parsing Easy for humans to write and understand String of characters Lexemes identified String of tokens Easy for programs to transform Data structure Syntax A syntax

More information

CSE 401 Midterm Exam Sample Solution 2/11/15

CSE 401 Midterm Exam Sample Solution 2/11/15 Question 1. (10 points) Regular expression warmup. For regular expression questions, you must restrict yourself to the basic regular expression operations covered in class and on homework assignments:

More information

Dr. D.M. Akbar Hussain

Dr. D.M. Akbar Hussain Syntax Analysis Parsing Syntax Or Structure Given By Determines Grammar Rules Context Free Grammar 1 Context Free Grammars (CFG) Provides the syntactic structure: A grammar is quadruple (V T, V N, S, R)

More information

EECS 6083 Intro to Parsing Context Free Grammars

EECS 6083 Intro to Parsing Context Free Grammars EECS 6083 Intro to Parsing Context Free Grammars Based on slides from text web site: Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. 1 Parsing sequence of tokens parser

More information

Habanero Extreme Scale Software Research Project

Habanero Extreme Scale Software Research Project Habanero Extreme Scale Software Research Project Comp215: Grammars Zoran Budimlić (Rice University) Grammar, which knows how to control even kings - Moliere So you know everything about regular expressions

More information

Building a Parser II. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Building a Parser II. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1 Building a Parser II CS164 3:30-5:00 TT 10 Evans 1 Grammars Programming language constructs have recursive structure. which is why our hand-written parser had this structure, too An expression is either:

More information

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s) Outline Limitations of regular languages Introduction to Parsing Parser overview Lecture 8 Adapted from slides by G. Necula Context-free grammars (CFG s) Derivations Languages and Automata Formal languages

More information

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;} Compiler Construction Grammars Parsing source code scanner tokens regular expressions lexical analysis Lennart Andersson parser context free grammar Revision 2012 01 23 2012 parse tree AST builder (implicit)

More information

CSCI312 Principles of Programming Languages

CSCI312 Principles of Programming Languages Copyright 2006 The McGraw-Hill Companies, Inc. CSCI312 Principles of Programming Languages! LL Parsing!! Xu Liu Derived from Keith Cooper s COMP 412 at Rice University Recap Copyright 2006 The McGraw-Hill

More information

CS2210: Compiler Construction Syntax Analysis Syntax Analysis

CS2210: Compiler Construction Syntax Analysis Syntax Analysis Comparison with Lexical Analysis The second phase of compilation Phase Input Output Lexer string of characters string of tokens Parser string of tokens Parse tree/ast What Parse Tree? CS2210: Compiler

More information

Homework & Announcements

Homework & Announcements Homework & nnouncements New schedule on line. Reading: Chapter 18 Homework: Exercises at end Due: 11/1 Copyright c 2002 2017 UMaine School of Computing and Information S 1 / 25 COS 140: Foundations of

More information

Reflection in the Chomsky Hierarchy

Reflection in the Chomsky Hierarchy Reflection in the Chomsky Hierarchy Henk Barendregt Venanzio Capretta Dexter Kozen 1 Introduction We investigate which classes of formal languages in the Chomsky hierarchy are reflexive, that is, contain

More information

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised: EDAN65: Compilers, Lecture 06 A LR parsing Görel Hedin Revised: 2017-09-11 This lecture Regular expressions Context-free grammar Attribute grammar Lexical analyzer (scanner) Syntactic analyzer (parser)

More information

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis CS 1622 Lecture 2 Lexical Analysis CS 1622 Lecture 2 1 Lecture 2 Review of last lecture and finish up overview The first compiler phase: lexical analysis Reading: Chapter 2 in text (by 1/18) CS 1622 Lecture

More information

Introduction to Parsing Ambiguity and Syntax Errors

Introduction to Parsing Ambiguity and Syntax Errors Introduction to Parsing Ambiguity and Syntax rrors Outline Regular languages revisited Parser overview Context-free grammars (CFG s) Derivations Ambiguity Syntax errors 2 Languages and Automata Formal

More information

CSE 311 Lecture 21: Context-Free Grammars. Emina Torlak and Kevin Zatloukal

CSE 311 Lecture 21: Context-Free Grammars. Emina Torlak and Kevin Zatloukal CSE 311 Lecture 21: Context-Free Grammars Emina Torlak and Kevin Zatloukal 1 Topics Regular expressions A brief review of Lecture 20. Context-free grammars Syntax, semantics, and examples. 2 Regular expressions

More information

Syntax Analysis, VI Examples from LR Parsing. Comp 412

Syntax Analysis, VI Examples from LR Parsing. Comp 412 Midterm Exam: Thursday October 18, 7PM Herzstein Amphitheater Syntax Analysis, VI Examples from LR Parsing Comp 412 COMP 412 FALL 2018 source code IR IR target Front End Optimizer Back End code Copyright

More information

1 Parsing (25 pts, 5 each)

1 Parsing (25 pts, 5 each) CSC173 FLAT 2014 ANSWERS AND FFQ 30 September 2014 Please write your name on the bluebook. You may use two sides of handwritten notes. Perfect score is 75 points out of 85 possible. Stay cool and please

More information

Last Time. What do we want? When do we want it? An AST. Now!

Last Time. What do we want? When do we want it? An AST. Now! Java CUP 1 Last Time What do we want? An AST When do we want it? Now! 2 This Time A little review of ASTs The philosophy and use of a Parser Generator 3 Translating Lists CFG IdList -> id IdList comma

More information

B The SLLGEN Parsing System

B The SLLGEN Parsing System B The SLLGEN Parsing System Programs are just strings of characters. In order to process a program, we need to group these characters into meaningful units. This grouping is usually divided into two stages:

More information

Programming Language Syntax and Analysis

Programming Language Syntax and Analysis Programming Language Syntax and Analysis 2017 Kwangman Ko (http://compiler.sangji.ac.kr, kkman@sangji.ac.kr) Dept. of Computer Engineering, Sangji University Introduction Syntax the form or structure of

More information

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38 Syntax Analysis Martin Sulzmann Martin Sulzmann Syntax Analysis 1 / 38 Syntax Analysis Objective Recognize individual tokens as sentences of a language (beyond regular languages). Example 1 (OK) Program

More information

4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

More information