Specifying Syntax. An English Grammar. Components of a Grammar. Language Specification. Types of Grammars. 1. Terminal symbols or terminals, Σ

Similar documents
Principles of Programming Languages COMP251: Syntax and Grammars

CPS 506 Comparative Programming Languages. Syntax Specification

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

Syntax and Grammars 1 / 21

COP 3402 Systems Software Syntax Analysis (Parser)

Lexical Analysis. Introduction

LANGUAGE PROCESSORS. Presented By: Prof. S.J. Soni, SPCE Visnagar.

Introduction to Lexical Analysis

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Principles of Programming Languages COMP251: Syntax and Grammars

Syntax. A. Bellaachia Page: 1

3. Context-free grammars & parsing

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Grammars & Parsing. Lecture 12 CS 2112 Fall 2018

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

Part 5 Program Analysis Principles and Techniques

CSCE 314 Programming Languages

A Simple Syntax-Directed Translator

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

More Assigned Reading and Exercises on Syntax (for Exam 2)

Syntactic Analysis. CS345H: Programming Languages. Lecture 3: Lexical Analysis. Outline. Lexical Analysis. What is a Token? Tokens

Attribute Grammars. An Example. Using an Attribute Grammar. Recall the context-sensitive language from Chapter 1: L = { a n b n c n n!

This book is licensed under a Creative Commons Attribution 3.0 License

Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2

Syntax Analysis Check syntax and construct abstract syntax tree

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata

A simple syntax-directed

Introduction to Lexical Analysis

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

Defining syntax using CFGs

CS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built)

CMPT 755 Compilers. Anoop Sarkar.

CSE 3302 Programming Languages Lecture 2: Syntax

Non-deterministic Finite Automata (NFA)

Related Course Objec6ves

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

Course Overview. Introduction (Chapter 1) Compiler Frontend: Today. Compiler Backend:

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

CS 403: Scanning and Parsing

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

CSCE 531 Spring 2009 Final Exam

Syntax Intro and Overview. Syntax

B The SLLGEN Parsing System

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

7. Introduction to Denotational Semantics. Oscar Nierstrasz

Context-free grammars (CFG s)

Lexical and Syntax Analysis. Top-Down Parsing

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

Syntax. In Text: Chapter 3

CSCI312 Principles of Programming Languages!

Lexical and Syntax Analysis

CSE 401 Midterm Exam 11/5/10

EECS 6083 Intro to Parsing Context Free Grammars

VIVA QUESTIONS WITH ANSWERS

LECTURE 3. Compiler Phases

Grammars and Parsing. Paul Klint. Grammars and Parsing

announcements CSE 311: Foundations of Computing review: regular expressions review: languages---sets of strings

Announcements. Written Assignment 1 out, due Friday, July 6th at 5PM.

CSE302: Compiler Design

Building Compilers with Phoenix

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

High Level Languages. Java (Object Oriented) This Course. Jython in Java. Relation. ASP RDF (Horn Clause Deduction, Semantic Web) Dr.

Syntax and Semantics

Question Bank. 10CS63:Compiler Design

A programming language requires two major definitions A simple one pass compiler

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

Chapter 4. Syntax - the form or structure of the expressions, statements, and program units

Topic 3: Syntax Analysis I

Lexical Analysis - An Introduction. Lecture 4 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Lexical Analysis. Chapter 2

Examples of attributes: values of evaluated subtrees, type information, source file coordinates,

CSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions

Chapter 3. Syntax - the form or structure of the expressions, statements, and program units

Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

CS 314 Principles of Programming Languages

Syntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc.

1 Lexical Considerations

Chapter 4: Syntax Analyzer

Compilers. Compiler Construction Tutorial The Front-end

CMSC 330: Organization of Programming Languages. Formal Semantics of a Prog. Lang. Specifying Syntax, Semantics

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

MATVEC: MATRIX-VECTOR COMPUTATION LANGUAGE REFERENCE MANUAL. John C. Murphy jcm2105 Programming Languages and Translators Professor Stephen Edwards

CSc 520 Principles of Programming Languages

Compiler Construction

Context Free Grammars and Recursive Descent Parsing

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program.

Compiler Design Concepts. Syntax Analysis

Parsing - 1. What is parsing? Shift-reduce parsing. Operator precedence parsing. Shift-reduce conflict Reduce-reduce conflict

Languages and Compilers

Lexical Analysis. Lecture 3. January 10, 2018

Programming Language Definition. Regular Expressions

Programming Language Specification and Translation. ICOM 4036 Fall Lecture 3

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

Week 2: Syntax Specification, Grammars

Transcription:

Specifying Syntax Language Specification Components of a Grammar 1. Terminal symbols or terminals, Σ Syntax Form of phrases Physical arrangement of symbols 2. Nonterminal symbols or syntactic categories, N Vocabulary = Σ N Semantics Meaning of phrases Result of executing a program Pragmatics Usage of the language Features of implementations 3. Productions or rules, α ::= β, where α,β (Σ N)* 4.Start Symbol (a nonterminal) Chapter 1 1 Chapter 1 2 Types of Grammars Type 0: Unrestricted grammars α ::= β where α contains a nonterminal Type 1: Context-sensitive grammars αbγ ::= αβγ Type 2: Context-free grammars A ::= string of vocabulary symbols Type 3: Regular grammars A ::= a or A ::= ab An English Grammar <sentence> ::= <noun phrase> <verb phrase>. <noun phrase> ::= <determiner> <noun> <determiner> <noun> <prepositional phrase> <verb phrase> ::= <verb> <verb> <noun phrase> <verb> <noun phrase> <prepositional phrase> <prepositional phrase> ::= <preposition> <noun phrase> <noun> ::= boy girl cat telescope song feather <determiner> ::= a the <verb> ::= saw touched surprised sang <preposition> ::= by with Chapter 1 3 Chapter 1 4

A Derivation (for Figure 1.4) <sentence> <noun phrase> <verb phrase>. <determiner> <noun> <verb phrase>. the <noun> <verb phrase>. the girl <verb phrase>. the girl <verb> <noun phrase>. the girl touched <noun phrase>. the girl touched <determiner> <noun> <prep phrase>. the girl touched the <noun> <prep phrase>. the girl touched the cat <prep phrase>. the girl touched the cat <preposition> <noun phrase>. the girl touched the cat with <noun phrase>. the girl touched the cat with <determiner> <noun>. the girl touched the cat with a <noun>. the girl touched the cat with a feather. Note example of a context-sensitive grammar. Concrete Syntax for Wren Phrase-structure syntax: <program> ::= program <ident> is <block> <block> ::= <declaration seq> begin <cmd seq> end <declaration seq> ::= ε <declaration> <declaration seq> <declaration> ::= var <var list> : <type> ; <type> ::= integer boolean <var list> ::= <var> <var>, <var list> <cmd seq> ::= <cmd> <cmd> ; <cmd seq> <cmd> ::= <var> := <expr> skip read <var> write <int expr> while <bool expr> do <cmd seq> end while if <bool expr> then <cmd seq> end if if <bool expr> then <cmd seq> else <cmd seq> end if Chapter 1 5 Chapter 1 6 <expr> ::= <int expr> <bool expr> <integer expr> ::= <term> <int expr> <weak op> <term> <term> ::= <element> <term> <strong op> <element> <element> ::= <numeral> <var> ( <int expr> ) <element> <bool expr> ::= <bool term> <bool expr> or <bool term> <bool term> ::= <bool element> <bool term> and <bool element> <bool element> ::= true false <var> <comparison> not ( <bool expr> ) ( <bool expr> ) <comparison> ::= <int expr> <relation> <int expr> <var> ::= <ident> Lexical syntax: <relation> ::= <= < = > >= <> <weak op> ::= + <strong op> ::= * / <ident> ::= <letter> <ident> <letter> <ident> <digit> <letter> ::= a b c d e f g h i j k l m n o p q r s t u v w x y z <numeral> ::= <digit> <digit> <numeral> <digit> ::= 0 1 2 3 4 5 6 7 8 9 Tokens: <token> ::= <identifier> <numeral> <reserved word> <relation> <weak op> <strong op> := ( ), ; : <reserved word> ::= program is begin end var integer boolean read write skip while do if then else and or true false not Chapter 1 7 Chapter 1 8

Sample Wren Program program prime is var num,divisor : integer; var done : boolean; begin read num; while num>0 do divisor := 2; done := false; while divisor<= num/2 and not(done) do done := num = divisor*(num/divisor); divisor := divisor+1 end while; if done then write 0 else write num end if; read num end while end Ambiguity One phrase has two different derivation trees, thus two different syntactic structures. Ambiguous structure may lead to ambiguous semantics. Dangling else Associativity of expressions (ex 2, page 16) Ada: Is a(2) a subscripted variable or a function call? Classifying Errors in Wren Context-free syntax specified by concrete syntax lexical syntax phrase-structure syntax Context-sensitive syntax (context constraints) also called static semantics Semantic or dynamic errors Chapter 1 9 Chapter 1 10 Context Constraints in Wren 1. The program name identifier may not be declared elsewhere in the program. 2. All identifiers that appear in a block must be declared in that block. 3. No identifier may be declared more than once in a block. 4. The identifier on the left side of an assignment command must be declared as a variable, and the expression on the right must be of the same type. 5. An identifier occurring as an (integer) element must be an integer variable. 6. An identifier occurring as a Boolean element must be a Boolean variable. 7. An identifier occurring in a read command must be an integer variable. Chapter 1 11 Syntax of Wren <token>* abitrary strings of tokens <program> strings derived from the context-free grammar Syntactically well formed Wren programs derived from <program> and also satisfying the context constraints Semantic Errors in Wren 1. An attempt is made to divide by zero. 2. A variable that has not been initialized is accessed. 3. A read command is executed when input file empty. 4. An iteration command (while) does not terminate. Chapter 1 12

Regular Expressions Denote Languages ε a for each a Σ (E F) union (E F) concatenation (E*) Kleene closure May omit some parentheses following the precedence rules Abbreviations E 1 E 2 = E 1 E 2 E + = E E* E n = E E E, repeated n times E? = ε E Note: E* = ε E 1 E 2 E 3 Nonessential Ambiguity Command Sequences: <cmd seq> ::= <command> <command> ; <cmd seq> <cmd seq> ::= <command> <cmd seq> ; <command> <cmd seq> ::= <command> ( ; <command> )* Or include command sequences with commands: <command> ::= <command> ; <command> skip read <var> Language Processing: scan : Character* Token* parse : Token*???? Chapter 1 13 Chapter 1 14 Concrete Syntax (BNF) Specifies the physical form of programs. Guides the parser in determining the phrase structure of a program. Should not be ambiguous. A derivation proceeds by replacing one nonterminal at a time by its corresponding right-hand side in some production for that nonterminal. A derivation following a BNF definition of a language can be summarized in a derivation (parse) tree. Although a parser may not produce a derivation tree, the structure of the tree is embodied in the parsing process. read m; if m=-9 then m:=0 end if; write 22*m Derivation Tree <cmd seq> <command> ; <cmd seq> read <variable> <command> ; <cmd seq> <command> if <bool expr> then <cmd seq> end if write <int expr> <bool term> <command> <term> <bool elem> <variable> := <expr> <term> <str op> <elem> <comparison> <int expr> <element> * <int expr> <rel> <int expr> <term> num(22) <term> = <term> <element> <element> <element> num(0) - <element> num(9) Chapter 1 15 Chapter 1 16

Abstract Syntax Representation of the structure of language constructs as determined by the parser. read m; if m=-9 then m:=0 end if; write 22 * m Abstract Syntax Tree: The hierarchical structure of language constructs can be represented as abstract syntax trees. read commands if write times The goal of abstract syntax is to reduce language constructs to their essential properties. equal minus commands assign num(22) num(9) num(0) Specifying the abstract syntax of a language means providing the possible templates for the various constructs of the language. OR if write expr Abstract syntax generally allows ambiguity since parsing has already been done. equal expr minus times num(22) Semantic specifications based on abstract syntax are much simpler. num(9) Chapter 1 17 Chapter 1 18 Designing Abstract Syntax Consider concrete syntax for expressions: <expr> ::= <int expr> <bool expr> <integer expr> ::= <term> <int expr> <weak op> <term> <term> ::= <element> <term> <strong op> <element> <element> ::= <numeral> <var> ( <int expr> ) <element> <bool expr> ::= <bool term> <bool expr> or <bool term> <bool term> ::= <bool element> <bool term> and <bool element> <bool element> ::= true false <var> <comparison> ( <bool expr> ) not ( <bool expr> ) <comparison> ::= <int expr> <relation> <int expr> Assume we only have to deal with programs that are syntactically correct. What basic forms can an expression take? <int expr> <weak op> <term> <term> <strong op> <element> <numeral> <variable> <element> <bool expr> or <bool term> <bool term> and <bool element> true false not ( <bool expr> ) <int expr> <relation> <int expr> Abstract Syntax: Expression ::= Numeral Identifier true false Expression Operator Expression Expression not(expression) Operator ::= + * / or and <= < = > >= <> Chapter 1 19 Chapter 1 20

Abstract Syntax Abstract Syntactic Categories Program Type Operator Block Command Numeral Declaration Expression Identifier Abstract Production Rules Program ::= program Identifier is Block Block ::= Declaration* begin Command+ end Declaration ::= var Identifier+ : Type ; Type ::= integer boolean Command ::= Identifier := Expression skip read Identifier write Expression while Expression do Command+ if Expression then Command+ if Expression then Command+ else Command+ Expression ::= Numeral Identifier true false Expression Operator Expression Expression not( Expression) Operator ::= + * / or and <= < = > >= <> Alternative Abstract Syntax Abstract Production Rules Program ::= prog (Identifier, Block) Block ::= block (Declaration*, Command+) Declaration ::= dec (Identifier+, Type) Type ::= integer boolean Command ::= assign (Identifier, Expression) skip read (Identifier) write (Expression) while (Expression, Command+) if (Expression, Command+) ifelse (Expression, Command+, Command+) Expression ::= Numeral Identifier true false not (Expression) minus (Expression) exp (Operator, Expression, Expression) Operator ::= + * / or and <= < = > >= <> Chapter 1 21 Chapter 1 22 Example read m; if m=-9 then m:=0 end if; write 22*m [ read (m), if (exp (equal, ide(m),minus (num(9))), [ assign (m, num(0)) ]), write (exp (times, num(22), ide(m))) ] Language Processing scan : Character* Token* parse : Token* Abstract Syntax Tree Are either of these functions one-to-one? Explain. Chapter 1 23