Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

Similar documents
A simple syntax-directed

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

Syntax. A. Bellaachia Page: 1

A Simple Syntax-Directed Translator

Syntax. In Text: Chapter 3

Syntax Analysis Check syntax and construct abstract syntax tree

Parsing II Top-down parsing. Comp 412

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

Principles of Programming Languages COMP251: Syntax and Grammars

EECS 6083 Intro to Parsing Context Free Grammars

A programming language requires two major definitions A simple one pass compiler

CSE 3302 Programming Languages Lecture 2: Syntax

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Introduction to Parsing

Chapter 4. Syntax - the form or structure of the expressions, statements, and program units

Chapter 4: Syntax Analyzer

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

3. Context-free grammars & parsing

Lexical Scanning COMP360

CPS 506 Comparative Programming Languages. Syntax Specification

COP 3402 Systems Software Syntax Analysis (Parser)

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

Context-free grammars (CFG s)

Lexical and Syntax Analysis. Top-Down Parsing

Compiler Theory. (Semantic Analysis and Run-Time Environments)

ECE251 Midterm practice questions, Fall 2010

EDA180: Compiler Construc6on. Top- down parsing. Görel Hedin Revised: a

Describing Syntax and Semantics

Properties of Regular Expressions and Finite Automata

Course Overview. Introduction (Chapter 1) Compiler Frontend: Today. Compiler Backend:

Lexical and Syntax Analysis

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Optimizing Finite Automata

UNIVERSITY OF CALIFORNIA

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Chapter 3. Syntax - the form or structure of the expressions, statements, and program units

Chapter 3. Describing Syntax and Semantics

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

Chapter 3. Describing Syntax and Semantics ISBN

Theory and Compiling COMP360

Defining syntax using CFGs

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

EDA180: Compiler Construc6on Context- free grammars. Görel Hedin Revised:

Chapter 3. Describing Syntax and Semantics ISBN

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

Specifying Syntax. An English Grammar. Components of a Grammar. Language Specification. Types of Grammars. 1. Terminal symbols or terminals, Σ

Dr. D.M. Akbar Hussain

COP4020 Programming Assignment 2 - Fall 2016

Chapter 3. Describing Syntax and Semantics

CSE 12 Abstract Syntax Trees

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

CS 314 Principles of Programming Languages

Context-Free Grammars

LECTURE 3. Compiler Phases

Programming Language Definition. Regular Expressions

CSCI 1260: Compilers and Program Analysis Steven Reiss Fall Lecture 4: Syntax Analysis I

CMSC 330: Organization of Programming Languages. Formal Semantics of a Prog. Lang. Specifying Syntax, Semantics

Chapter 3. Describing Syntax and Semantics

Chapter 3. Parsing #1

Chapter 4. Abstract Syntax

CSE 401 Midterm Exam Sample Solution 2/11/15

Syntax Analysis Part I

A language is a subset of the set of all strings over some alphabet. string: a sequence of symbols alphabet: a set of symbols

Programming Languages

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

Syntax Analysis. Chapter 4

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Programming Language Syntax and Analysis

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

7. Introduction to Denotational Semantics. Oscar Nierstrasz

CSE 401 Midterm Exam Sample Solution 11/4/11

Habanero Extreme Scale Software Research Project

CS 536 Midterm Exam Spring 2013

Parsing #1. Leonidas Fegaras. CSE 5317/4305 L3: Parsing #1 1

CSE302: Compiler Design

COP4020 Spring 2011 Midterm Exam

Programming Languages, Summary CSC419; Odelia Schwartz

Programming Language Specification and Translation. ICOM 4036 Fall Lecture 3

1. Explain the input buffer scheme for scanning the source program. How the use of sentinels can improve its performance? Describe in detail.

22c:111 Programming Language Concepts. Fall Syntax III

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

Recursive Descent Parsers

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Part 5 Program Analysis Principles and Techniques

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

Chapter 3: Syntax and Semantics. Syntax and Semantics. Syntax Definitions. Matt Evett Dept. Computer Science Eastern Michigan University 1999

Examples of attributes: values of evaluated subtrees, type information, source file coordinates,

ICOM 4036 Spring 2004

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal)

Compilers Course Lecture 4: Context Free Grammars

Transcription:

Syntax/semantics Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing Meta-models 8/27/10 1

Program <> program execution Syntax Semantics 8/27/10 2

Syntax <> Semantics A description of a programming language consists of two main components: Syntactic rules: what form does a legal program have. Semantic rules: what the sentences in the language mean. Static semantics: rules that may be checked before the execution of the program, e.g.: All variables must be declared. Declaration and use of variables coincide (type check). Dynamic semantics: rules saying what shall happen during the execution of the program, e.g. in terms of an operational semantics, that is a semantics that describes the behaviour of a (idealised) abstract machine performing a program, or by mapping to something else (but well-known and well-defined) - denotational semantics. 8/27/10 3

Compiler/interpreter A compiler translates a program to another language, typically a machine language or to a language for a virtual machine. An interpreter reads a program and simulates its operations. Both are based upon an abstract syntax tree representation of the program 8/27/10 4

Syntax described by BNF-grammars production, rule e ::= n e ::= e + e e ::= e - e n ::= d n ::= nd d ::= 0 d ::= 1... d ::= 9 nonterminal terminal terminal metasymbol meta- language 8/27/10 5

Extended BNF In Extended BNF we can use the following metasymbols on the righthand side: alternatives? optionality * zero or more times + one or more times {...} grouping symbols e ::= n e + e e - e n ::= d nd d ::= 0 1 2 3 4 5 6 7 8 9 8/27/10 6

Derivation of sentences The possible sentences in a language defined by a BNF-grammar are those that emerge by following this procedure: 1. Start with the start symbol (e). 2. For each nonterminal symbol (e, n, d) exchange this with one of the alternatives on the right hand side of the production defining this nonterminal. 3. Repeat 2 until only terminal symbols. This is called a derivation from the start symbol to a sentence, and it may be represented by a parse tree / syntax tree Removing unnecessary derivations and nodes like - and + gives an abstract syntax tree e ::= n e + e e - e n ::= d nd d ::= 0 1 2 3 4 5 6 7 8 9 8/27/10 10-15 + 12

10-15 + 12? 10 (15 + 12) = 10 27 = -17 (10 15) + 12 = -5 + 12 = 7

Unambiguous/ Ambiguous Grammars If every sentence in the language can be derived by one and only one parse tree, then the grammar is unambiguous, otherwise it is ambiguous. e ::= 0 1 e + e e - e e * e Ambiguity handled by precedence and associativity rules

s ::= v ::= e ::= b ::= v := e s ; s if b then s if b then s else s x y z v 0 1 2 3 4 e = e 8/27/10 10

x:=1; y:=2; if x=y then y:=3 8/27/10 11

if b 1 then if b 2 then s 1 else s 2 is s 2 executed? b 1 = true b 2 = false yes no b 1 = false b 2 = true 8/27/10 12 no yes

8/27/10 13

Alternatives to grammars Syntax diagrams Automata/State Machines 8/27/10 14

Syntax diagram <exp> ::= <exp> + <term> <term> <term> ::= <term> * <num> <num> 8/27/10 15

Automata/State Machines Transitions marked with terminals, one start state and a number of stop states Recognizes a string in the language if the terminals represent a valid sequence of transitions ending up in a stop state upon reading the last symbol 8/27/10 16

Scanning A scanner groups characters to symbols called tokens BEGIN IDENT LPAR TEXT RPAR END begin OutText ( Hello ) end A scanner is normally constructed as an automata/state machine 8/27/10 17

Parsing To check that a sentence (or a program) is syntactically correct, that is to construct the corresponding syntax tree. In general we would like to construct the tree by reading the sentence once, from left to right. Example grammar <exp> ::= <exp> + <term> <term> <term> ::= <term> * <num> <num> 8/27/10 18

Top-down parsing The parse tree is constructed downwards, that is we start with the start symbol and try to derive the actual sentence by selecting appropriate rules: <exp> ::= <exp> + <term> <term> <term> ::= <term> * <num> <num> 8/27/10

Bottom-up parsing The tree is constructed upwards. Starts by finding part of the sentence the corresponds to the right hand side of a production and reduces this part of the sentence to the corresponding nonterminal. The goal is to reduce until the start symbol. <exp> ::= <exp> + <term> <term> <term> ::= <term> * <num> <num> 8/27/10 20

LL(1)-parsing LL(1)-parsing is a top-down strategy with a left derivation from the start symbol. Recursive descent To each nonterminal there is a method. The method takes care of the rule for for this nonterminal, and may call other methods. For each terminal in the right hand side: Check that the next symbol (from the scanner) is this terminal. For each nonterminal in the right hand side: Call the corresponding method. When the method is called, the scanner shall have as its next symbol the first symbol of the corresponding rule. When the method is finished, the scanner shall have as its next symbol the first symbol after the sentence. 8/27/10 21

Example <program> <stmtlist> <stmtlist> <stmt> + <stmt> <input> <output> <assignment> <input>? <variable> <output>! <variable> <assignment> <variable> = <variable> <operator> <operand> <operator> + - <operand> <variable> <number> <variable> v <digit> <digit> 0 1 2 3 4 5 6 7 8 9 <number> <digit> + static void assignment() { variable(); readsymbol('='); variable(); operator(); operand(); } 8/27/10 22

Example <program> <stmtlist> <stmtlist> <stmt> + <stmt> <input> <output> <assignment> <input>? <variable> <output>! <variable> <assignment> <variable> = <variable> <operator> <operand> <operator> + - <operand> <variable> <number> <variable> v <digit> <digit> 0 1 2 3 4 5 6 7 8 9 <number> <digit> + static void stmt() { if (checksymbol('v')) { assignment(); } else if (checksymbol('?')) { input(); } else if (checksymbol('!')) { output(); } 8/27/10 23

Yet another alternative: Meta-models Object model representing the program (not the execution) <statement> <assignment> <if-then-else> <while-do> 8/27/10 24

Why meta models? Inspired by abstract syntax trees in terms of object structures, interchange formats between tools Not all modeling/programming tools are parser-based (e.g. wizards) Growing interest in domain specific languages, often with a mixture of text and graphics Meta models often include name binding and type information in addition to the pure abstract syntax tree 8/27/10 25

Example Metamodel 8/27/10 26