Describing Syntax and Semantics

Similar documents
Chapter 3. Describing Syntax and Semantics ISBN

Chapter 3. Describing Syntax and Semantics

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

Chapter 3. Describing Syntax and Semantics

Chapter 4. Syntax - the form or structure of the expressions, statements, and program units

Syntax. In Text: Chapter 3

Chapter 3. Syntax - the form or structure of the expressions, statements, and program units

Chapter 3. Describing Syntax and Semantics ISBN

Chapter 3. Describing Syntax and Semantics

Syntax and Semantics

Chapter 3: Syntax and Semantics. Syntax and Semantics. Syntax Definitions. Matt Evett Dept. Computer Science Eastern Michigan University 1999

Programming Language Syntax and Analysis

3. DESCRIBING SYNTAX AND SEMANTICS

Programming Languages

Chapter 3. Topics. Languages. Formal Definition of Languages. BNF and Context-Free Grammars. Grammar 2/4/2019

Habanero Extreme Scale Software Research Project

Programming Language Definition. Regular Expressions

Programming Language Specification and Translation. ICOM 4036 Fall Lecture 3

ICOM 4036 Spring 2004

Chapter 3. Describing Syntax and Semantics. Introduction. Why and How. Syntax Overview

P L. rogramming anguages. Fall COS 301 Programming Languages. Syntax & Semantics. UMaine School of Computing and Information Science

Syntax & Semantics. COS 301 Programming Languages

Chapter 3 (part 3) Describing Syntax and Semantics

Syntax. A. Bellaachia Page: 1

Defining Languages GMU

COP 3402 Systems Software Syntax Analysis (Parser)

EECS 6083 Intro to Parsing Context Free Grammars

Dr. D.M. Akbar Hussain

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

CPS 506 Comparative Programming Languages. Syntax Specification

Principles of Programming Languages COMP251: Syntax and Grammars

Homework & Announcements

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

10/18/18. Outline. Semantic Analysis. Two types of semantic rules. Syntax vs. Semantics. Static Semantics. Static Semantics.

3. Context-free grammars & parsing

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

Lecture 4: Syntax Specification

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

CMSC 330: Organization of Programming Languages

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part2 3.3 Parse Trees and Abstract Syntax Trees

COSC252: Programming Languages: Semantic Specification. Jeremy Bolton, PhD Adjunct Professor

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

CMSC 330: Organization of Programming Languages. Context Free Grammars

A programming language requires two major definitions A simple one pass compiler

Formal Languages. Formal Languages

CMSC 330: Organization of Programming Languages

EDA180: Compiler Construc6on Context- free grammars. Görel Hedin Revised:

CMSC 330: Organization of Programming Languages. Context Free Grammars

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

Introduction to Parsing

This book is licensed under a Creative Commons Attribution 3.0 License

CSE 3302 Programming Languages Lecture 2: Syntax

CMSC 330: Organization of Programming Languages

Part 5 Program Analysis Principles and Techniques

Defining syntax using CFGs

A simple syntax-directed

CS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built)

Lexical Scanning COMP360

Parsing II Top-down parsing. Comp 412

Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2

CSE 311 Lecture 21: Context-Free Grammars. Emina Torlak and Kevin Zatloukal

Introduction to Lexing and Parsing

CMSC 330: Organization of Programming Languages. Context Free Grammars

CSE 12 Abstract Syntax Trees

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

Principles of Programming Languages COMP251: Syntax and Grammars

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Languages and Compilers

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Syntax Analysis Check syntax and construct abstract syntax tree

CSE302: Compiler Design

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Theory and Compiling COMP360

Chapter 4. Lexical and Syntax Analysis

Optimizing Finite Automata

programming languages need to be precise a regular expression is one of the following: tokens are the building blocks of programs

Lecture 10 Parsing 10.1

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

CIT Lecture 5 Context-Free Grammars and Parsing 4/2/2003 1

CSCE 314 Programming Languages

UNIT I Programming Language Syntax and semantics. Kainjan Sanghavi

Unit-1. Evaluation of programming languages:

Context-free grammars (CFG s)

4. Lexical and Syntax Analysis

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Context-Free Languages and Parse Trees

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

CS 230 Programming Languages

Introduction to Syntax Analysis

CSCI312 Principles of Programming Languages!

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

4. Lexical and Syntax Analysis

Specifying Syntax COMP360

More Assigned Reading and Exercises on Syntax (for Exam 2)

Introduction. Introduction. Introduction. Lexical Analysis. Lexical Analysis 4/2/2019. Chapter 4. Lexical and Syntax Analysis.

Transcription:

Describing Syntax and Semantics

Introduction Syntax: the form or structure of the expressions, statements, and program units Semantics: the meaning of the expressions, statements, and program units Syntax and semantics provide a language s definition Users of a language definition Other language designers Implementers Programmers (the users of the language)

The General Problem of Describing Syntax: Terminology A sentence is a string of characters over some alphabet A language is a set of sentences A lexeme lexeme is the lowest level syntactic unit of a language (e.g., *, sum, begin) A token is a category of lexemes (e.g., identifier)

Formal Definition of Languages Recognizers A recognition device reads input strings over the alphabet of the language and decides whether the input strings belong to the language Example: syntax analysis part of a compiler Generators A device that generates sentences of a language One can determine if the syntax of a particular sentence is syntactically correct by comparing it to the structure of the generator

BNF and Context-Free Grammars Context-Free Grammars Developed by Noam Chomsky in the mid- 1950s Language generators, meant to describe the syntax of natural languages Define a class of languages called contextfree languages Backus-Naur Form (1959) Invented by John Backus to describe Algol 58 BNF is equivalent to generator for contextfree grammars

BNF Fundamentals In BNF, abstractions are used to represent classes of syntactic structures--they act like syntactic variables (also called nonterminal symbols, or just nonterminals) Terminals are lexemes or tokens A rule (production) has a left-hand side (LHS), which is a nonterminal, and a right-hand side (RHS), which is a string of terminals and/or nonterminals Nonterminals are often enclosed in angle brackets Examples of BNF rules: <ident_list> identifier identifier, <ident_list> <if_stmt> if <logic_expr> then <stmt> Grammar: a finite non-empty set of rules A start symbol is a special element of the nonterminals of a grammar

BNF Rules An abstraction (or nonterminal symbol) can have more than one RHS <stmt> -> <single_stmt> begin <stmt_list> end

Describing Lists Syntactic lists are described using recursion <ident_list> -> ident ident, <ident_list> A derivation is a repeated application of rules, starting with the start symbol and ending with a sentence (all terminal symbols)

An Example Grammar <program> -> <stmts> <stmts> -> <stmt> <stmt> ; <stmts> <stmt> -> <var> = <var> -> a b c d -> <term> + <term> <term> - <term> <term> -> <var> const

An Example Derivation <program> => <stmts> => <stmt> => <var> = => a = => a = <term> + <term> => a = <var> + <term> => a = b + <term> => a = b + const

Derivations Every string of symbols in a derivation is a sentential form A sentence is a sentential form that has only terminal symbols A leftmost derivation is one in which the leftmost nonterminal in each sentential form is the one that is expanded A derivation may be neither leftmost nor rightmost

Parse Tree A hierarchical representation of a derivation <program> <stmts> <stmt> <var> = a <term> + <var> b <term> const

Ambiguity in Grammars A grammar is ambiguous if and only if it generates a sentential form that has two or more distinct parse trees

An Ambiguous Expression Grammar <op> const <op> / - <op> <op> <op> <op> const - const / const const - const / const

An Unambiguous Expression Grammar If we use the parse tree to indicate precedence levels of the operators, we cannot have ambiguity - <term> <term> <term> <term> / const const - <term> <term> <term> / const const const

Associativity of Operators Operator associativity can also be indicated by a grammar -> + const (ambiguous) -> + const const (unambiguous) + const + const const

Extended BNF Optional parts are placed in brackets [ ] <proc_call> -> ident [(<expr_list>)] Alternative parts of RHSs are placed inside parentheses and separated via vertical bars <term> <term> (+ -) const Repetitions (0 or more) are placed inside braces { } <ident> letter {letter digit}

BNF and EBNF BNF -> + <term> - <term> <term> <term> -> <term> * <factor> <term> / <factor> <factor> EBNF -> <term> {(+ -) <term>} <term> -> <factor> {(* /) <factor>}

Recent Variations in EBNF Alternative RHSs are put on separate lines Use of a colon instead of => Use of opt for optional parts Use of oneof for choices

Static Semantics Context-free grammars (CFGs) cannot describe all of the syntax of programming languages Categories of constructs that are trouble: - Context-free, but cumbersome (e.g., types of operands in expressions) - Non-context-free (e.g., variables must be declared before they are used)

Attribute Grammars Attribute grammars (AGs) have additions to CFGs to carry some semantic info on parse tree nodes Primary value of AGs: Static semantics specification Compiler design (static semantics checking)

Attribute Grammars : Definition Def: An attribute grammar is a context-free grammar G = (S, N, T, P) with the following additions: For each grammar symbol x there is a set A(x) of attribute values Each rule has a set of functions that define certain attributes of the nonterminals in the rule Each rule has a (possibly empty) set of predicates to check for attribute consistency

Attribute Grammars: Definition Let X0 -> X 1... X n be a rule Functions of the form S(X0 ) = f(a(x 1 ),..., A(X n )) define synthesized attributes Functions of the form I(Xj ) = f(a(x 0 ),..., A(X n )), for i <= j <= n, define inherited attributes Initially, there are intrinsic attributes on the leaves

Attribute Grammars: An Example Syntax <assign> -> <var> = -> <var> + <var> <var> <var> -> A B C actual_type: synthesized for <var> and expected_type: inherited for

Attribute Grammar (continued) Syntax rule: -> <var>[1] + <var>[2] Semantic rules:.actual_type -> <var>[1].actual_type Predicate: <var>[1].actual_type == <var>[2].actual_type.expected_type ==.actual_type Syntax rule: <var> -> id Semantic rule: <var>.actual_type -> lookup (<var>.string)

Attribute Grammars (continued) How are attribute values computed? If all attributes were inherited, the tree could be decorated in top-down order. If all attributes were synthesized, the tree could be decorated in bottom-up order. In many cases, both kinds of attributes are used, and it is some combination of top-down and bottom-up that must be used.

Attribute Grammars (continued).expected_type -> inherited from parent <var>[1].actual_type -> lookup (A) <var>[2].actual_type -> lookup (B) <var>[1].actual_type =? <var>[2].actual_type.actual_type -> <var>[1].actual_type.actual_type =?.expected_type

Semantics There is no single widely acceptable notation or formalism for describing semantics Several needs for a methodology and notation for semantics: Programmers need to know what statements mean Compiler writers must know exactly what language constructs do Correctness proofs would be possible Compiler generators would be possible Designers could detect ambiguities and inconsistencies