Informatica 3 Syntax and Semantics

Similar documents
Programmiersprachen (Programming Languages)

Programming Languages Third Edition. Chapter 7 Basic Semantics

Syntax. A. Bellaachia Page: 1

UNIT I Programming Language Syntax and semantics. Kainjan Sanghavi

Binding and Variables

CS 330 Lecture 18. Symbol table. C scope rules. Declarations. Chapter 5 Louden Outline

Informatica 3 Marcello Restelli

Informal Semantics of Data. semantic specification names (identifiers) attributes binding declarations scope rules visibility

CSCI312 Principles of Programming Languages!

CSC 533: Organization of Programming Languages. Spring 2005

Attributes, Bindings, and Semantic Functions Declarations, Blocks, Scope, and the Symbol Table Name Resolution and Overloading Allocation, Lifetimes,

Programming Languages, Summary CSC419; Odelia Schwartz

G Programming Languages - Fall 2012

CS 345. Functions. Vitaly Shmatikov. slide 1

Imperative Programming

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Topic IV. Parameters. Chapter 5 of Programming languages: Concepts & constructs by R. Sethi (2ND EDITION). Addison-Wesley, 1996.

Chapter 5. Names, Bindings, and Scopes

Fundamentals of Programming Languages

CSCE 314 Programming Languages. Type System

1 Lexical Considerations

Lexical Considerations

CS 415 Midterm Exam Spring 2002

Lexical Considerations

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

This book is licensed under a Creative Commons Attribution 3.0 License

Subprograms. Bilkent University. CS315 Programming Languages Pinar Duygulu

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

Chapter 3:: Names, Scopes, and Bindings (cont.)

Topic IV. Block-structured procedural languages Algol and Pascal. References:

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms

The role of semantic analysis in a compiler

CSE 307: Principles of Programming Languages

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

Chapter 3:: Names, Scopes, and Bindings (cont.)

The SPL Programming Language Reference Manual

CSE 307: Principles of Programming Languages

COMPILER DESIGN LECTURE NOTES

St. MARTIN S ENGINEERING COLLEGE Dhulapally, Secunderabad

Programming Languages Third Edition. Chapter 10 Control II Procedures and Environments

CPS 506 Comparative Programming Languages. Syntax Specification

G Programming Languages - Fall 2012

EI326 ENGINEERING PRACTICE & TECHNICAL INNOVATION (III-G) Kenny Q. Zhu Dept. of Computer Science Shanghai Jiao Tong University

Chapter 5 Names, Binding, Type Checking and Scopes

1. true / false By a compiler we mean a program that translates to code that will run natively on some machine.

Chapter 9 Subprograms

Syntax. In Text: Chapter 3

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 17: Types and Type-Checking 25 Feb 08

Names, Scopes, and Bindings II. Hwansoo Han

INSTITUTE OF AERONAUTICAL ENGINEERING

Short Notes of CS201

Semantic Analysis and Type Checking

CST-402(T): Language Processors

Answer: Early binding generally leads to greater efficiency (compilation approach) Late binding general leads to greater flexibility

STUDY NOTES UNIT 1 - INTRODUCTION TO OBJECT ORIENTED PROGRAMMING

Concepts of Programming Languages

CS201 - Introduction to Programming Glossary By

Chapter 5 Names, Bindings, Type Checking, and Scopes

Implementing Subprograms

TYPES, VALUES AND DECLARATIONS

COP4020 Programming Languages. Subroutines and Parameter Passing Prof. Robert van Engelen

CMSC 331 Final Exam Section 0201 December 18, 2000

Outline. Programming Languages 1/16/18 PROGRAMMING LANGUAGE FOUNDATIONS AND HISTORY. Current

Decaf Language Reference Manual

6. Names, Scopes, and Bindings

Implementing Subprograms

Lecture 16: Static Semantics Overview 1

G Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University

CS 242. Fundamentals. Reading: See last slide

Organization of Programming Languages CS 3200/5200N. Lecture 09

Names, Bindings, Scopes

Chapter 3. Describing Syntax and Semantics ISBN

Type Bindings. Static Type Binding

Subprograms. Copyright 2015 Pearson. All rights reserved. 1-1

LECTURE 3. Compiler Phases

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

COP4020 Programming Languages. Compilers and Interpreters Robert van Engelen & Chris Lacher

CS 415 Midterm Exam Spring SOLUTION

Motivation was to facilitate development of systems software, especially OS development.

Chapter 4. Syntax - the form or structure of the expressions, statements, and program units

Compiling and Interpreting Programming. Overview of Compilers and Interpreters

CS 314 Principles of Programming Languages

Programming Languages Third Edition

Computer Science & Information Technology (CS) Rank under AIR 100. Examination Oriented Theory, Practice Set Key concepts, Analysis & Summary

Static Semantics. Lecture 15. (Notes by P. N. Hilfinger and R. Bodik) 2/29/08 Prof. Hilfinger, CS164 Lecture 15 1

Chapter 5: Procedural abstraction. Function procedures. Function procedures. Proper procedures and function procedures

Chapter 8. Fundamental Characteristics of Subprograms. 1. A subprogram has a single entry point

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

The PCAT Programming Language Reference Manual

Fundamental Concepts and Definitions

Chapter 8 ( ) Control Abstraction. Subprograms Issues related to subprograms How is control transferred to & from the subprogram?

Chapter 6: User-Defined Functions. Objectives (cont d.) Objectives. Introduction. Predefined Functions 12/2/2016

9. Subprograms. 9.2 Fundamentals of Subprograms

Programming Methodology

MIDTERM EXAM (Solutions)

Semantic Processing. Semantic Errors. Semantics - Part 1. Semantics - Part 1

Concepts Introduced in Chapter 7

CS 314 Principles of Programming Languages

Tail Calls. CMSC 330: Organization of Programming Languages. Tail Recursion. Tail Recursion (cont d) Names and Binding. Tail Recursion (cont d)

Transcription:

Informatica 3 Syntax and Semantics Marcello Restelli 9/15/07 Laurea in Ingegneria Informatica Politecnico di Milano

Introduction Introduction to the concepts of syntax and semantics Binding Variables Routines 2

Programming Languages Tools for writing software Formal notations for describing algorithms for execution by computers Formal notations need Syntax set of rules specifying how programs are written specifies the validity of a program formal notations: EBNF, Syntax Diagrams Semantics set of rules specifying the meaning of syntactically correct programs specifies the effect of a program operational semantics 3

Syntax Rules that define admissible sequences of symbols Lexical rules Define the alphabet of the language E.g, set of chars, combinations Example Pascal does not distinguish between uppercase and lowercase chars, while C and Ada do <> is a valid operator in Pascal, but denoted by!= in C, and /= in Ada Syntactic rules Define how to compose words to build syntactically correct programs Helps programmer to write syntactically correct program Allows to check whether a program is syntactically correct 4

EBNF Extended Backus Naur Form Early descriptions were done using natural language ALGOL60 was described by John Backus using a context free grammar (BNF) EBNF is a metalanguage I.e., a language to describe other languages metasymbols ::= symbol definition < > elements defined later (nonterminals) { ; } if else terminals choice * zero or more occurrences of the preceding element + one or more occurrences of the preceding element 5

Syntax Rules Example: <program> ::= { <statement>* } rule <statement> ::= <assignment> <conditional> <loop> <assignment>::= <identifier> = <expr> ; <conditional>::= if <expr> { <statement>+ } if <expr> {<statement>+} else {<statement>+} <loop> ::= while <expr> {<statement>+} <expr> ::= <identifier> <number> ( <expr> ) <expr> <operator> <expr> 6

Lexical Rules Example <operator> ::= + - * / = /= < > <= >= <identifier> ::= <letter> <ld>* <ld> ::= <letter> <digit> <number> ::= <digit>+ <letter> ::= a b c... z <digit> ::= 0 1... 9 7

Syntax Diagrams Graphical representation of the syntax rules terminal symbols are put inside circles nonterminals symbols are put inside rectangles admissible sentences are obtained by traveling the diagram according to the arrows 8

Syntax Diagrams Graphical representation of the syntax rules terminal symbols are put inside circles nonterminals symbols are put inside rectangles admissible sentences are obtained by traveling the diagram according to the arrows 9

Syntax Diagrams Graphical representation of the syntax rules terminal symbols are put inside circles nonterminals symbols are put inside rectangles admissible sentences are obtained by traveling the diagram according to the arrows 10

Syntax Diagrams Graphical representation of the syntax rules terminal symbols are put inside circles nonterminals symbols are put inside rectangles admissible sentences are obtained by traveling the diagram according to the arrows 11

Syntax Diagrams Graphical representation of the syntax rules terminal symbols are put inside circles nonterminals symbols are put inside rectangles admissible sentences are obtained by traveling the diagram according to the arrows 12

Syntax Diagrams Graphical representation of the syntax rules terminal symbols are put inside circles nonterminals symbols are put inside rectangles admissible sentences are obtained by traveling the diagram according to the arrows 13

Abstract Syntax vs Concrete Syntax There can be construction in different languages that have the same conceptual structure, but different lexical rules C Language while (x!= y) {... } Pascal while x <> y do begin... end Differences { and } instead of begin and end!= instead of <> In Pascal, ( and ) can be omitted Same abstract syntax, but different concrete syntax 14

Pragmatics Concrete syntax has a high impact on usability Example In C language, when loops have only one instruction, parentheses may be omitted This feature can be error prone Other languages (e.g., Modula-2 and Ada) address this issue by closing the conditional/loop statements with an end keyword It is useful to adopt a programming style that avoids such problems (i.e., in C you should always use the curly braces even when there is only a single statement) 15

Language processing Interpretation Language statements are directly executed Translation High-level language statements are translated into machine-level statements before being executed 16

Interpretation Main cycle Get the next statement Determine the actions to be executed Perform the actions Interpretation is a simulation on a host of a special purpose computer with a high-level machine language 17

Translation Cross-Translation translation is performed on a machine that is different from the one used for execution needed when the execution machine has a special purpose processor 18

Interpretation vs Translation Translation to intermediate code Interpretation of the intermediate code of a virtual machine Portability of executable code Examples Java bytecode 19

Basic Concepts of a Programming Language Programs deal with different types of entities variables routines statements Entities have different properties or attributes Variables Name, type, storage area Routines Name, formal parameters, parameter-passing information Statements Actions to be taken Attribute information is stored in the descriptor Binding sets attributes for entities 20

The Concept of Binding Is a key element to define the semantics of a programming language Different programming languages have different number of entities have different number of attributes bound to entities allow different binding time have different binding stability Binding time Determine when the binding occurs Stability Determine whether certain bindings are fixed or modifiable Static binding cannot be modified 21

Binding Time Language definition time E.g., in most languages (FORTRAN, Ada, C++) the integer type is bound at language definition Language implementation time E.g., then during implementation, the integer type is bound to a memory representation Compile-time (translation-time) Pascal provides a way to redefine the integer type, the new representation is bound to the type when the program is compiled Execution-time (run-time) Variables are usually bounded to a value during execution Static Binding Dynamic Binding 22

Static Binding vs Dynamic Binding Static Binding Established before execution and cannot be changed thereafter E.g., FORTRAN, Ada, C, C++ integer bound to a set of values at definition/implementation time Dynamic Binding Established at run-time and usually modifiable during execution E.g., variable values Exceptions in Pascal, read-only constant are variables whose value is initialized at run-time but cannot be modified thereafter 23

Variables Conventional language variables are memory abstractions In main memory, cells are identified by an address The contents of a cell are encoded representation of a value The value can be modified at run-time Variables Variables abstract the concept of memory cell The variable name abstracts the address information The assignment statement abstracts the cell modification 24

Formal Definition of a Variable A variable is a 5-tuple <name, scope, type, l-value, r-value> Name String used in program statement to denote the variable Scope Range of program instructions over which the name is defined Type L-value Memory location associated to the variable R-value Encoded value stored in the variable s location 25

Name and Scope The variable's name is a string of chars that is introduced by a declaration statement The scope is the range of program statements over which the name is known The variable s scope usually extends from the declaration until a later closing point A program can use variables through their name inside their scope it means that a variable is visible with its name only inside its scope 26

Scope and Binding The rules of binding of a variable inside its scope are language dependent Static scope binding The variable s scope is defined in terms of the syntactic structure of the program The scope of a variable can be determined just examining the program text Many programming languages adopt static scope binding Dynamic scope binding The variable s scope is defined during program execution Each declaration extends its effect over all the instructions executed thereafter, until a new declaration for the same variable is encountered APL, SNOBOL4, and LISP use dynamic scope binding 27

Example: Static Scope Binding main() { int x,y;... { int temp;... temp = x-y; } } main() { int x,y;... { int temp; int x;... temp = x-y; } } 28

Example: Dynamic Scope Binding { } { } { } /* block A */ int x; /* block B */ float x; /* block C */ x =...; Case 1: A then C Variable x in block C refers to x declared in block A Case 2: B then C Variable x in block C refers to x declared in block B 29

Dynamic Scope Binding Rules for dynamic scope binding are simple and rather easy to implement The major disadvantages are in terms of programming discipline efficiency of implementation Programs are hard to read and difficult to debug 30

Type Set of values that can be associated to a variable, and... Set of operations that can be legally performed on such values creating, reading, and accessing the values It protects variables from nonsensical operations A variable of a given type is also called an instance of the type A language can be Typeless or untyped, e.g., assembly languages Dynamically typed, e.g., LISP Statically typed 31

Type and Binding Binding between type and set of admissible values and operations that modify the values Built-in types (e.g., integer) the set of admissible value is associated when the language is defined the operations are defined when the type is implemented Declarations of new types (e.g., typedef float reale;) Binding between the type name and implementation at translation time New type inherits all operations Abstract Data Types allow to define new types by specifying the set of admissible values and the set of legal operations 32

Abstract Data Types Typical ADT declaration is structured as typedef new_type_name { } data structure for new_type_name; operations to manipulate data objects; 33

Abstract Data Types: Example class StackOfChar { private: int size; char* top; char* s; public: StackOfChar (int sz) { top = s = new char [size=sz]; } ~StackOfChar ( ) {delete [ ] s;} void push (char c) {*top++ = c;} char pop ( ) {return *--top;} int length ( ) {return top - s;} }; 34

Static vs Dynamic Type Binding Statically typed language (C, C++, Pascal, etc.) Binding occurs at compile-time Binding cannot be changed at run-time type checking can be performed at compile-time Dynamically typed language (LISP, APL, Smalltalk, etc.) Binding occurs at run-time Binding can be changed at run-time variables are polymorphic the type depends on the values assigned to the variable type checking cannot be performed statically No typed languages (script and assembly languages) memory cells and registries contain strings that can be interpreted as values of any type 35

l value The l-value of a variable is the storage area bound to the variable during execution The lifetime or extent of a variable is the period of time in which such a binding exists The l-value is used to hold the r-value Data object is the pair <l-value,r-value> The binding between the variable and the l-value is the memory allocation Memory allocation acquires storage and binds the variable Lifetime extends from allocation to deallocation 36

Memory Allocation Static allocation Allocation is performed before run-time, deallocation is performed upon termination Dynamic allocation Allocation and deallocation are at runt-time performed on explicit request (through creation statements) or automatically 37

r value The r-value of a variable is the encoded value stored in the location (l-value) of the variable The r-value is interpreted according to variable's type Instructions Access variables through their l-value (lefthand side of assignments) Modify their r-value (righthand side of assignments) x = y; // x is a location, while y is a value The binding between variables and values is usually dynamic Generally, it is static only for constants 38

Initialization What is the r-value when the variable is created? Different languages, different solutions Some languages (e.g., ML) requires that the binding is established at variable creation Other languages, like C and Ada, support such a binding but do not require it (int I, J=0;) What if initialization is not provided? Ignore the bit string found in the storage area is considered the variable initial value System-defined initialization int = zero, char = blank Special undefined value The variable is considered initialized to an undefined value, and any read access to an undefined variable is trapped 39

Reference and Unnamed Variables Some languages allow variables that can be accessed through the r-value of another variable Reference or pointers are variables whose r-value allow to access the content of another variable of an unnamed variable Access via a pointer is called dereferencing int x = 5; int* px; px = &x; px x 5 40

Reference and Unnamed Variables Some languages allow variables that can be accessed through the r-value of another variable Reference or pointers are variables whose r-value allow to access the content of another variable of an unnamed variable Access via a pointer is called dereferencing int* px; px = malloc(sizeof(int)); px 5 41

Reference and Unnamed Variables Some languages allow variables that can be accessed through the r-value of another variable Reference or pointers are variables whose r-value allow to access the content of another variable of an unnamed variable Access via a pointer is called dereferencing type PI = ^integer; type PPI = ^PI; var pxi: PI; var ppxi: PPI;... new(pxi); pxi^ := 0;... new(ppxi); ppxi^ := pxi; ppxi pxi 0 42

Routines Routines allow programs to be decomposed in a number of functional units Routines are general concepts found in most languages Subprograms in assembly language Subroutines in FORTRAN Procedures and functions in Pascal and Ada Functions in C Functions are routines that return a value Procedures are routines that do not return a value Formally, they can be defined as variables <name, scope, type, l-value, r-value> 43

Routines: Name and Scope Name Introduced by a routine declaration int sum(int n); // C prototype Scope The scope of the routine name extends from the declaration point to some closing point, either statically or dynamically determined In C, the end of the file where the routine is defined Activation Routines activation is achieved through a routine invocation or routine call Must be specified routine's name and arguments The call statement must be in routine s scope 44

Routines: Scope Scope Local scope Routines also define a scope for the declarations that are nested in them local declarations such local declarations are only visible within the routine non-local declarations visible from other units, but not from the routine global declarations visible from all the units in the program 45

Routines: Type The type of a routine is defined from its header name of the function type of input parameters Signature type of returned value int function(int,int,float) Type correct if the call conforms to the routine type int sum(int n) {... }... i = sum(10); int sum(int n) {... }... i = sum(5.3); 46

Routines: l value and r value l-value The reference to the area where the routine body is stored r-value The body that is currently bound to the routine Usually this binding is statically determined at translation time Some languages allow variables of type routine in C, we can define pointers to routines Routines are considered first-class objects when... int sum(int); int (*ps)(int); ps = int i = (*ps)(5);...it is possible to distinguish between their l-value and r- value...so as to allow the definition of variables of type routine 47

Routines: Declaration and Definition Some languages distinguish between routine declaration and definition Declaration introduces the routine s header without specifying the body specifies scope Definition specifies both the header and the body The distinction between declaration and definition support mutual recursion /* declaration of A, A is visible from now on */ int A (int x, int y); /* definition of B, B is visible from now on */ float B (int z) { int w, u; /* A is visible here */ w = A (z, u);... }; int A(int x, int y) {... /* B is visible here */ float t = B(x); } 48

Routine instance The representation of a routine during execution is called routine instance Code segment Contains the instructions of the unit, the content is fixed Activation record (or frame) Includes all the data objects associated with the local variables of a specific routine instance Contains all the information needed to execute the routine It is changeable Referencing Environment of an instance U contains: U s local variable, i.e., its local environment U s non local variables, i.e., its non local environment Non local variables are modifiable via side-effect 49

Recursive Attivation All instances of the same unit composed of same code segment different activation records The binding between code segment and activation record constitutes a new unit instance The binding between an activation record and its code segment is dynamic 50

Routines: Parameters Information passing data routine (some languages) Parameters formal: routine's definition actual: routine's call Binding positional method routine R(F1, F2,..., Fn); call R(A1, A2,..., An); named association routine call R(A:T1, B:T2, C:T3); R(C=>Z, A=>X, B=>Y); 51

Generic Routines Example Routine to sort an array of integers... int i,j;... swap(i,j); //caso 1 Routine to sort an array of floats... float i,j;... swap(i,j); //caso 2 void swap(int& a, int& b) { int temp = a; a = b; b = temp; } void swap(float& a, float& b) { float temp = a; a = b; b = temp; } template <class T> void swap(t& a, T& b) { T temp = a; a = b; Binding at compile-time b = temp; } 52

Overloading It occurs when...more than one entity is bound to a name at a given point... and name occurrence provides enough information to disambiguate the binding Example int i,j,k; float a,b,c; i = j + k; a = b + c; // int addition // float addition a = b + c + b();// b is overloaded a = b + c + b(i);// b is overloaded 53

Aliasing It is the opposite of overloading Two names, N1 and N2, are aliases if they denote the same entity at the same program point N1 and N2 share the same object in the same referencing environment Main issue: modifications under N1 have an effect visible under N2 Aliasing may lead to error prone and difficult to read programs int x =0; int *i = &x; int *j = &x; int q1 (int a) { } return a *= a; void q2 (int &a) { }... a *= a; int x = 2; cout << q1(x); cout << x; q2(x); cout << x; Stampa 4 Stampa 2 Stampa 4 54