CSc 520 Principles of Programming Languages 50: Semantics Introduction Christian Collberg collberg@cs.arizona.edu Department of Computer Science University of Arizona Copyright c 2005 Christian Collberg [1]
Formal Semantics In order for 1. compiler writers to know exactly how to implement a language, and 2. language users to know exactly what (combinations of) language constructs mean, the meaning of a language needs to be defined. Most definitions of real languages are in a stylized, but informal, English. It is also possible to give formal semantic language definitions. [2]
Formal Semantics... In practice, most languages are not defined in a formal, precise, mathematical way. There have been some attempts, for example Modula-2, Algol 68, and PL/I. Simple languages such as Scheme and Haskell are comparatively easy to define formally, compared to C, C++, Java, etc. [3]
Formal Semantics Modula-2 The Modula-2 specification was written in VDM-SL (Vienna Development Method - Specification Language), a formalism for giving a precise definition of a programming language in a denotational style. It was over 500 pages long, and didn t include specifications of the standard libraries. Wirth s original Modula-2 report was 28 pages. For a history of this disastrous standardization effort, see http://www.scifac.ru.ac.za/cspt/sc22wg13.htm. Note also that Modula-2 is a very simple language compared to Ada, C++, Java, etc. [4]
Formal Semantics PL/I VDL (Vienna Definition Language) was used to specify PL/I. A specification has two parts: 1. A translator that specified a translation into an abstract syntax tree, 2. an interpreter of the abstract syntax tree. VDL is a kind of operational semantics. PL/I is large and complex. The resulting (large) document was called the Vienna Telephone Directory. It was impossible to comprehend. [5]
Methods In this class we will consider two methods for defining the semantics of programming languages: Operational semantics define a computation by giving step-by-step transformations on a abstract machine that simulate the execution of the program. Denotational semantics constructs a mathematical object (typically a function) which is the meaning of the program. [6]
Contextual Constraints A compiler performs syntactic and semantic analysis. There really isn t a sharp distinction between the two. Is String x; ; print x/2 a syntactic or semantic error? Some would say that it violates the static semantic rules of the language, and hence is a semantic (not a syntactic) error. Others would say it violates context-sensitive syntax rules of the language. I.e., they d consider the program as a whole to determine if it is well-formed or not. We will use the term contextual constraints for those rules that restricts the programs which are considered well-formed. [7]
Operational Semantics Operational Semantics specifies a language through the steps by which each program is executed. This is often done informally. For example, the statement while E do C is specified as 1. Evaluate E to a truthvalue B; 2. If B = true then execute C, then repeat from 1). 3. If B = false, terminate. The emphasis is on specifying the steps needed to execute the program. This makes the specification useful for language implementers. [8]
Operational Semantics... We need two things: 1. an abstract syntax, and 2. an interpreter. The abstract syntax defines the structure of each construct in the language, for example, that an if-statement consists of three parts: the test e, the then-part c 1 and the else-part c 2 : if ::= e:bool expr c 1 :statement c 2 :statement Note that no syntactic information is given. The interpreter generates a sequence of machine configurations that define the program s semantics. The interpreter is defined by rewriting rules. [9]
Operational Sem. Peano Arithmetic Abstract Syntax (N Nat, the Natural Numbers): N ::= 0 S(N) (N + N) ( N N) Interpreter: I : N N I [(n + 0)]] n I [(m + S(n))]] S(I [(m + n)]]) I [(n 0)]] 0 I [(m S(n))]] I [((m n) + m)]] where m, n Nat [10]
Operational Sem. Peano Arithmetic The rewrite rules are used to turn an expression into standard form, containing only S (succ) and 0. S(S(S(S(0)))) = 4. [11]
Operational Sem. Simple... Simple is a language with if-statements, while-statements, assignment-statements, and integer arithmetic. The semantic function I interprets commands. The semantic function ν interprets expressions. The store σ maps variables to their values. Assignments update the store. The result of the interpretation (the semantics of the program) is the resulting store. [12]
Operational Sem. Simple... Interpreter: I : C Σ Σ ν : E Σ T Z Semantic Equations: I(skip, σ) = σ I(V := E, σ) = σ[v ν(e, σ)] I(C 1 ; C 2, σ) = E(C 2, E(C 1, σ)) I(if E then C 1 else C 2 end, σ) = I(C 1, σ) if ν(e, σ) = true I(C 2, σ) if ν(e, σ) = false [13]
Operational Sem. Simple... Interpreter: while E do C end = if E then (C; while Edo C end) else skip ν(v, σ) = σ[v ] ν(n, σ) = N ν(e 1 + E 2, σ) = ν(e 1, σ) + ν(e 2, σ) ν(e 1 = E 2, σ) = true if ν(e, σ) = ν(e, σ) = false if ν(e, σ) ν(e, σ) [14]
Denotational Semantics We think of each program as implementing a mathematical function. An imperative program is a function from inputs to outputs. This function is the meaning of the program. Example exec [while E do C]] = let exec-while env sto = let Boolean tr = evaluate [E]] env sto in if tr then exec-while env (exec [C]] env sto) else sto in exec-while [15]
Denotational Semantics... We need three things: 1. an abstract syntax, 2. a semantic algebra defining a computational model, and 3. valuation functions. The valuation functions map the syntactic constructs of the language to the semantic algebra. Denotational semantics relies on defining an object in terms of its constituent parts. [16]
Denotational Sem. Peano Arithmetic Abstract Syntax (N Nat, the Natural Numbers): N ::= 0 S(N) (N + N) (N N) Semantic Algebra: + : Nat Nat Nat Valuation Function: D : Nat Nat D [(n + 0)]] = D [n]] D [(m + S(n))]] = D [(m + n)]] + 1 D [(n 0)]] = 0 D [(m S(n))]] = D [((m n) + m)]] where m, n Nat [17]
Denotational Sem. Simple Abstract Syntax: C Command E Expression O Operator N Numeral V Variable C ::= V := E if E then C 1 else C 2 end while E do C end C 1 ; C 2 skip E ::= V N E 1 O E 2 (E) O ::= + - * / = < > <> [18]
Denotational Sem. Simple... Semantic Algebra: τ T = true, false; the boolean values ζ Z = {... 1, 0, 1,...}; the integers + : Z Z Z = : Z Z T σ S = Variable Numeral; the state Valuation Functions: C C (S S) E E E (N T ) [19]
Denotational Sem. Simple... C [skip]] σ = σ C [V :=E]] σ = σ [V E [E]]]] σ C [C1; C2]] = C [C2]] C [C1]] C [if E then C 1 else C 2 end]] σ = C [C 1 ]] σ if E [E]] σ = true C [while E do C end]] σ = = C [C 2 ]] σ if E [E]] σ = false lim C [(if E then C else skip end n E [V ]] σ = σ(v ) E [N]] = ζ E [E 1 + E 2 ]] = E [E 1 ]] σ + E [E 2 ]] σ E [E 1 = E 2 ]] σ = E [E 1 ]] σ = E [E 2 ]] σ [20]
Concrete Syntax of Wren [21]
Wren Wren is a small imperative language that we will be using as a running example. The complete concrete syntax of Wren is given in the next few slides. [22]
Concrete Syntax program ::= program identifier is block block ::= declaration seq begin command seq end declaration seq ::= declaration declaration seq declaration ::= var variable list : type ; type ::= integer boolean variable list ::= variable variable, variable list command seq ::= command command ; command seq command ::= variable := expr skip read variable write integer expr while boolean expr do command seq end while if boolean expr then command seq end if if boolean expr then command seq else command seq end if [23]
Concrete Syntax... expr ::= integer expr boolean expr integer expr ::= term integer expr weak op term term ::= element term strong op element element ::= numeral variable ( integer expr ) element boolean expr ::= boolean term boolean expr or boolean term boolean term ::= boolean element boolean term and boolean element boolean element ::= true false variable comparison not ( boolean expr ) ( boolean expr ) comparison ::= integer expr relation integer expr [24]
Concrete Syntax... variable ::= identifier identifier::= letter identifier letter identifier digit relation ::= <= < = > >= <> weak op ::= + strong op ::= * / letter ::= a b c d e f g h i j k l m n o p q r s t u v w x y z numeral ::= digit digit numeral digit ::= 0 1 2 3 4 5 6 7 8 9 [25]
Wren Example program binary is var n,p : integer; begin read n; p := 2; while p<=n do p := 2*p end while; p := p/2; while p>0 do if n>= p then write 1; n := np else write 0 end if; p := p/2 end while end [26]
Readings and References Read Chapter 1, in Syntax and Semantics of Programming Languages, by Ken Slonneger and Barry Kurtz, http://www.cs.uiowa.edu/ slonnegr/plf/book. [27]
Acknowledgments Some examples are taken from Introduction to Programming Languages, by Anthony A. Aaby, http://burks.brighton.ac.uk/burks/pcinfo/progdocs/plbook/semantic.htm The Wren lanuage is taken from the book Syntax and Semantics of Programming Languages, by Ken Slonneger and Barry Kurtz, http://www.cs.uiowa.edu/ slonnegr/plf/book. [28]