CS558 Programming Languages

Similar documents
CS558 Programming Languages Winter 2018 Lecture 4a. Andrew Tolmach Portland State University

CS558 Programming Languages. Winter 2013 Lecture 4

Weeks 6&7: Procedures and Parameter Passing

CS558 Programming Languages. Winter 2013 Lecture 3

CSE 504. Expression evaluation. Expression Evaluation, Runtime Environments. One possible semantics: Problem:

G Programming Languages - Fall 2012

Short Notes of CS201

CS201 - Introduction to Programming Glossary By

Design Issues. Subroutines and Control Abstraction. Subroutines and Control Abstraction. CSC 4101: Programming Languages 1. Textbook, Chapter 8

G Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University

CS558 Programming Languages

CS558 Programming Languages

CS558 Programming Languages

Names, Bindings, Scopes

CS 345. Functions. Vitaly Shmatikov. slide 1

CS558 Programming Languages

Wednesday, October 15, 14. Functions

System Software Assignment 1 Runtime Support for Procedures

22c:111 Programming Language Concepts. Fall Functions

CIT Week13 Lecture

CS558 Programming Languages

Organization of Programming Languages CS 3200/5200N. Lecture 09

Subprograms. Bilkent University. CS315 Programming Languages Pinar Duygulu

Programming Languages Third Edition. Chapter 7 Basic Semantics

Chapter 8 ( ) Control Abstraction. Subprograms Issues related to subprograms How is control transferred to & from the subprogram?

CSC 8400: Computer Systems. Using the Stack for Function Calls

CSE 307: Principles of Programming Languages

CSE 307: Principles of Programming Languages

CS558 Programming Languages

Types. What is a type?

CS321 Languages and Compiler Design I Winter 2012 Lecture 13

Programming Languages & Paradigms PROP HT Course Council. Subprograms. Meeting on friday! Subprograms, abstractions, encapsulation, ADT

CS 330 Lecture 18. Symbol table. C scope rules. Declarations. Chapter 5 Louden Outline

CS558 Programming Languages

COP4020 Programming Languages. Subroutines and Parameter Passing Prof. Robert van Engelen

G Programming Languages - Fall 2012

CS 314 Principles of Programming Languages

Chapter 5. Names, Bindings, and Scopes

Module 27 Switch-case statements and Run-time storage management

Chapter 9 Subprograms

CS 314 Principles of Programming Languages. Lecture 13

Procedure and Object- Oriented Abstraction

Review of the C Programming Language

CSE 504: Compilers. Expressions. R. Sekar

Function Calls and Stack Allocation

Programming Languages: Lecture 11

CSE 3302 Programming Languages Lecture 5: Control

Code Generation & Parameter Passing

CPS311 Lecture: Procedures Last revised 9/9/13. Objectives:

Function Calls and Stack Allocation

Stack. Stack. Function Calls and Stack Allocation. Stack Popping Popping. Stack Pushing Pushing. Page 1

Lecture 3: C Programm

Principles of Programming Languages

Programming Languages

Run-time Environments

Run-time Environments

Code Generation. Lecture 12

LECTURE 19. Subroutines and Parameter Passing

Pointer Basics. Lecture 13 COP 3014 Spring March 28, 2018

CSC 2400: Computer Systems. Using the Stack for Function Calls

CS240: Programming in C

Compilers and computer architecture: A realistic compiler to MIPS

Operating Systems CMPSCI 377, Lec 2 Intro to C/C++ Prashant Shenoy University of Massachusetts Amherst

procedure definition (or declaration - some make a distinction, we generally won t), e.g., void foo(int x) {... }

CSC 2400: Computing Systems. X86 Assembly: Function Calls"

Chapter 9 :: Subroutines and Control Abstraction

In Java we have the keyword null, which is the value of an uninitialized reference type

known as non-void functions/methods in C/C++/Java called from within an expression.

Chapter 8 :: Subroutines and Control Abstraction. Final Test. Final Test Review Tomorrow

Lecture Notes on Memory Layout

Arguments and Return Values. EE 109 Unit 16 Stack Frames. Assembly & HLL s. Arguments and Return Values

Attributes, Bindings, and Semantic Functions Declarations, Blocks, Scope, and the Symbol Table Name Resolution and Overloading Allocation, Lifetimes,

Principles of Programming Languages

Chapter 5: Procedural abstraction. Function procedures. Function procedures. Proper procedures and function procedures

Calling Conventions. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 2.8 and 2.12

Chapter 5 Names, Binding, Type Checking and Scopes

UNIT V Sub u P b ro r g o r g a r m a s

Data Types. Every program uses data, either explicitly or implicitly to arrive at a result.

Topic IV. Parameters. Chapter 5 of Programming languages: Concepts & constructs by R. Sethi (2ND EDITION). Addison-Wesley, 1996.

D Programming Language

CS558 Programming Languages

Symbol Tables. ASU Textbook Chapter 7.6, 6.5 and 6.3. Tsan-sheng Hsu.

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

CS321 Languages and Compiler Design I. Fall 2013 Week 8: Types Andrew Tolmach Portland State University

G Programming Languages - Fall 2012

CSE 307: Principles of Programming Languages

CSC 2400: Computing Systems. X86 Assembly: Function Calls

Anne Bracy CS 3410 Computer Science Cornell University

Run-time Environments. Lecture 13. Prof. Alex Aiken Original Slides (Modified by Prof. Vijay Ganesh) Lecture 13

Run-time Environment

Subroutines. Subroutine. Subroutine design. Control abstraction. If a subroutine does not fit on the screen, it is too long

Informatica 3 Syntax and Semantics

Lexical Considerations

CA Compiler Construction

1 Lexical Considerations

Implementing Subprograms

Lecture 11: Subprograms & their implementation. Subprograms. Parameters

Chapter 3:: Names, Scopes, and Bindings

Memory Management and Run-Time Systems

CSC 533: Organization of Programming Languages. Spring 2005

Transcription:

CS558 Programming Languages Fall 2016 Lecture 4a Andrew Tolmach Portland State University 1994-2016

Pragmatics of Large Values Real machines are very efficient at handling word-size chunks of data (e.g. 16-64 bits depending on hardware). Things that fit easily in a word: Numbers, characters, booleans, enumerations, class tags, etc. Memory addresses (locations) Words are very easy to move, load, store, supply to operations, etc. But how can we manipulate larger chunks of data, such as records or arrays, which may occupy many words?

Boxing Two basic ways to represent large values The unboxed representation holds the actual bits of the value, using as many machine words as necessary ~textbook: value model The boxed representation allocates separate storage (the box ) for the actual bits, and then represents the value by the location of that storage ~textbook: reference model Boxes are usually, but not necessarily, stored in the heap Boxing may be performed implicitly or explicitly

Boxed vs. Unboxed Example: an array of 100 (machine) integers Unboxed implementation: values occupy 100 consecutive words Boxed representation: values occupy 1 word pointer pointing to 100 consecutive words contents Choice of representation can make a big difference to semantics on operations on the data What does assignment mean? How does parameter passing work? What do equality comparisons mean?

Unboxed Assignment Semantics Early languages often used unboxed records and arrays occupies 80x1 + 1x4 = 84 bytes TYPE Employee = RECORD name : ARRAY (1..80) OF CHAR; age : INTEGER; END; Pascal Semantics of assignment is to copy entire representation VAR e1,e2 : Employee; e1.age := 91; e2 := e1; e1.age := 19; WRITE(e1.age, e2.age); prints 19,91

Unboxed representation issues This assignment semantics seems simple and appealing, but it has problems: Assignment of a large value is expensive, since lots of words may need to be copied Especially hard to generate efficient code if size of large value is not known statically

Boxed Assignment Semantics Most modern languages (e.g. Java, Python, Haskell) box all values (e.g. objects, records, constructions) that are larger than one word These languages naturally use reference semantics for assignment: just the pointer is copied, creating an alias case class emp(var name:string, var age:int) object Bat { def main(argv:array[string]) = { val e1 = emp("fred",91) val e2 = e1 e1.age = 19 println(e1.age + " " + e2.age) } } prints 19,19 Scala

Explicit Pointers Languages that use unboxed semantics may also have explicit pointer types to support reference-style operations prints 19,19,91 struct Emp { char name[80]; int age; }; Emp *e1 = new Emp(); e1->age = 91; Emp *e2 = e1; Emp e3 = *e1; e1->age = 19; cout << e1->age << " " << e2->age << " " << e3.age << "\n"; In C/C++, struct and class instances are fundamentally unboxed, but programers usually box them explicitly (using new or malloc) and manipulate them via pointers C++

Varieties of Equality Languages typically provide some form of built-in equality testing on values. When are two (large) values equal? Under structural equality, values are equal when their contents are equal, bit for bit. (Only sane definition for unboxed values.) Under reference equality, values are equal when their locations are identical. Reference equality structural equality, but not vice-versa Reference equality may be cheaper to check than structural equality

Multiple kinds of equality Some language provide both structural and reference equality, under different names They may also provide a standard way for programmer to define equality for a given type in an ad-hoc way E.g in Scala: the eq operator gives reference equality the == operator invokes a user-defined equals method for case classes equals is pre-defined to be structural equality

Procedures and Functions Procedures have long history as essential programming tool Low-level view: subroutines let us avoid duplicating frequently-used code Higher-level view: procedural abstraction lets us divide programs into components with hidden internals Procedural abstractions are parameterized over values and (sometimes) types A function is just a procedure that returns a result (or,conversely, a procedure is just a function whose result we don t care about).

Procedure Activation Data Each invocation of procedure is specialized by associated activation data, such as the actual values corresponding to the formal parameters of the procedure locations allocated for the values of local variables the return address in the caller Activation data lives from time procedure is applied until time it returns If one procedure calls another, directly or indirectly, their activation data must be kept separate, because lifetimes overlap In particular, each recursive invocation needs new activation data

Activation Stacks In most languages, activation data can be stored on a stack, and we speak of pushing and popping activation frames on the stack tos fp (stack grows) 0 z locals saved fp line 5 ret addr 0 y 20 x args 20 z locals saved fp line 10 ret addr 10 y args 10 x frame for f frame for f Pr i } v } int f(int x, int y){ int z = y+y; if (z > 0) z = f(z,0); return z+y; } void main() { } int w = 10; w = f(w,w); 10 w locals frame for main typical activation stack, shown just before inner call to f returns

Calling conventions In compiled language implementations, we want to be able to generate the code for procedures separately from the code for their applications e.g. procedure may live in a pre-compiled library Requires a calling convention between caller and callee: e.g. caller places parameter values on the stack in a fixed order, and callee looks for them there In an interpreter, where caller and callee are visible at the same time, it is easy to be imprecise about this, but we have been trying to build a careful model

What about registers? For simplicity, we view store locations a memory addresses. But most real machines also have registers, which are: much faster to access than memory very limited in number (e.g. 4 to 64) don t have addresses, so cannot be accessed via an indirection Compilers try to keep variables (and pass parameters) in registers when possible, but always need memory as a backup. Using registers is just an (important!) optimization

Procedure Parameter Passing When we apply a function in an imperative language, the formal parameters get bound to locations containing values How is this done and which locations are used? Do we pass addresses or contents of variables from the caller? How do we pass actual values that aren t variables? What does it mean to pass a large value like an array? Two main approaches: call-by-value(cbv) and call-by-reference (CBR). Also call-by-name/need(cbn).

Call-by-value Each actual parameter is evaluated to a value before call On entry to function, each formal parameter is bound to a freshly-allocated location, and the actual parameter value is copied into that location Much like processing declaration and initialization of a local variable Semantics are just like assignment of actual expression to formal parameter Simple; easy to understand!

Issues with call-by-value Updating a formal parameter doesn t affect actuals in the caller. Usually a good thing! But sometimes not what we want void swap(int i,int j) { int t; t = i ; i = j; j = t; }... swap(a[p],a[q]); C call has no effect on a

More issues Can be inefficient for large unboxed values, e.g. C structs (records): typedef struct {double a1,a2,...,a10;} vector; double dotp(vector v, vector w) { return v.a1 * w.a1 + v.a2 * w.a2 +... + v.a10 * w.a10; } vector v1,v2; double d = dotp(v1,v2); C Call to dotp copies 20 doubles

Call-by-reference Pass a pointer to the existing location of each actual parameter Within function, references to formal parameter are indirected through this pointer so parameter can be dereferenced to get the value, but can also be updated If actual argument doesn t have a location (e.g. is an expression (x+3) ) then either evaluate it into a temporary location and pass address of temporary,or treat as an error

Issues with Call-by-reference Now procedures like swap work fine! Can also return values from procedure by assigning to parameters Lots of opportunity for aliasing problems, e.g. PROCEDURE matmult(a,b,c: MATRIX)... (* sets c := a * b *) matmult(a,b,a) (* oops! *) overwrites parts of argument as it computes result

Hybrid methods In Pascal, Ada, and C++, programmer can specify (in the procedure header) for each parameter whether to use CBV or CBR C always uses CBV, but programmers can take the address of a variable explicitly, and pass that to obtain CBR-like behavior: swap(int *a, int *b) { int t; t = *a; *a = *b; *b = t; } swap (&a[p],&a[q]);

Values can be References In many modern languages, like Java or Python, both records (objects) and arrays are always boxed, so values of these types are already pointers (or references) Thus, even if the language uses CBV, the values that are passed are actually references: calls don t cause any actual copying of the large values But it is a mistake (which some otherwise good authors make) to say that these languages use callby-reference (If they did, they would be passing a reference to the reference!)

Substitution and macros Recall that one simple way to give semantics to procedure calls is say they behave as if the procedure body were textually substituted for the call, substituting actual parameters for formal ones. This is very similar to macro-expansion, which really does this substitution (statically) #define swap(x,y) {int t;t = x;x = y;y = t;}... swap(a[p],a[q]); C expands to {int t; t = a[p]; a[p] = a[q]; a[q] = t;}

Avoiding capture Blind substitution is dangerous! #define swap(x,y) {int t;t = x;x = y;y = t;}... swap(a[t],a[q]) expands to {int t; t = a[t]; a[t] = a[q]; a[q] = t;} Nonsense! Say t has been captured by the declaration in the macro block

Call-by-name (CBN) One solution is to note that names of local variables are not important, e.g. we can rename to {int u; u = a[t]; a[t] = a[q]; a[q] = u;} Call-by-name can be thought of as substitution with renaming where necessary On real machines, CBN is implemented by passing to the function the AST for actual argument + values of its free variables This makes CBN much less efficient to implement than CBV or CBR. (We may see more later.)

Call-by-need A very useful feature of call-by-name is that arguments are evaluated only if needed foo x y = if x > 0 then x else y foo 1 (factorial 1000000) avoids expensive computation Haskell As a further refinement, pure functional languages typically use call-by-need (or lazy) evaluation, in which arguments are evaluated at most once: foo x y = if x > 0 then x else y * y foo (-1) (factorial 1000000) avoids expensive recomputation