CS 230 Programming Languages 11 / 20 / 2015 Instructor: Michael Eckmann
Questions/comments? Chapter 6 Arrays Pointers Today s Topics
We all know what arrays are. Design issues Legal types for subscripts Are references to subscript expressions range checked? When are subscript ranges bound? When is allocation bound? Are ragged and multidimensional arrays allowed? Is initialization allowed at allocation time? Slices?
Most languages use the name of the array followed by an index in square brackets to specify an element of an array e.g. array_of_names[5] Ada uses parentheses surrounding the index. Ada also uses parentheses for subprogram (function) calls to enclose the parameters. Any idea why they might have done that? Is it a good idea?
Ada chose to use parentheses in these two ways because both array references and function calls are similar in that they map the index (or parameters) to a value which is returned. It certainly reduces readability to some degree because the reader of the program can't tell the difference between an array reference and a function call with one parameter.
There are two types associated with any array The type of the elements of the array The type of the subscript Most commonly integers (or subranges of integers) Ada allows boolean, characters and enumeration types as subscripts Range checking the subscripts (indices) When are we talking here? C, C++, Perl, and Fortran --- no range checking Java, ML, C# do range check Ada range checks by default, but can be programmer disabled Why not range check?
Lower bound of subscripts Usually 0 (C-based languages) Fortran95 defaults to 1. Why use 0?
Five categories of arrays Static subscript ranges and storage allocation is static. Statically bound variables are bound BEFORE run-time. Fixed stack-dynamic subscript ranges statically bound, but allocation done at declaration elaboration time. When is declaration elaboration time? Let's compare these two. What are the advantages of each?
Static vs. Fixed stack-dynamic Arrays Let's compare these two. What are the advantages of each? Static --- speed efficient --- no dynamic allocation or deallocation (at runtime) so they're faster. Fixed stack-dynamic --- space efficient --- if have block scoped arrays, they're only allocated when they need to be in memory instead of allocating all of them before run-time. And they can be deallocated from the stack when they're no longer needed.
Back to the five categories of arrays Besides static and fixed stack-dynamic Stack-dynamic Subscript ranges dynamically bound Storage allocation is dynamic (on the stack)
Compare Stack-dynamic vs. static and fixed stack-dynamic Stack-dynamic Subscript ranges dynamically bound Storage allocation is dynamic (on the stack) Both are fixed after they're bound What advantage does this have over static and fixed stackdynamic?
Static and fixed stack-dynamic VS. Stack-dynamic Size of the array doesn't need to be known until it is going to be used.
Back to the five categories of arrays Besides static and fixed stack-dynamic and stack-dynamic Fixed heap-dynamic Subscript ranges dynamically bound Storage allocation is dynamic (on the heap) Both are fixed after they're bound which is different from: Heap-dynamic Subscript ranges dynamically bound Storage allocation is dynamic (on the heap) Both are changeable during execution.
In functions (therefore block scope) C & C++ use static arrays if programmer uses the static modifier C & C++ use fixed stack-dynamic arrays by default (without the static modifier) C & C++ also use fixed heap-dynamic arrays Java - fixed heap-dynamic arrays Their subscripts and allocation are dynamically bound but fixed afterwards. What does that mean? What kind of arrays do you think Perl has?
Perl and Javascript (and C# has capability) Heap-dynamic arrays Perl has push and pop functions that treat arrays like a stack therefore changing the length --- I had not mentioned these before. I did mention though that you could do things like: @arr = (1, 2, 3, 4); $arr[4] = 5; Which also obviously lengthens the array dynamically
Array initialization Sometimes the initialization implicitly states the length In C, C++, C# and Java this happens Ada has an interesting feature that allows initialization of some elements of the array to certain values and all others to another value.
Array operations (operate on array as a whole) Some languages allow concatenation of arrays and can use relational operators on arrays Others like APL have arithmetic operations on arrays like vector multiplication and matrix transpose, etc.
Rectangular vs. jagged arrays Arrays Only makes sense when talking about multidimensional arrays Jagged if rows can have different numbers of columns in the same array I made can bold above, because Java, C and C++ support jagged arrays but do not support rectangular arrays. This is because multidimensional arrays are implemented as arrays of arrays. Fortran, Ada and C# provide rectangular arrays When are different numbers of columns for a row useful?
Slices Not a new data type Instead, it is a way to reference part of an array as a separate unit. e.g. One row, or some subset of consecutive elements in a column, or some rectangular subset of rows etc.
Implementation of array types Arrays How to access an array element Access function code needs to produce the address of an element Descriptor for arrays Needs to have information in it to construct the access function That is, when an element of an array is referenced, the index along with the information in the descriptor needs to be able to find the address in memory where that element lives. So if run-time range checking is being done, what do you think the descriptor will contain?
Multidimensional Arrays Design issue: How to store the multidimensional array in memory? A 1-D array is typically stored sequentially in memory
Multidimensional Arrays Design issue: How to store the multidimensional array in memory? Row major order vs. Column major order Fortran uses column, all others use row Would the programmer care whether the multidimensional array is stored in row or column major order?
Multidimensional Arrays How to access 2-dimensional array elements at run-time (assuming row major order and all rows have same number of columns and index starts at 0). We have the subscripts to look for We need address of first element What else do we need?
Multidimensional Arrays How to access 2-dimensional array elements at run-time (assuming row major order and all rows have same number of columns and index starts at 0). We have the subscripts to look for We need address of first element We need the size of one element (determined from the type of the array.) We need to know the number of columns per row Luckily all this info is stored in the descriptor for the array
Multidimensional Arrays Address of element at row i, column j is = addressofarray + (numcols * i + j) * sizeofelement;
Pointers Pointer types Store memory addresses or nil (no address.) For indirect addressing For dynamic storage management Can access data at a particular memory address in the heap Provide a way to have data structures that shrink and grow during execution.
Pointers Using pointers Pointer reference (the address) Indirect reference (aka dereference) gives us what data is in the memory address that is stored in the pointer Dereferencing is usually explicit (as in C, C++ with the * operator) but it is sometimes implicit (ala Ada) Dereferencing example variable ptr is a pointer and contains the memory address 7080 Memory address 7080 contains the integer value 206 j is a variable of type integer.
Pointers
Pointers Accessing fields of records via pointers to records C++ (pointer to a struct) (*ptr).field_name (* dereferences,. accesses field ) ptr -> field_name ( -> does both ) Because this is a common operation, there's a shorthand Ada ptr.field_name (because of implicit dereferencing acts just like -> in C++)