COS 140: Foundations of Computer Science

COS 140: Foundations of Variables and Primitive Data Types Fall 2017 Copyright c 2002 2017 UMaine School of Computing and Information S 1 / 29

Homework Reading: Chapter 16 Homework: Exercises at end of chapter Homework due 10/23 Copyright c 2002 2017 UMaine School of Computing and Information S 2 / 29

What is a variable? What is a variable? Variable attributes Binding In this lecture: imperative/oo languages A variable is an area of memory that: can have a value can have its value changed by the program Different sizes: byte, word, multiple words Copyright c 2002 2017 UMaine School of Computing and Information S 3 / 29

Variable attributes What is a variable? Variable attributes Binding Name what do we call it? Address where does it live in memory? Type what kind of thing can it hold? Value contents of the variable Scope who can see this variable? Lifetime duration of program, or shorter period? Copyright c 2002 2017 UMaine School of Computing and Information S 4 / 29

Binding What is a variable? Variable attributes Binding Associates an attriburte value with an attribute Not just value attribute Two types: Static binding: occurs before run-time, not changed afterward Allows compiler to check for errors, but......need information about the binding at compile time Dynamic binding: takes place/is changed at run-time Copyright c 2002 2017 UMaine School of Computing and Information S 5 / 29

Name length Special names Name: identifier associated variable Same name different variables in different contexts Different names same variable Unnamed variables: access via pointer Copyright c 2002 2017 UMaine School of Computing and Information S 6 / 29

Name length Special names Strings (symbols) in a symbol table Symbol tables created at compile/interpretation time Associate names with entities (e.g., variables) Copyright c 2002 2017 UMaine School of Computing and Information S 7 / 29

Name length Special names Which characters to allow? Have to recognize names during lexical analysis no spaces Sometimes special initial character e.g., a letter For enhanced readability, may allow connectors (e.g., -, ), case-sensitivity Copyright c 2002 2017 UMaine School of Computing and Information S 8 / 29

Length of variable names Name length Special names Names should be meaningful longer names But symbol table should be reasonable size, easy to maintain shorter names BASIC: one letter + optional number Can t complain about the size! Readable for mathematical applications, but not much else Limited number of characters e.g., FORTRAN (6 or 31), C (63 significant) Trades-off readability and wasted space Table maintenance easy, but whatever the size, someone always wants more Unlimited number of characters e.g., Ada, Lisp Complete flexibility, but......potentially great deal of space, hard to maintain Copyright c 2002 2017 UMaine School of Computing and Information S 9 / 29

Special names Name length Special names Some names not allowed or limited Reserved words can only be used in language-defined context(s) if in C, e.g. Predefined words defined in the language, but can be changed (e.g., Ada, C [libraries]) Keywords special meaning in some contexts, but can be used as a name (e.g.,for in Lisp,if in FORTRAN) Copyright c 2002 2017 UMaine School of Computing and Information S 10 / 29

Giving variables values construct binds a value to a variable Often, a construct such as: x = a+b x s value is bound to result ofa+b Not algebra: can have: x = x+1 Copyright c 2002 2017 UMaine School of Computing and Information S 11 / 29

construct examples C, Java, Python... x = x + 1; Pascal x := x + 1 Lisp (setf x (1+ x)) R x <<- x + 1 orx + 1 -> x Smalltalk x x + 1 Forth x 1 + x! COBOL ADD X 1 GIVING X TCL set x x + 1 Copyright c 2002 2017 UMaine School of Computing and Information S 12 / 29

All data has a type: integer, floating point, etc. Type species size, interpretation, operations for data All variables have a type attribute Depending on language, variables type static or dynamic Determining variable type: Declared somewhere in program (static, explicit) Inferred by compiler/interpreter (static, implicit; dynamic) Copyright c 2002 2017 UMaine School of Computing and Information S 13 / 29

What are primitive data types? : defined by language Not (usually) defined in terms of other types Building blocks of new types Unstructured/atomic data types (scalars): Integers Floating point numbers Booleans Characters Pointers Structured data types: Strings Arrays Records Copyright c 2002 2017 UMaine School of Computing and Information S 14 / 29

Integers Integer type represents subset of the integersz(duh) Most common data type You ve seen: sign-magnitude, two s complement Most languages: several sizes (byte, 2 byte, 4 byte...) Can t represent all possible integers in c instead i...j, wherei 0 j For two s complement, withnbits, can represent 2 n 1...2 n 1 1 E.g., 8 bits can represent10000000 2 (= 128 10 ) to 01111111 2 (=127 10 ). Some languages: unsigned integers (C) Copyright c 2002 2017 UMaine School of Computing and Information S 15 / 29

Floating point numbers Represent subset of real numbersr Usually two sizes Two parts: fraction and exponent E.g.: 0.1101 2, exponent101 2 Copyright c 2002 2017 UMaine School of Computing and Information S 16 / 29

Floating point numbers Represent subset of real numbersr Usually two sizes Two parts: fraction and exponent E.g.: 0.1101 2, exponent101 2 Fractional part= 1 2 + 1 4 + 0 8 + 1 16 = 0.8125 10 Exponent = 5, so multiply fraction by2 5 Copyright c 2002 2017 UMaine School of Computing and Information S 16 / 29

Floating point numbers Represent subset of real numbersr Usually two sizes Two parts: fraction and exponent E.g.: 0.1101 2, exponent101 2 Fractional part= 1 2 + 1 4 + 0 8 + 1 16 = 0.8125 10 Exponent = 5, so multiply fraction by2 5 So number is0.8125 32 = 26.0 10 Copyright c 2002 2017 UMaine School of Computing and Information S 16 / 29

Floating point numbers Represent subset of real numbersr Usually two sizes Two parts: fraction and exponent E.g.: 0.1101 2, exponent101 2 Fractional part= 1 2 + 1 4 + 0 8 + 1 16 = 0.8125 10 Exponent = 5, so multiply fraction by2 5 So number is0.8125 32 = 26.0 10 Can t represent all numbers accurately e.g.,π, 1 3 Copyright c 2002 2017 UMaine School of Computing and Information S 16 / 29

Booleans True/false values Good for flags (switches), conditions to be checked Sometimes languages provide as primitive data type If not: 0 is false, everything else true (C) 0,, [], etc., is false, everything else is true (Python) Representation: smallest efficiently-addressable unit (often byte) Copyright c 2002 2017 UMaine School of Computing and Information S 17 / 29

Characters Character represented as a pattern of 1s and 0s Representations: ASCII 7 bits code; 8-bit variants of this (e.g., ISO 8859 1) are the most common codes used E.g.: a = 33, b = 34,...; space = 32 Unicode 16 bits (or longer); accommodates other alphabets and symbols E.g.: - codepoints 47196, 51060; ; EBCDIC 8 bits; old IBM code Copyright c 2002 2017 UMaine School of Computing and Information S 18 / 29

Strings Sequence of characters: textual data Special operators (e.g., substring, concatenation), special relational operators, sometimes special assignment Lengths: Fixed length: always same size Limited-dynamic can change, but there s a maximum (e.g., C) Length + sequence (often array) of characters Null-terminated sequence of characters Dynamic can change without limit (e.g., Lisp) more flexible, but overhead for allocation/deallocation Copyright c 2002 2017 UMaine School of Computing and Information S 19 / 29

Arrays Sometimes need to represent groups of related data items (e.g., integers, characters, etc.) An array is a data type that stores homogeneous data in contiguous memory Can be single-dimensional (vectors) or multi-dimensional Use index to find element Value of index doesn t affect time taken to find element random access storage Copyright c 2002 2017 UMaine School of Computing and Information S 20 / 29

Design issues for Arrays Syntax for array indices e.g.,temperatures(20) or Temperatures[20] Subscripts: what is the lower bound? bounds checking or not? must subscripts be integers? Number of dimensions allowed Usually no real limits C: limits to one dimension, but allows array elements to be themselves arrays Copyright c 2002 2017 UMaine School of Computing and Information S 21 / 29

Array descriptors What information is needed by the programming language/program about an array? Base address where is the first element Element type Index type doesn t have to be integer for some languages Index lower, upper bounds Number of locations in array Copyright c 2002 2017 UMaine School of Computing and Information S 22 / 29

Finding address of array element Element location (or address, A) depends on base address (B), index (I), index lower bound (L), and element size (S) A = B +(I L) S Copyright c 2002 2017 UMaine School of Computing and Information S 23 / 29

Finding address of array element Element location (or address, A) depends on base address (B), index (I), index lower bound (L), and element size (S) A = B +(I L) S Given an array Taxes whose start location is 1024, element type is a long integer (8 bytes), and whose indices start at 0, find location of elementtaxes[15] 1024 1028 1032... Array "Taxes" Taxes[0] Taxes[1] Taxes[15] Copyright c 2002 2017 UMaine School of Computing and Information S 23 / 29

Finding address of array element Element location (or address, A) depends on base address (B), index (I), index lower bound (L), and element size (S) A = B +(I L) S Given an array Taxes whose start location is 1024, element type is a long integer (8 bytes), and whose indices start at 0, find location of elementtaxes[15] 1024 1028 1032... Array "Taxes" Taxes[0] Taxes[1] Taxes[15] A = 1024+(15 0) 8 = 1144 Copyright c 2002 2017 UMaine School of Computing and Information S 23 / 29

Another example Suppose arrayfoo has elements of a type that requires 6 bytes to store, that it begins at location 4096, and that its indices begin at 10. What is the address offoo[201]? Copyright c 2002 2017 UMaine School of Computing and Information S 24 / 29

Another example Suppose arrayfoo has elements of a type that requires 6 bytes to store, that it begins at location 4096, and that its indices begin at 10. What is the address offoo[201]? Answer: A = B +(I L) S = 4096+(201 10) 6 = 5242 Copyright c 2002 2017 UMaine School of Computing and Information S 24 / 29

Optimizing address computation Can write expression so static parts can be calculated at compile time and stored in a constant: A = B +(I L) S = B +(I S) (L S) = (I S)+(B L S) constant part Copyright c 2002 2017 UMaine School of Computing and Information S 25 / 29

Multidimensional arrays Can have more than 1 dimension E.g., 10 10 array is 2D, 100 entries, 10 rows, 10 columns Indices: [row,column] or [row][column] How to store dimensions? Row-major order: Rows are kept together, elements stored one row after another T: array 1..10, 1..10 of integers row 1 row 2 row 10... Column-major order: keeps columns together Row-major probably more common Copyright c 2002 2017 UMaine School of Computing and Information S 26 / 29

Address calculation ArrayTis an 2D array of temperatures taken over a 1-meter grid, 100 m on a side Assume row-major order, float data type (assume 4 bytes), base address fort=2048, indices each start at 0 (so 0..99) What is the address oft[20,15]? First: what is start of row 20? A [20,0] = B +(I L) S = 2048+(20 0) (4 100) = 10,048 Copyright c 2002 2017 UMaine School of Computing and Information S 27 / 29

Address calculation ArrayTis an 2D array of temperatures taken over a 1-meter grid, 100 m on a side Assume row-major order, float data type (assume 4 bytes), base address fort=2048, indices each start at 0 (so 0..99) What is the address oft[20,15]? First: what is start of row 20? A [20,0] = B +(I L) S = 2048+(20 0) (4 100) = 10,048 Next, find offset of element in row 20 same kind of calculation: A [20,15] = 10,048+(I L) S = 10,048+(15 0) 4 = 10,108 Copyright c 2002 2017 UMaine School of Computing and Information S 27 / 29

Heterogenous types: Records, structs contain heterogeneous related data E.g., data about employee: name (string), address (string), salary (fixed-point integer), height (float),... Most languages support this type: struct in C,defstruct in Lisp,record in Pascal, etc. Design issues: How are fields selected? How is field checked? Can we assign one structure to another? How to implement e.g., how to find element from selector? (field must have type, offset) Copyright c 2002 2017 UMaine School of Computing and Information S 28 / 29

Pointers These point to an object: contain its address Used for user-created dynamic variables (from heap) Used for indirect addressing (e.g., C) abstraction of assembly language s indirect addressing mode Can assign to and through them: int a = 3; // integer int* p; // pointer to integer p = &a; // p = a s address *p = 4; // a now = 4 Often used (in C, e.g.) to access array elements int a[100]; // 100-element array of ints int* p = a; // p = addr of a s start (no &) for (int i=0;i<100;i++) { *p = 0; p++; } Copyright c 2002 2017 UMaine School of Computing and Information S 29 / 29