Data Types and Computer Storage Arrays and Pointers. K&R, chapter 5

Data Types and Computer Storage Arrays and Pointers K&R, chapter 5

Fundamental Data Types Most fundamental types are numeric (integer): - char - signed or unsigned - short int - signed or unsigned - int - signed or unsigned - long int - signed or unsigned - long long int - not standard, but on some systems The char data type counts as an integer, even though it usually holds characters - characters are encoded as small integers by ASCII

Additional Data Types Additional data types: - float - "real" numbers (i.e. with decimal point) - double - real numbers with increased range, precision - pointer - computer memory addresses Real numbers are handled completely differently than integers - Even though many arithmetic operations are "the same" Pointer data can differ from computer type to computer type - e.g. Intel- or AMD-based desktops and laptops, versus ARMbased smartphones and tablets - Often similar to integers

Data Types and Memory Usage Variables occupy space in computer memory Formally, sizes are "implementationdependent"......but some sizes are very common - char: 1 byte (8 bits) - short: 2 bytes (16 bits) - int: 4 bytes (32 bits) - long: 8 bytes (64 bits) - pointer: 4 bytes or 8 bytes Defined standard for real numbers: - float: 4 bytes - double: 8 bytes - not implementationdependent

Experiment: Use the sizeof() pseudo-function to examine object sizes in memory Use the limits.h and float.h modules to report the minimum and maximum sizes of each type Notes - Counting signed and unsigned versions, there are 10 integer types - Float and double types have separate minimum and maximum values for the digits part and the exponent part

Some Code:

Where Is the Data in Memory? Variables refer to areas of memory containing data - Variable type describes the data Address of a variable the location of the memory area - Commonly called a pointer to the memory Addresses are a data type too

Data in Memory Declare an integer: - int myint; - double mydouble; Reserves 4 bytes of memory for an integer, 8 bytes for a double myint mydouble "myint" and "mydouble" refer to the reserved memory locations - Variable addresses are the first byte's address

Arrays Read K&R, sections 1.6, 1.9, and 5.3

Arrays Array: collection of elements of the same type - Array elements can be used as individual variables Elements are contiguous in memory Elements are numbered starting at 0 - The index is the offset from the beginning address myarray 0 1 2 3 4 5 6 7 8 9

Array sizing Array size is the size of an element, times the number of elements In ANSI C (C89/C90), array size is static - must be specified by a constant expression, at compile time In C99, array size is variable - a function can specify an array size with a variable Arrays cannot be resized after creation - "Resizing" (adding elements) requires creating a new, bigger array and then copying values from the old version to the new one

Array declaration examples Fixed-size array, with initialization (unused) Fixed-size array, with string initialization (unused) Variable-size array

(A little extra: sort the array ) Add another #include sort the array before printing it out print: - smallest to largest? - largest to smallest?

Groups of Data Arrays Declare an array of integers: - int myarray[10]; Reserves memory for 10 integers, 4 bytes per integer myarray 0 1 2 3 4 5 6 7 8 9 "myarray" refers to the beginning of the first element of the array - "myarray" is equivalent to a pointer

Array Indexing Array indexes are 0-based - "myarray[0]" is the first element of myarray - "myarray[9]" is the last element Array elements are just like individual variables, but stored contiguously in memory No error checking - myarray[10] is not a legal element but can be referenced - Result may be garbage data, or a crashed program

sine-plotter revisited: #include<stdio.h> #define ARRAYSIZE 10 int main(int argc, char**argv) { unsigned ary[arraysize]; int i; for (i = 0; i < ARRAYSIZE; i++) ary[i] = i*i; printf("ary: %p\n", ary); for(i = 0; i < ARRAYSIZE; i++) printf("%2d: %p %#02x %d\n", i, &ary[i], ary[i], ary[i]); Generate sine values as before, but save them in an array Use array size to scale the "x" (horizontal) axis Use min, max array values to scale the "y" (vertical) axis Nested for loops "fill in" a horizontal graph } return 0;

Array assignments You can assign new values to individual elements of an array You cannot assign one array to another - do it element-byelement in a loop, instead This doesn't work: - int i, a1[10], a2[10]; // fill a1 first (ok) for (i = 0; i < 10; i++) a1[i] = i*i/2; // try to copy a1 a2 = a1; // change original a1 a1[0] = -9; may not compile What happens to a2?

Copying Arrays Use a loop: - for (i = 0; i < ARRAY_SIZE; i++) new_array[i] = old_array[i]; For text strings arrays of characters terminated by NULL use strncpy(): - strncpy(new_str, old_str, ARRAY_SIZE);» ARRAY_SIZE specifies the maximum number of characters to copy avoids "buffer overflow"» strcpy() is simpler, but permits buffer overflows

Initializing arrays When declaring an array, its elements can be given initial values List initial values in curly braces If not enough initial values are supplied, the remaining elements are uninitialized - What happens if too many initial values? project: write a program to fill an array of factorials.

try: #include<stdio.h> #define ARRAYSIZE 10 int main(int argc, char**argv) { char ary[arraysize]; int i; for (i = 0; i < ARRAYSIZE; i++) ary[i] = i*i; printf("ary: %p\n", ary); for(i = 0; i < ARRAYSIZE; i++) printf("%2d: %p %#02x %d\n", i, &ary[i], ary[i], ary[i]); return 0; } Create an array of chars, and fill it with something Print array's address Loop over each element print: - element's index - element's address - element's contents

remarks Use of the ARRAYSIZE macro & - "address-of" operator Compare array's value (address) to first element's address they're the same Observe how the element addresses change compare to the size of a char

modify: #include<stdio.h> #define ARRAYSIZE 10 int main(int argc, char**argv) { char int[arraysize]; int i; for (i = 0; i < ARRAYSIZE; i++) ary[i] = i*i; printf("ary: %p\n", ary); for(i = 0; i < ARRAYSIZE; i++) printf("%2d: %p %#02x %d\n", i, &ary[i], ary[i], ary[i]); return 0; } Change array type from "char" to "int" Leave the rest unchanged: - Print array's address - Loop over each element print:» element's index» element's address» element's contents How do element addresses change now?

Pointers Pointers are memory addresses The address-of operator ( & ) gives the memory address of a variable - printf( "%x %p\n", a, &a ); Pointer variables hold pointers - int x; int *xp = &x; What happens if you try to print the address of a constant? - printf( "%x %p\n", 77, &77 );

Pointers and Coding int main(int argc, char**argv) { int a; int *pa; a = 7; pa = &a; printf("%d %p %p\n", a, &a, pa); return 0; } "a" refers to someplace in memory - that place contains "7" "pa" refers to another place in memory - that place gets the address of the memory that holds the "7" Print: the contents of "a", the address of "a", the contents of "pa"

Arrays and Pointers "ary" and "&ary[0]" both refer to the same memory address The "dereference" operator (*) works on an address, returns the contents of memory at that address Dereference: - ary[0] contains 0*0, or 0 - *ary contains the same 0*0, or 0 - What about ary[1]? - *(ary+1) is the same

back to the integer version: #include<stdio.h> #define ARRAYSIZE 6 int main(int argc, char**argv) { int ary[arraysize]; int i; for (i = 0; i < ARRAYSIZE; i++) *(ary+i) = i*i; printf("ary: %p\n", ary); for(i = 0; i < ARRAYSIZE; i++) { printf("%2d: %p %#04x ", i, (ary+i), *(ary+i)); printf(" %p %2d \n", &ary[i], ary[i]); } return 0; } Refer to array elements via array indexes, and via pointers "Pointer arithmetic" - adds to pointers according to the element size Array notation and pointer notation show the same values - print values in hex and in decimal

Array Bounds C does not do bounds-checking Array references outside the array itself are possible Equally true with array notation and with pointer notation This is the source of buffer overflows

Try overrunning the integer version: #include<stdio.h> #define ARRAYSIZE 6 int main(int argc, char**argv) { int ary[arraysize]; int i; for (i = -3; i < 9; i++) *(ary+i) = i*i; printf("ary: %p\n", ary); for(i = -6; i < 12; i++) { printf("%2d: %p %#04x ", i, &ary[i], ary[i]); printf(" %p %2d \n", (ary+i), *(ary+i)); } return 0; } Define small array - 6 elements Overfill it - 12 values - both before and after the array's location Print the overfilled array, and even more - What happens?

Arrays, Pointers, and Functions In C, arrays can t be passed as function arguments, nor returned as function values - Pointers can be passed and returned Functions can receive an array name which is a pointer to the beginning of the array - Function also needs to know the size of the array this is usually passed in as another argument Example: - void fill_array(float *ary, unsigned size) { } unsigned i; for (i = 0; i < size; i++) ary[i] = 0.01; also written as: (float ary[],

be careful returning pointers A pointer can be a function s return value but don t return a pointer to one of the function s local variables or arrays! - These variables will be destroyed when the function finishes executing. - The pointer will therefore point to invalid locations You can return a pointer that points into an array that was originally passed into the function as an argument Example: - float *split_array(float *ary, unsigned size) { // return a pointer to the middle element of the array return &(ary[(size+1) / 2]); }

Using Pointers Pointers are the address of an item in memory Q. What if you want to use the item itself? A. The dereference operator, * - char ary[3], *p; p = ary; *p = a ; printf( %u\n, (ary[0] == *p) ); // True!

Pointer Arithmetic You can assign to, add to, and subtract from, a pointer - int ary[10], *pary; pary = ary; // pary points to 1 st array element pary += 3; // pary now points to 4 th array element pary--; // pary backs up to 3 rd array element Addition, subtraction, increment, decrement work in units equal to the size of the element being pointed to - int, unsigned: a unit is 4 bytes - double: a unit is 8 bytes - etc.

Using Pointers Pointers point to variables (in memory) Variable contents can be accessed by dereferencing the pointer - the dereference operator is * - example: // add 7 to whatever pary points to x = *pary + 7; Dereferenced pointers can be assigned to: example: // add previous two elements, put in this element *pfibo = *(pfibo-1) + *(pfibo-2);

Pointers example: print cmd-line args

Fibonacci example, using pointers

More examples

more examples 2 In mystrncpy() below, note the use of the pre-increment operator, ++n. - This adds one before testing the value, in contrast to the post-increment operator n++.

done