C++ Arrays Areas for Discussion Strings and Joseph Spring/Bob Dickerson School of Computer Science Operating Systems and Computer Networks Lecture Arrays 1 Lecture Arrays 2 To declare an array: follow the name by square brackets containing a constant expression int i, a[10]; the element type is determined by the name at the front of the declaration char line[132]; int i, a[10]; declares i to be an integer, a to be an array of 10 ints char line[132]; declares line to be an array of 132 chars. Lecture Arrays 3 Lecture Arrays 4 Access to an element of an array is by indexing so: line[3] accesses the 4th element of the array line all arrays in C++ start at zero so the first element of a is a[0] the last is a[9]. Note: C++ never checks the index value of array accesses so line[201] is legal but will yield some unknown value at runtime. Lecture Arrays 5 Arrays // The following program reads a line of characters from the // terminal and then writes them out again: char line[132], c; int i, nxt=0; do{ cin.get(c); line[nxt] = c; nxt++; while(c!= \n ); for(i=0;i<nxt;i++) cout.put(line[i]); Lecture Arrays 6 1
The do while Loop and ++ Operator Format: do statement while( expression ); The statements in the body must be enclosed in {.. The terminating ; is part of the syntax. The loop executes its statement then if expression is true (non-zero) it repeats. The ++ operator Any storage location can be incremented by following (or preceding) the expression designating it by ++, i++ increments i ++line[3] increments the 4th element of line. There is also a decrement operator: -- x-- decrements x by 1 Lecture Arrays 7 The for Loop The syntax is: for( expr1 ; expr2 ; expr3 ) statement where the expri are expressions. The meaning is given by: expr1 ; while ( expr2 ) { statement ; expr3 ; Example for(i=0; i < nxt; i++) stat we have: i set to 0 and while i is less than nxt, the for loop executes stat, then increments i, re-checks i and so on. Lecture Arrays 8 The new standard library class string provides a wide range of special functions for extracting sub-strings and other manipulations. In addition there are operations between strings such as assignment, comparison and concatenation; some of this is shown below: #include <string> string s1("hello"); string s2, s3; s2 = "world"; s3 = s1 + " " + s2; cout << s3 << "\n"; Lecture Arrays 9 #include <string> string s1("hello"); ") string s2, s3; s2 = "world"; s3 = s1 + " " + s2; cout << s3 << "\n"; Lecture Arrays 10 With most standard library classes, including string, it is necessary to make the names and definitions directly available in an unqualified way with: strings can be defined and given an initial value, this can be done in a different way but with identical effect as string s1="hello" Lecture Arrays 11 Strings can be assigned. This can be one string to another s1 = s2 Or from a quoted string (which is of type char *, see next section) Strings can be concatenated using the + operator. This produces a new value It does not modify either of its operands Strings can be input and output, in the above case s3 contains "hello world" and it will be sent to the standard output. Lecture Arrays 12 2
In C there are many library functions that treat an array of characters as if it is a string to be manipulated as a whole. C++ is the same, the C function declarations can be got by including cstring.h (or for backwards compatibility with C, just string.h. Although h the standard d now has the string library class it is still important to understand arrays of characters because they (a) are the underlying form of the string, (b) sometimes provide more flexibility and efficiency, (c) are a fundamental part of the language, and (d) probably most existing programs use them instead of the newer library class string. Lecture Arrays 13 Since there is no index checking and arrays of any length can be passed to functions expecting strings the convention is to terminate the string with a byte containing zero (the null byte) written as \0, as an end marker. The output operator cout << will take a null terminated character array as argument, : char line[100]; line[0] = h ; line[1] = i ; line[2] = \0 ; cout << line; The above will print hi; line must be null terminated otherwise the undefined values of elements later in the array will be printed. Lecture Arrays 14 The standard library of C contains various functions to manipulate arrays of char as if they are strings. To get the correct declarations for them use the header file: string.h or cstring. #include <string.h> char s1[6] = "hello"; char s2[7] = " world"; char s3[64], s4[64]; strcpy(s3,s1); strcat(s3,s2); cout << s3 << "\n ; cout << "length of: \"" << s3 << "\" is " << strlen(s3) << "\n"; Lecture Arrays 15 Notes on the use of char arrays and other library routines: When an array of characters is declared storage is reserved and it can be initialised, eg: char s1[6] = "hello"; does. This is equivalent to writing: char s1[6] = { h, e, l, l, o, \0 ; and shows that a quoted string like "hello", is a sort of array of characters contains a null as its last character is 6 characters long, not 5. Lecture Arrays 16 Note: Arrays in C and C++ cannot be assigned to (unlike the new library class objects), so s3 = s1; is illegal. Although the initialisation appears to be like assignment it is different: initialisation is not assignment. since there is no built-in assignment of arrays there is a library routine, strcpy, which copies the elements of the second argument into the storage of the first up to and including the null byte; so strcpy(s3,s1) copies "hello" (+ null) into the first 6 elements of s3. Lecture Arrays 17 strcat assumes there is a string in its first argument array and copies characters, from the second, into positions after it; it overwrites the null byte at the end of the first string; so strcat(s3,s2) copies " world" into array s3 after the string "hello producing "hello world". strlen gives the number of characters in the string (not counting the null). Lecture Arrays 18 3
strcmp is not used above but is important. it compares 2 arrays lexically and gives < 0 if the first precedes the second, > 0 if the second precedes the first and 0 if the same. The value of an array name by itself, eg: line is the address of the first element of the sequence of storage units allocated for the array (see figure 1). Indexing an array name involves following the pointer and using the index value as an offset into the sequence of element storage locations. No storage location is allocated for the pointer itself, the array name is like a constant pointer used where necessary by the compiler. Lecture Arrays 19 Lecture Arrays 20 Note that this is completely different from Pascal, Modula-2 or Ada where the use of an array name by itself stands for the whole array. When an array is given as argument eg: strlen(line), the value used is the address of the first of the element sequence. The formal argument in the function acts as a pointer variable initialised with the address so all references in the function through the formal argument access the storage of the elements of the actual argument. Lecture Arrays 21 extern void lower(char[]); char line[132]; while( cin.getline(line,132) ) { lower(line); cout << line << "\n"; Lecture Arrays 22 void lower(char s[ ]) { for(int i=0; i<strlen(s); i++) if(s[i] >= A && s[i] <= Z ) s[i] = s[i] - A + a ; Example Above the example program loops through a file reading lines, converting all upper case letters to lower case and printing them out. The main program is just a test for the function lower, if it was only required to lower the case of all the upper case letters in a file it would obviously simpler to have a single while loop reading the file character by character. Lecture Arrays 23 Lecture Arrays 24 4
For the above program the formal array parameter is defined as type char s[ ] which means that: it can be passed any size array of characters; there must be no size for a formal parameter. getline There is a standard iostream library function called getline that takes a character array and size as arguments and reads the standard input file up to, and not including, newline or end of file and then stores the characters in the array. Lecture Arrays 25 getline So: char s[100]; cin.getline(s,100); when given the input: hi there will store h in s[0] the rest in consecutive array elements up to e in s[7] and put null \0 in s[8]; it has not stored the newline character (note that cin >> s or cin.get(s) will only read in "hi" because space is treated as a separator). Lecture Arrays 26 Character Macros There is a standard library of macros (inline expanded definitions) in ctype.h. These allow characters to be tested and converted. Amongst these macros is a test called isupper(c) and a converter called tolower(c) (NB tolower doesn t change a variable, it alters the character value which then needs assigning back to the variable). Lecture Arrays 27 Character Macros If we use these macros the function becomes: void lower(char s[]) { for(int i=0;i<strlen(s);i++) if( isupper(s[i]) ) s[i] = tolower(s[i]); it is also necessary to include the definitions at the start using: #include <ctype.h> Lecture Arrays 28 Summary Strings and Lecture Arrays 29 5