CS102 Lecture 7 xkcd xkcd palindramas Sean Cusack 2018
Overview Not just int's any more, also characters What's the difference? 1-D arrays and a little of 2-D again Strings What's a "char*" or a "char**"? Labs that do things to characters and strings and command line arguments Project to manipulate strings and command-line arguments
What is a "char"? int i = 65; f(i);... void f( int i ) { fprintf( stdout, "%d\n", i ); } char i = 'A'; f(i);... void f( char i ) { fprintf( stdout, "%c\n", i ); }
Small Numbers But only 1's and 0's fit in a computer, and travel along wires 01000001 = 65 = 'A'?
A Ski Accident? This is the standard for turning a character into a number. It's called ASCII. American Standard Code for Information Interchange.
Char Array Just like int's, we can have arrays of char's: char name[3] = { 'B', 'o', 'b' }; And just like int's, we can alter an element in the vector one at a time: name[0] = 'R'; /* Now it's Rob instead of Bob */
Printing Magic However, unlike arrays of integers, arrays of characters can be printed all at once using fprintf: char name[3] = { 'B', 'o', 'b' }; fprintf( stdout, "%s\n", name ); Sorta... Ever think about what the prototype for fprintf() might be? In your functions, you knew the size of your arrays. Here, we don't. But somehow it sorts it out. For this, we need an "any size" array...
Chars and Stars This is we meet our new friend char* Who is very much like an array: char name[3]={'b','o','b'}; foo(name);... void foo( char* s ) { for(i=0;???; i=i+1) { fprintf(stdout, %c,s[i]); } } char name[3]={'b','o','b'}; foo(name);... void foo( char s[3] ) { for(i=0; i<3; i=i+1) { fprintf(stdout, %c,s[i]); } }
Using "any size" arrays They actually aren't arrays, and until we explain them better, they are only to be used in the very limited case of function arguments. But do use them just like the int arrays from last week... Do not declare or otherwise use these until next week, except in the below: void foo( char* good ) { char* dont_do_this;
To Reiterate In this class, until otherwise noted: char* good Is the same as: char good[unknown_size] But only in function parameters, not: char* dont_do_this; Is the latter useful in some way? Yes. Will we be covering it this week? No. So if you're typing a star after char or int, other than in a parameter list, it's wrong for now, guaranteed.
Where the Sidewalk Ends As we are stepping through an array, one at a time, you may have noticed that we didn't know when to end, since we don't know the length. In English, words end when you hit space or punctuation. That wouldn't work here because we want to be able to store any kind of characters, i.e.: 555 Elm St., NY NY 11111 So how do we know when is enough?
Nullification When you hit a space in an English sentence, you know the last word is done because it's a letter that can't be part of a word. We just have to pick one for ourselves. How about 'B'? That would be fine, it's a letter. But then you can't ever have a set of characters with a 'B' in it because it would be interpreted as an "end of word". Enter '\0'
Bob's True Name So, in order to Bob to have an end to his name, we have to mark it as such. char name[3]={'b','o','b','\0'}; But don't forget that the null-character is a character, too, and it needs it's own room: char name[4]={'b','o','b','\0'}; Now this will work: fprintf( stdout, "%s\n", name );
Bob is talkative? -bash-4.1$ cat ex1.c #include <stdio.h> int main( int argc, char** argv ) { int a = 123; char name[3]={'b','o','b','\0'}; int b = 456; fprintf( stdout, "%s\n", name ); } -bash-4.1$ gcc ex1.c -o ex1 ex1.c: In function `main': ex1.c:5: warning: excess elements in array initializer ex1.c:5: warning: (near initialization for `name') -bash-4.1$./ex1 Bo4???X???a4??)
Strings That's what we commonly refer to as a "string"... a set of characters with a null at the end. And the more common way of writing it is: char name[4] = "Bob"; Don't forget that 4th character. The doublequotes do you the favor of adding the null to the end, but you need to remember to leave room for it, or you will have a:
Arguments You know, all the commands you run on UNIX are their own programs. "ls", "cp", et ceteras. But they each do different things depending on what you type second. I.e.: cp foo bar is different than: cp baz blurf../grok How'd they do that?...
Argc and Argv Remember the header stuff we type in all C programs? Today we decode some of it. The first parameter is argc, an int, which we understand, and the second is argv, a char**, which we are just starting to understand... Since a char* name is kinda like a char name[10], a char** argv is much like a char argv[10][10], except without "limits" on size
Arguing with your program The arguments you pass on the command line on UNIX, when calling your program, are what are available in main()'s parameters:./prog./prog hello./prog a b c argc 1 argv prog argc 2 argv prog hello argc 4 argv prog a b c
Dissecting argv./foo bar baz f o o \0 b a r \0 b a z \0
Dissecting argv./foo bar baz argv argv[0] argv[0][2] f o o \0 b a r \0 b a z \0
Dissecting argv./foo bar baz char argv[1][1] char argv[1][0] char argv[1][2] char argv[0][1] char argv[2][1] char argv[0][0] char argv[0][2] char argv[2][0] char argv[2][2] char* argv[0] char* argv[1] char* argv[2] char** argv
Argv coding example int main( int argc, char** argv ) { fprintf( stdout, "%s\n", argv[3] ); fprintf( stdout, "%s\n", argv[2] ); fprintf( stdout, "%c\n", argv[1][0] ); fprintf( stdout, "%c\n", argv[0][0] ); return 0; }
Lab 1 cs102*/lab-7 git init Write a program the prints out how many arguments were on the command-line when the program was run I.e.: #>./prog one two 3 #>./prog 1 #>./prog one two three 4
Lab 2a Write a program that prints out the first command line argument, not including the name of the program itself For example: #>./prog hello world hello
Lab 2b Write a program that prints out the first two command line argument, not including the name of the program itself For example: #>./prog hello world hello world #>./prog hello world many args hello world
Lab 2c Write a program that prints out all the command line arguments, not including the name of the program itself How many are there? (See lab1) Instead of printing just two print from 1 until the number of them we have For example: #>./prog hello world hello world #>./prog hello world many args hello world many args
Lab 3 Now, IF there's at least one argument, print all of them, just like lab 2c Otherwise, if there's not, then print no arguments found! For example: #>./prog hello world 1 2 3 hello world 1 2 3 #>./prog no arguments found!
Lab 4 Write a program that prints out all the command line arguments, in reverse order, not including the name of the program itself Same thing as lab3 (or lab2c), but print starting from the last one and make the counter less rather than more each time For example: #>./prog hello world 1 2 3 3 2 1 world hello
Lab 5a Make a function that find string length: int string_length( char* string ) Remember, this is C, so a prototype above main and a definition below However, FOR NOW, don't bother making string_length() do any real work, just have it return 42. Yes, it's fake, it doesn't work, but get this much to work first. In your print loop in main print the each argument and ALSO the string length of that argument (use the function) Example output: #>./lab5a foo bar baz baz 42 bar 42 foo 42
Lab 5b - 1 Now, fix string_length(). Instead of just returning 42, you need to figure out how big string is. Remember, you're inside a function, so you are not allowed to touch any variables outside, which means you don't have to worry about argv or argc, you're ONLY worried about the variable string In order to count the elements in that, you need a loop, but instead of i<number, you need to go up until you reach the end of the string Remember how every string has a null at the end?
Lab 5b - 2 So you need to see if string[0] is a null, if string[1] is a null, if string[2] is a null, etc... until it really is... then stop Remember that "is equals" is == and "not equals" is!= And remember that a null character is '\0' Output example: #>./lab5b foo bar baz baz 3 bar 3 foo 3 #>./lab5b hello gobbledegook gobbledegook 12 hello 5
Meanwhile, inside String Length I recommend using a while loop for this, rather than a for-loop... string counter character at counter'th position in string "hello\0" 0 h go to next "hello\0" 1 e go to next "hello\0" 2 l go to next "hello\0" 3 l go to next "hello\0" 4 o go to next "hello\0" 5 \0 go to next done! length is 5
Lab 6a Now we're going to work towards adding the ability to reverse the characters in each argument, so given BAR print RAB Note: do NOT print anything in reverse_string, we're literally altering the string, so that after we alter it, we can print it like normal However, one step at a time. New file, lab6.c. Keep the string length thing in there, you'll eventually need it. Make a new function (yes, prototype above main, definition below): void reverse_string( char* string ) In main, use that function on an argument before you print it. I.e. reverse argv[whatever] and then print argv[whatever] Now, let's make reverse_string do something. Not EVERYTHING, but something Remember that we're inside a new function, so we can't see or touch anything in string_length or main. No argv, no argc. All we have is string, that's it. FOR NOW: set the first character of string to the second character of string. Yes, I know that's not reversing anything, but one step at a time. Example output: #>./lab6a foo bar baz hello eello 5 aaz 3 aar 3 ooo 3
Lab 6b OK let's make it a little better than that. Let's SWAP the first two characters. Now, in order to SWAP, we have to be careful, we need a third "bucket" to keep a spare copy. It takes THREE lines to swap. So your functino will go from one line to one new variable plus 3 lines. Example of "swapping" in the next two slides. Output example: #>./lab6b foo bar baz hello ehllo 5 abz 3 abr 3 ofo 3
Swap juggle You can't just assign it: string[first] = string[second]; first value second value second value second value Because that wipes out the first... and you can't do the reverse, because it wipes out the second: string[second] = string[first]; second value first value first value first value
Juggling first first value second value second spare first first value second value second second value spare first first value first value second second value spare first second value first value second second value spare
Lab 6c OK let's make it EVEN BETTER Now let's swap the first and the LAST Remember that we don't know how big string is ahead of time, we can't assume it's 5 or something However, we DO have a function that tells us how long a string is (you wrote it in lab5) Use that to figure out the LAST index of string and swap that one and the first one So if string just so happens to be hello: string_length(string) will return 5 but string[4] is the last element Output example: #>./lab6c foo bar baz hello oellh 5 zab 3 rab 3 oof 3
Reversing the WHOLE String Look at your swapping algorithm If only we could swap string[0] with string[ the last one ] string[1] with string[ the last one -1 ] string[2] with string[ the last one -2 ] etc Hmm, I see a number changing, I wonder what kind of construct that is... The question is... how many times do you do this?
Reverse String 2 string h e l l o \0 string[0] string[1] string[2] string[3] string[4] string[5]?
Reverse String 3 string index index'th char in string index'th char from end of string "hello\0" 0 h o swap "oellh\0" 1 e l swap "olleh\0" 2 l l swap (or not) "olleh\0" 3 stop