CS 322 Operating Systems Programming Assignment 2 Using malloc and free Due: February 15, 11:30 PM Goals To get more experience programming in C To learn how to use malloc and free To lean how to use valgrind to find memory problems To learn how to call functions from the C library Assignment Unix provides a lot of utility programs. One of these is sort. Run man sort to see what it does and all the options available. In this assignment, you will write a much simpler version of Unix s sort utility. Your version will sort lines of text. This program should be invoked as follows: or./textsort file./textsort -3 file textsort is the name of the executable program. In the first line, "file" is the name of the file that contains the text being sorted. The input file file should be sorted and the sorted output printed to the screen; it is assumed that the file is a text file (full of ASCII characters). When only the filename is given, the sorting should be over the first word in each line. If the optional argument is included ( -3 in the example on the second line), the program should sort the text input file using the specified word as the key to sort upon (with -3, the program should find the 3rd word in each line and sort the lines based upon that). It is normal Unix convention to start optional arguments with a -, so here -3 does not mean negative 3, but rather using the optional argument 3. Examples Let's say you have the following file, called short_alma.txt: If you run./textsort short_alma.txt, it should print:!1
because In is alphabetically before Oh, which is before Our, which is before The. If, however, you pass in a flag to sort a different word, you'll get a different output. For example, if you call./textsort -2 short_alma.txt, you should get: In ASCII, uppercase letters appear before lowercase letters and therefore they will get sorted first. As a result, we see Mount before courage before lives before the. Yes, we are assuming -2 means the second word in each line (like most people would, except computer scientists who always want to start at 0). Hints on Sorting You can use C s qsort function to sort an array. Here is a excerpt from the man page: void qsort (void *base, size_t nmemb, size_t size, int (*compar) (const void *, const void *)); The qsort() function sorts an array with nmemb elements of size size. The base argument points to the start of the array. The comparison function must return an integer less than, equal to, or greater than zero if the first argument is considered to be respectively less than, equal to, or greater than the second. If two members compare as equal, their order in the sorted array is undefined. Here are some tips about using qsort. The last parameter to qsort is a function. To pass a function as a parameter in C, just pass in the name of the function. In this case, the function you pass in will be called by qsort to do the pairwise comparison of values in the array you are sorting. The signature of this function should be: int my_compare (const void *, const void *) You can call the function whatever you want; it does not need to be my_compare. The trickier part is understanding what the parameters to your comparison function are. The man page for qsort says: the comparison function "is called with two arguments that point to the objects being compared". That is, they are pointers to the array elements. Therefore, if you are calling qsort with an array of strings, then the elements in your array have type char *. The parameters to the comparison function are pointers to array elements, so that type would be char **. To use the parameters, you need to do two things: 1. Cast the void * to its actual type of char ** 2. Dereference the first pointer so that you have a char *. Then you will be able to just do the comparison. For example: int my_compare (const void *elem1, const void *elem2) {!2
/* Cast to its actual type. */ char **strptr1 = (char **) elem1; char **strptr2 = (char **) elem2; /* Dereference to get the strings */ char *str1 = *strptr1; char *str2 = *strptr2; } /* Then use strcmp to compare the strings */... The same principle applies if the array you are sorting contains something other than char *. The compare function will get a pointer to whatever element type you have and you will need to dereference the pointer to get the actual value to compare. To give yourself confidence that you understand how to call qsort, I recommend that you write a very simple test program in which you define a comparison function. In the main function of this test program, call qsort and print out the sorted array. Other Hints You will be using a number of other functions provided by the C libraries. You need to use the man command to find out how to call these functions and what.h files you need to include. Here, I identify functions you will likely want to use. In your sorting program, you should use fopen() to open the input file, fgets() to read each line of the file, and fclose() when you are done with the input file. If you want to figure out how big the input file is before reading it in, use the stat() function. This will not give you an exact answer but you will be able to make a better guess. To compare strings, use the strcmp() function. To chop lines into words, you can use strtok(). Be careful, though; it is destructive and will change the contents of the lines. Thus, if you use it, make sure to make a copy of the original line for later use. The routine strtol() can be used to transform a string into an integer. To exit, call exit() with a single argument. This argument to exit() is then available to the user to see if the program returned an error (that is, return 1 by calling exit(1) ) or exited cleanly (that is, returned 0 by calling exit(0) ). You can also just return from the main() function and pass the return code that way where appropriate. The routine malloc() is used for memory allocation. If you have a data structure that you need to make bigger, use realloc() to increase the amount of memory for that data structure. You can use calloc() to allocate an array dynamically. Use free() to free memory when it is no longer needed. Your program should not have any memory leaks; all the memory allocated by malloc should be freed before your program ends. Check the return calls of any library function that you use. For many functions, the return value indicates whether the function succeeded or encountered an error.!3
Assumptions and Errors The return code upon success is zero. When the program runs normally and no errors are encountered, you should return 0. Only space characters (that is, what you get when you hit spacebar) will be used to separate words in the input. Thus, you don't have to worry about tabs or other whitespace. However, your program should correctly handle the case where there are two or more spaces between words, that is, it should treat that as one big separator between the words. Max line length will be 128. If you get a line longer than this (detected by the lack of a newline character in the last position), please print Line too long to standard error and exit with return code 1. You should check the arguments of textsort carefully. If more than two arguments are passed, or two are passed but the second does not fit the format of a dash followed by a number, you should EXACTLY print (to standard error): Error: Bad command line parameters and exit with return code 1. Key does not exist on one line of input file: If the specified key does not exist on a particular line of the input file, you should just use the last word of that line as the key. For example, if the user wants to sort on the 4th word ( -4 ), and the sort encounters a line like this ( sample line ), the sort should use the word line to sort that line. Empty line: You should use an empty string to sort any empty lines (that is, lines that are just a newline or spaces and a newline character). File length: May be pretty long! However, you can assume the data set will fit into memory and you shouldn't have to do anything special to handle this. However, if malloc() does fail, please print malloc failed to standard error and exit with code 1. Invalid files: If the user specifies an input file that you cannot open (for whatever reason), the sort should EXACTLY print (to standard error): Error: Cannot open file foo with no extra spaces (if the file was named foo ) and then exit with return code 1. Important: On any error code, you should print the error to the screen using fprintf(), and send the error message to stderr (standard error) and not stdout (standard output). This is accomplished in your C code as follows: fprintf(stderr, whatever the error message is\n ); General Advice Start small, and get things working incrementally. For example, first get a program that simply reads in the input file, one line at a time, and prints out what it reads in. Then, slowly add features and test them as you go. Testing is critical. Testing your code to make sure it works is crucial. Write tests to see if your code handles all the cases you think it should. Be as comprehensive as you can be. Save copies of your code as you get things working. Use valgrind to help you find memory problems. valgrind is a program that will help you find memory errors in your C programs. You need to install it in your Linux VM:!4
sudo apt install valgrind To run valgrind, use a command like the following, where file1.txt is the name of the file being read by textsort. You can also optionally pass in the number used to indicate which word to sort by, just as you would if you were calling textsort from the command line. valgrind --leak-check=yes./textsort file.txt Teams You will work in pairs on this assignment. Here are the teams: Ammal Abbasi, Mimi Shahzad Eeman Abbasi, Crystal Seo Sabrina Accime, Ana Saverchenko Shanzeh Agrawala, Sophie Manum Momal Baloch, Alice Richardson Ranjini Das, Xiaolei Ni Anne Demosthene, Sophie Le Darienne Dewalt, Vivian Le Zineb El Mechrafi, Rebecca Kim Hiwete Fetene, Hyeji Kim Emma Grotto, Miral Khalil Zhiling Hu, Van Trinh Tracy Keya, Jenny Lee Sujin Kim, Zoe Liang Sara Rutkowski, Jane Yoo Grading Code that is turned in that does not compile will be returned ungraded. Grading will be based on correctness, documentation, and coding style in these proportions: 5% Makefile 20% Sorts correctly on a short text file 10% Sorts correctly on a long text file 10% Sorting by other than the first word 5% Word to sort by > number of words on some lines 10% Error checking 20% Correct memory management 10% Comments 10% Coding style Turning in your solution Place your C file(s) and makefile into a single tar.gz file, using your name and your partner s name rather than mine in the name of the file. (First names are enough.) tar -cvzf Barbara_Lerner_Assign1.tar.gz *.c makefile Upload your tar.gz file to Moodle. Only one of you should submit this.!5