CSI 402 Systems Programming LECTURE 4 FILES AND FILE OPERATIONS
A mini Quiz 2 Consider the following struct definition struct name{ int a; float b; }; Then somewhere in main() struct name *ptr,p; ptr=&p; // Referencing pointer to memory address of p How to access variable a? (*ptr).a ptr->a
A mini Quiz 3 Consider the following code snippet: void avg_sum(double a[], int n, double *avg, double *sum) { int i; sum = 0.0; for(i = 0; i < n; i++) sum += a[i]; avg = sum / n; }
A mini Quiz 4 Consider the following code snippet: void avg_sum(double a[], int n, double *avg, double *sum) { int i; *sum = 0.0; for(i = 0; i < n; i++) *sum += a[i]; *avg = *sum/ n; / n; }
Files 5 A collection of data on disk managed by the user and the operating system A way to permanently store data A file name is how the user and OS know the filefollows OS naming rules Why (or When to) use files? Large volume of input/output data More permanent storage of data Transfer to other programs Multiple simultaneous input and/or output streams
Examples of Files 6 Scientific Data: Books Papers Reports Biological data Weather data Environmental data Web Data: html pages Images (jpg, gif, png) Videos Programs
Files and File Variables 7 Remember a file is a collection of data (on disk) But a C program can only operate directly on variables Need to make a connection between the data on the disk and variables in a C program File Variables A file variable is a data structure (FILE in <stdio.h>) which represents a file Temporary: exists only when program runs
Files in C 8 Type for file variables: FILE * Pointers to a FILE struct. File operations use functions from stdio.h fopen for opening a file fclose for closing a file getc for reading characters from a file putc for writing characters to a file fscanf for reading other data types from a file fprintf for writing other data types to files
Key Unix I/O Design Concepts 9 High Level I/O Streams Uniformity file operations, device I/O, and interprocess communication through open, read/write, close Low Level I/O syscall File System Handles Registers Descriptors Allows simple composition of programs I/O Drivers Data transferring e.g., find or grep tools Open before use Access control and arbitration Sets up the underlying machinery, i.e., data structures Explicit close
The File System Abstraction 10 High-level idea File Files live in hierarchical namespace of filenames Named collection of data in a file system File data Text, binary File Metadata: information about the file Size, modification time, owner, security information Basis for access control Directory (more in a future lecture) Folder containing files & Directories Hierarchical namespace Path uniquely identifies a file or directory
Opening and Closing Files 11 Prototype: FILE *fopen(const char *filename, const char *mode); filename: identifies file to open mode: r for reading The file must exist w for writing Creates an empty file If the file exists its contents are erased (!) a for appending To append data at the end of the file The file is created if it doesn t exist. Prototype: int fclose(file *stream);
Positioning in Files 12 File offset For input files, it gives the number of the byte to be read next It is set to zero when file is opened (using "r" mode) Offset value increases as bytes are read from file For output files, it gives the number of the byte to be written next It is set to zero when file is opened (with mode "w"). Offset value increases as bytes are written to file. For both input and output files, the current value of file offset can be obtained using the ftell function.
Library Function ftell 13 Part of stdio.h. Prototype: long ftell(file *fp) fp specifies the (input or output) file. Returns the offset for the file specified by fp Returns -1L in case of error and the global variable errno is set to a positive value
Library Function fseek 14 Part of stdio.h. Prototype: int fseek (FILE *fp, long offset, int origin) Sets the file position of the stream to the given offset fp specifies the (input or output) file offset (which may be negative) specifies the number of bytes to offset from origin Parameter origin is the position from where offset is added/subtracted SEEK_SET: Beginning of file SEEK_CUR: Current position of the file pointer SEEK_END: EOF Returns zero if successful and a non-zero value otherwise
Library Function rewind 15 Part of stdio.h. Prototype: int rewind (FILE *fp) Sets the file offset to zero (i.e., gets us back to the beginning of a file) rewind(fp) is equivalent to fseek(fp, 0, SEEK_SET); Note: Handout 4.1 illustrates the use of library functions fseek, ftell, and rewind to move around in an input file specified by a command line argument
Common Pitfalls with File Accesses 16 Moving outside file boundary Function fseek allows any offset value It doesn t check whether specified move is within the file The result of illegal moves is implementation dependent On most Unix systems: fseek does not move the offset value below the beginning of the file offset can be changed to a value beyond the end of file, however, trying to read from a non-existent position produces EOF. For an output file, fseek allows forward jumps ; positions where nothing was written contain \0.
Common Pitfalls with File Accesses 17 Forgetting to close a file The file descriptor is "leaked" The operating system associates some resources with a file descriptor that is linked to an open file Not closing the file means that the descriptor is not cleaned up, i.e., it will persist until the program closes Opening a file multiple times without closing it first more and more descriptors will be leaked Eventually the operating system either refuses or is unable to create another descriptor The call to fopen fails Accessing a file after it has been closed Segmentation fault