File Structures Physical Files and Logical Files Chap 2. Fundamental File Processing Operations Things you have to learn Physical files and logical files File processing operations: create, open, close, read, write, and seek C++ input and output classes Overloading in C++ Organization of hierarchical file systems Unix view of a file Physical file A collection of bytes stored on a storage device (e.g. disk, tape) Logical file The file as seen by the program Can be viewed as a channel (like a telephone line) that connects the program to a physical file Logical files enables performing operations on a file without knowing what physical file will be used The operating system is responsible for making a hookup between a logical file and some physical file The number of logical files open at the same time is limited to a small number File Structures 2 Opening Files Opening a file makes it ready for use by the program It may also bind a logical file name to a physical file After opening, we are positioned at the beginning of the file Creating a file also opens the file in the sense that it is ready for use after creation Example: opening a file in C and C++ fd = open(filename, flags [, pmode]) fd: int, the file descriptor (logical file identifier) filename: char*, a string containing the physical file name flags: int, information on whether it opens an existing file for reading or writing pmode: int, the protection mode for the file when a file is created The value of flags is set by performing a bit-wise OR of the following values: O_APPEND, O_CREAT, O_EXCL, O_RDONLY, O_RDWR, O_TRUNC, and O_WRONLY Specification of pmode: a three-digit octal number r w e r w e r w e pmode = 0751 = 1 1 1 1 0 1 0 0 1 owner group world Examples fd = open(filename,o_rdwr O_CREAT,0751) fd = open(filename,o_rdwr O_CREAT O_TRUNC,0751) fd = open(filename,o_rdwr O_CREAT O_EXCL,0751) File Structures 3 File Structures 4
Closing Files Reading and Writing When you close a file, the logical file name or file descriptor is available for use with another file Closing a file that has been used for output also ensures that everything has been written to the file Closing ensures that the buffer for that file has been flushed to the physical file Files are usually closed automatically by the operating system when a program terminates normally It s better to close the file as soon as possible to protect it against data loss in the event that the program is interrupted and to free up logical filenames for reuse Reading Read data from a file and place it in a variable inside the program Read (Source_file, Destination_addr, Size) Source_file: logical name of a file to be read Destinatnion_addr: first address of the memory block where we want to store the data Size: number of bytes to be read Writing Write data from a variable inside the program into a file Write (Destination_file, Source_addr, Size) Source_file: logical name of a file to be written Destinatnion_addr: first address of the memory block where the data is stored Size: number of bytes to be written File Structures 5 File Structures 6 Stream: a sequence of bytes C Streams Use the standard C functions in stdio.h stdio.h contains definitions of the types & the operations on C streams I/O functions: fread, fget fwrite, fput Formated I/O functions: fscanf, fprintf C++ Stream Classes Use the stream classes of iostream.h and fstream.h fstream: class for access to files that have methods open, read, and write among others >>(extraction) and <<(insertion): overloaded for input and output How to open C Streams FILE *outfile; outfile = fopen( myfile.txt, w ); The first arguments indicates the physical file name of the file The second one determines the mode the way the file is opened r : open an existing file for input (reading) w : create a new file, or truncate existing one, for output a : create a new file, or append an existing one, for output r+ : open an existing file for input and output w+ : create a new file, or truncate existing one, for input and output a+ : create a new file, or append an existing one, for input and output File Structures 7 File Structures 8
How to open C++ Stream Classes fstream outfile; outfile.open( myfile.txt, ios::out); The second argument is an integer indicating the mode Its value is set as a bitwise or of constants defined in the class ios: ios::in open for input ios::out open for output ios::app open for append ios::trunc always create a new file ios::nocreate fail if file doesn t exist ios::noreplace create a new file, but fail if it already exists ios::binary open in binary mode (rather than text mode) File Structures 9 Read/Write example using C streams // listc.cpp // program using C streams to read characters from a file // and write them to the terminal screen #include <stdio.h> main() { char ch; FILE * file; // pointer to file descriptor char filename[20]; printf( Enter the name of the file: ); // Step 1 gets(filename); // Step 2 file = fopen(filename, r ); // Step 3 while (fread(&ch, 1, 1, file)!= 0) // Step 4a fwrite(&ch, 1, 1, stdout); // Step 4b fclose(file); // Step 5 File Structures 10 Read/Write example using C++ stream classes // listcpp.cpp // list contents of file using C++ stream classes #include <fstream.h> main() { char ch; fstream file; // declare unattached fstream char filename[20]; cout << Enter the name of the file: // Step 1 << flush; // force output cin >> filename; // Step 2 file.open(filename, ios::in); // Step 3 file.unsetf(ios::skipws); // include white space in read while (1) { file >> ch; // Step 4a if (file.fail()) break; cout << ch; // Step 4b file.close(); // Step 5 File Structures 11 Detecting end-of-file C streams: fread returns the number of elements read C++ stream: use the fail function to check the status of previous operation Ada: use a function end_of_file before trying to read the next byte File Structures 12
Seeking Special Characters in Files Action of moving directly to a certain position in a file Seeking with C Streams pos = fseek(file, byte_offset, origin) Values of origin: SEEK_SET, SEEK_CUR, SEEK_END Seeking with C++ Stream Classes Two files pointers: get pointer and put pointer Two functions for seeking: seekg and seekp file.seekg(byte_offset, origin) Values of origin: ios::beg, ios::cur, ios::end I/O Support packages may add or delete some characters during their I/O processing (automatic modification) Usually associated with the concepts of a line of text or the end of a file Examples Control-Z (ASCII value of 26) is appended at the end of files (MS-DOS systems) Single CR (carriage return, ASCII value of 13) or LF (line feed, ASCII value of 10) is automatically expanded into CR-LF pairs Opening the file with the binary mode turns off this translation CR characters are removed, replacing them with a count of the characters in the preceding line (VMS systems) File Structures 13 File Structures 14 The Unix Directory Structure Sample Unix Directory Structure a tree-structured organization with two kinds of files ( i.e., files(programs and data) and directories) devices such as tape or disk drivers are also files (in the /dev directory) / to indicate the root directory to separate directory names from the file name absolute pathname and relative pathname for file identification current directory :. parent directory :.. File Structures 15 File Structures 16
Physical devices as Files (in UNIX) The Console, the Keyboard, and Standard Error a sequence of bytes ( => very few operations ) Logical file names associated with standard I/O devices magnetic disk and devices like the keyboard and the console are also files (/dev/kbd, /dev/console) purpose standard input default meaning keyboard logical file names in C streams in C++ streams stdin cin represented logically by an integer (file descriptor) standard output standard error console/screen console/screen stdout stderr cout cerr the same file operations These streams don t need to be open or closed in the program File Structures 17 File Structures 18 I/O Redirection and Pipes File-Related Header Files Convenient shortcuts for switching between standard I/O (stdin and stdout) and regular file I/O Default meanings of stdin and stdout are changed I/O redirections: specifying at execution time alternate files for input or output date > myfile Sort < data_file Pipe: carrying data from one process to another Type large_file more Contain special names and values that you must use when performing file operations C streams: stdio.h EOF, stdin, stdout, stderr C++ streams: iostreams.h and fstream.h Class definitions Unix operations: fcntl.h and file.h O_RDONLY, O_WRONLY, O_RDWR File Structures 19 File Structures 20