File System CS170 Discussion Week 9 *Some slides taken from TextBook Author s Presentation
File-System Structure File structure Logical storage unit Collection of related information File system resides on secondary storage (disks). File system organized into layers. File control block storage structure consisting of information about a file, called inode in UNIX/Linux 11.2 Silberschatz, Galvin and Gagne 2005
A Typical File Control Block 11.3 Silberschatz, Galvin and Gagne 2005
Layered File System 11.4 Silberschatz, Galvin and Gagne 2005
Application programs Logical file system File System Layers Manages metadata information which includes all of the information about the file system structure It manages the directory structure to provide the file-organization module with the information its needs, given a symbolic name. It maintains file structure via File Control Blocks (FCB). It is also responsible for protection and security File organization module Knows about the files and their logical and physical blocks Can translate logical block address to physical block address for the basic file system to transfer It also includes a free-space manager which keeps track of unallocated blocks Basic file system This needs only issue generic commands to the appropriate device drivers to read and write physical blocks on the disk Each physical block is identified by its disk address (e.g., drive 3, cylinder 10, track 2, sector 20) I/O control Contains the device drivers and interrupt handlers to transfer information between main memory and the disk. 11.5 Silberschatz, Galvin and Gagne 2005
File System Implementation Generally, several on-disk and in-memory structures are used to implement a file system. Sector is the smallest user-accessible portion of the disk, usually it is 512 bytes for magnetic disks and 2048 bytes for optical disks. The term block was used earlier for sector. Nowadays, sector has become a common name. A file block usually consists of several (power of 2) contiguous disk blocks. Usually file block size is a configuration parameter that can be set. On-disk structures Boot control block: contains information needed by the system to boot an operating system from that partition. Volume control block: contains volume (partition) details such as number of blocks in the partition, free block counts, free block pointers, and free FCB count and FCB pointers. In Unix File systems, it is called superblock; In NT File Systems, it is called Master File Table. Directory structure: used to organize files. A FCB contains the file s details such as owner, permissions, location of data blocks, etc. In UFS, it is called inode. 11.6 Silberschatz, Galvin and Gagne 2005
File System Implementation In-memory information used for file system management as well as performance improvement Mount table: contains information about each mounted partition Directory structure: holds directory information of recently accessed directories. System-wide open file table: contains a copy of the FCB of each open file, as well as other information. Per-process open-file table: contains a pointer to the appropriate entry in the system-wide open-file table. 11.7 Silberschatz, Galvin and Gagne 2005
File System Implementation To illustrate the use of these structures, let us look at how create and open work Creating a file: application program calls the logical file system to create a file. The logical file system allocates a new FCB, reads the appropriate directory into memory, and updates it with the new file name and FCB and writes back to disk Opening a file: The open call passes a file name to the file system. The file systems searches the directory structure for the given file name, and the FCB of the file is copied into a system-wide openfile table in memory. An entry is also made in the per-process open file table, with a pointer to the entry in the system-wide open-file table and some other fields. The other fields can include a pointer to the current location in the file and access mode in which the file is open. 11.8 Silberschatz, Galvin and Gagne 2005
Directory Implementation methods Linear list of file names with pointer to the data blocks. simple to program time-consuming to execute for example, to create a file in a directory the entire list has to be searched to check if there is a file with the same name To delete a file, the entire directory needs to be searched as well Hash Table linear list with hash data structure. decreases directory search time collisions situations where two file names hash to the same location A chained-overflow hash table can be used to overcome the fixed size, but it might require stepping through a linked list of colliding hash table entries. 11.9 Silberschatz, Galvin and Gagne 2005
Allocation Methods An allocation method refers to how disk blocks are allocated for files: Contiguous allocation Linked allocation Indexed allocation 11.10 Silberschatz, Galvin and Gagne 2005
Contiguous Allocation Each file occupies a set of contiguous blocks on the disk. Simple only starting location (block #) and length (number of blocks) are required. Random access is easy. Wastes space external and internal fragmentation Files cannot grow. 11.11 Silberschatz, Galvin and Gagne 2005
Contiguous Allocation of Disk Space 11.12 Silberschatz, Galvin and Gagne 2005
Extent-Based Systems Many newer file systems (i.e. Veritas File System) use a modified contiguous allocation scheme. Extent-based file systems allocate disk blocks in extents. An extent is a contiguous set of blocks on the disk. A file is initially allocated one or more extents. As need arises, more extents are allocated dynamically. 11.13 Silberschatz, Galvin and Gagne 2005
Linked Allocation Each file is allocated a linked list of disk blocks: blocks may be scattered everywhere on the disk. block = pointer 11.14 Silberschatz, Galvin and Gagne 2005
Linked Allocation (Cont.) Simple need only starting address Size of file need not be declared at the time of creation Free-space management system no wastage of space No external fragmentation Effective sequential access but no random access 11.15 Silberschatz, Galvin and Gagne 2005
Linked Allocation 11.16 Silberschatz, Galvin and Gagne 2005
Advantages and disadvantages of linked allocation It can be used effectively only for sequential access Random access is expensive To find the ith block, we must start from the beginning of the file and follow the pointers until we get the ith block If a pointer needs 4 bytes out of 512 byte block then, 0.78 percent is wasted for pointers. Pointers could be lost. A bug in the OS or disk hardware might result in picking up the wrong pointers. 11.17 Silberschatz, Galvin and Gagne 2005
File Allocation Table (FAT) Method This is a variation of linked allocation used in MS-DOS and OS/2 A section of the disk at the beginning of each partition is kept aside for this table. The table has one entry for each block and is indexed by block number The directory entry contains the block number of the first block of the file. The table entry indexed by that block number then contains the block number of the next block of the file. This chain continues until the last block which contains a special end-of-file value as the table entry. Unused blocks are indicated by 0 table value. Allocating a new block involves finding the first 0 valued table entry and replacing the previous end-of-file value with the address of the new block 11.18 Silberschatz, Galvin and Gagne 2005
File-Allocation Table 11.19 Silberschatz, Galvin and Gagne 2005
Indexed Allocation Brings all pointers together into the index block. This helps in solving the random access problem in the case of linked allocation Logical view. index table 11.20 Silberschatz, Galvin and Gagne 2005
Example of Indexed Allocation 11.21 Silberschatz, Galvin and Gagne 2005
Indexed Allocation (Cont.) Needs index table Random access is easy Dynamic allocation without external fragmentation, but has overhead of index block. 11.22 Silberschatz, Galvin and Gagne 2005
Indexed Allocation Two level indexing outer-index index table file 11.23 Silberschatz, Galvin and Gagne 2005
ombined Scheme: UNIX (4K bytes per block) In UNIX file System, the first, say, 15 pointers to the index blocks are kept in the inode. The first 12 are pointers to the direct blocks The next three pointers point to indirect blocks The first points to a single indirect block The second points to a double indirect block The third points to a triple indirect block 11.24 Silberschatz, Galvin and Gagne 2005
Combined Scheme: UNIX (4K bytes per block) 11.25 Silberschatz, Galvin and Gagne 2005
Project-5 Implementation Implement a simple file system on top of a virtual disk library that offers a set of basic file system calls The file data and file system metainformation will be stored on a virtual disk 11.26 Silberschatz, Galvin and Gagne 2005
Virtual Disk Access(code provided to you) #define DISK_BLOCKS 8192 #define BLOCK_SIZE 4096 int make_disk(char *name); // create an empty, virtual disk file int open_disk(char *name); // open a virtual disk (file) int close_disk(); // close a previously opened disk (file) int block_write(int block, char *buf); // write a block of size BLOCK_SIZE to disk int block_read(int block, char *buf); // read a block of size BLOCK_SIZE from disk 11.27 Silberschatz, Galvin and Gagne 2005
Block Size 8192 Block of 4k each 4096 blocks reserved as data block Remaining 4096 blocks could be used for meta information 11.28 Silberschatz, Galvin and Gagne 2005
File system basic functions int make_fs(char *disk_name); This function creates a fresh (and empty) file system on the virtual disk with name disk_name. As part of this function, you should first invoke make_disk(disk_name) to create a new disk. Then, open this disk and write/initialize the necessary meta-information for your file system so that it can be later used (mounted). The function returns 0 on success, and -1 when the disk disk_name could not be created, opened, or properly initialized. 11.29 Silberschatz, Galvin and Gagne 2005
File system basic functions int mount_fs(char *disk_name); This function mounts a file system that is stored on a virtual disk with name disk_name. With the mount operation, a file system becomes "ready for use." You need to open the disk and then load the meta-information that is necessary to handle the file system operations. The function returns 0 on success, and -1 when the disk disk_name could not be opened or when the disk does not contain a valid file system (that you previously created with make_fs). 11.30 Silberschatz, Galvin and Gagne 2005
File system basic functions int umount_fs(char *disk_name); This function unmounts your file system from a virtual disk with name disk_name. As part of this operation, you need to write back all meta-information so that the disk persistently reflects all changes that were made to the file system. You should also close the disk. The function returns 0 on success, and -1 when the disk disk_name could not be closed or when data could not be written to the disk (this should not happen). 11.31 Silberschatz, Galvin and Gagne 2005
File System Operations 1. int fs_open(char *name); 2. int fs_close(int fildes); 3. int fs_create(char *name); 4. int fs_delete(char *name); 5. int fs_read(int fildes, void *buf, size_t nbyte); 6. int fs_write(int fildes, void *buf, size_t nbyte); 7. int fs_get_filesize(int fildes); 8. int fs_truncate(int fildes, off_t length); 11.32 Silberschatz, Galvin and Gagne 2005
Design File Allocation Table or Unix(Inode) based design Simple v/s complex 11.33 Silberschatz, Galvin and Gagne 2005
DOS FAT File Allocation Table... Directory (5) A: 6 B: 1... 7 free File A: 6 4 6 4 3 5 end 4 3 File B: 3 end 1 2 2 end 1 2 0 free Slide 34 11.34 Silberschatz, Galvin and Gagne 2005
Implementation details Data structures needed for super block, a root directory information about free and empty blocks on disk file meta-information (such as file size), and a mapping from files to data blocks.. 11.35 Silberschatz, Galvin and Gagne 2005
Questions? 11.36 Silberschatz, Galvin and Gagne 2005