CS370 Operating Systems Colorado State University Yashwant K Malaiya Fall 2017 Lecture 25 File Systems Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1
FAQ Q 2
Data and Metadata Storage abstraction: File system metadata (size, free lists), File metadata (attributes, disk block maps), Data blocks 3
OS File Data Structures Open file table - shared by all processes with an open file. open count file attributes, including ownership, protection information, access times,... location(s) of file on disk pointers to location(s) of file in memory Per-process file descriptor table - for each file, pointer to entry in the open file table FD: int current position in file (offset) mode in which the process will access the file (r, w, rw) pointers to file buffer 4
In-Memory File System Structures Opening a file fopen( ) returns fid Reading a file Inode refers to an individual file 5
File System -variations Index structure: linked list/tree (fixed or dynamic) Granularity: block/extent Free space allocation Etc. File System Max File Size Max Partition Size Journaling Notes Fat32 4 GiB 8 TiB No Commonly supported ExFAT 128 PiB 128 PiB No Optimized for flash NTFS 2 TiB 256 TiB Yes For Windows Compatibility ext2 2 TiB 32 TiB No Legacy ext3 2 TiB 32 TiB Yes Standard linux filesystem for many years. ext4 16 TiB 1 EiB Yes Modern iteration of ext3. 6
File-System Implementation Based on several on-disk and in-memory structures. On-disk Boot control block (per volume) boot block in unix Volume control block (per volume) master file table in UNIX Directory structure (per file system) file names and pointers to corresponding FCBs File control block (per file) inode in unix In-memory Mount table about mounted volumes The open-file tables (system-wide and per process) Directory structure cache Buffers of the file-system blocks Volume: logical disk drive, perhaps a partition 7
On-disk File-System Structures 1. Boot control block contains info needed by system to boot OS from that volume Needed if volume contains OS, usually first block of volume Volume: logical disk drive, perhaps a partition 2. Volume control block (superblock ext or master file tablentfs) contains volume details Total # of blocks, # of free blocks, block size, free block pointers or array 3. Directory structure organizes the files File Names and inode numbers UFS, master file table NTFS Boot block Super block Directory, FCBs File data blocks 8
File-System Implementation (Cont.) 4. Per-file File Control Block (FCB or inode ) contains many details about the file Indexed using inode number; permissions, size, dates UFS (unix file system) master file table using relational DB structures NTFS 9
Create a file Allocates a new FCB. Update directory Reads the appropriate directory into memory, in unix a directory is a file with special type field up-dates it with the new file name and FCB, writes it back to the disk. 10
Directory Implementation Linear list of file names with pointer to the data blocks Simple to program Time-consuming to execute Linear search time Could keep ordered alphabetically via linked list or use B+ tree Hash Table linear list with hash data structure Decreases directory search time Collisions situations where two file names hash to the same location. use chained-overflow method Each hash entry can be a linked list instead of an individual value. 11
CS370 Operating Systems Colorado State University Yashwant K Malaiya Fall 2017 Lecture 25 File-system Implementation Slides based on Text by Silberschatz, Galvin, Gagne Various sources 12 12
15 Allocation Methods
Block Allocation Methods An allocation method refers to how disk blocks are allocated for files: Contiguous (not common, except for DVDs etc.) Linked (e.g. FAT32) Indexed (e.g. ex4) Thoughts about optimization: - Sequential vs random access - What is the common file size? 16
Allocation Methods i.contiguous i. Contiguous allocation each file occupies set of contiguous blocks Simple only starting location (block #) and length (number of blocks) are required Occupies n block b, b+1, b+n-1 Minimal disk head movement Problems include finding space for file, knowing file size, external fragmentation, need for compaction off-line (downtime) or on-line 17
Contiguous Allocation File tr: 3 blocks Starting at block 14 18
Contiguous Allocation Mapping logical byte address LA to physical address Q Assume block size =512 LA/512 R Block to be accessed = starting block number (address) + Q Displacement into block = R 19
Extent-Based Systems Some file systems use a modified contiguous allocation scheme Extent-based file systems allocate disk blocks in extents An extent is a contiguous set of blocks Metadata: beginning block, number of blocks Extents are allocated for file allocation A file consists of one or more extents 20
Allocation Methods - Linked ii. Linked allocation each file a linked list of blocks Each block contains pointer to next block. File ends at null pointer No external fragmentation, no compaction Free space management system called when new block needed Locating a block can take many I/Os and disk seeks. Improve efficiency by clustering blocks into groups but increases internal fragmentation Reliability can be a problem, since every block in a file is linked 21
Allocation Methods Linked (Cont.) FAT (File Allocation Table) variation Beginning of volume has table, indexed by block number Much like a linked list, but faster on disk and cacheable New block allocation simple Each FAT entry corresponds to the corresponding block of Storage. Free block entries are also linked. 22
Allocation Methods - Indexed Indexed allocation Each file has its own index block(s) of pointers to its data blocks Logical view Pointers to Data blocks index table 23
24 Example of Indexed Allocation
Indexed Allocation (Cont.) Need index table Random access Dynamic access without external fragmentation, but have overhead of index block even for a small file Assuming pointer size is 1 byte, block is 512 bytes 1 block for index table can be used for a file of maximum size of 512x512 = 256K bytes Larger files? Multi-block index table? 25
Indexed Allocation Two level Two-level index: Ex: 4K blocks could store 1,024 four-byte pointers in outer index - > 1024x1024 data blocks and file size of up to 4GB 2 10 = 1024 1024 4byte pointers 1024 tables, 1024 pointers each 4KB each 26
Most files are.. Most file are small, but the few very large files* occupy a large fraction of the space Optimize for small files, but allow for large files also * Y. K. Malaiya and J. Denton " Module Size Distribution and Defect Density," Proc. IEEE International Symposium on Software Reliability Engineering, Oct. 2000, pp. 62-71 27
Inode idea Inode: hold file metadata including pointers to actual data Small file stored in the inode itself. Typical inode size 128 bytes Medium size: direct pointers to data If the file is large: Indirect pointer To a block of pointers that point to additional data blocks If the file is larger: Double indirect pointer Pointer to a block of indirect pointers If the file is huge: Triple indirect pointer Pointer to a block of double-indirect pointers 28
Combined Scheme: UNIX inodes Assume 4K bytes per block, 32-bit addresses Volume block: Table with file names Points to this inode (file control block) Common: 12 direct+3. Indirect block could contain 1024 pointers. Max file size: k.k.k.4k (triple)+ More index blocks than can be addressed with 32-bit file pointer 29
On-disk layout of a typical UNIX file system superblock - empty blocks, location of the inode tables etc 30
Performance Best method depends on file access type Contiguous great for sequential and random Linked good for sequential, not random Indexed more complex Single block access could require 0-3 index block reads then data block read Clustering can help improve throughput, reduce CPU overhead Cluster: set of contiguous sectors 31
Performance (Cont.) Adding instructions to the execution path to save one disk I/O is reasonable Intel Core i7 Extreme Edition 990x (2011) at 3.46Ghz = 159,000 MIPS http://en.wikipedia.org/wiki/instructions_per_second Typical disk drive at 250 I/Os per second 159,000 MIPS / 250 = 630 million instructions during one disk I/O Fast SSD drives provide 60,000 IOPS 159,000 MIPS / 60,000 = 2.65 millions instructions during one disk I/O 32
Free-Space Management File system maintains free-space list to track available blocks/clusters (Using term block for simplicity) Approaches: i. Bit vector ii. Linked list iii. Grouping iv. Counting Bit vector or bit map (n blocks) 0 1 2 n-1 bit[i] = 1 block[i] free 0 block[i] occupied Block number calculation for first free block (number of bits per word) *(number of 0-value words) + offset of first 1 bit 00000000 00000000 00111110.. CPUs may have instructions to return offset within word of first 1 bit 35
Free-Space Management (Cont.) Bit map requires extra space Example: block size = 4KB = 2 12 bytes disk size = 2 40 bytes (1 terabyte) blocks: n = 2 40 /2 12 = 2 28 bits (or 32MB) for map if clusters of 4 blocks -> 8MB of memory Easy to get contiguous files if desired 36
Linked Free Space List on Disk ii. Linked list (free list) Cannot get contiguous space easily No waste of space Superblock Can hold pointer to head of linked list 37
Free-Space Management (Cont.) iii. Grouping Modify linked list to store address of next n-1 free blocks in first free block, plus a pointer to next block that contains free-block-pointers iv. Counting Because space is frequently contiguously used and freed, with contiguous-allocation allocation, extents, or clustering Keep address of first free block and count of following free contiguous blocks Free space list then has entries containing addresses and counts 38
UNIX directory structure Contains only file names and the corresponding inode numbers Use ls i to retrieve inode numbers of the files in the directory Looking up path names in UNIX Example: /usr/tom/mbox Lookup inode for /, then for usr, then for tom, then for mbox 39
Advantages of directory entries that have name and inode information Changing filename only requires changing the directory entry Only 1 physical copy of file needs to be on disk File may have several names (or the same name) in different directories Directory entries are small Most file info is kept in the inode 40
Hard and symbolic links Both refer to the same inode (and hence same file) Directory entry in /dira..[12345 filename1].. Directory entry in /dirb..[12345 filename2].. To create a hard link ln /dira/filename1 /dirb/filename2 To create a symbolic link shortcut in windows ln -s /dira/filenmame1 /dirb/filename3 filename3 just contains a pointer 41
Limitations File system based on inodes File must fit in a single disk partition Partition size and number of files are fixed when system is set up inode preallocation and distribution inodes are preallocated on a volume Even on empty disks % of space lost to inodes Preallocating inodes Improves performance Keep file s data block close to its inode Reduce seek times 42
Command: df -i Checking up on the inodes Gives inode statistics for a given system: total, free and used nodes Filesystem Inodes IUsed IFree IUse% Mounted on devtmpfs 2045460 484 2044976 1% /dev tmpfs 2053722 1 2053721 1% /dev/shm tmpfs 2053722 695 2053027 1% /run tmpfs 2053722 16 2053706 1% /sys/fs/cgroup Command: ls i 13320302 diskusage.txt 2408538 Documents/ 680003 downloads/ 43