CS 423 Operating Systems Design Lecture 18 File Systems and their Management and Optimization Klara Nahrstedt Fall 2011 Based on slides by YY Zhou and Andrew S. Tanenbaum
Overview Administrative announcements MP2 interviews today Homework 1 posted today October 3 Homework 1 - deadline October 10 in class File Systems Log-Structured File Systems Journaling File Systems Disk Space Management File System Backups File System Consistency File System Performance Summary
Log-Structured File Systems CPUs getting faster + Disks getting bigger and cheaper Disk seek time is not improving => performance bottleneck Utilize fast CPU and large RAM/disk caches Satisfy all read requests directly from FS cache with no disk access needed. Deal with small writes because: Consider creating a new file To write this file, i-node for the directory, directory block, i-node for the file and the file itself must be written. While the writes can be delayed, doing so exposes the file system to serious consistency problems if a crash occurs before the writes are done Hence i-nodes writes are generally done immediately.
Log-structured FS Solution: LFS Log-structured FS Idea: Structure whole disk as a log Process: All pending writes are buffered in memory and collected in a single segment Periodically (or when needed) they are written to the disk as a single contiguous segment at the end of the log. i-node map, indexed by i-number, is maintained Entry I in this map points to i-node I on the disk Map is kept on disk, but it is also cached Opening a file now consists of using the map to locate the i-node for the file Once the i-node has been located, the addresses of blocks can be found from it. LFS has cleaner thread that spends its time scanning log circularly to compact it since after some time not all blocks are used
Journaling File Systems Idea: keep log of what FS is going to do before it does it if the system crashes before it can do its planned work, upon rebooting the system can look in the log to see what was going on at the time of crash and finish the job. Solution: JFS Journaling File Systems Linux Ext3 and Microsoft NTFS
JFS - Example Consider removing file operation 1. Remove file from its directory 2. Release i-node to the pool of free i-nodes 3. Return all disk blocks to the pool of free disk blocks Suppose the first step completes and then system crashes i-node and file blocks will not be accessible from any file, but will also not be available for reassignment decrease of available resources If the crash occurs after the second step, only blocks are lost. JFS does: Write log entry of the three steps to be completed Write log entry to disk Only after the log entry has been written, do the individual steps JSF only works if the logged operations are idemponent
Disk Space Management (1) Disk Block Size Decision Small Blocks or Large Blocks? Trade-offs between space efficiency on the disk and access time (data rates)
Disk Space Management (2) If we have small disk block sizes, we get high space efficiency (no wastage), But low performance (data rates); With large block sizes, we get high data rates (high performance), but Low space utilization
Free Space Management Bit vector A bit map is kept of free blocks Each bit in a vector represents one block If the block is free, the bit is zero Simple to find n consecutive free blocks Overhead is bit map Example BSD file system
Free Space Management Free list Keep a linked list of free blocks Not very efficient because linked list needs traversal Example system V R1
Free Space Management Linked list of indices A linked list of index blocks is kept Each index block contains addresses of free blocks and a Pointer to the next index block A large number of free blocks can be found quickly
Free Space Management Linked list of contiguous blocks that are free The free list node consists of a pointer and the number of free blocks starting from that address Blocks are joined together into larger blocks as necessary
Free Space Management (Example)
Free Space Management Issues (a) Almost-full block of pointers to free disk blocks in RAM - three blocks of pointers on disk (b) Result of freeing a 3-block file (c) Alternative strategy for handling 3 free blocks - shaded entries are pointers to free disk blocks
Disk Quota Management Quotas for keeping track of each user s disk use
File System Reliability (1) A file system to be dumped (Logical Dump of directories/files) squares are directories, circles are files shaded items, modified since last dump; each directory & file labeled by i-node number Bit maps used by the logical dumping algorithm File that has not changed
File System Reliability (2) Dump Algorithm: Phase 1 for each modified file, its i-node is marked in the bitmap and each directory is also marked (whether or not it has been modified) Phase 2 recursively walk the tree again, unmarking any directories that have no modified files or directories in them or under them Phase 3 scan i-nodes in numerical order and dump all directories that are marked for dumping Phase 4 scan i-nodes in numerical order and dump files that are marked for dumping
File System Consistency fsck utility in UNIX Block Consistency two tables 1 table keep track of assigned blocks 2 table keept track of free blocks
File System Consistency File system states (a) consistent After crash (b) missing block Cause no harm, but waste space and reduce capacity of disk Solution: just add missing block to the free list After crash (c) duplicate block in free list Happens only if we use list for free list (not bitmap) Solution: rebuild the free list After crash (d) duplicate data block Solution: allocate a free block, copy contents of block 5 and insert the copy into one of the files, error should be reported to allow user inspect the damage
File System Performance Access to disk much slower than access to memory Read a memory word 10 nsec Read from hard disk with 10MBps 5-10 msec Methods to speed up Cache data in memory Use block read ahead method Reduce disk arm motion
File System Performance (Caching) If cache is full, use replacement techniques LRU Least Recently Used FIFO Cache here is a collection of blocks Important method for multimedia playback
File System Performance (Block Read Ahead) Block read ahead method means Try to get blocks into the cache before they are needed Increase hit rate by prefetching anticipated blocks Approach: User requests k block FS gets k block FS checks of k+1 block is in cache, if not, FS will get k+1 block from disk anticipating that it will be needed in the future Advantage: if user needs k+1 block, the access is fast great method for video playback Disadvantage: if user does not need k+1 block, extra unnecessary work has been done Recommendation: FS keeps track of access patterns to open files Sequential access mode Random access mode FS may use a bit associated with each file to keep track of access pattern (1 for sequential access, 0 for random access)
File System Performance (Reduce Disk Arm Motion) I-nodes placed at the start of the disk Disk divided into cylinder groups each with its own blocks and i-nodes
Questions Which of the following free space management schemes allows a large number of free blocks to be found quickly? Bit vector Free list Linked list with indices Consider a system in which free space is kept in a free space list. If the pointer to the free-space list is lost, the system cannot reconstruct the free space list: Is this true or false?
Conclusion Performance Optimization of File Systems is crucial Pay attention to Block sizes Placement of i-nodes Free space management File system reliability File system performance (caching, prefetching,.)