CSE325 Principles of Operating Systems File Systems David P. Duggan dduggan@sandia.gov March 21, 2013
External View of File Manager Application Program mount() write() close() open() lseek() read() WriteFile() CreateFile() CloseHandle() ReadFile() SetFilePointer() UNIX File Mgr Device Mgr Memory Mgr Process Mgr File Mgr Device Mgr Memory Mgr Process Mgr Windows Hardware Why do we need files? 3/21/13 2
File Management File is a named, ordered collection of information File manager s job Storing the information on a device Mapping the block storage to a logical view Allocating/deallocating storage Organizing files efficiently What abstraction should be presented to programmer? 3/21/13 CSE325 - File Systems 3
Outline File Management Overview File and Directory Structures File Sharing & Protection Allocation Methods Contiguous allocation Linked allocation Indexed allocation 3/21/13 CSE325 - File Systems 4
File Structure None - sequence of words, bytes Simple record structure Lines Fixed length Variable length Complex Structures Formatted document Relocatable load file Who decides: Operating system Program 3/21/13 5
Set of files Used to organize files Directories Translates external file names to internal file names (file labels, i-nodes, file control blocks) Treated as a file but with special operations Why do we need directories? Operations Search for a file Create a file Delete a file List a directory Rename a file Traverse the file system 3/21/13 6
Organize the Directory (Logically) Grouping logical grouping of files by properties, (e.g., all Java programs, all games, ) Efficiency locating a file quickly (performance) Naming convenient to users Two users can have same name for different files The same file can have several different names Directory structures? 3/21/13 CSE325 - File Systems 7
Single-Level Directory A single directory for all users Good News? Problems? Grouping problem Naming problem 3/21/13 CSE325 - File Systems 8
Two-Level Directory Separate directory for each user Path name" Can have the same file name for different user" Efficient searching" Advantages? Problems? No grouping capability" 9
Tree-Structured Directories 3/21/13 CSE325 - File Systems 10
Tree-Structured Directories (Cont.) Efficient searching Grouping Capability Current directory (working directory) cd /spell/mail/prog type list 3/21/13 CSE325 - File Systems 11
Tree-Structured Directories (Cont.) Absolute or relative path name Creating a new file is done in current directory Delete a file rm <file-name> Creating a new subdirectory is done in current directory mkdir <dir-name> Example: if in current directory /mail mkdir count mail" prog" copy" prt" exp" count" Deleting mail deleting the entire subtree rooted by mail " CSE325 - File Systems 12
Acyclic-Graph Directories Have shared subdirectories and files 3/21/13 CSE325 - File Systems 13
Acyclic-Graph Directories (Cont.) Two different names (aliasing) New directory entry type Link another name (pointer) to an existing file Resolve the link follow pointer to locate the file If dict deletes count dangling pointer Solutions: Backpointers, so we can delete all pointers Variable size records a problem Entry-hold-count solution 3/21/13 CSE325 - File Systems 14
General Graph Directory 3/21/13 CSE325 - File Systems 15
General Graph Directory (Cont.) How do we guarantee no cycles? Allow only links to file not subdirectories Every time a new link is added use a cycle detection algorithm to determine whether it is OK 3/21/13 CSE325 - File Systems 16
Directory Implementation Linear list of file names with pointer to the data blocks. simple to program time-consuming to execute Hash Table linear list with hash data structure. decreases directory search time collisions situations where two file names hash to the same location fixed size 3/21/13 CSE325 - File Systems 17
File Sharing Sharing of files on multi-user systems is desirable On distributed systems, files may be shared across a network Network File System (NFS) is a common distributed file-sharing method Sharing may be done through a protection scheme 3/21/13 CSE325 - File Systems 18
File Sharing Remote File Systems Uses networking to allow file system access between systems Manually via programs like FTP Automatically, seamlessly using distributed file systems Semi-automatically via the world wide web Client-server model allows clients to mount remote file systems from servers Server can serve multiple clients Client and user-on-client identification is insecure or complicated NFS is standard UNIX client-server file sharing protocol CIFS is standard Windows protocol Standard operating system file calls are translated into remote calls 3/21/13 19
File Sharing Consistency Semantics Consistency semantics: specify how multiple users are to access a shared file simultaneously Similar to process synchronization algorithms Unix file system (UFS) implements: Writes to an open file visible immediately to other users of the same open file Sharing file pointer to allow multiple users to share the pointer of current location into the file Andrew File System (AFS) implemented complex remote file sharing semantics AFS has session semantics Writes only visible to sessions starting after the file is closed 3/21/13 CSE325 - File Systems 20
Access Lists and Groups Mode of access: read, write, execute Three classes of access RWX a) owner access 7 1 1 1 RWX b) group access 6 1 1 0 RWX c) public access 1 0 0 1 Ask manager to create a group (unique name), say G, and add some users to the group. For a particular file (say game) or subdirectory, define an appropriate access. Attach a group to a file chgrp G game owner" group" public" chmod" 761" game" 3/21/13 CSE325 - File Systems 21
Protection Mechanisms Files are OS objects: unique names and a finite set of operations that processes can perform on them Protection domain is a set of {object,rights}, where right is the permission to perform one of the operations Each process runs in some protection domain. Domain" How do we store all the protection domains? 22
Access Control Lists (ACL) Example: Owner s ID stored in PCB File owner s ID and access permissions stored in inode 3/21/13 CSE325 - File Systems 23
Capabilities List (C-List) Each process has a capability for every resource it can access Kept with other process meta data Checked by the kernel on every access 3/21/13 CSE325 - File Systems 24
Allocation Methods An allocation method refers to how disk blocks are allocated for files: Contiguous allocation Linked allocation Indexed allocation 3/21/13 CSE325 - File Systems 25
Contiguous Allocation Each file occupies a set of contiguous blocks on the disk Maps the N file blocks into N contiguous blocks on the secondary storage device File descriptor Head position 237 First block 785 Number of blocks 25 Good news? Simple only starting location (block #) and length (number of blocks) are required Direct access 26
Contiguous Allocation (cont.) Mapping from logical address to physical LA/512" Q" R" Block to be accessed = Q + starting address Displacement into block = R 3/21/13 CSE325 - File Systems 27
Contiguous Allocation of Disk Space Drawbacks? Wasteful of space (dynamic storageallocation problem) Difficult to support dynamic file sizes (files cannot grow) How do we solve this problem? 3/21/13 CSE325 - File Systems 28
Linked Allocation Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk. Each block contains a header with Number of bytes in the block Pointer to next block block =" pointer" 3/21/13 CSE325 - File Systems 29
Linked Allocation Good news? Simple need only starting address Free-space management system no waste of space Bad news? No random access Seeks can be slow Mapping of free space 3/21/13 CSE325 - File Systems 30
File-Allocation Table 3/21/13 CSE325 - File Systems 31
Indexed Allocation Brings all pointers together into the index block. Logical view. index table" 3/21/13 CSE325 - File Systems 32
Example of Indexed Allocation Good news? Simplify seeks Random access Dynamic access without external fragmentation May link indices together (for large files) Bad news? Overhead of index block. 3/21/13 CSE325 - File Systems 33
Indexed Allocation Mapping (Cont.) " outer-index" index table" file" 3/21/13 CSE325 - File Systems 34
Combined Scheme: UNIX (4K bytes per block) 3/21/13 CSE325 - File Systems 35
Unallocated Blocks How should unallocated blocks be managed? Need a data structure to keep track of them Linked list Very large Hard to manage spatial locality Block status map ( bit map or bit vector ) Bit per block Easy to identify nearby free blocks Useful for disk recovery 3/21/13 CSE325 - File Systems 36
Efficiency and Performance Efficiency dependent on disk allocation and directory algorithms types of data kept in file s directory entry Performance disk cache separate section of main memory for frequently used disk blocks free-behind and read-ahead techniques to optimize sequential access improve PC performance by dedicating section of memory as virtual disk, or RAM disk 3/21/13 CSE325 - File Systems 37