TDDB68 Concurrent Programming and Operating Systems Lecture: File systems Mikael Asplund, Senior Lecturer Real-time Systems Laboratory Department of Computer and Information Science Copyright Notice: Thanks to Christoph Kessler for much of the material behind these slides. The lecture notes are partly based on Silberschatz s, Galvin s and Gagne s book ( Operating System Concepts, 7th ed., Wiley, 2005). No part of the lecture notes may be reproduced in any form, due to the copyrights reserved by Wiley. These lecture notes should only be used for internal teaching purposes at the Linköping University. 1
File system consists of interface + implementation 2
Storing data Primary memory is volatile need secondary storage for long-term storage A disk is essentially a linear sequence of numbered blocks With 2 operations: write block b, read block b Low level of abstraction 3
The file abstraction Provided by the OS Smallest allotment of secondary storage known to the user Typically, contiguous logical address space Organized in a directory of files Has Attributes (Name, id, size, ) API (operations on files and directories) 4
File Attributes Name the only information kept in human-readable form Identifier unique tag (number) identifies file within file system Type needed for systems that support different types Location pointer to file location on device Size current file size Protection controls who can read, write, execute Time, date, and user identification data for protection, security, and usage monitoring 5
Meta data Such information about files (i.e., meta-data) is kept in a directory structure, which is maintained on the disk. Stored in a File Control Block (FCB) data structure for each file 6
File Operations (API) File is an abstract data type, with operations Create Write Read Reposition within file Delete... Open(Fi) search the directory structure on disk for entry Fi, and move the content of that entry to memory Close (Fi) move the content of entry Fi in memory to directory structure on disk 7
Open in unix: open ( filename, mode ) returns a file descriptor / handle = index into a per-process table of open files (part of PCB) (or an error code) 8
File descriptors and open file tables Process 1 Logical Address Space FILE data structure {, fd, } fp Process-local open file table 0 stdin (pos, ) 1 stdout (pos, ) 2 stderr (pos, ) Logical Address Space System-wide open file table Console input Console output d newfile(pos, ) returned by fopen() C library call Process 2 KERNEL MEMORY SPACE newfile (loc., ) FCB contents Process-local open file table 0 stdin (pos, ) 1 stdout (pos, ) 2 stderr (pos, ) stdin, stdout, stderr are opened upon process start Disk FCB File data open() syscall returns a file descriptor = 9 index in local open file table
Data to manage open files Disk location of the file (and other metadata from FCB) File-open count: count number of times a file is opened to allow removal of data from open-file table when last process closes it shared by all processes who opened the file File pointer (seekpos): pointer to next read/write location one for every open system call (process) 10
Storing open file data Collected in a system-wide table of open files and process-local open file tables (part of PCB) Process-local open file table entries point to system-wide open file table entries Semantics of fork()? 11
Access Methods Sequential Access Direct Access read next block write next block reset (rewind) read block n write block n position to n read next block write next block n = relative block number from beginning of file) 12
Directory Structure Files in a system organised in directories A collection of nodes containing information about all files Both the directory structure and the files reside on disk. Directory Files F1 F2 F3 F4 Fn 13
Directory API Search for a file Create a file Delete a file List a directory Rename a file Traverse the file system 14
Examples of File-system Organization 15
Organize the Directory (Logically) to Obtain Efficiency locating a file quickly Naming convenient to users Two users can use the same name for different files The same file can have several different names Grouping logical grouping of files by properties e.g., all Java programs, all games, 16
Single-Level Directory A single directory for all users Very simple Naming problem Grouping problem Still used on simple devices, embedded systems, Pintos 17
Two-Level Directory Separate directory for each user Path name: username / filename Can have the same file name for different user Efficient searching No grouping capability 18
Tree-Structured Directories 19
Acyclic-Graph Directories 20
Hard links Direct pointer (block address) to a directory or file Cannot span partition boundaries Need be updated when file moves on disk Unix: ln <filename> <linkname> 21
Soft links Soft links (symbolic links, shortcut, alias ) files containing the actual (full) file name still valid if file moves on disk no longer valid if file name (or path) changes Not as efficient as hard links (one extra block read) Unix: ln s <filename> <linkname> 22
Hard links vs. Soft links Example directory: Name Location myfile 371 file2 524 mylink_hard 371 mylink_soft./myfile 23
File System Mounting A file system must be mounted before it can be accessed Mounting combines multiple file systems in one namespace An unmounted file system is mounted at a mount point In Windows, mount points are given names C:, D:, 24
Example Existing file system Unmounted volume residing on /device/dsk Mount point: Mounted /device/dsk over /users 25
File Sharing Sharing of files on multi-user systems is desirable Sharing may be done through a protection scheme In order to have a protection scheme, the system should have User IDs - identify users, allowing permissions and protections to be per-user Group IDs - allow users to be in groups, permitting group access rights 26
Sharing across a network Distributed system Network File System (NFS) is a common distributed file-sharing method SMB (Windows shares) is another Protection is a challenge! 27
Protection File owner/creator should be able to control: what can be done by whom Types of access Read Write Execute Append Delete List 28
Access Lists and Groups 3 modes of access: read, write, execute 3 classes of users: a) owner access 7 b) group access 6 c) public access 1 RWX 111 RWX 110 RWX 001 Ask manager to create a group (unique name), say G, and add some users to the group. For a particular file (say game) or subdirectory, define an appropriate access. owner Attach a group to a file: chgrp chmo d G game group 761 public game 29
A Sample UNIX Directory Listing > ls -l owner group name 30
File system implementation 31
File-System Structure File system resides on secondary storage (disks) File system organized into layers 32
File-System Layers File API: filenames, directories, attributes, access... Logical block addresses (1D array of blocks) Physical block addresses on disk (cylinder, sector,...) read/write block commands 33
File control block (FCB) Resides at the logical FS layer Storage structure consisting of information about a file 34
In-Memory File System Structures creat (a) Creating a new file (b) Reading an open file 35
Virtual File System (VFS) 36
Allocation Methods An allocation method refers to how disk blocks are allocated for files Contiguous allocation Linked allocation Indexed allocation 38
Contiguous Allocation 39
Contiguous Allocation Pros: Simple Allows random access Cons: Wasteful Files cannot grow easily Works well on CD-ROM 40
Linked Allocation block = next-pointer 41
Linked Allocation Pros: Simple need only starting address Free-space management No external fragmentation Cons: No random access Overhead (space and time) Reliability 42
File-Allocation Table (FAT) File-allocation table (FAT) disk-space allocation used by older Windows in FAT Variant of linked allocation: FAT resides in reserved section at beginning of each disk volume One entry for each disk block, indexed by block number, points to successor. Entry for last block in a chain has table value -1 Unused blocks have table value 0 Finding free blocks is easy Does not scale well to large disks or small block sizes 43
Indexed Allocation Brings all pointers together into an index block 44
Indexed Allocation (Cont.) Direct access once index block is loaded without external fragmentation, but overhead of index block. All block pointers of a file must fit into the index block How large should an index block be? Small Limits file size Large Wastes space for small files Solution: Multi-level indexed allocation 45
Multilevel-indexed allocation Directory outer-index index table file 46
Combined Scheme: UNIX inode Block size 4 KB -> With 12 direct block pointers kept in the inode, 48 KB can be addressed directly. Small overhead for small files Still allows large files 47
B-trees Self-balancing tree Efficient operations O(log n) Popular in newer (less old) filesystems NTFS Ext4 HFS+ 48
Free-Space Management Where is there free space on the disk? A free-space list Two basic approaches Free-space map (bit vector) Linked list 49
Bit vector Each block represented by one bit 0 1 2 n-1 bit[i] = 1 block[i] free 0 block[i] occupied First free block: number of bits per word) * (number of 0-value words) + offset of first 1 bit 50
Bit vector Easy to get contiguous files Bit map requires extra space Example: block size = 1 KB = 2 10 bytes disk size = 68 GB ~ 236 bytes n = 236 / 210 = 226 bits (or 67 MB) Inefficient unless entire bit vector is kept in main memory 51
Linked list 52
Linked list Only need to store the pointer to the first free block Finding k free blocks means reading in k blocks from disk No waste of space 53
Grouping a really free block First free block (n-1 references to free blocks) 54
Counting Often, multiple subsequent blocks are allocated/freed together For sequences of free blocks located subsequently on disk, keep only reference to first one and length of sequence 55
Fact #1 File systems contain multiple data structures 56
Fact #2 These data structures have inter-depencencies 57
Conclusion: Modification of the file system should be atomic 58
What happens if the computer is suddenly turned off? 59
File system repair For each block Find which files use the block Check if the block is marked as free The block is used by 1 file xor is free OK Two files use the same block BAD: duplicate the block and give one to each file The block is both used and is marked free BAD: remove from free list The block is neither free nor used Wasted block: mark as free 60
Modern alternatives Log-based, transaction-oriented Each modification is made as a transaction Keep a journal (log) of all pending transactions Interrupted transaction can be rolled-back Examples: NTFS, ext4,... Snapshot-based Copy-on-write Often combined with checksums Examples: ZFS, Btrfs, APFS 61
Memory-Mapped Files Mapping a disk block to a page in memory A page-sized portion of the file is read from the file system into a physical page. Subsequent reads/writes to/from the file are treated as ordinary memory accesses. Simplifies file access by treating file I/O through memory rather than read() / write() system calls Also allows several processes to map the same file allowing the pages in memory to be shared 62
Memory-Mapped Files 63