COMP091 Operating Systems 1 File Systems
Media File systems organize the storage space on persistent media such as disk, tape, CD/DVD/BD, USB etc. Disk, USB drives, and virtual drives are referred to as block storage devices because data are transferred to and from the devices in fixed length blocks The first few blocks on a disk or USB drive contain a table that allows the disk to be divided into partitions
Partitions Partitions divide a disk (or USB drive) into separate extents of contiguous blocks that function like separate devices A partition table at the start of the drive contains the addresses of the start and end blocks of each partition
Partition Schemes MBR partitions Introduced in 1983 with PC DOS 2.0 Master Boot Record contained partition table allowing for four partitions Later one extended partition could contain additional logical partitions Disk limited to 2 TiB GPT (GUID Partition Table) is now replacing MBR Allows for 128 partitions, 8 ZiB (9,444,732,965,739,290,427,392 bytes)
Terminology Disk partitions function much like separate devices Often referred to as volumes In Windows they are assigned a Drive letter as if they were separate devices Our discussion of storage often talks about devices when what we really mean are partitions A file system organizes the space on one partition of a partitioned device, or an entire un-partitioned device, to make the space available for files
Spanned Volumes A partition is a subdivision of the space on a single drive A spanned volume functions like a partition, but includes space from more than 1 drive Spanned volumes are more flexible Can add space to a spanned volume even if the original disk is fully allocated Microsoft sys volume and boot partition can't be spanned Space must be on separate disks
Logical Volumes Linux concept Combines partitions into one logical volume Can add space to existing volume Unlike MS spanned volumes, no restrictions Can be boot partition or root partition Can be same disk Can participate in RAID
File Allocation Various methods can be used by different OS to position files on disk Contiguous Allocation: Each file is stored on consecutive disk blocks Advantages: Simple to implement because we need to know only disk address of the first block of file and number of blocks The read performance is excellent because we may need only one disk operation to read the entire file.
Contiguous Allocation
Contiguous Allocation The disadvantages of Contiguous allocation are: Disk fragmentation when files are removed. Compaction is difficult because all the blocks following the holes need to be copied. Need to know the final size of new file to be able to choose the correct hole in which to place it. Contiguous allocation is good for write once media such as CD-ROMS and DVDs and BR
Linked List Allocation A linked list of disk blocks is kept in this method First word is pointer Every disk block can be used Sequential read for the blocks of the file is easy Random access is hard because we have to read all the blocks of a file up to desired block Because of the pointer the amount of data stored in each block is not a whole block
Linked List Allocation
Linked List with Memory Table Like linked list but keep a table of pointers to the blocks in memory. This File Allocation Table (FAT) was used in MSDOS and early windows OS Random access to blocks is easy because there is no disk reading involved. To reduce number of table entries allocation unit is a cluster of blocks Larger clusters means fewer FAT entries But more wasted space
File Allocation Table
Bad Blocks Bad block management: Most hard disks have bad blocks that can be avoided using bad block tables in the file system Bad block table points to spare blocks somewhere else on the disk that can be used instead of the bad block
Crash Recovery Consistency checking If the system crashes before writing all of the modified blocks, the file system becomes inconsistent. File system utilities can often resolve inconsistencies
Microsoft FAT file systems
The MS-DOS FAT File System (1) Directory entry
The MS-DOS File System (2) Partition (entire file system) and Cluster (Block) Size
The Windows 98 File System (1) Extended MS-DOS directory entry
The Windows 98 File System (2) Long name entry
The Windows 98 File System (3) A long file name in storage
Reference http://en.wikipedia.org/wiki/fat32#fat32 Lots of detail
NTFS
NTFS Windows NT file system More secure than FAT ACLs Scales well to large disks Cluster size depends on disk size 64-bit file pointers Can address up to 16 exabytes of disk Multiple data streams Compression and encryption Journal
File System Structure Each NTFS volume (e.g., disk partition) contains files, directories, bitmaps, and other data structures Each volume is organized as a linear sequence of blocks (called clusters) usually 4 KB in size (can be 512 bytes to 64 KB) and pointed to by 64 bit pointers The main data structure in each volume is the MFT (Master File Table) which is a linear sequence of 1 KB records
NTFS Master File Table (1) Each MFT record describes one file or directory and contains file attributes The MFT is a file itself and can be placed anywhere within the volume (eliminating the problem of defective sectors in the first track) The initial address of the MFT file is stored in boot sector at offset 30h bytes from its beginning. The first 16 MFT records are reserved for NTFS metadata files which contain volume related system data to describe the volume
NTFS Master File Table (2)
Attributes Used in MFT Records Each record consists of a sequence of attribute header (= name & length) and value pairs If attribute is small it is kept in the record, if it is long it is put in another block on disk and pointed to
MFT Record for A File
An MFT Record for A Small Directory
File Compression Transforms file to take less space on disk Lempel-Ziv Compression Algorithm Transparent Applications access files using standard API calls System compresses and decompresses files Applications unaware if file compressed The compression algorithm considers 16 consecutive blocks If the compressed form takes less than 16 blocks then the compression is applied else not
File Encryption Protects files from illicit access Encryption performed in compression units Keys Public key / private key encryption to encrypt copies of key Keys stored in X.509 certificates Recovery key given to system administrator In case user forgets password Encrypted versions of keys stored on disk Decrypted keys stored in non-paged pool
NTFS Log NTFS is a Journaling file system Changes stored in reliable log first, then applied to disk After a crash, changes can be reconstructed from the log NTFS is a logical journal (not physical) Only contains changes to metadata USN (update sequence number) journal can be enabled to track all changes
More NTFS Features Volume Shadow Copy Copy-on-write keeps before image journal Allows file to be reverted to previous state Transactional NTFS Quotas Users can define multi-operation transaction Technique similar to copy-on-write allows backout of partially completed transactions Administrators can limit users use of disk space
Links Symbolic Links Like an alias or bookmark to a file or directory Symbolic name link points to file's real name Hard links Two directory entries (hence two names) for the same file Volume Mount Points Declare a directory to be a mount point Other volumes can be mounted there rather than at a drive letter
Links Directory junctions Similar to mount points but mount a directory from same file system, rather than another volume Single instance storage Two identical files are linked, only stored once
ADS, Sparse Files Alternate Data Streams Rarely used Introduced so Services for Macintosh could support Mac resource forks Filename:streamname refers to the alternate stream Sketchy support Used to store malware Sparse Files Files with lots of unused segments Unused aren't stored, FS returns zeros on read
References Old but official http://technet.microsoft.com/en-us/library/cc758691%2 http://technet.microsoft.com/en-us/library/cc781134%28 Comprehensive http://en.wikipedia.org/wiki/ntfs