Microsoft File System Instructor: Chia-Tsun Wu. 11/25/2004 ACCESS IC LAB
Types of File System Outline Principles of File System Microsoft File System General Comments Boot Sector and BPB Boot Sector and BPB Structure FAT Data Structure LAB P2
Types of FileSystem DOS FAT 12/16/32, VFAT High Performance FileSystem (HPFS) New Technology FileSystem (NTFS) Extended filesystems (Ext, Ext2, Ext3) Macintosh Hierarchical Filesystem - HFS ISO 9660 - CD-ROM filesystem P3
Other filesystems ADFS - Acorn Disc File System AFFS - Amiga fast filesystem BeFS - BeOS filesystem BFS - UnixWare Boot Filesystem CrosStor filesystem DTFS - Desktop filesystem EFS - Enhanced filesystem (Linux) EFS - Extent filesystem (IRIX) FFS - BSD Fast filesystem GPFS - General Parallel Filesystem HFS - HP-UX Hi performance filesystem HTFS - High throughput filesystem LFS - Linux log structured filesystem JFS - Journaled filesystem (HP-UX, AIX, OS/2 5, Linux) MFS - Macintosh filesystem Minix filesystem NWFS - Novell NetWare filesystem NSS - Novell Storage Services ODS - On Disk Structure filesystem QNX filesystem Reiser filesystem RFS (CD-ROM Filesystem) RomFS - Rom filesystem SFS - Secure filesystem Spiralog filesystem (OpenVMS) System V and derived filesystems Text - (Philips' CD-ROM Filesystem) UDF - Universal Disk Format (DVD-ROM filesystem) UFS V7 Filesystem VxFS - Veritas filesystem (HP-UX, SCO UnixWare, Solaris) XFS - Extended filesystem (IRIX) Xia FS P4
File-System Structure ACCESS IC LAB
File structure Logical storage unit Introduction Collection of related information File system resides on secondary storage (disks) File system organized into layers File control block storage structure consisting of information about a file Ownership, permissions, and location of the file content I/O transfers between memory and disk are performed in units of blocks (one more more sectors) P6
Layered File System P7
Layered File System (Cont.) I/O control handlers device drivers and interrupt Transfer information between main memory and disk system Retrieve block 123 Basic file system HW-specific instructions Issue generic commands to device driver to read and write physical blocks on the disk Physical block: drive 1, cylinder 73, track 2, sector 10 P8
Layered File System (Cont.) File-organization module Know about files, their logical blocks, and physical blocks Translate logical blocks to physical blocks (similar to VM) Logical blocks: 0 Free-space manager Blocks allocation Logical file system N manage metadata information Metadata: file-system structure, excluding the actual file contents Manage the directory structure via file control blocks (FCB) P9
Layered File System (Cont.) Why Layered file system? All the advantages of the layered approach File system standard: UFS, FAT FAT32, NTFS Duplication of code is minimized for different file system standard Usually I/O control and the basic file system code can be used by multiple file system formats. P10
A Typical FCB P11
File System Implementation ACCESS IC LAB
On-Disk Structures Boot control block: information needed by the system to boot an OS from that partition UFS: boot block; NTFS: partition boot sector Partition control block: partition details No. of blocks, size of the blocks, free-block count and freeblock pointers, free FCB count and FCB pointers UFS: superblock; NTFS: Master File Table A directory structure is used to organize the files File control block: many of the file s details File permissions, ownership, size, location of the data blocks UFS: inode; NTFS: within the Master File Table P13
In-Memory Structures An in-memory partition table containing information about each mounted partition An in-memory directory structure that holds the directory information of recently accessed directories The system-wide open-file table (Chapter 11) The per-process open-file table (Chapter 11) Caching information so that no need to retrieve the information every time from the disk P14
In-Memory File-System Structures File Open File Read P15
Virtual File Systems Virtual File Systems (VFS) provide an object-oriented way of implementing file systems VFS separates file-system-generic operations from their implementation by defining a clean VFS interface VFS allows the same system call interface (the API) to be used for different types of file systems VFS is based on a file-representation structure, called a vnode, that contains a numerical designator for a network-wide unique file The API is to the VFS interface, rather than any specific type of file system P16
Schematic View of Virtual File System Schematic View of Virtual File System Open, read, write P17
Directory Implementation Linear list of file names with pointer to the data blocks Simple to program Time-consuming to execute particular entry Hash Table Cache and sorted list may help linear search to find a linear list with hash data structure Decreases directory search time Collisions situations where two file names hash to the same location Fixed size and the dependence of the hash function on that size P18
Allocation Methods How to allocate space to files so that disk space is utilized effectively and files can be accessed quickly ACCESS IC LAB
Contiguous Allocation A file occupies a set of contiguous blocks on disk Only starting block (block #) and length (number of blocks) are required in the directory entry (FCB) Fast -- Minimal seek time and head movement Random access any block within the file Similar to dynamic storage-allocation problem External fragmentation Files are difficult to grow may need compaction Find a larger hole and copy the file to the new space P20
Contiguous Allocation (Cont.) P21
Extent-Based Systems Many newer file systems (I.e. Veritas File System) use a modified contiguous allocation scheme Extent-based file systems allocate disk blocks in extents An extent is a contiguous block of disks. Extents are allocated for file allocation. A file consists of one or more extents. Integrate contiguous allocation and linked allocation (see later) P22
Linked Allocation Each file is a linked list of disk blocks Blocks may be scattered anywhere on the disk Directory contains a pointer to the first and last blocks Each block contains a pointer to the next block Advantages No external fragmentation Easy to grow Disadvantages Any free block is OK Effectively for only sequential-access file Space required for the pointers Reliability What if the pointers are lost P23
Linked Allocation (Cont.) block = pointer data P24
Linked Allocation (Cont.) Solution for spaces for pointers Collect blocks into clusters, and allocate the clusters than blocks ( Allocate Cluster, Block) Fewer disk head seeks and decreases the space needed for block allocation and free-list management Internal fragmentation Solution for reliability Double linked list or store the filename and relative block number in each block More overhead for each file P25
Linked Allocation (Cont.) FAT (File Allocation Table) OS/2, MS-DOS The table has one entry for each disk block and is indexed by block number Similar to the linked list Contain the block number of the next block in the file Significant number of disk head seeks One for FAT, one for data Improved by caching FAT Random access time is improved Pointer FAT Data Block P26
Indexed Allocation Bring all pointers together into the index block An array of disk-block addresses The ith entry points to the ith block of the file The directory contains the address of the index block Similar to the paging scheme for memory management P27
Example of Indexed Allocation P28
Indexed Allocation (Cont.) Advantage Support random access Dynamic access without external fragmentation No size-declaration problem But have overhead of index block. Need index table Disadvantage Wasted space: Worse than the linked allocation for small files How large the index block should be Large index block: waste space for small files Small index block: how to handle large files P29
Indexed Allocation (Cont.) Mechanism for handling the index block Linked scheme: Link together several index blocks Multilevel index: like multi-level paging With 4096-byte blocks, we could store 1024 4-byte pointers in an index block. Two levels of indexes allows 1,048,576 data blocks, which allow a file of up to 4 gigabytes Combined scheme: For example BSD UNIX System P30
Indexed Allocation Index (Cont.) Multilevel P31
Combined Scheme: UNIX (4K bytes per block) The UNIX inode How large can a file be, if each pointer in the index blocks is 4- bytes? P32
Free Space Management ACCESS IC LAB
Bit Vector Simple and efficient to find the first free block, or consecutive free blocks By bit-manipulation Requires extra space block size = 2 12 bytes disk size = 2 30 bytes n = 2 30 /2 12 = 2 18 bits (or 32K bytes) Efficient only when the entire vector is kept in main memory Write back to the disk occasionally for recovery needs 0 1 2 n-1 bit[i] = 0 block[i] free 1 block[i] occupied 001111001111100011000011100 Question: What s the block # of the fist free block? P34
Linked List Link together all free blocks Keep a pointer to the first free block in a special location on the disk and caching it in memory Cannot get contiguous space easily No waste of space Not efficient: have to traverse the disk for free spaces Usually, OS needs one free block at a time FAT incorporate the linked list mechanism P35
Grouping And Counting Grouping: store the address of n free blocks in the first free block. The first n-1 are actually free. The final block contains the addresses of another n free blocks Counting: Each entry has a disk address and a count Several contiguous blocks may be allocated or freed simultaneously P36
Example Of Free-Space Management Bit Vector 11000011000000111001111110001111 Grouping Block 2 3, 4, 5 Block 5 8, 9, 10 Block 10 11, 12, 13 Block 13 17, 28, 25 Block 25 26, 27 Counting 2 4 8 6 17 2 25 3 P37
Efficiency and Performance ACCESS IC LAB
Efficiency and Performance Efficiency dependent on Disk allocation and directory algorithms Types of data kept in file s directory entry Performance On-board cache local memory in disk controller to store entire tracks at a time Disk cache separate section of main memory for frequently used blocks (LRU is a reasonable algorithm for block replacement) Free-behind and read-ahead techniques to optimize sequential access (optimize the disk cache s block replacement algorithm) Improve PC performance by dedicating section of memory as virtual disk, or RAM disk. P39
Various Disk-Caching Locations P40
Page Cache Non-unified buffer cache A page cache caches pages rather than disk blocks using virtual memory techniques Memory-mapped I/O uses a page cache Routine I/O through the file system uses the buffer (disk) cache Unified Buffer Cache A unified buffer cache uses the same buffer cache to cache both memory-mapped pages and ordinary file system I/O P41
I/O Without/With A Unified Buffer Cache P42
Recovery Consistency checker compares data in directory structure with data blocks on disk, and tries to fix inconsistencies Use system programs to back up data from disk to another storage device (floppy disk, magnetic tape) Recover lost file or disk by restoring data from backup P43
Log Structured File Systems Log structured (or journaling) file systems record each update to the file system as a transaction All transactions are written to a log. A transaction is considered committed once it is written to the log However, the file system may not yet be updated The transactions in the log are asynchronously written to the file system. When the file system is modified, the transaction is removed from the log If the file system crashes, all remaining transactions in the log must still be performed P44
Microsoft File System ACCESS IC LAB
General Comments FAT file system on disk data structure is all little endian you will have to translate if your machine is a big endian machine A FAT file system volume is composed of four basic regions, which are laid out in this order on the volume: 0 Reserved Region 1 FAT Region 2 Root Directory Region (doesn t exist on FAT32 volumes) 3 File and Directory Data Region P46
BPB (BIOS Parameter Block), which is located in the first sector of the volume in the Reserved Region. AKA boot sector or the reserved sector or the 0th sector, There is no BPB in MS-DOS 1.X The BPB in the boot sector defined for MS-DOS 2.x (FAT 16) FAT16 volume with strictly less than 65,536 sectors (32 MB worth of 512-byte sectors). FAT32 was defined by MS-DOS 3.x, where the BPB was modified to include a new 32-bit field for the total sectors value. Win95 OSR2 P47
struct BootSector { u8 BS_jmpBoot[3]; //long jump instruction u8 BS_OEMName[8]; //name of OEM u16 BPB_BytsPerSec; //512,1024,2048,4096 u8 BPB_SecPerClus; //1,2,4,8,16,32,64,128 u16 BPB_RsvdSecCnt; //reserved sector after BPB u8 BPB_NumFATs; //2 u16 BPB_RootEntCnt; //the count of 32-byte directory entries in the root directory u16 BPB_TotSec16; //total size in this volume (FAT 16) u8 BPB_Media; //media type u16 BPB_FATsz; //number of sectors u16 BPB_SecPerTrk; //Sectors per track for interrupt 0x13 u16 BPB_NumHeads; //Number of heads for interrupt 0x13 u32 BPB_HiddSec; //Count of hidden sectors u32 BPB_TotSec32; //total size in this volume (FAT 32) } ; P48
Fat12 and Fat16 Structure Starting at Offset 36 struct Fat12_16 { u8 BS_DrvNum; u8 BS_Reserved1; u8 BS_BootSig; u32 BS_VolID; u8 BS_VolLab[11]; u8 BS_FilSysType[8] }; //device number //reserved //Extended boot signature //serial ID //Volume label. //FAT12/FAT16/FAT P49
FAT32 Structure Starting at Offset 36 Struct Fat32 { u32 BPB_FATSz32; //count of sectors u16 BPB_ExtFlags; u8 BPB_FSVer[2]; //file system version u32 BPB_RootClus; //root cluster (directory) u16 BPB_FSInfo; //file system information u16 BPB_BkBootSec; //back up boot sector u8 BPB_Reserved[12]; //reserved u8 BS_DrvNum; //driver number u8 BS_Reserved1; //reserved u8 BS_BootSig; //Extended boot signature u32 BS_VolID; //volume serial number u8 BS_VolLab[11]; //volume label u8 BS_FilSysType[8]; //file system type ( FAT32 ) }; P50
File Allocation Table (FAT) is a linking list to a stored file The FAT maps the data region of the volume by cluster number The first data cluster is cluster 2. RootDirSectors = ((BPB_RootEntCnt * 32) + (BPB_BytsPerSec 1)) / BPB_BytsPerSec; The start of the data region, the first sector of cluster 2: If(BPB_FATSz16!= 0) FATSz = BPB_FATSz16; Else FATSz = BPB_FATSz32; FirstDataSector = BPB_ResvdSecCnt + (BPB_NumFATs * FATSz) + RootDirSectors; P51
FAT32 FSInfo Sector Structure and Backup Boot Sector Data structure on FAT12/16/32 On a FAT32 volume, the FAT can be a large data structure On FAT16 where it is limited to a maximum of 128K worth of sectors On FAT12 where it is limited to a maximum of 6K worth of sectors. A provision is made to store the last known free cluster count on the FAT32 volume. The FSInfo sector number is the value in the BPB_FSInfo field; For Microsoft operating systems it is always set to 1. Here is the structure of the FSInfo sector: P52
struct FSInfo { u64 FSI_LeadSig; //FSInfo header = 0x41615252 u8 FSI_Reserved1[480]; //reserved (should not be used) u64 FSI_StrucSig; //structure header = 0x61417272. u64 FSI_Free_Count; //free cluster count u64 FSI_Nxt_Free; //next free cluster pointer u8 FSI_Reserved2[12]; //reserved (should not be used) u64 FSI_TrailSig; //0xAA550000 to valid FSsector }; P53
FAT32 FSInfo Sector Structure and Backup Boot Sector The Microsoft FAT32 boot sector is actually three 512-byte sectors long. There is a copy of all three of these sectors starting at the BPB_BkBootSec sector. BPB_BkBootSec sector is a complete boot record include FSInfo sector. BPB_BkBootSec is not present on FAT16/FAT12. FAT16/FAT12 volumes can be totally lost if the contents of sector 0 of the volume are overwritten or sector 0 goes bad and cannot be read. The BPB_BkBootSec field reduces the severity of this problem for FAT32 on no value other than sector 6 When the sector 0 information has been accidentally overwritten, all a disk repair utility has to do is restore the boot sector(s) from the backup copy. When sector 0 goes bad, this allows the volume to be mounted so that the user can access data before replacing the disk. When sector 0 goes bad, check for backup boot sector(s) starting at sector 6 of the FAT32 volume. NOTE: All 3 of these sectors have the 0xAA55 signature in sector offsets 510 and 511, just like the first boot sector does P54
A FAT directory is a file composed of a linear list of 32-byte structures. The root directory must always be present. For FAT12 and FAT16 media, the root directory is located in a fixed location on the disk immediately following the last FAT and is of a fixed size in sectors computed from the BPB_RootEntCnt value. For FAT12 and FAT16 media, the first sector of the root directory is sector number relative to the first sector of the FAT volume: FirstRootDirSecNum = BPB_ResvdSecCnt + (BPB_NumFATs * BPB_FATSz16); For FAT32, the root directory can be of variable size and is a cluster chain, just like any other directory is. The first cluster of the root directory on a FAT32 volume is stored in BPB_RootClus. Unlike other directories, the root directory itself on any FAT type does not have any date or time stamps, does not have a file name (other than the implied file name \ ), and does not contain. and.. files as the first two directory entries in the directory. The only other special aspect of the root directory is that it is the only directory on the FAT volume for which it is valid to have a file that has only the ATTR_VOLUME_ID attribute bit set (see below). P55
FAT 32 Byte Directory Entry Structure struct DirectoryEntryStruct { u8 DIR_Name[11]; u8 DIR_Attr; u8 DIR_NTRes; u8 DIR_CrtTimeTenth; u16 DIR_CrtTime; u16 DIR_CrtDate; u16 DIR_LstAccDate; u16 DIR_FstClusHI; u16 DIR_WrtTime; u16 DIR_WrtDate; u16 DIR_FstClusLO; u32 DIR_FileSize; }; //directory name //file attribute //ATTR_READ_ONLY 0x01 //ATTR_HIDDEN 0x02 //ATTR_SYSTEM 0x04 //ATTR_VOLUME_ID 0x08 //ATTR_DIRECTORY 0x10 //ATTR_ARCHIVE 0x20 //ATTR_LONG_NAME // ATTR_READ_ONLY // ATTR_HIDDEN ATTR_SYSTEM // ATTR_VOLUME_ID //reversed for NT //create time tenth to DIR_CrtTime // create time of 2 seconds //create date //last access date //High word of this entry s first cluster number //last write time //last write date //Low word of this entry s first cluster number //file size in bytes P56
Directory Name Special notes about the first byte (DIR_Name[0]) of a FAT directory entry: If DIR_Name[0] == 0xE5, then the directory entry is free (there is no file or directory name in this entry). If DIR_Name[0] == 0x00, then the directory entry is free (same as for 0xE5), and there are no allocated directory entries after this one (all of the DIR_Name[0] bytes in all of the entries after this one are also set to 0). The special 0 value, rather than the 0xE5 value, indicates to FAT file system driver code that the rest of the entries in this directory do not need to be examined because they are all free. If DIR_Name[0] == 0x05, then the actual file name character for this byte is 0xE5. 0xE5 is actually a valid KANJI lead byte value for the character set used in Japan. The special 0x05 value is used so that this special file name case for Japan can be handled properly and not cause FAT file system code to think that the entry is free. The DIR_Name field is actually broken into two parts+ the 8-character main part of the name, and the 3-character extension.these two parts are trailing space padded with bytes of 0x20. DIR_Name[0] may not equal 0x20. Lower case characters are not allowed in DIR_Name The following characters are not legal in any bytes of DIR_Name: Values less than 0x20 except for the special case of 0x05 in DIR_Name[0] described above. 0x22, 0x2A, 0x2B, 0x2C, 0x2E, 0x2F, 0x3A, 0x3B, 0x3C, 0x3D, 0x3E, 0x3F, 0x5B, 0x5C, 0x5D, and 0x7C. P57
Directory Name ATTR_VOLUME_ID There should only be one file on the volume that has this attribute set, and that file must be in the root directory. DIR_FstClusHI and DIR_FstClusLO must always be 0 for the volume label (no data clusters are allocated to the volume label file). When a directory is created, a file with the ATTR_DIRECTORY bit set in its DIR_Attr field, you set its DIR_FileSize to 0. DIR_FileSize is not used and is always 0 on a file with the ATTR_DIRECTORY attribute (directories are sized by simply following their cluster chains to the EOC mark). One cluster is allocated to the directory (unless it is the root directory on a FAT16/FAT12 volume), and you set DIR_FstClusLO and DIR_FstClusHI to that cluster number and place an EOC mark in that clusters entry in the FAT. Next, you initialize all bytes of that cluster to 0. If the directory is the root directory, you are done (there are no dot or dotdot entries in the root directory). P58
Directory Name If the directory is not the root directory, you need to create two special entries in the first two 32-byte FAT directory entries of the directory (the first two 32 byte entries in the data region of the cluster you just allocated). The first directory entry has DIR_Name set to:. The second has DIR_Name set to:.. These are called the dot and dotdot entries. The DIR_FileSize field on both entries is set to 0, and all of the date and time fields in both of these entries are set to the same values as they were in the directory entry for the directory that you just created. You now set DIR_FstClusLO and DIR_FstClusHI for the dot entry (the first entry) to the same values you put in those fields for the directories directory entry (the cluster number of the cluster that contains the dot and dotdot entries). Finally, you set DIR_FstClusLO and DIR_FstClusHI for the dotdot entry (the second entry) to the first cluster number of the directory in which you just created the directory (value is 0 if this directory is the root directory even for FAT32 volumes). Here is the summary for the dot and dotdot entries: The dot entry is a directory that points to itself. The dotdot entry points to the starting cluster of the parent of this directory (which is 0 if this directories parent is the root directory). P59
Date and Time Formats Many FAT file systems do not support Date/Time other than DIR_WrtTime and DIR_WrtDate. For this reason, DIR_CrtTimeMil, DIR_CrtTime, DIR_CrtDate, and DIR_LstAccDate are actually optional fields. DIR_WrtTime and DIR_WrtDate must be supported. Set to 0 If the other date and time fields are not supported Date Format. A FAT directory entry date stamp is a 16-bit field that is basically a date relative to the MS-DOS epoch of 01/01/1980. Here is the Bits 0 4: Day of month, valid value range 1-31 inclusive. Bits 5 8: Month of year, 1 = January, valid value range 1 12 inclusive. Bits 9 15: Count of years from 1980, valid value range 0 127 inclusive (1980 2107). Time Format. A FAT directory entry time stamp is a 16-bit field that has a granularity of 2 seconds. Here is the format: Bits 0 4: 2-second count, valid value range 0 29 inclusive (0 Bits 5 10: Minutes, valid value range 0 59 inclusive. Bits 11 15: Hours, valid value range 0 23 inclusive. The valid time range is from Midnight 00:00:00 to 23:59:58. 58 seconds). P60
Lab Create a Virtual Disk form SDRAM Create a Boot Sector on Virtual Disk Create two FATs and one Directory on Virtual Disk which should be compatible to Microsoft File System FAT32 Provide Delete, Create, Write, Read functions to Virtual disk for next Lab usage P61
About memory allocate from Blackfin malloc();??? malloc will allocate memory from internal memory. Try it: void * tt; tt = malloc(sizeof(char)*32); printf( %8x,tt); What do you find out? How about try to do it? Void *tt; Char *BootSector[2], *FAT32[2]; Tt = (void *) (0x..); //point to a free SDRAM space BootSector[1] = (char *)tt; BootSector[2] = (char *)(tt+512*3); FAT32[0] = ; P62
About memory allocate from Blackfin How about try to manager memory yourself if you re a expert? #define malloc MyMalloc #define free MyFree Struct MemAllocTable { }; Struct MemFreeTable{ }; Struct MemAllocTable *MatHead, MatCurrent ; Struct MemFreeTable *MftHead, Mftcurrent ; Void InitMyMem { MatHead = (struct MemAllocTable *)(0x.); //point to a free SDRAM space } Void * MyMalloc(int a) { } Void MyFree(void *a) { } P63