UNIX File System
Filesystem structure Disk divided into one or more partitions independent file system on each partition Sector 0 contains the Master Boot Record (MBR) MBR contains partition table one partition marked as active boot block first block of active partition BIOS reads and executes MBR MBR reads boot block and executes it program in boot block loads OS and runs it Often file system contains superblock which contains key file system parameters 2
Example disk and filesystem layout Partition Table MBR partition 1 partition 2 (active) partition 3 boot block super block free space management inode list root dir files & dirs 3
File allocation on disk Disk is divided into blocks or sectors Files are stored on secondary storage in blocks or sectors Blocks are the unit of I/O transfer with secondary storage (on LINUX and UNIX, clusters on Windows) Blocks can be of fixed length or variable-length Need file allocation table (FAT) to keep track of files on disk Each file has a FAT entry 4
Disk blocks Q1 5
UNIX filesystem framework Provides persistent storage Provides facilities for managing data Interface includes exported abstractions: files, directories, file descriptors and file systems Kernel does not interpret file contents Files and directories form tree structure Q2 6
The Unix filesystem Powerful, elegant filesystem from a small number of mechanisms Support these types of files Ordinary File: An unstructured sequence of bytes Directory: A set of files (even other directories) Special File: An I/O device (Normally in /dev/) Names Pipe: or FIFO, used for IPC Referred to as: UFS FFS: Berkeley Fast File System Q3 7
The Unix filesystem filesystem think of everything as being a file Directories are really nothing more than files containing the names of other files The filesystem is even used to represent physical devices such as tty lines or even disk and tape units. Refs: http://userpages.umbc.edu/~jack/ifsm498d/filesystem. html http://www2.lib.uchicago.edu/~keith//tclcourse/topics/unix-files.html 8
File and directory organization (hard) links / bin etc dev usr vmunix sh local etc /usr/local/bin/bash bin bash 9
File attributes Type - for example regular, FIFO, special Link count ( or reference count) Size in bytes Device id Ownership Access modes Timestamps 10
Inode (index node) structure Q4 11
Inode 128 bytes (64 bytes in older systems) each Statically allocated Root inode (inode 2) is the root (/) of the filesystem 12
i-node: 13
User view of files File Descriptors (open, dup, dup2, fork) All I/O is through file descriptors references the open file object per process object File Object - holds context created by an open() system call stores file offset reference to vnode vnode - internal representation of an opened file Q5 14
vnode struct typedef struct vnode { kmutex_t v_lock; /* protects vnode fields */ u_short v_flag; /* vnode flags */ u_long v_count; /* reference (link) count */ struct vfs *v_vfsmountedhere; /* ptr to vfs mounted here */ struct vnodeops *v_op; /* vnode operations */ struct vfs *v_vfsp; /* ptr to containing VFS */ struct stdata *v_stream; /* associated stream */ struct page *v_pages; /* vnode pages list */ enum vtype v_type; /* vnode type */ dev_t v_rdev; /* device (VCHR, VBLK) */ caddr_t v_data; /* private data for fs */ struct filock *v_filocks; /* ptr to filock list */ kcondvar_t v_cv; /* synchronize locking */ } vnode_t; http://www.am-utils.org/docs/zadok-thesis-proposal/node80.html 15
How it works File Descriptors {{0, uf_ofile} {1, uf_ofile} {2, uf_ofile} {3, uf_ofile} {4, uf_ofile} {5, uf_ofile}} Open File Objects {*f_vnode,f_offset,f_count,...}, {*f_vnode,f_offset,f_count,...}, {*f_vnode,f_offset,f_count,...}, {*f_vnode,f_offset,f_count,...}, {*f_vnode,f_offset,f_count,...}} Vnode/vfs Vnode/vfs In-memory Vnode/vfs In-memory Vnode/vfs representation In-memory Vnode/vfs representation In-memory of file representation In-memory of file representation of file representation of file of file 16
File systems File hierarchy composed of one or more file systems One file system is designated the Root File System Attached to mount points File can not span multiple file systems resides on one logical disk 17
Logical disks Viewed as linear sequence of fixed sized, randomly accessible blocks. A file system must reside in a logical disk, however a logical disk need not contain a file system. Typically physical disks divided into partitions that correspond to logical disks 18
BSD Unix space allocation Disk partition divided into cylinder groups superblocks restructured and replicated across partition Constant information cylinder group contains summary info such as free inodes and free block Support block fragments Long file names new disk block allocation strategy 19
BSD Unix space allocation Example: 4,096-byte blocks, 1,024-byte fragments (4096/1024) Block Map One associated with each cylinder group Records free fragments in the cylinder group Free fragments: 0-3, 4-5, 10-11 Bit Map **** **11 11** 1111 Fragment Numbers 0-3 4-7 8-11 12-15 Block Numbers 0 1 2 3 Q6 20
FFS allocation strategy Goal: Collocate similar data/info. file inodes located in same cyl group as dir. new dirs created in different cyl groups. Place file data blocks/inode in same cyl group - for size < 48K allocate sequential blocks at a rotationally optimal position. Choose cyl group with best free count Q7 21
Disk space management Block size disk utilization and performance dependent on file size. Disk space utilization with larger blocks you increase internal fragmentation 22
Free space management Bit vector (n blocks) 0 1 2 n-1 bit[i] = 0 block[i] free 1 block[i] occupied 23
Free space management Bit map requires extra space. Example: block size = 2 12 bytes (4096 Bytes) disk size = 2 34 bytes (16 GByte) n = 2 34 /2 12 = 2 22 bits (4Mbits=512 KBytes) Linked list (free blocks) Cannot get contiguous space easily No waste of space Grouping Q8, 9 24
Free space management Need to protect: Pointer to free list Bit map Must be kept on disk Copy in memory and disk may differ. Cannot allow for block[i] to have a situation where bit[i] = 1 in memory and bit[i] = 0 on disk. Solution: Set bit[i] = 1 in disk. Allocate block[i] Set bit[i] = 1 in memory 25
Free space management Every block is assigned an ID in sequential order In a reserved part of the disk, a list of the IDs of free blocks is maintained The ID size depends on the number of blocks on disk The number of blocks required to hold the IDs depends on how many blocks are free Q1026