File System Case Studies Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu
Today s Topics The Original UNIX File System FFS Ext2 FAT 2
UNIX FS (1) On-disk layout Boot Block Super Block i-node List Data Blocks Boot block: stores boot code Super block: Basic info. of the file system Head of freelists of i-nodes and data blocks i-node list Referenced by index into the i-node list All i-nodes are the same size Data blocks A data block belongs to only one file 3
UNIX FS (2) i-node File metadata Pointers for data blocks belonging to the file 4
UNIX FS (3) Directory Directory is a special file Array of <file name, i-node number>. 100.. 24 foo 212 bar 37 The i-node number of the root directory is fixed or specified in the superblock Different names can point to the same i-node number ( hard link ) Each i-node keeps track of the link count 5
FFS (1) Fast file system (FFS) The original Unix file system (70 s) was very simple and straightforwardly implemented: Easy to implement and understand. But very poor utilization of disk bandwidth (lots of seeking). BSD Unix folks redesigned file system called FFS. McKusick, Joy, Fabry, and Leffler (mid 80 s) Now it is the file system from which all other UNIX file systems have been compared. The basic idea is aware of disk structure. Place related things on nearby cylinders to reduce seeks. Improved disk utilization, decreased response time. 6
FFS (2) Problems in the original UNIX file system Files have their blocks allocated randomly over the disk in aging file systems As the file system ages and fills, need to allocate blocks freed up when other files are deleted. Deleted files essentially randomly placed. i-nodes are allocated far from blocks All i-nodes at the beginning of disk, far from data Traversing file name paths, manipulating files and directories require going back and forth from i-nodes to data blocks Files in a directory are typically not allocated consecutive i-node slots The small block size: 512 bytes 7
FFS (3) Cylinder groups BSD FFS addressed these problems using the notion of a cylinder group. Disk partitioned into groups of cylinders. Data blocks from a file all placed in the same cylinder group. Files in same directory placed in the same cylinder group. i-nodes for files allocated in the same cylinder group as file s data blocks. 8
Ext2 FS (1) History Evolved from Minix filesystem. Block addresses are stored in 16bit integers maximal file system size is restricted to 64MB. Directories contain fixed-size entries and the maximal file name was 14 characters. Virtual File System (VFS) is added. Extended Filesystem (Ext FS), 1992. Added to Linux 0.96c Maximum file system size was 2GB, and the maximal file name size was 255 characters. Second Extended Filesystem (Ext2 FS), 1994. Evolved to Ext3 File system (with journaling) 9
Ext2 FS (2) Ext2 features Configurable block sizes (from 1KB to 4KB) depending on the expected average file size. Configurable number of i-nodes depending on the expected number of files Partitions disk blocks into groups. lower average disk seek time Preallocates disk data blocks to regular files. reduces file fragmentation Fast symbolic links If the pathname of the symbolic link has 60 bytes or less, it is stored in the i-node. Automatic consistency check at boot time. 10
Ext2 FS (3) Disk layout Boot block Reserved for the partition boot sector Block group Similar to the cylinder group in FFS. All the block groups have the same size and are stored sequentially. Boot Block Block group 0 Block group n Super Block Group Descriptors Data block Bitmap i-node Bitmap i-node Table Data Blocks 1 block n blocks 1 block 1 block n block n blocks 11
Ext2 FS (4) Block group Superblock: stores file system metadata Total number of i-nodes, file system size in blocks Free blocks / i-nodes counter Number of blocks / i-nodes per group, block size, etc. Group descriptor Number of free blocks / i-nodes / directories in the group Block number of block / i-node bitmap, etc. Both the superblock and the group descriptors are duplicated in each block group. Only those in block group 0 are used by the kernel. fsck copies them into all other block groups. When data corruption occurs, fsck uses old copies to bring the file system back to a consistent state. 12
Ext2 FS (5) Block group size The block bitmap must be stored in a single block. In each block group, there can be at most 8xb blocks, where b is the block size in bytes. The smaller the block size, the larger the number of block groups. Example: 8GB Ext2 partition with 4KB block size Each 4KB block bitmap describes 32K data blocks = 32K * 4KB = 128MB At most 64 block groups are needed. 13
Ext2 FS (6) Directory structure inode record length name 0 12 24 40 62 68 21 22 53 67 0 34 12 1 2. \0 \0 \0 12 2 2.. \0 \0 16 5 2 h o m e 1 \0 \0 \0 28 3 2 u s r \0 16 7 1 o l d f i l e \0 12 3 2 b i n \0 <deleted file> name length file type 14
FAT FS (1) FAT filesystem Used in MS-DOS based OSes MS-DOS, Windows 3.1, 95, 98, Originally developed as a simple file system suitable for floppy disk drives less than 500KB in size. FAT stands for File Allocation Table. Each FAT entry contains a pointer to a region on the disk Currently there are three FAT file system types: FAT12, FAT16, FAT32 15
FAT FS (2) FAT filesystem organization Boot Sector FAT1 FAT2 Root Dir. Files and Directories FAT file system on disk data structure is all little endian. The data area is divided into clusters. Used for subdirectories and files. Root directory region doesn t exist on FAT32. 16
FAT FS (3) Boot sector The first sector on the disk. Contains BPB (BIOS Parameter Block). Sectors per cluster The number of sectors on the volume. Volume label. The number of root directory entries. File system type (FAT12, FAT16, FAT32) and many more. If the volume is bootable, the first sector also contains the code required to boot the OS. 17
FAT FS (4) FAT (File Allocation Table) Starts at sector 1 (after the boot sector) A singly linked list of the clusters of a file. Two copies maintained so that corruption can be detected and repaired. Cached in memory Unused / bad / reserved clusters also marked foo 12 11 17 13 15 0-1 0 18-1 10 11 12 13 14 15 16 17 18 bar 10 18
FAT FS (5) FAT12 example Each FAT12 entry is 12bits. When designed, space was tight. Pack 2 entries into 3 bytes. 4096 possible clusters. If a sector is 512bytes and cluster = 1 sector, can represent 2MB of data. FAT12 entry values: 0 Unused cluster 0xFF0-0xFF6 Reserved cluster 0xFF7 Bad cluster 0xFF8-0xFFF End of Clusterchain mark Other Next cluster in file 19
FAT FS (6) Maximum partition size allowed Block size FAT-12 FAT-16 FAT-32 0.5 KB 2 MB 1 KB 4 MB 2 KB 8 MB 128 MB 4 KB 16 MB 256 MB 1 TB 8 KB 512 MB 2 TB 16 KB 1024 MB 2 TB 32 KB 2048 MB 2 TB 20
FAT FS (7) Directories The root directory is fixed in length and always located at the start of the volume (after the FAT). FAT32 treats the root directory as just another cluster chain in the data area. A subdirectory is nothing but a regular file that has a special attribute indicating it is a directory. No size restriction The data or contents of the file is a series of 32byte FAT directory entries. Filename s first character is usage indicator:» 0x00 Never been used.» 0xe5 Used before but entry has been released. 21
FAT FS (8) FAT directory entry (Last written) Attributes: Read Only, Hidden, System, Volume Label, Subdirectory, Archive 22
FAT FS (9) FAT32 directory entry An entry for a long file name = 0x0F Checksum 23
FAT FS (10) Representing long file name in FAT32 The quick brown fox jumps over the lazy dog 24
반도체시스템공학전공내시스템소프트웨어트랙로드맵 2 학년 3 학년 4 학년 1 학기 2 학기 1 학기 2 학기 1 학기 2 학기 문제해결방법 자료구조및알고리즘 컴퓨터그래픽스 * 인공지능 * 논리회로 컴퓨터시스템개론 컴퓨터구조 프로그래밍언어와컴파일러 멀티코어시스템 객체지향시스템설계 * 논리회로설계실험 마이크로프로세서 컴퓨터네트워크 이동컴퓨팅 공학수학 1 이산수학 마이크로프로세서실험 시스템운영체제 임베디드시스템설계 임베디드시스템실습 전자기학 1 디지털시스템 SoC 설계 기초전기회로 1 기초전기회로 2 소프트웨어공학 * 시스템시뮬레이션 기초전기회로실험데이터베이스 * 정보보호개요 * * 표시는컴퓨터공학전공개설교과목. 이산수학은학부대학교양과목으로수강이적극권장됨. 4 학년교과목중컴퓨터그래픽스, 인공지능, 객체지향시스템설계는필요에따라선택적으로수강 주요진출분야 : 시스템소프트웨어엔지니어및연구원 ( 휴대폰, 스마트폰, Digital TV, 플래시메모리카드, SSD 등다양한임베디드시스템을위한운영체제개발 / 이식, 디바이스드라이버개발, 성능 / 전력소모 / 신뢰성향상을위한소프트웨어최적화, 새로운응용및서비스개발등의업무수행 )