Chap 12: File System Implementa4on

Similar documents
CS370 Operating Systems

CS370 Operating Systems

Chapter 12: File System Implementation

Chapter 11: Implementing File

Chapter 11: Implementing File Systems. Operating System Concepts 9 9h Edition

Chapter 10: File System Implementation

Chapter 12: File System Implementation. Operating System Concepts 9 th Edition

Chapter 11: Implementing File Systems

Chapter 12: File System Implementation

Chapter 12: File System Implementation

OPERATING SYSTEM. Chapter 12: File System Implementation

CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

Week 12: File System Implementation

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

OPERATING SYSTEMS II DPL. ING. CIPRIAN PUNGILĂ, PHD.

CS3600 SYSTEMS AND NETWORKS

File System Implementation

CS370 Operating Systems

CS307: Operating Systems

Chapter 14: File-System Implementation

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

CSE 4/521 Introduction to Operating Systems. Lecture 23 File System Implementation II (Allocation Methods, Free-Space Management) Summer 2018

V. File System. SGG9: chapter 11. Files, directories, sharing FS layers, partitions, allocations, free space. TDIU11: Operating Systems

Da-Wei Chang CSIE.NCKU. Professor Hao-Ren Ke, National Chiao Tung University Professor Hsung-Pin Chang, National Chung Hsing University

Chapter 11: Implementing File-Systems

Chapter 11: File System Implementation

Chapter 11: Implementing File Systems

Chapter 11: File System Implementation. Objectives

File System & Device Drive Mass Storage. File Attributes (Meta Data) File Operations. Directory Structure. Operations Performed on Directory

Chapter 11: Implementing File Systems. Operating System Concepts 8 th Edition,

Chapter 12 File-System Implementation

File System: Interface and Implmentation

File-System Structure. Allocation Methods. Free-Space Management. Directory Implementation. Efficiency and Performance. Recovery

Chapter 11: Implementing File Systems

Chapter 11: File System Implementation

CS720 - Operating Systems

Chapter 7: File-System

Advanced Operating Systems. File Systems Lecture 9

File Systems I COMS W4118

File System Internals. Jo, Heeseung

CS370 Operating Systems

File System Internals. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Computer Systems Laboratory Sungkyunkwan University

Chapter 11: File System Implementation

CS370: Operating Systems [Spring 2017] Dept. Of Computer Science, Colorado State University

File-System Structure

Frequently asked questions from the previous class survey

File Systems: Interface and Implementation

File System CS170 Discussion Week 9. *Some slides taken from TextBook Author s Presentation

CS370: System Architecture & Software [Fall 2014] Dept. Of Computer Science, Colorado State University

Chapter 11: Implementing File Systems

CHAPTER 10 AND 11 - FILE SYSTEM & IMPLEMENTING FILE- SYSTEMS

ICS Principles of Operating Systems

TDDB68 Concurrent Programming and Operating Systems. Lecture: File systems

UNIT V SECONDARY STORAGE MANAGEMENT

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Operating System Concepts Ch. 11: File System Implementation

Chapter 12: File System Implementation

FILE SYSTEM IMPLEMENTATION. Sunu Wibirama

File System Management

File Systems: Interface and Implementation

File Systems: Interface and Implementation

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission 1

File System Implementation. Sunu Wibirama

CS370 Operating Systems

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Operating Systems 2010/2011

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

File Systems. CS170 Fall 2018

File System Internals. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

File-System Interface

A file system is a clearly-defined method that the computer's operating system uses to store, catalog, and retrieve files.

File System Case Studies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Example Implementations of File Systems

Introduction to OS. File Management. MOS Ch. 4. Mahmoud El-Gayyar. Mahmoud El-Gayyar / Introduction to OS 1

UNIX File Systems. How UNIX Organizes and Accesses Files on Disk

Principles of Operating Systems

CSE380 - Operating Systems

Typical File Extensions File Structure

CS 4284 Systems Capstone

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission 1

Main Points. File layout Directory layout

ECE 598 Advanced Operating Systems Lecture 14

File Systems Management and Examples

hashfs Applying Hashing to Op2mize File Systems for Small File Reads

Operating Systems: Lecture 12. File-System Interface and Implementation

Operating Systems. Operating Systems Professor Sina Meraji U of T

Advanced Operating Systems

C13: Files and Directories: System s Perspective

File System (FS) Highlights

File Systems. ECE 650 Systems Programming & Engineering Duke University, Spring 2018

F 4. Both the directory structure and the files reside on disk Backups of these two structures are kept on tapes

Lecture 19: File System Implementation. Mythili Vutukuru IIT Bombay

What is a file system

Outlook. File-System Interface Allocation-Methods Free Space Management

File Management By : Kaushik Vaghani

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems

Final Review Spring 2018 Also see Midterm Review

File System Concepts File Allocation Table (FAT) New Technology File System (NTFS) Extended File System (EXT) Master File Table (MFT)

Transcription:

Chap 12: File System Implementa4on Implemen4ng local file systems and directory structures Implementa4on of remote file systems Block alloca4on and free- block algorithms and trade- offs Slides based on Text by Silberschatz, Galvin, Gagne Berkeley Opera;ng Systems group S. Pallikara Other sources CS370 Opera;ng Systems Yashwant K Malaiya Fall 2015 1 1

Chap 12: File System Implementa4on File- System Structure File- System Implementa4on Directory Implementa4on Alloca4on Methods Free- Space Management Efficiency and Performance Recovery 2

Ques4ons from last 4me How do you access data when one drive in RAID 0 fails? RAID 0: non- redundant, striping Drive fails data is lost. Upcoming deadlines: see Schedule on website HW2 Dec 1 Slides Dec 3 on Piazza Peer reviews Dec 8 (it is a homework) Final project report Dec 9 Final test Dec 15 Done: File System Interface. Now implementa4on. 3

File- System Structure File structure Logical storage unit Collec4on of related informa4on File system resides on secondary storage (disks/ssd) Provides user interface to storage, mapping logical to physical Provides efficient and convenient access to disk by allowing data to be stored, located retrieved easily Can be on other media (flash etc), with different file system Disk provides in- place rewrite and random access I/O transfers performed in blocks of sectors (usually 512 bytes) File control block storage structure - informa4on about a file (inode in Linux) inc loca4on of data Device driver controls the physical device 4

Layered File System Files, metadata File system Logical blocks to physical blocks Linear array of blocks Device drivers 5

Layered File System Logical File System Layer File Organiza4on Layer Processes Search dir, find file loca4on, determine which file blocks will be used Map file blocks (logical blocks) to disk blocks (physical blocks), disk alloca4on fd = open (afilename,..) read (fd, buf, size); write (fd, buf, size); close (fd) Logical block numbers(s); file_start block on disk 6 Basic File System Layer Commands to device driver, Buffering of disk data, caching of disk blocks Disk Driver Disk Controller Physical block numbers In cache? If not, get block Cylinder, track, sector, R/W

File System Layers (from bohom) Device drivers manage I/O devices at the I/O control layer Given commands like read drive1, cylinder 72, track 2, sector 10, into memory loca4on 1060 outputs low- level hardware specific commands to hardware controller Basic file system given command like retrieve block 123 translates to device driver Also manages memory buffers and caches (alloca4on, freeing, replacement) Buffers hold data in transit Caches hold frequently used data File organiza;on module understands files, logical address, and physical blocks - Translates logical block # to physical block # - Manages free space, disk alloca4on 7

File System Layers (Cont.) Logical file system manages metadata informa4on Translates file name into file number, file handle, loca4on by maintaining file control blocks (inodes in UNIX) Directory management Protec4on 8

File Systems Many file systems, some4mes several within an opera4ng system Each with its own format Windows has FAT (1977), FAT32 (1996), NTFS (1993) Linux has more than 40 types, with extended file system (1992) ext2 (1993), ext3 (2001), ext4 (2008); plus distributed file systems floppy, CD, DVD Blu- ray New ones s4ll arriving ZFS, GoogleFS, Oracle ASM, FUSE, xfat 9

File Systems File System Max Partition Max File Size Size Journaling Notes Fat32 4 GiB 8 TiB No Commonly supported ExFAT 128 PiB 128 PiB No Optimized for flash NTFS 2 TiB 256 TiB Yes For Windows Compatibility ext2 2 TiB 32 TiB No Legacy ext3 2 TiB 32 TiB Yes ext4 16 TiB 1 EiB Yes Standard linux filesystem for many years. Best choice for super-standard installation. Modern iteration of ext3. Best choice for new installations where super-standard isn't necessary. 10

File- System Implementa4on Based on several on- disk and in- memory structures. On- disk Boot control block (per volume) Volume control block (per volume) Directory structure (per file system) File control block (per file) In- memory Mount table Directory structure cache The open- file table (system- wide and per process) Buffers of the file- system blocks Volume: logical disk drive, perhaps a par44on 11

On- disk File- System Structures 1. Boot control block contains info needed by system to boot OS from that volume Needed if volume contains OS, usually first block of volume Volume: logical disk drive, perhaps a par44on 2. Volume control block (superblock UFS or master file tablentfs) contains volume details Total # of blocks, # of free blocks, block size, free block pointers or array 3. Directory structure organizes the files File Names and inode numbers UFS, master file table NTFS Boot block Super block Directory, FCBs File data blocks 12

File- System Implementa4on (Cont.) 4. Per- file File Control Block (FCB or inode ) contains many details about the file Indexed using inode number; permissions, size, dates UFS master file table using rela4onal DB structures NTFS 13

In- Memory File System Structures An in- memory mount table contains informa4on about each mounted volume. An in- memory directory- structure cache holds the directory informa4on of recently accessed directories. The system- wide open- file table contains a copy of the FCB of each open file, as well as other informa4on. The per- process open file table contains a pointer to the appropriate entry in the system- wide open- file table Plus buffers hold data blocks from secondary storage Open returns a file handle (file descriptor) for subsequent use Data from read eventually copied to specified user process memory address 14

Create a file A program calls the logical file system. The logical file system knows the format of the directory structures, and allocates a new FCB. The system, then, reads the appropriate directory into memory, up- dates it with the new file name and FCB, and writes it back to the disk. 15

Open a File The file must be opened. The open() passes a file name to the logical file system. The open() first searches the system- wide open- file: if the file is already in use by another process. If yes: a per- process open- file table entry is created. If no: the directory structure is searched for the given file name: once the file is found, the FCB is cached into a system- wide open- file table in memory. (next slide) This table stores the FCB as well as the number of processes that have the file open. 16

Open a file, Read From a File The open() returns an index to the appropriate entry in the per- process file- system table. This index is called file descriptor in Unix and file handle in Windows. All file read/write opera4ons are then performed via this index. 17

Close a File When a process closes the file: The per- process table entry is removed. The system- wide entry's open count is decremented. When all users that have opened the file close it, any updated meta- data is copied back to the disk- based directory structure, and the system- wide open- file table entry is removed. 18

Par44ons and Moun4ng Par44on can be a volume containing a file system or raw just a sequence of blocks with no file system Boot block can point to boot volume or boot loader set of blocks that contain enough code to know how to load the kernel from the file system Root par;;on contains the OS, Mounted at boot 4me other par44ons can hold other OSes, other file systems, or be raw Other par44ons can mount automa4cally or manually At mount 4me, file system consistency checked Is all metadata correct? If not, fix it, try again If yes, add to mount table, allow access 19

Virtual File Systems Virtual File Systems (VFS) on Unix provide an object- oriented way of implemen4ng file systems VFS allows the same system call interface (the API) to be used for different types of file systems The API (POSIX system calls) is to the VFS interface, rather than any specific type of file system Virtual to specific FS interface 20

Virtual File Systems VFS layer serves two important func;ons: 1. It separates file- system- generic opera;ons from their implementa;on, and allows transparent access to different types of file systems mounted locally. 2. It provides a mechanism for uniquely represen;ng a file throughout a network. The VFS is based on a structure, called a vnode. Contains a numerical designator for a network- wide unique file. Unix inodes are unique within only a single file system. The kernel maintains one vnode structure for each ac;ve node. 21

Virtual File System Implementa4on VFS defines set of opera4ons on the objects that must be implemented Every object has a pointer to a func4on table. Func4on table has addresses of rou4nes to implement that func4on on that object. For example: int open(...) Open a file int close(...) Close an already- open file ssize t read(...) Read from a file ssize t write(...) Write to a file int mmap(...) Memory- map a file 22

Directory Implementa4on Linear list of file names with pointer to the data blocks Simple to program Time- consuming to execute Linear search 4me Could keep ordered alphabe4cally via linked list or use B+ tree Hash Table linear list with hash data structure Decreases directory search 4me Collisions situa4ons where two file names hash to the same loca4on. use chained- overflow method Each hash entry can be a linked list instead of an individual value. 23

Ques4ons from last 4me Which file structure is the fastest? Probably the newer ones? How do device drivers work? Complex. Translate logical view to controller commands What is moun4ng? Making a file structure on a storage devices accessible to the computer filing structure. How does a flash memory work? By trapping or not trapping electrons On- disk file system struc4eres 24

Alloca4on Methods i.con4guous An alloca4on method refers to how disk blocks are allocated for files: Con4guous (not common now) Linked (e.g. FAT32) Indexed (e.g. ex2) i. Con;guous alloca;on each file occupies set of con4guous blocks Simple only star4ng loca4on (block #) and length (number of blocks) are required First fit/best fit/worst fit Problems include finding space for file, knowing file size, external fragmenta4on, need for compac;on off- line (down;me) or on- line 25

Con4guous Alloca4on Mapping logical byte address LA to physical Q" LA/512" Assume block size =512 R" Block to be accessed = Q + starting block number (address)" Displacement into block = R" File tr: 3 blocks Star4ng at block 14 26

Extent- Based Systems Many newer file systems (i.e., Veritas File System) use a modified con4guous alloca4on scheme Extent- based file systems allocate disk blocks in extents An extent is a con4guous block of disks Extents are allocated for file alloca4on A file consists of one or more extents Actually 1991 " 27

Alloca4on Methods - Linked ii. Linked alloca;on each file a linked list of blocks Each block contains pointer to next block. File ends at null pointer No external fragmenta4on, no compac4on Free space management system called when new block needed Loca4ng a block can take many I/Os and disk seeks. Improve efficiency by clustering blocks into groups but increases internal fragmenta4on Reliability can be a problem, since every block in a file is linked 28

Alloca4on Methods Linked (Cont.) FAT (File Alloca4on Table) varia4on Beginning of volume has table, indexed by block number Much like a linked list, but faster on disk and cacheable New block alloca4on simple Each FAT entry corresponds to the corresponding block of Storage. Free block entries are also linked. 29

Linked Alloca4on Each file is a linked list of disk blocks: blocks may be scahered anywhere on the disk block =" pointer" (assuming pointer " size is 1 byte)" Mapping Logical byte address LA LA/511" Q" R" Block to be accessed is the Qth block in the linked chain of blocks representing the file." " Displacement into block = R + 1" 30

Linked Alloca4on 31 9[16]- 16[1]- 1[10]- 10[25]- 25[- 1] bad diagram!

Linked Alloca4on Correct version! 32

Alloca4on Methods - Indexed Indexed alloca;on Each file has its own index block(s) of pointers to its data blocks Logical view Pointers to Data blocks index table 33

34 Example of Indexed Alloca4on

Indexed Alloca4on (Cont.) Need index table Random access Dynamic access without external fragmenta4on, but have overhead of index block even for a small file Mapping from logical to physical in a file of maximum size of 512x512 = 256K bytes and block size of 512 bytes: we need only 1 block for index table Q = displacement into index table" R = displacement into block" LA/512" Q" R" (assuming pointer size is 1 byte, block is 512 bytes)" Larger files? Coming up" 35

Indexed Alloca4on Mapping (Cont.) Mapping a file of unbounded length: from logical to physical (block size of 512 words) Linked scheme Link blocks of mul4- block index table (no limit on size) Large file? Linked scheme Mul4- level index Combined scheme LA / (512 x 511)" Q 1 = block of index table" R 1 is used as follows:" R 1 / 512" Q 1 " R 1 " Q 2 " R 2 " Q 2 = displacement into block of index table" R 2 displacement into block of file:" 36

Indexed Alloca4on Mapping (Cont.) Two- level index (4K blocks could store 1,024 four- byte pointers in outer index - > 1,048,567 data blocks and file size of up to 4GB) LA / (1024x1024)" Q 1 " R 1 " 1024x1024 blocks, each 4K Q 1 = displacement into outer-index" R 1 is used as follows:" R 1 /1024" Q 2 " R 2 " Q 2 = displacement into block of index table" R 2 displacement into block of file:" 37

Indexed Alloca4on Mapping (Cont.) 1024 tables, 1024 pointers each 4KB each 38

Combined Scheme: UNIX UFS 4K bytes per block, 32-bit addresses Volume block: Table with file names Points to this Common: 12+3 Indirect block could contain 1024 pointers. Max file size: k.k.k.4k+ Inode (file control block) More index blocks than can be addressed with 32-bit file pointer 39

Performance Best method depends on file access type Con4guous great for sequen4al and random Linked good for sequen4al, not random Declare access type at crea4on - > select either con4guous or linked Indexed more complex Single block access could require 2 index block reads then data block read Clustering can help improve throughput, reduce CPU overhead Cluster: set of con4guous sectors 40

Performance (Cont.) Adding instruc4ons to the execu4on path to save one disk I/O is reasonable Intel Core i7 Extreme Edi4on 990x (2011) at 3.46Ghz = 159,000 MIPS hhp://en.wikipedia.org/wiki/instruc4ons_per_second Typical disk drive at 250 I/Os per second 159,000 MIPS / 250 = 630 million instruc4ons during one disk I/O Fast SSD drives provide 60,000 IOPS 159,000 MIPS / 60,000 = 2.65 millions instruc4ons during one disk I/O 41

Ques4ons from last 4me Loca4ng a byte address with 2- level directory structure Use inner directory, then block, then offset within block LA = 788505 = 3(512x512)+4(512)+25 Disadvantage delay, Advantage: large files (512x512x512x4) Page vs block Page/frame: virtual addressing units Block: one or more sectors on a disk Indexed: good for random access (not seq) 42

Free- Space Management File system maintains free- space list to track available blocks/clusters (Using term block for simplicity) Approaches: i. Bit vector ii. Linked list iii. Grouping iv. Coun;ng Bit vector or bit map (n blocks) 0" 1" 2" n-1" " " bit[i] =" Block number calculation" 1 block[i] free" 0 block[i] occupied" (number of bits per word) *(number of 0-value words) + offset of first 1 bit" 00000000 00000000 00111110.. CPUs have instructions to return offset within word of first 1 bit" 43

Free- Space Management (Cont.) Bit map requires extra space Example: block size = 4KB = 2 12 bytes disk size = 2 40 bytes (1 terabyte) blocks: n = 2 40 /2 12 = 2 28 bits (or 32MB) for map if clusters of 4 blocks - > 8MB of memory Easy to get con4guous files if desired 44

Linked Free Space List on Disk " ii. Linked list (free list)" Cannot get contiguous space easily" No waste of space" No need to traverse the entire list (if # free blocks recorded)" Superblock Can hold pointer to head of linked list 45

Free- Space Management (Cont.) iii. Grouping Modify linked list to store address of next n- 1 free blocks in first free block, plus a pointer to next block that contains free- block- pointers (like this one) iv. Coun4ng Because space is frequently con4guously used and freed, with con4guous- alloca4on alloca4on, extents, or clustering Keep address of first free block and count of following free blocks Free space list then has entries containing addresses and counts 46

Efficiency and Performance Efficiency dependent on: Disk alloca4on and directory algorithms Types of data kept in file s directory entry Pre- alloca4on or as- needed alloca4on of metadata structures Fixed- size or varying- size data structures 47

Efficiency and Performance (Cont.) Performance impacted by Keeping data and metadata close together in disks Buffer cache (Disk cache) separate sec4on of main memory for frequently used blocks Synchronous writes some4mes requested by apps or needed by OS No buffering / caching writes must hit disk before acknowledgement Asynchronous writes more common, buffer- able, faster Free- behind and read- ahead techniques to op4mize sequen4al access Read ahead: read ahead into memory in an4cipa4on, free blocks axer access 48

Memory mapped files I/O using: open(), read(), write() Requires system calls and disk access I/O using Memory mapping maps a disk block to a page (or pages) in memory Manipulate files through memory Writes to files in memory are not necessarily immediate 49

Page Cache vs buffer cache A page cache caches pages rather than disk blocks using virtual memory techniques and addresses Memory- mapped I/O uses a page cache Rou4ne I/O through the file system uses the buffer (disk) cache This leads to double caching (following figure) 50

51 I/O Without a Unified Buffer Cache

Unified Buffer Cache A unified buffer cache uses the same page cache to cache both memory- mapped pages and ordinary file system I/O to avoid double caching But which caches get priority, and what replacement algorithms to use? 52

53 I/O Using a Unified Buffer Cache

Recovery Consistency checking compares data in directory structure with data blocks on disk, and tries to fix inconsistencies Can be slow and some4mes fails Some4mes metadata is duplicated Use system programs to back up data from disk to another storage device (magne4c tape, other magne4c disk, op4cal) Recover lost file or disk by restoring data from backup 54

Log Structured File Systems Log structured (or journaling) file systems record each metadata update to the file system as a transac;on All transac4ons are wrihen to a log A transac4on is considered commihed once it is wrihen to the log (sequen4ally) Some4mes to a separate device or sec4on of disk However, the file system may not yet be updated The transac4ons in the log are asynchronously wrihen to the file system structures When the file system structures are modified, the transac4on is removed from the log If the file system crashes, all remaining transac4ons in the log must s4ll be performed Faster recovery from crash, removes chance of inconsistency of metadata 55

UNIX directory structure Contains only file names and the corresponding inode numbers Use ls i to retrieve inode numbers of the files in the directory Looking up path names in UNIX Example: /usr/tom/mbox Lookup inode for /, then for usr, then for tom, then for mbox 56

Advantages of directory entries that have name and inode informa4on Changing filename only requires changing the directory entry Only 1 physical copy of file needs to be on disk File may have several names (or the same name) in different directories Directory entries are small Most file info is kept in the inode 57

Two hard links to the same file Directory entry in /dira..[12345 filename1].. Directory entry in /dirb..[12345 filename2].. Both refer to the same inode To create a hard link ln /dira/filename1 /dirb/filename2 To create a symbolic link ln - s /dira/filenmame1 /dirb/filename3 filename3 just contains a pointer 58

File system based on inodes Limita4ons File must fit in a single disk par44on Par44on size and number of files are fixed when system is set up inode prealloca4on and distribu4on inodes are preallocated on a volume Even on empty disks % of space lost to inodes Prealloca4ng inodes and spreading them Improves performance Keep file s data block close to its inode Reduce seek 4mes 59

Checking up on the inodes Command: df - i Gives inode sta4s4cs for a given system: total, free and used nodes Filesystem 512-blocks Used Available Capacity iused ifree %iused Mounted on /dev/disk0s2 488555536 126143120 361900416 26% 15831888 45237552 26% / devfs 361 361 0 100% 626 0 100% /dev map -hosts 0 0 0 100% 0 0 100% /net map auto_home 0 0 0 100% 0 0 100% /home Command: ls i 211655579 exfile.txt 211655593 exfile2.txt Command: stat *.* 234881026 211655579 -rw-r--r-- 1 ymalaiya staff 0 25 "Dec 3 18:11:02 2015 "Dec 3 18:11:00 2015" "Dec 3 18:11:00 2015" "Dec 3 18:11:00 2015" 4096 8 0 exfile.txt 234881026 211655593 -rw-r--r-- 1 ymalaiya staff 0 0 "Dec 3 18:11:46 2015" "Dec 3 18:11:46 2015" "Dec 3 18:11:46 2015" "Dec 3 18:11:46 2015" 4096 0 0 exfile2.txt 60