Logical disks. Bach 2.2.1

Similar documents
INTERNAL REPRESENTATION OF FILES:

Lecture 19: File System Implementation. Mythili Vutukuru IIT Bombay

ROEVER ENGINEERING COLLEGE Elambalur,Perambalur DEPARTMENT OF CSE UI UNIT-I

Department of Computer Science and Technology, UTU 2014

ADVANCED OPERATING SYSTEMS

Arvind Krishnamurthy Spring Implementing file system abstraction on top of raw disks

The UNIX Time- Sharing System

Noorul Islam College Of Engineering, Kumaracoil MCA Degree Model Examination (October 2007) 5 th Semester MC1642 UNIX Internals 2 mark Questions

Motivation. Operating Systems. File Systems. Outline. Files: The User s Point of View. File System Concepts. Solution? Files!

UNIX File System. UNIX File System. The UNIX file system has a hierarchical tree structure with the top in root.

CSE506: Operating Systems CSE 506: Operating Systems

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING ACADEMIC YEAR / ODD SEMESTER

Virtual File System. Don Porter CSE 306

Files and File System

File. File System Implementation. Operations. Permissions and Data Layout. Storing and Accessing File Data. Opening a File

PROCESS CONTROL: PROCESS CREATION: UNIT-VI PROCESS CONTROL III-II R

SHRI ANGALAMMAN COLLEGE OF ENGINEERING AND TECHNOLOGY (An ISO 9001:2008 Certified Institution) SIRUGANOOR, TIRUCHIRAPPALLI

RCU. ò Walk through two system calls in some detail. ò Open and read. ò Too much code to cover all FS system calls. ò 3 Cases for a dentry:

VFS, Continued. Don Porter CSE 506

File systems: outline

Process Creation in UNIX

File Systems. Kartik Gopalan. Chapter 4 From Tanenbaum s Modern Operating System

CS2028 -UNIX INTERNALS

V. File System. SGG9: chapter 11. Files, directories, sharing FS layers, partitions, allocations, free space. TDIU11: Operating Systems

Case study: ext2 FS 1

Operating Systems CMPSC 473 File System Implementation April 10, Lecture 21 Instructor: Trent Jaeger

ECE 598 Advanced Operating Systems Lecture 19

Case study: ext2 FS 1

PROCESS STATES AND TRANSITIONS:

To understand this, let's build a layered model from the bottom up. Layers include: device driver filesystem file

CSE 120: Principles of Operating Systems. Lecture 10. File Systems. November 6, Prof. Joe Pasquale

Secondary Storage (Chp. 5.4 disk hardware, Chp. 6 File Systems, Tanenbaum)

Babu Madhav Institute of Information Technology, UTU

CSE 120: Principles of Operating Systems. Lecture 10. File Systems. February 22, Prof. Joe Pasquale

Chapter 11: File System Implementation

Introduction. Secondary Storage. File concept. File attributes

File Systems. CS170 Fall 2018

File System Internals. Jo, Heeseung

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 18: Naming, Directories, and File Caching

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 18: Naming, Directories, and File Caching

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

File System Code Walkthrough

Outline. Operating Systems. File Systems. File System Concepts. Example: Unix open() Files: The User s Point of View

mode uid gid atime ctime mtime size block count reference count direct blocks (12) single indirect double indirect triple indirect mode uid gid atime

File Management. Chapter 12

File Management 1/34

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Introduction to OS. File Management. MOS Ch. 4. Mahmoud El-Gayyar. Mahmoud El-Gayyar / Introduction to OS 1

File Systems. Todays Plan. Vera Goebel Thomas Plagemann. Department of Informatics University of Oslo

Chapter 11: File System Implementation. Objectives

Fall 2017 :: CSE 306. File Systems Basics. Nima Honarmand

Process Management 1

Operating Systems CMPSC 473. File System Implementation April 1, Lecture 19 Instructor: Trent Jaeger

Glossary. The target of keyboard input in a

CS 550 Operating Systems Spring File System

File Systems. What do we need to know?

Chapter 10: Case Studies. So what happens in a real operating system?

VIRTUAL FILE SYSTEM AND FILE SYSTEM CONCEPTS Operating Systems Design Euiseong Seo

CS370 Operating Systems

Da-Wei Chang CSIE.NCKU. Professor Hao-Ren Ke, National Chiao Tung University Professor Hsung-Pin Chang, National Chung Hsing University

CS2506 Quick Revision

3/26/2014. Contents. Concepts (1) Disk: Device that stores information (files) Many files x many users: OS management

OS COMPONENTS OVERVIEW OF UNIX FILE I/O. CS124 Operating Systems Fall , Lecture 2

CS370 Operating Systems

COMP SCI 3SH3: Operating System Concepts (Term 2 Winter 2006) Test 2 February 27, 2006; Time: 50 Minutes ;. Questions Instructor: Dr.

File Systems. Before We Begin. So Far, We Have Considered. Motivation for File Systems. CSE 120: Principles of Operating Systems.

CS370 Operating Systems

File Systems. CSE 2431: Introduction to Operating Systems Reading: Chap. 11, , 18.7, [OSC]

File. File System Implementation. File Metadata. File System Implementation. Direct Memory Access Cont. Hardware background: Direct Memory Access

ECE 598 Advanced Operating Systems Lecture 18

Local File Stores. Job of a File Store. Physical Disk Layout CIS657

File Systems. CS 4410 Operating Systems. [R. Agarwal, L. Alvisi, A. Bracy, M. George, E. Sirer, R. Van Renesse]

4/19/2016. The ext2 file system. Case study: ext2 FS. Recap: i-nodes. Recap: i-nodes. Inode Contents. Ext2 i-nodes

Operating Systems CMPSC 473 Midterm 2 Review April 15, Lecture 21 Instructor: Trent Jaeger

CS 4284 Systems Capstone

Chapter 11: Implementing File Systems

Virtual File System. Don Porter CSE 506

File System: Interface and Implmentation

UNIX rewritten using C (Dennis Ritchie) UNIX (v7) released (ancestor of most UNIXs).

NTFS Recoverability. CS 537 Lecture 17 NTFS internals. NTFS On-Disk Structure

Computer Systems Laboratory Sungkyunkwan University

Operating Systems. Operating Systems Professor Sina Meraji U of T

The UNIX File System

Caching and reliability

CSE506: Operating Systems CSE 506: Operating Systems

Advanced Operating Systems

File I/O and File Systems

File System Case Studies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

The EXT2FS Library. The EXT2FS Library Version 1.37 January by Theodore Ts o

UNIX File Systems. How UNIX Organizes and Accesses Files on Disk

Processes COMPSCI 386

File Systems. Chapter 11, 13 OSPP

Addition of Structured Records to the UNIX and MS-DOS File Systems

CSE506: Operating Systems CSE 506: Operating Systems

File Systems. File system interface (logical view) File system implementation (physical view)

File Systems: Consistency Issues

UNIX Kernel. UNIX History

CS3210: File Systems Tim Andersen

The UNIX File System

Typical File Extensions File Structure

Transcription:

Logical disks Bach 2.2.1 Physical disk is divided into partitions or logical disks Logical disk linear sequence of fixed size, randomly accessible, blocks disk device driver maps underlying physical storage to logical disks large blocks faster access, more fragmentation File system logical disk whose blocks have been arranged suitably so that files may be created and accessed NOTE: 1. Logical disk may also be used for swap partition 2. Logical disk may be located on multiple physical disks (e.g. volumes, striping, raid) Operating Systems: File Systems p. 1/37

File systems File inode (header) + data Boot Super Inode Data block block list blocks Boot block: usually contains bootstrap code Super block: size, # files, free blocks, etc. Inode list size fixed when configuring file system contains all inodes Data blocks: file data Files, directories organized into tree-like structure Operating Systems: File Systems p. 2/37

Super block Bach 4.5 Size of the file system Free data blocks: # of free blocks, list of free blocks, pointer to the first free block on free list Size of inode list Free inodes: (as above) Locks for free block list, free inode list Dirty flag Operating Systems: File Systems p. 3/37

Inodes Bach 4.1 File attributes: Type: regular file, directory, device special file, pipes Owner: individual + group Permissions: read, write, execute, for owner, group, others Access times: last modified, last accessed, last modification of inode Number of links File size: 1 + highest byte offset written into Table of contents: disk addresses of (discontiguous) disk blocks containing the file data -rw-rw-r- 1 mandar mandar 2647 Mar 11 14:58 filesys.tex Operating Systems: File Systems p. 4/37

Table of contents Bach 4.2 File data stored in non-contiguous disk blocks contiguous storage fragmentation, expansion of files problematic File space is allocated one block at a time (i.e. data can be spread throughout file system) 10 direct entries: numbers of disk blocks that contains file data Single indirect: number of a disk block that contains a list of direct block numbers Double indirect, triple indirect Processes access data by byte offset; kernel converts byte offset into block no. Block no. = 0 corresponding logical block contains no data no disk space is wasted Operating Systems: File Systems p. 5/37

Directories (SVR2) Bach 4.3 Data blocks contain a sequence of 16 byte entries Each entry = inode number (2 bytes) + null-terminated file names (14 bytes) Compulsory entries: current directory (.) and parent directory (..) for root, parent directory = root Inode number = 0 empty directory entry Operating Systems: File Systems p. 6/37

Inode cache/table Bach 4.1 List of buffers stored in main memory Each buffer contains in-core copy of disk inode At most one copy of any inode is present in the cache Additional information stored in each buffer: logical device number of file system that contains the file inode number inode list on disk linear array numbering starts from 1 pointers to other in-core inodes status (locked/free/awaited, dirty bit, mount point flag) reference count: # of active uses of the file Operating Systems: File Systems p. 7/37

Inode cache Hash queues free Operating Systems: File Systems p. 8/37

Inode cache Free list: doubly linked circular list of inodes that are not in active use reference count = 0 inode is on free list kernel can reallocate buffer to hold another disk inode initially, all buffers are free kernel takes inode buffer from head of list when it needs a free buffer kernel returns buffer to end of list when done buffer allocation policy = LRU Hash queue buffer pool organized into separate circular, doubly linked lists inode is assigned to a queue using a hash function of dev. no., block no. Inode buffer can simultaneously be on free list, hash queue Operating Systems: File Systems p. 9/37

Accessing inodes Boot Super... Data block block blocks Inode list Input: Inode no. i Output: Location (Block no., byte offset) Method: 1. Let n = number of inodes per block 2. Block no. B = (i 1) div n + starting block of inode list 3. Byte offset b = ((i 1) mod n) size of disk inode 4. Return B, b Operating Systems: File Systems p. 10/37

Algorithms iget, iput get / release a known inode used for opening / closing a file ialloc, ifree allocate / free a new inode used for creating / deleting files alloc, free allocate / free a new disk block used for adding / removing blocks from a file Operating Systems: File Systems p. 11/37

iget Input: device no., inode no. Output: locked inode Algorithm: Bach 4.1 while (not done) { if (inode in cache) { if (inode locked) { sleep till inode becomes unlocked; continue; /*!! */ } remove from free list if necessary; increment ref_count; return inode; } } /* inode not in cache */ if (free list is empty) return error; /* why? */ remove buffer from free list; reassign to correct hash queue; read inode from disk; initialize ref_count = 1; return inode; Operating Systems: File Systems p. 12/37

iput Input: device no., inode no. Output: none Algorithm: lock inode (if not already locked); decrement ref_count; if (ref_count = 0) { if (link count = 0) { free disk blocks for the file; set file type = 0; /* type = 0 <=> inode is free */ free inode; /* ifree() */ } if (file accessed / changed, or inode changed) update disk copy; /* since in-core copy is "dirty" */ put inode on free list; } release lock; Operating Systems: File Systems p. 13/37

Locks vs. reference count Locks are used to ensure mutual exclusion Locks never last across system calls always released at the end of a system call Reference count = number of active uses Reference count remains set across multiple system calls prevents kernel from reassigning an inode that is in use Operating Systems: File Systems p. 14/37

ialloc Superblock has a cache containing the numbers of free inodes (type = 0) Bach 4.6 Algorithm: while (!done) { if (SB is locked) { sleep until SB is free; continue; } if (inode list in SB is empty) { lock SB; starting from "remembered inode", search disk for free inodes; add free inodes to SB list until full, or no more free inodes; update "remembered inode" for next search; unlock SB; /* wakes up other procs */ if (no free inodes) return error; } Operating Systems: File Systems p. 15/37

ialloc } get inode from SB inode list; get corresponding inode from inode cache; /* iget */ if (inode is not free) { /* race condition! */ release inode; /* iput() */ continue; } initialize inode; write inode to disk; decrement free inode count; return inode; Operating Systems: File Systems p. 16/37

ialloc Race condition: 1. Kernel assigns inode I to process P A ; P A goes to sleep before reading disk copy into memory. 2. P B needs inode, finds list is empty, searches for free inodes, finds I is free, puts I on free list. 3. P A wakes up, initializes I and uses it. 4. When P C requires inode, gets I, but I is not free! Operating Systems: File Systems p. 17/37

ifree Algorithm: increment file system free inode count; if (SB is locked) return; lock SB; if (inode list is full) { if (inode no. < "remembered inode") update "remembered inode"; } else store inode no. in free inode list; unlock SB; Operating Systems: File Systems p. 18/37

Disk block allocation SB contains a list of free disk block nos. Initially, mkfs organizes all data blocks in a linked list each link (disk block) contains (i) list of free disk block nos., (ii) no. of next block on the list Bach 4.7 a 1 a 2 a 3 b 1 b 2 b 3............ Super Block Block a 1 Block b 1 Free nodes identifiable by type field; free disk blocks not identifiable by content Disk blocks consumed more quickly than inodes Disk blocks large enough to contain long list of free block nos. Operating Systems: File Systems p. 19/37

alloc Algorithm: while (SB is locked) sleep until SB is free; remove block from SB free list; if (last block was removed) { lock SB; read block just removed; copy block nos. into SB list; unlock SB; /* wake up other procs */ } zero block contents; decrement total count of free blocks; mark SB modified; return buffer; Operating Systems: File Systems p. 20/37

free Algorithm: if (SB list is not full) put block on SB list; if (SB list if full) { copy SB list into freed block; write block to disk; put block no. of freed block into SB list; } Operating Systems: File Systems p. 21/37

System calls... File descriptor table Global file table Inode table File descriptor (per process) pointers to all open files Global file table mode, offset for each open-ed file Inode table memory copy of on-disk inode (only one per file) creat, open, close read, write mount, umount Operating Systems: File Systems p. 22/37

open Bach 5.1 Syntax: Algorithm: fd = open(pathname, flags, mode); flags = O_RDONLY, O_RDWR, etc. mode - used only if file is created 1. Find in-core inode for given filename (pathname lookup, followed by iget). 2. If file does not exist, or access denied, return error. 3. Allocate and initialize global file table entry. pointer to inode, mode, ref. count, offset (0 or file size) 4. Allocate user fd table entry; set pointer to global file table entry. 5. If file needs to be truncated, free all file blocks (free). 6. Unlock inode, return descriptor. Operating Systems: File Systems p. 23/37

close Bach 5.6 Syntax: close(fd); Algorithm: 1. Set user fd table entry to NULL. 2. Decrement ref. count of global file table entry. 3. If ref. count is 0: free file table entry, release inode (iput). Operating Systems: File Systems p. 24/37

creat Bach 5.7 Syntax: Action: fd = creat(pathname, modes); If file non-existent: new file with specified permissions is created (parent directory must have write permission) If file existed, file is truncated and opened in write mode (file itself should have write permission, parent directory permissions not checked) Algorithm: 1. Parse given pathname. remember inode of parent directory in u area, keep inode locked note byte offset of first empty directory slot in the directory and save this in the u area 2. If file already exists, and permissions are improper: release inode (iput), return error. Operating Systems: File Systems p. 25/37

4. Otherwise: 4.1 assign free inode from file system (ialloc). 4.2 initialize new directory entry in parent directory with new name and inode no. 4.3 write directory with new name to disk. 4.4 release inode of parent directory (iput). 5. Allocate file table entry for inode, initialize ref. count. 6. If file existed, free all disk blocks (free). (Owner and permissions of old file are retained.) 7. Unlock inode, return file descriptor. NB: Order of writes is important: system crash allocated inode will not be reachable if writes were done in reverse order, system crash path name would refer to bad inode creat Operating Systems: File Systems p. 26/37

read Bach 5.2 Syntax: Algorithm: num_read = read(fd, buffer, num_bytes); 1. Get file table entry from user fd. 2. Check file mode (read / write). 3. Set parameters in the u area: (i) mode = read (ii) number of bytes to read (iii) offset in file (iv) target memory address (v) flag to indicate that target is in user memory 4. Get inode from file table, lock inode. 5. While target is not reached: 5.1 convert file offset into disk block no., offset in block. 5.2 calculate no. of bytes to read; return EOF if necessary. 5.3 read block into system buffer; copy data from system buffer to user address. 5.4 update u area fields. Operating Systems: File Systems p. 27/37

read 6. Unlock inode. 7. Update file table offset for next read. 8. Return total no. of bytes read. Notes: If a process reads data from a "non-existent" block, kernel returns null (zero) bytes Inode is locked for the duration of the call to ensure that a single read call returns consistent data (otherwise, single read can return mix of old and new data) Inode unlocked at the end of the call concurrent reading and writing may return mix of data fd1 = open("abc.txt",...); fd2 = open("abc.txt",...); fd1, fd2 are manipulated indenpendently Operating Systems: File Systems p. 28/37

write Bach 5.3 Syntax: num_written = write(fd, buffer, count); Algorithm: as for read Notes: When writing a block: if only a part of the block is written, block must first be read from disk if entire block is written, block need not be read If the file does not contain a block corresponding to the byte offset to be written, kernel allocates a new block (alloc); fills in ToC slot with this block no. multiple blocks may have to be allocated if the offset corresponds to an indirect block Operating Systems: File Systems p. 29/37

mount / mnt floppy cdrom audio / images Bach 5.14 Syntax: mount(dev, dir, options); dev = dev. spl. file for fs to be mounted dir = mount point options = read-only, etc. Mount table: 1 entry for each mounted file system device no. pointer to buffer containing super block of mounted f.s. pointer to root inode of mounted f.s. pointer to inode of mount point directory Operating Systems: File Systems p. 30/37

Algorithm: 1. If not super user, return error. 2. Get inode for block special file corresponding to the file system to be mounted; extract major and minor numbers. 3. Get inode for mount point directory. If (not directory or ref. count > 1), release inodes; return error. 4. Find free slot in mount table; mark slot in use; initialize device # field. process could go to sleep in ensuing read; marking slot prevents other process from using the same M.T. entry. recording device # prevents other processes from mounting same file system again. 5. Call block device open routine (legality checks, initialization of hardware and driver data structures, etc.). mount Operating Systems: File Systems p. 31/37

mount 6. Read SB into system buffer; initialize SB fields. locks are cleared no. of free inodes in SB list is set to 0 minimizes chance of file system corruption when mounting f.s. after a crash, since ialloc scans disk and constructs an accurate list of free inodes 7. Get root inode of mounted f.s. (iget); save pointer in mount table. 8. Set mount_point flag in inode of mount point directory; save pointer to this inode. allows path names to use.. to traverse mount point directory 9. Release special file inode (iput); unlock inode of mount point directory. Operating Systems: File Systems p. 32/37

umount Syntax: Algorithm: umount(dev); 1. If not superuser, return error. 2. Get inode of device special file; extract major and minor nos. of device being unmounted; release inode of special file. 3. Get mount table entry, based on major/minor #. 4. Check whether files on the f.s. are still in use: 4.1 search inode table for all files whose dev. # matches the f.s. to be unmounted; 4.2 if any such file has ref. count > 0 (current directory of some process / open files that have not been closed), return error. 5. Update SB, inodes, flush any unwritten data. 6. Get root inode of mounted f.s. from M.T.; release inode (iput). Operating Systems: File Systems p. 33/37

umount 7. Invoke close routine for device. 8. Get and lock inode of mount point via M.T. 9. Clear mount_point flag, release inode iput. 10. Free system buffer used for SB; free M.T. entry. Operating Systems: File Systems p. 34/37

dup Bach 5.13 Syntax: newfd = dup(fd); Algorithm: 1. Find first free slot in the user fd table. 2. Copy given file descriptor into the free slot. 3. Increment ref. count of corresponding global file table entry. 4. Return descriptor (index) of this slot. Operating Systems: File Systems p. 35/37

Pipes Definition: pseudo file with a maximum size which has two file descriptors Syntax: int fd[2]; status = pipe(fd); Bach 5.12 read fd[0] write fd[1] Operating Systems: File Systems p. 36/37

Shell Bach 7.8 Algorithm: 1. Parse command line. 2. If command is internal command, call suitable function. 3. If command is external command, fork a child. In child: 3.1 if input / output redirection is required: fd = /* create new file */ close(stdout); dup(fd); close(fd); 3.2 if pipes are required: 3.2.1 create a pipe; fork a child 3.2.2 in child: setup pipes s.t. stdout goes to pipe, exec the first component of command line in parent: setup pipes s.t. stdin comes from pipe. 3.3 exec command (or last component of command). 4. If command is run in foreground, wait for child to exit. Operating Systems: File Systems p. 37/37