File System Interface. ICS332 Operating Systems

Similar documents
Chapter 10: File System

Chapter 11: File-System Interface

Chapter 11: File-System Interface

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

Chapter 11: File-System Interface. Operating System Concepts 9 th Edition

I.-C. Lin, Assistant Professor. Textbook: Operating System Concepts 8ed CHAPTER 10: FILE SYSTEM

Chapter 10: File-System Interface. File Concept Access Methods Directory Structure File-System Mounting File Sharing Protection

Chapter 10: File-System Interface

Chapter 10: File-System Interface. Operating System Concepts 8 th Edition

Chapter 11: File-System Interface. Long-term Information Storage. File Structure. File Structure. File Concept. File Attributes

Chapter 11: File-System Interface

Chapter 11: File-System Interface. File Concept. File Structure

Chapter 11: File System Interface

File-System Interface

Files and File Systems

Modulo V Sistema de Arquivos

Chapter 10: File System. Operating System Concepts 9 th Edition

Chapter 11: File-System Interface

File Systems: Interface and Implementation

File Systems: Interface and Implementation

Chapter 10: File-System Interface

File System File Concept File Structure File Attributes Name Identifier Type Location Size Protection Time, date, and user identification

Chapter 10: File-System Interface

Chapter 10: File-System Interface. Operating System Concepts with Java 8 th Edition

File System Interface: Overview. Objective. File Concept UNIT-IV FILE SYSTEMS

CS3600 SYSTEMS AND NETWORKS

File-System Interface. File Structure. File Concept. File Concept Access Methods Directory Structure File-System Mounting File Sharing Protection

Introduction to File Systems

Chapter 6 Storage Management File-System Interface 11.1

Lecture 10 File Systems - Interface (chapter 10)

Chapter 7: File-System

Outlook. File-System Interface Allocation-Methods Free Space Management

ICS Principles of Operating Systems

Part Four - Storage Management. Chapter 10: File-System Interface

File Concept Access Methods Directory and Disk Structure File-System Mounting File Sharing Protection

File Systems: Interface and Implementation

File System (FS) Highlights

CSE325 Principles of Operating Systems. File Systems. David P. Duggan. March 21, 2013

Chapter 11: File-System Interface

TDDB68 Concurrent Programming and Operating Systems. Lecture: File systems

Chapter 11: File-System Interface

CS720 - Operating Systems

Chapter 11: File-System Interface. File Concept. File Structure

UNIT V SECONDARY STORAGE MANAGEMENT

Implementation should be efficient. Provide an abstraction to the user. Abstraction should be useful. Ownership and permissions.

File Systems Ch 4. 1 CS 422 T W Bennet Mississippi College

UNIX File System. UNIX File System. The UNIX file system has a hierarchical tree structure with the top in root.

Motivation: I/O is Important. CS 537 Lecture 12 File System Interface. File systems

Virtual Memory cont d.; File System Interface. 03/30/2007 CSCI 315 Operating Systems Design 1

Principles of Operating Systems

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 18: Naming, Directories, and File Caching

Chapter 9: File System Interface

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 18: Naming, Directories, and File Caching

File Management By : Kaushik Vaghani

CSE 120 Principles of Operating Systems

Chapter 11: File System Implementation. Objectives

F 4. Both the directory structure and the files reside on disk Backups of these two structures are kept on tapes

File Systems. CS170 Fall 2018

MODULE 4. FILE SYSTEM AND SECONDARY STORAGE

Fall 2017 :: CSE 306. File Systems Basics. Nima Honarmand

A file system is a clearly-defined method that the computer's operating system uses to store, catalog, and retrieve files.

Last Class: Memory management. Per-process Replacement

File Systems. What do we need to know?

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Problem Overhead File containing the path must be read, and then the path must be parsed and followed to find the actual I-node. o Might require many

File-System Interface

There is a general need for long-term and shared data storage: Files meet these requirements The file manager or file system within the OS

FILE SYSTEMS. CS124 Operating Systems Winter , Lecture 23

Directory Structure and File Allocation Methods

V. File System. SGG9: chapter 11. Files, directories, sharing FS layers, partitions, allocations, free space. TDIU11: Operating Systems

File Systems. File system interface (logical view) File system implementation (physical view)

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

CMSC421: Principles of Operating Systems

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24

File System: Interface and Implmentation

CSE 120 Principles of Operating Systems

File-System. File Concept. File Types Name, Extension. File Attributes. File Operations. Access Methods. CS307 Operating Systems

Virtual File System. Don Porter CSE 306

Chapter 11: File System Interface Capítulo 10 no livro adotado!

File Systems. ECE 650 Systems Programming & Engineering Duke University, Spring 2018

OPERATING SYSTEMS CS136

Today: File System Functionality. File System Abstraction

Operating System: Chap10 File System Interface. National Tsing-Hua University 2016, Fall Semester

Introduction to OS. File Management. MOS Ch. 4. Mahmoud El-Gayyar. Mahmoud El-Gayyar / Introduction to OS 1

Typical File Extensions File Structure

COS 318: Operating Systems. Journaling, NFS and WAFL

Virtual File System. Don Porter CSE 506

File Systems. CS 4410 Operating Systems

Segmentation with Paging. Review. Segmentation with Page (MULTICS) Segmentation with Page (MULTICS) Segmentation with Page (MULTICS)

Introduction. Secondary Storage. File concept. File attributes

Long-term Information Storage Must store large amounts of data Information stored must survive the termination of the process using it Multiple proces

Operating Systems. No. 9 ศร ณย อ นทโกส ม Sarun Intakosum

Page 1. Recap: File System Goals" Recap: Linked Allocation"

GFS: The Google File System

Operating Systems CMPSC 473. File System Implementation April 1, Lecture 19 Instructor: Trent Jaeger

Operating Systems. Operating Systems Professor Sina Meraji U of T

Secondary Storage (Chp. 5.4 disk hardware, Chp. 6 File Systems, Tanenbaum)

What is a file system

CS 143A - Principles of Operating Systems

Transcription:

File System Interface ICS332 Operating Systems

Files and Directories Features A file system implements the file abstraction for secondary storage It also implements the directory abstraction to organize files logically Usage It is used for users to organize their data It is used to permit data sharing among processes and users It provides mechanisms for protection

File System The term File System is a bit confusing The component of the OS that knows how to do file stuff A set of algorithms and techniques The content on disk that describes a set of files directory table of contents on-disk file system data files Remember that a disk can be partitioned arbitrarily into logically independent partitions Each partition can contain a file system In this case the partition is often called a volume (e.g., C:, A:) One can have multiple disks, each with arbitrary partitions, each with a different file system on it

File Systems Example with 2 disks, and 3 file systems

File and File Type A file is data + properties (or attributes) Content, size, owner, last read/write time, protection, etc. A file can also have a type Understood by the File System e.g., regular file, logical link, device Or understood by the OS Executable, shared library, object file, text, binary, etc. In Windows a file type is encoded in its name.com,.exe,.bat,... Some known to the OS, some just known to applications In Mac OS X, each file is associated with a type and the name of the program that created it Done by the create() system call for all files Allows for double clicks to remember which program to use In Linux a file type is encoded only in its content Magic numbers, first bytes (#!...) Some files have no type and filenames are arbitrary

File Structure Question: should the OS know about the structure of a file? The more different structures the OS knows about the more help it can provide applications that use particular file types But then, the more complicated the OS code is And it may be too restrictive: e.g., assume all binary files are executable! Modern OSes support very few files structures: Files are sequences of bytes that the OS doesn t know about but that have meaning to the applications Certain files are executables and must have a specific format that the OS knows about Executable formats have evolved throughout the years, partly to accommodate dynamic loading The OS may expect a certain directory structure defining an application e.g., Mac OS X application bundles

Internal File Structure We ve seen that the disk provides the OS with a block abstraction (e.g., 512 bytes) All disk I/O is performed in number of blocks Each file is stored in a number of blocks Internal Fragmentation

File Operations A file is an abstraction, i.e., an abstract data type As such the OS defines several file operations Basic operations Creating Writing/Reading A current-file-position pointer is kept per process Updated after each write/read operation Repositioning the current-file-position pointer This is called a seek Appending at the end of a file Truncating Down to zero size Deleting Renaming

Open Files The OS requires that processes open and close files Otherwise, the OS would spend its time searching directories for file names for each file operation After an open, the OS copies the file system s file entry (i.e., attributes) into an open-file table that is kept in RAM in the kernel The OS keeps two kinds of open-file tables One table per process One global table for all processes A process specifies which file the operation is on by giving an index in its local table The famous filed (file descriptor) in Linux The OS keeps track of a reference count for each open file in the global table Incremented each time a process opens the file Decremented each time a process closes the file Let s see an example

Open File Tables Disk A B C file A file B file C OS Data structure that contains Location of file on disk Block number of the first block Number of blocks Size in bytes Attributes permissions owner time of creation date of last modification... This is kept on disk, but for now we take a logical view of it Let s just say it s in the OS, or in the File System

Open File Tables Disk A B Process 1: Open File A C file A file B file C OS refcount=1 Global table 13 file pos. ptr Process 1 s table

Open File Tables Disk A B Process 1: Open File A Open File C C file A file B file C OS refcount=1 refcount=1 Global table 13 file pos. ptr 42 file pos. ptr Process 1 s table

Open File Tables Disk A B Process 1: Open File A Open File C Process 2: Open File A Open File B C file A file B refcount=2 refcount=1 13 file pos. ptr 42 file pos. ptr 37 file pos. ptr file C OS Global table Process 1 s table Process 2 s table

Open File Tables Disk A B Process 1: Open File A Open File C Process 2: Open File A Open File B C file A file B refcount=2 refcount=1 refcount=1 13 file pos. ptr 42 file pos. ptr 37 file pos. ptr 15 file pos. ptr file C OS Global table Process 1 s table Process 2 s table

Open File Tables Disk A B Process 1: Open File A Open File C Close File A Process 2: Open File A Open File B C file A file B refcount=1 refcount=1 refcount=1 42 file pos. ptr 37 file pos. ptr 15 file pos. ptr file C OS Global table Process 1 s table Process 2 s table

Open File Tables Disk A B Process 1: Open File A Open File C Close File A Process 2: Open File A Open File B Close File A C file A file B refcount=1 refcount=1 42 file pos. ptr 15 file pos. ptr file C OS Global table Process 1 s table Process 2 s table

File Locking Bad things may happen when multiple processes reference the same file Just like when threads reference the same memory A file lock can be acquired for a full file or for a portion of a file The OS may require mandatory locking for some files e.g., for writing for a log file that many system calls write to Typically applications have to implement their own locking And of courses there can be deadlocks and all the messiness of thread synchronization Let s look at the Java example in Fig. 10.1 in the book

File Locking in Java import java.io.*; import java.nio.channels.*; public class LockingExample { public static final boolean EXCLUSIVE = false; public static final boolean SHARED = true; public static void main(string arsg[]) throws IOException { FileLock sharedlock = null; FileLock exclusivelock = null; try { RandomAccessFile raf = new RandomAccessFile("file.txt", "rw"); FileChannel ch = raf.getchannel(); // this locks the first half of the file - exclusive (one writer) exclusivelock = ch.lock(0, raf.length()/2, EXCLUSIVE); /** Now modify the data... **/ // release the lock exclusivelock.release();

File Locking in Java (cont.) } } // this locks the second half of the file - shared (multiple readers) sharedlock = ch.lock(raf.length()/2+1, raf.length(), SHARED); /** Now read the data... **/ // release the lock sharedlock.release(); } catch (java.io.ioexception ioe) { System.err.println(ioe); } finally { if (exclusivelock!= null) exclusivelock.release(); if (sharedlock!= null) sharedlock.release(); }

Access Methods Sequential Access One byte at a time, in order, until the end Read next, write next, reset to the beginning Direct Access Ability to position anywhere in the file Position to block #n, Read next, write next Block number is relative to the beginning of the file Indexed Access Just like a logical page number is relative to the first page in a process address space A file contains an index of file record locations One can then look for the object in the index, and then jump directly to the beginning of the record Linux and Windows: Direct access It s up to you to implement your own application-specific index But internally the FS does some indexing of blocks, as we ll see Older OSes provided other, more involved methods, including indexing e.g., you could tell the OS more information about the logical structure of your file

Directories We re used to file systems that support multiple directory levels the notion of a current directory Single-Level directory Naming conflicts Have to pick longer, and longer unique names Slow searching

Directories Two-Level Directory Faster searching Still naming conflict for each user

Tree-Structured Directories

Tree-Structured Directories More general than 1- and 2-level schemes Each directory can contain files or directories Differentiated internally by a bit set to 0 for files and 1 for directories Each process has a current directory Relative paths Absolute paths Path name translation, e.g., for /one/two/three Open / (the file system knows where that is on disk) Search it for one and get its location Open one, search it for two and get its location Open two, search it for three and get its location Open three The OS spends a lot of time walking directory paths Another reason why one separates open from read/write The OS attempts to cache common path prefixs But what we have in modern systems is actually more complicated...

Acyclic Graph Directories Files/directories can be shared by directories A hard link is created in a directory, to point to or reference another file or directory Identified in the file system as a special file The file system keeps track of reference count for each file, and deletes the file when the last reference is removed A symbolic link does not count toward the reference count You can think of it as an alias for the file (if you remove the alias, nothing happens) If the target file is removed then the alias simply becomes invalid This is the UNIX view of links, as implemented by the ln command No hard-linking of directories Acyclic is good for quick/simple traversals Simple way to prohibit cycles: no hard linking of directories!

General Graph Directory In this scheme users can do whatever they want Directory traversals algorithms must be smarter to avoid infinite loops bogus1 bogus2 Garbage collection could be useful because ref counts may never reach zero But way too expensive in practice

Mac OSX Time Machine Time Machine is the backup mechanism introduced with Leopard It uses hard links Every time a new backup is made, a new backup directory is created that contains a snapshot of the current state of the file system Files that haven t been modified are hard links to previously backed up version A new backup should be mostly hard links instead of file copies(space saving) When an old backup directory is wiped out, then whatever files have a reference count of zero are removed (no longer part of more recent data) At time t=0, the first backup is initialized, meaning that it s a full copy of the directory structure and files t=0 backup 0

Mac OSX Time Machine Time Machine is the backup mechanism introduced with Leopard It uses hard links Every time a new backup is made, a new backup directory is created that contains a snapshot of the current state of the file system Files that haven t been modified are hard links to previously backed up version A new backup should be mostly hard links instead of file copies(space saving) When an old backup directory is wiped out, then whatever files have a reference count of zero are removed (no longer part of more recent data) X By time t=1, a file has been modified, another one is added, an another one is deleted t=1 backup 0

Mac OSX Time Machine Time Machine is the backup mechanism introduced with Leopard It uses hard links Every time a new backup is made, a new backup directory is created that contains a snapshot of the current state of the file system Files that haven t been modified are hard links to previously backed up version A new backup should be mostly hard links instead of file copies(space saving) When an old backup directory is wiped out, then whatever files have a reference count of zero are removed (no longer part of more recent data) X At time t=1, a new backup is triggered by the user t=1 backup 0

Mac OSX Time Machine Time Machine is the backup mechanism introduced with Leopard It uses hard links Every time a new backup is made, a new backup directory is created that contains a snapshot of the current state of the file system Files that haven t been modified are hard links to previously backed up version A new backup should be mostly hard links instead of file copies(space saving) When an old backup directory is wiped out, then whatever files have a reference count of zero are removed (no longer part of more recent data) hard links X copied from copied file from system file system t=1 backup 0 backup 1

Mac OSX Time Machine Time Machine is the backup mechanism introduced with Leopard It uses hard links Every time a new backup is made, a new backup directory is created that contains a snapshot of the current state of the file system Files that haven t been modified are hard links to previously backed up version A new backup should be mostly hard links instead of file copies(space saving) When an old backup directory is wiped out, then whatever files have a reference count of zero are removed (no longer part of more recent data) The user can now remove backup 0 backup 0 backup 1

Mac OSX Time Machine Advantages Extremely simple to implement The back up can be navigated in all the normal ways, without Time Machine Provided backups are frequent, they are done by creating mostly hard links (which is MUCH faster than copying data) Drawback If you change 1 byte in a 10GB file, then you copy the whole 10GB But how often does this happen?? For efficiency, Mac OSX allows hard linking of directories Cycles in the directory hierarchy must be detected, i.e., more complicated file system code Complexity deemed worthwhile by Mac OSX developers

Time Machine on Linux? Give how simple and elegant Time Machine is, one may want to implement it on Linux Could be an interesting course project But because Linux doesn t allow hard-linking of directories, one would have to recreate the whole directory structure for each backup While would take space and, more importantly perhaps, a lot of time Such implementations exist, but if you use a standard Linux file system that does not allow cycles in the directory structures, it won t be efficient

Hard Links on Linux It turns out that, on Linux, whenever a file is opened by a process, a hard link to the file is created Say that process with PID 2233 calls the open() system call to open a file /home/casanova/somefile open() returns a file descriptor, i.e., an integer, say 55 At that point, a hard link to /home/casanova/somefile is created in /proc/2233/fd/55 If, while the process is running, /bin/rm /home/casanova/somefile is executed, then the file survives because its reference count is nonzero Essentially, you can t remove the data for a file while a process is using it, which is probably a good thing This allows you to retrieve a file that you ve erased by mistake as long as some process has it opened You might want to create hard links to your important files anyway Let s try this on a Linux box...

File System Mounting There can be multiple file systems Each file system is mounted at a special location, the mount point Typically seen as an empty directory When given a mount point, a volume, a file system type, the OS asks the device driver to read the device s directory checks that the volume does contain a valid file system makes note of the new file system at the specified mount point The OS keeps a list of mount points Mac OS X: all volumes are mounted in the /Volumes/ directory Including temporary volumes on USB keys, CDs, etc. UNIX: volumes can be mounted anywhere Windows: volumes were identified with a letter (e.g., A:, C:), but current versions, like UNIX, allow mounting anywhere On Linux the mount command lists all mounted volumes

Protection File systems provide controlled access General approach: Access Control Lists (ACLs) For each file/directory, keep a list of all users and of all allowed accesses for each user Protection violations are raised upon invalid access Problem: ACLs can be very large Solution: consider only a few groups of users and only a few possible actions UNIX: User, Group, Others not in Group, All (ugoa) Read, Write, Execute (rwx) Represented by a few bits chmod command: e.g., chmod g+w foo (add write permission to Group users) e.g., chmod o-r foo (remove read permission to Other users)

Conclusion In the next set of lecture notes we ll look at how a file system is implemented...