CIS Operating Systems File Systems. Professor Qiang Zeng Spring 2018

Similar documents
CIS Operating Systems File Systems. Professor Qiang Zeng Fall 2017

CIS Operating Systems I/O Systems & Secondary Storage. Professor Qiang Zeng Fall 2017

CIS Operating Systems I/O Systems & Secondary Storage. Professor Qiang Zeng Spring 2018

File Systems. CS170 Fall 2018

FILE SYSTEMS. CS124 Operating Systems Winter , Lecture 23

File System Implementation. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

File Systems: Fundamentals

File Systems. CS 4410 Operating Systems. [R. Agarwal, L. Alvisi, A. Bracy, M. George, E. Sirer, R. Van Renesse]

File Systems: Fundamentals

Main Points. File layout Directory layout

Da-Wei Chang CSIE.NCKU. Professor Hao-Ren Ke, National Chiao Tung University Professor Hsung-Pin Chang, National Chung Hsing University

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

File System Implementation

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission 1

CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

File Systems. ECE 650 Systems Programming & Engineering Duke University, Spring 2018

Introduction to OS. File Management. MOS Ch. 4. Mahmoud El-Gayyar. Mahmoud El-Gayyar / Introduction to OS 1

CS3600 SYSTEMS AND NETWORKS

File System Internals. Jo, Heeseung

Chapter 11: File System Implementation. Objectives

ECE 650 Systems Programming & Engineering. Spring 2018

File Systems. Chapter 11, 13 OSPP

we are here Page 1 Recall: How do we Hide I/O Latency? I/O & Storage Layers Recall: C Low level I/O

CS370 Operating Systems

Operating Systems. Week 9 Recitation: Exam 2 Preview Review of Exam 2, Spring Paul Krzyzanowski. Rutgers University.

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission 1

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission

File Systems: Interface and Implementation

Input-Output (I/O) Input - Output. I/O Devices. I/O Devices. I/O Devices. I/O Devices. operating system must control all I/O devices.

File System Implementation. Sunu Wibirama

we are here I/O & Storage Layers Recall: C Low level I/O Recall: C Low Level Operations CS162 Operating Systems and Systems Programming Lecture 18

Files and File Systems

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Chapter 12: File System Implementation

Chapter 14: File-System Implementation

ICS Principles of Operating Systems

Operating Systems. File Systems. Thomas Ropars.

File Systems. File system interface (logical view) File system implementation (physical view)

Outlook. File-System Interface Allocation-Methods Free Space Management

File System Internals. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

SMD149 - Operating Systems - File systems

I/O Systems. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

CS370: Operating Systems [Spring 2017] Dept. Of Computer Science, Colorado State University

CS370 Operating Systems

COMP 530: Operating Systems File Systems: Fundamentals

Main Points. File layout Directory layout

Operating Systems. Operating Systems Professor Sina Meraji U of T

UNIX File Systems. How UNIX Organizes and Accesses Files on Disk

Operating Systems Design Exam 2 Review: Spring 2012

I/O Handling. ECE 650 Systems Programming & Engineering Duke University, Spring Based on Operating Systems Concepts, Silberschatz Chapter 13

File System: Interface and Implmentation

CIS Operating Systems Memory Management Cache. Professor Qiang Zeng Fall 2017

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems

EECE.4810/EECE.5730: Operating Systems Spring 2017

Computer Systems Laboratory Sungkyunkwan University

CS330: Operating System and Lab. (Spring 2006) I/O Systems

CS 550 Operating Systems Spring File System

File Systems. Kartik Gopalan. Chapter 4 From Tanenbaum s Modern Operating System

Files and File Systems

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

CS370: System Architecture & Software [Fall 2014] Dept. Of Computer Science, Colorado State University

Chapter 11: Implementing File

OPERATING SYSTEM. Chapter 12: File System Implementation

Chapter 11: Implementing File Systems. Operating System Concepts 9 9h Edition

File Layout and Directories

Chapter 4 File Systems. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved

Frequently asked questions from the previous class survey

CS 318 Principles of Operating Systems

Chapter 11: Implementing File Systems

File Systems: Interface and Implementation

File Systems: Interface and Implementation

V. File System. SGG9: chapter 11. Files, directories, sharing FS layers, partitions, allocations, free space. TDIU11: Operating Systems

Operating System Concepts Ch. 11: File System Implementation

Advanced Operating Systems

TDDB68 Concurrent Programming and Operating Systems. Lecture: File systems

Fall 2017 :: CSE 306. File Systems Basics. Nima Honarmand

Principles of Operating Systems

CS 4284 Systems Capstone

Motivation. Operating Systems. File Systems. Outline. Files: The User s Point of View. File System Concepts. Solution? Files!

FILE SYSTEM IMPLEMENTATION. Sunu Wibirama

[537] Fast File System. Tyler Harter

File System Case Studies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Chapter 12 File-System Implementation

Chapter 10: File System Implementation

CS 318 Principles of Operating Systems

File Systems. CSE 2431: Introduction to Operating Systems Reading: Chap. 11, , 18.7, [OSC]

CIS Operating Systems Memory Management Page Replacement. Professor Qiang Zeng Spring 2018

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

Week 12: File System Implementation

Chapter 12: File System Implementation

Introduction. Secondary Storage. File concept. File attributes

Frequently asked questions from the previous class survey

Device-Functionality Progression

Chapter 12: I/O Systems. I/O Hardware

File Management 1/34

Local File Stores. Job of a File Store. Physical Disk Layout CIS657

I/O Systems. Jo, Heeseung

Chapter 11: Implementing File-Systems

CIS Operating Systems Memory Management Cache and Demand Paging. Professor Qiang Zeng Spring 2018

Chapter 11: Implementing File Systems

Transcription:

CIS 3207 - Operating Systems File Systems Professor Qiang Zeng Spring 2018

Previous class I/O subsystem: hardware aspect Terms: controller, bus, port Addressing: port-mapped IO and memory-mapped IO I/O subsystem: software aspect Device drivers: polling, interrupt-driven, and DMA API styles: blocking, non-blocking, & asynchronous I/O subsystem: performance aspect Caching, Buffering and Spooling I/O scheduling Some slides are courtesy of Dr. Abraham Silberschatz and Dr. Thomas Anderson CIS 3207 Operating Systems 2

Previous class Compare buffering and caching Data in cache is repetitively accessed, while data in buffer is consumed like a stream Buffering is mainly used to cope with the speed mismatch between producers and consumers (and other purposes), while caching is used to speed up the access CIS 3207 Operating Systems 3

Previous class When to use blocking I/O and when to use nonblocking I/O? If the current I/O is the only thing the current thread/ process should do, then use blocking I/O; it is easier If the current thread/process needs to receive whatever data it can receive and consume it ASAP, or send whatever data it can send out and get back to continue producing (e.g., voice call), or the process has to handle multiple I/O streams simultaneously, then use nonblocking I/O; it is a little more complex CIS 3207 Operating Systems 4

Previous class Advantage and disadvantage of DMA The data passing does not go through CPU, so it saves CPU cycles when passing a large amount of data But it is not worthwhile if you only need to pass a small amount of data each time, because setting up a DMA operation is costly CIS 3207 Operating Systems 5

Previous class What is the benefit of having the drivers to expose the same set of interfaces to the kernel? Modularized and layered design, so that the core part of the kernel can be coded according to those small set of interfaces, no matter how diverse and complex the devices are CIS 3207 Operating Systems 6

Outline File System concepts Mounting File and directory Hard link and soft link File system design FAT32 UFS Some slides are courtesy of Dr. Thomas Anderson CIS 3207 Operating Systems 7

Volume and Partition A partition is just a storage segment crafted out of a disk A physical concept Volume: a collection of physical storage resources that form a logical storage device One volume contains one file system After a partition is formatted into some file system, that partition can be called a volume CIS 3207 Operating Systems 8

Volume CIS 3207 Operating Systems 9

Mount Each volume contains a file system Mounting volumes is to organize the multiple file systems into a single logical hierarchy E.g., assuming your usb disk contains one volume, when you insert your usb stick, the system mounts the volume to the existing file system hierarchy Mount point: the path where a volume is mounted to mount device dir tells the kernel to attach the file system found on device at the directory dir CIS 3207 Operating Systems 10

Mounting Alice s Disk Bob s Disk / / Volumes USB Volume Media USB Volume USB1 Mount Point / Disk-1 Mount Point / Bin Movies Programs Movies Home vacation.mov Users vacation.mov Alice Backup Bob Backup How to access the movie? Alice: /Volumes/usb1/Movies/vacation.mov Bob: /Media/disk-1/Movies/vacation.mov Where does the original content in /Volume/usb1 and /Media/Disk-1 go? Temporarily hidden CIS 3207 Operating Systems 11

Mounting associates file system with drivers --- very important idea Once a file system, which is on some device, is mounted, the kernel associates the file system F with the driver D for the device So that when you read/write/seek a file in F, it will route the device operation to functions in D CIS 3207 Operating Systems 12

The transparency benefits from layering Application Library File System Block Cache Block Device Interface Device Driver Memory-Mapped I/O, DMA, Interrupts File System API and Performance Device Access Physical Device CIS 3207 Operating Systems 13

open() vs fopen() open(), read(), write(), and close() are system calls fopen(), fread(), fwrite(), and fclose() are stdio lib calls The library includes buffers to aggregate a program s small reads and writes into system calls that access large blocks If a program uses the library function fread() to read 1 byte of data, the library may use read() system call to read in large block of data, e.g., 4KB, into a user space buffer, so that when the process calls fread() again to read another byte, it reads from the buffer instead of issuing another system call CIS 3207 Operating Systems 14

Files in kernel CIS 3207 Operating Systems 15

Users usually don t need to know the file system types and device drivers CIS 3207 Operating Systems 16

Block In the view of file systems, disk space is divided into blocks A disk block consists of a fixed number of consecutive disk sectors E.g., sector size = 512B, block size = 4KB A file is stored in blocks E.g., a file may be stored in blocks 0x0A35, 0xB98C, 0xD532 CIS 3207 Operating Systems 18

File File: a named collection of data E.g., /home/qiang/file-systems.pptx A file consists of two parts Metadata Data Metadata: data that describes other data File type File size Creation/access/modification time Owner and group a file belongs to Access permissions CIS 3207 Operating Systems 19

Metadata of a Linux/Unix file ls -l CIS 3207 Operating Systems 20

File type in linux/unix (the first column in ls -l) CIS 3207 Operating Systems 21

Metadata is a generic concept Name the possible metadata of a song Singer Author Style Creation time How does Pandora infer which songs you like? Use the metadata (and other info) as features The features will be used in a set of machine learning and data mining algorithms to infer your likes CIS 3207 Operating Systems 22

Directory Directory maps the names of files the directory contains to their locations Directory is a special type of file ; like a file, a directory also consists of Metadata: Size, owner, date, etc.; same as ordinary files Data: An array of entries, each of which is a mapping between file name and the index of the first block of the file Logically, a file is a linked list of blocks; once you locate the first block, you have located the file Let s call the index of the first block of a file as a file number, which describes the location of a file CIS 3207 Operating Systems 23

Question When you rename a file, do you modify the file? No, you actually modify the containing directory File a.txt is in directory A; assume you don t have write permission with a.txt but do have write permission with directory A, can you rename a.txt? Yes Because you are modifying the containing directory CIS 3207 Operating Systems 24

Hard link and soft link A hard link: a mapping from a file name to some file s data ln original.txt hard.txt A soft/symbolic link: a mapping from a file name to another file name; whenever you refer to a symbolic link, the system will automatically translate it to the target name ln s original.txt soft.txt Question: if a program execution opens a file with a fixed name, but you want to the program execution to refer to your another file, what should you do? CIS 3207 Operating Systems 25

Hard link and soft (symbolic) link We have created a file original.txt, and a hard link named hard.txt, and a symbolic link named soft.txt Can you distinguish original.txt and soft.txt? Certainly Can you distinguish original.txt and hard.txt? Hmmm CIS 3207 Operating Systems 26

Truth about hard link Link count Only after the last hard link is removed, will the underlying file be deleted inode block: this is the formal name used to refer to the first block of a file in Linux; Hey, you! Don t call me a hard link; we are equal Hard link: another directory entry for a given file CIS 3207 Operating Systems 27

Symbolic link vs Hard link The inode of a symbolic file contains: A flag saying that I am symbolic link A file name of the target file Very important for software upgrade After upgrade, you just redirect the symbolic link to the new version CIS 3207 Operating Systems 28

Question If you modify a file through a hard link, will the modification time of another hard link of the same file be updated as well? Yes They point to the same inode block, which stores the modification time (and other metadata) Hard links of a file share the same piece of metadata and data of the file The only difference is the names CIS 3207 Operating Systems 29

Question Most systems do not allow creating a hard link for a directory. Why? Avoiding it ensures that the directory structures form a directed acyclic graph (DAG). The benefits: It simplifies the directory traversal No circles during traversal Consider the implementation of du (disk use by a directory) It simplifies the deletion logic If the link count of a file reaches zero, you can delete the file and reclaim the space If the directories form a circle, even when the hard link count of files do not reach zero, they should be garbage collected; but now you need a dedicated garbage collector to detect such circles CIS 3207 Operating Systems 30

File System Design Need to consider data structures for Directories: file name -> location of file File: locations of the disk blocks used for each file Tracking the list of free disk blocks How to design and organize these data structures is the task of file system design CIS 3207 Operating Systems 31

Extra considerations Efficiency How to quickly map the offset of a file (e.g., the 1000 th byte) to the location of the containing data block? How to quickly insert a byte into a file? What block size do we use? How to decrease fragmentation? How do we preserve spatial locality? Reliability What if machine crashes in middle of a file system op? CIS 3207 Operating Systems 32

File System Design Contiguous file allocation Each file occupies a contiguous area in disk Advantage: very good locality and high performance What are the disadvantages? Bad performance when files grow and shrink Severe external fragmentation when files are created and deleted Noncontiguous file allocation CIS 3207 Operating Systems 33

A simple design: linked list of blocks Now consider a seek operation targeting the fifth blocks The seek operation has to search from the first hop by hop, leading to frequent disk seek operations CIS 3207 Operating Systems 34

Tabular design (FAT32 as an example) File Allocation Table (FAT), an array of 32-bit entries in a reserved area of the volume volume space = space for the table + space for real data Each FAT entry corresponds to a block in the data area That is, one-to-one mapping If a file has entry i in the FAT, it owns block i in the data blocks Each file in the system corresponds to a linked list of FAT entries CIS 3207 Operating Systems 35

FAT Data Blocks 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 File 9 File 12 File 9 Block 3 File 9 Block 0 File 9 Block 1 File 9 Block 2 File 12 Block 0 File 12 Block 1 File 9 Block 4 CIS 3207 Operating Systems 36

Analysis on tabular design Advantage: Pointers that locate file data are stored in a central location, that is, the FAT The table can be cached in memory so that the chain of blocks that compose a file can be traversed quickly Improves the seek performance Disadvantage: the locality of data itself is poor, as data blocks may spread across the disk => more disk seeking operations No support for hard links Limitations on volume and file size: take the file size as an example; the system uses a 32-bit field to describe the number of bytes in a file; so the file size limit is 4GB CIS 3207 Operating Systems 37

Multilevel Index design The metadata of a file is stored at an inode, which contains An array of direct pointers pointing to data blocks One indirect pointer One double-indirect pointer One triple-indirect pointer Each file is a tree of blocks rooted at its inode Volume space = inodes + indirect blocks + file data The design is widely used in Berkeley Fast File System (FFS), also known as Unix File System (UFS) Linux ext2 and ext3 CIS 3207 Operating Systems 38

Multilevel Index design Inode Array Inode Triple Indirect Blocks Double Indirect Blocks Indirect Blocks Data Blocks File Metadata Direct Pointer DP DP DP DP DP DP DP DP DP DP Direct Pointer Indirect Pointer Dbl. Indirect Ptr. Tripl. Indirect Ptr. CIS 3207 Operating Systems 39

Analysis Advantages Each indirect block is 4KB, and each pointer takes 4 bytes, so a block contains 1024 pointers. That is, the tree has a very high fan-out degree, which implies that the search tree can be very shallow, and hence, efficient Scalable design: for small files all the data blocks can be reached through the direct pointers, while the multi-level tree supports very large files (4TB max) CIS 3207 Operating Systems 40

Writing assignments Why does not the file system design use a uniform, say, two-level index design, just as the design of the two-level page table? What are the difference and relation between read and fread What does mounting do under the hood? CIS 3207 Operating Systems 41