CS Lab 2: fs. Vedant Kumar, Palmer Dabbelt. February 27, Getting Started 2

Size: px
Start display at page:

Download "CS Lab 2: fs. Vedant Kumar, Palmer Dabbelt. February 27, Getting Started 2"

Transcription

1 CS Vedant Kumar, Palmer Dabbelt February 27, 2014 Contents 1 Getting Started 2 2 lpfs Structures and Interfaces 3 3 The Linux VFS Layer Operation Tables Inode Cache Directory Cache The Linux Block Layer Page Cache Device Mapper Other Useful Kernel Primitives Slab Allocation Work Queues The RCU Subsystem Wait Queues Schedule and Grading Design Document Checkpoint Checkpoint Checkpoint Evaluation

2 For this lab you will implement a filesystem that supports efficient snapshots, copy-on-write updates, encrypted storage, checksumming, and fast crash recovery. Our goal is to give you a deeper understanding of how real filesystems are designed and implemented. We have provided the on-disk data structures and some support code for a log-structured filesystem (lpfs). You can either build on top of the distributed code or implement a novel, feature-equivalent design. The first rule of kernel programming may well be don t mess with the kernel, which is why we ve built fsdb. The idea here is to run your kernel code in userspace via a thin compatibility layer. This gives you the chance to debug and test in a relatively forgiving environment. You will extend fsdb to host ramfs as well as your own filesystem. 1 Getting Started Pull the latest sources from the class project repo. You should see some new directories: lpfs: A filesystem skeleton. The compatibility layer also lives in here. ramfs: A compact version of linux/fs/ramfs. Note that ramfs/compat.c is symlinked to lpfs/compat.c. You can mount a ramfs by running mount -t ramfs ramfs /mnt. userspace: Miscellaneous tools to help build and debug your filesystem. Run sudo make fsdb. You should see some interesting output. A reduced version follows: dd if=/dev/zero of=.lpfs/disk.img bs=1m count=128 make reset_loop.lpfs/mkfs-lp /dev/loop0 Disk formatted successfully..lpfs/fsdb /dev/loop0 snapshot=0 (info) lpfs: mount /dev/sda, snapshot=0 Registered filesystem fsdb> The build system creates, mounts and formats a disk image for you. It uses a loop device to accomplish this. Then it invokes fsdb on your new disk, leaving you ready to debug. Since we re relying on the build system to do some interesting work, it s crucial that you thoroughly understand */Makefile.mk. You may occasionally need to extend the build system, so reading through these files early on is worthwhile. Let s make a small modification to lpfs to see how everything works. Go to the bottom of lpfs/struct.h and uncomment the LPFS DARRAY TEST macro. This will cause the filesystem to run sanity checks on its block layer abstraction code ( darray ) instead of actually mounting. Now when you run sudo make fsdb, you should see this: (info) lpfs: mount /dev/sda, snapshot=0 (info) lpfs: Starting darray tests. (info) lpfs: darray tests passed! (info) Note: proceeding to graceful crash... Looks like the tests pass in userspace. The next step is to run them in the kernel to make sure this wasn t a fluke. Run make linux, then./boot qemu, and finally mount -t lpfs /dev/sda /mnt in the guest s shell. If you see the same success messages, feel free to do a happy hacker dance. Life is short. 2

3 2 lpfs Structures and Interfaces In an attempt to make this lab manageable, we ve designed a set of on-disk structures that define lpfs. These structures are defined in lpfs/lpfs.h. If you need to modify these structures, make sure that you also update the formatting program (lpfs/mkfs-lp.c). Failure to do this will result in corrupted images. The main structure you ll find inside here is struct lp superblock fmt, which defines the on-disk format of an lpfs superblock. Superblocks are a concept that exist in most UNIX-derived systems. The superblock is the first block in the on-disk filesystem image and contains all the information necessary to initialize a filesystem image. This block is loaded when the OS attempts to mount a block device using a particular filesystem implementation and is parsed by the filesystem implementation. As you can probably see from the superblock structure, lpfs is a log-structured filesystem. LFS, the first log-structured file system, is described in a research paper online edu/~brewer/cs262/lfs.pdf. lpfs largely follows the design of LFS: data is stored in segments that are written serially, the SUT contains segment utilization information, and garbage collection must be performed to free segments for later use. The one major difference is that lpfs uses a statically-placed journal instead of LFS s dynamic journal. The goal here is to aid crash recovery if the journal is static then it should be easier to find. Another minor difference is that lpfs supports snapshots. Effectively what this means is that you can make a system call that tells lpfs to keep around an exact copy of the filesystem at some particular point in time. This maps well to log-structured filesystems lpfs/lpfs.h also contains the on-disk structures that describe files, directories, and journal entries. These pretty much mirror the structure of a traditional UNIX filesystem, most of the interesting bits in lpfs are in the log. lpfs/struct.h summarizes the important interfaces the filesystem relies on. You will notice that much of the code (including the entire transaction system and some of the mount logic) is far from complete. You will need to implement all of this. lpfs/inode.c takes care of loading and filling in batches of inodes. lpfs/inode map.c tracks inode mappings: these objects define a snapshot by specifying an on-disk byte address for every live inode. lpfs/darray.c implements an abstraction on top of the buffer head API. It presents a picture of a segment as a contiguous array, handles locking, and can sync your buffers to disk. The problem with darray is that the buffer head interface is quite bloated. When the rest of your filesystem is done, you should rewrite darray using the lighter bio interface. 3 The Linux VFS Layer Filesystems are one of the more complicated aspects of an operating system (by lines of code, only drivers/ and arch/ are bigger than fs). Luckily for you, Linux provides something known as the VFS (Virtual Filesystem Switch) layer that is designed to help manage this complexity. In Linux, all 1 filesystems are implemented using the VFS layer. Due to the fact that UNIX is designed to map pretty much every operation to the filesystem, the VFS layer plays a central role in Linux. Figure 1 shows exactly where the VFS layer lives and how it plugs into the rest of Linux. Linux s VFS documentation is very good and can be found at linux/documentation/filesystems/vfs.txt. You will need to read this document to complete this lab. In that directory you ll also find documentation for other filesystems which may or may not be useful VFS was kind of hacked on top of an early UNIX filesystem, which still more-or-less exists as ext2 in Linux today. 1 There s also FUSE, which maps VFS calls into userspace but FUSE itself hooks into VFS so I think it still counts. 3

4 Figure 1: A map of Linux s VFS layer 4

5 3.1 Operation Tables The primary means of interfacing your filesystem with the VFS layer consists of filling out operation tables with callbacks that will be used to perform operations that are specific to your filesystem. There are three of these tables: struct super operations which defines operations that are global across your filesystem, struct inode operations which defines methods on inodes, and struct file operations which defines operations that are local to a particular file. This split is largely historical: on the original UNIX directories and files were both accessed via the same system calls, so a different function was required to differentiate between the two. The simplest disk-based filesystem I know of is ext2, which you can find in the Linux sources (linux/fs/ext2/). If you look at how they define these operation tables, you ll notice that a significant fraction of them can be filled out using generic mechanisms provided by Linux you ll want to take advantage of this so you can avoid re-writing a whole bunch of stuff that already works. 3.2 Inode Cache Linux s VFS layer was designed with traditional UNIX filesystems in mind, and as such has a number of UNIX filesystem concepts baked into it. You ve already seen one of these with the whole directory/file distinction, but another important one is that the VFS layer directly talks to your filesystem in terms of inodes. While this was probably originally a decision that stemmed from Sun s attempts at hacking NFS into their UNIX, today it has important performance implications: specifically that Linux will cache inodes in something (quite sensibly) known as the inode cache. The VFS layer handles the majority of the inode caching logic for you. The one thing you ll have to be aware of is that reference counting is done on these inodes to ensure that they re never freed while they can still be accessed from anywhere else within the VFS layer. If you end up manually passing around inodes (for example, your garbage collection layer might do this) then you ll need to be sure to keep the reference counts coherent. 3.3 Directory Cache A number of VFS system calls do path lookup operations. In order to speed these up, Linux maintains something known as the dcache which is a cache of partially resolved directory lookups. What this means to you as a VFS programmer is that you re pretty much isolated from doing any sort of name resolution, you simply need to provide methods that re-populate the dcache on requests from Linux. Since dentries are cached you re going to have to remove them from the cache on rmdir(), as otherwise Linux won t know that they ve disappeared. The VFS documentation describes how to modify the dcache in order to do this. 4 The Linux Block Layer The whole purpose of a filesystem is to provide access to block devices in a more friendly manner for users. Thus, one of the primary interfaces that filesystems use is the block layer. Linux s block layer was designed to provide high performance access to rotating disks which imparts a significant amount of complexity into it. Luckily for you, the Linux Device Drivers book contains a great description of the block layer (the multiqueue changes aren t in until 3.13, so you re safe). The section on Request Processing contains all the information you ll need to deal with the block layer for this lab. Note that before you dive into the device driver book, you ll want to look at the page cache. Linux abstracts the vast majority of the block layer behind an in-ram cache in order to improve performance, so you re going to want to read up on the page cache first. 5

6 4.1 Page Cache Modern systems tend to have significantly more physical memory than would be required just to hold each running process s segments. In order to take advantage of this extra memory, Linux fills otherwise unused memory pages with a cache of disk blocks in the hope these in-memory copies can be used by programs. It turns out this cache is one of the most important performance considerations for the sorts of machines that are common today. Caching disk blocks in memory is extremely important: it hides latency and increases bandwidth for both disk reads and writes. On most machines almost all operations hit in the page cache and disk IO is relegated to simply providing a persistent backing store. Since the performance impact of the page cache is so large, Linux provides significant shared code to help manage the cache for your filesystem in fact, the page cache is so ingrained into Linux file systems that it would be pretty much impossible to write an on-disk filesystem without using the page cache. Luckily, Linux s implementation of the page cache is pretty much transparent to your filesystem. You register your filesystem with the page cache by filling out a struct address space operations and attaching it to your inodes. These callbacks will then get called at the appropriate times by the page cache when it wants to operate on your filesystem for example, your filesystem will probably need to have some specific page cache eviction function that ensures pages make it back to disk before eviction. 4.2 Device Mapper So far we have discussed block devices assuming they map to a physical block device such as a hard disk. While this was the original purpose of the block layer, it has since been expanded with what s known as the Device Mapper (often times just DM ) framework. DM allows code that targets the block layer to be backed by a virtual block device, an example of which may be a software RAID configuration. DM then loops back into the block layer to actually satisfy requests, after performing some sort of arbitrary computation. For this assignment, you will be using device mapper to provide the relevant cryptographic operations required by the lab document. This boils down to invoking cryptsetup on your loop device. 5 Other Useful Kernel Primitives The VFS layer depends on a large amount of shared kernel code. While you may be familiar with some of these systems (atomic operations, for example) there are a number of systems you probably haven t seen before. 5.1 Slab Allocation By this point you ve probably noticed that the VFS layer interacts with a number of different caches. All of these caches use the slab allocation in order to allocate memory as efficiently as possible. The default Linux slab allocator is known as SLUB (yes, that s not a typo the old one was SLAB). While the various caches should hide the details of slab allocation from you, it can still be useful to know what s going on behind the scenes. The general idea behind slab allocation is to speed up memory management when objects are reused many times. The general idea is to keep around a cache of already initialized objects and return one of those rather than creating a new object. This saves the overhead of re-initializing objects over and over again. There are additional significant advantages in both time and space that result from coalescing objects of the same kind during allocation that slab caches provide. Linux s slab allocator can be accessed through the kmem cache * methods. 6

7 5.2 Work Queues Log-structured filesystems essentially involve garbage collection: you ll need to clean up unused segments and merge mostly-empty segments when the machine isn t particularly busy. Userspace garbage collection tends to involve either a signal or a background thread, but neither of these approaches are correct when within the kernel. You can t use a signal because your filesystem isn t attached to a particular userspace thread, and forking kernel threads is generally frowned upon because of the resource overhead involved (though you ll notice that there s a whole bunch of them anyway...). To implement garbage collection you will instead need to use Linux s work queue functionality to defer work for a later time. The idea behind work queues is that there is a pool of event handling kernel-threads that exist the entire time the system is running. The work queue mechanism allows you to enqueue an item of work that will later be dequeued by one of these threads. This allows work to be performed asynchronously without the overhead of creating a bunch of threads. Linux s work queue mechanisms are documented in linux/documentation/workqueue.txt. Note that the provided lpfs code currently spawns off placeholder syncer and cleaner threads. We suggest that you get rid of these and use work queues. 5.3 The RCU Subsystem The RCU (Read, Copy, Update) subsystem is a mechanism for synchronizing particular sorts of operations without actually using any synchronization primitives directly. RCU is commonly used to manage lists of buffers. While this sort of buffer management is common in filesystems, we believe that the RCU subsystem should be hidden from you until you need to start using BIO. The RCU subsystem is somewhat complicated. We ll eventually cover it in sections, but the Linux documentation for it is very good. It can be found at linux/documentation/rcu/ (my personal favorite is whatisrcu.txt, but Vedant likes one of the other ones so YMMV). 5.4 Wait Queues Operations that touch block devices are very high latency. This means you re going to have to sleep whenever you submit a request that doesn t hit in the page cache. Linux provides a generic mechanism for sleeping until the completion of an event, known as wait event. You shouldn t need to access this directly for the majority of your code, as the page cache and BIO code will handle these for you, but you will probably want to look at wait queues for putting your cleaner thread to sleep under certain conditions. Later on when you convert darray over to the bio interface, you may need to use wait for completion to synchronize your I/Os. 6 Schedule and Grading We ll be following the same sort of checkpoint system that was used for Lab 1: three checkpoints, once a week on Thursdays at 9pm. You ll probably notice that this lab is more heavily loaded towards the third checkpoint. This means it will be particularly important to ensure that your early checkpoints are useful for your later checkpoints. The requirements for this assignment are specified in terms of the Linux system calls (or where their names don t match, libc functions). You ll have to translate those into the corresponding VFS operations in order to actually implement the lab. 6.1 Design Document Here s a subset of the items we re looking for: Task separation, work distribution, time estimates 7

8 A plan to manage transactions, snapshots, and the cleaner A plan to manage the journal and crash recovery Remarks on how on on-disk structures will be used Proposed changes to lpfs, or a description of your own filesystem A brief description of your tests (no mention of ramfs is needed) Any questions you may have about the lab Avoid being vague, since this inhibits our helping you. As a special case of avoiding vagueness, avoid listing Correctness Constraints in your documents. Avoid being too specific (i.e prefer pseudocode over code, aim for brevity and clarity). 6.2 Checkpoint 1 For the first checkpoint you will be implementing a userspace compatibility layer for ramfs, an inmemory filesystem that s already been written for Linux. We ve written much of this compatibility layer already, so all you really have to do is implement the struct dentry management code and some of the VFS-provided generic functions in userspace. You ll also need to extend userspace/fsdb.c by invoking the appropriate file operations and inode operations methods from command handlers. (15 points) Make ramfs work in userspace. All of the fsdb commands should work. (15 points) A filesystem test suite, running against ramfs inside of Linux and against your compatibility layer. Your tests may take the form of commands to fsdb (which can be piped in) or shell scripts which can be run in the kernel. (10 points) A design document that describes how you will complete the remaining checkpoints. 6.3 Checkpoint 2 For the second checkpoint you ll be implementing a read-only version of lpfs that gets it data from a real block device and runs within Linux (as well as the userspace compatibility wrapper). (5 points) mount() (5 points) umount() (5 points) open() (5 points) close() (5 points) readdir() (5 points) read() (5 points) seek() (5 points) stat() (5 points) statfs() (5 points) Encryption (5 points) Checksums 8

9 6.4 Checkpoint 3 The final checkpoint involves making the core functionality of the lab work. You ll need to be able to perform read and write operations that target a live, on-disk filesystem image. Be sure not to break anything from Checkpoint 2! (5 points) mkdir() (5 points) rmdir() (5 points) write() (5 points) truncate() (5 points) link() (5 points) unlink() (5 points) rename() (5 points) sync() (5 points) fsync() (10 points) Snapshotting (10 points) Efficient crash recovery (1 point) Updates to the darray interface 6.5 Evaluation We re going to grade your filesystems by invoking fsdb on your tests and by replacing the default init process with a script that stresses a bunch of filesystem operations. As we get the stress tester for your httpd project up and running, we ll also set up a separate test environment where your webserver s DATA ROOT is backed by your new filesystem. 9

1 / 23. CS 137: File Systems. General Filesystem Design

1 / 23. CS 137: File Systems. General Filesystem Design 1 / 23 CS 137: File Systems General Filesystem Design 2 / 23 Promises Made by Disks (etc.) Promises 1. I am a linear array of fixed-size blocks 1 2. You can access any block fairly quickly, regardless

More information

1 / 22. CS 135: File Systems. General Filesystem Design

1 / 22. CS 135: File Systems. General Filesystem Design 1 / 22 CS 135: File Systems General Filesystem Design Promises 2 / 22 Promises Made by Disks (etc.) 1. I am a linear array of blocks 2. You can access any block fairly quickly 3. You can read or write

More information

Ext3/4 file systems. Don Porter CSE 506

Ext3/4 file systems. Don Porter CSE 506 Ext3/4 file systems Don Porter CSE 506 Logical Diagram Binary Formats Memory Allocators System Calls Threads User Today s Lecture Kernel RCU File System Networking Sync Memory Management Device Drivers

More information

Virtual File System. Don Porter CSE 306

Virtual File System. Don Porter CSE 306 Virtual File System Don Porter CSE 306 History Early OSes provided a single file system In general, system was pretty tailored to target hardware In the early 80s, people became interested in supporting

More information

ò Very reliable, best-of-breed traditional file system design ò Much like the JOS file system you are building now

ò Very reliable, best-of-breed traditional file system design ò Much like the JOS file system you are building now Ext2 review Very reliable, best-of-breed traditional file system design Ext3/4 file systems Don Porter CSE 506 Much like the JOS file system you are building now Fixed location super blocks A few direct

More information

Operating Systems. File Systems. Thomas Ropars.

Operating Systems. File Systems. Thomas Ropars. 1 Operating Systems File Systems Thomas Ropars thomas.ropars@univ-grenoble-alpes.fr 2017 2 References The content of these lectures is inspired by: The lecture notes of Prof. David Mazières. Operating

More information

CS Lab 1: httpd

CS Lab 1: httpd CS 194-24 Palmer Dabbelt February 6, 2013 Contents 1 Setup 2 2 Distributed Code 3 2.1 Cucumber, Capybara and Mechanize.............................. 3 2.2 HTTP Server...........................................

More information

COS 318: Operating Systems. Journaling, NFS and WAFL

COS 318: Operating Systems. Journaling, NFS and WAFL COS 318: Operating Systems Journaling, NFS and WAFL Jaswinder Pal Singh Computer Science Department Princeton University (http://www.cs.princeton.edu/courses/cos318/) Topics Journaling and LFS Network

More information

JOURNALING FILE SYSTEMS. CS124 Operating Systems Winter , Lecture 26

JOURNALING FILE SYSTEMS. CS124 Operating Systems Winter , Lecture 26 JOURNALING FILE SYSTEMS CS124 Operating Systems Winter 2015-2016, Lecture 26 2 File System Robustness The operating system keeps a cache of filesystem data Secondary storage devices are much slower than

More information

RCU. ò Dozens of supported file systems. ò Independent layer from backing storage. ò And, of course, networked file system support

RCU. ò Dozens of supported file systems. ò Independent layer from backing storage. ò And, of course, networked file system support Logical Diagram Virtual File System Don Porter CSE 506 Binary Formats RCU Memory Management File System Memory Allocators System Calls Device Drivers Networking Threads User Today s Lecture Kernel Sync

More information

Virtual File System. Don Porter CSE 506

Virtual File System. Don Porter CSE 506 Virtual File System Don Porter CSE 506 History ò Early OSes provided a single file system ò In general, system was pretty tailored to target hardware ò In the early 80s, people became interested in supporting

More information

PROJECT 6: PINTOS FILE SYSTEM. CS124 Operating Systems Winter , Lecture 25

PROJECT 6: PINTOS FILE SYSTEM. CS124 Operating Systems Winter , Lecture 25 PROJECT 6: PINTOS FILE SYSTEM CS124 Operating Systems Winter 2015-2016, Lecture 25 2 Project 6: Pintos File System Last project is to improve the Pintos file system Note: Please ask before using late tokens

More information

ECE 598 Advanced Operating Systems Lecture 19

ECE 598 Advanced Operating Systems Lecture 19 ECE 598 Advanced Operating Systems Lecture 19 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 7 April 2016 Homework #7 was due Announcements Homework #8 will be posted 1 Why use

More information

CS 318 Principles of Operating Systems

CS 318 Principles of Operating Systems CS 318 Principles of Operating Systems Fall 2018 Lecture 16: Advanced File Systems Ryan Huang Slides adapted from Andrea Arpaci-Dusseau s lecture 11/6/18 CS 318 Lecture 16 Advanced File Systems 2 11/6/18

More information

Crash Consistency: FSCK and Journaling. Dongkun Shin, SKKU

Crash Consistency: FSCK and Journaling. Dongkun Shin, SKKU Crash Consistency: FSCK and Journaling 1 Crash-consistency problem File system data structures must persist stored on HDD/SSD despite power loss or system crash Crash-consistency problem The system may

More information

2. PICTURE: Cut and paste from paper

2. PICTURE: Cut and paste from paper File System Layout 1. QUESTION: What were technology trends enabling this? a. CPU speeds getting faster relative to disk i. QUESTION: What is implication? Can do more work per disk block to make good decisions

More information

EXPLODE: a Lightweight, General System for Finding Serious Storage System Errors. Junfeng Yang, Can Sar, Dawson Engler Stanford University

EXPLODE: a Lightweight, General System for Finding Serious Storage System Errors. Junfeng Yang, Can Sar, Dawson Engler Stanford University EXPLODE: a Lightweight, General System for Finding Serious Storage System Errors Junfeng Yang, Can Sar, Dawson Engler Stanford University Why check storage systems? Storage system errors are among the

More information

CSE 153 Design of Operating Systems

CSE 153 Design of Operating Systems CSE 153 Design of Operating Systems Winter 2018 Lecture 22: File system optimizations and advanced topics There s more to filesystems J Standard Performance improvement techniques Alternative important

More information

CSE 333 Lecture 9 - storage

CSE 333 Lecture 9 - storage CSE 333 Lecture 9 - storage Steve Gribble Department of Computer Science & Engineering University of Washington Administrivia Colin s away this week - Aryan will be covering his office hours (check the

More information

The Art and Science of Memory Allocation

The Art and Science of Memory Allocation Logical Diagram The Art and Science of Memory Allocation Don Porter CSE 506 Binary Formats RCU Memory Management Memory Allocators CPU Scheduler User System Calls Kernel Today s Lecture File System Networking

More information

Operating Systems Design Exam 2 Review: Spring 2011

Operating Systems Design Exam 2 Review: Spring 2011 Operating Systems Design Exam 2 Review: Spring 2011 Paul Krzyzanowski pxk@cs.rutgers.edu 1 Question 1 CPU utilization tends to be lower when: a. There are more processes in memory. b. There are fewer processes

More information

CS Lab 2: Scheduling

CS Lab 2: Scheduling CS 194-24 Palmer Dabbelt March 1, 2013 Contents 1 Real-time Scheduling 2 1.1 Linux Scheduler Classes..................................... 2 1.2 Kernel Snapshotting.......................................

More information

CS 416: Opera-ng Systems Design March 23, 2012

CS 416: Opera-ng Systems Design March 23, 2012 Question 1 Operating Systems Design Exam 2 Review: Spring 2011 Paul Krzyzanowski pxk@cs.rutgers.edu CPU utilization tends to be lower when: a. There are more processes in memory. b. There are fewer processes

More information

PERSISTENCE: FSCK, JOURNALING. Shivaram Venkataraman CS 537, Spring 2019

PERSISTENCE: FSCK, JOURNALING. Shivaram Venkataraman CS 537, Spring 2019 PERSISTENCE: FSCK, JOURNALING Shivaram Venkataraman CS 537, Spring 2019 ADMINISTRIVIA Project 4b: Due today! Project 5: Out by tomorrow Discussion this week: Project 5 AGENDA / LEARNING OUTCOMES How does

More information

File System Performance (and Abstractions) Kevin Webb Swarthmore College April 5, 2018

File System Performance (and Abstractions) Kevin Webb Swarthmore College April 5, 2018 File System Performance (and Abstractions) Kevin Webb Swarthmore College April 5, 2018 Today s Goals Supporting multiple file systems in one name space. Schedulers not just for CPUs, but disks too! Caching

More information

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review COS 318: Operating Systems NSF, Snapshot, Dedup and Review Topics! NFS! Case Study: NetApp File System! Deduplication storage system! Course review 2 Network File System! Sun introduced NFS v2 in early

More information

CS 111. Operating Systems Peter Reiher

CS 111. Operating Systems Peter Reiher Operating System Principles: File Systems Operating Systems Peter Reiher Page 1 Outline File systems: Why do we need them? Why are they challenging? Basic elements of file system design Designing file

More information

CS 318 Principles of Operating Systems

CS 318 Principles of Operating Systems CS 318 Principles of Operating Systems Fall 2017 Lecture 16: File Systems Examples Ryan Huang File Systems Examples BSD Fast File System (FFS) - What were the problems with the original Unix FS? - How

More information

COS 318: Operating Systems. File Systems. Topics. Evolved Data Center Storage Hierarchy. Traditional Data Center Storage Hierarchy

COS 318: Operating Systems. File Systems. Topics. Evolved Data Center Storage Hierarchy. Traditional Data Center Storage Hierarchy Topics COS 318: Operating Systems File Systems hierarchy File system abstraction File system operations File system protection 2 Traditional Data Center Hierarchy Evolved Data Center Hierarchy Clients

More information

Lecture 21: Reliable, High Performance Storage. CSC 469H1F Fall 2006 Angela Demke Brown

Lecture 21: Reliable, High Performance Storage. CSC 469H1F Fall 2006 Angela Demke Brown Lecture 21: Reliable, High Performance Storage CSC 469H1F Fall 2006 Angela Demke Brown 1 Review We ve looked at fault tolerance via server replication Continue operating with up to f failures Recovery

More information

Directory. File. Chunk. Disk

Directory. File. Chunk. Disk SIFS Phase 1 Due: October 14, 2007 at midnight Phase 2 Due: December 5, 2007 at midnight 1. Overview This semester you will implement a single-instance file system (SIFS) that stores only one copy of data,

More information

OPERATING SYSTEM. Chapter 12: File System Implementation

OPERATING SYSTEM. Chapter 12: File System Implementation OPERATING SYSTEM Chapter 12: File System Implementation Chapter 12: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management

More information

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24 FILE SYSTEMS, PART 2 CS124 Operating Systems Fall 2017-2018, Lecture 24 2 Last Time: File Systems Introduced the concept of file systems Explored several ways of managing the contents of files Contiguous

More information

Advanced file systems: LFS and Soft Updates. Ken Birman (based on slides by Ben Atkin)

Advanced file systems: LFS and Soft Updates. Ken Birman (based on slides by Ben Atkin) : LFS and Soft Updates Ken Birman (based on slides by Ben Atkin) Overview of talk Unix Fast File System Log-Structured System Soft Updates Conclusions 2 The Unix Fast File System Berkeley Unix (4.2BSD)

More information

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017 ECE 550D Fundamentals of Computer Systems and Engineering Fall 2017 The Operating System (OS) Prof. John Board Duke University Slides are derived from work by Profs. Tyler Bletsch and Andrew Hilton (Duke)

More information

Advanced UNIX File Systems. Berkley Fast File System, Logging File System, Virtual File Systems

Advanced UNIX File Systems. Berkley Fast File System, Logging File System, Virtual File Systems Advanced UNIX File Systems Berkley Fast File System, Logging File System, Virtual File Systems Classical Unix File System Traditional UNIX file system keeps I-node information separately from the data

More information

Operating Systems. Operating Systems Professor Sina Meraji U of T

Operating Systems. Operating Systems Professor Sina Meraji U of T Operating Systems Operating Systems Professor Sina Meraji U of T How are file systems implemented? File system implementation Files and directories live on secondary storage Anything outside of primary

More information

CS 318 Principles of Operating Systems

CS 318 Principles of Operating Systems CS 318 Principles of Operating Systems Fall 2017 Lecture 17: File System Crash Consistency Ryan Huang Administrivia Lab 3 deadline Thursday Nov 9 th 11:59pm Thursday class cancelled, work on the lab Some

More information

(Not so) recent development in filesystems

(Not so) recent development in filesystems (Not so) recent development in filesystems Tomáš Hrubý University of Otago and World45 Ltd. March 19, 2008 Tomáš Hrubý (World45) Filesystems March 19, 2008 1 / 23 Linux Extended filesystem family Ext2

More information

Announcements. Persistence: Log-Structured FS (LFS)

Announcements. Persistence: Log-Structured FS (LFS) Announcements P4 graded: In Learn@UW; email 537-help@cs if problems P5: Available - File systems Can work on both parts with project partner Watch videos; discussion section Part a : file system checker

More information

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability Topics COS 318: Operating Systems File Performance and Reliability File buffer cache Disk failure and recovery tools Consistent updates Transactions and logging 2 File Buffer Cache for Performance What

More information

FILE SYSTEMS, PART 2. CS124 Operating Systems Winter , Lecture 24

FILE SYSTEMS, PART 2. CS124 Operating Systems Winter , Lecture 24 FILE SYSTEMS, PART 2 CS124 Operating Systems Winter 2015-2016, Lecture 24 2 Files and Processes The OS maintains a buffer of storage blocks in memory Storage devices are often much slower than the CPU;

More information

Long-term Information Storage Must store large amounts of data Information stored must survive the termination of the process using it Multiple proces

Long-term Information Storage Must store large amounts of data Information stored must survive the termination of the process using it Multiple proces File systems 1 Long-term Information Storage Must store large amounts of data Information stored must survive the termination of the process using it Multiple processes must be able to access the information

More information

RCU. ò Walk through two system calls in some detail. ò Open and read. ò Too much code to cover all FS system calls. ò 3 Cases for a dentry:

RCU. ò Walk through two system calls in some detail. ò Open and read. ò Too much code to cover all FS system calls. ò 3 Cases for a dentry: Logical Diagram VFS, Continued Don Porter CSE 506 Binary Formats RCU Memory Management File System Memory Allocators System Calls Device Drivers Networking Threads User Today s Lecture Kernel Sync CPU

More information

OPERATING SYSTEMS CS136

OPERATING SYSTEMS CS136 OPERATING SYSTEMS CS136 Jialiang LU Jialiang.lu@sjtu.edu.cn Based on Lecture Notes of Tanenbaum, Modern Operating Systems 3 e, 1 Chapter 4 FILE SYSTEMS 2 File Systems Many important applications need to

More information

VFS, Continued. Don Porter CSE 506

VFS, Continued. Don Porter CSE 506 VFS, Continued Don Porter CSE 506 Logical Diagram Binary Formats Memory Allocators System Calls Threads User Today s Lecture Kernel RCU File System Networking Sync Memory Management Device Drivers CPU

More information

CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed. CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed. File-System Structure File structure Logical storage unit Collection of related information File

More information

Tricky issues in file systems

Tricky issues in file systems Tricky issues in file systems Taylor Riastradh Campbell campbell@mumble.net riastradh@netbsd.org EuroBSDcon 2015 Stockholm, Sweden October 4, 2015 What is a file system? Standard Unix concept: hierarchy

More information

CS-736 Midterm: Beyond Compare (Spring 2008)

CS-736 Midterm: Beyond Compare (Spring 2008) CS-736 Midterm: Beyond Compare (Spring 2008) An Arpaci-Dusseau Exam Please Read All Questions Carefully! There are eight (8) total numbered pages Please put your NAME ONLY on this page, and your STUDENT

More information

CS510 Operating System Foundations. Jonathan Walpole

CS510 Operating System Foundations. Jonathan Walpole CS510 Operating System Foundations Jonathan Walpole File System Performance File System Performance Memory mapped files - Avoid system call overhead Buffer cache - Avoid disk I/O overhead Careful data

More information

Operating Systems. Week 9 Recitation: Exam 2 Preview Review of Exam 2, Spring Paul Krzyzanowski. Rutgers University.

Operating Systems. Week 9 Recitation: Exam 2 Preview Review of Exam 2, Spring Paul Krzyzanowski. Rutgers University. Operating Systems Week 9 Recitation: Exam 2 Preview Review of Exam 2, Spring 2014 Paul Krzyzanowski Rutgers University Spring 2015 March 27, 2015 2015 Paul Krzyzanowski 1 Exam 2 2012 Question 2a One of

More information

Chapter 11: Implementing File Systems

Chapter 11: Implementing File Systems Chapter 11: Implementing File Systems Operating System Concepts 99h Edition DM510-14 Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory Implementation Allocation

More information

Block Device Scheduling. Don Porter CSE 506

Block Device Scheduling. Don Porter CSE 506 Block Device Scheduling Don Porter CSE 506 Logical Diagram Binary Formats Memory Allocators System Calls Threads User Kernel RCU File System Networking Sync Memory Management Device Drivers CPU Scheduler

More information

Block Device Scheduling

Block Device Scheduling Logical Diagram Block Device Scheduling Don Porter CSE 506 Binary Formats RCU Memory Management File System Memory Allocators System Calls Device Drivers Interrupts Net Networking Threads Sync User Kernel

More information

Project C: B+Tree. This project may be done in groups of up to three people.

Project C: B+Tree. This project may be done in groups of up to three people. Project C: B+Tree In this last project, you will implement a B+Tree index in C++. At the end of the project, you will have a C++ class that conforms to a specific interface. Your class is then used by

More information

File System Case Studies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

File System Case Studies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University File System Case Studies Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Today s Topics The Original UNIX File System FFS Ext2 FAT 2 UNIX FS (1)

More information

Chapter 10: Case Studies. So what happens in a real operating system?

Chapter 10: Case Studies. So what happens in a real operating system? Chapter 10: Case Studies So what happens in a real operating system? Operating systems in the real world Studied mechanisms used by operating systems Processes & scheduling Memory management File systems

More information

Logging File Systems

Logging File Systems Logging ile Systems Learning Objectives xplain the difference between journaling file systems and log-structured file systems. Give examples of workloads for which each type of system will excel/fail miserably.

More information

Final Examination CS 111, Fall 2016 UCLA. Name:

Final Examination CS 111, Fall 2016 UCLA. Name: Final Examination CS 111, Fall 2016 UCLA Name: This is an open book, open note test. You may use electronic devices to take the test, but may not access the network during the test. You have three hours

More information

Using GitHub to Share with SparkFun a

Using GitHub to Share with SparkFun a Using GitHub to Share with SparkFun a learn.sparkfun.com tutorial Available online at: http://sfe.io/t52 Contents Introduction Gitting Started Forking a Repository Committing, Pushing and Pulling Syncing

More information

Storage and File System

Storage and File System COS 318: Operating Systems Storage and File System Andy Bavier Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall10/cos318/ Topics Storage hierarchy File

More information

Operating System Concepts Ch. 11: File System Implementation

Operating System Concepts Ch. 11: File System Implementation Operating System Concepts Ch. 11: File System Implementation Silberschatz, Galvin & Gagne Introduction When thinking about file system implementation in Operating Systems, it is important to realize the

More information

CSE506: Operating Systems CSE 506: Operating Systems

CSE506: Operating Systems CSE 506: Operating Systems CSE 506: Operating Systems File Systems Traditional File Systems FS, UFS/FFS, Ext2, Several simple on disk structures Superblock magic value to identify filesystem type Places to find metadata on disk

More information

CS Lab 4: Device Drivers

CS Lab 4: Device Drivers CS 194-24 Palmer Dabbelt April 18, 2014 Contents 1 ETH194 References 2 2 ETH194+ 2 3 Tasks 3 3.1 ETH194 Linux Driver (20 pts)................................. 3 3.2 Test Harness (10 pts).......................................

More information

Current Topics in OS Research. So, what s hot?

Current Topics in OS Research. So, what s hot? Current Topics in OS Research COMP7840 OSDI Current OS Research 0 So, what s hot? Operating systems have been around for a long time in many forms for different types of devices It is normally general

More information

Chapter 11: File System Implementation. Objectives

Chapter 11: File System Implementation. Objectives Chapter 11: File System Implementation Objectives To describe the details of implementing local file systems and directory structures To describe the implementation of remote file systems To discuss block

More information

ò Server can crash or be disconnected ò Client can crash or be disconnected ò How to coordinate multiple clients accessing same file?

ò Server can crash or be disconnected ò Client can crash or be disconnected ò How to coordinate multiple clients accessing same file? Big picture (from Sandberg et al.) NFS Don Porter CSE 506 Intuition Challenges Instead of translating VFS requests into hard drive accesses, translate them into remote procedure calls to a server Simple,

More information

NFS. Don Porter CSE 506

NFS. Don Porter CSE 506 NFS Don Porter CSE 506 Big picture (from Sandberg et al.) Intuition ò Instead of translating VFS requests into hard drive accesses, translate them into remote procedure calls to a server ò Simple, right?

More information

Chapter 11: Implementing File

Chapter 11: Implementing File Chapter 11: Implementing File Systems Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency

More information

CS Final Exam. Stanford University Computer Science Department. June 5, 2012 !!!!! SKIP 15 POINTS WORTH OF QUESTIONS.!!!!!

CS Final Exam. Stanford University Computer Science Department. June 5, 2012 !!!!! SKIP 15 POINTS WORTH OF QUESTIONS.!!!!! CS 240 - Final Exam Stanford University Computer Science Department June 5, 2012!!!!! SKIP 15 POINTS WORTH OF QUESTIONS.!!!!! This is an open-book (but closed-laptop) exam. You have 75 minutes. Cross out

More information

Review: FFS [McKusic] basics. Review: FFS background. Basic FFS data structures. FFS disk layout. FFS superblock. Cylinder groups

Review: FFS [McKusic] basics. Review: FFS background. Basic FFS data structures. FFS disk layout. FFS superblock. Cylinder groups Review: FFS background 1980s improvement to original Unix FS, which had: - 512-byte blocks - Free blocks in linked list - All inodes at beginning of disk - Low throughput: 512 bytes per average seek time

More information

Assignment 4 Section Notes #2

Assignment 4 Section Notes #2 Assignment 4 Section Notes #2 CS161 Course Staff April 2, 2016 1 Administrivia The assignment is due Wednesday, April 27 at 5:00PM. There is a course-wide extension until Friday, April 29 at 5:00PM. 2

More information

Network File System (NFS)

Network File System (NFS) Network File System (NFS) Nima Honarmand User A Typical Storage Stack (Linux) Kernel VFS (Virtual File System) ext4 btrfs fat32 nfs Page Cache Block Device Layer Network IO Scheduler Disk Driver Disk NFS

More information

Chapter 11: Implementing File Systems. Operating System Concepts 9 9h Edition

Chapter 11: Implementing File Systems. Operating System Concepts 9 9h Edition Chapter 11: Implementing File Systems Operating System Concepts 9 9h Edition Silberschatz, Galvin and Gagne 2013 Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory

More information

The Google File System

The Google File System October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single

More information

W4118 Operating Systems. Instructor: Junfeng Yang

W4118 Operating Systems. Instructor: Junfeng Yang W4118 Operating Systems Instructor: Junfeng Yang File systems in Linux Linux Second Extended File System (Ext2) What is the EXT2 on-disk layout? What is the EXT2 directory structure? Linux Third Extended

More information

Review: FFS background

Review: FFS background 1/37 Review: FFS background 1980s improvement to original Unix FS, which had: - 512-byte blocks - Free blocks in linked list - All inodes at beginning of disk - Low throughput: 512 bytes per average seek

More information

File Systems Management and Examples

File Systems Management and Examples File Systems Management and Examples Today! Efficiency, performance, recovery! Examples Next! Distributed systems Disk space management! Once decided to store a file as sequence of blocks What s the size

More information

File System Implementation

File System Implementation File System Implementation Last modified: 16.05.2017 1 File-System Structure Virtual File System and FUSE Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance. Buffering

More information

File System Consistency. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

File System Consistency. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University File System Consistency Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Crash Consistency File system may perform several disk writes to complete

More information

To understand this, let's build a layered model from the bottom up. Layers include: device driver filesystem file

To understand this, let's build a layered model from the bottom up. Layers include: device driver filesystem file Disks_and_Layers Page 1 So what is a file? Tuesday, November 17, 2015 1:23 PM This is a difficult question. To understand this, let's build a layered model from the bottom up. Layers include: device driver

More information

Case study: ext2 FS 1

Case study: ext2 FS 1 Case study: ext2 FS 1 The ext2 file system Second Extended Filesystem The main Linux FS before ext3 Evolved from Minix filesystem (via Extended Filesystem ) Features Block size (1024, 2048, and 4096) configured

More information

Lecture 11: Linux ext3 crash recovery

Lecture 11: Linux ext3 crash recovery 6.828 2011 Lecture 11: Linux ext3 crash recovery topic crash recovery crash may interrupt a multi-disk-write operation leave file system in an unuseable state most common solution: logging last lecture:

More information

[537] Fast File System. Tyler Harter

[537] Fast File System. Tyler Harter [537] Fast File System Tyler Harter File-System Case Studies Local - FFS: Fast File System - LFS: Log-Structured File System Network - NFS: Network File System - AFS: Andrew File System File-System Case

More information

OCFS2 Mark Fasheh Oracle

OCFS2 Mark Fasheh Oracle OCFS2 Mark Fasheh Oracle What is OCFS2? General purpose cluster file system Shared disk model Symmetric architecture Almost POSIX compliant fcntl(2) locking Shared writeable mmap Cluster stack Small, suitable

More information

Case study: ext2 FS 1

Case study: ext2 FS 1 Case study: ext2 FS 1 The ext2 file system Second Extended Filesystem The main Linux FS before ext3 Evolved from Minix filesystem (via Extended Filesystem ) Features Block size (1024, 2048, and 4096) configured

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Fall 2017 Lecture 24 File Systems Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 Questions from last time How

More information

Caching and reliability

Caching and reliability Caching and reliability Block cache Vs. Latency ~10 ns 1~ ms Access unit Byte (word) Sector Capacity Gigabytes Terabytes Price Expensive Cheap Caching disk contents in RAM Hit ratio h : probability of

More information

CSC369 Lecture 9. Larry Zhang, November 16, 2015

CSC369 Lecture 9. Larry Zhang, November 16, 2015 CSC369 Lecture 9 Larry Zhang, November 16, 2015 1 Announcements A3 out, due ecember 4th Promise: there will be no extension since it is too close to the final exam (ec 7) Be prepared to take the challenge

More information

Chapter 12: File System Implementation

Chapter 12: File System Implementation Chapter 12: File System Implementation Chapter 12: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency

More information

File System Consistency

File System Consistency File System Consistency Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3052: Introduction to Operating Systems, Fall 2017, Jinkyu Jeong (jinkyu@skku.edu)

More information

Chapter 10: File System Implementation

Chapter 10: File System Implementation Chapter 10: File System Implementation Chapter 10: File System Implementation File-System Structure" File-System Implementation " Directory Implementation" Allocation Methods" Free-Space Management " Efficiency

More information

CS5460: Operating Systems Lecture 20: File System Reliability

CS5460: Operating Systems Lecture 20: File System Reliability CS5460: Operating Systems Lecture 20: File System Reliability File System Optimizations Modern Historic Technique Disk buffer cache Aggregated disk I/O Prefetching Disk head scheduling Disk interleaving

More information

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto Ricardo Rocha Department of Computer Science Faculty of Sciences University of Porto Slides based on the book Operating System Concepts, 9th Edition, Abraham Silberschatz, Peter B. Galvin and Greg Gagne,

More information

File Systems Part 2. Operating Systems In Depth XV 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

File Systems Part 2. Operating Systems In Depth XV 1 Copyright 2018 Thomas W. Doeppner. All rights reserved. File Systems Part 2 Operating Systems In Depth XV 1 Copyright 2018 Thomas W. Doeppner. All rights reserved. Extents runlist length offset length offset length offset length offset 8 11728 10 10624 10624

More information

To Everyone... iii To Educators... v To Students... vi Acknowledgments... vii Final Words... ix References... x. 1 ADialogueontheBook 1

To Everyone... iii To Educators... v To Students... vi Acknowledgments... vii Final Words... ix References... x. 1 ADialogueontheBook 1 Contents To Everyone.............................. iii To Educators.............................. v To Students............................... vi Acknowledgments........................... vii Final Words..............................

More information

CS3600 SYSTEMS AND NETWORKS

CS3600 SYSTEMS AND NETWORKS CS3600 SYSTEMS AND NETWORKS NORTHEASTERN UNIVERSITY Lecture 11: File System Implementation Prof. Alan Mislove (amislove@ccs.neu.edu) File-System Structure File structure Logical storage unit Collection

More information

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 13 Virtual memory and memory management unit In the last class, we had discussed

More information

416 Distributed Systems. Distributed File Systems 2 Jan 20, 2016

416 Distributed Systems. Distributed File Systems 2 Jan 20, 2016 416 Distributed Systems Distributed File Systems 2 Jan 20, 2016 1 Outline Why Distributed File Systems? Basic mechanisms for building DFSs Using NFS and AFS as examples NFS: network file system AFS: andrew

More information

Lecture 18: Reliable Storage

Lecture 18: Reliable Storage CS 422/522 Design & Implementation of Operating Systems Lecture 18: Reliable Storage Zhong Shao Dept. of Computer Science Yale University Acknowledgement: some slides are taken from previous versions of

More information