A4: Layered Block-Structured File System

Similar documents
Filesystem. Disclaimer: some slides are adopted from book authors slides with permission 1

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission 1

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission

we are here Page 1 Recall: How do we Hide I/O Latency? I/O & Storage Layers Recall: C Low level I/O

we are here I/O & Storage Layers Recall: C Low level I/O Recall: C Low Level Operations CS162 Operating Systems and Systems Programming Lecture 18

CS 318 Principles of Operating Systems

File Systems. CS 4410 Operating Systems. [R. Agarwal, L. Alvisi, A. Bracy, M. George, E. Sirer, R. Van Renesse]

File System Case Studies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

File System CS170 Discussion Week 9. *Some slides taken from TextBook Author s Presentation

CS 318 Principles of Operating Systems

Lecture 19: File System Implementation. Mythili Vutukuru IIT Bombay

Lab 4: Tracery Recursion in C with Linked Lists

Da-Wei Chang CSIE.NCKU. Professor Hao-Ren Ke, National Chiao Tung University Professor Hsung-Pin Chang, National Chung Hsing University

File System Definition: file. File management: File attributes: Name: Type: Location: Size: Protection: Time, date and user identification:

File System Case Studies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CS 4284 Systems Capstone

CS-537: Final Exam (Spring 2011) The Black Box

Operating Systems CMPSCI 377 Spring Mark Corner University of Massachusetts Amherst

CS370 Operating Systems

ECE 598 Advanced Operating Systems Lecture 18

File System Implementation. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CS370 Operating Systems

Typical File Extensions File Structure

CS370 Operating Systems

FILE SYSTEM IMPLEMENTATION. Sunu Wibirama

File System Implementation. Sunu Wibirama

CS 550 Operating Systems Spring File System

CS 111. Operating Systems Peter Reiher

File System Implementation

Reminder. COP4600 Discussion 9 Assignment4. Discussion 8 Recap. Question-1(Segmentation) Question-2(I-node) Question-1(Segmentation) 10/24/2010

Basic filesystem concepts. Tuesday, November 22, 2011

Virtual File System. Don Porter CSE 306

OPERATING SYSTEMS ASSIGNMENT 4 FILE SYSTEMS

Chapter 11: File System Implementation

File Systems. CS170 Fall 2018

CSE380 - Operating Systems

Chapter 11: File System Implementation. Objectives

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability

Operating Systems. Week 9 Recitation: Exam 2 Preview Review of Exam 2, Spring Paul Krzyzanowski. Rutgers University.

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

Lesson 09: SD Card Interface

EL2310 Scientific Programming

Operating System Concepts Ch. 11: File System Implementation

File Management 1/34

Outline. Operating Systems. File Systems. File System Concepts. Example: Unix open() Files: The User s Point of View

Virtual File System. Don Porter CSE 506

PRINCIPLES OF OPERATING SYSTEMS

Operating Systems. Operating Systems Professor Sina Meraji U of T

Lecture 24: Filesystems: FAT, FFS, NTFS

CMSC 313 Lecture 13 Project 4 Questions Reminder: Midterm Exam next Thursday 10/16 Virtual Memory

Memory and C/C++ modules

CSE 333 Midterm Exam 5/10/13

File System Internals. Jo, Heeseung

File System Case Studies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CS 326 Operating Systems C Programming. Greg Benson Department of Computer Science University of San Francisco

Operating Systems Design Exam 2 Review: Fall 2010

File System Structure. Kevin Webb Swarthmore College March 29, 2018

Page 1. Recap: File System Goals" Recap: Linked Allocation"

File Systems Part 2. Operating Systems In Depth XV 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

Long-term Information Storage Must store large amounts of data Information stored must survive the termination of the process using it Multiple proces

File. File System Implementation. Operations. Permissions and Data Layout. Storing and Accessing File Data. Opening a File

Operating System Labs. Yuanbin Wu

Cache and Virtual Memory Simulations

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

File Systems. Kartik Gopalan. Chapter 4 From Tanenbaum s Modern Operating System

537 Final Exam, Old School Style. "The Job Interview" Name: This exam contains 13 old-school pages.

Project 5: GeekOS File System 2

COMP 3361: Operating Systems 1 Final Exam Winter 2009

File System Management

CSC/ECE 506: Computer Architecture and Multiprocessing Program 3: Simulating DSM Coherence Due: Tuesday, Nov 22, 2016

SOFTWARE ARCHITECTURE 2. FILE SYSTEM. Tatsuya Hagino lecture URL.

Unix System Architecture, File System, and Shell Commands

FILE SYSTEMS. Tanzir Ahmed CSCE 313 Fall 2018

FCFS: On-Disk Design Revision: 1.8

CS240: Programming in C

CSE 333 Midterm Exam Sample Solution 5/10/13

The EXT2FS Library. The EXT2FS Library Version 1.37 January by Theodore Ts o

This Unit: Main Memory. Virtual Memory. Virtual Memory. Other Uses of Virtual Memory

CSE 410 Final Exam 6/09/09. Suppose we have a memory and a direct-mapped cache with the following characteristics.

File Systems: Fundamentals

UNIX File Systems. How UNIX Organizes and Accesses Files on Disk

3/26/2014. Contents. Concepts (1) Disk: Device that stores information (files) Many files x many users: OS management

File Systems. What do we need to know?

ò Server can crash or be disconnected ò Client can crash or be disconnected ò How to coordinate multiple clients accessing same file?

The EXT2FS Library. The EXT2FS Library Version 1.38 June by Theodore Ts o

Introduction to OS. File Management. MOS Ch. 4. Mahmoud El-Gayyar. Mahmoud El-Gayyar / Introduction to OS 1

Motivation. Operating Systems. File Systems. Outline. Files: The User s Point of View. File System Concepts. Solution? Files!

File System (FS) Highlights

NFS. Don Porter CSE 506

File Systems: Fundamentals

In Java we have the keyword null, which is the value of an uninitialized reference type

Computer Science 162, Fall 2014 David Culler University of California, Berkeley Midterm 2 November 14, 2014

History. 5/1/2006 Computer System Software CS012 BE 7th Semester 2

1 / 23. CS 137: File Systems. General Filesystem Design

CSE 124 Discussion (10/3) C/C++ Basics

CSE351 Winter 2016, Final Examination March 16, 2016

The bigger picture. File systems. User space operations. What s a file. A file system is the user space implementation of persistent storage.

Recitation 7 Caches and Blocking. 9 October 2017

CSC369 Lecture 9. Larry Zhang, November 16, 2015

3.Constructors and Destructors. Develop cpp program to implement constructor and destructor.

Transcription:

A4: Layered Block-Structured File System CS 4410 Operating Systems Slides originally by Robbert van Renesse.

Introduction File System abstraction that provides persistent, named data Block Store abstraction providing access to a sequence of numbered blocks. (No names.) Physical Device (e.g., DISK) Disk: sectors identified with logical block addresses, specifying surface, track, and sector to be accessed. Layered Abstractions to access storage (HIGHLY SIMPLIFIED FIGURE 11.7 from book) 2

Block Store Abstraction Provides a disk-like interface: a sequence of blocks numbered 0, 1, (typically a few KB) you can read or write 1 block at a time nblocks() read(block_num) write(block_num, block) setsize(size) returns size of the block store in #blocks returns contents of given block number writes block contents at given block num sets the size of the block store A4 has you work with multiple versions / instantiations of this abstraction. 3

Heads up about the code! This entire code base is what happens when you want object oriented programming, but you only have C. Put on your C++ / Java Goggles! block_store_t (a block store type) is essentially an abstract class 4

Contents of block_store.h #define BLOCK_SIZE 512 // # bytes in a block typedef unsigned int block_no; // index of a block typedef struct block { char bytes[block_size]; } block_t; typedef struct block_store { void *state; int (*nblocks)(struct block_store *this_bs); int (*read)(struct block_store *this_bs, block_no offset, block_t *block); int (*write)(struct block_store *this_bs, block_no offset, block_t *block); int (*setsize)(struct block_store *this_bs, block_no size); void (*destroy)(struct block_store *this_bs); } block_store_t; ß poor man s class None of this is data! All typedefs! 5

Block Store Instructions block_store_t *xxx_init( ) Name & signature varies, sets up the fn pointers int nblocks( ) read( ) write( ) setsize( ) destroy() ß destructor ß constructor frees everything associated with this block store 6

sample.c -- just a lone disk #include... #include block_store.h int main(){ block_store_t *disk = disk_init( disk.dev, 1024); block_t block; strcpy(block.bytes, Hello World ); (*disk->write)(disk, 0, &block); (*disk->destroy)(disk); return 0; } RUN IT! IT S COOL! > gcc -g block_store.c sample.c >./a.out > less disk.dev 7

Block Stores can be Layered! Each layer presents a block store abstraction block_store CACHEDISK STATDISK DISK keeps a cache of recently used blocks keeps track of #reads and #writes for statistics keeps blocks in a Linux file 8

A Cache for the Disk? Yes! All requests for a given block go through block cache File System AKA treedisk Block Cache AKA cachedisk Disk Benefit #1: Performance Caches recently read blocks Buffers recently written blocks (to be written later) Benefit #2: Synchronization: For each entry, OS adds information to: prevent a process from reading block while another writes ensure that a given block is only fetched from storage device once, even if it is simultaneously read by many processes 9

layer.c -- code with layers #define CACHE_SIZE 10 // #blocks in cache block_t cache[cache_size]; int main(){ block_store_t *disk = disk_init( disk2.dev, 1024); block_store_t *sdisk = statdisk_init(disk); block_store_t *cdisk = cachedisk_init(sdisk, cache, CACHE_SIZE); } block_t block; strcpy(block.bytes, Farewell World! ); (*cdisk->write)(cdisk, 0, &block); (*cdisk->destroy)(cdisk); (*sdisk->destroy)(sdisk); (*disk->destroy)(disk); return 0; CACHEDISK STATDISK DISK RUN IT! IT S COOL! > gcc -g block_store.c statdisk.c cachedisk.c layer.c >./a.out > less disk2.dev 10

Example Layers block_store_t *statdisk_init(block_store_t *below); // counts all reads and writes block_store_t *debugdisk_init(block_store_t *below, char *descr); // prints all reads and writes block_store_t *checkdisk_init(block_store_t *below); // checks that what s read is what was written block_store_t *disk_init(char *filename, int nblocks) // simulated disk stored on a Linux file // (could also use real disk using /dev/*disk devices) block_store_t *ramdisk_init(block_t *blocks, nblocks) // a simulated disk in memory, fast but volatile 11

How to write a layer struct statdisk_state { block_store_t *below; unsigned int nread, nwrite; }; // block store below // stats layer-specific data block_store_t *statdisk_init(block_store_t *below){ struct statdisk_state *sds = calloc(1, sizeof(*sds)); sds->below = below; block_store_t *this_bs = calloc(1, sizeof(*this_bs)); this_bs->state = sds; this_bs->nblocks = statdisk_nblocks; this_bs->setsize = statdisk_setsize; this_bs->read = statdisk_read; this_bs->write = statdisk_write; this_bs->destroy = statdisk_destroy; return this_bs; } 12

statdisk implementation (cont d) int statdisk_read(block_store_t *this_bs, block_no offset, block_t *block){ struct statdisk_state *sds = this_bs->state; sds->nread++; return (*sds->below->read)(sds->below, offset, block); } int statdisk_write(block_store_t *this_bs, block_no offset, block_t *block){ } struct statdisk_state *sds = this_bs->state; sds->nwrite++; return (*sds->below->write)(sds->below, offset, block); records the stats and passes the request to the layer below void statdisk_destroy(block_store_t *this_bs){ free(this_bs->state); free(this_bs); } 13

Another Possible Layer: Treedisk A file system, similar to Unix file systems Initialized to support N virtual block stores (AKA files) Underlying block store (below) partitioned into 3 sections: 1. Superblock: block #0 2. Fixed number of i-node blocks: starts at block #1 Function of N (enough to store N i-nodes) 3. Remaining blocks: starts after i-node blocks data blocks, free blocks, indirect blocks, freelist blocks block number 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 blocks: super block i-node blocks Remaining blocks 14

Types of Blocks in Treedisk union treedisk_block { block_t datablock; struct treedisk_superblock superblock; struct treedisk_inodeblock inodeblock; struct treedisk_freelistblock freelistblock; struct treedisk_indirblock indirblock; }; Superblock: the 0 th block below Freelistblock: list of all unused blocks below I-nodeblock: list of inodes Indirblock: Datablock: list of blocks just data 15

treedisk Superblock block number 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 blocks: superblock inode blocks remaining blocks // one per underlying block store struct treedisk_superblock { }; block_no n_inodeblocks; n_inodeblocks 4 free_list? (some green box) block_no free_list; // 1 st block on free list // 0 means no free blocks Notice: there are no pointers. Everything is a block number. 16

treedisk Free List block number 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 blocks: 4 13 superblock inode blocks remaining blocks struct treedisk_freelistblock { block_no refs[refs_per_block]; }; Suppose REFS_PER_BLOCK = 4 0 6 7 8 5 10 11 12 9 14 15 0 refs[0]: # of another freelistblock or 0 if end of list refs[i]: # of free block for i > 1, 0 if slot empty 17

treedisk free list superblock: n_inodeblocks # free_list freelist block 0 0 0 0 freelist block free block free block free block free block 18

treedisk I-node block block number 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 blocks: superblock inode blocks remaining blocks Suppose REFS_PER_BLOCK = 4 inode[0] inode[1] struct treedisk_inodeblock { struct treedisk_inode inodes[inodes_per_block]; }; 1 15 0 0 struct treedisk_inode { block_no nblocks; // # blocks in virtual block store block_no root; // block # of root node of tree (or 0) }; 19 9 14 0 0 What if the file is bigger than 1 block?

treedisk Indirect block block number 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 blocks: superblock inode blocks remaining blocks Suppose INODES_PER_BLOCK = 2 inode[0] inode[1] 1 15 3 14 nblocks root nblocks root 13 12 11 0 struct treedisk_indirblock { block_no refs[refs_per_block]; }; 20

virtual block store: 3 blocks i-node: nblocks 3 root indirect block data block data block data block What if the file is bigger than 3 blocks? 21

treedisk virtual block store i-node: nblocks #### root (double) indirect block indirect block indirect block data block data block data block How do I know if this is data or a block number? 22

treedisk virtual block store all data blocks at bottom level #levels: ceil(log RPB (#blocks)) + 1 RPB = REFS_PER_BLOCK For example, if rpb = 16: #blocks #levels 0 0 1 1 2-16 2 17-256 3 257-4096 4 REFS_PER_BLOCK more commonly at least 128 or so 23

virtual block store: with hole i-node: nblocks 3 root 0 indirect block data block data block Hole appears as a virtual block filled with null bytes pointer to indirect block can be 0 too virtual block store can be much larger than the physical block store underneath! 24

Putting it all together block number 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 blocks: 4 9 superblock inode[0] inode[1] 1 15 3 14 inode blocks nblocks root nblocks root 0 6 7 8 5 10 0 0 remaining blocks 13 12 11 0 25

A short-lived treedisk file system #define DISK_SIZE 1024 #define MAX_INODES 128 int main(){ block_store_t *disk = disk_init( disk.dev, DISK_SIZE); treedisk_create(disk, MAX_INODES); treedisk_check(disk); // optional: check integrity of file system (*disk->destroy)(cdisk); } return 0; 26

Example code with treedisk block_t cache[cache_size]; int main(){ block_store_t *disk = disk_init( disk.dev, 1024); block_store_t *cdisk = cachedisk_init(disk, cache, CACHE_SIZE); treedisk_create(disk, MAX_INODES); block_store_t *file0 = treedisk_init(cdisk, 0); block_store_t *file1 = treedisk_init(cdisk, 1); block_t block; (*file0->read)(file0, 4, &block); (*file1->read)(file1, 4, &block); (*file0->destroy)(file0); (*file1->destroy)(file1); (*cdisk->destroy)(cdisk); (*disk->destroy)(cdisk); } return 0; 27

Layering on top of treedisk block_store_t *treedisk_init(block_store_t *below, unsigned int inode_no); // creates a new file associated with inode_no inode 0 inode 1... inode TREEDISK TREEDISK... TREEDISK CACHEDISK DISK 28

trace utility TRACEDISK CHECKDISK CHECKDISK... CHECKDISK TREEDISK TREEDISK... TREEDISK CHECKDISK CACHEDISK STATDISK RAMDISK 29

tracedisk ramdisk is bottom-level block store tracedisk is a top-level block store or application-level if you will you can t layer on top of it block_store_t *tracedisk_init( block_store_t *below, char *trace, // trace file name unsigned int n_inodes); 30

Trace file Commands W:0:3 // write inode 0, block 3 If nothing is known about the file associated with inode 0 prior to this line, by writing to block 3, you are implicitly setting the size of the file to 4 blocks W:0:4 // write to inode 0, block 4 by the same logic, you now set the size to 5 since you've written to block 4 N:0:2 // checks if inode 0 is of size 2 this will fail b/c the size should be 5 S:1:0 // set size of inode 1 to 0 R:1:1 // read inode 1, block 1 this will fail b/c you re reading past the end of the file (there is no block 1 for the file associated with inode 1) 31

Example trace file W:0:0 // write inode 0, block 0 N:0:1 // checks if inode 0 is of size 1 W:1:1 // write inode 1, block 1 N:1:2 // checks if inode 1 is of size 2 R:1:1 // read inode 1, block 1 S:1:0 // set size of inode 1 to 0 N:1:0 // checks if inode 0 is of size 0 if N fails, prints!!chksize.. 32

Compiling and Running run make in the release directory this generates an executable called trace run./trace this reads trace file trace.txt you can pass another trace file as argument./trace myowntracefile 33

$ make cc -Wall... cc -Wall Output to be expected -c -o trace.o trace.c -c -o treedisk_chk.o treedisk_chk.c cc -o trace trace.o block_store.o cachedisk.o checkdisk.o debugdisk.o ramdisk.o statdisk.o tracedisk.o treedisk.o treedisk_chk.o $./trace blocksize: 512 refs/block: 128!!TDERR: setsize not yet supported!!error: tracedisk_run: setsize(1, 0) failed!!chksize 10: nblocks 1: 0!= 2!$STAT: #nnblocks: 0!$STAT: #nsetsize: 0!$STAT: #nread: 32 Trace W:0:0 N:0:1 W:0:1 N:0:2 W:1:0 N:1:1 W:1:1 N:1:2 S:1:0 N:1:0!$STAT: #nwrite: 20 34 Cmd:inode:block

A4: Part 1/3 Implement treedisk_setsize(0) currently it generates an error what you need to do: iterate through all the blocks in the inode put them on the free list Useful functions: treedisk_get_snapshot 35

Implement cachedisk A4: Part 2/3 currently it doesn t actually do anything what you need to do: pick a caching algorithm: LRU, MFU, or design your own go wild! implement it within cachedisk.c write-through cache!! consult the web for caching algorithms! 36

A4: Part 3/3 Implement your own trace file that: is at least 10 lines long uses all 4 commands (RWNS) has an edit distance of at least 6 from the trace we gave you is well-formed. For example, it should not try to verify that a file has a size X when the previous command have in fact determined that it should have size Y. You may find the chktrace.c file useful At most: 10,000 commands, 128 inodes, 1<<27 block size Step 1: use it to convince yourself that your cache is working correctly. Optional Step: make a trace that is hard for a caching layer to be effective (random reads/writes) so that it can be used to distinguish good caches from bad ones. 37

What to submit treedisk.c cachedisk.c trace.txt // with treedisk_setsize(0) 38

The Big Red Caching Contest!!! We will run everybody s trace against everybody s treedisk and cachedisk We will run this on top of a statdisk We will count the total number of read operations The winner is whomever ends up doing the fewest read operations to the underlying disk Does not count towards grade of A4, but you may win fame and glory 39