SCSI overview. SCSI domain consists of devices and an SDS

Similar documents
Review: FFS [McKusic] basics. Review: FFS background. Basic FFS data structures. FFS disk layout. FFS superblock. Cylinder groups

Review: FFS background

Midterm evaluations. Thank you for doing midterm evaluations! First time not everyone thought I was going too fast

Advanced File Systems. CS 140 Feb. 25, 2015 Ali Jose Mashtizadeh

CSE506: Operating Systems CSE 506: Operating Systems

Operating Systems. Operating Systems Professor Sina Meraji U of T

Ext3/4 file systems. Don Porter CSE 506

CS 318 Principles of Operating Systems

ò Very reliable, best-of-breed traditional file system design ò Much like the JOS file system you are building now

CS 318 Principles of Operating Systems

Administrivia. Midterm. Realistic PC architecture. Memory and I/O buses. What is memory? What is I/O bus? E.g., PCI

Lecture S3: File system data layout, naming

[537] Fast File System. Tyler Harter

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems

Operating Systems. File Systems. Thomas Ropars.

Questions on Midterm. Midterm. Administrivia. Memory and I/O buses. Realistic PC architecture. What is memory? Median: 59.5, Mean: 57.3, Stdev: 26.

Chapter 11: File System Implementation. Objectives

Outline. Operating Systems. File Systems. File System Concepts. Example: Unix open() Files: The User s Point of View

File Systems. ECE 650 Systems Programming & Engineering Duke University, Spring 2018

File Systems. CS 4410 Operating Systems. [R. Agarwal, L. Alvisi, A. Bracy, M. George, E. Sirer, R. Van Renesse]

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

CS 111. Operating Systems Peter Reiher

ECE 650 Systems Programming & Engineering. Spring 2018

CS 4284 Systems Capstone

File Systems: Consistency Issues

PERSISTENCE: FSCK, JOURNALING. Shivaram Venkataraman CS 537, Spring 2019

File Systems: Fundamentals

Evolution of the Unix File System Brad Schonhorst CS-623 Spring Semester 2006 Polytechnic University

File Systems: Fundamentals

File System Implementation. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

V. File System. SGG9: chapter 11. Files, directories, sharing FS layers, partitions, allocations, free space. TDIU11: Operating Systems

CS510 Operating System Foundations. Jonathan Walpole

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems

CSE 120: Principles of Operating Systems. Lecture 10. File Systems. February 22, Prof. Joe Pasquale

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

[537] Journaling. Tyler Harter

File Layout and Directories

FFS: The Fast File System -and- The Magical World of SSDs

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Operating System Concepts Ch. 11: File System Implementation

Computer Systems Laboratory Sungkyunkwan University

File Systems: FFS and LFS

ECE 598 Advanced Operating Systems Lecture 18

COMP 530: Operating Systems File Systems: Fundamentals

File System Internals. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

FILE SYSTEM IMPLEMENTATION. Sunu Wibirama

File Systems. Before We Begin. So Far, We Have Considered. Motivation for File Systems. CSE 120: Principles of Operating Systems.

File System Internals. Jo, Heeseung

Crash Consistency: FSCK and Journaling. Dongkun Shin, SKKU

File System Implementation

The Berkeley File System. The Original File System. Background. Why is the bandwidth low?

File Systems: Allocation Issues, Naming, and Performance CS 111. Operating Systems Peter Reiher

Operating Systems Design Exam 2 Review: Fall 2010

CSE 120: Principles of Operating Systems. Lecture 10. File Systems. November 6, Prof. Joe Pasquale

Implementation should be efficient. Provide an abstraction to the user. Abstraction should be useful. Ownership and permissions.

File Systems Ch 4. 1 CS 422 T W Bennet Mississippi College

CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

Case study: ext2 FS 1

Long-term Information Storage Must store large amounts of data Information stored must survive the termination of the process using it Multiple proces

CS 537 Fall 2017 Review Session

File System Structure. Kevin Webb Swarthmore College March 29, 2018

Outlook. File-System Interface Allocation-Methods Free Space Management

File Systems. CS170 Fall 2018

The medium is the message. File system fun. Disk review Disk reads/writes in terms of sectors, not bytes. Disk vs. Memory. Files: named bytes on disk

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

Segmentation with Paging. Review. Segmentation with Page (MULTICS) Segmentation with Page (MULTICS) Segmentation with Page (MULTICS)

EECS 482 Introduction to Operating Systems

UNIX File Systems. How UNIX Organizes and Accesses Files on Disk

Operating Systems Design Exam 2 Review: Spring 2011

Lecture 21: Reliable, High Performance Storage. CSC 469H1F Fall 2006 Angela Demke Brown

Da-Wei Chang CSIE.NCKU. Professor Hao-Ren Ke, National Chiao Tung University Professor Hsung-Pin Chang, National Chung Hsing University

CSE 153 Design of Operating Systems

CS 416: Opera-ng Systems Design March 23, 2012

Case study: ext2 FS 1

Chapter 12: File System Implementation

W4118 Operating Systems. Instructor: Junfeng Yang

CS3600 SYSTEMS AND NETWORKS

Page 1. Recap: File System Goals" Recap: Linked Allocation"

Advanced file systems: LFS and Soft Updates. Ken Birman (based on slides by Ben Atkin)

File System Implementation. Sunu Wibirama

Filesystems Lecture 10. Credit: some slides by John Kubiatowicz and Anthony D. Joseph

we are here Page 1 Recall: How do we Hide I/O Latency? I/O & Storage Layers Recall: C Low level I/O

I/O and file systems. Dealing with device heterogeneity

Chapter 6: File Systems

File System Case Studies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

File system fun. File systems: traditionally hardest part of OS

- So: Where all important state ultimately resides

Disks, Memories & Buffer Management

Motivation. Operating Systems. File Systems. Outline. Files: The User s Point of View. File System Concepts. Solution? Files!

2. PICTURE: Cut and paste from paper

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Locality and The Fast File System. Dongkun Shin, SKKU

File System Implementations

OPERATING SYSTEM. Chapter 12: File System Implementation

Local File Stores. Job of a File Store. Physical Disk Layout CIS657

File System Implementation

C13: Files and Directories: System s Perspective

What is a file system

Chapter 12: File System Implementation

CS5460: Operating Systems Lecture 20: File System Reliability

Transcription:

SCSI overview SCSI domain consists of devices and an SDS - Devices: host adapters & SCSI controllers - Service Delivery Subsystem connects devices e.g., SCSI bus SCSI-2 bus (SDS) connects up to 8 devices - Controllers can have > 1 logical units (LUNs) - Typically, controller built into disk and 1 LUN/target, but bridge controllers can manage multiple physical devices Each device can assume role of initiator or target - Traditionally, host adapter was initiator, controller target - Now controllers act as initiators (e.g., COPY command) - Typical domain has 1 initiator, 1 targets

SCSI requests A request is a command from initiator to target - Once transmitted, target has control of bus - Target may disconnect from bus and later reconnect (very important for multiple targets or even multitasking) Commands contain the following: - Task identifier initiator ID, target ID, LUN, tag - Command descriptor block e.g., read 10 blocks at pos. N - Optional task attribute SIMPLE, ORDERD, HEAD OF QUEUE - Optional: output/input buffer, sense data - Status byte GOOD, CHECK CONDITION, INTERMEDIATE,...

Executing SCSI commdns Each LUN maintains a queue of tasks - Each task is DORMANT, BLOCKED, ENABLED, or ENDED - SIMPLE tasks are dormant until no ordered/head of queue - ORDERED tasks dormant until no HoQ/more recent ordered - HOQ tasks begin in enabled state Task management commands available to initiator - Abort/terminate task, Reset target, etc. Linked commands - Initiator can link commands, so no intervening tasks - E.g., could use to implement atomic read-modify-write - Intermediate commands return status byte INTERMEDIATE

SCSI exceptions and errors After error stop executing most SCSI commands - Target returns with CHECK CONDITION status - Initiator will eventually notice error - Must read specifics w. REQUEST SENSE Prevents unwanted commands from executing - E.g., initiator may not want to execute 2nd write if 1st fails Simplifies device implementation - Don t need to remember more than one error condition Same mechanism used to notify of media changes - I.e., ejected tape, changed CD-ROM

But back in the 80s... Disks spin at 3,600 RPM - 17 ms/rotation (vs. 4 ms on fastest disks today) Fixed # sectors/track (no zoning) Head switching free (?) Requests issued one at a time - No caching in disks - Head must pass over sector after getting a read - By the time OS issues next request, too late for next sector Slower CPUs, memory - Noticeable cost for block allocation algorithms

Original Unix file system Each FS breaks partition into three regions: - Superblock (parameters of file system, free ptr) - Inodes type/mode/size + ptr to data blocks - File and directory data blocks All data blocks 512 bytes Free blocks kept in a linked list

Inodes inode contents directory. name i-number. metadata data ptr data ptr. indirect ptr double indir.... indirect block data ptr data ptr. data data. data data.

Problems with original FS FS never transfers more than 512 bytes/disk access After a while, allocation essentially random - Requires a random seek every 512 bytes of file data Inodes far from both directory data and file data Within a directory, inodes not near each other Usability problems: - File names limited to 14 characters - No way to update file atomically & guarantee existence after crash

Fast file system New block size must be at least 4K - To avoid wasting space, use fragments for ends of files Cylinder groups avoid spread inodes around disk Bitmaps replace free list FS reserves space to improve allocation - Tunable parameter, default 10% - Only superuser can use space when over 90% full

FFS superblock Contains file system parameters - Disk characteristics, block size, CG info - Information necessary to get inode given i-number Replicated once per cylinder group - At shifting offsets, so as to span multiple platters - Contains magic to find replicas if 1st superblock dies Contains non-replicated summary info - # blocks, fragments, inodes, directories in FS - Flag stating if FS was cleanly unmounted

Cylinder groups Groups related inodes and their data Contains a number of inodes (set when FS created) - Default one inode per 2K data Contains file and directory blocks Contains bookkeeping information - Block map bit map of available fragments - Summary info within CG # free inodes, blocks/frags, files, directories - # free blocks by rotational position (8 positions)

Inode allocation Allocate inodes in same CG as directory if possible New directories put in new cylinder groups - Consider CGs with greater than average # free inodes - Chose CG with smallest # directories Within CG, inodes allocated randomly (next free) - Would like related inodes as close as possible - OK, because one CG doesn t have that many inodes

Fragment allocation Allocate space when user writes beyond end of file Want last block to be a fragment if not full-size - If already a fragment, may contain space for write done - Else, must deallocate any existing fragment, allocate new If no appropriate free fragments, break full block Problem: Slow for many small writes - (Partial) soution: new stat struct field st blksize - Tells applications file system block size - stdio library can buffer this much data

Block allocation Try to optimize for sequential access - If available, use rotationally close block in same cylinder - Otherwise, use block in same CG - If CG totally full, find other CG with quadratic hashing - Otherwise, search all CGs for some free space Problem: Don t want one file filling up whole CG - Otherwise other inodes will have data far away Solution: Break big files over many CGs - But large extents in each CGs, so sequential access doesn t require many seeks

Directories Inodes like files, but with different type bits Contents considered as 512-byte chunks Each chunk has direct structure(s) with: - 32-bit inumber - 16-bit size of directory entry - 8-bit file type (NEW) - 8-bit length of file name Coalesce when deleting - If first direct in chunk deleted, set inumber = 0 Periodically compact directory chunks

Updating FFS for the 90s No longer want to assume rotational delay - With disk caches, want data contiguously allocated Solution: Cluster writes - FS delays writing a block back to get more blocks - Accumulates blocks into 64K clusters, written at once Allocation of clusters similar to fragments/blocks - Summary info - Cluster map has one bit for each 64K if all free Also read in 64K chunks when doing read ahead

Dealing with crashes Suppose all data written asynchronously Delete/truncate a file, append to other file, crash - New file may reuse block from old - Old inode may not be updated - Cross-allocation! - Often inode with older mtime wrong, but can t be sure Append to file, allocate indirect block, crash - Inode points to indirect block - But indirect block may contain garbage

Ordering of updates Must be careful about order of updates - Write new inode to disk before directory entry - Remove directory name before deallocating inode - Write cleared inode to disk before updating CG free map Solution: Many metadata updates syncrhonous - Of course, this hurts performance - E.g., untar much slower than disk b/w Note: Cannot update buffers on the disk queue

Fixing corruption fsck Summary info usually bad after crash - Scan to check free block map, block/inode counts System may have corrupt inodes (not simple crash) - Bad block numbers, cross-allocation, etc. - Do sanity check, clear inodes with garbage Fields in inodes may be wrong - Count number of directory entries to verify link count, if no entries but count 0, move to lost+found - Make sure size and used data counts match blocks Directories may be bad - Holes illegal,. and.. must be valid,... - All directories must be reachable