Physical Representation of Files

Similar documents
CS370: System Architecture & Software [Fall 2014] Dept. Of Computer Science, Colorado State University

CS510 Operating System Foundations. Jonathan Walpole

CS 537 Fall 2017 Review Session

Chapter 11: File System Implementation. Objectives

CSE380 - Operating Systems. Communicating with Devices

CSE 380 Computer Operating Systems

I/O CANNOT BE IGNORED

CISC 7310X. C11: Mass Storage. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 4/19/2018 CUNY Brooklyn College

Database Systems II. Secondary Storage

Chapter 13: Mass-Storage Systems. Disk Scheduling. Disk Scheduling (Cont.) Disk Structure FCFS. Moving-Head Disk Mechanism

Chapter 13: Mass-Storage Systems. Disk Structure

Tape pictures. CSE 30341: Operating Systems Principles

CS3600 SYSTEMS AND NETWORKS

Appendix D: Storage Systems

Ch 11: Storage and File Structure

I/O CANNOT BE IGNORED

Physical Storage Media

OPERATING SYSTEMS CS3502 Spring Input/Output System Chapter 9

Disk Scheduling. Chapter 14 Based on the slides supporting the text and B.Ramamurthy s slides from Spring 2001

Module 13: Secondary-Storage

Chapter 14: Mass-Storage Systems

File systems CS 241. May 2, University of Illinois

Lecture 23: Storage Systems. Topics: disk access, bus design, evaluation metrics, RAID (Sections )

Chapter 5 Input/Output

Storage and File Structure. Classification of Physical Storage Media. Physical Storage Media. Physical Storage Media

Database Systems. November 2, 2011 Lecture #7. topobo (mit)

Operating Systems. File Systems. Thomas Ropars.

Reliable Computing I

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006

CSE380 - Operating Systems

CS370: Operating Systems [Fall 2018] Dept. Of Computer Science, Colorado State University

CS370: Operating Systems [Fall 2018] Dept. Of Computer Science, Colorado State University

Today: Secondary Storage! Typical Disk Parameters!

CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

Disk Scheduling. Based on the slides supporting the text

CSE 153 Design of Operating Systems

Chapter 10: Mass-Storage Systems

OPERATING SYSTEMS CS3502 Spring Input/Output System Chapter 9

CMSC 424 Database design Lecture 12 Storage. Mihai Pop

Mass-Storage Structure

Operating System 1 (ECS-501)

Chapter 4 File Systems. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved

CS370: Operating Systems [Spring 2017] Dept. Of Computer Science, Colorado State University

Components of the Virtual Memory System

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

Outlines. Chapter 2 Storage Structure. Structure of a DBMS (with some simplification) Structure of a DBMS (with some simplification)

Silberschatz, et al. Topics based on Chapter 13

Da-Wei Chang CSIE.NCKU. Professor Hao-Ren Ke, National Chiao Tung University Professor Hsung-Pin Chang, National Chung Hsing University

Introduction to I/O and Disk Management

Introduction to I/O and Disk Management

C13: Files and Directories: System s Perspective

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

Module 13: Secondary-Storage Structure

BBM371- Data Management. Lecture 2: Storage Devices

Long-term Information Storage Must store large amounts of data Information stored must survive the termination of the process using it Multiple proces

Chapter 14: Mass-Storage Systems. Disk Structure

Announcement. Computer Architecture (CSC-3501) Lecture 23 (17 April 2008) Chapter 7 Objectives. 7.1 Introduction. 7.2 I/O and Performance

MASS-STORAGE STRUCTURE

File. File System Implementation. File Metadata. File System Implementation. Direct Memory Access Cont. Hardware background: Direct Memory Access

Chapter 11 I/O Management and Disk Scheduling

Part IV I/O System. Chapter 12: Mass Storage Structure

Storage Systems. Storage Systems

Disks, Memories & Buffer Management

CS5460: Operating Systems Lecture 20: File System Reliability

1. Introduction. Traditionally, a high bandwidth file system comprises a supercomputer with disks connected

CS 554: Advanced Database System

I/O Management and Disk Scheduling. Chapter 11

Operating Systems 2010/2011

V. Mass Storage Systems

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

CPSC 421 Database Management Systems. Lecture 11: Storage and File Organization

CS510 Operating System Foundations. Jonathan Walpole

Operating Systems. Operating Systems Professor Sina Meraji U of T

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Database Management Systems need to:

Chapter 9: Peripheral Devices: Magnetic Disks

Lecture 15 - Chapter 10 Storage and File Structure

ICS Principles of Operating Systems

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

Lecture 21: Reliable, High Performance Storage. CSC 469H1F Fall 2006 Angela Demke Brown

Chapter 12: Mass-Storage Systems. Operating System Concepts 8 th Edition,

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

Magnetic Disk. Optical. Magnetic Tape. RAID Removable. CD-ROM CD-Recordable (CD-R) CD-R/W DVD

Chapter 12: File System Implementation

CS370 Operating Systems

HP Designing and Implementing HP Enterprise Backup Solutions. Download Full Version :

Administrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review

Chapter 14 Mass-Storage Structure

Chapter 11. I/O Management and Disk Scheduling

Lecture: Storage, GPUs. Topics: disks, RAID, reliability, GPUs (Appendix D, Ch 4)

L9: Storage Manager Physical Data Organization

CS3600 SYSTEMS AND NETWORKS

V. File System. SGG9: chapter 11. Files, directories, sharing FS layers, partitions, allocations, free space. TDIU11: Operating Systems

William Stallings Computer Organization and Architecture 6 th Edition. Chapter 6 External Memory

Storing Data: Disks and Files

Chapter 12: File System Implementation. Operating System Concepts 9 th Edition

Virtual File System -Uniform interface for the OS to see different file systems.

Chapter 10: Mass-Storage Systems

Chapter 12: File System Implementation

Chapter 10: Mass-Storage Systems

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Transcription:

Physical Representation of Files A disk drive consists of a disk pack containing one or more platters stacked like phonograph records. Information is stored on both sides of the platter. Each platter is divided into concentric rings called tracks, and each track is divided into sectors. All transfers to and from the disk are performed at the sector level. For example, to modify a single byte, the system reads the sector containing the byte off the disk, modifies the byte, and rewrites the entire sector. The disk uses a read/write head to transfer data to and from the platter. The operation of moving the head from one track to another is called seeking, and the heads generally move together as a unit. Because the disk heads move together as unit, it is useful to group the tracks at the same head position into logical units called cylinders. Sectors in the same cylinder can be read without an intervening seek. CS 4513 1 week2-disk.tex

Formatting Before a disk can be used, the raw disk must be formatted. Formatting creates sectors of the size appropriate for the target operating system. Some systems allow disks to be formatted by user programs; usually, the manufacturer formats a disk before sending it to a customer. One or more disk drives connects to a disk controller, which handles the details of moving the heads, etc. The controller communicates with the CPU through a host interface. Moreover, through direct memory access (DMA), the controller can access main memory directly, transferring entire sectors without interrupting the CPU. Look at picture. Fig 5-3 from Tanenbaum. Disk Latency Three factors influence the delay, or latency, that occurs in transferring data to/from a disk: 1. seek latency time needed to move the head to the desired track 2. rotational delay time needed to wait for the sector of a track to move under the head 3. transfer delay the amount of time required to read/write the sector CS 4513 2 week2-disk.tex

File Implementation An important (experimental) observation is that: 1. the majority of files are small 2. a few files are large We want to handle both file types efficiently. The operating system may choose to use a larger block size than the sector size of the physical disk. Each block consists of consecutive sectors. Motivation: a larger block size increases the transfer efficiency might be more convenient to have block size match the machine s page size Disk Block The size of transfer convenient for operating system is a disk block. It may be the same size as a sector or larger. Generally moving to larger block size. NTFS uses 4K block size for disks larger than 2GB. FAT-32 uses 4K up to 8GB, 8K up to 16GB, 16K up to 32GB and 32K above 32GB. Ideally want to locate blocks of a file in near proximity on disk to avoid excessive head movement. Can lead to fragmentation with too many small block runs because the size of a file may not be known a priori. Will look at some specific policies later. CS 4513 3 week2-disk.tex

Free-Space Block Management Approaches for keeping track of free blocks: 1. Bit map devote a single bit to each block 2. Linked List (in groups) each element of the list contains the number of a block. This block contains a group of free blocks. Thus for 16-bit block numbers (FAT-16) and 1K block, can fit 512 block numbers in the block. First 511 are really free and the last points to the next block of free blocks. Also can groups set of consecutive free blocks using an address/count approach. Can look at space/time tradeoffs for each approach. Caching Caching is crucial to improve performance. Why? many file blocks will be accessed again soon consider the cycle of editing, compiling, and running a program consider frequently used commands such as ls, vi, etc. Using memory-mapped files integrates the cache for files with the virtual memory system. Otherwise, must maintain a separate file system cache in Operating System. Unfortunately, caching may cause data in memory to become out of step with data on disk. This is known as the cache coherency problem. CS 4513 4 week2-disk.tex

Log-Structured File System Traditional files systems maintain a number of data structures on disk (directory structures, free-block pointers, indoes...). These structures may be cached in memory, but... A crash in the middle of changes to files can leave these structures in an inconsistent state. One solution is to run a consistency check on system reboot to ensure that file system structures are in a consistent state time consuming! Becoming more common to maintain a log of file system metadata changes. Changes are committed as a transaction once written to the log. Completed transactions are removed from log. Incomplete transactions can be replayed on system reboot. Also faster performance for file requests as only log file needs to be written before committing transaction versus waiting for all data structure updates to be completed in critical path of the request. CS 4513 5 week2-disk.tex

File System Reliability File Recovery To guard against complete file loss, most systems recommend that operators dump the file system to archival storage (floppy disks, Zip drives, tape, etc). Thus, some or all files could be restored after a disk failure. Because of the expense of dumping the entire file system, most dumps are incremental, meaning that only files modified (or created) within the last N days are saved. Once a week (month) a complete dump of the file system is performed. Operators cycle through a sequence of dump tapes, reusing a tape only after the files stored on the tape have been archived to a another tape. For instance, daily dump tapes might be recycled only after the weekend dump has saved all files modified since the last weekly dump. Thus, it is not always possible to restore every file from a dump, but chances increase the longer the file has been in existence. For additional safety, backups are often stored off-site. A back might store backups in a different region for increased reliability. CS 4513 6 week2-disk.tex

Redundant Array of Independent Disks (RAID) Section 12.7 of SGG Improve reliability by introducing file system redundancy. There are several levels of RAID disks. Simplest is to use mirroring (RAID level 1) where each disk is shadowed by a copy. Requires double the disk space, but reads can be faster by reading from both disks. Another level is block interleaved parity (RAID level 4). Fewer redundant disks. Assume nine disk blocks. Use one parity block for every eight data blocks. This is called disk striping or interleaving. If one of the disks fails then the data can be recovered using the remaining eight disks and the parity computation. Reads are quick because the data is spread amongst eight disks. Writes are expensive because parity must be recomputed. Problems with RAID RAID protects against physical media errors, but not necessarily software and hardware bugs. Other file systems also introduce checksums for all blocks to verify the stored data is consistent with checksum. Used by Google File System for example. CS 4513 7 week2-disk.tex