secondary storage: Secondary Stg Any modern computer system will incorporate (at least) two levels of storage:

Similar documents
1. What is the difference between primary storage and secondary storage?

Course Outline. Overview 1. I. Introduction II. Performance Evaluation III. Processor Design and Analysis. IV. Memory Design and Analysis

Course Outline. Introduction. Intro Computer Organization. Computer Science Dept Va Tech January McQuain & Ribbens

Secondary Storage Devices: Magnetic Disks Optical Disks Floppy Disks Magnetic Tapes CENG 351 1

Module 1: Basics and Background Lecture 4: Memory and Disk Accesses. The Lecture Contains: Memory organisation. Memory hierarchy. Disks.

CS 201 The Memory Hierarchy. Gerson Robboy Portland State University

CSE 153 Design of Operating Systems Fall 2018

Storage and File Structure. Classification of Physical Storage Media. Physical Storage Media. Physical Storage Media

UNIT 2 Data Center Environment

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems

Organization of a Surface

Ch 11: Storage and File Structure

1.1 Bits and Bit Patterns. Boolean Operations. Figure 2.1 CPU and main memory connected via a bus. CS11102 Introduction to Computer Science

Disk Scheduling COMPSCI 386

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

EXTERNAL MEMORY (Part 1)

CS 405G: Introduction to Database Systems. Storage

Disks and RAID. CS 4410 Operating Systems. [R. Agarwal, L. Alvisi, A. Bracy, E. Sirer, R. Van Renesse]

CSE 153 Design of Operating Systems

Department of Computer Engineering University of California at Santa Cruz. File Systems. Hai Tao

Hard facts. Hard disk drives

CS420: Operating Systems. Mass Storage Structure

Data Storage and Query Answering. Data Storage and Disk Structure (2)

CMSC 424 Database design Lecture 12 Storage. Mihai Pop

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 13

Introduction to I/O and Disk Management

CS370 Operating Systems

Introduction to I/O and Disk Management

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006

Components of the Virtual Memory System

The Server-Storage Performance Gap

File Systems Part 1. Operating Systems In Depth XIV 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

BBM371- Data Management. Lecture 2: Storage Devices

OPERATING SYSTEMS CS3502 Spring Input/Output System Chapter 9

Database Systems. November 2, 2011 Lecture #7. topobo (mit)

Contents. Memory System Overview Cache Memory. Internal Memory. Virtual Memory. Memory Hierarchy. Registers In CPU Internal or Main memory

L7: Performance. Frans Kaashoek Spring 2013

CS370 Operating Systems

Chapter 12: File System Implementation

CS370 Operating Systems

Secondary Storage Management Rab Nawaz Jadoon

I/O 1. Devices and I/O. key concepts device registers, device drivers, program-controlled I/O, DMA, polling, disk drives, disk head scheduling

Principles of Operating Systems CS 446/646

Chapter 11. I/O Management and Disk Scheduling

Disks & Files. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage

Database Systems II. Secondary Storage

I/O 1. Devices and I/O. key concepts device registers, device drivers, program-controlled I/O, DMA, polling, disk drives, disk head scheduling

Chapter 6 - External Memory

Introduction To Computer Hardware

Parser. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text.

STORING DATA: DISK AND FILES

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

Physical Data Organization. Introduction to Databases CompSci 316 Fall 2018

Principles of Data Management. Lecture #2 (Storing Data: Disks and Files)

CS 471 Operating Systems. Yue Cheng. George Mason University Fall 2017

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble

ECE331: Hardware Organization and Design

Secondary storage. CS 537 Lecture 11 Secondary Storage. Disk trends. Another trip down memory lane

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

CSE 451: Operating Systems Spring Module 12 Secondary Storage

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University

CPSC 421 Database Management Systems. Lecture 11: Storage and File Organization

OPERATING SYSTEMS CS3502 Spring Input/Output System Chapter 9

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Database Management Systems need to:

COS 318: Operating Systems. Storage Devices. Jaswinder Pal Singh Computer Science Department Princeton University

STORAGE SYSTEMS. Operating Systems 2015 Spring by Euiseong Seo

The DASD and Evolution of Storage Devices

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

Storage Technologies and the Memory Hierarchy

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Chapter 7

Data Storage - I: Memory Hierarchies & Disks. Contains slides from: Naci Akkök, Pål Halvorsen, Hector Garcia-Molina, Ketil Lund, Vera Goebel

Introduction To Computers: Hardware

Mass-Storage Structure

Operating Systems. File Systems. Thomas Ropars.

Initial Bootloader. On power-up, when a computer is turned on, the following operations are performed:

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

CSE 190D Database System Implementation

File System Management

Storage. CS 3410 Computer System Organization & Programming

OPERATING SYSTEM. PREPARED BY : DHAVAL R. PATEL Page 1. Q.1 Explain Memory

Chapter 8: Virtual Memory. Operating System Concepts Essentials 2 nd Edition

Storage and File Structure

I/O CANNOT BE IGNORED

Computer Memory. Data Structures and Algorithms CSE 373 SP 18 - KASEY CHAMPION 1

Introduction To Computers: Hardware and Software

CS370 Operating Systems

CSE 451: Operating Systems Winter Lecture 12 Secondary Storage. Steve Gribble 323B Sieg Hall.

I/O CANNOT BE IGNORED

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 26, SPRING 2013

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Section 10: Device Drivers, FAT, Queuing Theory, Memory Mapped Files

Princeton University. Computer Science 217: Introduction to Programming Systems. The Memory/Storage Hierarchy and Virtual Memory

Chapter 14: File-System Implementation

Disks and I/O Hakan Uraz - File Organization 1

Chapter 6 Part 1 Understanding Hardware

u Covered: l Management of CPU & concurrency l Management of main memory & virtual memory u Currently --- Management of I/O devices

Transcription:

Secondary Storage 1 Any modern computer system will incorporate (at least) two levels of storage: primary storage: random access memory (RAM) typical capacity 256MB to 4GB cost per MB $0.10 typical access time 5ns to 60ns secondary storage: typical capacity cost per MB typical access time magnetic disk/optical devices/tape systems 20GB to 400GB for fixed media; for removable $0.001 for fixed media, more for removable 8ms to 12ms for fixed media, larger for removable Note: all statistics here are guaranteed invalid by Oct 15, 2005.

Units of Measurement Spatial units: 2 byte (B) 8 bits byte (B) 8 bits kibibyte (KiB) 1024 or 2 10 bytes kilobyte (KB) 1024 or 2 10 bytes mebibyte (MiB) 1024 kibibytes or 2 20 bytes megabyte (MB) 1024 kilobytes or 2 20 bytes gibibyte (GiB) 1024 mebibytes or 2 30 bytes IEC standard gigabyte (GB) byte (B) kilobyte (KB) megabyte (MB) 1024 megabytes or 2 30 bytes traditional 8 bits 1000 or 10 3 bytes 1000 kilobytes or 10 6 bytes Time units: nanosecond (ns) microsecond (s) millisecond (ms) gigabyte (GB) one-billionth (10-9 ) of a second one-millionth (10-6 ) of a second one-thousandth (10-3 ) of a second 1000 megabytes or 10 9 bytes alt. industry

Primary versus Secondary Storage 3 While the particular values given earlier are volatile, the relative performances suggested are actually quite stable: Primary storage: costs several hundred times as much per unit as secondary storage. has access times that are 250,000 to 1,000,000 times faster than secondary storage has transfer rates that are?? times faster than secondary storage Why do WE care (in a data structures class)? Often data must be first read from disk into memory for processing, and then results must be written back to disk after processing. In many cases, data sets are too large to store in memory at once.

File Storage 4 In many file systems, space is allocated in fixed-size chunks called clusters. Typical cluster sizes range from 0.5KB up to 32KB, usually equaling a power of 2. When a file is stored on disk, an integer number of clusters are allocated for the file; since files sizes are typically not a multiple of the cluster size, this means that a certain amount of space is wasted to internal fragmentation. On average, ½ of a cluster is wasted per file stored. Obviously that adds up but that s not really our concern in this course. The clusters are also not usually stored contiguously on the disk. This external fragmentation can cause a serious degradation of performance when a file is being read from or written to disk. cluster

Hard Disk Performance Factors To illustrate the issues, we will consider the physical organization and behavior of a typical hard disk design. Note that the presentation here is an over-simplification and that contemporary hard drive designs incorporate control sophistication not discussed here. Despite advances, the basic performance issues remain the same. 5 We will consider three primary factors that affect the time required to read requested data from the disk into memory: seek time latency transfer time time for the appropriate I/O head to reach the desired track time for the appropriate sector to rotate to the I/O head time for the data to be read to move past the I/O head First we must have a clear picture of the hardware

Platters and I/O Heads A typical hard drive contains a number of circular platters, attached to a rotating spindle. Each platter holds data on one or both of its surfaces. The data is read and written by I/O heads, typically one per data surface. These I/O heads are mounted in a rack and all move in and out in unison. Typically, only one I/O head can be active (reading or writing data) at any given time. 6 rack platter spindle head

Tracks and Sectors Each data surface is organized into concentric rings of data, called tracks. Each track is divided into a number of segments, called sectors. On modern drives, outer tracks contain more sectors than inner tracks. We will simplify things by considering each track to hold the average number of sectors. Each sector contains the same amount of data. 7 track sector The basic unit of disk I/O is the sector. We will see shortly that reading an entire sector from disk takes only slightly longer than reading a single byte. inter-sector gap

Reading a Sector When a disk read is requested the following actions must be carried out: 8 1. Determine what sector(s) must be read the disk controller is responsible for this. 3. Wait for the beginning of the first sector to rotate to the head. 2. Move the read head to the track containing the targeted sector(s). 4. Read the data as it rotates beneath the head.

Seek Time seek time the time required to move the head to the appropriate track 9 The seek time depends only on the speed with which the head rack moves, and the number of tracks that the head must move across to reach its target. Given the following (which are constant for a particular disk): H s H T = the time for the I/O head to start moving = the time for the I/O head to move from one track to the next then the time for the head to move n tracks is: Seek( n) H s H T n QTP: On average, the head will move 1/3 of the way across the platter. Why?

Rotational Latency Time 10 latency the time required for the appropriate sector to rotate to the position of the I/O head The rotational latency time depends only on the speed at which the spindle rotates, and the angle through which the track must rotate to reach the I/O head. Given the following: R = the rotational speed of the spindle (in rotations per minute) = the number of radians through which the track must rotate then the rotational latency (in ms) is: Latency θ θ 2π 60000 R On average, the latency time will be the time required for the platter to complete 1/2 of a full rotation.

Data Transfer Time transfer time 11 the time required for the appropriate sector(s) to rotate under the I/O head (for contiguous sectors on the same track) The transfer time depends only on the speed at which the spindle rotates, and the number of sectors that must be read. Given: S T = the total number of sectors per track the transfer time for n contiguous sectors on the same track is: Transfer n n S T 60000 R For sectors on different tracks, each track must be analyzed separately, allowing for seek and latency.

Total Read/Write Time 12 The total time to read/write data is from/to disk is then the sum of the seek time, the rotational latency time, and the transfer time. This ignores the time for controller logic, delays due to multitasking queues, etc.; that is fair because those times are normally orders of magnitude less that the times considered. One note: we have assumed in the analysis of the data transfer time that the I/O heads are capable of reading/writing data as fast as the sector moves beneath the head. On older disk systems that was often not the case, and interleaving was necessary to optimize read/write performance. On most contemporary disk systems, the I/O heads are fast enough that such tricks are unnecessary.

Example: Contiguous File Read Assume a disk system with the following parameters: 13 # of tracks per surface 8096 avg# of sectors per track 2048 sector size 0.5 KB cluster size 4 KB spindle speed 10000 RPM head start time 1 ms track to track seek time 0.002 ms Question:how much time would be required, on average, to read a file of size 2.5 MB, assuming the file is stored in contiguous sectors on adjacent tracks (since one track won t hold the whole file in this case)?

Example: Contiguous File Read We must compute some values from the given disk parameters: 14 rotations per second: 166.67 clusters per track: 256 capacity of one track: 1 MB sectors per cluster: 8 So an 2.5 MB file would occupy two full tracks plus 128 clusters on an adjacent track. For reading the first full track, assuming average values: - seek time = 1 + (8096/3)*0.002 = 6.40 ms - latency time = (1/2)*(60000/10000) = 3.00 ms - transfer time = (2048/2048)*(60000/10000) = 6.00 ms total: 15.40 ms

Example: Contiguous File Read 15 For reading the second full track, assuming average values: - seek time = 1 + 1*0.002 = 1.002 ms - latency time = (1/2)*(60000/10000) = 3.00 ms - transfer time = (2048/2048)*(60000/10000) = 6.00 ms total: 10.00 ms For reading the relevant sectors from the second track, assuming average values: - seek time = 1 + (1)*0.002 = 1.00 ms - latency time = (1/2)*(60000/10000) = 3.00 ms - transfer time = (1024/2048)*(60000/10000) = 3.00 ms total: 7.00 ms So the total time to read the file into memory* would be about 32.40 ms or about 0.038 seconds. * Actually this is the time to read the file into the disk buffer memory.

Example: Fragmented File Read 16 What if the file is scattered all over the disk? Then it will take a lot longer for each cluster: - seek time = 1 + (8096/3)*0.002 = 6.40 ms - latency time = (1/2)*(60000/10000) = 3.00 ms - transfer time = (8/2048)*(60000/10000) = 0.02 ms total: 9.42 ms Since the file occupies (part or all of) 640 clusters, the total read time for a completely fragmented copy of the file would be about 6028.9 ms or about 6 seconds. So this would take almost 160 times as long as if the file were stored contiguously.

Sector Read vs Single Byte Read 17 The programs you have implemented simply perform read/write actions, usually on single variables which may store from 1 to a few hundred bytes of data. Why do typical disk controllers perform read/write operations in sector-sized chunks? Performance given the disk system described earlier: time to read one full track: 15.40 ms time to read one full cluster: 9.42 ms 6.40 ms + 3.00 ms + (1/256)*(60000/10000) time to read one byte: 9.40 ms 6.40 ms + 3.00 ms + (1/512)*(1/2048)*(60000/10000) The time to read one sector is only slightly more than the time to read a single byte, largely because the seek and latency times dominate the transfer time. What s the moral??