Monday, May 4, Discs RAID: Introduction Error detection and correction Error detection: Simple parity Error correction: Hamming Codes

Similar documents
Wednesday, April 25, Discs RAID: Introduction Error detection and correction Error detection: Simple parity Error correction: Hamming Codes

COMP283-Lecture 3 Applied Database Management

I/O CANNOT BE IGNORED

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage

BBM371- Data Management. Lecture 2: Storage Devices

Storage Devices for Database Systems

CSE 120. Operating Systems. March 27, 2014 Lecture 17. Mass Storage. Instructor: Neil Rhodes. Wednesday, March 26, 14

I/O CANNOT BE IGNORED

COS 318: Operating Systems. Storage Devices. Vivek Pai Computer Science Department Princeton University

Computer System Architecture

Chapter 9: Peripheral Devices: Magnetic Disks

A track on a magnetic disk is a concentric rings where data is stored.

Professor: Pete Keleher! Closures, candidate keys, canonical covers etc! Armstrong axioms!

Storage System COSC UCB

COS 318: Operating Systems. Storage Devices. Jaswinder Pal Singh Computer Science Department Princeton University

Magnetic Disk. Optical. Magnetic Tape. RAID Removable. CD-ROM CD-Recordable (CD-R) CD-R/W DVD

Disks and RAID. CS 4410 Operating Systems. [R. Agarwal, L. Alvisi, A. Bracy, E. Sirer, R. Van Renesse]

Mass-Storage Structure

CSE 451: Operating Systems Spring Module 12 Secondary Storage

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University

Components of the Virtual Memory System

Principles of Data Management. Lecture #2 (Storing Data: Disks and Files)

CS 554: Advanced Database System

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 6 External Memory

Chapter 6. Storage and Other I/O Topics

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

CS370 Operating Systems

Storage. CS 3410 Computer System Organization & Programming

CISC 7310X. C11: Mass Storage. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 4/19/2018 CUNY Brooklyn College

Computer Science 61C Spring Friedland and Weaver. Input/Output

Chapter 12: Mass-Storage

Mass-Storage Structure

Lecture 29. Friday, March 23 CS 470 Operating Systems - Lecture 29 1

CS 261 Fall Mike Lam, Professor. Memory

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

u Covered: l Management of CPU & concurrency l Management of main memory & virtual memory u Currently --- Management of I/O devices

CS143: Disks and Files

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Today: Secondary Storage! Typical Disk Parameters!

Chapter 6 External Memory

Disk Scheduling COMPSCI 386

V. Mass Storage Systems

STORAGE SYSTEMS. Operating Systems 2015 Spring by Euiseong Seo

Mass-Storage. ICS332 - Fall 2017 Operating Systems. Henri Casanova

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble

File. File System Implementation. File Metadata. File System Implementation. Direct Memory Access Cont. Hardware background: Direct Memory Access

Chapter 6 - External Memory

Chapter 10: Mass-Storage Systems

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Chapter 11. I/O Management and Disk Scheduling

Semiconductor Memory Types Microprocessor Design & Organisation HCA2102

CSCI-GA Operating Systems. I/O : Disk Scheduling and RAID. Hubertus Franke

IST346. Data Storage

Topics. Lecture 8: Magnetic Disks

Module 1: Basics and Background Lecture 4: Memory and Disk Accesses. The Lecture Contains: Memory organisation. Memory hierarchy. Disks.

CSE 451: Operating Systems Winter Secondary Storage. Steve Gribble. Secondary storage

Advanced Database Systems

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring 2002

Chapter 10: Mass-Storage Systems

Chapter 12: Mass-Storage

Chapter 12: Mass-Storage

Storage Technologies and the Memory Hierarchy

Database Systems II. Secondary Storage

Storage Technologies - 3

Computer Architecture 计算机体系结构. Lecture 6. Data Storage and I/O 第六讲 数据存储和输入输出. Chao Li, PhD. 李超博士

Storage Systems : Disks and SSDs. Manu Awasthi CASS 2018

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Page 1. Magnetic Disk Purpose Long term, nonvolatile storage Lowest level in the memory hierarchy. Typical Disk Access Time

Lecture 23: Storage Systems. Topics: disk access, bus design, evaluation metrics, RAID (Sections )

UC Santa Barbara. Operating Systems. Christopher Kruegel Department of Computer Science UC Santa Barbara

CHAPTER 12: MASS-STORAGE SYSTEMS (A) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

CS 471 Operating Systems. Yue Cheng. George Mason University Fall 2017

Adapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK]

Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu

Mass-Storage. ICS332 Operating Systems

William Stallings Computer Organization and Architecture 6 th Edition. Chapter 6 External Memory

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Storage Systems. Storage Systems

CSE 120. Overview. July 27, Day 8 Input/Output. Instructor: Neil Rhodes. Hardware. Hardware. Hardware

A+ Guide to Hardware: Managing, Maintaining, and Troubleshooting, 5e. Chapter 6 Supporting Hard Drives

Whither Hard Disk Archives? Dave Anderson Seagate Technology 6/2016

Disks, Memories & Buffer Management

Chapter 13 Disk Storage, Basic File Structures, and Hashing.

CS3600 SYSTEMS AND NETWORKS

Lecture 25: Interconnection Networks, Disks. Topics: flow control, router microarchitecture, RAID

Concepts Introduced. I/O Cannot Be Ignored. Typical Collection of I/O Devices. I/O Issues

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Outline of Today s Lecture. The Big Picture: Where are We Now?

Storing Data: Disks and Files

Homework, etc. Computer Science Foundations

CSE 153 Design of Operating Systems

Hard Disk Drives. Nima Honarmand (Based on slides by Prof. Andrea Arpaci-Dusseau)

PC-based data acquisition II

TODAY AND TOMORROW. Storage CHAPTER

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Department of Computer Engineering University of California at Santa Cruz. File Systems. Hai Tao

File. File System Implementation. Operations. Permissions and Data Layout. Storing and Accessing File Data. Opening a File

CS429: Computer Organization and Architecture

Principles of Operating Systems CS 446/646

EXTERNAL MEMORY (Part 1)

Transcription:

Monday, May 4, 2015 Topics for today Secondary memory Discs RAID: Introduction Error detection and correction Error detection: Simple parity Error correction: Hamming Codes Storage management (Chapter 9) Discs, RAID (9.3 9.5) Discs (see Fig. 9.17) Basic model of a hard disc drive (HDD) is a set of parallel rotating platters on which data can be recorded (usually on both sides apart possible from the topmost or bottommost on the stack). Read/write heads move as a unit into one of a finite number of positions. Consider one of these positions. As the disc rotates, the projection of the read/write head on the recording surface defines a track on which data can be stored. A track is normally divided into sectors and a sector is usually the smallest addressable unit of the disc. An address is thus a three-part object Surface-number, track-number, sector-number The set of tracks (one per surface) having the same radius is called a cylinder (because of its shape) The capacity of a disc is nominally the product Recording_surfaces * tracks-per-surface * sectors-per-track * bytes-per-sector The actual capacity is less than this because some bytes are used for formatting information (e.g., track numbers) and error-correcting and not available for users. To read/write a sector, we need to move the read/write heads to the appropriate track (note that the average movement is N/3 tracks if there are N tracks total) Comp 162 Notes Page 1 of 7 May 4, 2015

wait for the sector to rotate round (latency time is, on average, half a rotation). Spinning the discs faster, e.g. 15,000 rpm vs 3600 rpm, reduces latency. do the read/write An Operating System will typically allocate space in clusters of sectors (e.g. an allocation unit might be 4 sectors) to reduce overheads. It is also likely to allocate data cylinder by cylinder rather than surface by surface. Trends (from Computer Systems by Bryant and O Hallaron) Disk Year $/GB Seek time (ms) Typical Size MB Main memory Effective Cycle time (ns) 1980 500,000 87 1 1000 1985 100,000 75 10 166 1990 8,000 28 160 50 1995 300 10 1000 6 2000 10 8 20,000 1.6 2005 5 5 160,000 0.25 2010 0.3 3 1,500,000 0.10 2010:1980 1.5m times cheaper 29 times faster 1.5m times larger 10,000 times faster 4K advanced format A sector size of 512 bytes had been standard since DOS. However, space is wasted between sectors and also the small sector size left little room for error-correcting information. Since 2011, all new drives use 4096-byte sectors that will reduce the wasted space and will allow drives to use twice as much space per sector for error-correcting codes. The change in sector size also reduces by 3 the number of bits required to specify a sector address. Recent advances (1) Helium filled drives Helium is much less dense than air thus reducing the power needed to spin the platters. The platters can also be placed closer to each other. The advantages of helium have been known for some time, manufacturing is now enabling commercial products. Comp 162 Notes Page 2 of 7 May 4, 2015

(2) Heat-assisted magnetic recording Heat from a laser is used during writing to increase areal density. The density can be increased by a factor of between 10 and 100. Solid State Drives (SSD) SSD provide an alternative to conventional hard disc drives. Advantages of SSD Faster start up Shorter access time Relatively constant access time to all data Generally fast transfer rates Disadvantages of SSD Lower capacity High cost per Gb Tendency to fail without warning Need for software to move contents around to even out wear Best of both worlds? Use SSD as a cache memory in front of HDD. Hold copies of frequently accessed data on the SSD. As long as changed data is written back to the HDD, failure of the SSD is not a problem. What about failures of the HDD? The uncorrectable read rate is low, in the region of 1 in 10 14, but it is non-zero. RAID systems try to reduce the effective failure rate even further. RAID (9.5) Introduction Discs are getting larger. Currently, 3TB and 4TB drives are commonplace, 10TB will be common soon (see above). However, a large disc can be a single point of failure in a system. A RAID system (Redundant Array of Inexpensive Discs 1 ) is an alternative to a SLED (Single Large Expensive Disc). 1 Originally, RAID was Redundant Array of Inexpensive Discs. However, inexpensive discs tended to be unreliable so the acronym was changed to Redundant Array of Independent Discs. Some people, looking for more generality, use Redundant Array of Independent Devices. Comp 162 Notes Page 3 of 7 May 4, 2015

The idea of RAID is to make an array of discs appear as a single disc to the rest of the system but with improved reliability and/or performance compared with a real single large disc RAID systems that are easy for home users are available. See, for example, those made by Drobo (disk robot) (http://droboworks.com ) To appreciate how some aspects of RAID work we first need to consider some simple aspects of error detection and correction. Error detection and correction (Warford 9.4) Consider writing information to a device then, later, reading it back. How do we know that what we read is what was written? Writing to a potentially unreliable storage device is the same problem as transmitting over a noisy channel. We need to introduce redundancy into the system in order to be able detect errors and even more redundancy in order to be able to correct errors. Definitions The Hamming distance between two code words (bit patterns) of the same length is the number of bit positions at which they differ. For example Codeword 1 101110 1100010 101 001110101 Codeword 2 111101 1000011 010 001001100 Hamming distance 3 2 3 4 Comp 162 Notes Page 4 of 7 May 4, 2015

The code distance of a set of codewords is the smallest Hamming distance between any two members of the set. Set of four codewords Code distance { 110 101 011 000 } 2 { 1100 0011 1001 0001 } 1 Error Detection: Simple Parity If all bit patterns are legal in a particular context, that is, there is no redundancy, we cannot detect errors. We can introduce redundancy in the form of a simple parity bit for example, so chosen to make the total number of 1 bits even. Thus if the data is 1 0 0 1 1 1 0 1 we store 1 0 0 1 1 1 0 1 1 Only half of the 9-bit patterns have even parity. If an odd number of bits is changed while the item is in storage, we can detect it when we read it back (the number of 1 bits will be odd) but we cannot correct the error because we do not have enough information to locate it. In order to be able to detect single bit errors, the code distance needs to be > 1. This makes sense, if the distance is 1 then a single bit modification can change one valid codeword into another. For the same reason in order to detect n-bit errors the code distance needs to be > n Error correction: Hamming codes In order to be able to correct errors we add multiple parity bits to a data object in such a way as to enable single-bit errors to be located (and therefore corrected). Clearly, the code distance needs to be larger. In order to correct an error, the bad codeword needs to be closer to the appropriate correct codeword than to any other correct codeword. To detect and correct single bit errors, the code distance must be greater than 2. We can see why by looking at two codewords with distance=2 If Valid 1 = 011110 And Valid 2 = 101110 Which code was intended if we receive 111110? Comp 162 Notes Page 5 of 7 May 4, 2015

In general, in order to detect and correct d-bit errors, the code distance must be > 2d. So, in order to detect and correct single-bit errors, the code distance must be at least 3. Here is a possible scheme for arranging this. If there are 8 data bits we add 4 parity bits (see Fig. 9.24). Number the 12 bits (conceptually) 1 through 12 and place the parity bits in positions 1, 2, 4 and 8. In simple parity we had one group of bits (1..9) and chose the parity bit to make the total number of 1 s even. In our extended scheme we have 4 groups: Bits 1,3,5,7,9,11 Bits 2,3,6,7,10,11 Bits 4,5,6,7,12 Bits 8,9,10,11,12 Each group has a parity bit, the remaining are data bits. The parity bit is chosen so that the number of bits in each group is even. You can see how the groups are determined by considering the following table showing the binary representations of the numbers 1 through 12. The first group contains those numbers having 1 in the last column, the second group contains those numbers having a 1 in the next-to-last column and so on. Example. 1 0 0 0 1 2 0 0 1 0 3 0 0 1 1 4 0 1 0 0 5 0 1 0 1 6 0 1 1 0 7 0 1 1 1 8 1 0 0 0 9 1 0 0 1 10 1 0 1 0 11 1 0 1 1 12 1 1 0 0 Data is 1 0 0 1 0 1 1 1 Adding parity locations P P 1 P 0 0 1 P 0 1 1 1 First group is P 1 0 1 0 1 so P1 is set to 1 Second group is P 1 0 1 1 1 so P2 is set to 0 Third group is P 0 0 1 1 so P4 is set to 0 Fourth group is P 0 1 1 1 so P8 is set to 1 So we store 1 0 1 0 0 0 1 1 0 1 1 1 Comp 162 Notes Page 6 of 7 May 4, 2015

If any single bit is changed, a unique combination of parity bits will be affected and we can pinpoint the error. For example, if bit 6 is changed 1 0 1 0 0 1 1 1 0 1 1 1 parity bits P2 and P4 will be wrong because 6 is in those two groups. These are the only two groups containing 6 so parity bits P1 and P8 will still be correct. We add the indexes of the failed parities (2 and 4) to get the index of the bad bit. If bit 10 had been changed, parity bits P8 and P2 would be wrong because 10 is in those two groups (and only those groups). Because of the way we positioned the parity bits and grouped the data bits we add up the indexes of the failed check bits to get the index of the bad bit which we can then correct. Next we will look at how these ideas are used in RAID systems. Reading Section 9.1 has a page or two on disc drives. Section 9.4 discusses error detecting and correcting codes. Comp 162 Notes Page 7 of 7 May 4, 2015