Wednesday, April 25, Discs RAID: Introduction Error detection and correction Error detection: Simple parity Error correction: Hamming Codes

Similar documents
Monday, May 4, Discs RAID: Introduction Error detection and correction Error detection: Simple parity Error correction: Hamming Codes

COMP283-Lecture 3 Applied Database Management

I/O CANNOT BE IGNORED

BBM371- Data Management. Lecture 2: Storage Devices

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage

Storage Devices for Database Systems

Magnetic Disk. Optical. Magnetic Tape. RAID Removable. CD-ROM CD-Recordable (CD-R) CD-R/W DVD

Computer System Architecture

A track on a magnetic disk is a concentric rings where data is stored.

COS 318: Operating Systems. Storage Devices. Vivek Pai Computer Science Department Princeton University

Chapter 12: Mass-Storage

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 6 External Memory

Storage. CS 3410 Computer System Organization & Programming

CS 554: Advanced Database System

Mass-Storage Structure

Chapter 9: Peripheral Devices: Magnetic Disks

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

Disks and RAID. CS 4410 Operating Systems. [R. Agarwal, L. Alvisi, A. Bracy, E. Sirer, R. Van Renesse]

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

I/O CANNOT BE IGNORED

Components of the Virtual Memory System

Storage System COSC UCB

Professor: Pete Keleher! Closures, candidate keys, canonical covers etc! Armstrong axioms!

CSE 120. Operating Systems. March 27, 2014 Lecture 17. Mass Storage. Instructor: Neil Rhodes. Wednesday, March 26, 14

Computer Science 61C Spring Friedland and Weaver. Input/Output

COS 318: Operating Systems. Storage Devices. Jaswinder Pal Singh Computer Science Department Princeton University

Mass-Storage Structure

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

CS370 Operating Systems

CISC 7310X. C11: Mass Storage. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 4/19/2018 CUNY Brooklyn College

Principles of Data Management. Lecture #2 (Storing Data: Disks and Files)

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Semiconductor Memory Types Microprocessor Design & Organisation HCA2102

Today: Secondary Storage! Typical Disk Parameters!

Chapter 6. Storage and Other I/O Topics

Chapter 6 External Memory

u Covered: l Management of CPU & concurrency l Management of main memory & virtual memory u Currently --- Management of I/O devices

TODAY AND TOMORROW. Storage CHAPTER

CSE 451: Operating Systems Spring Module 12 Secondary Storage

Chapter 10: Mass-Storage Systems

Disk Scheduling COMPSCI 386

External Memory. Computer Architecture. Magnetic Disk. Outline. Data Organization and Formatting. Write and Read Mechanisms

Chapter 10: Mass-Storage Systems

Database Systems II. Secondary Storage

William Stallings Computer Organization and Architecture 6 th Edition. Chapter 6 External Memory

Chapter 12: Mass-Storage

Chapter 12: Mass-Storage

Storage Technologies and the Memory Hierarchy

Lecture 29. Friday, March 23 CS 470 Operating Systems - Lecture 29 1

CS3600 SYSTEMS AND NETWORKS

V. Mass Storage Systems

The personal computer system uses the following hardware device types -

Storage Technologies - 3

CS 261 Fall Mike Lam, Professor. Memory

STORAGE SYSTEMS. Operating Systems 2015 Spring by Euiseong Seo

CS429: Computer Organization and Architecture

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Homework, etc. Computer Science Foundations

Chapter 6 - External Memory

Storage. How does volatility compare? What is a storage device and a storage medium? Today s standard disk is 3.5 wide

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

CHAPTER 12: MASS-STORAGE SYSTEMS (A) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

IST346. Data Storage

EXTERNAL MEMORY (Part 1)

Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu

Disks, Memories & Buffer Management

CS429: Computer Organization and Architecture

Storage Systems. Storage Systems

Mass-Storage. ICS332 - Fall 2017 Operating Systems. Henri Casanova

Topics. Lecture 8: Magnetic Disks

CS 471 Operating Systems. Yue Cheng. George Mason University Fall 2017

CSE 153 Design of Operating Systems

Adapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK]

CS143: Disks and Files

Chapter 11. I/O Management and Disk Scheduling

Whither Hard Disk Archives? Dave Anderson Seagate Technology 6/2016

CS152 Computer Architecture and Engineering Lecture 19: I/O Systems

CSCI-GA Operating Systems. I/O : Disk Scheduling and RAID. Hubertus Franke

Random-Access Memory (RAM) CS429: Computer Organization and Architecture. SRAM and DRAM. Flash / RAM Summary. Storage Technologies

16/06/56. Secondary Storage. Secondary Storage. Secondary Storage The McGraw-Hill Companies, Inc. All rights reserved.

Contents. Memory System Overview Cache Memory. Internal Memory. Virtual Memory. Memory Hierarchy. Registers In CPU Internal or Main memory

Chapter 13 Disk Storage, Basic File Structures, and Hashing.

Module 1: Basics and Background Lecture 4: Memory and Disk Accesses. The Lecture Contains: Memory organisation. Memory hierarchy. Disks.

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble

Overview. EE 4504 Computer Organization. Historically, the limiting factor in a computer s performance has been memory access time

Department of Computer Engineering University of California at Santa Cruz. File Systems. Hai Tao

Administrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review

CS 201 The Memory Hierarchy. Gerson Robboy Portland State University

CS61C - Machine Structures. Week 7 - Disks. October 10, 2003 John Wawrzynek

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Outline of Today s Lecture. The Big Picture: Where are We Now?

Introduction to I/O. April 30, Howard Huang 1

CPSC 421 Database Management Systems. Lecture 11: Storage and File Organization

Storing Data: Disks and Files

File. File System Implementation. File Metadata. File System Implementation. Direct Memory Access Cont. Hardware background: Direct Memory Access

Storage Systems : Disks and SSDs. Manu Awasthi CASS 2018

Concepts Introduced. I/O Cannot Be Ignored. Typical Collection of I/O Devices. I/O Issues

Computer Architecture 计算机体系结构. Lecture 6. Data Storage and I/O 第六讲 数据存储和输入输出. Chao Li, PhD. 李超博士

Page 1. Magnetic Disk Purpose Long term, nonvolatile storage Lowest level in the memory hierarchy. Typical Disk Access Time

Transcription:

Wednesday, April 25, 2018 Topics for today Secondary memory Discs RAID: Introduction Error detection and correction Error detection: Simple parity Error correction: Hamming Codes Storage management (Chapter 9) Discs (see Fig. 9.17) Basic model of a hard disc drive (HDD) is a set of parallel rotating platters on which data can be recorded (usually on both sides apart possible from the topmost or bottommost on the stack). Read/write heads move as a unit into one of a finite number of positions. Consider one of these positions. As the disc rotates, the projection of the read/write head on the recording surface defines a track on which data can be stored. A track is normally divided into sectors and a sector is usually the smallest addressable unit of the disc. An address is thus a three-part object Surface-number, track-number, sector-number The set of tracks (one per surface) having the same radius is called a cylinder (because of its shape) The capacity of a disc is nominally the product Recording_surfaces * tracks-per-surface * sectors-per-track * bytes-per-sector The actual capacity is less than this because some bytes are used for formatting information (e.g., track numbers) and error-correcting and not available for users. To read/write a sector, we need to move the read/write heads to the appropriate track (note that the average movement is N/3 tracks if there are N tracks total) wait for the sector to rotate round (latency time is, on average, half a rotation). Spinning the discs faster, e.g. 15,000 rpm vs 3600 rpm, reduces latency. do the read/write Comp 162 Notes Page 1 of 9 April 25, 2018

An Operating System will typically allocate space in clusters of sectors (e.g. an allocation unit might be 4 sectors) to reduce the head movement overheads. It is also likely to allocate data cylinder by cylinder rather than surface by surface. Trends (from Computer Systems by Bryant and O Hallaron) Disk Year $/GB Seek time (ms) Typical Size MB Main memory Effective Cycle time (ns) 1980 500,000 87 1 1000 1985 100,000 75 10 166 1990 8,000 28 160 50 1995 300 10 1000 6 2000 10 8 20,000 1.6 2005 5 5 160,000 0.25 2010 0.3 3 1,500,000 0.10 2010:1980 1.5m times cheaper 29 times faster 1.5m times larger 10,000 times faster Recent advances (1) Helium filled drives Helium is much less dense than air thus reducing the power needed to spin the platters. The platters can also be placed closer to each other. The advantages of helium have been known for some time, manufacturing is now enabling commercial products. Sealed helium drives work better than conventional drives in location where air is thin, for example mountain-top observatories. (2) Heat-assisted magnetic recording Heat from a laser is used during writing to increase areal density. The density can be increased by a factor of between 10 and 100. Comp 162 Notes Page 2 of 9 April 25, 2018

Optical drives Optical drives are typically write-only. Good for applications that require unmodifiable logs or other permanent records. Also good for movie libraries. Solid State Drives (SSD) SSD provide an alternative to conventional hard disc drives. Advantages of SSD Faster start up Shorter access time Relatively constant access time to all data Generally fast transfer rates Disadvantages of SSD Lower capacity but this is changing rapidly High cost per Gb Tendency to fail without warning Need for software to move contents around to even out wear Best of both worlds? Use SSD as a cache memory in front of HDD. Hold copies of frequently accessed data on the SSD. As long as changed data is written back to the HDD, failure of the SSD is not a problem. What about failures of the HDD? The uncorrectable read rate is low, in the region of 1 in 10 14, but it is non-zero. RAID systems try to reduce the effective failure rate even further. Comp 162 Notes Page 3 of 9 April 25, 2018

RAID (9.5) Introduction Discs are getting larger. Currently, 3TB and 4TB hard disc drives are commonplace, Samsung announced a 16 TB SSD in August 2015 (PM1633a). However, a large disc can be a single point of failure in a system. A RAID system (Redundant Array of Inexpensive Discs 1 ) is an alternative to a SLED (Single Large Expensive Disc). The idea of RAID is to make an array of discs appear as a single disc to the rest of the system but with improved reliability and/or performance compared with a real single large disc L O G I C RAID systems that are easy for home users are available. See, for example, those made by Drobo (disk robot) (http://droboworks.com ) To appreciate how some aspects of RAID work we first need to consider some simple aspects of error detection and correction. Error detection and correction (Warford 9.4) Consider writing information to a device then, later, reading it back. How do we know that what we read is what was written? Writing to a potentially unreliable storage device is the same problem as transmitting over a noisy channel. We need to introduce redundancy into the system in order to be able detect errors and even more redundancy in order to be able to correct errors. 1 Originally, RAID was Redundant Array of Inexpensive Discs. However, inexpensive discs tended to be unreliable so the acronym was changed to Redundant Array of Independent Discs. Some people, looking for more generality, use Redundant Array of Independent Devices. Comp 162 Notes Page 4 of 9 April 25, 2018

Definitions The Hamming distance between two code words (bit patterns) of the same length is the number of bit positions at which they differ. For example Codeword 1 101110 1100010 101 001110101 Codeword 2 111101 1000011 010 001001100 Hamming distance 3 2 3 4 The code distance of a set of codewords is the smallest Hamming distance between any two members of the set. Set of four codewords Code distance { 110 101 011 000 } 2 { 1100 0011 1001 0001 } 1 Error Detection: Simple Parity If all bit patterns are legal in a particular context, that is, there is no redundancy, we cannot detect errors. We can introduce redundancy in the form of a simple parity bit for example, so chosen to make the total number of 1 bits even. Thus if the data is 1 0 0 1 1 1 0 1 we store 1 0 0 1 1 1 0 1 1 Only half of the 9-bit patterns have even parity. If an odd number of bits is changed while the item is in storage, we can detect the change when we read it back (the number of 1 bits will be odd) but we cannot correct the error because we do not have enough information to locate it. In order to be able to detect single bit errors, the code distance needs to be > 1. This makes sense, if the distance is 1 then a single bit modification can change one valid codeword into another. For the same reason in order to detect n-bit errors the code distance needs to be > n Comp 162 Notes Page 5 of 9 April 25, 2018

Error correction: Hamming codes In order to be able to correct errors we add multiple parity bits to a data object in such a way as to enable single-bit errors to be located (and therefore corrected). Clearly, the code distance needs to be larger. In order to correct an error, the bad codeword needs to be closer to the appropriate correct codeword than to any other correct codeword. To detect and correct single bit errors, the code distance must be greater than 2. We can see why by looking at two codewords with distance=2 If Valid 1 = 011110 And Valid 2 = 101110 Which code was intended if we receive 111110? In general, in order to detect and correct d-bit errors, the code distance must be > 2d. So, in order to detect and correct single-bit errors, the code distance must be at least 3. Here is a possible scheme for arranging this. If there are 8 data bits we add 4 parity bits (see Fig. 9.24). Number the 12 bits (conceptually) 1 through 12 and place the parity bits in positions 1, 2, 4 and 8. In simple parity we had one group of bits (1..9) and chose the parity bit to make the total number of 1 s even. In our extended scheme we have 4 groups: Bits 1,3,5,7,9,11 Bits 2,3,6,7,10,11 Bits 4,5,6,7,12 Bits 8,9,10,11,12 Each group has a parity bit, the remaining are data bits. The parity bit is chosen so that the number of bits in each group is even. You can see how the groups are determined by considering the following table showing the binary representations of the numbers 1 through 12. The first group contains those numbers having 1 in the last column, the second group contains those numbers having a 1 in the next-to-last column and so on. 1 0 0 0 1 2 0 0 1 0 3 0 0 1 1 4 0 1 0 0 5 0 1 0 1 6 0 1 1 0 7 0 1 1 1 8 1 0 0 0 9 1 0 0 1 10 1 0 1 0 11 1 0 1 1 12 1 1 0 0 Comp 162 Notes Page 6 of 9 April 25, 2018

Example. Data is 1 0 0 1 0 1 1 1 Adding parity locations P P 1 P 0 0 1 P 0 1 1 1 First group is P 1 0 1 0 1 so P1 is set to 1 Second group is P 1 0 1 1 1 so P2 is set to 0 Third group is P 0 0 1 1 so P4 is set to 0 Fourth group is P 0 1 1 1 so P8 is set to 1 So we store 1 0 1 0 0 0 1 1 0 1 1 1 If any single bit is changed, a unique combination of parity bits will be affected and we can pinpoint the error. Here is a template showing the 4 groups of bits 1 2 3 4 5 6 7 8 9 10 11 12 x x x x x x x x x x x x x x x x x x x x x x Suppose we receive the 12 bits above, with one bit changed: 101001110111. We enter the bits into template 1 2 3 4 5 6 7 8 9 10 11 12 1 1 0 1 0 1 0 1 1 1 1 1 0 0 1 1 1 1 0 1 1 1 Checking the four rows reveals that groups checked by bits 2 and 4 are bad (odd number of 1s) so bit 6 (2 + 4) must be the culprit. If bit 10 had been changed, parity bits P8 and P2 would be wrong because 10 is in those two groups (and only those groups). Because of the way we positioned the parity bits and grouped the data bits we add up the indexes of the failed check bits to get the index of the bad bit which we can then correct. Comp 162 Notes Page 7 of 9 April 25, 2018

Next we will look at how these ideas are used in RAID systems. Reading Section 9.1 has a page or two on disc drives. Section 9.4 discusses error detecting and correcting codes. Review Questions 1. If a disc spins at 7200 rpm what is (a) the maximum latency, (b) the minimum latency (c) the average latency? 2. Why might an operating system allocate space cylinder by cylinder rather than surface by surface? 3. Why is the average head movement N/3 tracks and not N/2 tracks? 4. For each of the following sets of strings, determine the Hamming distance. (a) { 010 000 110 } (b) { 0111 1001 1100 1111 } (c) { 10111 01101 10101 11000 } (d) { 100100 110011 010101 000000 } (e) { 100100 111011 000011 001100 } 5. For each of these 12-bit Hamming codes received, determine if there was an error and, if so, which bit was in error. Even parity is assumed. (a) 010001111110 (b) 111101001010 (c) 000011011101 Comp 162 Notes Page 8 of 9 April 25, 2018

Review Answers 1. (a) 1/120 sec (b) 0 sec (c) 1/240 sec. 2. To reduce access times: sectors in a particular cylinder can be accessed without moving the read/write heads 3. Because there are, for example, N-1 ways to move 1 track and only one way to move N-1 tracks. The average is ( (N-1)*1 + (N-2)*2 + 1 * (N-1) ) / (1+2+ N-1) Which approximates to N/3 4. (a) 1 (b) 1 (c) 1 (d) 2 (e) 2 5. (a) Bad bit 7 (b) Good (c) Bad bit 4 Comp 162 Notes Page 9 of 9 April 25, 2018