CS2410: Computer Architecture. Storage systems. Sangyeun Cho. Computer Science Department University of Pittsburgh

Similar documents
CS61C : Machine Structures

CS61C - Machine Structures. Week 7 - Disks. October 10, 2003 John Wawrzynek

NOW Handout Page 1 S258 S99 1

Computer Science 146. Computer Architecture

Advanced Computer Architecture

Lecture 23: I/O Redundant Arrays of Inexpensive Disks Professor Randy H. Katz Computer Science 252 Spring 1996

Today: Coda, xfs! Brief overview of other file systems. Distributed File System Requirements!

Agenda. Project #3, Part I. Agenda. Project #3, Part I. No SSE 12/1/10

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

www-inst.eecs.berkeley.edu/~cs61c/

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Topics. Lecture 8: Magnetic Disks

Page 1. Motivation: Who Cares About I/O? Lecture 5: I/O Introduction: Storage Devices & RAID. I/O Systems. Big Picture: Who cares about CPUs?

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

I/O CANNOT BE IGNORED

CSE325 Principles of Operating Systems. Mass-Storage Systems. David P. Duggan. April 19, 2011

Agenda. Agenda 11/12/12. Review - 6 Great Ideas in Computer Architecture

UCB CS61C : Machine Structures

I/O CANNOT BE IGNORED

CS61C : Machine Structures

Concepts Introduced. I/O Cannot Be Ignored. Typical Collection of I/O Devices. I/O Issues

Appendix D: Storage Systems

Outline. EEL-4713 Computer Architecture I/O Systems. I/O System Design Issues. The Big Picture: Where are We Now? I/O Performance Measures

RAID (Redundant Array of Inexpensive Disks)

COS 318: Operating Systems. Storage Devices. Vivek Pai Computer Science Department Princeton University

High-Performance Storage Systems

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Chapter 6. Storage and Other I/O Topics

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University

Storage Systems. Storage Systems

COS 318: Operating Systems. Storage Devices. Jaswinder Pal Singh Computer Science Department Princeton University

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

CMSC 424 Database design Lecture 12 Storage. Mihai Pop

CS152 Computer Architecture and Engineering Lecture 19: I/O Systems

u Covered: l Management of CPU & concurrency l Management of main memory & virtual memory u Currently --- Management of I/O devices

Lecture 25: Dependability and RAID

CSE 502 Graduate Computer Architecture. Lec 22 Disk Storage

A track on a magnetic disk is a concentric rings where data is stored.

Mass-Storage Structure

CISC 7310X. C11: Mass Storage. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 4/19/2018 CUNY Brooklyn College

Dependency Redundancy & RAID

Storage System COSC UCB

V. Mass Storage Systems

Lecture 23: Storage Systems. Topics: disk access, bus design, evaluation metrics, RAID (Sections )

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Lecture for Storage System

Chapter 9: Peripheral Devices: Magnetic Disks

Case for Storage. EEL 5764: Graduate Computer Architecture. Storage Ann Gordon-Ross Electrical and Computer Engineering University of Florida.

Chapter 6 External Memory

CS370 Operating Systems

Recap of Networking Intro. Lecture #28 Networking & Disks Protocol Family Concept. Protocol for Networks of Networks?

Administrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage

Chapter 10: Mass-Storage Systems

Today: Coda, xfs. Case Study: Coda File System. Brief overview of other file systems. xfs Log structured file systems HDFS Object Storage Systems

CS3600 SYSTEMS AND NETWORKS

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

Database Systems II. Secondary Storage

CS61C : Machine Structures

Chapter 12: Mass-Storage

5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive Disks 485.e1

Chapter 12: Mass-Storage

COMP283-Lecture 3 Applied Database Management

Mass-Storage. ICS332 - Fall 2017 Operating Systems. Henri Casanova

Advanced Database Systems

Disks. Storage Technology. Vera Goebel Thomas Plagemann. Department of Informatics University of Oslo

Database Systems. November 2, 2011 Lecture #7. topobo (mit)

Magnetic Disks. Tore Brox-Larsen Including material developed by Pål Halvorsen, University of Oslo (primarily) Kai Li, Princeton University

Definition of RAID Levels

Review. EECS 252 Graduate Computer Architecture. Lec 18 Storage. Case for Storage. Outline

Lecture 29. Friday, March 23 CS 470 Operating Systems - Lecture 29 1

Storage systems. Computer Systems Architecture CMSC 411 Unit 6 Storage Systems. (Hard) Disks. Disk and Tape Technologies. Disks (cont.

HP AutoRAID (Lecture 5, cs262a)

Mass-Storage. ICS332 Operating Systems

Review. Lecture #25 Input / Output, Networks II, Disks ABCs of Networks: 2 Computers

Today: Secondary Storage! Typical Disk Parameters!

Computer System Architecture

CS 61C: Great Ideas in Computer Architecture. Dependability. Guest Lecturer: Paul Ruan

William Stallings Computer Organization and Architecture 6 th Edition. Chapter 6 External Memory

Disks and RAID. CS 4410 Operating Systems. [R. Agarwal, L. Alvisi, A. Bracy, E. Sirer, R. Van Renesse]

Today s Papers. Array Reliability. RAID Basics (Two optional papers) EECS 262a Advanced Topics in Computer Systems Lecture 3

ECE Enterprise Storage Architecture. Fall 2018

HP AutoRAID (Lecture 5, cs262a)

Storage. Hwansoo Han

Computer Architecture 计算机体系结构. Lecture 6. Data Storage and I/O 第六讲 数据存储和输入输出. Chao Li, PhD. 李超博士

Magnetic Disk. Optical. Magnetic Tape. RAID Removable. CD-ROM CD-Recordable (CD-R) CD-R/W DVD

Specifying Storage Servers for IP security applications

Chapter 12: Mass-Storage

Where We Are in This Course Right Now. ECE 152 Introduction to Computer Architecture Input/Output (I/O) Copyright 2012 Daniel J. Sorin Duke University

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Chapter 11. I/O Management and Disk Scheduling

Chapter 13: Mass-Storage Systems. Disk Scheduling. Disk Scheduling (Cont.) Disk Structure FCFS. Moving-Head Disk Mechanism

Chapter 13: Mass-Storage Systems. Disk Structure

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Outline of Today s Lecture. The Big Picture: Where are We Now?

Lecture: Storage, GPUs. Topics: disks, RAID, reliability, GPUs (Appendix D, Ch 4)

SCSI vs. ATA More than an interface. Dave Anderson, Jim Dykes, Erik Riedel Seagate Research April 2003

Readings. Storage Hierarchy III: I/O System. I/O (Disk) Performance. I/O Device Characteristics. often boring, but still quite important

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Transcription:

CS24: Computer Architecture Storage systems Sangyeun Cho Computer Science Department (Some slides borrowed from D Patterson s lecture slides) Case for storage Shift in focus from computation to communication & storage of information Eg, Cray Research/Thinking Machines vs Google/Yahoo! The Computing Revolution (96s to 98s) The Information Age (99s to today) Storage emphasizes reliability and scalability as well as costperformance What is software king that determines which hardware features are actually used? OS for storage Compiler for processor Also has own performance theory queuing theory balances throughput vs latency time CS24: Computer Architecture

Outline Magnetic disks RAID Dependability/reliability/availability I/O benchmarks, performance, and dependability CS24: Computer Architecture Magnetic disks Spindle, platters, head, head arm, head arm actuator Tracks, sectors Tracks: concentric circles Sectors: unit of data access Standard interface ATA: parallel (old), serial (SATA, new) SCSI: parallel (old), serial (SAS, new) Performance determined by Seek (mechanical), rotation speed (mechanical) Queuing/buffering, interface CS24: Computer Architecture

Disk figure of merit: Areal density Bits recorded along a track Metric is bits per inch (BPI) Number of tracks per surface Metric is tracks per inch (TPI) Disk designs brag about bit density per unit area Metric is bits per square inch (areal density) = BPI TPI CS24: Computer Architecture Specific design ideas Zoned recording Outer tracks are longer than inner tracks; why not allocate more sectors on outer tracks? Drives usually have 5 to 25 zones Defect management in zones Serpentine ordering of tracks; skewed ordering of sectors Minimize arm movement! Command queuing Security support (data protection) CS24: Computer Architecture

Historical perspective 956 IBM RAMAC early 97s Winchester Developed for mainframe computers, proprietary interfaces Steady shrink in form factor: 27 inches to 4 inches CS24: Computer Architecture IBM RAMAC CS24: Computer Architecture

Historical perspective 956 IBM RAMAC early 97s Winchester Developed for mainframe computers, proprietary interfaces Steady shrink in form factor: 27 inches to 4 inches Form factor and capacity drives market more than performance 97s developments 525-in floppy disk form factor Emergence of industry standard disk interfaces Early 98s: PCs and first generation workstations Mid 98s: Client-server computing Centralized storage on file server accelerates disk downsizing: 8 in to 525 in Mass market disk drives become a reality 525 in to 35 in drives for PCs, end of proprietary interfaces 99s: Laptops 25-in drives 2s: Continued areal density improvement, new solid-state devices entering hard disk designs CS24: Computer Architecture Solid-state drive CS24: Computer Architecture

Solid-state drive CS24: Computer Architecture Solid-state drive CS24: Computer Architecture

Past and current design trends Fewer platters, lower diameter (thanks to high areal density) ~3 platters 25-inch form factor Slower rotational speeds More intelligence in the drive Access scheduling (w/ help from native command queuing) Integrated defect management Larger sector size (52B to 4kB) CS24: Computer Architecture Future disk sizes and performance Continued advance in capacity (6%/year) and bandwidth (4%/year) Slow improvement in seek, rotation (8%/year) Time to read whole disk Year Sequentially Randomly ( sector/seek) 99 4 minutes 6 hours 2 2 minutes week 26 56 minutes 3 weeks (SCSI) 26 7 minutes 7 weeks (SATA) CS24: Computer Architecture

Use arrays of small disks? Katz and Patterson of Berkeley asked in 987: Can smaller disks be used to close gap in performance between disks and CPUs? Conventional: 4 disk designs 35 525 Low End 4 High End Disk Array: disk design 35 CS24: Computer Architecture Advantages of small form-factor disk drives Low cost/mb High MB/volume High MB/watt Low cost/actuator Cost and Environmental Efficiencies CS24: Computer Architecture

Replace small number of large disks with large number of small disks! (988 disks) IBM 339K IBM 35" 6 x7 Capacity Volume Power Data Rate I/O Rate MTTF Cost 2 GBytes 97 cu ft 3 KW 5 MB/s 6 I/Os/s 25 KHrs $25K 32 MBytes cu ft W 5 MB/s 55 I/Os/s 5 KHrs $2K 23 GBytes cu ft 9X KW 3X 2 MB/s 8X 39 IOs/s 6X??? Hrs $5K Disk Arrays have potential for large data and I/O rates, high MB per cu ft, high MB per KW, but what about reliability? CS24: Computer Architecture Array reliability Reliability of N disks = (Reliability of disk)/n 5, hours/7 disks = 7 hours Disk system MTTF: drops from 6 years to month! Arrays (without redundancy) too unreliable to be useful! Hot spares support reconstruction in parallel with access: Very high media availability can be achieved CS24: Computer Architecture

Redundant array of (inexpensive) disks (RAID) Files are striped across multiple disks Redundancy yields high data availability Availability: Service still provided to user, even if some components failed Disks will still fail Contents reconstructed from data redundantly stored in the array Capacity penalty to store redundant information Bandwidth penalty to update redundant information CS24: Computer Architecture RAID : Disk mirroring/shadowing recovery group Each disk is fully duplicated onto its mirror High availability can be achieved Bandwidth sacrifice on write: Logical write = two physical writes Most expensive solution: % capacity overhead CS24: Computer Architecture

RAID 3: Parity disk P Striped physical records P contains sum of other disks per stripe mod 2 ( parity ) If disk fails, subtract P from sum of other disks to find missing information CS24: Computer Architecture RAID 3 Sum computed across recover group to protect against hard disk failures, stored in P disks Logically, a single high capacity, high transfer rate disk: Good for large transfers Wide arrays reduce capacity costs, but decreases availability 33% capacity cost for parity if 3 data disks and parity disk CS24: Computer Architecture

Inspiration for RAID 4 RAID 3 relies on parity disk to discover errors on read But every sector has an error detection/correction field To catch errors on read, rely on error detection/correction field vs the parity disk Allows independent reads to different disks simultaneously CS24: Computer Architecture RAID 4: High I/O rate parity Insides of 5 disks D D D2 D3 P D4 D5 D6 D7 P Increasing Logical Disk Address Example: small read D & D5, large write D2-D5 CS24: Computer Architecture D8 D9 D D P D2 D3 D4 D5 D6 D7 D8 D9 D2 D2 D22 D23 P Disk Columns P P 24 Stripe

Inspiration for RAID 5 RAID 4 works well for small reads Small writes (write to one disk): Option : Read other data disks, create new sum and write to parity disk Option 2: Since P has old sum, compare old data to new data, add the difference to P Small writes are limited by parity disk: Write to D, D5 both also write to parity disk D D D2 D3 P D4 D5 D6 D7 P CS24: Computer Architecture RAID 5: High I/O rate, interleaved parity Independent writes possible because of interleaved parity Example: write to D, D5 uses disks,, 3, 4 CS24: Computer Architecture D D D2 D3 P D4 D5 D6 P D7 D8 D9 P D D D2 P D3 D4 D5 P D6 D7 D8 D9 D2 D2 D22 D23 P Disk Columns Increasing Logical Disk Addresses

Small writes RAID 5: Small Write Algorithm Logical Write = 2 Physical Reads + 2 Physical Writes D' D D D2 D3 P new data + old data XOR ( Read) old (2 Read) parity + XOR (3 Write) (4 Write) CS24: Computer Architecture D' D D2 D3 P' RAID 6: Recovering from 2 failures Why > failure recovery? Operator accidentally replaces the wrong disk during a failure Since disk bandwidth is growing more slowly than disk capacity, the mean time to repair a disk in a RAID system is increasing increases the chances of a 2 nd failure during repair since repair takes longer CS24: Computer Architecture

RAID 6: Recovering from 2 failures NetApp s row-diagonal parity or RAID-DP Like the standard RAID schemes, it uses redundant space based on parity calculation per stripe Since it is protecting against a double failure, it adds two check blocks per stripe of data If p+ disks total, p disks have data Row parity disk is just like in RAID 4 Each block of the diagonal parity disk contains the even parity of the blocks in the same diagonal CS24: Computer Architecture Example w/ p = 5 Row diagonal parity starts by recovering one of the 4 blocks on the failed disk using diagonal parity Since each diagonal misses one disk, and all diagonals miss a different disk, 2 diagonals are only missing block Once the data for those blocks are recovered, then the standard RAID recovery scheme can be used to recover two more blocks in the standard RAID 4 stripes Process continues until two failed disks are restored CS24: Computer Architecture

Berkeley RAID-I RAID-I (989) Sun 4/28 w/ 28MB of DRAM Four dual-string SCSI controllers 28 525-in SCSI disks and specialized disk striping software Today, RAID is $24B industry, 8% non-pc disks sold in RAIDs CS24: Computer Architecture Summary: RAID techniques Disk mirroring (RAID ) Each disk is fully duplicated onto its mirror Logical write = two physical writes % capacity overhead Parity data bandwidth array (RAID 3) Parity computed horizontally Logically a single high data bandwidth disk High I/O rate parity array (RAID 5) Interleaved parity blocks Independent reads and writes Logical write = 2 reads + 2 writes CS24: Computer Architecture 32