Lecture 23: I/O Redundant Arrays of Inexpensive Disks Professor Randy H. Katz Computer Science 252 Spring 1996

Similar documents
Computer Science 146. Computer Architecture

Chapter 6. I/O issues

CS2410: Computer Architecture. Storage systems. Sangyeun Cho. Computer Science Department University of Pittsburgh

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Introduction. Motivation Performance metrics Processor interface issues Buses

Input/Output Introduction

Introduction to Input and Output

Agenda. Project #3, Part I. Agenda. Project #3, Part I. No SSE 12/1/10

RAID (Redundant Array of Inexpensive Disks)

Today: Coda, xfs! Brief overview of other file systems. Distributed File System Requirements!

Storage Systems. Storage Systems

CS61C : Machine Structures

Lecture 12: I/O: Metrics, A Little Queuing Theory, and Busses Professor David A. Patterson Computer Science 252 Fall 1996

Administrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review

Lecture 23: Storage Systems. Topics: disk access, bus design, evaluation metrics, RAID (Sections )

Storage systems. Computer Systems Architecture CMSC 411 Unit 6 Storage Systems. (Hard) Disks. Disk and Tape Technologies. Disks (cont.

CSE325 Principles of Operating Systems. Mass-Storage Systems. David P. Duggan. April 19, 2011

HP AutoRAID (Lecture 5, cs262a)

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Readings. Storage Hierarchy III: I/O System. I/O (Disk) Performance. I/O Device Characteristics. often boring, but still quite important

Agenda. Agenda 11/12/12. Review - 6 Great Ideas in Computer Architecture

I/O Systems. EECC551 - Shaaban. Processor. Cache. Memory - I/O Bus. Main Memory. I/O Controller. I/O Controller. I/O Controller. Graphics.

HP AutoRAID (Lecture 5, cs262a)

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Advanced Computer Architecture

Today: Coda, xfs. Case Study: Coda File System. Brief overview of other file systems. xfs Log structured file systems HDFS Object Storage Systems

I/O CANNOT BE IGNORED

Storage. Hwansoo Han

CS152 Computer Architecture and Engineering Lecture 26. Low Power Design. May 3, 2001 John Kubiatowicz (http.cs.berkeley.

Appendix D: Storage Systems

CS61C - Machine Structures. Week 7 - Disks. October 10, 2003 John Wawrzynek

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

RAID (Redundant Array of Inexpensive Disks) I/O Benchmarks ABCs of UNIX File Systems A Study Comparing UNIX File System Performance

Lecture 13. Storage, Network and Other Peripherals

Lecture for Storage System

Chapter 6. Storage and Other I/O Topics

Today s Papers. Array Reliability. RAID Basics (Two optional papers) EECS 262a Advanced Topics in Computer Systems Lecture 3

I/O CANNOT BE IGNORED

Where We Are in This Course Right Now. ECE 152 Introduction to Computer Architecture Input/Output (I/O) Copyright 2012 Daniel J. Sorin Duke University

Lecture 25: Dependability and RAID

Definition of RAID Levels

Concepts Introduced. I/O Cannot Be Ignored. Typical Collection of I/O Devices. I/O Issues

The Big Picture: Where are We Now? CS152 Computer Architecture and Engineering Lecture 24. I/O Systems II

Buses. Disks PCI RDRAM RDRAM LAN. Some slides adapted from lecture by David Culler. Pentium 4 Processor. Memory Controller Hub.

CS61C : Machine Structures

Storage System COSC UCB

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

CS152 Computer Architecture and Engineering Lecture 25. I/O and Storage Systems Power

CSC Operating Systems Spring Lecture - XIX Storage and I/O - II. Tevfik Koşar. Louisiana State University.

RAID Structure. RAID Levels. RAID (cont) RAID (0 + 1) and (1 + 0) Tevfik Koşar. Hierarchical Storage Management (HSM)

Chapter 6 Storage and Other I/O Topics

www-inst.eecs.berkeley.edu/~cs61c/

Chapter 6. Storage and Other I/O Topics. Jiang Jiang

Today: Secondary Storage! Typical Disk Parameters!

NOW Handout Page 1 S258 S99 1

Page 1. Motivation: Who Cares About I/O? Lecture 5: I/O Introduction: Storage Devices & RAID. I/O Systems. Big Picture: Who cares about CPUs?

High-Performance Storage Systems

CSE 120. Operating Systems. March 27, 2014 Lecture 17. Mass Storage. Instructor: Neil Rhodes. Wednesday, March 26, 14

CS252 S05. CMSC 411 Computer Systems Architecture Lecture 18 Storage Systems 2. I/O performance measures. I/O performance measures

CS 61C: Great Ideas in Computer Architecture. Dependability. Guest Lecturer: Paul Ruan

Storing Data: Disks and Files

Mass-Storage Structure

Thomas Polzer Institut für Technische Informatik

Computer Architecture Computer Science & Engineering. Chapter 6. Storage and Other I/O Topics BK TP.HCM

Quiz for Chapter 6 Storage and Other I/O Topics 3.10

Chapter 6. Storage and Other I/O Topics. ICE3003: Computer Architecture Fall 2012 Euiseong Seo

High Performance Computing Course Notes High Performance Storage

Dependency Redundancy & RAID

SYSTEM UPGRADE, INC Making Good Computers Better. System Upgrade Teaches RAID

Reading and References. Input / Output. Why Input and Output? A typical organization. CSE 410, Spring 2004 Computer Systems

Chapter 6. Storage and Other I/O Topics

CS152 Computer Architecture and Engineering Lecture 24. I/O and Storage Systems. April 29, 2004 John Kubiatowicz (

CMSC 424 Database design Lecture 12 Storage. Mihai Pop

V. Mass Storage Systems

Mladen Stefanov F48235 R.A.I.D

UCB CS61C : Machine Structures

CS61C : Machine Structures

Computer Architecture 计算机体系结构. Lecture 6. Data Storage and I/O 第六讲 数据存储和输入输出. Chao Li, PhD. 李超博士

CS5460: Operating Systems Lecture 20: File System Reliability

RAID. Redundant Array of Inexpensive Disks. Industry tends to use Independent Disks

Distributed Video Systems Chapter 5 Issues in Video Storage and Retrieval Part 2 - Disk Array and RAID

CSE 380 Computer Operating Systems

Chapter 6. Storage and Other I/O Topics. ICE3003: Computer Architecture Spring 2014 Euiseong Seo

COMP283-Lecture 3 Applied Database Management

Computer Systems Laboratory Sungkyunkwan University

CSE 120. Overview. July 27, Day 8 Input/Output. Instructor: Neil Rhodes. Hardware. Hardware. Hardware

INPUT/OUTPUT DEVICES Dr. Bill Yi Santa Clara University

Emulating High Speed Tape with RAID. Presented by Martin Bock Storage Concepts, Inc.

Address Accessible Memories. A.R. Hurson Department of Computer Science Missouri University of Science & Technology

Introduction I/O 1. I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec

Operating Systems 2010/2011

Disks. Storage Technology. Vera Goebel Thomas Plagemann. Department of Informatics University of Oslo

Page 1. Magnetic Disk Purpose Long term, nonvolatile storage Lowest level in the memory hierarchy. Typical Disk Access Time

Topics. Lecture 8: Magnetic Disks

Principles of Data Management. Lecture #2 (Storing Data: Disks and Files)

CS152 Computer Architecture and Engineering Lecture 19: I/O Systems

Lecture 14: I/O Benchmarks, Busses, and Automated Data Libraries Professor David A. Patterson Computer Science 252 Spring 1998

Introduction Electrical Considerations Data Transfer Synchronization Bus Arbitration VME Bus Local Buses PCI Bus PCI Bus Variants Serial Buses

Input/Output. Today. Next. Principles of I/O hardware & software I/O software layers Disks. Protection & Security

Transcription:

Lecture 23: I/O Redundant Arrays of Inexpensive Disks Professor Randy H Katz Computer Science 252 Spring 996 RHKS96

Review: Storage System Issues Historical Context of Storage I/O Storage I/O Performance Measures Secondary and Tertiary Storage Devices A Little Queuing Theory Processor Interface Issues I/O & Memory Buses RAID ABCs of UNIX File Systems I/O Benchmarks Comparing UNIX File System Performance RHKS96 2

Review: Busses Bus: a shared communication link between subsystems Disadvantage: a communication bottleneck, possibly limiting the maximum I/O throughput Bus speed is limited by physical factors Two generic types of busses: I/O and CPU Bus transaction: sending address & receiving or sending data RHKS96 3

Review: Bus Options Option High performance Low cost Bus width Separate address Multiplex address & data lines & data lines Data width Wider is faster Narrower is cheaper (eg, 32 bits) (eg, 8 bits) Transfer size Multiple words has Single-word transfer less bus overhead is simpler Bus masters Multiple Single master (requires arbitration) (no arbitration) Split Yes separate No continuous transaction? Request and Reply connection is cheaper packets gets higher and has lower latency bandwidth (needs multiple masters) Clocking Synchronous Asynchronous RHKS96 4

Review: 99 Bus Survey VME FutureBus Multibus II IPI SCSI Signals 28 96 96 6 8 Addr/Data mux no yes yes n/a n/a Data width 6-32 32 32 6 8 Masters multi multi multi single multi Clocking Async Async Sync Async either MB/s (ns, 25 37 2 25 5 (asyn) word) 5 (sync) 5ns word 29 55 = = ns block 279 952 4 = = 5ns block 36 28 33 = = Max devices 2 2 2 8 7 Max meters 5 5 5 5 25 Standard IEEE 4 IEEE 896 ANSI/IEEE ANSI X329 ANSI X33 296 RHKS96 5

Review: 993 I/O Bus Survey Bus SBus TurboChannel MicroChannel PCI Originator Sun DEC IBM Intel Clock Rate (MHz) 6-25 25-25 async 33 Addressing Virtual Physical Physical Physical Data Sizes (bits) 8,6,32 8,6,24,32 8,6,24,32,64 8,6,24,32,64 Master Multi Single Multi Multi Arbitration Central Central Central Central 32 bit read (MB/s) 33 25 2 33 Peak (MB/s) 89 84 75 (222) Max Power (W) 6 26 3 25 RHKS96 6

993 MP Server Memory Bus Survey Bus Summit Challenge XDBus Originator HP SGI Sun Clock Rate (MHz) 6 48 66 Split transaction? Yes Yes Yes? Address lines 48 4?? Data lines 28 256 44 (parity) Data Sizes (bits) 52 24 52 Clocks/transfer 4 5 4? Peak (MB/s) 96 2 56 Master Multi Multi Multi Arbitration Central Central Central Addressing Physical Physical Physical Slots 6 9 Busses/system 2 Length 3 inches 2? inches 7 inches RHKS96 7

Review: Improving Bandwidth of Secondary Storage Processor performance growth phenomenal I/O? I/O certainly has been lagging in the last decade Seymour Cray, Public Lecture (976) Also, I/O needs a lot of work David Kuck, Keynote Address, (988) RHKS96 8

Network Attached Storage Decreasing Disk Diameters 4"» "» 8"» 525"» 35"» 25"» 8"» 3"» high bandwidth disk systems based on arrays of disks Network provides well defined physical and logical interfaces: separate CPU and storage system! High Performance Storage Service on a High Speed Network Network File Services OS structures supporting remote file access 3 Mb/s» Mb/s» 5 Mb/s» Mb/s» Gb/s» Gb/s networks capable of sustaining high bandwidth transfers Increasing Network Bandwidth RHKS96 9

Manufacturing Advantages of Disk Arrays Disk Product Families Conventional: 4 disk designs 35 525 4 Low End High End Disk Array: disk design 35 RHKS96

Replace Small # of Large Disks with Large # of Small Disks! IBM 339 (K) IBM 35" 6 x7 Data Capacity 2 GBytes 32 MBytes 23 GBytes Volume 97 cu ft cu ft cu ft Power 3 KW W KW Data Rate 5 MB/s 5 MB/s 2 MB/s I/O Rate 6 I/Os/s 55 I/Os/s 39 IOs/s MTTF 25 KHrs 5 KHrs??? Hrs Cost $25K $2K $5K large data and I/O rates Disk Arrays have potential for high MB per cu ft, high MB per KW awful reliability RHKS96

Redundant Arrays of Disks Files are "striped" across multiple spindles Redundancy yields high data availability Disks will fail Contents reconstructed from data redundantly stored in the array Capacity penalty to store it Bandwidth penalty to update Mirroring/Shadowing (high capacity cost) Techniques: Horizontal Hamming Codes (overkill) Parity & Reed-Solomon Codes Failure Prediction (no capacity overhead!) VaxSimPlus Technique is controversial RHKS96 2

Array Reliability Reliability of N disks = Reliability of Disk N 5, Hours 7 disks = 7 hours Disk system MTTF: Drops from 6 years to month! Arrays without redundancy too unreliable to be useful! Hot spares support reconstruction in parallel with access: very high media availability can be achieved RHKS96 3

Redundant Arrays of Disks (RAID) Techniques Disk Mirroring, Shadowing Each disk is fully duplicated onto its "shadow" Logical write = two physical writes % capacity overhead Parity Data Bandwidth Array Parity computed horizontally Logically a single high data bw disk High I/O Rate Parity Array Interleaved parity blocks Independent reads and writes Logical write = 2 reads + 2 writes Parity + Reed-Solomon codes RHKS96 4

Problems of Disk Arrays: Small Writes RAID-5: Small Write Algorithm Logical Write = 2 Physical Reads + 2 Physical Writes D' D D D2 D3 P new data old data ( Read) old parity (2 Read) + XOR + XOR (3 Write) (4 Write) D' D D2 D3 P' RHKS96 5

Redundant Arrays of Disks RAID : Disk Mirroring/Shadowing recovery group Each disk is fully duplicated onto its "shadow" Very high availability can be achieved Bandwidth sacrifice on write: Logical write = two physical writes Reads may be optimized Most expensive solution: % capacity overhead Targeted for high I/O rate, high availability environments RHKS96 6

Redundant Arrays of Disks RAID 3: Parity Disk P logical record Striped physical records Parity computed across recovery group to protect against hard disk failures 33% capacity cost for parity in this configuration wider arrays reduce capacity costs, decrease expected availability, increase reconstruction time Arms logically synchronized, spindles rotationally synchronized logically a single high capacity, high transfer rate disk Targeted for high bandwidth applications: Scientific, Image Processing RHKS96 7

Redundant Arrays of Disks RAID 5+: High I/O Rate Parity A logical write becomes four physical I/Os Independent writes possible because of interleaved parity Reed-Solomon Codes ("Q") for protection during reconstruction Targeted for mixed applications D D D2 D3 P D4 D5 D6 P D7 D8 D9 P D D D2 P D3 D4 D5 P D6 D7 D8 D9 Increasing Logical Disk Addresses Stripe Stripe Unit D2 D2 D22 D23 P Disk Columns RHKS96 8

RA9: Ave Seek: 85 ms Rotation: 67 ms Xfer Rate: 28 MB/s Capacity: 2 MB IBM Small Disks: Ave Seek: 25 ms Rotation: 4 ms Xfer Rate: 24 MB/s Capacity: 3 MB Normal Operating Range of Most Existing Systems 4+2 Array Group Mirrored RA9's all writes all RHKS96 reads 9

Subsystem Organization host host adapter array controller single board disk controller manages interface to host, DMA control, buffering, parity logic physical device control striping software off-loaded from host to array controller no applications modifications no reduction of host performance single board disk controller single board disk controller single board disk controller often piggy-backed in small format devices RHKS96 2

System Availability: Orthogonal RAIDs String Controller Array Controller String Controller String Controller String Controller String Controller String Controller Data Recovery Group: unit of data redundancy Redundant Support Components: fans, power supplies, controller, cables End to End Data Integrity: internal parity protected data paths RHKS96 2

System-Level Availability host host I/O Controller Fully dual redundant I/O Controller Array Controller Array Controller Goal: No Single Points of Failure Recovery Group with duplicated paths, higher performance can be obtained when there are no failures RHKS96 22