Storage System COSC UCB

Similar documents
UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Appendix D: Storage Systems

Computer System Architecture

I/O CANNOT BE IGNORED

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Outline of Today s Lecture. The Big Picture: Where are We Now?

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Lecture 23: Storage Systems. Topics: disk access, bus design, evaluation metrics, RAID (Sections )

I/O CANNOT BE IGNORED

Chapter 6. Storage and Other I/O Topics

Storage Systems. Storage Systems

A track on a magnetic disk is a concentric rings where data is stored.

Magnetic Disk. Optical. Magnetic Tape. RAID Removable. CD-ROM CD-Recordable (CD-R) CD-R/W DVD

Page 1. Magnetic Disk Purpose Long term, nonvolatile storage Lowest level in the memory hierarchy. Typical Disk Access Time

Concepts Introduced. I/O Cannot Be Ignored. Typical Collection of I/O Devices. I/O Issues

Today: Secondary Storage! Typical Disk Parameters!

u Covered: l Management of CPU & concurrency l Management of main memory & virtual memory u Currently --- Management of I/O devices

Storage. CS 3410 Computer System Organization & Programming

Storage systems. Computer Systems Architecture CMSC 411 Unit 6 Storage Systems. (Hard) Disks. Disk and Tape Technologies. Disks (cont.

Mass-Storage Structure

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

High-Performance Storage Systems

CSE 120. Operating Systems. March 27, 2014 Lecture 17. Mass Storage. Instructor: Neil Rhodes. Wednesday, March 26, 14

CISC 7310X. C11: Mass Storage. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 4/19/2018 CUNY Brooklyn College

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

COS 318: Operating Systems. Storage Devices. Vivek Pai Computer Science Department Princeton University

ECE331: Hardware Organization and Design

Administrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review

Monday, May 4, Discs RAID: Introduction Error detection and correction Error detection: Simple parity Error correction: Hamming Codes

Computer Organization and Technology External Memory

COS 318: Operating Systems. Storage Devices. Jaswinder Pal Singh Computer Science Department Princeton University

Chapter 9: Peripheral Devices: Magnetic Disks

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006

Readings. Storage Hierarchy III: I/O System. I/O (Disk) Performance. I/O Device Characteristics. often boring, but still quite important

General Items: Reading Materials: Miscellaneous: Lecture 9 / Chapter 7 COSC1300/ITSC 1401/BCIS /19/2004 ? H ? T

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University

CS2410: Computer Architecture. Storage systems. Sangyeun Cho. Computer Science Department University of Pittsburgh

Lecture 13. Storage, Network and Other Peripherals

Database Systems. November 2, 2011 Lecture #7. topobo (mit)

Session: Hardware Topic: Disks. Daniel Chang. COP 3502 Introduction to Computer Science. Lecture. Copyright August 2004, Daniel Chang

CSE 153 Design of Operating Systems Fall 2018

BBM371- Data Management. Lecture 2: Storage Devices

Storage and File Structure. Classification of Physical Storage Media. Physical Storage Media. Physical Storage Media

Computer Science 146. Computer Architecture

William Stallings Computer Organization and Architecture 6 th Edition. Chapter 6 External Memory

Storage and File Structure

1.1 Bits and Bit Patterns. Boolean Operations. Figure 2.1 CPU and main memory connected via a bus. CS11102 Introduction to Computer Science

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 6 External Memory

CS152 Computer Architecture and Engineering Lecture 19: I/O Systems

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

Ch 11: Storage and File Structure

Storage. Hwansoo Han

Tape pictures. CSE 30341: Operating Systems Principles

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Silberschatz, et al. Topics based on Chapter 13

CMSC 424 Database design Lecture 12 Storage. Mihai Pop

CS61C : Machine Structures

CSE 153 Design of Operating Systems

Principles of Data Management. Lecture #2 (Storing Data: Disks and Files)

UNIT 2 Data Center Environment

Chapter 10: Storage and File Structure

Semiconductor Memory Types Microprocessor Design & Organisation HCA2102

Reading and References. Input / Output. Why Input and Output? A typical organization. CSE 410, Spring 2004 Computer Systems

Wednesday, April 25, Discs RAID: Introduction Error detection and correction Error detection: Simple parity Error correction: Hamming Codes

Computer Organization

Storing Data: Disks and Files

Introduction to I/O and Disk Management

COMP283-Lecture 3 Applied Database Management

Module 13: Secondary-Storage

Chapter 12: Mass-Storage Systems. Operating System Concepts 8 th Edition,

Introduction to I/O and Disk Management

Chapter 13: Mass-Storage Systems. Disk Scheduling. Disk Scheduling (Cont.) Disk Structure FCFS. Moving-Head Disk Mechanism

Chapter 13: Mass-Storage Systems. Disk Structure

Chapter 14: Mass-Storage Systems

Chapter 10: Mass-Storage Systems

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

UC Santa Barbara. Operating Systems. Christopher Kruegel Department of Computer Science UC Santa Barbara

Chapter 12: Mass-Storage

Thomas Polzer Institut für Technische Informatik

CSCI-GA Operating Systems. I/O : Disk Scheduling and RAID. Hubertus Franke

CSE 451: Operating Systems Spring Module 12 Secondary Storage

1. What is the difference between primary storage and secondary storage?

Lecture: Storage, GPUs. Topics: disks, RAID, reliability, GPUs (Appendix D, Ch 4)

Chapter 11. I/O Management and Disk Scheduling

Overview of Mass Storage Structure

Chapter 6 Storage and Other I/O Topics

V. Mass Storage Systems

Lecture 15 - Chapter 10 Storage and File Structure

CSE 451: Operating Systems Winter Secondary Storage. Steve Gribble. Secondary storage

Lecture 25: Interconnection Networks, Disks. Topics: flow control, router microarchitecture, RAID

Chapter 14 Mass-Storage Structure

Table 6.1 Physical Characteristics of Disk Systems

Disks. Storage Technology. Vera Goebel Thomas Plagemann. Department of Informatics University of Oslo

OPERATING SYSTEMS CS3502 Spring Input/Output System Chapter 9

Disks and RAID. CS 4410 Operating Systems. [R. Agarwal, L. Alvisi, A. Bracy, E. Sirer, R. Van Renesse]

High Performance Computing Course Notes High Performance Storage

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Outline. EEL-4713 Computer Architecture I/O Systems. I/O System Design Issues. The Big Picture: Where are We Now? I/O Performance Measures

CS3600 SYSTEMS AND NETWORKS

Transcription:

Storage System COSC4201 1 1999 UCB

I/O and Disks Over the years much less attention was paid to I/O compared with CPU design. As frustrating as a CPU crash is, disk crash is a lot worse. Disks are mechanical devices, bottleneck and according to Amdahl s law, diminishes the progress in CPU s Types: hard disks, optical disks, and tapes. 2 1999 UCB

I/O system Processor interrupts Cache Memory - I/O Bus Main Memory I/O Controller I/O Controller I/O Controller Disk Disk Graphics Network 3 1999 UCB

I/O System 4 1999 UCB

Magnetic Disks Arm Head Sector Inner Track Outer Track Actuator Platter Information recorded on several platters (both sides usually). Bits are recorded sequentially on tracks. Tracks are divided into sectors Heads are connected to arms that moves to position itself over the required track by actuators. Heads could be fixed (one per track) or movable. Cylinder is all the tracks under the head. Areal density: bits per square inch 5 1999 UCB

Magnetic Disks Typical values: Rotation speed 3600-15,000 RPM. Number of platters 1-12. Diameter 1-3.5 inches. Number of tracks per surface 5,000-30,000 Sectors: typically 512 bytes. Disk latency: seek timerotation timetransfer timecontroller overhead. 6 1999 UCB

Disk Performance Rotation delay: average is halfway around the disk, for 10,000RPM, the average rotation latency is 60*0.5/10000=3.0 ms. Seek time: average number of tracks arm moves. Typically 8ms. (overlaps with rotation time). Transfer time: 3-60 MB per second. Read ahead is used to cache nearby sectors in the disk cache (0.1-4 MB). More sectors may be on the outer track compared to inner track, constant bit density. (although not quite constant). Another solution is zone-bit recording. 7 1999 UCB

Optical Disks Compact Disks CD ROM, DVD ROM (Digital Versatile Disk). CD-RW, and WORM (Write Once Read Many) Magneto Optical Disks: uses an optical laser to enhance the capabilities of a magnetic disc system. Reading is optical: direction of magnetization can be detected by a polarized laser light. The disc is coated with a material whose polarity can be altered only at high temp. Laser is used to heat tiny spots and then applying a magnetic field. 8 1999 UCB

Magnetic Tapes. Same idea as magnetic disks. Sequential access. Helical scan tapes: information is recorded using a tape reader that spins much faster than the tape, and is recorded on a diagonal to the tape (one limit to tape drivers is the speed the tape can be spun without jamming). Tapes wear off: Automated tape library: A robot is used to automate loading and changing the tapes. 9 1999 UCB

RAID Redundant Arrays of Inexpensive Disks. Used to improve the system performance and reliability. The idea is to distribute the data among more than one physical disk. 10 1999 UCB

RAID Level 0 Is a misnomer, no redundancy Strips are distributed among many disks A strip is a block, or a sector Fast access since we can read many strips at the same time 11 1999 UCB

RAID Level 1 Mirroring Each disk is fully duplicated onto its mirror. Expensive 100% overhead Must write twice (but in parallel 12 1999 UCB

RAID Level 2 Raid level 2 performs stripping with a strip size of 1 bit or 1 byte. Must have extra disks to store error correcting codes. No commercial product was released for Level 1 RAID 13 1999 UCB

RAID Level 3 Data are stripped in small units. One extra parity disk. Spindles are synchronized (head is over the same sector in each disk). Works fine for access of big sequential data. Only 1 I/O request can be done at a time (not very good for transaction based environment). 14 1999 UCB

RAID Level 4 Similar to 3, but blocks are distributed on disks instead of bits. One disk is used for parity (potential bottleneck). Penalty for small write requests. For large writes, the parity is calculated from the written data (otherwise from the difference between old and new data). 15 1999 UCB

RAID Level 5 Parity is distributed over all disks (avoiding potential bottleneck) To write Strip 0 S 0. Must do Read S0, and P0 EXOR(S 0,S 0 ) and EXOR with P 0 write to P 0 Write S 0 16 1999 UCB

RAID Level 6 Uses 2 extra disks. Parity is distributed over all disks 2 different parity are used, one is the EX- OR, one is another parity. 3 disk failure in the MTTF in order to loose data, 17 1999 UCB

Reliability A fault creates one or more error. Errors are latent. Latent errors become effective once activated. If the error affects delivered service, a component failure. Module Reliability: is a measure of continuous service (MTTF) Availability = MTTF/(MTTFMTTR). 18 1999 UCB

Reliability If a collection of modules have an exponentially distributed lifetimes, then The overall failure rate is the some of failure rates for individual component. The failure rate, is the reciprocal of the MTTF 19 1999 UCB

Reliability Assume the following 10 disks, each has MTTF of 1,000,000 hours 1 SCSI controller, MTTF of 500,000 hours 1 Power supply, 250,000 hours MTTF 1 fan, with 200,000 hours MTTF 1 SCSI Cable with 1,000,000 hours MTTF Failure rate MTTF system system 1 1 = 10 6 10 500,000 23 = 1,000,000 hours = 1,000,000 23 1 200,000 = 43,500 hours = 5 years 1 200,000 1 10 6 20 1999 UCB

Example 2500 MIPS CPU for $20,000 16-byte-wide interleaved memory 10ns 1000MB/sec I/O bus with room for 20 Ultra3 SCSI busses and controllers Wide Ultra SCSI bus 160MB/sec (can support up to 15 disks pr bus called SCSI strings) A $500 Ultra3 SCSI controller.3 ms delay OS uses 50,000 instructions per IO Large 80, or 40GB disk $10 per GB $1500 enclosure power and cooling to 8 large disks or 12 small disks 15,000 RPM 5 msec seek 40MB/sec disk Storage must be 1920GB, with 32KB per IO 21 1999 UCB

Example CPU Memory 1000M B/B/sec I/O BUS SCSI controller SCSI controller SCSI controller 160MB/sec SCSI BUS 22 1999 UCB

Example Find the cost of IOPS (IO Per Sec, for both small and large disks (assume 100% utilization). We have a chain of components, CPU, OS, Memory, different busses, disks, and controllers. The performance of the system is limited by the weakest link in the chain. 23 1999 UCB

Example CPU 50,000 IOPS 25000MIPS Maximum IOPS = = 50,000 Instructions/IO 50,000IOPS For Memory 50,000 IOPS 16 /(10ns) Maximum IOPS = = 32KB per I/O 50,000IOPS 24 1999 UCB

Example I/O Bus 31,250 100MB / sec Maximum IOPS = = 32KB per I/O 31,250IOPS SCSI Controllers 2,000IOPS (this is per SCSI, we may have more than one) 32KB Time to transfer a block = = 0.2msec 160MB/sec Total time per block = 0.2 0.3 = 0.5msec. 1 5msec = 2000IOPS 25 1999 UCB

Example Now, the disks themselves. I/O time = 5 msec 0.5 15,000ROM 32KB 40MB / sec = 7.8msec. Maximum IOPS per disk is 128, that is of course assuming 100% utilization. Now how many disks we need, total capacity is 1920GB, so we need 24 large disks or 48 small disks. We have to be sure do we have enough SCSI strings. 26 1999 UCB

Example If we used 24 disks, the max. number of IOPS is 24 x 128=3072. If we use 48 disks the ma. Is 6144 IOPS For choice 1 (24 large disks) We need 24/15 = 2 SCSI strings. For choice 2 (small disks) We need 48/15=4 SCSI strings Both are O.K. although three enclosure for 2 SCSI controllers may not be the best way to go, increase them to 3 SCSI controllers. 27 1999 UCB

Example The limit is 3072 for large disks and 6144 for small disks. Now for the cost Large disks 20,0003*$50024(80x10)1500*3=$45,200 Small disks 20,0003*$50048(40x10)1500*4=47,200 28 1999 UCB

Example Calculate reliability assuming Component CPU/Memory MTTF (hours)disk 1,000,000 Disk 1,000,000 SCSI controller 500,000 Power supply 200,000 SCSI cable 1,000,000 Enclosure 1,000,000 Fan 200,000 29 1999 UCB

Example Consider failure rate Big disk 1 Big disk = 1,000,000 = 67 1,000,000 MTTF = 14,925 hours Small disk CPU disks controller enclosure power supply fan 24 1,000,000 3 500,000 3 2,00,000 3 2,00,000 3 2,00,000 cables 3 2,00,000 Big disk MTTF = = 1 1,000,000 105 1,000,000 = 9524 hours 48 1,000,000 4 500,000 4 2,00,000 4 2,00,000 4 2,00,000 4 2,00,000 30 1999 UCB

Example --Availability The configuration of the previous example has changed due to limitations on the utilization of the different parts. We did not cover that, but here is the final configuration for the next example (availability). 80GB disks 4 strings, 4 enclosures, 24 disks 40GB disks 8 strings, 4 enclosures, 48 disks 31 1999 UCB

Example -- Availability What about availability If we have n component, each has MMTF, the total MMTF n = MTTF/n For RAID, one data is lost if a second disk failed before the first disk is repaired. Probability of that is MTTR/MTTF MTDL = MTTF MTTR MTTF disk disk / disk /( G N 1) G is the number of disks in the group protected by a parity,n is the number of disks in the system. 32 1999 UCB

Orthogonal RAID String Controller... String Controller... Array Controller String Controller... String Controller... String Controller... String Controller... If a string controller fails, the system is available If a string controller fails, all disks in the group fails and data is lost 33 1999 UCB

Example -- Availability SO far, for large disks 4 strings 24 disks 4 enclosures For small disks 8 strings 48 disks 4 enclosures. 34 1999 UCB

Example --Availability I/O BUS SCSI SCSI SCSI SCSI 6 6 6 6 I/O BUS SS CC SIS 6I6 S C S I6 S C S I6 S C S I6 S C S I6 S C S I6 S C S I6 S C S I6 35 1999 UCB

Example --Availability Large Disks 4 enclosures, each one SCSI controller and 6 disks ADD 1 enclosure, controller and 6 disks Small disks 4 enclosures, each has 2 controllers with 12 disks ADD 1 enclosure 2 controllers with 12 disks each 36 1999 UCB

Example --Availability Now to calculate the MTTF per enclosure Disks SCSI controller power fan cable enclosure Enclosure Failure Rate big = 6 1,000,000 1 500,000 1 200,000 1 200,000 1 1,000,000 1 1,000,000 = 20 1,000,000 Enclosure Failure Rate small = 6 1,000,000 1 500,000 1 200,000 1 200,000 1 1,000,000 1 1,000,000 = 29 1,000,000 MTTF big = 50,000 hours MTTF small = 34,500 hours 37 1999 UCB

Example -- Availability Now, consider the MTDL, note that G=N=5. MTTR=24 hours. MTDL MTDL big small = 2 50,000 5 4 25 = 2 34,500 5 4 25 = 5,200,000 hours = 2,500,000 hours Cost big =20,0005x$50030x(80*10)5*1500 =$54,000 Cost small =20,00010*50060*(10*40)5*1500 =$56,500 The big disk costs $10 per 1000 hours of operation The small disk costs $23 per 1000 hours of operation 38 1999 UCB