Storage: HDD, SSD and RAID

Similar documents
Why? Storage: HDD, SSD and RAID. Computer architecture. Computer architecture. 10 µs - 10 ms. Johan Montelius

Storage: HDD, SSD and RAID

Introduction to I/O and Disk Management

Introduction to I/O and Disk Management

I/O CANNOT BE IGNORED

COS 318: Operating Systems. Storage Devices. Vivek Pai Computer Science Department Princeton University

CSE 451: Operating Systems Spring Module 12 Secondary Storage

CSE 153 Design of Operating Systems

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage

Storage Systems. Storage Systems

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble

Key Points. Rotational delay vs seek delay Disks are slow. Techniques for making disks faster. Flash and SSDs

Hard Disk Drives (HDDs) Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

COS 318: Operating Systems. Storage Devices. Jaswinder Pal Singh Computer Science Department Princeton University

Hard Disk Drives (HDDs)

COMP283-Lecture 3 Applied Database Management

Chapter 6. Storage and Other I/O Topics

Secondary storage. CS 537 Lecture 11 Secondary Storage. Disk trends. Another trip down memory lane

Storage Technologies and the Memory Hierarchy

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University

Storage. CS 3410 Computer System Organization & Programming

CSE 120. Operating Systems. March 27, 2014 Lecture 17. Mass Storage. Instructor: Neil Rhodes. Wednesday, March 26, 14

I/O CANNOT BE IGNORED

I/O & Storage. Jin-Soo Kim ( Computer Systems Laboratory Sungkyunkwan University

CS143: Disks and Files

CSE 153 Design of Operating Systems Fall 2018

Storage Update and Storage Best Practices for Microsoft Server Applications. Dennis Martin President, Demartek January 2009 Copyright 2009 Demartek

Computer Architecture 计算机体系结构. Lecture 6. Data Storage and I/O 第六讲 数据存储和输入输出. Chao Li, PhD. 李超博士

High-Performance Storage Systems

Chapter 10: Mass-Storage Systems

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Input/Output. Today. Next. Principles of I/O hardware & software I/O software layers Disks. Protection & Security

Disks. Storage Technology. Vera Goebel Thomas Plagemann. Department of Informatics University of Oslo

Lecture 29. Friday, March 23 CS 470 Operating Systems - Lecture 29 1

u Covered: l Management of CPU & concurrency l Management of main memory & virtual memory u Currently --- Management of I/O devices

Storage. Hwansoo Han

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

The Long-Term Future of Solid State Storage Jim Handy Objective Analysis

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006

Mass-Storage. ICS332 - Fall 2017 Operating Systems. Henri Casanova

Chapter 6. Storage & Other I/O

Storage Systems : Disks and SSDs. Manu Awasthi CASS 2018

CS 261 Fall Mike Lam, Professor. Memory

CIT 668: System Architecture. Computer Systems Architecture

STORAGE SYSTEMS. Operating Systems 2015 Spring by Euiseong Seo

V. Mass Storage Systems

Mass-Storage Structure

Mass-Storage Structure

CS3600 SYSTEMS AND NETWORKS

Chapter 10: Mass-Storage Systems

Advanced Parallel Architecture Lesson 4 bis. Annalisa Massini /2015

Chapter 6 Storage and Other I/O Topics

PC-based data acquisition II

CSE 451: Operating Systems Winter Lecture 12 Secondary Storage. Steve Gribble 323B Sieg Hall.

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Today: Secondary Storage! Typical Disk Parameters!

Original Processor Frequency: AMD FX(tm)-6100 Six-Core Processor. CPU Thermal Design Power (TDP): Microcode Update Revision:

I/O. Disclaimer: some slides are adopted from book authors slides with permission 1

Serial ATA (SATA) Interface. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Ref: Chap 12. Secondary Storage and I/O Systems. Applied Operating System Concepts 12.1

CSE 451: Operating Systems Winter Secondary Storage. Steve Gribble. Secondary storage

How Hard Drives Work

Introduction I/O 1. I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec

Administrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review

Mass-Storage Systems. Mass-Storage Systems. Disk Attachment. Disk Attachment

Flash memory talk Felton Linux Group 27 August 2016 Jim Warner

CS429: Computer Organization and Architecture

The Memory Hierarchy /18-213/15-513: Introduction to Computer Systems 11 th Lecture, October 3, Today s Instructor: Phil Gibbons

CMSC 424 Database design Lecture 12 Storage. Mihai Pop

UNIT 2 Data Center Environment

CS429: Computer Organization and Architecture

CS 554: Advanced Database System

CIS Operating Systems I/O Systems & Secondary Storage. Professor Qiang Zeng Fall 2017

Foundations of Computer Systems

File System Management

Linux Device Drivers: Case Study of a Storage Controller. Manolis Marazakis FORTH-ICS (CARV)

CISC 7310X. C11: Mass Storage. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 4/19/2018 CUNY Brooklyn College

CS370 Operating Systems

QuickSpecs. PCIe Solid State Drives for HP Workstations

A+ Guide to Hardware: Managing, Maintaining, and Troubleshooting, 5e. Chapter 6 Supporting Hard Drives

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

Virtual memory - Paging

The process. one problem

Concepts Introduced. I/O Cannot Be Ignored. Typical Collection of I/O Devices. I/O Issues

Solid State Drives (SSDs) Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

A+ Guide to Hardware: Managing, Maintaining, and Troubleshooting, 5e. Chapter 6 Supporting Hard Drives

Cisco UCS C240 M4 I/O Characterization

Opportunities from our Compute, Network, and Storage Inflection Points

Chapter 12: Mass-Storage

Lecture 23. Finish-up buses Storage

Data rate - The data rate is the number of bytes per second that the drive can deliver to the CPU.

FUNCTIONS OF COMPONENTS OF A PERSONAL COMPUTER

Computer Science 61C Spring Friedland and Weaver. Input/Output

Computer Architecture Computer Science & Engineering. Chapter 6. Storage and Other I/O Topics BK TP.HCM

Chapter 8. A Typical collection of I/O devices. Interrupts. Processor. Cache. Memory I/O bus. I/O controller I/O I/O. Main memory.

ECE Enterprise Storage Architecture. Fall 2016

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

High Performance SSD & Benefit for Server Application

I/O Devices & SSD. Dongkun Shin, SKKU

Transcription:

Storage: HDD, SSD and RAID Johan Montelius KTH 2017 1 / 33

Why? Give me two reasons why we would like to have secondary storage? 2 / 33

Computer architecture Gigabyte Z170 Gaming 2 4 2 6 4 6 2 1 1 4 PCIe x16/x4 PCIe x1 USB 3.1 USB 3.0 USB 2.0 SATA-III SATA Express M.2 gigabit Ethernet DDR4 SDRAM 3 / 33

Computer architecture GPU < 1 ns SDRAM < 10 ns CPU PCIe x16 up to 128Gb/s memory bus up to 160 Gb/s USB up to 10Gb/s PCIe x4 Control Hub SATA up to 6Gb/s SAS up to 12Gb/s 10 µs - 10 ms BIOS keybord audio network 4 / 33

System architecture I/O library application POSIX API user space OS generic block interface kernel space driver driver driver device HDD SSD 70 percent of the code of an operating system is code for device drivers. 5 / 33

how to interact with a device driver I/O status command data device A register to read the status of the device. A register to instruct the device to read or write. A register that holds the data. I/O-bus could be separate from memory bus (or the same). The driver will use either special I/O instructions or regular load/store instructions. 6 / 33

if you have the time char read_from_device () { while ( STATUS == BUSY ) {} // do nothing, just wait COMMAND = READ ; while ( STATUS == BUSY ) {} // do nothing, just wait return DATA ; } 7 / 33

asynchronous I/O and interrupts int read_request ( int pid, char * buffer ) { while ( STATUS == BUSY ) {} COMMAND = READ ; interrupt - > process = pid ; interrupt - > buffer = buffer ; block_process ( pid ); } scheduler (); 8 / 33

asynchronous I/O and interrupts int int errupt_handl er () { int pid = interrupt - > pid ; *( interrupt -> buffer ) = DATA ; } ready_process ( pid ); This is very schematic, more complicated in real life. 9 / 33

process state scheduled exit start ready running exit I/O completed timeout blocked I/O initiate The kernel is interrupt driven. 10 / 33

Direct Memory Access Allow devices to read and write to buffers in physical memory. int write_request ( int pid, char * string, int size ) { while ( STATUS == BUSY ) {} memcpy ( string, buffer, size ) COMMAND = WRITE ; blocked -> pid = pid ; block_process ( pid ); scheduler (); 11 / 33

the device driver Each physical device is controlled by a device driver that provides the abstraction of a character device or block device. Block devices used as interface to disk drives that provide persistent storage. All though all storage devices are presented using the same abstraction, they have very different characteristics. To understand the challenges and options of the operating system, you should know the basics of how storage devices work. 12 / 33

Anatomy of a HDD track/cylinder sectors per track varies sector size: 4K or 512 bytes platters: 1 to 6 heads: one side or two sides Only one head at a time is used (no parallel read). 13 / 33

Sector addressing Historically sectors address by cylinder-head-sector (CHS), due to incompatibe standars the limitation was: cylinder: 1024 (10-bits) heads: 16 (4-bits) sectors per cylinder: 63 (6-bits) number of sectors: 1 Mi largest disk assuming 512 Byte sectors: 512 MiByte Today, sectors are addresses linearly 0.. n, Linear Block Addressing (LBA): 28-bit or 48-bit address up to 256 Ti sectors largest disk assuming 4 KiByte sectors: 1 PiByte > sudo hdparm -I /dev/sda > dmesg grep ata2 14 / 33

HDD - Hard Disk Drive Seagate Desktop total capacity: 2 TiByte form factor: 3.5" rotational speed: 7.200 rpm connection: SATA III cache size: 64 MiByte read throughput: 156 MByte/s ST2000DM001, aprx price, October 2016, 900:- 15 / 33

HDD - Hard Disk Drive Seagate Cheetah 15K total capacity: 600 GiByte form factor: 3.5" rotational speed: 15.000 rpm connection: SAS-3 cache size: 16 MiByte read throughput: 204 MByte/s ST3300657SS, aprx price, October 2016, 2.200:- 16 / 33

access time seek time: time to move arm to the right cylinder rotation time: time to rotate the disk read time: read one or more sectors 17 / 33

HDD - shoot out Seagate Desktop rotation speed: 7200 rpm average seek time: < 10 ms average rotation time: 4 ms average time to read a sector: < 14ms capacity: 2 TiByte aprx. price: 900:- cost capacity: 0.44 SEK/GiByte Seagate Cheeta 15K rotation speed: 15000 rpm average seek time: < 4 ms average rotation time: 2 ms average time to read a sector: < 6ms capacity: 600 GiByte aprx. price: 2.200:- cost capacity: 3.70 SEK/GiByte 18 / 33

read/write performance If a sector is 512 bytes, it takes 10ms to find and read a sector, and we want to reaad 512 MiBytes then...? Time to find first sector is less relevant. If sectors that belong to the same file are close to each other we minimize movement of arm. Rotational speed should be high. The density i.e. how many sectors in each track is important. The communication with the drive should be fast. Typical read and write performance is between 150 MiByte/s to 250 MiByte/s. 19 / 33

who s in control Historically, the Operating System was in complete control: it knew the layout cylinder-head-sector (CHS), could order data in segments that were close to each other and, would schedule disk operations to minimize arm movement. Today, the drive can often make a better decission: it knows, but might not reveal, the layout. The operating system can help in grouping operations togheter, allowing the drive to decide in what order they should be done (Native Command Queuing). There is a reason why MS-DOS is called MS-DOS. 20 / 33

SSD - Solid State Drive Samsung 850 EVO total capacity: 250 GiByte form factor: 2.5" connection: SATA III cache size: 64 MiByte random access: 30 µ s read throughput: 540 MiByte/s MZ-75E250B/EU, aprx price, October 2016, 1000:- 21 / 33

SD cards - flash memory SanDisk Ultra SDXC form factor: SDXC capacity: 64 GiByte read performance: 80 MiByte/s aprx price, October 2016, 300:- 22 / 33

NAND - flash storage memory bank erase blocks ~256 KiByte pages ~4KiByte You have constant time access to any page. You can only write to (or program) an erased page. You can only erase a block. 23 / 33

price performance Drive Capacity Price SEK/GiByte HDD Desktop 2 TiByte 900:- 44 öre HDD Performance 600 GiByte 2.200:- 3.70:- SSD Desktop 250 GiByte 1000:- 4:- 24 / 33

Bus limitations SATA-III - 6 Gb/s, most internal HDD and SSD today SAS-3-12 Gb/s, enterprise RAID HDD USB3.1-10 Gb/s, everything PCI Express 3.0 x16-128 Gb/s, what is it used for? An SSD has a read througput of 500 MiByte/s which is a... b/s? 25 / 33

SSD on the PCIe bus Intel SSD 750 Series total capacity: 400 GiByte connection: PCI Express 3.0 x4 read performance: 2200 MByte/s write performance: 900 MByte/s aprx price, October 2016, 4.599:- 26 / 33

The M.2 connector Samsung 960 EVO 500GB total capacity: 500 GiByte form factor: M.2- connection: PCI Express 3.0 x4 read performance: 3.200 MByte/s write performance: 1.800 MByte/s aprx price, November 2017, 2.400:- 27 / 33

SSD on the memory bus HP NVDIMM 8GB regular DRAM backued up by Flash total capacity: 8 GiByte form factor: DDR4 SDIM bus speed: 2133 MHz aprx price, October 2016,??? 28 / 33

Next year? Intel Optane - 3D XPoint NVDIMM in the pipe line total capacity: 512 GiByte 29 / 33

Increase capcity, performance and/or reliability Redundant Array of Independet Disks RAID Multiple disks that can provide: capacity: looks like a 20 TiByte disk but is actually 10 2TiByte disks performance: spread a file across ten drives, read and write in parallell reliability: write the same file to several disks, if one crashes - not a problem 30 / 33

the abstraction layer Alternatives: The cabinet that holds the disks present itself as one drive. A device driver in the kernel knows that we have several disks but the kernel presents it as one disk to the application layer. The application layer knows that we have several disks but provides a API to other applications that looks a single drive. 31 / 33

RAID levels RAID 0: stripe files across several drives. RAID 1: keep a complete mirror copy of each file. RAID 2-6: spread a file plus parity information across several drives. 32 / 33

Summary application layer, simple to understand system calls: open, read, write, lseek... all devices have a generic API device drivers that know what they are doing now it s a bit structured I/O and memory buses, protocols suchs as SATA, SCSI, USB etc hardware - a complete mess 33 / 33