Why? Storage: HDD, SSD and RAID. Computer architecture. Computer architecture. 10 µs - 10 ms. Johan Montelius

Similar documents
Storage: HDD, SSD and RAID

Storage: HDD, SSD and RAID

I/O CANNOT BE IGNORED

Introduction to I/O and Disk Management

Introduction to I/O and Disk Management

CSE 153 Design of Operating Systems

COS 318: Operating Systems. Storage Devices. Vivek Pai Computer Science Department Princeton University

CSE 451: Operating Systems Spring Module 12 Secondary Storage

Storage Systems. Storage Systems

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage

Key Points. Rotational delay vs seek delay Disks are slow. Techniques for making disks faster. Flash and SSDs

Hard Disk Drives (HDDs) Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Hard Disk Drives (HDDs)

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble

Secondary storage. CS 537 Lecture 11 Secondary Storage. Disk trends. Another trip down memory lane

Storage Technologies and the Memory Hierarchy

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

COS 318: Operating Systems. Storage Devices. Jaswinder Pal Singh Computer Science Department Princeton University

I/O CANNOT BE IGNORED

COMP283-Lecture 3 Applied Database Management

CSE 153 Design of Operating Systems Fall 2018

Chapter 6. Storage and Other I/O Topics

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University

Storage. CS 3410 Computer System Organization & Programming

CSE 120. Operating Systems. March 27, 2014 Lecture 17. Mass Storage. Instructor: Neil Rhodes. Wednesday, March 26, 14

I/O & Storage. Jin-Soo Kim ( Computer Systems Laboratory Sungkyunkwan University

CS143: Disks and Files

Input/Output. Today. Next. Principles of I/O hardware & software I/O software layers Disks. Protection & Security

Storage Update and Storage Best Practices for Microsoft Server Applications. Dennis Martin President, Demartek January 2009 Copyright 2009 Demartek

Storage. Hwansoo Han

Computer Architecture 计算机体系结构. Lecture 6. Data Storage and I/O 第六讲 数据存储和输入输出. Chao Li, PhD. 李超博士

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

CS 261 Fall Mike Lam, Professor. Memory

High-Performance Storage Systems

Chapter 10: Mass-Storage Systems

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Chapter 6. Storage & Other I/O

Disks. Storage Technology. Vera Goebel Thomas Plagemann. Department of Informatics University of Oslo

Lecture 29. Friday, March 23 CS 470 Operating Systems - Lecture 29 1

Advanced Parallel Architecture Lesson 4 bis. Annalisa Massini /2015

u Covered: l Management of CPU & concurrency l Management of main memory & virtual memory u Currently --- Management of I/O devices

Chapter 6 Storage and Other I/O Topics

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Original Processor Frequency: AMD FX(tm)-6100 Six-Core Processor. CPU Thermal Design Power (TDP): Microcode Update Revision:

I/O. Disclaimer: some slides are adopted from book authors slides with permission 1

Serial ATA (SATA) Interface. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

The Long-Term Future of Solid State Storage Jim Handy Objective Analysis

Mass-Storage. ICS332 - Fall 2017 Operating Systems. Henri Casanova

Introduction I/O 1. I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec

Storage Systems : Disks and SSDs. Manu Awasthi CASS 2018

Ref: Chap 12. Secondary Storage and I/O Systems. Applied Operating System Concepts 12.1

CIT 668: System Architecture. Computer Systems Architecture

STORAGE SYSTEMS. Operating Systems 2015 Spring by Euiseong Seo

V. Mass Storage Systems

Administrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review

Mass-Storage Structure

Mass-Storage Structure

CS3600 SYSTEMS AND NETWORKS

Chapter 10: Mass-Storage Systems

CS429: Computer Organization and Architecture

Linux Device Drivers: Case Study of a Storage Controller. Manolis Marazakis FORTH-ICS (CARV)

PC-based data acquisition II

CS429: Computer Organization and Architecture

CSE 451: Operating Systems Winter Lecture 12 Secondary Storage. Steve Gribble 323B Sieg Hall.

CIS Operating Systems I/O Systems & Secondary Storage. Professor Qiang Zeng Fall 2017

Today: Secondary Storage! Typical Disk Parameters!

The Memory Hierarchy /18-213/15-513: Introduction to Computer Systems 11 th Lecture, October 3, Today s Instructor: Phil Gibbons

Foundations of Computer Systems

A+ Guide to Hardware: Managing, Maintaining, and Troubleshooting, 5e. Chapter 6 Supporting Hard Drives

CSE 451: Operating Systems Winter Secondary Storage. Steve Gribble. Secondary storage

How Hard Drives Work

Mass-Storage Systems. Mass-Storage Systems. Disk Attachment. Disk Attachment

Chapter 8. A Typical collection of I/O devices. Interrupts. Processor. Cache. Memory I/O bus. I/O controller I/O I/O. Main memory.

Controller PCI Card MODEL MAN UM

Concepts Introduced. I/O Cannot Be Ignored. Typical Collection of I/O Devices. I/O Issues

A+ Guide to Hardware: Managing, Maintaining, and Troubleshooting, 5e. Chapter 6 Supporting Hard Drives

Computer Science 61C Spring Friedland and Weaver. Input/Output

Data rate - The data rate is the number of bytes per second that the drive can deliver to the CPU.

Flash memory talk Felton Linux Group 27 August 2016 Jim Warner

Computer Architecture Computer Science & Engineering. Chapter 6. Storage and Other I/O Topics BK TP.HCM

I/O Devices & SSD. Dongkun Shin, SKKU

CMSC 424 Database design Lecture 12 Storage. Mihai Pop

UNIT 2 Data Center Environment

UC Santa Barbara. Operating Systems. Christopher Kruegel Department of Computer Science UC Santa Barbara

CS 554: Advanced Database System

ECE Enterprise Storage Architecture. Fall 2016

Chapter 6. Storage and Other I/O Topics

Storage systems. Computer Systems Architecture CMSC 411 Unit 6 Storage Systems. (Hard) Disks. Disk and Tape Technologies. Disks (cont.

File System Management

Readings. Storage Hierarchy III: I/O System. I/O (Disk) Performance. I/O Device Characteristics. often boring, but still quite important

CS370 Operating Systems

CISC 7310X. C11: Mass Storage. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 4/19/2018 CUNY Brooklyn College

QuickSpecs. PCIe Solid State Drives for HP Workstations

Chapter 6. Storage and Other I/O Topics. ICE3003: Computer Architecture Spring 2014 Euiseong Seo

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

Virtual memory - Paging

The process. one problem

I/O Device Controllers. I/O Systems. I/O Ports & Memory-Mapped I/O. Direct Memory Access (DMA) Operating Systems 10/20/2010. CSC 256/456 Fall

Solid State Drives (SSDs) Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Transcription:

Why? Storage: HDD, SSD and RAID Johan Montelius Give me two reasons why we would like to have secondary storage? KTH 2017 1 / 33 Computer architecture 2 4 2 6 4 6 2 1 1 4 Computer architecture GPU Gigabyte Z170 Gaming PCIe x16/x4 PCIe x1 USB 3.1 USB 3.0 USB 2.0 SATA-III SATA Express M.2 gigabit Ethernet DDR4 SDRAM 2 / 33 < 1 ns SDRAM < 10 ns CPU PCIe x16 up to 128Gb/s memory bus up to 160 Gb/s USB up to 10Gb/s PCIe x4 3 / 33 BIOS Control Hub keybord audio SATA up to 6Gb/s SAS up to 12Gb/s network 10 µs - 10 ms 4 / 33

System architecture how to interact with a device application POSIX API user space driver A register to read the status of the device. I/O library OS I/O A register to instruct the device to read or write. A register that holds the data. generic block interface driver driver driver device HDD SSD kernel space status command data device I/O-bus could be separate from memory bus (or the same). The driver will use either special I/O instructions or regular load/store instructions. 70 percent of the code of an operating system is code for device drivers. 5 / 33 6 / 33 if you have the time asynchronous I/O and interrupts char read_from_device () { int read_ request ( int pid, char * buffer ) { while ( STATUS == BUSY ) {} // do nothing, just wait while ( STATUS == BUSY ) {} COMMAND = READ ; COMMAND = READ ; while ( STATUS == BUSY ) {} // do nothing, just wait interrupt - > process = pid ; interrupt - > buffer = buffer ; return DATA ; block_process ( pid ); } } scheduler (); 7 / 33 8 / 33

asynchronous I/O and interrupts process state int int errupt_handl er () { } int pid = interrupt - > pid ; *( interrupt -> buffer ) = DATA ; ready_process ( pid ); scheduled exit start ready running timeout I/O completed I/O initiate blocked exit This is very schematic, more complicated in real life. The kernel is interrupt driven. Direct Memory Access 9 / 33 the device driver 10 / 33 Allow devices to read and write to buffers in physical memory. int write_ request ( int pid, char * string, int size ) { Each physical device is controlled by a device driver that provides the abstraction of a character device or block device. while ( STATUS == BUSY ) {} Block devices used as interface to disk drives that provide persistent storage. memcpy ( string, buffer, size ) COMMAND = WRITE ; All though all storage devices are presented using the same abstraction, they have very different characteristics. blocked -> pid = pid ; block_process ( pid ); To understand the challenges and options of the operating system, you should know the basics of how storage devices work. } scheduler (); 11 / 33 12 / 33 DMA often limited to lower memory addresses.

Anatomy of a HDD Sector addressing track/cylinder sectors per track varies sector size: 4K or 512 bytes platters: 1 to 6 heads: one side or two sides Historically sectors address by cylinder-head-sector (CHS), due to incompatibe standars the limitation was: cylinder: 1024 (10-bits) heads: 16 (4-bits) sectors per cylinder: 63 (6-bits) number of sectors: 1 Mi largest disk assuming 512 Byte sectors: 512 MiByte Today, sectors are addresses linearly 0.. n, Linear Block Addressing (LBA): 28-bit or 48-bit address up to 256 Ti sectors largest disk assuming 4 KiByte sectors: 1 PiByte Only one head at a time is used (no parallel read). 13 / 33 HDD - Hard Disk Drive Seagate Desktop > sudo hdparm -I /dev/sda > dmesg grep ata2 HDD - Hard Disk Drive Seagate Cheetah 15K 14 / 33 total capacity: 2 TiByte form factor: 3.5" rotational speed: 7.200 rpm connection: SATA III cache size: 64 MiByte read throughput: 156 MByte/s total capacity: 600 GiByte form factor: 3.5" rotational speed: 15.000 rpm connection: SAS-3 cache size: 16 MiByte read throughput: 204 MByte/s ST2000DM001, aprx price, October 2016, 900:- 15 / 33 ST3300657SS, aprx price, October 2016, 2.200:- 16 / 33

access time HDD - shoot out seek time: time to move arm to the right cylinder rotation time: time to rotate the disk read time: read one or more sectors Seagate Desktop rotation speed: 7200 rpm average seek time: < 10 ms average rotation time: 4 ms average time to read a sector: < 14ms capacity: 2 TiByte aprx. price: 900:- cost capacity: 0.44 SEK/GiByte Seagate Cheeta 15K rotation speed: 15000 rpm average seek time: < 4 ms average rotation time: 2 ms average time to read a sector: < 6ms capacity: 600 GiByte aprx. price: 2.200:- cost capacity: 3.70 SEK/GiByte read/write performance 17 / 33 who s in control 18 / 33 If a sector is 512 bytes, it takes 10ms to find and read a sector, and we want to reaad 512 MiBytes then...? Time to find first sector is less relevant. If sectors that belong to the same file are close to each other we minimize movement of arm. Rotational speed should be high. The density i.e. how many sectors in each track is important. The communication with the drive should be fast. Typical read and write performance is between 150 MiByte/s to 250 MiByte/s. Historically, the Operating System was in complete control: it knew the layout cylinder-head-sector (CHS), could order data in segments that were close to each other and, would schedule disk operations to minimize arm movement. There is a reason why MS-DOS is called MS-DOS. Today, the drive can often make a better decission: it knows, but might not reveal, the layout. The operating system can help in grouping operations togheter, allowing the drive to decide in what order they should be done (Native Command Queuing). 19 / 33 20 / 33

SSD - Solid State Drive Samsung 850 EVO SD cards - flash memory SanDisk Ultra SDXC total capacity: 250 GiByte form factor: 2.5" connection: SATA III cache size: 64 MiByte random access: 30 µ s read throughput: 540 MiByte/s form factor: SDXC capacity: 64 GiByte read performance: 80 MiByte/s MZ-75E250B/EU, aprx price, October 2016, 1000:- NAND - flash storage 21 / 33 aprx price, October 2016, 300:- price performance 22 / 33 memory bank erase blocks ~256 KiByte Drive Capacity Price SEK/GiByte HDD Desktop 2 TiByte 900:- 44 öre HDD Performance 600 GiByte 2.200:- 3.70:- SSD Desktop 250 GiByte 1000:- 4:- pages ~4KiByte You have constant time access to any page. You can only write to (or program) an erased page. You can only erase a block. 23 / 33 24 / 33

Bus limitations SSD on the PCIe bus Intel SSD 750 Series SATA-III - 6 Gb/s, most internal HDD and SSD today SAS-3-12 Gb/s, enterprise RAID HDD USB3.1-10 Gb/s, everything PCI Express 3.0 x16-128 Gb/s, what is it used for? total capacity: 400 GiByte connection: PCI Express 3.0 x4 read performance: 2200 MByte/s write performance: 900 MByte/s An SSD has a read througput of 500 MiByte/s which is a... b/s? aprx price, October 2016, 4.599:- The M.2 connector Samsung 960 EVO 500GB 25 / 33 SSD on the memory bus HP NVDIMM 8GB 26 / 33 total capacity: 500 GiByte form factor: M.2- connection: PCI Express 3.0 x4 read performance: 3.200 MByte/s write performance: 1.800 MByte/s regular DRAM backued up by Flash total capacity: 8 GiByte form factor: DDR4 SDIM bus speed: 2133 MHz aprx price, November 2017, 2.400:- 27 / 33 aprx price, October 2016,??? 28 / 33

Next year? Increase capcity, performance and/or reliability Intel Optane - 3D XPoint NVDIMM in the pipe line total capacity: 512 GiByte Redundant Array of Independet Disks RAID Multiple disks that can provide: capacity: looks like a 20 TiByte disk but is actually 10 2TiByte disks performance: spread a file across ten drives, read and write in parallell reliability: write the same file to several disks, if one crashes - not a problem the abstraction layer 29 / 33 RAID levels 30 / 33 Alternatives: The cabinet that holds the disks present itself as one drive. A device driver in the kernel knows that we have several disks but the kernel presents it as one disk to the application layer. The application layer knows that we have several disks but provides a API to other applications that looks a single drive. RAID 0: stripe files across several drives. RAID 1: keep a complete mirror copy of each file. RAID 2-6: spread a file plus parity information across several drives. 31 / 33 32 / 33

Summary application layer, simple to understand system calls: open, read, write, lseek... all devices have a generic API device drivers that know what they are doing now it s a bit structured I/O and memory buses, protocols suchs as SATA, SCSI, USB etc hardware - a complete mess 33 / 33