CS143: Disks and Files

Similar documents
CSCI-GA Database Systems Lecture 8: Physical Schema: Storage

Storage. CS 3410 Computer System Organization & Programming

CS 554: Advanced Database System

COS 318: Operating Systems. Storage Devices. Vivek Pai Computer Science Department Princeton University

COMP283-Lecture 3 Applied Database Management

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University

Disks, Memories & Buffer Management

Advanced Database Systems

COS 318: Operating Systems. Storage Devices. Jaswinder Pal Singh Computer Science Department Princeton University

STORING DATA: DISK AND FILES

Chapter 6. Storage and Other I/O Topics

Data Storage and Query Answering. Data Storage and Disk Structure (2)

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Advanced Database Systems

Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu

u Covered: l Management of CPU & concurrency l Management of main memory & virtual memory u Currently --- Management of I/O devices

Storage Systems. Storage Systems

Principles of Data Management. Lecture #2 (Storing Data: Disks and Files)

I/O CANNOT BE IGNORED

Secondary storage. CS 537 Lecture 11 Secondary Storage. Disk trends. Another trip down memory lane

CSE 153 Design of Operating Systems

Disks and Files. Storage Structures Introduction Chapter 8 (3 rd edition) Why Not Store Everything in Main Memory?

Storage and File Structure

CS370 Operating Systems

I/O CANNOT BE IGNORED

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Introduction to I/O and Disk Management

CS429: Computer Organization and Architecture

Introduction to I/O and Disk Management

CS3600 SYSTEMS AND NETWORKS

CS429: Computer Organization and Architecture

Database Systems II. Secondary Storage

Monday, May 4, Discs RAID: Introduction Error detection and correction Error detection: Simple parity Error correction: Hamming Codes

Today: Secondary Storage! Typical Disk Parameters!

Storage Devices for Database Systems

CSE 120. Operating Systems. March 27, 2014 Lecture 17. Mass Storage. Instructor: Neil Rhodes. Wednesday, March 26, 14

Outlines. Chapter 2 Storage Structure. Structure of a DBMS (with some simplification) Structure of a DBMS (with some simplification)

Thomas Polzer Institut für Technische Informatik

Readings. Storage Hierarchy III: I/O System. I/O (Disk) Performance. I/O Device Characteristics. often boring, but still quite important

File. File System Implementation. File Metadata. File System Implementation. Direct Memory Access Cont. Hardware background: Direct Memory Access

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Chapter 6. Storage & Other I/O

Storing Data: Disks and Files

Mass-Storage Structure

CSCI-GA Operating Systems. I/O : Disk Scheduling and RAID. Hubertus Franke

Chapter 1 Disk Storage, Basic File Structures, and Hashing.

V. Mass Storage Systems

Contents. Memory System Overview Cache Memory. Internal Memory. Virtual Memory. Memory Hierarchy. Registers In CPU Internal or Main memory

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Administrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review

Mass-Storage Structure

Random-Access Memory (RAM) CS429: Computer Organization and Architecture. SRAM and DRAM. Flash / RAM Summary. Storage Technologies

Lecture 25: Interconnection Networks, Disks. Topics: flow control, router microarchitecture, RAID

Database Design and Programming

Data Storage and Disk Structure

Storage Technologies and the Memory Hierarchy

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Database Management Systems need to:

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring 2002

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Chapter 7

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

Lecture 23: Storage Systems. Topics: disk access, bus design, evaluation metrics, RAID (Sections )

CMSC 424 Database design Lecture 12 Storage. Mihai Pop

Foundations of Computer Systems

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

Chapter 13 Disk Storage, Basic File Structures, and Hashing.

Adapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK]

Chapter 12: Mass-Storage

Chapter 12: Mass-Storage

Appendix D: Storage Systems

Storing Data: Disks and Files

Storage. Hwansoo Han

Module 1: Basics and Background Lecture 4: Memory and Disk Accesses. The Lecture Contains: Memory organisation. Memory hierarchy. Disks.

Lecture: Storage, GPUs. Topics: disks, RAID, reliability, GPUs (Appendix D, Ch 4)

Classifying Physical Storage Media. Chapter 11: Storage and File Structure. Storage Hierarchy (Cont.) Storage Hierarchy. Magnetic Hard Disk Mechanism

Classifying Physical Storage Media. Chapter 11: Storage and File Structure. Storage Hierarchy. Storage Hierarchy (Cont.) Speed

L9: Storage Manager Physical Data Organization

Chapter 12: Mass-Storage

Computer Architecture Computer Science & Engineering. Chapter 6. Storage and Other I/O Topics BK TP.HCM

CSE 451: Operating Systems Spring Module 12 Secondary Storage

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

Chapter 10: Mass-Storage Systems

Storage systems. Computer Systems Architecture CMSC 411 Unit 6 Storage Systems. (Hard) Disks. Disk and Tape Technologies. Disks (cont.

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Computer Systems. Memory Hierarchy. Han, Hwansoo

High Performance Computing Course Notes High Performance Storage

Database Systems. November 2, 2011 Lecture #7. topobo (mit)

Chapter 6. Storage and Other I/O Topics

Overview of Storage & Indexing (i)

A track on a magnetic disk is a concentric rings where data is stored.

Storage Technologies - 3

Computer Architecture 计算机体系结构. Lecture 6. Data Storage and I/O 第六讲 数据存储和输入输出. Chao Li, PhD. 李超博士

File Structures and Indexing

CS 33. Memory Hierarchy I. CS33 Intro to Computer Systems XVI 1 Copyright 2016 Thomas W. Doeppner. All rights reserved.

CMSC424: Database Design. Instructor: Amol Deshpande

Mass-Storage Systems. Mass-Storage Systems. Disk Attachment. Disk Attachment

Transcription:

CS143: Disks and Files 1

System Architecture CPU Word (1B 64B) ~ x GB/sec Main Memory System Bus Disk Controller... Block (512B 50KB) ~ x MB/sec Disk 2

Magnetic disk vs SSD Magnetic Disk Stores data on a magnetic disk Typical capacity: 100GB 10TB Solid State Drive Stores data in NAND flash memory Typical capacity: 100GB 1TB Much faster and more reliable than magnetic disk But, x10 more expensive and limited write cycles (~2000) 3

4

5

Structure of a Platter Track, cylinder, sector (=block, page) 6

Typical Magnetic Disk Platter diameter: 1-5 in Platters: 1 20 Tracks: 100 5000 Sectors per track: 200 5000 Sector size: 512 50K Rotation speed: 1000 15000 rpm Overall capacity: 100G 10TB Q: 2 platters, 2 surfaces/platter, 5000 tracks/surface, 1000 sectors/track, 1KB/sector. What is the overall capacity? 7

Capacity of Magnetic Disk - Capacity keeps increasing, but what about speed? 8

Access Time CPU Memory page Disk Q: How long does it take to read a page of a disk to memory? Q: What needs to be done to read a page? 9

Access Time Access time = (seek time) + (rotational delay) + (transfer time) 10

Seek Time Time to move a disk head between tracks 5-20x Time x 1 N Track to track ~ 1ms Average ~ 10 ms Full stroke ~ 20 ms Cylinders Traveled 11

Rotational Delay Head Here Block I Want Typical disk: 1000 rpm 15000 rpm Q: For 6000 RPM, average rotational delay? 12

Transfer Rate Read blocks as the platter rotates Disk Head 6000 RPM, 1000 sectors/track, 1KB/sector Q: How long to read one sector? Q: What is the transfer rate (bytes/sec)? 13

(Burst) Transfer Rate (Burst) Transfer rate = (RPM / 60) * (sectors/track) * (bytes/sector) 14

Sequential vs. Random I/O Q: How long to read 3 sequential sectors? q 6000 RPM q 1000 sectors/track q Assume the head is above the first sector 15

Sequential vs. Random I/O Q: How long to read 3 random sectors? q 6000 RPM q 1000 sectors/track q 10ms seek time q Assume the head is above the first sector 16

Random I/O For magnetic disks: Random I/O is VERY expensive compared to sequential I/O Avoid random I/O as much as we can 17

Magnetic Disk vs SSD Magnetic SSD Random IO ~100 IOs/sec ~100K IOs/sec Transfer rate ~ 100MB/sec ~500MB/sec Capacity/$ ~1TB/$100 (in 2014) ~100GB/$100 (in 2014) SSD speed gain is mainly from high random IO rate 18

RAID Redundant Array of Independent Disks Create a large-capacity disk volumes from an array of many disks Q: Possible advantages and disadvantages? 19

RAID Pros and Cons Potentially high throughput Read from multiple disks concurrently Potential reliability issues One disk failure may lead to the entire disk volume failure How should we store data into disks? Q: How should we organize the disks and store data to maximize benefit and minimize risks? 20

RAID Levels RAID 0: striping only (no redundancy) RAID 1: striping + mirroring RAID 5: striping + parity block 21

Data Modification Byte-level modification not allowed Can be modified by blocks Q: How can we modify only a part of a block? 22

Abstraction by OS (head, cylinder, sector) Sequential blocks No need to worry about head, cylinder, sector Access to non-adjacent blocks Random I/O Access to adjacent blocks Sequential I/O 1 2 3 4... 23

Buffers, Buffer pool Temporary main-memory cache for disk blocks Avoid future read Hide disk latency Most DBMS let users change buffer pool size 24

Reference Storage review disk guide http://www.storagereview.com/guide2000/ref/ hdd/index.html 25

Files: Main Problem How to store tables into disks? Jane CS 3.7 Susan ME 1.8 June EE 2.6 Tony CS 3.1 26

Spanned vs Unspanned Q: 512Byte block. 80Byte tuple. How to store? 27

Spanned vs Unspanned Unspanned T3 Spanned T1 T2 T6 T4 T5 T3 T1 T2 T4 T4 T7 T5 T6 Q: Maximum space waste for unspanned? T8 28

Variable-Length Tuples How do we store them? T2 T1 T4 T3 29

Reserved Space Reserve the maximum space for each tuple T5 T1 T2 T3 T6 Q: Any problem? T4 30

Variable-Length Space R1 R3 R2 R4 R6 R5 Pack tuples tightly Q: How do we know the end of a record? Q: What to do for delete/update? Q: How can we point to to a tuple? 31

Slotted Page Header Block Free space R4 R3 R1 R2 Q: How can we point to a tuple? 32

Long Tuples ProductReview( pid INT, reviewer VARCHAR(50), date DATE, rating INT, comments VARCHAR(1000)) Block size 512B How should we store it? 33

Long Tuples Spanning Splitting tuples Block with short attributes. T1 (a) T2 (a) Block with long attrs. T1 (b) T2 (b) T2 (b) This block may also have fixed-length slots. 34

Sequential File Tuples are ordered by certain attribute(s) (search key) Elaine CS James ME John EE Peter EE Susan CS Tony EE Search key: Name 3.7 2.8 1.8 3.9 1.0 2.4 35

Sequencing Tuples Inserting a new tuple Easy case T1 T3 T6 T8? T7 36

Sequencing Tuples Two options 1) Rearrange T1 T3 T6 T7 T8 2) Linked list T1 T7 T3 T6 T8 37

Sequencing Tuples Inserting a new tuple Difficult case T1 T4 T5 T8 T9? T7 38

Sequencing Tuples Overflow page header T1 T2 T4 T6 T7 Overflow page T8 T9 Reserving free space to avoid overflow PCTFREE in DBMS CREATE TABLE R(a int) PCTFREE 40 39

Things to Remember Disk Platter, track, cylinder, sector, block Seek time, rotational delay, transfer time Random I/O vs Sequential I/O Files Spanned/unspanned tuples Variable-length tuples (slotted page) Long tuples Sequential file and search key Problems with insertion (overflow page) PCTFREE 40