STORING DATA: DISK AND FILES

Similar documents
Disks and Files. Storage Structures Introduction Chapter 8 (3 rd edition) Why Not Store Everything in Main Memory?

L9: Storage Manager Physical Data Organization

Disks, Memories & Buffer Management

Parser. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text.

Storing Data: Disks and Files

CSE 190D Database System Implementation

Principles of Data Management. Lecture #2 (Storing Data: Disks and Files)

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Database Management Systems need to:

Outlines. Chapter 2 Storage Structure. Structure of a DBMS (with some simplification) Structure of a DBMS (with some simplification)

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

Storing Data: Disks and Files

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Chapter 7

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

Review 1-- Storing Data: Disks and Files

Storing Data: Disks and Files

Storing Data: Disks and Files

Roadmap. Handling large amount of data efficiently. Stable storage. Parallel dataflow. External memory algorithms and data structures

Database Systems. November 2, 2011 Lecture #7. topobo (mit)

Storing Data: Disks and Files

CSE 232A Graduate Database Systems

Disks and Files. Jim Gray s Storage Latency Analogy: How Far Away is the Data? Components of a Disk. Disks

Storing Data: Disks and Files. Administrivia (part 2 of 2) Review. Disks, Memory, and Files. Disks and Files. Lecture 3 (R&G Chapter 7)

Disks & Files. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Unit 2 Buffer Pool Management

Database design and implementation CMPSCI 645. Lecture 08: Storage and Indexing

Storing Data: Disks and Files

Database Applications (15-415)

Managing Storage: Above the Hardware

Data Storage and Query Answering. Data Storage and Disk Structure (2)

ECE331: Hardware Organization and Design

Goals for Today. CS 133: Databases. Relational Model. Multi-Relation Queries. Reason about the conceptual evaluation of an SQL query

CPSC 421 Database Management Systems. Lecture 11: Storage and File Organization

Unit 2 Buffer Pool Management

Project is due on March 11, 2003 Final Examination March 18, pm to 10.30pm

Overview of Storage & Indexing (i)

A memory is what is left when something happens and does not completely unhappen.

Normalization, Generated Keys, Disks

CS 405G: Introduction to Database Systems. Storage

Database Management Systems. Buffer and File Management. Fall Queries. Query Optimization and Execution. Relational Operators

Fall 2004 CS 186 Discussion Section Exercises - Week 2 ending 9/10

OPERATING SYSTEMS CS3502 Spring Input/Output System Chapter 9

Storing Data: Disks and Files

The Memory Hierarchy 10/25/16

Why Is This Important? Overview of Storage and Indexing. Components of a Disk. Data on External Storage. Accessing a Disk Page. Records on a Disk Page

CSE 451: Operating Systems Spring Module 12 Secondary Storage

User Perspective. Module III: System Perspective. Module III: Topics Covered. Module III Overview of Storage Structures, QP, and TM

OPERATING SYSTEMS CS3502 Spring Input/Output System Chapter 9

ECE7995 Caching and Prefetching Techniques in Computer Systems. Lecture 8: Buffer Cache in Main Memory (I)

CMSC424: Database Design. Instructor: Amol Deshpande

CS 4284 Systems Capstone

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage

OPERATING SYSTEM. PREPARED BY : DHAVAL R. PATEL Page 1. Q.1 Explain Memory

CS370 Operating Systems

Database Systems II. Secondary Storage

CMSC424: Database Design. Instructor: Amol Deshpande

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Advanced Database Systems

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble

CS122A: Introduction to Data Management. Lecture #14: Indexing. Instructor: Chen Li

Adapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK]

EXTERNAL SORTING. CS 564- Spring ACKs: Dan Suciu, Jignesh Patel, AnHai Doan

Advanced Database Systems

CS143: Disks and Files

Today: Secondary Storage! Typical Disk Parameters!

University of Waterloo Midterm Examination Sample Solution

Chapter 11: Storage and File Structure. Silberschatz, Korth and Sudarshan Updated by Bird and Tanin

Chapter 4 File Systems. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved

Storage. CS 3410 Computer System Organization & Programming

COS 318: Operating Systems. Storage Devices. Vivek Pai Computer Science Department Princeton University

Physical Data Organization. Introduction to Databases CompSci 316 Fall 2018

CSE 153 Design of Operating Systems

u Covered: l Management of CPU & concurrency l Management of main memory & virtual memory u Currently --- Management of I/O devices

Storage Devices for Database Systems

CMSC 424 Database design Lecture 12 Storage. Mihai Pop

COSC 243. Memory and Storage Systems. Lecture 10 Memory and Storage Systems. COSC 243 (Computer Architecture)

CS 471 Operating Systems. Yue Cheng. George Mason University Fall 2017

CSCI-UA.0201 Computer Systems Organization Memory Hierarchy

IMPORTANT: Circle the last two letters of your class account:

STORAGE SYSTEMS. Operating Systems 2015 Spring by Euiseong Seo

Professor: Pete Keleher! Closures, candidate keys, canonical covers etc! Armstrong axioms!

Storage and File Structure. Classification of Physical Storage Media. Physical Storage Media. Physical Storage Media

Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu

Disks and RAID. CS 4410 Operating Systems. [R. Agarwal, L. Alvisi, A. Bracy, E. Sirer, R. Van Renesse]

Chapter 11. I/O Management and Disk Scheduling

Module 1: Basics and Background Lecture 4: Memory and Disk Accesses. The Lecture Contains: Memory organisation. Memory hierarchy. Disks.

ECS 165B: Database System Implementa6on Lecture 3

Ch 11: Storage and File Structure

Mass-Storage Systems. Mass-Storage Systems. Disk Attachment. Disk Attachment

CS 471 Operating Systems. Yue Cheng. George Mason University Fall 2017

I/O Management and Disk Scheduling. Chapter 11

THE B+ TREE INDEX. CS 564- Spring ACKs: Jignesh Patel, AnHai Doan

1. What is the difference between primary storage and secondary storage?

Disk Scheduling COMPSCI 386

L7: Performance. Frans Kaashoek Spring 2013

6.033 Computer System Engineering

Secondary storage. CS 537 Lecture 11 Secondary Storage. Disk trends. Another trip down memory lane

Lecture 18: Memory Systems. Spring 2018 Jason Tang

Data Storage - I: Memory Hierarchies & Disks. Contains slides from: Naci Akkök, Pål Halvorsen, Hector Garcia-Molina, Ketil Lund, Vera Goebel

Chapter 10: Mass-Storage Systems

V. Mass Storage Systems

Transcription:

STORING DATA: DISK AND FILES CS 564- Spring 2018 ACKs: Dan Suciu, Jignesh Patel, AnHai Doan

WHAT IS THIS LECTURE ABOUT? How does a DBMS store data? disk, SSD, main memory The Buffer manager controls how the data moves between main memory and disk uses various replacement policies (LRU, Clock) 2

ARCHITECTURE OF A DBMS query Query Execution data access Storage Manager I/O access 3

ARCHITECTURE OF STORAGE MANAGER Concurrency Control Manager Access Methods Sorted File Heap File Hash Index B+-tree Index Buffer Manager I/O Manager Recovery Manager I/O access 4

DATA STORAGE How does a DBMS store and access data? main memory (fast, temporary) disk (slow, permanent) How do we move data from disk to main memory? buffer manager How do we organize relational data into files? 5

MEMORY HIERARCHY price CPU cache Main Memory Flash Storage Magnetic Hard Disk Drive access speed capacity 6

WHY NOT MAIN MEMORY? Relatively high cost Main memory is not persistent! Typical storage hierarchy: Primary storage: main memory (RAM) for currently used data Secondary storage: diskfor the main database Tertiary storage: tapes for archiving older versions of the data 7

DISK 8

DISKS Secondary storage device of choice Data is stored and retrieved in units called disk blocks The time to retrieve a disk block varies depending upon location on disk (unlike RAM) The placement of blocks on disk has major impact on DBMS performance! 9

COMPONENTS OF DISKS platter: circular hard surface on which data is stored by inducing magnetic changes spindle: axis responsible for rotating the platters disk head: mechanism to read or write data disk arm: moves to position a head on a desired track of the platter platter spindle disk head RPM (Rotations Per Minute) 7200 RPM 15000 RPM disk arm 10

COMPONENTS OF DISKS data is encoded in concentric circles of sectors called tracks block size: multiple of sector size (which is fixed) at any time, exactly one head can read/write block spindle track sector 11

ACCESSING THE DISK unit of read or write: block size once in memory, we refer to it as a page typically: 4k or 8k or 16k access time = rotational delay + seek time + transfer time 12

ACCESSING THE DISK (1) access time = rotational delay + seek time + transfer time rotational delay: time to wait for sector to rotate under the disk head typical delay: 0 10 ms maximum delay = 1 full rotation average delay ~ half rotation RPM Average delay 5,400 5.56 7,200 4.17 10,000 3.00 15,000 2.00 13

ACCESSING THE DISK (2) access time = rotational delay + seek time + transfer time seek time: time to move the arm to position disk head on the right track typical seek time: ~ 9 ms ~ 4 ms for high-end disks 14

ACCESSING THE DISK (3) access time = rotational delay + seek time + transfer time data transfer time: time to move the data to/from the disk surface typical rates: ~100 MB/s the access time is dominated by the seek time and rotational delay! 15

EXAMPLE: SPECS Seagate HDD Capacity 3 TB RPM 7,200 Average Seek Time Max Transfer Rate 9 ms 210 MB/s # Platters 3 What are the I/O rates for block size 4 KB and: random workload (~ 0.3 MB/s) sequential workload (~ 210 MB/s) 16

EXAMPLE: RANDOM WORKLOAD Seagate HDD Capacity 3 TB RPM 7,200 Average Seek Time 9 ms Max Transfer Rate 210 MB/s # Platters 3 For a 4KB block: rotational delay = 4.17 ms seek time = 9 ms transfer time = (4KB) / (210 MB/s) ~ 0.019 ms total time per block = 13.1 ms I/O rate = (4KB) / (13.1 ms) ~ 0.3 MB/s 17

ACCESSING THE DISK Blocks in a file should be arranged sequentially on disk to minimize seek and rotational delay! next block concept: blocks on same track, followed by blocks on same cylinder, followed by blocks on adjacent cylinder 18

MANAGING DISK SPACE The disk space is organized into files Files are made up of pages Pages contain records Data is allocated/deallocated in increments of pages Logically close pages should be nearby in the disk 19

SSD (SOLID STATE DRIVE) SSDs use flash memory No moving parts (no rotate/seek motors) eliminates seek time and rotational delay very low power and lightweight Data transfer rates: 300-600 MB/s SSDs can read data (sequential or random) very fast! 20

SSDS Small storage (0.1-0.5x of HDD) expensive (20x of HDD) Writes are much more expensive than reads (10x) Limited lifetime 1-10K writes per page the average failure rate is 6 years 21

BUFFER MANAGEMENT 22

BUFFER MANAGER Data must be in RAM for DBMS to operate on it All the pages may not fit into main memory Buffer manager: responsible for bringing pages from disk to main memory as needed pages brought into main memory are in the buffer pool the buffer pool is partitioned into frames: slots for holding disk pages 23

BUFFER MANAGER page request buffer pool page frame disk 24

BUFFER MANAGER: REQUESTS Read (page): read a page from disk and add to the buffer pool (if not already in buffer) Flush (page): evict page from buffer pool & write to disk Release (page): evict page from buffer pool without writing to disk 25

BOOKKEEPING Bookkeeping per frame: pin count: # current users of the page pinning : increment the pin count unpinning : decrement the pin count dirty bit: indicates if the page has been modified bit = 1 means that the changes to the page must be propagated to the disk 26

PAGE REQUEST Page is in the buffer pool: return the address to the frame increment the pin count Page is not in the buffer pool: choose a frame for replacement (with pin count = 0) if frame is dirty, write the page to disk read requested page into chosen frame pin the page and return the address 27

BUFFER REPLACEMENT POLICY How do we choose a frame for replacement? LRU (Least Recently Used) Clock MRU (Most Recently Used) FIFO, random, The replacement policy has big impact on # of I/O s (depends on the access pattern) 28

LRU LRU (Least Recently Used) uses a queue of pointers to frames that have pin count = 0 a page request uses frames only from the head of the queue when a the pin count of a frame goes to 0, it is added to the end of the queue 29

LRU EXAMPLE frame dirty pincount 1 0 0 priority queue: 1 2 3 2 0 0 3 0 0 Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 30

LRU EXAMPLE frame dirty pincount 1 A 0 1 2 0 0 3 0 0 priority queue: 2 3 one I/O to read the page Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 31

LRU EXAMPLE frame dirty pincount 1 A 1 1 2 0 0 3 0 0 priority queue: 2 3 no I/O here! Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 32

LRU EXAMPLE frame dirty pincount 1 A 1 1 2 B 0 1 3 0 0 priority queue: 3 one I/O to read the page Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 33

LRU EXAMPLE frame dirty pincount 1 A 1 1 2 B 0 2 3 0 0 priority queue: 3 No I/O here The pincount increases! Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 34

LRU EXAMPLE frame dirty pincount 1 A 1 0 2 B 0 2 3 0 0 priority queue: 3 1 no I/O yet! Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 35

LRU EXAMPLE frame dirty pincount 1 A 1 0 2 B 0 2 3 C 0 1 priority queue: 1 one I/O to read the page Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 36

LRU EXAMPLE frame dirty pincount 1 A 1 0 2 B 0 1 3 C 0 1 priority queue: 1 the pincount decreases Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 37

LRU EXAMPLE frame dirty pincount 1 D 0 1 2 B 0 1 3 C 0 1 priority queue: two I/Os: one to write A to disk and one to read D Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 38

LRU EXAMPLE frame dirty pincount 1 D 1 1 2 B 0 1 3 C 0 1 priority queue: no I/O here Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 39

LRU EXAMPLE frame dirty pincount 1 D 1 1 2 B 0 0 3 C 0 1 priority queue: 2 no I/O Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 40

LRU EXAMPLE frame dirty pincount 1 D 1 1 2 A 0 1 3 C 0 1 priority queue: one I/O to read A Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 41

LRU EXAMPLE frame dirty pincount 1 D 1 1 2 A 0 1 3 C 0 1 priority queue: The buffer pool is full, the request must wait! Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 42

CLOCK Variant of LRU with lower memory overhead The N frames are organized into a cycle Each frame has a referenced bit that is set to 1 when pin count becomes 0 A current variable points to a frame When a frame is considered: If pin count > 0, increment current If referenced = 1, set to 0 and increment If referenced = 0 and pin count = 0, choose the page 43

CLOCK EXAMPLE referenced bit frame dirty pincount 1 0 0 1 0 2 0 0 3 0 0 3 0 2 0 Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 44

CLOCK EXAMPLE frame dirty pincount 1 A 0 1 1 0 2 0 0 3 0 0 3 0 2 0 Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 45

CLOCK EXAMPLE frame dirty pincount 1 A 1 1 1 0 2 0 0 3 0 0 3 0 2 0 Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 46

CLOCK EXAMPLE frame dirty pincount 1 A 1 1 1 0 2 B 0 1 3 0 0 3 0 2 0 Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 47

CLOCK EXAMPLE frame dirty pincount 1 A 1 1 1 0 2 B 0 2 3 0 0 3 0 2 0 Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 48

CLOCK EXAMPLE frame dirty pincount 1 A 1 0 1 1 2 B 0 2 3 0 0 3 0 2 0 Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 49

CLOCK EXAMPLE frame dirty pincount 1 A 1 0 1 1 2 B 0 2 3 C 0 1 3 0 2 0 Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 50

CLOCK EXAMPLE frame dirty pincount 1 A 1 0 1 1 2 B 0 1 3 C 0 1 3 0 2 0 Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 51

CLOCK EXAMPLE frame dirty pincount 1 D 0 1 1 0 does a full cycle! 2 B 0 1 3 C 0 1 3 0 2 0 Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 52

CLOCK EXAMPLE frame dirty pincount 1 D 1 1 1 0 2 B 0 1 3 C 0 1 3 0 2 0 Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 53

CLOCK EXAMPLE frame dirty pincount 1 D 1 1 1 0 2 B 0 0 3 C 0 1 3 0 2 1 Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 54

CLOCK EXAMPLE frame dirty pincount 1 D 1 1 1 0 does a full cycle! 2 A 0 1 3 C 0 1 3 0 2 0 Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 55

CLOCK EXAMPLE frame dirty pincount 1 D 1 1 2 A 0 1 3 C 0 1 3 0 1 0 2 0 request has to wait! Sequence of requests: request A, modify A, request B, request B, release A, request C, release B, request D, modify D, release B, request A, request E 56

SEQUENTIAL FLOODING: EXAMPLE 3 frames in the buffer pool request sequence: A, B, C, D, A, B, C, D, A, B, C, D, With LRU policy, every page access needs an I/O! 57

SEQUENTIAL FLOODING Sequential Flooding: nasty situation caused by LRU policy + repeated sequential scans # buffer frames < # pages in file each page request causes an I/O!! MRU much better in this situation 58

DBMS VS OS FILE SYSTEM Why not let the OS handle disk management? DBMS better at predicting the reference patterns Buffer management in DBMS requires ability to: pin a page in buffer pool force a page to disk (for recovery & concurrency) adjust the replacement policy pre-fetch pages based on predictable access patterns can better control the overlap of I/O with computation can leverage multiple disks more effectively 59