Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu

Similar documents
Principles of Data Management. Lecture #2 (Storing Data: Disks and Files)

CS 554: Advanced Database System

Database Systems II. Secondary Storage

Advanced Database Systems

Data Storage and Query Answering. Data Storage and Disk Structure (2)

Storing Data: Disks and Files

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage

Disks, Memories & Buffer Management

Professor: Pete Keleher! Closures, candidate keys, canonical covers etc! Armstrong axioms!

CS143: Disks and Files

CS 405G: Introduction to Database Systems. Storage

STORING DATA: DISK AND FILES

I/O CANNOT BE IGNORED

Parser. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text.

CS 261 Fall Mike Lam, Professor. Memory

CS122A: Introduction to Data Management. Lecture #14: Indexing. Instructor: Chen Li

Module 1: Basics and Background Lecture 4: Memory and Disk Accesses. The Lecture Contains: Memory organisation. Memory hierarchy. Disks.

Database Systems. November 2, 2011 Lecture #7. topobo (mit)

CSE 451: Operating Systems Spring Module 12 Secondary Storage

CSE 190D Database System Implementation

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

CS 537 Fall 2017 Review Session

Physical Data Organization. Introduction to Databases CompSci 316 Fall 2018

Storage. CS 3410 Computer System Organization & Programming

I/O CANNOT BE IGNORED

Storage Systems. Storage Systems

Storage Devices for Database Systems

Storage and File Structure

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache

Warehouse- Scale Computing and the BDAS Stack

Introduction to I/O and Disk Management

Introduction to I/O and Disk Management

COS 318: Operating Systems. Storage Devices. Jaswinder Pal Singh Computer Science Department Princeton University

Storage systems. Computer Systems Architecture CMSC 411 Unit 6 Storage Systems. (Hard) Disks. Disk and Tape Technologies. Disks (cont.

Administrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review

COS 318: Operating Systems. Storage Devices. Vivek Pai Computer Science Department Princeton University

u Covered: l Management of CPU & concurrency l Management of main memory & virtual memory u Currently --- Management of I/O devices

Distributed File Systems II

File Structures and Indexing

Outlines. Chapter 2 Storage Structure. Structure of a DBMS (with some simplification) Structure of a DBMS (with some simplification)

L9: Storage Manager Physical Data Organization

Disks and Files. Jim Gray s Storage Latency Analogy: How Far Away is the Data? Components of a Disk. Disks

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University

COMP283-Lecture 3 Applied Database Management

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Disks & Files. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Introduction to Data Management. Lecture 14 (Storage and Indexing)

Contents. Memory System Overview Cache Memory. Internal Memory. Virtual Memory. Memory Hierarchy. Registers In CPU Internal or Main memory

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Database Management Systems need to:

CPSC 421 Database Management Systems. Lecture 11: Storage and File Organization

Introduction to Data Management. Lecture #13 (Indexing)

Chapter 12: Mass-Storage

CS 261 Fall Mike Lam, Professor. Memory

Disks. Storage Technology. Vera Goebel Thomas Plagemann. Department of Informatics University of Oslo

Mass-Storage Structure

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

Disks and RAID. CS 4410 Operating Systems. [R. Agarwal, L. Alvisi, A. Bracy, E. Sirer, R. Van Renesse]

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Chapter 7

BBM371- Data Management. Lecture 2: Storage Devices

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

EECS 482 Introduction to Operating Systems

Chapter 6 - External Memory

Introduction to Database Services

Monday, May 4, Discs RAID: Introduction Error detection and correction Error detection: Simple parity Error correction: Hamming Codes

Improving throughput for small disk requests with proximal I/O

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

OS and Hardware Tuning

Wednesday, April 25, Discs RAID: Introduction Error detection and correction Error detection: Simple parity Error correction: Hamming Codes

Lecture 18: Memory Systems. Spring 2018 Jason Tang

Overview and Current Topics in Solid State Storage

OS and HW Tuning Considerations!

CS 471 Operating Systems. Yue Cheng. George Mason University Fall 2017

Storage Technologies - 3

The Role of Database Aware Flash Technologies in Accelerating Mission- Critical Databases

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006

Hewlett Packard Enterprise HPE GEN10 PERSISTENT MEMORY PERFORMANCE THROUGH PERSISTENCE

Data Storage - I: Memory Hierarchies & Disks. Contains slides from: Naci Akkök, Pål Halvorsen, Hector Garcia-Molina, Ketil Lund, Vera Goebel

CS2410: Computer Architecture. Storage systems. Sangyeun Cho. Computer Science Department University of Pittsburgh

Advanced Database Systems

CMSC 424 Database design Lecture 12 Storage. Mihai Pop

CS122 Lecture 1 Winter Term,

Readings. Storage Hierarchy III: I/O System. I/O (Disk) Performance. I/O Device Characteristics. often boring, but still quite important

Next-Generation Cloud Platform

Decentralized Distributed Storage System for Big Data

CSE 232A Graduate Database Systems

Cold Storage: The Road to Enterprise Ilya Kuznetsov YADRO

CS3600 SYSTEMS AND NETWORKS

Chapter 10: Mass-Storage Systems

Magnetic Disks. Tore Brox-Larsen Including material developed by Pål Halvorsen, University of Oslo (primarily) Kai Li, Princeton University

Adapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK]

Chapter 12: Mass-Storage

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Overview and Current Topics in Solid State Storage

Chapter 12: Mass-Storage

CSCI-UA.0201 Computer Systems Organization Memory Hierarchy

User Perspective. Module III: System Perspective. Module III: Topics Covered. Module III Overview of Storage Structures, QP, and TM

Name: Instructions. Problem 1 : Short answer. [48 points] CMU / Storage Systems 23 Feb 2011 Spring 2012 Exam 1

Physical Representation of Files

V. Mass Storage Systems

Transcription:

Database Architecture 2 & Storage Instructor: Matei Zaharia cs245.stanford.edu

Summary from Last Time System R mostly matched the architecture of a modern RDBMS» SQL» Many storage & access methods» Cost-based optimizer» Lock manager» Recovery» View-based access control CS 245 2

A Note on Recovery Methods Jim Gray, The Recovery Manager of the System R Database Manager, 1981 CS 245 3

Outline System R discussion Relational DBMS architecture Alternative architectures & tradeoffs Storage hardware CS 245 4

Typical RDBMS Architecture Query Planner User Transaction Query Parser Transaction Manager User Concurrency Control Buffer Manager Recovery Manager Lock Table File Manager Mem.Mgr. Buffers Log Indexes Data Statistics User Data System Data CS 245 5

Boundaries Some of the components have clear boundaries and interfaces for modularity» SQL language» Query plan representation (relational algebra)» Pages and buffers Other components can interact closely» Recovery + buffers + files + indexes» Transactions + indexes & other data structures» Data statistics + query optimizer CS 245 6

Differentiating by Workload Two big classes of commercial RDBMS today Transactional DBMS: focus on concurrent, small, low-latency transactions (e.g. MySQL, Postgres, Oracle, DB2) real-time apps Analytical DBMS: focus on large, parallel but mostly read-only analytics (e.g. Teradata, Redshift, Vertica) data warehouses CS 245 7

How To Design Components for Transactional vs Analytical DBMS? Component Data storage Transactional DBMS Analytical DBMS Locking Recovery CS 245 8

How To Design Components for Transactional vs Analytical DBMS? Component Data storage Locking Transactional DBMS B-trees, row oriented storage Analytical DBMS Column-oriented storage Recovery CS 245 9

How To Design Components for Transactional vs Analytical DBMS? Component Data storage Locking Recovery Transactional DBMS B-trees, row oriented storage Fine-grained, very optimized Analytical DBMS Column-oriented storage Coarse-grained (few writes) CS 245 10

How To Design Components for Transactional vs Analytical DBMS? Component Data storage Locking Recovery Transactional DBMS B-trees, row oriented storage Fine-grained, very optimized Log data writes, minimize latency Analytical DBMS Column-oriented storage Coarse-grained (few writes) Log queries CS 245 11

Outline System R discussion Relational DBMS architecture Alternative architectures & tradeoffs Storage hardware CS 245 12

How Can We Change the DBMS Architecture? CS 245 13

Decouple Query Processing from Storage Management Example: big data ecosystem (Hadoop, GFS, etc) MapReduce Processing engines File formats & metadata GFS Large-scale file systems or blob stores CS 245 Data lake architecture 14

Decouple Query Processing from Storage Management Pros:» Can scale compute independently of storage (e.g. in datacenter or public cloud)» Let different orgs develop different engines» Your data is open by default to new tech Cons:» Harder to guarantee isolation, reliability, etc» Harder to co-optimize compute and storage» Can t optimize across many compute engines» Harder to manage if too many engines! CS 245 15

Change the Data Model Key-value stores: data is just key-value pairs, don t worry about record internals Message queues: data is only accessed in a specific FIFO order; limited operations ML frameworks: data is tensors, models, etc CS 245 16

Change the Compute Model Stream processing: Apps run continuously and system can manage upgrades, scaleup, recovery, etc Eventual consistency: handle it at app level CS 245 17

Different Hardware Setting Distributed databases: need to distribute your lock manager, storage manager, etc, or find system designs that eliminate them Public cloud: serverless databases that can scale compute independently of storage (e.g. AWS Aurora, Google BigQuery) CS 245 18

AWS Aurora Serverless CS 245 19

Outline System R discussion Relational DBMS architecture Alternative architectures & tradeoffs Storage hardware CS 245 21

Typical Server CPU CPU DRAM I/O Controller Network Card Storage Devices CS 245 22...

Storage Performance Metrics latency (s) throughput (bytes/s) CPU storage capacity (bytes, bytes/$) CS 245 23

CS 245 24

Storage Latency Andromeda 10 9 Tape /Optical Robot 10 6 Disk 2,000 Years Pluto Sacramento 150 10 2 1 CS 245 Memory L2 Cache L1 Cache Registers 2 Years 2 hr This Campus 10 min This Room 1 min My Head 25

Max Attainable Throughput Varies significantly by device» 100 GB/s for RAM» 2 GB/s for NVMe SSD» 130 MB/s for hard disk Assumes large reads ( 1 block)! CS 245 26

Storage Cost $1000 at NewEgg today buys:» 0.2 TB of RAM» 9 TB of NVMe SSD» 33 TB of magnetic disk CS 245 27

Hardware Trends over Time Capacity/$ grows exponentially at a fast rate (e.g. double every 2 years) Throughput grows at a slower rate (e.g. 5% per year), but new interconnects help Latency does not improve much over time CS 245 28

Most Common Permanent Storage: Hard Disks Terms: Platter, Head, Actuator Cylinder, Track Sector (physical), Block (logical), Gap CS 245 30

Top View CS 245 31

Disk Access Time I want block X block x in memory? CS 245 32

Disk Access Time Time = Seek Time + Rotational Delay + Transfer Time + Other CS 245 33

Seek Time 3-5X Time X 1 N Cylinders Traveled CS 245 34

Typical Seek Time Ranges from» 4 ms for high end drives» 15 ms for mobile devices In contrast, SSD access time ranges from» 0.02 ms: NVMe» 0.16 ms: SATA CS 245 35

Rotational Delay Head Here Block I Want CS 245 36

Average Rotational Delay R = 1/2 revolution R=0 for SSDs Typical HDD figures HDD Spindle [rpm] Average rotational latency [ms] 4,200 7.14 5,400 5.56 7,200 4.17 10,000 3.00 15,000 2.00 Source: Wikipedia, "Hard disk drive performance characteristics" CS 245 37

Transfer Rate Transfer rate T is around 50-130 MB/s Transfer time: size / T for contiguous read Block size: usually 512-4096 bytes CS 245 38

So Far: Random Block Access What about reading the next block? CS 245 39

If we do things right (i.e., Double Buffer, Stagger Blocks ) Time to get = block size / t + negligible Potential slowdowns:» Skip gap» Next track» Discontinuous block placement Sequential access generally much faster than random access CS 245 40

Cost of Writing: Similar to Reading. unless we want to verify! need to add (full) rotation + block size / t CS 245 41

Cost To Modify a Block? To Modify Block: (a) Read Block (b) Modify in Memory (c) Write Block [(d) Verify?] CS 245 42

Performance of DRAM The same basic issues with lookup time vs throughput apply to DRAM Min read from DRAM is a cache line (64 bytes) Even 64-byte random reads may not be as fast as sequential ones due to prefetching, page table, controllers, etc Place co-accessed data together! CS 245 43

Example Suppose we re accessing 8-byte records in a DRAM with 64-byte cache line sizes How much slower is random vs sequential? In the random case, we are reading 64 bytes for every 8 bytes we need, so we expect to max out the throughput at least 8x sooner. CS 245 44

Storage Hierarchy Typically want to cache frequently accessed data at a high level of the storage hierarchy to improve performance CPU CPU Cache (KBs-MBs) DRAM (GBs) Disk (TBs) CS 245 45

Sizing Storage Tiers How much high-tier storage should we have? Can determine based on workload & cost The 5 Minute Rule for Trading Memory Accesses for Disc Accesses Jim Gray & Franco Putzolu May 1985 CS 245 46

The Five Minute Rule Say a page is accessed every X seconds Assume a disk costs D dollars and can do I operations/sec; cost of keeping this page on disk is C disk = C iop / X = D / (I X) Assume 1 MB of RAM costs M dollars and holds P pages; then the cost of keeping it in DRAM is: C mem = M / P CS 245 47

Five Minute Rule This tells us that the page is worth caching when C mem < C disk, i.e. X < Source: The Five-minute Rule Thirty Years Later and its Impact on the Storage Hierarchy CS 245 48

Disk Arrays Many flavors of RAID : striping, mirroring, etc to increase performance and reliability logically one disk CS 245 49

Common RAID Levels Striping across 2 disks: adds performance but not reliability Mirroring across 2 disks: adds reliability but not performance Striping + 1 parity disk: adds performance and reliability at lower storage cost CS 245 Image source: Wikipedia 50

Coping with Disk Failures Detection» E.g. checksum Correction» Requires redundancy CS 245 51

At What Level Do We Cope? Single Disk» E.g., error-correcting codes on read Disk Array Logical Physical CS 245 52

Operating System E.g., network-replicated storage Logical Block Copy A Copy B CS 245 53

Database System E.g., Log Current DB Last week s DB CS 245 54

Summary Storage devices offer various tradeoffs in terms of latency, throughput and cost In all cases, data layout and access pattern matter because random sequential access Most systems will combine multiple devices CS 245 55

Assignment 1 Explores the effect of data layout for a simple in-memory database» Fixed set of supported queries» Implement a row store, column store, indexed store, and your own custom store! Will be posted soon on website! CS 245 56