Advanced Database Systems

Similar documents
Einführung in Datenbanksysteme

Chapter 2 Storage Disks, Buffer Manager, Files...

Chapter 2 Storage Disks, Buffer Manager, Files...

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage

Architecture and Implementation of Database System

Principles of Data Management. Lecture #2 (Storing Data: Disks and Files)

Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu

Disks, Memories & Buffer Management

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Storing Data: Disks and Files

L9: Storage Manager Physical Data Organization

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Database Management Systems need to:

CS143: Disks and Files

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

Database Systems. November 2, 2011 Lecture #7. topobo (mit)

STORING DATA: DISK AND FILES

ECE331: Hardware Organization and Design

Outlines. Chapter 2 Storage Structure. Structure of a DBMS (with some simplification) Structure of a DBMS (with some simplification)

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Chapter 7

Storing Data: Disks and Files

CMSC 424 Database design Lecture 12 Storage. Mihai Pop

Storage Devices for Database Systems

CPSC 421 Database Management Systems. Lecture 11: Storage and File Organization

CS 405G: Introduction to Database Systems. Storage

Advanced Database Systems

Storing Data: Disks and Files

Storage and File Structure

User Perspective. Module III: System Perspective. Module III: Topics Covered. Module III Overview of Storage Structures, QP, and TM

I/O CANNOT BE IGNORED

Ch 11: Storage and File Structure

Data Modeling and Databases Ch 10: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich

Starting Point: Existing DB Application (Java + SQL)

Disks & Files. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Disks and Files. Storage Structures Introduction Chapter 8 (3 rd edition) Why Not Store Everything in Main Memory?

Physical Data Organization. Introduction to Databases CompSci 316 Fall 2018

Data Modeling and Databases Ch 9: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich

I/O CANNOT BE IGNORED

Module 13: Secondary-Storage

COMP283-Lecture 3 Applied Database Management

Module 1: Basics and Background Lecture 4: Memory and Disk Accesses. The Lecture Contains: Memory organisation. Memory hierarchy. Disks.

CS2410: Computer Architecture. Storage systems. Sangyeun Cho. Computer Science Department University of Pittsburgh

File Structures and Indexing

Storing Data: Disks and Files

Chapter 13: Mass-Storage Systems. Disk Scheduling. Disk Scheduling (Cont.) Disk Structure FCFS. Moving-Head Disk Mechanism

Chapter 13: Mass-Storage Systems. Disk Structure

Storing Data: Disks and Files

Data Storage and Query Answering. Data Storage and Disk Structure (2)

Parser. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text.

Module 4. Implementation of XQuery. Part 0: Background on relational query processing

Chapter 14: Mass-Storage Systems

Storage and File Structure. Classification of Physical Storage Media. Physical Storage Media. Physical Storage Media

File. File System Implementation. Operations. Permissions and Data Layout. Storing and Accessing File Data. Opening a File

CS 554: Advanced Database System

Review 1-- Storing Data: Disks and Files

File. File System Implementation. File Metadata. File System Implementation. Direct Memory Access Cont. Hardware background: Direct Memory Access

Disks and Files. Jim Gray s Storage Latency Analogy: How Far Away is the Data? Components of a Disk. Disks

Classifying Physical Storage Media. Chapter 11: Storage and File Structure. Storage Hierarchy (Cont.) Storage Hierarchy. Magnetic Hard Disk Mechanism

Classifying Physical Storage Media. Chapter 11: Storage and File Structure. Storage Hierarchy. Storage Hierarchy (Cont.) Speed

OS and Hardware Tuning

Chapter 12: Mass-Storage

CSE325 Principles of Operating Systems. Mass-Storage Systems. David P. Duggan. April 19, 2011

CSE 190D Database System Implementation

Lecture 29. Friday, March 23 CS 470 Operating Systems - Lecture 29 1

OS and HW Tuning Considerations!

Disk Scheduling COMPSCI 386

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

Database Applications (15-415)

V. Mass Storage Systems

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Database Systems II. Secondary Storage

Information Systems (Informationssysteme)

Goals for Today. CS 133: Databases. Relational Model. Multi-Relation Queries. Reason about the conceptual evaluation of an SQL query

Chapter 11. I/O Management and Disk Scheduling

Storage Systems. Storage Systems

L7: Performance. Frans Kaashoek Spring 2013

Monday, May 4, Discs RAID: Introduction Error detection and correction Error detection: Simple parity Error correction: Hamming Codes

Tape pictures. CSE 30341: Operating Systems Principles

Mass-Storage Structure

CSE 153 Design of Operating Systems Fall 2018

CSE 120. Operating Systems. March 27, 2014 Lecture 17. Mass Storage. Instructor: Neil Rhodes. Wednesday, March 26, 14

Professor: Pete Keleher! Closures, candidate keys, canonical covers etc! Armstrong axioms!

Storage Technologies - 3

Advanced Database Systems

CSE 232A Graduate Database Systems

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Storing Data: Disks and Files

Disks and RAID. CS 4410 Operating Systems. [R. Agarwal, L. Alvisi, A. Bracy, E. Sirer, R. Van Renesse]

Computer Architecture 计算机体系结构. Lecture 6. Data Storage and I/O 第六讲 数据存储和输入输出. Chao Li, PhD. 李超博士

Chapter 6. Storage and Other I/O Topics

Session: Hardware Topic: Disks. Daniel Chang. COP 3502 Introduction to Computer Science. Lecture. Copyright August 2004, Daniel Chang

CS370 Operating Systems

CISC 7310X. C11: Mass Storage. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 4/19/2018 CUNY Brooklyn College

Lecture 23: Storage Systems. Topics: disk access, bus design, evaluation metrics, RAID (Sections )

Today: Secondary Storage! Typical Disk Parameters!

Storage. CS 3410 Computer System Organization & Programming

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble

I/O Buffering and Streaming

Storing Data: Disks and Files. Administrivia (part 2 of 2) Review. Disks, Memory, and Files. Disks and Files. Lecture 3 (R&G Chapter 7)

Transcription:

Lecture II Storage Layer Kyumars Sheykh Esmaili

Course s Syllabus Core Topics Storage Layer Query Processing and Optimization Transaction Management and Recovery Advanced Topics Cloud Computing and Web Databases Parallel Databases and MapReduce Distributed Databases Data Stream Management Systems Security in Databases 2

Outline DBMS Architecture Storage Systems Storage Management 3

DBMS Architecture

Database s Main Job What Input: SQL statement Output: {tuples} How 1. Translate SQL into a set of get/put requests to backend storage 2. Extract, process, transform tuples from blocks 5

End-to-End Query Processing SQL {tuples} Compiler Parser QGM Rewrite QGM Optimizer QGM++ CodeGen Plan Interpreter Runtime System 6

Parser Generates relational algebra (RA) tree for each sub-query constructs graph of trees: Query Graph Model (QGM) nodes are subqueries edges represent relationships between subqueries Extended RA because SQL more than RA GROUP BY ORDER BY DISTINCT Parser needs schema information Why? 7

SQL => RA - Example π Title select Title from Professor, Lecture where Name = Popper and Date = 1979 σ Name = Popper and Date= 1979 Professor Lecture 8

Query Rewrite Many equivalent query plans Finding the right plan can dramatically impact performance 9

Query Optimization Mainly based on statistics Many, many techniques Will be discussed in a separate lecture 10

Query Execution Once the final query plan is identified, it s rather straightforwad Code generated based on the plan Interpreter generates the output 11

Components of a DB System Naive User Expert User App- Developer DBadmin Application Ad-hoc Query Compiler Management tools DML-Compiler DDL-Compiler Query Processor/Optimizer DBMS TA Management Recovery Runtime Storage Manager Schema Logs Indexes DB Catalogue Storage System 12

Storage Systems

Memory Hierarchy Fast, but expensive and small, memory close to CPU Larger, slower memory at the periphery We ll try to hide latency by using the fast memory as a cache 14

Magnetic Disks A stepper motor positions an array of disk heads on the requested track Platters (disks) steadily rotate Disks are managed in blocks: the system reads/writes data one block at a time 15

Access Time Magnet disk s design has implications on the access time to read/write a given block: Move disk arms to desired track (seek time t s ) Wait for desired block to rotate under disk head (rotational delay t r ) Read/write data (transfer time t tr ) access time: t = t s + t r + t tr 16

Access Time - Example Notebook drive Hitachi Travelstar 7K200 rotational speed: 7200 rpm average seek time: 10 ms transfer rate: 50 MB/s 512 bytes per sector 63 sector per track Track-to-track seek time: 1ms What is the access time to read an 8 KB data block? What about 1000 blocks of size 8KB Random Sequential 17

Disks: Sequential vs. Random IO Random access: t rnd = 1000 * t = 1000 * (t s + t r + t tr ) = 1000 * (10 + 4.17 + 0.16) = 1000 * 14.33 = 14330 ms Sequential access: t seq = t s + t r + 1000 * t tr + N * t track-to-track seek time = t s + t r + 1000 * 0.16 ms + (16 * 1000)/63 * 1 ms = 10 ms + 4.17 ms + 160 ms + 254 ms 428 ms Need consider this gap in algorithms! 18

Performance Tricks Track skewing Align sector 0 of each track to avoid rotational delay during sequential scans Request scheduling Choose the request that requires the smallest arm movement Zoning Divide outer tracks into more sectors than inners 19

Evolution of Hard Disk Technology Disk latencies have only marginally improved over the last years 10% per year But: Throughput (i.e., transfer rates) improve by 50% per year Hard disk capacity grows by 50% every year Therefore: Random access cost hurts even more as time progresses 20

Ways to Improve I/O Performance The latency penalty is hard to avoid. But: Throughput can be increased rather easily by exploiting parallelism. Idea: Use multiple disks and access them in parallel RAID: Redundant Array of Inexpensive Disks 21

Disk Mirroring Replicate data onto multiple disks I/O parallelism only for reads This is also known as RAID 1 (mirroring without parity) Failure risk? 22

Disk Striping Distribute data over disks Full I/O parallelism Also known as RAID 0 (striping without parity) Failure risk? 23

Disk Striping with Parity Distribute data and parity information over disks High I/O parallelism I Also known as RAID 5 (striping with distributed parity) Fault risk? 24

Solid-State Disks Solid state disks (SSDs) have emerged as an alternative to conventional hard disks Faster random reads Slower random writes Pages have to be erased before Once erased, sequential writes almost as fast as reads Adapting databases to these characteristics is a current research topic 25

Network-Based Storage The network is not a bottleneck any more Disks bandwidths Hard disk: 50 100 MB/s Serial ATA: 375 MB/s Network bandwidth 10 gigabit Ethernet: 1,250 MB/s Infiniband QDR: 12,000 MB/s Why not use the network for database storage? 26

Grid or Cloud Storage Some big enterprises (e.g., Google, Amazon) employ clusters with thousands of commodity PCs : Spare CPU cycles and disk space can be sold as a service use massive replication for data storage Amazon s Elastic Computing Cloud (EC2) Use Amazon s compute cluster by the hour ( 10 /hour). Amazon s Simple Storage Systems (S3) Infinite store for objects between 1 Byte and 5 GB in size 27

Components of a DB System Naive User Expert User App- Developer DBadmin Application Ad-hoc Query Compiler Management tools DML-Compiler DDL-Compiler Query Processor/Optimizer DBMS TA Management Recovery Runtime Storage Manager Schema Logs Indexes DB Catalogue Storage System 28

Storage Manager Interface to the stroage system Buffer management Handles the storage hierarchy Data management (files and blocks) Outsmarts OS o Oracle, Google, etc. implement their own file system Keeps track of recovery logs 29

Buffer Manager The buffer manager mediates between external storage and main memory manages a designated main memory area, the buffer pool for this task Disk pages are brought into memory as needed A replacement policy decides which page to evict when the buffer is full 30

Replacement Policies The effectiveness of the buffer manager s caching functionality can depend on the replacement policy it uses, e.g., Least Recently Used (LRU) LRU-k Most Recently Used (MRU) Random What could be the rationales behind each of these strategies? 31

Data Manager Maps records to pages implement record identifier (RID) Implementation of Indexes B+ trees, etc. Free space Management Various schemes 32

Database = { files } A file = variable-sized sequence of blocks Block is the unit of transfer to disk A page = fixed-sized sequence of blocks A page contains records or index entries Typical page size: 8KB Page is logical unit of transfer and unit of buffering Blocks of same page are prefetched, stored on same track on disk 33

Heap Files The most important type of files in a database A linked list of pages stores records in no particular order (in line with, e.g., SQL) Problems? 34

Heap Files Directory of pages use as space map with information about free page 35

Free Space Management Find a page for a new record Many different heuristics conceivable All based on a list of pages with free space Append Only Try to insert into the last page of free space list If no room in last page, create a new page. Best Fit Scan through list and find min page that fits First Fit, Next Fit Scan through list and find first / next fit Advantages and disadvantages? 36

Inside a Page record identifier (rid): <pageno, slotno> indexes use rids to ref. records record position (in page): slotno x bytes per slot 37

Insid a Page Variable-sized Fields Variable-sized fields moved to end of each record Slot directory points to start of each record. Create forward address if record won t fit on page 38

DBMS vs. OS Buffer management and data management very much look like virtual memory and file management in OSs But a DBMS may be much more aware of the access patterns of certain operators (-> pre-fetching) concurrency control often calls for a defined order of write operations technical reasons may make OS tools unsuitable for a database (e.g., file size limitation, platform independence). 39

Access Patterns of Databases Sequential: table scans P 1, P 2, P 3, P 4, P 5, Hiearchical: index navigation P 1, P 4, P 11, P 1, P 4, P 12, P 1, P 3, P 8, P 1, P 2, P 7, P 1, P 3, P 9, Random: index lookup P 13, P 27, P 3, P 43, P 15, Cyclic: nested-loops join P 1, P 2, P 3, P 4, P 5, P 1, P 2, P 3, P 4, P 5, P 1, P 2, P 3, P 4, P 5, 40

DBMS vs. OS In fact, databases and operating systems sometimes interfere Operating system and buffer manager effectively buffer the same data twice Things get really bad if parts of the DBMS buffer get swapped out to disk by OS VM manager Therefore, databases try to turn off OS functionality as much as possible Raw disk access instead of OS files 41