CSCI-GA Database Systems Lecture 8: Physical Schema: Storage

Similar documents
COMP283-Lecture 3 Applied Database Management

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Principles of Data Management. Lecture #2 (Storing Data: Disks and Files)

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Chapter 7

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

Database Systems. November 2, 2011 Lecture #7. topobo (mit)

CS143: Disks and Files

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Database Management Systems need to:

Chapter 6. Storage and Other I/O Topics

CSCI-UA.0201 Computer Systems Organization Memory Hierarchy

CMSC 424 Database design Lecture 12 Storage. Mihai Pop

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

Storing Data: Disks and Files

Mass-Storage. ICS332 - Fall 2017 Operating Systems. Henri Casanova

Mass-Storage Structure

Disks, Memories & Buffer Management

I/O CANNOT BE IGNORED

Storing Data: Disks and Files

L9: Storage Manager Physical Data Organization

I/O CANNOT BE IGNORED

Secondary storage. CS 537 Lecture 11 Secondary Storage. Disk trends. Another trip down memory lane

Today: Secondary Storage! Typical Disk Parameters!

Database Systems II. Secondary Storage

COS 318: Operating Systems. Storage Devices. Vivek Pai Computer Science Department Princeton University

Professor: Pete Keleher! Closures, candidate keys, canonical covers etc! Armstrong axioms!

Outlines. Chapter 2 Storage Structure. Structure of a DBMS (with some simplification) Structure of a DBMS (with some simplification)

Storage Devices for Database Systems

Storing Data: Disks and Files

Advanced Database Systems

Monday, May 4, Discs RAID: Introduction Error detection and correction Error detection: Simple parity Error correction: Hamming Codes

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

Mass-Storage Structure

Storage. CS 3410 Computer System Organization & Programming

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Module 1: Basics and Background Lecture 4: Memory and Disk Accesses. The Lecture Contains: Memory organisation. Memory hierarchy. Disks.

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Disks and Files. Storage Structures Introduction Chapter 8 (3 rd edition) Why Not Store Everything in Main Memory?

Mass-Storage. ICS332 Operating Systems

Chapter 10: Mass-Storage Systems

Chapter 6 - External Memory

COS 318: Operating Systems. Storage Devices. Jaswinder Pal Singh Computer Science Department Princeton University

CISC 7310X. C11: Mass Storage. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 4/19/2018 CUNY Brooklyn College

CSE 451: Operating Systems Spring Module 12 Secondary Storage

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University

Wednesday, April 25, Discs RAID: Introduction Error detection and correction Error detection: Simple parity Error correction: Hamming Codes

Storage Technologies - 3

Administrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review

Lecture 29. Friday, March 23 CS 470 Operating Systems - Lecture 29 1

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble

Operating Systems. Operating Systems Professor Sina Meraji U of T

Storing Data: Disks and Files

CSE 120. Operating Systems. March 27, 2014 Lecture 17. Mass Storage. Instructor: Neil Rhodes. Wednesday, March 26, 14

V. Mass Storage Systems

CS370 Operating Systems

Chapter 11. I/O Management and Disk Scheduling

CMSC424: Database Design. Instructor: Amol Deshpande

Storage. Hwansoo Han

CS3600 SYSTEMS AND NETWORKS

Computer System Architecture

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

CSCI-GA Operating Systems. I/O : Disk Scheduling and RAID. Hubertus Franke

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006

Chapter 12: Mass-Storage

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

STORING DATA: DISK AND FILES

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Introduction to I/O and Disk Management

Introduction to I/O and Disk Management

COT 4600 Operating Systems Fall 2009

u Covered: l Management of CPU & concurrency l Management of main memory & virtual memory u Currently --- Management of I/O devices

Storing Data: Disks and Files

Mass-Storage Systems. Mass-Storage Systems. Disk Attachment. Disk Attachment

Chapter 10: Mass-Storage Systems

CMSC424: Database Design. Instructor: Amol Deshpande

A memory is what is left when something happens and does not completely unhappen.

Semiconductor Memory Types Microprocessor Design & Organisation HCA2102

Storage Systems : Disks and SSDs. Manu Awasthi CASS 2018

CPSC 421 Database Management Systems. Lecture 11: Storage and File Organization

Classifying Physical Storage Media. Chapter 11: Storage and File Structure. Storage Hierarchy (Cont.) Storage Hierarchy. Magnetic Hard Disk Mechanism

Classifying Physical Storage Media. Chapter 11: Storage and File Structure. Storage Hierarchy. Storage Hierarchy (Cont.) Speed

Review 1-- Storing Data: Disks and Files

Storage and File Structure. Classification of Physical Storage Media. Physical Storage Media. Physical Storage Media

BBM371- Data Management. Lecture 2: Storage Devices

Ch 11: Storage and File Structure

Storage Systems. Storage Systems

Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu

Chapter 9: Peripheral Devices: Magnetic Disks

CSE 153 Design of Operating Systems

Disks and RAID. CS 4410 Operating Systems. [R. Agarwal, L. Alvisi, A. Bracy, E. Sirer, R. Van Renesse]

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Computer Architecture 计算机体系结构. Lecture 6. Data Storage and I/O 第六讲 数据存储和输入输出. Chao Li, PhD. 李超博士

Disks & Files. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Lecture 23. Finish-up buses Storage

CS 554: Advanced Database System

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Normalization, Generated Keys, Disks

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Lecture 15 - Chapter 10 Storage and File Structure

Disks and Files. Jim Gray s Storage Latency Analogy: How Far Away is the Data? Components of a Disk. Disks

Data Storage and Disk Structure

Transcription:

CSCI-GA.2433-001 Database Systems Lecture 8: Physical Schema: Storage Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com

View 1 View 2 View 3 Conceptual Schema Physical Schema 1. Create a model of the enterprise (e.g., using ER) 2. Create a logical implementation (using a relational model and normalization) Disk What happens under the hood?

Requirements Analysis Conceptual Design Uses a file system to store the relations Requires knowledge of hardware and operating systems characteristics Logical Design Schema Refinement Application & Security Design Physical Design

First, Let s Look at a Typical Hierarchy Level Access time Typical size Registers "instantaneous" under 1KB Level 1 Cache 1-3 ns 64KB per core Level 2 Cache 3-10 ns 256KB per core Level 3 Cache 10-20 ns 2-20 MB per chip Main Memory 30-60 ns 4-32 GB per system Hard Disk 3,000,000-10,000,000 ns over 1TB In DB, we care about those two levels

File Structures Indexes Physical Design Criteria Storage media Performance Hard-disk drives Solid-state Disks

Disk Drives Platters Tracks Platter Sectors Track To access data: seek time: position head over the proper track rotational latency: wait for desired sector (RPM) transfer time: grab the data (one or more sectors)

Hard Disks spinning platter of special material mechanical arm with read/write head must be close to the platter to read/write data data is stored magnetically disks are random access meaning data can be read/written anywhere on the disk

A Conventional Hard Disk Structure

Hard Disk Architecture Surface = group of tracks Track = group of sectors Sector = group of bytes Cylinder: several tracks on corresponding surfaces

Disk Sectors and Access Each sector records Sector ID Data (512 bytes, 4096 bytes proposed) Error correcting code (ECC) Used to hide defects and recording errors Synchronization fields and gaps Access to a sector involves Queuing delay if other accesses are pending Seek: move the heads Rotational latency Data transfer Controller overhead

Example of a Real Disk: MKxx59GSM from Toshiba Size: 1TB 5,400 RPM 2.5 Number of platters: 3 Number of data heads: 6 Interface: SATA Transfer rate to host: 3GB/sec Average seek time: 12ms Track-to-Track seek: 2ms

Disks: Other Issues Average seek and rotation times are helped by locality. Disk performance improves about 10%/year Capacity increases about 60%/year Common disk interfaces/controllers: SCSI, IDE, SATA

The Disk: A View from the Top A sequence of blocks A physical unit of access is always a block All blocks are of the same size Actually, a block consists of 1 or more sectors A file: Logically: a series of records, of similar or different sizes Physically: a series of, no necessarily contiguous blocks The file system helps us to: Find the first block Find the last block Find the next block Find the previous block

Relation File Each tuple is a record in the file. Logical view Records There is one extra layer that we will discuss shortly! Physical view Assumptions: There can be several records in a block No record spans more than one block Blocks Sectors

Example Relation E# Salary 1 1200 3 2100 4 1800 2 1200 6 2300 9 1400 8 1900 Records 2 1200 4 1800 1 1200 3 2100 8 1900 Records First block of the file 2 1200 9 4 1800 1400 1 1200 3 2100 2 1200 8 1900 9 1400 Blocks 6 2300 4 1800 Left-over Space 9 1400 6 2300 6 2300 1 1200 3 2100 8 1900

Processing a Query: What Happens Under The Hood? SELECT E# FROM R WHERE SALARY > 1500; Read from disk into RAM all relevant blocks Get the relevant information from blocks What is the cost of this? Additional processing then produce the answer

A Simple Cost Model Assumptions Reading or Writing a block costs one time unit Processing is free Justification Accessing the disk is much more expensive than any reasonable CPU processing of queries Implication Goal: minimize number of block accesses Good Heuristic: Organize the physical database so that you make as much use as possible from any block you read/write

Why Not Store Everything in Main Memory?

Example What is the cost of accessing: E# 2 and E# 9? What is the cost of accessing: E# 2 and E# 4? Array in RAM 2 1200 4 1800 1 1200 3 2100 8 1900 9 1400 6 2300 Blocks on disk 9 1400 6 2300 2 1200 4 1800 1 1200 3 2100 8 1900

What Is The Best Place for the Next Block?

RAID Disk Array: Arrangement of several disks that gives abstraction of a single, large disk. Goals: Increase performance and reliability. Two main techniques: Data striping: Data is partitioned; size of a partition is called the striping unit. Partitions are distributed over several disks. Redundancy: More disks => more failures. Redundant information allows reconstruction of data if a disk fails.

RAID Levels Level 0: No redundancy Level 1: Mirrored (two identical copies) Each disk has a mirror image Parallel reads, a write involves two disks. Maximum transfer rate = transfer rate of one disk Better reliability but no protection against data corruption of viruses

RAID Levels Level 0+1: Striping and Mirroring Parallel reads, a write involves two disks. Maximum transfer rate = aggregate bandwidth Level 1+0 is the same as 0+1 with reverse in mirroring and stripping.

RAID Levels Level 2: Bit-Interleaved Uses hamming code for error correction Can recover data from single-bit corruption Out of fashion! For Error correction (Hamming code)

RAID Levels Level 3: Byte-Interleaved with Parity Striping Unit: One byte. One check disk. Each read and write request involves all disks; disk array can process one request at a time.

RAID Levels Level 4: Block-Interleaved Parity Striping Unit: One disk block. One check disk. Parallel reads possible for small requests, large requests can utilize full bandwidth Writes involve modified block and check disk

RAID Levels Level 5: Block-Interleaved Distributed Parity Similar to RAID Level 4, but parity blocks are distributed over all disks Level 6 is similar to 5 but with extra parity block

What Is This Story of Solid State Disks (SSD)? No moving parts hence called solid state reads and writes to a medium called NAND flash memory Faster startup: no spinning Extremely low read latency Deterministic: performance does not depend on the location of the data

BUT Much more expensive than hard disks (~3$/GB vs. 0.15$/GB) Limited write erase time Slower write speeds High capacity SSDs may have significant higher power requirements SSD can get slower as it ages

SSD Organization SSD has its own page SSD pages are getting larger: 8KB, 16KB, 32KB Sector addressable Most SSD uses 4KB sectors

Comparison of typical hard-drives and SSD More up-to-date- SSD: Samsung SSD 840 Pro Series: 256 GB Read XFR Rate: ~510 MB/s Write XFR Rate: ~500 MB/s

Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB Application DBMS OS DISK Controller Important: Tow flavors of DBMS Relying on OS file systems Uses its own or extend OS Data must be in RAM for DBMS to operate on it!

Disk Space Manager Abstractly: deals with pages as a unit of data (read, write, allocate, deallocate) Page size usually chosen as disk block Keeps track of: Which blocks are in use List of bitmap Which pages are on which page blocks Hides the details of the hardware and OS and makes higher level layers think in term of pages

Buffer Manager Memory may not be able to hold all needed pages the software layer responsible for bringing pages from disk to main memory as needed Implements replacement policy Higher levels of the DBMS code can be written without worrying about whether data pages are in memory or not.

Buffer Manager It does its job as follows: Partitions the memory into frames (1 frame holds 1 page) The collection of all frames/pages is called buffer pool Page Requests from Higher Levels BUFFER POOL disk page free frame MAIN MEMORY DISK DB choice of frame dictated by replacement policy

Hierarchy of Data Tuples/Records Relations Files Pages Blocks Sectors

We can try to minimize the number of block accesses for frequent queries. Indexing Oversimplification: Tries to provide where blocks containing useful records are File Organization Oversimplification: Tries to provide When you read a block you get many useful records

Conclusions From the application to the disk, each layer sees the data differently Disks provide cheap, non-volatile storage. Performance and cost are the main issues