What the Hekaton? In-memory OLTP Overview. Kalen Delaney

Similar documents
SQL Server 2014: In-Memory OLTP for Database Administrators

Quick Poll. SQL Server 2014 In-Memory OLTP. Prod Test

Are You OPTIMISTIC About Concurrency?

Heckaton. SQL Server's Memory Optimized OLTP Engine

JANUARY 20, 2016, SAN JOSE, CA. Microsoft. Microsoft SQL Hekaton Towards Large Scale Use of PM for In-memory Databases

SQL Server 2014 In-Memory Tables (Extreme Transaction Processing)

SQL Server 2014 Highlights der wichtigsten Neuerungen In-Memory OLTP (Hekaton)

SQL Server 2014 In-Memory OLTP: Prepare for Migration. George Li, Program Manager, Microsoft

SQL Server In-Memory OLTP Internals Overview

The DBA Survival Guide for In-Memory OLTP. Ned Otter SQL Strategist

SQL Server 2014 Internals and Query Tuning

Will my workload run faster with In-Memory OLTP?

SQL Server In-Memory OLTP Internals Overview for CTP1

Field Testing Buffer Pool Extension and In-Memory OLTP Features in SQL Server 2014

Persistence Is Futile- Implementing Delayed Durability in SQL Server

Indexing survival guide for SQL 2016 In-Memory OLTP. Ned Otter SQL Strategist

InnoDB: Status, Architecture, and Latest Enhancements

In-Memory Tables and Natively Compiled T-SQL. Blazing Speed for OLTP and MOre

Real world SQL 2016 In-Memory OLTP

Let s Explore SQL Storage Internals. Brian

Tables. Tables. Physical Organization: SQL Server Partitions

Physical Organization: SQL Server. Leggere Cap 7 Riguzzi et al. Sistemi Informativi

Physical Organization: SQL Server 2005

Get the Skinny on Minimally Logged Operations

High Performance Transactions in Deuteronomy

Main-Memory Databases 1 / 25

Memory Pointer Management

Foster B-Trees. Lucas Lersch. M. Sc. Caetano Sauer Advisor

Locking, Blocking, Versions: Concurrency for Maximum Performance. Kalen Delaney, Moderated By: Daniel Janik

Microsoft SQL Server Database Administration

20762B: DEVELOPING SQL DATABASES

Tuesday, April 6, Inside SQL Server

Microsoft. [MS20762]: Developing SQL Databases

Developing SQL Databases

Bigtable. Presenter: Yijun Hou, Yixiao Peng

The Hekaton Memory-Optimized OLTP Engine

Seven Awesome SQL Server Features

Last Class Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications

CONFIGURING SQL SERVER FOR PERFORMANCE LIKE A MICROSOFT CERTIFIED MASTER

Jyotheswar Kuricheti

SQL Server Development 20762: Developing SQL Databases in Microsoft SQL Server Upcoming Dates. Course Description.

User Perspective. Module III: System Perspective. Module III: Topics Covered. Module III Overview of Storage Structures, QP, and TM

Scale out Read Only Workload by sharing data files of InnoDB. Zhai weixiang Alibaba Cloud

TempDB how it works? Dubi Lebel Dubi Or Not To Be

Google File System. Arun Sundaram Operating Systems

Course Outline. SQL Server Performance & Tuning For Developers. Course Description: Pre-requisites: Course Content: Performance & Tuning.

The Google File System

bobpusateri.com heraflux.com linkedin.com/in/bobpusateri. Solutions Architect

The Oracle DBMS Architecture: A Technical Introduction

Google File System. By Dinesh Amatya

UNIT 9 Crash Recovery. Based on: Text: Chapter 18 Skip: Section 18.7 and second half of 18.8

Authors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani

Transactions and Recovery Study Question Solutions

"Charting the Course... MOC C: Developing SQL Databases. Course Summary

Oracle Architectural Components

Microsoft Developing SQL Databases

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Last Class. Today s Class. Faloutsos/Pavlo CMU /615

Internals of Active Dataguard. Saibabu Devabhaktuni

Lab 1. In this first lab class, we will address the following topics:

Common non-configured options on a Database Server

<Insert Picture Here> Filesystem Features and Performance

The Google File System

CS143: Index. Book Chapters: (4 th ) , (5 th ) , , 12.10

! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like

Lecture 21: Logging Schemes /645 Database Systems (Fall 2017) Carnegie Mellon University Prof. Andy Pavlo

INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados

XI. Transactions CS Computer App in Business: Databases. Lecture Topics

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

Indexing. Jan Chomicki University at Buffalo. Jan Chomicki () Indexing 1 / 25

Trekking Through Siberia: Managing Cold Data in a Memory-Optimized Database

Deep Dive: InnoDB Transactions and Write Paths

SQL Server 2014 Training. Prepared By: Qasim Nadeem

Root Cause Analysis for SAP HANA. June, 2015

GridGain and Apache Ignite In-Memory Performance with Durability of Disk

Kathleen Durant PhD Northeastern University CS Indexes

<Insert Picture Here> Looking at Performance - What s new in MySQL Workbench 6.2

Deep Dive: InnoDB Transactions and Write Paths

Virtual to physical address translation

Introduction. Storage Failure Recovery Logging Undo Logging Redo Logging ARIES

Percona Live September 21-23, 2015 Mövenpick Hotel Amsterdam

Google Disk Farm. Early days

HP AutoRAID (Lecture 5, cs262a)

Basics of SQL Transactions

Topics to Learn. Important concepts. Tree-based index. Hash-based index

Does the Optimistic Concurrency resolve your blocking problems? Margarita Naumova, SQL Master Academy

Oracle 1Z Upgrade Oracle9i/10g OCA to Oracle Database 11g OCP. Download Full Version :

Raima Database Manager Version 14.1 In-memory Database Engine

Guide to Database Maintenance: Locked Free Space Collection Algorithm

Final Review. May 9, 2017

ACID Properties. Transaction Management: Crash Recovery (Chap. 18), part 1. Motivation. Recovery Manager. Handling the Buffer Pool.

Final Review. May 9, 2018 May 11, 2018

NPTEL Course Jan K. Gopinath Indian Institute of Science

Seminar: Presenter: Oracle Database Objects Internals. Oren Nakdimon.

Transaction Management: Crash Recovery (Chap. 18), part 1

some sequential execution crash! Recovery Manager replacement MAIN MEMORY policy DISK

CLOUD-SCALE FILE SYSTEMS

Oracle Database 10g: New Features for Administrators Release 2

Rdb features for high performance application

Yves Goeleven. Solution Architect - Particular Software. Shipping software since Azure MVP since Co-founder & board member AZUG

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

Transcription:

What the Hekaton? In-memory OLTP Overview Kalen Delaney www.sqlserverinternals.com Kalen Delaney Background: MS in Computer Science from UC Berkeley Working exclusively with SQL Server for 28 years SQL Server In-Memory OLTP Internals (RedGate 2014) Primary Author: SQL Server 2012 Internals (MS Press/O Reilly, 2013) Author: SQL Server Concurrency (RedGate 2010) Primary Author: SQL Server 2008 Internals (MS Press, 2009) Primary Author: Inside SQL Server 2005: Query Tuning and Optimization (MS Press, 2007) Author: Inside SQL Server 2005: The Storage Engine (MS Press, 2006) SQL Server Magazine columnist and contributing editor Website: www.sqlserverinternals.com Twitter: @sqlqueen Blog: www.sqlblog.com 2016-05-10 2 Kalen Delaney, 2016 #2 1

In-Memory OLTP Overview SQL Server 2014 adds in-memory technology to boost performance of OLTP workloads Memory optimized table and index structures Native compilation of business logic in stored procedures Latch- and lockfree data structures Fully integrated into SQL Server Stream-based storage Multi-versioning built into table structures Familiar management tools: SSMS/SMO/DMVs 2016-05-10 3 Kalen Delaney, 2016 #3 Agenda In-memory Data Structures Managing Memory and Garbage Collection Persistence and Storage Management Logging and Recovery 2016-05-10 4 Kalen Delaney, 2016 #4 2

In-Memory Data Structures Rows New row format Structure of the row is optimized for in-memory residency and access One copy of row Indexes point to rows, they do not duplicate them Indexes Hash index for equality search Memory-optimized B-tree for range and equality search Do not exist on disk recreated during recovery Every table must have at least one index! 2016-05-10 5 Kalen Delaney, 2016 #5 In-memory Table: Row Format Row header Payload (table columns) 8 bytes * (IdxLinkCount 1) Begin Ts End Ts StmtId IdxLinkCount 8 bytes 8 bytes 4 bytes 2 bytes Key Points: Begin/End timestamp determines row s validity There is no data or index page - Just rows Row size limited to 8060 bytes Allows data to be moved to disk-based tables Not every SQL table schema is supported 2016-05-10 6 Kalen Delaney, 2016 #6 3

Indexes On Memory-Optimized Tables Indexes are what combines rows into a table Hash Indexes Defined as NONCLUSTERED HASH Predefined number of buckets (fixed memory size) Good for point lookups Range Indexes Defined as NONCLUSTERED Stored as Bw-Tree Good for range searches and ordered scans Indexes ONLY exist in memory 2016-05-10 7 Kalen Delaney, 2016 #7 Hash Indexes Timestamps Chain ptrs Name City Hash index on Name 50, Jane Prague Hash index on City 100, John Prague 90, 150 Susan Bogota 2016-05-10 8 Kalen Delaney, 2016 #8 4

Range Index Core Characteristics Resizeable grows and shrinks with utilization Unidirectional Good performance for point lookup, excellent performance for range and table scan Lock free Bw-tree based Leaf pages used to store index keys and pointers to chains of rows Value chains have same characteristics as bucket chains for hash index 2016-05-10 9 Kalen Delaney, 2016 #9 Page Mapping Table 0 PAGE 1 PAGE 2 3 Physical Bw-Tree Root 10 20 28 PageID-0 Page size- up to 8K Logical pointers Indirect physical pointers through Page Mapping table Page Mapping table grows (doubles) as table grows Sibling pages linked one direction Require two indexes for ASC/DESC No in-place updates on index pages Handled thru delta pages or building new pages 5 8 10 11 15 18 21 24 27 PageID-3 Page-ID-2 PageID -14 Non-leaf pages 1 2 4 6 7 8 25 26 27 leaf pages 14 15 200, 1 50, 300 2 Key 100,200 1 Key Data rows 2016-05-10 10 Kalen Delaney, 2016 #10 5

Point Lookups and Range Scans Point lookups similar to B-Trees Range scans Search for starting point Follow keys, duplicate chains and right page pointers until end key is reached Uni-directional because pages linked in only one direction 2016-05-10 11 Kalen Delaney, 2016 #11 Limitations on Tables in SQL 2014 Optimized for in-memory Rows are at most 8060 bytes no off-row data No Large Object (LOB) types like varchar(max) Scoping limitations No FOREIGN KEY and no CHECK constraints IDENTITY only (1,1) No schema changes (ALTER TABLE) need to drop/recreate table No add/remove index need to drop/recreate table 2016-05-10 12 Kalen Delaney, 2016 #12 6

Memory Management Table data resides in memory at all times. No paging Must configure SQL box with sufficient memory to store memory-optimized tables; Max supported 512GB Failure to allocate memory will fail transactional workload at runtime Integrated with SQL Server memory manager and reacts to memory pressure where possible Integration with Resource Governor Bind a database to a resource pool Ensures memory consumption from recovery is accounted for Hard limit (80% of phys. memory) to ensure system remains stable under lowmemory situations 2016-05-10 13 Kalen Delaney, 2016 #13 Garbage Collection Stale Row Versions Updates, deletes, and aborted insert operations create row versions that (eventually) are no longer visible to any transaction. Slows down scans of index structures Creates unused memory that needs to be reclaimed (i.e. Garbage Collected) Garbage Collection (GC) Analogous to version store cleanup task for disk-based tables to support Read Committed Snapshot (RCSI) System maintains oldest active transaction information Design Goals: Non-blocking, Cooperative, Efficient, Responsive, Scalable Active transactions work cooperatively and pick up parts of GC work A dedicated system thread to do GC 2016-05-10 14 Kalen Delaney, 2016 #14 7

Durability Memory-optimized tables can be durable or non-durable Default is durable Non-durable tables are useful for transient data Durable tables are persisted in a single memory-optimized filegroup Storage used for memory-optimized has a different access pattern than for disk tables Filegroup can have multiple containers (volumes) Additional containers aid in parallel recovery; recovery happens at the speed of I/O 2016-05-10 15 Kalen Delaney, 2016 #15 On-disk Storage Filestream is the underlying storage mechanism Checksums and single-bit correcting ECC on files Data files ~128MB in size (unless machine has <16GB memory), write 256KB chunks at a time Stores only the inserted rows (i.e. table content) Chronologically organized streams of row versions Delta files File size is not constant, write 4KB chunks at a time Stores IDs of deleted rows 2016-05-10 16 Kalen Delaney, 2016 #16 8

Storage: Data and Delta Files 0 100 Checkpoint File Pair Data File Delta File TS (ins) RowId TableId TS (ins) RowId TableId TS (ins) RowId TableId TS (ins) RowId TS (del) TS (ins) RowId TS (del) TS (ins) RowId TS (del) Row pay load Row pay load Row pay load 2016-05-10 17 Kalen Delaney, 2016 #17 Populating Data/Delta files SQL Transaction log Del Del Tran1 Tran1(TS150) (row TS150) Log in disk Table Del Tran2 (row (TS TS 450) 450) Del Tran3 (row (TS TS 250) 250) Insert into Hekaton Insert into T1 T1 Offline Checkpoint Thread Delete 150 TS Delete 250 TS Delete 450 TS New Inserts Range 100-199 Range 200-299 Range 300-399 Range 400-499 Range 500- Memory-optimized Table Filegroup 2016-05-10 18 Kalen Delaney, 2016 #18 9

Logging for Memory-Optimized Tables Uses SQL transaction log to store content Each HK log record contains a log record header followed by opaque memory optimized-specific log content All logging for memory-optimized tables is logical No log records for physical structure modifications No index-specific / index-maintenance log records No UNDO information is logged 2016-05-10 19 Kalen Delaney, 2016 #19 Backup for Memory-Optimized Tables Memory-Optimized file group is backed up as part SQL database backup 2016-05-10 20 Kalen Delaney, 2016 #20 10

Recovery for Memory-Optimized Tables Analysis Phase Finds the last completed checkpoint in transaction log Data Load Load from set of data/delta files from the last completed checkpoint Parallel Load by reading data/delta files using 1 thread / file Redo phase to apply tail of the log Apply the transaction log from last checkpoint Concurrent with REDO on disk-based tables No UNDO phase for memory-optimized tables Only committed transactions are logged 2016-05-10 21 Kalen Delaney, 2016 #21 Recovery: Parallel load Memory Optimized Tables Recovery Data Loader Recovery Data Loader Recovery Data Loader filter filter filter Delta map Delta map Delta map Data File1 Delta File1 Data File2 Delta File2 Data File3 Delta File3 Memory Optimized Container - 1 Memory Optimized Container - 2 2016-05-10 22 Kalen Delaney, 2016 #22 11

Summary In-memory Data Structures Managing Memory and Garbage Collection Persistence and Storage Management Logging and Recovery 2016-05-10 23 Kalen Delaney, 2016 #23 2016-05-10 24 Kalen Delaney, 2016 #24 12