Scalable Transaction Processing on Multicores

Size: px

Start display at page:

Download "Scalable Transaction Processing on Multicores"

Delilah McCarthy
5 years ago
Views:

1 Scalable Transaction Processing on Multicores [Shore-MT & DORA] Ippokratis Pandis Ryan Johnson, Nikos Hardavellas, Anastasia Ailamaki CMU & EPFL HPTS -26 Oct 2009

2 Multicore cluster on a chip Then Now CPU Small Machine CPU CPU CPU CPU CPU CPU CPU CPU CPU Core Core Core Core Core Core Core Core Multicore Machine Big Machine Parallelism of yesterday s big machine on one chip 2

3 Database Engine Scalability Norm. Throughput Ideal Postgres MySql Shore BDB Sun Niagara (32 contexts) Insert-only microbenchmark # HW Contexts Bestscalability just 30% of ideal 3

4 Shared Everything vs. Nothing Shared Everything Hard to scale Shared Nothing Multiple processes, physically separated data Explicit contention control Perfectly partition-able workload Memory pressure: redundant data/structures Two approaches complimentary Focus on scalability of a single (shared everything) instance 4

5 Shore-MT Multithreaded version of Shore Why Shore? State-of-the-art DBMS features Two-phase row-level locking ARIES-style logging/recovery Time (seconds) Execution Time [EDBT09] [DaMoN08] MySql Postgres DBMS "X" BDB Shore-MT Shoresimilar at instruction-level with commercial DBMSs # HW Contexts High-performing, scalable conventional engine Available at: 5

6 Scalability on Even Higher Parallelism Throughput / # HW Con ntexts # HW Contexts Time Spent (cpu sec cs) BPOOL LOCK-MAN LOCK-MGR LOG-MAN LOG-MGR DATA MOVEMENT OTHER Sun Niagara II TPC-C Payment # HW Contexts Lock manager overhead dominant Typical scenario: contention for compatible locks 6

7 Data-oriented Transaction Execution [PVLDB10] It is not the transaction which dictates what data the transaction-executing thread will access Break each transaction into smaller actions Depending on the data they touch Execute actions by data-owning threads Distribute and privatize locking, data accesses across the chip New data-oriented execution model Reduce overhead of locking and data accesses 7

8 DORA vs. Conventional Throughput Throughput (tps) TPC-C Payment 2x Avoid expensive (centralized) lock manager operations 20% DORA Intra-xaction parallelism on BASELINE light loads % Immune to centr. lock manager Sun Niagara II 64 HW Contexts Higher performancein # of the Clients entire load spectrum 8

9 DORA vs. Conventional At 100% CPU TM1-Mix TPC-C OrderStatus Time Breakdown (%) 100% 80% 60% 40% 20% 0% 100% 80% 60% 40% 20% 0% Dora Lock Manager Cont Lock Manager Other Cont Work Baseline Dora Baseline Dora Eliminate contention on the centr. lock manager Significantly reduced work (lightweight locks) 9

10 Roadmap Introduction Conventional execution Data-oriented transaction execution Evaluation Conclusions 10

11 Typical Lock Manager Lock Hash Table Lock Head L1 T1 EX Xct s Lock Requests L2 EX EX Queue Lock Requests The higher the HW parallelism Longer Queues of Requests Longer CSs Higher Contention 11

12 Conventional -Example Transaction: I D u(wh) u(cust) u(ord) u(wh) u(cust) u(ord) CPU-0 CPU-1 CPU-2 CPU L1 L2 I = Instruction D = Data MEM I/O WH CUST ORD 12

13 Conventional -Access Pattern Unpredictable access pattern Source of contention 13

14 Roadmap Introduction Conventional execution Data-oriented transaction execution Evaluation Conclusions 14

15 Dora -Access Pattern Predictable access patterns Optimizations possible (e.g. no centralized locks) 15

16 Transaction Flow Graph Each transaction input is a graph of Actions & RVPs Actions Identified by: Table/Index it is accessing Subset of primary key Rendezvous Points Decision points (commit/abort) Separate different phases Counter of the # of actions to report Last to report initiates next phase Enqueue the actions of the next phase TPC-C Payment Upd(WH) Upd(DI) Upd(CU) Ins(HI) Phase 1 Phase 2 16

17 Partitions & Executors Partitions at each table Local lock table Map {partof(key), LockMode} List of blocked actions Input queue New actions Completed queue On xct commit/abort Remove from local lock table Completed Local Lock Table B Input A DORA Storage Engine A A Pref LM Own Wait A EX A A B EX B Executor thread Loop completed/input queue Asynchronous communication / event-based 17

18 Dora -Example Transaction: I D u(wh) u(cust) u(ord) CPU-0 CPU-1 CPU-2 u(wh) u(cust) u(ord) CPU L1 L2 MEM I/O WH CUST ORD Centralized lock free Improved data reuse 18

19 Dora vs. Shared-nothing No physical partition of data No duplicated data structures Smaller memory footprint A single log manager No need for distributed transactions No need for 2PC Dora is NOT a shared-nothing system Combines benefits of both 19

20 Roadmap Introduction Conventional execution Data-oriented transaction execution Evaluation Conclusions 20

21 Experimental Setup Hardware Sun Niagara II processor 8 cores with 8 HW contexts per core (64 HW ctxs) 32 GB main memory Workloads Update-intensive, short-running transactions TPC-C 100 warehouses (13GB) TM1 1M subscribers (1.5GB) 21

22 Eliminating contention on the lock mgr Throughput (kmqth h/s) DORA DORA BASE-DLOG BASELINE Real CPU Load (%) Time Breakdown (cpu secs) Other Cont LogMgr Cont LockMgr Cont Useful # HW Ctxs Baseline Eliminates contention on the lock manager Linear scalability to 64 HW ctxs Immune to oversaturation Sun Niagara II TM1 Other Cont LogMgr Cont LockMgr Cont Useful # HW Ctxs DORA 22

23 Response time for single client 1.2 ime 1 Norm. Response Ti Baseline DORA better 0 Exploits intra-xct parallelism Lower response times on low-load 23

24 Peak Performance Baseline DORA 2 100% 100% 100% Norm. Peak Through hput % 95% 95% 95% 90% 70% 88% 74% 66% 78% 60% 85% 76% 88% 90% 70% 95% 85% better Higher peak performance Always close to 100% CPU utilization 24

25 Roadmap Introduction Conventional execution Data-oriented transaction execution Evaluation Conclusions 25

26 Summary Large number of active threads stress scalability of database system Data-oriented transaction execution Benefits of shared-nothing w/o physical data partitioning Small modifications on a conventional storage engine Higher performance on the entire load spectrum 26

PLP: Page Latch free

PLP: Page Latch free Shared everything OLTP Ippokratis Pandis Pınar Tözün Ryan Johnson Anastasia Ailamaki IBM Almaden Research Center École Polytechnique Fédérale de Lausanne University of Toronto OLTP