Scalable Low-Latency Indexes for a Key-Value Store Ankita Kejriwal

Similar documents
Indexing in RAMCloud. Ankita Kejriwal, Ashish Gupta, Arjun Gopalan, John Ousterhout. Stanford University

RAMCloud. Scalable High-Performance Storage Entirely in DRAM. by John Ousterhout et al. Stanford University. presented by Slavik Derevyanko

HyperDex. A Distributed, Searchable Key-Value Store. Robert Escriva. Department of Computer Science Cornell University

RAMCloud: A Low-Latency Datacenter Storage System Ankita Kejriwal Stanford University

Making RAMCloud Writes Even Faster

CS 655 Advanced Topics in Distributed Systems

NoSQL systems. Lecture 21 (optional) Instructor: Sudeepa Roy. CompSci 516 Data Intensive Computing Systems

High-Level Data Models on RAMCloud

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems

PROFESSIONAL. NoSQL. Shashank Tiwari WILEY. John Wiley & Sons, Inc.

Extreme Computing. NoSQL.

CGAR: Strong Consistency without Synchronous Replication. Seo Jin Park Advised by: John Ousterhout

Asynchronous View Maintenance for VLSD Databases

HyperDex: Next Generation NoSQL

A Modular Design for Geo-Distributed Querying

RAMCloud: Scalable High-Performance Storage Entirely in DRAM John Ousterhout Stanford University

GridGain and Apache Ignite In-Memory Performance with Durability of Disk

Exploiting Commutativity For Practical Fast Replication. Seo Jin Park and John Ousterhout

Migrating Oracle Databases To Cassandra

10/18/2017. Announcements. NoSQL Motivation. NoSQL. Serverless Architecture. What is the Problem? Database Systems CSE 414

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University

Distributed Data Store

EECS 498 Introduction to Distributed Systems

CompSci 516 Database Systems

Rule 14 Use Databases Appropriately

Benchmarking Cloud Serving Systems with YCSB 詹剑锋 2012 年 6 月 27 日

Goal of the presentation is to give an introduction of NoSQL databases, why they are there.

PNUTS: Yahoo! s Hosted Data Serving Platform. Reading Review by: Alex Degtiar (adegtiar) /30/2013

Building Consistent Transactions with Inconsistent Replication

Cassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent

arxiv: v1 [cs.dc] 9 Jan 2018

CSE 344 JULY 9 TH NOSQL

Design and Validation of Distributed Data Stores using Formal Methods

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

Replex: A Scalable, Highly Available Multi-Index Data Store

Data Storage Revolution

Overview. * Some History. * What is NoSQL? * Why NoSQL? * RDBMS vs NoSQL. * NoSQL Taxonomy. *TowardsNewSQL

Final Exam Review 2. Kathleen Durant CS 3200 Northeastern University Lecture 23

App Engine: Datastore Introduction

SCALABLE CONSISTENCY AND TRANSACTION MODELS

5/1/17. Announcements. NoSQL Motivation. NoSQL. Serverless Architecture. What is the Problem? Database Systems CSE 414

Database Availability and Integrity in NoSQL. Fahri Firdausillah [M ]

Introduction to Computer Science. William Hsu Department of Computer Science and Engineering National Taiwan Ocean University

Big Data Development CASSANDRA NoSQL Training - Workshop. November 20 to (5 days) 9 am to 5 pm HOTEL DUBAI GRAND DUBAI

YeSQL: Battling the NoSQL Hype Cycle with Postgres

Tools for Social Networking Infrastructures

MDCC MULTI DATA CENTER CONSISTENCY. amplab. Tim Kraska, Gene Pang, Michael Franklin, Samuel Madden, Alan Fekete

Time Series Storage with Apache Kudu (incubating)

PNUTS and Weighted Voting. Vijay Chidambaram CS 380 D (Feb 8)

CIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( )

5/2/16. Announcements. NoSQL Motivation. The New Hipster: NoSQL. Serverless. What is the Problem? Database Systems CSE 414

Building High Performance Apps using NoSQL. Swami Sivasubramanian General Manager, AWS NoSQL

Introduction to NoSQL Databases

Database Systems CSE 414

CIB Session 12th NoSQL Databases Structures

Scaling for Humongous amounts of data with MongoDB

Data-Intensive Distributed Computing

Advanced Data Management Technologies

Latest Trends in Database Technology NoSQL and Beyond

Beyond Relational Databases: MongoDB, Redis & ClickHouse. Marcos Albe - Principal Support Percona

Final Exam Logistics. CS 133: Databases. Goals for Today. Some References Used. Final exam take-home. Same resources as midterm

Part I What are Databases?

Performance Evaluation of NoSQL Databases

Realtime visitor analysis with Couchbase and Elasticsearch

Cassandra - A Decentralized Structured Storage System. Avinash Lakshman and Prashant Malik Facebook

References. NoSQL Motivation. Why NoSQL as the Solution? NoSQL Key Feature Decisions. CSE 444: Database Internals

JANUARY 20, 2016, SAN JOSE, CA. Microsoft. Microsoft SQL Hekaton Towards Large Scale Use of PM for In-memory Databases

ITG Software Engineering

Distributed Data Analytics Partitioning

How do we build TiDB. a Distributed, Consistent, Scalable, SQL Database

At-most-once Algorithm for Linearizable RPC in Distributed Systems

Advances in Data Management - NoSQL, NewSQL and Big Data A.Poulovassilis

Eventual Consistency 1

PRESENTATION TITLE GOES HERE. Understanding Architectural Trade-offs in Object Storage Technologies

Consistency in Distributed Storage Systems. Mihir Nanavati March 4 th, 2016

Database Solution in Cloud Computing

Low-Latency Datacenters. John Ousterhout Platform Lab Retreat May 29, 2015

Lecture Notes to Big Data Management and Analytics Winter Term 2017/2018 NoSQL Databases

The NoSQL Ecosystem. Adam Marcus MIT CSAIL

Kaladhar Voruganti Senior Technical Director NetApp, CTO Office. 2014, NetApp, All Rights Reserved

Cloud Spanner. Rohit Gupta, Solutions

MDHIM: A Parallel Key/Value Store Framework for HPC

The Consistency Analysis of Secondary Index on Distributed Ordered Tables. Houliang Qi, Xu Chang, Xingwu Liu, Li Zha

CISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL

10. Replication. Motivation

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS

Flash Storage Complementing a Data Lake for Real-Time Insight

Study of NoSQL Database Along With Security Comparison

A Global In-memory Data System for MySQL Daniel Austin, PayPal Technical Staff

NOSQL EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

SQL, NoSQL, MongoDB. CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden

Advances in Data Management - NoSQL, NewSQL and Big Data A.Poulovassilis

EECS 498 Introduction to Distributed Systems

Shen PingCAP 2017

Relational databases

Scaling with mongodb

Intro Cassandra. Adelaide Big Data Meetup.

LogBase: A Scalable Log-structured Database System in the Cloud

MongoDB and Mysql: Which one is a better fit for me? Room 204-2:20PM-3:10PM

Transcription:

Scalable Low-Latency Indexes for a Key-Value Store Ankita Kejriwal With Arjun Gopalan, Ashish Gupta, Zhihao Jia, Stephen Yang and John Ousterhout

Conjecture Can a key value store support strongly consistent secondary indexes while operating at low latency and large scale? SLIK Slide 2

Scalable Low-latency Indexes for a Key-value Store: SLIK Enables multiple secondary keys for each object Allows lookups and range queries on these keys Key design features: Scalability using independent partitioning Strong consistency using an ordered write approach Implemented in RAMCloud Low-latency, DRAM-based, distributed key-value store Performance: Summary of Results Scalability: Linear throughput increase with increasing number of partitions Low-latency:11-13 µs indexed reads, 29-37 µs durable writes/overwrites Latency approximately 2x non-indexed reads and writes SLIK Slide 3

Talk Outline Motivation Design Performance Related Work Summary SLIK Slide 4

Motivation Traditional RDBMs MySQL SLIK Slide 5

Motivation Traditional RDBMs MySQL + scalability - data models - consistency NoSQL Systems SLIK Slide 6

Motivation + consistency H-Base Espresso RAMCloud Traditional RDBMs MySQL + scalability - data models - consistency NoSQL Systems Yesquel Spanner Megastore HyperDex MongoDB H-Store Memcached PNUTS Tao + data models + low latency SLIK Slide 7

Motivation + consistency H-Base Espresso RAMCloud Traditional RDBMs MySQL + scalability - data models - consistency NoSQL Systems Yesquel Spanner Megastore HyperDex MongoDB H-Store Memcached PNUTS Tao + data models + low latency SLIK Slide 8

Talk Outline Motivation Design Performance Related Work Summary SLIK Slide 9

Design Data model Scalability Strong consistency Storage Durability Availability SLIK Slide 10

Design Data model Scalability Strong consistency Storage Durability Availability SLIK Slide 11

Design Data model Scalability Strong consistency Storage Durability Availability SLIK Slide 12

Scalability Nearly constant low latency irrespective of the server span Linear increase in throughput with the server span SLIK Slide 13

Index Partitioning: Colocation Colocate index entries and objects One of the keys used to partition the objects and indexes Server 1 Server 2 Server 3 Indexlet Indexlet Indexlet index key primary key a è 2 n è 3 q è 1 Tablet e è 4 g è 6 Tablet v è 5 b è 8 m è 7 Tablet 1 q rose 2 a tulip 3 n violet 4 e clover 5 v daily 6 g iris 7 m lily 8 b dahlia primary key index key value SLIK Slide 14

Index Partitioning: Colocation Colocate index entries and objects One of the keys used to partition the objects and indexes No association between index partitions and index key ranges Server 1 Server 2 Server 3 Indexlet Indexlet Indexlet index key primary key a è 2 n è 3 q è 1 Tablet e è 4 g è 6 Tablet v è 5 b è 8 m è 7 Tablet 1 q rose 2 a tulip 3 n violet 4 e clover 5 v daily 6 g iris 7 m lily 8 b dahlia primary key index key value Metadata: tablet & indexlet w/ pk 1 to 3: S 1 tablet & indexlet w/ pk 4 to 6: S 2 tablet & indexlet w/ pk >= 7: S 3 Slide 15

Index Partitioning: Colocation Client query: objects with index key between m - q Server 1 Server 2 Server 3 Indexlet Indexlet Indexlet a è 2 n è 3 q è 1 e è 4 g è 6 v è 5 b è 8 m è 7 Tablet 1 q rose 2 a tulip 3 n violet Tablet 4 e clover 5 v daily 6 g iris Tablet 7 m lily 8 b dahlia SLIK Slide 16

Index Partitioning: Colocation Client query: objects with index key between m - q Server 1 Server 2 Server 3 Indexlet Indexlet Indexlet a è 2 n è 3 q è 1 e è 4 g è 6 v è 5 b è 8 m è 7 Tablet 1 q rose 2 a tulip 3 n violet Tablet 4 e clover 5 v daily 6 g iris Tablet 7 m lily 8 b dahlia SLIK Slide 17

Index Partitioning: Colocation Client query: objects with index key between m - q Server 1 Server 2 Server 3 Indexlet Indexlet Indexlet a è 2 n è 3 q è 1 e è 4 g è 6 v è 5 b è 8 m è 7 Tablet 1 q rose 2 a tulip 3 n violet Tablet 4 e clover 5 v daily 6 g iris Tablet 7 m lily 8 b dahlia SLIK Slide 18

Index Partitioning: Colocation Client query: objects with index key between m - q Server 1 Server 2 Server 3 Indexlet Indexlet Indexlet a è 2 n è 3 q è 1 e è 4 g è 6 v è 5 b è 8 m è 7 Tablet 1 q rose 2 a tulip 3 n violet Tablet 4 e clover 5 v daily 6 g iris Tablet 7 m lily 8 b dahlia Not Scalable! Slide 19

Index Partitioning: Independent Partition each index and table independently Partition each index according to sort order for that index Server 4 Server 5 index key primary key Indexlet a è 2 b è 8 e è 4 g è 6 Indexlet m è 7 n è 3 q è 1 v è 5 Server 1 Server 2 Server 3 Tablet Tablet Tablet 1 q rose 4 e clover 7 m lily 2 a tulip 5 v daily 8 b dahlia 3 n violet 6 g iris primary key index key value SLIK Slide 20

Index Partitioning: Independent Partition each index and table independently Partition each index according to sort order for that index Server 4 Server 5 Indexlet Indexlet index key primary key a è 2 b è 8 e è 4 g è 6 Tablet Server 1 Server 2 Server 3 1 q rose 2 a tulip 3 n violet Tablet 4 e clover 5 v daily 6 g iris m è 7 n è 3 q è 1 v è 5 Tablet 7 m lily 8 b dahlia Metadata: tablet w/ pk 1 to 3: S 1 tablet w/ pk 4 to 6: S 2 tablet w/ pk >= 7: S 3 indexlet w/ sk a to g: S 4 indexlet w/ sk >= h: S 5 primary key index key value SLIK Slide 21

Index Partitioning: Independent Server 4 Server 5 Indexlet Indexlet a è 2 b è 8 e è 4 g è 6 m è 7 n è 3 q è 1 v è 5 Client query: objects with index key between m - q Server 1 Server 2 Server 3 Tablet Tablet Tablet 1 q rose 4 e clover 7 m lily 2 a tulip 5 v daily 8 b dahlia 3 n violet 6 g iris SLIK Slide 22

Index Partitioning: Independent Server 4 Server 5 Indexlet Indexlet a è 2 b è 8 e è 4 g è 6 m è 7 n è 3 q è 1 v è 5 Client query: objects with index key between m - q Server 1 Server 2 Server 3 Tablet Tablet Tablet 1 q rose 4 e clover 7 m lily 2 a tulip 5 v daily 8 b dahlia 3 n violet 6 g iris SLIK Slide 23

Index Partitioning: Independent Server 4 Server 5 Indexlet Indexlet a è 2 b è 8 e è 4 g è 6 m è 7 n è 3 q è 1 v è 5 Client query: objects with index key between m - q Server 1 Server 2 Server 3 Tablet Tablet Tablet 1 q rose 4 e clover 7 m lily 2 a tulip 5 v daily 8 b dahlia 3 n violet 6 g iris SLIK Slide 24

Index Partitioning: Independent Server 4 Server 5 Indexlet Indexlet a è 2 b è 8 e è 4 g è 6 m è 7 n è 3 q è 1 v è 5 Client query: objects with index key between m - q Server 1 Server 2 Server 3 Tablet Tablet Tablet 1 q rose 4 e clover 7 m lily 2 a tulip 5 v daily 8 b dahlia 3 n violet 6 g iris Scalable! Slide 25

Scalability Nearly constant low latency irrespective of the server span Linear increase in throughput with the server span SLIK Slide 26

Scalability Nearly constant low latency irrespective of the server span Linear increase in throughput with the server span Solution: Use independent partitioning But: indexed object writes: distributed operations Potential consistency issues between indexes and objects SLIK Slide 27

Design Data model Scalability Strong consistency Storage Durability Availability SLIK Slide 28

Design Data model Scalability Strong consistency Storage Durability Availability SLIK Slide 29

Consistency Properties If an object contains a given secondary key, then an index lookup with that key will return the object If an object is returned by index lookup, then this object contains a secondary key for that index within the specified range SLIK Slide 30

Consistency Properties If an object contains a given secondary key, then an index lookup with that key will return the object If an object is returned by index lookup, then this object contains a secondary key for that index within the specified range Bob Peggy Frank Alice Carol Trent SLIK Slide 31

Consistency Properties If an object contains a given secondary key, then an index lookup with that key will return the object If an object is returned by index lookup, then this object contains a secondary key for that index within the specified range Bob Alice Frank Trent Peggy Carol students with name between a d? Alice Carol? SLIK Slide 32

Consistency Properties If an object contains a given secondary key, then an index lookup with that key will return the object If an object is returned by index lookup, then this object contains a secondary key for that index within the specified range SLIK Slide 33

Consistency Properties If an object contains a given secondary key, then an index lookup with that key will return the object If an object is returned by index lookup, then this object contains a secondary key for that index within the specified range Bob Peggy Frank Alice Carol Trent SLIK Slide 34

Consistency Properties If an object contains a given secondary key, then an index lookup with that key will return the object If an object is returned by index lookup, then this object contains a secondary key for that index within the specified range Alice Bob Alice students Bob Frank Trent with name between a d? Carol Peggy Carol? Peggy SLIK Slide 35

Consistency Properties If an object contains a given secondary key, then an index lookup with that key will return the object If an object is returned by index lookup, then this object contains a secondary key for that index within the specified range Bob Alice Frank Trent Peggy Carol students with name between a d? Alice Bob Carol SLIK Slide 36

Consistency properties: Consistency If an object contains a given secondary key, then an index lookup with that key will return the object If an object is returned by index lookup, then this object contains a secondary key for that index within the specified range Solution: Longer index lifespan (via ordered writes) Object data is ground truth and index entries serve as hints Slide 37

Consistency properties: Consistency If an object contains a given secondary key, then an index lookup with that key will return the object If an object is returned by index lookup, then this object contains a secondary key for that index within the specified range Solution: Longer index lifespan (via ordered writes) Object data is ground truth and index entries serve as hints Index Entry Bob è Foo Object Foo Bob... Slide 38

Consistency properties: Consistency If an object contains a given secondary key, then an index lookup with that key will return the object If an object is returned by index lookup, then this object contains a secondary key for that index within the specified range Solution: Longer index lifespan (via ordered writes) Object data is ground truth and index entries serve as hints Index Entry Bob è Foo Sam è Foo 1. Add new index entry Object Foo Bob... Slide 39

Consistency properties: Consistency If an object contains a given secondary key, then an index lookup with that key will return the object If an object is returned by index lookup, then this object contains a secondary key for that index within the specified range Solution: Longer index lifespan (via ordered writes) Object data is ground truth and index entries serve as hints Index Entry Bob è Foo Sam è Foo 1. Add new index entry 2. Modify object Object Foo Sam... Slide 40

Consistency properties: Consistency If an object contains a given secondary key, then an index lookup with that key will return the object If an object is returned by index lookup, then this object contains a secondary key for that index within the specified range Solution: Longer index lifespan (via ordered writes) Object data is ground truth and index entries serve as hints Index Entry Object Foo Sam... Sam è Foo 1. Add new index entry 2. Modify object 3. Remove old index entry Slide 41

Consistency properties: Consistency If an object contains a given secondary key, then an index lookup with that key will return the object If an object is returned by index lookup, then this object contains a secondary key for that index within the specified range Solution: Longer index lifespan (via ordered writes) Object data is ground truth and index entries serve as hints Index Entry Bob è Foo Sam è Foo Object Foo Bob... Foo Sam... time write object modify object remove object Slide 42

Consistency properties: Consistency If an object contains a given secondary key, then an index lookup with that key will return the object If an object is returned by index lookup, then this object contains a secondary key for that index within the specified range Solution: Longer index lifespan (via ordered writes) Object data is ground truth and index entries serve as hints Index Entry Bob è Foo Sam è Foo Object Foo Bob... Foo Sam... time write object modify object remove object Slide 43

Consistency properties: Consistency If an object contains a given secondary key, then an index lookup with that key will return the object If an object is returned by index lookup, then this object contains a secondary key for that index within the specified range Solution: Longer index lifespan (via ordered writes) Object data is ground truth and index entries serve as hints commit point commit point x commit point Index Entry Bob è Foo Sam è Foo Object Foo Bob... Foo Sam... time write object modify object remove object Slide 44

Consistency properties: Consistency If an object contains a given secondary key, then an index lookup with that key will return the object If an object is returned by index lookup, then this object contains a secondary key for that index within the specified range Solution: Longer index lifespan (via ordered writes) Object data is ground truth and index entries serve as hints commit point commit point commit point Index Entry Bob è Foo Sam è Foo Object Foo Bob... Foo Sam... time write object modify object remove object Slide 45

Talk Outline Motivation Design Performance Related Work Summary SLIK Slide 46

Performance: Questions Does SLIK provide low latency? Does SLIK provide scalability? How does the performance of indexing with SLIK compare to other state-of-the-art systems? SLIK Slide 47

Performance: Systems for Comparison H-Store: Main memory database Data (and indexes) partitioned based on specified attribute Many parameters for tuning Got assistance from developers to tune for each test Examples: txn_incoming_delay, partitioning column HyperDex: Spaces containing objects Data (and indexes) partitioned using hyperspace hashing Each index contains all object data Designed to use disk for storage SLIK Slide 48

Hardware SLIK Slide 49

Latency Experiments: 1. Lookups: table with single secondary index 2. Overwrites: table with single secondary index 3. Overwrites: varying number of secondary indexes Configuration: Single client Single partition for table and (each) index Object: 30 B pk, 30 B sk, 100 B value SLIK: Three-way replication to durable backups H-Store: No replication, durability disabled, single server Slide 50

Lookup Latency 250 H-Store SLIK TCP SLIK (a) Lookup Latency (µs) 200 150 100 50 142.10 150.11 152.37 138.18 134.97 45.2 44.9 42.7 48.5 49.7 147.42 45.6 132.00 54.5 11.0 10.2 11.7 11.6 12.7 12.8 13.1 0 10 10 0 10 10 1 10 10 2 10 3 10 4 10 10 5 10 10 6 Size of Index (# objects) Slide 51

Lookup Latency 250 H-Store SLIK TCP SLIK (a) Lookup Latency (µs) 200 150 100 50 142.10 150.11 152.37 138.18 134.97 45.2 44.9 42.7 48.5 49.7 147.42 45.6 132.00 54.5 11.0 10.2 11.7 11.6 12.7 12.8 13.1 0 10 10 0 10 10 1 10 10 2 10 3 10 4 10 10 5 10 10 6 Size of Index (# objects) Slide 52

Overwrite Latency 250 H-Store SLIK TCP SLIK (b) Overwrite Latency (µs) 200 150 100 50 143.51 143.00 124.4 151.39 153.28 148.88 137.74 133.54 135.8 126.5 125.9 129.7 124.3 123.8 31.4 32.7 34.2 34.3 35.2 35.2 37.0 0 10 0 10 1 10 2 10 3 10 4 10 5 10 6 Size of Index (# objects) Slide 53

Multiple Secondary Indexes 1474.39 1682.16 1704.69 1756.99 Overwrite Latency (µs) 1000 100 10 265.91 273.41 285.99 287.72 181.98 152.97 138.5 139.1 156.2 165.2 164.3 175.3 175.7 179.1 182.0 184.6 33.0 35.3 39.8 39.0 42.7 42.4 46.4 47.3 49.5 51.2 H-Store via PK SLIK TCP H-Store via SK SLIK 0 1 2 3 4 5 6 7 8 9 10 Number of Indexes Slide 54

Multiple Secondary Indexes 1474.39 1682.16 1704.69 1756.99 Overwrite Latency (µs) 1000 100 10 265.91 273.41 285.99 287.72 181.98 152.97 138.5 139.1 156.2 165.2 164.3 175.3 175.7 179.1 182.0 184.6 33.0 35.3 39.8 39.0 42.7 42.4 46.4 47.3 49.5 51.2 H-Store via PK SLIK TCP H-Store via SK SLIK 0 1 2 3 4 5 6 7 8 9 10 Number of Indexes Slide 55

Multiple Secondary Indexes 1474.39 1682.16 1704.69 1756.99 Overwrite Latency (µs) 1000 100 10 265.91 273.41 285.99 287.72 181.98 152.97 138.5 139.1 156.2 165.2 164.3 175.3 175.7 179.1 182.0 184.6 33.0 35.3 39.8 39.0 42.7 42.4 46.4 47.3 49.5 51.2 H-Store via PK SLIK TCP H-Store via SK SLIK 0 1 2 3 4 5 6 7 8 9 10 Number of Indexes Slide 56

Scalability Compare: (a) partitioning approaches (b) systems Experiments: 1. Lookup throughput with increasing number of partitions 2. Lookup latency with increasing number of partitions Configuration: Single table with one secondary index Table and index partitioned across servers Object: 30 B pk, 30 B sk, 100 B value Throughput experiments: Loaded system Latency experiments: Unloaded system Slide 57

Scalability: Throughput Throughput (10 3 lookups/sec) 4000 3500 Independent Partitioning Colocation 3000 2782 2478 2500 2184 1859 2000 1558 1500 1249 1000 940 634 500 435 0 335 461 423 463 447 457 447 357 441 418 0 2 4 6 8 10 Number of Indexlets 3092 Slide 58

Scalability: Throughput Throughput (10 3 lookups/sec) H-Store 5000 4629 SLIK TCP 4248 SLIK 4000 3629 3199 3000 2655 2197 2000 1619 1352 1199 1127 1001 794 1000 653 580 430 220 96.90 0 0 2 4 6 8 10 12 14 16 18 20 312.35 347.32 396.27 Number of Servers 5069 1445 1663 1807 Slide 59

Scalability: Latency Lookup Latency (µs) 100 80 60 40 20 Colocation size 1 89.7 Colocation size 10 87.3 Independent size 1 Independent size 10 28.8 22.3 16.2 26.7 12.7 15.2 16.7 8.3 0 0 10 20 30 40 50 60 70 80 Number of Servers Slide 60

Scalability: Latency Average Latency per Lookup (µs) 300 250 200 150 100 50 H-Store SLIK TCP SLIK 267.4 269.8 267.2 243.4 240.3 215.6 199.2 179.6 152.6 119.4 112.7 114.4 113.3 90.6 91.0 81.6 69.1 49.9 55.6 58.2 13.1 13.3 13.9 13.7 14.4 14.7 14.5 14.4 14.6 14.7 0 0 1 2 3 4 5 6 7 8 9 10 Number of Indexlets Slide 61

Talk Outline Motivation Design Performance Related Work Summary SLIK Slide 62

Data storage system Related Work Data model (spectrum from key-value to relational) Consistency (spectrum from eventual to strong) Performance: latency and/or throughput SLIK Slide 63

Current Web Scale Datastores Better Strong Consistency Level Causal, SI, Define your own Eventual 100s 10s 1s 100ms 10ms 1ms 100µs 10µs Read / write latency (approx) Better SLIK Slide 64

Current Web Scale Datastores Better Strong Consistency Level Causal, SI, Define your own Eventual PNUTS CouchDB Espresso Tao 100s 10s 1s 100ms 10ms 1ms 100µs 10µs Read / write latency (approx) Better SLIK Slide 65

Current Web Scale Datastores Better Strong MongoDB H-Base Spanner H-Store HyperDex Consistency Level Causal, SI, Define your own Eventual PNUTS CouchDB Espresso Tao 100s 10s 1s 100ms 10ms 1ms 100µs 10µs Read / write latency (approx) Better SLIK Slide 66

Current Web Scale Datastores Better Strong MongoDB H-Base Spanner H-Store HyperDex Consistency Level Causal, SI, Define your own Eventual Cassandra Espresso PNUTS CouchDB Tao 100s 10s 1s 100ms 10ms 1ms 100µs 10µs Read / write latency (approx) Better SLIK Slide 67

Current Web Scale Datastores Better Strong MongoDB H-Base Spanner H-Store HyperDex SLIK Consistency Level Causal, SI, Define your own Eventual Cassandra Espresso PNUTS CouchDB Tao 100s 10s 1s 100ms 10ms 1ms 100µs 10µs Read / write latency (approx) Better SLIK Slide 68

Talk Outline Motivation Design Performance Related Work Summary SLIK Slide 69

Conjecture Can a key value store support strongly consistent secondary indexes while operating at low latency and large scale? SLIK Slide 70

Summary Lookups and range queries on secondary keys A key value store can support strongly consistent secondary indexes while operating at low latency and large scale. SLIK Slide 71

Summary By using ordered writes and treating indexes as hints Lookups and range queries on secondary keys A key value store can support strongly consistent secondary indexes while operating at low latency and large scale. SLIK Slide 72

Summary By using ordered writes and treating indexes as hints Lookups and range queries on secondary keys A key value store can support strongly consistent secondary indexes while operating at low latency and large scale. By using independent partitioning we get: linear throughput increase and minimal impact on latency as the scale increases SLIK Slide 73

Summary By using ordered writes and treating indexes as hints Lookups and range queries on secondary keys A key value store can support strongly consistent secondary indexes while operating at low latency and large scale. By using approaches that have minimal overheads we get: 11-13 µs lookups and 30-37 µs (over)writes By using independent partitioning we get: linear throughput increase and minimal impact on latency as the scale increases SLIK Slide 74

Summary By using ordered writes and treating indexes as hints Lookups and range queries on secondary keys A key value store can support strongly consistent secondary indexes while operating at low latency and large scale. By using approaches that have minimal overheads we get: 11-13 µs lookups and 30-37 µs (over)writes By using independent partitioning we get: linear throughput increase and minimal impact on latency as the scale increases SLIK Slide 75

Thank you! Code available free and open source: github.com/platformlab/ramcloud My papers and other information at: stanford.edu/~ankitak I can be reached at: ankitak@cs.stanford.edu