SCALING COUCHDB WITH BIGCOUCH. Adam Kocoloski Cloudant

Similar documents
DEMYSTIFYING BIG DATA WITH RIAK USE CASES. Martin Schneider Basho Technologies!

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems

Non-Relational Databases. Pelle Jakovits

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL

Capabilities of Cloudant NoSQL Database IBM Corporation

The Lion of storage systems

OpenEdge & CouchDB. Integrating the OpenEdge ABL with CouchDB. Don Beattie Software Architect Quicken Loans Inc.

ZHT: Const Eventual Consistency Support For ZHT. Group Member: Shukun Xie Ran Xin

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store

SCALABLE CONSISTENCY AND TRANSACTION MODELS

CS 655 Advanced Topics in Distributed Systems

Distributed File Systems II

Engineering Goals. Scalability Availability. Transactional behavior Security EAI... CS530 S05

Distributed Data Management Replication

CISC 7610 Lecture 2b The beginnings of NoSQL

Chapter 24 NOSQL Databases and Big Data Storage Systems

BigTable: A Distributed Storage System for Structured Data (2006) Slides adapted by Tyler Davis

The Google File System

Riak. Distributed, replicated, highly available

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

Distributed Non-Relational Databases. Pelle Jakovits

SCALABLE DATABASES. Sergio Bossa. From Relational Databases To Polyglot Persistence.

Spotify. Scaling storage to million of users world wide. Jimmy Mårdell October 14, 2014

Data Storage Revolution

The Google File System

Building a Scalable Architecture for Web Apps - Part I (Lessons Directi)

Flat Datacenter Storage. Edmund B. Nightingale, Jeremy Elson, et al. 6.S897

June 20, 2017 Revision NoSQL Database Architectural Comparison

Bigtable. A Distributed Storage System for Structured Data. Presenter: Yunming Zhang Conglong Li. Saturday, September 21, 13

Course Content MongoDB

Coherence & WebLogic Server integration with Coherence (Active Cache)

NoSQL Databases. Amir H. Payberah. Swedish Institute of Computer Science. April 10, 2014

Cassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent

NOSQL EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

Introduction to NoSQL Databases

Distributed PostgreSQL with YugaByte DB

NoSQL systems. Lecture 21 (optional) Instructor: Sudeepa Roy. CompSci 516 Data Intensive Computing Systems

Distributed Systems. 29. Distributed Caching Paul Krzyzanowski. Rutgers University. Fall 2014

The NoSQL movement. CouchDB as an example

A Brief Introduction of TiDB. Dongxu (Edward) Huang CTO, PingCAP

CSE 444: Database Internals. Lectures 26 NoSQL: Extensible Record Stores

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs

Datacenter replication solution with quasardb

Distributed Filesystem

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

Modern Database Concepts

References. What is Bigtable? Bigtable Data Model. Outline. Key Features. CSE 444: Database Internals

Making Non-Distributed Databases, Distributed. Ioannis Papapanagiotou, PhD Shailesh Birari

Dynamo: Amazon s Highly Available Key-Value Store

Transactions and ACID

SCALARIS. Irina Calciu Alex Gillmor

Final Exam Review 2. Kathleen Durant CS 3200 Northeastern University Lecture 23

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Indexing Large-Scale Data

Client Server & Distributed System. A Basic Introduction

Map-Reduce. Marco Mura 2010 March, 31th

Migrating to Cassandra in the Cloud, the Netflix Way

CS November 2017

Cassandra - A Decentralized Structured Storage System. Avinash Lakshman and Prashant Malik Facebook

10. Replication. Motivation

CSE 124: Networked Services Lecture-16

Data Informatics. Seon Ho Kim, Ph.D.

CSE 124: Networked Services Fall 2009 Lecture-19

April Final Quiz COSC MapReduce Programming a) Explain briefly the main ideas and components of the MapReduce programming model.

CS Amazon Dynamo

Documentation Accessibility. Access to Oracle Support

HA for OpenStack: Connecting the dots

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

OPEN SOURCE DB SYSTEMS TYPES OF DBMS

Replication in Distributed Systems

MongoDB Architecture

8/24/2017 Week 1-B Instructor: Sangmi Lee Pallickara

Realtime visitor analysis with Couchbase and Elasticsearch

relational Relational to Riak Why Move From Relational to Riak? Introduction High Availability Riak At-a-Glance

The course modules of MongoDB developer and administrator online certification training:

A Non-Relational Storage Analysis

Road to Auto Scaling

GFS-python: A Simplified GFS Implementation in Python

Distributed System. Gang Wu. Spring,2018

CS November 2018

Using space-filling curves for multidimensional

2/27/2019 Week 6-B Sangmi Lee Pallickara

Building High Performance Apps using NoSQL. Swami Sivasubramanian General Manager, AWS NoSQL

BUILDING A SCALABLE MOBILE GAME BACKEND IN ELIXIR. Petri Kero CTO / Ministry of Games

Multi-Data Center MySQL with Continuent Tungsten

Ceph: A Scalable, High-Performance Distributed File System PRESENTED BY, NITHIN NAGARAJ KASHYAP

Dynamo: Amazon s Highly Available Key-value Store. ID2210-VT13 Slides by Tallat M. Shafaat

High Availability/ Clustering with Zend Platform

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Recap. CSE 486/586 Distributed Systems Case Study: Amazon Dynamo. Amazon Dynamo. Amazon Dynamo. Necessary Pieces? Overview of Key Design Techniques

Postgres-XC PostgreSQL Conference Michael PAQUIER Tokyo, 2012/02/24

Distributed Data Infrastructures, Fall 2017, Chapter 2. Jussi Kangasharju

NPTEL Course Jan K. Gopinath Indian Institute of Science

Fluentd + MongoDB + Spark = Awesome Sauce

Dynomite. Yet Another Distributed Key Value Store

HDFS Architecture. Gregory Kesden, CSE-291 (Storage Systems) Fall 2017

Goal of the presentation is to give an introduction of NoSQL databases, why they are there.

Transcription:

SCALING COUCHDB WITH BIGCOUCH Adam Kocoloski Cloudant Erlang Factory SF Bay Area 2011

OUTLINE Introductions Brief intro to CouchDB BigCouch Usage Overview BigCouch Internals Reports from the Trenches 2

INTRODUCTIONS Cloudant CTO CouchDB Committer Since 2008 PhD physics MIT 2010 BigCouch Putting the C back in CouchDB Open Core: 2 years Development Contact adam@cloudant.com kocolosk in #cloudant / #couchdb / #erlang @kocolosk 3

COUCHDB IN A SLIDE Document database management system JSON over HTTP Append-only MVCC storage Views: Custom, persistent representations of your data Incremental MapReduce with results persisted to disk Fast querying by primary key (views stored in a B-tree) Bi-Directional Replication Master-slave and multi-master topologies supported Optional filters to replicate a subset of the data Edge devices (mobile phones, sensors, etc.) 4

WHY BIGCOUCH? CouchDB is Awesome...But somewhat incomplete Cluster Of Untrusted Commodity Hardware CouchDB is not a distributed database -J. Ellis Without the Clustering, it s just OuchDB 5

BIGCOUCH = HA CLUSTERED COUCH Horizontal Scalability Easily add storage capacity by adding more servers Computing power (views, compaction, etc.) scales with more servers No SPOF Any node can handle any request Individual nodes can come and go Transparent to the Application All clustering operations take place behind the curtain looks like a single server instance of Couch, just with more awesome asterisks and caveats discussed later 6

GRAPHICAL REPRESENTATION PUT http://kocolosk.cloudant.com/dbname/blah?w=2 Node 24 X Y Z A Load Balancer Node 1 A B C D hash(blah) = E Node 2 B C D E Clustering in a ring (a la Dynamo) Any node can handle a request O(1) lookup Quorum system (N, R, W) Views distributed like documents Distributed Erlang Masterless Node 3 C D E F N=3 W=2 R=2 D E F G Node 4 7

BUILDING YOUR FIRST CLUSTER Shopping List 3 networked computers Usual CouchDB Dependencies BigCouch Code http://github.com/cloudant/bigcouch 8

BUILDING YOUR FIRST CLUSTER Build and Start BigCouch foo.example.com bar.example.com baz.example.com Pick one node and add the others to the local nodes DB Make sure they all agree on the magic cookie (rel/etc/vm.args) 9

QUORUM: IT S YOUR FRIEND BigCouch databases are governed by 4 parameters Q: Number of shards N: Number of redundant copies of each shard R: Read quorum constant W: Write quorum constant (NB: Also consider the number of nodes in a cluster) 1 For the next few examples, consider a 5 node cluster 5 2 4 3 10

Q Q: The number of shards over which a DB will be spread consistent hashing space divided into Q pieces Specified at DB creation time possible for more than one shard to live on a node Documents deterministically mapped to a shard 1 5 2 4 3 Q=1 Q=4 11

N N: The number of redundant copies of each document Choose N>1 for fault-tolerant cluster Specified at DB creation Each shard is copied N times Recommend N>2 1 5 2 N=3 4 3 12

W W: The number of document copies that must be saved before a document is written W must be less than or equal to N W=1, maximize throughput W=N, maximize consistency Allow for 201 created response Can be specified at write time 1 5 2 201 Created W=2 4 3 13

R R: The number of identical document copies that must be read before a read request is ok R must be less than or equal to N R=1, minimize latency R=N, maximize consistency Can be specified at query time Inconsistencies are automatically repaired 1 5 2 R=2 4 3 14

VIEWS So far, so good, but what about secondary indexes? Views are built locally on each node, for each DB shard Mergesort at query time using exactly one copy of each shard Run a final rereduce on each row if a the view has a reduce _changes feed works similarly, but has no global ordering Sequence numbers converted to strings to encode more information 5 1 2 4 3 15

BIGCOUCH STACK CHTTPD Fabric Rexi Mem3 Embedded CouchDB Mochiweb, Spidermonkey, etc. 16

MEM3 Maintains the shard mapping for each clustered database in a node-local CouchDB database Changes in the node registration and shard mapping databases are automatically replicated to all cluster nodes Shard copies are eagerly synchronized CHTTPD Fabric Rexi Mem3 Embedded CouchDB 17

REXI BigCouch makes a large number of parallel RPCs Erlang RPC library not designed for heavy parallelism promiscuous spawning of processes responses directed back through single process on remote node requests block until remote rex process is monitored Rexi removes some of the safeguards in exchange for lower latencies no middlemen on the local node remote process responds directly to client remote process monitoring occurs out-of-band Rexi CHTTPD Fabric Mem3 Embedded CouchDB 18

FABRIC / CHTTPD Fabric OTP library application (no processes) responsible for clustered versions of CouchDB core API calls Quorum logic, view merging, etc. Provides a clean Erlang interface to BigCouch No HTTP awareness Chttpd Cut-n-paste of couch_httpd, but using fabric for all data access CHTTPD Fabric Rexi Mem3 Embedded CouchDB 19

REPORTS FROM THE TRENCHES code_change and supervision trees remote execution of fun expressions == recipe for badfun blocking! 20

http://github.com/cloudant/bigcouch 21