Couchbase Architecture 2015 Couchbase Inc. 1
$whoami Laurent Doguin Couchbase Developer Advocate @ldoguin laurent.doguin@couchbase.com 2015 Couchbase Inc. 2 2
Big Data = Operational + Analytic (NoSQL + Hadoop) Real-time, interactive databases Batch-oriented analytic databases OPERATIONAL VELOCITY ANALYTICAL VOLUME Online Web/Mobile/IoT apps Millions of customers/ consumers Offline, batch-oriented Analytics apps Hundreds of business analysts 2015 Couchbase Inc. 3
Key Capabilities Combines the flexibility of JSON, the power of SQL and the scale of NoSQL N1QL Develop with Agility Operate at Any Scale Multiple data models N1QL - SQL-Like query language Multiple indexes Languages, ODBC / JDBC drivers and frameworks you already know Push-button scalability Consistent high-performance Always on 24x7 with HA - DR Easy Administration with Web UI, Rest API and CLI 2015 Couchbase Inc. 4
Couchbase provides a complete Data Management solution General purpose capabilities support a broad range of apps and use cases N1QL Highly available cache Key-value store Document database Embedded database Sync management 2015 Couchbase Inc. 5
Enterprises use Couchbase to enable key objectives Profile Management Personalization 360 Degree Customer View Internet of Things Mobile Applications Content Management Catalog Real Time Big Data Digital Communication Fraud Detection 2015 Couchbase Inc. 6
Develop with Agility 2015 Couchbase Inc. 7
What does a JSON document look like? { ID : 1, FIRST : Dipti, LAST : Borkar, ZIP : 94040, CITY : MV, STATE : CA } JSON = + All data in a single document 2015 Couchbase Inc. 8
Storing and retrieving documents Clients Documents Read from / Written to User/application data Servers Data Buckets Which live on Server Nodes Based on hash partitioning That form a Couchbase Cluster Dynamically scalable 2015 Couchbase Inc. 2014 Couchbase, Inc. 9
Accessing Data in Couchbase Multiple Access Paths CRUD View Query Data Service Cluster N1QL Query Query & Index Services Functional Hold Give Allow Manage the on for to connections application querying, view cluster querying, information execution developer to the building bucket of such a other concurrent of within as queries the topology. API directives and cluster for reasonable for basic such different (k-v) as error defining services. document handling indexes from management and the cluster. checking Provide a on core index layer state. where IO can be managed API and optimized. Reference get() API abucket.newviewquery().limit().stale() Provide a way Cluster to manage Management buckets. openbucket() insert() abucket.newn1qlquery( info() upsert() API SELECT * FROM default LIMIT 5 ) disconnect() remove() insertdesigndocument().consistency(gocouchbase.requestplus); flush() listdesigndocuments() 2015 Couchbase Inc. 10
Couchbase SDKs and Connectors 2015 Couchbase Inc. 11
Operate at Any Scale 2015 Couchbase Inc. 12
Couchbase Architecture Single Node ü Data Service builds and maintains Distributed secondary indexes (MapReduce Views) Data Service Index Service Query Service Management REST API Web UI ü Indexing Engine builds and maintains Global Secondary Indexes ü Query Engine plans, coordinates, and executes queries against either Global or Distributed indexes ü Cluster Manager configuration, heartbeat, statistics, RESTful Management interface Managed Cache Storage Node Manager Couchbase Server Node Node / Cluster Orchestration Erlang / OTP Cluster Manager 2015 Couchbase Inc. 13 13 View Engine Indexing Engine Managed Cache Storage Query Engine Managed Cache
Data Service: Write Operation APPLICATION SERVER DOC 1 Single-node type means easier administration and scaling Writes are async by default MANAGED CACHE DOC 1 Application gets acknowledgement when successfully in RAM and can tradeoff waiting for replication or persistence per-write REPLICATION/ XDCR/ CONNECTORS/ VIEWS/ INDEXING DISK DISK QUEUE Replication to 1, 2 or 3 other nodes Replication is RAM-based so extremely fast Off-node replication is primary level of HA Disk written to as fast as possible no waiting 2015 Couchbase Inc. 14 14
Data Service: Read Operation APPLICATION SERVER GET DOC 1 Single-node type means easier administration and scaling Reads out of cache are extremely fast REPLICATION/ XDCR/ CONNECTORS/ VIEWS/ INDEXING MANAGED CACHE DISK DOC 1 No other process/system to communicate with Data connection is a TCP-binary protocol DOC 1 DISK QUEUE 2015 Couchbase Inc. 15 15
Data Service: Cache Miss APPLICATION SERVER GET DOC 1 Single-node type means easier administration and scaling MANAGED CACHE Layer consolidation means 1 single interface for App to talk to and get its data back as fast as possible REPLICATION/ XDCR/ CONNECTORS/ VIEWS/ INDEXING DOC 1 DISK DOC 2 DOC 3 DOC 4 DOC 5 Separation of cache and disk allows for fastest access out of RAM while pulling data from disk in parallel DOC 1 DOC 2 DOC 3 DOC 4 DOC 5 DISK QUEUE 2015 Couchbase Inc. 16 16
Couchbase Views Local Index Distributed indexing and scatter gather querying Incremental Map-Reduce Distributed simple real-time analytics Only considers changes due to updated data 2015 Couchbase Inc. 2014 Couchbase, Inc. 17
Index Service 2015 Couchbase Inc. 18
Couchbase Global Indexing Service Index#1 Index#3 Index#2 Index#4 Global Secondary Index Service New to 4.0 Indexes partitioned independently from data Supervisor Index maintenance & Scan coordinator Indexing Service Each index receives only its own mutations Managed Caching layer ForestDB storage engine B+ Trie optimized for very large data volumes Optimized for SSD s 2015 Couchbase Inc. 19
Query Service 2015 Couchbase Inc. 20
Query Execution Flow SELECT c_id, c_first, c_last, c_max FROM CUSTOMER WHERE c_id = 49165; Clients 1. Submit the query over REST API 8. Query result { } "c_first": "Joe", "c_id": 49165, "c_last": "Montana", "c_max" : 50000 Index Service 2. Parse, Analyze, create Plan 7. Evaluate: Documents to results 3. Scan Request; index filters 4. Get qualified doc keys Query Service 5. Fetch Request, doc keys 6. Fetch the documents Data Service 2015 Couchbase Inc. 21
Couchbase Clustering Architecture 2015 Couchbase Inc. 22 22
Auto sharding Bucket and vbuckets Data buckets vb vb vb vb 1.. 1024 1.. 1024 Active Virtual buckets Replica Virtual buckets 2015 Couchbase Inc. 23
Cluster Map Couchbase SDK Couchbase SDK CRC32 Hashing Algorithm CRC32 Hashing Algorithm CLUSTER MAP CLUSTER MAP vbucket1 vbucket2 vbucket3 vbucket4 vbucket5 vbucket6 vbucket7... vbucket1024 vbucket1 vbucket2 vbucket3 vbucket4 vbucket5 vbucket6 vbucket7... vbucket1024 2015 Couchbase Inc. 24 Couchbase Cluster Couchbase Cluster
Data Services Sharding and Replication READ/WRITE/UPDATE ACTIVE ACTIVE ACTIVE ACTIVE ACTIVE Application has single logical connection to cluster (client object) Multiple nodes added or removed at once 5 2 9 4 7 8 1 3 6 One-click operation 4 REPLICA REPLICA REPLICA 1 8 6 3 2 7 9 5 REPLICA REPLICA Incremental movement of active and replica vbuckets and data Client library updated via cluster map Couchbase Server 1 Couchbase Server 2 Couchbase Server 3 Couchbase Server 4 Couchbase Server 5 Fully online operation, no downtime or loss of performance 2015 Couchbase Inc. 25 25
What is Multi-Dimensional Scaling? MDS is the architecture that enables independent scaling of data, query and indexing workloads while being managed as one cluster node1 node8 Index Service Query Service Data Service 2015 Couchbase Inc. 26 Couchbase Cluster
Modern Architecture Independent Scalability for Best Computational Capacity per Service Heavier indexing (index more fields) : scale up index service nodes More RAM for query processing: scale up query service nodes node1 node8 node9 Query Service Index Service Data Service 2015 Couchbase Inc. 27 Couchbase Cluster
Cross Data Center Replication 2015 Couchbase Inc. 28
Market leading memory-to-memory replication NYC Server Cluster Couchbase Server 1 Couchbase Server 2 Couchbase Server 3 Couchbase Server 4 MEMORY DISK MEMORY DISK MEMORY DISK MEMORY DISK New York San Francisco MEMORY DISK MEMORY DISK MEMORY DISK Couchbase Server 1 Couchbase Server 2 Couchbase Server 3 SF Server Cluster 2015 Couchbase Inc. 29
In summary The best of both worlds N1QL Develop with Agility Operate at Any Scale Multiple data models N1QL - SQL-Like query language Multiple indexes Languages, ODBC / JDBC drivers and frameworks you already know Push-button scalability Consistent high-performance Always on 24x7 with HA - DR Easy Administration with Web UI, Rest API and CLI 2015 Couchbase Inc. 30
Thanks! 2015 Couchbase Inc. 31