BARNS: Backup and Recovery for NoSQL Databases Atish Kathpal, Priya Sehgal Advanced Technology Group, NetApp 1
Why Backup/Restore NoSQL DBs? Customers are directly ingesting into NoSQL Security breach are on the rise e.g. ransomware attacks on MongoDB [1] and recent WannaCrypt exploits Fat-finger errors eventually propagate to replicas Ransomware Sandbox deployments for test/dev Bring up shadow clusters of different cardinality (from production cluster snapshots) Compliance and regulatory requirements IDC, 2016 report [2] lists data-protection and retention as one of the top infrastructural requirements for NoSQL 2 [1] http://thehackernews.com/2017/01/secure-mongodb-database.html [2] Nadkarni A., Polyglot Persistence: Insights on NoSQL Adoption and the Resulting Impact on Infrastructure. IDC. 2016 Feb.
NoSQL Database Classes From Backup/Restore Service Perspective Master-slave Authoritative copy of each partition is contained in the master node that we can backup. Loss of primary node leads to shard/partition-unavailability until new leader is elected. Example: MongoDB, Redis, Oracle NoSQL, MarkLogic Master-less Data is scattered across nodes using consistent hashing techniques, no single node has all data for a given partition Eventual consistency: Unavailability of a destination node does not lead to write-failure, data is eventually replicated Example: Cassandra, Couchbase https://www.slideshare.net/mongodb/sharding-v-final, https://blog.imaginea.com/consistent-hashing-in-cassandra/ 3
NoSQL DBs Hosted on Shared Storage High-level, conceptual deployment architecture Backup: Leverage storage snapshots Restore: Leverage cloning NoSQL DB software (Node1) NoSQL DB software (Node2) NoSQL DB software (Node3) Filesystem (DB Journal, Logs, Data Files) Filesystem (DB Journal, Logs, Data Files) Filesystem (DB Journal, Logs, Data Files) LUN-1 LUN-2 LUN-3 Shared Storage Array (Snapshots, Cloning, Compression, Deduplication, Encryption, Cloud Integrations) 4
Backup/Restore Challenges Cluster-consistency at scale Cluster/App quiesce significantly hampers application performance. Cross node consistency not guaranteed Take crash consistent snapshots Post process crash consistent snapshots (in a sand-box) using NoSQL DB stack to reach an cluster-consistent state Space Efficiency Replica set data copies do not de-duplicate small row sizes, scattered across nodes (Cassandra) and unique ids added by storage engines (MongoDB) DB performs compression and encryption Remove replicas logically (application aware backup) Topology Changes Commodity nodes, at scale of 10-100s of nodes. E.g., Primary node might be unreachable while taking backup in case of MongoDB Storage snapshots do not have context about cluster topology Use cases may require restore to a test/dev cluster of different cardinality Save Cluster topology and storage mapping as part of backup Insignificant block based deduplication Write: (K1, V1), (K3, V3) Cassandra N1 Replication factor = 2 Write: (K1, V1), (K2, V2) Cassandra N2 Shared Storage cluster LUN1: K1, K3 LUN2: K1, K2 LUN3: K2, K3 Write: (K2, V2), (K3, V3) Cassandra N3 Existing open source utilities like Mongodump and Cassandra snapshots suffer from above challenges. 5
NoSQL Data Protection Challenges Master-Less Databases Client, Update K1 (CL.ONE) Challenges: Fault tolerance Backup may capture stale data due to eventual consistency Higher restore times, since Cassandra will perform repairs during restore Ack Update: (K1, V11, Tnew) Cassandra N1 Update: (K1, V11, Tnew) Cassandra N2 Cassandra N3 Snapshot of LUN2 will point to stale data Shared Storage cluster LUN1: (K1, V1, Told), (K1, V11, Tnew) LUN2: (K1, V1, Told), LUN3: http://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsrepairnodesreadrepair.html, https://wiki.apache.org/cassandra/hintedhandoff 6
BARNS Architecture Addresses challenges of: 1. Taking cluster-consistent backup at scale 2. Taking storage efficient backups (through replica removal) 3. Enables recovery/cloning to different cluster topologies Mongo APIs MongoDB Cluster Lib_MongoDB Light Weight Backup Process (LWB) Lib_Store1 APIs Lib_DB Post- Process (PP) Lib_Storage Cassandra Cluster Lib_Cassandra Restore Process Lib_Store2 Cassandra APIs APIs Stop_balancer() Get_topology() Check_cluster_health() Prepare_Backup_Topo( Etc. Get_lun_mappings() Take_snapshots() Lun_clone() Vault_backup() Etc. 7 Store1 Store2
BARNS Solution: Cassandra Master-less Distributed Database 8
Phase 1: Light-weight Backup Phase C1 Token1 Production Cluster C2 C3 Token2 Token3 C4 Token4 BARNS: LWB 1. Capture token assignment of each node 2. Store mapping of LUNs à Tokens L1 L2 L3 L4 CL1 D1 CL2 D2 CL3 D3 CL4 D4 3. Take snapshots of L1 to L4 Backup Metadata example CL1 CL2 CL3 CL4 D1 D2 D3 D4 sn1 sn2 sn3 sn4 9
Phase2: Post Process Phase Part1: Flush Commitlogs PP Node(s) BARNS: PP-Phase P1 P2 P3 P4 CL_sn1 CL_sn2 CL_sn3 CL_sn4 1. Clone the LWB Snapshot LUNs 2. Mount on different PP processes or different nodes 3. Start Cassandra processes P1 to P4 CL1 D1 CL2 D2 CL3 D3 CL4 D4 4. Flush CommitLogs E.g. Data K1, V1, T1 K2, V2, T1 K1,V11, T2 10
Phase2: Post Process Phase Part2: Compaction BARNS: PP-Phase PP Node /cassandra unionfs 1. Using UnionFS mount all snapshot clones as readonly 2. Create a new volume and LUN for full backup 3. Mount it as RW through UnionFS 4. Let PP node have all tokens of prod cluster 5. Start Cassandra 6. Start compaction process 7. Final compacted files will be stored on Fullbackup LUN Data 1 Data 2 Data 3 Data 4 K1, V1, T1 K1, V1, T1 K1, V1, T1 K3,V3, T3 K2, V2, T1 K1, V11, T2 K2, V2, T1 K2, V22, T4 K1,V11, T2 K3,V3, T3 CL_sn1 CL_sn2 CL_sn3 CL_sn4 Data Compact K1, V11, T2 K2, V22, T1 K3,V3, T3 Full_bk_lun Keeping single copy of data in the backup => ~66% reduction in backup storage requirements 11
Cassandra Restore The post-process step enables cloning to different restore/clone topologies Token1 Token2 N1 Token3 Token4 N2 Clones Full Backup K1, V11, T2 K2, V22, T1 Full_bk_lun K3,V3, T3 12
Evaluation - Cassandra Backup and Restore Full Backup LWB <10 secs pp-flush - ~40 secs pp-compact time increases by 35-40% à incremental backup Production cluster 4 nodes 4 iscsi LUNs Commitlog and SSTables for a node on same LUN Cassandra 4.0 Post Process Node 2 CPUs 8GB RAM YCSB to ingest data Restore time less than ~2 mins (irrespective of cluster size and data set size) 13
BARNS Solution: MongoDB Master-Slave Distributed Database Ø Check the paper or just attend the poster session J 14
Summary Tracking replicas and cluster topologies is important for taking backups and performing flexible topology restores Existing open-source solutions have several inefficiencies like need for repairs after restore, lack of storage efficiency in backup and poor integrations with shared storage Opportunity to provide efficient backup and restore through light-weight snapshots and clones 15
Thank You. 16