MongoDB Backup and Recovery Field Guide Tim Vaillancourt Sr Technical Operations Architect, Percona
`whoami` { name: tim, lastname: vaillancourt, employer: percona, techs: [ mongodb, mysql, cassandra, redis, rabbitmq, solr, python, golang ] } 2
Agenda History Methods Logical Binary Cold LVM Hot Backup Integrity / Consistency mongodb_consistent_backup 3 Architecture Restore and Validation
History 4 3000-4000 BC: Culturally significant data backed up in a universal format 1400: The Printing Press 1600-1800: Chapultepec Aqueduct 1990s: Floppy and Zip Disks 2000s: No more Floppy/Zip Disks Present: All my data is on Google Drive and I have 7 days of hourly Time Machine backups! Future:?
Replication!= Backup Replication is not a backup! Replication is High Availability Including 5 Binary/Statement-based Replication of any type Delayed Replication*** RAID Arrays <EOF>
Backup Methods
Logical Backups Tools mongodump Uses find() queries with $snapshot to backup all collections Supports Gzip and Threading in 3.2+ Outputs a directory containing bson files in various subdirectories Custom Queries The client API could be used similarly to mongodump to perform logical backups Benefits Reduced storage footprint Replication awareness Compatibility 7 Drawbacks
Binary Backups: Cold Backup Very simple process Causes full outage to MongoDB instance! Process Stop mongod Copy and archive dbpath Start mongod 8
Binary Backups: LVM / Filer / Cloud Disk Process If Non-Journalled db.fsynclock() Keep session open Create block-device snapshot Unlock the database db.fsyncunlock() Copy or achive the snapshot directory Remove block devics snapshot (as quickly as possible!) LVM Snapshots have been demonstrated to cause up to 30%* write latency impact to disk due to COW 9
Binary Backups: Hot Backup PSMDB or MongoDB Enterprise Pay $$$ for MongoDB Enterprise or download PSMDB for free(!) db.admincommand({ createbackup: 1, backupdir: "/data/mongodb/backup" }) Copy/archive the output path Delete the backup output path NOTE: 10 RocksDB-based createbackup creates filesystem hardlinks whenever possible! Delete RocksDB backupdir as soon as possible to reduce bloom filter overhead!
Backup Integrity / Consistency
The Distributed Cluster Backup Problem Mongodump is single node consistent only! Common to most or all database techs in sharded environment Problems: Backup tools consider single-instance integrity only Backups of different shards may complete at different times Changes replicate asynchronously Data may be balancing / moving in the cluster Risks: Orphaned documents / references Holes in data 12
Backups: mongodb_consistent_backup Python project by Percona-Lab for consistent backups URL: https://github.com/percona-lab/mongodb_consistent_backup Best-effort support, not a Percona Product Created to solve limitations in MongoDB backup tools: Replica Set and Sharded Cluster awareness Cluster-wide Point-in-time consistency In-line Oplog backup (vs post-backup) Notifications of success / failure Extra Features Remote Upload (AWS S3, Google Cloud Storage and Rsync) Archiving (Tar or ZBackup deduplication and optional AES-at-rest) CentOS/RHEL7 RPMs and Docker-based releases (.deb soon!) 13
Backups: mongodb_consistent_backup 1.2.0 Future 14 Multi-threaded Rsync Upload Replica Set Tags support Support for MongoDB SSL / TLS connections and client auth Rotation / Expiry of old backups (locally-stored only) Incremental Backups Binary-level Backups (Hot Backup, Cold Backup, LVM, Cloud-based, etc) More Notification Methods (PagerDuty, Email, etc) Restore Helper Tool Instrumentation / Metrics <YOUR AWESOME IDEA HERE> we take GitHub PRs (and it s Python)!
Backup Architecture
Architecture: Simple Example Method Run mongodump (with --oplog) using a plain secondary Store backups with on-site remote storage (filer, rsync, etc) Potential Issues Application Impact I/O and CPU impact due to backups may affect application Storage-engine and FS caches will become dirty Primary Failure A failure of the Primary may cause the Secondary backing-up to become Primary This can be avoided by using a Read Preference of secondary (supported in recent mongodump versions) No Disaster Recovery 16
Architecture: Tag-Based Example Replica Set Tags Allow selection of MongoDB nodes using key/value pairs Represented in JSON/single document Many key/value pairs is possible Example Backup from west Only Specify a single node with a tag such as { location: west } Use Read Preference Tag in mongodump/mongodb_consistent_backup to target a specific node. 17
Architecture: Offsite Backup Example Example Create backup within local datacenter Upload completed backups to other datacenter, cloud, etc mongodb_consistent_backup supports Amazon S3, Google Cloud Storage and Rsync for remote upload! Benefits Fast backup time due to in-datacenter latency Drawbacks A full backup data uploaded each backup job 18
Architecture: Disaster Recovery Example Example Place a SECONDARY node in another location Dedicated node is recommended to reduce impact hidden:true recommended Run backup from off-site SECONDARY member Optionally upload to Cloud Storage Benefits Only changes (replication) replicated to offsite location Potentially faster uploads to Cloud Storage Drawbacks Bootstrap / Initial Sync may use high bandwidth (if not seeded by backup) 19
Restore and Validation It s not a backup system, it s a restore system ~ Raymond Blum, Google SRE
Restoring and Validation Methodology Optimise restore time, not backup run time Users and business care how fast their data is back, not how long it takes to backup Binary-level backups are much faster to restore in MongoDB Validation This is very application specific Random sample restored data and validate Example: Compare to Production Example: Integration Test / QA Run code integration tests or QA on restored data Example: Production Backup as Test Data 21 Compare real Production item, user, article, etc to backup Ensure backup age doesn t cause false alarms, ie: test data older than backup Copy Production Data to Test periodically using backups
Thank You Sponsors! 22
SAVE THE DATE! April 23-25, 2018 Santa Clara Convention Center CALL FOR PAPERS OPENING SOON! 23 www.perconalive.com
Questions? 24