Mike Kania Engineer @ Truss http://truss.works/
MongoDB on AWS With Minimal Suffering +
Topics Provisioning MongoDB Replica Sets on AWS Choosing storage and a storage engine Backups Monitoring Capacity Planning
Why Shouldn t You Listen to Me? MongoDB is a jack of all trades, and there s certain features that I haven t touched. Sharding built custom way to shard data Aggregation/Map Reduce didn t touch this at all
Why Should You Listen to Me? Ran one of the most complex MongoDB infrastructures(in the world?) Started using MongoDB in v1.8 Ran MongoDB on a platform that supported hundreds of thousands of developers Did crazy shit with MongoDB
What happens when you manage MongoDB for 2 years
Do You Need to Run Your own MongoDB? Hosted MongoDB services do exist and they aren t terrible, but there is a cost mlab(formerly MongoLab) ObjectRocket Compose.io Parse(RIP)
I thought we were going to talk about MongoDB on AWS
Getting your AWS House in Order Use VPC(s) EC2 Classic is full of problems VPC is better in every way except for IPv6 Run all your servers using hardware virtual machine(hvm) AMIs Use Enhanced Networking on certain instance types Way better packets per second limits Less packet loss
Starter Replica Set 1 Primary and 2 Secondaries In at least 3 Availability Zones because redundancy Each node has one vote(default) 1 Secondary is dedicated backup and should never be primary hidden=true priority = 0
Replica Set
SERVER-17882 - Update with key too large to index crashes WiredTiger/RockDB secondary
Replica Set
Arbiters Mongod processes that do nothing but vote Highly reliable and mostly stateless Easy to run multiple arbiters instances on a single host
Arbiters
Arbiters
Storage Engines Choosing a storage engine will influence your options around instance types and disk storage It also depends on how you use MongoDB Read-heavy MMAP WiredTiger Write-heavy RocksDB If you have an existing workload use Flashback to benchmark each storage engine
MMAP Only storage that existed before v3.0 Most stable of all the storage engines It is what it says. Memory Mapped Files Uses B-Trees as the data structure No performance tuning Trusts the underlying OS will do the right thing No Compression Collection level locking
WiredTiger The next generation storage engine that is default in 3.2 B-trees in 3.0 and 3.2 Up to 10x compression over MMAP Document level locks Good for mixed read/write workloads
RockDB RocksDB is an embedded database written and deployed at Facebook Uses log structured merge(lsm) trees as the underlying data structure Designed for fast writes Document level locks Up to 10x compression over MMAP Incremental backups with Strata Supported by Percona
Disk Storage EBS Built in block level snapshots and restores Up to 20,000 IOPS per volume and overall max throughput of 800MB/ sec No longer need to RAID unless you have a dataset > 16TB! Restoring from snapshot requires expensive pre-warm step Ephemeral Storage Up to 315,000 Write IOPS and 365,000 Read IOPS Lives and dies with the EC2 instance Low network latencies
EBS Standard Never use these Spinning drives are so 2012 GP2 Max 10,000 IOPS per EBS volumes Max 160MB/sec per volume SSD backed but not as performant as PIOPS Provisioned IOPS Max 20,000 IOPS per EBS volume Max 320MB/sec per volume
Ephemeral Storage Use i2 instance class Use only with RockDB and Strata for backups No need to pre-warm No reliance on network for storage Low latency and High throughput Goes POOF with the EC2 instance
Backups EBS Snapshots Incremental backups at the block level Works on all storage engines Impacts performance Strata Open source backup tool used by Parse Incremental backups to S3 Support RocksDB only
EBS Snapshots Avoid RAIDing EBS volumes 16TB is more than enough for anyone Dedicated Secondary for backups Set priority = 0 Set hidden = 1 Lock mongo db.fsynclock() or xfs_freeze if using XFS Restoring from snapshot comes with pre-warming cost use dd or fio to read all the blocks
Strata No need for dedicated backup node Use with ephemeral storage Turn on Cross Region Replication for the S3 bucket Running a strata backup to S3 strata backup -r=mydata-db1 -b=s3-mongo-backups Prune metadata for backups older than a certain date strata delete -r=mydata-db1 -b=s3-mongo-backups -a=720h Delete data files that are orphaned by strata delete strata gc -r=mydata-db1 -b=s3-mongo-backups
Planning for More Mongo Try to keep replica sets to 2-3TBs in size Allows for reasonable time for initial syncs and restoring from backup If running MMAP, keep an eye on lock % as the metric for adding more capacity > 50% means it s time to add more
Configuration Chef/Ansible/Salt Pick your poison.. as long as it s not CFEngine Parse chose Chef, but eventually grew out of it Most important lesson learned was limiting scope to managing system level configs and services ulimits cronjobs sysctl
Managing Running Replica Sets Mongoctl Keeps state in JSON file or Database Allows for centralizing commands like rsyncsecondary or configure-cluster Write your own open source it because more tools like this need to exist
CloudFormation Suited for managing stateless services or services that don t change often Auto Scaling Groups VPCs/Security groups/nat Gateways Orchestration becomes difficult once you start dealing with stateful EBS volumes Also doesn t really fit into the need for managing cluster state Adding removing nodes Changing priorities Rolling upgrade
Monitoring Log all queries to your favorite centralized logging service MMAP Lock % the biggest bottleneck RockDB Tombstones CPU and DiskIO
Logging Pipeline Log all queries, not just the slow ones Open source logtailer written in Go that can output Mongo logs as consistent JSON https://github.com/parseplatform/logtailer Fire into favorite centralized logging services
Qs and As
Links RockDB - http://rocksdb.org/ Compose - compose.io Object Rocket - http://objectrocket.com/ mlab - https://mlab.com/ Strata - https://github.com/facebookgo/rocks-strata Mongo Logtailer - https://github.com/parseplatform/logtailer Mongoctl - https://github.com/mongolab/mongoctl Terraform - https://www.terraform.io/ CloudFormation - https://aws.amazon.com/documentation/cloudformation/