To Shard or Not to Shard That is the question! Peter Zaitsev April 21, 2016
Story Let s start with the story 2
First things to decide Before you decide how to shard you d best understand whether or not you really need to shard 3
Modern Technology Can go much further without sharding 4
Single MySQL Can Do 100K+ queries per second 100K+ rows inserted/updated/deleted per second 5M+ rows scanned per second 10K+ concurrent connections 10TB+ data size 5
MySQL 5.7 Performance Improvements Sysbench Benchmark Starts with 8 Threads What about 2-4 threads? 6 *Information from Oracle OpenWorld presentation by Geir Hoydalsvik
Let s do some math 3M daily active users 30 interactions per user per day 10 queries per interaction 3x peak versus average use 7
How many QPS would it be? 31250 Queries/sec 8
Avoided Sharding Enterprise with 200K+ employees Internal Drupal Installation E-commerce merchant with $10M+ sales per month 9
Sharding = Pain It is painful! Though you may have gotten used to it. 10
Sharding Pains Developer Complexity Operational Complexity Technology Complexity Complex Failures Complex Performance Profile 11
MySQL Sharding Especially painful 12
Can t Avoid? Delay! 13
Strategies to Delay Sharding Architecture Functional Partitioning Replication Caching Queuing Supplemental Technologies 14
Architecture Building up from small blocks Each owning its data Microservices 15
Functional Partitioning Keep separate data separate 16
Replication Scale reads Beware they are asynchronous Consider Percona XtraDB Cluster 17
Caching Scale Reads Query Cache Application Server Cache Memcache/Redis Summary Tables HTTP Cache 18
Queueing Scale Writes Balance Demand Spikes Batch Work Redis RabbitMQ ActiveMQ Kafka 19
Beyond MySQL Analytics Hadoop Vertica Spark Full Text Search ElasticSearch Sphinx Solr Document Store MongoDB CouchBase RethinkDB 20
Optimize! Do simple optimizations before you decide to shard 21
Hardware Fast CPUs Plenty of memory Fast flash storage Good network (keep it close) 22
Environment Linux is the most common OS New MySQL versions scale better Use a recent GA version (MySQL 5.7 ) Consider Percona Server and PXC 23
Configuration Configure MySQL Server Properly http://bit.ly/1j8ljad What storage engine is right for you? Consider TokuDB for high compression 24
Sharding When? Too early Waste resources Too late Run into the wall 25
Architectural Runway Sharding is architecture consideration Make it part of your architecture runway planning How long would it take you to implement Sharding? 26
Capacity Planning Know where your wall is! Be conservative in your estimates! Do not plan for linear scalability! 27
Benefits of Sharding Yes there are! 28
Ultimate Scalability The only way to scale to Facebook Scale 29
Avoid other complexities Complex caching layer Asynchronous replication for scaling 30
Isolation Security Compliance Keeping data close to user 31
Costs Can use lower power systems Especially important in the cloud 32
When to Shard Summary Easy in your case Think development and operations Scaling up is impossible or too expensive Your application grow making sharding imminent Enterprise? Cloud? Other optimizations give too short of a runway to care 33
Sharding Questions Sharding Level Sharding Key Sharding Unit Sharding HA Sharding Technology 34
Sharding Level Database Level? Deployment Unit Level? 35
Sharding Key(s) Most small accesses go to single shard No shard is too large in terms of data or load May double-store date with different sharding keys if needed 36
Sharding Unit Shard = Physical MySQL Instance Shard = Schema Multiple Shards Per Schema/Table 37
Sharding HA Many Servers = higher chance of failure Sharding Increases need for HA Sharding over Master-Slave Clusters Sharding over PXC Clusters 38
Sharding Technology Roll-your-own Vitess Jetpants Shard-Query Clustrix MySQL Cluster 39
Sharding Technology MySQL Fabric Tesora Database Virtualization Engine ScaleArc Official solution from MySQL team at Oracle (Open Source) Automated Open Source Rule Based Commercial MariaDB MaxScale Basic Routing 40
In Summary There are multiple technologies for Sharding There is no standard solution used across the board 41
Thank you! pz@percona.com https://www.linkedin.com/in/peterzaitsev https://twitter.com/peterzaitsev 42