Building High Performance Apps using NoSQL. Swami Sivasubramanian General Manager, AWS NoSQL

Building High Performance Apps using NoSQL Swami Sivasubramanian General Manager, AWS NoSQL

Building high performance apps There is a lot to building high performance apps Scalability Performance at high percentiles Availability Database choice on has disproportionate impact on these

History of Databases in Amazon Amazon.com is a big startup Composed of thousands of service Each service is developed and operated independently No central mandate or coordination (slows down execution..) A vast ecosystem.. Survival of the fittest! We have gone through multiple iterations on the following question What is the right database architecture for Amazon apps?

Relational Era Amazon.com page composed of responses from 1000s of independent services Query patterns for different service are different Catalog service is usually heavy key-value Ordering service is very write intensive (key-value) Catalog search has a different pattern for querying However: Usually, a relational database is the default database choice!

Relational Era (contd..) How did it go? Poor availability Poor Scalability (Q4/Christmas was a big project) Exorbitantly high costs for hardware, software and administration

Lessons Relational Database is used even when they are not the right tool! Didn t need all the query capabilities RDBMS provide Need a database that can provide: Extreme availability Seamless scalability: No more re-architecture for planning Embrace failures and make it part of normal operations (Hey, this was early 2000s when these were not obvious)

Distributed Systems Era: Amazon Dynamo Replicated DHT with consistency management Consistent hashing Optimistic replication Sloppy quorum Anti-entropy mechanisms Object versioning Specialist tool: Limited query capabilities Simpler consistency

Amazon Dynamo Usage Higher Availability Incremental Scalability Lower costs.. however less query capability Adopted by services for which scale and availability are most important Dynamo inspired many other internal variants for distributed caching, messaging etc.. Services that needed complex queries used RDBMS

Amazon Dynamo: Lessons learned What could have been better? Lack of strong consistency Forced a model which may not fit every app We forced every engineer to learn distributed systems Version clocks, sloppy quorum, anti-entropy, Cluster balancing, Operational complexity Required each service to carry pagers Manage their fleet Deal with performance tuning In the end, Amazon developers wanted Dynamo as a service not a product!

Cloud Era Time when AWS was just starting Developers loved: Why? Amazon S3 for storage Amazon EC2 for compute Lets them focus on their app Not deal with operations They wanted equivalent of S3 for databases Seamless scalability No operational overhead

Cloud Era: Amazon DynamoDB Non- Rela)onal Fast & Predictable Performance Seamless Scalability Easy Administra)on

We built DynamoDB to make developers life easier

Where is Amazon.com right now? NoSQL (DynamoDB) has been a huge central piece for Amazon Most of the online workloads are using DynamoDB No other solution meets our scale, availability and cost needs We use other cloud databases too! RDS for relational workloads ElastiCache for caching Redshift for warehouse applications EMR for analytics

So much for Amazon.com, what about AWS and its customers?

State of NoSQL in AWS: Brief Recap

DynamoDB: Looking ahead We will continue to invest in making sure DynamoDB continues to be Secure Extremely reliable Three datacenter replication Synchronous replication Extremely well tested replication pipeline No compromise on reliability for costs or performance Seamlessly scalable Cost effective Launched in April: 75% price drop for storage + 35% drop for throughput Reserved capacity options: 1-year = 53% discount; 3-year = 76% discount 4KB read capacity units

DynamoDB: Looking ahead (contd..) We will continue to improve query capabilities Launched Local secondary indexes (April 2013) Launched Parallel Scan API (May 2013) Launched geospatial indexing library today! Lot more to come.. We will continue to reduce your operational overhead Example: Dynamic DynamoDB, autoscale-dynamodb, etc.. We will continue to integrate with other AWS services seamlessly EMR integration One click copy to Redshift (Feb 2013) Data Pipeline template to backup/restore (Mar 2013) More to come..

ElastiCache Managed caching service Offered memcache as a service Added Redis support yesteday! Lookout for more caching features here..!

Run your own database on EC2 There is a rich ecosytem of NoSQL solutions in EC2 MongoDB Cassandra Riak Graph databases Pick the right solutions based on your needs.

Getting back to original question

How do I choose the right database for my app?

So many choices, what to pick? Choose the right tool for each job.

Redux.. Decision point #1: Optimize Query patterns Decision point #2: Plan for (business) success Decision point #3: Plan for (infrastructure) failures Decision point #4: What is the operational expense for my pick?

Decision #1: Choosing right query patterns Understand your apps s query pattern carefully Identify which queries need to scale linearly with growth in user base For those queries, pick a database architecture that scales linearly Perf should be same for 10MB table or 10GB table or 10TB table. If your db does not grow with your business growth Signing up for operational hell Don t think about sharding as after thought

Decision #1: Choosing right query patterns (contd..) Separate query patterns carefully Interactive part of your apps need to perform well and scale Avoid non-scalable queries in interactive user workflow Good real-time query Example: Load user preferences, set user preferences Bad real-time query Example: Compute all friends of friends for user A who are interested in X Perform complex queries, pre-compute and store in a cache Example: Compute recommendations for user-a and store in a cache

Optimize Query Patterns For time series data Separate cold data from hot data Enables you to separate read heavy workload from write heavy workload Example: Ordering application is a great example for time series data Past few days orders are hot 6 month old orders are cold Recommendation: Create an ordering table every week Store recent orders in this week s table Archive the old tables or dial down their read throughput You can query across tables

Decision #2: Plan for success Understand scale needs Talk to your CFO/product visionary/business owner What does success look like? Don t postpone tough decisions until you are successful Re-architecting while dealing with growth is a pain Pick query flexibility vs. scalability carefully Don t take shortcuts Plan to sleep well for other 51 weekends

Decision #2: Plan for success (contd..) Test for scale You will find strange bottlenecks in these tests Connection timeouts Cluster reconfiguration issues Load balancing.. Test how system scales More throughput capacity (for DynamoDB) More cache nodes (for elasticache) More ec2 instances (for run your own database)

Decision #3: Plan for failure Do not treat failure as a special case Replication and redundancy is key! Pick replication technology carefully Synchronous vs. Asynchronous Hint: If you care about your data, pick synchronous replication Multi-AZ vs. Single-AZ Hint: If you care about availability, pick Multi-AZ replication Pick replication factor carefully Two is a terrible number in distributed systems Three is better (and is not a crowd)

Decision #3: Test for failures.. Plans are only good intentions.. In DynamoDB, we test for failures Unit tests Mock tests Cluster tests Performance tests Datacenter failure tests Network degradation tests Dependency failure tests Also, we use strong theoretical foundation when necessary.. Fault injection testing is key!

Decision #4: What is the operational overhead? Understand the operational costs of your app Don t underestimate the cost of Managing hardware Maintaining and patching software Configuring and keeping multi-az replication Plan for repeated game days and hardware upgrades Plan for optimizing costs Plan for operations staff If a cloud service works for you and meets your needs (#1 to #3) great! If not, do it your own but plan accordingly.

Simple rule of thumb.. When you need seamless scale and super availability: DynamoDB Complex query workloads and need relational capabilities Choose Amazon RDS Usually MySQL is a good choice Caching ElastiCache - memcached for scaling key-value ElastiCache Redis for advanced datastructures For data warehousing: Choose Amazon Redshift Cases where these services are not the right fit: Build your own on EC2!

Thank you! swami@amazon.com