Trade- Offs in Cloud Storage Architecture Stefan Tai
Cloud computing is about providing and consuming resources as services There are five essential characteristics of cloud services [NIST] [NIST]: http://csrc.nist.gov/groups/sns/cloud- computing/
1. Elastic Scalability
2. On- demand Self- Service
3. Ubiquitous Network Access
4. Resource Pooling
5. Measured Service
Essential characteristics Security High Availability Reliability Rapid Elasticity Data Consistency Measured Service On- demand Self- service Demand- side Aggregation Multi- Channel Access Multi- Tenancy Efficiency Resource Pooling Supply- side Savings Broad Network Access Client Latency
...and diverse Trade- offs Security High Availability Reliability Rapid Elasticity Data Consistency Measured Service Demand- side Aggregation $ On- demand Self- service Multi- Channel Access Multi- Tenancy Efficiency Resource Pooling Supply- side Savings Broad Network Access Client Latency
Trade- offs are unavoidable Trade- off decisions cannot (always) be made during design- time, rather, the factors determining trade- off decisions are likely to dynamically change during runtime, e.g. Workload patterns might change because user location changes times of (peek) usage changes (partly user- controlled) applications & features change Failure patterns might change because hardware characteristics change software characteristics change network reliability changes
Key Challenges Measurable qualities and trade- offs Consumer- observable vs. Provider- side Metrics: $ vs. msec Tunable qualities and trade- offs At runtime, not (only) design/deployment- time Continuous evaluation
Example: Cloud Storage Service
A simple Web API to start with Storage Consumer Storage Provider write (file, key) read (file, key) Programming Model Storage Model
Let s have a closer look at write() Storage Consumer Storage Provider write (file, key) Programming Model Storage Model
(N,R,W) Quorum System Storage Consumer Storage Provider the girl with the dragon tattoo (key) write (file, key) Programming Model fff A 000 030849cd38 (MD5 hash of key) bbb E Coordinator Node B 333 Replica Node Storage Model 999 D C 666
with Eventual Consistency Storage Consumer Storage Provider the girl with the dragon tattoo (key) write (file, key) Programming Model fff A 000 030849cd38 (MD5 hash of key) bbb E Coordinator Node B 333 Replica Node Hinted Handoff Node Storage Model 999 D C 666
A well- known Trade- Off: CAP Theorem Consis- tency Avail- ability Either C, or A, but not both (Network Partitions assumed in a Cloud context) Tolerance to Network Partitions
Consistency vs. Availability Atomic Consistent Isolated Durable ACID BASE Basically Available Soft- State Eventually Consistent Traditional DBMS NoSQL Cloud Storage
Measuring client- observable t0 Client Eventual Consistency Send Update Request Cloud Storage Provider #1 t1 Receive Response How soon (how late) is eventual? Client- observable time window (t1,t2) Update Propagation #2 #n t2 time Bermbach/Tai 2011
Setup Cloud Clock Sync. Protocol
Experimental Findings for Amazon S3 LOW SAW [Bermbach2011]: David Bermbach and Stefan Tai: Eventual Consistency: How soon is eventual? An Evaluation of Amazon S3 s Consistency Behavior Middleware for Services Computing Workshop. to appear December 2011, ACM.
Length of S3 LOW/SAW Periods Avg. Bermbach/Tai 2011
Experimental Findings for Apache Cassandra Bermbach/Tai 2011
Comparing S3 and Cassandra Amazon S3 1 Apache Cassandra 2 Two different periodicities 12% violations of monotonic read consistency One availability zone out of three usually lags behind (in 50% of all tests) > 99 % of all LOW writes create consistent data after 175ms Read availability > 8 nines Geometric distribution 0.0006% violations of monotonic read consistency No influence of geographic distribution > 99% of all writes create consistent data after 35ms Read availability 100% 1 Setup: high redundancy with replication over 3 availability zones; test duration 7 days 2 Setup: deployed on 3 large EC2 instances in different availability zones; Consistency level ONE; 3 replica; test duration 24h Bermbach/Tai 2011
Further S3 Measurements Two files, same bucket, different behavior Bermbach/Tai 2011
Client- observable measurable qualities as defined, as agreed?
Objective (1.): Understanding the C- A- Spectrum for use in Applications A B A S E A C I D C t
(2.) Tuning Knobs to dynamically manage trade- offs Storage Consumer Storage Provider the girl with the dragon tattoo (key) write (file, key) Programming Model fff A 000 030849cd38 (MD5 hash of key) config_ring (N,R,W) = (3,2,2) Config Model bbb E Coordinator Node Replica Node B 333 N Durability Storage Model R W Read- Availability Write- Availability 999 D C 666
3. Reliable, client- observable behavior
Our Agenda Understanding relevant qualities and critical trade- off decisions that need to be addressed on a per- application basis, on a per- provider, and per- client/ tenant basis Providing tuning knobs that translate into novel programming / configuration / deployment models for the Cloud Continuously monitoring and evaluating runtime data (e.g., nr. of transactions, load and utilization) within the trade- off model and automating trade- off adjustments
Our Vision: The Harmonic Cloud Architecture Trade- Off Equalizer
Thank You Stefan Tai tai@kit.edu / tai@fzi.de
Acknowledgments David Bermbach (david.bermbach@kit.edu) Markus Klems (markus.klems@kit.edu)