Amazon Aurora Deep Dive

Similar documents
Amazon Aurora Deep Dive

BERLIN. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved

Amazon Aurora Deep Dive

Amazon Aurora Relational databases reimagined.

Aurora, RDS, or On-Prem, Which is right for you

Deep Dive on Amazon Aurora

Introduction to Database Services

Which technology to choose in AWS?

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache

Highly Available Database Architectures in AWS. Santa Clara, California April 23th 25th, 2018 Mike Benshoof, Technical Account Manager, Percona

White Paper Amazon Aurora A Fast, Affordable and Powerful RDBMS

Overview of AWS Security - Database Services

AWS_SOA-C00 Exam. Volume: 758 Questions

Migrating to Aurora MySQL and Monitoring with PMM. Percona Technical Webinars August 1, 2018

Running MySQL on AWS. Michael Coburn Wednesday, April 15th, 2015

Deep Dive on MySQL Databases on Amazon RDS. Chayan Biswas Sr. Product Manager Amazon RDS

Oracle WebLogic Server 12c on AWS. December 2018

Advanced Architectures for Oracle Database on Amazon EC2

Amazon AWS-Solution-Architect-Associate Exam


AWS Solutions Architect Associate (SAA-C01) Sample Exam Questions

High Noon at AWS. ~ Amazon MySQL RDS versus Tungsten Clustering running MySQL on AWS EC2

AWS Database Migration Service

Deep Dive Amazon Kinesis. Ian Meyers, Principal Solution Architect - Amazon Web Services

ARCHITECTING WEB APPLICATIONS FOR THE CLOUD: DESIGN PRINCIPLES AND PRACTICAL GUIDANCE FOR AWS

Designing Fault-Tolerant Applications

Scaling Massive Content Stores in the Cloud. CloudExpo New York June Alfresco Founder & CTO

Using SQL Server on Amazon Web Services

Document Sub Title. Yotpo. Technical Overview 07/18/ Yotpo

NVMFS: A New File System Designed Specifically to Take Advantage of Nonvolatile Memory

Oracle DBA workshop I

Modernize Your Backup and DR Using Actifio in AWS

AWS Storage Gateway. Not your father s hybrid storage. University of Arizona IT Summit October 23, Jay Vagalatos, AWS Solutions Architect

Performance Test Results for ScaleArc for MySQL on Aurora RDS Nov ScaleArc. All Rights Reserved. 1

Amazon Web Services (AWS) Solutions Architect Intermediate Level Course Content

AWS Well Architected Framework

Deep Dive on Amazon Relational Database Service

Amazon AWS and RDS, moving towards it. Dimitri Vanoverbeke Solution Percona

Datacenter replication solution with quasardb

POSTGRESQL ON AWS: TIPS & TRICKS (AND HORROR STORIES) ALEXANDER KUKUSHKIN. PostgresConf US

SAA-C01. AWS Solutions Architect Associate. Exam Summary Syllabus Questions

Nutanix Tech Note. Virtualizing Microsoft Applications on Web-Scale Infrastructure

Amazon Web Services (AWS) Training Course Content

Principal Solutions Architect. Architecting in the Cloud

Amazon Web Services Training. Training Topics:

Cloud Computing /AWS Course Content

ApsaraDB for Redis. Product Introduction

Enroll Now to Take online Course Contact: Demo video By Chandra sir

Percona XtraDB Cluster 5.7 Enhancements Performance, Security, and More

Oracle Exadata: Strategy and Roadmap

AWS Storage Gateway. Amazon S3. Amazon EFS. Amazon Glacier. Amazon EBS. Amazon EC2 Instance. storage. File Block Object. Hybrid integrated.

The Right Choice for DR: Data Guard, Stretch Clusters, or Remote Mirroring. Ashish Ray Group Product Manager Oracle Corporation

Oracle IaaS, a modern felhő infrastruktúra

PrepAwayExam. High-efficient Exam Materials are the best high pass-rate Exam Dumps

Client Success in an Open Source World. Udi Shamay Head of Client Strategy, Magento

Exploring Amazon RDS MySQL Second Tier Read Replica

AWS Solution Architect Associate

Amazon. Exam Questions AWS-Certified-Solutions-Architect- Professional. AWS-Certified-Solutions-Architect-Professional.

Performance comparisons and trade-offs for various MySQL replication schemes

Managing IoT and Time Series Data with Amazon ElastiCache for Redis

Pass4test Certification IT garanti, The Easy Way!

We are ready to serve Latest IT Trends, Are you ready to learn? New Batches Info

AWS Solution Architecture Patterns

NGF0502 AWS Student Slides

MySQL In the Cloud. Migration, Best Practices, High Availability, Scaling. Peter Zaitsev CEO Los Angeles MySQL Meetup June 12 th, 2017.

BERLIN. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved

PolarDB. Cloud Native Alibaba. Lixun Peng Inaam Rana Alibaba Cloud Team

AWS Solutions Architect Exam Tips

Cloud Analytics and Business Intelligence on AWS

Design Patterns for the Cloud. MCSN - N. Tonellotto - Distributed Enabling Platforms 68

AWS: Basic Architecture Session SUNEY SHARMA Solutions Architect: AWS

Scaling on AWS. From 1 to 10 Million Users. Matthias Jung, Solutions Architect

Doubling Performance in Amazon Web Services Cloud Using InfoScale Enterprise

How can you implement this through a script that a scheduling daemon runs daily on the application servers?

Percona XtraDB Cluster

Virtualizing Oracle on VMware

HPE Digital Learner AWS Certified SysOps Administrator (Intermediate) Content Pack

Accelerate Applications Using EqualLogic Arrays with directcache

THE ZADARA CLOUD. An overview of the Zadara Storage Cloud and VPSA Storage Array technology WHITE PAPER

Amazon Web Services. Block 402, 4 th Floor, Saptagiri Towers, Above Pantaloons, Begumpet Main Road, Hyderabad Telangana India

CIT 668: System Architecture. Amazon Web Services

4 Myths about in-memory databases busted

Oracle Exadata X7. Uwe Kirchhoff Oracle ACS - Delivery Senior Principal Service Delivery Engineer

Training on Amazon AWS Cloud Computing. Course Content

Oracle Autonomous Database

AWS Administration. Suggested Pre-requisites Basic IT Knowledge

Azure Webinar. Resilient Solutions March Sander van den Hoven Principal Technical Evangelist Microsoft

Amazon Elastic File System

CogniFit Technical Security Details

Fault-Tolerant Computer System Design ECE 695/CS 590. Putting it All Together

EBOOK. FROM DISASTER RECOVERY TO ACTIVE-ACTIVE: NuoDB AND MULTI-DATA CENTER DEPLOYMENTS

Move Amazon RDS MySQL Databases to Amazon VPC using Amazon EC2 ClassicLink and Read Replicas

Oracle Database 12c: RAC Administration Ed 1

Exploring Amazon RDS MySQL Second Tier Read Replica

Postgres in Amazon RDS. Denish Patel Lead Database Architect

Oracle Database 18c and Autonomous Database

Highway to Hell or Stairway to Cloud?

Migrating Oracle Databases to Amazon Aurora

ArcGIS 10.3 Server on Amazon Web Services

Deep Dive on Amazon Elastic File System

Transcription:

Amazon Aurora Deep Dive Enterprise-class database for the cloud Damián Arregui, Solutions Architect, AWS October 27 th, 2016 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Enterprise customer wish list A database that. Stays up, even when components fail. Performs consistently at enterprise scale Doesn t need an army of experts to manage Doesn t cost a fortune; no licenses to handle

Enterprise-class database for the cloud Speed and availability of high-end commercial databases Simplicity and cost-effectiveness of open source databases Drop-in compatibility with MySQL Simple pay as you go pricing Delivered as a managed service

Relational databases were not designed for cloud SQL Transactions Caching Multiple layers of functionality all in a monolithic stack Logging

Not much has changed in the last 30 years Application Application Application SQL Transactions Caching Logging SQL Transactions Caching Logging SQL Transactions Caching Logging SQL Transactions Caching Logging SQL Transactions Caching Logging SQL Transactions Caching Logging Storage Even when you scale it out, you re still replicating the same stack

Reimagining the relational database What if you were inventing the database today? You wouldn t design it the way we did in 1970 You d build something that can scale out that is self-healing that leverages existing AWS services

SOA applied to databases 1 Moved the logging and storage layer into a multitenant, scale-out databaseoptimized storage service Data Plane SQL Transactions Control Plane Integrated with other AWS services like Caching Amazon DynamoDB 2 Amazon EC2, Amazon VPC, Amazon DynamoDB, Amazon SWF, and Amazon Logging + Storage Route 53 for control plane operations Amazon SWF 3 Integrated with Amazon S3 for continuous backup with 99.999999999% durability Amazon S3 Amazon Route 53

Performance

SQL benchmark results Using MySQL SysBench with Amazon Aurora R3.8XL with 32 cores and 244 GB RAM WRITE PERFORMANCE READ PERFORMANCE 4 client machines with 1,000 connections each Single client machine with 1,600 connections

Beyond benchmarks If only real-world applications saw benchmark performance POSSIBLE DISTORTIONS Real-world requests contend with each other Real-world metadata rarely fits in the data dictionary cache Real-world data rarely fits in the buffer cache Real-world production databases need to run at high availability

Scaling user connections Connections Amazon Aurora Amazon RDS MySQL 30 K IOPS (single AZ) 50 40,000 10,000 500 71,000 21,000 8x U P T O FA S T E R 5,000 110,000 13,000 SysBench OLTP workload 250 tables

Scaling table count Number of write operations per second Tables Amazon Aurora MySQL I2.8XL local SSD MySQL I2.8XL RAM disk RDS MySQL 30 K IOPS (single AZ) 10 60,000 18,000 22,000 25,000 100 66,000 19,000 24,000 23,000 U P T O 11x FA S T E R 1,000 64,000 7,000 18,000 8,000 10,000 54,000 4,000 8,000 5,000 SysBench write-only workload 1,000 connections, default settings

Scaling dataset size SYSBENCH WRITE-ONLY CLOUDHARMONY TPC-C DB Size Amazon Aurora RDS MySQL 30 K IOPS (single AZ) DB Size Amazon Aurora RDS MySQL 30K IOPS (single AZ) 1 GB 107,000 8,400 10 GB 107,000 2,400 80 GB 12,582 585 800 GB 9,406 69 100 GB 101,000 1,500 1 TB 26,000 1,200 U P T O 67x FA S T E R U P T O 136x FA S T E R

Running with read replicas Updates per second Amazon Aurora RDS MySQL 30 K IOPS (single AZ) 1,000 2.62 ms 0 s 2,000 3.42 ms 1 s 5,000 3.94 ms 60 s 500x U P T O L O W E R L A G 10,000 5.38 ms 300 s SysBench write-only workload 250 tables

How do we achieve these results? DO LESS WORK Do fewer I/Os Minimize network packets Cache prior results Offload the database engine BE MORE EFFICIENT Process asynchronously Reduce latency path Use lock-free data structures Batch operations together DATABASES ARE ALL ABOUT I/O NETWORK-ATTACHED STORAGE IS ALL ABOUT PACKETS/SECOND HIGH-THROUGHPUT PROCESSING DOES NOT ALLOW CONTEXT SWITCHES

Aurora cluster AZ 1 AZ 2 AZ 3 Aurora primary instance Cluster volume spans 3 AZs Amazon S3

I/O traffic in RDS MySQL MYSQL WITH STANDBY AZ 1 AZ 2 Primary Instance 3 Standby Instance I/O FLOW Issue write to Amazon EBS EBS issues to mirror, acknowledge when both done Stages write to standby instance using storage level replication Issues write to EBS on standby instance OBSERVATIONS 1 Amazon Elastic Block Store (EBS) EBS 4 Steps 1, 3, 5 are sequential and synchronous This amplifies both latency and jitter Many types of write operations for each user operation Have to write data blocks twice to avoid torn write operations 2 5 PERFORMANCE Amazon S3 EBS mirror EBS mirror 780 K transactions 7,388 K I/Os per million transactions (excludes mirroring, standby) Average 7.4 I/Os per transaction 30 minute SysBench write-only workload, 100 GB dataset, RDS Single AZ, 30 K PIOPS T YPE OF WRIT E LOG BINLOG DATA DOUBLE-WRITE FRM FILES

I/O traffic in Aurora (database) AMAZON AURORA AZ 1 AZ 2 AZ 3 Primary Instance Replica Instance I/O FLOW Boxcar redo log records fully ordered by LSN Shuffle to appropriate segments partially ordered Boxcar to storage nodes and issue write operations ASYNC 4/6 QUORUM Amazon S3 DISTRIBUTED WRITES OBSERVATIONS Only write redo log records; all steps asynchronous No data block writes (checkpoint, cache replacement) 6x more log writes, but 9x less network traffic Tolerant of network and storage outlier latency PERFORMANCE 27,378 K transactions 35x MORE 950K I/Os per 1M transactions (6x amplification) 7.7x LESS 30 minute SysBench writeonly workload, 100GB dataset T YPE OF WRIT ES LOG BINLOG DATA DOUBLE-WRITE FRM FILES

I/O traffic in Aurora (storage node) STORAGE NODE I/O FLOW Primary Instance Peer Storage Nodes LOG RECORDS 4 ACK INCOMING QUEUE 2 PEER TO PEER GOSSIP SORT GROUP UPDATE QUEUE HOT LOG 3 1 COALESCE 5 POINT IN TIME SNAPSHOT 6 S3 BACKUP GC DATA BLOCKS 7 SCRUB 8 1 Receive record and add to in-memory queue 2 Persist record and acknowledge 3 Organize records and identify gaps in log 4 Gossip with peers to fill in holes 5 Coalesce log records into new data block versions 6 Periodically stage log and new block versions to S3 7 Periodically garbage-collect old versions 8 Periodically validate CRC codes on blocks OBSERVATIONS All steps are asynchronous Only steps 1 and 2 are in the foreground latency path Input queue is 46x less than MySQL (unamplified, per node) Favors latency-sensitive operations Use disk space to buffer against spikes in activity

CLIENT CONNECTION CLIENT CONNECTION epoll() Adaptive thread pool MYSQL THREAD MODEL AURORA THREAD MODEL LATCH FREE TASK QUEUE Standard MySQL one thread per connection Doesn t scale with connection count MySQL EE connections assigned to thread group Requires careful stall threshold tuning Re-entrant connections multiplexed to active threads Kernel-space epoll() inserts into latch-free event queue Dynamically size threads pool Gracefully handles 5000+ concurrent client sessions on r3.8xl

I/O traffic in Aurora (read replica) MYSQL READ SCALING AMAZON AURORA READ SCALING MySQL Master 70% Write SINGLE-THREADED BINLOG APPLY MySQL Replica 70% Write Aurora Master 70% Write PAGE CACHE UPDATE Aurora Replica 100% New Reads 30% Read 30% New Reads 30% Read Data Volume Data Volume Shared Multi-AZ Storage Logical: Ship SQL statements to replica Write workload similar on both instances Independent storage Can result in data drift between master and replica Physical: Ship redo from master to replica Replica shares storage; no writes performed Cached pages have redo applied Advance read view when all commits seen

Availability

Storage node availability Quorum system for read/write; latency tolerant Peer-to-peer gossip replication to fill in holes AZ 1 AZ 2 AZ 3 Continuous backup to S3 Continuous scrubbing of data blocks Continuous monitoring of nodes and disks for repair 10 GB segments as unit of repair or hotspot rebalance to quickly rebalance load Amazon S3 Quorum membership changes do not stall write operations

Self-healing, fault-tolerant Lose two copies or an Availability Zone failure without read or write availability impact Lose three copies without read availability impact Automatic detection, replication, and repair AZ 1 AZ 2 AZ 3 SQL Transaction Caching AZ 1 AZ 2 AZ 3 SQL Transaction Caching Read availability Read and write availability

Instant crash recovery Traditional databases Have to replay logs since the last checkpoint Amazon Aurora Underlying storage replays redo records on demand as part of a disk read Typically 5 minutes between checkpoints Parallel, distributed, asynchronous Single-threaded in MySQL; requires a large number of disk accesses Crash at T 0 requires a reapplication of the SQL in the redo log since last checkpoint No replay for startup Crash at T 0 will result in redo logs being applied to each segment on demand, in parallel, asynchronously Checkpointed data Redo log T 0 T 0

Survivable caches We moved the cache out of the database process Cache remains warm in the event of a database restart Lets you resume fully loaded operations much faster Caching process is outside the DB process and remains warm across a database restart SQL Transactions Caching SQL Transactions Caching SQL Transactions Caching Instant crash recovery + survivable cache = quick and easy recovery from DB failures

Faster, more predictable failover MYSQL DB Failure Failure Detection DNS Propagation App Running Recovery Recovery AURORA WITH MARIADB DRIVER DB Failure Failure Detection DNS Propagation 15 2 0 s e c. Recovery 3 2 0 s e c. App Running

High availability with Aurora Replicas db.r3.8xlarge Primary db.r3.2xlarge Priority: tier-1 db.r3.8xlarge Priority: tier-0 AZ 1 AZ 2 AZ 3 Aurora primary instance Aurora Replica Aurora Replica Cluster volume spans 3 AZs Amazon S3

High availability with Aurora Replicas db.r3.8xlarge db.r3.2xlarge Priority: tier-1 db.r3.8xlarge Primary AZ 1 AZ 2 AZ 3 Aurora Primary instance Aurora Replica Aurora primary Instance Cluster volume spans 3 AZs Amazon S3

Simulate failures using SQL To cause the failure of a component at the database node: ALTER SYSTEM CRASH [{INSTANCE DISPATCHER NODE}] To simulate the failure of disks: ALTER SYSTEM SIMULATE percent_failure DISK failure_type IN [DISK index NODE index] FOR INTERVAL interval To simulate the failure of networking: ALTER SYSTEM SIMULATE percent_failure NETWORK failure_type [TO {ALL read_replica availability_zone}] FOR INTERVAL interval

Simplicity

Amazon Relational Database Service (Amazon RDS) No infrastructure management Instant provisioning Application compatibility Cost-effective Scale up/down

Data security Encryption to secure data at rest AES-256; hardware accelerated All blocks on disk and in Amazon S3 are encrypted Key management by using AWS KMS SSL to secure data in transit Network isolation: Amazon VPC by default No direct access to nodes Application SQL Transactions Caching Storage Supports industry standard certifications Amazon S3

Storage management up to 64 TB Up to 64 TB of storage autoincremented in 10 GB units Automatic storage scaling up to 64 TB no performance impact Continuous, incremental backups to Amazon S3 Instantly create user snapshots no performance impact Automatic restriping, mirror repair, hot spot management, encryption

Monitoring with AWS Management Console Amazon CloudWatch metrics for RDS CPU utilization Storage Memory 50+ system/os metrics 1 60 second granularity DB connections Selects per second Latency (read and write) Cache hit ratio Replica lag CloudWatch alarms Similar to on-premises custom monitoring tools

Migration from MySQL Source database on RDS Snapshot migration: One-click migration from RDS MySQL 5.6 to Aurora DB snapshot One-click migrate Source database external or on EC2 Use native MySQL migration tools Back up to S3 using Percona XtraBackup, restore from S3 RDS MySQL master/slave RDS snapshot migration New Aurora cluster

Start your first migration in 10 minutes or less AWS Database Migration Service Keep your apps running during the migration Replicate within, to, or from Amazon EC2 or RDS Move data to the same or different database engine

Migrate from Oracle and SQL Server Move your tables, views, stored procedures, and data manipulation language (DML) to MySQL, MariaDB, and Amazon Aurora AWS Schema Conversion Tool Know exactly where manual edits are needed Download at aws.amazon.com/dms

Perfect fit for enterprise Performance and scale Up to 500 K/sec read and 100 K/sec write 15 low latency (10 ms) Read Replicas Up to 64 TB DB optimized storage volume Enterprise class availability 6-way replication across 3 data centers Failover in less than 30 secs Near instant crash recovery Fully managed service Instant provisioning and deployment Automated patching and software upgrade Backup and point-in-time recovery Compute and storage scaling

http://aws.amazon.com/rds/aurora