Deep Dive on Amazon Relational Database Service Toby Knight - Manager, Solutions Architecture, AWS 28 June 2017 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What to expect Amazon RDS overview Security Customer story Migrating to RDS Metrics and monitoring Scaling on RDS Backups and snapshots High availability
Amazon Relational Database Service (Amazon RDS) No infrastructure management Instant provisioning Application compatibility Cost-effective Scale up/down
Amazon RDS engines Commercial Open source Amazon Aurora
Amazon Aurora vs. MySQL Feature RDS Aurora RDS MySQL Number of replicas Up to 15 Up to 5 Replication type Asynchronous (milliseconds) Asynchronous (seconds) Replication performance impact on primary Low Replica can act as failover target Yes (no data loss) Yes (potentially minutes of loss) Storage Up to 64 TB, auto growth Up to 6 TB, specify storage limit Automated failover Yes, to replica Yes, to standby User-defined replication delay No Yes Replica support for different data or schema vs. primary Cross-region replication No Yes Data cache survives Yes No No High Yes
Trade-offs with a managed service Fully managed host and OS No access to the database host operating system Limited ability to modify configuration that is managed on the host operating system No functions that rely on configuration from the host OS Fully managed storage Max storage limits SQL Server 4 TB MySQL, MariaDB, PostgreSQL, Oracle 6 TB Aurora 64 TB Growing your database is a process
Selected Amazon RDS customers
Security
Amazon Virtual Private Cloud (Amazon VPC) Securely control network configuration 10.1.0.0/16 Manage connectivity 10.1.1.0/24 AWS Direct Connect VPN Connection VPC Peering Routing Rules Internet Gateway Availability Zone AWS Region
Security groups Database IP firewall protection Corporate address admins Protocol Port Range Source TCP 3306 172.31.0.0/16 TCP 3306 Application security group Application tier
Compliance Singapore MTCS 27001/9001 27017/27018
Compliance MySQL, Oracle, PostgreSQL SOC 1, 2, and 3 ISO 27001/9001 ISO 27017/27018 PCI DSS FedRamp HIPAA BAA UK government programs Singapore MTCS Germany C5 SQL Server SOC 1, 2, and 3 ISO 27001/9001 ISO 27017/27018 PCI DSS UK government programs Singapore MTCS Germany C5 Aurora SOC 1, 2, and 3 ISO 27001/9001 ISO 27017/27018 PCI DSS Germany C5
SSL Using SSL to encrypt a connection to a DB instance Available for all six engines mysql -h myinstance.c123xyz.rds-eu-west-1.amazonaws \ --ssl-ca=rds-combined-ca-bundle.pem --ssl-verify-server-cert
At-rest encryption DB instance storage Automated backups Read Replicas Snapshots Available for all six engines No additional cost Support compliance requirements
AWS KMS RDS standard encryption Two-tiered key hierarchy using envelope encryption Unique data key encrypts customer data AWS KMS master keys encrypt data keys Customer Master Key(s) Benefits: Limits risk of compromised data key Better performance for encrypting large data Easier to manage small number of master keys than millions of data keys Centralized access and audit of key activity Data Key 1 Data Key 2 Data Key 3 Data Key 4 Amazon S3 Object Amazon EBS Volume Amazon Redshift Cluster Custom Application
Enabling encryption AWS Command Line Interface (AWS CLI) aws rds create-db-instance --region eu-west-1 --db-instanceidentifier sg-cli-test \ --allocated-storage 20 --storage-encrypted \ --db-instance-class db.m4.large --engine mysql \ --master-username myawsuser --master-user-password myawsuser aws rds create-db-instance --region eu-west-1 --db-instanceidentifier sg-cli-test1 \ --allocated-storage 20 \ --storage-encrypted \ --kms-key-id xxxxxxxxxxxxxxxxxx \ --db-instance-class db.m4.large --engine mysql \ --master-username myawsuser --master-user-password myawsuser
Amazon RDS + AWS KMS useful hints You can only encrypt on new database creation Encryption cannot be removed Master and Read Replica must be encrypted Unencrypted snapshots cannot be restored to encrypted DB Cannot restore MySQL to Aurora or Aurora to MySQL You can now copy encrypted or unencrypted snapshots across regions
IAM managed access You can use AWS Identity and Access Management (IAM) to control who can perform actions on RDS Controlled with database grants Controlled with IAM Applications Users and DBA Applications DBA and Ops Your database RDS
IAM policies for RDS Policies Read Only "Action": [ "rds:describe*", "rds:listtagsforresource", "ec2:describeaccountattributes", "ec2:describeavailabilityzones", "ec2:describesecuritygroups", "ec2:describevpcs, "cloudwatch:getmetricstatistics", "logs:describelogstreams", "logs:getlogevents" ], "Effect": "Allow", "Resource": "*" Full Access "Action": [ "rds:*", "cloudwatch:describealarms", "cloudwatch:getmetricstatistics", "ec2:describeaccountattributes", "ec2:describeavailabilityzones", "ec2:describesecuritygroups", "ec2:describesubnets", "ec2:describevpcs", "sns:listsubscriptions", "sns:listtopics", "logs:describelogstreams", "logs:getlogevents" ], "Effect": "Allow", "Resource": "*"
NEW: IAM DB auth for MySQL and Aurora You can now also use AWS Identity and Access Management (IAM) to control access to the database Controlled with IAM Applications Users and DBA DBA and Ops Your database RDS
IAM DB Auth for MySQL and Amazon Aurora 1. Create RDS DB instance with IAM DB auth enabled 2. Create the user in the DB 3. Attach an IAM policy to the IAM user or role 4. Get an authentication token 5. Connect to DB using IAM DB auth For more details: http://docs.aws.amazon.com/amazonrds/latest/userguide/usingwithrds.ia MDBAuth.html
RDS Deep Dive Steve Blake - CTO steve.blake@sportpursuit.co.uk www.sportpursuit.com
SportPursuit business overview Founded in 2011 Flash sale business Unbeatable deals on sports and outdoor gear from the world s leading sports brands Mission Give access to and inspire sports enthusiasts to discover gear that they ll fall in love with 3.5m+ Members the UK s largest private shopping club for sports & outdoor enthusiasts 1000+ of the world s best sports brands / 40% of which are on-uk 7 Languages / 8 Currencies / Shipping to 40+ countries 40% YoY growth
2-way proposition: brands & customers - Marketing to huge audience of sports enthusiasts - Channel for clearing excess stock, without compromising brand identity - Access to the best discounts on sports and outdoor gear from the world s leading sports brands
Technical architecture 2011 e-commerce Platform MySQL Database EC2 EC2
Technical architecture 2017 Cloudfront Varnish e-commerce Platform ERP Platform Memcache / Redis ELB Application Server Redis RDS - PostgreSQL ELB API Elasticsearch RDS - MySQL Redshift RDS MySQL Analytics Platform
RDS estate statistics ~ 25 RDS instances (incl. replicas) Volume of RDS data Production environment: ~2TB (incl. replicas) Entire estate: ~7TB Throughput of data e-commerce platform Peak: ~100 MB/S Average: ~30 MB/S
Challenges before RDS Manual database administration Creating slave replicas Solving replication errors Version upgrades Backups Time consuming as amount of data increase Refreshing staging environments Time consuming: export -> anonymise -> create -> import
RDS positives Reduction in manual database administration Create many read replicas with ease Painless version upgrades Multi-AZ for production Daily automated backups out of the box Weekly refresh of staging environments Automated snapshot -> anonymise -> restore Feature request: anonymisation as a service - serverless
Lessons learned Fix replication errors on slaves/replicas ASAP Binary log disk usage Don t use MyISAM tables Restrictions in sub-accounts Can t use automated snapshots Database creation: automated -> manual -> restore
Sticking points Adjusting to cloud philosophy Destroy & re-create failing component: Pets / Cattle Mindset change Health of the overall platform, not on a single element Time taken to make changes / upgrades Reboot required for option group changes
Upcoming projects Aurora Plan to benchmark against Aurora vs RDS MySQL Test to ensure compatibility with e-commerce platform BI / reporting RedShift for enterprise data warehouse Data Pipeline for data ingestion
BI / Reporting Data warehouse Redshift Board BI tool Python Django Data Pipeline RDS > Redshift Python Django API integration to 3 rd parties
Thank you! Steve Blake - CTO steve.blake@sportpursuit.co.uk www.sportpursuit.com
Migrating to RDS
Historically, Migration = Cost, Time Commercial data migration and replication software Complex to setup and manage Legacy schema objects, PL/SQL or T-SQL code Application downtime
Database Migration 2 Steps
Step 1: Schema Conversion Tool Overview
ü Move data to the same or different database engine AWS Database Migration Service ü Keep your apps running during the migration ü Start your first migration in 10 minutes or less ü Replicate within, to, or from Amazon EC2 or RDS
Keep your apps running during the migration Customer premises VPN AWS Internet Start a replication instance Connect to source and target database Select tables, schemas, or databases Application Users Let the AWS Database Migration Service create tables, load data, and keep them in sync Switch applications over to the target at your convenience
Flexible migration approach Target Multiple targets Source Replication instance Target Target Source Multiple sources Source Replication instance Target Source Source Target Selective Replication instance instance L
Metrics and monitoring
Accessing Amazon RDS metrics
Amazon RDS standard metrics Change Time Period 45 Metrics Dive Deeper Create Alarms
Amazon RDS Enhanced Monitoring Access to over 50 metrics in 7 categories: Memory, I/O, CPU, File system, Load, Swap Processes
Amazon RDS Event Notifications Get notified when events occur on your database instances 17 different event categories (availability, backup, configuration change and so on) Uses Amazon Simple Notification Service (Amazon SNS)
Scaling on RDS
Scale out with Read Replicas Relieve pressure on your master node for supporting reads and writes. Bring data close to your customer s applications in different regions Promote a Read Replica to a master for faster recovery in the event of disaster Replicas within and crossregion MySQL, MariaDB, PostgreSQL Aurora Engines Needing Other Tools Oracle Microsoft SQL Server
Creating and promoting Read Replica Read Replica creation and promotion are accessed from the Instance Actions button in the RDS console
Creating and promoting Read Replicas with CLI create-db-instance-read-replica--db-instance-identifier <value> --source-db-instance-identifier <value>
Creating and promoting Read Replicas With CLI create-db-instance-read-replica--db-instance-identifier <value> --source-db-instance-identifier <value> [--db-instance-class <value>] [--availability-zone <value>] [--port <value>] [--auto-minor-version-upgrade --no-auto-minor-version-upgrade] [--iops <value>][--option-group-name <value>] [--publicly-accessible --no-publicly-accessible] [--tags <value>][--db-subnet-group-name <value>] [--storage-type <value>] [--copy-tags-to-snapshot --no-copy-tags-to-snapshot] [--monitoring-interval <value>] [--monitoring-role-arn <value>] [--kms-key-id <value>] [--pre-signed-url <value>] [--enable-iam-database-authentication --no-enable-iam-databaseauthentication] [--source-region <value>] [--cli-input-json <value>] [--generate-cli-skeleton <value>]
Scaling up and down Handle higher load or lower usage Control costs
Scaling Up and Down Console
NEW Stop & Start DB Instances
Stop your RDS database instance aws rds stop-db-instance \ --db-instance-identifier mydbinstance
and start it again aws rds start-db-instance \ --db-instance-identifier mydbinstance
Backups and snapshots
RDS backups MySQL, PostgreSQL, MariaDB, Oracle, SQL Server Scheduled daily backup of entire instance Archive database change logs Up to 35 day retention for backups I/O suspension as backup is initiated (but not with multi-az deployment) Multiple copies in each AZ where you have instances for a deployment Aurora Automatic, continuous, incremental backups Point-in-time restore No impact on database performance 35 day retention
RDS restore Restoring creates an entire new database instance You define all the instance configuration just like a new instance
Snapshots Full copies of your Amazon RDS database that are different from your scheduled backups Backed by Amazon S3 Typical use cases Resolve production issues Nonproduction environments Point-in-time restore Final copy before terminating a database Disaster recovery Cross-region copy Copy between accounts
High availability
Minimal deployment: Single AZ 10.1.0.0/16 10.1.1.0/24 Amazon Elastic Block Store Volume Availability Zone AWS Region
High availability: Multi-AZ 10.1.0.0/16 10.1.1.0/24 Same instance type as master 10.1.2.0/24 Replicated storage Availability Zone A Availability Zone B AWS Region
High availability Multi-AZ to DNS dbinstancename.1234567890.us-west-2.rds.amazonaws.com:3006
Aurora high availability Aurora cluster contains primary node and up to 15 secondary nodes Failing database nodes are automatically detected and replaced Failing database processes are automatically detected and recycled Secondary nodes automatically promoted on persistent outage, no single point of failure Customer application can scale out read traffic across secondary nodes AZ 1 AZ 2 AZ 3 Primary Primary Node Primary Node Node Primary Primary Secondary Node Node Node Primary Primary Secondary Node Node Node Amazon S3
Aurora-DNS Failover MYSQL DB Failure Failure Detection DNS Propagation App Running 15-30 sec Recovery Recovery AURORA WITH MARIADB DRIVER Driver benefits DB Failure Failure Detection DNS Propagation 15-30 sec Recovery 5-20 sec App Running
Thank you! Toby Knight Manager, Solutions Architecture Amazon Web Services tobyk@amazon.co.uk @tobywknight 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.