Lessons from database failures

Similar documents
The MySQL Ecosystem - understanding it, not running away from it!

Choosing a MySQL High Availability Solution. Marcos Albe, Percona Inc. Live Webinar June 2017

Percona XtraDB Cluster MySQL Scaling and High Availability with PXC 5.7 Tibor Korocz

Choosing a MySQL HA Solution Today

MySQL High Availability

Using MySQL for Distributed Database Architectures

Which technology to choose in AWS?

MySQL High Availability. Michael Messina Senior Managing Consultant, Rolta-AdvizeX /

Percona Software & Services Update

Percona XtraDB Cluster 5.7 Enhancements Performance, Security, and More

Percona XtraDB Cluster powered by Galera. Peter Zaitsev CEO, Percona Slide Credits: Vadim Tkachenko Percona University, Washington,DC Sep 12,2013

Percona XtraDB Cluster

whoami MySQL MongoDB 100% open source PostgreSQL To champion unbiased open source database solutions

MySQL usage of web applications from 1 user to 100 million. Peter Boros RAMP conference 2013

MySQL in the Hosted Cloud

Architecture and Design of MySQL Powered Applications. Peter Zaitsev CEO, Percona Highload Moscow, Russia 31 Oct 2014

MySQL Backup Best Practices and Case Study:.IE Continuous Restore Process

Amazon AWS and RDS, moving towards it. Dimitri Vanoverbeke Solution Percona

Choosing a MySQL HA Solution Today. Choosing the best solution among a myriad of options

Accelerate MySQL for Demanding OLAP and OLTP Use Case with Apache Ignite December 7, 2016

Percona Software & Services Update

What s New in MySQL and MongoDB Ecosystem Year 2017

Aurora, RDS, or On-Prem, Which is right for you

Databases in the Hosted Cloud

Backup & Restore. Maximiliano Bubenick Sr Remote DBA

MySQL HA Solutions Selecting the best approach to protect access to your data

Percona Server for MySQL 8.0 Walkthrough

MySQL Replication Options. Peter Zaitsev, CEO, Percona Moscow MySQL User Meetup Moscow,Russia

MySQL High Availability Solutions. Alex Poritskiy Percona

MariaDB MaxScale 2.0, basis for a Two-speed IT architecture

Highly Available Database Architectures in AWS. Santa Clara, California April 23th 25th, 2018 Mike Benshoof, Technical Account Manager, Percona

HA solution with PXC-5.7 with ProxySQL. Ramesh Sivaraman Krunal Bauskar

MySQL In the Cloud. Migration, Best Practices, High Availability, Scaling. Peter Zaitsev CEO Los Angeles MySQL Meetup June 12 th, 2017.

MySQL HA Solutions. Keeping it simple, kinda! By: Chris Schneider MySQL Architect Ning.com

MyRocks deployment at Facebook and Roadmaps. Yoshinori Matsunobu Production Engineer / MySQL Tech Lead, Facebook Feb/2018, #FOSDEM #mysqldevroom

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

Migrating to Aurora MySQL and Monitoring with PMM. Percona Technical Webinars August 1, 2018

MySQL Replication. Rick Golba and Stephane Combaudon April 15, 2015

Percona Software & Services Update

How Percona Contributes to Open Source Database Ecosystem. Peter Zaitsev 5 October 2016

Consistent Reads Using ProxySQL and GTID. Santa Clara, California April 23th 25th, 2018

Migrating To MySQL The Live Database Upgrade Guide

Database Backup and Recovery Best Practices. Manjot Singh, Data & Infrastrustructure Architect

BERLIN. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved

Amazon Aurora Relational databases reimagined.

Design Patterns for Large- Scale Data Management. Robert Hodges OSCON 2013

Mysql Cluster Global Schema Lock

High availability with MariaDB TX: The definitive guide

Why Choose Percona Server For MySQL? Tyler Duzan

Percona XtraDB Cluster ProxySQL. For your high availability and clustering needs

Creating a Best-in-Class Backup and Recovery System for Your MySQL Environment. Akshay Suryawanshi DBA Team Manager,

High Availability Solutions for the MySQL Database

MongoDB Backup and Recovery Field Guide. Tim Vaillancourt Sr Technical Operations Architect, Percona

MySQL Replication Advanced Features In 20 minutes

MariaDB 5.5 and what comes next

MySQL Backup solutions. Liz van Dijk Zarafa Summer Camp - June 2012

ProxySQL - GTID Consistent Reads. Adaptive query routing based on GTID tracking

MariaDB 10.3 vs MySQL 8.0. Tyler Duzan, Product Manager Percona

Various MySQL High Availability (HA) Solutions

Running MySQL on AWS. Michael Coburn Wednesday, April 15th, 2015

How to Backup at Petabyte Scale When Every Transaction Counts

Performance comparisons and trade-offs for various MySQL replication schemes

How to evaluate which MySQL High Availability solution best suits you

The history and future of the MySQL Ecosystem

Enterprise Open Source Databases

MySQL Cluster An Introduction

Deploying PXC in Kubernetes / Openshift. Alexander Rubin, Percona

A Brief Introduction of TiDB. Dongxu (Edward) Huang CTO, PingCAP

MySQL Replication Update

High Noon at AWS. ~ Amazon MySQL RDS versus Tungsten Clustering running MySQL on AWS EC2

ShardProxy Replication-Manager. DataOps - Juin 2018 Kentoku SHIBA - Stephane VAROQUI

Mix n Match Async and Group Replication for Advanced Replication Setups. Pedro Gomes Software Engineer

MongoDB and Mysql: Which one is a better fit for me? Room 204-2:20PM-3:10PM

Lessons learned while automating MySQL in the AWS cloud. Stephane Combaudon DB Engineer - Slice

Migrating to XtraDB Cluster 2014 Edition

MariaDB a peek at 10.2

Migrating to Vitess at (Slack) Scale. Michael Demmer Percona Live Europe 2017

What s new in Percona Xtradb Cluster 5.6. Jay Janssen Lead Consultant February 5th, 2014

Why we re excited about MySQL 8

Accelerate MySQL for Demanding OLAP and OLTP Use Cases with Apache Ignite. Peter Zaitsev, Denis Magda Santa Clara, California April 25th, 2017

Amazon Aurora Deep Dive

Beyond MySQL 5.1 What is happening in MySQL Space. Feb 16, 2011 Percona Live San Francisco,CA by Peter Zaitsev, Percona Inc

Switching to Innodb from MyISAM. Matt Yonkovit Percona

Percona Live Europe Amsterdam, Netherlands October 3 5, 2016

Migrating to Vitess at (Slack) Scale. Michael Demmer Percona Live - April 2018

Use Cases for Partitioning. Bill Karwin Percona, Inc

Kenny Gryp. Ramesh Sivaraman. MySQL Practice Manager. QA Engineer 2 / 60

MySQL Group Replication. Bogdan Kecman MySQL Principal Technical Engineer

Understanding Percona XtraDB Cluster 5.7 Operation and Key Algorithms. Krunal Bauskar PXC Product Lead (Percona Inc.)

MongoDB Backup & Recovery Field Guide

MySQL Multi-Site/Multi-Master Done Right

Amazon s Database Migration Service, a magical wand for moving from closed source solutions? Dimitri Vanoverbeke Solution Percona

Percona Live Europe 2016 Use ProxySQL to Improve Your MySQL High Availability Solution Marco Tusa Manager Consulting Amsterdam, Netherlands October 3

Introduction to MySQL InnoDB Cluster

MySQL: Scaling & High Availability

Become a MongoDB Replica Set Expert in Under 5 Minutes:

Scale out Read Only Workload by sharing data files of InnoDB. Zhai weixiang Alibaba Cloud

Future-Proofing MySQL for the Worldwide Data Revolution

MySQL Replication : advanced features in all flavours. Giuseppe Maxia Quality Assurance Architect at

Replication features of 2011

Transcription:

Lessons from database failures Colin Charles, Chief Evangelist, Percona Inc. colin.charles@percona.com / byte@bytebot.net http://www.bytebot.net/blog/ @bytebot on Twitter Percona Webminar 18 January 2017

whoami Chief Evangelist (in the CTO office), Percona Inc Focusing on the MySQL ecosystem (MySQL, Percona Server, MariaDB Server), as well as the MongoDB ecosystem (Percona Server for MongoDB) + 100% open source tools from Percona like Percona Monitoring & Management, Percona xtrabackup, Percona Toolkit, etc. Founding team of MariaDB Server (2009-2016), previously at Monty Program Ab, merged with SkySQL Ab, now MariaDB Corporation Formerly MySQL AB (exit: Sun Microsystems) Past lives include Fedora Project (FESCO), OpenOffice.org MySQL Community Contributor of the Year Award winner 2014

Agenda Backups (and verification) Replication (and failover) Security (and encryption)

ma.gnolia.com

ma.gnolia.com s failure January 30 2009: complete outage February 17 2009: data corruption in the UDB, essentially dead What happened? Ruby on Rails on four self-hosted Mac Mini s, a couple of XServe s, 500GB+ MySQL 5 DB Filesystem corruption, corrupted database backup No versioning, didn t check if the backups worked, made use of rsync to backup the database over Firewire network

ma.gnolia.com today? EC2 for the app with EBS snapshots, RDS with snapshots, Multi-AZ deployment Self-hosted? xtrabackup START TRANSACTION WITH CONSISTENT SNAPSHOT + mysqldump single-transaction master-data Backup a replica Replication event checksums

Couchsurfing, 2006

Couchsurfing problems 1. major, avoidable hard drive crash 2. incremental backups weren t executed in the correct manner, and twelve of our most important data files didn t survive

Time-delayed replication MySQL 5.6+ has time-delayed replication. Stop replication when you know a mistake has happened before it propagates to all the slaves. Feature suggestion since 2001! Bug reported August 2006 (mysql#21639). Pushed June 2010 (WL#344). GA February 2013.

Why replicate? Scale out [automatic] (master) failover Geographical redundancy across multiple data centres Online schema changes

Replication Asynchronous (default) (Enhanced loss-less) Semi-synchronous (plugin) Synchronous (Galera, group replication, NDBCLUSTER) DRBD

Frameworks MySQL-MMM Severalnines ClusterControl Orchestrator MySQL MHA Tungsten Replicator 5.6+ utilities: mysqlfailover, mysqlrpladmin Percona Replication Manager (https://github.com/percona/ percona-pacemaker-agents/) Replication Manager (github.com/tanji/replicationmanager)

GitHub

GitHub

GitHub

GitHub https://github.com/blog/1261-github-availability-this-week

Fully automated failover a good idea? False alarms Repeated failover Overloaded master? MHA doesn t allow a failover within 8h, unless last_failover_min=n is set Data loss id=103 latest, relay logs at id=101 => loss group commit in the binary log Split brain

Proxies MariaDB MaxScale Popular use: load balancing Galera clusters MySQL Router + MySQL Fabric ProxySQL Used alongside Galera clusters too Included with Percona XtraDB Cluster 5.7

Sharding SPIDER Tungsten Replicator Tumblr JetPants

Vitess Servers & tools to scale MySQL for web written in Go Has MariaDB support too (*) Python client interface DML annotation, connection pooling, shard management, workflow management, zero downtime restarts Become super easy to use: http://vitess.io/ (with the help of Kubernetes)

Failwhales Twitter started on MySQL, and is still MySQL - you just need to evolve Gizzard (sharding), Mesos + Apache Cotton Digg started on MySQL, migrated to Cassandra, and came back to MySQL

Security Philippines voter data leave 55m at risk: 338GB MySQL dump Ashley Madison: 6.9GB compressed dump, 36m email addresses leaked, 9.6m credit card transactions Patreon: 13.7GB MySQL dump, 99 tables

Mossack Fonseca: Panama Papers

Prevent SQL injections MariaDB MaxScale database firewall filter Configurable filter actions on rule match (Allow the query, block the query or ignore the match), Logging of matching and/or nonmatching queries MySQL Enterprise firewall ProxySQL

Encryption at rest MariaDB Server 10.1: table or tablespace encryption design goal: Encrypt all user data that may touch the disk InnoDB data, InnoDB logs, binary logs, temporary tables, temporary files key management on the filesystem? [no key rotation] Amazon KMS? caveats: mysqlbinlog needs work with encrypted binlogs; Galera Cluster gcache isn t encrypted MySQL 5.7: only encrypts InnoDB tablespaces (innodb_file_per_table; logs unencrypted)

In conclusion Use semi-sync replication with a failover solution that ensures you don t failover too often Make good backups. Test them. Save them. You ll most definitely need to shard your data, use proven frameworks and get a proxy involved. Complete backups with multisource replication when needed. Use mysqldump and xtrabackup together (and mydumper for parallel backup/restore; mysqlpump) Security is key: prevent SQL injections, encrypt your data at rest

It s 2016, you don t want this

Percona Monitoring and Management (PMM) http://pmmdemo.percona.com/

Join Us at Percona Live When: April 24-27, 2017 Where: Santa Clara, CA USA Percona Live is a very popular conference: last year s Percona Live Europe sold out, and we re looking to do the same for Percona Live 2017. Don t miss your chance to get your ticket at its most affordable price. https://www.percona.com/live/17/registration Percona Live 2017 sponsorship opportunities are available now. https://www.percona.com/live/17/sponsors

Thank you. Q&A? colin.charles@percona.com / byte@bytebot.net @bytebot on Twitter http://www.bytebot.net/blog/ slides: slideshare.net/bytebot