Effective Testing for Live Applications. March, 29, 2018 Sveta Smirnova

Similar documents
Load Testing Tools. for Troubleshooting MySQL Concurrency Issues. May, 23, 2018 Sveta Smirnova

Introduction to troubleshooting Basic techniques. Sveta Smirnova Principal Support Engineer March, 10, 2016

Troubleshooting Best Practices

Backup & Restore. Maximiliano Bubenick Sr Remote DBA

Oracle 1Z MySQL 5.6 Database Administrator. Download Full Version :

MySQL Performance Optimization and Troubleshooting with PMM. Peter Zaitsev, CEO, Percona Percona Technical Webinars 9 May 2018

Upgrading MySQL Best Practices. Apr 11-14, 2011 MySQL Conference and Expo Santa Clara,CA by Peter Zaitsev, Percona Inc

MySQL Performance Optimization and Troubleshooting with PMM. Peter Zaitsev, CEO, Percona

1Z Oracle. MySQL 5 Database Administrator Certified Professional Part I

MyRocks in MariaDB. Sergei Petrunia MariaDB Tampere Meetup June 2018

Switching to Innodb from MyISAM. Matt Yonkovit Percona

Resolving and Preventing MySQL Downtime

MySQL for Database Administrators Ed 4

Performance improvements in MySQL 5.5

mysql Sun Certified MySQL 5.0 Database(R) Administrator Part 1

mysql Certified MySQL 5.0 DBA Part I

Improvements in MySQL 5.5 and 5.6. Peter Zaitsev Percona Live NYC May 26,2011

Informatica Developer Tips for Troubleshooting Common Issues PowerCenter 8 Standard Edition. Eugene Gonzalez Support Enablement Manager, Informatica

Preventing and Resolving MySQL Downtime. Jervin Real, Michael Coburn Percona

MySQL Storage Engines Which Do You Use? April, 25, 2017 Sveta Smirnova

What is MariaDB 5.5? w: e: Daniel Bartholomew Oct 2012

10 Percona Toolkit tools every MySQL DBA should know about

Running MySQL on AWS. Michael Coburn Wednesday, April 15th, 2015

MySQL Performance Troubleshooting

Migrating To MySQL The Live Database Upgrade Guide

How To Rock with MyRocks. Vadim Tkachenko CTO, Percona Webinar, Jan

Creating a Best-in-Class Backup and Recovery System for Your MySQL Environment. Akshay Suryawanshi DBA Team Manager,

InnoDB: Status, Architecture, and Latest Enhancements

Background. $VENDOR wasn t sure either, but they were pretty sure it wasn t their code.

MySQL 5.6: Advantages in a Nutshell. Peter Zaitsev, CEO, Percona Percona Technical Webinars March 6, 2013

Tips from the Trenches Preventing downtime for the over extended DBA. Andrew Moore Senior Remote DBA Percona Managed Services

Scale out Read Only Workload by sharing data files of InnoDB. Zhai weixiang Alibaba Cloud

Mysql Performance Schema Has The Wrong Structure

MySQL Query Tuning 101. Sveta Smirnova, Alexander Rubin April, 16, 2015

<Insert Picture Here> Looking at Performance - What s new in MySQL Workbench 6.2

Oracle Exam 1z0-883 MySQL 5.6 Database Administrator Version: 8.0 [ Total Questions: 100 ]

Percona Live September 21-23, 2015 Mövenpick Hotel Amsterdam

Replication features of 2011

InnoDB Scalability Limits. Peter Zaitsev, Vadim Tkachenko Percona Inc MySQL Users Conference 2008 April 14-17, 2008

InnoDB: What s new in 8.0

The Care and Feeding of a MySQL Database for Linux Adminstrators. Dave Stokes MySQL Community Manager

ò Server can crash or be disconnected ò Client can crash or be disconnected ò How to coordinate multiple clients accessing same file?

NFS. Don Porter CSE 506

Chapter 8: Working With Databases & Tables

Migrating to XtraDB Cluster 2014 Edition

Advanced Oracle SQL Tuning v3.0 by Tanel Poder

Why we re excited about MySQL 8

Q&A Session for Connect with Remedy - CMDB Best Practices Coffee Break

Experiences of Global Temporary Tables in Oracle 8.1

IT Best Practices Audit TCS offers a wide range of IT Best Practices Audit content covering 15 subjects and over 2200 topics, including:

MySQL High Availability Solutions. Alex Poritskiy Percona

Part 2 (Disk Pane, Network Pane, Process Details & Troubleshooting)

MySQL Performance Improvements

MySQL 101. Designing effective schema for InnoDB. Yves Trudeau April 2015

Mysql Cluster Global Schema Lock

Engineering Robust Server Software

EECS 482 Introduction to Operating Systems

Database Security: Transactions, Access Control, and SQL Injection

Manual Trigger Sql Server 2008 Update Inserted Rows

VMware vrealize operations Management Pack FOR. PostgreSQL. User Guide

MySQL Performance Tuning

DRS: Advanced Concepts, Best Practices and Future Directions

Migrating to Aurora MySQL and Monitoring with PMM. Percona Technical Webinars August 1, 2018

Written by Marco Tusa Wednesday, 23 February :03 - Last Updated Sunday, 18 August :39

MariaDB 10.3 vs MySQL 8.0. Tyler Duzan, Product Manager Percona

Jyotheswar Kuricheti

MySQL usage of web applications from 1 user to 100 million. Peter Boros RAMP conference 2013

Continuous MySQL Restores Divij Rajkumar

ORACLE 8 OBJECT ORIENTED TECHNOLOGY ORACLE SOFTWARE STRUCTURES SERVER SIDE BACKGROUND PROCESSES DATABASE SERVER AND DATABASE INSTANCE

What s New in MySQL 5.7 Geir Høydalsvik, Sr. Director, MySQL Engineering. Copyright 2015, Oracle and/or its affiliates. All rights reserved.

Practical MySQL indexing guidelines

MySQL Database Administrator Training NIIT, Gurgaon India 31 August-10 September 2015

HA solution with PXC-5.7 with ProxySQL. Ramesh Sivaraman Krunal Bauskar

Engineering Robust Server Software

Capacity metrics in daily MySQL checks. Vladimir Fedorkov MySQL and Friends Devroom FOSDEM 15

Oracle 1Z0-882 Exam. Volume: 100 Questions. Question No: 1 Consider the table structure shown by this output: Mysql> desc city:

1Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12

GitHub's online schema migrations for MySQL

CSC 261/461 Database Systems Lecture 20. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

What s new in Percona Xtradb Cluster 5.6. Jay Janssen Lead Consultant February 5th, 2014

MySQL Replication Update

MySQL Database Scalability

XA Transactions in MySQL

Introduction to MySQL Cluster: Architecture and Use

Innodb Performance Optimization

MySQL Architecture and Components Guide

Demystifying SQL Tuning: Tips and Techniques for SQL Experts

Manage MySQL like a devops sysadmin. Frédéric Descamps

Percona XtraDB Cluster ProxySQL. For your high availability and clustering needs

SQL Server 2014 Performance Tuning and Optimization

Analyze MySQL Query Performance with Percona Cloud Tools

PDI Techniques Logging and Monitoring

Performance Tuning. Chapter 25

DB2 is a complex system, with a major impact upon your processing environment. There are substantial performance and instrumentation changes in

Sun Certified MySQL 5.0 Developer Part II

Writing High Performance SQL Statements. Tim Sharp July 14, 2014

Introduction Storage Processing Monitoring Review. Scaling at Showyou. Operations. September 26, 2011

Best Practices for Database Administrators

MySQL Replication : advanced features in all flavours. Giuseppe Maxia Quality Assurance Architect at

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability

Transcription:

Effective Testing for Live Applications March, 29, 2018 Sveta Smirnova

Table of Contents Sometimes You Have to Test on Production Wrong Data SELECT Returns Nonsense Wrong Data in the Database Performance Issues Crashes 2

MySQL Troubleshooting Webinars I taught to Repeat problematic queries Crash servers 3

MySQL Troubleshooting Webinars I taught to Repeat problematic queries Crash servers Not what you want to do in production 3

Not Acceptable Updating wrong data Intended performance slowdowns Crashes 4

Sandboxes Slower machines Smaller disks Blocking backups 5

Concurrency Hard to imitate Load testing tools are not perfect 6

Sometimes You Have to Test on Production

Safety First Data must stay consistent No intended crashes Tests should not make performance worse 8

We will Discuss Wrong data SELECT UPDATE/INSERT/DELETE DDL 9

We will Discuss Wrong data Performance Slow queries Locking issues Hardware-related 9

We will Discuss Wrong data Performance Crashes 9

General Workflow Measure: record what is wrong Actual and expected results Query execution time Overall application performance 10

General Workflow Measure: record what is wrong Actual and expected results Query execution time Overall application performance Plan changes 10

General Workflow Measure: record what is wrong Actual and expected results Query execution time Overall application performance Plan changes Implement them May require maintenance window 10

General Workflow Measure: record what is wrong Actual and expected results Query execution time Overall application performance Plan changes Implement them May require maintenance window Measure again 10

Challenges We need to know what causes bad behavior 11

Challenges We need to know what causes bad behavior It is hard to find out 11

Challenges We need to know what causes bad behavior It is hard to find out Issue can have many potential solutions You can find one only after many iterations 11

Challenges We need to know what causes bad behavior It is hard to find out Issue can have many potential solutions You can find one only after many iterations You don t want to cause worse damage 11

Wrong Data

Wrong Data SELECT Returns Nonsense

There is No Harm in Experimenting This is just a query 14

There is No Harm in Experimenting This is just a query Does not change anything 14

There is No Harm in Experimenting This is just a query Does not change anything Works fast 14

There is No Harm in Experimenting This is just a query Does not change anything Works fast No effect on others 14

Just Run It Use all troubleshooting techniques you know 15

Just Run It Use all troubleshooting techniques you know Which we discussed Troubleshooting Slow Queries 15

Just Run It Use all troubleshooting techniques you know Which we discussed Troubleshooting Slow Queries And not 15

Measure Result of SELECT EXPLAIN output Desired result Can you use sandbox for tests? 16

General Query Analyzing Checklist Is data in the table? 17

General Query Analyzing Checklist Is data in the table? Does simplified query return correct result? 17

General Query Analyzing Checklist Is data in the table? Does simplified query return correct result? Try removing ORDER BY GROUP BY WHERE JOIN 17

General Query Analyzing Checklist Is data in the table? Does simplified query return correct result? Try removing ORDER BY GROUP BY WHERE JOIN Can you use part of data for tests? 17

Reducing Query Example Data in the table mysql> select a from t1; +----------------------+ a +----------------------+ -9223372036854775808-9223372036854775807 9223372036854775806 9223372036854775807 +----------------------+ 4 rows in set (0.00 sec) 18

Reducing Query Example Problematic query mysql> select distinct ceil(a) from t1; +-------------------+ ceil(a) +-------------------+ -9999999999999999 ---- Here is the problem! 9999999999999999 +-------------------+ 2 rows in set (0.00 sec) 18

Reducing Query Example Wrong data type mysql> select ceil(a) a from t1; +----------------------+ a +----------------------+ -9223372036854775808-9223372036854775807 9223372036854775806 9223372036854775807 +----------------------+ 4 rows in set, 4 warnings (0.00 sec) Warning (Code 1292): Truncated incorrect DECIMAL value: -9223372036854775808 Warning (Code 1292): Truncated incorrect DECIMAL value: -9223372036854775807 Warning (Code 1292): Truncated incorrect DECIMAL value: 9223372036854775806 Warning (Code 1292): Truncated incorrect DECIMAL value:... 18

Reducing Query Example Solution mysql> select distinct cast(ceil(a) as signed) from t1; +-------------------------+ cast(ceil(a) as signed) +-------------------------+ -9223372036854775808-9223372036854775807 9223372036854775806 9223372036854775807 +-------------------------+ 4 rows in set (0.00 sec) 18

Reducing Query Example Based on MySQL Bug #79571 18

Challenges Large datasets Removing WHERE leads to high IO 19

Challenges Large datasets Removing WHERE leads to high IO Solutions Limit data by primary key access Run EXPLAIN before query Adjust in parts 19

Wrong Data Wrong Data in the Database

You Have Wrong Data in the Database Incorrect UPDATE statement 21

You Have Wrong Data in the Database Incorrect UPDATE statement Concurrency 21

You Have Wrong Data in the Database Incorrect UPDATE statement Concurrency Wrong order of master events on slave 21

You Already Know Problematic Query All updates are in the binary log 22

You Already Know Problematic Query All updates are in the binary log With RBR use mysqlbinlog --verbose May help to discover logic error 22

You Already Know Problematic Query All updates are in the binary log With RBR use mysqlbinlog --verbose May help to discover logic error Audit, general, application logs 22

How to Deal with Tons of Updates? Quickly run through them 23

How to Deal with Tons of Updates? Quickly run through them Treat similar queries as same one UPDATE foo SET bar=8 WHERE baz=9 UPDATE foo SET bar=9 WHERE baz=8 23

How to Deal with Tons of Updates? Quickly run through them Treat similar queries as same one Search for not trivial queries DELETE FROM t1, t2 USING t1, t2; 23

How to Deal with Tons of Updates? Quickly run through them Treat similar queries as same one Search for not trivial queries DELETE FROM t1, t2 USING t1, t2; Will work only for matching rows 23

How to Deal with Tons of Updates? Quickly run through them Treat similar queries as same one Search for not trivial queries Examine logs for number of affected rows Slow query log in Percona Server has it performance schema.events statements * Performance Schema for MySQL Troubleshooting 23

Concurrency and Conflicts Order of events 24

Concurrency and Conflicts Order of events Transaction isolation level 24

Concurrency and Conflicts Order of events Transaction isolation level Content of binary log 24

Concurrency and Conflicts Order of events Transaction isolation level Content of binary log Application logic Log files What do you expect from it? 24

Measure What was wrong DML queries What changed How result changed 25

Test safely update City, Country set City.Population=Country.Population+1000 where City.CountryCode=Country.Code and Continent= Oceania and Region= Polynesia ; 26

Test safely update City, Country set City.Population=Country.Population+1000 where City.CountryCode=Country.Code and Continent= Oceania and Region= Polynesia ; Convert to SELECT select City.Population, Country.Population, Country.Code from City, Country where City.CountryCode=Country.Code and Continent= Oceania and Region= Polynesia ; 26

Test safely update City, Country set City.Population=Country.Population+1000 where City.CountryCode=Country.Code and Continent= Oceania and Region= Polynesia ; Convert to SELECT Multi-statement transaction and rollback mysql> begin; Query OK, 0 rows affected (0.00 sec) mysql> update City, Country set City.Population=Country.Population+1000 -> where City.CountryCode=Country.Code and Continent= Oceania and Region= Polynesia ; Query OK, 12 rows affected (0.01 sec) Rows matched: 12 Changed: 12 Warnings: 0 mysql> rollback; Query OK, 0 rows affected (0.13 sec) 26

Test safely update City, Country set City.Population=Country.Population+1000 where City.CountryCode=Country.Code and Continent= Oceania and Region= Polynesia ; Convert to SELECT Multi-statement transaction and rollback Experiment with slave 26

Test safely update City, Country set City.Population=Country.Population+1000 where City.CountryCode=Country.Code and Continent= Oceania and Region= Polynesia ; Convert to SELECT Multi-statement transaction and rollback Experiment with slave Be ready to revert changes! 26

Performance Issues

Measure Most important for Performance Issues 28

Measure Most important for Performance Issues Application response time How long web page loads? How long took AJAX request? How much took request to database? 28

Measure Most important for Performance Issues Application response time Slow query log 28

Measure Most important for Performance Issues Application response time Slow query log Throughout data collection pt-stalk Performance Schema Manual 28

What is Wrong with Hardware Metrics? They say nothing about state 29

What is Wrong with Hardware Metrics? They say nothing about state High CPU Is database starving? Is it running many client threads? 29

What is Wrong with Hardware Metrics? They say nothing about state High CPU Low CPU Is database idle? Did you limit concurrency? 29

What is Wrong with Hardware Metrics? They say nothing about state High CPU Low CPU High Memory Usage Too large buffers? Too many temporary tables? 29

What is Wrong with Hardware Metrics? They say nothing about state High CPU Low CPU High Memory Usage You got the idea 29

Measure in Right Time! Baseline: for normal load Makes sense to understand hardware limits 30

Measure in Right Time! Baseline: for normal load For troubleshooting: for real load 30

Test Workflow Collect data 31

Test Workflow Collect data Plan changes Record everything you modify Choose time 31

Test Workflow Collect data Plan changes Record everything you modify Choose time Make changes 31

Test Workflow Collect data Plan changes Record everything you modify Choose time Make changes Collect again 31

Test Workflow Collect data Plan changes Record everything you modify Choose time Make changes Collect again Compare 31

Can I get Situation Worse? Yes, you can! 32

Can I get Situation Worse? Yes, you can! But you are already in trouble 32

Can I get Situation Worse? Yes, you can! But you are already in trouble You have to do something 32

Safe Way to Test Understand what you change 33

Safe Way to Test Understand what you change Test in session 33

Safe Way to Test Understand what you change Test in session Change dynamic variables 33

Safe Way to Test Understand what you change Test in session Change dynamic variables Store changes Configuration file 8.0+: SET PERSIST 33

Safe Way to Test Understand what you change Test in session Change dynamic variables Store changes Configuration file 8.0+: SET PERSIST Always measure effect! 33

If Things Went Wrong Revert changes 34

If Things Went Wrong Revert changes Record what caused failure 34

If Things Went Wrong Revert changes Record what caused failure Easy if you changed in right time 34

Changing in Less Busy Time May work Follow regular workflow Compare with data, collected for similar load 35

Changing in Less Busy Time May work Follow regular workflow Compare with data, collected for similar load Won t work Issue happens only during high activity Number of active writing clients is greater than your disk and CPU can handle InnoDB Buffer Pool aggressively purges old data only when it cannot do it in idle time 35

Changing in Less Busy Time May work Follow regular workflow Compare with data, collected for similar load Won t work Issue happens only during high activity Number of active writing clients is greater than your disk and CPU can handle InnoDB Buffer Pool aggressively purges old data only when it cannot do it in idle time You may try to scale Use case for sandbox Not always work 35

Upgrade Performance: Issues Application runs slower than before 36

Upgrade Performance: Issues Application runs slower than before Reason is unknown 36

Upgrade Performance: Issues Application runs slower than before Reason is unknown A lot of changes Hardware Operating System MySQL version 36

Upgrade Performance: Best Practices Record performance stats for old version pt-stalk Slow query log for few days 37

Upgrade Performance: Best Practices Record performance stats for old version pt-stalk Slow query log for few days Upgrade 37

Upgrade Performance: Best Practices Record performance stats for old version pt-stalk Slow query log for few days Upgrade Let it run 37

Upgrade Performance: Best Practices Record performance stats for old version pt-stalk Slow query log for few days Upgrade Let it run Measure again 37

Upgrade Performance: Best Practices Record performance stats for old version pt-stalk Slow query log for few days Upgrade Let it run Measure again You will know exactly what went wrong! 37

Calling Support Collect data At the problematic time pt-stalk in daemon mode pt-stalk --daemonize --iterations=2 --sleep=30 --dest=/path/where/you/want/logs/stored --user=root --password=mysql-root-pass --function=status --variable=threads_running --threshold=25 38

Calling Support Collect data At the problematic time pt-stalk in daemon mode Identify how it affects your application 38

Calling Support Collect data At the problematic time pt-stalk in daemon mode Identify how it affects your application Read answers about changes you need to make 38

Calling Support Collect data At the problematic time pt-stalk in daemon mode Identify how it affects your application Read answers about changes you need to make Apply them incrementally 38

Calling Support Collect data At the problematic time pt-stalk in daemon mode Identify how it affects your application Read answers about changes you need to make Apply them incrementally Measure result 38

Crashes

You have to Make Changes You are in trouble already There is nothing to be more afraid of 40

What to Change? Query 41

What to Change? Query Scenario 41

What to Change? Query Scenario Options 41

What to Change? Query Scenario Options Hardware 41

Log Files Error log 42

Log Files Error log Core file 42

Log Files Error log Core file Troubleshooting MySQL Crashes Debug symbols Analyze hints 42

Workflow Study logs 43

Workflow Study logs Make changes 43

Workflow Study logs Make changes Repeat workload 43

Workflow Study logs Make changes Repeat workload Do best to imitate in sandbox! 43

Crash in Random Places Check if mysqld is the reason 44

Crash in Random Places Check if mysqld is the reason Monitor OS around crash times 44

Crash in Random Places Check if mysqld is the reason Monitor OS around crash times PMM or similar tool is a must Impractical to have scheduled job 44

System and Crashes Server state 45

System and Crashes System Overview: 2.7 Weeks before 45

System and Crashes System Overview: 2.7 Weeks before 45

System and Crashes System Overview: 2.7 Weeks before 45

Summary You must monitor your system 46

Summary You must monitor your system Keep historic logs, so you know what changed Slow query logs pt-stalk collections PMM 46

Summary You must monitor your system Keep historic logs, so you know what changed Slow query logs pt-stalk collections PMM Make changes sequentially 46

Summary You must monitor your system Keep historic logs, so you know what changed Slow query logs pt-stalk collections PMM Make changes sequentially Avoid to replace everything at a time 46

Summary You must monitor your system Keep historic logs, so you know what changed Slow query logs pt-stalk collections PMM Make changes sequentially Avoid to replace everything at a time Measure before and after the change 46

More Information Troubleshooting Webinars at Percona MySQL is crashing: a support engineer s point of view pmmdemo.percona.com/ 47

Thank you! http://www.slideshare.net/svetasmirnova https://twitter.com/svetsmirnova 48