Patient A SQL Critical Care Part 1: Health Triage Findings

Similar documents
Background. Let s see what we prescribed.

Background. $VENDOR wasn t sure either, but they were pretty sure it wasn t their code.

Background. Let s see what we prescribed.

Patient C SQL Critical Care

AlwaysOn Availability Groups: Backups, Restores, and CHECKDB

High Availability- Disaster Recovery 101

How to Build a Microsoft SQL Server Disaster Recovery Plan with Google Compute Engine

High Availability- Disaster Recovery 101

Are AGs A Good Fit For Your Database? Doug Purnell

Chapter 10 Protecting Virtual Environments

MD-HQ Utilizes Atlantic.Net s Private Cloud Solutions to Realize Tremendous Growth

Understanding Virtual System Data Protection

Slide 1. Slide 2 Housekeeping. Slide 3 Overview or Agenda

5 REASONS YOUR BUSINESS NEEDS NETWORK MONITORING

Managed IT Solutions. Managed IT Solutions

Identifying and Fixing Parameter Sniffing

How to be a Great Production DBA

Chapter 11. SnapProtect Technology

Case study: Building bi-directional DR. Joep Piscaer, VMware vexpert, VCDX #101

MINION ENTERPRISE FEATURES LIST

Disaster Recovery How to NOT do it. Derek Martin Senior TSP Azure

Disaster Recovery-to-the- Cloud Best Practices

Microsoft SQL Server

ECE Engineering Robust Server Software. Spring 2018

BC/DR Strategy with VMware

Vembu v4.0 Vembu ImageBackup

Using Computer Associates BrightStor ARCserve Backup with Microsoft Data Protection Manager

Balancing the pressures of a healthcare SQL Server DBA

SQL Server Virtualization 201

Still All on One Server: Perforce at Scale

Vembu v4.0 Vembu ImageBackup

XP: Backup Your Important Files for Safety

Thank You Sponsors! Visit the Sponsor tables to enter their end of day raffles.

12 Common Mistakes while Backing Up Databases

Successfully migrate existing databases to Azure SQL Database. John Sterrett Principal Consultant

Asigra Cloud Backup Provides Comprehensive Virtual Machine Data Protection Including Replication

SERVICE DESCRIPTION MANAGED BACKUP & RECOVERY

Paragon Protect & Restore

Exchange 2010 Transaction Logs Not Clearing After Full Backup

Dell SC Series Snapshots and SQL Server Backups Comparison

HP Dynamic Deduplication achieving a 50:1 ratio

VCAP5-DCD. Vmware. VMware Certified Advanced Professional 5 - Data Center Design

Manual Backup Sql Server Express 2008 Schedule Database Using Sql Agent

CONFIGURING SQL SERVER FOR PERFORMANCE LIKE A MICROSOFT CERTIFIED MASTER

Microsoft Azure Windows Server Microsoft System Center

The Evolution of IT Resilience & Assurance

Veeam and HP: Meet your backup data protection goals

SEMINAR. Achieve 100% Backup Success! Achieve 100% Backup Success! Today s Goals. Today s Goals

Data Loss and Component Failover

Availability and the Always-on Enterprise: Why Backup is Dead

Chapter 2 CommVault Data Management Concepts

The Top 25 Questions, Issues, and Complaints of Cloud Business. Doug Pitcher RoseASP

Introduction. Read on and learn some facts about backup and recovery that could protect your small business.

Hystax. Live Migration and Disaster Recovery. Hystax B.V. Copyright

MANAGE YOUR SHOP WITH POLICY BASED MANAGEMENT & CENTRAL MANAGEMENT SERVER

Running and maintaining a secure Unity RIS/CVIS/PACS

Backup and Disaster Recovery: DIY or Buy? Presented by: Stanley Louissaint

Real-time Monitoring, Inventory and Change Tracking for. Track. Report. RESOLVE!

Business Continuity and Disaster Recovery Disaster-Proof Your Business

StorageCraft OneXafe and Veeam 9.5

Virtual protection gets real

This video is part of the Microsoft Virtual Academy.

Red Hat Enterprise Virtualization (RHEV) Backups by SEP

Arcserve Unified Data Protection Virtualization Solution Brief

Check Table Size In Sql Server 2008 R2 >>>CLICK HERE<<<

Veritas Storage Foundation for Windows by Symantec

Simplifying HDS Thin Image (HTI) Operations

CIO Guide: Disaster recovery solutions that work. Making it happen with Azure in the public cloud

Performing an ObserveIT Upgrade Using the Interactive Installer

Data Storage, Recovery and Backup Checklists for Public Health Laboratories

High Availability Without the Cluster (or the SAN) Josh Sekel IT Manager, Faculty of Business Brock University

VERITAS Storage Foundation 4.0 for Windows

Veritas Storage Foundation for Windows by Symantec

AUTOMATED RESTORE TESTING FOR TIVOLI STORAGE MANAGER

VERITAS Volume Manager for Windows 2000

If you lost your data and don t have a plan, its too late to for DR! When Disaster strikes, will you have HA protection?

HIGH-AVAILABILITY & D/R OPTIONS FOR MICROSOFT SQL SERVER

Chapter Title. Time Warner Cable Business Class Online Backup. Windows User Guide. Version 2.6

Configuration Guide for Veeam Backup & Replication with the HPE Hyper Converged 250 System

Validating Your PSQL Database Backups

CommVault Simpana 9 Virtual Server - Lab Validation

SAP HANA. HA and DR Guide. Issue 03 Date HUAWEI TECHNOLOGIES CO., LTD.

Sample Exam ISTQB Advanced Test Analyst Answer Rationale. Prepared By

Real-time Protection for Microsoft Hyper-V

Redefine Data Protection: Next Generation Backup And Business Continuity

Quick Startup Guide - EnsureDR for Zerto

VMware admins: Can your DR do this?

Vembu Technologies. Experience. Headquartered in Chennai Countries

UNITRENDS CLOUD BACKUP FOR OFFICE 365

VMware Backup and Replication using Vembu VMBackup

Winning Strategies for Successful SharePoint Backup and Recovery

Provided as an educational service by: Introduction

Optimizing and Managing File Storage in Windows Environments

Veritas Storage Foundation for Windows by Symantec

Disaster Recovery Is A Business Strategy

Disaster Recovery and Mitigation: Is your business prepared when disaster hits?

Why SaaS isn t Backup

SQL 2005 BACKUP AND RESTORE REPAIR MANUAL E-BOOK

Expert Oracle GoldenGate

Maximizing Availability With Hyper-Converged Infrastructure

Transcription:

Background PatientA got in touch because they were having performance pain with $VENDOR s applications. PatientA wasn t sure if the problem was hardware, their configuration, or something in $VENDOR s code. $VENDOR wasn t sure either, but they were pretty sure it wasn t their code. PatientA also wanted help with maintenance and best practices Check out what happened next!

Patient A SQL Critical Care Part 1: Health Triage Findings AKA: Are we going to lose data? 2016 Brent Ozar Unlimited. All Rights Reserved. For details: http://www.brentozar.com/go/samples

We re Brent Ozar Unlimited. RICHIE RUMP ANGELA WALKER DOUG LANE TARA KIZER BRENT OZAR JESSICA CONNORS ERIK DARLING

What instance did we look at? Instance: [REDACTED] Applications involved: [REDACTED] Memory Size: 144GB Number of logical cores: 8 vcpu SQL Server version and edition: SQL Server 2012 SP1 (with fix for initial SP1 release). Enterprise Edition. Virtualized? Yes. [REDACTED DETAILS]

RPO and RTO Recovery Point Objective (RPO): How many minutes of data can you lose in a worst case scenario? Recovery Time Objective (RTO): How many minutes can you be offlinein a worst case scenario? Minutes of data loss allowed Minutes of downtime allowed Server Offline Corrupt data Oops deletes Datacenter offline 4 hours 4 hours 4 hours 4 hours 4 hours 4 hours 4 hours 4 hours

This presentation focuses on avoiding data loss We re not solving all your pain points here We re talking about: 1. Making sure you don t lose more data than is acceptable, even if multiple things go wrong 2. Detecting corruption if it happens 3. Getting data back to a point in time if it s incorrect or corrupt This can be done with pretty simple steps, there s just a few of them

Your Steps to Prevent Data Loss

Take regular log backups Determine the frequency of them this way: If we lost the SAN, how much data do we want to lose? For RedactedImportantDB, this means backing up the transaction log every 5 minutes or every 1 minute (just answer the question above) http://www.brentozar.com/archive/2014/02/backtransaction-logs-every-minute-yes-really/

Regular log backups can help avoid this Every time these happen, everyone doing a write hast to wait a very long time! The problem is multiple things: The log file was shrunk Log backups aren t running frequently enough to let it reuse space in the file Log growths are set to a percentage, not a fixed unit REDACTED: screenshot showing 2-3 minute log growths

Regular log backups are good for performance, too! Your existing maintenance did one log backup each day, at night: One log backup Shrink the log All throughout the day, the log was periodically having to grow This led to big delays You re much better off doing frequent log backups and never shrinking the log file!

Don t back up to local storage You re currently taking full backups to a drive on the production VM, using the same SAN This is risky What if the SAN failed? What if the VM couldn t start up how long would it take to get to the files? A better option Run the all backups to a UNC path for a fileshare using separate storage

Speeding up backups: short term There s a few tricks to this For large databases, writing the backup out to multiple files can improve efficiency With Ola Hallengren s backup job, the @NumberOfFiles parameter can do this easily Try @NumberOfFiles=4 first on your RedactedImportantDBdatabase Test restoring the database too, so you know how the commands work Third party tools like $SQLVENDORTOOL1 make tuning backup speed, filecounts, and restores easier

Test your backups: can you meet your RTO? Test restoring database full and log backups to a nonproduction server Can you restore many log files quickly? Can you restore to a single point in time? Find a PowerShell script or tool (but beware scripts that require xp_cmdshell shouldn t be needed for this task) Document and /or automate the process Can someone else do this quickly if you re not available? Have you completed this within your RTO of 4 hours?

Manage backup history and messages Set up a job to purge backup history from MSDB once a week using sp_delete_backuphistory Ola Hallengren s maintenance includes a job for this: Ola.Hallengren.com You provide the @OldestDateparameter Implement trace flag 3226 so that successful backups don t get written to the SQL Server Error log Failures will still be written All backups will still be recorded in MSDB http://msdn.microsoft.com/enus/library/ms188396.aspx

Speeding up backups: long term $SANVENDOR$ $SANTOOL$can help speed up backups This does require more licensing at the SAN level May require reconfiguring drive assignments for data files But it also can replicate the backups to other locations, which is useful for DR Danger: currently your SAN snapshots do not talk to the VSS provider your databases may be corrupt and non-usable upon restore!

Create and test SQL Agent alerts Create alerts for high severity errors and corruption Set the alerts to notify you through an operator and database mail Script to create them: http://brentozar.com/go/alert To test that it all works, run: RAISERROR('TESTALERT',18,1) WITH LOG; Make sure you get the email!

Set Agent Jobs to notify on failure Lots of jobs currently won t let an operator know if they fail (too many to list here just run our sp_blitz script anytime to get a full list) Set the jobs to notify your operator when they fail At minimum, do this for all database and log backup jobs To test: set one of the jobs to notify on completion and make sure you get the email. Then switch it back to notify on failure.

Set a failsafe operator Tell your SQL Server Agent who it can notify in case of an emergency Set this on SQL Agent properties Note: When you first set up the mail profile, you will need to restart the SQL Server Agent service.

You need CHECKDB, but you need to be careful CHECKDB reads pages from disk and looks for corruption This should be run against all system and user databases One place to be cautious: This is performance intensive, and your RedactedImportantDBdatabase doesn t have a lot of free time in the nightly maintenance cycle

Automate CHECKDB for all databases Make sure you find out about corruption as soon as you can 1. Add CHECKDB for all databases except RedactedImportantDB on a nightly basis Exclude RedactedImportantDB from Ola Hallengren s job and schedule it nightly 2. Add CHECKDB for RedactedImportantDB databases on a weekly basis This will require a copy of the job which only does that database Set up the job on both replicas Make sure that this does not overlap with full backups or index maintenance, ever! Watch the job closely on the first run and make sure it doesn t cause a failover or those 15 second messages in the SQL Server log

Don t delete backups until you run CHECKDB You can end up with data corruption in production and in all your backups Rule of thumb: only delete a full backup after you get a reasonable assurance that it isn t corrupt You can do this by: Restoring the backup and running CHECKDB Running CHECKDB against the production database after the full backup completes Make sure: You have the backups on separate storage You re retaining enough copies to answer what was the data like N days ago? Some customers keep multiple copies of backup files if restoring to a historical point in time can be important to their business.

Enable the Remote DAC And practice using it! This is the Dedicated Admin Connection One sysadmin can use it at a time SQL Server reserves a special CPU for it (even if you don t have remote access enabled) This comes in very handy in a performance crisis Learn how to enable it and practice using it at BrentOzar.com/go/DAC

Questions?

Now let s get you in to see one of our performance specialists. Not quite like this guy