New England Data Camp v2.0 It is all about the data! Caregroup Healthcare System Ayad Shammout Lead Technical DBA ashammou@caregroup.harvard.edu
About Caregroup SQL Server Database Mirroring Selected SQL Server 2008 enhancements & Demos SQL Server Fail Over Clustering Selected SQL Server 2008 enhancements Best Practices Summary and Q and A
Four Hospitals located in Boston 16,000 Employees 146 Mission Critical Clinical Applications 3.5 Million Patient Medical Records Annual Revenue : $2 Billion HA/DR requirements for clinical Applications: RTO : zero downtime RPO: No data loss All mission-critical SQL Servers are Clustered and Mirrored
Minimize or avoid service downtime Whether planned or unplanned When components fail, service interruption is brief or non-existent Automatic failover Eliminate single points of failure (as affordable) Redundant components Fault-tolerant servers
Permitted downtime (planned vs. unplanned?) Uptime SLA Downtime per day Downtime per month Downtime per year 99.999% 00:00.9 0:00:26 0:05:16 99.99% 0:00:09 0:04:23 0:52:36 99.90% 0:01:26 0:43:50 8:45:57 99% 0:14:24 7:18:17 87:39:30 Define Recovery Point Objective (RPO) Define Recovery Time Objective (RTO) Application response times Note: Database uptime is not equivalent to application availability Failures of other application services Network outages
Physical Infrastructure Failures Storage subsystem Disk Controller Network Server Power Logical Data Failures Operator errors DBMS interruption Drops / deletes Application defects DBMS defects Data corruption
Standby Mode Cold standby Warm standby Hot standby Failover Behavior Manual intervention required to restore offline data copy Data copy online and ready Manual failover required Automatic failover SQL Server Feature Backup and restore Transaction log shipping Database mirroring Database mirroring Failover clustering
Database Mirroring Scope: user DB Standard hardware One SQL license (unless querying snapshots on mirror) Very fast failover (seconds) OS flexible (e.g. 32/64) Independent storage Independent services Reporting on mirror Geographic separation OK Failover Clustering Scope: DBMS instance Certified hardware One SQL license (only one node can access database) Automatic failover (up to minutes) Enterprise OS Shared storage Clustered services Standby not available Servers are usually co-located
Instant Standby Conceptually a fault-tolerant server Building block for complex topologies Database Failover Very Fast in seconds Zero data loss Automatic or manual failover Automatic re-sync after failover Automatic, transparent client redirect Database Mirroring
High-Availability Mode Safety is FULL with a witness Database is available whenever a quorum exists High-Protection Mode Safety FULL but there is no witness If the principal loses quorum, it stops servicing the database Ensures high protection; database is never in exposed state Manual failover only, no automatic failover High-Performance Mode Safety OFF and witness irrelevant Manual failover: force service
Principal Commit Application Witness Mirror Mirror is always redoing it remains current 1 SQL Server 5 2 SQL Server 2 >2 4 3 >3 Log Data Log Data
Demo
Automatic recovery from page corruption Log stream compression: helps in both synchronous & asynchronous modes For example for one SAP deployment Log Bytes Sent/sec = 300 K Log Compressed Bytes Sent/sec = 110 K
Backup Compression helps in Mirroring Reduced backup, file copy & restore time Some users have reported the following. Your mileage will vary: 25 to 85% space saving 35 to 50% faster backup & restore time Great feature for database migration: 2005 2005 2005 2008
Failover of multiple Databases Set up alerts Replication stops if DBM is paused To prevent Subscriber getting ahead of the mirror To allow replication to continue with DBM paused SQL 2008: Globally enable Trace flag 1448 SQL 2005: Apply Hotfix for SQL 2005 SP2 and then use the Trace flag http://support.microsoft.com/kb/937041 Cannot Restore from a Database Snapshot on the Principal If an application uses multiple databases, Synchronous with Witness mode not recommended Fail-over granularity is database
Application Connection using aliases Helps in uninterrupted replacement of the mirror Only alias definition needs to be altered PRINCIPAL MIRROR Data Source=Alias1;Failover Partner=Alias2;InitialCatalog=DBMTest ;Integrated Security=True BUILD NEW MIRROR & REDEFINE ALIAS
Long Disconnects Mirror unavailable DISCONNECTED Mirroring session suspended SUSPENDED Log records keep accumulating at the principal Transaction log cannot be truncated, even if you backup transaction log May eventually fill up the transaction log space and the database comes to halt Look at LOG_REUSE_WAIT_DESC column in sys.databases RESUME the mirroring session, or break it
SQL Server Database Mirroring Selected SQL Server 2008 enhancements & Demos SQL Server Fail Over Clustering Selected SQL Server 2008 enhancements Best Practices Other enhancements Summary and Q and A
Failover Cluster Validation tool DHCP support IPv6 support Service SIDs Up to 16-node support SQL Server 2008 does NOT support the new OR dependency feature of Windows Server 2008 Understand that Windows Server 2008 Clustering is different from Windows Server 2003 Clustering
Migration as an upgrade option No major change in the behavior from previous versions Install Side-by-Side on the same hardware (or separate hardware) Then detach / attach the database Rolling upgrade enables minimal service unavailability Failover clustering alone or with Database Mirroring Be aware of subtle differences between Windows Server 2008 and Windows Server 2003 Clustering
Demo New SQL Failover Instance installation
In-Place Upgrade SQL 2005 to SQL 2008
Active Passive Windows Server 2003 R2 EE SP2, 64-Bit EMC SQL Server 2005 EE SP2, 64-Bit
Step #2: Install Prerequisites: 1-.Net Framework 3.5 SP1 2- Windows Installer 4.5 3- Windows QFE (KB937444) 4- SQL2008 Setup Support files REBOOT.. Step #1: Install Prerequisites: 1-.Net Framework 3.5 SP1 2- Windows Installer 4.5 3- Windows QFE (KB937444) 4- SQL2008 Setup Support files REBOOT. Active SQL Instance Manual Failover Passive
Step #3: Upgrade to SQL Server 2008 on Passive Node SQL 2008 Removed from Cluster Group Preferred Owners No client connection for 1-2 minutes while db is being upgraded to 2008 on the left node Step #4: Upgrade to SQL Server 2008 on Active Node SQL 2008 Passive Active Step 5: SQL Instance Automatic Failover Active
SQL Server 2008 Step#2: Step#4: Manual Failover to the database mirroring partner for each database Step #3: SQL Server Cluster Step #1: Upgrade to SQL Server 2008 on Mirrored Instance SQL 2008 Mirrored SQL Principal Mirroring suspended resumed Active Passive
Principal Server Alias Name = Green Active IP: 100.10.56.30 100.85.3.10 Cisco Global Site Selector (GSS) DNS Connect to: Green\SQL1 Applications: 1- SharePoint 2- SSRS 3- BlackBerry 4- Citrix Server 5- VMware VC SQL Server Cluster SQLNetworkNameA\SQL1 Active IP:100.10.56.30 SQLHostNameB\SQL1 Passive IP:100.85.3.10 Mirroring DR Site Mirror Server
Useful pointers SQL Server 2008 Failover Clustering whitepaper http://www.sqlcat.com Rolling in-place cluster upgrade process http://msdn.microsoft.com/en-us/library/ms191295.aspx How to create a single node SQL Server 2008 failover cluster http://msdn.microsoft.com/en-us/library/ms179530.aspx How to add node to a SQL Server 2008 failover cluster http://msdn.microsoft.com/en-us/library/ms191545.aspx An advanced cluster installation option, which prepares cluster nodes first and then completes the cluster across prepared nodes http://msdn.microsoft.com/en-us/library/ms144259.aspx
SQL Server Database Mirroring Selected SQL Server 2008 enhancements & Demos SQL Server Fail Over Clustering Selected SQL Server 2008 enhancements Best Practices Summary and Q and A
New England Data Camp v2.0 It is all about the data!