Demystifying Storage Area Networks Michael Wells Microsoft Application Solutions Specialist EMC Corporation
About Me DBA for 7+ years Developer for 10+ years MCSE: Data Platform MCSE: SQL Server 2012 MCITP: Database Administrator 2008 MCITP: Database Developer 2008 MCTS: SQL Server 2005
Agenda SAN Protocols SAN Storage Types The parts of a SAN Different SAN Architectures Understanding RAID Understanding Multipathing
SAN Protocols Fiber Channel high performance, dedicated fiber network requiring special hardware iscsi sends SCSI commands over traditional Ethernet interfaces (uses TCP/IP) Fiber Channel over Ethernet (FCOE) Fiber Channel commands over a lossless Ethernet network and requires special hardware
SAN Storage Types Block Storage volumes must be mounted to a host File Storage configured at the array with a file system and can be addressed by a UNC path Unified Storage array capable of supporting both storage types
The Main Parts of a SAN (FC) LUN (Logical) HBA (Physical) Fiber Optic Cable (Physical) Fiber Channel Switch (Physical) Storage Processor/Engine (Physical) Cache (Physical) Disk Array Enclosure DAE (Physical) Storage Pool (Logical) RAID Set (Logical) Drives (Physical)
LUN Logical Unit Number Volume mounted to one or more servers and appears in Windows as a drive Can be mounted using a drive letter or a mount point
HBA Host Bus Adapter Physical card that connects the server to the SAN Fabric Rating Net Throughput Efficiency 1Gb 98.44 MBs 77.7% 2Gb 196.9 MBs 77.7% 4Gb 393.8 MBs 77.7% 8Gb 787.6 MBs 77.7% 10Gb 1,181 MBs 94.2% 16Gb 1,575 MBs 94.2%
Fiber Optic Cable Physical medium used for data transmission between components connected to the SAN Fabric
Fiber Channel Switch Networking device that handles traffic between components the heart of the SAN Fabric
Storage Processor/Engine The brains of the SAN there are at least 2 and can be Active/Passive or Active/Active depending on the system
Cache Split into Read Cache and Write Cache Read Cache prevents the need to re-read the data from spinning disk Write Cache allows the write operation to be acknowledged before the data is committed to disk (should have battery backup) Can be memory in the Storage Processor/Engine or it can be Drives in the array (usually SSD)
Disk Array Enclosure (DAE) A piece of hardware that holds hard drives and is connected to the storage processors
Storage Pool One or more RAID sets grouped into a single pool from which LUNs are allocated Usually built using drives of the same size and speed Can be a mixed pool when storage tiering technology is used
RAID Set Redundant Array of Inexpensive Disks - Method of grouping disks together for performance, redundancy, or both RAID 0 Striping Performance with no redundancy RAID 1 Mirroring Redundancy, but not space efficient RAID 5 Striping with Parity Redundancy, with better space efficiency RAID 6 Striping with Parity Redundancy, with better fault tolerance than RAID 5 RAID 1/0 Mirroring + Striping Performance with redundancy
Drives The physical drive that stores the data blocks Enterprise Flash Drives (EFD) 3500 IOPS Single Layer Cells (SLC) low density with the best durability Multi Layer Cells (MLC) higher density at the cost of long-term durability SAS Drives 15k RPM 160 IOPS 10k RPM 140 IOPS Near-Line (NL) SAS Drives 7.2k RPM 90 IOPS SATA Drives SATA2 7.2k 80 IOPS SATA 7.2k 60 IOPS
The Four Storage Architectures Clustered Scale Up & Down TYPE 1 Tightly Coupled Scale Out TYPE 2 Loosely Coupled Scale Out TYPE 3 Distributed Shared Nothing TYPE 4 General Purpose Storage Balance of Perf/Cost/RAS Integrated & Unified Efficient & Simple Transactional Commits Brains Share Memory Distributed Data Data Available all Brains Shared Meta-Data Transactional Commits Independent Brains Inter-Brain Communication Distributed Data Data Available all Brains Transactional Commits Independent Brains Direct Attach Storage Lazy / Forced Data Dist. Distributed Commits Non-Transactional Commits
Key Performance Metrics IOPS I/O Operations per Second Front-End IOPS IO traffic generated at the host and sent to the Storage Processor / Engine Back-End IOPS IO traffic generated at the Storage Processor and sent to the disks Bandwidth the amount of data being sent/recieved Latency the time it takes for an IO request to complete
What Size IO Does SQL Server Use? File type Operation READ pattern WRITE pattern Threads used I/O type Data File Normal Activity 8KiB up to 128KiB 8KiB up to 128KiB Based on MaxDOP Random Checkpoint N/A 64KiB up to 128 KiB # of Sockets in Computer Random LazyWriter N/A 64KiB up to 128 KiB 1 per NUMA Node Random Bulk Insert N/A 8KiB up to 128 KiB Based on MaxDOP Sequential Backup 1 MB 1 MB Based on MaxDOP Sequential Restore 64KiB 64KiB Based on MaxDOP Sequential DBCC Checkdb w/ no repair option 8KiB up to 64KiB N/A Based on MaxDOP Sequential Rebuild Index See Read Ahead 8KiB 128 KiB Based on MaxDOP Sequential ReadAhead Up to 512 KiB N/A Based on MaxDOP Sequential Log File Normal Activity 512 bytes - 64KiB 512 bytes - 64KiB one log writer thread per soft NUMA node with a cap of 4 Sequential
Understanding RAID RAID Protection comes at a cost Small block random writes 1 application write IO RAID 1/0 2 back-end write IO RAID 5 4 back-end IO (2 read IO + 2 write IO) RAID 6 6 back-end IO (3 read IO + 3 write IO)
RAID Recommendations for SQL Always place Log files on RAID 1/0 or RAID 1 Better protection from failure Better write performance (log activity is almost all write) Consider using RAID 5 or RAID 6 for data files Data files are less write intensive than log files and can benefit from the less expensive RAID level Consider using RAID 1/0 for TEMPDB Faster writes will improve performance when temporary objects are used and/or operations spill over to TEMPDB https://technet.microsoft.com/en-us/library/cc966534.aspx
Understanding Multipathing The primary purpose of multipathing is redundancy
Multipathing - Redundancy Storage Processor A Storage Processor B
Understanding Multipathing The primary purpose of multipathing is redundancy The secondary purpose of multipathing is for performance and load balancing
Multipathing - Performance Storage Processor A Storage Processor B
Microsoft Multipath I/O (MPIO) Use more than one path for read and write functions to your storage device Provides redundant failover and load-balancing support for disks or LUNs Supports bandwidth aggregation Distributes I/O transactions across multiple adapters Windows Server feature
The Importance of a Baseline A query takes 30 seconds to run, is that too long? What about a query that runs in 3 seconds? To identify abnormal behavior you must first understand normal
SQL Monitoring Dynamic Management Views (DMVs) SQL Profiler/Trace Extended Events Performance Dashboard Management Data Warehouse (MDW)
DYNAMIC MANAGEMENT VIEWS SYS.DM_IO_VIRTUAL_FILE_STATS SELECT [ReadLatency] = CASE WHEN [num_of_reads] = 0 THEN 0 ELSE ([io_stall_read_ms] / [num_of_reads]) END, [WriteLatency] = CASE WHEN [num_of_writes] = 0 THEN 0 ELSE ([io_stall_write_ms] / [num_of_writes]) END, [Latency] = CASE WHEN ([num_of_reads] = 0 AND [num_of_writes] = 0) THEN 0 ELSE ([io_stall] / ([num_of_reads] + [num_of_writes])) END, [AvgBPerRead] = CASE WHEN [num_of_reads] = 0 THEN 0 ELSE ([num_of_bytes_read] / [num_of_reads]) END, [AvgBPerWrite] = CASE WHEN [num_of_writes] = 0 THEN 0 ELSE ([num_of_bytes_written] / [num_of_writes]) END, [AvgBPerTransfer] = CASE WHEN ([num_of_reads] = 0 AND [num_of_writes] = 0) THEN 0 ELSE (([num_of_bytes_read] + [num_of_bytes_written]) / ([num_of_reads] + [num_of_writes])) END, LEFT ([mf].[physical_name], 2) AS [Drive], DB_NAME ([vfs].[database_id]) AS [DB], [mf].[physical_name] FROM sys.dm_io_virtual_file_stats (NULL,NULL) AS [vfs] JOIN sys.master_files AS [mf] ON [vfs].[database_id] = [mf].[database_id] AND [vfs].[file_id] = [mf].[file_id] -- WHERE [vfs].[file_id] = 2 -- log files -- ORDER BY [Latency] DESC -- ORDER BY [ReadLatency] DESC ORDER BY [WriteLatency] DESC; GO Query Credit Paul Randal and Jimmy May
Dynamic Management Views
SQL Profiler Warning: SQL Trace was deprecated in SQL 2012 No new events or enhancements! Look at ClearTrace for aggregating trace results http://www.scalesql.com/cleartrace/
Extended Events Replaces the functionality of SQL Trace The only way to monitor new SQL Server events like those related to Availability Group Replication Jonathan Kehayias has written a converter to migrate trace definitions to extended events - https://www.sqlskills.com/blogs/jonathan/converting-sql-traceto-extended-events-in-sql-server-2012/
Performance Dashboard
Management Data Warehouse
Windows Monitoring Windows Performance Counters Performance Monitor (PerfMon) System Center Operations Manager (SCOM)
WINDOWS PERFORMANCE COUNTERS PERFORMANCE MONITOR: LOGICAL DISK Latency Avg. Disk sec/transfer Avg. Disk sec/read Avg. Disk sec/write IOPS Disk Transfers/sec Disk Read/sec Disk Writes/sec Throughput Disk Bytes/sec Disk Read Bytes/sec Disk Write Bytes/sec Transfer Size Avg. Disk Bytes/Transfer Avg. Disk Read Bytes/Transfer Avg. Disk Write Bytes/Transfer Disk Queue Length Current Disk Queue Length Avg. Disk Queue Length Avg. Disk Read Queue Length Avg. Disk Write Queue Length Capacity % Free Space Free Megabytes For details on these counters http://blogs.technet.com/b/askcore/archive/2012/03/16/windows-performance-monitor-disk-counters-explained.aspx
Performance Monitor (PerfMon)
System Center Operations MGR
When I Find An Issue Don t immediately blame the I/O Subsystem, using your baseline look for things like: Query plan changes Additional indexes or index changes Access pattern or key changes Adding Change Data Tracking (CDC) or triggers Enabling Snapshot Isolation Decreased server memory Increased session counts
Escalating Issues Provide as much relevant information as possible Show normal/expected performance metrics These must already exist and be easily accessible Show how the current workload compares to normal/expected If the current workload is different, what changed and is it permanent? Focus on solving the problem Don t get bogged down in the blame game