Administração e Optimização de Bases de Dados 2012/2013 Hardware and OS Tuning Bruno Martins DEI@Técnico e DMIR@INESC-ID OS and HW Tuning Considerations OS " Threads Thread Switching Priorities " Virtual Memory DB buffer size " File System Disk layout and access Hardware " Storage subsystem Configuring the disk array Using the controller cache " Components upgrades " Multiprocessor Architectures
Threads Switching control from one thread to another is expensive Number of threads: " Fewer active threads = increased waiting time " Decreased thread switching (and increased overall throughput) Further tuning " Adjust the number of threads in the OS or in the DBMS " If possible, configure the system to longer task switching " DBMS should not run below priority of other applications Threads: Priority Inversion Transaction states running waiting Priority #1 Priority #2 Priority #3 T3 Three transactions: T1, T2, T3 in priority order (1=high to 3=low) T1 1. T3 obtains lock on X and is preempted... 2. T1 blocks on X lock, so is T2 descheduled... 3. T2 does not access X and runs for a long time preventing T3 from running... Net effect: T1 waits for T2
Pay attention to priority " Avoid priority inversion Give all transactions the same priority (recommended by Oracle). Can lead to undesirable fairness. Dynamic priorities: Holder of lock inherits priority of highest priority waiter of lock (SQL Server) Tuning Considerations OS " Threads Thread Switching Priorities # Virtual Memory DB buffer size " File System Disk layout and access Hardware " Storage subsystem Configuring the disk array Using the controller cache " Components upgrades " Multiprocessor Architectures
Impact of the Buffer Goal of the buffer: reduce the number of physical accesses to secondary memory (usually disks) The impact of the buffer on the number of physical accesses depends on three parameters: " Logical reads and writes: Pages that the DBMS accesses via system read and write commands: Some of these will be in the buffer; others will be translated to physical reads and writes (depends on the buffer size) " DBMS page replacements: Physical writes to disk that occur when a page must be brought into the buffer; there are no free pages; and the occupied pages are dirty " OS paging: physical accesses to disk that occur when part of the buffer space lies outside RAM. This should never happen. Database Buffer DATABASE PROCESSES Buffer Cache Hit Ratio = 1 - (PR/LR) UNSTABLE MEMORY Paging Disk RAM LOG DATABASE BUFFER BUFFER LOG DATA DATA STABLE MEMORY Buffer too small, then hit ratio too small Buffer too large, risk of paging Recommended strategy: Monitor hit ratio and increase buffer size until hit ratio flattens out. If there is still paging, then buy memory.
Database Buffer Size SQL Server 7 on Windows 2000 630 MB relation -- Warm buffer(the table is scanned once before each run) Scan query: " Either relation accessed in RAM, or entire relation accessed on disk. This is because of LRU replacement policy Multipoint query: " Throughput increases linearly with buffer size up to the point where all data is accessed from RAM. Tuning Considerations OS " Threads Thread Switching Priorities " Virtual Memory DB buffer size # File System Disk layout and access Hardware " Storage subsystem Configuring the disk array Using the controller cache " Components upgrades " Multiprocessor Architectures
Parameters for File Systems Size of disk chunks allocated at one time " Allocate long sequential slices of disk to files that tend to be scanned. History table files or log files Scan-intensive file Usage factor on disk pages: percentage of a page that can be utilized, yet still permitting a further insertion " Depending on scan/update ratio " High utilization helps scan because fewer pages need be scanned (provided there are no overflows) " Low utilization reduces likelihood of overflows when updates change the size of a record (e.g., string fields) Number of pages that may be prefetched " Prefetching: strategy used to speed up table/index scans by physically reading ahead more pages than requested by a query at a specific point in the hope that future requests be logically fulfilled. " Useful for queries that scan files Usage Factor DB2 UDB v7.1 on Windows 2000 Scan table (aggregation) Throughput increases significantly with usage factor. " The bigger the usage factor the fuller the pages are " The fewest pages have to be read
Prefetching DB2 UDB v7.1 on Windows 2000 Scan table (aggregation) Throughput increases up to a certain point when prefetching size increases. Tuning Considerations OS " Threads Thread Switching Priorities " Virtual Memory DB buffer size " File System Disk layout and access # Hardware " Storage subsystem Configuring the disk array Using the controller cache " Components upgrades " Multiprocessor Architectures
RAID Levels (recap) RAID Level 0: Block striping; nonredundant. Reads and writes large block sizes Used in high-performance applications where data lost is not critical Stripe size should be adjusted to the DBMS page size RAID Level 1: Mirrored disks Writes are synchronous and therefore must wait for the slowest disk Slower than RAID 1 but safe Popular for applications such as storing log files in a database system. RAID Levels (recap) RAID Level 5: Rotated parity striping Partitions data and parity among all N + 1 disks, rather than storing data in N disks and parity in 1 disk. " E.g., with 5 disks, parity block for n th set of blocks is stored on disk ( n mod 5) + 1, with the data blocks stored on the other 4 disks. RAID Level 10: Striping with Mirroring Offers best write performance. Popular for applications such as storing log files in a database system.
RAID Levels Log File " RAID 1 is appropriate Fault tolerance. Writes are synchronous and sequential. No benefits in striping. Temporary Files " RAID 0 is appropriate. No fault tolerance. High throughput. Data and Index Files " RAID 5 fault tolerance where reading dominates writing " RAID 10 is best suited for write intensive apps. RAID Levels Read-Intensive: " Using multiple disks (RAID0, RAID 10, RAID5) increases throughput significantly. Write-Intensive: " Negative impact on performance is obvious with Software RAID5. " The controller manages to hide poor RAID5 performances using its cache
Tuning Considerations OS " Threads Thread Switching Priorities " Virtual Memory DB buffer size " File System Disk layout and access Hardware " Storage subsystem Configuring the disk array Using the controller cache " Components upgrades " Multiprocessor Architectures Hardware Configuration There are 3 ways to add HW to our system: Add Memory " Then increase buffer size without increasing paging Add Disks " Log on separate disk (very important) " Mirror frequently read file " Partition large files Add Processors (in a shared-nothing environment) " Off-load non-database applications onto other CPUs " Off-load data mining applications to old database copy " Increase throughput to shared data
Disk Controller Interfaces between the computer system and the disk drive hardware. " accepts high-level commands to read or write a sector " initiates actions such as moving the disk arm to the right track and actually reading or writing the data " Computes and attaches checksums to each sector to verify that data is read back correctly If data is corrupted, with very high probability stored checksum won t match recomputed checksum " Ensures successful writing by reading back sector after writing it " Performs remapping of bad sectors Hardware Configuration Site#1 SHARED EVERYTHING Site#3 SHARED NOTHING (CLUSTER) Site#2 SHARED DISKS
Controller Cache Disk and array controllers contain memory that can be used as read cache or as write cache Read-ahead: " Prefetching at the disk controller level. " No information on access pattern. " Not recommended. Write-back vs. write through: " Write back: transfer terminated as soon as data is written to cache. Batteries to guarantee write back in case of power failure Fast cache flushing is a priority " Write through: transfer terminated as soon as data is written to disk (no caching). Controller Cache SQL Server 7 on Windows 2000. Adaptec ServerRaid controller: " 80 Mb RAM " Write-back mode Controller cache increases throughput whether operation is cache friendly (volume of update is slightly larger than the controller cache) or not (volume of update is 10 times larger than the controller cache). " This controller implements an efficient replacement policy