WHITE PAPER VERITAS Database Edition 1.0 for DB2 PERFORMANCE BRIEF OLTP COMPARISON AIX 5L September 2002 1
TABLE OF CONTENTS Introduction...3 Test Configuration...3 Results and Analysis...4 Conclusions...6 Appendix A...7 2
Introduction This document describes the performance of VERITAS Database Edition 1.0 for DB2 in a 64-bit AIX 5.1 operating system as measured by an Online Transaction Processing (OLTP) workload. The purpose of this brief is to illustrate the impact of different I/O and memory configurations on the performance of the database. The benchmark used for this performance comparison was derived from the commonly known TPC-C benchmark that comprises a mixture of read-only and update intensive transactions that simulate a warehouse supplier environment. (Details on this benchmark can be obtained from the Transaction Processing Council s web page at http://www.tpc.org.) The Database Edition 1.0 for DB2 comprises the following VERITAS components: VERITAS Database Edition for DB2 on AIX 1.0.0.0 VERITAS Volume Manager TM (VxVM ) 3.2.0.1 VERITAS File System TM (VxFS ) 3.4.2.1 (including Quick I/O, and Cached Quick I/O) Test Configuration The OLTP benchmark tests were conducted on an IBM RS6000 p660 server with eight processors and 8GB of memory. The p660 system was attached to sixteen IBM 2105 disk bricks via four fibre channel adapters. Each 2105 brick contains 8 36GB drives that were exported as a single VLUN to the host. Each VLUN was configured as a RAID-5 device. The RAID-5 device used 6 disks for data, one for parity and the last one for hot spare; as a result, each VLUN had roughly 200GB in capacity. Using IBM AIX Subsystem Device Driver (SDD), each brick was enabled for four different paths, one path through each fibre channel adapter. Twelve out of the sixteen VLUN s were used to store database tables; a separate VLUN was used for DB2 log files. The remaining 3 VLUN s were used for administrative purpose such as storing backup database, VxVM rootdg, etc. The IBM SDD driver was enabled during LVM, JFS and JFS test runs. For VxVM and VxFS testing, the default VxVM Dynamic Multi-Pathing feature (DMP) was used instead and the SDD driver was disabled. The following software releases were used in the tests: DB2 Universal Database 7.2 (version 7 release 2, 32-bit with fixpack 6) IBM AIX 5.1D IBM Subsystem Device Driver for AIX 51 version 1.3.1.3 IBM 2105 Disk Device Driver, version 32.6.100.9 The size of the database was roughly 78GB, which is equivalent to that of a fully scaled TPC-C database with a scale factor of 500 warehouses. The database consisted of five DB2 Tablespaces, each of which contained DB2 Containers. The size and number of Containers in each Tablespace determined the size of the Tablespace. The DB2 Container could be either a raw volume or a file. The volume layout and number of DB2 Containers per Tablespace have significant impact to database performance. The benchmark tests were conducted with three DB2 Bufferpool sizes: 500MB, 1.0GB, and 1.5GB, and using five different I/O configurations: VxVM_RAW uses the VERITAS Volume Manager raw volumes directly QIO uses the Quick I/O feature of VERITAS File System CQIO uses the Cached Quick I/O feature of VERITAS File System BIO uses the default VERITAS File System buffered I/O mode DIO uses the VERITAS File System direct I/O mode 1. For each I/O configuration, experiments were conducted to determine the disk layout that yielded the best throughput for that configuration. As a result of these experiments, volumes for all configurations were created with RAID-0 stripe across the 12 data VLUN s. Each of these volumes was assigned to one of the Tablespaces as a DB2 Container. 1 VxFS DIO mode uses the mount option o convosync=direct. 3
All I/O configurations, except VxVM, QIO and CQIO, used 60 equal sized Containers to store the database. The number of Containers assigned to each Tablespace depended on the aggregate size of the tables residing on the Tablespace. For VxVM, QIO and CQIO test runs, a single container per Tablespace configuration was used. The reason for this variation will be covered later. A more detail description of the disk layout for each of the five I/O configurations is given below: 1. Volumes for buffered I/O and direct I/O, i.e. BIO, DIO, were created with the default stripe unit size of 64KB. For these two I/O configurations, a single volume and file system were created to store the database. During database creation, the file system was carved into 60 equal sized DB2 Containers; each Container was then assigned to one of the five DB2 Tablespaces. This resulted in multiple containers per Tablespace. Multiple containers per Tablespace was necessary to minimize the overhead of single writer lock which are common to UNIX file system I/O s. 2. Volumes for VxVM_RAW, QIO and CQIO were created using VxVm striping. Five raw volumes were created for VxVM, one for each Tablespace. For QIO and CQIO, one large stripe volume/file system was created, out of which, five QIO files were then created, one for each of the five Tablespaces. Each raw volume or QIO file was presented to DB2 as a single raw device. This resulted in a single container per Tablespace layout. This layout was plausible for this particular group of I/O configuration for the following three reasons: First, the single writer lock overhead did not apply to the RAW/QIO devices, hence, it was not necessary to create multiple containers to avoid the locking effect. Second, with multiple containers per Tablespace, DB2 would stripe data across the containers. Now since the volumes were already striped at volume manager layer, having multiple containers resulted in DB2 performing striping on top of VxVM striping. This resulted in slight degradation of performance due to increased I/O activities, as is illustrated by the I/O statistics collected during tests.. Finally, from system administration point of view, having a single container per Tablespace was much easier to manage, especially when there was a large number of Tablespaces to manage. Note that all I/O configurations used the same 4KB DB2 data block size. The DB2 parameters used in the benchmark tests are provided in Appendix A. Results and Analysis The primary performance metric used in this brief is a throughput metric that measures the number of transactions completed per minute (TPM-C). The transaction mix in this OLTP benchmark represents the processing of an order as it is entered, paid for, checked, and delivered, following the model of a complete business activity. The TPM metric is, therefore, considered a measure of business throughput. Table 1 lists the database throughput of the benchmark tests with 160 concurrent TPCC clients in five different I/O configurations and three different DB2 bufferpool sizes. The relative plot of database throughput compared to VxVM Raw I/O is shown in Figure 1. Table 1 Database Throughput of the Benchmark Tests. Database Throughput in Transactions per minutes (TPM) I/O Configuration Size of DB2 Bufferpool Cache 500MB 1.0GB 1.5 GB VxVM RAW I/O 10,467 10,840 11,314 Quick I/O 10,429 10,921 11,289 Cached Quick I/O 12,398 13,789 14,928 VxFS buffered I/O 7,523 8,163 8,827 VxFS direct I/O 7,290 8,047 8,728 4
Figure 1. Relative Throughput Comparison 1.4 Relative Throughput Ratio with VxVM RAW as the base 1.2 1 0.8 0.6 0.4 0.2 0 500MB 1.0GB 1.5GB VxVM RAW I/O Quick I/O Cached Quick I/O VxFS buffered I/O VxFS direct I/O DB2 Bufferpool Size Throughput with Quick I/O matches closely with that of the VxVM raw I/O configuration. This confirms the excellence of Quick I/O for achieving raw I/O-equivalent performance while still providing file system manageability to DB2 databases. A large performance gain for Cached Quick I/O can be observed in Figure 1. This is because Cached Quick I/O is able to use as a second level cache, the vast amount of operating system memory, external to the DB2 bufferpool. This performance gap, however, will decrease as the DB2 bufferpool size becomes larger, which allows more data blocks to be cached in the DB2 bufferpool area, thereby increasing DB2 bufferpool cache hit ratio and decreasing the benefit of using the operating system page cache. Table 2 lists the average CPU utilization of the benchmark tests for VxVM_RAW, QIO. VxFS BIO and VxFS DIO configurations at 160 concurrent batch user tests. The table shows that Quick I/O initially had a slightly higher utilization than other configurations; however, the CPU utilization dropped quickly as the bufferpool size was increased.. Table 2 Average CPU utilization of the benchmark tests. Average CPU Utilization I/O Configuration Size of DB2 Bufferpool Cache 500MB 1.0GB 1.5GB VxVM Raw I/O 43.6% 40.0% 38.3% Quick I/O 47.0% 42.2% 40.5% VxFS BIO 42.0% 41.4% 42.3% VxFS DIO 42.0% 41.6% 43.2% 5
Conclusions The OLTP benchmark used in this study is commonly used to evaluate database performance of specific hardware and software configurations. By normalizing the system configuration and varying the file system I/O configuration, it was possible to study the impact of various storage layouts on database performance with this benchmark. Prior to the final benchmark test runs, significant amount of effort was devoted in tuning the system configuration to ensure optimal performance for each I/O configurations. The tuning process revealed strengths and weaknesses of the different I/O configurations involved in this study. These observations are summarized in the following: Simplifying the multi-path setup process: In case of LVM, enabling multi-paths to the IBM 2105 disk device is a non-trivial task. It requires installation of the SDD (Subsystem Device Driver),, followed by tedious configuration of the specific hardware cables and the disk device. VxVM on the other hand, automatically enables dynamic multi-pathing, thereby dramatically simplifying the multi-path setup process. Automatic enablement works as long as there are multiple physical connections to disk device. Better manageability without sacrificing performance: Having multiple DB2 containers per Tablespace bypasses the UNIX single writer lock limitation, thereby improving the buffered I/O performance. However, managing multiple containers in multiple Tablespaces can become non-trivial task for database administrators. The problem is especially compounded when there are a large number of Tablespaces defined. VxFS Quick I/O provides best of the two worlds by enabling VERITAS Database Edition for DB2 to have comparable performance to raw configurations, while simplifying the task of managing multiple Tablespaces. Quick I/O does this by providing raw device interface that eliminates the single writer lock limitation, thereby eliminating the need to create multiple containers per Tablespace.. In fact, the QIO, CQIO and VxVM numbers in Table 1 were achieved using single container per Tablespace. In addition, qiostat, the QIO utility that enables the administrator to monitor each container separately can be very useful for performance monitoring and tuning, especially when multiple containers are used. Improving performance by disabling DB2 striping: When multiple containers exist in a Tablespace, DB2 automatically stripes data blocks across the existing containers. If each container resides on a separate physical device, this striping helps improve I/O throughput by load balancing across multiple disk devices. For QIO, CQIO and VxVM runs, if the volumes were created using VxVM striping and if multiple containers were assigned to the Tablespace this caused striping at DB2 level. This double stripping (VxVM and DB2 striping) generated more I/O activities and less throughput. Double striping can be avoided by disabling DB2 striping, i.e. limiting a single DB2 Container per Tablespace. When 32-bit DB2 and 4K data block size are used, the largest single Bufferpool size that can be allocated for DB2 is 2GB. For systems with plenty of memory, the memory beyond the 2GB DB2 Bufferpool size limit can be used as a second level cache with VERITAS Cached Quick I/O. The second level cache improves DB2 performance, as data blocks that could not previously be cached in DB2 Bufferpool can now be cached in the operating system cache. In the 64-bit AIX 5.1 and 32-bit DB2 environment, Cached Quick I/O achieved up to a 132% performance gain over that experienced with VxVM RAW configuration. The benefit of the second level cache may diminish when 64-bit DB2 is used, or when multiple Bufferpools are created which allow for more than 2GB memory to be allocated for DB2 Bufferpool.. Careful tuning on the memory usage may be necessary to achieve optimal performance. In summary the DBED for DB2 provides customers with simplified installation process, and dramatically improves manageability without sacrificing the underlying database performance. 6
Appendix A The following init.ora file was used in the benchmark tests. The Buffpool size parameter was changed for different DB2 Bufferpool sizes. Database Configuration for Database TPCC Database configuration release level = 0x0900 Database release level = 0x0900 Database territory = US Database code page = 819 Database code set = ISO8859-1 Database country code = 1 Dynamic SQL Query management (DYN_QUERY_MGMT) = DISABLE Directory object name (DIR_OBJ_NAME) = Discovery support for this database (DISCOVER_DB) = ENABLE Default query optimization class (DFT_QUERYOPT) = 5 Degree of parallelism (DFT_DEGREE) = 1 Continue upon arithmetic exceptions (DFT_SQLMATHWARN) Default refresh age (DFT_REFRESH_AGE) = 0 Number of frequent values retained (NUM_FREQVALUES) = 10 Number of quantiles retained (NUM_QUANTILES) = 20 Backup pending Database is consistent = YES Rollforward pending Restore pending Multi-page file allocation enabled Log retain for recovery status User exit for logging status Data Links Token Expiry Interval (sec) (DL_EXPINT) = 60 Data Links Number of Copies (DL_NUM_COPIES) = 1 Data Links Time after Drop (days) (DL_TIME_DROP) = 1 Data Links Token in Uppercase (DL_UPPER) Data Links Token Algorithm (DL_TOKEN) = MAC0 Database heap (4KB) (DBHEAP) = 512 Catalog cache size (4KB) (CATALOGCACHE_SZ) = 64 Log buffer size (4KB) (LOGBUFSZ) = 32 Utilities heap size (4KB) (UTIL_HEAP_SZ) = 5000 Buffer pool size (pages) (BUFFPAGE) = 500000 Extended storage segments size (4KB) (ESTORE_SEG_SZ) = 16000 Number of extended storage segments (NUM_ESTORE_SEGS) = 0 Max storage for lock list (4KB) (LOCKLIST) = 150 Max appl. control heap size (4KB) (APP_CTL_HEAP_SZ) = 128 Sort list heap (4KB) (SORTHEAP) = 16 SQL statement heap (4KB) (STMTHEAP) = 2048 Default application heap (4KB) (APPLHEAPSZ) = 328 Package cache size (4KB) (PCKCACHESZ) = 40 Statistics heap size (4KB) (STAT_HEAP_SZ) = 4384 Interval for checking deadlock (ms) (DLCHKTIME) = 3000 Percent of lock lists per application (MAXLOCKS) = 20 Lock timeout (sec) (LOCKTIMEOUT) = -1 Changed pages threshold (CHNGPGS_THRESH) = 40 Number of asynchronous page cleaners (NUM_IOCLEANERS) = 4 Number of I/O servers (NUM_IOSERVERS) = 1 7
Index sort flag (INDEXSORT) = YES Sequential detect flag (SEQDETECT) Default prefetch size (pages) (DFT_PREFETCH_SZ) = 32 Track modified pages (TRACKMOD) = OFF Default number of containers = 1 Default tablespace extentsize (pages) (DFT_EXTENT_SZ) = 32 Max number of active applications (MAXAPPLS) = 128 Average number of active applications (AVG_APPLS) = 1 Max DB files open per application (MAXFILOP) = 800 Log file size (4KB) (LOGFILSIZ) = 4000 Number of primary log files (LOGPRIMARY) = 10 Number of secondary log files (LOGSECOND) = 20 Changed path to log files (NEWLOGPATH) = Path to log files = /db2log/logdir/ First active log file = Group commit count (MINCOMMIT) = 1 Percent log file reclaimed before soft chckpt (SOFTMAX) = 250 Log retain for recovery enabled (LOGRETAIN) = OFF User exit for logging enabled (USEREXIT) = OFF Auto restart enabled (AUTORESTART) = ON Index re-creation time (INDEXREC) = SYSTEM (RESTART) Default number of loadrec sessions (DFT_LOADREC_SES) = 1 Number of database backups to retain (NUM_DB_BACKUPS) = 12 Recovery history retention (days) (REC_HIS_RETENTN) = 366 TSM management class (TSM_MGMTCLASS) = TSM node name (TSM_NODENAME) = TSM owner (TSM_OWNER) = TSM password (TSM_PASSWORD) = VERITAS Software Corporation Corporate Headquarters 350 Ellis Street Mountain View, CA 94043 650-527-8000 or 866-837-4827 For additional information about VERITAS Software, its products, or the location of an office near you, please call our corporate headquarters or visit our Web site at www.veritas.com. 8