Optimizing Insert Performance - Part 1 John Campbell Distinguished Engineer DB2 for z/os development CAMPBELJ@uk.ibm.com
2 Disclaimer/Trademarks The information contained in this document has not been submitted to any formal IBM test and is distributed AS IS. The use of this information or the implementation of any of these techniques is a customer responsibility and depends on the customer s ability to evaluate and integrate them into the customer s operational environment. While IBM may have reviewed each item for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Anyone attempting to adapt these techniques to their own environments do so at their own risk. Any performance data contained in this document were determined in various controlled laboratory environments and are for reference purposes only. Customers should not adapt these performance numbers to their own environments as system performance standards. The results that may be obtained in other operating environments may vary significantly. Users of this document should verify the applicable data for their specific environment. Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion. Trademarks IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at Copyright and trademark information at www.ibm.com/legal/copytrade.shtml. 2
3 Objectives To understand: Typical performance bottlenecks How to design and optimize for high performance How to tune for optimum performance New features of DB2 9 and DB2 10 How to best apply and use new features 3
4 Agenda Insert algorithm Table Space definition Space search concepts Table space attributes Append option DB2 V10 ALTER DDL enhancements Summary 4
5 Insert Algorithm 5
6 Key Physical Design Questions Design for maximum performance throughput or space reuse? Random key insert or sequential key insert? Store rows in clustering sequence or insert at the end? Input records sorted into clustering key sequence? What are indexing requirements and are they justified? 6
7 Type of table space Segmented Classic Segmented table space Non-Segmented Classic Partitioned table space MEMBER CLUSTER attribute Universal table space (introduced in V9) Partition-by-Growth (PBG) Partition-by-Range (PBR) MEMBER CLUSTER attribute (improvement in V10) Same insert space search algorithm as Segmented table space 7
8 Simple Insert Flow with clustering key INSERT Clustering Index Get IX pages Read I/O wait From index or data page CPU time Get page from space search Deferred Write I/O Wait Space map page Find data page Write logs Page Latch or P-lock wait Return AP Update pages 8
9 Space Search Steps (Classic Partition/UTS in V10) Search from free space to the end with physical extend 4 2 Search adjacent 2 End Page w/ free space Candidate page 1 3 Search the end Without physical extend Search the end with Extend 5 6 Exhaustive search 9
1 0 Space search steps (Classic Segmented/UTS in V9) Space Map page w/ free space Search the space map page that contains lowest segment has free space to the end of space map page 2 Space Map Page W/ free space Search adjacent pages within the segment Candidate page 1 2 Space Map page w/ free space Search from lowest segment that has free space 5 3 End 4 Search the end Without physical extend Search the end with Extend 5 Extend failed 6 Exhaustive search from the beginning of table space 10
1 1 PBG search algorithm for entire table space Part 1 Part 2 part 3 part 4 Thread 1 T1 Thread 2 T2 Thread 3 T3 Thread 4 T4 Insert to Part 3 Insert to Part 3 Insert to Part 2 Insert to Part 3 11
1 2 Factors can affect insert performance Insert patterns (application design) Sequential vs. random index key Multi-Row Insert Delete and insert Concurrency Data sharing Bufferpool tuning Log and DB I/O performance Table space type Segmented Partitioned Universal (PBG/PBR) Table space definitions PRIQTY, SECQTY Page size LOCKSIZE PCTFREE, FREEPAGE TRACKMOD SEGSIZE MAXROWS MEMBER CLUSTER Table definitions Variable (or Compression) Row size APPEND Index definitions Number of indexes Clustering Index size / Number of keys 12
1 3 Segmented Table space Segmented table space provides for more efficient search in fixed length compressed and true variable length row insert Space map contains more information on available space so that only a data page with guaranteed available space is accessed 2 bits per data page in non segmented table space (2**2=4 different conditions) 4 bits per data page in segmented table space (2**4=16 different conditions) Possible more space map page updates Possible performance penalty with data sharing 13
1 4 Partitioned Table Space Use page range partitioning by dividing table space into partitions by key range Spread insert workload across partitions Can reduce logical and physical contention to improve concurrency and reduce cost Separate index B-tree for each index partition of partitioned index (good for concurrency) Only one index B-tree for non-partitioned index (bad for concurrency) Over wide partitioning has potential to reduce number of index levels to reduce performance cost Reduce numbers of levels in each index part 14
1 5 Partition by Growth Insert using clustering index of the entire table space Search algorithm applies to part level and table space level Partition level utility is supported Design point is to keep smaller pieces in support of data base management Evaluate the usage and the need for partition level utility Only use Part level REORG if many deletes to free up the space and help with cluster insert Part level REORG applies free space attributes, row may not fit back New zparm REORG_IGNORE_FREESPACE Use RTS stats can help to determine free space on each partition Table space level REORG can restore the clustering of the data rows for the whole table 15
1 6 Data Page Size Assuming very insert intensive workload, use large data page size for sequential inserts to Reduce # Getpages Reduce # Lock Requests Reduce # CF requests Get better space use 16
1 7 Table space definition - LOCKSIZE Lock size type Page lock, Row lock or Any (same as page lock) Sequential key order Page lock (PAGE/ANY) Best Practice Effective if inserting many rows per page Default for partitioned and segmented TS Default for implicit PBG is ROW Explicit UTS uses ANY Random insert No difference in lock request between ROW vs. Page Possibly better concurrency with ROW Additional data page P-locks with ROW level lock in data sharing environment MAXROWS 1 with lock size PAGE can avoid additional data page p- locks in the data sharing, however, it may cause space map page contention with high concurrent inserts 17
1 8 Table space definition Distributed Free Space FREEPAGE n DB2 leaves a page free space every n pages during LOAD or REORG PCTFREE n DB2 leaves percentage of free space on each page during LOAD or REORG N can be 0 99 The default is PCTFREE 5 % for data page / 10% Index page Space is reserved during LOAD, REORG, REBUILD INDEX tablespace Free Page 002F 18
1 9 Distributed Free Space Use distributed free space PCTFREE and/or FREEPAGE For efficient sequential read of index For efficient sequential read of data via clustering index To minimize index split Carefully calculate setting Use default PCTFREE and FREEPAGE unless you know better Default distributed free space 0 FREEPAGE 5% PCTFREE within data page 10% PCTFREE within index page 19
2 0 Best Practice Free Space Sequential key order Use PCTFREE (0) Default FREEPAGE (0) Random key order PCTFREE > 0 can reduce chance of space search FREEPAGE > 0 If segmented TS, use number smaller than SEGSIZE For index pages PCTFREE (0-99) for leaf page: Use 0% for sequential, 10% (default) or higher for random depending on frequency of leaf page splits vs. REORG PCTFREE(0-10) for non-leaf pages 20
2 1 Table space definition TRACKMOD Applies to both sequential or random insert DB2 keeps track of changed pages in the space map page It is used by incremental COPY to efficiently determine which pages to be copied i.e., avoid scanning every page <YES> DB2 keep track of updated data pages Is default prior to V10 Dirty bit on the space map page is updated when the data page is changed. <NO> is recommended if do not require incremental COPY DB2 does not keep track of updated pages Less space map page updates which will improve performance Less data sharing overhead Can be altered via ALTER TABLESPACE DDL The default can be controlled by zparm (IMPTKMOD) in V10 21
2 2 Table space definition MAXROWS n Applies to both Sequential and Random insert Be careful using MAXROWS(n) where fixed length compressed or variable length rows to avoid waste space search When MAXROWS n is reached the page is marked full Can avoid data page false lead visit with insufficient space Carefully estimate average row size and number of average size rows will fit in a single data page Things to monitor when MAXROWS is used New column added to table can change the average size of row Record size may be changed Default is 255 Careful with MAXROWS 1 More frequent space map update may drive more lock/latch contention Excessive space usage 22
2 3 Table space definition SEGSIZE Recommendation is to use larger SEGSIZE value for single table in the table space Typical SEGSIZE value of 32 or 64 Default is SEGSIZE 4 (prior to V10) Default is SEGSIZE 32 in V10 Large SEGSIZE Provides better opportunity to find space near by candidate page and therefore maintain clustering Better chance to avoid longer space search Small SEGSIZE More space map pages, can reduce space page contention But less chance of hitting searching threshold, looking for space at the end and kicking off a more extensive space search Also applies to Universal table space 23
2 4 MEMBER CLUSTER INSERT into hot spots Resulting in excessive page p-lock and page latch contention on space map and data pages Can occur when Concurrent insert among different members in a data sharing environment Table space has row level locking attribute Clustering key of the table is in ascending sequence MEMBER CLUSTER Member-private space map and corresponding data pages 24
MEMBER CLUSTER Concept 2 5 25 Member C SMAP SMAP SMAP SAMP Data pages corresponds to the SMAP Member A Member B SMAP SMAP
2 6 MEMBER CLUSTER Data rows inserted by Insert SQL are not clustered by clustering index Instead, rows stored in available space in member-private area Can reduce space map contention Used in the high concurrent insert in the data sharing environment Can reduce space map sharing between data sharing members In non-data sharing environment, smaller space map page can reduce the interest of space map page between threads Space map page Data clustering can be restored via REORG May want to use LOCKSIZE ROW and larger data page size when using MEMBER CLUSTER Better space use Reduce working set of bufferpool pages 26
2 7 MEMBER CLUSTER structure Classic Partitioned table space Member 1 Spacemap 1 Data page 1 Data page 199 Member 2 Spacemap 2 Data page 1 Data page 199 Universal table space in V10 Member 1 Spacemap 1 Segment 1 Segment 10 Member 2 Spacemap 2 Segment 1 Segment 10 27
2 8 MEMBER CLUSTER Option not available on segmented table space Only use when data clustering is not essential V10 improvement A new catalog column (SYSTABLESPACE.MEMBER_CLUSTER) Option is available on Universal table space MEMBER CLUSTER attribute can be changed By using ALTER table space DDL statement It is a pending ALTER statement, will need REORG to materialize the change Classic partitioned table space with MEMBER CLUSTER attributes can be converted to UTS with MEMBER CLUSTER Use ALTER table space DDL statement to convert table space to UTS Use ALTER table space DDL statement to convert MEMBER CLUSTER attribute 28
MEMBER CLUSTER (PCTREE = 0 and FREEPAGE = 0) Last space map page 29 2 9 SMAP SMAP SAMP Member C pages covered by the SMAP SMAP SMAP Member A Member B SMAP
3 0 MEMBER CLUSTER (PCTREE = 0 and FREEPAGE = 0) Introduced in V7 APAR PQ87381 Re-established in V9 with APAR PK81470 Only applies to classic partitioned table space / simple table space Insert to the end of table space without looking at the free space Will reduce longer chain of space map page search as table space gets bigger Good for table space with seldom delete or update that creates free space Exhaustive search through space map pages belong to the member before physical extend Success depends on deletes and inserts being spread across members 30
3 1 APPEND Option on Table New APPEND option is provided for INSERT in V9 NFM CREATE/ALTER TABLE.. APPEND YES Can relieve high get pages during space search APPEND search at the end of table space quickly Not going through looking for deleted space Table space size will tend to grow With high number of concurrent inserts, APPEND could cause bottleneck on the last space map page Using MEMBER CLUSTER option together with APPEND to relieve the contention at the end Could lead to more frequent REORG to restore data clustering V9 APAR PK81471 Favors physical extend rather than space search on the deleted space 31
3 2 V10 Online Schema Enhancements Attributes can be altered Page size (not XML) by ALTER BUFFERPOOL ALTER TABLESPACE BUFFERPOOL BP8K0 DSSIZE ALTER TABLESPACE. DSSIZE ng SEGSIZE ALTER TABLESPACE. SEGSIZE MEMBER CLUSTER ALTER TABLESPACE.. MEMBER CLUSTER YES/NO ALTER INDEX Page size (BUFFERPOOL) In V9 this was immediate with RBDP set Convert classic table space type to UTS PBR/UTS 32
3 3 Details on Execute ALTER statement Specify ALTER statement Statement is validated Semantic checking against effective catalog definition Assuming all checks out ok: Statement is put on pending list Table space is placed in AREOR (non-restrictive) Statement completes with SQLCODE +610 to advertise the advisory state SYSIBM.SYSPENDINGDDL DBNAM E TSNAM E OBJSCHE MA OBJNA ME OBJTYP E OPTI ON_ SEQ NO OPTION_ KEYWO RD OPTIO N VA LUE CREATE DTS STATEMENT_TEX T "DB1" "TS1" "DB1" "TS1" 'S' 1 "BUFFER POOL" "BP8K0 " 2008-10-0 4-07.14.2 0.204010 ALTER TABLESPACE DB1.TS1 BUFFERPOOL BP8K0 MAXPARTITIONS 20; 33
3 4 Details on Execute ALTER statement Cached changes are materialized by next REORG SHRLEVEL REFERENCE CHANGE Undo of DDL changes if not materialized ALTER TABLESPACE DROP PENDING CHANGES All pending changes are removed Undo of DDL changes if materialized Perform compensating ALTER & schedule REORG Assumes no dependencies on prior ALTER have evolved Pending ALTER can not be used for Catalog object, workfile db or table space has clone table (drop the clone first) 34
3 5 Alter Member Cluster with RECOVER PIT : Not allow recover to PIT that is prior to the materialization of pending REORG after materializing REORG Pending alter Image copy DML activities REORG converts the pending alter with In-line copy RECOVER TO PIT 35
3 6 Alter Member Cluster with RECOVER To currency : RECOVER will use the image copy taken from REORG that materialized the pending ALTER Image copy taken before the REORG that materialized the pending alter can not be use Pending Alter Image copy DML activities REORG converts the pending ALTER with In-line copy + log records Not allow RECOVER to Currency 36
3 7 Table Space type conversion Single Table Simple table space ALTER TABLESPACE MAXPARTITIONS n UTS PBG Single Table Segmented table space ALTER TABLESPACE SEGSIZE n Classic Partitioned table space UTS PBR Note: Member Cluster attributes will be inherited through the conversion 37
3 8 Summary How DB2 looks for a space to insert? Same algorithm apply to all threads High performance, than less space reuse Better space reuse, than less performance Hard to balance between space vs. performance Delete or Update, or REORG frequency can impact insert performance If efficient space usage is more important Use of APPEND and MEMBER CLUSTER may not be a good choice Tends to increase the space Use segmented TS or UTS More granular space status with segmented/uts space map Tends to have more space map page contention when high concurrency 38
3 9 Summary Best Practices Insert with sequential key order PCTFREE 0, FREEPAGE 0 Multi-Row Insert TRACKMOD NO MEMBER CLUSTER Larger index page size Active and archive log striping Insert with random key order PCTFREE 5 (Default) or higher for table PCTFREE 10 (Default) or higher for index leaf Non zero FREEPAGE Cluster index TRACKMOD NO Active and archive log striping Factors which could impact the performance significantly Table space type MEMBER CLUSTER APPEND MAXROWS Large index page size PRIQTY, SECQTY 39