DB2 for z/os Best Practices Optimizing Insert Performance - Part 1

Size: px

Start display at page:

Download "DB2 for z/os Best Practices Optimizing Insert Performance - Part 1"

Kelley Russell
6 years ago
Views:

1 DB2 for z/os Best Practices Optimizing Insert Performance - Part 1 John J. Campbell IBM Distinguished Engineer DB2 for z/os Development CampbelJ@uk.ibm.com 2011 IBM Corporation Transcript of webcast Slide 01 (00:00) So hello, this is John Campbell here, from DB2 development. What I'd like to do here is give a Web lecture about optimizing INSERT performance. This is part 1 of a two-part presentation. INSERT is one of the most important SQL statements, both for online and especially batch programming. It is often critical to be able to capture data, and get that into the database, store it in the database as efficiently and as quickly as possible. Often, particularly in the case of batch processing here, it's important to insert the data at a high enough rate so that you can meet technical and business deadlines. Slide 02 Slide 03 (00:41) So, we now go to slide three here, and go through the objectives. The objectives over this two-part presentation are as follows: The first of all basically is to understand typical performance bottlenecks. Then, to understand how to design and optimize for high performance. Then, how to tune optimal performance. There are many new features to help insert performance in both DB2 Version 9 and DB2 Version 10, and I will introduce and discuss those features,and how then to best apply and use these new features. Now let's turn to the next slide. Slide 04 (01:15) In part one of this two-part web lecture, what I'd like to cover here is: first of all, the INSERT algorithm; talk about the table space definitions; and the space search concepts and algorithms across the different types of table spaces. There are many attributes on the table spaces and indexes that can have either a positive or negative impact on INSERT performance. So I'll introduce and describe the critical table space attributes.

2 In DB2 Version 9, we have a new option called APPEND for inserting at the end of a table space or a table space partition, and I'll introduce and discuss that. The final part of this presentation I will talk about the new DDL enhancements in DB2 Version 10, related to inserts, and finally summarize part 1 of this Web lecture. Now, let's turn to the next slide. Slide 05 (02:11) So, what I want to do in the first part of this web lecture here, is to talk about the INSERT algorithm. Slide 06 (02:22) Now this is, on slide 6 here, this is probably the most important slide in this Web lecture. There are some key physical design questions that you as a customer or designer need to understand when designing for performance of inserts. The first key question here, is what is your goal? Are you designing for maximum INSERT throughput? Or, are you trying to design for max space reuse? I need to have an answer to that question, because in reality you cannot have both. It's a tradeoff: if you want maximum performance throughput, invariably you'll get less space reuse. Conversely, if you want best space reuse, then there will be a negative impact on performance throughput. That's a very important question. The next question is, in terms of the new rows being inserted into the table, are these going to arrive in the sequential key order of the clustering index, or are those keys going to come in a random order? Now clustering is often very important for query performance, later. Another question is, is it important that new rows being inserted are stored in clustering sequence of the clustering index, or are you quite happy to have the new rows inserted at the end of the table space or the table space partition. Now for Batch processing, another key question, is it possible to sort the input records into the clustering key sequence of the clustering index, and that way we can get efficient skip-sequential processing through the table space or partition. Any the final question here, again is a very important one, is about the indexing requirements, and are they justified? Every index that you add on a table will aggravate the performance of both insert and delete, OK. And so indexing requirements will have to be justified, and sometimes challenged to make sure we're not over indexing the table space. And consequently, there will be a negative impact on insert and delete performance. Now, let's turn to slide 7.

3 Slide 07 (04:23) So, I want to give a quick recap here, on the different types of table spaces. First of all we have the classic segmented table space that's been around since Version 2, that supports both single-table and multi-table table spaces. We then have the classic partitioned table spaces, which supports key-range partitioning. And ever since the introduction of data sharing, we support this attribute for the cluster. Later on we are going to talk in some detail about how much that cluster helps, and the consequence of using it. Starting in Version 9, we introduced a new table space called the universal table space, which is a new table space type, it s the convergence of the best features of the classic segmented and partitioned table spaces. There are two types of universal table spaces: one is partition-by-growth, and the other is partition-by-range. Now, starting in Version 10 the member cluster attribute will now be supported for the universal table space. And the space search algorithm in Version 9 is very similar to the space search algorithm for the segmented table space. Now, let's turn to the next slide. Slide 08 (05:37) On this slide here, first of all, I'm just showing a very simple insert flow with a clustering key. What you're seeing here is that when a row is inserted, the DB2 index manager, using the clustering index, will identify candidate RIDs, in other words a page, for the new row to be inserted. And as part of that process index manager will access the index and those pages may result in I/O, may not depending on whether the index pages are buffer pool resident. It will then have to go the space map pages, either one or many inside the table space to see whether there is space in that page, and if not to look for alternatives. And then finally go to the data page, and see whether there is actually enough room in that page for the new row to be inserted. And the last part of the process is to write the log records and actually commit the pages. Now, throughout this very simple flow here, there may be potential bottlenecks. So, what you can now see on this slide here, overlaid on it is basically the potential bottlenecks. First of all, in the top right hand corner there, there may be read I/O waits for both index and data pages if the respective pages are not in the buffer pool. Space map pages, look at the classic partitioned space map. For partitioned table spaces, one space map can cover ten-thousand data pages. For a segmented or universal table space, he can cover fivethousand pages.

4 So there is a very heavy concurrent insert processing, especially across multiple members in a data sharing environment. This can result in high page latch, and/or p-lock contention. When it comes to actually finding the data page, it may well be that that data page is already locked. Or, we may get a false lead, whereby the space map says there's space in the data page, but when we actually get the data page in our hand, there's no space left. Therefore we actually may have to do more searching through the space. And go the appropriate data pages to see if there is room for the new row to be inserted. There may be bottlenecks in terms of the writing to the active log, or in terms of the deferred write I/O. So, now what I'd like to do, on the next slide... Slide 09 (7:56) in fact on the next few slides, is to go through the space search algorithms to begin with, I m going to talk about the space search algorithm, for the classic partitioned table space. And note carefully on the slides here, the space search algorithm of the universal table space is very similar to classic partitions, with the introduction of DB2 Version 10. So, to start here at step 1, index manager will identify what's called the candidate page. And that's a recommendation from index manager to data manager to say, try and put the new row in that candidate page. Now, a couple of things can happen here. First of all the page might be locked by another process, and in insert processing, DB2 uses a mechanism called conditional locking. And what happens with conditional locking, is already locked by another process. Control will return to that thread in DB2, and it will try to look for an alternative page nearby. The other possibility of course, is that there is no space left in the page. The page may be marked not full, however they may not be enough space left in that page to absorb the new row. Now, at step 2, what DB2 will do is search forwards and backwards plus and minus 16 pages to see if there is space in a nearby page. So, if that candidate page hasn't got enough space, or it's locked, then DB2 will search plus and minus 16 pages using space map, to see if an adjacent page or nearby page has enough space. So, what happens if we can't find space in those pages. Well, the next thing we do is go the end of the table space or tables space partition, and see if there is space at the end, first. If we can't find space, we do not physically extend or bump-up the high-used RBA. So what happens next, if that fails? Well, then what we'll do is search for free space at the end with a physical extent. So, we'll look in the space map to see if there is some space in a nearby page covered by the space map, and failing that, we'll then go to the end of the table space, and this time we may actually have to extend the page set partition, in order to actually absorb the new row.

5 And the final step, step 6 here, if we cannot extend the page set partition because primary quantity is reached, or secondary quantity is reached, what DB2 will then do is do an exhaustive search from the front to the back of the page set partition, using the space maps. And obviously trying to reduce that exhaustive space search is an important factor. So, one of the most important rules you'll come to learn very early here, about insert performance, is that you'll have high volume concurrent insert, then basically what you need to do is design for a high a high primary and secondary quantity value to avoid these exhaustive searches. Slide 10 (11:06) So, on the next slide now: I'd like to talk about the space search algorithm for the classic segmented table spaces. And in fact, this is the same space search algorithm used by the universal table space, in Version 9. Again, what happens, as in the previous example of classic partitions, index manager will identify candidate page. There may or may not be space in that page, and the page also may be locked by somebody else, so that we cannot insert the row in the candidate page. So, as you see here manager has identified the candidate page, and that candidate page may be locked by another process, or there may actually not be enough space left in that candidate page to absorb the new row. So, what DB2 will do now is search forwards and backwards inside the segment. Notice the difference compared to the classic partitions, in the classic partition we go plus and minus 16 pages, but in the case of the segmented table space, we search forwards and backwards inside the same segment. So, clearly, having a large segsize is a good thing, if you want to actually maintain clustering, because there is a better chance to store the rows near the clustering order. On the other hand, if you have a small segsize, there is a reduced chance of trying to maintain rows in the clustering sequence. So, what happens if that search fails? Well, then what happens is, there's a space map which covers the subject segment. And what DB2 will actually search through that space map page and look for the lowest segment that has free space, and try and find another page covered by that space map where the new row can be inserted. What happens next? If you can't basically insert the row in step 3, we then actually go to step 4, where we actually search the end of the page set, and see if there is space there. But, we will not extend the space. So what happens if it fails step 4? Then what we'll do, is we'll go to the space map which has the lowest segment from the front of the data set that has free space, and then work our way through the space maps from the front to the back of the page set. And then finally, at the end of the step of 5, we'll search right at the end, and we'll extend the page set if needed, in order to be able to insert the new row.

6 And then finally, if spatial extend fails because we've reached the primary quantity or the secondary quantity, then DB2 again will do an exhaustive space search, beginning from the front and going trough to the back of the table space, using the information in the space maps. So one thing we'll probably see here is some of the differences in the space search algorithms for segmented and UTS versus the classic partitioned table space. You'll see that the search algorithm of classic segmented and UTS tries focus on better space reuse. Where as, the previous example with the classic partition is more focused on performance and throughput. Slide 11 (14:16) Now on the next slide here, I'd like to talk about the space search algorithm for a UTS PBG, partitioned by growth. So, in this example here, we have an existing PBG table space, with three parts: parts 1, 2, and 3. And what we have here, is four separate threads, with time going left to right. And what we're doing here, is going to go through the steps as rows are actually inserted in these four threads, across the four time periods. So, to start with at time 1 and thread 1 is an insert into part 3. Why is it part 3? It's because again, index manager has identified a candidate page based on the information, the next highest key, in the index. So, what happens here at time 1, is a row is inserted into partition 3. At time time 2, with thread 2 there's another insert into part 3. So again, index manager identifies the candidate page, and that candidate page happens to be in part 3. Now what happens is that the space has no more space left in part 3. So what's going to happen? What's going to happen here is that the space search algorithm is going to look for space in an alternative partition. And here, there is some space in part 2, and the new row is inserted into part 2. Now at time 3, from a different thread, thread 3, there is now an insert into partition 2. The insert into partition 2, like in the previous step, is all driven by index manager identifying a candidate RID, and that RID identified a page in partition 2. Of course, what happens here, is there is no space left in part 2. And then what happens is that basically the DB2 data manager has to look for the space in an alternative partition. So the first thing he does here, is to go partition 3, to see if there is space. And unfortunately there is no space in partition 3, and what he then does, is he goes to look in part 2 again. Because, by now some space may have been freed up. But in this example here, there's still no space left in part 2. So, finally he'll go to partition 1, where he actually finds some space and inserts the new row. So now what happens at time 4. At time 4, with thread 4, we're now going to try and insert another row into partition 3. Again, the candidate page has been identified by index

7 manager, identifying a page in partition 3. As you can see from this picture here, there's no space in part 3, there's no space in part 2, and there's no space in part 1. In other words, there's no space in the existing partitions. So, what happens? What happens is DB2 will then allocate a new partition for this PBG table space and insert the new row into partition 4. Slide 12 (17:21) Now, on the next slide is a summary. High-level summary, at that, of the many features and functions and options that can effect insert performance. Now, what I'm going to do on the subsequent slides here, is go through the most important features. Next slide... Slide 13 (17:36) So, the first thing I'd like to talk about is the segmented table space, and the advantages and disadvantages that come with the space map pages. So when these table space were introduced back in Version 2 Release 1, and one of the objectives of the segmented table space, was to provide more accurate information in the space map, that would be useful to varying length rows, and also fixed-length compressed rows. So, the advantage of segmented table spaces is that four bits of information per data page in the space map with the classic partitioned and simple linear table spaces, there's only two bits per data page in the space map. So, these four bits instead of two bits provides more granular information, and it can reduce the number of situations, where you can get a false lead. A false lead being when the space map page indicates a space in the data page, and then when DB2 goes to the respective data page, there's actually insufficient space to absorb the new row. But there are pros and cons on this. On one hand, having four bits per data page, means that there's better chance of avoiding these false leads. The bad news is clearly there's more updates to be done on the space map pages when you have-high volume insert processing, or high-volume update processing. Slide 14 (19:05) Now, let's talk about partitioned table space. Basically, partitioned table space, and the same thing applies here to universal table space partitioned by range, gives you the opportunity to divide the large tables space into multiple partitions based on key ranges. You end up with multiple index partitions and multiple data partitions. And this has a

8 number of advantages when it comes to insert processing. First of all, it provides the opportunity to spread the insert workload across partitions. And this provides the benefit of reducing logical and physical contentions, improving concurrency and reducing processing costs. They key, of course, is to make sure that those inserts, the keys that are being inserted, are spread across the key range. One of the advantages comes from the fact that there is a separate index B-tree for each index partition of a partitioned table space. So if you don't have a partitioned table space, or you have a non-partitioned index, there's just one index B-tree for index. So, when you have an index leaf-page split because the leaf pages become full, we have what's called and index tree modification. And only one of these is allowed at a time. And although we try to allow concurrent read processing, if we're both reading the index B-tree, and also splitting the index B-tree at the same time, if they collide on the same branch of the tree, then we have to serialize them. So, using partitioned tables spaces, and using partitioned indexes can have a big benefit here in terms of introducing multiple index B-trees and allowing more index modifications to be done concurrently. Also, to be applied with some care, is basically to consider over-partitioning the table space. By partitioning the table spaces, in some cases doing over-wide partitioning, this can reduce the number of levels in the index B-tree, and hence reduce performance costs. Not just for INSERT, but also SELECT processing as well. Slide 15 (21:07) Partition by growth: Universal table space partitioned by growth was introduced in Version 9. And one of the benefits was to allow a table space to grow beyond 64 gigabytes. Whereas, if you had a classic segmented table space, in order to convert it over to partitioned to be able to go beyond 64 gigabytes, that meant unloading the data, drop and recreate the table space, and then reloading the data. So, one of the examples of PBG was to allow a table space to gracefully extend beyond 64 gigabytes. It also addresses the situation for a very large table, when there is no natural partitioning key, and therefore PBG as identified by the name can partition based on data set size. So, when the data set size is reached, it will spin off a new partition for the table space. The space search algorithm applies at both the partition, and at the table space level. And partition-level utility is also supported. However, you need to be careful here in terms of the management of the table space, especially in the area of partition-level REORG. In the situation where you have a partition-level REORG against a partitioned table space (PBG), you've got to make sure that there are sufficient deletes to free up space. Because, if you run the part-level REORG against UTS PBG, will by default try to reestablish the distributed free space, and the danger that if enough deletes have not happened that when the online REORG takes place at the partition-level, it may not be able to load the rows back into the partition.

9 So, you do need definite rules of thumbs and changes to your procedures in order to support part-level REORG for UTS PBG. To help with this situation, there's a new systems parameter identified on the slide here called called REORG_IGNORE_FREE SPACE, and what this basically means is, is that the REORG utility ignores the reestablishment of free space when running REORG at the part level. So there are some recommendations here to use real-time statistics to help determine whether or not there is sufficient free space in each partition. And, generally speaking, if you want to restore the clustering of the data rows for the whole table, then you should be running REORG at the table space level. Slide 16 (23:28) What I'd like to now is talk about data page size. And this is a recommendation, which is really optimal, and we'll need to be careful here. We're talking here about a workload which is basically very insert-intensive, with little or no update and delete activity. So, one of the advantages you've got here with an insert-intensive workload, is to use a large data page size. This should be helpful for sequential inserts, either at the end, or some hot spots in the range. The advantage of using a large data page size is to reduce the number of get pages, reduce the number of lock requests, and the number of CF requests, and in conjunction with MEMBER CLUSTER to get better space reuse. Slide 17 (24:13) The next topic I'd like to talk about, is the LOCKSIZE. Are you going to use page-level or row-level locking? So, in terms of LOCKSIZE type, we can have what's called PAGE lock, or ROW lock, or ANY, which is basically the same as the PAGE lock. What I'd like to do here is talk about, or offer some recommended best practices. First of all, for sequential key order, and in a minute I am going to talk about random key order. So, if you are doing sequential processing, so that they keys are being inserted in sequence, and the data rows are going to the same data page, then it's much more efficient here to use page-level locking. So as it says here, effective at inserting many rows to the same data page. And this is the default for both partitioned and segmented table space. But, note carefully that the default for the implicitly created UTS PBG is ROW level locking.

10 So what about random key order? Well, if it's truly random here, we're just inserting one row per data page, then there's really no difference between one row lock versus one page lock. There's potential, and possibly get better concurrency with row-level locking. But in a data sharing environment, it should be recognized that there is additional locking protocol. So, for example, when using row-level locking, there will be a page p-lock on the data page. OK? Sometimes, there may be concurrency issues. So one of the tuning options that sometimes applies here, is to use MAXROWS 1 with LOCKSIZE PAGE to force one row per page, to avoid or reduce overhead of data page p-locking. Slide 18 (25:59) Now, let's turn onto distributed free space. Distributed free space is one of the most important decisions for the database administrator. There are two options. The first one is FREEPAGE, which says that basically leave a free page every n pages. So, if you say FREEPAGE 10, this means after every 10 pages, we will leave a completely empty page. PCTFREE n basically says we are going to leave a certain percentage of free space in each and every index or data page. So, notice here the default PCTFREE is five percent of the data page and ten percent of the index page. And this space is reserved or established during LOAD, REORG, and REBUILD INDEX. Next slide. Slide 19 (26:45) So, there are some important tradeoffs here, OK? So, why do we even care about distributed free space? Well, the idea of leaving distributed free space here is to absorb the new index entries and the new data rows. From the index perspective, we are trying to leave distributed free space so that it can maintain efficient sequential read of the index. When it comes to the data, we're trying to leave sufficient free space here, to absorb the new rows, so we can still maintain efficient sequential read to the data when going through the clustering index. Another thing here, is to minimize index page splits. Index pages splits can be painful, especially in a data sharing environment. SO, by providing sufficient PCTFREE and FREEPAGE we can actually reduce the number of index splits. And also make sure that the new index leaf pages are nearby the original leaf page. So, the recommendation is to carefully calculate the settings. If you're not prepared to do that, then it's best to run with defaults for PCTFREE and FREEPAGE. So, the defaults

11 are basically: 0 for FREEPAGE for both index and data. The PCTFREE is five percent of the data page and ten percent of the index page. N Now the default of FREEPAGE 0 is a very bad choice for index pages. What it means basically, is that we haven't left any of those free pages sprinkled over the key range. What it means is that when we get an index leaf page split, the new page where we're going to populate some of the index entries, is going to be at the end of the index partition or the index space. So in heavily inserted table spaces, please give some consideration here for having a non-zero value for FREEPAGE. Slide 20 (28:38) So, let me try and summarize recommended best practice here on distributed free space. So, basically, if the keys and data rows are coming in sequential key order, it makes sense here to basically use a PCTFREE of zero and use the default of FREEPAGE, because we're always inserting in sequential order. On the other hand, if the new rows are coming in a random key order, the a good idea, recommended here, to have healthy nonzero values for PCTFREE to reduce that chance of space search, and also to have a nonzero value for free page. When it comes to index pages, it really depends whether the new index keys coming in are sequential or random. Clearly, if it's a sequential key that's coming in, then the idea here would be set PCTFREE to zero. There's no point in reserving distributed free space if the new keys are always being inserted at the end. On the other hand, if it's random, then again it's recommended to have a healthy nonzero value to absorb the new keys. And most importantly to line up the frequency of the REORG based on the amount of free space. This applies in fact to both the table space and the index space. There needs to be a balance here between the amount of free space and the REORG cycle frequency. Now, let's go on to the next slide. Slide 21 (29:57) I now want to talk about a table space option called TRACKMOD. This is an option that was introduced after the introduction of data sharing. The aim of this feature here, is to reduce the amount of contention on the space map pages, especially in data sharing. This applies to both sequential and random insert behavior. What happens, is that DB2, by default, will maintain information in the space map page about whether or not a page has been changes. So when a page is changed, there is a bit

12 in the space map for each page to indicate that it's been changed. This information is used by the incremental copy utility to determine which pages have changed. If DB2 didn't maintain this information, DB2 would actually have to scan through the whole of the table space or the table space partition, to identify the change pages. So, by default DB2 maintains this information in the space map pages, and this will lead to a very efficient use by the incremental copy utility. So, you do have the option at the DDL level on the object to specify TRACKMOD NO. And this is strongly recommended if in your installation, or against this table space, you never use incremental copy. On the other hand, it still may be useful if incremental copy is rarely used. However, it will lead to the degradation of incremental copy performance. So, when TRACKMOD is set to NO the incremental copy utility will have to run through the whole indexes, so run through the whole of the table space or table space partition to identify the changed pages. Now if you are using incremental copy, but you still decide to use TRACKMOD NO, then you need to re-evaluate the break-even point at which you want to use full copy as opposed to incremental copy. Now to help many customer installations, there's a new systems parameter in Version 10 called IMPTKMOD, and this gives you the ability to set a systems wide default as to whether you want to have TRACKMOD YES or TRACKMOD NO. Slide 22 (32:06) Now the next feature, is the ability to specify MAXROWS. Now this can either be a very, lead to a very, optimal performance, or can actually lead to very bad performance. So this needs to be very carefully applied and once a decision has been made, this decision needs to be maintained going forward. So again, this applies to both sequential and random inserts. And the objectives here, by setting MAXROWS to the expected value, is to avoid the situation where DB2 has what we called these false leads, where the space map search indicates there's space, but when we get to the data page we find that there's not enough space left in the data page. So, this is a tradeoff here. We're basically saying we want to trade a little bit of disk space here in order to get efficiency. So, the general idea here is to actually estimate the average row size, and the number of rows that will fit in the data page, and then back it off a little bit. So, for example if we think we can store 25 rows on average in a data page, why not back it off to 24 or 23 and set MAXROWS to this value? The advantage of doing that is, let's just say that we set MAXROWS to 24. What happens is when the 24th row is inserted, DB2 will mark that page full in the space map, and take it out of the space search.

13 So, it's a tradeoff here in terms of giving away a little extra DASD space in order to get that page marked full, and to take it out of the space search, and to reduce these wasteful false leads. But you need to be careful here, because if for example the table space changes, that more columns are added to the table, or you switch between compression and not being compressed or vice versa, you actually need then to re-evaluate the decision. Now, sometimes page locking with MAXROWS 1 is used to try and simulate row-level locking. Just bear in mind however, at very high concurrency levels, this might drive up the amount of space map contention because more modifications are required to the space maps. And also, in this example as well, we're trading more disk space in order to provide efficiency. Now, let's go to the next slide. Slide 23 (34:25) SEGSIZE: When I went through the space search algorithm early for segmented table space, I talked about the pros and cons of SEGSIZE. Basically, the general recommendation here is to make the SEGSIZE as large as possible, consistent with the size of the table space. So, by definition, for large tables with high-volume concurrent inserts, the SEGSIZE should be either 32 or 64. Prior to Version 10, the default SEGSIZE is too low at 4 and my need to be increased. There is a change in Version 10 now to make the default SEGSIZE 32, and this can also be tuned in Version 10 through a system parameter. So, the tradeoff is simple here, by having a large SEGSIZE, this provides a better opportunity to find space near the candidate page, and maintain data row clustering, and a better chance to avoid a longer space search. On the other hand, by having a small SEGSIZE, this will lead to more space map pages, and could possibly have the benefit of reducing space map page contention. But, there's less chance of hitting the searching threshold, and looking for space at the end of the table space, and thereby kicking off more expensive space searches. And, the pros and cons here would also apply to the universal table space. Now, let's turn to the next slide. Slide 24 (35:51) MEMBER CLUSTER: This is a very important feature that was introduced into DB2 for data sharing in Version 4. And basically what it's to deal with, it's aimed basically to

14 reduce contention on both space maps and possibly data pages. It's often the case with insert workloads, that there are hot spots, either inserting at the end of the key range, or certain hot spots within the key range. And this can drive up excessive space map contention for page p-lock and page latch contention. But also, with row-level locking, it may cause contention on data pages. Again, both page p-lock and page latch contention. So, the idea here of using MEMBER CLUSTER is to provide member-private space maps, and also with the associated or corresponding data pages. Slide 25 (36:47) So, what I'd like to do on the follow-on slide here, is look at some examples. So what I'm going to do on this chart here, is talk about the MEMBER CLUSTER concept. So, by default in a classic partitioned table space, a simple linear table space, there's a space map for every ten-thousand pages. And one of the things we're trying to do with MEMBER CLUSTER, is first of all introduce more space map pages, and the second thing here as well, is to provide a loose affinity or ownership of a space map, and it's data pages with a particular member. So, the first thing about MEMBER CLUSTER is we now get a space map for every 199 pages, as opposed to ten thousand. So, there's many more space map pages. So, in this example here, we have three-way data sharing, with member A, member B, and member C. So here, member A comes in and does an insert, and basically goes to the first space map here, and at that point then, there is affinity between that member, in terms of insert processing, that space map, and the associated data pages. Then member C does some insert processing, and goes after a different space map, and therefore has a loose affinity between itself and that space map, and the associated data pages. Member B then comes along, on again a different space map, and there's a loose affinity between member B and that space map, and the associated data pages. So, one thing you see here now is that basically you'll loose data row clustering because of this technique. So, what happens when member B fills up all the space on those data pages covered by that space map? Then what it'll then do is use a different space map that is not being used by other members of the data sharing group. A similar story here, with member A here. Member A went after the first space map and the associated data pages, then he exhausted the space covered by that space map, and then went to different space map later on in the page set partition.

15 Slide 26 (38:58) So, now let's summarize a bit about MEMBER CLUSTER. Once you use MEMBER CLUSTER the data rows as inserted, may not by the clustering index, and are probably not clustered. Instead, rows are stored in the available space in the member-private area. MEMBER CLUSTER has the ability to reduce space map contention, especially when you have high-volume concurrent inserts spread around multiple members in a data sharing environment. In a non-data sharing environment, it has the benefit of, by having smaller space map pages, or should I say a space map covering fewer pages? Again, this can reduce the amount of contention on space map pages. There is the option of restoring the data clustering by running a REORG or frequent REORG against the table space. One optimization that's worth considering is that having used member clustering, you may want to then consider the use of LOCKSIZE ROW and a larger data page size, to get better space reuse, and to also reduce the working set of buffer pool pages. Slide 27 (40:00) This picture here tries to give you another perspective on this here. On, the top half of this slide here, is a picture of a classic partitioned table space. And what you see here, that member 1 has an affinity with space map 1 and therefore that ownership or relationship covers data page 1 through to data page 199. Member 2 owns space map two, has that relationship and covers then the associated data pages 1 to 199 covered by space map 2. Now, starting in Version 10 for the first time, a universal table space can also support MEMBER CLUSTER. However, the organization is slightly different. Basically, with MEMBER CLUSTER on a universal table space, then each space map covers just 10 segments. So, the amount of pages covered by a space map in a UTS is going to be a function of the SEGSIZE. Now, let's turn onto the next slide. Slide 28 (40:59) So, more information on the MEMBER CLUSTER. First of all the MEMBER CLUSTER option was never available on a segmented table space, and it's not available on a segmented table space in Version 10 either. OK? It should really be used only when data clustering is not essential.

16 There are a number of improvements in Version 10 to help out. There's a new catalog column here, to indicate whether the table space has got MEMBER CLUSTER. This MEMEMBER CLUSTER option is now introduced for the universal table space for the first time. If you are using a universal table space, the MEMBER CLUSTER attribute can now be changed to what I call a deferred-alter mechanism, and do an ALTER against the table space and the change will go pending. And then you can REORG later, an online REORG, to materialize the change. But the ability to change the MEMBER CLUSTER attribute is only available on the universal table space. Another option you have in Version 10, is to convert a classic partitioned table space with MEMBER CLUSTER over to a universal table space partitioned by range, and retain the MEMBER CLUSERT behavior. And having converting over to UTS, obviously then, you can change the MEMBER CLUSTER behavior on or off, and it will be implemented by the materializing REORG. Now, let's turn on to the next slide. Slide 29 (42:23) So, what I've got on this slide here, is talking about a feature called MC00. MC00 is short for MEMBER CLUSTER with PCTFREE 0 and FREEPAGE 0. The significance of PCTFREE 0 and FREEPAGE 0, is it provides a hint when used in combination with MEMBER CLUSTER. Now, the objective of basically this optimization is to try to basically, in an application where you have both inserts and deletes, what you're trying to do here is provide what's called a pseudo-append type of function. We're going to try to insert the new rows at the end. We are going to ignore clustering, insert the rows at the end, but before extending, we're going to try to fill in the holes that may be left behind by delete processing, such as the archive. So, switching between append at the end, and insert processing. So, again, we have a space map covering a certain number of data pages. And what we have in this example here, is that member A goes off to the first space map; member C goes off to the second space map; and member B goes off to the third space map. However, in this example, if you look at member B here, he's basically gone after the third space map and the associated data pages, and when that space is exhausted he is going to go the very last space map at the end of the page set. It's a similar story here with member A. Once he's exhausted all the space in the data pages covered by the space map, he's then going to go to the last space map page for the page set. Slide 30 (44:07)

17 So, a little bit more detail on the next slide here about MC00. This is introduced for the first time in Version 7 with the APAR PQ When Version 9 went generally available, this function was missing. And it was reintroduced, or re-established in Version 9 with the APAR PK It only applies to the classic partitioned table space, and the simple table space. It does not apply to the universal table space. And its real objective here is to provide a high-efficient insert at the end of the table space or table space partition, without looking at free space. And it will choose longer chain of space maps search as a table space gets bigger and bigger. It's good for table space where we seldom have deletes or updates that create free space, and as I said, it's key objective here is to avoid exhaustive space search. But the success here does depend on the deletes and inserts being spread around the members of the data sharing group. Now, let's turn to the next slide. Slide 31 (45:11) In Version 9, we've introduced a new option on the DDL called APPEND, and APPEND does actually what it should be doing. It is always going to append the new rows at the end of the table space. It will never try to fill in the holes left by deletes or updates. So if there is a lot of space caused by deletes, or updates causing rows to be relocated in different parts, REORG has to be run in order to reclaim the space freed up. So, APPEND is exactly as advertised: inserting at the end. And one of the key objectives here is to relieve high get page rates during space search. So, it just inserts at the end without searching through for free space. Now, one of the strong recommendations here with append is to always use it with the MEMBER CLUSTER option and also with a high FREE SPACE. APPEND was introduced after general availability. And the PTF for PK81471 for the first time introduced the APPEND option under Version 9. To summarize, it favors physical extent over trying to use space search to fill in the holes left by rows that have been deleted. Slide 32 (46:37) In the last part of this web lecture, I'd like to talk about the changes for online schema in Version 10. So, one of the objectives of this enhancement in Version 10, was to provide a migration path for customers from the classic table space types, simple linear, segmented, and partitioned, to give them a migration path to UTS. Prior to version 10, the only way to migrate to UTS was to unload the data, drop the table space, recreate it at a universal table space, and then reload the data. And it is usually a major to for customer migrating to UTS.

18 A longer term strategy in DB2 is that all table spaces will converge on UTS, and eventually we will deprecate the classic table space types. This will not happen for some time. So, the first thing about this option here, was to give a migration path to UTS. Having gone to UTS, then another potential opens up that you can alter the table space to change the buffer pool, to change the DSSIZE, the SEGSIZE, MEMBER CLUSTER, and so on. The ALTER will actually go pending and be stored in the catalog. Then sometime later, at user discretion when you run a REORG, including the online REORG, the changes will actually be materialized. So that, on a very high level, is the goal of the online schema changes in Version 10. Please recognize that the conversion from the classic table space types to the UTS, is a one-way ticket. You can't go back to the old classic table space types, and in order to get the new features changes to change things like buffer pool size, member cluster on and off, you have to first converge it to UTS. Now, let's go on to the next slide. Slide 33 (48:22) So, as part of this deferred ALTER mechanism, the first thing is to put the alter in to change the index or table space attributes. The statement is validated, and if the checks are OK, the statement, or the request, is stored in the catalog for later materialization with REORG, and the table space is marked as ADVISORY REORG PENDING NON- RESTRICTIVE. And basically, the statement itself will complete with SQLCODE +610 to advertise that the change has happened, or happened in the sense that it is stored in the catalog, ready for the materializing REORG. And as example in this chart of a pending change stored in the catalog. Now, let's turn to the next slide. Slide 34 (49:08) So, to repeat myself and to reinforce this, the cache changes are materialized by the next REORG. It can be an online REORG if it's SHRLEVEL REFERENCE or SHRLEVEL CHANGE. Any change that is pending can actually be dropped, so you can say ALTER TABLE SPACE and DROP PENDING CHANGES. If you actually want to undo the change you've made, you can then put in another ALTER, a compensating ALTER, and then do another REORG to roll the changes back. But to repeat myself, having converted to UTS, you cannot go back to any of the classic table space types.

19 The pending ALTER cannot be used for catalog objects, the work file database, or any database or table space that is involved in a clone table relationship. In the case of a clone table relationship, you must drop the clone first. Now, on the next two slides here... Slide 35 (50:04)...I'd like to talk about a particular issue, to make sure that you're well informed. So, in this picture here, you can see that at time 1 with time going left to right, we have a pending ALTER, and that ALTER is valid and it's stored away in the catalog waiting to be materialized later at the next REORG. At time 2, the image copy is taken. Then at time 3 a REORG is run, which takes the pending ALTER and actually implements the change and takes an in-line copy. And after that we have, obviously, some SQL DML statements inserting, updating, and deleting, or reading that particular table. So, using that particular picture here, if you want to do a point-in-time recovery to a point in time after the REORG, then everything will perfectly OK, and there will be no problem with recovering to currency. But, what I'd like to do here is point out, that if you want to do point-in-time recovery to a point in time before the materializing REORG, that will not be possible. So, to try to express it a different way, and to reinforce the message here, once you do the pending ALTER, and you take the image copy, if want to do a point-in-time recovery prior to this materializing recovery, materializing REORG I should say, then there's no problem. But having done the materializing REORG, if you want to do a point-in-time recovery to a point before that materializing REORG, then it will not be possible. And any recovery to a point after the materializing REORG, will be perfectly OK. Slide 36 (51:37) So, on the next chart here, tries to summarize it up here, that if you're recovering to currency or a point in time after the materializing REORG, then DB2 will use the in-line copy taken with the REORG, apply the log records, and comes forward to the respective point in time, or to currency. On the other hand, if you want to do a point in time recovery to a point before that REORG, then this will not be possible, and it will not be allowed. Slide 37 (52:06)

20 So, on the next slide here, I'd like to talk about table space type conversion. So, what it's showing on this chart here is the migration path, using this deferred ALTER mechanism to get the universal table space. So, basically, we can migrate a single table simple table space, or a single table segmented table space to UTS PBG. We cannot migrate a multi-table table space to UTS PBG. Because UTS PBG right now only supports a single-table table space. So you can ALTER the table space, specifying MAXPARTITITIONS n, and then basically the materializing REORG will convert to UTS PBG. We can convert a classic partitioned table space to a UTS PBR by saying ALTER TABLESPACE SEGSIZE n. And if there's MEMBER CLUSTER attributes that already exist on the single-table table space or a classic partitioned table space, that will be inherited by the new universal table space. Slide 38 (53:51) SO, what I'd like to now do is try to summarize in the next few charts here, what we've learned in web lecture part 1. First of all what I hope we've understood here, is how DB2 looks for space on insert, and I walked through the space search algorithms for both a classic partitioned and a segmented table space. We talked about the tradeoffs between performance and space reuse. You cannot have both. So you design for high performance then you get less space reuse. If you want better space use, then less performance. So, the very best performance, the recommendation is to use a classic partitioned table space, and if you want better space reuse, you want to use the segmented or universal table space. So, you have to balance between space reuse versus performance. And also recognize that deletes or updates or REORG frequency can impact performance. And there needs to be balance here between distributed free space, and the amount you inject to absorb the new inserts, and the REORG frequency, to re-establish the free space. We also talked about efficiencies or optimizations here, to speed up inserts. Use of things like APPEND and MEMBER CLUSTER are a very good choice here to speed up performance, and reduce contentions. But there is a tradeoff here, in terms of loosing clustering and using more space. Use of segmented table space or UTS provides more granular space status in the space map page. This may be useful from a space use viewpoint, but it may lead to more space map contention.

Optimizing Insert Performance - Part 1

Optimizing Insert Performance - Part 1 John Campbell Distinguished Engineer DB2 for z/os development CAMPBELJ@uk.ibm.com 2 Disclaimer/Trademarks The information contained in this document has not been