DB2 for z/os Utilities Best Practices Part 2. Haakon Roberts DB2 for z/os Development IBM Corporation. Transcript of webcast.

DB2 for z/os Utilities Best Practices Part 2 Haakon Roberts DB2 for z/os Development 2011 IBM Corporation Transcript of webcast Slide 1 (00:00) My name is Haakon Roberts and I work for DB2 Silicon Valley Lab in California and I am going to be presenting DB2 for z/os Best Practices. If we take a look at the agenda, we'll start with general recommendations for utilities. Then we'll take a look at a set of primary utilities within the DB2 for z/os utilities suite. We'll take a look at COPY and include in that COPY's use of data set level FlashCopy that was introduced in version 10 of DB2. Then we'll look at RECOVER including QUIESCE and MODIFY RECOVERY utilities. Slide 3 (00:18) If we take a look at the agenda, I'll start by going through some general recommendations regarding utilities use for DB2 z/os. Then we'll look at some specific utilities areas such as COPY, and COPY s use of data set level FlashCopy that was introduced in version 10 of DB2. Then we'll look at RECOVER, including QUIESCE and MODIFY. We'll look at LOAD and UNLOAD processing. Spend quite some time

discussing REORG and then RUNSTATS, CHECK, and finally we'll have a look at DSN1COPY and use of the DSN1COPY utility. The general recommendations and COPY and FlashCopy will be in the first part of this presentation. And then RECOVER, QUIESCE, MODIFY, LOAD, UNLOAD, REORG, RUNSTATS, CHECK, and DSN1COPY will be in the second part of this presentation. Slide 13 (01:25) Moving on to RECOVER, QUIESCE, and MODIFY. Slide 14 (01:31) The first thing to note about the RECOVER utility is that it typically consists of two phases. One of them is restoring the recovery base and the other part is the log apply. Any time we are talking about RECOVER, we're talking about data and application availability. And anything that can be done to reduce the recovery time is going to improve availability for applications and for businesses. Therefore, if we take a look at what we're doing for RECOVER we either need to speed up the restore of data sets or recovery bases. Or we need to reduce the log apply time. So if you take a look at our recommendations in this area. The first recommendation is to maximize exploitation of parallel RESTORE and fast log apply. Our recommendation is to recover multiple objects in a single recover statement because the recovery bases are going to be restored in parallel and perhaps more importantly we will perform one

scan of the log for that RECOVER utility and we will be able to take full advantage of fast log apply which was introduce in version 6 of DB2. To have parallel log apply across multiple objects in a single RECOVER. Ideally our recommendation would be to specify less than a 100 objects in a recover list but we can support many more than that. If multiple recover jobs are being run, then avoid running more that 10 recover jobs in a single DB2 subsystem. The reason for that, is that fast log apply will use up to 100 megabytes of DBM1 address space storage. But each RECOVER job itself will only use a maximum of 10 megabytes. So if you run more than 10 recover jobs, the first 10 recover jobs will be able to acquire fast log apply storage, the 11th is unlikely to get any fast log apply storage and will then run doing slow log apply and you will not take advantage of the performance of fast log apply. Other recommendations are to image copy indexes and include those indexes in the recovery list. That's particularly important for point-in-time recovery because if you wish to recover data to a prior point in time and you don't include the indexes in that recover statement then the indexes will be put in rebuild pending and the index rebuilds can take a considerable length of time. By image copying the indexes and including them in the recovery list you avoid the need to do the rebuild of the indexes. Next, consider if you have multiple objects that need to be recovered, and you know that some of the objects are read only and have no updates to them and other objects do have updates to them, consider splitting off the page sets that don't have any updates and recover them in a separate RECOVER statement. The reason for that is if they don't

have any updates that require log apply, then the RECOVER utility will restore the image copies and then determine from SYSLGRNX range that there is no log apply to be done. Then the recover is complete and the objects are then available at that point. If those objects are included in a set with other page sets that do have logs that need to be applied, then none of the objects in the list are going to be made available until the entire recovery is completed and so therefore the objects that could be made available after the restore of the recovery base are not going to be available until the entire log apply is complete. Next for point in time recovery include the entire referential integrity set in the same recovery statement to avoid object to be put into check pending. And also include base and aux objects in the same recovery statement for the same reason. In fact, in version 10 of DB2 we provide an option on the RECOVER statement that would allow you either enforce or not enforce these particular issues. If you are using system level backups, i.e., if you are using the RESTORE SYSTEM utility in version 8 of DB2 and you want to perform object level recovery from a system level backup, then the recovery expert tool has that capability. In version 9 of DB2, the RECOVER utility itself has that capability. Finally, for point in time recovery for version 10 of DB2, consider using the BACKOUT YES option of the RECOVER utility. Now it s not the utilities job to determine whether it makes more sense to roll back to a prior point in time or whether to restore an image copy and roll forward on the log. That is a decision which you will have to make yourself or the recovery expert tool will provide the necessary recommendations.

Slide 15 (07:36) Moving on to slide 15. For the QUIESCE utility, quiesces typically run periodically to ensure that there is a consistent point to recover back to in case point in time recovery is required. In version 9 of DB2, the RECOVER utility was enhanced so that any point in time recovery to an RBA or to an LRSN value, will ensure that at the end of the recover the data set is transactionally consistent. What I mean by that is if you recover to a particular RBA in version 9 of DB2, the RECOVER utility will recover to that point in the log, the RECOVER utility will then determine what units of work were uncommitted at that point and will then back out those uncommitted units of work. So at the end of the RECOVER, what you have is a consistent data set. For that reason, you might want to consider whether it is really necessary to continue taking quiesce points on a regular basis. Since running QUIESCE has an application impact because it drains off update claims. Write claims against the object. So in version 9 of DB2, because of the enhancement to the recover utility it may no longer be necessary to take period quiesce points. On the other hand, if what you want is to just take a quiesce point, so that you have a mark on the log, so that you know what the RBA was, at mid-day yesterday, in case you want to recover back to mid-day yesterday. Then instead of running QUIESCE against the object that you want to recover, or you may want to recover, one thing you can consider doing, is running the QUIESCE utility against DSNDB06.SYSEBCDC. That just contains SYSIBM.SYSDUMMY and taking a quiesce point of that is not going to impact your applications and you will still end up

with a quiesce point logged in SYSCOPY so that that you can then have a look at the quiesce in SYSCOPY and see what the RBA or the LRSN was at that particular point in time. And that RBA or LRSN can then be used for point in time recovery of your real data. The other point to note about the QUIESCE is that our recommendation is to use write NO unless you absolutely do have to have pages written out forced out to disk. With respect to MODIFY RECOVERY, you should ensure that you base your modify strategy on your backup strategy and not vice versa. So you do not want to take backups when objects go in copy pending because the MODIFY RECOVERY utility that you ran removed your last recovery point, your last recovery base. you need to make sure you understand what your backup strategy is and you set your backup strategy based on your recovery time objective and then once you've set your backup strategy you would then set your MODIFY RECOVERY strategy to ensure you no longer keep lying around obsolete recovery information that you would never use for recovery purposes. So consider running MODIFY RECOVERY every time a backup is taken or at least weekly. In addition to that, in order to ensure that MODIFY RECOVERY runs optimally consider reorging the SYSLOG range tablespace on a regular basis in order to ensure optimal performance of MODIFY RECOVERY and ensure it has no impact on the system when it runs. Take advantage of the new features that were delivered on version 9 of DB2 for MODIFY RECOVERY to say not what is it you want to delete but what is it you actually want to keep. For example, I want to keep recovery information for the last

3 image copies. Also bear in mind that MODIFY RECOVERY will not clean up orphan entries in SYSLGRNX range. And by orphan entries, I mean SYSLGRNX entries for objects that have been dropped. Finally, run the MODIFY RECOVER utility to delete recovery information from prior to a REORG that materializes row alterations. Such as a table where new columns have been added and REORG has been run to materialize those alter added columns. Eventually you would want to run the MODIFY RECOVERY utility in order to have the old recovery information removed because that will make subsequent REORGs and other utilities more efficient. Slide 16 (13:05) Moving on now to LOAD and UNLOAD utilities. Slide 17 (13:10) On slide 17, as you would imagine, running the LOAD utility without logging and with reuse of existing data sets and without the need to build new compression dictionaries is going to make the LOAD utility run more efficiently. In addition to that, make sure that inline image copy data sets are allocated to disk. And if loading multiple partitions split the input data set up and drive load partition parallelism in a single load. Use SORTNUM elimination as was discussed earlier on in this talk. And in version 9 of DB2 in the maintenance stream we introduced a new parameter called NUMRECS. And NUMRECS is a table level replacement for the SORTKEYS parameter. Our recommendation would be

to take a look at NUMRECS and use NUMRECS rather than SORTKEYS. It is simpler to use and it is more robust. If loading a partitioned table with a single input data set, then consider presorting the data in clustering key order. The LOAD utility does not sort data, but by presorting the data outside of the LOAD utility, in partitioning key order, we have found that that can significantly improve performance of the LOAD utility when loading multiple partitions from a single input data set. And bear in mind that the utility enhancement tool has a presort option which would allow the tool to automatically presort the data before invoking the LOAD if you wish to purchase the utility enhancement tool, or you already have that tool available to you today. In addition to this, for improved performance and reduced CPU consumption, consider using the new option format internal which was delivered this year in the maintenance stream for the LOAD and UNLOAD utilities. The idea here is if you are unloading from table A, and loading the data into table B, and table A and table B have the same table definitions, then avoid converting the data into external format only to convert it from external format back into internal format to load the data into table B. That is what format internal does for you. Format internal unloads the data into internal format and avoids all of the row conversion and field conversion that otherwise would need to be done by the UNLOAD and LOAD utilities. Consider taking a look at USS named pipe support and I refer to you the APARs on this particular slide for details on that. The idea here is that with USS named pipes, it is possible to unload to a virtual file in memory and also populate from an application a virtual file

in memory and then have the LOAD utility pull the data from that virtual file and load the data into a DB2 table without landing the data on disk on the z/os system. Finally, in version 10 of DB2, we introduced hashed tables. Hashed access to data provides very fast access to data for applications, but it means that the tablespace structure is slightly different than normal tablespaces. It also means that the LOAD utility cannot load data in the order in which it resides in the SYSREC data set. Each row that is loaded, has to go to its specific hash position, as a result, one should not expect that loading data into a hash table is going to perform as well as loading into a non-hash table. However, the utility enhancement tool, has been enhanced so the presort option will sort the data in hash order and that provides a significant performance improvement when loading into hash tables in version 10 of DB2. Slide 18 (17:57) Moving on to slide 18. Consider whether you want to use the UNLOAD utility or the High Performance Unload tool. The UNLOAD utility is part of the utilities suite, High Performance Unload is a separately chargeable tool. They often have comparable elapsed time although HPU often runs in less CPU. HPU also has a full SQL interface, and also permits unload from page sets on disk. Next a quick word about file reference variable processing for LOAD and UNLOAD. If you have LOB data or XML data of any size, and that data needs to be unloaded or loaded, chances are you are using file reference

variables. File reference variable performance in version 8 was improved in version 9 with APAR PK75216. Even so, for file reference variables you have a choice of using members of a PDS or using HFS files. And even though we improved the performance of PDS file reference variables in version 9 of DB2, HFS still performs better in terms of elapsed time. In addition to this, there is a limit on the number of members that can be created in a PDSE which could limit the number of records that are unloaded using PDS FRVs. Null LOBs are handled better than zero length LOBs. But in version 9 of DB2, that issue is resolved in the maintenance stream. In version 9, as I mentioned, the performance of FRVs tends to be better than the performance in version 8. But the true performance improvement comes in version 10 of DB2 where in UNLOAD and LOAD we now support VBS variable blocked spanned format for the SYSREC data set. And that now allows the LOB and XML documents to be put inline with the base row in the SYSREC data set and avoids the use of FRVs altogether and that can be used via the new spanned parameter in version 10 of DB2. Slide 19 (20:47) Moving on to the REORG utility. Slide 20 (20:51) If we look at slide 20. Obviously our recommendation is to run REORG SHRLEVEL CHANGE for maximum availability. If you are reorging a partition of a partitioned tablespace simply in order to compress the data in that partition, then if

your table is partitioned by data, consider using LOAD copy dictionary to copy your compression dictionary from one partition into a new partition in order to avoid having to run REORG in the first place. The REORG utility was changed in version 9 of DB2 to remove the build to phase when reorging a subset of partitions and non-partition secondary indexes exist. The way that was done, was by shadowing the entire NPI. So REORG of a small set or subset of partitions with NPIs can actually take longer in version 9 than it took in version 8 of DB2. And the performance of it can be worsened if the NPI is disorganized since keys for nonreorged partitions are unloaded in order. The performance is improved significantly in version 10 of DB2 with index leaf page list prefetch. In addition to that, further performance improvements are planned to help address this particular issue. And the expectation is that changes will be put in the maintenance stream for version 9 and version 10 of DB2 to provide additional performance improvements for reorging subsets of partitions when non-partitioned secondary indexes exist. The other issue surrounding the build to phase and the removal of the build to phase is that if you have an NPI then in version 9 of DB2 and version 10 also concurrent reorgs of partitions in the same tablespace is not permitted. The reason for that is that both reorgs will attempt to shadow the same NPI and you cannot have two shadows of the same page set at the same time. So in PK87762 we retrofitted from version 10 back into version 9 the ability to specify multiple partition ranges in a single REORG statement. So now in a single REORG you can specify as per this example,

you can specify that you want to reorg parts 1, part 10, part 50 through 71, and part 500 through 900 in a single REORG. And in version 9 of DB2 we will UNLOAD partitions in parallel and LOAD the partitions in parallel and we have fast log apply for the log phase as well. It's much more efficient and you only get a single processing of the NPIs. So much better than splitting those partitions up and running them in separate REORGs which is what would have had to occur prior to PK87762. However, reorging all those partitions in a single REORG statement, requires shadowing of all of those partitions and therefore it can use more disk space. As a result of this, in order to allow customers to determine whether they want to run these in parallel or not. A new PARALLEL keyword was introduced with PM25525 and the default for that is PARALLEL yes. In addition to that, we're introducing a new ZPARM that will govern the parallelism for reorg of subsets of partitions when using a LISTDEF for the REORG utility and that LISTDEF is specifying a list of partitions. So the new ZPARM introduced in PM37293 will govern the PARALLEL keyword so that the PARALLEL keyword does not have to be specified. And the PARALLEL keyword and that parallelism processing governs whether we process partitions in parallel when the REORG utility has as its input a LISTDEF specifying multiple partitions for a particular partition tablespace. So in summary, the partition parallelism in version 9 in UNLOAD, RELOAD, and log apply means that multiple partition reorg is much more efficient. It is faster, log phase is better at keeping up with the logging rates, and we only process NPIs once. So if you have the DASD space, our

recommendation is to reorg multiple partitions in a singe REORG statement. Slide 21 (26:33) If we now take a look at slide 21. This slide discusses main recommendations for REORG SHRLEVEL CHANGE. First of all, our recommendation is to use DRAIN ALL rather than DRAIN WRITERS to minimize application impact. Secondly, to use TIMEOUT TERM so that objects get freed up if we hit a timeout against a drain. And next, we take a look at DRAIN_WAIT and MAXRO. REORG SHRLEVEL CHANGE needs to apply log records. So it does a log scan and it does that in the log phase. At some point, the REORG utility is going to determine that it is close enough to the end of the log that it should drain off writers and then catch up the last little bit on the log. What governs when we are trying to get the drain, is the MAXRO parameter. If MAXRO is set to 30, that says that the REORG utility should try to get the drain when we think that the last bit of log is going to take us less than 30 seconds to process. Now the thing to note here is that if we think that the last bit of log is only going to take us 30 seconds to process we will attempt to get the drain at that point. It may be that it takes us 20 seconds to drain off the claimers. While those claimers are being drained, further updates could be processing against the log. So by the time we've actually acquired the drain, we are now past MAXRO. So even though at the time we decided to get the drain we thought it was only going to take us 30 seconds. Well now by the time we succeeded in getting the drain we now actually have more log to apply than originally existed at the

time we decided to get the drain. That is why if you want to minimize application impact from the drain processing and from the last log iteration and from the switch phase of REORG, you should use DRAIN_WAIT and MAXRO and add those values together and then set those to be something less than your IRLM lock timeout interval. Because MAXRO says how much log we are going to be processing before we attempt to get the drain and then we need to wait for the drain to occur and then once we've got the drain we've got to catch up on the log and we still have our switch phase processing that needs to be down as well before we can allow applications back in to access the reorganized data. So if the idea is to minimize application impact then our recommendation is to set DRAIN_WAIT plus MAXRO to be something less than the IRLM lock timeout interval. The difficulty here is to ensure that you do not set MAXRO too low. Because if MAXRO is set too low you could potentially end up in a situation where the REORG utility determines that it can never actually catch up on the log. And therefore we will not attempt to acquire the drain. And then potentially you could hit the timeouts, the timeout interval where you have set a deadline threshold for how long this whole process should take. If we are unable to acquire the drain, then we will release the drain attempt, allow applications back in, wait for a period of time, and then try again. And that is governed by the RETRY parameter and the RETRY_WAIT parameter. And our recommendation is to use the default value of RETRY 6 and RETRY_WAIT of DRAIN_WAIT times the number of retries. Another option is to consider using MAXRO defer. If you have a 30 minute

window where you have to get the data reorganized within that 30 minute window and yet you are reorganizing a 5 billion row table. That REORG is not going to complete from start to end in 30 minutes. Therefore it is important to start the REORG earlier and what typically matters is whether it can complete within the 30 minute window. So with MAXRO defer, it is possible to start the REORG sometime earlier; hours or even days earlier. And then have the REORG utility run along in the background in the log phase, keeping up on the log. And then when you hit the 30 minute window, that you have available to you, at that point use the ALTER utility command to alter the utility and have the utility then try to complete and get the drain in that windows you have made available. Next regarding reorg of log page sets. Log tablespaces came along in version 6 of DB2 but REORG of log tablespaces had some drawbacks in versions 6, 7, and 8. The primary recommendation is to get to version 9 of DB2 and use the new SHRLEVEL REFERENCE capability that is available in version 9 conversion mode. And then in version 10 of DB2 use REORG SHRLEVEL CHANGE for log tablespaces. Bear in mind that in version 10 new function mode, REORG SHRLEVEL NONE for log tablespaces, even though it is technically still supported, no REORG will actually take place. The REORG will complete return code zero, but no REORG will actually be done. Therefore the strong recommendation is to convert REORGs of log tablespaces to either SHRLEVEL REFERENCE or SHRLEVEL CHANGE before moving to version 10 new function mode. Another point to note is that when reorging a PBG that has LOB columns in version 10 of DB2 and the

PBG grows in size, grows new partitions during the REORG, the corresponding newly grown log tablespace may be left in copy pending. Next if using REORG DISCARD on variable length records, REORG DISCARD performs better with the NOPAD option. And finally our recommendation is to use inline statistics in order to gather statistics against objects when running REORG rather than when running separate RUNSTATS. However, bear in mind that in version 10 of DB2 there is an availability improvement for REORG with inline stats. Because the catalog update, to update the statistics columns in the catalog, in version 10 of DB2 is done after we allow applications back in to access the data. In version 9 of DB2, the catalog update for inline statistics by REORG is done prior to releasing the drain. So if the REORG is reorging many objects, i.e., hundreds or thousands of partitions, then the catalog statistics update could have a significant impact on the duration of the application unavailability for REORG. Slide 22 (35:14) Moving on to slide 22. A quick word about REORG INDEX versus REBUILD INDEX. REBUILD INDEX SHRLEVEL CHANGE is provided in version 9 of DB2 and is very good for creation of new non-unique indexes and for indexes that are already broken or are already in rebuild pending. It doesn't operate against shadow page sets. So it will set rebuild pending if it isn't already set. REORG INDEX however, does operate against a shadow. Rebuilding indexes could be faster than reorging them, particularly if the index is disorganized. But REORG INDEX performance has

improved in version 10 due to index leaf page list prefetch. Slide 23 (36:10) Slide 24 (36:13) Moving on to slide 24 if we now take a quick look at RUNSTATS. A key thing to note about RUNSTATS is that one should not gather unnecessary statistics. It is important to take a look and see what statistics are really necessary to be gathered and only gather those statistics that are necessary. Also, do not use RUNSTATS to gather space statistics you should really be relying of real time statistics information for that instead. And use sampling for RUNSTATS. In version 10 of DB2, we provide page sampling rather than row sampling and our recommendation would be to use page level sampling with auto sampling rates through specification of the new table sample auto parameter for RUNSTATS. In addition, in version 10 of DB2, we provide statistics profiles for tables to simplify RUNSTATS processing and our recommendation would be to use that in version 10. However, rather than running RUNSTATS it's more efficient to gather statistics through inline statistics, for example, with the REORG utility. And finally our recommendation is to specify KEYCARD when gathering statistics through RUNSTATS. Then index cardinality statistics that this gathers are cheap to collect and they are heavily used by the DB2 optimizer. Slide 25 (37:54) Slide 26 (37:57)

Moving on to slide 26 and CHECK utilities. If all that is required is just a check consistency, then our recommendation would be to run CHECK utility SHRLEVEL CHANGE. Another option of course rather than run CHECK utilities would be to use SQL, for example SELECTs with isolation level UR which don't acquire any locks. One thing to be aware of is that if running the CHECK utility and you're not running SHRLEVEL CHANGE. If the CHECK utility detects any inconsistency, for example when running check LOB against a LOB tablespace. If it finds an inconsistency then it would put the entire LOB tablespace in check pending. Even though that one broken LOB may have had no application impact to you prior to that point. For that reason, consider either running CHECK SHRLEVEL CHANGE which does not set restrictive states nor would it reset restrictive states. Or if running the CHECK utility such as CHECK data or CHECK LOB, consider having a REPAIR utility ready to reset any check or ACHKP pending states. Bear in mind that check index is never set restrictive states if it finds any inconsistencies. In version 10 of DB2, this has changed so now no CHECK utility will set check pending or ACHKP pending any more. Instead they will reset them if they find that no inconsistencies exist and check pending or ACHKP is currently set. If running CHECK SHRLEVEL CHANGE, then as I said, the utility will not set check pending or ACHKP but nor will it set them. Also bear in mind, that the CHECK utilities, running SHRLEVEL CHANGE, will exploit data set level FlashCopy. So make sure that page sets are on FlashCopy enabled devices. And in the maintenance stream, we are going to be delivering a ZPARM that will

ensure that any CHECK run SHRLEVEL CHANGE can fail if the data set is not on a FlashCopy enabled device. The alternative is to allow the FlashCopy to continue and the CHECK utility to continue but instead of a FlashCopy a slow copy will occur and the data even though running CHECK SHRLEVEL CHANGE, the data could remain in read only mode for an extended period of time. In addition to this, because CHECK utility is running SHRLEVEL CHANGE, depend on data level FlashCopy, there could be an impact to DASD mirroring or the BACKUP system utility that is also using volume level FlashCopy. In order to avoid contention and impact to BACKUP system or DASD level mirroring a new ZPARM UTIL_TEMP_STORCLAS was introduced which would allow the target volume for the data set level FlashCopy to be directed outside of the DASD mirroring group or outside of the storage group for backup system. Slide 27 (42:01) Moving on to slide 27. This slide gives a visual depiction of the order in which we would recommend data integrity checking be performed for LOB data. First of all, our recommendation would be to run CHECK LOB against the LOB tablespace. CHECK LOB simply ensures the consistency of the LOB tablespace in isolation. Once check LOB runs clean and the LOB tablespace is consistent, then run CHECK INDEX against the aux index to ensure that the aux index matches the LOB tablespace. Once that is consistent, then run CHECK DATA with SCOPE AUXONLY and AUXERROR INVALIDATE to determine whether the base data rows match the aux entries. There is no other

way to validate base data rows against LOBs. There is no direct access from the direct row to the LOB data itself. All access to the LOB data is via the aux index. And therefore, by ensuring that the LOB data is consistent with the aux index and the base row is consistent with the aux index. We can then ensure that the base rows are consistent with the LOB data itself. Slide 28 (43:40) Moving on to slide 28. This is the equivalent for XML objects. The recommendation here would to run check index against the DOCID index on the base table. Then run CHECK INDEX against the node id index which will ensure consistency between the node id index and the XML data in the XML tablespace. And then once they are consistent, then run CHECK DATA to validate the base rows against the node id index. Bear in mind that in version 10 of DB2, CHECK DATA has been enhanced to provide additional XML data integrity checking. Slide 29 (44:36) Slide 30 (44:39) Moving on to slide 30. I want to spend a few minutes about DSN1COPY. DSN1COPY is an essential part of the utilities portfolio and is used by a large number of customers. DSN1COPY is a stand-alone utility. As such, it cannot access any control information in DB2 catalogs or anywhere else in DB2. Therefore, there is no policing by DSN1COPY of anything that would be considered a user error such as DSN1COPYing from a segmented tablespace into a simple

tablespace. Or DSN1COPYing from a segmented tablespace with SEGSIZE 8 into a segmented tablespace with SEGSIZE 32, for example. That is not policed and cannot be policed by DSN1COPY and one can expect various abends or other errors to occur when subsequently that particular tablespace and that particular page set is accessed by applications. All target data sets have to be pre-allocated for multi-piece tablespaces for DSN1COPY. Now areas to watch out for are BRF-RRF mismatches. So if you have a mismatch between basic row format and reordered row format, that is tolerated by SQL but not by a number of utilities. Primarily the REORG utility. As an example, it may be that your source data set is basic row format and your target data set is defined as reordered row format. If you DSN1COPY from the source to the target, now the target data set is in fact in BRF format, but the catalog and directory information say that it is RRF. SQL can handle that inconsistency, because we have control information in the page set that tells us that the data is actually BRF regardless of what the catalog and directory might say. For utilities however, the utilities expect the catalog and directory will reflect the true state of the data. And since it does not, then the REORG utility may fail or a number of various errors or abends may occur. The expectation is that we will enhance utilities to handle BRF- RRF mismatches in the future. In the meantime, it is wise to ensure that when DSN1COPYing, you not only ensure that the actual page set definition is the same, but also that the row format matches and that you are DSN1COPYing RRF to RRF or BRF to BRF. And if there is a mismatch between

BRF and RRF, between the source and the target, it's always possible to use the REORG utility to convert one of them from BRF to RRF or from RRF back to BRF before running the DSN1COPY. If that isn't possible, for example, if the image copy is in BRF format, then you can unload from the BRF image copy and load into the RRF target page set instead of using DSN1COPY. Another thing to be careful about, is table metadata changes, for example, by adding new columns. The general recommendation is to REORG at the source before taking a DSN1COPY. Particularly if no updates have occurred to that particular page set since the first alter has taken place. If the first alter has been done, then the first alter will create a new version of the table. But if no data has been inserted or updated in the page set, then no system page information will exist in that page set. And then if you DSN1COPY to the target, we may not have the system page information available in the page set to allow us to interpret the version rows in that page set. Therefore the general recommendation is to run a REORG before you run a DSN1COPY. Another option would be after DSN1COPYing to the target to use repair versions in order to fix the version information at the target site. And a new APAR PM27940 enhances repair versions so that we will extract system page information from any and all partitions of a partitioned tablespace and preserve that for use by data in other partitions. Slide 31 (53:02) Moving on to slide 31. With respect to XML and DSN1COPY, care needs to be taken with respect to the

DOCID. Because the DOCID is a sequence number that is generated by DB2 and it is possible that by DSN1COPYing to new target where the DOCID is actually lower, it could result in a -803 SQLCODE on insert. Because DB2 will generated a new DOCID but that DOCID already exists in the table. Another concern regarding XML is that one cannot DSN1COPY an XML tablespace from one DB2 system to a different DB2 environment. The reason for that is that the XML data in the XML tablespace is not self defining. The XML data also requires information in a catalog table XMLSTRINGS to allow its interpretation. So DSN1COPYing XML data to a completely different DB2 system that has a different DB2 catalog will not work. Because the XMLSTRINGS catalog table will not have the necessary information to allow interpretation of the XML documents. DSN1COPYing XML data within a single DB2 subsystem or within a single DB2 data sharing group will work fine. Slide 32 (54:49) Moving on to slide 32. In summary, our recommendation is to stay reasonably current on DB2 versions and maintenance. Understand what this gives you in terms of utility capability and then revisit existing utility jobs to see how you can benefit from the new capabilities that we have provided both in the maintenance stream and in versions of DB2 in terms of taking advantage of enhancements for both availability and performance and CPU reduction. Thank you.

(55:36)