Platform: DB2 UDB for z/os Reorganization Strategies in Depth Peter Plevka Software Consultant/BMC Software Session: B7 Tuesday, May 24, 2005, 3:30 pm With the number and size of database objects constantly growing, and requirements on availability getting tougher every month, reorgs of tables and indexes need to be planned well. This presentation looks at different availability options and automation considerations. A focus lies on online reorganization and reorg avoidance. And yes, you need to reorg in DB2 V8. This session will cover generic,vendor independent topics. 1
Agenda General Reorg Considerations Reorg Availability Options Reorg Automation and Avoidance DB2 UDB for z/os V8 and Reorg Summary 2
Agenda General Reorg Considerations Reorg Availability Options Reorg Automation and Avoidance DB2 UDB for z/os V8 and Reorg Summary 3
What this session is all about General concepts Reorg automation Avoidance of unnecessary reorgs Vendor neutral I want to provide you with all necessary input to decide on a particular Reorg Automation Strategy, that fits your needs This session is NOT about - Reorg JCL or SYSIN cards - Error conditions and restart - Vendor specific features The focus lies on automation and through this, avoidance of unnecessary reorgs. You should understand the automation approach in general, and you should be able to apply this approach to your current situation in your shop. You might already do reorg automation to some degree, then you should get additional input on how to enhance the degree of your reorg automation. I do not want you to sit in your office or at home and submit reorgs manually. This should be history by now. Even if you have a nice and cosy maintainance window to work with. 4
General DBA Trends Amount of Data Performance SLA s IT as a Service Availability I know and visit a lot of DB2 users. Beside of other IT trends, one is THE talk of the town these days IT as a service for the business. More and more people in management positions ask what does IT deliver and how much does it cost? Also, more and more businesses outsource their IT, and pay for the service of their provider. These agreements contain SLA s Service Level Agreements like an average response time per transaction for example. You as a DBA are part of IT, which delivers the service to the business. Directly impacting this service is for your part the amount of data within the database, Availability of the database, and the performance of SQL accessing the database. Performance and Availability are directly connected today. I would even say: Performance=Availability, because a bad performing SQL still returns with a result, but directly influences the availability of the overall business process. It s not black and white anymore (black=database not available, white=database available), it is today, that a bad performing batch job, may cause interruption or delay of an important business process. 5
Overall Tuning Potential SQL Performance Database Performance System Performance Look at this as a plan or package or a single SQL, it doesn t matter, the distribution of tuning potential is always the same. The biggest potential lies within the application SQL including index design 75%. The 2 nd biggest part is the physical health of the database objects, and that s the area we discuss in this session. Around 10-12% tuning potential is possible by having database objects well reorganized and sized. The smallest part, 5-3% lies in the area of DB2 system resources, like bufferpools and other storage areas, and ZPARM settings. Still, all 3 together have to be seen as the complete picture of tuning potential for a particular application, package, or single SQL statement. 6
The Main Message Reorg is NOT an art Automate once, and forget about it Well reorganized objects = IO Availability Running reorg jobs should not be a time consuming art for the DBA. There are much more important things to do, which cannot be automated by any kind of software. On the other hand, reorgs are still important. They help to minimize IO and increase availability of the application. 7
Reorg Reasons - Indexspace Leaf Distribution LEAFDIST, LEAFNEAR, LEAFFAR from SYSIBM.SYSINDEXPART! Decrease Index Levels IO Pseudo Deleted Rows Freespace low (PCTFREE/FREEPAGE) Secondary Extents Availability Healthy indexes are much more important than tablespaces they need to be reorganized to perform well for your application SQL. Freespace is very important for indexspaces, as the index is always kept in perfect sort order by DB2. Most of all you don t want to have Leaf pages physically out of order within your indexspace dataset. The LEAFDIST value gives you an average number of leaf pages not in perfect order. LEAFNEAR and LEAFFAR values indicate how many leaf pages are either near or far from successive leaf pages. The logical order between leaf pages is always maintained through pointers. Whenever your application SQL needs to scan multiple leaf pages you want to have a LEAFDIST, LEAFNEAR and LEAFFAR at zero then DB2 will do sequential prefetch which optimizes physical IO. 8
Reorg Reasons Tablespace Poor Clusterratio Check if clustered index is really used by SQL Freespace low Row relocation FARINDREF/NEARINDREF from SYSIBM.SYSTABLEPART! Dead Space Deleted records IO One of the main reasons to reorganize a tablespace with its associated indexes is to restore the clustering order of your table data which resides in your tablespace. But before you start reorganising your tablespaces blindly, you should ask yourself if the clustering index is really used by application SQL. If the explain shows the usage of the clustering index AND sequential prefetch is used by DB2 to retrieve the table data, THEN the clustering index is used, and a REORG TABLESPACE to restore a bad clusterratio makes full sense. But if sequential prefetch is not used, I would question the need for this particular clustering index. Also the FREESPACE and PCTFREE parms of your tablespace are directly connected to the need of a clustering index. If your SQL does not need a clustering index to do sequential prefetch, it doesn t matter in which order your table data is stored physically and therefore you don t need freespace on your data pages, thus saving space and most of all IO, and the need to reorg this object because of low freespace. 9
Reorg Reasons Tablespace Secondary Extents ALTER PRIQTY/SEQTY before Reorg Minor path length increase = CPU time increase Resolve REORGP after ALTER of part index key Resolve AREO (Advisory) for optimal performance Add COL to table and index Purge/Archive through REORG Availability Here are more reasons to consider to reorganize a tablespace, but here the focus lies on increasing the availability of your application accessing the tablespace. By eliminating secondary extents you avoid the extent failed condition for a growing object. When pending conditions exist on tablespaces or tablespace partitions you need to reorganize the object to avoid access failures for DML. 10
Agenda General Reorg Considerations Reorg Availability Options Reorg Automation and Avoidance DB2 UDB for z/os V8 and Reorg Summary 11
SHRLEVEL Overview Indicates access allowed during RELOAD phase NONE UNLOAD READ RELOAD, SORT, BUILD NO ACCESS REFERENCE Online Read! UNLOAD READ RELOAD, SORT, BUILD READ to original copy of TS/IX FASTSWITCH NO ACCESS CHANGE Online Read/Write! UNLOAD READ/WRITE RELOAD, SORT, BUILD READ/WRITE to original copy of TS/IX LOGFINAL, FASTSWITCH NO ACCESS Before we can take a look at the detailled reorg process, we need to understand the concept of SHRLEVEL. These are the IBM standard SHRLEVEL options as of DB2 for z/os V8. As you can see, even the SHRLEVEL CHANGE Online Reorg causes a more or less short outage to your applications. IBM and BMC uses the CLAIM/DRAIN methodology to stop database changes before doing the FASTSWITCH. In environments with CICS thread reuse and constant transactions the CLAIM method might not be successful, and the reorg will not finish. In such cases you will have to fall back to the SHRLEVEL NONE or REFERENCE method and run the reorg during a maintanence window of your application. 12
The Process TS SHRLEVEL CHANGE LOGFINAL Old TS/IX.I0001. Old TS/IX.I0001. LOG APPLY Old TS/IX.I0001. UNLOAD! New TS/IX.J0001. SORT/IX Database Changes New TS/IX.J0001. STOP or DRAIN/ CLAIM FASTSWITCH New TS/IX.J0001. DB2 Katalog IPREFIX SYSTABLE PART This chart shows the general process for an online reorg SHRLEVEL CHANGE. For most of the phases, the RW availability of your applications is guaranteed. Still, the performance of the unload and logapply phase is critical, because a fast unload phase will get you to the logapply phase quicker, which will then have to apply less log records. This on the other hand will give you more flexibility in scheduling and planning your online reorgs during your workday, as you are not dependent on quiet times with only a few or no transactions going on. Since DB2 V7 IBM offers the FASTSWITCH option, which does not rename the reorganised datasets at the end. Most of the vendors also support FASTSWITCH in their reorg products. 13
Agenda General Reorg Considerations Reorg Availability Options Reorg Automation and Avoidance DB2 UDB for z/os V8 and Reorg Summary 14
To Reorg, or not to Reorg, that s the Question Collect Statistics Object Selection/Exclusion 1 Processing Step Threshold Analysis JCL Generation Execution This is a general step-by-step approach to automate the reorganization of DB2 tablespaces and indexspaces. This is an abstract view, and does not represent the workflow of a particular IBM or vendor product. This process is the result of many customer experiences and implementations of self written or standard software products. Since the database by itself does not yet support a self-reorganisation of its tables and indexes, customers around the world came up with slightly different flavours of automation and avoidance, but it all came down to those 5 main steps. It is not necessary to implement all 5 steps, you can achieve a great deal of automation just by automating the threshold analysis and the JCL generation. Its the vendors job to make this whole process easy to use, implement and maintain for the user. 15
To Reorg, or not to Reorg, that s the Question Collect Statistics Object Selection/Exclusion 1 Processing Step Threshold Analysis JCL Generation Execution First step: Collect Statistics 16
Collect Statistics IBM RUNSTATS HIST Catalog Tables provide data over time Schedule RUNSTATS for Optimizer need Includes IBM STOSPACE data - Space in KB RUNSTATS TABLESPACE UPDATE(ALL/SPACE) Real-Time-Stats tables (since V7) Very useful Since Utility values Complex to implement via SP Before you can submit your reorg jobs, you will need to collect statistical data on your DB2 objects. This data will be used to filter out those objects which need to be reorged. There are several utilities available, which collect different statistical information on DB2 objects and the underlying VSAM datasets. Once you start using Real-Time-Stats for your reorg automation, I would suggest, that you use RUNSTATS only for your optimizer needs thus supporting the quality of your SQL access paths. 17
Collect Statistics VSAM LISTCAT Its the truth! ISV Tools Do not rely on RUNSTATS/STOSPACE Optionally utilize RTS, LISTCAT and VTOC data Enhanced stats database and reports (Trending, Forecasting) Use the output of the VSAM LISTCAT command only, if you need to analyze data which is not available through RUNSTATS and/or RTS and/or STOSPACE, OR the data is outdated. The main advantage of using LISTCAT is that it always provides up to date information of the dataset. But it is also more difficult to read LISTCAT output, as it cannot be analysed by SQL. You would need to parse the output to find the relevant information. 18
To Reorg, or not to Reorg, that s the Question Collect Statistics Object Selection/Exclusion 1 Processing Step Threshold Analysis JCL Generation Execution Second Step: Object Selection/Exclusion 19
Object Selection/Exclusion Use IBM LISTDEF or query DB2 Catalog individually Can also be used for stats collection with RUNSTATS Exclude particular objects if needed Very large/small objects Hot tables LOB Tablespaces An important part of the reorg automation process, is the concept of object selection and exclusion. You might house thousands of tablespaces and indexspaces in your subsystem. Not all of them qualify for reorg automation. There might be different applications using different tables with different needs for reorg. You might need to reorg some objects on a weekly basis, and others twice a week or only once a month. The simplest way to select objects for a reorg is to use the IBM LISTDEF utility command. It provides a basic way to select and exclude objects for a certain reorg run. More sofisticated methods might be needed and can be accomplished bei either your own program or by ISV tools. 20
Object Selection/Exclusion Consider DEFINE NO objects Indexes RI relationships Grouping of objects by application Collection ID Package/Plan Name List Use SYSPACKDEP or SYSPLANDEP to find objects Additional considerations are related to certain exceptions like DEFINE NO objects, which exist in the DB2 catalog as an entry, but have not yet materialized into an existing VSAM dataset (will happen at first INSERT or LOAD). You might have the need to reorganize all objects related to a particular application or part of an application. You might not know which objects are used by certain packages or plans. With the information in SYSIBM.SYSPACKDEP or SYSIBM.SYSPLANDEP you can find all objects which are used by certain plans or packages. 21
To Reorg, or not to Reorg, that s the Question Collect Statistics Object Selection/Exclusion 1 Processing Step Threshold Analysis JCL Generation Execution Third Step: Threshold Analysis 22
Threshold Analysis Find out which objects qualify for reorg Via SQL: SELECT COUNT(*)... WHERE EXISTS... Consider AND ing and/or OR ing conditions Physical Extents and growth rate over time Trigger reorgs based on historical trends I.E. more than 25% increase of rows since last stats Reorg to avoid exceptions in the future Stats data helps with capacity planning too This is the main part of the process. Evaluating each selected object against a certain threshold of one or more particular statistical value(s). When the particular statistical value of the selected object is exceeding (higher or lower depends on the particular statistical value) the condition is true, and the object is eliglible for reorg. Certain situations will ask for a combination of conditions with AND/OR, and if you have statistical data over time, you can compare current with older values to qualify objects for reorgs. 23
Threshold Analysis Trigger different actions by condition type SHRLEVEL REFERENCE/NONE REORG Test it find threshold values that fit your needs 80% or 90% Clusterratio? If you have never applied conditions against your objects, you will find it difficult to set the specific threshold values. A certain value might trigger all objects or none. This needs to be tested, and, if necessary, change the value. For example, if you set the clusterration threshold to 99% (reorg everything which has a lower clusterratio than 99), you will get too many objects qualified for reorg. This is not the idea of reorg avoidance. If you set it to 20% you might miss objects which need a reorg. 24
To Reorg, or not to Reorg, that s the Question Collect Statistics Object Selection/Exclusion 1 Processing Step Threshold Analysis JCL Generation Execution Step Four: JCL Generation 25
JCL Generation Avoid index reorg if tablespace needs reorg too Automatically resize (ALTER) objects before reorg Calculate PRIQTY (SEQTY?) Calculate Workdatasetsize SORTWK, SYSUT1, SYSCOPY DDs etc... Once you ve got your list of eliglible objects its time to generate some JCL and SYSIN cards for your favourite reorg utility. Fully automating this can be a complex task. For example you want to avoid to reorg an index, which qualified for reorg in step 3, but the parent tablespace, which houses the table for which this index was created, also qualified for reorg in step 3 when you reorg a tablespace, all indexes defined on tables within that tablespace are also reorged. Adjust the physical allocation of datasets prior to a reorg, if you find out that the object is too small, or secondary extents grow over time. Depending on how you code your reorg JCL and SYSIN you might need additional work datasets, which should be sized according to the actual size of the object which gets reorged. 26
JCL Generation IBM Reorg uses Mappingtable and Index Create TS/TB/IX in STEP before Reorg Drop TS after Reorg Sizing of Mappingtable and Index depends on table size (110%) Run MODIFY on Mappingtable TS to reduce DBD size Assign priorities by object and exception Do worst organized and largest object first IBM Online Reorg uses mapping tables and indexes to track database changes during the reorg phase. Other reorg products do not use DB2 tables to track those changes, but rather sequential files or other methods. When you use IBMs Reorg you will need those mappingtables. The mappingtable and index all look the same, regardless of the object which gets reorganized, but you should size it according to the size of the reorganized object. Some users have a predefined set of mappingtables and indexes. This is a valid solution, but you will have to consider that you cannot run 2 reorgs of 2 different objects against the same set of mappingtable and index. Another solution is to create mappingtable and index with the reorg syntax in the same job step and drop those objects after the reorg completed, all in the same SYSIN DD. If you have to deal with many objects to reorg, and at the same time have to finish those reorgs within a particular time frame/window, consider to prioritize the work. 27
JCL Generation Partition level reorgs ISFP Skeleton technology useful 28
To Reorg, or not to Reorg, that s the Question Collect Statistics Object Selection/Exclusion 1 Processing Step Threshold Analysis JCL Generation Execution Step Five: Execution 29
Execution Talk to your job scheduling person Know your application data usage or use a tool Read vs. Update Frequent Delete Jobs LOAD RESUME Clusterratio! Day vs. Night vs. 24/7 If you want to automate the execution of your reorgs, you should first talk to your job scheduling person. You should provide information on how often you want to have reorgs done in general. I would start on a weekly base. Step 1-5 should run once a week in sequence. You don t want to run step 1 on Wednesday, and then the rest of the steps on Sunday = stats would be old, and you might reorg the wrong objects. You can delay step 4 and 5 of course, but plan it according to the input of your job scheduling person. If you know more about your applications and how they use DB2 data, the better, faster and less intrusive your reorgs will execute. 30
Execution Do REORGs during off-peak time Even SHRLEVEL CHANGE Use regular maint. window for hot tables Consider Workload balancing Submit largest and worst reorganized object first Use stats about past executions to schedule reorg better What if reorg fails? Online Reorg relieves If you got really hot table to deal with (permanent transactions changing data in a CICS thread reuse environment) you might even get into troubles with online reorg finishing. For those objects, I would suggest to use the regular maintenance window available for system maint., shutdowns etc... Even if this is only once a month. Still better than nothing. If you need to execute many reorgs at once you can enhance your automation to setup a balancing of objects 31
Online Reorg Execution Synchronization with transactions is the issue! 2 Joices for LOGFINAL phase: Transactions must complete, OR REORG must complete Acceptance that there may be some application failures Use DRAIN_WAIT int keyword (for IBM Reorg) tells REORG how long to wait for not committing transactions RETRY keyword multiplies DRAIN_WAIT Most online reorg products on the market today need to synchronise (basically DRAIN all updaters) the reorganised tablespace or indexspace in order to switch over from the old (not reorganised) dataset to the reorganised one. This needs to happen for all datasets which participate in the reorg. For a tablespace reorg this also includes the indexspace datasets for all indexes defined on all tables within the tablespace which gets reorganized. 32
Agenda General Reorg Considerations Reorg Availability Options Reorg Automation and Avoidance DB2 UDB for z/os V8 and Reorg Summary 33
DB2 UDB for z/os V8 and Reorg Type of enhancement depends on vendor IBM related DPSI no BUILD2 Phase for NPIs for IBM Online Reorg SCOPE PENDING keyword for REORP/AREO status 34
Agenda General Reorg Considerations Reorg Availability Options Reorg Automation and Avoidance DB2 UDB for z/os V8 and Reorg Summary 35
Summary Reorg is not an art Use Online Reorgs Automate as much as possible Get it off your desk 36
Reorganization Strategies in Depth Session: B7 Peter Plevka BMC Software Peter_plevka@bmc.com 37