DB2 for z/os Best Practices Recommendations from DB2 Health Check Studies: Operations

Size: px
Start display at page:

Download "DB2 for z/os Best Practices Recommendations from DB2 Health Check Studies: Operations"

Transcription

1 DB2 for z/os Best Practices Recommendations from DB2 Health Check Studies: Operations John Campbell & Florence Dubois DB2 for z/os User Technology 2012 IBM Corporation Transcript of webcast Slide 1 (00:00) Hello this is John Campbell, a distinguished engineer from DB2 for z/os development. Today s web lecture is from the series on DB2 for z/os best practices. Specifically, today s web lecture is part 4 called Operations from the series on recommendations from DB2 health check studies. Slide 2 (00:24) On slide 2 is a disclaimer and a list of trademarks related to this presentation. Now let's turn to slide 3. Slide 3 (00:34) On slide 3 are the overall objectives of this series on DB2 best practices, which is intended to introduce and discuss the issues most commonly found, share the experience from customer DB2 health check studies, share the experience of live customer production incidents, and provide proven global recommended best practices, and finally to encourage proactive behavior as opposed to regret analysis. Now let's turn to slide 4. Slide 4 (01:05) On slide 4 is the first part of the agenda here, which describes part 1 and part 2, which are separate modules in this series of web lectures. Now let s turn to slide 5. Slide 5 (01:17) On slide 5 here on the agenda are parts 3 and 4. Today s web lecture is about

2 operations. We re going to cover three specific topics: preventative software maintenance, DB2 virtual and real storage management, and finally performance- and exception-based monitoring. Now let's turn to slide 6. Slide 6 (01:41) Here on slide 6, I d like to introduce and discuss the most common problems associated with preventative software maintenance. First of all, too many customers are very back level on preventative service. So much so that they experience problems that could have been avoided if they had applied the preventative service (in other words, the missing HIPER or resolved the PE). Some of these problems were already experienced by other customers, and a PTF was readily available. In some cases, the very same customer may have already have experienced that very same problem but have taken no action and continue to run into the same problem again and again. Many customers apply no HIPERs or PE fixes since the last preventative service upgrades. I ve also experienced a number of high-profile production incidents that could have been readily avoided by a missing HIPER. Too many organizations have a fix-on-failure culture. The result of that is that they don t apply any preventative service until they actually experience a problem. The problem with this fix-on-failure culture is that when you do run into some problem there may be a very long prerequisite PTF chain to be applied in order to get that corrective fix applied. Many customers have deployed z/os parallel sysplex and DB2 Data Sharing technologies to avoid planned outages and to remove dependencies on change windows. However, there are way too customers here who are not getting the full benefits of DB2 Data Sharing technology, for example, because they have application affinities where an application becomes a single point of failure. Or there may be other single point of failures around the Data Sharing group. But being back level on preventative service leads to a delay in exploiting new availability functions and it also means a delay in applying DB2 serviceability enhancements to prevent outages. Another trend with large organizations and with outsourcers is that they have a one-size-fits-all approach to maintenance, and the same maintenance is being applied across different application environments. The problem with this is that no two application environments are typically the same. And with a one-size-fits-all maintenance approach that s not complemented by a

3 one-size-fits-all testing approach, this can lead to escalating maintenance costs whereby the organization tries to roll out some maintenance with some success, but then leads to failures in particular application environments, and the maintenance has to be upgraded. Now let's turn to slide 7. Slide 7 (04:21) So on slide 7, let me answer the question why install preventative software maintenance. In order to achieve the highest level of availability, it depends on having an adaptive preventative maintenance process -- basically, learning from experience. There is no one-size-fits-all. So, for example, if an organization is running into defects where they are the first customer in the world to be experiencing that defect, and this is a recurring theme, this indicates that the customer is too aggressive about applying service. On the other hand, if a customer runs into problems where the fixes were readily available and not applied, this means that the customer s maintenance process is too conservative and they need to be more aggressive about applying preventative service. Secondly, applying preventative maintenance can and will avoid outages. From our own analysis of the DB2 for z/os lab, up to twenty percent of multi-system outages could have been avoided by regularly installing critical PTFs in other words, HIPERs and PE fixes. And finally, executing a preventative maintenance process requires a deep understanding of the trade-offs in achieving high systems availability. Now let s turn to slide 8. Slide 8 (05:40) On slide 8 is a chart that tries to illustrate the trade-offs in maintenance. If you look at the graph, on the left side, on the Y axis, is the percentage chance, and on the X axis are the months going minus one, minus two, minus three, etcetera. When it shows three months here, this means three months after a PTF becomes available. And what the graph shows, first of all looking at the blue (yellow), is that time goes on and the maintenance falls further and further back level, there s an increased chance that a customer will run into old bugs in other words, a bug where the fix was readily available. On the other hand, when you look at the blue on this chart, this indicates the chance of hitting a PTF in error. So, I m proposing here that the sweet spot is probably three or four months after PTF availability, which is a reasonable balance between avoiding problems where the fix was readily available and at the same time avoiding the excessive

4 chance of running into PEs. So what a customer must do here is balance for severity versus risk. In other words, balance the risk of problems encountered versus problems avoided, factoring in the potential for PTFs in error, and also factoring in application workload type. Some workloads, such as traditional legacy workloads, are much more stable and use a very limited amount of DB2 function, or at least function that s well stabilized inside DB2. On the other hand, there may be new application workload types using very new features of DB2 where you need to be much more aggressive about applying preventative service. And also, for any customer, you also need to factor into account the available windows for change control to install the actual service. But most importantly, every customer needs an adaptive service strategy that is adjusted based on prevailing experience -- looking at experience over the previous 12 to 18 months. And also you need to factor in the organization s attitude in terms of risk, to changing the environment, to exploiting new DB2 releases, associated products, and new feature function. For example, if the customer is very aggressive about using the latest features and functions of new DB2 releases, they need to be more aggressive about applying preventative service, providing it more often and not staying too far behind. On the other hand, if the customer is very conservative in terms of the usage of new features and functions and slow to adopt new DB2 releases, they can afford to apply preventative service less often and can afford to stay further behind. The last part of this equation is to factor in what s happening in terms of DB2 product and service plans. Now let s turn to slide 9. Slide 9 (08:35) On slide 9, I want to talk about consolidated service test. The goal of consolidated service test is to enhance the way that IBM Service for the z/os software products is tested and delivered, and the intent is to provide a single, coordinated service recommendation. CST testing provides cross-product testing for the participating products, like DB2 for example, CICS, and z/os and the other products on the z/os stack. So this is testing over and above what the respective development groups do. The list of products included in CST is continually expanding. The testing, as I said, that is performed in CST, is in addition to those performed in the existing testing programs and does not replace any current testing that is performed by the individual program products. The end goal is to standardize on the maintenance recommendation for the z/os software stack platform. The results of the CST are published quarterly on the

5 CST website. On slide 9 is the web page that will take you to the latest available quarterly report. IBM also publishes a monthly addendum with an update on tested HIPERs and PE fixes. After service has passed the CST testing, it is then packaged and marked with what s called RSU. RSU stands for Recommended Service Upgrade. And this is then made available for customers to order online. Now let s turn to slide 10. Slide 10 (10:14) On slide 10 is an example (picture here) that compares the CST RSU process versus what I would call the PUT calendar because the PUT calendar is an alternative way for a customer to pull and apply maintenance. Let s go to the top half of the chart, which talks about the PUT calendar. We have the calendar going January, February, March, April For each month there is a corresponding PUT from the previous month. So in January 2012, the PUT that s available in January 2012 is, in fact, PUT1112. That s from December So when you look at PUT1112, which is available in January 2012, you see that the base code, in terms of maintenance, is December 2011, and the HIPERs and PEs are also current up to December And as each month goes forward on the PUT calendar, you see that both the base and the HIPERs and PEs come forward by one month. So basically, when you have a PUT calendar and you order a PUT tape, once you have the ordered tape in your hands, then you re one month behind on the base, the HIPERs, and the PEs. The bottom half of this chart shows the CST RSU calendar. In the bottom left hand corner, we have the CST testing in fourth quarter And ultimately that results in RSU1112. When you look inside the package, you ve got all service through the end of September not already-marked RSU, followed by HIPERs and PEs through the end of November That RSU (RSU1112) is orderable in January, so what you get in January is the base at September 2011 and HIPERs and PEs through to The first thing that is obvious about the RSU process is that your base is further back in time to protect you against PEs related to non-hiper maintenance, and you ve got HIPERs and PEs that have been through extra testing through to the end of November. So this gives you more protection in terms of maintenance being stable. Now, as the RSU goes forward month by month, and we look at RSU1201, that s January 2012, that s orderable in February The base is still at September 2011, but the HIPERs and PEs have moved forward one extra month through

6 December Finally, every time a quarterly RSU becomes due, that s the point at which the base, in terms of non-hiper maintenance, moves forward. So, when we get to RSU1203, which is orderable in April, then basically the base moves forward from September 2011 to December 2011, and now the HIPERs and PEs have moved forward to February Now let s turn to slide 11. Slide 11 (13:25) On slide 11, I d like to introduce and discuss enhanced HOLDDATA. This is a critical ingredient for a preventative service strategy. At the top of chart 11 is a link to information about what enhanced HOLDDATA is and how to use it. As I said, it s a key element of the CST RSU best practices process. The goal is to simplify service management. It can be used and should be used to identify missing PE fixes and HIPER PTFs using the SMP/E REPORT ERRSYSMODS. What you re able to do is to produce a summary report that includes the fixing PTF number when the PTF is available. It also includes the HIPER reason flags, such as DAL for data loss, FUL for major function loss, or PRF for performance. It identifies whether any fixing PTFs are in RECEIVE status, in other words, available for installation, and if the chain of PTFs to fix the error has any outstanding PEs. The enhanced HOLDDATA is updated by IBM on a daily basis, and a single set of HOLDDATA is both cumulative and complete. Up to three years of history is available. Now let's turn to slide 12. Slide 12 (14:52) Another thing that we ve done in DB2 Development is to exploit what s called fix category HOLDDATA, often referred to as FIXCAT for short. Again, at the top of slide 12 there is a web page to click on to information that describes what fix category HOLDDATA is and describes the categories that are used by DB2. The advantage of fix categories is that they can be used to identify a group of fixes that are required to support a particular hardware device or to provide a particular software function. And it s supplied in the form of SMP/E FIXCAT HOLDDATA statements. On the second half of this chart is a current list of FIXCAT HOLDs for the DB2 for z/os product. I ll pick a few of them out here. For example, DB2STGLK are fixes for DB2 storage leak problems. Or DB2INCORR describes fixes for DB2 SQL incorrect output problems. This is a way of filtering through the many HIPERs to pull out the ones that really matter. I ve picked on the ones that I ve mentioned

7 already. Things like storage leaks, storage overlays, and incorrect SQL output are obviously very critical problems and are of paramount importance. Therefore, it s essential that these fixes for these sorts of problems are put on as soon as possible. Now let s turn to slide 13. Slide 13 (16:21) On slide 13, I want to provide some recommendations for preventative software maintenance. One very important point to touch on to start with is the change management process in an individual customer environment. In too many customer environments the change management process is very risk averse, and strangles all types of change. So it actually limits the amount of change going on. I d like to encourage customers to actually assess the impact of no change. Clearly there s a risk of making a change, but there s also risk in not making a change, and it s important to get the balance correct. The basis for the recommendation is to apply preventative maintenance every three months and to use the RSU calendar instead of the PUT calendar so that you re less aggressive in applying non-hiper maintenance. The sample strategy is based on two major and two minor releases per calendar year. So the major release is in fact a refresh of the base every six months based on the latest available quarterly RSU. And each base upgrade based on this quarterly RSU should be RSU-only service, you should specify SOURCEID=RSU* in the supplied APPLY and ACCEPT jobs. Now in between those two major releases, the next idea is to have what s called a mini-package or a minor package. The idea is that, in between two successive major software upgrades, you rollup all the missing HIPERs and PEs that are available and package them into a mini-package, and then roll that into production. Having said this, this strategy is based on a conservative customer, who is not aggressive about using new feature functions. On the other hand, if you re a customer who is very aggressive about migrating to new releases, in other words you re an early adopter, or are aggressive about using new functions, you need to be more aggressive than the strategy that I ve just outlined. For all customers, it s very important to review the enhanced HOLDDATA on a weekly basis. So customers should pull the enhanced HOLDDATA, bring all the missing fixes onsite and then basically analyze the missing HIPERs and PEs. Typically, they fall into three categories. There are those types of HIPERs and PEs that don t apply to the usage of DB2

8 at a particular installation. For example there may be a HIPER that is related to query parallelism, and you don t use query parallelism in your installation, in which case it can be ignored. The second category is related to something like a storage overlay or a storage leak or a bad recovery or DB2 crashes. These types of HIPERs are critical and they need to be expedited into production after one to two weeks in test to actually approve the fix. The third category is HIPERs that do apply but the exposure is very small and the impact is very small. Therefore, you could defer the application of that HIPER or PE fix until the next major or minor upgrade. Now let's turn to slide 14. Slide 14 (19:38) Continuing with my recommendations about preventative software maintenance, it s important for customers who demand the highest levels of availability to develop processes and procedures in technical changes to implement rolling maintenance outside of heavily constrained change windows. Basically, they need to exploit the z/os parallel sysplex and DB2 data sharing technologies. A few recommendations are to have a separate SDSNLOAD or SDSNLOAD alias per DB2 member. Have a separate ICF User Catalog Alias per DB2 member. The benefits of rolling maintenance are: only one DB2 member at a time is stopped; the DB2 data is continuously available via the N minus one members; fallback to the prior release is fully supported if necessary. However, I want to make one very important point here: if you have applications that have affinities that run in only one place, then clearly it s nearly impossible to implement rolling maintenance. Otherwise, if you do, you re going to enter application service outages. The second point on chart 14 is to aim for company-wide certification of new releases of maintenance. This particularly applies to many world-wide organizations that have many different application environments and also to outsourcing companies, again where they have many different application environments and a lot of diversity across those different application development environments. The whole idea is to complement a company-wide concept of building a maintenance package with a company-wide certification through testing of the new maintenance. So if you were to implement a shared test environment with

9 collaborative use by systems programming staff, DBAs, and application teams as a way of proving new maintenance, you can think of it a bit like a company-wide IVT to validate the software. This provides an additional insurance policy before starting to roll out new DB2 maintenance packages across multiple application environments. Now let s turn to slide 15. Slide 15 (21:48) Continuing on this theme of large organizations with multiple different application environments, we also think it s important to have separated DB2 maintenance environments. This means that we have a single master SMP/E environment as the base and then have one additional SMP/E environment to match each separate DB2 application environment. The certification testing that I talked about on slide 14 will be based on the master SMP/E environment before starting to promote the new maintenance to the various test and production environments. So what are the benefits of having separated DB2 maintenance environments with a separate SMP/E environment to support it. First of all, when you have multiple application environments and there s a lot of diversity in terms of the SQL functionality and function feature usage in DB2, there s no one-size-fits-all in terms of maintenance. It enables you to be more aggressive in applying maintenance to some particular environments and, at the same time, being conservative on other application environments in order to protect the stability of those environments. This enables you to have a very flexible approach that meets application development requirements to be aggressive about using new feature function or the adoption of new DB2 releases. At the same time, it supports the migration to new DB2 releases perfectly as DB2 application development environments can be treated independently. Now let's turn to slide 16. Slide 16 (23:23) So on slide 16 here, I want to switch to the second major topic, which is about DB2 virtual and real storage management. In a nutshell, the common problem here is that CTHREAD and MAXDBAT system parameters are set too high. CTHREAD describes the total number of allied threads allowed into DB2, and MAXDBAT describes the total number of database access threads used by distributed applications that can run in DB2. So, if these two system parameters, in combination, are set too high, this inflates the size of the storage cushion. This is a storage cushion in the DBM1 address space below the bar and it represents the storage cushion for 31-bit storage. If the storage cushion is inflated because

10 CTHREAD and MAXDBAT are set too high, this will mean that full system storage contraction will occur more often, and this has two disadvantages. First of all, it will cause CPU burn in the DBM1 address space in TCB mode, and the second thing it will put stress on the system because DB2 acquired the LPVT latch. The second aspect of this is that there s no denial of service. This applies to both version 9 and version 10, although what drives the problems here are very different. First of all, you know that EDM 31-bit pool full condition is pretty serious. So even in version 9 we still have a 31-bit pool full, and when that condition is reached, all the applications will get a SQLCODE -904, which means the application failed, and it amounts to a denial of service. On the other hand, if all the work falls in because these parameters are set too high, then we ll drive DBM1 full system storage contraction, and this will potentially degrade performance. Also note that if the full system storage contraction cannot free up enough storage, then the DBM1 address space will go storage critical. This means that individual DB2 threads that are not marked as must complete will end up abending with reason code starting 00E200. Ultimately then, DB2 can actually crash out, and that can result in a loss of business application services if there are affinities in those applications for a particular member, or even in non-data sharing. Particularly in version 10, over-commitment can lead to excessive paging to auxiliary storage. If both the available real storage and the auxiliary storage are overcommitted, then the LPAR can crash out causing DB2 to terminate and any other subsystems running on that particular LPAR. All of this can be aggravated by: a lack of workload balance across CICS and WAS versus DB2; workload failover conditions (when subsystems fail, LPARs fail); abnormal slow downs due to, for example, application locking considerations, degraded I/O performance, etcetera; or people moving application workloads from one particular LPAR or member to another. Now let s turn to slide 17. Slide 17 (26:40) Continuing on the theme of problems, let s talk about the shortage of real storage. This is always important prior to version 10, but certainly with the advent of version 10, this needs to be re-emphasized. Shortage of real storage can lead to excessive paging to auxiliary and severe performance problems. A virtual DB2 environment should be designed and provisioned so there s no paging to auxiliary storage. If you think about it from a

11 buffer pool perspective, if you don t have enough real storage to back the buffer pool, when DB2 has to steal a page using the LIU algorithm, then we don t want to find that the LIU buffer pool is paged out to auxiliary, because if that happens, you ll get two I/Os. You ll get the MVS page in I/O from auxiliary that you didn t want followed by the real I/O, which is to bring the page out of your application page set into the DB2 buffer pool. Ultimately, if things get out of control and you over-commit all the available real memory and all of the available auxiliary storage, this can take the LPAR out. This happens once all of the auxiliary storage is consumed, the LPAR will actually go into a wait state. Shortage of real storage can also lead to long dump processing times and cause major disruptions not just to that LPAR, but across the data-sharing group. A dump should complete in a small number of seconds (less than ten seconds) to make sure that no performance problems ensue on the LPAR or we don t get any sympathy sickness around the data-sharing group. Once paging begins, it s possible to have dump processing take tens of seconds, even a few minutes, with a high risk system-wide or even sysplex-wide slowdowns. As I ve said to several customers, you could be running just one dump away from disaster. Ultimately, if you don t have enough real storage to provision your system for both normal operation and also abnormal events like dump processing, this can lead to wasted opportunities for CPU reduction. Today, cost reduction is a paramount importance to almost every customer that I know. This leads to a reluctance to use bigger or more buffer pools, a reluctance to use buffer pool long-term page fix, which was introduced in version 8, and also an inability to use the many performance opportunities open up in DB2 version 10, which requires additional virtual storage. Now let s turn to slide 18. Slide 18 (29:12) Now I ll provide some recommendations. First of all, IFCID 225 provides comprehensive information about DB2 virtual storage usage, and now in V10, real storage usage. You should collect IFCID 225 data from the start of DB2 all the way through DB2 shutdown. These days, IFCID 225 is collected fastest as trace class 1 and should be written out to SMF. Prior to version 10, OMPE OMEGAMON provided a feature called the SPREADSHEETDD subcommand, and this was available in the batch processer to post-process the SMF data and write out as a commonly delimited file that could easily loaded into something like Microsoft Excel. However, the SPREADSHEETDD support in OMPE has not been enhanced to support DB2

12 version 10. In the meantime, what we ve done in DB2 Development is to enhance the sample REXX programs MEMU2 and MEMUSAGE to be able to pull the data. There s a separate version of MEMU2 to support version 9 and a separate one to support version 10. Both versions are available on the DB2 for z/os Exchange community on the IBM My DevelopWorks website. There s a web page on slide 18 that you click on which takes you to the DB2 for z/os technical exchange. Generally, what you need to do here is plan on having a basic storage cushion free. This is additional free storage over and above the storage cushion to avoid the possibility of driving full system storage contraction. Generally speaking, this basic storage cushion needs to be about 100 megabytes to allow for some growth and to allow for some margin of error. Having done that, the idea then is to project how many active threads can be supported, and then set CTHREADS and MAXDBAT to realistic values that are inline with the values that you projected. And the final piece of the jigsaw here is to balance the design across the CICS AORs connecting to DB2 with the amount of threads that can be supported by DB2. So you need both a bottom up and top down approach to make sure that the DB2 subsystem can support the defined number of threads without getting into trouble. And the number of connections across all the CICS AORs connecting to that DB2 subsystem can be supported. Now let s turn to slide 19. Slide 19 (31:51) Now starting with version 9, DB2 introduced an additional function called a DB2 internal monitor that issues messages when DB2 starts getting short on DBM1 31-bit storage. This DB2 internal monitor runs inside the master address space and automatically issues console message DSNV508I when DBM1 31-bit virtual storage crosses particular thresholds. The message is generated when it goes past the threshold and also when it comes down below that threshold. Here on the chart, I talk about increasing or decreasing with respect to the thresholds. When the actual storage gets depleted and goes over the threshold, then the DSNV508I message gets generated immediately. On the other hand, when the storage usage goes below the threshold, there s a three-minute delay before the message pops out (is issued). It s most important to have the PTF for APAR PM38435 applied. The reason for that is when this internal monitor was initially implemented, it did not take into account the storage cushion when applying the percentages. Here s an example based on the picture: here you see a DSNV508I message, and it s telling you

13 that the storage notification indicates that 77 percent is consumed, and 76 percent is consumed by DB2, leaving 352 left. Now the first threshold that is used by DB2 to generate the DSNV508I message is actually 88 percent. That 88 percent is not 88 percent of the 31-bit region. It s 88 percent of the 31-bit region less the storage cushion. This message that is popping out (being issued) at the 88 percent threshold which represents 77 percent consumed of the total region size. It also does as well to identify the agents that consume the most storage. As a customer, you can get the status at any time by using the DISPLAY THREAD(*) TYPE(SYSTEM) command. You have an example of that message that s generated at the bottom of the chart. It tells you how much storage is used in the whole region. It tells you how many times threads were delayed, holding a latch or were given a boost, and it also indicates the health of the system. If a DBM1 address space is short of available virtual storage, then the value of the health parameter will go below 100 percent. This will encourage the sysplex workload balancing for DDF workloads to move work away from that member. Now let s turn to slide 20. Slide 20 (34:49) Further recommendations here. One of the things that I ve already given you here is that it is absolutely important, is of paramount importance, to configure sufficient real storage to get the best performance and to quickly capture diagnostics. But it s not a good practice just to configure the available real storage based on normal operating conditions, and then rely on DASD paging space to absorb the peaks. You need to configure enough real memory to deal with the normal operation conditions plus the peaks and be able to take a dump. So, to add a bit more definition to this, you need to provision sufficient real storage to cover both the normal DB2 working set size and the MAXSPACE requirement. MAXSPACE can typically be up to eight or nine gigabytes in version 9 to get a full dump, and for version 10 this value is somewhere between twelve and sixteen gigabytes. If you undersize MAXSPACE, this will result in partial dumps, and this will seriously compromise problem determination, problem source identification. In order to protect the availability of production services on the same LPAR and on other LPARs in the data-sharing group, dumps should be taken very quickly, i.e., less than ten seconds, almost without anybody noticing, and with little or no disruptions on the subject LPAR and to the rest of the parallel sysplex. You may also want to consider automation to kill dumps taking longer than ten seconds. Now let's turn to slide 21.

14 Slide 21 (36:27) Thread storage contraction helps protect the availability of the system, and thread storage contraction applies to the 31-bit virtual storage in the DBM1 address space. So there s a strong recommendation, particularly in version 9, but this also applies to version 10, to run with CONTSTOR equals YES. With YES, DB2 will compress out part of the agent local non-system storage based on the number of commits and the thread size. And the overhead is fairly low. It s a maximum of one compress for every five commits. So it s very cheap to implement. What I d like to point out is that thread storage contraction is ineffective for long-running persistent threads with use of RELEASE DEALLOCATE. Once you ve actually migrated to DB2 version 10, and rebound your static SQL packages and plans, then you can turn off CONTSTOR thread storage contraction, so you actually save some CPU. In DB2 version 10, there s a new piece of function called a real storage monitor, which enables DISCARD mode to free up unused frames back to the operating system. This has the effect of contracting storage and protecting the system against excessive paging and use of auxiliary storage. This was introduced in DB2 APAR PM24723, and it has a prerequisite z/os APAR called OA It s controlled by a new systems parameter, zparm, called REALSTORAGE MANAGEMENT, and it has three particular values. The default is AUTO, and AUTO is strongly recommended. With AUTO, DB2 detects if excessive paging is imminent and tries to reduce the frame count to avoid system pages. It basically toggles between on and off. It s a bit like the thermostat on your air conditioner or central heating system. It will cut in when the system is in trouble and then go into DISCARD mode to free up the frames to the operating system. And it will go back into off mode once the condition is relieved. One slight problem that we ran into was that some customers reported a CPU increase when they were running multiple DB2 version 10 subsystems on the same LPAR. This is due to the underlying RSN service that DB2 is using is actually taking a CPU spin lock. So, there s a new z/os APAR called OA37821 that provides a new option on the RSN service that is used by DB2. DB2 APAR PM49816 uses that new option on the RSN service to avoid this CPU burn. Now let's turn to slide 22. Slide 22 (39:21) Real storage needs to be monitored as much as virtual storage. Important subsystems like DB2 should not be paging into auxiliary in a production

15 environment. The emphasis is on production. It s quite common in a pre-production development environment to over-commit available resources, but in a real production environment where performance and availability are of paramount importance, we need to avoid paging into auxiliary. The recommendations here are to keep the page-in rates near zero, actively monitoring using RMF monitor 3 to basically avoid paging, and monitor the DB2 page-in for reads and writes and also avoid the output log buffer being paged. As previously discussed, you want to collect IFCID 225 data from DB2 start time through to DB2 shutdown time. Any high use of auxiliary storage needs to be investigated. In other words, what time did it happen and what were the triggering events either inside or, often it s the case, outside of DB2 were driving DB2 to be pushed out to auxiliary storage. On the bottom of slide 22 is an extract from an OMPE report from DB2 version 9, which tells you how much real storage is being used and how much auxiliary storage is being used. In this example, there is no auxiliary storage being used by DB2. Now let s turn to slide 23. Slide 23 (40:50) On slide 23, we have an OMPE batch report for version 10. As you can see straight away, we have much more comprehensive reporting of real and auxiliary storage starting with version 10. At the top left hand side, we have the real and auxiliary storage for the DBM1 address space, and on the top right hand side, we have the real and auxiliary storage for the distributed address space DDF, and the middle section shows the real and auxiliary storage for the shared private storage. We also give it to you at the LPAR level, and finally, at the bottom left hand side we also give it for the common storage. All of these sections of the report we report the 31-bit storage below the bar and also the 64-bit storage above the bar. We break it out in both real and auxiliary for each of these sections. Now let's turn to slide 24. Slide 24 (41:49) How can we limit real storage usage? In version 9, we had a hidden system parameter, or zparm, called SPRMRSMX. I affectionately referred to this as the real storage kill switch. It was originally delivered in APAR PK It s been a secret and was not widely broadcast, and only a handful of customers were using it. Why was it introduced? It was introduced to prevent a runaway DB2 subsystem from taking the LPAR down and affecting other DB2 subsystems running on the LPAR and other MVS subsystems. It was only applicable to a customer who runs

16 multiple DB2 subsystems from the same data-sharing group, or even different data-sharing groups, on the same LPAR. The aim here is to prevent multiple outages caused by a single DB2 subsystem. In other words, you re prepared to basically sacrifice one DB2 subsystem in order to protect the availability of the other DB2 subsystems on that same LPAR. The general recommendation is to set the SPRMRSMX value to 1.5 to 2 times the normal DB2 subsystem usage. You need to be careful here, because if you set the value too small, there will be too many false positives where you ll take out the subsystem when you shouldn t have. On the other hand, if you set the value too high, basically the LPAR will die before the subsystem is sacrificed. So the general recommendation is, if the buffer pools are fairly large, that is that they represent a large amount of the DB2 working set size, then the multiplier should be about 1.5 x. On the other hand, if the buffer pools are small and represent a small percent of the overall working set size, then the value that is needed tends to be more toward 2 x. What will happen is when the real storage kill switch value is reached, the DB2 subsystem will abend. Now in DB2 version 10, the real storage kill switch is actually formalized. So the hidden zparm becomes an opaque zparm called REAL STORAGE MAX. For those customers who want to use this value, not only do you have to factor in the 31-bit storage, you also now need to factor in the 64-bit shared and common usage to establish a new footprint and you have to increase the size of the real storage kill switch. Now let s turn to slide 25. Slide 25 (44:27) On slide 25, I want cover the last topic and the third topic in this web lecture. I want to talk about performance- and exception-based monitoring. To begin with, I want to talk about common problems. Many organizations are operating mostly in fire-fighting mode, and actually reacting to today s performance problems with every day being a surprise and having to react. In many cases, there s missing performance data or a lack of granularity therein, which limits the ability to drive a problem to root cause, and you may need to wait for a recurrence of the problem to get the tracing data you need or to get the granularity that you need. The majority of customer organizations do not have a performance database, and even if they do, they make very limited use of it. Without having a performance database, there s no baseline for either DB2 system s performance or DB2 applications performance, and no basis for doing trend analysis. Most

17 organizations do not have a near-time history capability with their online monitor, and they haven t implemented any DB2 exception monitoring. The idea of exception monitoring is that you can identify, based on some rules of thumb, whether there are some out-of-line conditions and basically filter for those out-of-line conditions. The idea is to get an early warning of problem conditions so that you have time to either react to it or troubleshoot the problem quicker. Without having exception monitoring, nobody knows there s an issue until the situation escalates either into a very bad performance situation or a serious availability situation. Ultimately, not having exception monitoring delays problem determination and problem source identification. Lastly many organizations have an increasing amount of their applications using dynamic SQL for.net, ODBC, and JDBC applications. Organizations have limited control over the performance of those applications and the understanding of the performance of those applications. Now let's turn to slide 26. Slide 26 (46:37) First of all, I have some recommendations for data collection. The first rule is to set SMFSTAT=YES. This is the default and this collects information for statistics trace classes 1, 3, 4, and 5. Every customer should have that set. There s also an additional trace class called Stats Class 8, which gives you data set I/O statistics. I find that particular trace very useful. A lot of customers are concerned about their trace volume, but in actual fact (sic) we actually bucket multiple data sets to each record and will also only record statistics when there s a least one I/O per second, so an actual fact one myth about stats class 8 is that it generates lots of SMF data. So if it were my installation, I would collect statistics trace class 8 in addition to stats classes 1, 3, 4, and 5. In version 9, I have a strong recommendation to set the statistics integral STATIME equal to 1. This is highly valuable and essential to study the evolutional trends, which lead to complex system performance problems like slowdowns. A lot of people again have raised this myth about having the STATIME set to 1 will generate a huge amount of SMF data. This is actually not true. There are only 1,440 minutes in a day. So that s the number of intervals with a stattime of 1. I would recommend to you that it s very valuable and that all customers should set it to 1. Having said that, in DB2 version 10 the basic statistics records IFCIDs 2, 202, 217, 225, and 230 are always cut at a one-minute interval. They re no longer

18 controlled by the STATIME parameter. So the best practice recommendation for version 9 is now implemented and hardwired in version 10. The stat time interval, or the STATIME parameter is still there in version 10, and it controls the frequency of the other IFCIDs 105, 106, and 199. The other recommendation on this slide is to copy away the SMF 100 records (these are the records for statistics), and to keep them in a separate file. The new file is relatively small and it s much easier to post-process that data, and it s also much easier to send it to the DB2 lab to deal with PMRs. The whole goal here is to improve the elapsed time to actually post-process the data both for your environment and also at the DB2 lab. Now let's turn to slide 27. Slide 27 (49:25) Now, when it comes to the accounting, we strongly recommend that you collect accounting trace 1, 2, 3, 7, and 8. There s also an accounting class 10, but this is relatively expensive, and so is typically run only for short periods of time, in other words, accounting class 10 is not run on a permanent basis. There are options to consider if the SMF data volume is an issue. Many customers now are starting to record their SMF data to the z/os system s logger, and that basically streams the SMF data rather than writing it directly to VSAM files. The system logger improves both performance and throughput. In DB2 version 10, we have now enabled DB2 compression of SMF data, so that any instrumentation records that are written to the SMF destination are now subject to DB2 compression. Experience has been that the accounting records that represent the bulk of the SMF data volume compress to about 70 to 80%. The overhead of the compression is relatively tiny at approximately 1%. We also have accounting ROLLUP controlled by the DB2 system parameter called ACCUMACC. This applies only to DDF and the RRS Attach. It s worth pointing out to you that there s no effective package-level rollup reporting prior to version 10. So in version 10 for the first time, we get accurate package-level rollup accounting. The one general advantage about accounting rollover is that you lose granularity. In many performance problems, the problem is not pervasive for each and every transaction. When you have the odd outlying transaction that is badly performing, by using accounting rollup, you actually have lost the data for that outlying transaction. In other words, the impact of that outlying transaction is lost in the accounting rollup. So many people in version 10 may decide to trade turning off the accounting rollup, but at the same time then enable the SMF compression in order to reduce the volume. Now let's turn to slide 28.

19 Slide 28 (51:40) It s pretty important here to proactively monitor and review DB2 metrics on a regular basis, and to develop an automated process to store away the performance data into DB2 tables. At least the DB2 statistics data, which is relatively low volume, and to do this without any aggregation, and then to invest in a set of canned reports to transform the DB2 statistics data into real information that can be used. What you want to be able to do here is track the evolutionary trends of these key performance indicators at the DB2 system level from startup to shutdown, and then generate either red alerts or amber alerts based on out-of-normal conditions as of when they occur. And this also become the basis for a baseline for further analysis. Now there are some additional resources to help you. There s also a series of web lectures related to optimizing DB2 for z/os system performance using the DB2 statistics trace that are available from the same web site. There s also the one-day seminar written by myself and also Florence Dubois. In both cases here, this material provides a list of performance metrics and rules of thumbs in the area of buffer pool, group buffer pools, lock/latch contention, etcetera. Those web casts are also available on the DB2 for z/os best practices web site using the link below at the bottom of slide 28. Now let's turn to slide 29. Slide 29 (53:16) Another recommendation is to enable near-term history collection in your DB2 online monitor. This will provide the ability to retrieve and review DB2 statistics and accounting records for the past few hours of DB2 processing and gives you the ability to intercept adverse changing trends. It s also important to have effective exception monitoring at the DB2 systems and application level to track key statistics performance indicators both in your online monitor to generate alerts on outlying conditions and also out of the performance database. The performance metrics to use and the rules of thumb are covered in the performance one-day seminar. That could provide a good starting point. Also, these alerts being generated to the online monitor or through the performance database is to keep a log and history of the alerts and analyze the trends. So the other objectives here beyond exception monitoring are to get the virtual support teams engaged sooner, to help avoid, if possible, a performance problem escalating into a serious performance incident, and to help avoid extended recovery times, performance problems, and application availability

20 issues. It also provides a way of narrowing down the scope of the problem to speed up the reaction time during problem determination and problem source identification. It also provides the vehicle to understand weak points and to provide strategic remedial actions. Now let's turn to slide 30. Slide 30 (54:54) On slide 30 is a list of recommendations about enhancing the capture of diagnostic data. So first of all, at one-minute intervals we believe you should actually collect through automation, which is DISPLAY THREAD SERVICE WAIT, so drive through systems automation to drive the stay thread service wait command at a one-minute interval and then save the output away. Similarly through systems automation, drive the other list of commands at 15-minute intervals and save the outputs away again. So the objective here is to detect and correct as fast as possible, and here provide diagnostics so when you investigate problems you can actually study this diagnostic information at the time leading up to the problem and actually go back in time to see what led up to the particular problem. Now let s turn to slide 31. Slide 31 (55:50) In the last part of this web lecture, I want to talk about exploiting the dynamic statement cache for.net, ODBC, and JDBC applications. One of the trends of the last five to ten years has been the growth in workload coming through DDF,.NET, ODBC, JDBC, and also growth in the amount of mission-critical workload coming through DDF with dynamic SQL. What I want to encourage here as part of this web lecture is to encourage you to exploit the dynamic statement cache, which is a goldmine of information for performance analysis and shooting performance problems. Here I have a sample procedure which I d like to introduce and discuss to enable you to explain statements from the DB2 statement cache. We have five steps here. First of all, create the table DSN_STATEMENT_CACHE. And the Explain tables DSN_FUNCTION_TABLE. Now remember DSNTESC in DSN SAMPLIB gives you some sample DDL for those tables. Secondly, start the collection of the dynamic statement cache performance statistics. This is done by starting the performance trace you use to define service class 30, specifying IFCIDS 316, 317, and 318. The third step is to use the EXPLAIN STMTCACHE ALL command to extract the

21 SQL statements from the global cache and dump those statistics into the table DSN_STATEMENT_CACHE_TABLE. All the statements are included in the cache if EXPLAIN is executed by SYSADM. Otherwise, the statements that are exposed into the table are only those statements with the matching authorization. Then from the cache, you can generate individual EXPLAIN statements for each SQL statement by saying EXPLAIN STATEMENT CACHE STATEMENT ID, specifying the statement from the particular table. And the fifth thing is to actually import the contents of the four tables into a spreadsheet for post-processing. So that completes the end of this web lecture on operational best practices. Thank you for listening. (58:02)

On slide 2 here I have a disclaimer about particular trademarks that are used in this presentation. Now let s go to slide 3.

On slide 2 here I have a disclaimer about particular trademarks that are used in this presentation. Now let s go to slide 3. DB2 for z/os Best Practices DDF Connectivity John J. Campbell Distinguished Engineer DB2 for z/os Development db2zinfo@us.ibm.com 2011 IBM Corporation Transcript of webcast Slide 1 (00:00) Hello, this

More information

CPU and ziip usage of the DB2 system address spaces Part 2

CPU and ziip usage of the DB2 system address spaces Part 2 CPU and ziip usage of the DB2 system address spaces Part 2 Fabio Massimo Ottaviani EPV Technologies February 2016 4 Performance impact of ziip over utilization Configurations where the number of ziips

More information

DB2 is a complex system, with a major impact upon your processing environment. There are substantial performance and instrumentation changes in

DB2 is a complex system, with a major impact upon your processing environment. There are substantial performance and instrumentation changes in DB2 is a complex system, with a major impact upon your processing environment. There are substantial performance and instrumentation changes in versions 8 and 9. that must be used to measure, evaluate,

More information

Evolution of CPU and ziip usage inside the DB2 system address spaces

Evolution of CPU and ziip usage inside the DB2 system address spaces Evolution of CPU and ziip usage inside the DB2 system address spaces Danilo Gipponi Fabio Massimo Ottaviani EPV Technologies danilo.gipponi@epvtech.com fabio.ottaviani@epvtech.com www.epvtech.com Disclaimer,

More information

DB2 9 for z/os V9 migration status update

DB2 9 for z/os V9 migration status update IBM Software Group DB2 9 for z/os V9 migration status update July, 2008 Bart Steegmans DB2 for z/os L2 Performance Acknowledgement and Disclaimer i Measurement data included in this presentation are obtained

More information

The Present and Future of Large Memory in DB2

The Present and Future of Large Memory in DB2 The Present and Future of Large Memory in DB2 John B. Tobler Senior Technical Staff Member DB2 for z/os, IBM Michael Schultz Advisory Software Engineer DB2 for z/os, IBM Monday August 12, 2013 3:00PM -

More information

Best Practices. Deploying Optim Performance Manager in large scale environments. IBM Optim Performance Manager Extended Edition V4.1.0.

Best Practices. Deploying Optim Performance Manager in large scale environments. IBM Optim Performance Manager Extended Edition V4.1.0. IBM Optim Performance Manager Extended Edition V4.1.0.1 Best Practices Deploying Optim Performance Manager in large scale environments Ute Baumbach (bmb@de.ibm.com) Optim Performance Manager Development

More information

DB2 for z/os Best Practices Optimizing Insert Performance - Part 1

DB2 for z/os Best Practices Optimizing Insert Performance - Part 1 DB2 for z/os Best Practices Optimizing Insert Performance - Part 1 John J. Campbell IBM Distinguished Engineer DB2 for z/os Development CampbelJ@uk.ibm.com 2011 IBM Corporation Transcript of webcast Slide

More information

What's Currently Happening with Continuous Delivery on the z/os stack?

What's Currently Happening with Continuous Delivery on the z/os stack? Marna WALLE, mwalle@us.ibm.com Member of the IBM Academy of Technology z/os System Installation IBM z Systems, Poughkeepsie NY USA What's Currently Happening with Continuous Delivery on the z/os stack?

More information

FIVE BEST PRACTICES FOR ENSURING A SUCCESSFUL SQL SERVER MIGRATION

FIVE BEST PRACTICES FOR ENSURING A SUCCESSFUL SQL SERVER MIGRATION FIVE BEST PRACTICES FOR ENSURING A SUCCESSFUL SQL SERVER MIGRATION The process of planning and executing SQL Server migrations can be complex and risk-prone. This is a case where the right approach and

More information

[Slide 2: disclaimer]

[Slide 2: disclaimer] Slide 1: Hello this is John Campbell from DB2 development, and welcome to this next lecture in this series related to DB2 for z/os best practices. The subject of today's web lecture is about the EDM pool

More information

Infosys. Working on Application Slowness in Mainframe Infrastructure- Best Practices-Venkatesh Rajagopalan

Infosys. Working on Application Slowness in Mainframe Infrastructure- Best Practices-Venkatesh Rajagopalan Infosys Working on Application Slowness in Mainframe Infrastructure- Best Practices-Venkatesh Rajagopalan Summary Abstract An energy utility client was facing real-time infrastructure issues owing to the

More information

What it does not show is how to write the program to retrieve this data.

What it does not show is how to write the program to retrieve this data. Session: A16 IFI DATA: IFI you don t know, ask! Jeff Gross CA, Inc. 16 October 2008 11:45 12:45 Platform: DB2 for z/os Abstract The Instrumentation Facility Interface (IFI) can be a daunting resource in

More information

LeakDAS Version 4 The Complete Guide

LeakDAS Version 4 The Complete Guide LeakDAS Version 4 The Complete Guide SECTION 4 LEAKDAS MOBILE Second Edition - 2014 Copyright InspectionLogic 2 Table of Contents CONNECTING LEAKDAS MOBILE TO AN ANALYZER VIA BLUETOOTH... 3 Bluetooth Devices...

More information

Balancing the pressures of a healthcare SQL Server DBA

Balancing the pressures of a healthcare SQL Server DBA Balancing the pressures of a healthcare SQL Server DBA More than security, compliance and auditing? Working with SQL Server in the healthcare industry presents many unique challenges. The majority of these

More information

Lesson 11 Transcript: Concurrency and locking

Lesson 11 Transcript: Concurrency and locking Lesson 11 Transcript: Concurrency and locking Slide 1: Cover Welcome to Lesson 11 of the DB2 on Campus Lecture Series. We are going to talk today about concurrency and locking. My name is Raul Chong and

More information

Introduction to Statistical SMF data

Introduction to Statistical SMF data Introduction to Statistical SMF data Lyn Elkins IBM ATS elkinsc@us.ibm.com Agenda What is SMF? What is MQ SMF? Overview of MQ statistical SMF Controlling the generation of the data Processing the data

More information

DB2 Data Sharing Then and Now

DB2 Data Sharing Then and Now DB2 Data Sharing Then and Now Robert Catterall Consulting DB2 Specialist IBM US East September 2010 Agenda A quick overview of DB2 data sharing Motivation for deployment then and now DB2 data sharing /

More information

ThruPut Manager AE Product Overview From

ThruPut Manager AE Product Overview From Intro ThruPut Manager AE (Automation Edition) is the only batch software solution in its class. It optimizes and automates the total z/os JES2 batch workload, managing every job from submission to end

More information

Oracle Enterprise Manager 12c IBM DB2 Database Plug-in

Oracle Enterprise Manager 12c IBM DB2 Database Plug-in Oracle Enterprise Manager 12c IBM DB2 Database Plug-in May 2015 Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and

More information

Datacenter Care HEWLETT PACKARD ENTERPRISE. Key drivers of an exceptional NPS score

Datacenter Care HEWLETT PACKARD ENTERPRISE. Key drivers of an exceptional NPS score Datacenter Care The things I love about Datacenter Care is the a la carte nature of the offering. The contract is really flexible and the services delivered correspond exactly to what we bought. The contract

More information

Cisco Collaboration Optimization Services: Tune-Up for Peak Performance

Cisco Collaboration Optimization Services: Tune-Up for Peak Performance Cisco Collaboration Optimization Services: Tune-Up for Peak Performance What You Will Learn More than 200,000 enterprises around the world have deployed Cisco Collaboration Solutions. If you are one of

More information

z/os Preventive Maintenance Strategy to Maintain System Availability

z/os Preventive Maintenance Strategy to Maintain System Availability October 2010 z/os Preventive Maintenance Strategy to Maintain System Availability Authors: Dianne Gamarra IBM Corporation z/os Software Service Poughkeepsie, NY 12601 Phone (external): 845-435-9730 Phone

More information

WHITE PAPER Application Performance Management. Managing the Performance of DB2 Environments

WHITE PAPER Application Performance Management. Managing the Performance of DB2 Environments WHITE PAPER Application Performance Management Managing the Performance of DB2 Environments Management summary... 3 Precise for DB2 UDB: Application Performance Monitoring... 4 Maximizing the efficiency

More information

Exploiting IT Log Analytics to Find and Fix Problems Before They Become Outages

Exploiting IT Log Analytics to Find and Fix Problems Before They Become Outages Exploiting IT Log Analytics to Find and Fix Problems Before They Become Outages Clyde Richardson (richarcl@us.ibm.com) Technical Sales Specialist Sarah Knowles (seli@us.ibm.com) Strategy and Portfolio

More information

DB2 Performance A Primer. Bill Arledge Principal Consultant CA Technologies Sept 14 th, 2011

DB2 Performance A Primer. Bill Arledge Principal Consultant CA Technologies Sept 14 th, 2011 DB2 Performance A Primer Bill Arledge Principal Consultant CA Technologies Sept 14 th, 2011 Agenda Performance Defined DB2 Instrumentation Sources of performance metrics DB2 Performance Disciplines System

More information

Key Metrics for DB2 for z/os Subsystem and Application Performance Monitoring (Part 1)

Key Metrics for DB2 for z/os Subsystem and Application Performance Monitoring (Part 1) Key Metrics for DB2 for z/os Subsystem and Application Performance Monitoring (Part 1) Robert Catterall IBM March 12, 2014 Session 14610 Insert Custom Session QR if Desired. The genesis of this presentation

More information

BENEFITS OF VERITAS INDEPTH FOR IBM S DB2 UNIVERSAL DATABASE WITHIN AN OPERATIONAL ENVIRONMENT

BENEFITS OF VERITAS INDEPTH FOR IBM S DB2 UNIVERSAL DATABASE WITHIN AN OPERATIONAL ENVIRONMENT TUTORIAL: WHITE PAPER VERITAS Indepth For IBM s DB2 Universal Database BENEFITS OF VERITAS INDEPTH FOR IBM S DB2 UNIVERSAL DATABASE WITHIN AN OPERATIONAL ENVIRONMENT 1 1. Management Summary... 3 2. VERITAS

More information

Microsoft SQL Server Fix Pack 15. Reference IBM

Microsoft SQL Server Fix Pack 15. Reference IBM Microsoft SQL Server 6.3.1 Fix Pack 15 Reference IBM Microsoft SQL Server 6.3.1 Fix Pack 15 Reference IBM Note Before using this information and the product it supports, read the information in Notices

More information

The Dark Arts of MQ SMF Evaluation

The Dark Arts of MQ SMF Evaluation The Dark Arts of MQ SMF Evaluation Lyn Elkins elkinsc@us.ibm.com Session # 13884 August 13, 2013 Code! 1 The witch trial MQ is broken! Agenda Review of SMF 115 and SMF 116 class 3 data Hunting down the

More information

Optimizing Insert Performance - Part 1

Optimizing Insert Performance - Part 1 Optimizing Insert Performance - Part 1 John Campbell Distinguished Engineer DB2 for z/os development CAMPBELJ@uk.ibm.com 2 Disclaimer/Trademarks The information contained in this document has not been

More information

XP: Backup Your Important Files for Safety

XP: Backup Your Important Files for Safety XP: Backup Your Important Files for Safety X 380 / 1 Protect Your Personal Files Against Accidental Loss with XP s Backup Wizard Your computer contains a great many important files, but when it comes to

More information

An A-Z of System Performance for DB2 for z/os

An A-Z of System Performance for DB2 for z/os Phil Grainger, Lead Product Manager BMC Software March, 2016 An A-Z of System Performance for DB2 for z/os The Challenge Simplistically, DB2 will be doing one (and only one) of the following at any one

More information

Collecting Cached SQL Data and Its Related Analytics. Gerald Hodge HLS Technologies, Inc.

Collecting Cached SQL Data and Its Related Analytics. Gerald Hodge HLS Technologies, Inc. Collecting Cached SQL Data and Its Related Analytics Gerald Hodge HLS Technologies, Inc. Agenda Quick Review of SQL Prepare CACHEDYN=YES and KEEPDYNAMIC=YES CACHEDYN=YES and KEEPDYNAMIC=YES with COMMIT

More information

Memory for MIPS: Leveraging Big Memory on System z to Enhance DB2 CPU Efficiency

Memory for MIPS: Leveraging Big Memory on System z to Enhance DB2 CPU Efficiency Robert Catterall, IBM rfcatter@us.ibm.com Memory for MIPS: Leveraging Big Memory on System z to Enhance DB2 CPU Efficiency Midwest DB2 Users Group December 5, 2013 Information Management Agenda The current

More information

The Major CPU Exceptions in EPV Part 2

The Major CPU Exceptions in EPV Part 2 The Major CPU Exceptions in EPV Part 2 Mark Cohen Austrowiek EPV Technologies April 2014 6 System capture ratio The system capture ratio is an inverted measure of the internal system overhead. So the higher

More information

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers SLAC-PUB-9176 September 2001 Optimizing Parallel Access to the BaBar Database System Using CORBA Servers Jacek Becla 1, Igor Gaponenko 2 1 Stanford Linear Accelerator Center Stanford University, Stanford,

More information

Fully Optimize FULLY OPTIMIZE YOUR DBA RESOURCES

Fully Optimize FULLY OPTIMIZE YOUR DBA RESOURCES Fully Optimize FULLY OPTIMIZE YOUR DBA RESOURCES IMPROVE SERVER PERFORMANCE, UPTIME, AND AVAILABILITY WHILE LOWERING COSTS WE LL COVER THESE TOP WAYS TO OPTIMIZE YOUR RESOURCES: 1 Be Smart About Your Wait

More information

Three requirements for reducing performance issues and unplanned downtime in any data center

Three requirements for reducing performance issues and unplanned downtime in any data center Three requirements for reducing performance issues and unplanned downtime in any data center DARRYL FUJITA TECHNICAL SOFTWARE SOLUTIONS SPECIALIST HITACHI DATA SYSTEMS How Big Is The Cost Of Unplanned

More information

Six Sigma in the datacenter drives a zero-defects culture

Six Sigma in the datacenter drives a zero-defects culture Six Sigma in the datacenter drives a zero-defects culture Situation Like many IT organizations, Microsoft IT wants to keep its global infrastructure available at all times. Scope, scale, and an environment

More information

Musewerx support for Application Maintenance in Software AG NATURAL and ADABAS TM environment

Musewerx support for Application Maintenance in Software AG NATURAL and ADABAS TM environment Musewerx support for Application Maintenance in Software AG NATURAL and ADABAS TM environment Musewerx provides Application Maintenance Services for your applications written in NATURAL and ADABAS environment.

More information

vrealize Operations Manager User Guide Modified on 17 AUG 2017 vrealize Operations Manager 6.6

vrealize Operations Manager User Guide Modified on 17 AUG 2017 vrealize Operations Manager 6.6 vrealize Operations Manager User Guide Modified on 17 AUG 2017 vrealize Operations Manager 6.6 vrealize Operations Manager User Guide You can find the most up-to-date technical documentation on the VMware

More information

Craig S. Mullins. A DB2 for z/os Performance Roadmap By Craig S. Mullins. Database Performance Management Return to Home Page.

Craig S. Mullins. A DB2 for z/os Performance Roadmap By Craig S. Mullins. Database Performance Management Return to Home Page. Craig S. Mullins Database Performance Management Return to Home Page December 2002 A DB2 for z/os Performance Roadmap By Craig S. Mullins Assuring optimal performance is one of a database administrator's

More information

Monitor Qlik Sense sites. Qlik Sense Copyright QlikTech International AB. All rights reserved.

Monitor Qlik Sense sites. Qlik Sense Copyright QlikTech International AB. All rights reserved. Monitor Qlik Sense sites Qlik Sense 2.1.2 Copyright 1993-2015 QlikTech International AB. All rights reserved. Copyright 1993-2015 QlikTech International AB. All rights reserved. Qlik, QlikTech, Qlik Sense,

More information

vrealize Operations Manager User Guide

vrealize Operations Manager User Guide vrealize Operations Manager User Guide vrealize Operations Manager 6.2 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a

More information

Monitoring Tool Made to Measure for SharePoint Admins. By Stacy Simpkins

Monitoring Tool Made to Measure for SharePoint Admins. By Stacy Simpkins Monitoring Tool Made to Measure for SharePoint Admins By Stacy Simpkins Contents About the Author... 3 Introduction... 4 Who s it for and what all can it do?... 4 SysKit Insights Features... 6 Drillable

More information

Introduction to DB2 11 for z/os

Introduction to DB2 11 for z/os Chapter 1 Introduction to DB2 11 for z/os This chapter will address the job responsibilities of the DB2 system administrator, what to expect on the IBM DB2 11 System Administrator for z/os certification

More information

Solution Pack. Managed Services Virtual Private Cloud Managed Database Service Selections and Prerequisites

Solution Pack. Managed Services Virtual Private Cloud Managed Database Service Selections and Prerequisites Solution Pack Managed Services Virtual Private Cloud Managed Database Service Selections and Prerequisites Subject Governing Agreement Term DXC Services Requirements Agreement between DXC and Customer

More information

Oracle Enterprise Manager 12c Sybase ASE Database Plug-in

Oracle Enterprise Manager 12c Sybase ASE Database Plug-in Oracle Enterprise Manager 12c Sybase ASE Database Plug-in May 2015 Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only,

More information

Build a system health check for Db2 using IBM Machine Learning for z/os

Build a system health check for Db2 using IBM Machine Learning for z/os Build a system health check for Db2 using IBM Machine Learning for z/os Jonathan Sloan Senior Analytics Architect, IBM Analytics Agenda A brief machine learning overview The Db2 ITOA model solutions template

More information

End to End Analysis on System z IBM Transaction Analysis Workbench for z/os. James Martin IBM Tools Product SME August 10, 2015

End to End Analysis on System z IBM Transaction Analysis Workbench for z/os. James Martin IBM Tools Product SME August 10, 2015 End to End Analysis on System z IBM Transaction Analysis Workbench for z/os James Martin IBM Tools Product SME August 10, 2015 Please note IBM s statements regarding its plans, directions, and intent are

More information

IBM IMS Database Solution Pack for z/os Version 2 Release 1. Overview and Customization IBM SC

IBM IMS Database Solution Pack for z/os Version 2 Release 1. Overview and Customization IBM SC IBM IMS Database Solution Pack for z/os Version 2 Release 1 Overview and Customization IBM SC19-4007-04 IBM IMS Database Solution Pack for z/os Version 2 Release 1 Overview and Customization IBM SC19-4007-04

More information

IBM Tivoli OMEGAMON XE for Storage on z/os Version Tuning Guide SC

IBM Tivoli OMEGAMON XE for Storage on z/os Version Tuning Guide SC IBM Tivoli OMEGAMON XE for Storage on z/os Version 5.1.0 Tuning Guide SC27-4380-00 IBM Tivoli OMEGAMON XE for Storage on z/os Version 5.1.0 Tuning Guide SC27-4380-00 Note Before using this information

More information

High Availability through Warm-Standby Support in Sybase Replication Server A Whitepaper from Sybase, Inc.

High Availability through Warm-Standby Support in Sybase Replication Server A Whitepaper from Sybase, Inc. High Availability through Warm-Standby Support in Sybase Replication Server A Whitepaper from Sybase, Inc. Table of Contents Section I: The Need for Warm Standby...2 The Business Problem...2 Section II:

More information

ICANN and Technical Work: Really? Yes! Steve Crocker DNS Symposium, Madrid, 13 May 2017

ICANN and Technical Work: Really? Yes! Steve Crocker DNS Symposium, Madrid, 13 May 2017 ICANN and Technical Work: Really? Yes! Steve Crocker DNS Symposium, Madrid, 13 May 2017 Welcome, everyone. I appreciate the invitation to say a few words here. This is an important meeting and I think

More information

High Performance Computer Architecture Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

High Performance Computer Architecture Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur High Performance Computer Architecture Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 23 Hierarchical Memory Organization (Contd.) Hello

More information

vrealize Operations Manager User Guide 11 OCT 2018 vrealize Operations Manager 7.0

vrealize Operations Manager User Guide 11 OCT 2018 vrealize Operations Manager 7.0 vrealize Operations Manager User Guide 11 OCT 2018 vrealize Operations Manager 7.0 You can find the most up-to-date technical documentation on the VMware website at: https://docs.vmware.com/ If you have

More information

Optimize Your Databases Using Foglight for Oracle s Performance Investigator

Optimize Your Databases Using Foglight for Oracle s Performance Investigator Optimize Your Databases Using Foglight for Oracle s Performance Investigator Solve performance issues faster with deep SQL workload visibility and lock analytics Abstract Get all the information you need

More information

Sysplex: Key Coupling Facility Measurements Cache Structures. Contact, Copyright, and Trademark Notices

Sysplex: Key Coupling Facility Measurements Cache Structures. Contact, Copyright, and Trademark Notices Sysplex: Key Coupling Facility Measurements Structures Peter Enrico Peter.Enrico@EPStrategies.com 813-435-2297 Enterprise Performance Strategies, Inc (z/os Performance Education and Managed Service Providers)

More information

vrealize Operations Manager User Guide

vrealize Operations Manager User Guide vrealize Operations Manager User Guide vrealize Operations Manager 6.5 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a

More information

ABSTRACTING CONNECTIVITY FOR IOT WITH A BACKHAUL OPERATOR

ABSTRACTING CONNECTIVITY FOR IOT WITH A BACKHAUL OPERATOR ABSTRACTING CONNECTIVITY FOR IOT WITH A BACKHAUL OPERATOR NIGEL CHADWICK VIDEO TRANSCRIPT Welcome! What s your name and what do you do? Hi, it s Nigel Chadwick. I m one of the founders of Stream Technologies.

More information

DB2 and Memory Exploitation. Fabio Massimo Ottaviani - EPV Technologies. It s important to be aware that DB2 memory exploitation can provide:

DB2 and Memory Exploitation. Fabio Massimo Ottaviani - EPV Technologies. It s important to be aware that DB2 memory exploitation can provide: White Paper DB2 and Memory Exploitation Fabio Massimo Ottaviani - EPV Technologies 1 Introduction For many years, z/os and DB2 system programmers have been fighting for memory: the former to defend the

More information

<Insert Picture Here> Managing Oracle Exadata Database Machine with Oracle Enterprise Manager 11g

<Insert Picture Here> Managing Oracle Exadata Database Machine with Oracle Enterprise Manager 11g Managing Oracle Exadata Database Machine with Oracle Enterprise Manager 11g Exadata Overview Oracle Exadata Database Machine Extreme ROI Platform Fast Predictable Performance Monitor

More information

THE STATE OF IT TRANSFORMATION FOR RETAIL

THE STATE OF IT TRANSFORMATION FOR RETAIL THE STATE OF IT TRANSFORMATION FOR RETAIL An Analysis by Dell EMC and VMware Dell EMC and VMware are helping IT groups at retail organizations transform to business-focused service providers. The State

More information

Three Key Considerations for Your Public Cloud Infrastructure Strategy

Three Key Considerations for Your Public Cloud Infrastructure Strategy GOING PUBLIC: Three Key Considerations for Your Public Cloud Infrastructure Strategy Steve Follin ISG WHITE PAPER 2018 Information Services Group, Inc. All Rights Reserved The Market Reality The race to

More information

Controlling Costs and Driving Agility in the Datacenter

Controlling Costs and Driving Agility in the Datacenter Controlling Costs and Driving Agility in the Datacenter Optimizing Server Infrastructure with Microsoft System Center Microsoft Corporation Published: November 2007 Executive Summary To help control costs,

More information

DB2 Performance Health Check... in just few minutes. DUGI 8-9 April 2014

DB2 Performance Health Check... in just few minutes. DUGI 8-9 April 2014 DB2 Performance Health Check... in just few minutes DUGI 8-9 April 2014 Introduction DB2 is the preferred repository for mission critical data at all z/os sites Performance of z/os and non z/os based applications

More information

CICS insights from IT professionals revealed

CICS insights from IT professionals revealed CICS insights from IT professionals revealed A CICS survey analysis report from: IBM, CICS, and z/os are registered trademarks of International Business Machines Corporation in the United States, other

More information

Oracle Diagnostics Pack For Oracle Database

Oracle Diagnostics Pack For Oracle Database Oracle Diagnostics Pack For Oracle Database ORACLE DIAGNOSTICS PACK FOR ORACLE DATABASE Oracle Enterprise Manager is Oracle s integrated enterprise IT management product line, and provides the industry

More information

The Total Network Volume chart shows the total traffic volume for the group of elements in the report.

The Total Network Volume chart shows the total traffic volume for the group of elements in the report. Tjänst: Network Health Total Network Volume and Total Call Volume Charts Public The Total Network Volume chart shows the total traffic volume for the group of elements in the report. Chart Description

More information

WHY BUILDING SECURITY SYSTEMS NEED CONTINUOUS AVAILABILITY

WHY BUILDING SECURITY SYSTEMS NEED CONTINUOUS AVAILABILITY WHY BUILDING SECURITY SYSTEMS NEED CONTINUOUS AVAILABILITY White Paper 2 Why Building Security Systems Need Continuous Availability Always On Is the Only Option. If All Systems Go Down, How Can You React

More information

How IBM Can Identify z/os Networking Issues without tracing

How IBM Can Identify z/os Networking Issues without tracing How IBM Can Identify z/os Networking Issues without tracing Wed, August 12, 1:45-2:45 Session 17536 Speakers: Ernie Gilman, IBM (egilman@us.ibm.com) Dean Butler, IBM (butlerde@us.ibm.com) Abstract Running

More information

How Microsoft IT Reduced Operating Expenses Using Virtualization

How Microsoft IT Reduced Operating Expenses Using Virtualization How Microsoft IT Reduced Operating Expenses Using Virtualization Published: May 2010 The following content may no longer reflect Microsoft s current position or infrastructure. This content should be viewed

More information

CA OPS/MVS Event Management and Automation

CA OPS/MVS Event Management and Automation CA OPS/MVS Event Management and Automation Best Practices Guide Release 12.1 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to

More information

IBM InfoSphere Streams v4.0 Performance Best Practices

IBM InfoSphere Streams v4.0 Performance Best Practices Henry May IBM InfoSphere Streams v4.0 Performance Best Practices Abstract Streams v4.0 introduces powerful high availability features. Leveraging these requires careful consideration of performance related

More information

Virtualization. Q&A with an industry leader. Virtualization is rapidly becoming a fact of life for agency executives,

Virtualization. Q&A with an industry leader. Virtualization is rapidly becoming a fact of life for agency executives, Virtualization Q&A with an industry leader Virtualization is rapidly becoming a fact of life for agency executives, as the basis for data center consolidation and cloud computing and, increasingly, as

More information

Reducing Costs and Improving Systems Management with Hyper-V and System Center Operations Manager

Reducing Costs and Improving Systems Management with Hyper-V and System Center Operations Manager Situation The Microsoft Entertainment and Devices (E&D) division was encountering long lead times of up to two months to provision physical hardware for development and test purposes. Delays for production

More information

The Problem with Privileged Users

The Problem with Privileged Users Flash Point Paper Enforce Access Control The Problem with Privileged Users Four Steps to Reducing Breach Risk: What You Don t Know CAN Hurt You Today s users need easy anytime, anywhere access to information

More information

Avoiding the Cost of Confusion: SQL Server Failover Cluster Instances versus Basic Availability Group on Standard Edition

Avoiding the Cost of Confusion: SQL Server Failover Cluster Instances versus Basic Availability Group on Standard Edition One Stop Virtualization Shop Avoiding the Cost of Confusion: SQL Server Failover Cluster Instances versus Basic Availability Group on Standard Edition Written by Edwin M Sarmiento, a Microsoft Data Platform

More information

Diagnostics in Testing and Performance Engineering

Diagnostics in Testing and Performance Engineering Diagnostics in Testing and Performance Engineering This document talks about importance of diagnostics in application testing and performance engineering space. Here are some of the diagnostics best practices

More information

RAIFFEISENBANK BULGARIA

RAIFFEISENBANK BULGARIA RAIFFEISENBANK BULGARIA IT thought leader chooses EMC XtremIO and VMware for groundbreaking VDI project OVERVIEW ESSENTIALS Industry Financial services Company Size Over 3,000 employees, assets of approximately

More information

Total Cost of Ownership: Benefits of the OpenText Cloud

Total Cost of Ownership: Benefits of the OpenText Cloud Total Cost of Ownership: Benefits of the OpenText Cloud OpenText Managed Services in the Cloud delivers on the promise of a digital-first world for businesses of all sizes. This paper examines how organizations

More information

Best Practices for Alert Tuning. This white paper will provide best practices for alert tuning to ensure two related outcomes:

Best Practices for Alert Tuning. This white paper will provide best practices for alert tuning to ensure two related outcomes: This white paper will provide best practices for alert tuning to ensure two related outcomes: 1. Monitoring is in place to catch critical conditions and alert the right people 2. Noise is reduced and people

More information

Avoiding Storage Service Disruptions with Availability Intelligence

Avoiding Storage Service Disruptions with Availability Intelligence Avoiding Storage Service Disruptions with Availability Intelligence Brent Phillips, Managing Director, Americas Brett Allison, Director of Technical Services www.intellimagic.com 1 Today s Agenda 1. Availability

More information

ORACLE ENTERPRISE MANAGER 10g ORACLE DIAGNOSTICS PACK FOR NON-ORACLE MIDDLEWARE

ORACLE ENTERPRISE MANAGER 10g ORACLE DIAGNOSTICS PACK FOR NON-ORACLE MIDDLEWARE ORACLE ENTERPRISE MANAGER 10g ORACLE DIAGNOSTICS PACK FOR NON-ORACLE MIDDLEWARE Most application performance problems surface during peak loads. Often times, these problems are time and resource intensive,

More information

Cheryl s Hot Flashes #21

Cheryl s Hot Flashes #21 Cheryl s Hot Flashes #21 Cheryl Watson Watson & Walker, Inc. March 6, 2009 Session 2509 www.watsonwalker.com home of Cheryl Watson s TUNING Letter, CPU Chart, BoxScore, and GoalTender Agenda Survey Questions

More information

Oracle Rdb Hot Standby Performance Test Results

Oracle Rdb Hot Standby Performance Test Results Oracle Rdb Hot Performance Test Results Bill Gettys (bill.gettys@oracle.com), Principal Engineer, Oracle Corporation August 15, 1999 Introduction With the release of Rdb version 7.0, Oracle offered a powerful

More information

Running SNAP. The SNAP Team February 2012

Running SNAP. The SNAP Team February 2012 Running SNAP The SNAP Team February 2012 1 Introduction SNAP is a tool that is intended to serve as the read aligner in a gene sequencing pipeline. Its theory of operation is described in Faster and More

More information

AlwaysOn Availability Groups: Backups, Restores, and CHECKDB

AlwaysOn Availability Groups: Backups, Restores, and CHECKDB AlwaysOn Availability Groups: Backups, Restores, and CHECKDB www.brentozar.com sp_blitz sp_blitzfirst email newsletter videos SQL Critical Care 2016 Brent Ozar Unlimited. All rights reserved. 1 What I

More information

Key Metrics for DB2 for z/os Subsystem and Application Performance Monitoring (Part 1)

Key Metrics for DB2 for z/os Subsystem and Application Performance Monitoring (Part 1) Robert Catterall, IBM rfcatter@us.ibm.com Key Metrics for DB2 for z/os Subsystem and Application Performance Monitoring (Part 1) New England DB2 Users Group September 17, 2015 Information Management 2015

More information

Copyright 2018, Oracle and/or its affiliates. All rights reserved.

Copyright 2018, Oracle and/or its affiliates. All rights reserved. Beyond SQL Tuning: Insider's Guide to Maximizing SQL Performance Monday, Oct 22 10:30 a.m. - 11:15 a.m. Marriott Marquis (Golden Gate Level) - Golden Gate A Ashish Agrawal Group Product Manager Oracle

More information

#1593: The top 10 things that can go wrong with an IBM Traveler Server

#1593: The top 10 things that can go wrong with an IBM Traveler Server #1593: The top 10 things that can go wrong with an IBM Traveler Server plus how to detect and correct them Alan Forbes Acknowledgements and Disclaimer. Copyright IBM Corporation 2016. All rights reserved.

More information

Key metrics for effective storage performance and capacity reporting

Key metrics for effective storage performance and capacity reporting Key metrics for effective storage performance and capacity reporting Key Metrics for Effective Storage Performance and Capacity Reporting Objectives This white paper will cover the key metrics in storage

More information

Disaster Recovery Is A Business Strategy

Disaster Recovery Is A Business Strategy Disaster Recovery Is A Business Strategy A White Paper By Table of Contents Preface Disaster Recovery Is a Business Strategy Disaster Recovery Is a Business Strategy... 2 Disaster Recovery: The Facts...

More information

Practical Capacity Planning in 2010 zaap and ziip

Practical Capacity Planning in 2010 zaap and ziip Practical Capacity Planning in 2010 zaap and ziip Fabio Massimo Ottaviani EPV Technologies February 2010 1 Introduction When IBM released zaap (2004) and ziip(2006) most companies decided to acquire a

More information

QOS Quality Of Service

QOS Quality Of Service QOS Quality Of Service Michael Schär Seminar in Distributed Computing Outline Definition QOS Attempts and problems in the past (2 Papers) A possible solution for the future: Overlay networks (2 Papers)

More information

Why the Threat of Downtime Should Be Keeping You Up at Night

Why the Threat of Downtime Should Be Keeping You Up at Night Why the Threat of Downtime Should Be Keeping You Up at Night White Paper 2 Your Plan B Just Isn t Good Enough. Learn Why and What to Do About It. Server downtime is an issue that many organizations struggle

More information

BETA DEMO SCENARIO - ATTRITION IBM Corporation

BETA DEMO SCENARIO - ATTRITION IBM Corporation BETA DEMO SCENARIO - ATTRITION 1 Please Note: IBM s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM s sole discretion. Information regarding

More information

Title: Episode 11 - Walking through the Rapid Business Warehouse at TOMS Shoes (Duration: 18:10)

Title: Episode 11 - Walking through the Rapid Business Warehouse at TOMS Shoes (Duration: 18:10) SAP HANA EFFECT Title: Episode 11 - Walking through the Rapid Business Warehouse at (Duration: 18:10) Publish Date: April 6, 2015 Description: Rita Lefler walks us through how has revolutionized their

More information

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE RAID SEMINAR REPORT 2004 Submitted on: Submitted by: 24/09/2004 Asha.P.M NO: 612 S7 ECE CONTENTS 1. Introduction 1 2. The array and RAID controller concept 2 2.1. Mirroring 3 2.2. Parity 5 2.3. Error correcting

More information