Westfield DB2 z/os System Management Managing the Pain with Stats, Charts, Graphs, & REXX NEODBUG Aug 16, 2012 Mike Smith Westfield Insurance
Agenda About Westfield DB2 Workload and Environment Tools and resources Building the Infrastructure 5 Examples of pain management
About Westfield
About Westfield Property and casualty insurance for over 150 years One of the largest non-public companies in Ohio Over 3.4 billion in assets/1.5 billion in written premium Network of more that 1,200 leading independent agencies 2300 employees in 60 offices servicing 31 states
Westfield DB2 z/os IBM z196 Hardware Ziip Specialty Engine DB2 V9.1 Two Production DB2 Subsystems
DB2 Workloads PeopleSoft Financial PeopleSoft Human Resources (HRMS) Westfield-developed Java Web Applications Westfield-developed Mainframe Applications OLAP PDBS / ADBS
PeopleSoft Financial Application Software runs on UNIX and Windows servers General Ledger, Accounts Payable, Asset Management, Purchasing, Financial Reporting OLTP, Batch, and Reporting 550 gb of data in 52K tables 210K connections per day (prime shift) 1-3 Million SQL calls per day Heavy month end reporting
PeopleSoft HRMS Application Software runs on UNIX and Windows servers Payroll, Benefits, Human Resources OLTP, Batch, and reporting 40 gig of data in 24K tables 280K connections / day 1-3 Million SQL calls / day
Web Applications Westfield-written Web-based applications on Linux servers Policy entry, quoting, services OLTP 250 gig of data in 1K tables 3.7 Million connections / day 22 Million SQL calls / day Peaks of 800+ SQL / second
Mainframe Applications Westfield-written COBOL programs Small but growing workload CICS and batch processing
OLAP Reporting and some BI Relatively small, but growing volatile 4K connections / day 36K SQL calls / day
PDBS / ADBS People Doing Bad Stuff Applications Doing Bad Stuff Many ways for Business and IT users to easily generate and run SQL Unpredictable Disruptive Time consuming
Objectives #1 - No phone calls at night #2 - No phone calls during the day
Why Calls At Night? Availability Errors Poor Performance
Why Calls During the Day? Availability Errors Poor Performance Cost (CPU)
How to Meet Objectives Prevent problems from happening Third Party Monitor: Great for real-time monitoring Great for problem diagnosis as it is happening Canned reports and reporting tool lacking Need Tools for: Long-term performance monitoring Resource utilization Trending Capacity planning Measuring the impact of change
Requirements are Characteristic of Business Intelligence (BI) Easy flexible reporting Quick turnaround Detail Roll up or drill down Time Slicing (by hr, day, mo, prime time, etc) Trending Correlation Ad-hoc Discovery
The Evolution of GI BI for Geeks Data from DB2 Accounting, Performance, Stats Traces, Statement Cache, SMS, and more COGNOS for Reports and Graphs REXX for more complex processing or display
Steps for building GI Analyze DB2 Trace Data hundreds of metrics Design DB2 Tables and indexes Build dimension tables and hierarchies Transform & Load Data Monitor software and MXG can help with load 8-9 million rows loaded every day More transformation and summarization Create views to simplify use Build MQT s for performance
Steps for Building GI - continued Build Cognos application for standard reports and graphs Build Cognos templates for ad-hoc Build Spufi templates for ad-hoc and discovery Write REXX where programming is needed
Example #1 CPU Containment High cost of CPU High use by DB2 can constrain other mainframe workloads High use can indicate DB2 application performance problem
Example #2 Application Performance Distributed Application Response Time Components: DB2 Linux Server Websphere Java Code Service Calls How do we find the root of performance problems? DB2 guilty until proven innocent
Example #3 DASD Utilization Make sure enough available for day-to-day processing Capacity planning
Example #4 DB2 System Constraints DB2 Subsystem Resources Shared internally by all DB2 applications Shared externally with all z/os work CPU Memory I/O Constraint can cause DB2 performance degradation or outages Over allocation can cause mainframe degradation or outages
Example #5 Bad SQL The Monitor-Resistant Strain Symptoms can include: Increase in CPU Increased lock time Increased I/O Performance degradation Monitor detects no long-running SQL Look for (very) frequently executed but fast SQL useful technique for testing SQL before production deployment
Remedy Examine Dynamic Statement Cache 1. -STA TRACE(PERFM) CLASS(32) IFCID(318) DEST(GTF) 2. -STO TRACE(PERFM) CLASS(32) 3. EXPLAIN STMTCACHE ALL 4. Query table to find which SQL is using the most resources 5. Run REXX to get the SQL statement and detailed stats 6. Explain the SQL
Select to get SQL of interest ------+---------+---------+---------+---------+---------+-- STMT_ID AUTHID EXEC_CNT GETPG_TOT GETPG_AVG ------+---------+---------+---------+---------+---------+-- 1798621 APPWACP0 329650 1614458 5. 1815460 APPWPLP0 60 803718 13395. 1798546 APPWACP0 37511 628619 17. 1798391 APPWACP0 14976 612949 41. 1798380 APPWACP0 21609 547612 25. 1798374 APPWACP0 16878 491400 29. 1815846 IMPAISP0 32 484224 15132. 1798316 APPWACP0 27956 428973 15. 1798309 APPWACP0 31263 407709 13. 1798841 APPWCLP0 4382 376852 86. 1798554 APPWACP0 21610 347675 16. 1798371 APPWACP0 106450 324571 3. 1798317 APPWACP0 26678 319705 12. 1798308 APPWPLP0 22262 290368 13. 1798381 APPWACP0 58424 283522 5. 1798514 APPWACP0 54733 273527 5. 1798205 APPWACP0 18897 246516 13.
Run selected statement id s through REXX to get the details -- ========================================================== -- APPWPLP0 ID: 1815460 -- CACHED: 06/11/2012 10:19 AM TIME IN CACHE: 00:23:06 -- -- TOTAL PER EXEC PER MIN PER SEC -- -- EXECUTED 60 2.6 0 -- GET PG 803718 13395 34944.2 582.4 -- ELAP TM 34.6491 0.5775 1.5065 0.0251 -- CPU TM 19.3594 0.3227 0.8417 0.0140 -- BUF RD 11 2 -- SYNC I/O TM 0.0043 0.0007 -- ASYNC I/O TM 16.9978 0.2833 -- LOCK TM 0.0038 0.0006 -- SORTS 0 0 -- IX SCANS 0 0 -- TS SCANS 60 1 -- EXAM ROWS 21153428 3525571 -- PROC ROWS 3 1 -- ------------------------------------------------ SELECT HOM_ID FROM PDB2WPLP.HOM WHERE RCC_REF_NR =? WITH UR
Questions? Email - MichaelSmith@WestfieldGrp.com