Hash Access to DB2 Data Faster, Better, Cheaper

Size: px
Start display at page:

Download "Hash Access to DB2 Data Faster, Better, Cheaper"

Transcription

1 Hash Access to DB2 Data Faster, Better, Cheaper Kalpana Shyam, Karelle Cornwell Developers, DB2 for z/os, IBM Corp Session Code: A10 Wednesday, 10 November 2010, 11:00 AM - 12:00 PM Platform: DB2 10 for z/os 1

2 Agenda Hash Access Path and existing access paths Creating tables for Hash Access Altering tables for Hash Access Choosing tables suitable for hash Process of conversion Utilities Performance results Best practices and caveats 2 In this session, we will meander through these topics.

3 Objectives Faster access method for OLTP Single row fetch using a fully qualified unique key Conversion Easily convert existing tables for Hash Access 3 The objectives of implementing hash function in DB2 V10 are to provide our customers with a faster access

4 Name Table space scan Index Scan Variations such as: Ridlist scan,star join, Multi index join, Hybrid join, DPSI, etc. Hash Access DB2 Access Paths at a Glance New! How it works Reads the table space in physical order from start to finish or until record found Reads the index: in physical order from start to finish or from start of range to end of range or for an equal predicate Variations on basic Data Manager Table Space Scan and Index Scan. Given the key value, reads the data directly! Performance Considerations Good for reading all/records. No order necessary. Need order, need a range of key values, need a specific value. Could be index only or index and data access. Improved performance when one or more tables are accessed or there are multiple criteria for the same table. Best for equal predicate Equal predicates are used for IN list. Hash Access is only for equal predicates accessed in a random manner Data manager has two basic access paths: Table scan and Index scan. When Multiple tables are involved, or multiple criteria on the same table, then, depending on the query, we use combinations of these basic methods and different indexes to efficiently produce the result. (This is not a comprehensive list.) In DB2 V10, we introduce a new basic access path that we call Hash Access. This is equivalent to Index Scan for an equal predicate in function, but faster. Note that when DB2 process an IN list, it also uses equal predicate processing.

5 Fastest Available in DB2 V9: Index access Query: SELECT * WHERE ITEMNO = W ITEMNO Unique Index 5 levels Data Pages 6 page access: 5 Index pages 1 Data page Page in buffer pool Page is read from disk 2 pages need disk I/O: 1 index leaf page 1 data page This page shows when DB2 User data is accessed through an Index. To illustrate a typical index access, this index shows a 5 level index, it may be more typical in your shop to have only 3 level indexes. For an equal predicate query, 6 pages will be access, the root page, the 3 nonleaf pages and 1 leaf page and finally the data page. Most likely the Root and non-leaf pages will be in the buffer pool from previous queries so no I/O will be required. However the leaf page and the data page are unlikely to be in the buffer pool and will need to be read from disk.

6 Fastest Available In DB2 V10: Hash Access Query: SELECT * WHERE ITEMNO = W Unique Key Value 1 data page access Execution time hash calculation Data Pages Key finds the row without index Reduced: Page visits CPU time Elapsed time Page Page in buffer pool Page is read from disk 1 data page disk I/O (Possibly in buffer pool) -Trade-off: extra space used DB2 10 introduces a new method of accessing user data Hash Access. For an equal predicate query, 1 page will be accessed - the data page. Most likely the data page will be read from disk, but it may be possible that it is in the buffer pool. In comparison to the five level index access, the same query results in fewer page visits, reduced CPU and Elapsed time. The trade off is that hash access tables use slightly higher disk space.

7 How do we get from the key value to page? Unique Key Value Eg W Nbr of fixed hash data pages (n) Nbr of anchors per data page Execution time hash calculation Page number Physical location of row in the table space 1. Every byte of the key value is jumbled to produce a 64 bit hash value Predictably the same hash value each time the calculation is performed Different key values will generate hash values that will spread evenly across the range The DB2 hash function is very good and suitable for all data types 2. The 64 bit hash value modulo n gives the relative page number n is a prime number calculated from user provided information about the size of the table 3. The relative page number is converted to the physical page number in the table space 4. The 64 bit hash value modulo m gives the hash anchor m is a prime number from 17 to 53 calculated from page size and average number of rows on the page The hash function has 3 components to its input: the key value, the number of fixed hash data pages and the number of anchors per page. The hash function generates a hash value which is then Modulo divided by the fixed hash data pages to yield a relative page number. The relative page is converted to the physical page number within the table space accounting for header pages, space map pages, dictionary pages and system pages. The same hash value is modulo divided by the number of anchors per page to generate the exact id for this key value within the page. Together, the physical page number and the ID constitute the RID which is the physical location of the row that matches the input key value. The hash function is a proven function - no skews regardless of data type. This begs the question what are fixed hash datapages and anchors per page

8 Fixed Hash Space Fixed Hash Space (2G) Anchors page page.pages Fixed Hash Space specified with CREATE TABLE ORGANIZE BY HASH UNIQUE (ITEMNO) HASH SPACE 2G DB2 REORG can help manage the fixed hash space size Average Row size and page size determine the number of anchors reserved per page Hash Function generates a home page + anchor only in the fixed hash space This value is used to insert/load/fetch rows Collisions are chained together on the page. But we can run out of space on the page or run out of ids ( max 255) on a page 8 A DB2 table organized by hash has a fixed area containing a fixed number of data pages. The table space format is like any other UTS table space format in terms of header pages, space map pages, dictionary pages and system pages. The system pages and dictionary pages, however are pre-allocated at the start of each partition such that the data pages have a fixed number. Therefore, there are a fixed number of data pages per partition. The number of data pages in the fixed area depends on the user specified hash space on the CREATE TABLE STATEMENT, and subsequently, if desired, DB2 can manage the size on REORG. Each data page has a fixed number of anchors reserved for the hash algorithm. Collisions on anchors, uses a chain of ids, up to the usual maximum of 255.

9 Hash Overflow Space Fixed Hash Space (2G) Hash Overflow Space Anchors page page.pages pages (Allocated as needed) Rows that do not fit in the target page, upon insert/load, are placed at the end of the fixed hash space in the hash overflow space. Chains that overflow, have an indication that they have overflowed. If the fixed hash space is appropriately sized, the hash overflow space should be small. How do we find rows that are in the hash overflow area? 9 In the event that we run out of space on the page or we exceed the maximum id slots of 255, the row is placed in a dynamically allocated hash overflow area. Initially the rows are placed sequentially in the hash overflow area, over time, they follow the normal space search process.

10 Hash Overflow Space and Index Fixed Hash Space (2G) Hash Overflow Index Anchors page page.pages pages Hash Overflow Space (Allocated as needed) Rows stored in the hash overflow area are have index entries in the hash overflow index If the hash space is specified according to the size of the inserted/loaded data Hash overflow area will contain very few rows Consequently the hash overflow index will have very few entries Some overflow rows are expected even on a well sized table What if we want to find rows by a different access path? 10 Rows in the hash overflow area have index entries in the hash overflow index. The hash overflow index will be small if the table is sized correctly. Fetching rows from the hash overflow area using hash access path will be less efficient since DB2 visits the page in the fixed area and then, not finding the matching row there, it has to go to the hash overflow index and then find the matching row. It is therefore, strongly recommended that hash organized tables be sized appropriately to maintain peak performance.

11 Other Access Paths on Hash Organized Tables SELECT WHERE VENDOR_ID IN ( ) (Index Access) Vendor_id index SELECT WHERE ITEMNO = (Hash Access) Fixed Hash Space.pages Overflow Index Overflow space Anchors page page pages SELECT WHERE QTY < 10 (Table Space Scan) Hash Access does not support range scan Range scan queries on secondary indexes do not perform well since table is not clustered by key. 11 Tables organized by hash support all access paths They can have other indexes though they cannot be clustered. Caveat: Because the table is not clustered, range scans using an index does not perform well because DB2 cannot take advantage of page pre-fetch.

12 Examples Where Hash Access is Used Simple hash access SELECT * FROM T1 WHERE HASHCOL = :HV SELECT * FROM T1 WHERE HASHCOL IN (SELECT C1 FROM T2) List prefetch/multi-index access with hash access SELECT * FROM T1 WHERE (HASHCOL = :HV1 OR INDEXCOL = :HV2) Referential Integrity Parent key check on INSERT/UPDATE/LOAD/CHECKDATA uses hash access if available Cascade Delete/Set Null do not use hash access even if available No parallelism with hash access No hash access with star join or hybrid join New ACCESS_TYPE in PLAN_TABLE: H: full matching hash access HN*: IN-LIST hash access HL*: IN-LIST hash access with in-memory table MH: multiple-index access ACCESS_NAME: hash overflow index name

13 CREATE TABLE and the New Organization-clause CREATE TABLE IN database-name partitioning-clause organization-clause table-space-name CREATE TABLE PARTITION BY RANGE PARTITION 1. PARTITION 5..HASH SPACE 1G ORGANIZE BY HASH UNIQUE (lastname,firstname) HASH SPACE 2G Hash organization requires UTS If table space not specified with IN clause, DB2 will create one implicitly 13 New parameters have been added to the CREATE TABLE statement to tell DB2 that the table is to be organised using the hashing algorithm. Note that the UNIQUE parameter MUST be used for hash tables. The HASH SPACE size will need to be bigger than the current table space size because there are likely to be pages that may never be used due to a limited key range. It may need to be as much as 50% bigger.

14 Organization-clause organization-clause ORGANIZE BY HASH UNIQUE ( column-name ) HASH SPACE 64 M HASH SPACE integer K M G CREATE TABLE PARTITION BY RANGE PARTITION 1. PARTITION 5..HASH SPACE 1G ORGANIZE BY HASH UNIQUE (lastname,firstname) HASH SPACE 2G Hash key: Enforces uniqueness Is not updatable Delete/Insert needed Can enforce Referential Constraint Hash Space: Specifies size of fixed hash space 14 New parameters have been added to the CREATE TABLE statement to tell DB2 that the table is to be organised using the hashing algorithm. Note that the UNIQUE parameter MUST be used for hash tables. The HASH SPACE size will need to be bigger than the current table space size because there are likely to be pages that may never be used due to a limited key range. It may need to be as much as 50% bigger.

15 Hash Access and Partitioning Partition By GrowthTable Partition BY Range Table Hash range Part 1 Hash range Part 2 Part 1 Part 2 Hash range Part Key value Part n PBG Parts 1-n Part Part Key Key must must be be a a subset subset of of the the Hash Hash Key Key Part n Hash range 15 A Hash table can only exist in a Partitioned table space, this table space must be a UTS and can either be Partitioned By Range or Partitioned By Growth and can be explicitly created or implicitly created. PARTITION BY RANGE flavor Partitioning key columns and organize by hash columns must be the same Organize by hash can have extra columns DB2 determines how many pages within each part and allocates the hash space at CREATE TABLE Part number is calculated based on partitioning key. Row location within the part is determined by hash calculation of the key PBG flavor DB2 determines how many parts and pages needed and allocates all at CREATE TABLE Rows are inserted/loaded in the part and page according to hash calculation They are scattered all over the table space

16 Hash Organization With Respect to Partitioning PBG Fixed Hash Area Hash Overflow Index Part 1 Part 2 Part 3 Part 4 Part 5 Part 6 Part 7 PBR-UTS Part 4 Fixed Hash Areas Per Part Part 3 Part 2 Part 1 Hash Overflow Index Hash Overflow Index Hash Overflow Index Hash Overflow Index This picture shows that the fixed hash area in a PBG hash organized table covers multiple parts. Where as the Fixed Hash area in a PBR-UTS is part of each partition, as is the hash overflow area. The has overflow index is partitioned.

17 ALTER TABLE Change Hash Organization ALTER TABLE ADD ALTER DROP column-definition add-partition partitioning-clause organization-clause column-alteration partition-alteration ORGANIZATION SET HASH SPACE integer K M G PRIMARY KEY CLONE ORGANIZATION Alter the fixed hash space Table is accessible Changed at next Reorg Drop Hash Organization Table in REORGP Changed at next Reorg A table that is organized by hash may be ALTERed in the following ways: 1. Change the fixed hash space. It changed at partition level for PBR-UTS and for the whole table for the PBG case. In this case the table is accessible fully, but the new hash space will be see only after the next REORG. 2. Drop the hash organization all together. In this case, the table is placed in REORGP an data is not accessible at all. The next reorg will actually drop the hash organization.

18 ALTER TABLE ADD HASH ALTER TABLE ADD column-definition add-partition partitioning-clause organization-clause organization-clause ALTER column-alteration ORGANIZE BY partition-alteration HASH UNIQUE ( column-name ) ORGANIZATION SET HASH SPACE HASH SPACE 64 M integer K M G DROP HASH SPACE PRIMARY KEY integer CLONE K M ORGANIZATION G Alter the table to make it organized by hash Table is accessible (more later) Changed at next Reorg A non hash table can be converted to be organized by hash by using the ALTER TABLE ADD organization-clause. However, one need to consider many points before converting to hash organized tables. The conversion happens at the next reorg, but the between the time of ALTER ADD hash and the reorg, the table is in a semi-restrictive state.

19 ALTER TABLE ADD ORGANIZE BY HASH ALTER is IMMEDIATE to enforce uniqueness Table in advisory reorg state Overflow index in rebuild pending state Inserts not allowed Can delete/update Rebuild Index Could give duplicate keys Builds a large index Allows inserts after rebuild Reorg table space Part level REORG is not allowed Could give duplicates Changes the organization to hash Index is most likely really small Part 1 Part 2 Part 3 Part 4 Part 5 Fixed Hash Area Part 1 Part 2 Part 3 Part 4 Part 5 REORG Non-Hash Table Space Hash Overflow Index ALTER ADD HASH Hash Index created, but in Overflow Index Rebuild pending Part 1 Part 2 Part 3 Part 4 Part 5 Part 6 Part 7 19 The list of column names defines the hash key that is used to determine where a row will be placed. Each column-name must be an unqualified name that identifies a column of the table (SQLSTATE 42703, SQLCODE -205). The same column must not be identified more than once (SQLSTATE 42711, SQLCODE -612). The number of identified columns must not exceed 64, and the sum of their length attributes must not exceed 255 minus the number of columns that allow null values An identified column cannot be a LOB, DECFLOAT, or XML data type, or a distinct type that is based on one of these data types (SQLSTATE 42962, SQLCODE -350). If the table is defined as partition by range, the list of column names must be identical to the list of column names in the partitionexpression for the table, or contain all of the column names in the partition-expression with additional columns, and the column names must be specified in the same order as in partition-expression for the table.

20 ALTER TABLE DROP ORGANIZATION Removes Hash organization Table is placed in REORP Entire table becomes inaccessible until after Reorg Implicitly created hash overflow index dropped immediately Table space MUST be Reorged immediately There will be no clustering Index so consider creating clustering Index Part level REORG is not allowed after DROP ORGANIZATION Fixed Hash Area Hash Overflow Index ALTER DROP ORGANIZATION Part 1 Part 2 Part 3 Part 4 Part 5 Part 6 Part 7 Fixed Hash Area Hash Overflow Index Index dropped Part 1 Part 2 Part 3 Part 4 Part 5 Part 6 Part 7 Table in REORP REORG Part 1 Part 2 Part 3 Part 4 Part 5 Non-Hash Table Space 20 After altering the table space to remove hash organization the entire table space becomes unavailable until a Reorg is run, this may be a major outage.

21 ALTER SEQUENCE Simple Table Space Single table Segmented table space Classic Partitioned Table space REORG REORG REORG PBG Table Space PBR-UTS Table Space ADD HASH + REORG DROP HASH + REORG ADD HASH + REORG DROP HASH + REORG Hash Table Space PBR-UTS Hash Table space ALTER TABLE ADD HASH requires a Universal Table Space format.

22 Detail Implicitly Created PBG Table Space IN clause CREATE TABLESPACE generated-tsname-name Not Used on IN generated-database-name CREATE TABLE USING STOGROUP SYSDFLT PRIQTY -1 SECQTY -1 DSSIZE 4G MAXPARTITIONS max(256,calculated) NUMPARTS calculated FREEPAGE 0 BUFFERPOOL selected based on row size and optimum rows/page Notes: DB2 will choose BP larger than needed for max row. DB2 will give negative SQLCODE if BP is not available. DB2 manuals will give guidance on expected BP. PCTFREE will be the default value it is used on REORG. 22 If no table space is specified on CREATE TABLE and no partition-by-range clause is specified, a Partition By Growth (PBG) table space will be implicitly created. The CREATE TABLE process will calculate the number of partitions (NUMPARTS) needed for preallocation and MAXPARTITIONS based on the hash space value specified in the organization-clause on CREATE TABLE.

23 Detail Implicitly Created PBR-UTS Table Space IN clause CREATE TABLESPACE generated-tsname-name Not Used on IN generated-database-name CREATE TABLE USING STOGROUP SYSDFLT PRIQTY -1 SECQTY -1 DEFINE YES or depends on IMPDSDEF DSSIZE 4G or as specified on the partitioning clause of CREATE TABLE NUMPARTS same as the number of partitions BUFFERPOOL selected based on row size and optimum rows/page FREEPAGE 0 Notes: Each partition gets the hash space from the table level unless specified at partition level. DEFINE YES/NO. Pay attention to ZPARM value of IMPDSDEF. DEFINE YES is desirable so that there are no unexpected delays on first insert. 23 If no table space is specified on CREATE TABLE and the partitioning-clause is specified, a Partition By Range (PBR-UTS) table space will be implicitly created. The CREATE TABLE process will calculate the number of pages to preallocate for each partition that is needed based on the HASH SPACE value of the organization-clause on CREATE TABLE or the HASH SPACE value on the partitionexpression clause if specified. DEFINE attribute for an implicit hash access table space will get the value from the zparm, IMPDSDEF. Because DB2 needs to ensure that the entire fixed hash space is available before the first access and because the allocation may take some time, it is strongly recommended that DEFINE YES be used for implicitly created table spaces. A resource not available error or a delay on first access is not desirable. If it is not possible to change IMPDSDEF zparm for other reasons, then it is recommended that explicit table spaces be used for hash access tables.

24 Detail User Created Table Space CREATE TABLESPACE table-space-name IN database-name USING STOGROUP stogroup-name PRIQTY integer SECQTY integer (DB2 uses own calculation for allocation) DEFINE YES NO (Honored. However, YES is strongly recommended) DSSIZE integer (Honored, DB2 validates and may generate -SQLCODE for partitioned by range table) MAXPARTITIONS integer (DB2 validates and may generate -SQLCODE, only valid for PBG) NUMPARTS integer (DB2 adds parts as needed. Honored for range-partitioned tables) BUFFERPOOL bpname (DB2 validates and may generate -SQLCODE) LOCKSIZE PAGE ROW (Honored) MAXROWS integer (Honored but not recommended) SEGSIZE integer (Honored) PCTFREE integer (Honored) FREEPAGE integer (Not used) IN clause Is Used on CREATE TABLE Notes: Strongly recommend DEFINE YES DSSIZE -SQLCODE on CREATE TABLE...IN... BUFFER POOL -SQLCODE on CREATE TABLE...IN This is the explicit CREATE TABLESPACE statement you can choose to create either Partitioned By Growth or Partitioned By Range table space. If the explicit table space is not a PBR-UTS or PBG, DB2 will return SQLCODE -628,SQLSTATE Only table based partitioning supports hash. DEFINE: The recommended setting is DEFINE YES to ensure that the fixed table space is allocated successfully prior to first access.

25 Detail Implicitly Created Hash Overflow Index CREATE INDEX generated-name ON table-name (hash key column list) PARTITIONED (only if range-partitioned table space) NOT PADDED UNIQUE USING (stogroup from first logical part ) GBPCACHE (from first logical part of table space) CLOSE (from table space) DEFER YES NO DEFER NO with CREATE TABLE ORGANZE BY HASH; DEFER YES with ALTER TABLE ADD ORGANIZE BY HASH PIECESIZE 4G PCTFREE 0 FREEPAGE 0 Regardless of IN clause on CREATE TABLE Note: - Hash overflow index cannot be explicitly created or dropped - DB2 uses DEFER YES on ALTER TABLE ADD HASH because it is not desirable to build the index at the time of ALTER. 25 You can not define an explicit CREATE INDEX for the Hash Overflow Index DB2 creates it implicitly with the DDL shown. The hash key column list is taken from the list supplied in the ORGANIZATION CLAUSE of the CREATE TABLE statement. User can not drop the hash exception index.

26 Deciding to Use Hash Organization Before choosing Hash Organization for your table consider: That the table has a unique key The table is primarily used for index probes The look ups are random If existing table, verify using IFCID 199 that the look ups are truly random More on this on next page The table is relatively static in size The table is relatively less volatile Increased insert/update/delete activity may result in more overflow rows, causing performance degradation The row sizes are not drastically different There are multiple rows per page (ideally 20 and above) More often than not, the requested row is found in the table Deeper indexes show more marked improvement with hash 20% more space is not a problem Before you choose Hash Organization for your table, it is important to consider these factors for suitability.

27 Hash Organization and Key Range Scans Evaluate access patterns of your current applications Key range scan (SELECT WHERE EMPID > 199 and EMPID < 300 ) Non-hash table with a clustering index has the benefit of index look aside during key range scan Hash table data is not clustered, therefore the key range scan results in more I/O Which means increased CPU time and Elapsed time Exact key look up (SELECT WHERE EMPID =256) This query is excellent for hash access. However, if your applications have exact key look up in the same key range at times, e.g.. Specific time slots for student class registrations by student id ranges a clustering index will show better performance because pages will likely be in the buffer pool Perhaps a larger buffer pool will help IN LIST ok e.g. : SELECT WHERE EMPID IN (256, 389, 550) SELECT WHERE EMPID IN (SELECT EMPID FROM DEPT_TABLE WHERE DEPTNO = M55 ) Key range scan may be used internally in joins queries Inspect EXPLAIN outputs of all queries to determine suitability of using hash organization

28 Estimating Hash Space PBG Fixed Hash Area Hash Overflow Index Part 1 Part 2 Part 3 Part 4 Part 5 Part 6 Part 7 User Challenge: Specify Hash Space on CREATE TABLE as close to final data size as possible DB2 Challenge: Choose the following for best performance (small overflow area): - The number of hash data pages for the Fixed Hash Space - Number of anchors per page These are chosen at: - CREATE TABLE / ALTER TABLE ADD HASH - REORG AUTOESTSPACE(YES) Together: We can converge on the most optimal hash distribution Such that we get close to 1.1 page I/O per FETCH Large number of overflow rows degrade performance. To read an overflow row, DB2 potentially does three I/Os instead of one. It is important to make a good estimate for the hash space such that DB2 can allocate the appropriate fixed hash space to keep the hash overflow rows to a minimum. Overflow rows can result in performance degradation Since the process results in searching the hash home page first and then accessing the index and then the hash overflow page.

29 Fixed Size Hash Area Sizing Number of Rows that fit in a page Overflows -> % % % % If 20 rows fit in a page and you want approximately 5% of the rows to be hash overflows, you would need a hash space 1.14 times (14 percent) larger than your data. Note that more space will actually be used, because 5% of the rows would overflow, so you would actually have approximately 19% overhead for using the hash space. These numbers are estimates Bob Lyle s Statistical Analysis Table 1.0% % % % % Extra space needed in Fixed Size Hash Area

30 Hash Distribution 10.0% 9.0% 8.0% 7.0% Expected Results Room for 20 rows/page 20M rows into 1M pages Expect 20 rows/page on average 6.0% 5.0% 4.0% 3.0% 2.0% Expected Results f k;λ λ k e λ k! λ = expected 1.0% Number k = actual 0.0% There are statistical variations for random processes of hash distribution Poisson Distribution Only 9% of pages would get exactly 20 Over 4% would get 25 (meaning 5 were overflows) About 56% of pages would have 20 or fewer entries With the above distribution, about 8.9% of rows overflow That 8.9% of rows will be stored in the Hash Overflow Area REORG AUTOESTSPACE(YES) can reduce overflows When the table space is properly sized, the distribution of rows exhibits a normal distribution (converging to Poisson distribution when the number of rows is very high) showing a bell shaped curve, denoting the largest percentage of pages containing the rows around 20 rows per page. A small percentage of pages exceed their capacity resulting in rows in the overflow area. By the same token, a small percentage of rows in the same table may have very few rows.

31 How Good is the DB2 Hash Distribution? The hash function used by DB2 is very good and not prone to noticeable skew A more complex function is used to provide good data distribution The function is not sensitive to data type or patterns in data Compared to theoretical random number distributions, the results from test data map the bell curve very closely Watson Research s Test Data Approx 25 Million Keys, 1.25 Million Buckets Ascending Integers (0, 1, 2,.) 20 Million Keys, 1 Million Buckets Ascending Even Integers (0, 2, 4,.) 20 Million Keys, 1 Million Buckets Multiples of 64 (0, 64, 128,.) 20 Million Keys, 1 Million Buckets 192,718 most common English Words 9,637 Buckets not as close because sample is too small

32 When the Hash Algorithm is Not so Good When the page size is too small Very few rows fit on a page This can happen on user created table spaces when average row size is not estimated correctly Implicitly created table spaces choose the appropriate page size Much larger extra space is needed to perform well If this is the case, consider going to larger page size Wide range of row sizes Normally small rows, but occasionally one or two rows could fill a page causing many rows to overflow Hash works well with somewhat uniformly sized rows

33 REORG Why REORG? To adjust fixed hash space To collapse update overflow rows To collapse hash overflow rows With Pseudo Deleted data rows in V10, overflow rows can increase over time Overflow rows progressively degrade the 1.1 access that is our target Reorg frequency requirement is no higher than for non-hash tables Let us say we have a well sized hash table, why should we need to REORG? Here are the reasons.

34 REORG New Keyword AUTOESTSPACE(YES NO) AUTOESTSPACE (YES) - DB2 uses RTS values to determine the new size of the hash space. SYSTABLESPACESTATS.TOTALROWS SYSTBALESPACESTATS.DATASIZE and SYSINDEXSPACESTATS.TOTENTRIES Default YES AUTOESTSPACE (NO) DB2 uses existing catalog values of HASH SPACE to determine size of the hash space. If HASH SPACE size has not been altered then REORG may NOT do anything Even after REORG, there will be some overflow rows. 34 If corrective action is needed, one of two actions can be taken. Either REORG can be run using AUTOESTSPACE(YES) to let DB2 adjust the space or ALTER TABLE HASH SPACE statement can be used to change hash space at table level and at partition level. The ALTER statement should be followed by a REORG AUTOESTSPACE(NO) to explicitly set the space.

35 What is influencing the hash function? Hashdatapages and Anchors CREATE TABLE With HASH ALTER TABLE ADD HASH DB2 Internal Studies By Bob Lyle LOAD/INSERT/ UPDATE/DELETE AREOR (non-hash table) HASH SPACE, BPOOL SYSTABLESPACE / SYSTABLEPART Statistical Analysis Table DATASIZE, TOTALROWS REAL TIME STATS SYSTABLESPACESTATS CREATE TABLE continues REORG AUTOESTSPACE(NO) REORG AUTOESTSPACE(YES) default is YES Hashdatapages, Anchors (hash table) Hashdatapages, Anchors (hash table) On CREATE TABLE with Hash Organization, DB2 will use the HASH SPACE from DDL and update the catalog tables. Together with the catalog information (also need SYSTABLES and SYSCOLUMNS when datasize is not used) and the statistical analysis table, DB2 will arrive at a suitable value for Hashdatapages and anchors that will be used in the generation of the hash values for keys. On ALTER TABLE ADD HASH, the catalog tables are updated, but the table remains non-hash with advisory reorg status. When a table is in AREOR (for ALTER ADD HASH) or an already hash organized table is REORGed with AUTOESTSPACE(NO), DB2 uses the catalog values an and the statistical analysis table, DB2 will arrive at a suitable value for Hashdatapages and Anchors. When a table is in AREOR (for ALTER ADD HASH) or an already hash organized table is REORGed with AUTOESTSPACE(YES), DB2 uses the RTS values and the statistical analysis table, DB2 will arrive at a suitable value for Hashdatapages and Anchors.

36 Converting an Existing Table to a Hash Organized Table 1. Is the table a good candidate? Unique key Random access Multiple rows per page Clustering not required 2. ALTER ADD organization-clause 3. REORG AUTOESTSPACE YES NO Important Evaluation Step 4. Rebind applications That use fully qualified equal predicates on hash key to pick hash access 5. Follow-up: Check to see if hash access was chosen using explain Check RTS for appropriate space specifications Monitor index last-used RTS info to see if index can be dropped Indexes used for range queries cannot be dropped

37 Checking Hash Space Usage Relevant Real Time Statistics (RTS) values: 1. SYSTABLESPACESTATS.TOTALROWS: Actual number of rows in the table 2. SYSTABLESPACESTATS.DATASIZE: Total number of bytes used for rows 3. SYSINDEXSPACESTATS.TOTALENTRIES: Number of overflow records with keys in the overflow index 4. SYSTABLESPACESTATS.REORGHASHACCESS: Number of times data is read using hash access since CREATE or the last REORG What to look for: HASH SPACE (DDL) should be in line with DATASIZE (RTS) TOTENTRIES should be in the range of 0-10% of TOTALROWS - the less the better REORGHASHACCESS should be non zero indicating that applications are using hash access What to do: ALTER the HASH SPACE and REORG or Let DB2 calculate the space automatically upon REORG DROP HASH if not used 37 During insert, update and delete operations, DB2 RTS will maintain hash access related column values in the SYSTABLESPACESTATS and SYSINDEXSPACESTATS catalog tables. These values will be used by DB2 access path selection process to determine if using hash access is suitable. Also RTS field to count update overflow rows. Totalentries update overflow rows = hash overflow rows.

38 Using Real Time Statistics #1 HASH SPACE (from DDL) Compare to DATASIZE (from RTS) #3 TOTALENTRIES (from RTS - Index) Compare to TOTALROWS s/b 0-10% of TOTALROWS (from RTS - TS) Fixed Hash Space Overflow Index Overflow space page page.pages pages Hash anchors #2 TOTALROWS (from RTS) and DATASIZE (from RTS) Applies to entire table space 38 Since hash access is performance sensitive, it is important to look at the space usage and keep it tuned via ALTERing the HASH SPACE and REORGing. In addition to existing space usage indicators, values in RTS tables, SYSTABLESPACESTATS and SYSINDEXSPACESTATS, in the fields TOTALENTRIES, TOTALROWS, DATASIZE and new fields REORGHASHACCESS and REORGINDEXACCESS should be looked at frequently to take corrective actions.

39 LOAD Utility and Hash Tables Large volumes of unsorted input data results in a markedly slow performance LOADing to hash tables Sorting in key sequence is not useful because the rows are loaded based on hash value calculations Steps used to calculate the sort key are available upon request DB2 is working on improving LOAD performance Until then, these options are available 1. Load to non-hash table, ALTER to HASH, REORG REORG performance is good 2. Generate the hash key value and presort Similar results seen with mass inserts. Caution about slow LOAD performance of unsorted data. Basic Steps for Calculating Pre-sort Value for Loading into Hash Tables (Preview) KEY: Construct the non-padded hash key and calculate its length HASH ALGORITHM: Calculate the 8 byte hash value using the detailed algorithm available upon request MODULO: Compute the pre-sort value: divide the 8 byte hash value by the HASHDATAPAGES value. The remainder is the pre-sort value. Note the details regarding HASHDATAPAGES for PBR-UTS and PBG For PBG the above value is sufficient for pre-sort For PBR-UTS you need to calculate the partition number and pre-pend it to the above pre-sort value SORT: Pre-pend the complete pre-sort value to the data row and sort Overflow rows are still possible, therefore the rows may not be inserted in a strictly sequential manner Follow steps closely, this is just an overview to show what is involved

40 Other DB2 Utilites CHECK DATA In addition to normal check data functions, DB2 check data ensures that the hash chains and the hash overflow indexes are correct. CHECK INDEX for hash overflow index Validates consistency between the hash overflow index and the hash overflow area. RECOVER Supported, with one exception: Cannot recover table space to point in time before ALTER DROP hash organization. REBUILD INDEX for hash overflow index Scans only the hash overflow area for rows Recover If you alter the organization of your tablespace to hash organization: You can recover the tablespace to the current time. You can recover the tablespace to a point in time before or after the alter. You can recover the tablespace to a point in time before or after the REORG that materialized the hash organization. Recover will place the tablespace in AREOR if the tablespace was recovered to a point before the REORG. If you alter the size of the hash space in your tablespace: You can recover the tablespace to the current time. You can recover the tablespace to a point in time before or after the alter. You can recover the tablespace to a point in time before or after the REORG that materialized the change in hash space size. If you drop the hash organization (via ALTER): You can recover the tablespace to the current time. You can recover the tablespace to a point in time after the alter. You cannot recover the tablespace to a point in time before the alter.

41 DB2 Performance Measurements Extensive performance measurements have been conducted by the DB2 performance team. Summary: Single SELECT shows up to 20-30% CPU and Elapsed time improvement depending on the number of levels of index that are compared Single INSERT shows up to 20-30% CPU and Elapsed time improvement depending on the number of levels of index that are compared Single UPDATE shows up to 20-30% CPU and Elapsed time improvement depending on the number of levels of index that are compared REORG shows up to 50% increase in Elapsed time, but it is still in an acceptable range. DB2 strives to improve performance continually

42 DB2 Hash Performance Concerns Recap CREATE TABLE Performance can be slow Need to allocate all of the fixed space DB2 looking for improvement LOAD performance can be slow Input data is not sorted by hash key DB2 looking for improvement Mass Insert Performance can be slow Input data is not sorted by hash key DB2 looking for improvement No Clustering available with hash Queries accessing in key sequence will be slower Increase buffer pool perhaps Query performance will degrade as space gets over utilized Monitor RTS and REORG as needed Needs monitoring 42

43 Best Practices for Using Hash Organization 1. Evaluate the applications and workload thoroughly before choosing Hash Hash is not for everybody Refer to Deciding to Use Hash Organization slide earlier 2. Choose the page size carefully Refer to Bob Lyle s Statistical Analysis table 3. Look for opportunity to drop primary index If not used for scan Will reduce index maintenance cost 4. Monitor Real Time Statistics regularly to determine REORG necessity 5. Use REORG AUTOESTSPACE(YES) Make use of PCTFREE to influence padding DATASIZE Deciding to Use Hash Organization: That the table has a unique key The table is primarily used for index probes The look ups are random If existing table, verify using IFCID 199 that the look ups are truly random More on this on next page The table is relatively static is size The table is relatively less volatile Increased insert/update/delete activity may result in more overflow rows, causing performance degradation V10 Pseudo Deleted rows compound the space usage The row sizes are not drastically different There are multiple rows per page (ideally

44

45 Kalpana Shyam / Karelle Cornwell Developer, DB2 for z/os, IBM Corp. kshyam@us.ibm.com / karelle@us.ibm.com Session A10 Hash Access to DB2 Data Faster, Better, Cheaper 45

Optimizing Insert Performance - Part 1

Optimizing Insert Performance - Part 1 Optimizing Insert Performance - Part 1 John Campbell Distinguished Engineer DB2 for z/os development CAMPBELJ@uk.ibm.com 2 Disclaimer/Trademarks The information contained in this document has not been

More information

Eenie Meenie Miney Mo, Which Table (Space) Type and Page Size Shall I Choose for DB2 on z/os?

Eenie Meenie Miney Mo, Which Table (Space) Type and Page Size Shall I Choose for DB2 on z/os? Eenie Meenie Miney Mo, Which Table (Space) Type and Page Size Shall I Choose for DB2 on z/os? September 13, 2012 Mark Rader IBM ATS - DB2 for z/os mrader@us.ibm.com 2012 IBM Corporation Title: Eenie Meenie

More information

Eenie Meenie Miney Mo, Which Table (Space) Type and Page Size Shall I Choose for DB2 on z/os?

Eenie Meenie Miney Mo, Which Table (Space) Type and Page Size Shall I Choose for DB2 on z/os? Eenie Meenie Miney Mo, Which Table (Space) Type and Page Size Shall I Choose for DB2 on z/os? St. Louis DB2 Users Group December 3, 2013 John Iczkovits iczkovit@us.ibm.com 1 Title: Eenie Meenie Miney Mo,

More information

PBR RPN - Removing Partitioning restrictions in Db2 12 for z/os

PBR RPN - Removing Partitioning restrictions in Db2 12 for z/os PBR RPN - Removing Partitioning restrictions in Db2 12 for z/os Steve Thomas CA Technologies 07/11/2017 Session ID Agenda Current Limitations in Db2 for z/os Partitioning Evolution of partitioned tablespaces

More information

What Developers must know about DB2 for z/os indexes

What Developers must know about DB2 for z/os indexes CRISTIAN MOLARO CRISTIAN@MOLARO.BE What Developers must know about DB2 for z/os indexes Mardi 22 novembre 2016 Tour Europlaza, Paris-La Défense What Developers must know about DB2 for z/os indexes Introduction

More information

DB2 Partitioning Choices, choices, choices

DB2 Partitioning Choices, choices, choices DB2 Partitioning Choices, choices, choices Phil Grainger BMC Software Date of presentation (01/11/2016) Session IB DB2 Version 8 Table Based Partitioning Version 8 introduced TABLE BASED PARTITIONING What

More information

Modern DB2 for z/os Physical Database Design

Modern DB2 for z/os Physical Database Design Modern DB2 for z/os Physical Database Design Northeast Ohio DB2 Users Group Robert Catterall, IBM rfcatter@us.ibm.com May 12, 2016 2016 IBM Corporation Agenda Get your partitioning right Getting to universal

More information

DB2 for z/os: Conversion from indexcontrolled partitioning to Universal Table Space (UTS)

DB2 for z/os: Conversion from indexcontrolled partitioning to Universal Table Space (UTS) DB2 for z/os: Conversion from indexcontrolled partitioning to Universal Table Space (UTS) 1 Summary The following document is based on IBM DB2 11 for z/os. It outlines a conversion path from traditional

More information

Copyright 2007 IBM Corporation All rights reserved. Copyright 2007 IBM Corporation All rights reserved

Copyright 2007 IBM Corporation All rights reserved. Copyright 2007 IBM Corporation All rights reserved Structure and Format Enhancements : UTS & RRF Willie Favero Senior Certified IT Specialist DB2 for z/os Software Sales Specialist IBM Sales and Distribution West Region, Americas 713-9401132 wfavero@attglobal.net

More information

DB2 for z/os Best Practices Optimizing Insert Performance - Part 1

DB2 for z/os Best Practices Optimizing Insert Performance - Part 1 DB2 for z/os Best Practices Optimizing Insert Performance - Part 1 John J. Campbell IBM Distinguished Engineer DB2 for z/os Development CampbelJ@uk.ibm.com 2011 IBM Corporation Transcript of webcast Slide

More information

Optimising Insert Performance. John Campbell Distinguished Engineer IBM DB2 for z/os Development

Optimising Insert Performance. John Campbell Distinguished Engineer IBM DB2 for z/os Development DB2 for z/os Optimising Insert Performance John Campbell Distinguished Engineer IBM DB2 for z/os Development Objectives Understand typical performance bottlenecks How to design and optimise for high performance

More information

Database Design and Implementation

Database Design and Implementation Chapter 2 Database Design and Implementation The concepts in database design and implementation are some of the most important in a DBA s role. Twenty-six percent of the 312 exam revolves around a DBA

More information

z/os Db2 Batch Design for High Performance

z/os Db2 Batch Design for High Performance Division of Fresche Solutions z/os Db2 Batch Design for High Performance Introduction Neal Lozins SoftBase Product Manager All tests in this presentation were run on a dedicated zbc12 server We used our

More information

Pass IBM C Exam

Pass IBM C Exam Pass IBM C2090-612 Exam Number: C2090-612 Passing Score: 800 Time Limit: 120 min File Version: 37.4 http://www.gratisexam.com/ Exam Code: C2090-612 Exam Name: DB2 10 DBA for z/os Certkey QUESTION 1 Workload

More information

An Introduction to DB2 Indexing

An Introduction to DB2 Indexing An Introduction to DB2 Indexing by Craig S. Mullins This article is adapted from the upcoming edition of Craig s book, DB2 Developer s Guide, 5th edition. This new edition, which will be available in May

More information

Advanced Design Considerations

Advanced Design Considerations Advanced Design Considerations par Phil Grainger, BMC Réunion du Guide DB2 pour z/os France Mercredi 25 novembre 2015 Hôtel Hilton CNIT, Paris-La Défense Introduction Over the last few years, we have gained

More information

File Structures and Indexing

File Structures and Indexing File Structures and Indexing CPS352: Database Systems Simon Miner Gordon College Last Revised: 10/11/12 Agenda Check-in Database File Structures Indexing Database Design Tips Check-in Database File Structures

More information

Kathleen Durant PhD Northeastern University CS Indexes

Kathleen Durant PhD Northeastern University CS Indexes Kathleen Durant PhD Northeastern University CS 3200 Indexes Outline for the day Index definition Types of indexes B+ trees ISAM Hash index Choosing indexed fields Indexes in InnoDB 2 Indexes A typical

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

Can you really change DB2 for z/os Schemas online?

Can you really change DB2 for z/os Schemas online? Can you really change DB2 for z/os Schemas online? Steve Thomas BMC Software Session Code: B17 Thursday 11 th November 2010 at 11:15 Platform: z/os Ever since I started working with DB2 back in 1989 with

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

An A-Z of System Performance for DB2 for z/os

An A-Z of System Performance for DB2 for z/os Phil Grainger, Lead Product Manager BMC Software March, 2016 An A-Z of System Performance for DB2 for z/os The Challenge Simplistically, DB2 will be doing one (and only one) of the following at any one

More information

Vendor: IBM. Exam Code: Exam Name: IBM Certified Database Administrator - DB2 10 for z/os. Version: Demo

Vendor: IBM. Exam Code: Exam Name: IBM Certified Database Administrator - DB2 10 for z/os. Version: Demo Vendor: IBM Exam Code: 000-612 Exam Name: IBM Certified Database Administrator - DB2 10 for z/os Version: Demo QUESTION NO: 1 Workload Manager (WLM) manages how many concurrent stored procedures can run

More information

Arrays are a very commonly used programming language construct, but have limited support within relational databases. Although an XML document or

Arrays are a very commonly used programming language construct, but have limited support within relational databases. Although an XML document or Performance problems come in many flavors, with many different causes and many different solutions. I've run into a number of these that I have not seen written about or presented elsewhere and I want

More information

Why Is This Important? Overview of Storage and Indexing. Components of a Disk. Data on External Storage. Accessing a Disk Page. Records on a Disk Page

Why Is This Important? Overview of Storage and Indexing. Components of a Disk. Data on External Storage. Accessing a Disk Page. Records on a Disk Page Why Is This Important? Overview of Storage and Indexing Chapter 8 DB performance depends on time it takes to get the data from storage system and time to process Choosing the right index for faster access

More information

Unit 3 Disk Scheduling, Records, Files, Metadata

Unit 3 Disk Scheduling, Records, Files, Metadata Unit 3 Disk Scheduling, Records, Files, Metadata Based on Ramakrishnan & Gehrke (text) : Sections 9.3-9.3.2 & 9.5-9.7.2 (pages 316-318 and 324-333); Sections 8.2-8.2.2 (pages 274-278); Section 12.1 (pages

More information

Click to edit the title text format

Click to edit the title text format Click to edit the title text format DB2 10 for z/os Performance Preview John Second B. Tobler Outline Level IBM Software Third Outline Engineer Level Session Code: A13 Wednesday November Eighth 10, Outline

More information

Indexing. Week 14, Spring Edited by M. Naci Akkøk, , Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel

Indexing. Week 14, Spring Edited by M. Naci Akkøk, , Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel Indexing Week 14, Spring 2005 Edited by M. Naci Akkøk, 5.3.2004, 3.3.2005 Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel Overview Conventional indexes B-trees Hashing schemes

More information

PBR RPN & Other Availability Improvements in Db2 12

PBR RPN & Other Availability Improvements in Db2 12 PBR RPN & Other Availability Improvements in Db2 12 Haakon Roberts IBM Session code: A11 07.11.2018 11:00-12:00 Platform: Db2 for z/os 1 Disclaimer IBM s statements regarding its plans, directions, and

More information

DB2 for z/os Utilities Update

DB2 for z/os Utilities Update Information Management for System z DB2 for z/os Utilities Update Haakon Roberts DE, DB2 for z/os & Tools Development haakon@us.ibm.com 1 Disclaimer Information regarding potential future products is intended

More information

RAID in Practice, Overview of Indexing

RAID in Practice, Overview of Indexing RAID in Practice, Overview of Indexing CS634 Lecture 4, Feb 04 2014 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke 1 Disks and Files: RAID in practice For a big enterprise

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

DB2 9 for z/os Selected Query Performance Enhancements

DB2 9 for z/os Selected Query Performance Enhancements Session: C13 DB2 9 for z/os Selected Query Performance Enhancements James Guo IBM Silicon Valley Lab May 10, 2007 10:40 a.m. 11:40 a.m. Platform: DB2 for z/os 1 Table of Content Cross Query Block Optimization

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join

More information

Chapter 12: Query Processing. Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join

More information

How do I keep up with this stuff??

How do I keep up with this stuff?? Michael Cotignola Db2 Software Consultant BMC Software Db2 12 How do I keep up with this stuff?? Or. Add your tag line here So, what s new with Db2 12 We ll take a quick look at the usual suspects: Reliability,

More information

Overview of Storage and Indexing

Overview of Storage and Indexing Overview of Storage and Indexing Chapter 8 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Data on External Storage Disks: Can retrieve random page at fixed cost But reading several consecutive

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

PBR RPN & Other Availability Enhancements In Db2 12 Dec IBM z Analytics

PBR RPN & Other Availability Enhancements In Db2 12 Dec IBM z Analytics PBR RPN & Other Availability Enhancements In Db2 12 Dec 2018 IBM z Analytics Disclaimer IBM s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at

More information

Overview of Storage and Indexing

Overview of Storage and Indexing Overview of Storage and Indexing Chapter 8 How index-learning turns no student pale Yet holds the eel of science by the tail. -- Alexander Pope (1688-1744) Database Management Systems 3ed, R. Ramakrishnan

More information

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17 Announcement CompSci 516 Database Systems Lecture 10 Query Evaluation and Join Algorithms Project proposal pdf due on sakai by 5 pm, tomorrow, Thursday 09/27 One per group by any member Instructor: Sudeepa

More information

7. Query Processing and Optimization

7. Query Processing and Optimization 7. Query Processing and Optimization Processing a Query 103 Indexing for Performance Simple (individual) index B + -tree index Matching index scan vs nonmatching index scan Unique index one entry and one

More information

Chapter 13: Query Processing

Chapter 13: Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

Overview of Storage and Indexing

Overview of Storage and Indexing Overview of Storage and Indexing Chapter 8 How index-learning turns no student pale Yet holds the eel of science by the tail. -- Alexander Pope (1688-1744) Database Management Systems 3ed, R. Ramakrishnan

More information

CA Performance Handbook

CA Performance Handbook SECTION 2: CHAPTERS 4 6 CA Performance Handbook for DB2 for z/os About the Contributors from Yevich, Lawson and Associates Inc. DAN LUKSETICH is a senior DB2 DBA. He works as a DBA, application architect,

More information

To include or not include? That is the question.

To include or not include? That is the question. To include or not include? That is the question. Donna Di Carlo BMC Software Session Code: F12 Wednesday, 16 October 2013 14:45 Platform: DB2 for z/os 2 Agenda Provide an overview of index include columns

More information

Chapter 11: Indexing and Hashing" Chapter 11: Indexing and Hashing"

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing" Database System Concepts, 6 th Ed.! Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-use " Chapter 11: Indexing and Hashing" Basic Concepts!

More information

Overview of Storage and Indexing

Overview of Storage and Indexing Overview of Storage and Indexing Yanlei Diao UMass Amherst Feb 21, 2006 Slides Courtesy of R. Ramakrishnan and J. Gehrke 1 DBMS Architecture Query Parser Query Rewriter Query Optimizer Query Executor Lock

More information

Hashing for searching

Hashing for searching Hashing for searching Consider searching a database of records on a given key. There are three standard techniques: Searching sequentially start at the first record and look at each record in turn until

More information

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,

More information

DB2 9 for z/os V9 migration status update

DB2 9 for z/os V9 migration status update IBM Software Group DB2 9 for z/os V9 migration status update July, 2008 Bart Steegmans DB2 for z/os L2 Performance Acknowledgement and Disclaimer i Measurement data included in this presentation are obtained

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Chapter 13: Query Processing Basic Steps in Query Processing

Chapter 13: Query Processing Basic Steps in Query Processing Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information

ODD FACTS ABOUT NEW DB2 for z/os SQL

ODD FACTS ABOUT NEW DB2 for z/os SQL ODD FACTS ABOUT NEW DB2 for z/os SQL NEW WAYS OF THINKING ABOUT OLD THINGS + STATIC/DYNAMIC SQL CHANGES, PREDICATE APPLICATION AND LOCKS. LATCHES, CLAIMS, & DRAINS Bonnie K. Baker Bonnie Baker Corporation

More information

TUC TOTAL UTILITY CONTROL FOR DB2 Z/OS. TUC Unique Features

TUC TOTAL UTILITY CONTROL FOR DB2 Z/OS. TUC Unique Features TUC Unique Features 1 Overview This document is describing the unique features of TUC that make this product outstanding in automating the DB2 object maintenance tasks. The document is comparing the various

More information

Announcements. Reading Material. Recap. Today 9/17/17. Storage (contd. from Lecture 6)

Announcements. Reading Material. Recap. Today 9/17/17. Storage (contd. from Lecture 6) CompSci 16 Intensive Computing Systems Lecture 7 Storage and Index Instructor: Sudeepa Roy Announcements HW1 deadline this week: Due on 09/21 (Thurs), 11: pm, no late days Project proposal deadline: Preliminary

More information

What s new in DB2 Administration Tool 10.1 for z/os

What s new in DB2 Administration Tool 10.1 for z/os What s new in DB2 Administration Tool 10.1 for z/os Joseph Reynolds, Architect and Development Lead, IBM jreynold@us.ibm.com Calene Janacek, DB2 Tools Product Marketing Manager, IBM cjanace@us.ibm.com

More information

The physical database. Contents - physical database design DATABASE DESIGN I - 1DL300. Introduction to Physical Database Design

The physical database. Contents - physical database design DATABASE DESIGN I - 1DL300. Introduction to Physical Database Design DATABASE DESIGN I - 1DL300 Fall 2011 Introduction to Physical Database Design Elmasri/Navathe ch 16 and 17 Padron-McCarthy/Risch ch 21 and 22 An introductory course on database systems http://www.it.uu.se/edu/course/homepage/dbastekn/ht11

More information

UNIT III BALANCED SEARCH TREES AND INDEXING

UNIT III BALANCED SEARCH TREES AND INDEXING UNIT III BALANCED SEARCH TREES AND INDEXING OBJECTIVE The implementation of hash tables is frequently called hashing. Hashing is a technique used for performing insertions, deletions and finds in constant

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information

DB2 for z/os Utilities Best Practices Part 2. Haakon Roberts DB2 for z/os Development IBM Corporation. Transcript of webcast.

DB2 for z/os Utilities Best Practices Part 2. Haakon Roberts DB2 for z/os Development IBM Corporation. Transcript of webcast. DB2 for z/os Utilities Best Practices Part 2 Haakon Roberts DB2 for z/os Development 2011 IBM Corporation Transcript of webcast Slide 1 (00:00) My name is Haakon Roberts and I work for DB2 Silicon Valley

More information

Db2 V12 Gilbert Sieben

Db2 V12 Gilbert Sieben Db2 V12 Migration @KBC Gilbert Sieben Agenda 1. Time line 2. Premigration checks 3. Migration to V12 4. Measurements 5. New Features 6. Lessons learned Company 2 1. Time line Project of 1 year, 300 Mandays,

More information

DB2 12 for z/os: Technical Overview and Highlights

DB2 12 for z/os: Technical Overview and Highlights DB2 12 for z/os: Technical Overview and Highlights by John Campbell and Gareth Jones Introduction Cloud, Analytics, and Mobile are changing the landscape for enterprise customers. These technology trends

More information

DB2 11 for z/os Utilities Update

DB2 11 for z/os Utilities Update DB2 11 for z/os Utilities Update Andy Lai DB2 Utilities Development atlai@us.ibm.com Insert Custom Session QR if Desired. 1 Disclaimer Copyright IBM Corporation 2014. All rights reserved. IBM s statements

More information

Database System Concepts, 6 th Ed. Silberschatz, Korth and Sudarshan See for conditions on re-use

Database System Concepts, 6 th Ed. Silberschatz, Korth and Sudarshan See  for conditions on re-use Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files Static

More information

Indexing. Chapter 8, 10, 11. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Indexing. Chapter 8, 10, 11. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Indexing Chapter 8, 10, 11 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Tree-Based Indexing The data entries are arranged in sorted order by search key value. A hierarchical search

More information

To REORG or not to REORG That is the Question. Kevin Baker BMC Software

To REORG or not to REORG That is the Question. Kevin Baker BMC Software To REORG or not to REORG That is the Question Kevin Baker BMC Software Objectives Identify I/O performance trends for DB pagesets Correlate reorganization benefits to I/O performance trends Understand

More information

What s new in DB2 9 for z/os for Applications

What s new in DB2 9 for z/os for Applications What s new in DB2 9 for z/os for Applications Patrick Bossman bossman@us.ibm.com Senior software engineer IBM Silicon Valley Lab 9/8/2009 Disclaimer Copyright IBM Corporation [current year]. All rights

More information

Overview of Storage and Indexing. Data on External Storage

Overview of Storage and Indexing. Data on External Storage Overview of Storage and Indexing Chapter 8 How index-learning turns no student pale Yet holds the eel of science by the tail. -- Alexander Pope (1688-1744) Database Management Systems 3ed, R. Ramakrishnanand

More information

DB2 12 A new spin on a successful database

DB2 12 A new spin on a successful database Presenter: Dan Lohmeier Lead Developer BMC Software Author: Phil Grainger Product Manager BMC Software DB2 12 A new spin on a successful database So, what s new with DB2 12 We ll take a speedy journey

More information

Evaluation of relational operations

Evaluation of relational operations Evaluation of relational operations Iztok Savnik, FAMNIT Slides & Textbook Textbook: Raghu Ramakrishnan, Johannes Gehrke, Database Management Systems, McGraw-Hill, 3 rd ed., 2007. Slides: From Cow Book

More information

Modern Database Systems Lecture 1

Modern Database Systems Lecture 1 Modern Database Systems Lecture 1 Aristides Gionis Michael Mathioudakis T.A.: Orestis Kostakis Spring 2016 logistics assignment will be up by Monday (you will receive email) due Feb 12 th if you re not

More information

External Sorting Implementing Relational Operators

External Sorting Implementing Relational Operators External Sorting Implementing Relational Operators 1 Readings [RG] Ch. 13 (sorting) 2 Where we are Working our way up from hardware Disks File abstraction that supports insert/delete/scan Indexing for

More information

DB2 11 for z/os Application Functionality (Check out these New Features) Randy Ebersole IBM

DB2 11 for z/os Application Functionality (Check out these New Features) Randy Ebersole IBM DB2 11 for z/os Application Functionality (Check out these New Features) Randy Ebersole IBM ebersole@us.ibm.com Please note IBM s statements regarding its plans, directions, and intent are subject to change

More information

DB2 10 for z/os Technical Update

DB2 10 for z/os Technical Update DB2 10 for z/os Technical Update James Teng, Ph.D. Distinguished Engineer IBM Silicon Valley Laboratory March 12, 2012 Disclaimers & Trademarks* 2 Information in this presentation about IBM's future plans

More information

Review of Storage and Indexing

Review of Storage and Indexing Review of Storage and Indexing CMPSCI 591Q Sep 17, 2007 Slides adapted from those of R. Ramakrishnan and J. Gehrke 1 File organizations & access methods Many alternatives exist, each ideal for some situations,

More information

Database Management and Tuning

Database Management and Tuning Database Management and Tuning Index Tuning Johann Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Unit 4 Acknowledgements: The slides are provided by Nikolaus Augsten and have

More information

Deep Dive Into Storage Optimization When And How To Use Adaptive Compression. Thomas Fanghaenel IBM Bill Minor IBM

Deep Dive Into Storage Optimization When And How To Use Adaptive Compression. Thomas Fanghaenel IBM Bill Minor IBM Deep Dive Into Storage Optimization When And How To Use Adaptive Compression Thomas Fanghaenel IBM Bill Minor IBM Agenda Recap: Compression in DB2 9 for Linux, Unix and Windows New in DB2 10 for Linux,

More information

Outline. Database Management and Tuning. What is an Index? Key of an Index. Index Tuning. Johann Gamper. Unit 4

Outline. Database Management and Tuning. What is an Index? Key of an Index. Index Tuning. Johann Gamper. Unit 4 Outline Database Management and Tuning Johann Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Unit 4 1 2 Conclusion Acknowledgements: The slides are provided by Nikolaus Augsten

More information

Db2 12 A new spin on a successful database

Db2 12 A new spin on a successful database Phil Grainger Principal Enablement Manager BMC Software Db2 12 A new spin on a successful database Management Performance Administration So What's new with Performance Performance Management Db2 12? Availability

More information

D B M G Data Base and Data Mining Group of Politecnico di Torino

D B M G Data Base and Data Mining Group of Politecnico di Torino Database Management Data Base and Data Mining Group of tania.cerquitelli@polito.it A.A. 2014-2015 Optimizer operations Operation Evaluation of expressions and conditions Statement transformation Description

More information

DB2 Performance Essentials

DB2 Performance Essentials DB2 Performance Essentials Philip K. Gunning Certified Advanced DB2 Expert Consultant, Lecturer, Author DISCLAIMER This material references numerous hardware and software products by their trade names.

More information

Understanding the Power and Pitfalls of Partitioning In V8, 9 and Beyond

Understanding the Power and Pitfalls of Partitioning In V8, 9 and Beyond Regional Forums The Power and Pitfalls of Partitioning Understanding the Power and Pitfalls of Partitioning In V8, 9 and Beyond Robert Goodman Sr DBA November 10 th, 2008 Session 2 San Ramon, CA Nov 10-11

More information

Indexing. Jan Chomicki University at Buffalo. Jan Chomicki () Indexing 1 / 25

Indexing. Jan Chomicki University at Buffalo. Jan Chomicki () Indexing 1 / 25 Indexing Jan Chomicki University at Buffalo Jan Chomicki () Indexing 1 / 25 Storage hierarchy Cache Main memory Disk Tape Very fast Fast Slower Slow (nanosec) (10 nanosec) (millisec) (sec) Very small Small

More information

Intro to DB CHAPTER 12 INDEXING & HASHING

Intro to DB CHAPTER 12 INDEXING & HASHING Intro to DB CHAPTER 12 INDEXING & HASHING Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing

More information

C Exam code: C Exam name: IBM DB2 11 DBA for z/os. Version 15.0

C Exam code: C Exam name: IBM DB2 11 DBA for z/os. Version 15.0 C2090-312 Number: C2090-312 Passing Score: 800 Time Limit: 120 min File Version: 15.0 http://www.gratisexam.com/ Exam code: C2090-312 Exam name: IBM DB2 11 DBA for z/os Version 15.0 C2090-312 QUESTION

More information

Hash Table and Hashing

Hash Table and Hashing Hash Table and Hashing The tree structures discussed so far assume that we can only work with the input keys by comparing them. No other operation is considered. In practice, it is often true that an input

More information

Evaluation of Relational Operations

Evaluation of Relational Operations Evaluation of Relational Operations Yanlei Diao UMass Amherst March 13 and 15, 2006 Slides Courtesy of R. Ramakrishnan and J. Gehrke 1 Relational Operations We will consider how to implement: Selection

More information

Step 4: Choose file organizations and indexes

Step 4: Choose file organizations and indexes Step 4: Choose file organizations and indexes Asst. Prof. Dr. Kanda Saikaew (krunapon@kku.ac.th) Dept of Computer Engineering Khon Kaen University Overview How to analyze users transactions to determine

More information

Chapter 17 Indexing Structures for Files and Physical Database Design

Chapter 17 Indexing Structures for Files and Physical Database Design Chapter 17 Indexing Structures for Files and Physical Database Design We assume that a file already exists with some primary organization unordered, ordered or hash. The index provides alternate ways to

More information

Hashing. Hashing Procedures

Hashing. Hashing Procedures Hashing Hashing Procedures Let us denote the set of all possible key values (i.e., the universe of keys) used in a dictionary application by U. Suppose an application requires a dictionary in which elements

More information

Chapter 2. DB2 concepts

Chapter 2. DB2 concepts 4960ch02qxd 10/6/2000 7:20 AM Page 37 DB2 concepts Chapter 2 Structured query language 38 DB2 data structures 40 Enforcing business rules 49 DB2 system structures 52 Application processes and transactions

More information

Introducing Hashing. Chapter 21. Copyright 2012 by Pearson Education, Inc. All rights reserved

Introducing Hashing. Chapter 21. Copyright 2012 by Pearson Education, Inc. All rights reserved Introducing Hashing Chapter 21 Contents What Is Hashing? Hash Functions Computing Hash Codes Compressing a Hash Code into an Index for the Hash Table A demo of hashing (after) ARRAY insert hash index =

More information

CSIT5300: Advanced Database Systems

CSIT5300: Advanced Database Systems CSIT5300: Advanced Database Systems L08: B + -trees and Dynamic Hashing Dr. Kenneth LEUNG Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong SAR,

More information

Session: G03 No Magic to Improve DB2 for z/os Application Performance. Marcel Lévy Natixis. May 19, :30 p.m. 02:30 p.m. Platform: DB2 for z/os

Session: G03 No Magic to Improve DB2 for z/os Application Performance. Marcel Lévy Natixis. May 19, :30 p.m. 02:30 p.m. Platform: DB2 for z/os Session: G03 No Magic to Improve DB2 for z/os Application Performance Marcel Lévy Natixis May 19, 2008 01:30 p.m. 02:30 p.m. Platform: DB2 for z/os 1 Agenda Story of a DB2 application migration Measurement

More information

CPS352 Lecture - Indexing

CPS352 Lecture - Indexing Objectives: CPS352 Lecture - Indexing Last revised 2/25/2019 1. To explain motivations and conflicting goals for indexing 2. To explain different types of indexes (ordered versus hashed; clustering versus

More information

Physical Database Design and Performance (Significant Concepts)

Physical Database Design and Performance (Significant Concepts) Physical Database Design and Performance (Significant Concepts) Learning Objectives This topic is intended to introduce the Physical Database Design and Performance. At the end of the topic it is desired

More information

IBM DB2 10 for z/os beta. Reduce costs with improved performance

IBM DB2 10 for z/os beta. Reduce costs with improved performance IBM DB2 10 for z/os beta Reduce costs with improved performance TABLE OF CONTENTS SECTION I INTRODUCTION OF DB2 10 FOR Z/OS... 3 Executive Summary... 3 SECTION II PERFORMANCE AVAILABILITY... 5 Many performance

More information