Partitionierungsstrategien für Data Vault Dani Schnider, Trivadis AG DOAG Konferenz, 23. November 2017 @dani_schnider DOAG2017
Unser Unternehmen. Trivadis ist führend bei der IT-Beratung, der Systemintegration, dem Solution Engineering und der Erbringung von IT-Services mit Fokussierung auf - und -Technologien in der Schweiz, Deutschland, Österreich und Dänemark. Trivadis erbringt ihre Leistungen aus den strategischen Geschäftsfeldern: B E T R I E B Trivadis Services übernimmt den korrespondierenden Betrieb Ihrer IT Systeme. 2 23.11.2017
Mit über 600 IT- und Fachexperten bei Ihnen vor Ort. KOPENHAGEN HAMBURG 14 Trivadis Niederlassungen mit über 600 Mitarbeitenden. Über 200 Service Level Agreements. Mehr als 4'000 Trainingsteilnehmer. DÜSSELDORF Forschungs- und Entwicklungsbudget: CHF 5.0 Mio. / EUR 4.0 Mio. FRANKFURT Finanziell unabhängig und nachhaltig profitabel. BASEL FREIBURG BRUGG ZÜRICH STUTTGART MÜNCHEN WIEN Erfahrung aus mehr als 1'900 Projekten pro Jahr bei über 800 Kunden. GENF BERN LAUSANNE 3 23.11.2017
Dani Schnider Working for Trivadis in Glattbrugg/Zurich Senior Principal Consultant Data Warehouse Lead Architect Trainer of several Courses Co-Author of the books Data Warehousing mit Oracle Data Warehouse Blueprints Certified Data Vault Data Modeler @dani_schnider danischnider.wordpress.com 4 23.11.2017
Data Vault Tables HUB Surrogate Key (PK) Business Key(s) (UK) Load Date Record Source SATELLITE Foreign Key to Hub (PK) Load Date (PK) Load End Date (optional) Context Attribute 1 Context Attribute 2... Context Attribute n Record Source LINK Surrogate Key (PK) Foreign Key Hub 1 Foreign Key Hub 2... Load Date Record Source 5 23.11.2017
Source: How to Create a Data Vault Model, https://youtu.be/q1qj_ljeawc 6 23.11.2017
Example Data Vault Model (Subset) 7 23.11.2017
Partitioning by Load Date 8 23.11.2017
Partitioning by Load Date: A Good Strategy? SID LOAD_DATE LOAD_END_DATE 1 01.01.2014 13.04.2014 1 13.04.2014 28.07.2015 1 28.07.2015 09.02.2017 1 09.02.2017 31.12.9999 2 15.03.2014 31.12.9999 3 26.06.2016 10.03.2017 3 10.03.2017 13.03.2017 3 13.03.2017 14.03.2017 3 14.03.2017 31.12.9999 9 23.11.2017
Partitioning by Load Date: Use Cases Master Data (Changing Data) Product Customer Employee Beer (general product description) Transactional Data (Events) Sales Transaction Order Web Tracking Brew (particular brew batch) 10 23.11.2017
Partitioning by Load Date: Overview S_Beer H_Beer L_Beer_Brew H_Brew S_Recipe S_Brew_Journal 11 23.11.2017
Partitioning by Load Date: Hub Example CREATE TABLE H_BREW ( H_Brew_Key RAW (16) NOT NULL, Brew_No NUMBER( 4) NOT NULL, Load_Date DATE NOT NULL, Record_Source VARCHAR2 (4 CHAR) NOT NULL ) PARTITION BY RANGE (Load_Date) INTERVAL(numtoyminterval(1,'MONTH')) (PARTITION p_old_data VALUES LESS THAN (TO_DATE('01-01-2015','dd-mm-yyyy'))); 12 23.11.2017
Partitioning by Load Date: Satellite Example CREATE TABLE S_BREW_JOURNAL ( H_Brew_Key RAW (16) NOT NULL, Load_Date DATE NOT NULL, Brew_Date DATE NOT, Brewer VARCHAR2 (40),... Record_Source VARCHAR2 (4 CHAR) NOT NULL ) PARTITION BY RANGE (Load_Date) INTERVAL(numtoyminterval(1,'MONTH')) (PARTITION p_old_data VALUES LESS THAN (TO_DATE('01-01-2015','dd-mm-yyyy'))); 13 23.11.2017
Partitioning by Load Date ü û ü ü ü Partition Pruning Partition-wise Join Rolling History Data Distribution Partition Exchange Restrictions: Only for transactional data Global indexes should be avoided 14 23.11.2017
Partitioning by Load Date: Global Index Issue How to Avoid Global Indexes (PK/UK on Hubs)? ALTER TABLE H_BREW ADD CONSTRAINT H_BREW_PK PRIMARY KEY (H_Brew_Key) RELY DISABLE NOVALIDATE; ALTER TABLE H_BREW ADD CONSTRAINT H_BREW_UN UNIQUE (Brew_No) RELY DISABLE NOVALIDATE; ALTER TABLE S_BREW_JOURNAL ADD CONSTRAINT S_BREW_JOURNAL_PK PRIMARY KEY (H_Brew_Key, Load_Date) RELY DISABLE NOVALIDATE; ALTER TABLE S_BREW_JOURNAL ADD CONSTRAINT H_BREW_S_BREW_JOURNAL_FK FOREIGN KEY (H_Brew_Key) REFERENCES H_BREW (H_Brew_Key) RELY DISABLE NOVALIDATE; 15 23.11.2017
Partitioning by Load End Date 16 23.11.2017
Partitioning by Load End Date: Find Current Versions SID LOAD_DATE LOAD_END_DATE 1 01.01.2014 13.04.2014 1 13.04.2014 28.07.2015 1 28.07.2015 09.02.2017 1 09.02.2017 31.12.9999 2 15.03.2014 31.12.9999 3 26.06.2016 10.03.2017 3 10.03.2017 13.03.2017 3 13.03.2017 14.03.2017 3 14.03.2017 31.12.9999 17 23.11.2017
Partitioning by Load End Date: Overview H_Beer L_Beer_Brew H_Brew S_Beer S_Recipe S_Brew_Journal History Partition Current Partition History Partition Current Partition History Partition Current Partition 18 23.11.2017
Partitioning by Load End Date: Satellite Example CREATE TABLE S_RECIPE ( H_Beer_Key RAW (16) NOT NULL, Load_Date DATE NOT NULL, Load_End_Date DATE DEFAULT ON NULL TO_DATE('31-12-9999', 'dd-mm-yyyy'), Start_Temp NUMBER (3), Mashing_Time_1 NUMBER (3), Mashing_Temp_1 NUMBER (3), Record_Source VARCHAR2 (4 CHAR) NOT NULL ) ENABLE ROW MOVEMENT PARTITION BY LIST (Load_End_Date) (PARTITION p_current VALUES (TO_DATE('31-12-9999', 'dd-mm-yyyy')),partition p_history VALUES (DEFAULT)); 19 23.11.2017
Partitioning by Load End Date ü û û ü ü Partition Pruning Partition-wise Join Rolling History Data Distribution Partition Exchange Restrictions: Only for Satellites, requires LOAD_END_DATE ENABLE ROW MOVEMENT required 20 23.11.2017
Partitioning by Load End Date: Partition Exchange Special Implementation of Satellite Load Jobs Only useful if most versions are replaced Insert unchanged versions S_Recipe 1. Load Table contains All new versions All unchanged versions Load Table Partition Exchange Current Partition History Partition 2. Move old versions to history partition Insert rows with load end date Insert new versions Move old versions 3. Exchange current partition 21 23.11.2017
Partitioning by Hub Key 22 23.11.2017
Partitioning by Hub Key Hub proc (QC) Satellite Improve Join Performance: Full Partition-wise Joins part 1 slave1 part 1 Between Hubs and Satellites Between Links and Hubs part 2 slave2 part 2 Equal Distribution with HASH Partitioning Run Extraction Queries in Parallel part 3 slave3 part 3 Partition Key: Primary Key of Hub Foreign Key of Satellite / Link part 4 slave4 part 4 Link: Composite HASH-HASH Partitioning 23 23.11.2017
Partitioning by Hub Key: Overview S_Beer H_Beer L_Beer_Brew H_Brew S_Recipe S_Brew_Journal 24 23.11.2017
Partitioning by Hub Key: Hub Example CREATE TABLE H_BEER ( H_Beer_Key RAW (16) NOT NULL, Beer_Name VARCHAR2 (40) NOT NULL, Load_Date DATE NOT NULL, Record_Source VARCHAR2 (4 CHAR) NOT NULL ) PARTITION BY HASH (H_Beer_Key) PARTITIONS 8; 25 23.11.2017
Partitioning by Hub Key: Satellite Example CREATE TABLE S_BEER_DESCRIPTION ( H_Beer_Key RAW (16) NOT NULL, Load_Date DATE NOT NULL, Style VARCHAR2 (40), ABV NUMBER (3,1), IBU NUMBER (3), Seasonal VARCHAR2 (10), Label_Color VARCHAR2 (10), Record_Source VARCHAR2 (4 CHAR) NOT NULL ) PARTITION BY HASH (H_Beer_Key) PARTITIONS 8; 26 23.11.2017
Partitioning by Hub Key: Link Example CREATE TABLE L_BEER_BREW ( L_Beer_Brew_Key RAW (16) NOT NULL, H_Beer_Key RAW (16) NOT NULL, H_Brew_Key RAW (16) NOT NULL, Load_Date DATE NOT NULL, Record_Source VARCHAR2 (4 CHAR) NOT NULL ) PARTITION BY HASH (H_Beer_Key) SUBPARTITION BY HASH (H_Brew_Key) SUBPARTITIONS 8 PARTITIONS 8; 27 23.11.2017
Partitioning by Hub Key û ü û ü û Partition Pruning Partition-wise Join Rolling History Data Distribution Partition Exchange Restrictions: Maximal two partition keys per Link 28 23.11.2017
Conclusion 29 23.11.2017
Partitioning by Hub Key: Benefits Load Date 1) Load End Date 2) Hub Key Partition Pruning ü ü û Partition-wise Join û û ü Rolling History ü û û Data Distribution ü ü ü Partition Exchange ü ü û 1) Only for transactional data 2) Only for Satellites 30 23.11.2017
White Paper: White Paper Dani Schnider WHITE PAPER Page 1 of 18 www.trivadis.com Date 30.10.2017 Download: https://danischnider.wordpress.com/publications/ 31 23.11.2017
Trivadis @ DOAG 2017 #opencompany Stand: 3. Stock, direkt an der Rolltreppe Wir teilen unser Knowhow! Einfach vorbei kommen, Live-Präsentationen und Dokumentenarchiv T-Shirts, Gewinnspiel und mehr Wir freuen uns wenn Sie vorbei schauen 32 23.11.2017