Data Vault Modeling & Methodology. Technical Side and Introduction Dan Linstedt, 2010,
|
|
- Betty McBride
- 6 years ago
- Views:
Transcription
1 Data Vault Modeling & Methodology Technical Side and Introduction Dan Linstedt, 2010,
2 Technical Definition The Data Vault is a detail oriented, historical tracking and uniquely linked set of normalized tables that support one or more functional areas of business. It is a hybrid approach encompassing the best of breed between 3 rd normal form (3NF) and star schema. The design is flexible, scalable, consistent and adaptable to the needs of the enterprise. Architected specifically to meet the needs of today s enterprise data warehouses 2
3 Customer What Does One Look Like? Records a history of the interaction Product Elements: Hub Link Satellite Sat Sat Customer Sat F(x) Sat Link F(x) F(x) Sat Product Sat Sat Sat Hub = List of Unique Business Keys Link = List of Relationships, Associations Satellites = Descriptive Data Order F(x) Order Sat Sat 3
4 Excel As A Source Level A Level B Level C Item Item Item Staging Table Hub Grouping Link Acct To Group Hierarchical Link of Groups Sat Group Type User Grouping Structures Flattened Structure Hub Account Raw Source Data in DV Do you have a power executive who is technically inclined, who runs the business off a rogue spreadsheet? 4
5 Data Vault Basic Elements CORE ARCHITECTURE 5
6 Data Vault Core Architecture Hubs, Links, Satellites Hubs = Unique List of Business Keys Links = Unique List of Relationships across keys Satellites = Descriptive Data Satellites have 1 and only one parent table Satellites cannot be Parents to other tables Hubs cannot be child tables Last Seen Dates, Load Dates, Record Sources, and Surrogate keys are not part of the core architecture. They exists to help models and key migration. 6
7 Hub Entity A Hub is a list of unique business keys Hub Structure Primary Key <Business Key> Load DTS Last Seen DTS Record Source Unique Index (Primary Index) Hub Product Product Sequence ID Product Number Product Load DTS Product Last Seen DTS Prod Record Source A Hub s business key is a unique index A Hub s load date represents the FIRST TIME the EDW saw the data A Hub s record source represents: First the Master data source (on collisions), if not available it holds the origination source of the actual key 7
8 Link Entity A Link is an intersection of two or more business keys It can contain Hub keys and other Link keys Link Structure Primary Key Link Line-Item Link Line Item Sequence ID {Hub/Lnk Surrogate Keys 2..N} Load DTS Last Seen DTS Record Source Unique Index (Primary Index) Hub Product Sequence ID Hub Order Sequence ID **Line Item Number Load DTS Last Seen DTS A Link s business key is a composite unique index Record Source A Link may or may not have a **Item Numbering attribute A Link s load date represents the FIRST TIME the EDW saw the data A Link s record source represents: first the Master data source (on collisions), if not available, it holds the origination source of the actual key 8
9 Satellite Entity A Satellite is a time-dimensional table housing detailed information about the Hub s or Link s business keys Primary Key Load DTS Extract DTS **Load End Date Detail Business Data {Update User} {Update DTS} Record Source Unique Index (Primary Index) Customer # Load DTS Extract DTS **Load End Date Customer Name Customer Addr1 Customer Addr2 {Update User} {Update DTS} Record Source Satellites are defined by TYPE of data and RATE OF CHANGE Mathematically this reduces redundancy and decreases storage requirements over time (compared to a star schema) 9
10 Rules and Standards GOVERN your deployment THINKING OF BREAKING RULES 10
11 Some Rules For You NO Foreign Keys in the Satellites! NO Hub to Hub (Parent Child relationships) NO Enforcement of relationships in the data model NO Date Time attributes in HUB or LINK Primary Keys Why?? It breaks flexibility It breaks auditability / accountability It breaks Scalability It breaks Performance It introduces Decisions in the architecture, which breaks Patterns! Up Next Links and the Unit Of Work 11
12 Business Key Definitions The contracts system is responsible for creating customer account numbers. The EDW will never see other systems creating customer account numbers. (Requirement #101) Sales is clearly creating customer numbers, how do we detect the issue and alert the business? Point: Not all business keys are created EQUAL! 12
13 Link: Unit of Work Hub Category Link Prod-Cat Hub Product Sat Effectivity Link Line Item Unit Of Work Link: Product by Supplier by Category Link Prod-Supp Sat Effectivity Hub Supplier Link Product by Category Link Product by Supplier These links are Optional, used For exploration only 13
14 What Happens When: We Break the Unit of Work Source System UOW Product_ID Category_ID Supplier_ID Link Product by Supplier Product_ID Supplier_ID Link Product by Category Product_ID Category_ID Model Normalization Question: After normalizing, how can you reconstruct the source image EXACLTY as it stands? 14
15 What Happens When: Trying to Rebuild from Two Links Source System UOW Product_ID Category_ID Supplier_ID Model Normalization Link Product by Supplier Product_ID Supplier_ID Link Product by Category Product_ID Category_ID Re-joining the data, creates a record that does not exist in the original source system, this is the same problem that BI engines will have when putting together Data Mart results. 15
16 Link: Unit of Work Kept Together Source System Source Table UOW Product_ID Category_ID Supplier_ID Data Vault Link: Product by Category by Supplier Product_ID Category_ID Supplier_ID Commutative Property: Enable reproduction of the source exactly as it stands UOW is properly represented by a single Link in the Data Vault 16
17 What keeps you up at night? CURRENT LOADING PAIN 17
18 Problems with EDW Loads Today Technical Issues: 2am Wakeup Calls because data won t fit the business rules Emergency Fixes to Production Speed, Speed, Speed (shrinking load window + more data) Can t load real-time data (business rules in the way!!) Business won t buy better, faster, hardware! Business Issues: Maintenance cycles take too long Maintenance costs continue to increase Fixes to existing mappings break working logic Complexity of existing systems become unsustainable to business IT isn t using 80%+ of the hardware resources given to them (their jobs are running at 40% utilization when they are full-bore ) 18
19 Solutions! Technical Solutions All Parallel Job Streams As much as possible 1 Target Per Map, Per Action reduces complexity Generate Data Flows based on patterns (then focus on the real work) Get some SLEEP at night!! (no more production modifications) Business Solutions Decrease turn-around time Increase Performance Handle Real-Time Data!! Reduce Complexity = Reduce Costs, Reduce Time to Implement Get the power back for decision making, discovering and building your own marts 19
20 How? 20
21 Some standards to follow BASIC LOADING CONCEPTS 21
22 Loading: A Golden Rule 100% of the Data Loaded to the EDW 100% of the time! It s all about Auditability 22
23 Load Date / End Date Geology Batch Load Real-Time Loading 23
24 Real Time Loading - DV Stock Trade ACCOUNT= TRADE="Buy" STOCK= DAN" SHARES=100.0 CURRENCY="USD" PRICE= DATE="Feb 20, 2002 Comment="Buy Order to Execute" = Inserts Only, no Updates Acct Hub Trade Link DAN Stock Hub TRADE="Buy" SHARES=100.0 CURRENCY="USD" PRICE= DATE="Feb 20, 2002 Comment="Buy Order to Execute" Transactional Link # of Inserts 75M 50M 25M 10M First Data Set Loaded New Systems Data Added Months in Production As critical mass of current business keys is reached, the insert rates decrease rapidly. New systems add new keys, quickly and efficiently to an existing Hub. 24
25 Batch Load Date Time Stamp Stage Load Stage Load Staging Area CNTRL_DTE LOAD_DTS STAGING TABLE Sequence_ID. Load_DTS Record_Source STAGING TABLE Sequence_ID. Load_DTS Record_Source EDW Data Vault Load Date Is exactly the same For All rows 25
26 Parallel Load Architecture - Batch Staging Loads Data Vault Loads Data Mart Loads Sources Stage Hubs Hub Satellites Link Satellites Dimensions Facts Links Major Synchronization Points Processing: All loads are done in parallel Sets of processes wait for the previous set to complete Processes are run as soon as data is ready No other waiting time is required Load dependencies are greatly reduced 26
27 Mathematics of Batch Loading Its all about SPEED SPEED SPEED 10 Million Incoming Rows 60% - 80% Inserts (Never Seen Before) 10%-20% Updates Matched By KEY 5% Deletes EDW: 1 Billion Rows And growing Inserts are the single fastest operation in the Database! Updates are the single slowest operation in the Database! Q: Why push 80% of your Insert data through the heaviest/slowest transformation logic? 27
28 Simple Loading Patterns Rule: 1 Target Per Data Flow (map/graph) Per Action Source SQ LKP Target Filter If Exists Target Insert Source (Stage) Insert View: Select ALL that do not exist By PK in target Update View: Select ALL that exist By PK in target ONLY those with DELTA Source SQ Target Insert Target 28
29 Results of Pattern Tuning FROM THIS.. 5M 600 RPS = 2.31 hrs OR: 7k rps = 11.9 mins No parallelism This map must run at a minimum of 10k rps to beat the parallel times 10k rps = 8.33 mins TO THIS! Pass 1: 33k RPS = 2.52 mins Pass 2: 33k RPS = 2.52 mins 25k RPS = 3.33 mins Pass 3: 50k RPS = 1.66 mins 33k RPS = 2.52 mins 40k RPS = 2.03 mins 23k RPS = 3.61 mins Total Time: = 9.46 mins 29
30 Patterns Take the Cake! LOADING THE DATA VAULT 30
31 Loading Templates: Hubs Staging Data Distinct List BK Keys Exists In Target? No Insert Into Target (Gen Surrogate) Hub Yes Drop Row From Feed Select a Master system, and a hierarchy of importance for sub-systems to annotate arrival location of data Purpose of the loading template: Find out if the business key exists in the hub, if not insert it Use a distinct list (unique) of business keys coming from the staging area 31
32 Loading Templates: Links Staging Data Distinct List Busn Keys Lookup EACH Hubs Surrogate Keys Exists In Target? No Insert Into Target (gen surrogate) Link Yes Drop Row From Feed Select a Master system, and a hierarchy of importance for sub-systems to annotate arrival location of data Purpose of the loading template: Find all relationships between business keys, then, is the relationship already recorded in the Link, if not insert it Use a distinct list of related business keys 32
33 Loading Templates: Satellites Staging Data Distinct List Sat Rows Lookup EACH Hub s or Link s Surrogate Keys All Columns Match? No Insert Into Target Satellite Find Latest Sat Row Yes Drop Row From Feed Select a Master system, and a hierarchy of importance for sub-systems to annotate arrival location of data Purpose of the loading template: Gather descriptive data, compare to most recent copy of information in satellite, and if there are any deltas load, if not, don t load Use a distinct list of descriptive fields from the source systems 33
34 How to build your Data Vault GETTING STARTED HOW TO 34
35 Step 1: Establish Scope (Build Business Case Model) 35
36 Step 1: Define Business Keys Hub Invoice Hub Campaign Hub Customer Hub Products 36
37 Step 2: Define Associations Hub Invoice Hub Campaign Link Campaign by Invoice by Customer Hub Customer Link Product on Campaign Hub Products Link Invoice Line Items 37
38 Step 3: Define Descriptive Data Hub Invoice Hub Campaign Link Campaign by Invoice by Customer Hub Customer Sat Effectiveness Ratings Sat Effectiveness Dates Sat Dates and Amounts Sat Address Sat Details Sat Contacts Link Product on Campaign Hub Products Link Invoice Line Items Sat Availability Dates Sat Defect Reasons Sat Descriptions Sat Stock Quantities Sat Amounts Sat Quantities 38
39 Step 4: Build Source Model (PK/FK) (No Pictures, Sorry) Ensure the source model (DDL Only) has Primary and Foreign Keys defined Normalize the source model (if not normalized) Capture and integrate all source systems involved (if not already captured) Add Comments to the DDL (tables and fields) 39
40 Step 5: Build Cross-Reference The purpose of such an exercise is not to identify all the elements, but specifically to identify the target Hubs, (ie: the business keys), target Links, and at LEAST a single Satellite for at least 1 source column The engine (SaaS) will automatically assign all other descriptive elements to the first Satellite identified. SOURCE TABLE SOURCE COLUMN GROUP TARGET TABLE TARGET COLUMN AHLTAT_DIAGNOSIS DOC_REF 1 SAT_AHLTAT_DIAGNOSIS DOC_REF DATAID 1 HUB_DIAGNOSIS DIAGNOSIS_DATAID FACILITYNCID 1 HUB_FACILITY FAC_ID DIAGNOSISNCID 1 SAT_AHLTAT_DIAGNOSIS DIAGNOSISNCID ENCOUNTERNUMBER 1 HUB_EVENT EVNT_ID CLINICIANNCID 1 HUB_CLINICIAN CLINICIAN_NCID UNIT_NUMBER 1 HUB_UNIT UNIT_ID MEDCINID 1 HUB_MEDCIN MEDCIN_ID CREATETIME 1 SAT_AHLTAT_DIAGNOSIS CREATETIME CREATEUSERNCID 1 SAT_AHLTAT_DIAGNOSIS CREATEUSERNCID MODIFYUSERNCID 1 SAT_AHLTAT_DIAGNOSIS MODIFYUSERNCID MODIFYTIME 1 SAT_AHLTAT_DIAGNOSIS MODIFYTIME PRIORITY 1 SAT_AHLTAT_DIAGNOSIS PRIORITY DIAGNOSESCOMMENT 1 SAT_AHLTAT_DIAGNOSIS DIAGNOSESCOMMENT 40
41 Step 6: Generate Baseline ETL/ELT Source DDL Cross-Ref Mapping XLS Target DDL Generate Code, Reports, Documentation Data Flows (Mappings / Graphs) 41
42 What did we learn? CONCLUSIONS / SUMMARY 42
43 Data Vault Modeling Is Made up of Hubs, Links, and Satellites Easy to create and build Hardest thing is to find/locate and define the Business Keys Consistent, Scalable, Repeatable, Pattern Based RULES BASED / STANDARDS DRIVEN Loading Is. Scalable, Fault-Tolerant, Parallelizable, Pattern Based Generatable Performance Based 100% Restartable Set Based Devoid of Soft Business Rules!! 43
44 Still - Lots To Learn We didn t cover: Joins point-in-time tables building marts business logic components SQL extraction bridge tables what to do when dealing with bad data architecting security, managing governance, handling metadata Contact me for Workshops (training), and Mentoring 44
45 Questions? Dan Linstedt President, Empowered Holdings, LLC Tel: SERVICES: Consulting Assessments Product Selection Scorecards Architecture / Design Mentoring and Workshops (training) 45
Kent Graziano
Agile Data Warehouse Modeling: Introduction to Data Vault Modeling Kent Graziano Twitter @KentGraziano Agenda Bio What is a Data Vault? Where does it fit in an DW/BI architecture? How to design a Data
More informationData Vault Brisbane User Group
Data Vault Brisbane User Group 26-02-2013 Agenda Introductions A brief introduction to Data Vault Creating a Data Vault based Data Warehouse Comparisons with 3NF/Kimball When is it good for you? Examples
More informationData Vault. The Next Super Model. (Patent Pending Architecture) Presented by Kent Graziano Supervisor, Enterprise Data Warehouse Denver Public Schools
Data Vault The Next Super Model (Patent Pending Architecture) Presented by Kent Graziano Supervisor, Enterprise Data Warehouse Denver Public Schools Slides courtesy of Dan Linstedt Core Integration Partners,
More informationTechnology Note. Data Vault Modeling with ER/Studio Data Architect
Technology Note Data Vault Modeling with ER/Studio Data Architect Dr. Sultan Shiffa March 28, 2018 Data Vault Modeling with ER/Studio Data Architect Overview I have been asked multiple times if ER/Studio
More informationDATA VAULT MODELING GUIDE
DATA VAULT MODELING GUIDE Introductory Guide to Data Vault Modeling GENESEE ACADEMY, LLC 2012 Authored by: Hans Hultgren DATA VAULT MODELING GUIDE Introductory Guide to Data Vault Modeling Forward Data
More informationData Vault Partitioning Strategies WHITE PAPER
Dani Schnider Data Vault ing Strategies WHITE PAPER Page 1 of 18 www.trivadis.com Date 09.02.2018 CONTENTS 1 Introduction... 3 2 Data Vault Modeling... 4 2.1 What is Data Vault Modeling? 4 2.2 Hubs, Links
More informationTechno Expert Solutions An institute for specialized studies!
Getting Started Course Content of IBM Cognos Data Manger Identify the purpose of IBM Cognos Data Manager Define data warehousing and its key underlying concepts Identify how Data Manager creates data warehouses
More informationA Star Schema Has One To Many Relationship Between A Dimension And Fact Table
A Star Schema Has One To Many Relationship Between A Dimension And Fact Table Many organizations implement star and snowflake schema data warehouse The fact table has foreign key relationships to one or
More informationData Strategies for Efficiency and Growth
Data Strategies for Efficiency and Growth Date Dimension Date key (PK) Date Day of week Calendar month Calendar year Holiday Channel Dimension Channel ID (PK) Channel name Channel description Channel type
More informationIBM B5280G - IBM COGNOS DATA MANAGER: BUILD DATA MARTS WITH ENTERPRISE DATA (V10.2)
IBM B5280G - IBM COGNOS DATA MANAGER: BUILD DATA MARTS WITH ENTERPRISE DATA (V10.2) Dauer: 5 Tage Durchführungsart: Präsenztraining Zielgruppe: This course is intended for Developers. Nr.: 35231 Preis:
More informationComparing Anchor Modeling with Data Vault Modeling
PLACE PHOTO HERE, OTHERWISE DELETE BOX Comparing Anchor Modeling with Data Vault Modeling Lars Rönnbäck & Hans Hultgren SUMMER 2013 lars.ronnback@anchormodeling.com www.anchormodeling.com Hans@GeneseeAcademy.com
More informationGuide Users along Information Pathways and Surf through the Data
Guide Users along Information Pathways and Surf through the Data Stephen Overton, Overton Technologies, LLC, Raleigh, NC ABSTRACT Business information can be consumed many ways using the SAS Enterprise
More informationALTERNATE SCHEMA DIAGRAMMING METHODS DECISION SUPPORT SYSTEMS. CS121: Relational Databases Fall 2017 Lecture 22
ALTERNATE SCHEMA DIAGRAMMING METHODS DECISION SUPPORT SYSTEMS CS121: Relational Databases Fall 2017 Lecture 22 E-R Diagramming 2 E-R diagramming techniques used in book are similar to ones used in industry
More informationIntroductory Guide to Data Vault Modeling GENESEE ACADEMY, LLC
Introductory Guide to Data Vault Modeling GENESEE ACADEMY, LLC 2016 Authored by: Hans Hultgren Introductory Guide to Data Vault Modeling Forward Data Vault modeling is most compelling when applied to an
More informationDecision Guidance. Data Vault in Data Warehousing
Decision Guidance Data Vault in Data Warehousing DATA VAULT IN DATA WAREHOUSING Today s business environment requires data models, which are resilient to change and enable the integration of multiple data
More informationDATA VAULT CDVDM. Certified Data Vault Data Modeler Course. Sydney Australia December In cooperation with GENESEE ACADEMY, LLC
DATA VAULT CDVDM Certified Data Vault Data Modeler Course Sydney Australia December 3-5 2012 In cooperation with GENESEE ACADEMY, LLC Course Description and Outline DATA VAULT CDVDM Certified Data Vault
More informationImplementing a Data Warehouse with Microsoft SQL Server 2012/2014 (463)
Implementing a Data Warehouse with Microsoft SQL Server 2012/2014 (463) Design and implement a data warehouse Design and implement dimensions Design shared/conformed dimensions; determine if you need support
More informationData warehouse architecture consists of the following interconnected layers:
Architecture, in the Data warehousing world, is the concept and design of the data base and technologies that are used to load the data. A good architecture will enable scalability, high performance and
More informationNext Generation DWH Modeling. An overview of DWH modeling methods
Next Generation DWH Modeling An overview of DWH modeling methods Ronald Kunenborg www.grundsatzlich-it.nl Topics Where do we stand today Data storage and modeling through the ages Current data warehouse
More informationSchwan Food Company s Journey with SAP HANA
Speakers: Schwan Food Company s Journey with SAP HANA May 14, 2013 From Vision of SAP HANA to EDW on SAP HANA Al Grube Enterprise Information Architect The Schwan Food Company Al.Grube@schwans.com Mark
More informationInformation Management Fundamentals by Dave Wells
Information Management Fundamentals by Dave Wells All rights reserved. Reproduction in whole or part prohibited except by written permission. Product and company names mentioned herein may be trademarks
More informationOracle 11g Partitioning new features and ILM
Oracle 11g Partitioning new features and ILM H. David Gnau Sales Consultant NJ Mark Van de Wiel Principal Product Manager The following is intended to outline our general product
More informationFull file at
Chapter 2 Data Warehousing True-False Questions 1. A real-time, enterprise-level data warehouse combined with a strategy for its use in decision support can leverage data to provide massive financial benefits
More informationModeling Pattern Awareness
Modeling Pattern Awareness Modeling Pattern Awareness 2014 Authored by: Hans Hultgren Modeling Pattern Awareness The importance of knowing your pattern Forward Over the past decade Ensemble Modeling has
More informationFile Processing Approaches
Relational Database Basics Review Overview Database approach Database system Relational model File Processing Approaches Based on file systems Data are recorded in various types of files organized in folders
More informationturning data into dollars
turning data into dollars Tom s Ten Data Tips November 2012 Data warehouse automation Data warehouse (DWH) automation is a relatively new and burgeoning field. Design patterns have emerged that enable
More informationETL Best Practices and Techniques. Marc Beacom, Managing Partner, Datalere
ETL Best Practices and Techniques Marc Beacom, Managing Partner, Datalere Thank you Sponsors Experience 10 years DW/BI Consultant 20 Years overall experience Marc Beacom Managing Partner, Datalere Current
More informationLow Friction Data Warehousing WITH PERSPECTIVE ILM DATA GOVERNOR
Low Friction Data Warehousing WITH PERSPECTIVE ILM DATA GOVERNOR Table of Contents Foreword... 2 New Era of Rapid Data Warehousing... 3 Eliminating Slow Reporting and Analytics Pains... 3 Applying 20 Years
More informationMOC 20463C: Implementing a Data Warehouse with Microsoft SQL Server
MOC 20463C: Implementing a Data Warehouse with Microsoft SQL Server Course Overview This course provides students with the knowledge and skills to implement a data warehouse with Microsoft SQL Server.
More informationAn Information Asset Hub. How to Effectively Share Your Data
An Information Asset Hub How to Effectively Share Your Data Hello! I am Jack Kennedy Data Architect @ CNO Enterprise Data Management Team Jack.Kennedy@CNOinc.com 1 4 Data Functions Your Data Warehouse
More informationIBM Industry Data Models
IBM Software Group IBM Industry Data Models Usage, Process & Demonstration David Cope EDW Architect Asia Pacific 2007 IBM Corporation The EDW Data Model Business Requirements Analysis Design Planning Data
More informationOLAP Introduction and Overview
1 CHAPTER 1 OLAP Introduction and Overview What Is OLAP? 1 Data Storage and Access 1 Benefits of OLAP 2 What Is a Cube? 2 Understanding the Cube Structure 3 What Is SAS OLAP Server? 3 About Cube Metadata
More informationTop of Minds Report series Data Warehouse The six levels of integration
Top of Minds Report series Data Warehouse The six levels of integration Recommended reading Before reading this report it is recommended to read ToM Report Series on Data Warehouse Definitions for Integration
More informationMicrosoft SQL Server Training Course Catalogue. Learning Solutions
Training Course Catalogue Learning Solutions Querying SQL Server 2000 with Transact-SQL Course No: MS2071 Two days Instructor-led-Classroom 2000 The goal of this course is to provide students with the
More informationFROM A RELATIONAL TO A MULTI-DIMENSIONAL DATA BASE
FROM A RELATIONAL TO A MULTI-DIMENSIONAL DATA BASE David C. Hay Essential Strategies, Inc In the buzzword sweepstakes of 1997, the clear winner has to be Data Warehouse. A host of technologies and techniques
More informationContingency Planning and Disaster Recovery
Contingency Planning and Disaster Recovery Best Practices Version: 7.2.x Written by: Product Knowledge, R&D Date: April 2017 2017 Lexmark. All rights reserved. Lexmark is a trademark of Lexmark International
More informationModeling the. Agile. with Data Vault. Data Warehouse. Hans Hultgren
Agile Modeling the Data Warehouse with Data Vault Hans Hultgren Contents FORWARD 4 ABOUT THE AUTHOR 7 ACKNOWLEDGEMENTS 8 CHAPTER 1 DATA VA ULT DEF IN ED 19 1.1 data Vault is a Data Modeling Approach 20
More informationManaging Data Resources
Chapter 7 OBJECTIVES Describe basic file organization concepts and the problems of managing data resources in a traditional file environment Managing Data Resources Describe how a database management system
More informationBasics of Dimensional Modeling
Basics of Dimensional Modeling Data warehouse and OLAP tools are based on a dimensional data model. A dimensional model is based on dimensions, facts, cubes, and schemas such as star and snowflake. Dimension
More informationMIS2502: Data Analytics Dimensional Data Modeling. Jing Gong
MIS2502: Data Analytics Dimensional Data Modeling Jing Gong gong@temple.edu http://community.mis.temple.edu/gong Where we are Now we re here Data entry Transactional Database Data extraction Analytical
More informationDATABASE DEVELOPMENT (H4)
IMIS HIGHER DIPLOMA QUALIFICATIONS DATABASE DEVELOPMENT (H4) December 2017 10:00hrs 13:00hrs DURATION: 3 HOURS Candidates should answer ALL the questions in Part A and THREE of the five questions in Part
More informationCall: SAS BI Course Content:35-40hours
SAS BI Course Content:35-40hours Course Outline SAS Data Integration Studio 4.2 Introduction * to SAS DIS Studio Features of SAS DIS Studio Tasks performed by SAS DIS Studio Navigation to SAS DIS Studio
More informationTDWI strives to provide course books that are contentrich and that serve as useful reference documents after a class has ended.
Previews of TDWI course books offer an opportunity to see the quality of our material and help you to select the courses that best fit your needs. The previews cannot be printed. TDWI strives to provide
More informationOptimizing and Modeling SAP Business Analytics for SAP HANA. Iver van de Zand, Business Analytics
Optimizing and Modeling SAP Business Analytics for SAP HANA Iver van de Zand, Business Analytics Early data warehouse projects LIMITATIONS ISSUES RAISED Data driven by acquisition, not architecture Too
More informationMicrosoft Implementing a SQL Data Warehouse
1800 ULEARN (853 276) www.ddls.com.au Microsoft 20767 - Implementing a SQL Data Warehouse Length 5 days Price $4290.00 (inc GST) Version C Overview This five-day instructor-led course provides students
More informationIntegrating SAS and Data Vault
ABSTRACT Paper 1898-2018 Integrating SAS and Data Vault Patrick Cuba, Cuba BI Consulting Pty Ltd Data Vault (DV) modelling technique is fast gaining popularity around the world as an easy to learn, easy
More informationMigrate from Netezza Workload Migration
Migrate from Netezza Automated Big Data Open Netezza Source Workload Migration CASE SOLUTION STUDY BRIEF Automated Netezza Workload Migration To achieve greater scalability and tighter integration with
More informationHANA Performance. Efficient Speed and Scale-out for Real-time BI
HANA Performance Efficient Speed and Scale-out for Real-time BI 1 HANA Performance: Efficient Speed and Scale-out for Real-time BI Introduction SAP HANA enables organizations to optimize their business
More informationXcelerated Business Insights (xbi): Going beyond business intelligence to drive information value
KNOWLEDGENT INSIGHTS volume 1 no. 5 October 7, 2011 Xcelerated Business Insights (xbi): Going beyond business intelligence to drive information value Today s growing commercial, operational and regulatory
More informationA brief history of time for Data Vault
Dates and times in Data Vault There are no best practices. Just a lot of good practices, and even more bad practices. This is especially true when it comes to handling dates and times in Data Warehousing,
More informationIntroduction to Data Science
UNIT I INTRODUCTION TO DATA SCIENCE Syllabus Introduction of Data Science Basic Data Analytics using R R Graphical User Interfaces Data Import and Export Attribute and Data Types Descriptive Statistics
More informationMicrosoft Developer Day
Microsoft Developer Day Pradeep Menon Microsoft Developer Day Solutions Architect Agenda Microsoft Developer Day Traditional Business Intelligence Architecture Structured Sources Extract Transform Structurize
More informationBusiness Intelligence. You can t manage what you can t measure. You can t measure what you can t describe. Ahsan Kabir
Business Intelligence You can t manage what you can t measure. You can t measure what you can t describe Ahsan Kabir A broad category of applications and technologies for gathering, storing, analyzing,
More informationFreecoms VoIP Mobile Community Telecom S. Ferrari, page n 1»
Freecoms VoIP Mobile Community Telecom S. Ferrari, page n 1» Multiservice Mobile VoIP Community Powerful multiservice package: Home and Mobile VoIP communication. Business and Private WEB Portal community
More informationModern Data Warehouse The New Approach to Azure BI
Modern Data Warehouse The New Approach to Azure BI History On-Premise SQL Server Big Data Solutions Technical Barriers Modern Analytics Platform On-Premise SQL Server Big Data Solutions Modern Analytics
More informationOracle Data Warehousing Pushing the Limits. Introduction. Case Study. Jason Laws. Principal Consultant WhereScape Consulting
Oracle Data Warehousing Pushing the Limits Jason Laws Principal Consultant WhereScape Consulting Introduction Oracle is the leading database for data warehousing. This paper covers some of the reasons
More informationSegregating Data Within Databases for Performance Prepared by Bill Hulsizer
Segregating Data Within Databases for Performance Prepared by Bill Hulsizer When designing databases, segregating data within tables is usually important and sometimes very important. The higher the volume
More informationOverview of Reporting in the Business Information Warehouse
Overview of Reporting in the Business Information Warehouse Contents What Is the Business Information Warehouse?...2 Business Information Warehouse Architecture: An Overview...2 Business Information Warehouse
More informationWhitepaper. Solving Complex Hierarchical Data Integration Issues. What is Complex Data? Types of Data
Whitepaper Solving Complex Hierarchical Data Integration Issues What is Complex Data? Historically, data integration and warehousing has consisted of flat or structured data that typically comes from structured
More informationInformation Value Chain
Physical Value Chain Introduction When I was head of architecture at the newly global Dun and Bradstreet I needed to change my thinking from that of a software vendor, which I had recently been, to that
More informationSharePoint 2010 Enterprise Content Management for IT Pros. Mirjam van Olst Macaw
SharePoint 2010 Enterprise Content Management for IT Pros Mirjam van Olst Macaw About Mirjam Blog: http://sharepointchick.com Email: mirjam@macaw.nl Twitter: @mirjamvanolst Agenda Managed Metadata Service
More informationData and Knowledge Management Dr. Rick Jerz
Data and Knowledge Management Dr. Rick Jerz 1 Goals Define big data and discuss its basic characteristics Understand ways to store information Understand the value of a Database Management System Explain
More informationEntity Relationship Diagram (ERD) Dr. Moustafa Elazhary
Entity Relationship Diagram (ERD) Dr. Moustafa Elazhary Data Modeling Data modeling is a very vital as it is like creating a blueprint to build a house before the actual building takes place. It is built
More informationBI/DWH Test specifics
BI/DWH Test specifics Jaroslav.Strharsky@s-itsolutions.at 26/05/2016 Page me => TestMoto: inadequate test scope definition? no problem problem cold be only bad test strategy more than 16 years in IT more
More informationModule 1.Introduction to Business Objects. Vasundhara Sector 14-A, Plot No , Near Vaishali Metro Station,Ghaziabad
Module 1.Introduction to Business Objects New features in SAP BO BI 4.0. Data Warehousing Architecture. Business Objects Architecture. SAP BO Data Modelling SAP BO ER Modelling SAP BO Dimensional Modelling
More informationEfficiency Gains in Inbound Data Warehouse Feed Implementation
Efficiency Gains in Inbound Data Warehouse Feed Implementation Simon Eligulashvili simon.e@gamma-sys.com Introduction The task of building a data warehouse with the objective of making it a long-term strategic
More informationMaking EXCEL Work for YOU!
Tracking and analyzing numerical data is a large component of the daily activity in today s workplace. Microsoft Excel 2003 is a popular choice among individuals and companies for organizing, analyzing,
More informationIT1105 Information Systems and Technology. BIT 1 ST YEAR SEMESTER 1 University of Colombo School of Computing. Student Manual
IT1105 Information Systems and Technology BIT 1 ST YEAR SEMESTER 1 University of Colombo School of Computing Student Manual Lesson 3: Organizing Data and Information (6 Hrs) Instructional Objectives Students
More information1Z0-526
1Z0-526 Passing Score: 800 Time Limit: 4 min Exam A QUESTION 1 ABC's Database administrator has divided its region table into several tables so that the west region is in one table and all the other regions
More informationData and Knowledge Management. Goals. Big Data. Dr. Rick Jerz
Data and Knowledge Management Dr. Rick Jerz 1 Goals Define big data and discuss its basic characteristics Understand ways to store information Understand the value of a Database Management System Explain
More informationData Warehouses Chapter 12. Class 10: Data Warehouses 1
Data Warehouses Chapter 12 Class 10: Data Warehouses 1 OLTP vs OLAP Operational Database: a database designed to support the day today transactions of an organization Data Warehouse: historical data is
More information20767B: IMPLEMENTING A SQL DATA WAREHOUSE
ABOUT THIS COURSE This 5-day instructor led course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse with Microsoft SQL Server
More informationPro Tech protechtraining.com
Course Summary Description This course provides students with the skills necessary to plan, design, build, and run the ETL processes which are needed to build and maintain a data warehouse. It is based
More informationData Vault Modeling and its Evolution DECISION SCIENCES INSTITUTE. Conceptual Data Vault Modeling and its Opportunities for the Future
DECISION SCIENCES INSTITUTE Conceptual Data Vault Modeling and its Opportunities for the Future Aarthi Raman, Active Network, Dallas, TX, 75201 itz.aarthi@gmail.com Teuta Cata, Northern Kentucky University,
More informationAnalytics in the Cloud Mandate or Option?
Analytics in the Cloud Mandate or Option? Rick Lower Sr. Director of Analytics Alliances Teradata 1 The SAS & Teradata Partnership Overview Partnership began in 2007 to improving analytic performance Teradata
More informationOracle Database 12c: Performance Management and Tuning
Oracle University Contact Us: +43 (0)1 33 777 401 Oracle Database 12c: Performance Management and Tuning Duration: 5 Days What you will learn In the Oracle Database 12c: Performance Management and Tuning
More informationEPM Live 2.2 Configuration and Administration Guide v.os1
Installation Configuration Guide EPM Live v2.2 Version.01 April 30, 2009 EPM Live 2.2 Configuration and Administration Guide v.os1 Table of Contents 1 Getting Started... 5 1.1 Document Overview... 5 1.2
More informationEZY Intellect Pte. Ltd., #1 Changi North Street 1, Singapore
Oracle Database 12c: Performance Management and Tuning NEW Duration: 5 Days What you will learn In the Oracle Database 12c: Performance Management and Tuning course, learn about the performance analysis
More informationSAMPLE. Preface xi 1 Introducting Microsoft Analysis Services 1
contents Preface xi 1 Introducting Microsoft Analysis Services 1 1.1 What is Analysis Services 2005? 1 Introducing OLAP 2 Introducing Data Mining 4 Overview of SSAS 5 SSAS and Microsoft Business Intelligence
More informationAnalytics: Server Architect (Siebel 7.7)
Analytics: Server Architect (Siebel 7.7) Student Guide June 2005 Part # 10PO2-ASAS-07710 D44608GC10 Edition 1.0 D44917 Copyright 2005, 2006, Oracle. All rights reserved. Disclaimer This document contains
More informationLyras Shipping - CIO Forum
Lyras Shipping - CIO Forum Data Relationships at the Core of Making Big Data Work Panteleimon Pantelis 2015 Ulysses Systems (UK) Ltd. www.ulysses-systems.com Lyras Shipping and Big or not so Big BUT very
More informationApplication software office packets, databases and data warehouses.
Introduction to Computer Systems (9) Application software office packets, databases and data warehouses. Piotr Mielecki Ph. D. http://www.wssk.wroc.pl/~mielecki piotr.mielecki@pwr.edu.pl pmielecki@gmail.com
More informationMigrate from Netezza Workload Migration
Migrate from Netezza Automated Big Data Open Netezza Source Workload Migration CASE SOLUTION STUDY BRIEF Automated Netezza Workload Migration To achieve greater scalability and tighter integration with
More informationExam /Course 20767B: Implementing a SQL Data Warehouse
Exam 70-767/Course 20767B: Implementing a SQL Data Warehouse Course Outline Module 1: Introduction to Data Warehousing This module describes data warehouse concepts and architecture consideration. Overview
More informationFrom Single Purpose to Multi Purpose Data Lakes. Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019
From Single Purpose to Multi Purpose Data Lakes Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019 Agenda Data Lakes Multiple Purpose Data Lakes Customer Example Demo Takeaways
More informationThis tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.
About the Tutorial A data warehouse is constructed by integrating data from multiple heterogeneous sources. It supports analytical reporting, structured and/or ad hoc queries and decision making. This
More informationCIS 330: Web-driven Web Applications. Lecture 2: Introduction to ER Modeling
CIS 330: Web-driven Web Applications Lecture 2: Introduction to ER Modeling 1 Goals of This Lecture Understand ER modeling 2 Last Lecture Why Store Data in a DBMS? Transactions (concurrent data access,
More informationUSERS CONFERENCE Copyright 2016 OSIsoft, LLC
Bridge IT and OT with a process data warehouse Presented by Matt Ziegler, OSIsoft Complexity Problem Complexity Drives the Need for Integrators Disparate assets or interacting one-by-one Monitoring Real-time
More informationDan Vlamis Vlamis Software Solutions, Inc Copyright 2005, Vlamis Software Solutions, Inc.
2UDFOH2/$3 +RZ'RHVLW5HDOO\:RUN",28*/LYH 6HVVLRQ Dan Vlamis dvlamis@vlamis.com Vlamis Software Solutions, Inc. 816-781-2880 http://www.vlamis.com 9ODPLV6RIWZDUH6ROXWLRQV,QF Founded in 1992 in Kansas City,
More informationArchitectural challenges for building a low latency, scalable multi-tenant data warehouse
Architectural challenges for building a low latency, scalable multi-tenant data warehouse Mataprasad Agrawal Solutions Architect, Services CTO 2017 Persistent Systems Ltd. All rights reserved. Our analytics
More informationThis module presents the star schema, an alternative to 3NF schemas intended for analytical databases.
Topic 3.3: Star Schema Design This module presents the star schema, an alternative to 3NF schemas intended for analytical databases. Star Schema Overview The star schema is a simple database architecture
More informationCT75 DATA WAREHOUSING AND DATA MINING DEC 2015
Q.1 a. Briefly explain data granularity with the help of example Data Granularity: The single most important aspect and issue of the design of the data warehouse is the issue of granularity. It refers
More information<Insert Picture Here> Looking at Performance - What s new in MySQL Workbench 6.2
Looking at Performance - What s new in MySQL Workbench 6.2 Mario Beck MySQL Sales Consulting Manager EMEA The following is intended to outline our general product direction. It is
More informationDeccansoft Software Services Microsoft Silver Learning Partner. SSAS Syllabus
Overview: Analysis Services enables you to analyze large quantities of data. With it, you can design, create, and manage multidimensional structures that contain detail and aggregated data from multiple
More informationData Science. Data Analyst. Data Scientist. Data Architect
Data Science Data Analyst Data Analysis in Excel Programming in R Introduction to Python/SQL/Tableau Data Visualization in R / Tableau Exploratory Data Analysis Data Scientist Inferential Statistics &
More informationcollection of data that is used primarily in organizational decision making.
Data Warehousing A data warehouse is a special purpose database. Classic databases are generally used to model some enterprise. Most often they are used to support transactions, a process that is referred
More informationThe DBMS accepts requests for data from the application program and instructs the operating system to transfer the appropriate data.
Managing Data Data storage tool must provide the following features: Data definition (data structuring) Data entry (to add new data) Data editing (to change existing data) Querying (a means of extracting
More informationETL Testing Concepts:
Here are top 4 ETL Testing Tools: Most of the software companies today depend on data flow such as large amount of information made available for access and one can get everything which is needed. This
More informationETL and OLAP Systems
ETL and OLAP Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, first semester
More informationChapter 6. Foundations of Business Intelligence: Databases and Information Management VIDEO CASES
Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
More information