Data modelling patterns used for integration of operational data stores
|
|
- Joel Dorsey
- 6 years ago
- Views:
Transcription
1 Data modelling patterns used for integration of operational data stores A D ONE insight brought to you by Catalin Ehrmann Abstract Modern enterprise software development and data processing approaches have been divided into separate paths in the last decade. Recently, the two communities have realized the benefit of sharing expertise across domains. In this paper, we will explain how a clever mix of data warehouse and OLTP (Online Transaction Processing) patterns create a robust operational system and discuss the advantages and disadvantages of this approach. Introduction There is no shortage of design patterns for data processing or software development, but each pattern has its own set of trade offs. Developers and database administrators often have to weigh the pros and cons of several options. Before choosing a pattern, it is important to understand business requirements and the data model. What is the tolerance for failure of an operation? What are the legal requirements? Is the data used globally or only locally? What kind of analysis will be done on the data? We also have to think about connected systems and what hardware our database and software must operate on. However, if we can design a pattern that has only a few minor trade offs, we can meet more cross organization business requirements without slowing down the business or increasing error rates. DWH Patterns Before we can design an improved enterprise DWH pattern, we must understand two basic patterns important to DWH design and implementation: Slowly Changing Dimensions (SCD) and Change Data Capture (CDC). SCD Type 1 SCD1 updates data by overwriting existing values (See Figures 1 and 2). It is used about half of the time because it is easy to implement and use. This is a good approach for error fixes, but compliance laws could be violated since all historical values are lost. This approach should not be used when a data value is being updated because the information has changed for example, an organization moving to a new location. D ONE Insight Data modelling patterns used for integration of operational data stores 1/7
2 Figure 1: SCD1 Sample Code update single record in Vendors table UPDATE dbo.vendors SET CITY = 'BERLIN' WHERE UID = 1234 Figure 2: SCD1 Example VENDOR TABLE: BEFORE SCD1: UID TAX_ID VENDOR CITY COUNTRY ACME, INC MUNICH GERMANY AFTER SCD1: UID TAX_ID VENDOR CITY COUNTRY ACME, INC BERLIN GERMANY Additionally, analysis cubes and pre computed aggregates must be rebuilt any time a data point is changed using SCD1. If there are distributed copies of the data, the change will have to be implemented on the copies as well. Calculations must also be rebuilt on each copy. As compliance requirements grow, SCD1 will likely be used less as time goes on. Organizations will be forced to choose another method to maintain good standing with compliance enforcement agencies. SCD Type 2 is a bit more complex than SCD1, but has some important advantages. SCD Type 2 In SCD2, the current record is expired and a new row is added to take its place using SQL Server s MERGE functionality (See Figure 3). SCD Type 2 is a bit more difficult to implement, but has the advantage of preserving historical data. This method is an excellent choice when the law requires history preservation. The disadvantage of SCD2 is that database storage and performance could quickly become a concern since new rows are being added for every update. D ONE Insight Data modelling patterns used for integration of operational data stores 2/7
3 Figure 3: SCD1 Example VENDOR TABLE: BEFORE UID TAX_ID VENDOR CITY COUNTRY VALID_FROM VALID_TO CURR_FL ACME, INC MUNICH GERMANY Y AFTER UID TAX_ID VENDOR CITY COUNTRY VALID_FROM VALID_TO CURR_FL ACME, INC MUNICH GERMANY N ACME, INC BERLIN GERMANY Y When implementing SCD2, it is important to include metadata columns so users are able to determine which record is current and which are historical (See Figure 3). Administrators should also make end users aware of the metadata columns and their meaning. A current flag is not absolutely necessary, but it does make querying for current or historical records easier. It is sometimes useful to include a reason flag or description to note why the data was updated to distinguish error fixes from information changes. Administrators should also keep in mind that updates to downstream systems may not be made properly when a natural key is updated and no surrogate key is present. It is recommended that surrogate keys always be present in data updated using SCD2 (See Figure 4). Figure 4: Changing a Natural Key (Tax ID) With No Surrogate Key (UID) Present is Not Recommended VENDOR TABLE: BEFORE TAX_ID VENDOR CITY COUNTRY VALID_FROM VALID_TO CURR_FL ACME, INC MUNICH GERMANY Y AFTER TAX_ID VENDOR CITY COUNTRY VALID_FROM VALID_TO CURR_FL ACME, INC MUNICH GERMANY N ACME, INC BERLIN GERMANY Y Change Data Capture CDC is a method to extract data for ETL (extract, transform and load) processing. CDC isolates changed data for extract rather than performing a full refresh. CDC captures all inserts, updates and deletes from all systems that interface with the database, including front end applications and database processes such as triggers. If metadata is present, CDC also follows compliance regulations. D ONE Insight Data modelling patterns used for integration of operational data stores 3/7
4 Changes can be detected in four different ways: via audit columns, database log scraping, timed extracts and a full database difference comparison. See Figure 5 for a comparison of CDC methods. Audit columns can be very easy to use, but they can also be unreliable. If front end applications modify data or there are null values in the data, audit columns should not be used to detect changes. If the administrator is certain that only database triggers are used to update metadata, audit columns may be a good option. Database log scraping should be used as a last resort. While this method is slightly more reliable than using audit columns to find changes, it can be very error prone and tedious to build a system that takes a snapshot of a log, then extracts useful information from that log, and finally acts on that information accordingly. Furthermore, log files tend to be the first thing to be erased when database performance and storage volumes are suffering, resulting in missed change captures. Figure 5: Comparison of CDC Methods IMPLEMENTATION & SPEED ACCURACY AUDIT COLUMNS Fast, easyimplementation Ifdatabasetriggersareusedto modifymetadata, highlyaccurate DATABASE LOG SCRAPING Tediousandtime-consuming Highlypronetoerrordueto natureoflogfilescraping; Must havealternativemethodifdba emptieslogfilestoensure databaseperformance TIMED EXTRACTS Fast, butmanualcleanupis oftenrequiredandcanbe time-consuming Veryunreliable. Mid-jobfailures orjobskipscancauselarge amountsofdatatobemissed FULL DIFFERENCE COMPARISON Somewhateasytoimplement, buthighlyresourceintensive Highlyaccurate Timed extracts are notoriously unreliable, but novice DBAs often mistakenly choose this technique. Here, an extract of data is taken at a specific time. Data is captured within a particular timeframe. If the process fails before it completes all steps, duplicate rows could be introduced into the table. A failed or stopped process will cause entire sets of data to be missed. In the case of failed or skipped processes, an administrator will have the tedious task of cleaning up duplicate rows and identifying which rows should be included in the next CDC process and which should be excluded. A full database difference comparison is the only method that is guaranteed to find all changes. Unfortunately, it can be very resource intensive to run a full diff compare since snapshots are compared record by record. To improve performance, a checksum can be used to quickly determine if a record has changed. This method is a good choice for environments where reliability and accuracy are the primary concerns. D ONE Insight Data modelling patterns used for integration of operational data stores 4/7
5 The OLTP Pattern OLTP (Online Transaction Processing) is a very popular method for processing data. Data input is gathered, then the information is processed and the data is updated accordingly all in real time. Most front end applications that allow users to interface with the database use OLTP. If you ve ever used a credit card to pay for something, you ve used OLTP (See Figure 6). You swiped your card (input), then the credit card machine or website sent the data to your card company (information gathering), and your card was charged according to your purchase (data update). OLTP is an all or nothing process. If any step in the process fails, the entire operation must fail. If you swiped your card, funds were verified, but the system failed to actually charge you for your purchase the vendor would not get his money; therefore the process must fail. Figure 6: Sample OLTP Process OLTP s design makes it ideal for real time, mission critical operations. If any process has zero tolerance for error, OLTP is the pattern of choice. Additionally, OLTP supports concurrent operations. You and other customers can all make purchases from the same vendor at the same time without having to wait for another transaction to finish. This is another reason OLTP is a good choice for front end applications. When implementing an OLTP system, it is important to keep your queries highly selective and highly optimized. Query operation times must be kept to a minimum or users will get tired of waiting and abandon the task at hand. To improve performance, the data used by an OLTP system should be highly normalized and the transaction distributed across multiple machines or networks (if anticipated traffic requires additional processing power and memory). Traditionally, OLTP updates have followed the SCD1 pattern and history is erased when an update is made. In the next section, we learn how to preserve historical data and use SCD2 in an OLTP environment. D ONE Insight Data modelling patterns used for integration of operational data stores 5/7
6 Business Case: Using SCD and CDC in an OLTP Environment Our client, a newly formed company, required infrastructure setup that would support multi system integrations between its customer and partner systems. They were using a Microsoft stack, so SQL Server 2012 was the database of choice while SQL Server Integration Services (SSIS), Analysis Services (SSAS) and Reporting Services (SSRS) were chosen as supporting applications. Code was written in T SQL and C# and managed using Team Foundation Server (TFS). The Process A process was designed that would reduce impacts on performance while making historical and current data easily available to customers and partners (See Figure 7). First, the customer sends three files to our client via a secure inbox containing deletes, updates and inserts to their database. That data was then imported to an operational data store (ODS). If the customer had not yet configured partner system credentials and integration parameters, they could log in to a customer portal to do so. Figure 7: SCD1 & SCD2 Mix in OLTP Environment After data is imported to the ODS, the unmodified data from both the partner systems and ODS is loaded into a pre staging environment. The data is then enriched with SCD2 metadata elements including valid to and from dates and a current record flag. The enriched data is imported into a persistent storage staging environment. Any changes to the data are then made in the core database using SCD1 in SSIS. When changes are completed in the core database, CDC is used to detect those changes, which are then sent to the other connected databases, excluding the database the change originated from. No deletes are made in the ODS. Rather, the record is marked as inactive. D ONE Insight Data modelling patterns used for integration of operational data stores 6/7
7 Advantages The biggest benefit of using this mixed approach is preservation of historical data without rapidly expanding the size of the production database. As a result, we can comply with the law and keep the production database stable and responsive. Because history is intact, analysis can be performed on the staging database in fact, an analysis component is planned for the project discussed above. With analysis computations taking place on the staging database, we are able to preserve resources on the production database for OLTP operations. If any analysis is performed on the production database, queries will perform better since historical records do not need to be filtered from current records. Additionally, users of the data are more clear on the data they are querying and do not need to decipher what metadata columns such as current flag and valid dates are used for and how they might impact their query. Last, downstream systems and the DWH will integrate changes more smoothly since each record will have its own primary key that will not change. Database triggers will also perform reliably. Disadvantages Compared to most patterns, there are few disadvantages to this mixed approach. The main concern is that multiple steps in the process will introduce more potential points of failure, as with any process. Because there are more steps, troubleshooting failures and errors will be more time consuming than a single pattern approach. If any analysis computations are performed on production data, they will need to be rebuilt any time there is an update. However, using the staging environment for any complex analytical functions will negate this issue. Conclusion In an OLTP environment, reliability and speed are paramount. Combining the SCD1 and SCD2 approaches allows us to benefit from the advantages of both patterns without suffering many of the disadvantages. When this mixed approach is implemented, it can be an ideal solution for the legal, IT, marketing and analytics departments without sacrificing customer and user workflows. D ONE Insight Data modelling patterns used for integration of operational data stores 7/7
BI ENVIRONMENT PLANNING GUIDE
BI ENVIRONMENT PLANNING GUIDE Business Intelligence can involve a number of technologies and foster many opportunities for improving your business. This document serves as a guideline for planning strategies
More informationFederal Agency Firewall Management with SolarWinds Network Configuration Manager & Firewall Security Manager. Follow SolarWinds:
Federal Agency Firewall Management with SolarWinds Network Configuration Manager & Firewall Security Manager Introduction What s different about Federal Government Firewalls? The United States Federal
More informationETL Best Practices and Techniques. Marc Beacom, Managing Partner, Datalere
ETL Best Practices and Techniques Marc Beacom, Managing Partner, Datalere Thank you Sponsors Experience 10 years DW/BI Consultant 20 Years overall experience Marc Beacom Managing Partner, Datalere Current
More informationData Warehouses Chapter 12. Class 10: Data Warehouses 1
Data Warehouses Chapter 12 Class 10: Data Warehouses 1 OLTP vs OLAP Operational Database: a database designed to support the day today transactions of an organization Data Warehouse: historical data is
More informationFIVE BEST PRACTICES FOR ENSURING A SUCCESSFUL SQL SERVER MIGRATION
FIVE BEST PRACTICES FOR ENSURING A SUCCESSFUL SQL SERVER MIGRATION The process of planning and executing SQL Server migrations can be complex and risk-prone. This is a case where the right approach and
More informationturning data into dollars
turning data into dollars Tom s Ten Data Tips December 2008 ETL ETL stands for Extract, Transform, Load. This process merges and integrates information from source systems in the data warehouse (DWH).
More informationCRM-to-CRM Data Migration. CRM system. The CRM systems included Know What Data Will Map...3
CRM-to-CRM Data Migration Paul Denwood Table of Contents The purpose of this whitepaper is to describe the issues and best practices related to data Choose the Right Migration Tool...1 migration from one
More informationImplement a Data Warehouse with Microsoft SQL Server
Implement a Data Warehouse with Microsoft SQL Server 20463D; 5 days, Instructor-led Course Description This course describes how to implement a data warehouse platform to support a BI solution. Students
More informationDatabricks Delta: Bringing Unprecedented Reliability and Performance to Cloud Data Lakes
Databricks Delta: Bringing Unprecedented Reliability and Performance to Cloud Data Lakes AN UNDER THE HOOD LOOK Databricks Delta, a component of the Databricks Unified Analytics Platform*, is a unified
More informationLow Friction Data Warehousing WITH PERSPECTIVE ILM DATA GOVERNOR
Low Friction Data Warehousing WITH PERSPECTIVE ILM DATA GOVERNOR Table of Contents Foreword... 2 New Era of Rapid Data Warehousing... 3 Eliminating Slow Reporting and Analytics Pains... 3 Applying 20 Years
More informationUnified Governance for Amazon S3 Data Lakes
WHITEPAPER Unified Governance for Amazon S3 Data Lakes Core Capabilities and Best Practices for Effective Governance Introduction Data governance ensures data quality exists throughout the complete lifecycle
More informationCSI:DW Anatomy of a VLDW. Dave Fackler Business Intelligence Architect
CSI:DW Anatomy of a VLDW Dave Fackler Business Intelligence Architect davef@rollinghillsky.com Agenda The Crime Scene VA s DW and BI Landscape DW Model and Metadata Infrastructure The Evidence Database
More information20463C-Implementing a Data Warehouse with Microsoft SQL Server. Course Content. Course ID#: W 35 Hrs. Course Description: Audience Profile
Course Content Course Description: This course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse 2014, implement ETL with
More informationEnterprise Data Architect
Enterprise Data Architect Position Summary Farmer Mac maintains a considerable repository of financial data that spans over two decades. Farmer Mac is looking for a hands-on technologist and data architect
More informationImplementing a Data Warehouse with Microsoft SQL Server
Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server Page 1 of 6 Implementing a Data Warehouse with Microsoft SQL Server Course 20463C: 4 days; Instructor-Led Introduction This course
More informationIntroduction to Data Science
UNIT I INTRODUCTION TO DATA SCIENCE Syllabus Introduction of Data Science Basic Data Analytics using R R Graphical User Interfaces Data Import and Export Attribute and Data Types Descriptive Statistics
More informationNext Generation DWH Modeling. An overview of DWH modeling methods
Next Generation DWH Modeling An overview of DWH modeling methods Ronald Kunenborg www.grundsatzlich-it.nl Topics Where do we stand today Data storage and modeling through the ages Current data warehouse
More informationAfter completing this course, participants will be able to:
Designing a Business Intelligence Solution by Using Microsoft SQL Server 2008 T h i s f i v e - d a y i n s t r u c t o r - l e d c o u r s e p r o v i d e s i n - d e p t h k n o w l e d g e o n d e s
More informationTen Innovative Financial Services Applications Powered by Data Virtualization
Ten Innovative Financial Services Applications Powered by Data Virtualization DATA IS THE NEW ALPHA In an industry driven to deliver alpha, where might financial services firms find opportunities when
More informationDeccansoft Software Services. SSIS Syllabus
Overview: SQL Server Integration Services (SSIS) is a component of Microsoft SQL Server database software which can be used to perform a broad range of data migration, data integration and Data Consolidation
More information6+ years of experience in IT Industry, in analysis, design & development of data warehouses using traditional BI and self-service BI.
SUMMARY OF EXPERIENCE 6+ years of experience in IT Industry, in analysis, design & development of data warehouses using traditional BI and self-service BI. 1.6 Years of experience in Self-Service BI using
More informationImplementing a Data Warehouse with Microsoft SQL Server 2012
Implementing a Data Warehouse with Microsoft SQL Server 2012 Course 10777A 5 Days Instructor-led, Hands-on Introduction Data warehousing is a solution organizations use to centralize business data for
More informationCSE 530A ACID. Washington University Fall 2013
CSE 530A ACID Washington University Fall 2013 Concurrency Enterprise-scale DBMSs are designed to host multiple databases and handle multiple concurrent connections Transactions are designed to enable Data
More informationAudience BI professionals BI developers
Applied Microsoft BI The Microsoft Data Platform empowers BI pros to implement organizational BI solutions delivering a single version of the truth across the enterprise. A typical organizational solution
More informationWindocks Technical Backgrounder
Windocks Technical Backgrounder Windocks is a port of Docker s open source to Windows used to modernize SQL Server workflows. Windocks is also an open, modern, data delivery solution that sources data
More information1. Analytical queries on the dimensionally modeled database can be significantly simpler to create than on the equivalent nondimensional database.
1. Creating a data warehouse involves using the functionalities of database management software to implement the data warehouse model as a collection of physically created and mutually connected database
More informationThe Emerging Data Lake IT Strategy
The Emerging Data Lake IT Strategy An Evolving Approach for Dealing with Big Data & Changing Environments bit.ly/datalake SPEAKERS: Thomas Kelly, Practice Director Cognizant Technology Solutions Sean Martin,
More informationRealizing the Full Potential of MDM 1
Realizing the Full Potential of MDM SOLUTION MDM Augmented with Data Virtualization INDUSTRY Applicable to all Industries EBSITE www.denodo.com PRODUCT OVERVIE The Denodo Platform offers the broadest access
More informationGDPR: An Opportunity to Transform Your Security Operations
GDPR: An Opportunity to Transform Your Security Operations McAfee SIEM solutions improve breach detection and response Is your security operations GDPR ready? General Data Protection Regulation (GDPR)
More informationExam Name: PRO: Designing a Business Intelligence. Infrastructure Using Microsoft SQL Server 2008
Vendor: Microsoft Exam Code: 70-452 Exam Name: PRO: Designing a Business Intelligence Infrastructure Using Microsoft SQL Server 2008 Version: DEMO 1: You design a Business Intelligence (BI) solution by
More informationBODS10 SAP Data Services: Platform and Transforms
SAP Data Services: Platform and Transforms SAP BusinessObjects - Data Services Course Version: 96 Revision A Course Duration: 3 Day(s) Publication Date: 05-02-2013 Publication Time: 1551 Copyright Copyright
More informationHow to integrate data into Tableau
1 How to integrate data into Tableau a comparison of 3 approaches: ETL, Tableau self-service and WHITE PAPER WHITE PAPER 2 data How to integrate data into Tableau a comparison of 3 es: ETL, Tableau self-service
More informationMission-Critical Customer Service. 10 Best Practices for Success
Mission-Critical Email Customer Service 10 Best Practices for Success Introduction When soda cans and chocolate wrappers start carrying email contact information, you know that email-based customer service
More informationSage SQL Gateway Installation and Reference Guide
Sage SQL Gateway Installation and Reference Guide IMPORTANT NOTICE This document and the Sage 300 Construction and Real Estate software may be used only in accordance with the Sage 300 Construction and
More informationBYOD WORK THE NUTS AND BOLTS OF MAKING. Brent Gatewood, CRM
THE NUTS AND BOLTS OF MAKING BYOD Mobile technology is changing at an astonishing rate, and employees are increasingly using their personally owned devices for business purposes sanctioned or not. Organizations,
More informationScaleArc for SQL Server
Solution Brief ScaleArc for SQL Server Overview Organizations around the world depend on SQL Server for their revenuegenerating, customer-facing applications, running their most business-critical operations
More informationImplementing a Data Warehouse with Microsoft SQL Server 2012/2014 (463)
Implementing a Data Warehouse with Microsoft SQL Server 2012/2014 (463) Design and implement a data warehouse Design and implement dimensions Design shared/conformed dimensions; determine if you need support
More informationImplementing a Data Warehouse with Microsoft SQL Server 2014
Course 20463D: Implementing a Data Warehouse with Microsoft SQL Server 2014 Page 1 of 5 Implementing a Data Warehouse with Microsoft SQL Server 2014 Course 20463D: 4 days; Instructor-Led Introduction This
More information20767B: IMPLEMENTING A SQL DATA WAREHOUSE
ABOUT THIS COURSE This 5-day instructor led course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse with Microsoft SQL Server
More informationMicrosoft certified solutions associate
Microsoft certified solutions associate MCSA: BI Reporting This certification demonstrates your expertise in analyzing data with both Power BI and Excel. Exam 70-778/Course 20778 Analyzing and Visualizing
More informationCaché and Data Management in the Financial Services Industry
Caché and Data Management in the Financial Services Industry Executive Overview One way financial services firms can improve their operational efficiency is to revamp their data management infrastructure.
More informationMacola Enterprise Suite Release Notes: Macola ES version ES
Page 1 of 6 Macola Enterprise Suite Release Notes: Macola ES version ES9.6.100 Release: version ES9.6.100 Main: Support Product Know How Category: Release Notes Sub Category: General Assortment: Macola
More informationExam /Course 20767B: Implementing a SQL Data Warehouse
Exam 70-767/Course 20767B: Implementing a SQL Data Warehouse Course Outline Module 1: Introduction to Data Warehousing This module describes data warehouse concepts and architecture consideration. Overview
More informationUsing Metadata Queries To Build Row-Level Audit Reports in SAS Visual Analytics
SAS6660-2016 Using Metadata Queries To Build Row-Level Audit Reports in SAS Visual Analytics ABSTRACT Brandon Kirk and Jason Shoffner, SAS Institute Inc., Cary, NC Sensitive data requires elevated security
More informationJet Enterprise Frequently Asked Questions
Pg. 1 03/18/2011 Jet Enterprise Regarding Jet Enterprise What are the software requirements for Jet Enterprise? The following components must be installed to take advantage of Jet Enterprise: SQL Server
More informationOLAP Cubes 101: An Introduction to Business Intelligence Cubes
OLAP Cubes 101 Page 3 Clear the Tables Page 4 Embracing Cubism Page 5 Everybody Loves Cubes Page 6 Cubes in Action Page 7 Cube Terminology Page 9 Correcting Mis-cube-ceptions Page 10 OLAP Cubes 101: It
More informationIntegrated McAfee and Cisco Fabrics Demolish Enterprise Boundaries
Integrated McAfee and Cisco Fabrics Demolish Enterprise Boundaries First united and open ecosystem to support enterprise-wide visibility and rapid response The cybersecurity industry needs a more efficient
More informationRefresh a 1TB+ database in under 10 seconds
Refresh a 1TB+ database in under 10 seconds BY ANDRZEJ PILACIK Who am I? Database Manager / Solution Architect at Bracebridge Capital 15 years of experience in Database Platforms, SQL Server 7.0-2016,
More informationProtect Your Data the Way Banks Protect Your Money
Protect Your Data the Way Banks Protect Your Money A New Security Model Worth Understanding and Emulating Enterprise security traditionally relied on a fortress strategy that locked down user endpoints
More informationXcelerated Business Insights (xbi): Going beyond business intelligence to drive information value
KNOWLEDGENT INSIGHTS volume 1 no. 5 October 7, 2011 Xcelerated Business Insights (xbi): Going beyond business intelligence to drive information value Today s growing commercial, operational and regulatory
More informationSQL Maestro and the ELT Paradigm Shift
SQL Maestro and the ELT Paradigm Shift Abstract ELT extract, load, and transform is replacing ETL (extract, transform, load) as the usual method of populating data warehouses. Modern data warehouse appliances
More informationCHAPTER 3 Implementation of Data warehouse in Data Mining
CHAPTER 3 Implementation of Data warehouse in Data Mining 3.1 Introduction to Data Warehousing A data warehouse is storage of convenient, consistent, complete and consolidated data, which is collected
More information2 The IBM Data Governance Unified Process
2 The IBM Data Governance Unified Process The benefits of a commitment to a comprehensive enterprise Data Governance initiative are many and varied, and so are the challenges to achieving strong Data Governance.
More informationData Warehousing. Jens Teubner, TU Dortmund Summer Jens Teubner Data Warehousing Summer
Jens Teubner Data Warehousing Summer 2018 1 Data Warehousing Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de Summer 2018 Jens Teubner Data Warehousing Summer 2018 160 Part VI ETL Process ETL Overview
More informationBest Practices in Data Modeling. Dan English
Best Practices in Data Modeling Dan English Objectives Understand how QlikView is Different from SQL Understand How QlikView works with(out) a Data Warehouse Not Throw Baby out with the Bathwater Adopt
More informationCA GovernanceMinder. CA IdentityMinder Integration Guide
CA GovernanceMinder CA IdentityMinder Integration Guide 12.6.00 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation
More informationImplementing a SQL Data Warehouse
Course 20767B: Implementing a SQL Data Warehouse Page 1 of 7 Implementing a SQL Data Warehouse Course 20767B: 4 days; Instructor-Led Introduction This 4-day instructor led course describes how to implement
More informationTHINGS YOU NEED TO KNOW ABOUT USER DOCUMENTATION DOCUMENTATION BEST PRACTICES
5 THINGS YOU NEED TO KNOW ABOUT USER DOCUMENTATION DOCUMENTATION BEST PRACTICES THIS E-BOOK IS DIVIDED INTO 5 PARTS: 1. WHY YOU NEED TO KNOW YOUR READER 2. A USER MANUAL OR A USER GUIDE WHAT S THE DIFFERENCE?
More informationData Strategies for Efficiency and Growth
Data Strategies for Efficiency and Growth Date Dimension Date key (PK) Date Day of week Calendar month Calendar year Holiday Channel Dimension Channel ID (PK) Channel name Channel description Channel type
More informationEnhancing Security With SQL Server How to balance the risks and rewards of using big data
Enhancing Security With SQL Server 2016 How to balance the risks and rewards of using big data Data s security demands and business opportunities With big data comes both great reward and risk. Every company
More informationBI4Dynamics AX/NAV Integrate external data sources
BI4Dynamics AX/NAV Last update: November 2018 Version: 2.1 Abbreviation used in this document: EDS: External Data Source(s) are data that are not a part of Microsoft Dynamics AX/NAV. It can come from any
More informationDatastage Slowly Changing Dimensions
Datastage Slowly Changing Dimensions by Shradha Kelkar, Talentain Technologies Basics of SCD Shradha Kelkar Slowly Changing Dimensions (SCDs) are dimensions that have data that changes slowly, rather than
More information1 DATAWAREHOUSING QUESTIONS by Mausami Sawarkar
1 DATAWAREHOUSING QUESTIONS by Mausami Sawarkar 1) What does the term 'Ad-hoc Analysis' mean? Choice 1 Business analysts use a subset of the data for analysis. Choice 2: Business analysts access the Data
More informationCONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED DATA PLATFORM
CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED PLATFORM Executive Summary Financial institutions have implemented and continue to implement many disparate applications
More informationBUILD BETTER MICROSOFT SQL SERVER SOLUTIONS Sales Conversation Card
OVERVIEW SALES OPPORTUNITY Lenovo Database Solutions for Microsoft SQL Server bring together the right mix of hardware infrastructure, software, and services to optimize a wide range of data warehouse
More informationBalancing the pressures of a healthcare SQL Server DBA
Balancing the pressures of a healthcare SQL Server DBA More than security, compliance and auditing? Working with SQL Server in the healthcare industry presents many unique challenges. The majority of these
More informationImplementing a Data Warehouse with Microsoft SQL Server 2012
10777 - Implementing a Data Warehouse with Microsoft SQL Server 2012 Duration: 5 days Course Price: $2,695 Software Assurance Eligible Course Description 10777 - Implementing a Data Warehouse with Microsoft
More informationDATA MINING TRANSACTION
DATA MINING Data Mining is the process of extracting patterns from data. Data mining is seen as an increasingly important tool by modern business to transform data into an informational advantage. It is
More informationDistributed Hybrid MDM, aka Virtual MDM Optional Add-on, for WhamTech SmartData Fabric
Distributed Hybrid MDM, aka Virtual MDM Optional Add-on, for WhamTech SmartData Fabric Revision 2.1 Page 1 of 17 www.whamtech.com (972) 991-5700 info@whamtech.com August 2018 Contents Introduction... 3
More informationWHITE PAPER. The truth about data MASTER DATA IS YOUR KEY TO SUCCESS
WHITE PAPER The truth about data MASTER DATA IS YOUR KEY TO SUCCESS Master Data is your key to success SO HOW DO YOU KNOW WHAT S TRUE AMONG ALL THE DIFFER- ENT DATA SOURCES AND ACROSS ALL YOUR ORGANIZATIONAL
More informationOracle Audit Vault Implementation
Oracle Audit Vault Implementation For SHIPPING FIRM Case Study Client Company Profile It has been involved in banking for over 300 years. It operates in over 50 countries with more than 1, 47,000 employees.
More informationIntroduction to SSIS. Or you want to take some data, change it, and put it somewhere else? Then boy do I have THE tool for you!
Introduction to SSIS Or you want to take some data, change it, and put it somewhere else? Then boy do I have THE tool for you! Who am I? Ed Watson Data Services Consultant or Ambassador of Mayhem Twitter:
More informationTaking the Integrated Data Warehouse Global:
Taking the Integrated Data Warehouse Global: Part 1 The IDW Architecture 3.16 EB9305 ANALYTICS What happens when the CEO says he wants a global view of his business all in one place, complete with drill
More informationMicrosoft Implementing a Data Warehouse with Microsoft SQL Server 2014
1800 ULEARN (853 276) www.ddls.com.au Microsoft 20463 - Implementing a Data Warehouse with Microsoft SQL Server 2014 Length 5 days Price $4290.00 (inc GST) Version D Overview Please note: Microsoft have
More informationProvide Real-Time Data To Financial Applications
Provide Real-Time Data To Financial Applications DATA SHEET Introduction Companies typically build numerous internal applications and complex APIs for enterprise data access. These APIs are often engineered
More informationChoosing the Right Cloud Computing Model for Data Center Management
Choosing the Right Cloud Computing Model for Data Center Management www.nsi1.com NETWORK SOLUTIONS INCOPORATED NS1.COM UPDATING YOUR NETWORK SOLUTION WITH CISCO DNA CENTER 1 Section One Cloud Computing
More informationFocus On: Oracle Database 11g Release 2
Focus On: Oracle Database 11g Release 2 Focus on: Oracle Database 11g Release 2 Oracle s most recent database version, Oracle Database 11g Release 2 [11g R2] is focused on cost saving, high availability
More informationAbstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight
ESG Lab Review InterSystems Data Platform: A Unified, Efficient Data Platform for Fast Business Insight Date: April 218 Author: Kerry Dolan, Senior IT Validation Analyst Abstract Enterprise Strategy Group
More informationIntroduction to Customer Data Platforms
Introduction to Customer Data Platforms Introduction to Customer Data Platforms Overview Many marketers are struggling to assemble the unified customer data they need for successful marketing programs.
More informationDATABASE SCALE WITHOUT LIMITS ON AWS
The move to cloud computing is changing the face of the computer industry, and at the heart of this change is elastic computing. Modern applications now have diverse and demanding requirements that leverage
More informationMicrosoft SharePoint Server 2013 Plan, Configure & Manage
Microsoft SharePoint Server 2013 Plan, Configure & Manage Course 20331-20332B 5 Days Instructor-led, Hands on Course Information This five day instructor-led course omits the overlap and redundancy that
More informationMOBIUS + ARKIVY the enterprise solution for MIFID2 record keeping
+ Solution at a Glance IS A ROBUST AND SCALABLE ENTERPRISE CONTENT ARCHIVING AND MANAGEMENT SYSTEM. PAIRED WITH THE DIGITAL CONTENT GATEWAY, YOU GET A UNIFIED CONTENT ARCHIVING AND INFORMATION GOVERNANCE
More informationMOC 20463C: Implementing a Data Warehouse with Microsoft SQL Server
MOC 20463C: Implementing a Data Warehouse with Microsoft SQL Server Course Overview This course provides students with the knowledge and skills to implement a data warehouse with Microsoft SQL Server.
More informationSecuring Amazon Web Services (AWS) EC2 Instances with Dome9. A Whitepaper by Dome9 Security, Ltd.
Securing Amazon Web Services (AWS) EC2 Instances with Dome9 A Whitepaper by Dome9 Security, Ltd. Amazon Web Services (AWS) provides business flexibility for your company as you move to the cloud, but new
More informationColumnstore Technology Improvements in SQL Server 2016
Columnstore Technology Improvements in SQL Server 2016 Subtle Subtitle AlwaysOn Niko Neugebauer Our Sponsors Niko Neugebauer Microsoft Data Platform Professional OH22 (http://www.oh22.net) SQL Server MVP
More informationMcAfee Total Protection for Data Loss Prevention
McAfee Total Protection for Data Loss Prevention Protect data leaks. Stay ahead of threats. Manage with ease. Key Advantages As regulations and corporate standards place increasing demands on IT to ensure
More informationAccelerated SQL Server 2012 Integration Services
1 Accelerated SQL Server 2012 Integration Services 4 Days (BI-ISACL12-301-EN) Description This 4-day instructor led training focuses on developing and managing SSIS 2012 in the enterprise. In this course,
More informationPhire Frequently Asked Questions - FAQs
Phire Frequently Asked Questions - FAQs Phire Company Profile Years in Business How long has Phire been in business? Phire was conceived in early 2003 by a group of experienced PeopleSoft professionals
More informationFederal Agencies and the Transition to IPv6
Federal Agencies and the Transition to IPv6 Introduction Because of the federal mandate to transition from IPv4 to IPv6, IT departments must include IPv6 as a core element of their current and future IT
More informationData Replication Buying Guide
Data Replication Buying Guide 1 How to Choose a Data Replication Solution IT professionals are increasingly turning to heterogenous data replication to modernize data while avoiding the costs and risks
More informationMicrosoft SQL Server More solutions. More value. More reasons to buy.
Microsoft SQL Server 2005 More solutions. More value. More reasons to buy. Microsoft SQL Server 2005 is a nextgeneration data management and analysis solution. A solution that helps organizations deliver
More informationFundamentals of Information Systems, Seventh Edition
Chapter 3 Data Centers, and Business Intelligence 1 Why Learn About Database Systems, Data Centers, and Business Intelligence? Database: A database is an organized collection of data. Databases also help
More informationEnabling Performance & Stress Test throughout the Application Lifecycle
Enabling Performance & Stress Test throughout the Application Lifecycle March 2010 Poor application performance costs companies millions of dollars and their reputation every year. The simple challenge
More informationCourse Number : SEWI ZG514 Course Title : Data Warehousing Type of Exam : Open Book Weightage : 60 % Duration : 180 Minutes
Birla Institute of Technology & Science, Pilani Work Integrated Learning Programmes Division M.S. Systems Engineering at Wipro Info Tech (WIMS) First Semester 2014-2015 (October 2014 to March 2015) Comprehensive
More informationTaking a First Look at Excel s Reporting Tools
CHAPTER 1 Taking a First Look at Excel s Reporting Tools This chapter provides you with an overview of Excel s reporting features. It shows you the principal types of Excel reports and how you can use
More informationWHAT CIOs NEED TO KNOW TO CAPITALIZE ON HYBRID CLOUD
WHAT CIOs NEED TO KNOW TO CAPITALIZE ON HYBRID CLOUD 2 A CONVERSATION WITH DAVID GOULDEN Hybrid clouds are rapidly coming of age as the platforms for managing the extended computing environments of innovative
More informationChimpegration for The Raiser s Edge
Chimpegration for The Raiser s Edge Overview... 3 Chimpegration Versions... 3 Chimpegration Basic... 3 Chimpegration Professional... 3 The Raiser s Edge Versions... 3 Installation... 3 Set up... 4 Activation...
More informationCSC 261/461 Database Systems Lecture 20. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101
CSC 261/461 Database Systems Lecture 20 Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101 Announcements Project 1 Milestone 3: Due tonight Project 2 Part 2 (Optional): Due on: 04/08 Project 3
More informationBuilding a Data Strategy for a Digital World
Building a Data Strategy for a Digital World Jason Hunter, CTO, APAC Data Challenge: Pushing the Limits of What's Possible The Art of the Possible Multiple Government Agencies Data Hub 100 s of Service
More informationALIGNING CYBERSECURITY AND MISSION PLANNING WITH ADVANCED ANALYTICS AND HUMAN INSIGHT
THOUGHT PIECE ALIGNING CYBERSECURITY AND MISSION PLANNING WITH ADVANCED ANALYTICS AND HUMAN INSIGHT Brad Stone Vice President Stone_Brad@bah.com Brian Hogbin Distinguished Technologist Hogbin_Brian@bah.com
More information