Data Mining & Data Warehouse

Similar documents
Data Mining. Asso. Profe. Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology Department of CS (1)

Data Mining. Associate Professor Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology

Data Mining & Data Warehouse

Data Mining. Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology Department of Computer Science

STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / BTIX05 - BTECH DEPARTMENT OF INFORMATICS. By: Dr. Tendani J. Lavhengwa

INTRODUCTORY INFORMATION TECHNOLOGY ENTERPRISE DATABASES AND DATA WAREHOUSES. Faramarz Hendessi

Information Management course

QM Chapter 1 Database Fundamentals Version 10 th Ed. Prepared by Dr Kamel Rouibah / Dept QM & IS

A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective

Question Bank. 4) It is the source of information later delivered to data marts.

Overview. Introduction to Data Warehousing and Business Intelligence. BI Is Important. What is Business Intelligence (BI)?

DATAWAREHOUSING AND ETL PROCESSES: An Explanatory Research

Data Warehousing. Ritham Vashisht, Sukhdeep Kaur and Shobti Saini

CHAPTER 3 Implementation of Data warehouse in Data Mining

The Evolution of Data Warehousing. Data Warehousing Concepts. The Evolution of Data Warehousing. The Evolution of Data Warehousing

Chapter 1 Database System Concepts and Architecture. Nguyen Thi Ai Thao

Seminars of Software and Services for the Information Society. Data Warehousing Design Issues

Training 24x7 DBA Support Staffing. MCSA:SQL 2016 Business Intelligence Development. Implementing an SQL Data Warehouse. (40 Hours) Exam

Full file at

Chapter 6. Foundations of Business Intelligence: Databases and Information Management VIDEO CASES

Business Intelligence and Decision Support Systems

Introduction to DWML. Christian Thomsen, Aalborg University. Slides adapted from Torben Bach Pedersen and Man Lung Yiu

Data Warehouse and Data Mining

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

A New Approach of Extraction Transformation Loading Using Pipelining

DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

Duration: 5 Days. EZY Intellect Pte. Ltd.,

Implementing a SQL Data Warehouse

Fig 1.2: Relationship between DW, ODS and OLTP Systems

Data Warehouse and Data Mining

Testing Masters Technologies

Implementing a Data Warehouse with Microsoft SQL Server 2014

Data Warehousing Introduction. Toon Calders

DATA WAREHOUSING APPLICATIONS IN AYURVEDA THERAPY

Chapter 3. Foundations of Business Intelligence: Databases and Information Management

Data Warehouse and Data Mining

DATABASE DEVELOPMENT (H4)

Microsoft Implementing a Data Warehouse with Microsoft SQL Server 2014

Information Integration

CS377: Database Systems Data Warehouse and Data Mining. Li Xiong Department of Mathematics and Computer Science Emory University

Fast Innovation requires Fast IT

Fundamentals of Information Systems, Seventh Edition

CT75 DATA WAREHOUSING AND DATA MINING DEC 2015

Data Quality Architecture and Options

Data Mining. Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology Department of Computer Science

Management Information Systems MANAGING THE DIGITAL FIRM, 12 TH EDITION FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.

Chapter 3. Database Architecture and the Web

Exam /Course 20767B: Implementing a SQL Data Warehouse

by Prentice Hall

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.

Sql Fact Constellation Schema In Data Warehouse With Example

Data-Driven Driven Business Intelligence Systems: Parts I. Lecture Outline. Learning Objectives

20767B: IMPLEMENTING A SQL DATA WAREHOUSE

Implementing a Data Warehouse with Microsoft SQL Server 2014 (20463D)

Data Mining and Warehousing

Data Warehouse and Mining

IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 06, 2016 ISSN (online):

Knowledge Modelling and Management. Part B (9)

Data Integration and Data Warehousing Database Integration Overview

Data Warehousing & Mining. Data integration. OLTP versus OLAP. CPS 116 Introduction to Database Systems

Knowledge/Data Management. MIS 4133 Software Systems

Data Warehousing & Mining Techniques

Outline. Managing Information Resources. Concepts and Definitions. Introduction. Chapter 7

Chapter 6 VIDEO CASES

Figure 1-1a Data in context. Context helps users understand data

Implementing a SQL Data Warehouse

Managing Changes to Schema of Data Sources in a Data Warehouse

Managing a Multi-iierData Warehousing Environment with the SAS/Warehouse Adminlstrator. Darrell Barton, SAS Institute Inc.

1. Inroduction to Data Mininig

IT1105 Information Systems and Technology. BIT 1 ST YEAR SEMESTER 1 University of Colombo School of Computing. Student Manual

High Speed ETL on Low Budget

2. Summary. 2.1 Basic Architecture. 2. Architecture. 2.1 Staging Area. 2.1 Operational Data Store. Last week: Architecture and Data model

Data Warehousing. Seminar report. Submitted in partial fulfillment of the requirement for the award of degree Of Computer Science

Data Warehousing ETL. Esteban Zimányi Slides by Toon Calders

Data Warehousing & Mining Techniques

What s New in SAS/Warehouse Administrator

DATA WAREHOUSE- MODEL QUESTIONS

Aggregating Knowledge in a Data Warehouse and Multidimensional Analysis

OBJECTIVES DEFINITIONS CHAPTER 1: THE DATABASE ENVIRONMENT AND DEVELOPMENT PROCESS. Figure 1-1a Data in context

Databases and Data Warehouses

1/12/2018. APPA Institute Dallas, TX Feb DATA INTEGRATION PURPOSE OF TODAY S PRESENTATION

Managing Data Resources

Data Warehousing and Data Mining. Announcements (December 1) Data integration. CPS 116 Introduction to Database Systems

ETL Interview Question Bank

Decision Guidance. Data Vault in Data Warehousing

Whitepaper. Solving Complex Hierarchical Data Integration Issues. What is Complex Data? Types of Data

Evolving To The Big Data Warehouse

Application software office packets, databases and data warehouses.

After completing this course, participants will be able to:

Data Mining: Approach Towards The Accuracy Using Teradata!

Managing Information Resources

Implement a Data Warehouse with Microsoft SQL Server

Summary of Last Chapter. Course Content. Chapter 2 Objectives. Data Warehouse and OLAP Outline. Incentive for a Data Warehouse

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores

20463C-Implementing a Data Warehouse with Microsoft SQL Server. Course Content. Course ID#: W 35 Hrs. Course Description: Audience Profile

VERITAS Storage Foundation for Windows FlashSnap Option

Copyright 2016 Datalynx Pty Ltd. All rights reserved. Datalynx Enterprise Data Management Solution Catalogue

SAS Data Integration Studio 3.3. User s Guide

DKMS Brief No. Five: Is Data Staging Relational? A Comment

Transcription:

Data Mining & Data Warehouse Asso. Profe. Dr. Raed Ibraheem Hamed University of Human Development, College of Science and Technology Department of Information Technology 2016 2017 (1)

Points to Cover Problem: Heterogeneous Information Sources Data integration Extract, Transform, Load (ETL) Data Warehouse With The ETL Process Problem: Data Management in Large Enterprises Goals of Data Integration Current Solutions Three-layer Architecture: Conceptual View Generic Warehouse Architecture Department of IT- DMDW - UHD 1-2

Problem: Heterogeneous Information Sources Heterogeneities are everywhere Personal Databases Scientific Databases Different interfaces Different data representations Duplicate and inconsistent information Digital Libraries World Wide Web Department of IT- DMDW - UHD 3

Data integration Data integration involves combining data residing in different sources and providing users with a unified view of these data. This process becomes significant in a variety of situations, which include both commercial (when two similar companies need to merge their databases) and scientific research results. Department of IT- DMDW - UHD 4

Goal: Unified Access to Data Integration System World Wide Web Digital Libraries Scientific Databases Personal Databases Collects and combines information Provides integrated view, uniform user interface Supports sharing among different applications Department of IT- DMDW - UHD 5

Goals of Data Integration Provide 1) Uniform (same query interface to all sources) 2) Access to (updates the database) 3) Multiple (we want many users at each time) 4) Heterogeneous (data models are different) 5) Distributed (over LAN, WAN, Internet) 6) Data Sources (not only databases). Department of IT- DMDW - UHD 6

Extract, Transform, Load (ETL) Extract, Transform, Load (ETL) refers to a process in database. Data extraction is where data is extracted from heterogeneous data sources; Data transformation where the data is transformed for storing in the proper format or structure. Data loading where the data is loaded into the final target more specifically, an operational data store, data mart, or data warehouse. Department of IT- DMDW - UHD 7

Data Warehouse With The ETL Process A simple schematic for a data warehouse with the ETL process extracts information from the source databases, transforms it and then loads it into the data warehouse. Department of IT- DMDW - UHD 8

Data Management in Large Enterprises Database Management system (DBMS) Result of different applications development of operational systems Examples Sales Planning Suppliers Control system Sales Administration Finance Manufacturing Department of IT- DMDW - UHD 9

Extract all the data into a single data source Data Warehouse Extract all the data into a single data source Query Data Warehouse Clean the data Load Periodically Data Sources Department of IT- DMDW - UHD 10

Data Warehouse Architectures: Conceptual View Single-layer Every data element is stored once only Virtual warehouse: is another term for a data warehouse. Operational systems Informational systems Real-time data Department of IT- DMDW - UHD 11

Data Warehouse Architectures: Conceptual View Two-layer Real-time + derived data Most commonly used approach in industry today Operational systems Informational systems Derived Data Real-time data Department of IT- DMDW - UHD 12

Three-layer Architecture: Conceptual View Transformation of real-time data to derived data really requires reconciliation step Operational systems Derived Data Reconciliation Real-time data Informational systems Reconciliation is the process of ensuring that two sets of records are in agreement. Example:- Reconciliation is used to ensure that the money leaving an account matches the actual money spent. Department of IT- DMDW - UHD 13

Data Warehousing: Two Distinct Issues (1) How to get information into warehouse Data warehousing (2) What to do with data once it s in warehouse Warehouse DBMS Both rich research areas Industry has focused on (2) Department of IT- DMDW - UHD 14

Issues in Data Warehousing Warehouse Design Extraction Integration Cleansing & merging Warehousing specification & Maintenance Optimizations Evolution Department of IT- DMDW - UHD 15

Generic Warehouse Architecture Client Design Phase Maintenance Query & Analysis Warehouse Integration Client Loading Metadata Extractor/ Monitor Extractor/ Monitor Extractor/ Monitor... Department of IT- DMDW - UHD 16

Problems with DW Approach Data has to be cleaned different formats Needs to store all the data from different data sources in single system Data needs to be updated periodically Data sources are independent content can change without notice Expensive because of the large quantities of data and data cleaning costs Department of IT- DMDW - UHD 17

Department of IT- DMDW - UHD Department of IT- DMDW - UHD 18