The University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory

Similar documents
DATA WAREHOUING UNIT I

1 DATAWAREHOUSING QUESTIONS by Mausami Sawarkar

Data Warehouses Chapter 12. Class 10: Data Warehouses 1

CHAPTER 3 Implementation of Data warehouse in Data Mining

Data warehouse architecture consists of the following interconnected layers:

Data Mining Concepts & Techniques

Decision Support, Data Warehousing, and OLAP

Decision Support Systems aka Analytical Systems

What is a Data Warehouse?

Fig 1.2: Relationship between DW, ODS and OLTP Systems

WKU-MIS-B10 Data Management: Warehousing, Analyzing, Mining, and Visualization. Management Information Systems

Information Management course

Data Warehousing. Adopted from Dr. Sanjay Gunasekaran

Data Warehousing and OLAP

The Data Organization

Question Bank. 4) It is the source of information later delivered to data marts.

IT DATA WAREHOUSING AND DATA MINING UNIT-2 BUSINESS ANALYSIS

An Overview of Data Warehousing and OLAP Technology

Data Warehousing. Overview

Data Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394

DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI

Dta Mining and Data Warehousing

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. [R&G] Chapter 23, Part A

Information Management course

DATA MINING AND WAREHOUSING

A Systems Approach to Dimensional Modeling in Data Marts. Joseph M. Firestone, Ph.D. White Paper No. One. March 12, 1997

DATA MINING TRANSACTION

REPORTING AND QUERY TOOLS AND APPLICATIONS

Data Warehouse and Data Mining

Data Warehousing and Decision Support

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.

DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

Data Warehousing and Decision Support

Basics of Dimensional Modeling

Data Warehouses. Yanlei Diao. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Přehled novinek v SQL Server 2016

Data Mining & Data Warehouse

Database design View Access patterns Need for separate data warehouse:- A multidimensional data model:-

Data Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1396

Summary of Last Chapter. Course Content. Chapter 2 Objectives. Data Warehouse and OLAP Outline. Incentive for a Data Warehouse

Data Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394

Overview. Introduction to Data Warehousing and Business Intelligence. BI Is Important. What is Business Intelligence (BI)?

IDU0010 ERP,CRM ja DW süsteemid Loeng 5 DW concepts. Enn Õunapuu

OLAP2 outline. Multi Dimensional Data Model. A Sample Data Cube

Introduction to DWML. Christian Thomsen, Aalborg University. Slides adapted from Torben Bach Pedersen and Man Lung Yiu

CHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP)

Databases and Data Warehouses

Data Warehousing & Mining. Data integration. OLTP versus OLAP. CPS 116 Introduction to Database Systems

Knowledge/Data Management. MIS 4133 Software Systems

Data Warehouse and Data Mining

CS 245: Database System Principles. Warehousing. Outline. What is a Warehouse? What is a Warehouse? Notes 13: Data Warehousing

Information Management course

Tribhuvan University Institute of Science and Technology MODEL QUESTION

A Multi-Dimensional Data Model

Data Warehousing. Ritham Vashisht, Sukhdeep Kaur and Shobti Saini

1/12/2018. APPA Institute Dallas, TX Feb DATA INTEGRATION PURPOSE OF TODAY S PRESENTATION

Data Management Glossary

Q1) Describe business intelligence system development phases? (6 marks)

Chapter 4, Data Warehouse and OLAP Operations

Data warehouses Decision support The multidimensional model OLAP queries

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.

Chapter 6. Foundations of Business Intelligence: Databases and Information Management VIDEO CASES

Full file at

CS377: Database Systems Data Warehouse and Data Mining. Li Xiong Department of Mathematics and Computer Science Emory University

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores

Designing Data Warehouses. Data Warehousing Design. Designing Data Warehouses. Designing Data Warehouses

Chapter 6 VIDEO CASES

Acknowledgment. MTAT Data Mining. Week 7: Online Analytical Processing and Data Warehouses. Typical Data Analysis Process.

Rocky Mountain Technology Ventures

Syllabus. Syllabus. Motivation Decision Support. Syllabus

Data Warehousing and Decision Support (mostly using Relational Databases) CS634 Class 20

Data Analysis and Data Science

BI/DWH Test specifics

Data Warehouse. TobiasGroup, Inc Crow Drive Suite 218 Macedonia, Ohio USA

QUALITY MONITORING AND

Data Warehousing and OLAP Technologies for Decision-Making Process

Enterprise Informatization LECTURE

Data Warehousing Introduction. Toon Calders

Data-Intensive Distributed Computing

Subject Oriented: Data that gives information about a particular subject instead of about a company's ongoing operations.

1. Inroduction to Data Mininig

Data Warehouses and Deployment

What is Data Mining? Data Mining. Data Mining Architecture. Illustrative Applications. Pharmaceutical Industry. Pharmaceutical Industry

Taking a First Look at Excel s Reporting Tools

Management Information Systems Review Questions. Chapter 6 Foundations of Business Intelligence: Databases and Information Management

KORA. Business Intelligence An Introduction

CT75 DATA WAREHOUSING AND DATA MINING DEC 2015

Data Warehousing and Data Mining. Announcements (December 1) Data integration. CPS 116 Introduction to Database Systems

Data Warehouse and Data Mining

collection of data that is used primarily in organizational decision making.

Guide Users along Information Pathways and Surf through the Data

CSPP 53017: Data Warehousing Winter 2013! Lecture 7! Svetlozar Nestorov! Class News!

The strategic advantage of OLAP and multidimensional analysis

Data Mining. Associate Professor Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology

Table Of Contents: xix Foreword to Second Edition

An Overview of various methodologies used in Data set Preparation for Data mining Analysis

Data Warehousing & OLAP

Management Information Systems MANAGING THE DIGITAL FIRM, 12 TH EDITION FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT

Evolving To The Big Data Warehouse

DATAWAREHOUSING AND ETL PROCESSES: An Explanatory Research

Transcription:

Warehousing Outline Andrew Kusiak 2139 Seamans Center Iowa City, IA 52242-1527 andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak Tel. 319-335 5934 Introduction warehousing concepts Relationship to data mining Applications Definition (1/3) warehouse = An enabled database designed to support large volume of data at high performance level, usability, and manageability. (2/3) A data warehouse is a copy of transaction data specifically structured for querying and reporting. The form of the stored data has nothing to do with whether something is a data warehouse. warehousing is not necessarily for the needs of "decision makers" or used in the process of decision making. Ralph Kimball (3/3) Easy data access Quick data access Low cost data access Accurate data access Warehousing (1/2) warehousing systems, for the most part, store historical data that have been generated in internal transaction processing systems. This is a small part of the universe of data available to manage a business. Sometimes this part has limited value. warehousing systems can complicate business processes. warehousing can have a learning curve that may be too long for impatient firms. 1

DW Architecture Marts Automotive Analogy Trans DB Individual parts Exploration Subassemblies, e.g., gear box Trans DB Mart Car Legacy Applications Integration/ Transformation Layer Warehousing (2/2) warehousing can become an exercise in data for the sake of the data. warehousing systems can require a great deal of "maintenance" which many organizations cannot or will not support. Sometimes the cost to capture data, clean it up, and deliver it in a format and time frame that is useful for the end users is too much of a cost to bear. http://www.dwinfocenter.org/ Successful (1/3) From day one establish that warehousing is a joint user/builder project. Establish that maintaining data quality will be an ongoing joint user/builder responsibility. Train the users one step at a time. Train the users about the data stored in the data warehouse. Consider doing a high level corporate data model / data warehouse architecture "exercise" in three weeks. Implement a user accessible automated directory to information stored in the warehouse. Successful (2/3) Once you know what raw data you want to feed into the data, request that data. Determine a plan to test the integrity of the data in the warehouse. From the start get warehouse users in the habit of testing complex queries. Coordinate system roll-out with network administration personnel. Have a good grasp of desktop databases and spreadsheets. Successful (3/3) Be prepared to support beginning users immediately and at any time. Maintain the audit trail to the feeder systems. Market and sell your data warehousing systems. 2

Decision Support System A decision support system or tool is one specifically designed to allow business end users to perform computer generated analyses of data on their own. Designed for performing analytical tasks using a variety of data. Supports a relatively small number of users with relatively long interaction loads. Its usage is read-intensive. Its content is periodically updated, mostly through additions. It contains a relatively few large tables. Each query normally produces a large result set. Current detail data = acquired directly from the transactional database, frequently representing an entire application (e.g., enterprise). Old detail data = Previously stored current detail data allowing for analysis of trends. mart = An implementation of the data warehouse with a limited scope of data. A data warehouse may be a collection of gradually constructed data marts. Summarized data = aggregated for executive reporting, trend analysis, and decion-making. Drill-down = A capability of performing data analysis in a top-down fashion. The summary data can be decomposed into current and old detail data. Metadata = ( about data) A description of all data items, their location, sources, structure, content, end-user views, and so on. Tabular form reporting. Information mapping, e.g., mapping spatial data. Complex queries and sophisticated criteria search. Ranking. Multivariable analysis. series analysis. visualization, graphing, charting, and pivoting. Complex textual search. Advanced statistical analysis. Trend analysis.. Pattern and associations discovery. OLAP tools mining tools 3

OLAP OLTP (On-line transaction processing) = processing in traditional databases that are also called transactional databases. OLAP (On-line analytical processing) = analysis for maximum data usability. Mining mining = In -depth processing of data leading to discovery of non-obvious relationships. Warehousing Quick location of the right information. Presentation of information in the needed form. Testing of hypotheses. Knowledge discovery. Sharing the analysis results. Warehousing Improved product inventory turnover. Improved selection of targeted markets reduces the product introduction cost. More effective decision-making. More effective business intelligence. Enhanced asset and liability management due to the big picture view provided by the data warehouse. Warehousing Improved productivity due to the single source of information. Reduced redundancy in information processing. Enhanced customer relationship. Enabler of business process reengineering and breakthrough idea generation by providing useful insights into the processes. Relational Table Product Market No. of Units P1 Chicago Q1 1000 P2 Chicago Q2 1200 P3 Chicago Q3 1500 P4 Chicago Q4 2000 P5 Atlanta Q1 1400 P6 Atlanta Q2 1600 P7 Atlanta Q3 1100 P8 Atlanta Q4 1900 P9 Paris Q1 1300 P10 Paris Q2 1000 P11 Paris Q3 1900 P12 Paris Q3 1400 4

Cube Cube P1 Paris Atlanta Chicago 1000 1200 1500 2000 Markets Cuboids [1D cuboid, 2D cuboid, 3D cuboid, etc.] [3D cuboid = data cube] Products Lattice of cuboids Q1 Q2 Q3 Q4 Lattice of cuboids All 0D (apex) cuboid Market Product 1D cuboid Product Product, 2D cuboid Product, 3D cuboid 5