Data Warehouse and Data Mining

Similar documents
Data Warehouse and Data Mining

Data Warehouse and Data Mining

Data Warehouse and Data Mining

Data Warehouse and Data Mining

Data Mining Concepts & Techniques

DATA MINING TRANSACTION

DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

Data Mining & Data Warehouse

Syllabus. Syllabus. Motivation Decision Support. Syllabus

CHAPTER 3 Implementation of Data warehouse in Data Mining

Data Warehousing (1)

CS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures)

A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective

Data Mining. Associate Professor Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology

CHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI

TIM 50 - Business Information Systems

Data Warehouse. Asst.Prof.Dr. Pattarachai Lalitrojwong

Information Management course

Data Warehouse and Data Mining

UNIT -1 UNIT -II. Q. 4 Why is entity-relationship modeling technique not suitable for the data warehouse? How is dimensional modeling different?

Evolution of Database Systems

An Overview of Data Warehousing and OLAP Technology

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.

DATA MINING AND WAREHOUSING

Data Warehousing and OLAP

OLAP Introduction and Overview

Fig 1.2: Relationship between DW, ODS and OLTP Systems

INTRODUCTORY INFORMATION TECHNOLOGY ENTERPRISE DATABASES AND DATA WAREHOUSES. Faramarz Hendessi

Data warehouse architecture consists of the following interconnected layers:

DATA WAREHOUING UNIT I

1 DATAWAREHOUSING QUESTIONS by Mausami Sawarkar

Database design View Access patterns Need for separate data warehouse:- A multidimensional data model:-

Rocky Mountain Technology Ventures

TIM 50 - Business Information Systems

Data Warehouse and Data Mining

Question Bank. 4) It is the source of information later delivered to data marts.

WKU-MIS-B10 Data Management: Warehousing, Analyzing, Mining, and Visualization. Management Information Systems

Data Analysis and Data Science

Table of Contents. Knowledge Management Data Warehouses and Data Mining. Introduction and Motivation

Knowledge Management Data Warehouses and Data Mining

The Data Organization

Lectures for the course: Data Warehousing and Data Mining (IT 60107)

Chapter 13 Business Intelligence and Data Warehouses The Need for Data Analysis Business Intelligence. Objectives

Managing Data Resources

Data Warehousing ETL. Esteban Zimányi Slides by Toon Calders

IT DATA WAREHOUSING AND DATA MINING UNIT-2 BUSINESS ANALYSIS

Data Warehousing. Adopted from Dr. Sanjay Gunasekaran

Data Modeling and Databases Ch 7: Schemas. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

Data Warehouses. Yanlei Diao. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Decision Support. Chapter 25. CS 286, UC Berkeley, Spring 2007, R. Ramakrishnan 1

Oracle 1Z0-515 Exam Questions & Answers

collection of data that is used primarily in organizational decision making.

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores

Decision Support, Data Warehousing, and OLAP

Warehousing. Data Mining

Data Warehousing. Overview

STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / BTIX05 - BTECH DEPARTMENT OF INFORMATICS. By: Dr. Tendani J. Lavhengwa

Deccansoft Software Services Microsoft Silver Learning Partner. SSAS Syllabus

Data Warehouse and Data Mining

Data Warehouse and Mining

Summary of Last Chapter. Course Content. Chapter 2 Objectives. Data Warehouse and OLAP Outline. Incentive for a Data Warehouse

Managing Data Resources

Designing Data Warehouses. Data Warehousing Design. Designing Data Warehouses. Designing Data Warehouses

Data Modelling for Data Warehousing

Oracle Database 11g: Data Warehousing Fundamentals

DSS based on Data Warehouse

CSPP 53017: Data Warehousing Winter 2013! Lecture 7! Svetlozar Nestorov! Class News!

Department of Industrial Engineering. Sharif University of Technology. Operational and enterprises systems. Exciting directions in systems

Data Warehousing and Decision Support

Data Mining and Warehousing

Data Warehouses and OLAP. Database and Information Systems. Data Warehouses and OLAP. Data Warehouses and OLAP

On-Line Application Processing

Databases and Data Warehouses

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. [R&G] Chapter 23, Part A

Managing Information Resources

CHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP)

CS 245: Database System Principles. Warehousing. Outline. What is a Warehouse? What is a Warehouse? Notes 13: Data Warehousing

Oracle #1 RDBMS Vendor

Chapter 3: AIS Enhancements Through Information Technology and Networks

Chapter 4, Data Warehouse and OLAP Operations

DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI

Computers Are Your Future

The University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory

REPORTING AND QUERY TOOLS AND APPLICATIONS

Chapter 3. Databases and Data Warehouses: Building Business Intelligence

Data Warehousing and OLAP Technologies for Decision-Making Process

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.

Data Warehousing & OLAP

After completing this course, participants will be able to:

Database Vs. Data Warehouse

Outline. Managing Information Resources. Concepts and Definitions. Introduction. Chapter 7

Data Warehousing and Decision Support

CS377: Database Systems Data Warehouse and Data Mining. Li Xiong Department of Mathematics and Computer Science Emory University

Chapter 3. The Multidimensional Model: Basic Concepts. Introduction. The multidimensional model. The multidimensional model

Microsoft SQL Server Training Course Catalogue. Learning Solutions

Data Mining and Data Warehousing Introduction to Data Mining

Step-by-step data transformation

The strategic advantage of OLAP and multidimensional analysis

Business Intelligence and Decision Support Systems

DATABASE DEVELOPMENT (H4)

Transcription:

Data Warehouse and Data Mining Lecture No. 07 Terminologies Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro

Database Database is an organized collection of data Databases are created to operate large quantities of information by inputting, storing, retrieving, and managing that information. Collection of data central to some enterprise Essential to operation of enterprise Contains the only record of enterprise activity

Database Management System DBMS is a suite of computer software providing the interface between users and a database or databases. A Database Management System (DBMS) is a program that manages a database: Supports a high-level access language (e.g. SQL) Application describes database accesses using that language DBMS interprets statements of language to perform requested database access

Transaction A transaction is any kind of action involved in conducting business, or an interaction between people A transaction is executed to make change into database state of organization, whenever an event in real world changes the business state

Transaction Processing System Transaction processing (TP) is a style of computing that divides work into individual, indivisible operations, called transactions A Transaction Processing System (TPS) or Transaction Server is a software system, or software/hardware combination, that supports transaction processing A TPS consists of Transaction Processing Monitor (TPM), databases, and transactions

Transaction Processing System

TPS Requirements High Availability: on-line implies that it must be operational while enterprise is functioning High Reliability: correctly tracks state, does not lose data, controlled concurrency High Throughput: many users implies that many transactions per second Low Response Time: on-line implies that users are waiting

TPS Requirements Long Lifetime: complex systems are not easily replaced Must be designed so they can be easily extended as the needs of the enterprise change Security: sensitive information must be carefully protected since system is accessible to many users Authentication, authorization, encryption

Roles in TPS System Analyst - specifies system using input from customer; provides complete description of functionality from customer s and user s point of view Database Designer - specifies structure of data that will be stored in database Application Programmer - implements application programs (transactions) that access data and support enterprise rules

Roles in TPS Database Administrator - maintains database once system is operational: space allocation, performance optimization, database security System Administrator - maintains transaction processing system: monitors interconnection of Hardware and Software modules, deals with failures and congestion

Terminologies On-Line: A process controlled by a computer Analytical Processing needs Analytical Data Analytical Data: Data that involve analysis Analytical Data consist of Business Data Business Data: Time, Customers, Sales, Stores, Products, etc Client Analytical Processing Analytical Data Business Data

On-Line Transaction Processing OLTP a class of information systems that facilitate and manage transaction-oriented applications, typically for data entry and retrieval transaction processing. OLTP is day-to-day handling of transactions that result from enterprise operation It maintains correspondence between database state and enterprise state OLTP has also been used to refer to processing in which the system responds immediately to user requests. An automated teller machine (ATM) for a bank is an example of a commercial transaction processing application.

Enterprise Resource Planning ERP is business management software (a suite of integrated applications) An organization can use to store and manage data from every stage of business, including: Product planning, cost and development Manufacturing Marketing and sales Inventory management Shipping and payment ERP provides an integrated real-time view of core business processes, using common databases maintained by a database management system (DBMS)

Enterprise Resource Planning

Terms and Definitions Bitmapped Indexing - A family of advanced indexing algorithms that optimize RDBMS query performance by maximizing the search capability of the index per unit of memory and per CPU instruction. Properly implemented, bitmapped indices eliminate all table scans in query and join processing. Business Model - An object-oriented model that captures the kinds of things in a business or a business area and the relationships associated with those things (and sometimes associated business rules, too). Note that a business model exists independently of any data or database. A data warehouse should be designed to match the underlying business models or else no tools will fully unlock the data in the warehouse.

Terms and Definitions Corporate Data - All the databases of the company. This includes legacy systems, old and new transaction systems, general business systems, client/server databases, data warehouses and data marts. Data Dictionary - A collection of Meta Data. Many kinds of products in the data warehouse arena use a data dictionary, including database management systems, modeling tools, middleware, and query tools.

Terms and Definitions Data Mart - A subset of a data warehouse that focuses on one or more specific subject areas. The data usually is extracted from the data warehouse and further denormalized and indexed to support intense usage by targeted customers. Data Mining - Techniques for finding patterns and trends in large data sets. Data Model - The road map to the data in a database. This includes the source of tables and columns, the meanings of the keys, and the relationships between the tables.

Terms and Definitions Data Visualization - Techniques for turning data into information by using the high capacity of the human brain to recognize visually recognize patterns and trends. There are many specialized techniques designed to make particular kinds of visualization easy. Data Warehouse - A database built to support information access. Typically a data warehouse is fed from one or more transaction databases. The data needs to be cleaned and restructured to support queries, summaries, and analyses.

Terms and Definitions Decision Support - Data access targeted to provide the information needed by business decision makers. Examples include pricing, purchasing, human resources, management, manufacturing, etc. Decision Support System (DSS) - Database(s), warehouse(s), and/or mart(s) in conjunction with reporting and analysis software optimized to support timely business decision making. Meta Data - Literally, "data about data." More usefully, descriptions of what kind of information is stored where, how it is encoded, how it is related to other information, where it comes from, and how it is related to your business.

Terms and Definitions Methodology - The steps followed to guarantee repeatability of success. A good methodology is built on top of real world experience. Middleware - Hardware and software used to connect clients and servers, to move and structure data, and/or to pre-summarize data for use by queries and reports. Multidimensional database (MDD) - A DBMS optimized to support multidimensional data. The best systems support standard RDBMS functionality and add high-bandwith support for multidimensional data and queries. Users that need a lot of slices and dices might appreciate a multidimensional database.

Terms and Definitions Object Oriented Analysis (OOA) - A process of abstracting a problem by identifying the kinds of entities in the problem domain, the is-a relationships between the kinds (kinds are known as classes, is-a relationships as subtype/supertype, subclass/superclass, or less commonly, specialization/generalization), and the has-a relationships between the classes. Also identified for each class are its attributes (e.g. class Person has attribute Hair Color) and its conventional relationships to other classes(e.g. class Order has a relationship Customer to class Customer.)

Terms and Definitions Object Oriented Design (OOD) - A design methodology that uses Object Oriented Analysis to promote object reusability and interface clarity. OLAP - An acronym for On Line Analytical Processing. A common use of a data warehouse that involves real time access and analysis of multidimensional data such as order information. Performance - Data, summaries, and analyses need to be delivered in a timely fashion. Performance is often a key issue with data warehouses: the right answer isn't worth much if it shows up after the decisions have been made.

Terms and Definitions Query - A specific atomic request for information from a database. Relational On-Line Analytic Processing (ROLAP) - OLAP based on conventional relational databases rather than specialized multidimensional databases. Replication - A standard technique in data warehousing. For performance and reliability several independent copies are often created of each data warehouse. Even data marts can require replication on multiple servers to meet performance and reliability standards.

Terms and Definitions Replicator - Any of a class of product that supports replication. Often these tools use special load and unload database APIs and have scripting languages that support automation. Report - A repeatable, formatted, non-atomic request for information from a database. Usually a report formats and combines several related queries. Security - The right data for the right person. Snowflake Schema - A layering of Star Schema that scales that technique to handle an entire warehouse.

Terms and Definitions Star Schema - A standard technique for designing the summary tables of a data warehouse. "Fact" tables each join to a larger number of independent "dimension" tables. The tables may be partially denormalized for performance, but most queries will still need to join in one or more of the star tables. OLAP refers to querying and accessing on-line data Data Warehouse refers to specific technical architectures for storing and accessing large amounts of data