Penn State Student Chapter of the Association for Computing Machinery

Size: px
Start display at page:

Download "Penn State Student Chapter of the Association for Computing Machinery"

Transcription

1 Penn State Student Chapter of the Association for Computing Machinery We welcome all interested students to our 4th general meeting of the Spring 2005 semester! When: Monday, April 11th, 2005 from 7-8 pm Where: Cybertorium (213 IST) Agenda: Brief overview of our ACM chapter New officer introductions Special topic presentation: No Pain, No Game Presented by IST Professor Brian K. Smith Co-op/Intern presentation: Working at IBM Presented by Rick Osowski Free refreshments will be provided

2 Data Warehousing, Data Mining, IST 210 and Advanced Applications 2

3 Data Rich, but Information Poor Data is stored, not explored : by its volume and complexity it represents a burden, not a support Data overload results in uninformed decisions, contradictory information, higher overhead, wrong decisions, increased costs Data is not designed and is not structured for successful management decision making 3

4 Improving Decision Making Decisions Data Information Warehouse Data 4

5 Data Warehouse Concepts 5

6 What s a Data Warehouse? A data warehouse is a single, integrated source of decision support information formed by collecting data from multiple sources, internal to the organization as well as external, and transforming and summarising this information to enable improved decision making. A data warehouse is designed for easy access by users to large amounts of information, and data access is typically supported by specialized analytical tools and applications. 6

7 Data Warehouse Characteristics Key Characteristics of a Data Warehouse Subject-oriented Integrated Time-variant Non-volatile 7

8 Subject Oriented Example for an insurance company : Applications Area Data Warehouse Commercial Commercial and and Life Life Insurance Insurance Systems Systems Auto Auto and and Fire Fire Policy Policy Processing Processing Systems Systems Customer Customer Policy Policy Data Data Accounting Accounting System System Billing Billing System System Claims Claims Processing Processing System System Losses Losses Premium Premium 8

9 Integrated Data is stored once in a single integrated location (e.g. insurance company) Customer data stored in several databases Auto Auto Policy Policy Processing Processing System System Fire Fire Policy Policy Processing Processing System System FACTS, FACTS, LIFE LIFE Commercial, Commercial, Accounting Accounting Applications Applications Data Warehouse Database Subject = Customer 9

10 Time - Variant Data is stored as a series of snapshots or views which record how it is collected across time. Data Warehouse Data Time Data Key Data is tagged with some element of time - creation date, as of date, etc. Data is available on-line for long periods of time for trend analysis and forecasting. For example, five or more years 10

11 Non-Volatile Existing data in the warehouse is not overwritten or updated. External Sources Production Applications Production Databases Data Warehouse Environment Data Warehouse Database Update Insert Delete Load Read-Only 11

12 Transaction System vs. Data Warehouse 12

13 Transaction-Based Reporting System Day-to-day operations On-line, real time update into disparate systems System Experts Data Manipulation Users Unix VMS MVS Other 13

14 Warehouse-Based Reporting System Unix Executive Reporting and On-Line Analysis VMS MVS Interfaces Data Data Staging, Staging, Transformation Transformation and and Cleansing Cleansing Data Warehouse Summarization Other Environment OLAP BENEFIT: Reduce data processing costs BENEFIT: Integrated, consistent data available for analysis BENEFIT: Improve Network Reporting processes and analytical capabilities 14

15 Transaction - Warehouse Process Day-to-day operations Transaction Based Process On-line, real time update. Detailed Information to operational systems. Decision support for management use. Warehouse Based Process Summarize & Refine Transform Batch Load 15

16 Transaction System vs. Data Warehouse Transaction System Supports day-to-day operational processes Contains raw, detailed data that has not been refined or cleansed Volatile -- data changes from day-to-day, with frequent updates Technical issues drive the data structure and system design Disparate data structures, physical locations, query types, etc. Users rely on technical analysts for reporting needs Operational processes impacted by queries run off of system Data Warehouse Supports management analysis and decisionmaking processes Contains summarized, refined, and cleansed information Non-volatile -- provides a data snapshot ; adjustments are not permitted, or are limited Business analysis requirements drive the data structure and system design Integrated, consistent information on a single technology platform Users have direct, fast access via On-line Analytical Processing tools Minimal impact on operational processes 16

17 Data Warehouse Architecture 17

18 Data Warehouse Architecture Operational System Data Warehouse Ad-hoc Reporting Conversion & Interface OLAP Cubes ODS Staging Area Data Marts Canned Reports 18

19 Data Warehouse Architecture Conversion and Cleansing Activities Conversion & Cleansing Map source data to target Data scrubbing Derive new data Data Extraction Transform / convert data Create / modify metadata 19

20 Data Warehouse Architecture Data Warehouse Components Detailed Data Summary Data Ranges from detailed to summarized data Contains metadata Many views of the data Subject-Oriented Time-variant Metadata 20

21 Requirements Gathering Process Business Measure Definition Standard definition and related business rules and formulas Source data element(s), including quality constraints Data granularity levels (e.g., county detail for state) Data retention (e.g., one month, one quarter, one year, multiple years) Priority of the information (For example, is the information necessary to derive other business measures?) Data load frequency (e.g., monthly, quarterly, etc.) 21

22 Star Join Schema Region_Dimension_Table region _id region _doc Dimension Tables prod_grp_id Product_Dimension_Table prod_id prod_grp_desc Fewer devices Circuit boards Components prod_desc Power supply Motherboard Co-processor NE NW SE SW Northeast Northwest Southeast Southwest account _id account _doc ABC Electronics Midway Electric Victor Components Washburn, Inc. Zerox Account_Dimension_Table month prod_id region_id account_id vend_id net-sales gross_sales SW NE SW ,000 23,000 32,000 50,000 42,000 49,000 Fact Table Monthly_Sales_Summary_Table month mo_in_fiscal_yr month_name January February March Time_Dimension_Table Vendor_Dimension_Table vend_id vendor_desc PowerAge, Inc. Advanced Micro Devices Farad Incorporated 22

23 Multi-Dimensional Analysis Geography Dimension Zip Code County Region State Customer Dimension Class of Trade Client Type Account Store Net Sales by Brand by Region by Client Type Product Family Product Line BrandCategory Product Dimension Group Item Business Measure: Product Net Dimension Sales DW

24 Application Solution Classes Executive information system (EIS) : Present information at the highest level of summarization using corporate business measures. They are designed for extreme ease-ofuse and, in many cases, only a mouse is required. Graphics are usually generously incorporated to provide at-a-glance indications of performance Decision Support Systems (DSS) : They ideally present information in graphical and tabular form, providing the user with the ability to drill down on selected information. Note the increased detail and data manipulation options presented 24

25 Data Mining 25 1

26 Data Mining The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases and using it to make crucial business decisions, (Simoudis,1996). Involves the analysis of data and the use of software techniques for finding hidden and unexpected patterns and relationships in sets of data. 26

27 Data Mining Reveals information that is hidden and unexpected, as little value in finding patterns and relationships that are already intuitive. Patterns and relationships are identified by examining the underlying rules and features in the data. Data mining can provide huge paybacks for companies who have made a significant investment in data warehousing. Relatively new technology, however already used in a number of industries. 27

28 Examples of Applications of Data Mining Retail / Marketing Identifying buying patterns of customers Finding associations among customer demographic characteristics Predicting response to mailing campaigns Market basket analysis Banking Detecting patterns of fraudulent credit card use Identifying loyal customers Predicting customers likely to change their credit card affiliation Determining credit card spending by customer groups 28

29 Examples of Applications of Data Mining Insurance Claims analysis Predicting which customers will buy new policies Medicine Characterizing patient behavior to predict surgery visits Identifying successful medical therapies for different illnesses 29

30 Data Mining Operations and Associated Techniques 30

31 Database Segmentation Aim is to partition a database into an unknown number of segments, or clusters, of similar records. Uses unsupervised learning to discover homogeneous subpopulations in a database to improve the accuracy of the profiles. Less precise than other operations thus less sensitive to redundant and irrelevant features. Sensitivity can be reduced by ignoring a subset of the attributes that describe each instance or by assigning a weighting factor to each variable. Applications of database segmentation include customer profiling, direct marketing, and cross selling. 31

32 Scatterplot 32

33 Visualization 33

34 Data Mining and Data Warehousing Major challenge to exploit data mining is identifying suitable data to mine. Data mining requires single, separate, clean, integrated, and self-consistent source of data. A data warehouse is well equipped for providing data for mining. Data quality and consistency is a pre-requisite for mining to ensure the accuracy of the predictive models. Data warehouses are populated with clean, consistent data. 34

35 Data Mining and Data Warehousing It is advantageous to mine data from multiple sources to discover as many interrelationships as possible. Data warehouses contain data from a number of sources. Selecting the relevant subsets of records and fields for data mining requires the query capabilities of the data warehouse. The results of a data mining study are useful if there is some way to further investigate the uncovered patterns. Data warehouses provide the capability to go back to the data source. 35

36 Advanced Database Topics 36

37 A Little History Prior to the 1980s hierarchical and network databases. Hardware dumb terminals using private networks Database centralized and stored on the disk packs End user terminals simply input/output devices Processing at the mainframe Data text data Networks had to handle text data No access from outside to the organization's private network. 37

38 New Needs Microcomputer enabled workstation processing power. Satellite and network technology provided for very high speed, high traffic, and low cost long distance communications networks. Internet in the late 1990s and the corresponding phenomenal growth in electronic commerce (Ecommerce) necessitated public access to data in people's homes. The volume of data needed to be transmitted increased greatly. 38

39 New Needs Business environment changed during the last two decades Information stored at different locations, on different hardware and operating systems, with different commercial DBMS products, and with different underlying data models had to be combined The centralized database was no longer feasible to handle these new demands 39

40 Distributed Database Scenario There are many advantages to using a distributed database rather than a centralized database. They are: Improved performance, because high traffic data are stored locally. More efficient data management, because the DBA workload is shared. Better network integrity, because the whole system does not stop if one computer goes down. Expansion of the database is facilitated when the organization grows, since new data does not have to be centralized. It can remain and be administered in the original location. Data for the whole organization can still be accessed from any location. 40

41 Distributed Database Data administration is improved (??) In a distributed database system even a simple task like creating a backup copy of the database can take a considerable amount of time. If the database is divided among several locations the time and workload for this task can be shared. 41

42 Replication of Data System failure in one location should not stop processing in other locations Replicate all or parts of the database in more than one location. Database replication improves performance and provides a failsafe option, but it involves considerable complexity Replication of frequently used data improves response time and reduces network traffic If the data changes at one location it must be changed at all locations 42

43 Distributed Systems in an Ideal World C. J. Date established rules for the ideal distributed DBMS system Rules are a goal that distributed systems strive toward, but have not yet reached According to Date's rules: Each site is responsible for its own portion of the distributed database, including security, backup, and recovery. Each site has equal capabilities and does not rely on any other site. The system should work regardless of the computer hardware, operating system, or network installed at any site. 43

44 Date's Rules of Distributed Databases: 1. Local site independence 2. Central site independence 3. Failure independence 4. Location transparency 5. Fragmentation transparency 6. Replication transparency 7. Distributed query processing 8. Distributed transaction processing 9. Hardware independence 10. Operating system independence 11. Network independence 12. Database independence 44

45 Complexities of Distributed Databases There also are many complications involved in the management of distributed database systems. The distributed database must be carefully designed to insure the following: Store data as close as possible to where it is used most often. Make the location of the data transparent to the end user. Make the system easy to expand. Optimize queries to improve response time in the distributed environment. 45

46 Database Design The designer must analyze the organization's needs and business processes to determine the best way to distribute the database. There are several possibilities for storing the data in more than one location: Centralized master database Replication of the entire or part of the database in several locations Horizontal partitions Vertical partitions Mixture of the above 46

47 Fragmentation Horizontal fragmentation of the database means that rows of a table(s) may be stored in different locations Similar to the separation of the customer table in the retailing example above. Vertical fragmentation means that columns of a table ( i.e., attributes or groups of attributes of an entity) are stored in different locations. 47

48 Query Formulation Distributed databases require a considerable amount of network overhead Poorly formulated query it may cause unnecessary data retrieval from the database Query optimization is ideally performed by the distributed database management system 48

49 OODB In traditional relational databases E-R Modeling and normalization focuses on identifying entities, their attributes, and the relationships between entities This works well for most organizational data, especially business data The advent of the microcomputer and processing power on the desktop Computer aided design, CAD, became the norm for engineering work, so it became necessary to store drawings Powerful multimedia PCs with sound cards and color monitors enabled the manipulation of sound and video files Many other applications were developed that required more than just text and numeric processing 49

50 Why?? These new applications were facilitated by the development of Object-Oriented Programming Still evolving development of object-oriented data modeling, object-oriented databases, and object-oriented database management systems OODBMS and O/R DBMS are two types of database management systems that are currently available O/R DBMS uses the basic theory of relational database management systems with object-oriented features added OODBMS is more object-oriented and was developed separately from the relational products OODMBS suffers from a lack of standardization that is available with relational database systems 50

Lecture 18. Business Intelligence and Data Warehousing. 1:M Normalization. M:M Normalization 11/1/2017. Topics Covered

Lecture 18. Business Intelligence and Data Warehousing. 1:M Normalization. M:M Normalization 11/1/2017. Topics Covered Lecture 18 Business Intelligence and Data Warehousing BDIS 6.2 BSAD 141 Dave Novak Topics Covered Test # Review What is Business Intelligence? How can an organization be data rich and information poor?

More information

Question Bank. 4) It is the source of information later delivered to data marts.

Question Bank. 4) It is the source of information later delivered to data marts. Question Bank Year: 2016-2017 Subject Dept: CS Semester: First Subject Name: Data Mining. Q1) What is data warehouse? ANS. A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile

More information

Data warehouse architecture consists of the following interconnected layers:

Data warehouse architecture consists of the following interconnected layers: Architecture, in the Data warehousing world, is the concept and design of the data base and technologies that are used to load the data. A good architecture will enable scalability, high performance and

More information

The Evolution of Data Warehousing. Data Warehousing Concepts. The Evolution of Data Warehousing. The Evolution of Data Warehousing

The Evolution of Data Warehousing. Data Warehousing Concepts. The Evolution of Data Warehousing. The Evolution of Data Warehousing The Evolution of Data Warehousing Data Warehousing Concepts Since 1970s, organizations gained competitive advantage through systems that automate business processes to offer more efficient and cost-effective

More information

Full file at

Full file at Chapter 2 Data Warehousing True-False Questions 1. A real-time, enterprise-level data warehouse combined with a strategy for its use in decision support can leverage data to provide massive financial benefits

More information

DATA MINING AND WAREHOUSING

DATA MINING AND WAREHOUSING DATA MINING AND WAREHOUSING Qno Question Answer 1 Define data warehouse? Data warehouse is a subject oriented, integrated, time-variant, and nonvolatile collection of data that supports management's decision-making

More information

OLAP Introduction and Overview

OLAP Introduction and Overview 1 CHAPTER 1 OLAP Introduction and Overview What Is OLAP? 1 Data Storage and Access 1 Benefits of OLAP 2 What Is a Cube? 2 Understanding the Cube Structure 3 What Is SAS OLAP Server? 3 About Cube Metadata

More information

Topics covered 10/12/2015. Pengantar Teknologi Informasi dan Teknologi Hijau. Suryo Widiantoro, ST, MMSI, M.Com(IS)

Topics covered 10/12/2015. Pengantar Teknologi Informasi dan Teknologi Hijau. Suryo Widiantoro, ST, MMSI, M.Com(IS) Pengantar Teknologi Informasi dan Teknologi Hijau Suryo Widiantoro, ST, MMSI, M.Com(IS) 1 Topics covered 1. Basic concept of managing files 2. Database management system 3. Database models 4. Data mining

More information

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing. About the Tutorial A data warehouse is constructed by integrating data from multiple heterogeneous sources. It supports analytical reporting, structured and/or ad hoc queries and decision making. This

More information

CHAPTER 3 Implementation of Data warehouse in Data Mining

CHAPTER 3 Implementation of Data warehouse in Data Mining CHAPTER 3 Implementation of Data warehouse in Data Mining 3.1 Introduction to Data Warehousing A data warehouse is storage of convenient, consistent, complete and consolidated data, which is collected

More information

Managing Data Resources

Managing Data Resources Chapter 7 Managing Data Resources 7.1 2006 by Prentice Hall OBJECTIVES Describe basic file organization concepts and the problems of managing data resources in a traditional file environment Describe how

More information

Dr.G.R.Damodaran College of Science

Dr.G.R.Damodaran College of Science 1 of 20 8/28/2017 2:13 PM Dr.G.R.Damodaran College of Science (Autonomous, affiliated to the Bharathiar University, recognized by the UGC)Reaccredited at the 'A' Grade Level by the NAAC and ISO 9001:2008

More information

5-1McGraw-Hill/Irwin. Copyright 2007 by The McGraw-Hill Companies, Inc. All rights reserved.

5-1McGraw-Hill/Irwin. Copyright 2007 by The McGraw-Hill Companies, Inc. All rights reserved. 5-1McGraw-Hill/Irwin Copyright 2007 by The McGraw-Hill Companies, Inc. All rights reserved. 5 hapter Data Resource Management Data Concepts Database Management Types of Databases McGraw-Hill/Irwin Copyright

More information

DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY CHARACTERISTICS Data warehouse is a central repository for summarized and integrated data

More information

The Six Principles of BW Data Validation

The Six Principles of BW Data Validation The Problem The Six Principles of BW Data Validation Users do not trust the data in your BW system. The Cause By their nature, data warehouses store large volumes of data. For analytical purposes, the

More information

1. Analytical queries on the dimensionally modeled database can be significantly simpler to create than on the equivalent nondimensional database.

1. Analytical queries on the dimensionally modeled database can be significantly simpler to create than on the equivalent nondimensional database. 1. Creating a data warehouse involves using the functionalities of database management software to implement the data warehouse model as a collection of physically created and mutually connected database

More information

Management Information Systems MANAGING THE DIGITAL FIRM, 12 TH EDITION FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT

Management Information Systems MANAGING THE DIGITAL FIRM, 12 TH EDITION FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT MANAGING THE DIGITAL FIRM, 12 TH EDITION Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT VIDEO CASES Case 1: Maruti Suzuki Business Intelligence and Enterprise Databases

More information

Partner Presentation Faster and Smarter Data Warehouses with Oracle OLAP 11g

Partner Presentation Faster and Smarter Data Warehouses with Oracle OLAP 11g Partner Presentation Faster and Smarter Data Warehouses with Oracle OLAP 11g Vlamis Software Solutions, Inc. Founded in 1992 in Kansas City, Missouri Oracle Partner and reseller since 1995 Specializes

More information

Overview. Introduction to Data Warehousing and Business Intelligence. BI Is Important. What is Business Intelligence (BI)?

Overview. Introduction to Data Warehousing and Business Intelligence. BI Is Important. What is Business Intelligence (BI)? Introduction to Data Warehousing and Business Intelligence Overview Why Business Intelligence? Data analysis problems Data Warehouse (DW) introduction A tour of the coming DW lectures DW Applications Loosely

More information

WKU-MIS-B10 Data Management: Warehousing, Analyzing, Mining, and Visualization. Management Information Systems

WKU-MIS-B10 Data Management: Warehousing, Analyzing, Mining, and Visualization. Management Information Systems Management Information Systems Management Information Systems B10. Data Management: Warehousing, Analyzing, Mining, and Visualization Code: 166137-01+02 Course: Management Information Systems Period: Spring

More information

1 DATAWAREHOUSING QUESTIONS by Mausami Sawarkar

1 DATAWAREHOUSING QUESTIONS by Mausami Sawarkar 1 DATAWAREHOUSING QUESTIONS by Mausami Sawarkar 1) What does the term 'Ad-hoc Analysis' mean? Choice 1 Business analysts use a subset of the data for analysis. Choice 2: Business analysts access the Data

More information

Management Information Systems Review Questions. Chapter 6 Foundations of Business Intelligence: Databases and Information Management

Management Information Systems Review Questions. Chapter 6 Foundations of Business Intelligence: Databases and Information Management Management Information Systems Review Questions Chapter 6 Foundations of Business Intelligence: Databases and Information Management 1) The traditional file environment does not typically have a problem

More information

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining. About the Tutorial Data Mining is defined as the procedure of extracting information from huge sets of data. In other words, we can say that data mining is mining knowledge from data. The tutorial starts

More information

IT1105 Information Systems and Technology. BIT 1 ST YEAR SEMESTER 1 University of Colombo School of Computing. Student Manual

IT1105 Information Systems and Technology. BIT 1 ST YEAR SEMESTER 1 University of Colombo School of Computing. Student Manual IT1105 Information Systems and Technology BIT 1 ST YEAR SEMESTER 1 University of Colombo School of Computing Student Manual Lesson 3: Organizing Data and Information (6 Hrs) Instructional Objectives Students

More information

Data Mining Concepts & Techniques

Data Mining Concepts & Techniques Data Mining Concepts & Techniques Lecture No. 01 Databases, Data warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro

More information

Data Warehousing. Data Warehousing and Mining. Lecture 8. by Hossen Asiful Mustafa

Data Warehousing. Data Warehousing and Mining. Lecture 8. by Hossen Asiful Mustafa Data Warehousing Data Warehousing and Mining Lecture 8 by Hossen Asiful Mustafa Databases Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age Information,

More information

Course Number : SEWI ZG514 Course Title : Data Warehousing Type of Exam : Open Book Weightage : 60 % Duration : 180 Minutes

Course Number : SEWI ZG514 Course Title : Data Warehousing Type of Exam : Open Book Weightage : 60 % Duration : 180 Minutes Birla Institute of Technology & Science, Pilani Work Integrated Learning Programmes Division M.S. Systems Engineering at Wipro Info Tech (WIMS) First Semester 2014-2015 (October 2014 to March 2015) Comprehensive

More information

Chapter 6 VIDEO CASES

Chapter 6 VIDEO CASES Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI

DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI Department of Information Technology IT6702 Data Warehousing & Data Mining Anna University 2 & 16 Mark Questions & Answers Year / Semester: IV / VII Regulation:

More information

REPORTING AND QUERY TOOLS AND APPLICATIONS

REPORTING AND QUERY TOOLS AND APPLICATIONS Tool Categories: REPORTING AND QUERY TOOLS AND APPLICATIONS There are five categories of decision support tools Reporting Managed query Executive information system OLAP Data Mining Reporting Tools Production

More information

CS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures)

CS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures) CS614- Data Warehousing Solved MCQ(S) From Midterm Papers (1 TO 22 Lectures) BY Arslan Arshad Nov 21,2016 BS110401050 BS110401050@vu.edu.pk Arslan.arshad01@gmail.com AKMP01 CS614 - Data Warehousing - Midterm

More information

TIM 50 - Business Information Systems

TIM 50 - Business Information Systems TIM 50 - Business Information Systems Lecture 15 UC Santa Cruz Nov 10, 2016 Class Announcements n Database Assignment 2 posted n Due 11/22 The Database Approach to Data Management The Final Database Design

More information

Teradata Analyst Pack More Power to Analyze and Tune Your Data Warehouse for Optimal Performance

Teradata Analyst Pack More Power to Analyze and Tune Your Data Warehouse for Optimal Performance Data Warehousing > Tools & Utilities Teradata Analyst Pack More Power to Analyze and Tune Your Data Warehouse for Optimal Performance By: Rod Vandervort, Jeff Shelton, and Louis Burger Table of Contents

More information

Data Warehousing and OLAP Technologies for Decision-Making Process

Data Warehousing and OLAP Technologies for Decision-Making Process Data Warehousing and OLAP Technologies for Decision-Making Process Hiren H Darji Asst. Prof in Anand Institute of Information Science,Anand Abstract Data warehousing and on-line analytical processing (OLAP)

More information

Data Warehousing. Overview

Data Warehousing. Overview Data Warehousing Overview Basic Definitions Normalization Entity Relationship Diagrams (ERDs) Normal Forms Many to Many relationships Warehouse Considerations Dimension Tables Fact Tables Star Schema Snowflake

More information

COMP 465 Special Topics: Data Mining

COMP 465 Special Topics: Data Mining COMP 465 Special Topics: Data Mining Introduction & Course Overview 1 Course Page & Class Schedule http://cs.rhodes.edu/welshc/comp465_s15/ What s there? Course info Course schedule Lecture media (slides,

More information

Data Warehouse and Mining

Data Warehouse and Mining Data Warehouse and Mining 1. is a subject-oriented, integrated, time-variant, nonvolatile collection of data in support of management decisions. A. Data Mining. B. Data Warehousing. C. Web Mining. D. Text

More information

Chapter 3. The Multidimensional Model: Basic Concepts. Introduction. The multidimensional model. The multidimensional model

Chapter 3. The Multidimensional Model: Basic Concepts. Introduction. The multidimensional model. The multidimensional model Chapter 3 The Multidimensional Model: Basic Concepts Introduction Multidimensional Model Multidimensional concepts Star Schema Representation Conceptual modeling using ER, UML Conceptual modeling using

More information

Data Warehouses Chapter 12. Class 10: Data Warehouses 1

Data Warehouses Chapter 12. Class 10: Data Warehouses 1 Data Warehouses Chapter 12 Class 10: Data Warehouses 1 OLTP vs OLAP Operational Database: a database designed to support the day today transactions of an organization Data Warehouse: historical data is

More information

TIM 50 - Business Information Systems

TIM 50 - Business Information Systems TIM 50 - Business Information Systems Lecture 15 UC Santa Cruz May 20, 2014 Announcements DB 2 Due Tuesday Next Week The Database Approach to Data Management Database: Collection of related files containing

More information

Business Intelligence and Decision Support Systems

Business Intelligence and Decision Support Systems Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 8: Data Warehousing Learning Objectives Understand the basic definitions and concepts of data warehouses Learn different

More information

Benefits of Automating Data Warehousing

Benefits of Automating Data Warehousing Benefits of Automating Data Warehousing Introduction Data warehousing can be defined as: A copy of data specifically structured for querying and reporting. In most cases, the data is transactional data

More information

Meaning & Concepts of Databases

Meaning & Concepts of Databases 27 th August 2015 Unit 1 Objective Meaning & Concepts of Databases Learning outcome Students will appreciate conceptual development of Databases Section 1: What is a Database & Applications Section 2:

More information

IDU0010 ERP,CRM ja DW süsteemid Loeng 5 DW concepts. Enn Õunapuu

IDU0010 ERP,CRM ja DW süsteemid Loeng 5 DW concepts. Enn Õunapuu IDU0010 ERP,CRM ja DW süsteemid Loeng 5 DW concepts Enn Õunapuu enn.ounapuu@ttu.ee Content Oveall approach Dimensional model Tabular model Overall approach Data modeling is a discipline that has been practiced

More information

Managing Data Resources

Managing Data Resources Chapter 7 OBJECTIVES Describe basic file organization concepts and the problems of managing data resources in a traditional file environment Managing Data Resources Describe how a database management system

More information

by Prentice Hall

by Prentice Hall Chapter 6 Foundations of Business Intelligence: Databases and Information Management 6.1 2010 by Prentice Hall Organizing Data in a Traditional File Environment File organization concepts Computer system

More information

DATABASE DEVELOPMENT (H4)

DATABASE DEVELOPMENT (H4) IMIS HIGHER DIPLOMA QUALIFICATIONS DATABASE DEVELOPMENT (H4) December 2017 10:00hrs 13:00hrs DURATION: 3 HOURS Candidates should answer ALL the questions in Part A and THREE of the five questions in Part

More information

DATA MINING TRANSACTION

DATA MINING TRANSACTION DATA MINING Data Mining is the process of extracting patterns from data. Data mining is seen as an increasingly important tool by modern business to transform data into an informational advantage. It is

More information

1. Inroduction to Data Mininig

1. Inroduction to Data Mininig 1. Inroduction to Data Mininig 1.1 Introduction Universe of Data Information Technology has grown in various directions in the recent years. One natural evolutionary path has been the development of the

More information

Chapter 13 Business Intelligence and Data Warehouses The Need for Data Analysis Business Intelligence. Objectives

Chapter 13 Business Intelligence and Data Warehouses The Need for Data Analysis Business Intelligence. Objectives Chapter 13 Business Intelligence and Data Warehouses Objectives In this chapter, you will learn: How business intelligence is a comprehensive framework to support business decision making How operational

More information

Department of Industrial Engineering. Sharif University of Technology. Operational and enterprises systems. Exciting directions in systems

Department of Industrial Engineering. Sharif University of Technology. Operational and enterprises systems. Exciting directions in systems Department of Industrial Engineering Sharif University of Technology Session# 9 Contents: The role of managers in Information Technology (IT) Organizational Issues Information Technology Operational and

More information

UNIT -1 UNIT -II. Q. 4 Why is entity-relationship modeling technique not suitable for the data warehouse? How is dimensional modeling different?

UNIT -1 UNIT -II. Q. 4 Why is entity-relationship modeling technique not suitable for the data warehouse? How is dimensional modeling different? (Please write your Roll No. immediately) End-Term Examination Fourth Semester [MCA] MAY-JUNE 2006 Roll No. Paper Code: MCA-202 (ID -44202) Subject: Data Warehousing & Data Mining Note: Question no. 1 is

More information

Database Management Systems MIT Lesson 01 - Introduction By S. Sabraz Nawaz

Database Management Systems MIT Lesson 01 - Introduction By S. Sabraz Nawaz Database Management Systems MIT 22033 Lesson 01 - Introduction By S. Sabraz Nawaz Introduction A database management system (DBMS) is a software package designed to create and maintain databases (examples?)

More information

An Overview of Data Warehousing and OLAP Technology

An Overview of Data Warehousing and OLAP Technology An Overview of Data Warehousing and OLAP Technology CMPT 843 Karanjit Singh Tiwana 1 Intro and Architecture 2 What is Data Warehouse? Subject-oriented, integrated, time varying, non-volatile collection

More information

Computers Are Your Future

Computers Are Your Future Computers Are Your Future Twelfth Edition Chapter 12: Databases and Information Systems Copyright 2012 Pearson Education, Inc. Publishing as Prentice Hall 1 Databases and Information Systems Copyright

More information

Designing Data Warehouses. Data Warehousing Design. Designing Data Warehouses. Designing Data Warehouses

Designing Data Warehouses. Data Warehousing Design. Designing Data Warehouses. Designing Data Warehouses Designing Data Warehouses To begin a data warehouse project, need to find answers for questions such as: Data Warehousing Design Which user requirements are most important and which data should be considered

More information

Evolution of Database Systems

Evolution of Database Systems Evolution of Database Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Intelligent Decision Support Systems Master studies, second

More information

INTRODUCTORY INFORMATION TECHNOLOGY ENTERPRISE DATABASES AND DATA WAREHOUSES. Faramarz Hendessi

INTRODUCTORY INFORMATION TECHNOLOGY ENTERPRISE DATABASES AND DATA WAREHOUSES. Faramarz Hendessi INTRODUCTORY INFORMATION TECHNOLOGY ENTERPRISE DATABASES AND DATA WAREHOUSES Faramarz Hendessi INTRODUCTORY INFORMATION TECHNOLOGY Lecture 7 Fall 2010 Isfahan University of technology Dr. Faramarz Hendessi

More information

CT75 (ALCCS) DATA WAREHOUSING AND DATA MINING JUN

CT75 (ALCCS) DATA WAREHOUSING AND DATA MINING JUN Q.1 a. Define a Data warehouse. Compare OLTP and OLAP systems. Data Warehouse: A data warehouse is a subject-oriented, integrated, time-variant, and 2 Non volatile collection of data in support of management

More information

Summary of Last Chapter. Course Content. Chapter 2 Objectives. Data Warehouse and OLAP Outline. Incentive for a Data Warehouse

Summary of Last Chapter. Course Content. Chapter 2 Objectives. Data Warehouse and OLAP Outline. Incentive for a Data Warehouse Principles of Knowledge Discovery in bases Fall 1999 Chapter 2: Warehousing and Dr. Osmar R. Zaïane University of Alberta Dr. Osmar R. Zaïane, 1999 Principles of Knowledge Discovery in bases University

More information

CS377: Database Systems Data Warehouse and Data Mining. Li Xiong Department of Mathematics and Computer Science Emory University

CS377: Database Systems Data Warehouse and Data Mining. Li Xiong Department of Mathematics and Computer Science Emory University CS377: Database Systems Data Warehouse and Data Mining Li Xiong Department of Mathematics and Computer Science Emory University 1 1960s: Evolution of Database Technology Data collection, database creation,

More information

Data Mining and Warehousing

Data Mining and Warehousing Data Mining and Warehousing Sangeetha K V I st MCA Adhiyamaan College of Engineering, Hosur-635109. E-mail:veerasangee1989@gmail.com Rajeshwari P I st MCA Adhiyamaan College of Engineering, Hosur-635109.

More information

CT75 DATA WAREHOUSING AND DATA MINING DEC 2015

CT75 DATA WAREHOUSING AND DATA MINING DEC 2015 Q.1 a. Briefly explain data granularity with the help of example Data Granularity: The single most important aspect and issue of the design of the data warehouse is the issue of granularity. It refers

More information

Data Mining. ❸Chapter 3 Data warehouse, ETL and OLAP. Asso.Prof.Dr. Xiao-dong Zhu. Business School, University of Shanghai for Science & Technology

Data Mining. ❸Chapter 3 Data warehouse, ETL and OLAP. Asso.Prof.Dr. Xiao-dong Zhu. Business School, University of Shanghai for Science & Technology ❸Chapter 3 Data warehouse, and Business School, University of Shanghai for Science & Technology 2016-2017 2nd Semester, Spring2017 Contents of chapter 2 1 KDD Process 2 3 4 5 What is KDD? KDD Process the

More information

Data Warehouse. Asst.Prof.Dr. Pattarachai Lalitrojwong

Data Warehouse. Asst.Prof.Dr. Pattarachai Lalitrojwong Data Warehouse Asst.Prof.Dr. Pattarachai Lalitrojwong Faculty of Information Technology King Mongkut s Institute of Technology Ladkrabang Bangkok 10520 pattarachai@it.kmitl.ac.th The Evolution of Data

More information

The strategic advantage of OLAP and multidimensional analysis

The strategic advantage of OLAP and multidimensional analysis IBM Software Business Analytics Cognos Enterprise The strategic advantage of OLAP and multidimensional analysis 2 The strategic advantage of OLAP and multidimensional analysis Overview Online analytical

More information

CHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI

CHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI CHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS Assist. Prof. Dr. Volkan TUNALI Topics 2 Business Intelligence (BI) Decision Support System (DSS) Data Warehouse Online Analytical Processing (OLAP)

More information

Application software office packets, databases and data warehouses.

Application software office packets, databases and data warehouses. Introduction to Computer Systems (9) Application software office packets, databases and data warehouses. Piotr Mielecki Ph. D. http://www.wssk.wroc.pl/~mielecki piotr.mielecki@pwr.edu.pl pmielecki@gmail.com

More information

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:

More information

MIT Database Management Systems Lesson 01: Introduction

MIT Database Management Systems Lesson 01: Introduction MIT 22033 Database Management Systems Lesson 01: Introduction By S. Sabraz Nawaz Senior Lecturer in MIT, FMC, SEUSL Learning Outcomes At the end of the module the student will be able to: Describe the

More information

Decision Support Systems aka Analytical Systems

Decision Support Systems aka Analytical Systems Decision Support Systems aka Analytical Systems Decision Support Systems Systems that are used to transform data into information, to manage the organization: OLAP vs OLTP OLTP vs OLAP Transactions Analysis

More information

Handout 12 Data Warehousing and Analytics.

Handout 12 Data Warehousing and Analytics. Handout 12 CS-605 Spring 17 Page 1 of 6 Handout 12 Data Warehousing and Analytics. Operational (aka transactional) system a system that is used to run a business in real time, based on current data; also

More information

KORA. Business Intelligence An Introduction

KORA. Business Intelligence An Introduction Business Intelligence An Introduction Outline What is Business Intelligence Business Intelligence Market BI Tools & Users What should be understood when someone uses the term Business Intellingence? But

More information

Data Warehousing. Ritham Vashisht, Sukhdeep Kaur and Shobti Saini

Data Warehousing. Ritham Vashisht, Sukhdeep Kaur and Shobti Saini Advance in Electronic and Electric Engineering. ISSN 2231-1297, Volume 3, Number 6 (2013), pp. 669-674 Research India Publications http://www.ripublication.com/aeee.htm Data Warehousing Ritham Vashisht,

More information

TDWI Data Modeling. Data Analysis and Design for BI and Data Warehousing Systems

TDWI Data Modeling. Data Analysis and Design for BI and Data Warehousing Systems Data Analysis and Design for BI and Data Warehousing Systems Previews of TDWI course books offer an opportunity to see the quality of our material and help you to select the courses that best fit your

More information

IT DATA WAREHOUSING AND DATA MINING UNIT-2 BUSINESS ANALYSIS

IT DATA WAREHOUSING AND DATA MINING UNIT-2 BUSINESS ANALYSIS PART A 1. What are production reporting tools? Give examples. (May/June 2013) Production reporting tools will let companies generate regular operational reports or support high-volume batch jobs. Such

More information

Data Warehousing and OLAP

Data Warehousing and OLAP Data Warehousing and OLAP INFO 330 Slides courtesy of Mirek Riedewald Motivation Large retailer Several databases: inventory, personnel, sales etc. High volume of updates Management requirements Efficient

More information

CHAPTER 3 BUILDING ARCHITECTURAL DATA WAREHOUSE FOR CANCER DISEASE

CHAPTER 3 BUILDING ARCHITECTURAL DATA WAREHOUSE FOR CANCER DISEASE 32 CHAPTER 3 BUILDING ARCHITECTURAL DATA WAREHOUSE FOR CANCER DISEASE 3.1 INTRODUCTION Due to advanced technology, increasing number of hospitals are using electronic medical records to accumulate substantial

More information

Data Analytics at Logitech Snowflake + Tableau = #Winning

Data Analytics at Logitech Snowflake + Tableau = #Winning Welcome # T C 1 8 Data Analytics at Logitech Snowflake + Tableau = #Winning Avinash Deshpande I am a futurist, scientist, engineer, designer, data evangelist at heart Find me at Avinash Deshpande Chief

More information

Data Warehouse and Data Mining

Data Warehouse and Data Mining Data Warehouse and Data Mining Lecture No. 03 Architecture of DW Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Basic

More information

Database design View Access patterns Need for separate data warehouse:- A multidimensional data model:-

Database design View Access patterns Need for separate data warehouse:- A multidimensional data model:- UNIT III: Data Warehouse and OLAP Technology: An Overview : What Is a Data Warehouse? A Multidimensional Data Model, Data Warehouse Architecture, Data Warehouse Implementation, From Data Warehousing to

More information

Introduction Database Concepts

Introduction Database Concepts Introduction Database Concepts CO attained : CO1 Hours Required: 05 Self Study: 08 Prepared and presented by : Ms. Swati Abhang Contents Introduction Characteristics of databases, File system V/s Database

More information

Data Mining and Data Warehousing Introduction to Data Mining

Data Mining and Data Warehousing Introduction to Data Mining Data Mining and Data Warehousing Introduction to Data Mining Quiz Easy Q1. Which of the following is a data warehouse? a. Can be updated by end users. b. Contains numerous naming conventions and formats.

More information

CHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP)

CHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP) CHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP) INTRODUCTION A dimension is an attribute within a multidimensional model consisting of a list of values (called members). A fact is defined by a combination

More information

Computers Are Your Future

Computers Are Your Future Computers Are Your Future Computers Are Your Future Databases and Information Systems Slide 2 What You Will Learn About The potential uses of a database program The basic components of a database The differences

More information

Fig 1.2: Relationship between DW, ODS and OLTP Systems

Fig 1.2: Relationship between DW, ODS and OLTP Systems 1.4 DATA WAREHOUSES Data warehousing is a process for assembling and managing data from various sources for the purpose of gaining a single detailed view of an enterprise. Although there are several definitions

More information

Chapter 3. Database Architecture and the Web

Chapter 3. Database Architecture and the Web Chapter 3 Database Architecture and the Web 1 Chapter 3 - Objectives Software components of a DBMS. Client server architecture and advantages of this type of architecture for a DBMS. Function and uses

More information

Lectures for the course: Data Warehousing and Data Mining (IT 60107)

Lectures for the course: Data Warehousing and Data Mining (IT 60107) Lectures for the course: Data Warehousing and Data Mining (IT 60107) Week 1 Lecture 1 21/07/2011 Introduction to the course Pre-requisite Expectations Evaluation Guideline Term Paper and Term Project Guideline

More information

Data Warehouse and Data Mining

Data Warehouse and Data Mining Data Warehouse and Data Mining Lecture No. 02 Introduction to Data Warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology

More information

ALTERNATE SCHEMA DIAGRAMMING METHODS DECISION SUPPORT SYSTEMS. CS121: Relational Databases Fall 2017 Lecture 22

ALTERNATE SCHEMA DIAGRAMMING METHODS DECISION SUPPORT SYSTEMS. CS121: Relational Databases Fall 2017 Lecture 22 ALTERNATE SCHEMA DIAGRAMMING METHODS DECISION SUPPORT SYSTEMS CS121: Relational Databases Fall 2017 Lecture 22 E-R Diagramming 2 E-R diagramming techniques used in book are similar to ones used in industry

More information

STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / BTIX05 - BTECH DEPARTMENT OF INFORMATICS. By: Dr. Tendani J. Lavhengwa

STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / BTIX05 - BTECH DEPARTMENT OF INFORMATICS. By: Dr. Tendani J. Lavhengwa STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / BTIX05 - BTECH DEPARTMENT OF INFORMATICS LECTURE: 05 (A) DATA WAREHOUSING (DW) By: Dr. Tendani J. Lavhengwa lavhengwatj@tut.ac.za 1 My personal quote:

More information

Web Mining Evolution & Comparative Study with Data Mining

Web Mining Evolution & Comparative Study with Data Mining Web Mining Evolution & Comparative Study with Data Mining Anu, Assistant Professor (Resource Person) University Institute of Engineering and Technology Mahrishi Dayanand University Rohtak-124001, India

More information

QM Chapter 1 Database Fundamentals Version 10 th Ed. Prepared by Dr Kamel Rouibah / Dept QM & IS

QM Chapter 1 Database Fundamentals Version 10 th Ed. Prepared by Dr Kamel Rouibah / Dept QM & IS QM 433 - Chapter 1 Database Fundamentals Version 10 th Ed Prepared by Dr Kamel Rouibah / Dept QM & IS www.cba.edu.kw/krouibah Dr K. Rouibah / dept QM & IS Chapter 1 (433) Database fundamentals 1 Objectives

More information

MAXIMIZING ROI FROM AKAMAI ION USING BLUE TRIANGLE TECHNOLOGIES FOR NEW AND EXISTING ECOMMERCE CUSTOMERS CONSIDERING ION CONTENTS EXECUTIVE SUMMARY... THE CUSTOMER SITUATION... HOW BLUE TRIANGLE IS UTILIZED

More information

Data Warehousing & Mining Techniques

Data Warehousing & Mining Techniques Data Warehousing & Mining Techniques Wolf-Tilo Balke Kinda El Maarry Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 2. Summary Last week: What is a Data

More information

Data Warehousing. Seminar report. Submitted in partial fulfillment of the requirement for the award of degree Of Computer Science

Data Warehousing. Seminar report.  Submitted in partial fulfillment of the requirement for the award of degree Of Computer Science A Seminar report On Data Warehousing Submitted in partial fulfillment of the requirement for the award of degree Of Computer Science SUBMITTED TO: SUBMITTED BY: www.studymafia.org www.studymafia.org Preface

More information

Chapter 3. Foundations of Business Intelligence: Databases and Information Management

Chapter 3. Foundations of Business Intelligence: Databases and Information Management Chapter 3 Foundations of Business Intelligence: Databases and Information Management THE DATA HIERARCHY TRADITIONAL FILE PROCESSING Organizing Data in a Traditional File Environment Problems with the traditional

More information

Taking a First Look at Excel s Reporting Tools

Taking a First Look at Excel s Reporting Tools CHAPTER 1 Taking a First Look at Excel s Reporting Tools This chapter provides you with an overview of Excel s reporting features. It shows you the principal types of Excel reports and how you can use

More information

International Journal of Computer Engineering and Applications, ICCSTAR-2016, Special Issue, May.16

International Journal of Computer Engineering and Applications, ICCSTAR-2016, Special Issue, May.16 The Survey Of Data Mining And Warehousing Architha.S, A.Kishore Kumar Department of Computer Engineering Department of computer engineering city engineering college VTU Bangalore, India ABSTRACT: Data

More information

Data Warehouse and Data Mining

Data Warehouse and Data Mining Data Warehouse and Data Mining Lecture No. 04-06 Data Warehouse Architecture Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology

More information