Decision Support Systems
|
|
- Franklin Charles
- 5 years ago
- Views:
Transcription
1 Decision Support Systems 2011/2012 Week 3. Lecture 5
2 Previous Class: Data Pre- Processing Data quality: accuracy, completeness, consistency, 4meliness, believability, interpretability Data cleaning: handling missing and noisy values Data integra4on: Remove redundancies Data transforma4on: Normaliza4on Data reduc4on: Dimensionality reduc4on
3 Today Data reduc4on & discre4za4on (conclusion) Data Warehousing: Introduc4on Data cubes Online analy4cal processing (OLAP)
4 Previously, on DSS
5 Data ReducGon (revisited) Obtain reduced representa4on of data that produces similar mining results Data reduc4on strategies: 1. Dimensionality reduc4on: Data is encoded so as to reduce dimension (wavelet transforms, PCA, feature selec4on) 2. Numerosity reduc4on: Data is replaced by smaller representa4ons, either parametric (e.g., log- linear models) or non- parametric (e.g., clustering, sampling) 3. ANribute selec4on: Where irrelevant or redundant anributes are removed
6 1. Dimensionality ReducGon Transforms (wavelets, Fourier) use specific basis to represent data PCA builds a basis from the data (eigenvalues of covariance matrix) Transformed data can be truncated Data New Basis (Fourier, Wavelet, PC) Trunca4on
7 2. Numerosity ReducGon Reduce data volume using smaller forms of data representa4on Two main classes of methods Parametric methods: Assume the data fits a model Es4mate and store model parameters Discard the data (except possible outliers) Non- parametric methods: Do not assume models
8 Parametric Data ReducGon Regression and Log- Linear models Linear regression: Data modeled to fit a straight line OVen uses the least- square method to fit the line Mul4ple regression: Data modeled as a linear func4on of mul4dimensional feature vector Log- linear model: Approximates data as a (parameterized) probability distribu4on Also useful for dimensionality reduc4on and smoothing
9 Histogram Analysis Already met histograms as a tool for data visualiza4on Histogram approach to numerosity reduc4on: Divide data into buckets Store mean for each bucket Par44oning rules: Equal- width (equal bucket range) Equal- frequency (equal number of points)
10 Clustering Clustering approach to numerosity reduc4on: Par44on data set into clusters based on similarity Store cluster representa4on (e.g., centroid and diameter) Can be very effec4ve if data is clustered; not if data is smeared There are many choices of clustering defini4ons and clustering algorithms Cluster analysis will be covered later on
11 Sampling Sampling: obtaining a small sample to represent the whole data set Complexity of mining algorithms poten4ally sub- linear in the size of the data Key principle: Choose a representa4ve subset of the data Sampling approaches: Simple random sampling (may have very poor performance if data is skewed) Stra4fied sampling (data divided into strata, each sampled separately) Cluster sampling (data is cluster and then clusters are sampled)
12 3. AOribute Subset SelecGon (aka Feature SelecGon) Remove redundant anributes/dimensions With N anributes, 2 N possible subsets (exhaus4ve search imprac4cal) Common techniques: Stepwise forward selec4on: depart from empty set, add one feature at a 4me Stepwise backward elimina4on: depart from full set, remove one feature at a 4me Combina4on of forward selec4on and backward elimina4on Heuris4cs to guide selec4on/elimina4on: Sta4s4cal tests Informa4on gain
13 Data Cube AggregaGon Data in a database is aggregated at different granulari4es Mul4dimensional aggregated informa4on is stored a data cube dark pastel white all skirt dress shirt pants all all small medium large dark pastel white TOTAL skirt dress shirt pants TOTAL
14 Data DiscreGzaGon
15 DiscreGzaGon Discre4za4on: Divide range of con4nuous anribute into intervals Interval labels used as actual data values Reduces and simplifies original data Discre4za4on techniques can: Be supervised (uses class informa4on) or unsupervised Rely on spligng (top- down) or merging (bonom- up) Discre4za4on can be performed recursively on an anribute 15
16 Data DiscreGzaGon Methods Binning: already discussed for smoothing Top- down split, unsupervised Histogram analysis: already discussed in data visualiza4on Top- down split, unsupervised Cluster analysis: Values par44oned in clusters from data distribu4on Unsupervised, top- down split or bonom- up merge Entropy- based analysis: Values par44oned from informa4on measures Supervised, top- down split Correla4on analysis: Correla4on informa4on guides merging with neighbor values Unsupervised, bonom- up merge
17 Concept Hierarchies Concept hierarchies organize concepts hierarchically Usually associated dimensions in a data warehouse Concept hierarchies facilitate viewing data at mul4ple granulari4es Concept hierarchy forma4on: Recursively reduce the data by collec4ng and replacing low level concepts by higher level concepts Concept hierarchies can be explicitly specified by domain experts or automa4cally generated for both numeric and nominal data
18 Concept Hierarchy GeneraGon (Nominal Data) Specifica4on of a par4al/total ordering of anributes E.g., street < city < state < country Specifica4on of a hierarchy for a set of values by explicit data grouping E.g., {Urbana, Champaign, Chicago} < Illinois Specifica4on of an par4al/total order on a par4al set of anributes E.g., street < city, but not others Automa4c genera4on of hierarchies by analysis of dis4nct values E.g., {street, city, state, country}
19 Example Automa4c genera4on of hierarchies by analysis of dis4nct values ANribute with most dis4nct values is placed at the lowest level of the hierarchy 15 dis4nct values Country Careful: There are excep4ons! 365 dis4nct values 3567 dis4nct values Province or State City 674,339 dis4nct values Street
20 Data Warehousing
21 IntroducGon
22 What is a Data Warehouse? Defined in different ways, not rigorously: A decision support database maintained separately from the organiza4on s opera4onal database Support informa4on processing by providing a solid plaoorm of consolidated, historical data for analysis A data warehouse is a subject- oriented, integrated, 6me- variant, and non - vola6le collec6on of data in support of management s decision- making process Data warehousing: The process of construc4ng and using data warehouses W. H. Inmon
23 Subject- Oriented Organized around major subjects, such as customer, product, sales Focusing on the modeling and analysis of data for decision makers, not on daily opera4ons or transac4on processing Provide a simple and concise view around par4cular subject issues by excluding data that are not useful in the decision support process
24 Integrated Constructed by integra4ng mul4ple heterogeneous data sources E.g., rela4onal databases, flat files, on- line transac4on records Construc4on requires data cleaning and data integra4on techniques: Ensure consistency across different data sources in naming conven4ons, encoding structures, anribute measures, etc. When data is consolidated in the warehouse, it is converted
25 Time Variant Time horizon for data warehouses is longer than that of opera4onal DB systems: Opera4onal databases: Current values of data Data warehouses: Informa4on from a historical perspec4ve Every key structure in the data warehouse contains elements of 4me, explicitly or implicitly Keys in opera4onal data may or may not contain 4me elements
26 Non- volagle A store of data transformed that is physically separated from the opera4onal environment Opera4onal update of data does not occur in the data warehouse environment: Does not require transac4on processing, recovery, and concurrency control mechanisms Requires only two opera4ons in data accessing: ini4al loading of data and access of data
27 OLTP OLTP vs. OLAP OLAP Users Clerk, IT professional Knowledge worker FuncGon Day to day opera4ons Decision support DB design Applica4on- oriented Subject- oriented Data Current, up- to- date, detailed, flat rela4onal, isolated Usage Repe44ve Ad- hoc Access Read/write, index/hash on primary keys Historical, summarized, mul4dimensional, integrated Scan Unit of work Short, simple transac4ons Complex queries Records accessed Tens Millions Users Thousands Hundreds DB size 10Mb- Gb 100Gb- Tb Metric Transac4on throughput Query throughput, response
28 Why a Separate Data Warehouse? High performance for both systems DBMS tuned for OLTP: Access methods, indexing, concurrency control, recovery Warehouse tuned for OLAP: Complex OLAP queries, mul4dimensional view, consolida4on Different func4ons and data: Missing data: DS requires historical data that opera4onal DBs typically do not maintain Data consolida4on: DS requires consolida4on (aggrega4on, summariza4on) of data from heterogeneous sources Data quality: Different sources typically use inconsistent data representa4ons, codes and formats must be reconciled
29 Basic Concepts
30 Data Warehouse: A MulG- Gered Architecture = = = Metadata Monitor & Integrator OLAP Server Other sources Opera4onal DBs Extract Transform Load Refresh Data Warehouse Serve Analysis Query Reports Data mining Data Marts Data Sources Data Storage OLAP Engine Front- end Tools
31 Data Warehouse Models Enterprise warehouse: Collects all of the informa4on about subjects spanning the en4re organiza4on Data mart: A subset of corporate- wide data that is of value to a specific groups of users. Its scope is confined to specific selected groups, such as marke4ng data mart Independent vs dependent (directly from warehouse) data mart Virtual warehouse: A set of views over opera4onal databases Only some of the possible summary views may be materialized
32 ExtracGon, TransformaGon, and Loading (ETL) Data extrac4on: Get data from mul4ple heterogeneous external sources Data cleaning: Detect errors in the data and rec4fy them when possible Data transforma4on: Convert data from legacy or host format to warehouse format Load: Sort, summarize, consolidate, compute views, check integrity, and build indices and par44ons Refresh: Propagate the updates from the data sources to the warehouse
33 Metadata Repository Metadata is the data defining warehouse objects: Descrip4on of the structure of the data warehouse (schemas, views, dimensions, hierarchies, derived data defini4ons, etc.) Opera4onal metadata (data lineage, currency of data, monitoring informa4on Algorithms used for summariza4on Mapping from opera4onal environment to the data warehouse Data related to system performance (warehouse schema, view and derived data defini4ons) Business data (business terms and defini4ons, ownership of data, charging policies)
34 Data Cubes and OLAP
35 From Tables to Data Cubes A data warehouse is based on a mul4dimensional data model which views data in the form of a data cube A data cube allows data to be modeled and viewed in mul4ple dimensions A data cube contains: Dimension tables Fact table containing measures and keys to each of the related dimension tables An n- D base cube is called a base cuboid. The top- most 0- D cuboid, which holds the highest- level of summariza4on, is called the apex cuboid. The lagce of cuboids forms a data cube
36 Example (2- D view) Time (quarter) Home entertainment LocaGon = Vancouver Item (type) Computer Phone Security Q Q Q Q Cross TabulaGon (Dimensions tabulated across one another)
37 Example (3- D view) LocaGon = Chicago LocaGon = New York LocaGon = Toronto LocaGon = Vancouver Item Item Item Item Time Home ent. Comp. Phone Sec. Home ent. Comp. Phone Sec. Home ent. Comp. Phone Sec. Home ent. Comp. Phone Sec. Q Q Q Q
38 Example 0- D (appex) cuboid 1- D cuboids 2- D cuboids 3- D cuboids 4- D (base) cuboids
39 A Concept Hierarchy (LocaGon) all all Region Europe North America Country Germany Spain Canada Mexico City Frankfurt Vancouver Toronto Office L. Chan M. Wind
40 MulGdimensional Data Sales volume as a func4on of product, month, and region Dimensions: Product, Loca4on, Time Hierarchical summariza4on paths: Type Category Item Country State City Street Year Quarter Month Day Week
41 Typical OLAP OperaGons Roll up (drill- up): Summarize data by climbing up hierarchy or by dimension reduc4on Drill down (roll down): Go from less detailed to more detailed data or introduce new dimensions (reverse of roll- up) Slice: Project Dice: Select Pivot: Reorient the cube Other opera4ons: Drill across: Involves summariza4on/aggrega4on across mul4ple fact tables Drill through: Involves going through the bonom level of the cube to its back- end rela4onal tables
42 Example: Roll- up Drill- down Slice Dice Pivot
43 A Starnet Query Model
44 Browsing a Data Cube Visualiza4on OLAP capabili4es Interac4ve manipula4on
45 Finally
46 Summary Data Reduc4on Numerosity Reduc4on: Reduce data volume using smaller forms of data representa4on Data discre4za4on: Reduces and simplifies original data Data warehousing: A mul4- dimensional model of a data warehouse Data cube: Dimensions & measures OLAP opera4ons: drilling, rolling, slicing, dicing and pivo4ng
47 Next Class (Data Warehouse) Architecture
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 05(b) : 23/10/2012 Data Mining: Concepts and Techniques (3 rd ed.) Chapter
More informationWhat is a Data Warehouse?
What is a Data Warehouse? COMP 465 Data Mining Data Warehousing Slides Adapted From : Jiawei Han, Micheline Kamber & Jian Pei Data Mining: Concepts and Techniques, 3 rd ed. Defined in many different ways,
More informationData Mining & Data Warehouse
Data Mining & Data Warehouse Associate Professor Dr. Raed Ibraheem Hamed University of Human Development, College of Science and Technology (1) 2016 2017 1 Points to Cover Why Do We Need Data Warehouses?
More informationDecision Support Systems
Decision Support Systems 2011/2012 Week 3. Lecture 6 Previous Class Dimensions & Measures Dimensions: Item Time Loca0on Measures: Quan0ty Sales TransID ItemName ItemID Date Store Qty T0001 Computer I23
More informationData Warehousing (1)
ICS 421 Spring 2010 Data Warehousing (1) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/18/2010 Lipyeow Lim -- University of Hawaii at Manoa 1 Motivation
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 14 : 18/11/2014 Data Mining: Concepts and Techniques (3 rd ed.) Chapter
More informationData Mining. Associate Professor Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology
Data Mining Associate Professor Dr. Raed Ibraheem Hamed University of Human Development, College of Science and Technology (1) 2016 2017 Department of CS- DM - UHD 1 Points to Cover Why Do We Need Data
More informationData Mining Concepts & Techniques
Data Mining Concepts & Techniques Lecture No. 01 Databases, Data warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro
More informationData Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394
Data Mining Data warehousing Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 22 Table of contents 1 Introduction 2 Data warehousing
More informationSummary of Last Chapter. Course Content. Chapter 2 Objectives. Data Warehouse and OLAP Outline. Incentive for a Data Warehouse
Principles of Knowledge Discovery in bases Fall 1999 Chapter 2: Warehousing and Dr. Osmar R. Zaïane University of Alberta Dr. Osmar R. Zaïane, 1999 Principles of Knowledge Discovery in bases University
More informationSyllabus. Syllabus. Motivation Decision Support. Syllabus
Presentation: Sophia Discussion: Tianyu Metadata Requirements and Conclusion 3 4 Decision Support Decision Making: Everyday, Everywhere Decision Support System: a class of computerized information systems
More informationIntroduction to Data Warehousing
ICS 321 Spring 2012 Introduction to Data Warehousing Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 4/23/2012 Lipyeow Lim -- University of Hawaii at Manoa
More informationManaging Information Resources
Managing Information Resources 1 Managing Data 2 Managing Information 3 Managing Contents Concepts & Definitions Data Facts devoid of meaning or intent e.g. structured data in DB Information Data that
More informationAnalyse des Données. Master 2 IMAFA. Andrea G. B. Tettamanzi
Analyse des Données Master 2 IMAFA Andrea G. B. Tettamanzi Université Nice Sophia Antipolis UFR Sciences - Département Informatique andrea.tettamanzi@unice.fr Andrea G. B. Tettamanzi, 2016 1 CM - Séance
More informationThis tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.
About the Tutorial A data warehouse is constructed by integrating data from multiple heterogeneous sources. It supports analytical reporting, structured and/or ad hoc queries and decision making. This
More informationDatabase design View Access patterns Need for separate data warehouse:- A multidimensional data model:-
UNIT III: Data Warehouse and OLAP Technology: An Overview : What Is a Data Warehouse? A Multidimensional Data Model, Data Warehouse Architecture, Data Warehouse Implementation, From Data Warehousing to
More informationECT7110 Introduction to Data Warehousing
ECT7110 Introduction to Data Warehousing Prof. Wai Lam ECT7110 Introduction to Data Warehousing 1 What is Data Warehouse? Defined in many different ways, but not rigorously. A decision support database
More informationData Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1396
Data Mining Data warehousing Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1396 1 / 31 Table of contents 1 Introduction 2 Data warehousing
More informationData Warehousing & OLAP
Data Warehousing & OLAP Data Mining: Concepts and Techniques Chapter 3 Jiawei Han and An Introduction to Database Systems C.J.Date, Eighth Eddition, Addidon Wesley, 4 1 What is Data Warehousing? What is
More informationData Preprocessing. Slides by: Shree Jaswal
Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data
More informationEvolution of Database Systems
Evolution of Database Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Intelligent Decision Support Systems Master studies, second
More informationECLT 5810 Introduction to Data Warehousing
ECLT 5810 Introduction to Data Warehousing Prof. Wai Lam ECLT 5810 Introduction to Data Warehousing 1 What is Data Warehouse? Provides tools for business executives Systematically organize and understand
More informationUNIT 2 Data Preprocessing
UNIT 2 Data Preprocessing Lecture Topic ********************************************** Lecture 13 Why preprocess the data? Lecture 14 Lecture 15 Lecture 16 Lecture 17 Data cleaning Data integration and
More informationBy Mahesh R. Sanghavi Associate professor, SNJB s KBJ CoE, Chandwad
By Mahesh R. Sanghavi Associate professor, SNJB s KBJ CoE, Chandwad Data Analytics life cycle Discovery Data preparation Preprocessing requirements data cleaning, data integration, data reduction, data
More informationDecision Support, Data Warehousing, and OLAP
Decision Support, Data Warehousing, and OLAP : Contents Terminology : OLAP vs. OLTP Data Warehousing Architecture Technologies References 1 Decision Support and OLAP Information technology to help knowledge
More informationData Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394
Data Mining Data warehousing Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 31 Table of contents 1 Introduction 2 Data warehousing
More informationData Warehouse and Data Mining
Data Warehouse and Data Mining Lecture No. 07 Terminologies Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Database
More informationChapter 4, Data Warehouse and OLAP Operations
CSI 4352, Introduction to Data Mining Chapter 4, Data Warehouse and OLAP Operations Young-Rae Cho Associate Professor Department of Computer Science Baylor University CSI 4352, Introduction to Data Mining
More informationIT DATA WAREHOUSING AND DATA MINING UNIT-2 BUSINESS ANALYSIS
PART A 1. What are production reporting tools? Give examples. (May/June 2013) Production reporting tools will let companies generate regular operational reports or support high-volume batch jobs. Such
More informationData Mining and Analytics. Introduction
Data Mining and Analytics Introduction Data Mining Data mining refers to extracting or mining knowledge from large amounts of data It is also termed as Knowledge Discovery from Data (KDD) Mostly, data
More informationA Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective
A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective B.Manivannan Research Scholar, Dept. Computer Science, Dravidian University, Kuppam, Andhra Pradesh, India
More informationDecision Support Systems aka Analytical Systems
Decision Support Systems aka Analytical Systems Decision Support Systems Systems that are used to transform data into information, to manage the organization: OLAP vs OLTP OLTP vs OLAP Transactions Analysis
More informationOn-Line Analytical Processing (OLAP) Traditional OLTP
On-Line Analytical Processing (OLAP) CSE 6331 / CSE 6362 Data Mining Fall 1999 Diane J. Cook Traditional OLTP DBMS used for on-line transaction processing (OLTP) order entry: pull up order xx-yy-zz and
More informationJarek Szlichta Acknowledgments: Jiawei Han, Micheline Kamber and Jian Pei, Data Mining - Concepts and Techniques
Jarek Szlichta http://data.science.uoit.ca/ Acknowledgments: Jiawei Han, Micheline Kamber and Jian Pei, Data Mining - Concepts and Techniques Frequent Itemset Mining Methods Apriori Which Patterns Are
More informationCHAPTER 3 Implementation of Data warehouse in Data Mining
CHAPTER 3 Implementation of Data warehouse in Data Mining 3.1 Introduction to Data Warehousing A data warehouse is storage of convenient, consistent, complete and consolidated data, which is collected
More informationData Preprocessing. Komate AMPHAWAN
Data Preprocessing Komate AMPHAWAN 1 Data cleaning (data cleansing) Attempt to fill in missing values, smooth out noise while identifying outliers, and correct inconsistencies in the data. 2 Missing value
More informationSummary of Last Chapter. Course Content. Chapter 3 Objectives. Chapter 3: Data Preprocessing. Dr. Osmar R. Zaïane. University of Alberta 4
Principles of Knowledge Discovery in Data Fall 2004 Chapter 3: Data Preprocessing Dr. Osmar R. Zaïane University of Alberta Summary of Last Chapter What is a data warehouse and what is it for? What is
More informationData Preprocessing. Why Data Preprocessing? MIT-652 Data Mining Applications. Chapter 3: Data Preprocessing. Multi-Dimensional Measure of Data Quality
Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate data e.g., occupation = noisy: containing
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 07 : 06/11/2012 Data Mining: Concepts and Techniques (3 rd ed.) Chapter
More informationQuestion Bank. 4) It is the source of information later delivered to data marts.
Question Bank Year: 2016-2017 Subject Dept: CS Semester: First Subject Name: Data Mining. Q1) What is data warehouse? ANS. A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile
More informationAn Overview of Data Warehousing and OLAP Technology
An Overview of Data Warehousing and OLAP Technology CMPT 843 Karanjit Singh Tiwana 1 Intro and Architecture 2 What is Data Warehouse? Subject-oriented, integrated, time varying, non-volatile collection
More informationCS570: Introduction to Data Mining
CS570: Introduction to Data Mining Fall 2013 Reading: Chapter 3 Han, Chapter 2 Tan Anca Doloc-Mihu, Ph.D. Some slides courtesy of Li Xiong, Ph.D. and 2011 Han, Kamber & Pei. Data Mining. Morgan Kaufmann.
More informationData Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation
Data Mining Part 2. Data Understanding and Preparation 2.4 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Normalization Attribute Construction Aggregation Attribute Subset Selection Discretization
More informationOutline. Managing Information Resources. Concepts and Definitions. Introduction. Chapter 7
Outline Managing Information Resources Chapter 7 Introduction Managing Data The Three-Level Database Model Four Data Models Getting Corporate Data into Shape Managing Information Four Types of Information
More informationBy Mahesh R. Sanghavi Associate professor, SNJB s KBJ CoE, Chandwad
By Mahesh R. Sanghavi Associate professor, SNJB s KBJ CoE, Chandwad All the content of these PPTs were taken from PPTS of renown author via internet. These PPTs are only mean to share the knowledge among
More informationCS377: Database Systems Data Warehouse and Data Mining. Li Xiong Department of Mathematics and Computer Science Emory University
CS377: Database Systems Data Warehouse and Data Mining Li Xiong Department of Mathematics and Computer Science Emory University 1 1960s: Evolution of Database Technology Data collection, database creation,
More informationPreprocessing Short Lecture Notes cse352. Professor Anita Wasilewska
Preprocessing Short Lecture Notes cse352 Professor Anita Wasilewska Data Preprocessing Why preprocess the data? Data cleaning Data integration and transformation Data reduction Discretization and concept
More informationThis tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.
About the Tutorial A data warehouse is constructed by integrating data from multiple heterogeneous sources. It supports analytical reporting, structured and/or ad hoc queries and decision making. This
More informationData Preprocessing Yudho Giri Sucahyo y, Ph.D , CISA
Obj ti Objectives Motivation: Why preprocess the Data? Data Preprocessing Techniques Data Cleaning Data Integration and Transformation Data Reduction Data Preprocessing Lecture 3/DMBI/IKI83403T/MTI/UI
More informationData Warehousing 2. ICS 421 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa
ICS 421 Spring 2010 Data Warehousing 2 Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/30/2010 Lipyeow Lim -- University of Hawaii at Manoa 1 Data Warehousing
More informationDta Mining and Data Warehousing
CSCI6405 Fall 2003 Dta Mining and Data Warehousing Instructor: Qigang Gao, Office: CS219, Tel:494-3356, Email: q.gao@dal.ca Teaching Assistant: Christopher Jordan, Email: cjordan@cs.dal.ca Office Hours:
More informationData Mining & Analytics Data Mining Reference Model Data Warehouse Legal and Ethical Issues. Slides by Michael Hahsler
Data Mining & Analytics Data Mining Reference Model Data Warehouse Legal and Ethical Issues Slides by Michael Hahsler Data Mining & Analytics Analytics is the discovery and communication of meaningful
More informationData Warehousing & On-Line Analytical Processing
Data Warehousing & On-Line Analytical Processing Erwin M. Bakker & Stefan Manegold https://homepages.cwi.nl/~manegold/dbdm/ http://liacs.leidenuniv.nl/~bakkerem2/dbdm/ s.manegold@liacs.leidenuniv.nl e.m.bakker@liacs.leidenuniv.nl
More informationCSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores
CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores Announcements Shumo office hours change See website for details HW2 due next Thurs
More informationData Warehousing and OLAP
Data Warehousing and OLAP INFO 330 Slides courtesy of Mirek Riedewald Motivation Large retailer Several databases: inventory, personnel, sales etc. High volume of updates Management requirements Efficient
More informationData Warehousing and Decision Support. Introduction. Three Complementary Trends. [R&G] Chapter 23, Part A
Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 432 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business
More informationData Warehousing and Decision Support
Data Warehousing and Decision Support Chapter 23, Part A Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical
More informationDEPARTMENT OF INFORMATION TECHNOLOGY IT6702 DATA WAREHOUSING & DATA MINING
DEPARTMENT OF INFORMATION TECHNOLOGY IT6702 DATA WAREHOUSING & DATA MINING UNIT I PART A 1. Define data mining? Data mining refers to extracting or mining" knowledge from large amounts of data and another
More informationCS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures)
CS614- Data Warehousing Solved MCQ(S) From Midterm Papers (1 TO 22 Lectures) BY Arslan Arshad Nov 21,2016 BS110401050 BS110401050@vu.edu.pk Arslan.arshad01@gmail.com AKMP01 CS614 - Data Warehousing - Midterm
More informationData warehouses Decision support The multidimensional model OLAP queries
Data warehouses Decision support The multidimensional model OLAP queries Traditional DBMSs are used by organizations for maintaining data to record day to day operations On-line Transaction Processing
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 03 : 13/10/2015 Data Mining: Concepts and Techniques (3 rd ed.) Chapter
More informationCS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #21: Data Mining and Warehousing
CS 4604: Introduc0on to Database Management Systems B. Aditya Prakash Lecture #21: Data Mining and Warehousing Overview Tradi8onal database systems are tuned to many, small, simple queries. New applica8ons
More informationData Warehousing. Data Warehousing and Mining. Lecture 8. by Hossen Asiful Mustafa
Data Warehousing Data Warehousing and Mining Lecture 8 by Hossen Asiful Mustafa Databases Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age Information,
More informationCS 412 Intro. to Data Mining
CS 412 Intro. to Data Mining Chapter 4. Data Warehousing and On-line Analytical Processing Jiawei Han, Computer Science, Univ. Illinois at Urbana -Champaign, 2017 1 2 3 Chapter 4: Data Warehousing and
More informationData Mining. Data preprocessing. Hamid Beigy. Sharif University of Technology. Fall 1395
Data Mining Data preprocessing Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 15 Table of contents 1 Introduction 2 Data preprocessing
More informationData Mining. Data preprocessing. Hamid Beigy. Sharif University of Technology. Fall 1394
Data Mining Data preprocessing Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 15 Table of contents 1 Introduction 2 Data preprocessing
More informationDATA WAREHOUING UNIT I
BHARATHIDASAN ENGINEERING COLLEGE NATTRAMAPALLI DEPARTMENT OF COMPUTER SCIENCE SUB CODE & NAME: IT6702/DWDM DEPT: IT Staff Name : N.RAMESH DATA WAREHOUING UNIT I 1. Define data warehouse? NOV/DEC 2009
More informationIntroduc)on to Informa)on Visualiza)on
Introduc)on to Informa)on Visualiza)on Seeing the Science with Visualiza)on Raw Data 01001101011001 11001010010101 00101010100110 11101101011011 00110010111010 Visualiza(on Applica(on Visualiza)on on
More informationData Warehousing and OLAP Technologies for Decision-Making Process
Data Warehousing and OLAP Technologies for Decision-Making Process Hiren H Darji Asst. Prof in Anand Institute of Information Science,Anand Abstract Data warehousing and on-line analytical processing (OLAP)
More informationData Warehousing and Decision Support
Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 4320 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business
More informationData Warehouses Chapter 12. Class 10: Data Warehouses 1
Data Warehouses Chapter 12 Class 10: Data Warehouses 1 OLTP vs OLAP Operational Database: a database designed to support the day today transactions of an organization Data Warehouse: historical data is
More informationFig 1.2: Relationship between DW, ODS and OLTP Systems
1.4 DATA WAREHOUSES Data warehousing is a process for assembling and managing data from various sources for the purpose of gaining a single detailed view of an enterprise. Although there are several definitions
More informationCS 521 Data Mining Techniques Instructor: Abdullah Mueen
CS 521 Data Mining Techniques Instructor: Abdullah Mueen LECTURE 2: DATA TRANSFORMATION AND DIMENSIONALITY REDUCTION Chapter 3: Data Preprocessing Data Preprocessing: An Overview Data Quality Major Tasks
More informationCHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP)
CHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP) INTRODUCTION A dimension is an attribute within a multidimensional model consisting of a list of values (called members). A fact is defined by a combination
More informationData Warehousing & On-line Analytical Processing
Data Warehousing & On-line Analytical Processing Erwin M. Bakker & Stefan Manegold https://homepages.cwi.nl/~manegold/dbdm/ http://liacs.leidenuniv.nl/~bakkerem2/dbdm/ Chapter 4: Data Warehousing and On-line
More informationCHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI
CHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS Assist. Prof. Dr. Volkan TUNALI Topics 2 Business Intelligence (BI) Decision Support System (DSS) Data Warehouse Online Analytical Processing (OLAP)
More informationA Multi-Dimensional Data Model
A Multi-Dimensional Data Model A Data Warehouse is based on a Multidimensional data model which views data in the form of a data cube A data cube, such as sales, allows data to be modeled and viewed in
More informationOLAP2 outline. Multi Dimensional Data Model. A Sample Data Cube
OLAP2 outline Multi Dimensional Data Model Need for Multi Dimensional Analysis OLAP Operators Data Cube Demonstration Using SQL Multi Dimensional Data Model Multi dimensional analysis is a popular approach
More informationIntroduc3on to Data Management
ICS 101 Fall 2014 Introduc3on to Data Management Assoc. Prof. Lipyeow Lim Informa3on & Computer Science Department University of Hawaii at Manoa Lipyeow Lim - - University of Hawaii at Manoa 1 The Data
More informationCT75 (ALCCS) DATA WAREHOUSING AND DATA MINING JUN
Q.1 a. Define a Data warehouse. Compare OLTP and OLAP systems. Data Warehouse: A data warehouse is a subject-oriented, integrated, time-variant, and 2 Non volatile collection of data in support of management
More informationCT75 DATA WAREHOUSING AND DATA MINING DEC 2015
Q.1 a. Briefly explain data granularity with the help of example Data Granularity: The single most important aspect and issue of the design of the data warehouse is the issue of granularity. It refers
More informationThe Evolution of Data Warehousing. Data Warehousing Concepts. The Evolution of Data Warehousing. The Evolution of Data Warehousing
The Evolution of Data Warehousing Data Warehousing Concepts Since 1970s, organizations gained competitive advantage through systems that automate business processes to offer more efficient and cost-effective
More informationDATA MINING AND WAREHOUSING
DATA MINING AND WAREHOUSING Qno Question Answer 1 Define data warehouse? Data warehouse is a subject oriented, integrated, time-variant, and nonvolatile collection of data that supports management's decision-making
More informationETL and OLAP Systems
ETL and OLAP Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, first semester
More informationData Warehouse and Data Mining
Data Warehouse and Data Mining Lecture No. 04-06 Data Warehouse Architecture Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology
More informationTribhuvan University Institute of Science and Technology MODEL QUESTION
MODEL QUESTION 1. Suppose that a data warehouse for Big University consists of four dimensions: student, course, semester, and instructor, and two measures count and avg-grade. When at the lowest conceptual
More informationData Warehouses. Yanlei Diao. Slides Courtesy of R. Ramakrishnan and J. Gehrke
Data Warehouses Yanlei Diao Slides Courtesy of R. Ramakrishnan and J. Gehrke Introduction v In the late 80s and early 90s, companies began to use their DBMSs for complex, interactive, exploratory analysis
More informationDATA WAREHOUSING & DATA MINING. by: Prof. Asha Ambhaikar
DATA WAREHOUSING & DATA MINING by: Prof. Asha Ambhaikar 1 UNIT-I Overview and Concepts 2 Contents of Unit-I Need for data warehousing, Basic elements of data warehousing, Trends in data warehousing. Planning
More information1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda
Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:
More informationData Warehouse. Concepts and Techniques. Chapter 3. SS Chung. April 5, 2013 Data Mining: Concepts and Techniques 1
Data Warehouse Concepts and Techniques Chapter 3 SS Chung April 5, 2013 Data Mining: Concepts and Techniques 1 Chapter 3: Data Warehousing and OLAP Technology: An Overview What is a data warehouse? A multi-dimensional
More informationData Warehousing & OLAP
CMPUT 391 Database Management Systems Data Warehousing & OLAP Textbook: 17.1 17.5 (first edition: 19.1 19.5) Based on slides by Lewis, Bernstein and Kifer and other sources University of Alberta 1 Why
More informationFull file at
Chapter 2 Data Warehousing True-False Questions 1. A real-time, enterprise-level data warehouse combined with a strategy for its use in decision support can leverage data to provide massive financial benefits
More informationData Preprocessing. Data Mining 1
Data Preprocessing Today s real-world databases are highly susceptible to noisy, missing, and inconsistent data due to their typically huge size and their likely origin from multiple, heterogenous sources.
More information2. Data Preprocessing
2. Data Preprocessing Contents of this Chapter 2.1 Introduction 2.2 Data cleaning 2.3 Data integration 2.4 Data transformation 2.5 Data reduction Reference: [Han and Kamber 2006, Chapter 2] SFU, CMPT 459
More informationThe University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory
Warehousing Outline Andrew Kusiak 2139 Seamans Center Iowa City, IA 52242-1527 andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak Tel. 319-335 5934 Introduction warehousing concepts Relationship
More informationDATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY
DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY CHARACTERISTICS Data warehouse is a central repository for summarized and integrated data
More informationGUJARAT TECHNOLOGICAL UNIVERSITY MASTER OF COMPUTER APPLICATIONS (MCA) Semester: IV
GUJARAT TECHNOLOGICAL UNIVERSITY MASTER OF COMPUTER APPLICATIONS (MCA) Semester: IV Subject Name: Elective I Data Warehousing & Data Mining (DWDM) Subject Code: 2640005 Learning Objectives: To understand
More informationData Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha
Data Preprocessing S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha 1 Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking
More informationcse634 Data Mining Preprocessing Lecture Notes Chapter 2 Professor Anita Wasilewska
cse634 Data Mining Preprocessing Lecture Notes Chapter 2 Professor Anita Wasilewska Chapter 2: Data Preprocessing (book slide) Why preprocess the data? Descriptive data summarization Data cleaning Data
More informationUNIT 2. DATA PREPROCESSING AND ASSOCIATION RULES
UNIT 2. DATA PREPROCESSING AND ASSOCIATION RULES Data Pre-processing-Data Cleaning, Integration, Transformation, Reduction, Discretization Concept Hierarchies-Concept Description: Data Generalization And
More information