Data Warehouse and Data Mining

Size: px
Start display at page:

Download "Data Warehouse and Data Mining"

Transcription

1 Data Warehouse and Data Mining Lecture No. 06 Data Modeling Naeem Ahmed Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro

2 Data Modeling Conceptual Modeling: DW Modeling Multidimensional Entity Relationship (ME/R) Model Multidimensional UML (muml) Logical Modeling: Cubes, Dimensions, Hierarchies Physical Modeling: Star, Snowflake, Array storage

3 Goal of the Logical Model Confirm the subject areas Logical Model Create real facts and dimensions from the subjects that have been identified Establish the needed granularity for dimensions Logical structure of the multidimensional model Cubes: Sales, Purchase, Price, Inventory Dimensions: Product, Time, Geography, Client

4 Logical Model

5 Dimensions Dimensions are analysis purpose chosen entities, within the data model One dimension can be used to define more than one cube They are hierarchically organized

6 Dimensions Dimension hierarchies are organized in classification levels (e.g., Day, Month,...) The dependencies between the classification levels are described by the classification schema through functional dependencies An attribute B is functionally dependent on an attribute A, denoted A B, if for all a dom(a) there exists exactly one b dom(b) corresponding to it

7 Classification schemas Dimensions The classification schema of a dimension D is a semiordered set of classification levels ({D.K 0,..., D.K k }, ) With a smallest element D.K 0, i.e. there is no classification level with smaller granularity A fully-ordered set of classification levels is called a Path If classification schema of the time dimension is considered, then one has the following paths T.Day T.Week and T.Day T.Month T.Quarter T.Year Here T.Day is the smallest element

8 Dimensions Classification hierarchies Let D.K0... D.Kk be a path in the classification schema of dimension D A classification hierarchy concerning these path is a balanced tree which Has as nodes dom(d.k0) U...U dom(d.kk) U {ALL} And its edges respect the functional dependencies

9 Dimensions Example: classification hierarchy from the path product dimension

10 Dimensions Store Dimension Product Dimension Total Total Region Manufacturer District Brand Stores Products

11 Cubes Cubes consist of data cells with one or more measures If a cube schema S(G,M) consists of a granularity G= (D 1.K 1,..., D n.k n ) and a set M=(M 1,..., M m ) representing the measure A Cube (C) is a set of cube cells, C dom(g) x dom(m)

12 Cubes The coordinates of a cell are the classification nodes from dom(g) corresponding to the cell Sales ((Article, Day, Store, Client) (Turnover))

13 Cubes 4 dimensions (supplier, city, quarter, product)

14 Cubes One can now imagine n-dimensional cubes n-d cube is called a base cuboid The top most cuboid, the 0-D, which holds the highest level of summarization is called apex cuboid - The full data cube is formed by the lattice of cuboids

15 Cubes But things can get complicated pretty fast

16 Basic Operations Basic operations of the multidimensional model on the logical level Selection Projection Cube join Sum Aggregation

17 Basic Operations Multidimensional Selection The selection on a cube C((D 1.K 1,..., D g.k g ), (M 1,..., M m )) through a predicate P, is defined as σp(c) = {z Є C:P(z)}, if all variables in P are either: Classification levels K, which functionally depend on a classification level in the granularity of K, i.e. D i.k i K Measures from (M 1,..., M m ) E.g. σ P.Prod_group= Video (Sales)

18 Basic Operations Multidimensional projection The projection of a function of a measure F(M) of cube C is defined as π F(M) (C) = { (g,f(m)) dom(g) x dom(f(m)): (g,m) C} E.g., price projection π turnover, sold_items (Sales)

19 Basic Operations Join operations between cubes is usual E.g. if turnover would not be provided, it could be calculated with the help of the unit price from the price cube 2 cubes C 1 (G 1, M 1 ) and C 2 (G 2, M 2 ) can only be joined, if they have the same granularity (G 1 = G 2 = G) C 1 C 2 = C(G, M 1 M 2 )

20 Basic Operations When the granularities are different, but there is still need to join the cubes, aggregation has to be performed Aggregation: A whole formed or calculated by the combination of many separate units or items Total E.g., Sales Inventory: aggregate Sales((Day,Article, Store, Client)) to Sales((Month, Article, Store, Client))

21 Basic Operations Aggregation: most important operation for OLAP operations Aggregation functions Build a single values from set of value, e.g. in SQL: SUM,AVG, Count, Min, Max Example: SUM (P.Product_group, G.City, T.Month) (Sales)

22 Change support Classification schema, cube schema, classification hierarchy are all designed in the building phase and considered as fix Practice has proven otherwise DW grow old, too Changes are strongly connected to the time factor This lead to the time validity of these concepts Reasons for schema modification New requirements Modification of the data source

23 Classification Hierarchy E.g. Saturn sells a lot of electronics Lets consider mobile phones They built their DW on A classification hierarchy of their data until could look like this:

24 Classification Hierarchy After G becomes hip and affordable and many phone makers start migrating towards 3G capable phones Lets say O2 makes its XDA 3G capable

25 Classification Hierarchy After phone makers already develop 4G capable phones

26 Classification Hierarchy It is important to trace the evolution of the data It can explain which data was available at which moment in time Such a versioning system of the classification hierarchy can be performed by constructing a validity matrix When is something, valid? Use timestamps to mark it!

27 Annotated Change data Classification Hierarchy

28 Classification Hierarchy The tree can be stored as dimension metadata The storage form is a validity matrix Rows are parent nodes Columns are child nodes

29 Classification Hierarchy Deleting a node in a classification hierarchy Should be performed only in exceptional cases It can lead to information loss How to solve it? Soon GSM phones will not be produced anymore But one might have some more in warehouses, to be delivered Or one might want to query data since when GSM was sold Just mark the end validity date of the GSM branch in the validity matrix

30 Classification Hierarchy Query classification Having the validity information we can support queries like as is versus as is Regards all the data as if the only valid classification hierarchy is the present one In the case of O2 XDA, it will be considered as it has always been a 3G phone

31 Classification Hierarchy As is versus as was Orders the classification hierarchy by the validity matrix information O2 XDA was a GSM phone until and a 3G phone afterwards

32 Classification Hierarchy As was versus as was Past time hierarchies can be reproduced E.g., query data with an older classification hierarchy Like versus like Only data whose classification hierarchy remained unmodified, is evaluated E.g. the Nokia 3600 and the Black Berry

33 Schema Modification Improper modification of a schema (deleting a dimension) can lead to Data loss Inconsistencies Data is incorrectly aggregated or adapted Proper schema modification is complex but It brings flexibility for the end user The possibility to ask As Is vs. As Was queries and so on Alternatives Schema evolution Schema versioning

34 Schema Modification Schema evolution Modifications can be performed without data loss It involves schema modification and data adaptation to the new schema This data adaptation process is called Instance adaptation

35 Schema evolution Advantage Schema Modification Faster to execute queries in DW with many schema modifications Disadvantages It limits the end user flexibility to query based on the past schemas Only actual schema based queries are supported

36 Schema versioning Also no data loss Schema Modification All the data corresponding to all the schemas are always available After a schema modification the data is held in their belonging schema Old data - old schema New data - new schema

37 Schema versioning Advantages Schema Modification Allows higher flexibility, e.g., As Is vs.as Was, etc. queries Disadvantages Adaptation of the data to the queried schema is done on the spot This results in longer query run time

38 Physical Model Defining the physical structures Setting up the database environment Performance tuning strategies Indexing Goal Partitioning Materialization Define the actual storage architecture Decide on how the data is to be accessed and how it is arranged

39 Physical Model Physical implementation of the multidimensional paradigm model can be: Relational Snowflake-schema Star-schema Fast constellation Multidimensional Matrixes

40 Physical Model Relational model, goals: As low loss of semantically knowledge as possible e.g., classification hierarchies The translation from multidimensional queries must be efficient The RDBMS should be able to run the translated queries efficiently The maintenance of the present tables should be easy and fast e.g., when loading new data

41 Relational Model Going from multidimensional to relational Representations for cubes, dimensions, classification hierarchies and attributes Implementation of cubes without the classification hierarchies is easy A table can be seen as a cube A column of a table can be considered as a dimension mapping A tuple in the table represents a cell in the cube If one interprets only a part of the columns as dimensions, he/ she can use the rest as measures The resulting table is called a fact table

42 Relational Model

43 Relational Model Snowflake-schema Simple idea: use a table for each classification level This table includes the ID of the classification level and other attributes 2 neighbor classification levels are connected by 1:n connections e.g., from n Days to 1 Month The measures of a cube are maintained in a fact table Besides measures, there are also the foreign key IDs for the smallest classification levels

44 Relational Model Snowflake? The facts/measures are in the center The dimensions spread out in each direction and branch out with their granularity

45 Snowflake Example

46 Snowflake Example Advantage: Best performance when queries involve aggregation Disadvantage: Complicated maintenance and metadata, explosion in the number of tables in the database

47 Snowflake Schema Snowflake schema Advantages With a snowflake schema the size of the dimension tables will be reduced and queries will run faster If a dimension is very sparse (most measures corresponding to the dimension have no data) And/or a dimension has long list of attributes which may be queried Snowflake schema Disadvantages Fact tables are responsible for 90% of the storage requirements Thus, normalizing the dimensions usually lead to insignificant improvements Normalization of the dimension tables can reduce the performance of the DW because it leads to a large number of tables E.g., when connecting dimensions with coarse granularity these tables are joined with each other during queries A query which connects Product category with Year and Country is clearly not performant (10 tables need to be connected)

48 Relational Model Star schema Basic idea: use a denormalized schema for all the dimensions A star schema can be obtained from the snowflake schema through the denormalization of the tables belonging to a dimension Database normalization is the process of organizing the fields and tables of a relational database to minimize redundancy and dependency Normalization usually involves dividing large tables into smaller (and less redundant) tables and defining relationships between them A de-normalization is the process of attempting to optimize the read performance of a database by adding redundant data or by grouping data

49 Star schema Example Benefits: Easy to understand, easy to define hierarchies, reduces # of physical joins, low maintenance, very simple metadata Drawbacks: Summary data in the fact table yields poorer performance for summary levels, huge dimension tables a problem

50 Star Schema Advantages Improves query performance for often-used data Less tables and simple structure Efficient query processing with regard to dimensions Disadvantages In some cases, high overhead of redundant data

51 Star Schema Store Dimension STORE KEY Store Description City State District ID District Desc. Region_ID Region Desc. Regional Mgr. Level Fact Table STORE KEY PRODUCT KEY PERIOD KEY Dollars Units Price Product Dimension PRODUCT KEY Product Desc. Brand Color Size Manufacturer Level Time Dimension PERIOD KEY Period Desc Year Quarter Month Day Current Flag Resolution Sequence Example: Select A.STORE_KEY, A.PERIOD_KEY, A.dollars from Fact_Table A where A.STORE_KEY in (select STORE_KEY from Store_Dimension B where region = North and Level = 2) The biggest drawback: dimension tables must carry a level indicator for every record and every query must use it. In the example, without the level constraint, keys for all stores in the NORTH region, including aggregates for region and district will be pulled from the fact table, resulting in error. Solution: FACT CONSTELLATION Level is needed whenever aggregates are stored with detail facts.

52 Fact Constellation Schema FACT Constellation Schema describes a logical database structure of Data Warehouse or Data Mart It can design with collection of de-normalized FACT, Shared and Conformed Dimension tables FACT Constellation Schema is an extended and decomposed STAR Schema In Fact Constellations, aggregate tables are created separately from the detail, therefore, it is impossible to pick up Example, Store detail when querying the District Fact Table

53 Fact Constellation Schema Fact Constellation is a good alternative to the Star, but when dimensions have very high cardinality, the sub-selects in the dimension tables can be a source of delay An alternative is to normalize the dimension tables by attribute level, with each smaller dimension table pointing to an appropriate aggregated fact table, the Snowflake Schema Advantage: No need for the Level indicator in the dimension tables, since no aggregated data is stored with lower-level detail Disadvantage: Dimension tables are still very large in some cases, which can slow performance; front-end must be able to detect existence of aggregate facts, which requires more extensive metadata

54 Fact Constellation Example Store Dimension STORE KEY Store Description City State District ID District Desc. Region_ID Region Desc. Regional Mgr. Fact Table STORE KEY PRODUCT KEY PERIOD KEY Dollars Units Price Product Dimension PRODUCT KEY Product Desc. Brand Color Size Manufacturer Time Dimension PERIOD KEY Period Desc Year Quarter Month Day Current Flag Sequence District Fact Table District_ID PRODUCT_KEY PERIOD_KEY Dollars Units Price Region Fact Table Region_ID PRODUCT_KEY PERIOD_KEY Dollars Units Price

55 Snowflake vs. Star Snowflake The structure of the classifications are expressed in table schemas The fact table and dimension tables are normalized Star The entire classification is expressed in just one table The fact table is normalized while in the dimension table the normalization is broken This leads to redundancy of information in the dimension tables

56 Snowflake vs. Star Snowflake Star

57 Snowflake vs. Star Attributes Star Schema Snowflake Schema Ease of maintenance / change Has redundant data and hence less easy to maintain/change No redundancy and hence more easy to maintain and change Ease of Use Query Performance Type of Datawarehouse Less complex queries and easy to understand Less no. of foreign keys and hence lesser query execution time Good for datamarts with simple relationships (1:1 or 1:many) More complex queries and hence less easy to understand More foreign keys-and hence more query execution time Good to use for datawarehouse core to simplify complex relationships (many:many) Joins Fewer Joins Higher number of Joins Dimension table Contains only single dimension table for each dimension It may have more than one dimension table for each dimension When to use Normalization/ De- Normalization When dimension table contains less number of rows, go for Star schema. Both Dimension and Fact Tables are in De- Normalized form When dimension table is relatively big in size, snowflaking is better as it reduces space. Dimension Tables are in Normalized form but Fact Table is still in De- Normalized form Data model Top down approach Bottom up approach!

58 Snowflake to Star When should one go from Snowflake to star? Heuristics-based decision When typical queries relate to coarser granularity (like product category) When the volume of data in the dimension tables is relatively low compared to the fact table In this case a star schema leads to negligible overhead through redundancy, but performance is improved When modifications on the classifications are rare compared to insertion of fact data In this case these modifications controlled through the data load process of the ETL reducing the risk of data anomalies

59 It depends on the necessity Which one is winner? Snowflake or Star? Fast query processing or efficient space usage However, most of the time a mixed form is used The Starflake schema: some dimensions stay normalized corresponding to the snowflake schema, while others are denormalized according to the star schema Snowflake schema: The decision on how to deal with the dimensions is influenced by Frequency of the modifications: if the dimensions change often, normalization leads to better results Amount of dimension elements: the bigger the dimension tables, the more space normalization saves Number of classification levels in a dimension: more classification levels introduce more redundancy in the star schema Materialization of aggregates for the dimension levels: if the aggregates are materialized, a normalization of the dimension can bring better response time

60 More Schemas Galaxies In pratice we usually have more measures described by different dimensions Thus, more fact tables

61 Fact constellations Pre-calculated aggregates Factless fact tables More Schemas Fact tables do not have non-key data Can be used for event tracking or to inventory the set of possible occurrences Factless fact table does not have any measures For example, consider a record of student attendance in classes. In this case, the fact table would consist of 3 dimensions: the student dimension, the time dimension, and the class dimension.

62 More Schemas Factless fact tables This factless fact table would look like the following:

63 Relational Model Relational model disadvantages The representation of the multidimensional data can be implemented relationally with a finite set of transformation steps, however: Multidimensional queries have to be first translated to the relational representation A direct interaction with the relational data model is not fit for the end user

64 Multidimensional Model The basic data structure for multidimensional data storage is the array The elementary data structures are the cubes and the dimensions C=((D 1,..., D n ), (M 1,..., M m )) The storage is intuitive as arrays of arrays, physically linearized

65 Multidimensional Model Linearization example: 2D cube D 1 = 5, D 2 = 4, cube cells = 20 Query: Jackets sold in March? Measure stored in cube cell D 1 [4], D 2 [3] The 2D cube is physically stored as a linear array, so D 1 [4], D 2 [3] becomes array cell 14 (Index(D 2 ) 1) * D 1 + Index(D 1 ) Linearized Index=2*5+4=14

66 Linearization Generalization: Given a cube C=((D 1, D 2,..., D n ), (M 1 :Type 1, M 2 :Type 2,..., M m :Type m )), the index of a cube cell z with coordinates (x 1, x 2,..., x n ) can be linearized as follows: Index(z) = x 1 + (x 2-1) * D 1 + (x 3-1) * D 1 * D (x n -1)* D 1 *...* D n-1 = 1+ i=1 n ((x i -1)* j=1 i-1 D i )

67 Problems in Array-Storage Influence of the order of the dimensions in the cube definition In the cube the cells of D 2 are ordered one under the other e.g., sales of all pants involves a column in the cube After linearization, the information is spread among more data blocks/pages If one considers a data block can hold 5 cells, a query over all products sold in January can be answered with just 1 block read, but a query of all sold pants, involves reading 4 blocks

68 Problems in Array-Storage Solution: use caching techniques But...caching and swapping is performed also by the operating system MDBMS has to manage its caches such that the OS doesn t perform any damaging swaps Storage of dense cubes If cubes are dense, array storage is more efficient. However, operations suffer due to the large cubes Solution: store dense cubes not linear but on 2 levels The first contains indexes and the second the data cells stored in blocks Optimization procedures like indexes (trees, bitmaps), physical partitioning, and compression (run-length- encoding) can be used

69 Problems in Array-Storage Storage of sparse cubes All the cells of a cube, including empty ones, have to be stored Sparseness leads to data being stored in many physical blocks or pages The query speed is affected by the large number of block accesses on the secondary memory Solution: Do not store empty blocks or pages but adapt the index structure 2 level data structure: upper layer holds all possible combinations of the sparse dimensions, lower layer holds dense dimensions

70 2 level cube storage Problems in Array-Storage

71

Data Warehousing. Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Data Warehousing. Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig Data Warehousing & OLAP Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Summary Last week: Storage structures: MDB Architectures:

More information

Data Warehousing & OLAP

Data Warehousing & OLAP Data Warehousing & OLAP Wolf-Tilo Balke Kinda El Maarry Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Summary Last Lecture: Architectures: Three-Tier

More information

Data Warehouse and Data Mining

Data Warehouse and Data Mining Data Warehouse and Data Mining Lecture No. 05 Data Modeling Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Data Modeling

More information

Chapter 3. The Multidimensional Model: Basic Concepts. Introduction. The multidimensional model. The multidimensional model

Chapter 3. The Multidimensional Model: Basic Concepts. Introduction. The multidimensional model. The multidimensional model Chapter 3 The Multidimensional Model: Basic Concepts Introduction Multidimensional Model Multidimensional concepts Star Schema Representation Conceptual modeling using ER, UML Conceptual modeling using

More information

Data Mining Concepts & Techniques

Data Mining Concepts & Techniques Data Mining Concepts & Techniques Lecture No. 01 Databases, Data warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro

More information

A Multi-Dimensional Data Model

A Multi-Dimensional Data Model A Multi-Dimensional Data Model A Data Warehouse is based on a Multidimensional data model which views data in the form of a data cube A data cube, such as sales, allows data to be modeled and viewed in

More information

ALTERNATE SCHEMA DIAGRAMMING METHODS DECISION SUPPORT SYSTEMS. CS121: Relational Databases Fall 2017 Lecture 22

ALTERNATE SCHEMA DIAGRAMMING METHODS DECISION SUPPORT SYSTEMS. CS121: Relational Databases Fall 2017 Lecture 22 ALTERNATE SCHEMA DIAGRAMMING METHODS DECISION SUPPORT SYSTEMS CS121: Relational Databases Fall 2017 Lecture 22 E-R Diagramming 2 E-R diagramming techniques used in book are similar to ones used in industry

More information

Basics of Dimensional Modeling

Basics of Dimensional Modeling Basics of Dimensional Modeling Data warehouse and OLAP tools are based on a dimensional data model. A dimensional model is based on dimensions, facts, cubes, and schemas such as star and snowflake. Dimension

More information

Data Warehouse and Data Mining

Data Warehouse and Data Mining Data Warehouse and Data Mining Lecture No. 03 Architecture of DW Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Basic

More information

Data Warehouse and Data Mining

Data Warehouse and Data Mining Data Warehouse and Data Mining Lecture No. 04-06 Data Warehouse Architecture Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology

More information

Evolution of Database Systems

Evolution of Database Systems Evolution of Database Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Intelligent Decision Support Systems Master studies, second

More information

ETL and OLAP Systems

ETL and OLAP Systems ETL and OLAP Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, first semester

More information

Data Warehousing & Mining Techniques

Data Warehousing & Mining Techniques 2. Summary Data Warehousing & Mining Techniques Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Last week: What is a Data

More information

An Overview of Data Warehousing and OLAP Technology

An Overview of Data Warehousing and OLAP Technology An Overview of Data Warehousing and OLAP Technology CMPT 843 Karanjit Singh Tiwana 1 Intro and Architecture 2 What is Data Warehouse? Subject-oriented, integrated, time varying, non-volatile collection

More information

OLAP Introduction and Overview

OLAP Introduction and Overview 1 CHAPTER 1 OLAP Introduction and Overview What Is OLAP? 1 Data Storage and Access 1 Benefits of OLAP 2 What Is a Cube? 2 Understanding the Cube Structure 3 What Is SAS OLAP Server? 3 About Cube Metadata

More information

Data Warehousing. Overview

Data Warehousing. Overview Data Warehousing Overview Basic Definitions Normalization Entity Relationship Diagrams (ERDs) Normal Forms Many to Many relationships Warehouse Considerations Dimension Tables Fact Tables Star Schema Snowflake

More information

Data Warehousing & Mining Techniques

Data Warehousing & Mining Techniques Data Warehousing & Mining Techniques Wolf-Tilo Balke Kinda El Maarry Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 2. Summary Last week: What is a Data

More information

Data Warehousing & Data Mining

Data Warehousing & Data Mining Data Warehousing & Data Mining Wolf-Tilo Balke Kinda El Maarry Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Summary Last week: Logical Model: Cubes,

More information

Data Warehouse and Data Mining

Data Warehouse and Data Mining Data Warehouse and Data Mining Lecture No. 02 Introduction to Data Warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology

More information

Data Warehousing & Data Mining

Data Warehousing & Data Mining Data Warehousing & Data Mining Wolf-Tilo Balke Kinda El Maarry Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Summary Last Week: Optimization - Indexes

More information

2. Summary. 2.1 Basic Architecture. 2. Architecture. 2.1 Staging Area. 2.1 Operational Data Store. Last week: Architecture and Data model

2. Summary. 2.1 Basic Architecture. 2. Architecture. 2.1 Staging Area. 2.1 Operational Data Store. Last week: Architecture and Data model 2. Summary Data Warehousing & Mining Techniques Wolf-Tilo Balke Kinda El Maarry Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Last week: What is a Data

More information

Decision Support Systems aka Analytical Systems

Decision Support Systems aka Analytical Systems Decision Support Systems aka Analytical Systems Decision Support Systems Systems that are used to transform data into information, to manage the organization: OLAP vs OLTP OLTP vs OLAP Transactions Analysis

More information

Data warehouse architecture consists of the following interconnected layers:

Data warehouse architecture consists of the following interconnected layers: Architecture, in the Data warehousing world, is the concept and design of the data base and technologies that are used to load the data. A good architecture will enable scalability, high performance and

More information

Advanced Data Management Technologies Written Exam

Advanced Data Management Technologies Written Exam Advanced Data Management Technologies Written Exam 02.02.2016 First name Student number Last name Signature Instructions for Students Write your name, student number, and signature on the exam sheet. This

More information

Data Warehouse. Asst.Prof.Dr. Pattarachai Lalitrojwong

Data Warehouse. Asst.Prof.Dr. Pattarachai Lalitrojwong Data Warehouse Asst.Prof.Dr. Pattarachai Lalitrojwong Faculty of Information Technology King Mongkut s Institute of Technology Ladkrabang Bangkok 10520 pattarachai@it.kmitl.ac.th The Evolution of Data

More information

A Star Schema Has One To Many Relationship Between A Dimension And Fact Table

A Star Schema Has One To Many Relationship Between A Dimension And Fact Table A Star Schema Has One To Many Relationship Between A Dimension And Fact Table Many organizations implement star and snowflake schema data warehouse The fact table has foreign key relationships to one or

More information

Data Warehouse and Data Mining

Data Warehouse and Data Mining Data Warehouse and Data Mining Lecture No. 07 Terminologies Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Database

More information

Summary. 4. Indexes. 4.0 Indexes. 4.1 Tree Based Indexes. 4.0 Indexes. 19-Nov-10. Last week: This week:

Summary. 4. Indexes. 4.0 Indexes. 4.1 Tree Based Indexes. 4.0 Indexes. 19-Nov-10. Last week: This week: Summary Data Warehousing & Data Mining Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Last week: Logical Model: Cubes,

More information

collection of data that is used primarily in organizational decision making.

collection of data that is used primarily in organizational decision making. Data Warehousing A data warehouse is a special purpose database. Classic databases are generally used to model some enterprise. Most often they are used to support transactions, a process that is referred

More information

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores Announcements Shumo office hours change See website for details HW2 due next Thurs

More information

Data Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394

Data Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394 Data Mining Data warehousing Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 22 Table of contents 1 Introduction 2 Data warehousing

More information

Lectures for the course: Data Warehousing and Data Mining (IT 60107)

Lectures for the course: Data Warehousing and Data Mining (IT 60107) Lectures for the course: Data Warehousing and Data Mining (IT 60107) Week 1 Lecture 1 21/07/2011 Introduction to the course Pre-requisite Expectations Evaluation Guideline Term Paper and Term Project Guideline

More information

CT75 DATA WAREHOUSING AND DATA MINING DEC 2015

CT75 DATA WAREHOUSING AND DATA MINING DEC 2015 Q.1 a. Briefly explain data granularity with the help of example Data Granularity: The single most important aspect and issue of the design of the data warehouse is the issue of granularity. It refers

More information

Database design View Access patterns Need for separate data warehouse:- A multidimensional data model:-

Database design View Access patterns Need for separate data warehouse:- A multidimensional data model:- UNIT III: Data Warehouse and OLAP Technology: An Overview : What Is a Data Warehouse? A Multidimensional Data Model, Data Warehouse Architecture, Data Warehouse Implementation, From Data Warehousing to

More information

Dta Mining and Data Warehousing

Dta Mining and Data Warehousing CSCI6405 Fall 2003 Dta Mining and Data Warehousing Instructor: Qigang Gao, Office: CS219, Tel:494-3356, Email: q.gao@dal.ca Teaching Assistant: Christopher Jordan, Email: cjordan@cs.dal.ca Office Hours:

More information

Data Warehousing 2. ICS 421 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Data Warehousing 2. ICS 421 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa ICS 421 Spring 2010 Data Warehousing 2 Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/30/2010 Lipyeow Lim -- University of Hawaii at Manoa 1 Data Warehousing

More information

Data Warehouses. Yanlei Diao. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Data Warehouses. Yanlei Diao. Slides Courtesy of R. Ramakrishnan and J. Gehrke Data Warehouses Yanlei Diao Slides Courtesy of R. Ramakrishnan and J. Gehrke Introduction v In the late 80s and early 90s, companies began to use their DBMSs for complex, interactive, exploratory analysis

More information

Data Warehousing and Decision Support

Data Warehousing and Decision Support Data Warehousing and Decision Support Chapter 23, Part A Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical

More information

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. [R&G] Chapter 23, Part A

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. [R&G] Chapter 23, Part A Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 432 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business

More information

Tribhuvan University Institute of Science and Technology MODEL QUESTION

Tribhuvan University Institute of Science and Technology MODEL QUESTION MODEL QUESTION 1. Suppose that a data warehouse for Big University consists of four dimensions: student, course, semester, and instructor, and two measures count and avg-grade. When at the lowest conceptual

More information

Data warehouse design

Data warehouse design Database and data mining group, Data warehouse design DATA WAREHOUSE: DESIGN - Risk factors Database and data mining group, High user expectation the data warehouse is the solution of the company s problems

More information

Fig 1.2: Relationship between DW, ODS and OLTP Systems

Fig 1.2: Relationship between DW, ODS and OLTP Systems 1.4 DATA WAREHOUSES Data warehousing is a process for assembling and managing data from various sources for the purpose of gaining a single detailed view of an enterprise. Although there are several definitions

More information

Acknowledgment. MTAT Data Mining. Week 7: Online Analytical Processing and Data Warehouses. Typical Data Analysis Process.

Acknowledgment. MTAT Data Mining. Week 7: Online Analytical Processing and Data Warehouses. Typical Data Analysis Process. MTAT.03.183 Data Mining Week 7: Online Analytical Processing and Data Warehouses Marlon Dumas marlon.dumas ät ut. ee Acknowledgment This slide deck is a mashup of the following publicly available slide

More information

Data Warehousing and Decision Support

Data Warehousing and Decision Support Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 4320 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business

More information

FROM A RELATIONAL TO A MULTI-DIMENSIONAL DATA BASE

FROM A RELATIONAL TO A MULTI-DIMENSIONAL DATA BASE FROM A RELATIONAL TO A MULTI-DIMENSIONAL DATA BASE David C. Hay Essential Strategies, Inc In the buzzword sweepstakes of 1997, the clear winner has to be Data Warehouse. A host of technologies and techniques

More information

Data Warehousing and Decision Support (mostly using Relational Databases) CS634 Class 20

Data Warehousing and Decision Support (mostly using Relational Databases) CS634 Class 20 Data Warehousing and Decision Support (mostly using Relational Databases) CS634 Class 20 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke, Chapter 25 Introduction Increasingly,

More information

UNIT

UNIT UNIT 3.1 DATAWAREHOUSING UNIT 3 CHAPTER 1 1.Designing the Target Structure: Data warehouse design, Dimensional design, Cube and dimensions, Implementation of a dimensional model in a database, Relational

More information

Overview. Introduction to Data Warehousing and Business Intelligence. BI Is Important. What is Business Intelligence (BI)?

Overview. Introduction to Data Warehousing and Business Intelligence. BI Is Important. What is Business Intelligence (BI)? Introduction to Data Warehousing and Business Intelligence Overview Why Business Intelligence? Data analysis problems Data Warehouse (DW) introduction A tour of the coming DW lectures DW Applications Loosely

More information

Processing of Very Large Data

Processing of Very Large Data Processing of Very Large Data Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, first

More information

Data Warehouse Design Using Row and Column Data Distribution

Data Warehouse Design Using Row and Column Data Distribution Int'l Conf. Information and Knowledge Engineering IKE'15 55 Data Warehouse Design Using Row and Column Data Distribution Behrooz Seyed-Abbassi and Vivekanand Madesi School of Computing, University of North

More information

CS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures)

CS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures) CS614- Data Warehousing Solved MCQ(S) From Midterm Papers (1 TO 22 Lectures) BY Arslan Arshad Nov 21,2016 BS110401050 BS110401050@vu.edu.pk Arslan.arshad01@gmail.com AKMP01 CS614 - Data Warehousing - Midterm

More information

Star Schema מחסני נתונים. Star Schema Example 1. Star Schema

Star Schema מחסני נתונים. Star Schema Example 1. Star Schema Star Schema In a star schema, each dimension table has a single-part primary key that links to one part of the multipart primary key in the fact table. מחסני נתונים תכנון לוגי של מסד נתונים רב מימדי באמצעות

More information

Summary of Last Chapter. Course Content. Chapter 2 Objectives. Data Warehouse and OLAP Outline. Incentive for a Data Warehouse

Summary of Last Chapter. Course Content. Chapter 2 Objectives. Data Warehouse and OLAP Outline. Incentive for a Data Warehouse Principles of Knowledge Discovery in bases Fall 1999 Chapter 2: Warehousing and Dr. Osmar R. Zaïane University of Alberta Dr. Osmar R. Zaïane, 1999 Principles of Knowledge Discovery in bases University

More information

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1 Basic Concepts :- 1. What is Data? Data is a collection of facts from which conclusion may be drawn. In computer science, data is anything in a form suitable for use with a computer. Data is often distinguished

More information

Seminars of Software and Services for the Information Society. Data Warehousing Design Issues

Seminars of Software and Services for the Information Society. Data Warehousing Design Issues DIPARTIMENTO DI INGEGNERIA INFORMATICA AUTOMATICA E GESTIONALE ANTONIO RUBERTI Master of Science in Engineering in Computer Science (MSE-CS) Seminars in Software and Services for the Information Society

More information

REPORTING AND QUERY TOOLS AND APPLICATIONS

REPORTING AND QUERY TOOLS AND APPLICATIONS Tool Categories: REPORTING AND QUERY TOOLS AND APPLICATIONS There are five categories of decision support tools Reporting Managed query Executive information system OLAP Data Mining Reporting Tools Production

More information

Chapter 13 Business Intelligence and Data Warehouses The Need for Data Analysis Business Intelligence. Objectives

Chapter 13 Business Intelligence and Data Warehouses The Need for Data Analysis Business Intelligence. Objectives Chapter 13 Business Intelligence and Data Warehouses Objectives In this chapter, you will learn: How business intelligence is a comprehensive framework to support business decision making How operational

More information

Data Warehousing and OLAP

Data Warehousing and OLAP Data Warehousing and OLAP INFO 330 Slides courtesy of Mirek Riedewald Motivation Large retailer Several databases: inventory, personnel, sales etc. High volume of updates Management requirements Efficient

More information

1. Attempt any two of the following: 10 a. State and justify the characteristics of a Data Warehouse with suitable examples.

1. Attempt any two of the following: 10 a. State and justify the characteristics of a Data Warehouse with suitable examples. Instructions to the Examiners: 1. May the Examiners not look for exact words from the text book in the Answers. 2. May any valid example be accepted - example may or may not be from the text book 1. Attempt

More information

Introduction to Data Warehousing

Introduction to Data Warehousing ICS 321 Spring 2012 Introduction to Data Warehousing Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 4/23/2012 Lipyeow Lim -- University of Hawaii at Manoa

More information

Logical design DATA WAREHOUSE: DESIGN Logical design. We address the relational model (ROLAP)

Logical design DATA WAREHOUSE: DESIGN Logical design. We address the relational model (ROLAP) atabase and ata Mining Group of atabase and ata Mining Group of B MG ata warehouse design atabase and ata Mining Group of atabase and data mining group, M B G Logical design ATA WAREHOUSE: ESIGN - 37 Logical

More information

DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY CHARACTERISTICS Data warehouse is a central repository for summarized and integrated data

More information

What is a Data Warehouse?

What is a Data Warehouse? What is a Data Warehouse? COMP 465 Data Mining Data Warehousing Slides Adapted From : Jiawei Han, Micheline Kamber & Jian Pei Data Mining: Concepts and Techniques, 3 rd ed. Defined in many different ways,

More information

A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective

A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective B.Manivannan Research Scholar, Dept. Computer Science, Dravidian University, Kuppam, Andhra Pradesh, India

More information

IT DATA WAREHOUSING AND DATA MINING UNIT-2 BUSINESS ANALYSIS

IT DATA WAREHOUSING AND DATA MINING UNIT-2 BUSINESS ANALYSIS PART A 1. What are production reporting tools? Give examples. (May/June 2013) Production reporting tools will let companies generate regular operational reports or support high-volume batch jobs. Such

More information

Proceedings of the IE 2014 International Conference AGILE DATA MODELS

Proceedings of the IE 2014 International Conference  AGILE DATA MODELS AGILE DATA MODELS Mihaela MUNTEAN Academy of Economic Studies, Bucharest mun61mih@yahoo.co.uk, Mihaela.Muntean@ie.ase.ro Abstract. In last years, one of the most popular subjects related to the field of

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 07 : 06/11/2012 Data Mining: Concepts and Techniques (3 rd ed.) Chapter

More information

Data Warehousing Conclusion. Esteban Zimányi Slides by Toon Calders

Data Warehousing Conclusion. Esteban Zimányi Slides by Toon Calders Data Warehousing Conclusion Esteban Zimányi ezimanyi@ulb.ac.be Slides by Toon Calders Motivation for the Course Database = a piece of software to handle data: Store, maintain, and query Most ideal system

More information

Exam Datawarehousing INFOH419 July 2013

Exam Datawarehousing INFOH419 July 2013 Exam Datawarehousing INFOH419 July 2013 Lecturer: Toon Calders Student name:... The exam is open book, so all books and notes can be used. The use of a basic calculator is allowed. The use of a laptop

More information

MIS2502: Data Analytics Dimensional Data Modeling. Jing Gong

MIS2502: Data Analytics Dimensional Data Modeling. Jing Gong MIS2502: Data Analytics Dimensional Data Modeling Jing Gong gong@temple.edu http://community.mis.temple.edu/gong Where we are Now we re here Data entry Transactional Database Data extraction Analytical

More information

Syllabus. Syllabus. Motivation Decision Support. Syllabus

Syllabus. Syllabus. Motivation Decision Support. Syllabus Presentation: Sophia Discussion: Tianyu Metadata Requirements and Conclusion 3 4 Decision Support Decision Making: Everyday, Everywhere Decision Support System: a class of computerized information systems

More information

OLAP2 outline. Multi Dimensional Data Model. A Sample Data Cube

OLAP2 outline. Multi Dimensional Data Model. A Sample Data Cube OLAP2 outline Multi Dimensional Data Model Need for Multi Dimensional Analysis OLAP Operators Data Cube Demonstration Using SQL Multi Dimensional Data Model Multi dimensional analysis is a popular approach

More information

Advanced Data Management Technologies

Advanced Data Management Technologies ADMT 2018/19 Unit 5 J. Gamper 1/48 Advanced Data Management Technologies Unit 5 Logical Design and DW Applications J. Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Acknowledgements:

More information

COGNOS (R) 8 GUIDELINES FOR MODELING METADATA FRAMEWORK MANAGER. Cognos(R) 8 Business Intelligence Readme Guidelines for Modeling Metadata

COGNOS (R) 8 GUIDELINES FOR MODELING METADATA FRAMEWORK MANAGER. Cognos(R) 8 Business Intelligence Readme Guidelines for Modeling Metadata COGNOS (R) 8 FRAMEWORK MANAGER GUIDELINES FOR MODELING METADATA Cognos(R) 8 Business Intelligence Readme Guidelines for Modeling Metadata GUIDELINES FOR MODELING METADATA THE NEXT LEVEL OF PERFORMANCE

More information

Query Processing with Indexes. Announcements (February 24) Review. CPS 216 Advanced Database Systems

Query Processing with Indexes. Announcements (February 24) Review. CPS 216 Advanced Database Systems Query Processing with Indexes CPS 216 Advanced Database Systems Announcements (February 24) 2 More reading assignment for next week Buffer management (due next Wednesday) Homework #2 due next Thursday

More information

Advanced Multidimensional Reporting

Advanced Multidimensional Reporting Guideline Advanced Multidimensional Reporting Product(s): IBM Cognos 8 Report Studio Area of Interest: Report Design Advanced Multidimensional Reporting 2 Copyright Copyright 2008 Cognos ULC (formerly

More information

Data Warehouse and Data Mining

Data Warehouse and Data Mining Data Warehouse and Data Mining Lecture No. 02 Lifecycle of Data warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro

More information

Data warehouses Decision support The multidimensional model OLAP queries

Data warehouses Decision support The multidimensional model OLAP queries Data warehouses Decision support The multidimensional model OLAP queries Traditional DBMSs are used by organizations for maintaining data to record day to day operations On-line Transaction Processing

More information

Overview. DW Performance Optimization. Aggregates. Aggregate Use Example

Overview. DW Performance Optimization. Aggregates. Aggregate Use Example Overview DW Performance Optimization Choosing aggregates Maintaining views Bitmapped indices Other optimization issues Original slides were written by Torben Bach Pedersen Aalborg University 07 - DWML

More information

MIS2502: Data Analytics Dimensional Data Modeling. Jing Gong

MIS2502: Data Analytics Dimensional Data Modeling. Jing Gong MIS2502: Data Analytics Dimensional Data Modeling Jing Gong gong@temple.edu http://community.mis.temple.edu/gong Where we are Now we re here Data entry Transactional Database Data extraction Analytical

More information

Data Warehouse and Data Mining

Data Warehouse and Data Mining Data Warehouse and Data Mining Lecture No. 09 Plannning Data Warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro

More information

Sql Fact Constellation Schema In Data Warehouse With Example

Sql Fact Constellation Schema In Data Warehouse With Example Sql Fact Constellation Schema In Data Warehouse With Example Data Warehouse OLAP - Learn Data Warehouse in simple and easy steps using Multidimensional OLAP (MOLAP), Hybrid OLAP (HOLAP), Specialized SQL

More information

Chapter 4, Data Warehouse and OLAP Operations

Chapter 4, Data Warehouse and OLAP Operations CSI 4352, Introduction to Data Mining Chapter 4, Data Warehouse and OLAP Operations Young-Rae Cho Associate Professor Department of Computer Science Baylor University CSI 4352, Introduction to Data Mining

More information

Data Modeling and Databases Ch 7: Schemas. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

Data Modeling and Databases Ch 7: Schemas. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich Data Modeling and Databases Ch 7: Schemas Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich Database schema A Database Schema captures: The concepts represented Their attributes

More information

CHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI

CHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI CHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS Assist. Prof. Dr. Volkan TUNALI Topics 2 Business Intelligence (BI) Decision Support System (DSS) Data Warehouse Online Analytical Processing (OLAP)

More information

Decision Support Systems

Decision Support Systems Decision Support Systems 2011/2012 Week 3. Lecture 6 Previous Class Dimensions & Measures Dimensions: Item Time Loca0on Measures: Quan0ty Sales TransID ItemName ItemID Date Store Qty T0001 Computer I23

More information

Management Information Systems MANAGING THE DIGITAL FIRM, 12 TH EDITION FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT

Management Information Systems MANAGING THE DIGITAL FIRM, 12 TH EDITION FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT MANAGING THE DIGITAL FIRM, 12 TH EDITION Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT VIDEO CASES Case 1: Maruti Suzuki Business Intelligence and Enterprise Databases

More information

Data Warehouse Logical Design. Letizia Tanca Politecnico di Milano (with the kind support of Rosalba Rossato)

Data Warehouse Logical Design. Letizia Tanca Politecnico di Milano (with the kind support of Rosalba Rossato) Data Warehouse Logical Design Letizia Tanca Politecnico di Milano (with the kind support of Rosalba Rossato) Data Mart logical models MOLAP (Multidimensional On-Line Analytical Processing) stores data

More information

Data Warehouse Testing. By: Rakesh Kumar Sharma

Data Warehouse Testing. By: Rakesh Kumar Sharma Data Warehouse Testing By: Rakesh Kumar Sharma Index...2 Introduction...3 About Data Warehouse...3 Data Warehouse definition...3 Testing Process for Data warehouse:...3 Requirements Testing :...3 Unit

More information

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation Data Mining Part 2. Data Understanding and Preparation 2.4 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Normalization Attribute Construction Aggregation Attribute Subset Selection Discretization

More information

DATA MINING TRANSACTION

DATA MINING TRANSACTION DATA MINING Data Mining is the process of extracting patterns from data. Data mining is seen as an increasingly important tool by modern business to transform data into an informational advantage. It is

More information

Questions about the contents of the final section of the course of Advanced Databases. Version 0.3 of 28/05/2018.

Questions about the contents of the final section of the course of Advanced Databases. Version 0.3 of 28/05/2018. Questions about the contents of the final section of the course of Advanced Databases. Version 0.3 of 28/05/2018. 12 Decision support systems How would you define a Decision Support System? What do OLTP

More information

IDU0010 ERP,CRM ja DW süsteemid Loeng 5 DW concepts. Enn Õunapuu

IDU0010 ERP,CRM ja DW süsteemid Loeng 5 DW concepts. Enn Õunapuu IDU0010 ERP,CRM ja DW süsteemid Loeng 5 DW concepts Enn Õunapuu enn.ounapuu@ttu.ee Content Oveall approach Dimensional model Tabular model Overall approach Data modeling is a discipline that has been practiced

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 14 : 18/11/2014 Data Mining: Concepts and Techniques (3 rd ed.) Chapter

More information

Data Warehouses and OLAP. Database and Information Systems. Data Warehouses and OLAP. Data Warehouses and OLAP

Data Warehouses and OLAP. Database and Information Systems. Data Warehouses and OLAP. Data Warehouses and OLAP Database and Information Systems 11. Deductive Databases 12. Data Warehouses and OLAP 13. Index Structures for Similarity Queries 14. Data Mining 15. Semi-Structured Data 16. Document Retrieval 17. Web

More information

QUALITY MONITORING AND

QUALITY MONITORING AND BUSINESS INTELLIGENCE FOR CMS DATA QUALITY MONITORING AND DATA CERTIFICATION. Author: Daina Dirmaite Supervisor: Broen van Besien CERN&Vilnius University 2016/08/16 WHAT IS BI? Business intelligence is

More information

Multidimensional Queries

Multidimensional Queries Multidimensional Queries Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, first semester

More information

Aggregating Knowledge in a Data Warehouse and Multidimensional Analysis

Aggregating Knowledge in a Data Warehouse and Multidimensional Analysis Aggregating Knowledge in a Data Warehouse and Multidimensional Analysis Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd rafal@projectbotticelli.com Objectives Explain the basics of: 1. Data

More information

Deccansoft Software Services Microsoft Silver Learning Partner. SSAS Syllabus

Deccansoft Software Services Microsoft Silver Learning Partner. SSAS Syllabus Overview: Analysis Services enables you to analyze large quantities of data. With it, you can design, create, and manage multidimensional structures that contain detail and aggregated data from multiple

More information

Advanced Data Management Technologies

Advanced Data Management Technologies ADMT 2017/18 Unit 13 J. Gamper 1/42 Advanced Data Management Technologies Unit 13 DW Pre-aggregation and View Maintenance J. Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Acknowledgements:

More information