Data Warehousing. Overview
|
|
- Gavin Burns
- 6 years ago
- Views:
Transcription
1 Data Warehousing Overview Basic Definitions Normalization Entity Relationship Diagrams (ERDs) Normal Forms Many to Many relationships Warehouse Considerations Dimension Tables Fact Tables Star Schema Snowflake Schema Further Warehouse Design Considerations Changing Dimensions Conformed Dimensions
2 Data warehouse A data warehouse is a copy of transaction data specifically structured for querying and reporting. a collection of computerized data that is organized to most optimally support reporting and analysis activity OLTP - On-Line Transaction Processing OLTP describes a type of processing that databases are designed to support. OLTP applications need to support a high number of transactions per unit of time. A transaction is a set of Insert, Update, and sometimes Delete statements that must succeed or fail as a unit. Transactions typically perform such functions as recording orders, depleting inventory, etc. Electronic banking and order processing are common OLTP applications. OLAP - On-Line Analytical Processing In its broadest usage, the term "OLAP" is used as a synonym for "data warehousing". The term "On-Line Analytical Processing" was developed to distinguish data warehousing activities from On-Line Transaction Processing. In a narrower usage, the term OLAP is used to refer to the tools used for Multidimensional Analysis
3 Sample Star Schema: but when people speak of OLAP they may properly be referring to a schema like this one in a relational database.
4 Database Normalization Normalization reduces redundant data storage by organizing data efficiently. There are many ways to normalize a database consistently within a set of business requirements. Normalization reduces the potential for anomalies during data manipulation operations. Non-normalized databases are vulnerable to data anomalies when they store data redundantly. If data is stored in two locations, but is later updated in only one location, then the data becomes inconsistent; this is referred to as an update anomaly. To avoid data anomalies, non-primary key data in a normalized database are stored in only one location. If you need a Department s physical location, you should need to look in the Department Table.
5 Unnormalized Table We could design a database so that each record we would read about specific types of business object would have all the information we d typically need about those object types. But
6 This schema is more typical of a normalized database. We could generate the information on the previous page with this query: Select e.employeeid, e.lastname, e.firstname, d.deptid, d.name, d.location From Department d Inner Join Employee e on d.deptid = e.deptid
7 When we normalize, we re building a logical hierarchy.
8 Entities classes of objects that are of interest from a business standpoint, about which information needs to be maintained In the process of modeling, they evolve into database tables. Entities are always nouns in business narratives (but not all nouns in business narratives are entities). Examples: Employee, Department, Project Entities must have attributes, or properties, that need to be known, which become columns. Employee: Name, Birth Date, Salary Department: Name, Number, Location Each entity is representative of a class of objects, and each instance of an wellformed entity will map to a row in a table. Each instance of an entity must be uniquely distinguishable from other instances of the same entity. An attribute or set of attributes that uniquely identify an entity is called a Unique Identifier (UID).
9 Relationship A bi-directional, significant association between two entities, or between an entity and itself Each (direction of a) relationship has: Name Optionality Either Must Be or May Be Degree/Cardinality/Ordinality 1:1 or 1:M ( or M:M) Degree = 0 is expressed as may be. Each employee must be assigned to one and only one department. Each department may be responsible for one or more employees. Our definitions for entities, attributes and relationships must have equal validity for each instance; not the normal case only. This point is critically important.
10 First Normal Form 1NF requires that each attribute store only one value. There can be no repeating groups ( = no multivalued attributes ). Each attribute of the table is said to be atomic. For example, each record in the Home table below should have only one owner. Each cell, which is the intersection of a row and a column, can contain only one value. Mention the PK convention. Unnormalized Entity What if some homes have more than three owners? How would we write stored procs to read from this table?
11 To support multiple owners we need another entity: This will always be the case when an entity has a 1:M relationship with one of its attributes. Both entities are now in 1NF.
12 Second Normal Form To be in 2NF, a table must be in 1NF. In addition, each non-key attribute must be dependent on all parts of its primary key. There must be no partial key dependencies. In the previous example: The Home entity is not in 2NF.» The Mayor attribute doesn t depend on the entire primary key.» We need a new entity. The Owner entity is not in 2NF.» The Price of Tea does not depend on the Owner.» We decide not to track this attribute. In normalizing to 2NF, we attempt to reduce the amount of redundant data in a table by extracting it, placing it in new tables, and creating relationships between tables.
13 Tables are now in 2NF.
14 Third Normal Form To be in 3NF, a table must be in 2NF. Additionally, all attributes that are not wholly dependent upon the primary key must be remodeled. Each table attribute can depend on nothing other than its primary key. 3NF = Every non-key attribute must depend on the key, the whole key, and nothing but the key. In the previous example: Sun sign depends on birth date, so it should be stored in a different table. A general modeling principle we see here is that when an attribute depends on another attribute, a new table will be necessary to model the relationship.
15 Entities are now in 3NF.
16 Modeling the M:M relationship How do we record the owners of individual homes?
17 We need an intermediate table that has a M:1 relationship with each of its parent tables.
18 The query below shows the name of each home s owner(s).
19 General Remarks: The definitions of normal forms provide guidelines for relational database design. Occasionally, it is necessary to stray from them to meet practical business requirements in an OLTP environment. There is not a single best way to normalize a database to conform with a specific set of business requirements. Insert, Update, and Delete operations run more quickly in a normalized database. Complex Select statements run more slowly.
20 Reasons to denormalize The fundamental reason to denormalize is to improve query performance. Consider the case of City, State, and CityStateZip tables. These tables can be designed to conform to the third normal form. But each time you need to write a query to extract Customer data, you will need to join data from four tables. If no valid business reason exists to divide city, state, and ZIP Code information into separate tables, then it may make sense to denormalize. Dimension tables in a star schema are intentionally denormalized.
21 Normalized database: Many narrow tables (i.e. fewer columns) Optimized for Insert Update, and Delete operations Slower Select statements because of the need for frequent join operations Few indexes Necessary for large OLTP applications Non-normalized database: Fewer (but wider ) tables Faster Select statements because we don t need to join as often Transactions are more problematic because of the need to maintain redundant instances of data during Insert, Update, and Delete operations Many indexes because data is relatively static Necessary for large relational OLAP applications
22 Data Warehouses Data warehouses and data marts are storage mechanisms for read-only, historical, aggregated data. Consider this example: we sell 2 products, dog food and cat food. Each day, we record the sales of each product. Here is some sample OLTP data for a couple of days:
23 Our data warehouse would usually not record this level of detail. Instead, in a warehouse we would summarize, or aggregate, the data to daily totals. Our records in the data warehouse might look something like this: Here we have reduced the number of records by aggregating the individual transaction records into daily records that show the number of each product purchased each day. We can certainly generate this data set from the OLTP system by running a query
24 but if we want to view our data as aggregated numbers broken down along a series of criteria (i.e. so-called by conditions ), then query performance will improve if we store data in a denormalized format. That s exactly what we do when implementing a star schema. It s important to realize that OLTP is not meant to be the basis of a decision support system. OLTP applications are optimized for activities such as recording (high numbers of) orders, etc. A system optimized for processing transactions is not optimized to perform complex analyses designed to uncover hidden trends. Therefore, rather than tie up our OLTP system by performing expensive queries, we should build a less normalized structure that conforms better to our query needs.
25 The Warehouse Typical business questions that drive warehouse design: How many units did we sell last week? Are overall sales of individual products or product categories higher or lower this year than in previous years? On a quarterly or monthly basis, are sales for some products/categories cyclical? In what regions are sales down this year? What products/categories in those regions account for the greatest percentage of the decrease? Some characteristics of warehouse business questions: Many concern the element of time. Many questions require the aggregation of data; sums and counts are important in an OLAP environment, whereas individual transactions are important in an OLTP environment. Each questions looks at data in terms of by conditions. On a quarterly and then monthly basis, are Dairy Product sales cyclical? = We need to see total sales of Dairy Products by quarter and by month.
26 These by conditions drive the design of our star schema. Each by condition is represented by a Dimension table.
27 Dimension Tables General Remarks Product and Geography are common dimensions. Date/Time information is almost always stored in a Dimension table. If our data happen to start on a particular date, do we care what sales have been since that date, or do we care more about how one year s sales compares to other years? Comparing one year to another is a common form of trend analysis accomplished through the use of a star schema.
28 Dimension Table Structure Dimension tables should have a single-field primary key. This key is often an identity column. The value of the primary key is irrelevant; our information is stored in the other fields in the table. Because the fields are the full descriptions, the dimension tables are often wide, i.e. they contain many large fields. For example, if we have a Product dimension, then we ll have fields in it that contain the description, the category name, the sub-category name, etc. These fields do not contain codes that link us to other tables. Dimension tables are often small in terms of row count relative to Fact tables.
29 Dimensional Hierarchies (Denormalization): In a star schema, the entire hierarchy for a dimension is stored in its corresponding Dimension table in the data warehouse. The product dimension, for example, contains individual products. Products are normally grouped into categories, and these categories may contain sub-categories. For example, a product with a product number of M1652 may be a refrigerator. Thus it belongs in the major appliance category, and in the refrigerator sub-category. We may have more levels of sub-categories to further classify each product. In an OLAP environment, it is preferable to maintain the product hierarchy in a single table, although this hierarchy would certainly be distributed among Product, Category, and SubCategory tables in an OLTP environment. This hierarchy allows us to perform drill-down functions on the data. We can perform a query that performs sums by category. We can then drill-down into that category by calculating sums for the subcategories for that category. We can the calculate the sums for the individual products in a particular subcategory. The actual sums we are calculating are based on numbers stored in the fact table.
30 Fact tables When we talk about the way we want to look at data, we usually want to see some sort of aggregated data. These data are called measures. Measures are numeric values that are measurable and additive. Sales dollars are a very common measure. The Number of Customers we have is also a typical measure. We d probably track both of these by day. Fact tables are used to store measures, or facts, which are numeric and additive across some or all dimensions. In the following star schema, sales dollars are numeric, and we can examine total sales in terms of product, category, and time period. Fact tables are narrow in the sense that they contain few (and numeric) columns, but they do contain large numbers of rows. Fact tables are responsible for most of the disk space used in a warehouse.
31 Fact Table Granularity Granularity refers to the level of detail in a fact table and is one of the most important design decisions in data warehouse planning. Granularity is often determined by the time dimension. For example, you may elect to store only weekly or monthly totals for sales dollars. Granularity determines how far we can drill down without recourse to the source OLTP data. Many if not most OLAP systems have daily grain in the Time dimension. Selecting a finer grain results in more records in the fact table. Choose data types for fact table columns that keep the table as small as possible.
32 Aggregations Fact table data consists of aggregations that are based on the fact table s granularity. Frequently we ll want to aggregate to a higher level. We may choose to keep total sales dollars at a quarterly or monthly level. We may be interested in only a particular product or category in this case. A better alternative is to build a cube structure
33 Simple Star Schema: To obtain total sales for all major appliances during March of 1999: Select Sum (sf.salesdollars) as TotalSales From SalesFact sf Inner Join TimeDimension td On td.timeid = sf.timeid Inner Join ProductDimension pd On pd.productid = sf.productid Where pd.category = Major Appliance And td.month = 3 And td.year = 1999
34 Snowflake Schemas Sometimes dimension tables have hierarchies broken out into separate tables. This will result in a different schema type known as a snowflake. This is a more normalized structure, but leads to more difficult queries and slower response times. It does conserve more disk space than a star schema that contains the same data.
35 Graphical comparison of Star and Snowflake schemas Star Schema Snowflake Schema
36 Further Warehouse Design Considerations Changing Dimensions In the schema below, consider a scenario in which we have realigned some of our stores, placing them in different territories and regions.
37 In the StoreDimension table, we have each store in a particular region, territory, and zone. If we simply update the StoreDimension table with new territory/region information, and then examine historical sales for a region, the numbers will no longer be accurate. To address this issue, consider creating new records for affected stores. Every new record will contain each store s new region, but leaves old store records intact along with the old regional sales data. This approach, however, prevents us from comparing this stores current sales to its historical sales unless we keep track of its previous StoreID. This may require an extra field called PreviousStoreID or something similar. There are no right and wrong answers. Each case may require a different solution.
38 When building an enterprise warehouse from local data marts: It is necessary to produce a set of conformed dimensions. It will also be necessary to standardize the definitions of facts. A conformed dimension is a dimension that means the same thing with every possible fact table to which it can be joined. Generally, this means that a conformed dimension is identical in each data mart. The conformed Product dimension is the enterprise s agreed-upon master list of products, including all product attributes and all product rollups such as category, subcategory, and department. The conformed Calendar dimension will almost always be a table of individual days, spanning a decade or more. Each day will have many useful attributes drawn from the legal calendars of the various states and countries the enterprise deals with, as well as special fiscal calendar periods and marketing seasons relevant only to internal managers. Most conformed dimensions will naturally be defined at the most granular level possible. The grain of the Customer dimension will be the individual customer.
39 Simplified Star Schema with Conformed Dimensions
40 Permissible Variations of Conformed Dimensions It is possible to create a subset of a conformed dimension table for certain data marts if you know that the domain of the associated fact table only contains that subset. For example, the Product table for a specific data mart may be restricted so as to include only those products manufactured at that location, if the data mart in question pertains to that location only.
41 Links: Wikipedia page on normalization Datbases.About.Com page on normalization MSDN Glossary Oracle-specific site where I got some schema diagrams Ralph Kimball's Data Warehousing site Kimball on Fact and Dimension Tables BI and Data Warehouse Glossary
Data Warehouses Chapter 12. Class 10: Data Warehouses 1
Data Warehouses Chapter 12 Class 10: Data Warehouses 1 OLTP vs OLAP Operational Database: a database designed to support the day today transactions of an organization Data Warehouse: historical data is
More informationCHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI
CHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS Assist. Prof. Dr. Volkan TUNALI Topics 2 Business Intelligence (BI) Decision Support System (DSS) Data Warehouse Online Analytical Processing (OLAP)
More informationData warehouse architecture consists of the following interconnected layers:
Architecture, in the Data warehousing world, is the concept and design of the data base and technologies that are used to load the data. A good architecture will enable scalability, high performance and
More informationAggregating Knowledge in a Data Warehouse and Multidimensional Analysis
Aggregating Knowledge in a Data Warehouse and Multidimensional Analysis Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd rafal@projectbotticelli.com Objectives Explain the basics of: 1. Data
More informationDesigning Data Warehouses. Data Warehousing Design. Designing Data Warehouses. Designing Data Warehouses
Designing Data Warehouses To begin a data warehouse project, need to find answers for questions such as: Data Warehousing Design Which user requirements are most important and which data should be considered
More informationData Strategies for Efficiency and Growth
Data Strategies for Efficiency and Growth Date Dimension Date key (PK) Date Day of week Calendar month Calendar year Holiday Channel Dimension Channel ID (PK) Channel name Channel description Channel type
More informationDATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY
DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY CHARACTERISTICS Data warehouse is a central repository for summarized and integrated data
More informationBasics of Dimensional Modeling
Basics of Dimensional Modeling Data warehouse and OLAP tools are based on a dimensional data model. A dimensional model is based on dimensions, facts, cubes, and schemas such as star and snowflake. Dimension
More informationRocky Mountain Technology Ventures
Rocky Mountain Technology Ventures Comparing and Contrasting Online Analytical Processing (OLAP) and Online Transactional Processing (OLTP) Architectures 3/19/2006 Introduction One of the most important
More informationDATA MINING TRANSACTION
DATA MINING Data Mining is the process of extracting patterns from data. Data mining is seen as an increasingly important tool by modern business to transform data into an informational advantage. It is
More informationWorking with the Business to Build Effective Dimensional Models
Working with the Business to Build Effective Dimensional Models Laura L. Reeves Co-Founder & Principal April, 2009 Copyright 2009 StarSoft Solutions, Inc. Slide 1 Instructor Information: Laura L. Reeves,
More informationLogical Design A logical design is conceptual and abstract. It is not necessary to deal with the physical implementation details at this stage.
Logical Design A logical design is conceptual and abstract. It is not necessary to deal with the physical implementation details at this stage. You need to only define the types of information specified
More informationData Mining Concepts & Techniques
Data Mining Concepts & Techniques Lecture No. 01 Databases, Data warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro
More informationAn Overview of Data Warehousing and OLAP Technology
An Overview of Data Warehousing and OLAP Technology CMPT 843 Karanjit Singh Tiwana 1 Intro and Architecture 2 What is Data Warehouse? Subject-oriented, integrated, time varying, non-volatile collection
More informationCS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures)
CS614- Data Warehousing Solved MCQ(S) From Midterm Papers (1 TO 22 Lectures) BY Arslan Arshad Nov 21,2016 BS110401050 BS110401050@vu.edu.pk Arslan.arshad01@gmail.com AKMP01 CS614 - Data Warehousing - Midterm
More informationChapter 3. The Multidimensional Model: Basic Concepts. Introduction. The multidimensional model. The multidimensional model
Chapter 3 The Multidimensional Model: Basic Concepts Introduction Multidimensional Model Multidimensional concepts Star Schema Representation Conceptual modeling using ER, UML Conceptual modeling using
More informationFig 1.2: Relationship between DW, ODS and OLTP Systems
1.4 DATA WAREHOUSES Data warehousing is a process for assembling and managing data from various sources for the purpose of gaining a single detailed view of an enterprise. Although there are several definitions
More informationOLAP Introduction and Overview
1 CHAPTER 1 OLAP Introduction and Overview What Is OLAP? 1 Data Storage and Access 1 Benefits of OLAP 2 What Is a Cube? 2 Understanding the Cube Structure 3 What Is SAS OLAP Server? 3 About Cube Metadata
More informationMIS2502: Data Analytics Dimensional Data Modeling. Jing Gong
MIS2502: Data Analytics Dimensional Data Modeling Jing Gong gong@temple.edu http://community.mis.temple.edu/gong Where we are Now we re here Data entry Transactional Database Data extraction Analytical
More informationData Warehousing and Decision Support (mostly using Relational Databases) CS634 Class 20
Data Warehousing and Decision Support (mostly using Relational Databases) CS634 Class 20 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke, Chapter 25 Introduction Increasingly,
More informationFROM A RELATIONAL TO A MULTI-DIMENSIONAL DATA BASE
FROM A RELATIONAL TO A MULTI-DIMENSIONAL DATA BASE David C. Hay Essential Strategies, Inc In the buzzword sweepstakes of 1997, the clear winner has to be Data Warehouse. A host of technologies and techniques
More informationGuide Users along Information Pathways and Surf through the Data
Guide Users along Information Pathways and Surf through the Data Stephen Overton, Overton Technologies, LLC, Raleigh, NC ABSTRACT Business information can be consumed many ways using the SAS Enterprise
More informationData Warehouses. Yanlei Diao. Slides Courtesy of R. Ramakrishnan and J. Gehrke
Data Warehouses Yanlei Diao Slides Courtesy of R. Ramakrishnan and J. Gehrke Introduction v In the late 80s and early 90s, companies began to use their DBMSs for complex, interactive, exploratory analysis
More informationReal-World Performance Training Dimensional Queries
Real-World Performance Training al Queries Real-World Performance Team Agenda 1 2 3 4 5 The DW/BI Death Spiral Parallel Execution Loading Data Exadata and Database In-Memory al Queries al Queries 1 2 3
More informationA Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective
A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective B.Manivannan Research Scholar, Dept. Computer Science, Dravidian University, Kuppam, Andhra Pradesh, India
More informationThe Data Organization
C V I T F E P A O TM The Data Organization 1251 Yosemite Way Hayward, CA 94545 (510) 303-8868 rschoenrank@computer.org Business Intelligence Process Architecture By Rainer Schoenrank Data Warehouse Consultant
More informationALTERNATE SCHEMA DIAGRAMMING METHODS DECISION SUPPORT SYSTEMS. CS121: Relational Databases Fall 2017 Lecture 22
ALTERNATE SCHEMA DIAGRAMMING METHODS DECISION SUPPORT SYSTEMS CS121: Relational Databases Fall 2017 Lecture 22 E-R Diagramming 2 E-R diagramming techniques used in book are similar to ones used in industry
More informationData Modeling and Databases Ch 7: Schemas. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich
Data Modeling and Databases Ch 7: Schemas Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich Database schema A Database Schema captures: The concepts represented Their attributes
More informationDecision Support Systems aka Analytical Systems
Decision Support Systems aka Analytical Systems Decision Support Systems Systems that are used to transform data into information, to manage the organization: OLAP vs OLTP OLTP vs OLAP Transactions Analysis
More informationManaging Data Resources
Chapter 7 Managing Data Resources 7.1 2006 by Prentice Hall OBJECTIVES Describe basic file organization concepts and the problems of managing data resources in a traditional file environment Describe how
More informationThis module presents the star schema, an alternative to 3NF schemas intended for analytical databases.
Topic 3.3: Star Schema Design This module presents the star schema, an alternative to 3NF schemas intended for analytical databases. Star Schema Overview The star schema is a simple database architecture
More informationData Warehousing & OLAP
CMPUT 391 Database Management Systems Data Warehousing & OLAP Textbook: 17.1 17.5 (first edition: 19.1 19.5) Based on slides by Lewis, Bernstein and Kifer and other sources University of Alberta 1 Why
More information1 DATAWAREHOUSING QUESTIONS by Mausami Sawarkar
1 DATAWAREHOUSING QUESTIONS by Mausami Sawarkar 1) What does the term 'Ad-hoc Analysis' mean? Choice 1 Business analysts use a subset of the data for analysis. Choice 2: Business analysts access the Data
More informationA Multi-Dimensional Data Model
A Multi-Dimensional Data Model A Data Warehouse is based on a Multidimensional data model which views data in the form of a data cube A data cube, such as sales, allows data to be modeled and viewed in
More informationTDWI strives to provide course books that are contentrich and that serve as useful reference documents after a class has ended.
Previews of TDWI course books offer an opportunity to see the quality of our material and help you to select the courses that best fit your needs. The previews cannot be printed. TDWI strives to provide
More informationMIS2502: Data Analytics Dimensional Data Modeling. Jing Gong
MIS2502: Data Analytics Dimensional Data Modeling Jing Gong gong@temple.edu http://community.mis.temple.edu/gong Where we are Now we re here Data entry Transactional Database Data extraction Analytical
More informationThe University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory
Warehousing Outline Andrew Kusiak 2139 Seamans Center Iowa City, IA 52242-1527 andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak Tel. 319-335 5934 Introduction warehousing concepts Relationship
More informationData Warehousing. Jens Teubner, TU Dortmund Winter 2015/16. Jens Teubner Data Warehousing Winter 2015/16 1
Jens Teubner Data Warehousing Winter 2015/16 1 Data Warehousing Jens Teubner, TU Dortmund jensteubner@cstu-dortmundde Winter 2015/16 Jens Teubner Data Warehousing Winter 2015/16 40 Part IV Modelling Your
More informationData Warehousing and Decision Support. Introduction. Three Complementary Trends. [R&G] Chapter 23, Part A
Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 432 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business
More informationManaging Data Resources
Chapter 7 OBJECTIVES Describe basic file organization concepts and the problems of managing data resources in a traditional file environment Managing Data Resources Describe how a database management system
More informationcollection of data that is used primarily in organizational decision making.
Data Warehousing A data warehouse is a special purpose database. Classic databases are generally used to model some enterprise. Most often they are used to support transactions, a process that is referred
More informationSyllabus. Syllabus. Motivation Decision Support. Syllabus
Presentation: Sophia Discussion: Tianyu Metadata Requirements and Conclusion 3 4 Decision Support Decision Making: Everyday, Everywhere Decision Support System: a class of computerized information systems
More informationData Warehouse and Data Mining
Data Warehouse and Data Mining Lecture No. 07 Terminologies Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Database
More informationUNIT
UNIT 3.1 DATAWAREHOUSING UNIT 3 CHAPTER 1 1.Designing the Target Structure: Data warehouse design, Dimensional design, Cube and dimensions, Implementation of a dimensional model in a database, Relational
More informationQUALITY MONITORING AND
BUSINESS INTELLIGENCE FOR CMS DATA QUALITY MONITORING AND DATA CERTIFICATION. Author: Daina Dirmaite Supervisor: Broen van Besien CERN&Vilnius University 2016/08/16 WHAT IS BI? Business intelligence is
More informationData Warehousing and OLAP
Data Warehousing and OLAP INFO 330 Slides courtesy of Mirek Riedewald Motivation Large retailer Several databases: inventory, personnel, sales etc. High volume of updates Management requirements Efficient
More informationCS 1655 / Spring 2013! Secure Data Management and Web Applications
CS 1655 / Spring 2013 Secure Data Management and Web Applications 03 Data Warehousing Alexandros Labrinidis University of Pittsburgh What is a Data Warehouse A data warehouse: archives information gathered
More informationDATA MINING AND WAREHOUSING
DATA MINING AND WAREHOUSING Qno Question Answer 1 Define data warehouse? Data warehouse is a subject oriented, integrated, time-variant, and nonvolatile collection of data that supports management's decision-making
More informationData Warehousing and Decision Support
Data Warehousing and Decision Support Chapter 23, Part A Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical
More informationCHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP)
CHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP) INTRODUCTION A dimension is an attribute within a multidimensional model consisting of a list of values (called members). A fact is defined by a combination
More informationThe strategic advantage of OLAP and multidimensional analysis
IBM Software Business Analytics Cognos Enterprise The strategic advantage of OLAP and multidimensional analysis 2 The strategic advantage of OLAP and multidimensional analysis Overview Online analytical
More informationData Warehousing and Decision Support
Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 4320 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business
More informationData Warehouse and Data Mining
Data Warehouse and Data Mining Lecture No. 04-06 Data Warehouse Architecture Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology
More informationData Warehouses and OLAP. Database and Information Systems. Data Warehouses and OLAP. Data Warehouses and OLAP
Database and Information Systems 11. Deductive Databases 12. Data Warehouses and OLAP 13. Index Structures for Similarity Queries 14. Data Mining 15. Semi-Structured Data 16. Document Retrieval 17. Web
More informationTIM 50 - Business Information Systems
TIM 50 - Business Information Systems Lecture 15 UC Santa Cruz May 20, 2014 Announcements DB 2 Due Tuesday Next Week The Database Approach to Data Management Database: Collection of related files containing
More informationData-Driven Driven Business Intelligence Systems: Parts I. Lecture Outline. Learning Objectives
Data-Driven Driven Business Intelligence Systems: Parts I Week 5 Dr. Jocelyn San Pedro School of Information Management & Systems Monash University IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, 2004 Lecture
More informationSeminars of Software and Services for the Information Society. Data Warehousing Design Issues
DIPARTIMENTO DI INGEGNERIA INFORMATICA AUTOMATICA E GESTIONALE ANTONIO RUBERTI Master of Science in Engineering in Computer Science (MSE-CS) Seminars in Software and Services for the Information Society
More informationOLAP2 outline. Multi Dimensional Data Model. A Sample Data Cube
OLAP2 outline Multi Dimensional Data Model Need for Multi Dimensional Analysis OLAP Operators Data Cube Demonstration Using SQL Multi Dimensional Data Model Multi dimensional analysis is a popular approach
More informationAdvanced Multidimensional Reporting
Guideline Advanced Multidimensional Reporting Product(s): IBM Cognos 8 Report Studio Area of Interest: Report Design Advanced Multidimensional Reporting 2 Copyright Copyright 2008 Cognos ULC (formerly
More informationUnit 7: Basics in MS Power BI for Excel 2013 M7-5: OLAP
Unit 7: Basics in MS Power BI for Excel M7-5: OLAP Outline: Introduction Learning Objectives Content Exercise What is an OLAP Table Operations: Drill Down Operations: Roll Up Operations: Slice Operations:
More informationDC Area Business Objects Crystal User Group (DCABOCUG) Data Warehouse Architectures for Business Intelligence Reporting.
DC Area Business Objects Crystal User Group (DCABOCUG) Data Warehouse Architectures for Business Intelligence Reporting April 14, 2009 Whitemarsh Information Systems Corporation 2008 Althea Lane Bowie,
More informationDatabase design View Access patterns Need for separate data warehouse:- A multidimensional data model:-
UNIT III: Data Warehouse and OLAP Technology: An Overview : What Is a Data Warehouse? A Multidimensional Data Model, Data Warehouse Architecture, Data Warehouse Implementation, From Data Warehousing to
More informationStar Schema Design (Additonal Material; Partly Covered in Chapter 8) Class 04: Star Schema Design 1
Star Schema Design (Additonal Material; Partly Covered in Chapter 8) Class 04: Star Schema Design 1 Star Schema Overview Star Schema: A simple database architecture used extensively in analytical applications,
More information5-1McGraw-Hill/Irwin. Copyright 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
5-1McGraw-Hill/Irwin Copyright 2007 by The McGraw-Hill Companies, Inc. All rights reserved. 5 hapter Data Resource Management Data Concepts Database Management Types of Databases McGraw-Hill/Irwin Copyright
More informationBest Practices in Data Modeling. Dan English
Best Practices in Data Modeling Dan English Objectives Understand how QlikView is Different from SQL Understand How QlikView works with(out) a Data Warehouse Not Throw Baby out with the Bathwater Adopt
More informationOverview. Introduction to Data Warehousing and Business Intelligence. BI Is Important. What is Business Intelligence (BI)?
Introduction to Data Warehousing and Business Intelligence Overview Why Business Intelligence? Data analysis problems Data Warehouse (DW) introduction A tour of the coming DW lectures DW Applications Loosely
More informationData Warehouse. Asst.Prof.Dr. Pattarachai Lalitrojwong
Data Warehouse Asst.Prof.Dr. Pattarachai Lalitrojwong Faculty of Information Technology King Mongkut s Institute of Technology Ladkrabang Bangkok 10520 pattarachai@it.kmitl.ac.th The Evolution of Data
More informationChapter 6. Foundations of Business Intelligence: Databases and Information Management VIDEO CASES
Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
More informationProceedings of the IE 2014 International Conference AGILE DATA MODELS
AGILE DATA MODELS Mihaela MUNTEAN Academy of Economic Studies, Bucharest mun61mih@yahoo.co.uk, Mihaela.Muntean@ie.ase.ro Abstract. In last years, one of the most popular subjects related to the field of
More informationCreate Cube From Star Schema Grouping Framework Manager
Create Cube From Star Schema Grouping Framework Manager Create star schema groupings to provide authors with logical groupings of query Connect to an OLAP data source (cube) in a Framework Manager project
More information1. Analytical queries on the dimensionally modeled database can be significantly simpler to create than on the equivalent nondimensional database.
1. Creating a data warehouse involves using the functionalities of database management software to implement the data warehouse model as a collection of physically created and mutually connected database
More informationEvolution of Database Systems
Evolution of Database Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Intelligent Decision Support Systems Master studies, second
More informationSAS Data Integration Studio 3.3. User s Guide
SAS Data Integration Studio 3.3 User s Guide The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2006. SAS Data Integration Studio 3.3: User s Guide. Cary, NC: SAS Institute
More informationData Warehouse and Data Mining
Data Warehouse and Data Mining Lecture No. 02 Introduction to Data Warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology
More informationDepartment of Industrial Engineering. Sharif University of Technology. Operational and enterprises systems. Exciting directions in systems
Department of Industrial Engineering Sharif University of Technology Session# 9 Contents: The role of managers in Information Technology (IT) Organizational Issues Information Technology Operational and
More informationA Star Schema Has One To Many Relationship Between A Dimension And Fact Table
A Star Schema Has One To Many Relationship Between A Dimension And Fact Table Many organizations implement star and snowflake schema data warehouse The fact table has foreign key relationships to one or
More informationData Warehousing. Data Warehousing and Mining. Lecture 8. by Hossen Asiful Mustafa
Data Warehousing Data Warehousing and Mining Lecture 8 by Hossen Asiful Mustafa Databases Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age Information,
More informationACS-2914 Normalization March 2009 NORMALIZATION 2. Ron McFadyen 1. Normalization 3. De-normalization 3
NORMALIZATION 2 Normalization 3 De-normalization 3 Functional Dependencies 4 Generating functional dependency maps from database design maps 5 Anomalies 8 Partial Functional Dependencies 10 Transitive
More informationETL and OLAP Systems
ETL and OLAP Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, first semester
More informationData warehouses Decision support The multidimensional model OLAP queries
Data warehouses Decision support The multidimensional model OLAP queries Traditional DBMSs are used by organizations for maintaining data to record day to day operations On-line Transaction Processing
More informationCHAPTER 3 Implementation of Data warehouse in Data Mining
CHAPTER 3 Implementation of Data warehouse in Data Mining 3.1 Introduction to Data Warehousing A data warehouse is storage of convenient, consistent, complete and consolidated data, which is collected
More informationAcknowledgment. MTAT Data Mining. Week 7: Online Analytical Processing and Data Warehouses. Typical Data Analysis Process.
MTAT.03.183 Data Mining Week 7: Online Analytical Processing and Data Warehouses Marlon Dumas marlon.dumas ät ut. ee Acknowledgment This slide deck is a mashup of the following publicly available slide
More informationChapter 6 VIDEO CASES
Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
More informationData Warehouse Logical Design. Letizia Tanca Politecnico di Milano (with the kind support of Rosalba Rossato)
Data Warehouse Logical Design Letizia Tanca Politecnico di Milano (with the kind support of Rosalba Rossato) Data Mart logical models MOLAP (Multidimensional On-Line Analytical Processing) stores data
More informationThis tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.
About the Tutorial A data warehouse is constructed by integrating data from multiple heterogeneous sources. It supports analytical reporting, structured and/or ad hoc queries and decision making. This
More informationData Warehousing. Adopted from Dr. Sanjay Gunasekaran
Data Warehousing Adopted from Dr. Sanjay Gunasekaran Main Topics Overview of Data Warehouse Concept of Data Conversion Importance of Data conversion and the steps involved Common Industry Methodology Outline
More informationDatabase Vs. Data Warehouse
Database Vs. Data Warehouse Similarities and differences Databases and data warehouses are used to generate different types of information. Information generated by both are used for different purposes.
More informationData Warehousing and OLAP Technology for Primary Industry
Data Warehousing and OLAP Technology for Primary Industry Taehan Kim 1), Sang Chan Park 2) 1) Department of Industrial Engineering, KAIST (taehan@kaist.ac.kr) 2) Department of Industrial Engineering, KAIST
More informationThe Data Organization
C V I T F E P A O TM The Data Organization Best Practices Metadata Dictionary Application Architecture Prepared by Rainer Schoenrank January 2017 Table of Contents 1. INTRODUCTION... 3 1.1 PURPOSE OF THE
More informationDevelopment of an interface that allows MDX based data warehouse queries by less experienced users
Development of an interface that allows MDX based data warehouse queries by less experienced users Mariana Duprat André Monat Escola Superior de Desenho Industrial 400 Introduction Data analysis is a fundamental
More informationHandout 12 Data Warehousing and Analytics.
Handout 12 CS-605 Spring 17 Page 1 of 6 Handout 12 Data Warehousing and Analytics. Operational (aka transactional) system a system that is used to run a business in real time, based on current data; also
More informationIST722 Data Warehousing
IST722 Data Warehousing Dimensional Modeling Michael A. Fudge, Jr. Pop Quiz: T/F 1. The business meaning of a fact table row is known as a dimension. 2. A dimensional data model is optimized for maximum
More informationOracle Database 11g: Data Warehousing Fundamentals
Oracle Database 11g: Data Warehousing Fundamentals Duration: 3 Days What you will learn This Oracle Database 11g: Data Warehousing Fundamentals training will teach you about the basic concepts of a data
More informationChapter 3. Foundations of Business Intelligence: Databases and Information Management
Chapter 3 Foundations of Business Intelligence: Databases and Information Management THE DATA HIERARCHY TRADITIONAL FILE PROCESSING Organizing Data in a Traditional File Environment Problems with the traditional
More informationWKU-MIS-B10 Data Management: Warehousing, Analyzing, Mining, and Visualization. Management Information Systems
Management Information Systems Management Information Systems B10. Data Management: Warehousing, Analyzing, Mining, and Visualization Code: 166137-01+02 Course: Management Information Systems Period: Spring
More informationFull file at
Chapter 2 Data Warehousing True-False Questions 1. A real-time, enterprise-level data warehouse combined with a strategy for its use in decision support can leverage data to provide massive financial benefits
More informationPASS4TEST. IT Certification Guaranteed, The Easy Way! We offer free update service for one year
PASS4TEST IT Certification Guaranteed, The Easy Way! \ http://www.pass4test.com We offer free update service for one year Exam : BI0-130 Title : Cognos 8 BI Modeler Vendors : COGNOS Version : DEMO Get
More informationNormalization in DBMS
Unit 4: Normalization 4.1. Need of Normalization (Consequences of Bad Design-Insert, Update & Delete Anomalies) 4.2. Normalization 4.2.1. First Normal Form 4.2.2. Second Normal Form 4.2.3. Third Normal
More informationCSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores
CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores Announcements Shumo office hours change See website for details HW2 due next Thurs
More informationThe COSMIC Functional Size Measurement Method Version 4.0.1
The COSMIC Functional Size Measurement Method Version 4.0.1 Guideline for sizing Data Warehouse Application Software Version 1.1 April 2015 Acknowledgements Reviewers of v1.1 (alphabetical order) Diana
More information