Multidimensional modeling using MIDEA
|
|
- Dorcas Roberts
- 5 years ago
- Views:
Transcription
1 Multidimensional modeling using MIDEA JOSÉ MARÍA CAVERO 1, MARIO PIATTINI 2, ESPERANZA MARCOS 1 1 Kybele Research Group, Escuela Superior de Ciencias Experimentales y Tecnología Universidad Rey Juan Carlos C/ Tulipán s/n, Móstoles (Madrid) 2 Escuela Superior de Informática Universidad de Castilla-La Mancha Ronda de Calatrava 5, Ciudad Real SPAIN Abstract: - Developing a Data Warehouse has become a critical factor for the success of many companies. Specific issues, such as conceptual modeling, scheme translation from operational systems, physical design, etc..., have been widely treated. Unfortunately, there is still not a general accepted complete methodology for data warehouse design. In this work we present MIDEA, a multidimensional data warehouse development methodology based on a multidimensional data model, and an application example of its conceptual modeling activity. Key-Words: - Data base design, Multidimensional modeling, Data warehouse design 1 Introduction A data warehouse is a subject oriented, integrated, non-volatile, and time variant collection of data in support of management's decisions [6]. It is a concept very related to the OLAP technology, first introduced by Dr. E.F. Codd in 1993 to characterize the requirements of aggregation, consolidation, view production, formulae application and data synthesis in many dimensions [2]. A data warehouse is a repository of information mainly coming from online transactional processing (OLTP) systems that provides data for analytical processing and decision support. Multidimensional view of data is a very old concept: managers observe the evolution of interesting data organized in dimensions, such as products, clients, promotions, sell points, and, of course, time. The need of having simply and rapidly every historical information of the operational systems has pushed to companies to look for new ways of structuring and accessing their data, for having advantage to their competitors. There is an agreement in that traditional data base systems are not appropriate for multidimensional data analysis. Traditional OLTP systems are optimized for providing high performance in processing a lot of concurrent transactions. These transactions usually affect to a very few records. Meanwhile, multidimensional systems have to answer to complex queries (sometimes unpredictable) that need a huge number of records [1]. In fact, OLTP is profoundly different from dimensional data warehousing in their users, their data content and structures, their hardware and software, their administration and management, and their daily rhythms [7]. 2 Data warehouse design OLTP and OLAP environments are profoundly different Therefore, the techniques used for operational database design are inappropriate for data warehouse design [7,8]. The development of a data warehouse needs the integration of data mainly proceeding from legacy systems. The process of developing a data warehouse is, like any other task that implies some kind of preexisting resources integration, profoundly complex. This process is labor-intensive, errorprone, and generally frustrating, leading a number o warehousing projects to be abandoned mid-way through development [13]. To this respect, in latest years, there have been many proposals restricted to some of the particular aspects involved in the data warehouse design process. However, although many solutions have been developed for interesting sub-problems like handling multidimensional data as typical requirement for data warehouses, view maintenance for aggregated data, data integration etc., combining these partial and often very abstract and formal solutions to an overall design methodology and warehousing strategy is still left over to the practitioners [4]. Despite the obvious importance of having a methodological support for the development of
2 OLAP systems, the design process has received very little attention of the scientific community and the product providers. Models usually utilized for operational data base design (like E/R model) shouldn t be used without further ado for analytical environments design. Attending only to technical reasons, databases obtained from E/R models are inappropriate for decision support systems, in which query performance and data loading (including incremental loading) are important [7]. Multidimensional paradigm should be used not only in data base queries, but also during its design and maintenance. To use the multidimensional paradigm during all development phases it is necessary to define dedicated conceptual, logical and physical data models for the paradigm and to develop a sound methodology which gives guidelines how to create and transform these models during the development process [3]. In [14] authors claim for data warehouse design methodologies and tools with the appropriate support for aggregation hierarchies, mapping between the multidimensional and the relational models, and cost models for partitioning and aggregation that can be used from the early design stages. There are a few proposals for data warehouse design [1, 8, 15], most of them incomplete or very focused from its initial phases on the relational model. There are also many partial proposals, focused on issues such as models translations, view materialization, indexing, etc. For example, in [12] using data mining techniques in data warehouse design phases is proposed (for example, using data mining algorithms for discovering implicit information on data, for conflicts resolution in schemes integration, for recovering lost values and incorrect data, etc.). The problem with all these works is that they propose to use a new different methodology for data warehouse design, so organizations must use at least two totally different methodologies: one for OLTP environments and one for OLAP environments. We think that it is better to integrate data warehouse design in the existing methodologies, modifying and adding new activities, so that training and learning curve for data warehouse design should be less difficult. 3 MIDEA methodology In this paper we propose a methodology integrated into a preexisting traditional methodology. It has been developed in the EINSTEIN project. EINSTEIN is a research and development project that applies the experience and knowledge obtained in relational data base system development in the last decade (SQL, ER modeling, CASE tools, methodologies...) to MultiDimensional DataBase (MDDB) design. The project is based on the following three points: - IDEA, a multidimensional conceptual model used to understand and represent analytical users requirements in a similar manner than ER model is used to interact with micro-data users [11]. - IDEA-DWCASE, a CASE tool that supports multidimensional modeling using IDEA [10]. IDEA- DWCASE incorporates a graphical interface and allows the translation of a conceptual IDEA scheme into a logical scheme based on a model supported by some multidimensional or relational products. Figure 1 shows a tool prototype window, including an IDEA schema, whose graphical notation is based on [5]. - MIDEA, a data warehouse development methodology. In the following subsection, we outlines the main characteristics of the methodology. Figure 1. IDEA-DWCASE 3.1 Methodology overview MIDEA methodology uses as reference framework the Spanish Public Methodology METRICA version 3 proposal (MV3), which is similar to British SSADM or French Merise [9]. MV3 processes considered are those on which the data warehouse development has more influence, that is, Information System Analysis, Design and Construction (ASI, DSI and CSI). The new processes, modified from the MV3 proposal, have been named as ASI-MD (MultiDimensional), DSI-MD and CSI-MD. Of course, considering only these three processes doesn t mean that the others processes shouldn t be taken into account on a data warehouse development, but we have considered that the differences shouldn t be significant with respect to any other information system development. Every process of the methodology is divided into activities and every activity is divided into tasks. The
3 order of the activities doesn t mean a necessary sequential order. The activities can be developed in a different order, or in parallel, overlapping tasks of different activities. However, a process should not be considered finished until completing every of its activities. In every process, a graphic emphasizing its most important activities is included. In the following, we offer a general overview of the three process of the methodology, graphically outlined in figure 2. Expert users (Business analysts, specialists,...) Translation into a multidimensional model (MOLAP) Pure Multidimensional Logical Scheme Data Warehouse (MOLAP) Conceptual Scheme (using IDEA model) Construction ExistingDB (ER Conceptual schemes) Conceptual Modeling Translation into a relational model (ROLAP) Relational with Multidimensional issues Logical Scheme Data Warehouse (ROLAP) Figure 2. MIDEA processes ASI-MD DSI-MD CSI-MD Analysis of the Information System (ASI- MD) The basic purpose of ASI-MD process is to obtain a detailed specification of the data warehouse. This specification has to satisfy the information needs of users (business analysts, specialists,...) and serve as a basis for the design. Information gathering is mainly done in ASI-MD 2 activity, Obtaining Detailed Requirements. The General Requirements Catalogue and high level schemes obtained during the Feasibility Study are used as starting points in this phase. If the Feasibility Study weren t done, such Catalogue should be done in the first activity of this process. That Catalogue consists of a set of generic and user oriented requirements. These products should be refined with users by means of work sessions. In this way, data warehouse requirements should be in more detail specified. In addition, data warehouse non-functional requirements have to be identified, that is, constraints that have to be accomplished related to performance, security, etc. The purpose of activity ASI-MD 2 is to define a detailed and validated Requirements Catalogue, which serve as a basis to test correctness of schemes obtained in activity ASI-MD 3, Data Warehouse Conceptual Modeling. This activity contains a verification and validation task in which the schema must be reviewed to guarantee that it is complete, complied with the Requirements Catalogue, and met some predetermined quality criteria. Participation of users is essential to this process, because it constitutes a warranty that requirements initially identified have been understood and incorporated into the system and, therefore, that it will be accepted. As an example, next we outline ASI-MD 3 activity, Datawarehouse Conceptual Modeling. We also briefly explain the construction of a schema example, in which we model information about the products sold by a company, the sales obtained, and the average price of each product. Such example is graphically represented in figure 1. Datawarehouse Conceptual Modeling activity has seven tasks. Due to space restrictions, input and output artifacts, techniques and participants of every task haven t been specified in this paper. The purpose of this activity is to obtain, using IDEA model, the datawarehouse data multidimensional conceptual schema. We use as inputs the requirements catalog and existing ER schemes. The first step to obtain the multidimensional conceptual scheme is to obtain a preliminary scheme. This preliminary scheme is obtained during first and second tasks. The purpose of Task 1, Obtaining Preliminary Sub-cell structures is to obtain preliminary sub-cell structures. This preliminary structures represent events occurring dynamically in the enterprise world [5], such as sales of a company, movements in a bank account, etc. At this moment it is not mandatory to detail attributes and synthesis functions (sum, average,...) that comprises every sub-cell structure. We are still only interested in preliminary, generic structures. Necessary information for modeling preliminary sub-cell structures can come from different sources: on the one hand, and more important, expert users opinion. They know which are their problems, and which are the data they need for their daily work. Usually these correspond with numeric, continuously valued, and additive data [7]. On the other hand, if we have a data base ER conceptual scheme, these preliminary sub-cell
4 structures use to correspond with some of the entities or N:M relationships attributes. We can also use the previous multidimensional conceptual scheme made in activity ASI-MD 1. Preliminary sub-cell structures detected so far represent company interesting variables. The next step (Task 2, Obtaining Preliminary dimensions ) is to detect dimensions that should take part on them, that is, how the values detected in task 1 can be aggregated. In this moment, users have to think about dimensions in a very general manner. They don t need to detail dimensions hierarchy attributes. For example, users must think in dimensions such as time, space, etc..., but no in attributes such as daysmonths-years, delegation-province-country. This descending way of working can be complemented with the study of operational data bases conceptual schemes observing the attributes of entities and relationships connected (directly or by means of others) to those identified as facts in the ER scheme. Those attributes could give us clues about hidden dimensions not detected by users. If we have a general multidimensional scheme as output of ASI- MD 1 activity, then it can be used as another information source. At this point we have a preliminary scheme with a set of preliminary sub-cell structures defined over some dimensions. In our example, our preliminary sub-cell structures could be the items sold, and the sales (in dollars). Our preliminary dimensions should be the dimensions along which we can define our facts, that is, time, stores, and products: each sale is made in a point of time (or during an interval) in a store and corresponds with some product. The purpose of Task 3, Obtaining Preliminary Hierarchies, is to identify in a more precise manner dimensions and their hierarchies. We have to identify every dimension, describing (if exists) its subhierarchy and sub-hierarchy aggregations. It is not necessary at this moment to detail dimension domains of every aggregation, nor aggregation functions. At this point new dimensions could be detected. A typical example is time, which sometimes is not in the elementary databases, but is essential in every datawarehouse. Next step, done in task 4 ( Obtaining Detailed Hierarchies ) is to refine hierarchies obtained in previous step. This refinement consists in a detailed enumeration of sub-hierarchy dimension attributes of each dimension. New attributes could be detected, useless attributes eliminated, and hierarchies attributes properties detected and converted into description attributes. For example, telephone or address attributes used to be only dimension attributes properties (description attributes). For every attribute its domain must be defined, or assigned a previously defined domain. Domain aggregation hierarchy must be specified, detailing aggregation functions. At this point we have completely defined dimensions and its hierarchies. In our example, with the help of the users and the ER schemes available, we can refine our preliminary dimensions, as shown in figure 3. Each dimension consists of a set of attributes organized in hierarchies. A dimension attribute can have description attributes. The purpose of a description attribute is to describe some dimension element (for example, the address of an store). Aggregations between dimension attributes must also be specified (for example, the city of every store). Month Week Week Manufacturer Product Type Product Size Figure 3. Dimensions City Address Store Store Now we have to study the detail of sub-cell structures. It is done in Task 5, Obtaining Detailed Sub-cell Structures. For each sub-cell structure its attribute and the synthesis functions must be specified. Every synthesis function should be studied with respect to the dimensions that affect it. Perhaps some of them couldn t be applied (for example, it doesn t have sense adding temperatures along time). Usually, synthesis functions are sums, but also can be taken into consideration average, maximum, minimum, etc. At this point we have sub-cell structures with associated dimensions. In Task 6 Obtaining Fact Schemes, we have to group these sub-cell structures into cell structures, to form fact schemes. Every cell structure belongs to a fact scheme, and every fact scheme has associated dimensions. Therefore, subcell structures that could be joined must be detected. This join can be one of the following: - Joining two sub-cell structures into one, because they both represent the same fact (they are duplicated). Its synthesis attributes and dimensions should be the same. The resulting synthesis functions will be the union of the synthesis functions of both sub-cell structures, deleting those duplicated. Synthesis functions applicability to dimensions and dimension attributes should be reviewed.
5 - Aggregation of two sub-cell structures into one, or one sub-cell structure into one previously detected. In this case its dimensions should be similar. We can be interested in joining substructures whose dimensions are not the same. In this case, aggregations must be studied with respect to the new dimensions (to be or not aggregable). Of course, we always have the possibility of join every sub-cell structures into one (fact scheme), but in this case many of the synthesis attributes could be not aggregable, and the resulting scheme should be unreadable. After these last tasks, we have our example almost completed. Our sub-cell structures should be the following: one for the quantity of products sold, and one more for the sales made (in dollars). The synthesis functions for both of them should be the sum, that is, to obtain the aggregated data we have to sum the elementary data from the transactional data bases. We do not need any other sub-cell structure to obtain the average price of the products sold, because we can calculate it by means of a formula (we call it a method): Figure 4 shows the complete schema of the example, which graphical representation corresponds with figure 1. are not available, perhaps could be planned to modify the elementary DB Design of the Information System (DSI- MD) In this process are described the necessary activities to obtain the data warehouse design starting from the Software Requirements Specification obtained in ASI-MD process. The design process describes how to implement the elements detected in the analysis process. In this process the following tests are designed: query tests, query consistency tests and data warehouse acceptance tests. Due to the non-existence of a standard or commonly accepted multidimensional logical model, the data design process is done in one step, from conceptual to logical specific (that is, product dependent) model. This one-step logical design is carried out in activity DSI-MD 2 in case of MOLAP systems, or in DSI-MD 3 for ROLAP systems. Previously (in the activity DSI-MD 1), the appropriate technology (ROLAP or MOLAP) and product must be chosen. In this process there are three activities focused in the Physical Design. The purpose of them is to carry out and tune the physical design starting from the logical design obtained in previous activities (DSI- MD 2 and DSI-MD 3) Construction of the Information System (CSI-MD) The main purposes of this process are codification and test of data warehouse starting from the design specification obtained in DSI-MD process. Tests made during this process are focused into query and consistency query. Acceptance tests will be carried out during system implantation. Figure 4. Final IDEA conceptual schema Finally, in task 7, Multidimensional scheme verification and validation, the conceptual multidimensional scheme has to be verified and validated, assuring that it is complete, adjusted to requirements catalogue, and to some predefined quality criteria. Some other verifications should be done, such as availability of data from elementary DB. If those data 4 Conclusion The development of a Data warehouse has turned into a critical success factor for many companies, and there is a need of methodologies that considers the special characteristics of this kind of systems. MIDEA, a general multidimensional methodology based on a public Spanish methodology proposal has been developed. MIDEA integrates OLTP and OLAP database design in a unique methodology. A CASE tool (IDEA-DWCASE) supports part of the methodology. A first prototype of the tool is available and was presented in [10]. It allows the creation of IDEA multidimensional conceptual schemes, and its
6 translation into different logical schemes directly supported by MOLAP or ROLAP products. At this moment, IDEA-CASE tool translates IDEA schemes into EXPRESS and ORACLE. Acknowledgement: This work is being carried out as part of the MIDAS project. MIDAS is partially financed by the Spanish Government and the European Community (reference number: 2FD ). References: [1] L. Cabibbo and R. Torlone. "A Logical Approach to Multidimensional Databases" In Sixth International Conference on Extending Database Technology (EDBT'98), Valencia, España, Lecture Notes in Computer Science 1377, Springer-Verlag, , [2] E. F. Codd, S. B. Codd, and C. T. Salley, "Providing OLAP (On-Line Analytical Processing) to User-Analyst: An IT Mandate". Technical Report, E. F. Codd and Associates, 1993 [3] B. Dinter, C. Sapia, M. Blaschka, G. Höfling. "OLAP Market and Research: Initiating the Cooperation". Journal of Computer Science and Information Management, Vol 2, N. 3, [4] S. Gatziu, M. A. Jeusfeld, M. Staudt y Y. Vassiliou. "Design and Management of Data Warehouses - Report on the DMDW'99 Workshop". SIGMOD Record 28(4), Dec [5] Golfarelli, M., Maio, D. and Rizzi, S., Conceptual design of data warehouses from E/R schemes en: 31st Hawaii International Conference on System Sciences, [6] W. H. Inmon. Building the Data Warehouse, John Wiley & Sons, New York, 1993 [7] R. Kimball. The Data Warehouse Toolkit: Practical techniques for building dimensional data warehouses. John Wiley & Sons, 1996 [8] R. Kimball, L. Reeves, M. Ross, W. y Thornthwaite. The Data Warehouse Lifecycle Toolkit., John Wiley & Sons, Inc., 1998 [9] A. de Miguel et al., METRICA Version 3: Planning and Development Methodology of Information Systems. Designing a Methodology: A practical experience, In Proceedings of the CIICC`98, Aguascalientes, México, Nov [10] A. de Miguel et al. "IDEA-DWCASE: Modeling mutidimensional databases" EDBT 2000 Software Demonstrations track. Konstanz, Alemania, March [11] A. Sánchez, J.M. Cavero and A. de Miguel. "IDEA: A conceptual multidimensional data model and some methodological implications". Proceedings of the CIICC'99, Cancún, Méjico [12] C. Sapia, G. Höfling, M. Müller, C. Hausdorf, H. Stoyan and U. Grimmer. "On Supporting the Data Warehouse Design by Data Mining Techniques" To appear in GI-Workshop: Data Mining and Data Warehousing, September , 1999, Magdeburg, Germany. [13] J. Srivastava y P-Y. Chen. "Warehouse Creation - A Potential Roadblock to Data Warehousing". IEEE Transactions on Knowledge and Data Engineering. Vol 11, Num. 1, Ene/Feb 1999 [14] M.C. Wu and A.P. Buchmann, "Research Issues in Data Warehousing". BTW'97, Ulm, March, [15] M.Golfarelli and S. Rizzi Designing the data warehouse: key steps and crucial issues. Journal of computer science and information management, Vol. 2, N. 3, 1999
Evolution of Database Systems
Evolution of Database Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Intelligent Decision Support Systems Master studies, second
More informationFig 1.2: Relationship between DW, ODS and OLTP Systems
1.4 DATA WAREHOUSES Data warehousing is a process for assembling and managing data from various sources for the purpose of gaining a single detailed view of an enterprise. Although there are several definitions
More informationRocky Mountain Technology Ventures
Rocky Mountain Technology Ventures Comparing and Contrasting Online Analytical Processing (OLAP) and Online Transactional Processing (OLTP) Architectures 3/19/2006 Introduction One of the most important
More informationMODELING THE PHYSICAL DESIGN OF DATA WAREHOUSES FROM A UML SPECIFICATION
MODELING THE PHYSICAL DESIGN OF DATA WAREHOUSES FROM A UML SPECIFICATION Sergio Luján-Mora, Juan Trujillo Department of Software and Computing Systems University of Alicante Alicante, Spain email: {slujan,jtrujillo}@dlsi.ua.es
More informationThis tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.
About the Tutorial A data warehouse is constructed by integrating data from multiple heterogeneous sources. It supports analytical reporting, structured and/or ad hoc queries and decision making. This
More informationData Warehouse Design Using Row and Column Data Distribution
Int'l Conf. Information and Knowledge Engineering IKE'15 55 Data Warehouse Design Using Row and Column Data Distribution Behrooz Seyed-Abbassi and Vivekanand Madesi School of Computing, University of North
More informationCHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP)
CHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP) INTRODUCTION A dimension is an attribute within a multidimensional model consisting of a list of values (called members). A fact is defined by a combination
More informationAdnan YAZICI Computer Engineering Department
Data Warehouse Adnan YAZICI Computer Engineering Department Middle East Technical University, A.Yazici, 2010 Definition A data warehouse is a subject-oriented integrated time-variant nonvolatile collection
More informationData Warehousing and OLAP Technologies for Decision-Making Process
Data Warehousing and OLAP Technologies for Decision-Making Process Hiren H Darji Asst. Prof in Anand Institute of Information Science,Anand Abstract Data warehousing and on-line analytical processing (OLAP)
More informationData Warehouses Chapter 12. Class 10: Data Warehouses 1
Data Warehouses Chapter 12 Class 10: Data Warehouses 1 OLTP vs OLAP Operational Database: a database designed to support the day today transactions of an organization Data Warehouse: historical data is
More informationREPORTING AND QUERY TOOLS AND APPLICATIONS
Tool Categories: REPORTING AND QUERY TOOLS AND APPLICATIONS There are five categories of decision support tools Reporting Managed query Executive information system OLAP Data Mining Reporting Tools Production
More informationData Warehouse and Data Mining
Data Warehouse and Data Mining Lecture No. 02 Introduction to Data Warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology
More informationData Warehousing Introduction. Toon Calders
Data Warehousing Introduction Toon Calders toon.calders@ulb.ac.be Course Organization Lectures on Tuesday 14:00 and Friday 16:00 Check http://gehol.ulb.ac.be/ for room Most exercises in computer class
More informationIJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 06, 2016 ISSN (online):
IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 06, 2016 ISSN (online): 2321-0613 Tanzeela Khanam 1 Pravin S.Metkewar 2 1 Student 2 Associate Professor 1,2 SICSR, affiliated
More informationCS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures)
CS614- Data Warehousing Solved MCQ(S) From Midterm Papers (1 TO 22 Lectures) BY Arslan Arshad Nov 21,2016 BS110401050 BS110401050@vu.edu.pk Arslan.arshad01@gmail.com AKMP01 CS614 - Data Warehousing - Midterm
More informationA Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective
A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective B.Manivannan Research Scholar, Dept. Computer Science, Dravidian University, Kuppam, Andhra Pradesh, India
More informationData Mining Concepts & Techniques
Data Mining Concepts & Techniques Lecture No. 01 Databases, Data warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro
More informationDatabase design View Access patterns Need for separate data warehouse:- A multidimensional data model:-
UNIT III: Data Warehouse and OLAP Technology: An Overview : What Is a Data Warehouse? A Multidimensional Data Model, Data Warehouse Architecture, Data Warehouse Implementation, From Data Warehousing to
More informationCHAPTER 3 Implementation of Data warehouse in Data Mining
CHAPTER 3 Implementation of Data warehouse in Data Mining 3.1 Introduction to Data Warehousing A data warehouse is storage of convenient, consistent, complete and consolidated data, which is collected
More informationQuestion Bank. 4) It is the source of information later delivered to data marts.
Question Bank Year: 2016-2017 Subject Dept: CS Semester: First Subject Name: Data Mining. Q1) What is data warehouse? ANS. A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile
More informationA Data Warehouse Engineering Process
A Data Warehouse Engineering Process Sergio Luján-Mora and Juan Trujillo D. of Software and Computing Systems, University of Alicante Carretera de San Vicente s/n, Alicante, Spain {slujan,jtrujillo}@dlsi.ua.es
More informationNovel Materialized View Selection in a Multidimensional Database
Graphic Era University From the SelectedWorks of vijay singh Winter February 10, 2009 Novel Materialized View Selection in a Multidimensional Database vijay singh Available at: https://works.bepress.com/vijaysingh/5/
More informationData Warehousing and OLAP Technology for Primary Industry
Data Warehousing and OLAP Technology for Primary Industry Taehan Kim 1), Sang Chan Park 2) 1) Department of Industrial Engineering, KAIST (taehan@kaist.ac.kr) 2) Department of Industrial Engineering, KAIST
More informationA Methodology for Integrating XML Data into Data Warehouses
A Methodology for Integrating XML Data into Data Warehouses Boris Vrdoljak, Marko Banek, Zoran Skočir University of Zagreb Faculty of Electrical Engineering and Computing Address: Unska 3, HR-10000 Zagreb,
More informationThe GOLD Model CASE Tool: an environment for designing OLAP applications
The GOLD Model CASE Tool: an environment for designing OLAP applications Juan Trujillo, Sergio Luján-Mora, Enrique Medina Departamento de Lenguajes y Sistemas Informáticos. Universidad de Alicante. Campus
More informationDATA WAREHOUSE MANAGEMENT SYSTEM A CASE STUDY
DATA WAREHOUSE MANAGEMENT SYSTEM A CASE STUDY DARKO KRULJ Trizon Group, Belgrade, Serbia and Montenegro. MILUTIN CUPIC MILAN MARTIC MILIJA SUKNOVIC Faculty of Organizational Science, University of Belgrade,
More informationMD2 Getting Users Involved in the Development of Data Warehouse Applications
MD2 Getting Users Involved in the Development of Data Warehouse Applications Gilmar M. Freitas Prodabel Av. Presidente Carlos Luz, 1275 31230-901 Belo Horizonte MG Brazil gilmar@pbh.gov.br Alberto H. F.
More informationUsing SLE for creation of Data Warehouses
Using SLE for creation of Data Warehouses Yvette Teiken OFFIS, Institute for Information Technology, Germany teiken@offis.de Abstract. This paper describes how software language engineering is applied
More informationData Warehousing. Ritham Vashisht, Sukhdeep Kaur and Shobti Saini
Advance in Electronic and Electric Engineering. ISSN 2231-1297, Volume 3, Number 6 (2013), pp. 669-674 Research India Publications http://www.ripublication.com/aeee.htm Data Warehousing Ritham Vashisht,
More informationDecision Support, Data Warehousing, and OLAP
Decision Support, Data Warehousing, and OLAP : Contents Terminology : OLAP vs. OLTP Data Warehousing Architecture Technologies References 1 Decision Support and OLAP Information technology to help knowledge
More informationData Warehousing and OLAP
Data Warehousing and OLAP INFO 330 Slides courtesy of Mirek Riedewald Motivation Large retailer Several databases: inventory, personnel, sales etc. High volume of updates Management requirements Efficient
More information1. Inroduction to Data Mininig
1. Inroduction to Data Mininig 1.1 Introduction Universe of Data Information Technology has grown in various directions in the recent years. One natural evolutionary path has been the development of the
More informationResearch Article ISSN:
Research Article [Srivastava,1(4): Jun., 2012] IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY An Optimized algorithm to select the appropriate Schema in Data Warehouses Rahul
More informationDATAWAREHOUSING AND ETL PROCESSES: An Explanatory Research
DATAWAREHOUSING AND ETL PROCESSES: An Explanatory Research Priyanshu Gupta ETL Software Developer United Health Group Abstract- In this paper, the author has focused on explaining Data Warehousing and
More informationData Warehouse. Asst.Prof.Dr. Pattarachai Lalitrojwong
Data Warehouse Asst.Prof.Dr. Pattarachai Lalitrojwong Faculty of Information Technology King Mongkut s Institute of Technology Ladkrabang Bangkok 10520 pattarachai@it.kmitl.ac.th The Evolution of Data
More informationData warehouse architecture consists of the following interconnected layers:
Architecture, in the Data warehousing world, is the concept and design of the data base and technologies that are used to load the data. A good architecture will enable scalability, high performance and
More informationData Warehousing. Overview
Data Warehousing Overview Basic Definitions Normalization Entity Relationship Diagrams (ERDs) Normal Forms Many to Many relationships Warehouse Considerations Dimension Tables Fact Tables Star Schema Snowflake
More informationData Warehousing and Decision Support
Data Warehousing and Decision Support Chapter 23, Part A Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical
More informationDATA MINING TRANSACTION
DATA MINING Data Mining is the process of extracting patterns from data. Data mining is seen as an increasingly important tool by modern business to transform data into an informational advantage. It is
More informationData Mining & Data Warehouse
Data Mining & Data Warehouse Associate Professor Dr. Raed Ibraheem Hamed University of Human Development, College of Science and Technology (1) 2016 2017 1 Points to Cover Why Do We Need Data Warehouses?
More informationA Systems Approach to Dimensional Modeling in Data Marts. Joseph M. Firestone, Ph.D. White Paper No. One. March 12, 1997
1 of 8 5/24/02 4:43 PM A Systems Approach to Dimensional Modeling in Data Marts By Joseph M. Firestone, Ph.D. White Paper No. One March 12, 1997 OLAP s Purposes And Dimensional Data Modeling Dimensional
More information1 DATAWAREHOUSING QUESTIONS by Mausami Sawarkar
1 DATAWAREHOUSING QUESTIONS by Mausami Sawarkar 1) What does the term 'Ad-hoc Analysis' mean? Choice 1 Business analysts use a subset of the data for analysis. Choice 2: Business analysts access the Data
More informationQ1) Describe business intelligence system development phases? (6 marks)
BUISINESS ANALYTICS AND INTELLIGENCE SOLVED QUESTIONS Q1) Describe business intelligence system development phases? (6 marks) The 4 phases of BI system development are as follow: Analysis phase Design
More informationData Warehousing and Decision Support. Introduction. Three Complementary Trends. [R&G] Chapter 23, Part A
Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 432 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business
More informationData Warehousing and Decision Support
Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 4320 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business
More informationSyllabus. Syllabus. Motivation Decision Support. Syllabus
Presentation: Sophia Discussion: Tianyu Metadata Requirements and Conclusion 3 4 Decision Support Decision Making: Everyday, Everywhere Decision Support System: a class of computerized information systems
More informationData Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394
Data Mining Data warehousing Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 22 Table of contents 1 Introduction 2 Data warehousing
More informationData warehousing in telecom Industry
Data warehousing in telecom Industry Dr. Sanjay Srivastava, Kaushal Srivastava, Avinash Pandey, Akhil Sharma Abstract: Data Warehouse is termed as the storage for the large heterogeneous data collected
More informationWKU-MIS-B10 Data Management: Warehousing, Analyzing, Mining, and Visualization. Management Information Systems
Management Information Systems Management Information Systems B10. Data Management: Warehousing, Analyzing, Mining, and Visualization Code: 166137-01+02 Course: Management Information Systems Period: Spring
More informationDATA WAREHOUSING IN LIBRARIES FOR MANAGING DATABASE
DATA WAREHOUSING IN LIBRARIES FOR MANAGING DATABASE Dr. Kirti Singh, Librarian, SSD Women s Institute of Technology, Bathinda Abstract: Major libraries have large collections and circulation. Managing
More informationEnterprise Informatization LECTURE
Enterprise Informatization LECTURE Piotr Zabawa, PhD. Eng. IBM/Rational Certified Consultant e-mail: pzabawa@pk.edu.pl www: http://www.pk.edu.pl/~pzabawa/en 07.10.2011 Lecture 5 Analytical tools in business
More informationAn Overview of Data Warehousing and OLAP Technology
An Overview of Data Warehousing and OLAP Technology CMPT 843 Karanjit Singh Tiwana 1 Intro and Architecture 2 What is Data Warehouse? Subject-oriented, integrated, time varying, non-volatile collection
More informationCourse Computer Science Academic year 2015/16 Subject Databases II ECTS 6
Course Computer Science Academic year 2015/16 Subject Databases II ECTS 6 Type of course Compulsory Year 3rd Semester 2nd semester Student Workload: Professor(s) José Carlos fonseca Total 168 Contact 75
More informationData Warehouse and Data Mining
Data Warehouse and Data Mining Lecture No. 05 Data Modeling Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Data Modeling
More informationData Warehouse and Data Mining
Data Warehouse and Data Mining Lecture No. 07 Terminologies Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Database
More informationA Comprehensive Method for Data Warehouse Design
A Comprehensive Method for Data Warehouse Design Sergio Luján-Mora and Juan Trujillo Department of Software and Computing Systems University of Alicante (Spain) {slujan,jtrujillo}@dlsi.ua.es Abstract.
More informationChapter 6 VIDEO CASES
Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
More informationCT75 DATA WAREHOUSING AND DATA MINING DEC 2015
Q.1 a. Briefly explain data granularity with the help of example Data Granularity: The single most important aspect and issue of the design of the data warehouse is the issue of granularity. It refers
More informationManaging Data Resources
Chapter 7 Managing Data Resources 7.1 2006 by Prentice Hall OBJECTIVES Describe basic file organization concepts and the problems of managing data resources in a traditional file environment Describe how
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 05(b) : 23/10/2012 Data Mining: Concepts and Techniques (3 rd ed.) Chapter
More informationChapter 4, Data Warehouse and OLAP Operations
CSI 4352, Introduction to Data Mining Chapter 4, Data Warehouse and OLAP Operations Young-Rae Cho Associate Professor Department of Computer Science Baylor University CSI 4352, Introduction to Data Mining
More informationDATA MINING AND WAREHOUSING
DATA MINING AND WAREHOUSING Qno Question Answer 1 Define data warehouse? Data warehouse is a subject oriented, integrated, time-variant, and nonvolatile collection of data that supports management's decision-making
More informationThis tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.
About the Tutorial Data Mining is defined as the procedure of extracting information from huge sets of data. In other words, we can say that data mining is mining knowledge from data. The tutorial starts
More informationA method for requirements elicitation of a Data Warehouse: An example
A method for requirements elicitation of a Data Warehouse: An example JORGE OLIVEIRA E SÁ Information Systems Department University of Minho Azurém Campus, 4800-058 Guimarães PORTUGAL jos@dsi.uminho.pt
More informationOn-Line Application Processing
On-Line Application Processing WAREHOUSING DATA CUBES DATA MINING 1 Overview Traditional database systems are tuned to many, small, simple queries. Some new applications use fewer, more time-consuming,
More informationWarehousing. Data Mining
On Line Application Processing Warehousing Data Cubes Data Mining 1 Overview Traditional database systems are tuned to many, small, simple queries. Some new applications use fewer, more timeconsuming,
More informationThe Use of Soft Systems Methodology for the Development of Data Warehouses
The Use of Soft Systems Methodology for the Development of Data Warehouses Roelien Goede School of Information Technology, North-West University Vanderbijlpark, 1900, South Africa ABSTRACT When making
More informationProceedings of the IE 2014 International Conference AGILE DATA MODELS
AGILE DATA MODELS Mihaela MUNTEAN Academy of Economic Studies, Bucharest mun61mih@yahoo.co.uk, Mihaela.Muntean@ie.ase.ro Abstract. In last years, one of the most popular subjects related to the field of
More informationIT DATA WAREHOUSING AND DATA MINING UNIT-2 BUSINESS ANALYSIS
PART A 1. What are production reporting tools? Give examples. (May/June 2013) Production reporting tools will let companies generate regular operational reports or support high-volume batch jobs. Such
More informationModelling Data Warehouses with Multiversion and Temporal Functionality
Modelling Data Warehouses with Multiversion and Temporal Functionality Waqas Ahmed waqas.ahmed@ulb.ac.be Université Libre de Bruxelles Poznan University of Technology July 9, 2015 ITBI DC Outline 1 Introduction
More informationSummary of Last Chapter. Course Content. Chapter 2 Objectives. Data Warehouse and OLAP Outline. Incentive for a Data Warehouse
Principles of Knowledge Discovery in bases Fall 1999 Chapter 2: Warehousing and Dr. Osmar R. Zaïane University of Alberta Dr. Osmar R. Zaïane, 1999 Principles of Knowledge Discovery in bases University
More informationPartner Presentation Faster and Smarter Data Warehouses with Oracle OLAP 11g
Partner Presentation Faster and Smarter Data Warehouses with Oracle OLAP 11g Vlamis Software Solutions, Inc. Founded in 1992 in Kansas City, Missouri Oracle Partner and reseller since 1995 Specializes
More informationData Warehousing & OLAP
CMPUT 391 Database Management Systems Data Warehousing & OLAP Textbook: 17.1 17.5 (first edition: 19.1 19.5) Based on slides by Lewis, Bernstein and Kifer and other sources University of Alberta 1 Why
More informationCSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores
CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores Announcements Shumo office hours change See website for details HW2 due next Thurs
More informationOLAP Introduction and Overview
1 CHAPTER 1 OLAP Introduction and Overview What Is OLAP? 1 Data Storage and Access 1 Benefits of OLAP 2 What Is a Cube? 2 Understanding the Cube Structure 3 What Is SAS OLAP Server? 3 About Cube Metadata
More informationIMPLEMENTING STATISTICAL DOMAIN DATABASES IN POLAND. OPPORTUNITIES AND THREATS. Central Statistical Office in Poland
IMPLEMENTING STATISTICAL DOMAIN DATABASES IN POLAND. OPPORTUNITIES AND THREATS. Central Statistical Office in Poland Agenda 2 Background Current state The goal of the SDD Architecture Technologies Data
More informationData Warehouse Logical Design. Letizia Tanca Politecnico di Milano (with the kind support of Rosalba Rossato)
Data Warehouse Logical Design Letizia Tanca Politecnico di Milano (with the kind support of Rosalba Rossato) Data Mart logical models MOLAP (Multidimensional On-Line Analytical Processing) stores data
More informationInternational Journal of Computer Engineering and Applications, REQUIREMENT GATHERING FOR MODEL DRIVEN DESIGN OF DATAWAREHOUSE
International Journal of Computer Engineering and Applications, Volume XII, Issue I, Jan. 18, www.ijcea.com ISSN 2321-3469 REQUIREMENT GATHERING FOR MODEL DRIVEN DESIGN OF DATAWAREHOUSE Kuldeep Deshpande
More informationBUSINESS INTELLIGENCE AND OLAP
Volume 10, No. 3, pp. 68 76, 2018 Pro Universitaria BUSINESS INTELLIGENCE AND OLAP Dimitrie Cantemir Christian University Knowledge Horizons - Economics Volume 10, No. 3, pp. 68-76 P-ISSN: 2069-0932, E-ISSN:
More informationLectures for the course: Data Warehousing and Data Mining (IT 60107)
Lectures for the course: Data Warehousing and Data Mining (IT 60107) Week 1 Lecture 1 21/07/2011 Introduction to the course Pre-requisite Expectations Evaluation Guideline Term Paper and Term Project Guideline
More informationQUALITY ORIENTED FOR PHYSICAL DESIGN DATA WAREHOUSE
QUALITY ORIENTED FOR PHYSICAL DESIGN DATA WAREHOUSE Munawar, Naomie Salim and Roliana Ibrahim Department of Information System, Universiti Teknologi Malaysia, Malaysia E-Mail: an_moenawar@yahoo.com ABSTRACT
More informationBuilding a Data Warehouse step by step
Informatica Economică, nr. 2 (42)/2007 83 Building a Data Warehouse step by step Manole VELICANU, Academy of Economic Studies, Bucharest Gheorghe MATEI, Romanian Commercial Bank Data warehouses have been
More informationManagement Information Systems MANAGING THE DIGITAL FIRM, 12 TH EDITION FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT
MANAGING THE DIGITAL FIRM, 12 TH EDITION Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT VIDEO CASES Case 1: Maruti Suzuki Business Intelligence and Enterprise Databases
More informationDevelopment of an interface that allows MDX based data warehouse queries by less experienced users
Development of an interface that allows MDX based data warehouse queries by less experienced users Mariana Duprat André Monat Escola Superior de Desenho Industrial 400 Introduction Data analysis is a fundamental
More informationMOLAP Data Warehouse of a Software Products Servicing Call Center
MOLAP Data Warehouse of a Software Products Servicing Call Center Z. Kazi, B. Radulovic, D. Radovanovic and Lj. Kazi Technical faculty "Mihajlo Pupin" University of Novi Sad Complete Address: Technical
More informationData Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1396
Data Mining Data warehousing Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1396 1 / 31 Table of contents 1 Introduction 2 Data warehousing
More informationDKMS Brief No. Five: Is Data Staging Relational? A Comment
1 of 6 5/24/02 3:39 PM DKMS Brief No. Five: Is Data Staging Relational? A Comment Introduction In the data warehousing process, the data staging area is composed of the data staging server application
More informationUnit 7: Basics in MS Power BI for Excel 2013 M7-5: OLAP
Unit 7: Basics in MS Power BI for Excel M7-5: OLAP Outline: Introduction Learning Objectives Content Exercise What is an OLAP Table Operations: Drill Down Operations: Roll Up Operations: Slice Operations:
More informationData Warehouse Testing. By: Rakesh Kumar Sharma
Data Warehouse Testing By: Rakesh Kumar Sharma Index...2 Introduction...3 About Data Warehouse...3 Data Warehouse definition...3 Testing Process for Data warehouse:...3 Requirements Testing :...3 Unit
More informationDta Mining and Data Warehousing
CSCI6405 Fall 2003 Dta Mining and Data Warehousing Instructor: Qigang Gao, Office: CS219, Tel:494-3356, Email: q.gao@dal.ca Teaching Assistant: Christopher Jordan, Email: cjordan@cs.dal.ca Office Hours:
More informationA MAS Based ETL Approach for Complex Data
A MAS Based ETL Approach for Complex Data O. Boussaid, F. Bentayeb, J. Darmont Abstract : In a data warehousing process, the phase of data integration is crucial. Many methods for data integration have
More informationAutomatically Generating OLAP Schemata from Conceptual Graphical Models
Automatically Generating OLAP Schemata from Conceptual Graphical Models Karl Hahn FORWISS Orleansstr. 34 D-81667 Munich, Germany +49-89-48095225 hahnk@forwiss.de Carsten Sapia FORWISS Orleansstr. 34 D-81667
More informationA Data Warehouse Implementation Using the Star Schema. For an outpatient hospital information system
A Data Warehouse Implementation Using the Star Schema For an outpatient hospital information system GurvinderKaurJosan Master of Computer Application,YMT College of Management Kharghar, Navi Mumbai ---------------------------------------------------------------------***----------------------------------------------------------------
More informationThe Data Organization
C V I T F E P A O TM The Data Organization Best Practices Metadata Dictionary Application Architecture Prepared by Rainer Schoenrank January 2017 Table of Contents 1. INTRODUCTION... 3 1.1 PURPOSE OF THE
More informationUNIT -1 UNIT -II. Q. 4 Why is entity-relationship modeling technique not suitable for the data warehouse? How is dimensional modeling different?
(Please write your Roll No. immediately) End-Term Examination Fourth Semester [MCA] MAY-JUNE 2006 Roll No. Paper Code: MCA-202 (ID -44202) Subject: Data Warehousing & Data Mining Note: Question no. 1 is
More informationData Warehouses. Yanlei Diao. Slides Courtesy of R. Ramakrishnan and J. Gehrke
Data Warehouses Yanlei Diao Slides Courtesy of R. Ramakrishnan and J. Gehrke Introduction v In the late 80s and early 90s, companies began to use their DBMSs for complex, interactive, exploratory analysis
More informationPart I. Introduction. Chapter 1: Introduction to Data Warehousing and SQL Server 2008 Analysis Services
Part I Introduction Chapter 1: Introduction to Data Warehousing and SQL Server 2008 Analysis Services Chapter 2: First Look at Analysis Services 2008 Chapter 3: Introduction to MDX Chapter 4: Working with
More informationChapter 6. Foundations of Business Intelligence: Databases and Information Management VIDEO CASES
Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
More informationA Step towards Centralized Data Warehousing Process: A Quality Aware Data Warehouse Architecture
A Step towards Centralized Data Warehousing Process: A Quality Aware Data Warehouse Architecture Maqbool-uddin-Shaikh Comsats Institute of Information Technology Islamabad maqboolshaikh@comsats.edu.pk
More informationIntroduction to Data Warehousing
ICS 321 Spring 2012 Introduction to Data Warehousing Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 4/23/2012 Lipyeow Lim -- University of Hawaii at Manoa
More information