INTEGRATING USER S REQUIREMENTS IN AUTOMATED CONCEPTUAL DATA WAREHOUSE DESIGN. Opim Salim Sitompul and Shahrul Azman Mohd Noah 1 )
|
|
- Brice Day
- 6 years ago
- Views:
Transcription
1 INTEGRATING USER S REQUIREMENTS IN AUTOMATED CONCEPTUAL DATA WAREHOUSE DESIGN Opim Salim Sitompul and Shahrul Azman Mohd Noah 1 ) Abstract Conceptual design is the most important stage in data warehouse design since it represents a general view of data that could be understood by both users and system analysts and it is the key for the success of subsequent stages of the design process. In its implementation, data warehouse design rests on multidimensional model thereby many efforts have been directed to formulate a concise approach to develop this model. One of the approaches is called the transformationoriented methodology by which the entity-relationships (ER) model of a database is progressively transformed into the multidimensional model. A number of research works have shown that this methodology could be implemented into an automated tool. As a matter of fact, however, the automated tool could only provide the first cut of the design output, thus further refinements are still necessary to accommodate specific user s requirements. In this paper, we propose a refinement approach to facilitate intelligent interactions between such a tool called the DWDesigner and the user to make such refinement. Using this approach user could interactively refine measures, temporal dimensions, and dimension hierarchies of the multidimensional constructs. Keywords: data warehouse design, automated tool, multidimensional model. 1. Introduction Conceptual model is a preliminary design model that could help both users and system analysts to describe the data warehouse in general terms without theoretical or technical jargons. In this manner, users could participate in the design process using only general terms to propose ideas about the new system, whereas system analysts suggest design consideration using only users recognized terms. In addition, the conceptual model is also important as the foundation for the logical and physical design stages where modeling errors could be detected early and the schema could be extended easily [6, 18]. It is universally agreed that data warehouse implementation rests on multidimensional model; therefore many endeavors have been focused on the development of concise methodology to build this model (see e.g. [3], [6], [17], [13], [8]). As a design approach, most of the methodologies use ER model as a basis and by using the transformation-oriented methodology, this model is transformed into multidimensional model. 1 Faculty of Information Science and Technology, National University of Malaysia, UKM Bangi, Selangor, Malaysia
2 Although a number of methods supporting the aforementioned approach have been proposed, but the capacity of these methods to be successfully implemented in the form of computer aided software engineering (CASE) largely remains a question. Many tools developed (see e.g. [7, 18], [2], [17], [4, 5], [9]) to support the process of data warehouse design are passive and incapable of supporting the basic characteristics of data warehouse design [10]. The tools are unable to fulfill specific user requirements since they could only build the data warehouse model based on what have been stated explicitly in the design sources, thus could only produce the first cut of the model. For example, ER model that is commonly used as the design sources usually does not express explicitly some of the data warehouse s features such as measures and temporal dimensions. As a consequence, user intervention is a necessity in order to revise the initial model in order to obtain a model that will suit specific requirements. In this paper we will describe our approach to the refinement of multidimensional model, which is developed using an automated tool for data warehouse conceptual design called the DWDesigner[15]. The approach utilizes a command shell to facilitate intelligent interactions between the tool and the user to perform the refinement process. Using this approach user could interactively refine measures, temporal dimensions, and dimension hierarchies of the multidimensional constructs. The rest of this paper is structured as follows. In Section 2 we will describe the process of translating the ER model into multidimensional model using the transformation-oriented methodology with the intention to provide readers with some background on how the transformation process is performed. The multidimensional model resulted from this process is the first-cut of the conceptual data warehouse design that should be refined by user. Section 3 describes the process of integrating user s requirements into the resulting multidimensional model, illustrating how the refinement of each multidimensional construct is performed. In Section 4 we will show the result of the refinement process and discuss some important concepts related to the process. Section 5 provides the conclusions and future research works. 2. The Transformation Oriented Methodology Our approach to the conceptual data warehouse design is based on the transformation-oriented approach, which transforms the ER model into multidimensional model in five stages: translation of ER model into specification language model, transformation of specification language model into problem domain model, expansion of the problem domain model, transformation of the problem domain model into multidimensional model, and refinement of the multidimensional model. The first stage is a process of translating the ER model into a specification language in order to represent the application domain in a computer readable form necessary for the transformation process. The specification language resembles a class construct in which an entity is represented as a class with the name of the entity as the class name and the entity properties such as attribute, identifier, subclass, aggregation, and relationship are written as the class properties. The translation is guided by a set of syntax rules in order to record properties and semantic contents of the ER model [14]. A collection of entity classes representing the whole entities of the ER model is recorded into a text file. To illustrate this process, we will look at a portion of ER model for a university database (adapted from [1]) as shown in Figure 1.
3 Figure 1. Portion of ER model for a university domain The specification language model formulated based on the above ER model for the Student entity is given as in the following: CLASS "STUDENT" ATTRIBUTE (("Class": Integer)) IDENTIFIER NIL SUBCLASS ("GRAD_STUDENT") AGGREGATION NIL RELATIONSHIP (("Minor" "DEPARTMENT" "NIL" "(1 1)" "(1 n)")\ ("Major" "DEPARTMENT" "NIL" "(1 1)" "(1 n)")\ ("Registered" "CURRENT_SECTION" "(("Count": Integer))" "(1 n)" "(1 m)")\ ("Transcript" "SECTION" "(("Grade": Float))" "(1 n)" "(1 m)")) End-Class In the second stage, each entity and its property is transformed using a simple parser into a compound term representation, which takes a form of property entity values triplet; and is stored into a database as the initial problem domain model. For example, the Student entity in the specification language model above is transformed into the following initial problem domain. Has-Attribute STUDENT ((Class. Integer)) Has-Subclass STUDENT (GRAD_STUDENT) Has-Relationship STUDENT (((Name. Minor) (Participating-obj. DEPARTMENT) (Rel-Attribute. NIL) (First-constraint. (1 1)) (Second-constraint. (1 n))) ((Name. Major) (Participating-obj. DEPARTMENT) (Rel-Attribute. NIL) (First-constraint. (1 1))(Second-constraint. (1 n))) ((Name. Registered) (Participating-obj. CURRENT_SECTION) (Rel-Attribute. ((Count. Integer))) (First-constraint. (1 n)) (Second-constraint. (1 m))) ((Name. Transcript) (Participating-obj. SECTION) (Rel-Attribute. ((Grade. Float))) (First-constraint. (1 n)) (Second-constraint. (1 m)))
4 The initial problem domain model contains only facts that are created from the non-nil values of the entity s properties. In the subsequent stage, this model is then progressively analyzed using a set of synthesis and diagnosis rules and as a result new facts are augmented into the database. The basic constructs of the multidimensional model are built from the entity attributes, which are categorized into numeric, date/time, and other attributes. Entities with numeric attributes are candidates for fact schemes; in this case, the name of the fact scheme is taken from the entity name and its measures are obtained from the numeric attributes. The date/time and other attributes are the basis for creating temporal and other dimensions of the fact scheme. In addition, the system also considers getting additional dimensions from other entities connected to the fact entity through relationships. In this case, the system will recursively examine the existence of one-to-many relationship between the fact entity and the other entities. Dimensions obtained from those relationships represent sub-trees of dimension hierarchies. The last stage is the refinement process by which users will be able to integrate specific requirements by adding or modifying the multidimensional constructs such as measures, temporal dimension, and dimension hierarchies. Details on how the refinement could be performed are described in the following section. 3. The Refinement Process After completing the transformation process, the user will obtain a list of candidate fact schemes representing the multidimensional model. This model, however, is merely a direct transformation of the ER model whereby the multidimensional constructs such as measures, temporal dimensions, and dimension hierarchies are created automatically. Nevertheless, without user intervention the tool has no way of knowing about what features should be included in the final data warehouse model, thus the user should evaluate the resulting model in order to integrate those requirements. In this section we will look on how the refinement process is implemented in the system and as an example we will look at a student fact scheme generated by the tool from the university database as shown in Figure 2. Figure 2. A student fact scheme
5 The student fact scheme is represented in the form of a tree structure [3], which rooted on the Student fact and has one measure called Class. Nodes directly connected to the root are dimensions in the form of simple sub-trees containing one or more leaf nodes, or sub-trees with one or more branches. In the above example, there are six dimensions: five of them are simple sub-trees (BDate, Name, Sex, Ssn, and Address), and a sub-tree with one branch stemming from the DeptName node forming a dimension hierarchy Refining Measures Measure is the focus of interest in data warehouse design described through a set of atomic or derived attributes [16]. User could evaluate the applicability of a measure by evaluating it along the available dimensions with one or more aggregation functions, such as count, sum, average, minimum, and maximum. Since measure is commonly of numeric type, one important thing to be considered is summarizability (additivity). Summarizability ensures the correctness of a measure whenever it is aggregated (summarized) along a specific dimension hierarchy by avoiding double counting of data and avoiding addition of non-additive data [12]. For example, the Class measure of the Student fact scheme, which corresponds to the year a student in a university, could be evaluated using the count function. In this case, we can count the number of students for each class based on sex, age, and country of origin for each department or college. Further consideration on this measure shows that the counting of students for each class will lead to the number of students in the university, including undergraduate and graduate students. In this case, user could do the refinement by modifying the Class measure. User could refine the measure construct of the multidimensional model by interacting with the system in a series of question-answer type dialog as illustrated in Figure 3. Figure 3. Example refinement session for measure construct The dialog confirms user that he has chosen to modify the measure construct and asked whether he wants to modify it or need to see an explanation about the measure construct. If the user chooses E[xplain], a brief explanation about measure will be displayed. After that, the user could continue
6 with the modification by choosing [Y]es or exit from the dialog by choosing [D]one. Continuing the refinement process will cause the system ask the user to enter each measure in a specified format i.e. one or more (measure type) pairs separated by parenthesis. After entering the new measure, the system will reconfirm user on the input given. The option is [O]k or [N]o. If the user chooses [O]k, then system will override the old measure with the new provided measure, but if for some reason the user chooses [N]o then system will prompt the user to re-enter the new measure Refining Temporal Dimension Temporal dimension is a multidimensional construct that should be given more attention since time plays a very important role in data warehousing. It could be represented either as time interval that spans from a shorter to a longer time period or as a specific point of time resembling a snapshot of the data warehouse. Obtaining temporal dimension from an ER model is not always successful since this feature is commonly unavailable or not stated explicitly. From the Student fact scheme, for instance, the temporal dimension obtained is BDate, which has an attribute type of date/time. This dimension, however, neither resembles a time interval nor a time snapshot of a university data warehouse. Therefore, The user should modify this dimension to add time granularity such as month semester year to enable appropriate counting of the number of student in each department or college. An example of the refinement session is illustrated in Figure 4. Figure 4. Example of refinement session for temporal dimension As in refining measure, user could ask for an explanation or make the refinement directly. The explanation will describe the importance of temporal dimension construct in data warehouse design to capture the historical aspect of data and recommend user to have one. For the refinement, user has some choices of available temporal dimensions or could provide customize temporal dimension that fulfills the design requirements. In the above dialog, the user chooses the month semester year interval to count the number of students for each department or college.
7 3.3. Refining Dimension Hierarchies Dimension hierarchy is a structuring of dimensions to determine how fact instances may be aggregated and selected for decision-making process [3]. The aggregation functions applied to the dimension hierarchies can be classified into three sets of functions, namely functions that could be applied to data that can be added, data that can be used for average calculations, and constant data that can only be counted [11]. Those three aggregation functions could semantically used to keep track of what type of aggregate functions can be applied to a specific data. The automated tool obtains dimension hierarchies from other attributes of a fact entity and recursively add more hierarchy levels from other attribute of an entity that has many-to-one relationship with the fact entity. An attribute tree will grow from the attributes of this chain of entity-to-entity relationships; rooted at the identifier of the fact entity. The refinement of the dimension hierarchies produced by the tool could be done by performing three types of activities, i.e. pruning, grafting, and aggregating the attribute tree. Pruning and grafting is intended to remove dimensions that are not relevant to the fact scheme being considered. The relevance of a dimension to the fact scheme could be determined by evaluating the relationship between the fact and the dimension as well as among all dimensions in a hierarchy. Pruning the attributes trees is to remove a dimension and all its descendants, thus decreasing the number of available dimensions. Grafting a dimension on the other hand, is to remove the dimension but maintains all its descendants. Nodes directly connected to the deleted dimension are becoming new dimensions, thus increasing the number of dimensions in the multidimensional construct. Aggregating the attribute tree is a way to add new classification hierarchies to the fact scheme. The aggregation could be built from the existing attributes or by adding new attributes. The dialog in Figure 5 illustrates the refinement process of the dimension hierarchies for the Student fact scheme. Figure 5. Example of refinement session for dimension hierarchy In the above example, the user chooses to prune the Name dimension since this dimension is not important to the count of the number of students being considered. Then the user would consider grafting the Address dimension in order to maintain the City and State properties. Subsequently, the
8 user could add another hierarchy level by aggregating the City and State dimensions with a new dimension Country. As in the other refinements, user could either get a brief explanation about pruning, grafting, and aggregating or go directly for the refinement by choosing one of the three options available. Pruning and grafting can be performed on a single node or a set of nodes of the attributes tree, whereas aggregating can be performed by combining two or more dimensions into one aggregation level. As necessary, while aggregating the dimensions user can put additional dimensions level for finer or coarser granularities. The refinement process can be repeated to fulfill specific requirements. 4. Results and Discussion User intervention in order to refine the multidimensional model produced by the automated tool is indeed necessary due to the fact that the automated tool will only produce the multidimensional model based on the entity properties of the ER model. As an illustration, it has been shown in the previous section that user should refine the temporal dimension of the Student fact scheme since the automated tool would only suggest that the temporal dimension for the multidimensional model is BDate. This is because BDate is the only attribute of the Student entity that has the date/time field type. To illustrate how the automated tool generates the multidimensional model that fulfill specific user s requirement, Figure 6 shows an output from the DWDesigner for the Student entity before and after the refinement. Figure 6. Fact scheme of Student: (a) Before refinement (b) After refinement In order to arrive at the desired multidimensional model as shown in Figure 6, the steps taken during the refinement process are listed in the following:
9 a. Refining measures Modifying Class into Number-Of-Student b. Refining Temporal dimension Modifying BDate into Month Semester Year c. Refining dimension hierarchies - Pruning Name - Grafting Address - Pruning No, Street, AptNo, and Zip - Aggregating City, State and adding Country - Pruning DeptPhone, Office, CollegeOffice and Dean - Aggregating DeptName and CollegeName The result obtain from the refinement process are definitely depends on user preferences. For instance, users should determine what would be the focus of the analysis, how data will be analyzed in terms of time aggregation, what would be the dimensions used for analyzing the data, and how those data would be aggregated. Initial multidimensional data produced by the automated tool could be used as the basis to dynamically refine the data model to fulfill those requirements. Apparently, this would be beneficial both to users and system analysts since they can work together to formulate current as well as new requirements. Output from the DWDesigner illustrating result of the refinement process described above can be seen in Figure 7. Figure 7. Output from the DWDesigner for the refinement of the Student fact scheme
10 5. Conclusions and Future Research Works In this paper we have presented an approach to refine multidimensional model necessary for conceptual data warehouse conceptual design in order to fulfill user s requirements. Users could initiate a dialog with the system through a command shell to refine the existing multidimensional constructs such as measures, temporal dimensions and dimension hierarchies that have been automatically generated by the system. In modifying the dimension hierarchies, users could perform pruning and grafting the attribute tree as well as aggregating two or more dimensions. The automated tool developed generates the multidimensional model in the form of graphical output to enhance the readability of the multidimensional model. As an initial step to the development of CASE tool for data warehouse design, this research work provides a basis for developing the logical and physical design phases. The prototype of the automated tool has been developed using Allegro Common LISP version 6.2 from Franz Inc. Some issues that are still open for future research works could be given as in the following: o Input to DWDesigner is still manually translated from existing ER models. This could be enhanced by automatically acquiring the input from various operational databases; thereby it should be added with a data integration module. o The simple parser used in translating the specification language into the initial problem domain model could be enhanced by a capability to perform syntax analysis to ensure correct input is being provided to the DWDesigner. o As data warehouse is concern with integrating multiple databases, therefore, a module that can perform aspects of integration process at the schema level should be provided. Furthermore, with the introduction of the semantic web technology, unstructured database such as web pages should also be constructed for analysis and processing in supporting the data warehouse system. o As a design tool, the system developed is still in prototype form with simple graphical user interface and direct conversation with users during the refinement session. This interface could be enhanced with user-friendlier interface by providing users with point-and-click capability to perform the transformation process and to refine the multidimensional model 6. References [1] ELMASRI, R., NAVATHE, S.B, Fundamentals of database systems, 3 rd Edition, Reading, Mass.: Addison- Wesley, [2] FRANCONI, E., NG, G., The i.com Tool for Intelligent Conceptual Modeling, Proceedings of 7th International Workshop on Knowledge Representation Meets Databases (KRDB-2000), Berlin, Germany, [3] GOLFARELLI, M., MAIO, D., RIZZI, S., Conceptual design of data warehouses from E/R schemes, Proceedings of the 31st Hawaii International Conference on System Sciences, Kohala Coast, HI, [4] GOLFARELLI, M., RIZZI, S., WAND: A CASE Tool for Data Warehouse Design, Demo Proceedings of 17th International Conference on Data Engineering (ICDE 2001), Heidelberg, Germany, [5] GOLFARELLI, M., RIZZI, S., SALTARELLI, E. WAND: A CASE Tool for Workload-Based Design of a Data Mart, Proceedings Decimo Convegno Nazionale su Sistemi Evoluti Per Basi Di Dati, Portoferraio, Italy, 2002.
11 [6] HÜSEMANN, B., LECHTENBÖRGER, J., VOSSEN, G., Conceptual data warehouse design, Proceeding of the International Workshop on Design and Management of Data Warehouse (DMDW 2000), Stockholm, Sweden [7] MILLER, L., NILAKANTA, S. Data Warehouse Modeler: a CASE Tool for Data Warehouse Design, Proceeding of 31st Annual Hawaii International Conference on System Sciences, Kona, Hawaii, [8] MOODY, D., KORTINK, M.A.R., From enterprise models to dimensional models: A methodology for data warehouse and data mart design, Proc. of Int'l Workshop on Design and Management of Data Warehouses, Stockholm, Sweden, [9] NAGGAR, P, PONTIERI, L., PUPO, M., TERRACINA, G., VIRARDI, E., A Model and a Toolkit for Supporting Incremental Data Warehouse Construction, In Ciccheti, R. et al. (Eds.): DEXA 2002, LNCS 2453, Springer-Verlag Berlin Heidelberg, [10] NOAH, S.A., WILLIAMS, M., Intelligent object analyser for conceptual database design model. Jurnal Teknologi 39: 27-44, [11] PEDERSEN, T. B., JENSEN, C. S. Multidimensional Data Modeling for Complex Data, Proceedings of 1st International Conference on Data Engineering (ICDE 99), Sydney, Australia, [12] PEDERSEN, T. B., JENSEN, C. S., DYRESON, C. E., A Foundation for Capturing and Querying Complex Multidimensional Data, Information Systems, 26, [13] PHIPPS, C., DAVIS, K. C., Automating data warehouse conceptual schema design and evaluation. Design and Management of Data Warehouses 2002, Proceedings of the 4th Intl. Workshop DMDW'2002, Toronto, Canada, [14] SITOMPUL, O.S., NOAH, S.A.M., Translation of ER model into multidimensional model for data warehouse: an automated approach, Int. J. of Information Technology 3: 11-32, Bangi, Malaysia, [15] SITOMPUL, O.S., NOAH, S.A.M., Application of knowledge-based system in automated data warehouse design. Proceedings of the Knowledge Management International Conference and Exhibition (KMICE 2004), Penang, Malaysia, [16] TRUJILLO, J. PALOMAR, M., GOMEZ, J., SONG, I.-Y., Designing Data Warehouses with OO Conceptual Models, IEEE Computer, 34(12): 66 75, [17] TRYFONA, N., BUSBORG, F., CHRISTIANSEN, J. G. B., starer. Proceedings of the ACM 2nd International Workshop on Data Warehousing and OLAP, ACM Press, New York, [18] WU, L., MILLER, L., NILAKANTA, S., Design of Data Warehouse Using Metadata, Information and Software Technology, 43: , 2001;
A Data Warehouse Engineering Process
A Data Warehouse Engineering Process Sergio Luján-Mora and Juan Trujillo D. of Software and Computing Systems, University of Alicante Carretera de San Vicente s/n, Alicante, Spain {slujan,jtrujillo}@dlsi.ua.es
More informationCOMPUTER-AIDED DATA-MART DESIGN
COMPUTER-AIDED DATA-MART DESIGN Fatma Abdelhédi, Geneviève Pujolle, Olivier Teste, Gilles Zurfluh University Toulouse 1 Capitole IRIT (UMR 5505) 118, Route de Narbonne 31062 Toulouse cedex 9 (France) {Fatma.Abdelhédi,
More informationThe GOLD Model CASE Tool: an environment for designing OLAP applications
The GOLD Model CASE Tool: an environment for designing OLAP applications Juan Trujillo, Sergio Luján-Mora, Enrique Medina Departamento de Lenguajes y Sistemas Informáticos. Universidad de Alicante. Campus
More informationA Methodology for Integrating XML Data into Data Warehouses
A Methodology for Integrating XML Data into Data Warehouses Boris Vrdoljak, Marko Banek, Zoran Skočir University of Zagreb Faculty of Electrical Engineering and Computing Address: Unska 3, HR-10000 Zagreb,
More informationConstructing Object Oriented Class for extracting and using data from data cube
Constructing Object Oriented Class for extracting and using data from data cube Antoaneta Ivanova Abstract: The goal of this article is to depict Object Oriented Conceptual Model Data Cube using it as
More informationMD2 Getting Users Involved in the Development of Data Warehouse Applications
MD2 Getting Users Involved in the Development of Data Warehouse Applications Gilmar M. Freitas Prodabel Av. Presidente Carlos Luz, 1275 31230-901 Belo Horizonte MG Brazil gilmar@pbh.gov.br Alberto H. F.
More informationChapter (4) Enhanced Entity-Relationship and Object Modeling
Chapter (4) Enhanced Entity-Relationship and Object Modeling Objectives Concepts of subclass and superclass and the related concepts of specialization and generalization. Concept of category, which is
More informationGoal-Oriented Requirement Analysis for Data Warehouse Design
Goal-Oriented Requirement Analysis for Data Warehouse Design Paolo Giorgini University of Trento - Italy paolo.giorgini@unitn.it Stefano Rizzi University of Bologna - Italy srizzi@deis.unibo.it Maddalena
More informationRocky Mountain Technology Ventures
Rocky Mountain Technology Ventures Comparing and Contrasting Online Analytical Processing (OLAP) and Online Transactional Processing (OLTP) Architectures 3/19/2006 Introduction One of the most important
More informationModern Software Engineering Methodologies Meet Data Warehouse Design: 4WD
Modern Software Engineering Methodologies Meet Data Warehouse Design: 4WD Matteo Golfarelli Stefano Rizzi Elisa Turricchia University of Bologna - Italy 13th International Conference on Data Warehousing
More informationSyllabus DATABASE I Introduction to Database (INLS523)
Syllabus DATABASE I Introduction to Database (INLS523) Course Description Databases are the backbones of modern scholarly, scientific, and commercial information systems. For example, NASA uses databases
More informationChapter 3. The Multidimensional Model: Basic Concepts. Introduction. The multidimensional model. The multidimensional model
Chapter 3 The Multidimensional Model: Basic Concepts Introduction Multidimensional Model Multidimensional concepts Star Schema Representation Conceptual modeling using ER, UML Conceptual modeling using
More informationUsing SLE for creation of Data Warehouses
Using SLE for creation of Data Warehouses Yvette Teiken OFFIS, Institute for Information Technology, Germany teiken@offis.de Abstract. This paper describes how software language engineering is applied
More informationData Warehouse Design. Letizia Tanca Politecnico di Milano (with the kind support of Rosalba Rossato)
Data Warehouse Design Letizia Tanca Politecnico di Milano (with the kind support of Rosalba Rossato) Data Warehouse Design User requirements Internal DBs Further info sources Source selection Analysis
More informationA Comprehensive Method for Data Warehouse Design
A Comprehensive Method for Data Warehouse Design Sergio Luján-Mora and Juan Trujillo Department of Software and Computing Systems University of Alicante (Spain) {slujan,jtrujillo}@dlsi.ua.es Abstract.
More informationDatabase design View Access patterns Need for separate data warehouse:- A multidimensional data model:-
UNIT III: Data Warehouse and OLAP Technology: An Overview : What Is a Data Warehouse? A Multidimensional Data Model, Data Warehouse Architecture, Data Warehouse Implementation, From Data Warehousing to
More informationMultidimensional Design by Examples
Multidimensional Design by Examples Oscar Romero and Alberto Abelló Universitat Politècnica de Catalunya, Jordi Girona 1-3, E-08034 Barcelona, Spain Abstract. In this paper we present a method to validate
More informationQUALITY ORIENTED FOR PHYSICAL DESIGN DATA WAREHOUSE
QUALITY ORIENTED FOR PHYSICAL DESIGN DATA WAREHOUSE Munawar, Naomie Salim and Roliana Ibrahim Department of Information System, Universiti Teknologi Malaysia, Malaysia E-Mail: an_moenawar@yahoo.com ABSTRACT
More informationChapter 8: Enhanced ER Model
Chapter 8: Enhanced ER Model Subclasses, Superclasses, and Inheritance Specialization and Generalization Constraints and Characteristics of Specialization and Generalization Hierarchies Modeling of UNION
More informationMOLAP Data Warehouse of a Software Products Servicing Call Center
MOLAP Data Warehouse of a Software Products Servicing Call Center Z. Kazi, B. Radulovic, D. Radovanovic and Lj. Kazi Technical faculty "Mihajlo Pupin" University of Novi Sad Complete Address: Technical
More information1. Analytical queries on the dimensionally modeled database can be significantly simpler to create than on the equivalent nondimensional database.
1. Creating a data warehouse involves using the functionalities of database management software to implement the data warehouse model as a collection of physically created and mutually connected database
More informationChapter 8 The Enhanced Entity- Relationship (EER) Model
Chapter 8 The Enhanced Entity- Relationship (EER) Model Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 Outline Subclasses, Superclasses, and Inheritance Specialization
More informationLecture2: Database Environment
College of Computer and Information Sciences - Information Systems Dept. Lecture2: Database Environment 1 IS220 : D a t a b a s e F u n d a m e n t a l s Topics Covered Data abstraction Schemas and Instances
More informationData warehouse architecture consists of the following interconnected layers:
Architecture, in the Data warehousing world, is the concept and design of the data base and technologies that are used to load the data. A good architecture will enable scalability, high performance and
More informationData Modeling: Beginning and Advanced HDT825 Five Days
Five Days Prerequisites Students should have experience designing databases. Who Should Attend This course is targeted at database designers, data modelers, database analysts, and anyone else who needs
More informationEDA Juin 2013 Blois, France. Summarizability Issues in Multidimensional Models: A Survey* Authors: Marouane HACHICHA Jérôme DARMONT
*Problèmes d'additivité dus à la présence de hiérarchies complexes dans les modèles multidimensionnels : définitions, solutions et travaux futurs EDA 2013 Summarizability Issues in Multidimensional Models:
More informationData Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394
Data Mining Data warehousing Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 22 Table of contents 1 Introduction 2 Data warehousing
More informationDta Mining and Data Warehousing
CSCI6405 Fall 2003 Dta Mining and Data Warehousing Instructor: Qigang Gao, Office: CS219, Tel:494-3356, Email: q.gao@dal.ca Teaching Assistant: Christopher Jordan, Email: cjordan@cs.dal.ca Office Hours:
More informationData Warehouse Testing. By: Rakesh Kumar Sharma
Data Warehouse Testing By: Rakesh Kumar Sharma Index...2 Introduction...3 About Data Warehouse...3 Data Warehouse definition...3 Testing Process for Data warehouse:...3 Requirements Testing :...3 Unit
More informationA Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective
A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective B.Manivannan Research Scholar, Dept. Computer Science, Dravidian University, Kuppam, Andhra Pradesh, India
More informationA grain preservation translation algorithm: From ER diagram to multidimensional model
Information Sciences 177 (2007) 3679 3695 www.elsevier.com/locate/ins A grain preservation translation algorithm: From ER diagram to multidimensional model Yen-Ting Chen a,b, *, Ping-Yu Hsu a a Department
More informationXML-OLAP: A Multidimensional Analysis Framework for XML Warehouses
XML-OLAP: A Multidimensional Analysis Framework for XML Warehouses Byung-Kwon Park 1,HyoilHan 2,andIl-YeolSong 2 1 Dong-A University, Busan, Korea bpark@dau.ac.kr 2 Drexel University, Philadelphia, PA
More informationDatabase Applications (15-415)
Database Applications (15-415) The Entity Relationship Model Lecture 2, January 15, 2014 Mohammad Hammoud Today Last Session: Course overview and a brief introduction on databases and database systems
More informationDatabase Technology Introduction. Heiko Paulheim
Database Technology Introduction Outline The Need for Databases Data Models Relational Databases Database Design Storage Manager Query Processing Transaction Manager Introduction to the Relational Model
More informationBuilding a Data Warehouse step by step
Informatica Economică, nr. 2 (42)/2007 83 Building a Data Warehouse step by step Manole VELICANU, Academy of Economic Studies, Bucharest Gheorghe MATEI, Romanian Commercial Bank Data warehouses have been
More informationExtended TDWI Data Modeling: An In-Depth Tutorial on Data Warehouse Design & Analysis Techniques
: An In-Depth Tutorial on Data Warehouse Design & Analysis Techniques Class Format: The class is an instructor led format using multiple learning techniques including: lecture to present concepts, principles,
More informationSpecific Objectives Contents Teaching Hours 4 the basic concepts 1.1 Concepts of Relational Databases
Course Title: Advanced Database Management System Course No. : ICT. Ed 525 Nature of course: Theoretical + Practical Level: M.Ed. Credit Hour: 3(2T+1P) Semester: Second Teaching Hour: 80(32+8) 1. Course
More informationData Warehousing. Overview
Data Warehousing Overview Basic Definitions Normalization Entity Relationship Diagrams (ERDs) Normal Forms Many to Many relationships Warehouse Considerations Dimension Tables Fact Tables Star Schema Snowflake
More informationTUML: A Method for Modelling Temporal Information Systems
TUML: A Method for Modelling Temporal Information Systems 2 Marianthi Svinterikou 1, Babis Theodoulidis 2 1 Intrasoft, GIS Department, Adrianiou 2, 11525 Athens, Greece MSSvin@tee.gr UMIST, Department
More informationLAB 2 Notes. Conceptual Design ER. Logical DB Design (relational) Schema Refinement. Physical DD
LAB 2 Notes For students that were not present in the first lab TA Web page updated : http://www.cs.ucr.edu/~cs166/ Mailing list Signup: http://www.cs.ucr.edu/mailman/listinfo/cs166 The general idea of
More informationNOTES ON OBJECT-ORIENTED MODELING AND DESIGN
NOTES ON OBJECT-ORIENTED MODELING AND DESIGN Stephen W. Clyde Brigham Young University Provo, UT 86402 Abstract: A review of the Object Modeling Technique (OMT) is presented. OMT is an object-oriented
More informationEnabling Off-Line Business Process Analysis: A Transformation-Based Approach
Enabling Off-Line Business Process Analysis: A Transformation-Based Approach Arnon Sturm Department of Information Systems Engineering Ben-Gurion University of the Negev, Beer Sheva 84105, Israel sturm@bgu.ac.il
More informationCS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #5: Entity/Relational Models---Part 1
CS 4604: Introduction to Database Management Systems B. Aditya Prakash Lecture #5: Entity/Relational Models---Part 1 E/R: NOT IN BOOK! IMPORTANT: Follow only lecture slides for this topic! Differences
More informationSAMSTAR. A Semi-Automated lexical Method for generating STAR schemas using an ER diagram
SAMSTAR A Semi-Automated lexical Method for generating STAR schemas using an ER diagram Il-Yeol Song, Ritu Khare, and Bing Dai The ischool at Drexel College of Information Science and Technology Drexel
More informationRepresenting Temporal Data in Non-Temporal OLAP Systems
Representing Temporal Data in Non-Temporal OLAP Systems Johann Eder University of Klagenfurt Dep. of Informatics-Systems eder@isys.uni-klu.ac.at Christian Koncilia University of Klagenfurt Dep. of Informatics-Systems
More informationE-R Model. Hi! Here in this lecture we are going to discuss about the E-R Model.
E-R Model Hi! Here in this lecture we are going to discuss about the E-R Model. What is Entity-Relationship Model? The entity-relationship model is useful because, as we will soon see, it facilitates communication
More informationDevelopment of an Ontology-Based Portal for Digital Archive Services
Development of an Ontology-Based Portal for Digital Archive Services Ching-Long Yeh Department of Computer Science and Engineering Tatung University 40 Chungshan N. Rd. 3rd Sec. Taipei, 104, Taiwan chingyeh@cse.ttu.edu.tw
More informationExtending Uml for Multidimensional Modeling in Data Warehouse
Available online at www.interscience.in Extending Uml for Multidimensional Modeling in Data Warehouse Bakul Dhawan & Anjana Gosain University School of Information Technology E-mail: bakuldhawan@gmail.com,
More informationFundamentals of. Database Systems. Shamkant B. Navathe. College of Computing Georgia Institute of Technology PEARSON.
Fundamentals of Database Systems 5th Edition Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington Shamkant B. Navathe College of Computing Georgia Institute
More informationQuestion Bank. 4) It is the source of information later delivered to data marts.
Question Bank Year: 2016-2017 Subject Dept: CS Semester: First Subject Name: Data Mining. Q1) What is data warehouse? ANS. A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile
More informationFig 1.2: Relationship between DW, ODS and OLTP Systems
1.4 DATA WAREHOUSES Data warehousing is a process for assembling and managing data from various sources for the purpose of gaining a single detailed view of an enterprise. Although there are several definitions
More informationScienceDirect. STA Data Model for Effective Business Process Modelling
Available online at www.sciencedirect.com ScienceDirect Procedia Technology 11 ( 2013 ) 1218 1222 The 4th International Conference on Electrical Engineering and Informatics (ICEEI 2013) STA Data Model
More informationEIAW: Towards a Business-Friendly Data Warehouse Using Semantic Web Technologies
EIAW: Towards a Business-Friendly Data Warehouse Using Semantic Web Technologies Guotong Xie 1, Yang Yang 1, Shengping Liu 1, Zhaoming Qiu 1, Yue Pan 1, and Xiongzhi Zhou 2 1 IBM China Research Laboratory
More informationThis tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.
About the Tutorial Data Mining is defined as the procedure of extracting information from huge sets of data. In other words, we can say that data mining is mining knowledge from data. The tutorial starts
More informationAdnan YAZICI Computer Engineering Department
Data Warehouse Adnan YAZICI Computer Engineering Department Middle East Technical University, A.Yazici, 2010 Definition A data warehouse is a subject-oriented integrated time-variant nonvolatile collection
More informationAutomatically Generating OLAP Schemata from Conceptual Graphical Models
Automatically Generating OLAP Schemata from Conceptual Graphical Models Karl Hahn FORWISS Orleansstr. 34 D-81667 Munich, Germany +49-89-48095225 hahnk@forwiss.de Carsten Sapia FORWISS Orleansstr. 34 D-81667
More informationTowards Development of Solution for Business Process-Oriented Data Analysis
Towards Development of Solution for Business Process-Oriented Data Analysis M. Klimavicius Abstract This paper proposes a modeling methodology for the development of data analysis solution. The Author
More informationContents. Database. Information Policy. C03. Entity Relationship Model WKU-IP-C03 Database / Entity Relationship Model
Information Policy Database C03. Entity Relationship Model Code: 164323-03 Course: Information Policy Period: Spring 2013 Professor: Sync Sangwon Lee, Ph. D 1 Contents 01. Overview of Database Design 02.
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 07 : 06/11/2012 Data Mining: Concepts and Techniques (3 rd ed.) Chapter
More informationData Warehousing and OLAP Technologies for Decision-Making Process
Data Warehousing and OLAP Technologies for Decision-Making Process Hiren H Darji Asst. Prof in Anand Institute of Information Science,Anand Abstract Data warehousing and on-line analytical processing (OLAP)
More informationImplementing Schema Evolution in Data Warehouse through Complex Hierarchy Semantics
International Journal of Scientific & Engineering Research, Volume 3, Issue 7, July-2012 1 Implementing Schema Evolution in Data Warehouse through Complex Hierarchy Semantics Kanika Talwar, Anjana Gosain
More informationData Warehousing ETL. Esteban Zimányi Slides by Toon Calders
Data Warehousing ETL Esteban Zimányi ezimanyi@ulb.ac.be Slides by Toon Calders 1 Overview Picture other sources Metadata Monitor & Integrator OLAP Server Analysis Operational DBs Extract Transform Load
More informationCHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP)
CHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP) INTRODUCTION A dimension is an attribute within a multidimensional model consisting of a list of values (called members). A fact is defined by a combination
More informationBasics of Dimensional Modeling
Basics of Dimensional Modeling Data warehouse and OLAP tools are based on a dimensional data model. A dimensional model is based on dimensions, facts, cubes, and schemas such as star and snowflake. Dimension
More informationElsevier Editorial System(tm) for Data & Knowledge Engineering Manuscript Draft
Elsevier Editorial System(tm) for Data & Knowledge Engineering Manuscript Draft Manuscript Number: Title: Extending ER Models to Capture Data Mining Transformations Article Type: Full Length Article Keywords:
More informationAn Overview of Data Warehousing and OLAP Technology
An Overview of Data Warehousing and OLAP Technology CMPT 843 Karanjit Singh Tiwana 1 Intro and Architecture 2 What is Data Warehouse? Subject-oriented, integrated, time varying, non-volatile collection
More informationOn the Integration of Autonomous Data Marts
On the Integration of Autonomous Data Marts Luca Cabibbo and Riccardo Torlone Dipartimento di Informatica e Automazione Università di Roma Tre {cabibbo,torlone}@dia.uniroma3.it Abstract We address the
More informationData Modeling Online Training
Data Modeling Online Training IQ Online training facility offers Data Modeling online training by trainers who have expert knowledge in the Data Modeling and proven record of training hundreds of students.
More informationConceptual Design. The Entity-Relationship (ER) Model
Conceptual Design. The Entity-Relationship (ER) Model CS430/630 Lecture 12 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke Database Design Overview Conceptual design The Entity-Relationship
More informationWhat Time is it in the Data Warehouse?
What Time is it in the Data Warehouse? Stefano Rizzi and Matteo Golfarelli DEIS, University of Bologna, Viale Risorgimento 2, 40136 Italy Abstract. Though in most data warehousing applications no relevance
More informationConceptual modeling for ETL
Conceptual modeling for ETL processes Author: Dhananjay Patil Organization: Evaltech, Inc. Evaltech Research Group, Data Warehousing Practice. Date: 08/26/04 Email: erg@evaltech.com Abstract: Due to the
More informationA Standard for Representing Multidimensional Properties: The Common Warehouse Metamodel (CWM)
A Standard for Representing Multidimensional Properties: The Common Warehouse Metamodel (CWM) Enrique Medina and Juan Trujillo Departamento de Lenguajes y Sistemas Informáticos Universidad de Alicante
More informationFUNDAMENTALS OF. Database S wctpmc. Shamkant B. Navathe College of Computing Georgia Institute of Technology. Addison-Wesley
FUNDAMENTALS OF Database S wctpmc SIXTH EDITION Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington Shamkant B. Navathe College of Computing Georgia Institute
More informationUsing DDC to create a visual knowledge map as an aid to online information retrieval
Sudatta Chowdhury and G.G. Chowdhury Department of Computer and Information Sciences University of Strathclyde, Glasgow G1 1XH Using DDC to create a visual knowledge map as an aid to online information
More informationRequirements Analysis Method For Extracting-Transformation-Loading (Etl) In Data Warehouse Systems
Requirements Analysis Method For Extracting-Transformation-Loading (Etl) In Data Warehouse Systems Azman Taa, Mohd Syazwan Abdullah, Norita Md. Norwawi College of Arts and Sciences Universiti Utara Malaysia,
More informationCPS352 Database Systems Syllabus Fall 2012
CPS352 Database Systems Syllabus Fall 2012 Professor: Simon Miner Fall Semester 2012 Contact: Simon.Miner@gordon.edu Thursday 6:00 9:00 pm KOSC 128 978-380- 2626 KOSC 243 Office Hours: Thursday 4:00 6:00
More informationDomain-Driven Development with Ontologies and Aspects
Domain-Driven Development with Ontologies and Aspects Submitted for Domain-Specific Modeling workshop at OOPSLA 2005 Latest version of this paper can be downloaded from http://phruby.com Pavel Hruby Microsoft
More informationMultidimensional Modeling using UML and XML
Departamento de Lenguajes y Sistemas Informáticos Multidimensional Modeling using UML and XML Sergio Luján-Mora Contents Introduction OO Multidimensional Modeling UML Extension for MD Modeling MD Modeling
More informationPESIT- Bangalore South Campus Hosur Road (1km Before Electronic city) Bangalore
Data Warehousing Data Mining (17MCA442) 1. GENERAL INFORMATION: PESIT- Bangalore South Campus Hosur Road (1km Before Electronic city) Bangalore 560 100 Department of MCA COURSE INFORMATION SHEET Academic
More informationCMPT 354 Database Systems I
CMPT 354 Database Systems I Chapter 2 Entity Relationship Data Modeling Data models A data model is the specifications for designing data organization in a system. Specify database schema using a data
More informationModelling of Adaptive Hypermedia Systems
Modelling of Adaptive Hypermedia Systems Martin Balík, Ivan Jelínek Abstract: The amount of information on the web is permanently growing. The orientation within the information is becoming more and more
More informationDIRA : A FRAMEWORK OF DATA INTEGRATION USING DATA QUALITY
DIRA : A FRAMEWORK OF DATA INTEGRATION USING DATA QUALITY Reham I. Abdel Monem 1, Ali H. El-Bastawissy 2 and Mohamed M. Elwakil 3 1 Information Systems Department, Faculty of computers and information,
More informationEnhanced Entity-Relationship (EER) Modeling
CHAPTER 4 Enhanced Entity-Relationship (EER) Modeling Copyright 2017 Ramez Elmasri and Shamkant B. Navathe Slide 1-2 Chapter Outline EER stands for Enhanced ER or Extended ER EER Model Concepts Includes
More informationDatabase Management Systems. Chapter 2 Part 2
Database Management Systems Chapter 2 Part 2 Introduction to Database Design Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Class Hierarchies Classify entities sets into Super-class and
More informationA method for requirements elicitation of a Data Warehouse: An example
A method for requirements elicitation of a Data Warehouse: An example JORGE OLIVEIRA E SÁ Information Systems Department University of Minho Azurém Campus, 4800-058 Guimarães PORTUGAL jos@dsi.uminho.pt
More informationStatic Modeling. SWE 321 Fall2014
Static Modeling SWE 321 Fall2014 Copyright 2014 Hassan Gomaa and Robert Pettit All rights reserved. No part of this document may be reproduced in any form or by any means, without the prior written permission
More informationCapturing Temporal Constraints in Temporal ER Models
Capturing Temporal Constraints in Temporal ER Models Carlo Combi 1, Sara Degani 1,andChristianS.Jensen 2 1 Department of Computer Science - University of Verona Strada le Grazie 15, 37134 Verona, Italy
More informationFormulating XML-IR Queries
Alan Woodley Faculty of Information Technology, Queensland University of Technology PO Box 2434. Brisbane Q 4001, Australia ap.woodley@student.qut.edu.au Abstract: XML information retrieval systems differ
More informationAn OPM-based Method for Transformation of Operational System Model to Data Warehouse Model
An OPM-based Method for Transformation of Operational System Model to Data Warehouse Model Dov Dori Technion Israel Institute of Technology Haifa, 32000, Israel dori@ie.technion.ac.il Roman Feldman Technion
More informationCS 338 The Enhanced Entity-Relationship (EER) Model
CS 338 The Enhanced Entity-Relationship (EER) Model Bojana Bislimovska Spring 2017 Major research Outline EER model overview Subclasses, superclasses and inheritance Specialization and generalization Modeling
More informationDevelopment of an interface that allows MDX based data warehouse queries by less experienced users
Development of an interface that allows MDX based data warehouse queries by less experienced users Mariana Duprat André Monat Escola Superior de Desenho Industrial 400 Introduction Data analysis is a fundamental
More informationIntroduction to Database Design. Dr. Kanda Runapongsa Dept of Computer Engineering Khon Kaen University
Introduction to Database Design Dr. Kanda Runapongsa (krunapon@kku.ac.th) Dept of Computer Engineering Khon Kaen University Overview What are the steps in designing a database? Why is the ER model used
More informationCOSC 304 Introduction to Database Systems. Entity-Relationship Modeling
COSC 304 Introduction to Database Systems Entity-Relationship Modeling Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca Conceptual Database Design Conceptual database design
More informationRECONCILIATION OF KNOWLEDGE - APPLICATIONS IN AUTOMATED DATABASE DESIGN DIGANOSING
107 RECONCILIATION OF KNOWLEDGE - APPLICATIONS IN AUTOMATED DATABASE DESIGN DIGANOSING Shahrul Azman Mohd. Noah Faculty of Information Science and Technology, National University of Malaysia, 43600, Bangi,
More informationDatabase Design with Entity Relationship Model
Database Design with Entity Relationship Model Vijay Kumar SICE, Computer Networking University of Missouri-Kansas City Kansas City, MO kumarv@umkc.edu Database Design Process Database design process integrates
More informationHierarchies in a multidimensional model: From conceptual modeling to logical representation
Data & Knowledge Engineering 59 (2006) 348 377 www.elsevier.com/locate/datak Hierarchies in a multidimensional model: From conceptual modeling to logical representation E. Malinowski *, E. Zimányi Department
More informationDATA MINING AND WAREHOUSING
DATA MINING AND WAREHOUSING Qno Question Answer 1 Define data warehouse? Data warehouse is a subject oriented, integrated, time-variant, and nonvolatile collection of data that supports management's decision-making
More informationSAS IT Resource Management Forecasting. Setup Specification Document. A SAS White Paper
SAS IT Resource Management Forecasting Setup Specification Document A SAS White Paper Table of Contents Introduction to SAS IT Resource Management Forecasting... 1 Getting Started with the SAS Enterprise
More informationMySQL Data Mining: Extending MySQL to support data mining primitives (demo)
MySQL Data Mining: Extending MySQL to support data mining primitives (demo) Alfredo Ferro, Rosalba Giugno, Piera Laura Puglisi, and Alfredo Pulvirenti Dept. of Mathematics and Computer Sciences, University
More informationEVALUATION OF THE USABILITY OF EDUCATIONAL WEB MEDIA: A CASE STUDY OF GROU.PS
EVALUATION OF THE USABILITY OF EDUCATIONAL WEB MEDIA: A CASE STUDY OF GROU.PS Turgay Baş, Hakan Tüzün Hacettepe University (TURKEY) turgaybas@hacettepe.edu.tr, htuzun@hacettepe.edu.tr Abstract In this
More information