Foundations of Data Warehouse Quality (DWQ)

Size: px
Start display at page:

Download "Foundations of Data Warehouse Quality (DWQ)"

Transcription

1 DWQ Foundations of Data Warehouse Quality (DWQ) v.1.1 Document Number: DWQ -- INRIA Project Name: Foundations of Data Warehouse Quality (DWQ) Project Number: EP Title: Author: Workpackage: Document Type: Classification: Distribution: Status: Designing Data Warehouse Refreshment Systems M. Bouzeghoub, F. Fabret, F. Llirbat, M. Matulovic, and E. Simon WP8 Report Public DWQ Consortium Draft Document file: wp8_refreshment.doc(word) Version: 1.0 Date: June 17th, 1997 Number of pages: /DWQ Consortium -1- DWQ/

2 Document Change Record Version Date Reason for Change v Oct., 1997 First Draft 1996/DWQ Consortium -2- DWQ/

3 1. Introduction The refreshment of a data warehouse is an incremental data warehouse process which can be decomposed into four logical activities which are: (i) extract a source change from a data source, that characterizes the changes that have occurred in this source since the last extraction, (ii) clean a source change using some predefined data, (iii) integrate the source changes coming from multiple sources, and (iv) determine which views in the data warehouse need to be updated. This is a logical decomposition whose operational implementation receives many different answers when looking at the state of the data warehouse product market. There is a large variety of data warehousing applications that have very different requirements of quality and configurations. However, an important problem raised by the study of existing tools for data warehouse refreshment is that they offer little customization facilities with respect to the scheduling of the refreshment process, both at the functionality and performance levels. In fact, each tool provides a fixed set of possibilities that is intended to cover some «frequent» data warehouse applications. Then two cases occur : (i) the tool cannot handle the requirements of a given application, or (ii) the tool is open enough to accommodate specific user-defined solutions. In the latter case, there are two main problems. First, these tools generally offer poor facilities and almost no methodology to help the user engineering his/her specific implementation of the refreshment system. Typically, the system offers the possibility of defining specific events that will trigger the refreshment process and one can program an ad-hoc solution for the transform and integrate steps of the refreshment process. The resulting program can be invoked from within the tool. Thus, although an ad-hoc solution can be deployed, no facilities are provided to facilitate its engineering. Second, the tools that are flexible enough to enable customizing a refreshment system are generally quite complex to use. Therefore, there is a real need for tools that enable a fast and customized development of data warehouse refreshment solutions. We have already stated in the introduction that active rules are appropriate means to implement a data warehouse refreshment process. In the following, we highlight some features of the refreshment process which match active rules features. Refreshment process is an event-driven application; active rules provide convenient way to specify events, and provide an execution monitor to detect their instances and to calculate them whenever they are composite. Different events characterize the refreshment process. We can roughly distinguish data changes (update events) and process checkpoints (monitoring events). The refreshment process is a complex system which may be composed of asynchronous activities that need a certain monitoring task which is also eventdriven task. Refreshment process evolves frequently, following the evolution of data sources and the evolution of view definitions; active rules provide modular specification which allows to easily modify the refreshment activities in order to adapt them to the new requirements or infrastructure modification. 1996/DWQ Consortium -3- DWQ/

4 There is no a single refreshment process which is suitable for all data warehouse applications or all data warehouse configurations, so there is a need to frequently engineer specific refreshment process for specific applications or data warehouse configurations. Using active rules allows to benefit from generic execution mechanisms and high level language which allow rapid development of the refreshment activities. Some of the activities of the refreshment process, such as extraction, cleaning and integration, can be done by commercial products; their integration into the global refreshment process should be done in transparent way. Active rules allow to consider the activities handled by these tools as their action part, which can be executed in an atomic way. The main original contribution of the work presented in this report is to show how a data warehouse refreshment system can be suitably modeled as an active application. We show how this process takes advantage of the modularity of active rules, and capitalize on recent advances in the formalization of active rules execution models to better understand application semantics. Our contribution is, first, a general methodology to specify a data warehouse refreshment system, which starts from a conceptual specification and progressively transforms it into logical and physical specifications; second, a generic active monitor which can be adapted to some refreshment activities. Apart from this introduction, this report is structured as follows. In Section 2, we present the logical view we have of a data warehouse architecture, on its initial design and its refreshment. We will particularly point to the refreshment tasks that can be modeled by active rules. We also show how this logical view of the data warehouse conforms to the DWQ framework. Section 3 presents a technical summary of the definitions and the principles of the generic active monitor we have defined. Section 4 demonstrates, through simple examples, how this active monitor can be adapted and instanciated in order to handle some active tasks of the refreshment process. Finally, Section 5 concludes and presents our future directions of work. 2. Logical data warehouse architecture, initial design and refreshment tasks This section presents the logical view we have of a data warehouse architecture, of its initial design and its successive refreshments. We will particularly point to the refreshment tasks that can be modeled by active rules. We also show how this logical view of the data warehouse conforms to the DWQ framework Logical view of a data warehouse architecture Data warehouse components The data warehouse can be defined as a hierarchy of data stores (Figure 1) which goes from source data to the highly aggregated data (data marts). Between these two extreme data stores, we can find different other stores depending on the requirements of OLAP applications. One of these stores is the corporate data warehouse (CDW) 1996/DWQ Consortium -4- DWQ/

5 which groups all aggregated views that serve to generate the data marts. The corporate data store can be complemented by an operational data store (ODS) which groups the base data collected and integrated from the sources. ODS contains the common source data from which aggregated views are derived. This data, although it contains some aggregation, is considered as a multi-relational view which synthesizes the source data. Within this ODS, we can maintain during a period of time a history, ODShistory, of all data collected from the sources at different moments. To each source, an intermediate store, called Source-delta, which groups the change extracted from the source at a given instant can be associated. After cleaning this change, we generate a Cleaned-delta source. Again, we can maintain a history of this source-delta, called Delta-history, during a certain period of time during which changes can be accumulated. Both ODS-history and Delta-source history are optional if they are not required by the semantics of OLAP applications. There is a difference between Source-history and an ODS-history. Source-history is defined when the frequency of the extraction step is different from the frequency of the cleaning step or the integration step. This can occur when the data source does not maintain its own history or when the volume of extracted data is not relevant with respect to the aggregation needed by the OLAP application. ODS-history is defined when OLAP applications need to accumulate data for statistical processing for example. Obviously, this hierarchy of data stores is a logical way to represent the data flows which go from the sources to the data marts. Concretely, all the intermediate stores between the sources and the data marts can be represented in the same database. This logical view allows a certain traceability of the design and the refreshment processes, leading to a better understanding of their construction and scheduling respectively. In the following paragraphs we describe the main phases and steps of the data warehouse design. 1996/DWQ Consortium -5- DWQ/

6 CDW V1 V4 Aggregation V2 V3 Aggregat. V5 AGGREGATION/UPDATING ODS-Hustory ODS-Historisat. Historisat. ODS INTEGRATION/LOADING M.S. Integration Integrator D-History D-History D-History S-Historisation Historisat.1 Historisat.2 Historisat.3 S. Cleaning C-Delta Cleaner1 C-Delta Cleaner2 C-Delta Cleaner3 PREPARATION S-Delta S-Delta S-Delta S.Extraction Extractor1 Extractor2 Extractor3 S1 S2 S3 Figure 1: Logical data warehouse architecture Data warehouse Design The design of a data warehouse consists in the definition of all the meta data which describes all the data warehouse objects (data stores), and the definition of the initial creation of the data warehouse stores as well as their periodic refreshment. This design starts with OLAP application requirements in one hand (expressed as conceptual views) and the set of sources potentially useful for the computation of these views in another hand. With respect to the DWQ framework, this definition meta data is structured at three abstraction levels: Conceptual level which contains the meta data which characterize the usage of the sources (access rights, quality factors, extraction frequency, etc.), the mapping rules between source models and relational model, the definition of cleaning rules, the definition of integration assertions, indications on histories (time periods, volumes), quality factors expected for views. Logical level where each data warehouse store has its own description. Sources are described with their respective model while all other data warehouse stores are described in the relational model. The logical schema of the corporate data 1996/DWQ Consortium -6- DWQ/

7 warehouse is a calculation graph of all views defined either on the sources on other views of the data warehouse. Physical level defines the actual implementation of the data warehouse stores. As stated in the DWQ framework, mappings are of multiple kinds: structural mappings, data mappings, knowledge mappings and requirements mappings. Mappings between the conceptual level and the logical level mainly correspond to the quality function deployment, using the house matrix for example. This house matrix transforms quality factors into technical strategies which allow to achieve the quality level described by these factors. This matrix is refined in several steps and evolves to represent the mappings between the logical level and the physical level. Other mappings between sources and data warehouse perspectives are represented as queries Definition of the refreshment process The refreshment process aims to propagate changes raised in the data sources to the data warehouse store. This propagation follows three phases: (i) preparation phase, (ii) loading phase and (iii) aggregation phase. Each phase is composed of several steps handling different tasks (Figure 1). These phases and steps are the same as in the initial construction of the data warehouse, except that the refreshment process is concerned by an incremental management of the updates. Phases and steps are governed by certain meta data which describes source and extractor capabilities, semantics needed by the OLAP application, and the moments where the refreshment is relevant to these applications. In the remaining of this section, we describe the different phases of the refreshment process, then we define the different refreshment strategies, and the way to plan these strategies. The preparation phase The preparation phase is composed of three steps applied to each data source : (1) extraction of data changes from the source, producing the source-delta stores, (2) cleaning of this data, producing the cleaned-deltas, (3) historisation of this data, producing delta-histories. The second and the third step are optional steps, depending on the quality and the representation of the source data, and on whether we need history or not on a data source. The cleaning and the historisation steps can be done in different orders, depending on performance or on the semantics of the cleaning rules. For example, if we have a cleaning rule which trashes duplicated data, it makes more sense to do it after historisation than before. But if we have a cleaning rule which adapts a certain format to another, it makes sense to do it after the extraction, or even to do it on the fly during the extraction. This means that at the operational level, the cleaning task can be distributed; some cleaning rules are applied before historisation, others are applied after historisation. The integration / loading phase 1996/DWQ Consortium -7- DWQ/

8 The loading phase is composed of two steps: (4) multi-source data integration, producing the operational data store (ODS), (5) data warehouse historisation, producing ODS-history. Step (4) is optional if the data warehouse is built from a single source, or if the sources are independent, i.e., there are no common objects nor common links. Step (5) is optional if we don't need a history on the integrated data; this means there is no OLAP application defined on this history. The integration activity consists roughly in (i) matching data coming from different sources, (ii) detecting multi-source inconsistencies with respect to integration assertions defined at the schema integration level, transforming and cleaning data which does not conform to these assertions, and assert the integrated data in the resulting relations. The multi-source integration step should be done with respect to a certain scenario depending, for example on the duration of the preparation phase of each source, on the semantics of integration rules (do some rules before others), on performance reasons (do in parallel some intermediate integration), etc. The data warehouse historisation is a necessary activity when there are OLAP applications which operate on statistical samples elaborated for certain periods of time. There might be OLAP applications interested by short-time prediction or longtime prediction and for which both factual ODS and ODS-History are interesting. The aggregation/propagation phase The aggregation phase can be composed of as many steps as the number of intermediate views in the view definition hierarchy. The aggregation phase is nothing than a recursive view evaluation from a set of operands. The refreshment process at this level may consist in an incremental evaluation of certain operands, depending on whether these operands are materialized or derived views, and on whether the materialized views are incrementally updated or completely re-evaluated. Refreshment tasks that can be modeled by active rules A refreshment process can be seen as an application program which monitors a set of activities devoted to data extraction, data cleaning, data integration, etc. This refreshment application is composed of a main activity, and a set of activities identified in the previous section (Figure 2). Some of these activities can be implemented using active rules, others will be implemented using classical programming languages. Our current assumption is that data cleaning, data integration and update propagation are typically the activities which can be implemented using active rules. With regard to its monitoring role, the main activity itself can also be implemented using active rules. In this report we restrict our experiment to the three activities of cleaning, integration and propagation, and we will show later how to implement them using active rules. 1996/DWQ Consortium -8- DWQ/

9 A REFRESHMENT PROCESS Propagation Activity Integration Activity Main Activity Cleaning Activity ExtractionActivity Figure 2: Activities of the refreshment process 2.3. Engineering a refreshment process This section describes an informal methodology which helps in building a refreshment. It also describes the meta data needed by this process, according to the DWQ Framework Engineering approach As shown in the previous section, the definition of a refreshment process is a complex activity which should be done in several phases and tasks, organized according to a certain strategy, and using different parameters. Thus, engineering a refreshment process must follow a classical life-cycle for process development (Figure 3) which successively generates a conceptual definition of the refreshment process, its logical specification and its physical implementation. Four activities materialize this engineering process: Requirement Analysis which helps to acquire all the necessary knowledge to the definition of the refreshment process. Conceptual Design which provides a first definition of the refreshment process as a planning scenario of possible strategies. Logical Design which transforms this conceptual scenario into a formal specification in terms of a master algorithm, rules and their execution semantics. Physical Design which implements the master algorithm as a master program and rules with their semantics as an active monitor driven by events generated by the master program. 1996/DWQ Consortium -9- DWQ/

10 The following subsections detail these processes and their corresponding inputs and outputs. Requirements of OLAP Applications Metadata Requirements Analysiss Requirements for the Refreshment Process Source Definitions View Definitions Quality on Sources & Expected Quality Conceptual Definition of the Refreshment Process Conceptual Process Logical Definition of the Refreshment Process Set of active rules + Execution semantics Active Monitor Physical Definition of the Refreshment Processs Operational Refreshment Process ENGINEERING PROCESS Figure 3: Methodology to define a refreshment process Requirement Analysis This activity is an informal activity which allows to Identify the views concerned by the refreshment process to build, Identify data sources and their corresponding data extractors, Identifying the quality factors attached to the data sources and those that the data warehouse content must achieve (data quality policy), Define the quality function deployment, that is associate to each quality factor the corresponding technical strategy. Conceptual Design Defining the refreshment process is a complex design which should consider many parameters related to refreshment strategy, sources features, task capabilities and user needs. The following procedures give an intuition of this definition process. Refreshment (V1,...Vn) For each view or a set of views to refresh do Select sources on which views are directly or indirectly defined; Choose the refreshment strategy; Select the relevant meta data to use; Define the refreshment window; Plan each refreshment phase; 1996/DWQ Consortium -10- DWQ/

11 end. PlanningPreparation (S1,...,Sp) For each data source Si do Select the corresponding extractor; Select other tasks to perform (i.e. cleaning, historisation); Organize these tasks into a sequence; Define events that govern the starting and the progress of these tasks; end. PlanningLoading (S1,...,Sp) Select the integration strategy; Identify synchronization points where the integration starts; Define the dynamic planning of the source preparation ends; Apply transformation with respect to integration assertions; Possibly perform the historisation task; end. PlanningAggregation (V1,...,Vn) Select materialized views; Select derived views; Define the computation dependency graph between views; Define a computation strategy for this graph; end. Planning a refreshment process means choosing a certain execution strategy for the component activities and for the whole refreshment process (Main Program). Defining a refreshment strategy consists in: sequencing preparation tasks, deciding on the level of parallelisation between different source preparations, deciding on the history definition, defining the graph of computational dependencies between views, deciding on the way to evaluate the views in this graph, Identifying all the major events that trigger the refreshment activities. The refreshment process should be planned statically or dynamically, depending on the activities, in order to achieve a certain quality of service, that is, for example, to provide a high degree of freshness for aggregated data or to propagate changes in an optimal time (availability). Static planning concerns the selection of tasks that should be performed on each source or on the data warehouse, and the logical sequencing of these tasks. Dynamic planning concerns the identification of parallel tasks, the definition of the point before the integration, and the mode of detection of all events that trigger these tasks. Both static and dynamic plannings are necessary to define the refreshment process. This planning is bounded by the data freshness time (i.e. the last data state which the OLAP application is interested in) and the data availability time (the deadline at which the aggregate data is significant to the OLAP application). This interval is called the refreshment window. Logical Design 1996/DWQ Consortium -11- DWQ/

12 Logical design of the refreshment process consists in : definition of the main activity algorithm and the checkpoints where refreshment activities are invoked, selection of activities that will be specified using active rules (active activities), definition of active rules related to each active activity, definition of the semantics of each rule, definition of the algorithms of non-active activities. The output of the logical design is then a set of rules with their semantics (see section 3 on the expression of this semantics) for active activities and a set of algorithmic specifications for the non-active activities. Section 4 shows, through simple examples, how some refreshment activities can be expressed with rules and how the semantics of these rules is specified. Rules are Event-Condition-Action rules. The semantics is the definition of the operational execution of this rule (when and how events are detected and composed, when and how conditions are evaluated, when and how actions are executed, what is the context in which a rule is executed, what is the rule effect, how rules are scheduled, etc...). Section 3 gives a more formal definition of these. The logical design should provide an aide in the definition of the rules and their semantics, that is how to translate to active rules the refreshment tasks and how to define the semantics of these rules from the strategies defined at the conceptual level. What are events, what are conditions and what are actions. Which event initiates the refreshment process. Physical Design Physical design of the refreshment process consists in the transformation of the logical rules and their semantics into an operational active program. This can be done using the generic operational active monitor. The components of this generic monitor are instanciated in such a way they implement the semantics defined for the rules, yielding a dedicated active monitor for data warehouse refreshment. Physical design consists also in the programming of all non active activities Meta data which governs the refreshment process The metadata used by the refreshment process concerns the sources and the data warehouse logical schemes, the sources and data warehouse physical schemes, and the corresponding mappings between all these, including the mappings from the conceptual levels. Logical schemes of sources are uniform representations of the sources parts which are relevant to the data warehouse goals (Figure 4). Besides the source definitions and the data warehouse definition, the refreshment process needs some other meta data such as the frequency of extraction or integration, the time interval of historisation, and all the time points associated to the activation of 1996/DWQ Consortium -12- DWQ/

13 the different tasks executed in the different steps. There should be a certain coherence between these parameters to avoid certain mismatch between the user needs and the capabilities of the data warehouse system. For example, the integration frequency should be consistent to the extraction frequency of multiple sources, as well as the time intervals of the different histories (source history and data warehouse history). PROCESS DIMENSION (DW System Architecture) Process 1 Process 3 Process 2 Process 4 Process 7 Refrershement Process Process 5 META-DATA DIMENSION (Metabase Content) Conceptual Level Source Model Entreprise Model Client Model Logical Level Source Schema Data Warehouse Schema Client Schema Physical Level Source Data Store DW Store Client Data Store Source Perspective EntreprisePerspective Client Perspective Figure 4: Position of the refreshment process with respect to DWQ Framework 3. Fundamentals basis of an active monitor As explained before, our goal is to model refreshment activities as cleaning, integration, updating by means of a set of active rules equipped with their execution model. The first step taken in this section is to formally characterize an execution model with a small number of parameters. These parameters form a hierarchy as showed by Figure 5. We first present some important assumptions. Second, we describe the types of events supported by a refreshment system. In a third part, the parameters are described in four parts that correspond to the highest classification level in Figure 5. Finally, we give the operational rule model that allows to map the semantic parameters with the behavior of the rules. Our presentation of the semantic parameters and of the operational model is a simplified version of the results presented in [Llir97] and [BFLM97] General Assumptions We represent a system supporting active refreshment activities by a master program (master, for short) that generates events to an active monitor, and a set of rules managed by this active monitor. 1996/DWQ Consortium -13- DWQ/

14 The Master Program The implementation of the master program depends on the data warehouse application. To illustrate this notion, we present several possible implementations for a master program. A master program can manage a set of alarm clocks that trigger calls to the extraction programs, for instance one alarm clock can be associated with every source extraction program. Every call to an extraction program returns a source change that is then passed as a primitive event to the active monitor. As another possible implementation, the master can manage persistent queues associated with the sources (one queue per source and one queue for the data warehouse). Every extraction program writes into its queue. The master program can then read the queues in some particular ordering, get a message, and pass it as an event to the active monitor. An important assumption made by our modeling is that when a master program passes an event to an active monitor, it gets interrupted until the processing of this event by the active monitor resumes. A second assumption is that an instance of a master program communicates with a single instance of an active monitor. However, a parallel implementation of a master program is also possible. For example, an instance of a master program can be associated with one persistent queue. This means that source changes coming from different sources will be processed independently. Of course, such a solution works only if the application guarantees that source changes can be processed in parallel without causing any conflict The rules The refreshment rules are event, condition, action rules that specify refreshment activities to undertake when certain situations arise. This situation is described by an event and, possibly a condition. The action part is reduced to a simple call to any application program whose execution is considered as atomic in the active monitor point of view. In the following, we will describe in more details the situation part and the context in which the rules are executed The events The events that may trigger the rules are associated with event types and event context types. An event type is described by an identifier (its name) and a possible sequence of formal parameters. An event context type is described by an identifier (its name) and a data structure. An event of type t is specified by an instance of t and an instance of the context type associated with t. Events can be primitive or constructed events. Primitive events correspond to : source changes or data warehouse changes received by the refreshment system, data modification operations generated by the rules within the refreshment system, or data warehouse access operations received by the refreshment system. Primitive events are produced by the master program and the actions of the rules. The instant of a primitive event e is the time at which e is received by the active monitor. This 1996/DWQ Consortium -14- DWQ/

15 definition deserves some explanations. Suppose that a data modification occurs at a source. In our modeling, the time at which the event corresponding to this data modification is signaled to the active monitor is the instant for the event. However, the instant at which this data modification occurred in the source can be captured in the context associated with the event (if it is necessary for the refreshment system). A constructed event type t is defined from a set St of primitive or constructed event types by three components : an identifier (its name), a time interval definition, and a synthesis function. Informally, a time interval is defined over the set E of events of types St that have been signaled to the active monitor, and returns an interval of time whose bounds correspond to event instants in E. The synthesis function is defined over a time interval I over E, and returns a set of events of type t. All these events have the upper bound of I as event instant. A constructed event type may have parameters, it is associated with a context type. To illustrate the notion of constructed events, consider the following example. Suppose that a region is decomposed into districts. In each district, a database registers local measurements about air pollution and air quality. A global data warehouse is defined to store aggregated data about air quality in the region. Suppose that each source sends its changes every hour to the data warehouse. Then a possible refreshment strategy would be to trigger the refreshment of the data warehouse only when there are at least two districts whose level of pollution (computed on a scale of 5 values) exceeds level 2 in the last 12 hours. In this case, the time interval function associated with this constructed triggering event is : [current_time - 12h ; current_time], the synthesis function computes the instances of the constructed event, and a possible context type for this event is the number of regions which have communicated a pollution level greater or equal than 2. Note that the notion of constructed event considered in this report is more general than the notion of composite event, defined from a set of primitive events and a set of logical and temporal operators, as in [CKAK94]. However, in that particular case, the authors show how to specify the corresponding time interval function, and synthesis function Rule execution semantics The execution semantics of an active system may be decomposed in a set of dimensions. Each dimension, so called semantic parameter, allows to define a facet of the rules behavior (see Figure 5). 1996/DWQ Consortium -15- DWQ/

16 Local semantics Triggering Evaluation Triggering Point Interval Synthesis Evaluation Point Evaluation Plan Semantics Execution Execution Point Execution Plan Global semantics Triggering Triggering Point Triggering Policy Selection Figure 5: Semantic dimensions Selection Point Selection Policy At the first level, we distinguish between the local and the global dimensions. The local dimensions describe the behavior of one rule independently from the others rules. Processing a rule is decomposed into three phases : the triggering phase where the rule is triggered, the evaluation phase where the condition of the rule is evaluated and the execution phase where the action is executed. Describing the local semantics of a rule r consists in specifying when and how each phase is processed for r. The global dimensions describe the global behavior of a set of rules. Indeed, a refreshment application generally uses several rules ; so, we have to specify how each phase of the processing of each rule taken individually is scheduled with respect to the processing of the other rules. Describing the global semantics of a set of rules consists in specifying when each individual phase have to be processed in the global process. In what follows, we present each component of this description The triggering phase The triggering phase of a rule r produces a set S of events triggering the rule and a set of rule instances of r (one per triggering event in S). Here, we use the standard definition of the rule instance notion : a rule instance of r associated with an event e is the rule r in which the event part is instanciated with e. In our modeling, we assume that only the events having a constructed type are able to trigger the refreshment rules. This assumption allows us to consider all the rules in an uniform manner. It doesn t jeopardize our semantic model ; indeed, any primitive event may easily be seen as a constructed event. The description of the semantic dimensions of the triggering phase of a rule r specifies in what situations the phase may begin and what is the mechanism used to produce the triggering events and the rule instances of r. 1996/DWQ Consortium -16- DWQ/

17 The situations where the triggering phase may begin are specified by means of synchronization points called (local) triggering points of r. The local triggering points are points where the master program may be interrupted for executing r, they are produced in message form by the master. As we suppose that the refreshment rules are triggered by constructed events, describing the triggering mechanism of r consists in providing the formal specification of the constructed triggering events of the rule. This is achieved by specifying a time interval, so called triggering interval, and a synthesis function. The interval is described by the specification of its bounds. The maximal lower bound of the interval is the time where the master program began. There is many way to specify interval boundary for a rule r. For example, the time where a triggering phase of r began, and the moment where a certain primitive (or constructed) event occurred are examples of possible interval boundaries. The synthesis function for r considers the events that were received by the master and the active monitor, and that have an event instant included in the interval. Then it derives a set of constructed events. The events the function has to consider are specified by their types : they may be primitive events or triggering events of the rule. Every event returned by the function is a triggering event for r that ultimately allows to produce a rule instance of r The semantics of the evaluation and the execution phases The description of the semantic dimensions of the evaluation phase and of the execution phase of a rule r specifies in what situations the phases may begin and what are the mechanisms used for evaluating and for executing the rule instances of r. The situations where the evaluation phase may begin are specified by means of local evaluation points of r, and the situations where the execution phase may begin (provided the condition was satisfied) are specified by means of local execution points of r. The local evaluation and local execution points are synchronization points where the master program may be interrupted for respectively evaluating the condition and executing the action of the rule instances of r. These points are produced in message form by the master. The need for specifying evaluation and execution plans for a rule r occurs every time the synthesis function may produce several triggering events for r. The evaluation and execution plans specify how shall we monitor the processing of the corresponding rule instances of the rule. For example the plans for r may specify that all the instances are processed in parallel The global semantics of a set of rules 1996/DWQ Consortium -17- DWQ/

18 The global semantics of a set of rules is described by specifying global synchronization points, a global triggering policy and a global selection policy. The global synchronization points allow to synchronize each rule with respect to the other rules. Given a rule r, a global synchronization point for r may be posted at the end of the condition or at the end of the action program of r. Such a point is produced in message form during the processing of the rule instances of r. A global synchronization point p for a rule r is associated with a set of rules, say S. Point p may be a triggering point, an evaluation point, or an execution ; that is a point where the processing of the rule instances of r may be interrupted for respectively triggering, evaluating, executing rules in S. Such a point is described by specifying in what rule the point is posted, what is its position in this rule and what is its set of associated rules. It may happen that several rules should have the same local triggering point or are attached to the same global triggering point. The triggering policy selects the rules for which the point is actually a triggering point. A triggering policy may be described by specifying what mechanism is used to perform the selection. It may happen that several rules should have the same local evaluation point, or the same local execution point, or are attached to the same global evaluation point or to the same global execution point. The selection policy has two functions : first it selects the rules for which the point is actually an execution point or an evaluation point, and second, it selects what rule instances of the selected rules have to be evaluated or executed. A selection policy may be described by specifying what mechanism is used to perform the selection Operational model of the rules Now we present an operational model of the rules intended to show how the semantic dimensions described above fully determine the behavior of the rules. In this model, we see the execution of the master program and of the rules it triggers as the execution of a set of tasks governed by a synchronizer. A task corresponds either to the execution of the master program or to the execution of a rule instance. The way the synchronizer composes and schedules the orders sent to the tasks is subject to the semantic dimensions of the refreshment application. 1996/DWQ Consortium -18- DWQ/

19 Synchroniser Messages Execution order & Task generation Figure 6: Program and rule execution The interactions between the synchronizer and the tasks are shown in Figure 6. The tasks send messages to the synchronizer and receive orders from the synchronizer. There is no inter-task communications. Every time a task sends a message to the synchronizer, it remains inactive until the synchronizer responds to the message. The synchronizer handles the messages one after another in the order where the messages were sent by the tasks. In reaction to a message the synchronizer may create new tasks or send orders to inactive tasks ; this reaction is determined by the semantic dimensions and the current state of the event history and the task history. The event history (EH) and the task history (TH) respectively contain all the event received by the active monitor from the beginning of the master program, and all the orders sent by the synchronizer to the tasks from the beginning of the master program The tasks : state diagram and messages We describe a rule instance task T by the state diagram shown in Figure 7. S:true Evaluated R:begin_action Triggered R:begin_rule Evaluating Executing S:false End S:end Figure 7: State diagram of rule instance Gray ovals, dashed ovals and white ovals respectively represent inactive states (i.e.. states where T wait for an order sent by the synchronizer), final states and active states (i.e. states where T evaluates the condition or executes the action of the associated rule). There are two inactive states (triggered and evaluated), one final state (end), and two active states (evaluating and executing). A transition from a state St to a state St is noted (St, St ) ; it is annotated with a label of the form R : m or S : m. Given an arc (St, St ), if St is an active state, then St is an 1996/DWQ Consortium -19- DWQ/

20 inactive state and the associated label is of the form R : m, with the following meaning : the task is in state St and expects the order m from the synchronizer ; when the task receives this order, its state changes by passing from St to St. On the opposite, if St is an active state, then St is an inactive (or final) state and the associated label is of the form S : m, with the following meaning : the task is in state St, and enters in state St by sending the message m to the synchronizer. The triggered state is the initial state of task T : that is the state of every rule instance task when it is created by the synchronizer. From this state, the task waits for the begin-rule order. When it receives this order, T enters in the active state evaluating where the condition is evaluated, and then T sends a message to the synchronizer to report the evaluation result. The following state of T depends on the evaluation result. If the condition is not satisfied, T signals its execution end to the synchronizer (message false) and enters in the final state end where the execution of T is abandoned. On the opposite, if the condition is satisfied, T signals this result to the synchronizer (message true) and enters in the inactive state evaluated where it expects the begin-action order from the synchronizer. When T receives this order, it enters in the active state executing where the action is executed. At the end of the action execution, T enters in the end state and signals to the synchronizer that the execution is done (message end). R:continue Interrupted Executing S:interrupt Figure 8: State diagram of the Master program The state diagram of the master program is shown in Figure 8. The program may be executing or interrupted for executing the rules. At certain points of its execution, the master sends the message interrupt to the synchronizer. This message signals a local synchronization point of the rules. On the receipt of the message, the synchronizer monitors the execution of the rules. The master is interrupted until the synchronizer sends the continue order. An important assumption of our operational model is that the master program is never in the evaluating state when some rule instance task is active The event history and the task history The execution of the master program that triggers active rules may be traced using two histories : the event history and the task history. The event history (EH) contains all the events produced by the master program and the rule action programs. For every event, the event instant is the time where the event was registered in EH. We impose two constraints. First, all the events produced by executing the action program of a rule instance are registered in the event history 1996/DWQ Consortium -20- DWQ/

21 before the rule instance signals to the synchronizer that the execution is done (message end). Second, all the events produced from the beginning of the master program have been registered in the event history before the master program sends to the synchronizer the interrupt message. The task history (TH) records the messages and the orders reporting all the state changes of the tasks during the execution of the master program and of the triggered rules. It also records the orders given by the synchronizer to create new rule instances. Each message contained in the task history mentions what task sent the message and at what time the synchronizer took the message into account. Each order mentions what task received the order and at what time the synchronizer sent the order. Each creating order mentions which is the created task and at what time the synchronizer ordered to create the task The synchronizer The synchronizer handles the messages coming from the tasks one after another. The kind of order it sends in response to a given message is defined by applying the semantic dimensions. To do that, it uses the information provided by the histories EH and TH. Indeed, each semantic dimension may be expressed as a set of formula over EH and TH (see [BFLM97] for more details). The synchronizer behavior is described in the algorithm shown in Figure 9. When the synchronizer receives a message, it operates in two steps. The first step is dedicated to create new rule instance tasks while the second step computes what orders have to be sent to the existing tasks. To create new tasks, the synchronizer computes in D what rules have reached a triggering point. To do that the synchronizer uses the specification of the local and global triggering points and the current state of the task history. Nevertheless, every rule having reach a triggering point is not necessarily triggered, and the synchronizer selects among the possibly triggered rules what rules have to be ultimately triggered. To do that, it applies the global triggering policy and put the result in Rdec. For each selected rule r, the synchronizer computes a set of rule instances of r by executing the triggering phase of the rule as specified in the local semantics of r. Finally, for each rule instance produced the synchronizer orders to create the associated task ; this order is registered in the task history. To compute what orders have to be send to the tasks, the synchronizer computes in Exec what rules have reached an evaluation or an execution point ; to do that, it uses the specification of the evaluation and execution points. Then the synchronizer uses the selection policy to select tasks in Exec, and send the appropriate order to the selected tasks. If there is neither selected task nor active task, the synchronizer send the continue order to the master program. Synchronizer algorithm input : a message m, and T the task that sent m 1996/DWQ Consortium -21- DWQ/

22 let EH and TH denote the current state of the event history and the task history ; step 1 : let D be the set of rules having reached a triggering point ; if D is not empty then let Rdec denote the set of rules of D that have to be triggered ; for each rule r in Rdec computes in Inst_r the set of rule instances by applying the triggering phase of r ; for each rule instance in Inst_r create a rule instance task ; step 2 : let Exec be the set of rules having reached an evaluation point or an execution point and let TS be the set of rule instances defined as follows : T is in TS if T is an instance of some rule r contained in Exec, and T may be selected with respect to both the global selection semantics and the evaluation plan of r ; if TS is empty and there is no task being in the evaluating state or in the executing state then send the order continue to the master task else for each task T in TS send the appropriate order to T with respect to the current state of T ; Figure 9 : Synchronizer algorithm 4. How to instanciate a generic active monitor to get an active application In this section we present an example of a refreshment strategy and we show how it can be expressed using active rules and the semantics parameters presented in the previous section. As described in section 2.2, we consider that a refreshment process consists of four main activities : the Extraction activity, the Cleaning activity, the Integration activity and the Propagation activity. Thus we choose a strategy for each activity and show how these strategies can be expressed using rules and semantics parameters. Activity 1: An Example of an Extraction Strategy. A very simple strategy consists in periodically extracting data from sources : We associate an extraction program P S and a time period T S with each source S. The extraction program P S produces a source-delta D S which contains the changes in the source S since its previous execution. We model such a strategy by associating to each source S an active rule r S whose condition is always true and whose action executes the extraction program P S and generates an event E S with a context equal to D S. The triggering and the execution points of r S correspond to the end of each time period T S. The synthesis function returns the predefined event TRUE with no 1996/DWQ Consortium -22- DWQ/

23 associated context and the triggering interval is undefined (i.e., the triggering phase always triggers one and only one instance of the rule r S and no input parameter is passed to the action of r S ). If a same clock tip corresponds to the end of several time periods T S1... T Sn we need a global triggering policy and a global selection policy to select the rules in { r S1,..., r Sn } that are effectively triggered and executed. Intuitively, since each rule is attached to a different source all the rules can be triggered at the same time and executed in parallel. Activity 2: An Example of a Cleaning Strategy. A simple strategy consists in periodically cleaning the data that are extracted from the sources. We associate a cleaning program C S, a time period Tc S and a function f S with each source S. We model the cleaning of a source S by a rule rc S whose condition is always true and whose action executes the extraction program C S. Triggering and execution points of rc S correspond to the end of each time period Tc S. The upper and lower bounds of the triggering interval I s are respectively the end and the beginning of the last time period. The synthesis function is a function of the form f o select S where select S is a function that selects in I s all the source-deltas associated with the source S. The function f eliminates from these deltas all the redundant or useless information and computes a data structure that can be used as input parameter to the cleaning program Cs. Cs produces a set of «cleaned events» that converts this information in a format readable by the integration process. If a same clock tip corresponds to the end of several time periods Tc S1... Tc Sn we need a global triggering policy and a global selection policy to select the rules in { rc S1,..., rc Sn } that are effectively triggered and executed. Intuitively, since each rule is attached to a different source all the rules can be triggered at the same time and executed in parallel. Activity 3: An Example of an Integration Strategy We consider an integration strategy that computes the content of Operational Data Store. ( We assume that there is no ODS History). Let S ODS be the schema of ODS. We assume that there is no integrity constraint that involves more than one relation. In such a case each relation of S ODS can be computed independently from other relations. Let S R be the set of sources that allows to compute the relation R and let P R be the integration program that computes the new value of R given a set of cleaned events associated to the sources of S R and the old value of R. We adopt a strategy that applies P R each time a sufficient amount of untreated cleaned events is produced. We model the integration of a relation R by a rule r R whose condition is always true and whose action executes the extraction program P R. 1996/DWQ Consortium -23- DWQ/

Chapter 2 Overview of the Design Methodology

Chapter 2 Overview of the Design Methodology Chapter 2 Overview of the Design Methodology This chapter presents an overview of the design methodology which is developed in this thesis, by identifying global abstraction levels at which a distributed

More information

SUMMARY: MODEL DRIVEN SECURITY

SUMMARY: MODEL DRIVEN SECURITY SUMMARY: MODEL DRIVEN SECURITY JAN-FILIP ZAGALAK, JZAGALAK@STUDENT.ETHZ.CH Model Driven Security: From UML Models to Access Control Infrastructres David Basin, Juergen Doser, ETH Zuerich Torsten lodderstedt,

More information

CoE CENTRE of EXCELLENCE ON DATA WAREHOUSING

CoE CENTRE of EXCELLENCE ON DATA WAREHOUSING in partnership with Overall handbook to set up a S-DWH CoE: Deliverable: 4.6 Version: 3.1 Date: 3 November 2017 CoE CENTRE of EXCELLENCE ON DATA WAREHOUSING Handbook to set up a S-DWH 1 version 2.1 / 4

More information

Conceptual modeling for ETL

Conceptual modeling for ETL Conceptual modeling for ETL processes Author: Dhananjay Patil Organization: Evaltech, Inc. Evaltech Research Group, Data Warehousing Practice. Date: 08/26/04 Email: erg@evaltech.com Abstract: Due to the

More information

A MAS Based ETL Approach for Complex Data

A MAS Based ETL Approach for Complex Data A MAS Based ETL Approach for Complex Data O. Boussaid, F. Bentayeb, J. Darmont Abstract : In a data warehousing process, the phase of data integration is crucial. Many methods for data integration have

More information

CHAPTER 3 Implementation of Data warehouse in Data Mining

CHAPTER 3 Implementation of Data warehouse in Data Mining CHAPTER 3 Implementation of Data warehouse in Data Mining 3.1 Introduction to Data Warehousing A data warehouse is storage of convenient, consistent, complete and consolidated data, which is collected

More information

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining. About the Tutorial Data Mining is defined as the procedure of extracting information from huge sets of data. In other words, we can say that data mining is mining knowledge from data. The tutorial starts

More information

Designing Data Warehouses. Data Warehousing Design. Designing Data Warehouses. Designing Data Warehouses

Designing Data Warehouses. Data Warehousing Design. Designing Data Warehouses. Designing Data Warehouses Designing Data Warehouses To begin a data warehouse project, need to find answers for questions such as: Data Warehousing Design Which user requirements are most important and which data should be considered

More information

Executing Evaluations over Semantic Technologies using the SEALS Platform

Executing Evaluations over Semantic Technologies using the SEALS Platform Executing Evaluations over Semantic Technologies using the SEALS Platform Miguel Esteban-Gutiérrez, Raúl García-Castro, Asunción Gómez-Pérez Ontology Engineering Group, Departamento de Inteligencia Artificial.

More information

NOTES ON OBJECT-ORIENTED MODELING AND DESIGN

NOTES ON OBJECT-ORIENTED MODELING AND DESIGN NOTES ON OBJECT-ORIENTED MODELING AND DESIGN Stephen W. Clyde Brigham Young University Provo, UT 86402 Abstract: A review of the Object Modeling Technique (OMT) is presented. OMT is an object-oriented

More information

Data Warehousing. Data Warehousing and Mining. Lecture 8. by Hossen Asiful Mustafa

Data Warehousing. Data Warehousing and Mining. Lecture 8. by Hossen Asiful Mustafa Data Warehousing Data Warehousing and Mining Lecture 8 by Hossen Asiful Mustafa Databases Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age Information,

More information

Contemporary Design. Traditional Hardware Design. Traditional Hardware Design. HDL Based Hardware Design User Inputs. Requirements.

Contemporary Design. Traditional Hardware Design. Traditional Hardware Design. HDL Based Hardware Design User Inputs. Requirements. Contemporary Design We have been talking about design process Let s now take next steps into examining in some detail Increasing complexities of contemporary systems Demand the use of increasingly powerful

More information

AOSA - Betriebssystemkomponenten und der Aspektmoderatoransatz

AOSA - Betriebssystemkomponenten und der Aspektmoderatoransatz AOSA - Betriebssystemkomponenten und der Aspektmoderatoransatz Results obtained by researchers in the aspect-oriented programming are promoting the aim to export these ideas to whole software development

More information

Interactions A link message

Interactions A link message Interactions An interaction is a behavior that is composed of a set of messages exchanged among a set of objects within a context to accomplish a purpose. A message specifies the communication between

More information

3.4 Data-Centric workflow

3.4 Data-Centric workflow 3.4 Data-Centric workflow One of the most important activities in a S-DWH environment is represented by data integration of different and heterogeneous sources. The process of extract, transform, and load

More information

CHAPTER 9 DESIGN ENGINEERING. Overview

CHAPTER 9 DESIGN ENGINEERING. Overview CHAPTER 9 DESIGN ENGINEERING Overview A software design is a meaningful engineering representation of some software product that is to be built. Designers must strive to acquire a repertoire of alternative

More information

An Architecture for Semantic Enterprise Application Integration Standards

An Architecture for Semantic Enterprise Application Integration Standards An Architecture for Semantic Enterprise Application Integration Standards Nenad Anicic 1, 2, Nenad Ivezic 1, Albert Jones 1 1 National Institute of Standards and Technology, 100 Bureau Drive Gaithersburg,

More information

The Timed Asynchronous Distributed System Model By Flaviu Cristian and Christof Fetzer

The Timed Asynchronous Distributed System Model By Flaviu Cristian and Christof Fetzer The Timed Asynchronous Distributed System Model By Flaviu Cristian and Christof Fetzer - proposes a formal definition for the timed asynchronous distributed system model - presents measurements of process

More information

Efficiency Gains in Inbound Data Warehouse Feed Implementation

Efficiency Gains in Inbound Data Warehouse Feed Implementation Efficiency Gains in Inbound Data Warehouse Feed Implementation Simon Eligulashvili simon.e@gamma-sys.com Introduction The task of building a data warehouse with the objective of making it a long-term strategic

More information

Section 8. The Basic Step Algorithm

Section 8. The Basic Step Algorithm Section 8. The Basic Step Algorithm Inputs The status of the system The current time A list of external changes presented by the environment since the last step Comments Scheduled action appears in the

More information

The Analysis and Proposed Modifications to ISO/IEC Software Engineering Software Quality Requirements and Evaluation Quality Requirements

The Analysis and Proposed Modifications to ISO/IEC Software Engineering Software Quality Requirements and Evaluation Quality Requirements Journal of Software Engineering and Applications, 2016, 9, 112-127 Published Online April 2016 in SciRes. http://www.scirp.org/journal/jsea http://dx.doi.org/10.4236/jsea.2016.94010 The Analysis and Proposed

More information

Code Generation with Visual Modeller

Code Generation with Visual Modeller Code Generation with Visual Modeller Patrick Fillatre Abstract Visual Modeller is a CASE tool that permits end-users to model business processes, using the associated method Visual Modelling. It is also

More information

Question Bank. 4) It is the source of information later delivered to data marts.

Question Bank. 4) It is the source of information later delivered to data marts. Question Bank Year: 2016-2017 Subject Dept: CS Semester: First Subject Name: Data Mining. Q1) What is data warehouse? ANS. A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile

More information

Recommended Practice for Software Requirements Specifications (IEEE)

Recommended Practice for Software Requirements Specifications (IEEE) Recommended Practice for Software Requirements Specifications (IEEE) Author: John Doe Revision: 29/Dec/11 Abstract: The content and qualities of a good software requirements specification (SRS) are described

More information

is easing the creation of new ontologies by promoting the reuse of existing ones and automating, as much as possible, the entire ontology

is easing the creation of new ontologies by promoting the reuse of existing ones and automating, as much as possible, the entire ontology Preface The idea of improving software quality through reuse is not new. After all, if software works and is needed, just reuse it. What is new and evolving is the idea of relative validation through testing

More information

Enterprise Architect. User Guide Series. Time Aware Models. Author: Sparx Systems. Date: 30/06/2017. Version: 1.0 CREATED WITH

Enterprise Architect. User Guide Series. Time Aware Models. Author: Sparx Systems. Date: 30/06/2017. Version: 1.0 CREATED WITH Enterprise Architect User Guide Series Time Aware Models Author: Sparx Systems Date: 30/06/2017 Version: 1.0 CREATED WITH Table of Contents Time Aware Models 3 Clone Structure as New Version 5 Clone Diagram

More information

CS SOFTWARE ENGINEERING QUESTION BANK SIXTEEN MARKS

CS SOFTWARE ENGINEERING QUESTION BANK SIXTEEN MARKS DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CS 6403 - SOFTWARE ENGINEERING QUESTION BANK SIXTEEN MARKS 1. Explain iterative waterfall and spiral model for software life cycle and various activities

More information

A PRIMITIVE EXECUTION MODEL FOR HETEROGENEOUS MODELING

A PRIMITIVE EXECUTION MODEL FOR HETEROGENEOUS MODELING A PRIMITIVE EXECUTION MODEL FOR HETEROGENEOUS MODELING Frédéric Boulanger Supélec Département Informatique, 3 rue Joliot-Curie, 91192 Gif-sur-Yvette cedex, France Email: Frederic.Boulanger@supelec.fr Guy

More information

Data Warehousing. Ritham Vashisht, Sukhdeep Kaur and Shobti Saini

Data Warehousing. Ritham Vashisht, Sukhdeep Kaur and Shobti Saini Advance in Electronic and Electric Engineering. ISSN 2231-1297, Volume 3, Number 6 (2013), pp. 669-674 Research India Publications http://www.ripublication.com/aeee.htm Data Warehousing Ritham Vashisht,

More information

Handout 9: Imperative Programs and State

Handout 9: Imperative Programs and State 06-02552 Princ. of Progr. Languages (and Extended ) The University of Birmingham Spring Semester 2016-17 School of Computer Science c Uday Reddy2016-17 Handout 9: Imperative Programs and State Imperative

More information

Data Warehouse. Asst.Prof.Dr. Pattarachai Lalitrojwong

Data Warehouse. Asst.Prof.Dr. Pattarachai Lalitrojwong Data Warehouse Asst.Prof.Dr. Pattarachai Lalitrojwong Faculty of Information Technology King Mongkut s Institute of Technology Ladkrabang Bangkok 10520 pattarachai@it.kmitl.ac.th The Evolution of Data

More information

2.0.3 attributes: A named property of a class that describes the range of values that the class or its instances (i.e., objects) may hold.

2.0.3 attributes: A named property of a class that describes the range of values that the class or its instances (i.e., objects) may hold. T0/04-023 revision 2 Date: September 06, 2005 To: T0 Committee (SCSI) From: George Penokie (IBM/Tivoli) Subject: SAM-4: Converting to UML part Overview The current SCSI architecture follows no particular

More information

Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language

Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language Dong Han and Kilian Stoffel Information Management Institute, University of Neuchâtel Pierre-à-Mazel 7, CH-2000 Neuchâtel,

More information

Minsoo Ryu. College of Information and Communications Hanyang University.

Minsoo Ryu. College of Information and Communications Hanyang University. Software Reuse and Component-Based Software Engineering Minsoo Ryu College of Information and Communications Hanyang University msryu@hanyang.ac.kr Software Reuse Contents Components CBSE (Component-Based

More information

10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety. Copyright 2012 Philip A. Bernstein

10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety. Copyright 2012 Philip A. Bernstein 10. Replication CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety Copyright 2012 Philip A. Bernstein 1 Outline 1. Introduction 2. Primary-Copy Replication 3. Multi-Master Replication 4.

More information

Automatic Reconstruction of the Underlying Interaction Design of Web Applications

Automatic Reconstruction of the Underlying Interaction Design of Web Applications Automatic Reconstruction of the Underlying Interaction Design of Web Applications L.Paganelli, F.Paternò C.N.R., Pisa Via G.Moruzzi 1 {laila.paganelli, fabio.paterno}@cnuce.cnr.it ABSTRACT In this paper

More information

21. Document Component Design

21. Document Component Design Page 1 of 17 1. Plan for Today's Lecture Methods for identifying aggregate components 21. Document Component Design Bob Glushko (glushko@sims.berkeley.edu) Document Engineering (IS 243) - 11 April 2005

More information

Petri Nets. Petri Nets. Petri Net Example. Systems are specified as a directed bipartite graph. The two kinds of nodes in the graph:

Petri Nets. Petri Nets. Petri Net Example. Systems are specified as a directed bipartite graph. The two kinds of nodes in the graph: System Design&Methodologies Fö - 1 System Design&Methodologies Fö - 2 Petri Nets 1. Basic Petri Net Model 2. Properties and Analysis of Petri Nets 3. Extended Petri Net Models Petri Nets Systems are specified

More information

Data Warehousing ETL. Esteban Zimányi Slides by Toon Calders

Data Warehousing ETL. Esteban Zimányi Slides by Toon Calders Data Warehousing ETL Esteban Zimányi ezimanyi@ulb.ac.be Slides by Toon Calders 1 Overview Picture other sources Metadata Monitor & Integrator OLAP Server Analysis Operational DBs Extract Transform Load

More information

Towards Quality-Oriented Data Warehouse Usage and Evolution

Towards Quality-Oriented Data Warehouse Usage and Evolution Towards Quality-Oriented Data Warehouse Usage and Evolution 2 Panos Vassiliadis 1, Mokrane Bouzeghoub 2, Christoph Quix 3 1 National Technical University of Athens, Greece, pvassil@dbnet.ece.ntua.gr University

More information

Data Mining. Asso. Profe. Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology Department of CS (1)

Data Mining. Asso. Profe. Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology Department of CS (1) Data Mining Asso. Profe. Dr. Raed Ibraheem Hamed University of Human Development, College of Science and Technology Department of CS 2016 2017 (1) Points to Cover Problem: Heterogeneous Information Sources

More information

Databases and Database Systems

Databases and Database Systems Page 1 of 6 Databases and Database Systems 9.1 INTRODUCTION: A database can be summarily described as a repository for data. This makes clear that building databases is really a continuation of a human

More information

TIMES A Tool for Modelling and Implementation of Embedded Systems

TIMES A Tool for Modelling and Implementation of Embedded Systems TIMES A Tool for Modelling and Implementation of Embedded Systems Tobias Amnell, Elena Fersman, Leonid Mokrushin, Paul Pettersson, and Wang Yi Uppsala University, Sweden. {tobiasa,elenaf,leom,paupet,yi}@docs.uu.se.

More information

White Paper: VANTIQ Digital Twin Architecture

White Paper: VANTIQ Digital Twin Architecture Vantiq White Paper www.vantiq.com White Paper: VANTIQ Digital Twin Architecture By Paul Butterworth November 2017 TABLE OF CONTENTS Introduction... 3 Digital Twins... 3 Definition... 3 Examples... 5 Logical

More information

Foundation of Contract for Things

Foundation of Contract for Things Foundation of Contract for Things C.Sofronis, O.Ferrante, A.Ferrari, L.Mangeruca ALES S.r.l. Rome The Internet of System Engineering INCOSE-IL Seminar, Herzliya, Israel 15 September, 2011 Software Platform

More information

Collage: A Declarative Programming Model for Compositional Development and Evolution of Cross-Organizational Applications

Collage: A Declarative Programming Model for Compositional Development and Evolution of Cross-Organizational Applications Collage: A Declarative Programming Model for Compositional Development and Evolution of Cross-Organizational Applications Bruce Lucas, IBM T J Watson Research Center (bdlucas@us.ibm.com) Charles F Wiecha,

More information

Estimating the Quality of Databases

Estimating the Quality of Databases Estimating the Quality of Databases Ami Motro Igor Rakov George Mason University May 1998 1 Outline: 1. Introduction 2. Simple quality estimation 3. Refined quality estimation 4. Computing the quality

More information

[MS-WSUSOD]: Windows Server Update Services Protocols Overview. Intellectual Property Rights Notice for Open Specifications Documentation

[MS-WSUSOD]: Windows Server Update Services Protocols Overview. Intellectual Property Rights Notice for Open Specifications Documentation [MS-WSUSOD]: Intellectual Property Rights Notice for Open Specifications Documentation Technical Documentation. Microsoft publishes Open Specifications documentation ( this documentation ) for protocols,

More information

6.001 Notes: Section 4.1

6.001 Notes: Section 4.1 6.001 Notes: Section 4.1 Slide 4.1.1 In this lecture, we are going to take a careful look at the kinds of procedures we can build. We will first go back to look very carefully at the substitution model,

More information

Analysis of BPMN Models

Analysis of BPMN Models Analysis of BPMN Models Addis Gebremichael addisalemayehu.gebremichael@student.uantwerpen.be Abstract The Business Process Modeling Notation (BPMN) is a standard notation for capturing business processes,

More information

Component-Based Software Engineering TIP

Component-Based Software Engineering TIP Component-Based Software Engineering TIP X LIU, School of Computing, Napier University This chapter will present a complete picture of how to develop software systems with components and system integration.

More information

UML-Based Conceptual Modeling of Pattern-Bases

UML-Based Conceptual Modeling of Pattern-Bases UML-Based Conceptual Modeling of Pattern-Bases Stefano Rizzi DEIS - University of Bologna Viale Risorgimento, 2 40136 Bologna - Italy srizzi@deis.unibo.it Abstract. The concept of pattern, meant as an

More information

Duration: 5 Days. EZY Intellect Pte. Ltd.,

Duration: 5 Days. EZY Intellect Pte. Ltd., Implementing a SQL Data Warehouse Duration: 5 Days Course Code: 20767A Course review About this course This 5-day instructor led course describes how to implement a data warehouse platform to support a

More information

Introduction to and Aims of the Project : Infocamere and Data Warehousing

Introduction to and Aims of the Project : Infocamere and Data Warehousing Introduction to and Aims of the Project : Infocamere and Data Warehousing Some Background Information Infocamere is the Italian Chambers of Commerce Consortium for Information Technology and as such it

More information

Overview of Reporting in the Business Information Warehouse

Overview of Reporting in the Business Information Warehouse Overview of Reporting in the Business Information Warehouse Contents What Is the Business Information Warehouse?...2 Business Information Warehouse Architecture: An Overview...2 Business Information Warehouse

More information

The International Intelligent Network (IN)

The International Intelligent Network (IN) The International Intelligent Network (IN) Definition In an intelligent network (IN), the logic for controlling telecommunications services migrates from traditional switching points to computer-based,

More information

Editor. Analyser XML. Scheduler. generator. Code Generator Code. Scheduler. Analyser. Simulator. Controller Synthesizer.

Editor. Analyser XML. Scheduler. generator. Code Generator Code. Scheduler. Analyser. Simulator. Controller Synthesizer. TIMES - A Tool for Modelling and Implementation of Embedded Systems Tobias Amnell, Elena Fersman, Leonid Mokrushin, Paul Pettersson, and Wang Yi? Uppsala University, Sweden Abstract. Times is a new modelling,

More information

Comparative Analysis of Architectural Views Based on UML

Comparative Analysis of Architectural Views Based on UML Electronic Notes in Theoretical Computer Science 65 No. 4 (2002) URL: http://www.elsevier.nl/locate/entcs/volume65.html 12 pages Comparative Analysis of Architectural Views Based on UML Lyrene Fernandes

More information

A Tutorial on Agent Based Software Engineering

A Tutorial on Agent Based Software Engineering A tutorial report for SENG 609.22 Agent Based Software Engineering Course Instructor: Dr. Behrouz H. Far A Tutorial on Agent Based Software Engineering Qun Zhou December, 2002 Abstract Agent oriented software

More information

UML- a Brief Look UML and the Process

UML- a Brief Look UML and the Process UML- a Brief Look UML grew out of great variety of ways Design and develop object-oriented models and designs By mid 1990s Number of credible approaches reduced to three Work further developed and refined

More information

Fig 1.2: Relationship between DW, ODS and OLTP Systems

Fig 1.2: Relationship between DW, ODS and OLTP Systems 1.4 DATA WAREHOUSES Data warehousing is a process for assembling and managing data from various sources for the purpose of gaining a single detailed view of an enterprise. Although there are several definitions

More information

Data Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

Data Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich Data Modeling and Databases Ch 14: Data Replication Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich Database Replication What is database replication The advantages of

More information

BUSINESS REQUIREMENTS SPECIFICATION (BRS) Documentation Template

BUSINESS REQUIREMENTS SPECIFICATION (BRS) Documentation Template BUSINESS REQUIREMENTS SPECIFICATION (BRS) Documentation Template Approved UN/CEFACT Forum Bonn 2004-03-09 Version: 1 Release: 5 Table of Contents 1 REFERENCE DOCUMENTS...3 1.1 CEFACT/TMWG/N090R10 UN/CEFACTS

More information

CSc33200: Operating Systems, CS-CCNY, Fall 2003 Jinzhong Niu December 10, Review

CSc33200: Operating Systems, CS-CCNY, Fall 2003 Jinzhong Niu December 10, Review CSc33200: Operating Systems, CS-CCNY, Fall 2003 Jinzhong Niu December 10, 2003 Review 1 Overview 1.1 The definition, objectives and evolution of operating system An operating system exploits and manages

More information

Uncertain Data Models

Uncertain Data Models Uncertain Data Models Christoph Koch EPFL Dan Olteanu University of Oxford SYNOMYMS data models for incomplete information, probabilistic data models, representation systems DEFINITION An uncertain data

More information

Chapter 4. Capturing the Requirements. 4th Edition. Shari L. Pfleeger Joanne M. Atlee

Chapter 4. Capturing the Requirements. 4th Edition. Shari L. Pfleeger Joanne M. Atlee Chapter 4 Capturing the Requirements Shari L. Pfleeger Joanne M. Atlee 4th Edition It is important to have standard notations for modeling, documenting, and communicating decisions Modeling helps us to

More information

Building a Data Warehouse step by step

Building a Data Warehouse step by step Informatica Economică, nr. 2 (42)/2007 83 Building a Data Warehouse step by step Manole VELICANU, Academy of Economic Studies, Bucharest Gheorghe MATEI, Romanian Commercial Bank Data warehouses have been

More information

Applying Experiences with Declarative Codifications of Software Architectures on COD

Applying Experiences with Declarative Codifications of Software Architectures on COD Applying Experiences with Declarative Codifications of Software Architectures on COD Position Paper Roel Wuyts Stéphane Ducasse Gabriela Arévalo roel.wuyts@iam.unibe.ch ducasse@iam.unibe.ch arevalo@iam.unibe.ch

More information

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:

More information

After completing this course, participants will be able to:

After completing this course, participants will be able to: Designing a Business Intelligence Solution by Using Microsoft SQL Server 2008 T h i s f i v e - d a y i n s t r u c t o r - l e d c o u r s e p r o v i d e s i n - d e p t h k n o w l e d g e o n d e s

More information

A Case Study for HRT-UML

A Case Study for HRT-UML A Case Study for HRT-UML Massimo D Alessandro, Silvia Mazzini, Francesco Donati Intecs HRT, Via L. Gereschi 32, I-56127 Pisa, Italy Silvia.Mazzini@pisa.intecs.it Abstract The Hard-Real-Time Unified Modelling

More information

Architectural Design

Architectural Design Architectural Design Topics i. Architectural design decisions ii. Architectural views iii. Architectural patterns iv. Application architectures PART 1 ARCHITECTURAL DESIGN DECISIONS Recap on SDLC Phases

More information

3.7 Denotational Semantics

3.7 Denotational Semantics 3.7 Denotational Semantics Denotational semantics, also known as fixed-point semantics, associates to each programming language construct a well-defined and rigorously understood mathematical object. These

More information

HYBRID PETRI NET MODEL BASED DECISION SUPPORT SYSTEM. Janetta Culita, Simona Caramihai, Calin Munteanu

HYBRID PETRI NET MODEL BASED DECISION SUPPORT SYSTEM. Janetta Culita, Simona Caramihai, Calin Munteanu HYBRID PETRI NET MODEL BASED DECISION SUPPORT SYSTEM Janetta Culita, Simona Caramihai, Calin Munteanu Politehnica University of Bucharest Dept. of Automatic Control and Computer Science E-mail: jculita@yahoo.com,

More information

Grid Computing Systems: A Survey and Taxonomy

Grid Computing Systems: A Survey and Taxonomy Grid Computing Systems: A Survey and Taxonomy Material for this lecture from: A Survey and Taxonomy of Resource Management Systems for Grid Computing Systems, K. Krauter, R. Buyya, M. Maheswaran, CS Technical

More information

Training 24x7 DBA Support Staffing. MCSA:SQL 2016 Business Intelligence Development. Implementing an SQL Data Warehouse. (40 Hours) Exam

Training 24x7 DBA Support Staffing. MCSA:SQL 2016 Business Intelligence Development. Implementing an SQL Data Warehouse. (40 Hours) Exam MCSA:SQL 2016 Business Intelligence Development Implementing an SQL Data Warehouse (40 Hours) Exam 70-767 Prerequisites At least 2 years experience of working with relational databases, including: Designing

More information

TDWI Data Modeling. Data Analysis and Design for BI and Data Warehousing Systems

TDWI Data Modeling. Data Analysis and Design for BI and Data Warehousing Systems Data Analysis and Design for BI and Data Warehousing Systems Previews of TDWI course books offer an opportunity to see the quality of our material and help you to select the courses that best fit your

More information

Conceptual Model for a Software Maintenance Environment

Conceptual Model for a Software Maintenance Environment Conceptual Model for a Software Environment Miriam. A. M. Capretz Software Engineering Lab School of Computer Science & Engineering University of Aizu Aizu-Wakamatsu City Fukushima, 965-80 Japan phone:

More information

junit RV Adding Runtime Verification to junit

junit RV Adding Runtime Verification to junit junit RV Adding Runtime Verification to junit Normann Decker, Martin Leucker, and Daniel Thoma Institute for Software Engineering and Programming Languages Universität zu Lübeck, Germany {decker, leucker,

More information

Web Services Annotation and Reasoning

Web Services Annotation and Reasoning Web Services Annotation and Reasoning, W3C Workshop on Frameworks for Semantics in Web Services Web Services Annotation and Reasoning Peter Graubmann, Evelyn Pfeuffer, Mikhail Roshchin Siemens AG, Corporate

More information

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 1 Database Systems

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 1 Database Systems Database Systems: Design, Implementation, and Management Tenth Edition Chapter 1 Database Systems Objectives In this chapter, you will learn: The difference between data and information What a database

More information

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 13 Virtual memory and memory management unit In the last class, we had discussed

More information

Design patterns of database models as storage systems for experimental information in solving research problems

Design patterns of database models as storage systems for experimental information in solving research problems Design patterns of database models as storage systems for experimental information in solving research problems D.E. Yablokov 1 1 Samara National Research University, 34 Moskovskoe Shosse, 443086, Samara,

More information

EXTENDING THE PRIORITY CEILING PROTOCOL USING READ/WRITE AFFECTED SETS MICHAEL A. SQUADRITO A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE

EXTENDING THE PRIORITY CEILING PROTOCOL USING READ/WRITE AFFECTED SETS MICHAEL A. SQUADRITO A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE EXTENDING THE PRIORITY CEILING PROTOCOL USING READ/WRITE AFFECTED SETS BY MICHAEL A. SQUADRITO A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER

More information

11. Architecture of Database Systems

11. Architecture of Database Systems 11. Architecture of Database Systems 11.1 Introduction Software systems generally have an architecture, ie. possessing of a structure (form) and organisation (function). The former describes identifiable

More information

Data Warehousing and OLAP Technologies for Decision-Making Process

Data Warehousing and OLAP Technologies for Decision-Making Process Data Warehousing and OLAP Technologies for Decision-Making Process Hiren H Darji Asst. Prof in Anand Institute of Information Science,Anand Abstract Data warehousing and on-line analytical processing (OLAP)

More information

Configuration Management in the STAR Framework *

Configuration Management in the STAR Framework * 3 Configuration Management in the STAR Framework * Helena G. Ribeiro, Flavio R. Wagner, Lia G. Golendziner Universidade Federal do Rio Grande do SuI, Instituto de Informatica Caixa Postal 15064, 91501-970

More information

5/9/2014. Recall the design process. Lecture 1. Establishing the overall structureof a software system. Topics covered

5/9/2014. Recall the design process. Lecture 1. Establishing the overall structureof a software system. Topics covered Topics covered Chapter 6 Architectural Design Architectural design decisions Architectural views Architectural patterns Application architectures Lecture 1 1 2 Software architecture The design process

More information

1 Executive Overview The Benefits and Objectives of BPDM

1 Executive Overview The Benefits and Objectives of BPDM 1 Executive Overview The Benefits and Objectives of BPDM This is an excerpt from the Final Submission BPDM document posted to OMG members on November 13 th 2006. The full version of the specification will

More information

Course on Database Design Carlo Batini University of Milano Bicocca

Course on Database Design Carlo Batini University of Milano Bicocca Course on Database Design Carlo Batini University of Milano Bicocca 1 Carlo Batini, 2015 This work is licensed under the Creative Commons Attribution NonCommercial NoDerivatives 4.0 International License.

More information

Data Models: The Center of the Business Information Systems Universe

Data Models: The Center of the Business Information Systems Universe Data s: The Center of the Business Information Systems Universe Whitemarsh Information Systems Corporation 2008 Althea Lane Bowie, Maryland 20716 Tele: 301-249-1142 Email: Whitemarsh@wiscorp.com Web: www.wiscorp.com

More information

Data warehouse architecture consists of the following interconnected layers:

Data warehouse architecture consists of the following interconnected layers: Architecture, in the Data warehousing world, is the concept and design of the data base and technologies that are used to load the data. A good architecture will enable scalability, high performance and

More information

INCREMENTAL SOFTWARE CONSTRUCTION WITH REFINEMENT DIAGRAMS

INCREMENTAL SOFTWARE CONSTRUCTION WITH REFINEMENT DIAGRAMS INCREMENTAL SOFTWARE CONSTRUCTION WITH REFINEMENT DIAGRAMS Ralph-Johan Back Abo Akademi University July 6, 2006 Home page: www.abo.fi/~backrj Research / Current research / Incremental Software Construction

More information

Describing the architecture: Creating and Using Architectural Description Languages (ADLs): What are the attributes and R-forms?

Describing the architecture: Creating and Using Architectural Description Languages (ADLs): What are the attributes and R-forms? Describing the architecture: Creating and Using Architectural Description Languages (ADLs): What are the attributes and R-forms? CIS 8690 Enterprise Architectures Duane Truex, 2013 Cognitive Map of 8090

More information

Generalized Document Data Model for Integrating Autonomous Applications

Generalized Document Data Model for Integrating Autonomous Applications 6 th International Conference on Applied Informatics Eger, Hungary, January 27 31, 2004. Generalized Document Data Model for Integrating Autonomous Applications Zsolt Hernáth, Zoltán Vincellér Abstract

More information

Data Mining: Approach Towards The Accuracy Using Teradata!

Data Mining: Approach Towards The Accuracy Using Teradata! Data Mining: Approach Towards The Accuracy Using Teradata! Shubhangi Pharande Department of MCA NBNSSOCS,Sinhgad Institute Simantini Nalawade Department of MCA NBNSSOCS,Sinhgad Institute Ajay Nalawade

More information

Architectural Design

Architectural Design Architectural Design Topics i. Architectural design decisions ii. Architectural views iii. Architectural patterns iv. Application architectures Chapter 6 Architectural design 2 PART 1 ARCHITECTURAL DESIGN

More information

Joint Entity Resolution

Joint Entity Resolution Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute

More information

SQL Server Analysis Services

SQL Server Analysis Services DataBase and Data Mining Group of DataBase and Data Mining Group of Database and data mining group, SQL Server 2005 Analysis Services SQL Server 2005 Analysis Services - 1 Analysis Services Database and

More information

DATA MINING AND WAREHOUSING

DATA MINING AND WAREHOUSING DATA MINING AND WAREHOUSING Qno Question Answer 1 Define data warehouse? Data warehouse is a subject oriented, integrated, time-variant, and nonvolatile collection of data that supports management's decision-making

More information