Hierarchical Materialisation of Methods in Object-Oriented Views: Design, Maintenance, and Experimental Evaluation

Size: px

Start display at page:

Download "Hierarchical Materialisation of Methods in Object-Oriented Views: Design, Maintenance, and Experimental Evaluation"

Magdalen Jacobs
6 years ago
Views:

1 Hierarchical Materialisation of Methods in Object-Oriented Views: Design, Maintenance, and Experimental Evaluation Bartosz Bębel Poznań University of Technology Institute of Computing Science Piotrowo 3A, Poznań, Poland Robert Wrembel Poznań University of Technology Institute of Computing Science Piotrowo 3A, Poznań, Poland ABSTRACT The application of materialised object oriented views in object relational data warehousing systems is promising. In this paper we propose a novel technique for the materialisation of method results in object oriented views, called hierarchical materialisation. When an object used to materialise the result of method m is updated, then m has to be recomputed. This recomputation can use unaffected intermediate materialised results of methods called from m, thus reducing a recomputation time. The hierarchical materialisation technique was implemented and evaluated by a number of experiments concerning methods without input arguments as well as methods with input arguments. The results showed that hierarchical materialisation reduces method recomputation time. Moreover, materialising methods with input arguments of narrow discrete domains introduces only a small time overhead. Categories and Subject Descriptors H.2. [Database Management]: Physical Design access methods. General Terms Algorithms, Performance. Keywords object-relational data warehouse, object-oriented view, view materialisation, method materialisation 1. INTRODUCTION New, dynamically growing branches of industry including, telecommunication, banking, and commerce process and store large amounts of data. The information collected in an enterprise is often of different data format and complexity (e.g., relational, object relational, object oriented, on-line, Web pages, semi structured, spreadsheets, flat files) and is stored in information systems that usually have different functionality. As the management of an enterprise requires a comprehensive view of most of its data, one of the important tasks of an information system is to provide an integrated access to all data sources within an enterprise. There are two basic approaches to data integration: a virtual approach and a data warehouse approach [13]. A very important mechanism used in both of these approaches is a view. The most important kinds of view applications are: access control, shorthand for queries, data presentation, integrity constraints, database design, and data mining. A view whose data are persistently stored in a database is called a materialised view. Materialised views are required in distributed database systems and data warehouse systems. The application of a data warehousing technology to the integration of complex data implies combining of object oriented technology with the technology of data warehousing and the development of object oriented or object relational data warehousing systems [5, 6, 7]. In the process of integrating and warehousing complex data, materialised object oriented views are very promising, but in this field few approaches have been proposed so far. While materialising an object oriented view one should consider materialisation of objects' structure as well as objects' methods. The existing approaches to materialised object oriented views support only the materialisation objects' structure. In this paper we propose a framework for the materialisation of method results in object oriented views, discuss its implementation and experimental evaluation. The materialisation of a method consists in computing the result of the method once, storing it persistently in a database, and then, using the persistent value when the method is invoked, rather than computing it every time the method is invoked. On the one hand, this technique reduces access time to the result of a method. But on the other hand, when a method result is made persistent it has to be kept up to date when data used to compute this result change. To this end, we use additional data structures representing links between materialised methods and objects used to compute these methods. When such an object is updated, the system examines an appropriate data structure in order to find these materialised methods that have to be recomputed. Method invocations form a graph of dependencies. When method m is materialised, it may be

2 reasonable to materialise also the intermediate results of methods called from m. We call this technique hierarchical materialisation. When an object used to materialise the result of method m is updated, then m has to be recomputed. This recomputation can use unaffected intermediate results that have already been materialised, thus reducing the time spent on recomputation. Hierarchical materialisation of methods is suitable for the environments where updates and deletions of objects are less frequent than queries, e.g., data warehousing systems. This paper is organised as follows: Section 2 discusses related approaches to method materialisation in database systems. Section 3 outlines our concept of materialised object oriented view. Section 4 introduces the concept of hierarchical materialisation of methods, that we have developed. The maintenance of materialised methods and experimental results concerning hierarchical materialisation are discussed in Section 5 and 6, respectively. Finally, Section 7 summarises the paper and points out the areas for future work. 2. RELATED WORK Several approaches to object oriented views have been proposed in scientific publications (see [15] for an overview). As it concerns materialised object oriented views few approaches have been proposed so far that support their materialisation [3] and maintenance [1, 11, 12]. None of them, however, supports the materialisation of methods. A method can be a very complex program, whose computation may last long, therefore the efficient execution of a method has a great impact on the query response time. A promising technique, called method precomputation or materialisation, was proposed in [2, 8, 9, 1] in the context of indexing techniques and query optimisation, but not in the context of object oriented views. The work of [8] sets up the analytical framework for estimating costs of caching complex objects. Two data representations are considered, i.e., procedural representation and object identity based representation. In the approach of [2], the results of materialised methods are stored in an index structure based on B tree, called method index. A method index on a method M stores in its key values the results of the invocation of M on the instances of the indexed class. The index record in a leaf node contains a key value v and the list of object identifiers of those objects for which the indexed method M returns v. Having materialised method M for object o j the following additional information is stored along with object o j : the result r j of executing M on o j, a validity flag that indicates whether result r j is valid, a dependency record that indicates which attributes are used to compute M. After updating the value of object o j used to compute the result v of method M its validity flag is set to False and an appropriate entry from the method index is removed. While executing queries that use M, the system checks the method index for M before executing M. If the appropriate entry is found the already precomputed value is used. Otherwise, M is executed for an object. The application of method materialisation proposed in [2] is limited to methods that: (1) do not have input arguments, (2) use only atomic type attributes to compute their values, and (3) do not modify values of objects. Otherwise, a method is left non materialised. The concept of [9, 1] uses the so called Reverse Reference Relation, which contains the tuples in the form of: [object used to materialise method m, the name of a materialised method, the set of objects passed as arguments of m]. Furthermore, this approach maintains also the information about the attributes, called relevant attributes, whose values were used to materialise method m. Method m has to be recomputed only when the value of a relevant attribute of an object used in m was updated. The entries are inserted to the above structures during the materialisation of a method. Each modification of an object used for computing the value of method M results in the rematerialisation of the materialised method's value. There are two possible rematerialisation strategies, namely lazy and immediate. In the first strategy, the validity flag is set to False in an appropriate record and method M is recomputed next time it is invoked. In the immediate strategy, the invalidated function is immediately recomputed. Another approach to method precomputation concerns so called inverse methods [4]. An inverse method is used for transforming a value or object from one representation (type) to the other. When this method is used in a query, it is computed once, instead of computing appropriate method for each object returned by the query. The result of an inverse method is stored in a memory and is accessible only for the current query. When the query ends the result is removed from the memory. 3. THE CONCEPT OF A MATERIALISED OBJECT-ORIENTED VIEW In our approach, called View Schema Approach (VSA) [14, 15], an object oriented view is defined as a view schema of an arbitrary complex structure and behaviour, composed of view classes. Each view class is derived from one or more classes in a database schema. Each view schema is uniquely identified by its name. A view class is derived by an OQL like command defining its structure, behaviour, and set of instances. View classes in a view schema are connected by inheritance and association relationships. Several view schemas may be created in the same ODS and each of them is uniquely identified by its name. Example 1. Before presenting the details concerning hierarchical materialisation of methods let us develop the VS_Computer view schema with six view classes, namely V_Computer composed of V_CDDrive, V_Disk, and V_MainBoard. The V_Main_Board view class is further composed of V_RAM and V_CPU, as shown in Figure 1. A view schema is explicitly materialised. Similarly as materialised relational views, a materialised view schema has to be kept up to date with the content of a source database. Three following techniques for keeping a materialised view schema up to date were developed within the View Schema Approach: deferred on commit incremental refreshing, deferred on demand incremental

3 V_CDDrive voltage cintensity cdr : V_CDDrive V_Computer mb : V_MainBoard disk : V_Disk V_Disk voltage cintensity V_RAM voltage cintentisy radu ram : V_RAM radi V_MainBoard refreshing, and deferred on demand complete refreshing. Refreshing given view schema VS i means that all the materialised instances of view classes in VS i are refreshed. In order to incrementally propagate the modifications from base to view objects we have developed additional data structures, called Class Mapping Structure, Object Mapping Structure, and Log. Due to space limitations these structures will not be described in this paper. Every view class defined in a view schema may have several methods defined with it. The system should support materialisation of selected methods for the reason of efficiency, in cases when the computation of a method takes long time. 4. HIERARCHICAL MATERIALISATION OF METHODS We propose a novel technique of method materialisation, called hierarchical materialisation. When hierarchical materialisation is applied to method m i, then the result of mi is stored persistently and additionally, the results of methods called from m i are also stored persistently. Hierarchical materialisation may be useful only for those methods that call other methods and the computation of those called methods is costly. After the materialisation of m i, the result of the first invocation of method m i for view object vo i is stored persistently. Each subsequent invocation of m i for the same object vo i uses the already materialised value. The materialisation of methods in a given view schema is allowed only when this schema has previously been materialised. Methods may have various numbers of input arguments, that can be of various types. Generally, methods that have input arguments are not good candidates for the materialisation. However, in the View Schema Approach a method with input arguments can be materialised and maintained within acceptable time provided that: (1) the method has few input arguments and (2) each of the arguments has a narrow, discrete domain. A given method m i implemented in view class vc i can use in its body attributes of vc i and can call other methods in other view classes via association relationships. When the value of an attribute used to compute and materialise the value of m i is modified, then the materialised value becomes invalid. Such an attribute will be further called a sensitive attribute. 4.1 Data Structures In order to materialise methods in a view class and maintain the materialised results, three additional data structures have been developed. These structures, which are described below, are cpu : V_CPU Figure 1. View schema VS_Computers voltage cintensity V_CPU power_cons(frequency: Integer) called View Methods, Materialised Method Results Structure, and Graph of Method Calls. Each of them has associated the set of procedures and functions that operate on its data View Methods View Methods (VM for short) makes available the data dictionary information about all methods and their signatures implemented in view classes. View Methods is implemented as two object tables, called View_Methods and VM_InputArgs. The structure of the View_Methods object table is as follows. < m_id, method, view_class, view_schema, ret_type, body, materialised, sensitive_attribs > m_id is the identifier of a method; method stores the name of a method whose result is to be materialised; view_class stores the name of a view class whose method is to be materialised; view_schema contains the name of a view schema where a view class has been placed; ret_type contains the type of the value returned by a method; body stores the implementation of a method; materialised is a flag indicating whether a method has been materialised or not; sensitive_attribs stores the set of so called sensitive attributes for a method. The set of sensitive attributes for materialised method m i is used to verify whether an update to a view object makes the materialised result of m i invalid. The VM_InputArgs object table has three following attributes: < m_id, arg_name, arg_type > m_id is the identifier of the method that comes from View_Methods; arg_name contains the name of an input argument; arg_type contains the type of an input argument Materialised Method Results Structure As the same method can be invoked for different instances of a given view class and the same method can be invoked with different values of input arguments, the system has to maintain the mappings between: (1) the materialised value of a method, (2) an object for which it was invoked, and (3) values of input arguments. The mappings are represented in the structure, called Materialised Method Results Structure (MMRS for short). MMRS is used by the procedure that maintains the materialised results of methods. When method m i is invoked for a given view object vo i and this method has been previously set as materialised, then MMRS is searched in order to get the result of m i invoked for vo i. If it is not found then, m i is computed for vo i and stored in MMRS. Otherwise, the materialised result of m i is read instead of executing m i. When an object used to

4 compute the materialised value of m i is updated or deleted, then the materialised value becomes invalid. In such a case, appropriate record is removed from MMRS. MMRS is implemented as an object table having the following structure: < m_id, view_oid, inputargs, value > V_CDDrive:: V_Computer:: V_MainBoard:: V_Disk:: The m_id attribute is the identifier of the method that comes from View_Methods; view_oid stores the set of those view object identifiers for which the method has been invoked with the same set of input argument values and the method returned the same value for all these view objects; inputargs stores the set of records containing the names and values of input arguments; the value attribute stores the value of the method executed for a given view object for a certain set of input arguments. view_oid is implemented as a nested table having one attribute that stores the identifier of a view object. inputargs is implemented as a nested table, having the following structure: < arg_name, arg_value > where arg_name is the name of an input argument and arg_value is its value Graph of Method Calls A method defined in one view class can invoke other methods defined in other view classes. For example, in order to compute the consumption of power by the instances of V_Computer, the power_cons method (in V_Computer) calls power_cons methods defined in V_Disk, in V_CDDrive, and in V_MainBoard. The chain of method dependencies, where one method calls another, is called Graph of Method Calls (GMC for short). GMC is used by the procedure that maintains the materialised results of methods. When materialised method m j becomes invalid all the materialised methods that use the value of m j also become invalid. In order to invalidate those methods the content of GMC is used. The GMC is implemented as an object table having the following structure: < calling_m_id, called_m_id > The calling_m_id attribute stores the identifier of a calling method; this identifier comes from the View_Methods object table; the called_m_id attribute stores the set of identifiers of methods being called; each identifier in the set also comes from the View_Methods object table. called_m_id is implemented as an object table. Example 2. Let us consider the view schema presented in Figure 1. Each view class in this view schema has a method, called power_cons, that returns the consumption of electricity power by a computer component. The consumption of power by each instance of V_Main_Board is computed as the sum of power consumed by each component object, i.e., instances of V_RAM and V_CPU and by the instance of V_MainBoard itself. Similarly, the power consumption of each instance of V_Computer is the sum of power consumed by the component instance of V_CDDrive, V_MainBoard, and V_Disk. The example of GMC for the view schema is shown in Figure 2. The name of each method is preceded with the name of a class in which it has been defined. V_RAM:: V_CPU:: Figure 2. An example of the Graph of Method Calls 4.2 Hierarchical Materialisation When method m i is materialised, it may be reasonable to materialise also the intermediate results of methods called from m i. When a view object vo i, used to materialise the result of method m i, is updated or deleted, then m i has to be recomputed. This recomputation can use unaffected intermediate materialised results, thus reducing the recomputation time overhead. We call this technique hierarchical materialisation. In order to maintain the chain of method invocations the system uses the Graph of Method Calls. In our framework all the intermediate results up to leaf nodes are materialised. In order to illustrate the hierarchical materialisation technique and its advantage let us consider the following example. Example 3. Figure 3 presents the materialised instances of the view schema from Figure 1. Let us assume that the instance of V_Computer, namely the view object identified by vcom 1 is composed of view objects vcd 1 (the instance of view class V_CDDrive), vd 2 (the instance of view class V_Disk), and vmb 1 (the instance of V_MainBoard), which in turn is composed of: vram 2, vram 21, vram 22, and vcpu 1. Let us further assume that the power_cons method in V_Computer was materialised. The result of V_Computer::power_cons is materialised for the instance of V_Computer only when this method is invoked for this instance. Furthermore, all the methods called from V_Computer::power_cons are also materialised when they are executed. Let us assume that the power_cons method was invoked for vcom 1. In our example, the hierarchical materialisation mechanism results in materialising values of the following methods: V_RAM::power_cons for view objects identified by vram 2, vram 21, and vram 22, V_CPU::power_cons for object vcpu 1, V_MainBoard::power_cons for object vmb 1, V_Disk::power_cons for object vd 2, V_CDDrive::power_cons for object vcd 1, and finally V_Computer::power_cons for object vcom 1. Having materialised the methods discussed above, let us assume that the component object vd 2 has been replaced with another disk instance, say vd 31, with greater consumption of power. Thus, the result of V_Computer::power_cons materialised for vcom 1 is no longer valid and it has to be recomputed. However, during the recomputation of vcom 1.power_cons the unaffected materialised results of methods can be reused, i.e., vcd 1.power_cons, vmb 1.power_cons have not been changed and they can be used to compute a new value of vcom 1.power_cons.

5 vco : 1: power_cons( ) 2: power_cons( ) V_Computer vcd1 : V_CDDrive vram2 : V_RAM 4: power_cons( ) vram22 vram21 3: power_cons( ) vmb1 : V_MainBoard vd2 : V_Disk 5: power_cons(integer) vcpu1 : V_CPU Figure 3. An example of materialised view schema instances 5. MAINTENANCE OF MATERIALISED METHODS Similarly as a materialised view object, a materialised method may become out of date when the values used to compute the method change. The materialised value of m i, defined in view class v i, becomes obsolete when: (1) m i uses the values of sensitive attributes belonging to the instance vo i of view class v i and the values of sensitive attributes have been changed, (2) m i calls another method, say m j, and the materialised value of m j has been changed. When the materialised value of m i becomes obsolete it is removed from MMRS 1. The removal of the result of method m j causes that the results of methods that called m j also become invalid and have to be removed from MMRS. The removal of materialised results from MMRS is recursively executed up to the root of GMC by the MMRS_Propagate_Remove_Result procedure. To this end the procedure has to identify a pair of values in MMRS records. This pair of values is composed of method identifier and view object identifier for which the method has been materialised. The MMRS_Propagate_Remove_Result procedure traverses the GMC and aggregation relationships in an inverse direction, i.e., from bottom to top. In order to ease the traversal of aggregation hierarchy in an inverse direction the prototype maintains for each view object so called inverse references. An inverse reference for view object vo j is the reference from vo j to other objects that reference vo j. For example, the inverse reference for view object vcpu 1 (cf. Figure 3) contains one object identifier vmb 1 that points to the instance of the V_MainBoard view class. The removal of a materialised method from MMRS is triggered by the deletion or update of a view object. To this end the Check_if_Removal procedure is used. Its pseudo code is shown in Listing 1. Listing 1. The pseudo code of the Check_if_Removal procedure Check_if_Removal ( view_oid void, updated_attr SET_Attr ) begin for m in MMRS_Get_Affected_Methods (view_oid, updated_attr) loop MMRS_Propagate_Remove_Result (m, view_oid); end loop; end; The Check_if_Removal procedure requires two input arguments. The first one view_oid is the identifier of a view object being either updated or deleted. The second attribute updated_attr is the set of updated attributes of a view object. The procedure calls the MMRS_Get_Affected_Methods function. The function returns the set of all materialised methods identifiers whose values became invalid after the modification of a view object. MMRS_Get_Affected_Methods requires two input arguments: the identifier of a modified (updated or deleted) view object and the set of updated attributes (this set is empty when a view object is deleted). For each method returned by this function the loop is executed. In the loop, the records from MMRS are removed by the MMRS_Propagate_Remove_Result procedure, whose pseudo code is presented in Listing 2. Listing 2. The pseudo code of the MMRS_Propagate_Remove_Result procedure 1: MMRS_Propagate_Remove_Result ( meth_id number, 2: view_oid void 3: V_callingMeth SET_Method; 4: i number; 5: j number; 6: begin 7: MMRS_Remove_Result (meth_id, view_oid); 8: /* find the set of methods calling the invalidated method */ 9: V_callingMeth :=GMC_Find_Calling_Methods (meth_id); 1: if V_callingMeth is not null 11: and view_oid.compositerefobject is not null 12: then 13: for m in V_callingMeth loop 14: for v_obj in view_oid.compositerefobject loop 15: /* call recursively MMRS_Propagate_Remove_Result */ 16: MMRS_Propagate_Remove_Result (m, v_obj); 17: end loop; 18: end loop; 19: end if; 2: end; The MMRS_Propagate_Remove_Result procedure has two input arguments: meth_id is the identifier of a method and view_oid is the identifier of a view object. The values of the input arguments are set up in the Check_if_Removal procedure (cf. Listing 1). For the pair of values stored in these two input arguments the appropriate records are removed from MMRS (line 7). Then the removal has to be propagated up to the root of GMC. To this end, the set of method identifiers that use the method identified by the value of meth_id is found in line 9. This set is returned by the GMC_Find_Calling_Methods function, that operates on GMC. The code in lines 1 and 11 is used to check if: the set of method identifiers returned by GMC_Find_Calling_Methods is not empty; the set of inverse references from view object pointed by view_oid is not empty. The set of inverse references is stored in each view object as the value of its attribute compositerefobjects. If both conditions are fulfilled, then the removal of records from MMRS is executed recursively for each method in the set returned by GMC_Find_Calling_Methods and for each view object in inverse references (lines 13-18). 1 This implemented by using database triggers.

6 6. EXPERIMENTAL RESULTS The proposed hierarchical materialisation technique has been implemented within so called View Schema Approach Prototype (VSAP). The prototype has been implemented partially as the application written in C/C++ and partially as packages, functions, and procedures stored in the Oracle8i DBMS, using its object oriented features. All dictionary tables, object tables, and data have been stored in this database. The experiment evaluating hierarchical materialisation has been performed in Oracle8i (rel ) database management system 2. Graph of Method Calls looked as presented in Figure 4. Method m 1 called m 11, m 12, and m 13. The result of m 1 was computed as the sum of results returned by m 11, m 12, and m 13. Similarly, m 11 called m 111, m 112, and m 113 by summing up their results. m 111, in turn, called m 1111, m 1112, and m 1113 by summing up their results. The same computation pattern was used for the rest of methods in this GMC. This graph represents also the aggregation hierarchy of objects. A complex object at the root of the hierarchy referenced three objects at the lower level. Each of these lower level objects referenced further three other objects. As a consequence, each root complex object was composed of 12 component objects. The size of one root complex object, including its components, equalled to 14kB. The experiments were performed for 1, 1, 2, and 5 root complex objects, that gave 12, 12, 24, and 6 component objects, respectively. Due to space constraints we present only the results for 5 root complex objects Figure 4. Graph of Method Calls used in the experiment level 1 level 2 level 3 level 4 level Methods without input arguments Chart 1 shows the total time overhead for: (1) the execution of a method without materialisation (Exe), (2) the execution of a method together with the materialisation of its result (E+M), (3) reading the materialised result of a method (RM), (4) the invalidation of a method (Inv), and (5) the rematerialisation of a previously invalidated method (Rem). These five kinds of time overhead were measured for methods without input arguments, for 5 root complex objects. m 1 and m 11 denote times for processing methods m 1 and m 11, respectively (cf. Figure 4). Average times, computed per one root complex object, are presented in Chart 2. In this experiment the invalidation of and 1 was caused by updating the object used to compute the result of method 1111 (cf. Figure 4), thus one branch of GMC was invalidated from the very bottom method to the very top method. 2 Oralce8i was running under the control of Windows NT, on a PC with Pentium III 55MHz, with 128MB of RAM. The size of a database buffer equaled 16MB. In order to measure the usefulness of hierarchical materialisation we computed the following time coefficient: tc = (Inv + Rem) / Exe. Taking into account the following times shown in Chart 2: the average method execution time without materialisation (Exe) approximately 11.5 sec for method m 1 ; the average method invalidation time overhead (Inv) approximately.3 sec for method m 1 ; the average method rematerialisation time overhead (Rem) approximately 5.4 sec for method m 1 ; the value of tc equals approximately 2, meaning that thanks to the hierarchical materialisation technique, method m 1 (the root of Graph of Method Calls) was executed approximately two times faster than without materialisation. totaltim e [sec] Exe E+M RM Inv Rem 1 Chart 1. Total times of processing methods m 1 and m 11, for 5 root complex objects (one branch of GMC invalidated) average tim e [sec] Exe E+M RM Inv Rem 1 Chart 2. Average times of processing methods m 1 and m 11 (one branch of GMC invalidated) In our tests it is the m 1 method whose execution time is double reduced. This reduction is valid only for Graph of Method Calls having the pattern as shown in Figure 4 and for only one branch of GMC that is invalidated. For other graphs, having different height and width, the acceleration coefficient will be different. This coefficient depends also on the number of objects being updated, which as a consequence, impacts the number of methods whose results have to be invalidated and rematerialised. Even though the methods used in the experiments performed simple arithmetical operations, the hierarchical materialisation technique gave better system performance. Higher increase in the system performance will be achieved provided that we materialise methods whose computation is more costly than those used in the experiments. The total and average processing times for two invalidated branches of GMC are presented in Chart 3 and 4, respectively.

7 For two updated leaf objects invalidating the results of m and m the time coefficient tc equals approximately one. totaltim e [sec] Exe E+M RM Inv Rem 1 Chart 3. Total times of processing methods m 1 and m 11, for 5 root complex objects (two branches of GMC invalidated) average tim e [sec] Exe E+M RM Inv Rem 1 Chart 4. Average times of processing methods m 1 and m 11 (two branches of GMC invalidated) 6.2 Methods with input arguments The results of the experiments concerning the materialisation and maintenance of methods having two and four input arguments of atomic types are discussed below. In the experiments, each input argument had only two possible values. Total times of processing methods m 1 and m 11 with input arguments, for 5 root complex objects are shown in Chart 5. One branch of GMC was invalidated and rematerialised. As we can observe, methods having two input arguments are processed in almost the same time as methods having four input arguments. From the analysis of Chart 1 and 5 it follows that the time overhead for materialising and maintaining methods having few input arguments is very low provided that the domains of input arguments are narrow. For arguments whose domain is wider than two values, the time overhead will be higher, as the number of records in MMRS will increase faster. The value of the acceleration coefficient computed for method m 1, for 5 root complex objects equals approximately.7. This value is approximately the same for methods with two as well as four input arguments. The coefficient computed for methods having input arguments is slightly lower than the coefficient computed for methods without input arguments. 6.3 Storage Space Additional data structures supporting the hierarchical materialisation need storage space. The space overhead for storing one record in MMRS is computed by the following formula: 64B + nb_arg * domain * 2B, where nb_arg is the number of input arguments of a materialised method; domain is the number of different values an argument can have. For example, in order to store materialised results of method m 1 and all methods invoked from m 1 we need 16kB, for two input arguments, each of which can have one of two values. The experiments with 5 root complex objects required additional storage of 4MB for methods without input arguments, 8MB for methods with two input arguments each of which could have one of two possible values, and 16MB for methods with two input arguments each of which could have one of four possible values. This evaluation shows that the size of MMRS grows with the number of input arguments and with the size of their domains. If the domain of an input argument is large, then the large size of MMRS deteriorates the system performance. The space overhead for storing one record in GMC is constant and equals to 6B and it is independent on the number of complex objects being processed. For example, the space for storing the chain of method invocations for m 1 equals to 7kB. 7. SUMMARY, CONCLUSIONS AND FUTURE WORK The support for view materialisation and maintenance is required when applying object oriented views in object relational data warehousing systems. In this paper we presented a framework for object oriented view materialisation and maintenance with respect to methods defined in view classes. To the best of our knowledge, it is the first approach to method materialisation applied to object oriented views. Moreover, we have proposed a novel method materialisation technique, called hierarchical materialisation. As the experiments showed, this technique allows to reduce the maintenance cost of materialised methods, as unaffected intermediate results do not have to be recomputed and can thus be reused during the recomputation of another affected method. Hierarchical materialisation of methods is suitable for the environments where updates and deletions of objects are less frequent than queries, e.g., data warehousing systems. Moreover, this technique is suitable for materialising only those methods that do not have input arguments at all, or have a few input arguments each of which can have few different values. The current implementation of hierarchical materialisation has a few following drawbacks. 1. The decision whether a method should be materialised or not is explicitly made by a view designer during the system tuning activity. He or she has to carefully select the methods for materialisation and their sensitive attributes. 2. The graph of method calls has to reflect the aggregation relationships. In other words, if method m i (in view class V i ) calls m j (in view class V j ) then aggregation relationship must exist between V i and V j.

8 7 6 totaltim e [sec] Exe Exe E+M E+M RM RM Inv Inv Rem Rem m 11 Chart 5. Total times of processing methods m 1 and m 11 with input arguments, for 5 root complex objects (one branch of GMC invalidated) 3. Only methods that have input arguments of atomic types can be materialised, i.e., methods that use arguments that are objects, values of structured types, or other methods can not be materialised. Moreover, materialised methods have to return values of atomic types. 4. The body of method m i being materialised may contain only simple arithmetical operations whose arguments are the called methods and/or the attributes of the view class where m i has been defined. SQL commands are not allowed in materialised methods as it would cause difficulties in registering the used object identifiers and values of the method in MMRS. One way to tackle the problem would be to build a special environment where SQL or OQL commands could be executed. In this environment every object touched by a materialised method could be registered together with the value of the method. The second way to overcome the problem would be to dynamically translate SQL or OQL command to a cursor and execute the command by the cursor. This, however, requires two features of an operational data store, namely the support for cursors and the support for dynamic SQL (OQL). 5. Materialised methods must not modify the content of a data warehouse. The main issue that needs further investigation is the development of a technique that will allow to select automatically or semi automatically the right method for materialisation. To this end, a cost model describing the complexity of a method needs to be developed. Further work will also concentrate on making available a mechanism for materialising the results of selected methods at selected levels of Graph of Method Calls. 8. REFERENCES [1] Ali M. A., Fernandes A. A. A., Paton N.: Incremental Maintenance of Materialized OQL Views. Proc. of the DOLAP', USA, 2. [2] Bertino E.: Method precomputation in object oriented databases. SIGOS Bulletin, 12 (2, 3), 1991, pp [3] Dobrovnik M., Eder J.: Partial Replication of Object Oriented Databases. Proc. of ADBIS'98. Poland, 1998, LNCS No. 1475, pp [4] Eder J., Frank H., Liebhart W.: Optimization of Object Oriented Queries by Inverse Methods. Proc. of East/West Database Workshop, Austria, [5] J.Eder, H.Frank, T.Morzy, R.Wrembel, M.Zakrzewicz, Designing an Object-Relational Database System: Project ORDAWA. Proc. of challenges of ADBIS-DASFAA 2, Prague, Czech Republic, 2, pp [6] Gopalkrishnan V., Li Q., Karlapalem K.: Efficient Query Processing with Associated Horizontal Class Partitioning in an Object Relational Data Warehousing Environment. Proc. of DMDW'2, Sweden, 2. [7] Huynh T.N., Mangisengi O., Tjoa A.M.: Metadata for Object Relational Data Warehouse. Proc. of DMDW'2, Sweden, 2. [8] Jhingran A.: Precomputation in a Complex Object Environment. Proc of IEEE Data Engineering Japan, 1991, pp [9] Kemper A., Kilger C., Moerkotte G.: Function Materialization in Object Bases. Proc. of SIGMOD, 1991, pp [1] Kemper A., Kilger C., Moerkotte G.: Function Materialization in Object Bases: Design, Realization, and Evaluation. IEEE Transactions on Knowledge and Data Engineering, Vol. 6, No. 4, [11] Kuno H. A., Rundensteiner E.: Materialised Object-Oriented Views in MultiView. Proc. of the ACM Research Issues in Data Engineering Workshop, [12] Kuno H. A., Rundensteiner E.: Using Object-Oriented Principles to Optimize Update Propagation to Materialised Views. Proc. of Int. Conf. on Data Engineering, 1996, pp [13] Widom J.: Research Problems in Data Warehousing, Proc. of the 4 th Int. Conference on Information and Knowledge Management (CIKM), 1995, pp [14] Wrembel R.: On Materialising Object Oriented Views. In Barzdins J., Caplinskas A. (eds.): Databases and Information Systems. Kluwer Academic Publishers, March 21, ISBN , pp [15] Wrembel R.: The Construction and Maintenance of Mateiralised Object Oriented Views in Data Warehousing Systems. PhD thesis, Poznań University of Technology, Institute of Computing Sicence, Poznań, Poland, March, 21.

Designing and Implementing an Object Relational Data Warehousing System

Designing and Implementing an Object Relational Data Warehousing System Abstract Bodgan Czejdo 1, Johann Eder 2, Tadeusz Morzy 3, Robert Wrembel 3 1 Department of Mathematics and Computer Science, Loyola