Hierarchical Materialisation of Methods in Object-Oriented Views: Design, Maintenance, and Experimental Evaluation

Size: px
Start display at page:

Download "Hierarchical Materialisation of Methods in Object-Oriented Views: Design, Maintenance, and Experimental Evaluation"

Transcription

1 Hierarchical Materialisation of Methods in Object-Oriented Views: Design, Maintenance, and Experimental Evaluation Bartosz Bębel Poznań University of Technology Institute of Computing Science Piotrowo 3A, Poznań, Poland Robert Wrembel Poznań University of Technology Institute of Computing Science Piotrowo 3A, Poznań, Poland ABSTRACT The application of materialised object oriented views in object relational data warehousing systems is promising. In this paper we propose a novel technique for the materialisation of method results in object oriented views, called hierarchical materialisation. When an object used to materialise the result of method m is updated, then m has to be recomputed. This recomputation can use unaffected intermediate materialised results of methods called from m, thus reducing a recomputation time. The hierarchical materialisation technique was implemented and evaluated by a number of experiments concerning methods without input arguments as well as methods with input arguments. The results showed that hierarchical materialisation reduces method recomputation time. Moreover, materialising methods with input arguments of narrow discrete domains introduces only a small time overhead. Categories and Subject Descriptors H.2. [Database Management]: Physical Design access methods. General Terms Algorithms, Performance. Keywords object-relational data warehouse, object-oriented view, view materialisation, method materialisation 1. INTRODUCTION New, dynamically growing branches of industry including, telecommunication, banking, and commerce process and store large amounts of data. The information collected in an enterprise is often of different data format and complexity (e.g., relational, object relational, object oriented, on-line, Web pages, semi structured, spreadsheets, flat files) and is stored in information systems that usually have different functionality. As the management of an enterprise requires a comprehensive view of most of its data, one of the important tasks of an information system is to provide an integrated access to all data sources within an enterprise. There are two basic approaches to data integration: a virtual approach and a data warehouse approach [13]. A very important mechanism used in both of these approaches is a view. The most important kinds of view applications are: access control, shorthand for queries, data presentation, integrity constraints, database design, and data mining. A view whose data are persistently stored in a database is called a materialised view. Materialised views are required in distributed database systems and data warehouse systems. The application of a data warehousing technology to the integration of complex data implies combining of object oriented technology with the technology of data warehousing and the development of object oriented or object relational data warehousing systems [5, 6, 7]. In the process of integrating and warehousing complex data, materialised object oriented views are very promising, but in this field few approaches have been proposed so far. While materialising an object oriented view one should consider materialisation of objects' structure as well as objects' methods. The existing approaches to materialised object oriented views support only the materialisation objects' structure. In this paper we propose a framework for the materialisation of method results in object oriented views, discuss its implementation and experimental evaluation. The materialisation of a method consists in computing the result of the method once, storing it persistently in a database, and then, using the persistent value when the method is invoked, rather than computing it every time the method is invoked. On the one hand, this technique reduces access time to the result of a method. But on the other hand, when a method result is made persistent it has to be kept up to date when data used to compute this result change. To this end, we use additional data structures representing links between materialised methods and objects used to compute these methods. When such an object is updated, the system examines an appropriate data structure in order to find these materialised methods that have to be recomputed. Method invocations form a graph of dependencies. When method m is materialised, it may be

2 reasonable to materialise also the intermediate results of methods called from m. We call this technique hierarchical materialisation. When an object used to materialise the result of method m is updated, then m has to be recomputed. This recomputation can use unaffected intermediate results that have already been materialised, thus reducing the time spent on recomputation. Hierarchical materialisation of methods is suitable for the environments where updates and deletions of objects are less frequent than queries, e.g., data warehousing systems. This paper is organised as follows: Section 2 discusses related approaches to method materialisation in database systems. Section 3 outlines our concept of materialised object oriented view. Section 4 introduces the concept of hierarchical materialisation of methods, that we have developed. The maintenance of materialised methods and experimental results concerning hierarchical materialisation are discussed in Section 5 and 6, respectively. Finally, Section 7 summarises the paper and points out the areas for future work. 2. RELATED WORK Several approaches to object oriented views have been proposed in scientific publications (see [15] for an overview). As it concerns materialised object oriented views few approaches have been proposed so far that support their materialisation [3] and maintenance [1, 11, 12]. None of them, however, supports the materialisation of methods. A method can be a very complex program, whose computation may last long, therefore the efficient execution of a method has a great impact on the query response time. A promising technique, called method precomputation or materialisation, was proposed in [2, 8, 9, 1] in the context of indexing techniques and query optimisation, but not in the context of object oriented views. The work of [8] sets up the analytical framework for estimating costs of caching complex objects. Two data representations are considered, i.e., procedural representation and object identity based representation. In the approach of [2], the results of materialised methods are stored in an index structure based on B tree, called method index. A method index on a method M stores in its key values the results of the invocation of M on the instances of the indexed class. The index record in a leaf node contains a key value v and the list of object identifiers of those objects for which the indexed method M returns v. Having materialised method M for object o j the following additional information is stored along with object o j : the result r j of executing M on o j, a validity flag that indicates whether result r j is valid, a dependency record that indicates which attributes are used to compute M. After updating the value of object o j used to compute the result v of method M its validity flag is set to False and an appropriate entry from the method index is removed. While executing queries that use M, the system checks the method index for M before executing M. If the appropriate entry is found the already precomputed value is used. Otherwise, M is executed for an object. The application of method materialisation proposed in [2] is limited to methods that: (1) do not have input arguments, (2) use only atomic type attributes to compute their values, and (3) do not modify values of objects. Otherwise, a method is left non materialised. The concept of [9, 1] uses the so called Reverse Reference Relation, which contains the tuples in the form of: [object used to materialise method m, the name of a materialised method, the set of objects passed as arguments of m]. Furthermore, this approach maintains also the information about the attributes, called relevant attributes, whose values were used to materialise method m. Method m has to be recomputed only when the value of a relevant attribute of an object used in m was updated. The entries are inserted to the above structures during the materialisation of a method. Each modification of an object used for computing the value of method M results in the rematerialisation of the materialised method's value. There are two possible rematerialisation strategies, namely lazy and immediate. In the first strategy, the validity flag is set to False in an appropriate record and method M is recomputed next time it is invoked. In the immediate strategy, the invalidated function is immediately recomputed. Another approach to method precomputation concerns so called inverse methods [4]. An inverse method is used for transforming a value or object from one representation (type) to the other. When this method is used in a query, it is computed once, instead of computing appropriate method for each object returned by the query. The result of an inverse method is stored in a memory and is accessible only for the current query. When the query ends the result is removed from the memory. 3. THE CONCEPT OF A MATERIALISED OBJECT-ORIENTED VIEW In our approach, called View Schema Approach (VSA) [14, 15], an object oriented view is defined as a view schema of an arbitrary complex structure and behaviour, composed of view classes. Each view class is derived from one or more classes in a database schema. Each view schema is uniquely identified by its name. A view class is derived by an OQL like command defining its structure, behaviour, and set of instances. View classes in a view schema are connected by inheritance and association relationships. Several view schemas may be created in the same ODS and each of them is uniquely identified by its name. Example 1. Before presenting the details concerning hierarchical materialisation of methods let us develop the VS_Computer view schema with six view classes, namely V_Computer composed of V_CDDrive, V_Disk, and V_MainBoard. The V_Main_Board view class is further composed of V_RAM and V_CPU, as shown in Figure 1. A view schema is explicitly materialised. Similarly as materialised relational views, a materialised view schema has to be kept up to date with the content of a source database. Three following techniques for keeping a materialised view schema up to date were developed within the View Schema Approach: deferred on commit incremental refreshing, deferred on demand incremental

3 V_CDDrive voltage cintensity cdr : V_CDDrive V_Computer mb : V_MainBoard disk : V_Disk V_Disk voltage cintensity V_RAM voltage cintentisy radu ram : V_RAM radi V_MainBoard refreshing, and deferred on demand complete refreshing. Refreshing given view schema VS i means that all the materialised instances of view classes in VS i are refreshed. In order to incrementally propagate the modifications from base to view objects we have developed additional data structures, called Class Mapping Structure, Object Mapping Structure, and Log. Due to space limitations these structures will not be described in this paper. Every view class defined in a view schema may have several methods defined with it. The system should support materialisation of selected methods for the reason of efficiency, in cases when the computation of a method takes long time. 4. HIERARCHICAL MATERIALISATION OF METHODS We propose a novel technique of method materialisation, called hierarchical materialisation. When hierarchical materialisation is applied to method m i, then the result of mi is stored persistently and additionally, the results of methods called from m i are also stored persistently. Hierarchical materialisation may be useful only for those methods that call other methods and the computation of those called methods is costly. After the materialisation of m i, the result of the first invocation of method m i for view object vo i is stored persistently. Each subsequent invocation of m i for the same object vo i uses the already materialised value. The materialisation of methods in a given view schema is allowed only when this schema has previously been materialised. Methods may have various numbers of input arguments, that can be of various types. Generally, methods that have input arguments are not good candidates for the materialisation. However, in the View Schema Approach a method with input arguments can be materialised and maintained within acceptable time provided that: (1) the method has few input arguments and (2) each of the arguments has a narrow, discrete domain. A given method m i implemented in view class vc i can use in its body attributes of vc i and can call other methods in other view classes via association relationships. When the value of an attribute used to compute and materialise the value of m i is modified, then the materialised value becomes invalid. Such an attribute will be further called a sensitive attribute. 4.1 Data Structures In order to materialise methods in a view class and maintain the materialised results, three additional data structures have been developed. These structures, which are described below, are cpu : V_CPU Figure 1. View schema VS_Computers voltage cintensity V_CPU power_cons(frequency: Integer) called View Methods, Materialised Method Results Structure, and Graph of Method Calls. Each of them has associated the set of procedures and functions that operate on its data View Methods View Methods (VM for short) makes available the data dictionary information about all methods and their signatures implemented in view classes. View Methods is implemented as two object tables, called View_Methods and VM_InputArgs. The structure of the View_Methods object table is as follows. < m_id, method, view_class, view_schema, ret_type, body, materialised, sensitive_attribs > m_id is the identifier of a method; method stores the name of a method whose result is to be materialised; view_class stores the name of a view class whose method is to be materialised; view_schema contains the name of a view schema where a view class has been placed; ret_type contains the type of the value returned by a method; body stores the implementation of a method; materialised is a flag indicating whether a method has been materialised or not; sensitive_attribs stores the set of so called sensitive attributes for a method. The set of sensitive attributes for materialised method m i is used to verify whether an update to a view object makes the materialised result of m i invalid. The VM_InputArgs object table has three following attributes: < m_id, arg_name, arg_type > m_id is the identifier of the method that comes from View_Methods; arg_name contains the name of an input argument; arg_type contains the type of an input argument Materialised Method Results Structure As the same method can be invoked for different instances of a given view class and the same method can be invoked with different values of input arguments, the system has to maintain the mappings between: (1) the materialised value of a method, (2) an object for which it was invoked, and (3) values of input arguments. The mappings are represented in the structure, called Materialised Method Results Structure (MMRS for short). MMRS is used by the procedure that maintains the materialised results of methods. When method m i is invoked for a given view object vo i and this method has been previously set as materialised, then MMRS is searched in order to get the result of m i invoked for vo i. If it is not found then, m i is computed for vo i and stored in MMRS. Otherwise, the materialised result of m i is read instead of executing m i. When an object used to

4 compute the materialised value of m i is updated or deleted, then the materialised value becomes invalid. In such a case, appropriate record is removed from MMRS. MMRS is implemented as an object table having the following structure: < m_id, view_oid, inputargs, value > V_CDDrive:: V_Computer:: V_MainBoard:: V_Disk:: The m_id attribute is the identifier of the method that comes from View_Methods; view_oid stores the set of those view object identifiers for which the method has been invoked with the same set of input argument values and the method returned the same value for all these view objects; inputargs stores the set of records containing the names and values of input arguments; the value attribute stores the value of the method executed for a given view object for a certain set of input arguments. view_oid is implemented as a nested table having one attribute that stores the identifier of a view object. inputargs is implemented as a nested table, having the following structure: < arg_name, arg_value > where arg_name is the name of an input argument and arg_value is its value Graph of Method Calls A method defined in one view class can invoke other methods defined in other view classes. For example, in order to compute the consumption of power by the instances of V_Computer, the power_cons method (in V_Computer) calls power_cons methods defined in V_Disk, in V_CDDrive, and in V_MainBoard. The chain of method dependencies, where one method calls another, is called Graph of Method Calls (GMC for short). GMC is used by the procedure that maintains the materialised results of methods. When materialised method m j becomes invalid all the materialised methods that use the value of m j also become invalid. In order to invalidate those methods the content of GMC is used. The GMC is implemented as an object table having the following structure: < calling_m_id, called_m_id > The calling_m_id attribute stores the identifier of a calling method; this identifier comes from the View_Methods object table; the called_m_id attribute stores the set of identifiers of methods being called; each identifier in the set also comes from the View_Methods object table. called_m_id is implemented as an object table. Example 2. Let us consider the view schema presented in Figure 1. Each view class in this view schema has a method, called power_cons, that returns the consumption of electricity power by a computer component. The consumption of power by each instance of V_Main_Board is computed as the sum of power consumed by each component object, i.e., instances of V_RAM and V_CPU and by the instance of V_MainBoard itself. Similarly, the power consumption of each instance of V_Computer is the sum of power consumed by the component instance of V_CDDrive, V_MainBoard, and V_Disk. The example of GMC for the view schema is shown in Figure 2. The name of each method is preceded with the name of a class in which it has been defined. V_RAM:: V_CPU:: Figure 2. An example of the Graph of Method Calls 4.2 Hierarchical Materialisation When method m i is materialised, it may be reasonable to materialise also the intermediate results of methods called from m i. When a view object vo i, used to materialise the result of method m i, is updated or deleted, then m i has to be recomputed. This recomputation can use unaffected intermediate materialised results, thus reducing the recomputation time overhead. We call this technique hierarchical materialisation. In order to maintain the chain of method invocations the system uses the Graph of Method Calls. In our framework all the intermediate results up to leaf nodes are materialised. In order to illustrate the hierarchical materialisation technique and its advantage let us consider the following example. Example 3. Figure 3 presents the materialised instances of the view schema from Figure 1. Let us assume that the instance of V_Computer, namely the view object identified by vcom 1 is composed of view objects vcd 1 (the instance of view class V_CDDrive), vd 2 (the instance of view class V_Disk), and vmb 1 (the instance of V_MainBoard), which in turn is composed of: vram 2, vram 21, vram 22, and vcpu 1. Let us further assume that the power_cons method in V_Computer was materialised. The result of V_Computer::power_cons is materialised for the instance of V_Computer only when this method is invoked for this instance. Furthermore, all the methods called from V_Computer::power_cons are also materialised when they are executed. Let us assume that the power_cons method was invoked for vcom 1. In our example, the hierarchical materialisation mechanism results in materialising values of the following methods: V_RAM::power_cons for view objects identified by vram 2, vram 21, and vram 22, V_CPU::power_cons for object vcpu 1, V_MainBoard::power_cons for object vmb 1, V_Disk::power_cons for object vd 2, V_CDDrive::power_cons for object vcd 1, and finally V_Computer::power_cons for object vcom 1. Having materialised the methods discussed above, let us assume that the component object vd 2 has been replaced with another disk instance, say vd 31, with greater consumption of power. Thus, the result of V_Computer::power_cons materialised for vcom 1 is no longer valid and it has to be recomputed. However, during the recomputation of vcom 1.power_cons the unaffected materialised results of methods can be reused, i.e., vcd 1.power_cons, vmb 1.power_cons have not been changed and they can be used to compute a new value of vcom 1.power_cons.

5 vco : 1: power_cons( ) 2: power_cons( ) V_Computer vcd1 : V_CDDrive vram2 : V_RAM 4: power_cons( ) vram22 vram21 3: power_cons( ) vmb1 : V_MainBoard vd2 : V_Disk 5: power_cons(integer) vcpu1 : V_CPU Figure 3. An example of materialised view schema instances 5. MAINTENANCE OF MATERIALISED METHODS Similarly as a materialised view object, a materialised method may become out of date when the values used to compute the method change. The materialised value of m i, defined in view class v i, becomes obsolete when: (1) m i uses the values of sensitive attributes belonging to the instance vo i of view class v i and the values of sensitive attributes have been changed, (2) m i calls another method, say m j, and the materialised value of m j has been changed. When the materialised value of m i becomes obsolete it is removed from MMRS 1. The removal of the result of method m j causes that the results of methods that called m j also become invalid and have to be removed from MMRS. The removal of materialised results from MMRS is recursively executed up to the root of GMC by the MMRS_Propagate_Remove_Result procedure. To this end the procedure has to identify a pair of values in MMRS records. This pair of values is composed of method identifier and view object identifier for which the method has been materialised. The MMRS_Propagate_Remove_Result procedure traverses the GMC and aggregation relationships in an inverse direction, i.e., from bottom to top. In order to ease the traversal of aggregation hierarchy in an inverse direction the prototype maintains for each view object so called inverse references. An inverse reference for view object vo j is the reference from vo j to other objects that reference vo j. For example, the inverse reference for view object vcpu 1 (cf. Figure 3) contains one object identifier vmb 1 that points to the instance of the V_MainBoard view class. The removal of a materialised method from MMRS is triggered by the deletion or update of a view object. To this end the Check_if_Removal procedure is used. Its pseudo code is shown in Listing 1. Listing 1. The pseudo code of the Check_if_Removal procedure Check_if_Removal ( view_oid void, updated_attr SET_Attr ) begin for m in MMRS_Get_Affected_Methods (view_oid, updated_attr) loop MMRS_Propagate_Remove_Result (m, view_oid); end loop; end; The Check_if_Removal procedure requires two input arguments. The first one view_oid is the identifier of a view object being either updated or deleted. The second attribute updated_attr is the set of updated attributes of a view object. The procedure calls the MMRS_Get_Affected_Methods function. The function returns the set of all materialised methods identifiers whose values became invalid after the modification of a view object. MMRS_Get_Affected_Methods requires two input arguments: the identifier of a modified (updated or deleted) view object and the set of updated attributes (this set is empty when a view object is deleted). For each method returned by this function the loop is executed. In the loop, the records from MMRS are removed by the MMRS_Propagate_Remove_Result procedure, whose pseudo code is presented in Listing 2. Listing 2. The pseudo code of the MMRS_Propagate_Remove_Result procedure 1: MMRS_Propagate_Remove_Result ( meth_id number, 2: view_oid void 3: V_callingMeth SET_Method; 4: i number; 5: j number; 6: begin 7: MMRS_Remove_Result (meth_id, view_oid); 8: /* find the set of methods calling the invalidated method */ 9: V_callingMeth :=GMC_Find_Calling_Methods (meth_id); 1: if V_callingMeth is not null 11: and view_oid.compositerefobject is not null 12: then 13: for m in V_callingMeth loop 14: for v_obj in view_oid.compositerefobject loop 15: /* call recursively MMRS_Propagate_Remove_Result */ 16: MMRS_Propagate_Remove_Result (m, v_obj); 17: end loop; 18: end loop; 19: end if; 2: end; The MMRS_Propagate_Remove_Result procedure has two input arguments: meth_id is the identifier of a method and view_oid is the identifier of a view object. The values of the input arguments are set up in the Check_if_Removal procedure (cf. Listing 1). For the pair of values stored in these two input arguments the appropriate records are removed from MMRS (line 7). Then the removal has to be propagated up to the root of GMC. To this end, the set of method identifiers that use the method identified by the value of meth_id is found in line 9. This set is returned by the GMC_Find_Calling_Methods function, that operates on GMC. The code in lines 1 and 11 is used to check if: the set of method identifiers returned by GMC_Find_Calling_Methods is not empty; the set of inverse references from view object pointed by view_oid is not empty. The set of inverse references is stored in each view object as the value of its attribute compositerefobjects. If both conditions are fulfilled, then the removal of records from MMRS is executed recursively for each method in the set returned by GMC_Find_Calling_Methods and for each view object in inverse references (lines 13-18). 1 This implemented by using database triggers.

6 6. EXPERIMENTAL RESULTS The proposed hierarchical materialisation technique has been implemented within so called View Schema Approach Prototype (VSAP). The prototype has been implemented partially as the application written in C/C++ and partially as packages, functions, and procedures stored in the Oracle8i DBMS, using its object oriented features. All dictionary tables, object tables, and data have been stored in this database. The experiment evaluating hierarchical materialisation has been performed in Oracle8i (rel ) database management system 2. Graph of Method Calls looked as presented in Figure 4. Method m 1 called m 11, m 12, and m 13. The result of m 1 was computed as the sum of results returned by m 11, m 12, and m 13. Similarly, m 11 called m 111, m 112, and m 113 by summing up their results. m 111, in turn, called m 1111, m 1112, and m 1113 by summing up their results. The same computation pattern was used for the rest of methods in this GMC. This graph represents also the aggregation hierarchy of objects. A complex object at the root of the hierarchy referenced three objects at the lower level. Each of these lower level objects referenced further three other objects. As a consequence, each root complex object was composed of 12 component objects. The size of one root complex object, including its components, equalled to 14kB. The experiments were performed for 1, 1, 2, and 5 root complex objects, that gave 12, 12, 24, and 6 component objects, respectively. Due to space constraints we present only the results for 5 root complex objects Figure 4. Graph of Method Calls used in the experiment level 1 level 2 level 3 level 4 level Methods without input arguments Chart 1 shows the total time overhead for: (1) the execution of a method without materialisation (Exe), (2) the execution of a method together with the materialisation of its result (E+M), (3) reading the materialised result of a method (RM), (4) the invalidation of a method (Inv), and (5) the rematerialisation of a previously invalidated method (Rem). These five kinds of time overhead were measured for methods without input arguments, for 5 root complex objects. m 1 and m 11 denote times for processing methods m 1 and m 11, respectively (cf. Figure 4). Average times, computed per one root complex object, are presented in Chart 2. In this experiment the invalidation of and 1 was caused by updating the object used to compute the result of method 1111 (cf. Figure 4), thus one branch of GMC was invalidated from the very bottom method to the very top method. 2 Oralce8i was running under the control of Windows NT, on a PC with Pentium III 55MHz, with 128MB of RAM. The size of a database buffer equaled 16MB. In order to measure the usefulness of hierarchical materialisation we computed the following time coefficient: tc = (Inv + Rem) / Exe. Taking into account the following times shown in Chart 2: the average method execution time without materialisation (Exe) approximately 11.5 sec for method m 1 ; the average method invalidation time overhead (Inv) approximately.3 sec for method m 1 ; the average method rematerialisation time overhead (Rem) approximately 5.4 sec for method m 1 ; the value of tc equals approximately 2, meaning that thanks to the hierarchical materialisation technique, method m 1 (the root of Graph of Method Calls) was executed approximately two times faster than without materialisation. totaltim e [sec] Exe E+M RM Inv Rem 1 Chart 1. Total times of processing methods m 1 and m 11, for 5 root complex objects (one branch of GMC invalidated) average tim e [sec] Exe E+M RM Inv Rem 1 Chart 2. Average times of processing methods m 1 and m 11 (one branch of GMC invalidated) In our tests it is the m 1 method whose execution time is double reduced. This reduction is valid only for Graph of Method Calls having the pattern as shown in Figure 4 and for only one branch of GMC that is invalidated. For other graphs, having different height and width, the acceleration coefficient will be different. This coefficient depends also on the number of objects being updated, which as a consequence, impacts the number of methods whose results have to be invalidated and rematerialised. Even though the methods used in the experiments performed simple arithmetical operations, the hierarchical materialisation technique gave better system performance. Higher increase in the system performance will be achieved provided that we materialise methods whose computation is more costly than those used in the experiments. The total and average processing times for two invalidated branches of GMC are presented in Chart 3 and 4, respectively.

7 For two updated leaf objects invalidating the results of m and m the time coefficient tc equals approximately one. totaltim e [sec] Exe E+M RM Inv Rem 1 Chart 3. Total times of processing methods m 1 and m 11, for 5 root complex objects (two branches of GMC invalidated) average tim e [sec] Exe E+M RM Inv Rem 1 Chart 4. Average times of processing methods m 1 and m 11 (two branches of GMC invalidated) 6.2 Methods with input arguments The results of the experiments concerning the materialisation and maintenance of methods having two and four input arguments of atomic types are discussed below. In the experiments, each input argument had only two possible values. Total times of processing methods m 1 and m 11 with input arguments, for 5 root complex objects are shown in Chart 5. One branch of GMC was invalidated and rematerialised. As we can observe, methods having two input arguments are processed in almost the same time as methods having four input arguments. From the analysis of Chart 1 and 5 it follows that the time overhead for materialising and maintaining methods having few input arguments is very low provided that the domains of input arguments are narrow. For arguments whose domain is wider than two values, the time overhead will be higher, as the number of records in MMRS will increase faster. The value of the acceleration coefficient computed for method m 1, for 5 root complex objects equals approximately.7. This value is approximately the same for methods with two as well as four input arguments. The coefficient computed for methods having input arguments is slightly lower than the coefficient computed for methods without input arguments. 6.3 Storage Space Additional data structures supporting the hierarchical materialisation need storage space. The space overhead for storing one record in MMRS is computed by the following formula: 64B + nb_arg * domain * 2B, where nb_arg is the number of input arguments of a materialised method; domain is the number of different values an argument can have. For example, in order to store materialised results of method m 1 and all methods invoked from m 1 we need 16kB, for two input arguments, each of which can have one of two values. The experiments with 5 root complex objects required additional storage of 4MB for methods without input arguments, 8MB for methods with two input arguments each of which could have one of two possible values, and 16MB for methods with two input arguments each of which could have one of four possible values. This evaluation shows that the size of MMRS grows with the number of input arguments and with the size of their domains. If the domain of an input argument is large, then the large size of MMRS deteriorates the system performance. The space overhead for storing one record in GMC is constant and equals to 6B and it is independent on the number of complex objects being processed. For example, the space for storing the chain of method invocations for m 1 equals to 7kB. 7. SUMMARY, CONCLUSIONS AND FUTURE WORK The support for view materialisation and maintenance is required when applying object oriented views in object relational data warehousing systems. In this paper we presented a framework for object oriented view materialisation and maintenance with respect to methods defined in view classes. To the best of our knowledge, it is the first approach to method materialisation applied to object oriented views. Moreover, we have proposed a novel method materialisation technique, called hierarchical materialisation. As the experiments showed, this technique allows to reduce the maintenance cost of materialised methods, as unaffected intermediate results do not have to be recomputed and can thus be reused during the recomputation of another affected method. Hierarchical materialisation of methods is suitable for the environments where updates and deletions of objects are less frequent than queries, e.g., data warehousing systems. Moreover, this technique is suitable for materialising only those methods that do not have input arguments at all, or have a few input arguments each of which can have few different values. The current implementation of hierarchical materialisation has a few following drawbacks. 1. The decision whether a method should be materialised or not is explicitly made by a view designer during the system tuning activity. He or she has to carefully select the methods for materialisation and their sensitive attributes. 2. The graph of method calls has to reflect the aggregation relationships. In other words, if method m i (in view class V i ) calls m j (in view class V j ) then aggregation relationship must exist between V i and V j.

8 7 6 totaltim e [sec] Exe Exe E+M E+M RM RM Inv Inv Rem Rem m 11 Chart 5. Total times of processing methods m 1 and m 11 with input arguments, for 5 root complex objects (one branch of GMC invalidated) 3. Only methods that have input arguments of atomic types can be materialised, i.e., methods that use arguments that are objects, values of structured types, or other methods can not be materialised. Moreover, materialised methods have to return values of atomic types. 4. The body of method m i being materialised may contain only simple arithmetical operations whose arguments are the called methods and/or the attributes of the view class where m i has been defined. SQL commands are not allowed in materialised methods as it would cause difficulties in registering the used object identifiers and values of the method in MMRS. One way to tackle the problem would be to build a special environment where SQL or OQL commands could be executed. In this environment every object touched by a materialised method could be registered together with the value of the method. The second way to overcome the problem would be to dynamically translate SQL or OQL command to a cursor and execute the command by the cursor. This, however, requires two features of an operational data store, namely the support for cursors and the support for dynamic SQL (OQL). 5. Materialised methods must not modify the content of a data warehouse. The main issue that needs further investigation is the development of a technique that will allow to select automatically or semi automatically the right method for materialisation. To this end, a cost model describing the complexity of a method needs to be developed. Further work will also concentrate on making available a mechanism for materialising the results of selected methods at selected levels of Graph of Method Calls. 8. REFERENCES [1] Ali M. A., Fernandes A. A. A., Paton N.: Incremental Maintenance of Materialized OQL Views. Proc. of the DOLAP', USA, 2. [2] Bertino E.: Method precomputation in object oriented databases. SIGOS Bulletin, 12 (2, 3), 1991, pp [3] Dobrovnik M., Eder J.: Partial Replication of Object Oriented Databases. Proc. of ADBIS'98. Poland, 1998, LNCS No. 1475, pp [4] Eder J., Frank H., Liebhart W.: Optimization of Object Oriented Queries by Inverse Methods. Proc. of East/West Database Workshop, Austria, [5] J.Eder, H.Frank, T.Morzy, R.Wrembel, M.Zakrzewicz, Designing an Object-Relational Database System: Project ORDAWA. Proc. of challenges of ADBIS-DASFAA 2, Prague, Czech Republic, 2, pp [6] Gopalkrishnan V., Li Q., Karlapalem K.: Efficient Query Processing with Associated Horizontal Class Partitioning in an Object Relational Data Warehousing Environment. Proc. of DMDW'2, Sweden, 2. [7] Huynh T.N., Mangisengi O., Tjoa A.M.: Metadata for Object Relational Data Warehouse. Proc. of DMDW'2, Sweden, 2. [8] Jhingran A.: Precomputation in a Complex Object Environment. Proc of IEEE Data Engineering Japan, 1991, pp [9] Kemper A., Kilger C., Moerkotte G.: Function Materialization in Object Bases. Proc. of SIGMOD, 1991, pp [1] Kemper A., Kilger C., Moerkotte G.: Function Materialization in Object Bases: Design, Realization, and Evaluation. IEEE Transactions on Knowledge and Data Engineering, Vol. 6, No. 4, [11] Kuno H. A., Rundensteiner E.: Materialised Object-Oriented Views in MultiView. Proc. of the ACM Research Issues in Data Engineering Workshop, [12] Kuno H. A., Rundensteiner E.: Using Object-Oriented Principles to Optimize Update Propagation to Materialised Views. Proc. of Int. Conf. on Data Engineering, 1996, pp [13] Widom J.: Research Problems in Data Warehousing, Proc. of the 4 th Int. Conference on Information and Knowledge Management (CIKM), 1995, pp [14] Wrembel R.: On Materialising Object Oriented Views. In Barzdins J., Caplinskas A. (eds.): Databases and Information Systems. Kluwer Academic Publishers, March 21, ISBN , pp [15] Wrembel R.: The Construction and Maintenance of Mateiralised Object Oriented Views in Data Warehousing Systems. PhD thesis, Poznań University of Technology, Institute of Computing Sicence, Poznań, Poland, March, 21.

Designing and Implementing an Object Relational Data Warehousing System

Designing and Implementing an Object Relational Data Warehousing System Designing and Implementing an Object Relational Data Warehousing System Abstract Bodgan Czejdo 1, Johann Eder 2, Tadeusz Morzy 3, Robert Wrembel 3 1 Department of Mathematics and Computer Science, Loyola

More information

SQL Server Analysis Services

SQL Server Analysis Services DataBase and Data Mining Group of DataBase and Data Mining Group of Database and data mining group, SQL Server 2005 Analysis Services SQL Server 2005 Analysis Services - 1 Analysis Services Database and

More information

Materialized Data Mining Views *

Materialized Data Mining Views * Materialized Data Mining Views * Tadeusz Morzy, Marek Wojciechowski, Maciej Zakrzewicz Poznan University of Technology Institute of Computing Science ul. Piotrowo 3a, 60-965 Poznan, Poland tel. +48 61

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

Hardware-Supported Pointer Detection for common Garbage Collections

Hardware-Supported Pointer Detection for common Garbage Collections 2013 First International Symposium on Computing and Networking Hardware-Supported Pointer Detection for common Garbage Collections Kei IDEUE, Yuki SATOMI, Tomoaki TSUMURA and Hiroshi MATSUO Nagoya Institute

More information

Fast Discovery of Sequential Patterns Using Materialized Data Mining Views

Fast Discovery of Sequential Patterns Using Materialized Data Mining Views Fast Discovery of Sequential Patterns Using Materialized Data Mining Views Tadeusz Morzy, Marek Wojciechowski, Maciej Zakrzewicz Poznan University of Technology Institute of Computing Science ul. Piotrowo

More information

A Data warehouse within a Federated database architecture

A Data warehouse within a Federated database architecture Association for Information Systems AIS Electronic Library (AISeL) AMCIS 1997 Proceedings Americas Conference on Information Systems (AMCIS) 8-15-1997 A Data warehouse within a Federated database architecture

More information

Modelling Data Warehouses with Multiversion and Temporal Functionality

Modelling Data Warehouses with Multiversion and Temporal Functionality Modelling Data Warehouses with Multiversion and Temporal Functionality Waqas Ahmed waqas.ahmed@ulb.ac.be Université Libre de Bruxelles Poznan University of Technology July 9, 2015 ITBI DC Outline 1 Introduction

More information

SQL Server 2005 Analysis Services

SQL Server 2005 Analysis Services atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of SQL Server

More information

Database Technology Introduction. Heiko Paulheim

Database Technology Introduction. Heiko Paulheim Database Technology Introduction Outline The Need for Databases Data Models Relational Databases Database Design Storage Manager Query Processing Transaction Manager Introduction to the Relational Model

More information

Enhanced Performance of Database by Automated Self-Tuned Systems

Enhanced Performance of Database by Automated Self-Tuned Systems 22 Enhanced Performance of Database by Automated Self-Tuned Systems Ankit Verma Department of Computer Science & Engineering, I.T.M. University, Gurgaon (122017) ankit.verma.aquarius@gmail.com Abstract

More information

Chapter 1: Introduction

Chapter 1: Introduction Chapter 1: Introduction Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Outline The Need for Databases Data Models Relational Databases Database Design Storage Manager Query

More information

Benchmarking the UB-tree

Benchmarking the UB-tree Benchmarking the UB-tree Michal Krátký, Tomáš Skopal Department of Computer Science, VŠB Technical University of Ostrava, tř. 17. listopadu 15, Ostrava, Czech Republic michal.kratky@vsb.cz, tomas.skopal@vsb.cz

More information

Data Warehousing Alternatives for Mobile Environments

Data Warehousing Alternatives for Mobile Environments Data Warehousing Alternatives for Mobile Environments I. Stanoi D. Agrawal A. El Abbadi Department of Computer Science University of California Santa Barbara, CA 93106 S. H. Phatak B. R. Badrinath Department

More information

On Multiple Query Optimization in Data Mining

On Multiple Query Optimization in Data Mining On Multiple Query Optimization in Data Mining Marek Wojciechowski, Maciej Zakrzewicz Poznan University of Technology Institute of Computing Science ul. Piotrowo 3a, 60-965 Poznan, Poland {marek,mzakrz}@cs.put.poznan.pl

More information

Chapter 1: Introduction. Chapter 1: Introduction

Chapter 1: Introduction. Chapter 1: Introduction Chapter 1: Introduction Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 1: Introduction Purpose of Database Systems View of Data Database Languages Relational Databases

More information

Processing of Very Large Data

Processing of Very Large Data Processing of Very Large Data Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, first

More information

Indexing Techniques. Indexing Techniques in Warehousing The UB-Tree Algorithm. Prepared by: Supervised by: March 24, 2003

Indexing Techniques. Indexing Techniques in Warehousing The UB-Tree Algorithm. Prepared by: Supervised by: March 24, 2003 Indexing Techniques Indexing Techniques in Warehousing The UB-Tree Algorithm Prepared by: Supervised by: March 24, 2003 1 Outline! Indexing Techniques Overview! Indexing Issues! Introduction to the UB-Tree

More information

Chapter 1: Introduction

Chapter 1: Introduction Chapter 1: Introduction Chapter 1: Introduction Purpose of Database Systems Database Languages Relational Databases Database Design Data Models Database Internals Database Users and Administrators Overall

More information

Evaluating XPath Queries

Evaluating XPath Queries Chapter 8 Evaluating XPath Queries Peter Wood (BBK) XML Data Management 201 / 353 Introduction When XML documents are small and can fit in memory, evaluating XPath expressions can be done efficiently But

More information

Accelerating XML Structural Matching Using Suffix Bitmaps

Accelerating XML Structural Matching Using Suffix Bitmaps Accelerating XML Structural Matching Using Suffix Bitmaps Feng Shao, Gang Chen, and Jinxiang Dong Dept. of Computer Science, Zhejiang University, Hangzhou, P.R. China microf_shao@msn.com, cg@zju.edu.cn,

More information

Optimising Mediator Queries to Distributed Engineering Systems

Optimising Mediator Queries to Distributed Engineering Systems Optimising Mediator Queries to Distributed Engineering Systems Mattias Nyström 1 and Tore Risch 2 1 Luleå University of Technology, S-971 87 Luleå, Sweden Mattias.Nystrom@cad.luth.se 2 Uppsala University,

More information

Chapter 1: Introduction

Chapter 1: Introduction Chapter 1: Introduction Slides are slightly modified by F. Dragan Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 1: Introduction Purpose of Database Systems View

More information

Distributed KIDS Labs 1

Distributed KIDS Labs 1 Distributed Databases @ KIDS Labs 1 Distributed Database System A distributed database system consists of loosely coupled sites that share no physical component Appears to user as a single system Database

More information

Web page recommendation using a stochastic process model

Web page recommendation using a stochastic process model Data Mining VII: Data, Text and Web Mining and their Business Applications 233 Web page recommendation using a stochastic process model B. J. Park 1, W. Choi 1 & S. H. Noh 2 1 Computer Science Department,

More information

CS425 Fall 2016 Boris Glavic Chapter 1: Introduction

CS425 Fall 2016 Boris Glavic Chapter 1: Introduction CS425 Fall 2016 Boris Glavic Chapter 1: Introduction Modified from: Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Textbook: Chapter 1 1.2 Database Management System (DBMS)

More information

Chapter 12: Query Processing. Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join

More information

Chapter 13: Query Processing

Chapter 13: Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join

More information

Towards Breast Anatomy Simulation Using GPUs

Towards Breast Anatomy Simulation Using GPUs Towards Breast Anatomy Simulation Using GPUs Joseph H. Chui 1, David D. Pokrajac 2, Andrew D.A. Maidment 3, and Predrag R. Bakic 4 1 Department of Radiology, University of Pennsylvania, Philadelphia PA

More information

An Overview of Projection, Partitioning and Segmentation of Big Data Using Hp Vertica

An Overview of Projection, Partitioning and Segmentation of Big Data Using Hp Vertica IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 19, Issue 5, Ver. I (Sep.- Oct. 2017), PP 48-53 www.iosrjournals.org An Overview of Projection, Partitioning

More information

DATABASE PERFORMANCE AND INDEXES. CS121: Relational Databases Fall 2017 Lecture 11

DATABASE PERFORMANCE AND INDEXES. CS121: Relational Databases Fall 2017 Lecture 11 DATABASE PERFORMANCE AND INDEXES CS121: Relational Databases Fall 2017 Lecture 11 Database Performance 2 Many situations where query performance needs to be improved e.g. as data size grows, query performance

More information

Pivoting M-tree: A Metric Access Method for Efficient Similarity Search

Pivoting M-tree: A Metric Access Method for Efficient Similarity Search Pivoting M-tree: A Metric Access Method for Efficient Similarity Search Tomáš Skopal Department of Computer Science, VŠB Technical University of Ostrava, tř. 17. listopadu 15, Ostrava, Czech Republic tomas.skopal@vsb.cz

More information

The Vagabond Temporal OID Index: An Index Structure for OID Indexing in Temporal Object Database Systems

The Vagabond Temporal OID Index: An Index Structure for OID Indexing in Temporal Object Database Systems The Vagabond Temporal OID Index: An Index Structure for OID Indexing in Temporal Object Database Systems Kjetil Nørvåg Department of Computer and Information Science Norwegian University of Science and

More information

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Chapter 13: Query Processing Basic Steps in Query Processing

Chapter 13: Query Processing Basic Steps in Query Processing Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Chapter 1: Introduction

Chapter 1: Introduction This image cannot currently be displayed. Chapter 1: Introduction Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 1: Introduction Purpose of Database Systems View

More information

UNIT I. Introduction

UNIT I. Introduction UNIT I Introduction Objective To know the need for database system. To study about various data models. To understand the architecture of database system. To introduce Relational database system. Introduction

More information

BEx Front end Performance

BEx Front end Performance BUSINESS INFORMATION WAREHOUSE BEx Front end Performance Performance Analyses of BEx Analyzer and Web Application in the Local and Wide Area Networks Environment Document Version 1.1 March 2002 Page 2

More information

Cube-Lifecycle Management and Applications

Cube-Lifecycle Management and Applications Cube-Lifecycle Management and Applications Konstantinos Morfonios National and Kapodistrian University of Athens, Department of Informatics and Telecommunications, University Campus, 15784 Athens, Greece

More information

OLAP Introduction and Overview

OLAP Introduction and Overview 1 CHAPTER 1 OLAP Introduction and Overview What Is OLAP? 1 Data Storage and Access 1 Benefits of OLAP 2 What Is a Cube? 2 Understanding the Cube Structure 3 What Is SAS OLAP Server? 3 About Cube Metadata

More information

Database Management Systems (CPTR 312)

Database Management Systems (CPTR 312) Database Management Systems (CPTR 312) Preliminaries Me: Raheel Ahmad Ph.D., Southern Illinois University M.S., University of Southern Mississippi B.S., Zakir Hussain College, India Contact: Science 116,

More information

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores Announcements Shumo office hours change See website for details HW2 due next Thurs

More information

Buffer Management for XFS in Linux. William J. Earl SGI

Buffer Management for XFS in Linux. William J. Earl SGI Buffer Management for XFS in Linux William J. Earl SGI XFS Requirements for a Buffer Cache Delayed allocation of disk space for cached writes supports high write performance Delayed allocation main memory

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation

More information

Correlation based File Prefetching Approach for Hadoop

Correlation based File Prefetching Approach for Hadoop IEEE 2nd International Conference on Cloud Computing Technology and Science Correlation based File Prefetching Approach for Hadoop Bo Dong 1, Xiao Zhong 2, Qinghua Zheng 1, Lirong Jian 2, Jian Liu 1, Jie

More information

PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data

PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data Enhua Jiao, Tok Wang Ling, Chee-Yong Chan School of Computing, National University of Singapore {jiaoenhu,lingtw,chancy}@comp.nus.edu.sg

More information

Stochastic propositionalization of relational data using aggregates

Stochastic propositionalization of relational data using aggregates Stochastic propositionalization of relational data using aggregates Valentin Gjorgjioski and Sašo Dzeroski Jožef Stefan Institute Abstract. The fact that data is already stored in relational databases

More information

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. [R&G] Chapter 23, Part A

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. [R&G] Chapter 23, Part A Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 432 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business

More information

Flexible Cache Cache for afor Database Management Management Systems Systems Radim Bača and David Bednář

Flexible Cache Cache for afor Database Management Management Systems Systems Radim Bača and David Bednář Flexible Cache Cache for afor Database Management Management Systems Systems Radim Bača and David Bednář Department ofradim Computer Bača Science, and Technical David Bednář University of Ostrava Czech

More information

Striped Grid Files: An Alternative for Highdimensional

Striped Grid Files: An Alternative for Highdimensional Striped Grid Files: An Alternative for Highdimensional Indexing Thanet Praneenararat 1, Vorapong Suppakitpaisarn 2, Sunchai Pitakchonlasap 1, and Jaruloj Chongstitvatana 1 Department of Mathematics 1,

More information

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1 Basic Concepts :- 1. What is Data? Data is a collection of facts from which conclusion may be drawn. In computer science, data is anything in a form suitable for use with a computer. Data is often distinguished

More information

Optimizing Testing Performance With Data Validation Option

Optimizing Testing Performance With Data Validation Option Optimizing Testing Performance With Data Validation Option 1993-2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording

More information

Dynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering

Dynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering Dynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering Abstract Mrs. C. Poongodi 1, Ms. R. Kalaivani 2 1 PG Student, 2 Assistant Professor, Department of

More information

Data Access Paths for Frequent Itemsets Discovery

Data Access Paths for Frequent Itemsets Discovery Data Access Paths for Frequent Itemsets Discovery Marek Wojciechowski, Maciej Zakrzewicz Poznan University of Technology Institute of Computing Science {marekw, mzakrz}@cs.put.poznan.pl Abstract. A number

More information

Database System Concepts

Database System Concepts Chapter 13: Query Processing s Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2008/2009 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth

More information

Database Architectures

Database Architectures Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 4/15/15 Agenda Check-in Parallelism and Distributed Databases Technology Research Project Introduction to NoSQL

More information

Query optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag.

Query optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag. Database Management Systems DBMS Architecture SQL INSTRUCTION OPTIMIZER MANAGEMENT OF ACCESS METHODS CONCURRENCY CONTROL BUFFER MANAGER RELIABILITY MANAGEMENT Index Files Data Files System Catalog DATABASE

More information

Pattern Mining. Knowledge Discovery and Data Mining 1. Roman Kern KTI, TU Graz. Roman Kern (KTI, TU Graz) Pattern Mining / 42

Pattern Mining. Knowledge Discovery and Data Mining 1. Roman Kern KTI, TU Graz. Roman Kern (KTI, TU Graz) Pattern Mining / 42 Pattern Mining Knowledge Discovery and Data Mining 1 Roman Kern KTI, TU Graz 2016-01-14 Roman Kern (KTI, TU Graz) Pattern Mining 2016-01-14 1 / 42 Outline 1 Introduction 2 Apriori Algorithm 3 FP-Growth

More information

Indexing. Week 14, Spring Edited by M. Naci Akkøk, , Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel

Indexing. Week 14, Spring Edited by M. Naci Akkøk, , Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel Indexing Week 14, Spring 2005 Edited by M. Naci Akkøk, 5.3.2004, 3.3.2005 Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel Overview Conventional indexes B-trees Hashing schemes

More information

Data Warehouse Performance - Selected Techniques and Data Structures

Data Warehouse Performance - Selected Techniques and Data Structures Data Warehouse Performance - Selected Techniques and Data Structures Robert Wrembel Poznań University of Technology, Institute of Computing Science, Poznań, Poland Robert.Wrembel@cs.put.poznan.pl Abstract.

More information

Data Warehousing and Decision Support

Data Warehousing and Decision Support Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 4320 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business

More information

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:

More information

Data Structure for Association Rule Mining: T-Trees and P-Trees

Data Structure for Association Rule Mining: T-Trees and P-Trees IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 16, NO. 6, JUNE 2004 1 Data Structure for Association Rule Mining: T-Trees and P-Trees Frans Coenen, Paul Leng, and Shakil Ahmed Abstract Two new

More information

A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective

A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective B.Manivannan Research Scholar, Dept. Computer Science, Dravidian University, Kuppam, Andhra Pradesh, India

More information

Main Memory and the CPU Cache

Main Memory and the CPU Cache Main Memory and the CPU Cache CPU cache Unrolled linked lists B Trees Our model of main memory and the cost of CPU operations has been intentionally simplistic The major focus has been on determining

More information

TagFS: A simple tag-based filesystem

TagFS: A simple tag-based filesystem TagFS: A simple tag-based filesystem Scott Bezek sbezek@mit.edu Raza (R07) 6.033 Design Project 1 March 17, 2011 1 Introduction TagFS is a simple yet effective tag-based filesystem. Instead of organizing

More information

SMD149 - Operating Systems - File systems

SMD149 - Operating Systems - File systems SMD149 - Operating Systems - File systems Roland Parviainen November 21, 2005 1 / 59 Outline Overview Files, directories Data integrity Transaction based file systems 2 / 59 Files Overview Named collection

More information

DISTRIBUTED DATABASE OPTIMIZATIONS WITH NoSQL MEMBERS

DISTRIBUTED DATABASE OPTIMIZATIONS WITH NoSQL MEMBERS U.P.B. Sci. Bull., Series C, Vol. 77, Iss. 2, 2015 ISSN 2286-3540 DISTRIBUTED DATABASE OPTIMIZATIONS WITH NoSQL MEMBERS George Dan POPA 1 Distributed database complexity, as well as wide usability area,

More information

Query Processing & Optimization

Query Processing & Optimization Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction

More information

CST-Trees: Cache Sensitive T-Trees

CST-Trees: Cache Sensitive T-Trees CST-Trees: Cache Sensitive T-Trees Ig-hoon Lee 1, Junho Shim 2, Sang-goo Lee 3, and Jonghoon Chun 4 1 Prompt Corp., Seoul, Korea ihlee@prompt.co.kr 2 Department of Computer Science, Sookmyung Women s University,

More information

Chapter 1 Introduction

Chapter 1 Introduction Chapter 1 Introduction Contents The History of Database System Overview of a Database Management System (DBMS) Three aspects of database-system studies the state of the art Introduction to Database Systems

More information

DATA MINING AND WAREHOUSING

DATA MINING AND WAREHOUSING DATA MINING AND WAREHOUSING Qno Question Answer 1 Define data warehouse? Data warehouse is a subject oriented, integrated, time-variant, and nonvolatile collection of data that supports management's decision-making

More information

A Methodology for Integrating XML Data into Data Warehouses

A Methodology for Integrating XML Data into Data Warehouses A Methodology for Integrating XML Data into Data Warehouses Boris Vrdoljak, Marko Banek, Zoran Skočir University of Zagreb Faculty of Electrical Engineering and Computing Address: Unska 3, HR-10000 Zagreb,

More information

A Transaction Processing Technique in Real-Time Object- Oriented Databases

A Transaction Processing Technique in Real-Time Object- Oriented Databases 122 IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.1, January 2008 A Transaction Processing Technique in Real-Time Object- Oriented Databases Woochun Jun Dept. of Computer

More information

CHAPTER 3 LITERATURE REVIEW

CHAPTER 3 LITERATURE REVIEW 20 CHAPTER 3 LITERATURE REVIEW This chapter presents query processing with XML documents, indexing techniques and current algorithms for generating labels. Here, each labeling algorithm and its limitations

More information

Column Stores vs. Row Stores How Different Are They Really?

Column Stores vs. Row Stores How Different Are They Really? Column Stores vs. Row Stores How Different Are They Really? Daniel J. Abadi (Yale) Samuel R. Madden (MIT) Nabil Hachem (AvantGarde) Presented By : Kanika Nagpal OUTLINE Introduction Motivation Background

More information

Full-Text and Structural XML Indexing on B + -Tree

Full-Text and Structural XML Indexing on B + -Tree Full-Text and Structural XML Indexing on B + -Tree Toshiyuki Shimizu 1 and Masatoshi Yoshikawa 2 1 Graduate School of Information Science, Nagoya University shimizu@dl.itc.nagoya-u.ac.jp 2 Information

More information

IBPS SO Examination 2013 IT Officer Professional Knowledge Question Paper

IBPS SO Examination 2013 IT Officer Professional Knowledge Question Paper IBPS SO Examination 2013 IT Officer Professional Knowledge Question Paper 1. The tracks on a disk which can be accused without repositioning the R/W heads is (A) Surface (B) Cylinder (C) Cluster 2. Which

More information

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more

More information

Two-Phase Optimization for Selecting Materialized Views in a Data Warehouse

Two-Phase Optimization for Selecting Materialized Views in a Data Warehouse Two-Phase Optimization for Selecting Materialized Views in a Data Warehouse Jiratta Phuboon-ob, and Raweewan Auepanwiriyakul Abstract A data warehouse (DW) is a system which has value and role for decision-making

More information

Oracle Endeca Information Discovery

Oracle Endeca Information Discovery Oracle Endeca Information Discovery Glossary Version 2.4.0 November 2012 Copyright and disclaimer Copyright 2003, 2013, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered

More information

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner CPS104 Computer Organization and Programming Lecture 16: Virtual Memory Robert Wagner cps 104 VM.1 RW Fall 2000 Outline of Today s Lecture Virtual Memory. Paged virtual memory. Virtual to Physical translation:

More information

CT75 DATA WAREHOUSING AND DATA MINING DEC 2015

CT75 DATA WAREHOUSING AND DATA MINING DEC 2015 Q.1 a. Briefly explain data granularity with the help of example Data Granularity: The single most important aspect and issue of the design of the data warehouse is the issue of granularity. It refers

More information

Efficient subset and superset queries

Efficient subset and superset queries Efficient subset and superset queries Iztok SAVNIK Faculty of Mathematics, Natural Sciences and Information Technologies, University of Primorska, Glagoljaška 8, 5000 Koper, Slovenia Abstract. The paper

More information

Maintenance of the Prelarge Trees for Record Deletion

Maintenance of the Prelarge Trees for Record Deletion 12th WSEAS Int. Conf. on APPLIED MATHEMATICS, Cairo, Egypt, December 29-31, 2007 105 Maintenance of the Prelarge Trees for Record Deletion Chun-Wei Lin, Tzung-Pei Hong, and Wen-Hsiang Lu Department of

More information

Data Partitioning and MapReduce

Data Partitioning and MapReduce Data Partitioning and MapReduce Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Intelligent Decision Support Systems Master studies,

More information

Selection Queries. to answer a selection query (ssn=10) needs to traverse a full path.

Selection Queries. to answer a selection query (ssn=10) needs to traverse a full path. Hashing B+-tree is perfect, but... Selection Queries to answer a selection query (ssn=) needs to traverse a full path. In practice, 3-4 block accesses (depending on the height of the tree, buffering) Any

More information

Database Management System 9

Database Management System 9 Database Management System 9 School of Computer Engineering, KIIT University 9.1 Relational data model is the primary data model for commercial data- processing applications A relational database consists

More information

Optimized Query Plan Algorithm for the Nested Query

Optimized Query Plan Algorithm for the Nested Query Optimized Query Plan Algorithm for the Nested Query Chittaranjan Pradhan School of Computer Engineering, KIIT University, Bhubaneswar, India Sushree Sangita Jena School of Computer Engineering, KIIT University,

More information

Data Warehousing and Decision Support

Data Warehousing and Decision Support Data Warehousing and Decision Support Chapter 23, Part A Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical

More information

Super-Key Classes for Updating. Materialized Derived Classes in Object Bases

Super-Key Classes for Updating. Materialized Derived Classes in Object Bases Super-Key Classes for Updating Materialized Derived Classes in Object Bases Shin'ichi KONOMI 1, Tetsuya FURUKAWA 1 and Yahiko KAMBAYASHI 2 1 Comper Center, Kyushu University, Higashi, Fukuoka 812, Japan

More information

A Translation Framework for Automatic Translation of Annotated LLVM IR into OpenCL Kernel Function

A Translation Framework for Automatic Translation of Annotated LLVM IR into OpenCL Kernel Function A Translation Framework for Automatic Translation of Annotated LLVM IR into OpenCL Kernel Function Chen-Ting Chang, Yu-Sheng Chen, I-Wei Wu, and Jyh-Jiun Shann Dept. of Computer Science, National Chiao

More information

Data Warehousing 11g Essentials

Data Warehousing 11g Essentials Oracle 1z0-515 Data Warehousing 11g Essentials Version: 6.0 QUESTION NO: 1 Indentify the true statement about REF partitions. A. REF partitions have no impact on partition-wise joins. B. Changes to partitioning

More information

Concept as a Generalization of Class and Principles of the Concept-Oriented Programming

Concept as a Generalization of Class and Principles of the Concept-Oriented Programming Computer Science Journal of Moldova, vol.13, no.3(39), 2005 Concept as a Generalization of Class and Principles of the Concept-Oriented Programming Alexandr Savinov Abstract In the paper we describe a

More information

MaanavaN.Com DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING QUESTION BANK

MaanavaN.Com DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING QUESTION BANK CS1301 DATABASE MANAGEMENT SYSTEM DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING QUESTION BANK Sub code / Subject: CS1301 / DBMS Year/Sem : III / V UNIT I INTRODUCTION AND CONCEPTUAL MODELLING 1. Define

More information

Improving the Performance of OLAP Queries Using Families of Statistics Trees

Improving the Performance of OLAP Queries Using Families of Statistics Trees Improving the Performance of OLAP Queries Using Families of Statistics Trees Joachim Hammer Dept. of Computer and Information Science University of Florida Lixin Fu Dept. of Mathematical Sciences University

More information