Horizontal Fragmentation in Object DBMS: New Issues and Performance Evaluation'

Size: px
Start display at page:

Download "Horizontal Fragmentation in Object DBMS: New Issues and Performance Evaluation'"

Transcription

1 Horizontal Fragmentation in Object DBMS: New Issues and Performance Evaluation' Fernanda BaiSio 293 Marta Mattoso Gerson Zaverucha 'v3 2 Department of Computer Science - COPPE/UFRJ Federal University of Rio de Janeiro - Brazil Department of Computer Science, University of Wisconsin - Madison Abstract Horizontal fragmentation may improve the pe$ormance of database systems. Defining primary and derived horizontal fragmentation along the classes in a database schema is an important and complex issue, yet not discussed in the literature, which must be considered when horizontally fragmenting a database. In this paper, we focus on an analysis to help the designer upon the decision for primary or derived horizontal fragmentation. This analysis considers performance results of distributed databases using horizontal fragmentation to evaluate the potential benefits and drawbacks of primary and derived techniques. Therefore, this work presents a horizontal fragmentation algorithm that chooses the most adequate strategy (primary or derived) based on class relationships, single class access and query access frequencies 1 Introduction Horizontal fragmentation is often used as a means to achieve better performance of database systems by reducing the disk access required to execute an application by minimizing the number of irrelevant objects accessed and reducing the data transfer among sites [9]. In the object-oriented (00) and relational-object (RO) models, applications may involve both set operations (search over class extensions) and navigation (traversal through a class path). For this reason, horizontal fragmentation is usually subdivided in primary and derived fragmentation. Primary fragmentation basically optimizes set operations over a class extension, firstly by reducing the amount of irrelevant data accessed and, secondly, by permitting applications to be executed concurrently, thus achieving a high degree of parallelism. On the other hand, derived fragmentation can be viewed as an approach of clustering objects of distinct classes in the disk [8], therefore clearly addressing the relationships ' The authors are partially supported by CNPq between classes and improving performance of applications with navigational accesses. Defining primary and derived horizontal fragmentation along the classes in a database schema is an important and complex issue, which must be considered when fragmenting a database. Our previous work [l] proposed a complete fragmentation methodology for object Database Management Systems to assist distribution designers in combining the horizontal and vertical fragmentation techniques. The emphasis in our previous methodology relies on an Analysis Phase, which helps the designer to decide whether a class should be horizontally fragmented, vertically fragmented, or both. In this paper, we focus on an analysis along our horizontal fragmentation algorithm to help the designer upon the decision for primary or derived horizontal fragmentation. This analysis considers performance results of distributed databases using horizontal fragmentation to evaluate the potential benefits and drawbacks of primary and derived techniques. We define the (owner, member) relation that will drive the horizontal fragmentation process, by selecting classes from the database schema to be primary or derived horizontally fragmented, taking both qualitative and quantitative information into account. The (owner, member) relation reflects the structure of the navigation paths accessed by the most frequent operations. Therefore, by taking the (owner, member) relation into consideration when designing a fragmentation schema, the execution of those operations is likely to be optimized, thus improving the overall system performance. This work is organized as follows. Section 2 presents a definition of the main concepts involved in horizontal fragmentation task. Section 3 discusses related works in this area and presents a summary table comparing relevant characteristics of them. We present a horizontal fragmentation approach discussing issues related to primary and derived fragmentation in Section 4. Performance results are presented in Section 5. Finally, Section 6 concludes this paper $ IEEE 108

2 2 The distribution design of databases The distribution design task involves making decisions on the fragmentation and placement of data across the sites of a computer network. In a top down approach, the distribution design has two phases: fragmentation and allocation. The fragmentation phase is the process of clustering in fragments the information accessed simultaneously by applications, while the allocation phase is the process of distributing the generated fragments over the database system sites. To fragment a class, it is possible to use two basic techniques: vertical fragmentation and horizontal fragmentation. It is also possible to perform mixed (or hybrid) fragmentation on a class, combining both techniques. The importance of mixed fragmentation was already detected in Navathe et al. [13], Ozsu and Valduriez [ 141, and was addressed in Baiao et al. [ 11. In the object model, vertical fragmentation breaks the class logical structure (its attributes and methods) and distributes them across the fragments, which will logically contain the same objects, but with different structures. On the other hand, horizontal fragmentation distributes class instances across the fragments, which will have exactly the same structure but different contents. Thus, a horizontal fragment of a class contains a subset of the whole class extension. Horizontal fragmentation is usually subdivided in primary and derived fragmentation to address the relationship between classes, thus benefiting navigational access from applications. The primary horizontal fragmentation is applied on owner classes while the derived fragmentation is applied on member classes according to the owner fragmentation. Defining owner and member classes in a database schema is an important and complex issue, yet not discussed in the literature for the object model, which must be considered when horizontally fragmenting a database. The definition of owner and member classes is presented in Section 4.1. In the horizontal fragmentation approach, there is also the "path partitioning" technique [ 141, which we understand to be a special situation in the database schema (the "part-of' relationship) in which derived horizontal fragmentation must be performed. Therefore, we consider the path partitioning technique as a special case of the derived horizontal fragmentation, rather than an alternative approach in the distribution design. The design of distributed object databases is a complex task. First, because the semantic differences between relational and object models inhibit a straightforward migration from existing relational distribution design algorithms to the object model. Second, because it has to consider the existence of class methods and complex relationships (such as the "is-a" and "part-of' relationships), in addition to address application access to complex objects and multiple relationships between classes. Third, because of operations access patterns: while relational operations are only set oriented, object operations are pointer based, and therefore may have a dual nature involving both set operations (search over class extensions) and navigation (traversals). 3 Comparing related work in the area Many researchers addressed the distribution design in the relational model, including Ceri and Navathe [6], Navathe and Ra [12], Ozsu and Valduriez [14], Molina and Hsu [ll] and Navathe et al. [13]. In the object context, there are also many works evidencing the importance of the distribution design to improve performance of applications manipulating large volumes of data in object DBMS. Karlapalem et al. [9] describe different aspects of a distributed object DBMS that are critical to the distribution design process, which are the data model, method invocation, types of location transparency, and transaction management. The authors develop some preliminary ideas for designing fragmentation algorithms in the object context. There are some works in the literature focusing on the horizontal fragmentation in object DBMS. Savonnet et al. [15] propose a methodology for the horizontal fragmentation of all classes in a database schema. The choice between primary and derived horizontal fragmentation on each class considers its relationships, which are defined by analyzing only the method calls between classes in the schema. The work does not present an algorithm to support the methodology. Bellatreche et al. [3] present a horizontal fragmentation method and an analytical cost model to evaluate query execution time in horizontally fragmented databases, The fragmentation schema with the best performance according to the cost model is achieved through a hill-climbing algorithm, by selecting a subset of classes for primary horizontal fragmentation. The work from Ezeife and Barker [7] presents a set of algorithms to the horizontal fragmentation of all classes in a database schema. It takes relationships between classes into account to propose primary and derived horizontal fragmentation. However, this approach works at the instance level, where the class instances already exist in the database to proceed with the fragmentation process. It also assumes a storage structure for the objects in the database class hierarchy in which an instance of a subclass physically contains a pointer to the instance of its superclass that is logically part of it. This assumption leads to considering inheritance links in the horizontal fragmentation process. 109

3 Table 1: Related works on the distribution design of object databases Baiao et al. [l] propose a complete fragmentation methodology for object DBMS, which is divided in three phases. First, there is an Analysis Phase to assist distribution designers in defining the most adequate fragmentation technique (horizontal, vertical, or both) to be applied in each class of the database schema. The Analysis Phase also considers the case in which no fragmentation of a class is the best alternative. Second, they present an algorithm to perform Vertical Fragmentation in a class. Finally, the authors present an algorithm to perform Horizontal Fragmentation on the whole class or on a vertical fragment of a class, which may result in mixed fragmentation. The main characteristics of all mentioned fragmentation works are summarized in the comparative table 1. For more details on the allocation phase, the reader may refer to [2]. 4 A horizontal fragmentation approach to object DBMS Traditionally, most works in the literature base the choice between primary and derived horizontal fragmentation on the owner and member classification. This turns this classification to be one of the crucial aspects to be considered while designing a distributed database. The owner-member classification was firstly defined for the relational model in [14]. However, when we change to the object model, the owner and member classification is not as simple as it was in its relational counterpart, yet as important as it used to be. There are many issues in the object model that must be taken into account in this classification, such as the existence of complex objects, part-of relationships between classes, method calls, classes with no instances (pseudo-classes) and n x m relationships that may lead to object sharing. Also, important information such as the application access frequency to each relationship of a class must be considered. This Section discusses the definition of owner and member classes in a database schema taking both qualitative and quantitative information into account. Although this is a relevant issue in designing a distributed database in the horizontal fragmentation approach, it is not considered at this detail level in any of the works from the literature. 4.1 The (owner, member) relation For the definition of owner and member classes, we first define the (owner, member) relation. The (owner, member) relation is a set of pairs of the form (X, Y) where X and Y must be classes in the database schema. Each pair (X, Y) in the relation is called an instance of it. An instance (X, Y) denotes that class Y will be selected for derived horizontal fragmentation according to class X (class X is called the owner of Y). If, at the end of the definition of the (owner, member) relation, there is no other instance (A, X) in it (that is, if class X does not plays the member role in any of the relation instances), then class X will be selected for primary horizontal fragmentation. It is also possible to define an instance of the form (X, null), denoting class X will be selected for primary horizontal fragmentation, even though there may not be any other class selected for derived fragmentation according to X. Some restrictions apply on the (owner, member) relation definition: it is not possible to have an instance (X, X), since it is useless to define derived horizontal fragmentation on a class according to itself; it is not possible to have a pair of instances (X, A), (Y. A), since it is not possible to define derived horizontal fragmentation on a class A according to more than one primary class (X and Y); it is not possible to have a pair of instances (X, A), (A, X), since it is not possible to define both primary and derived horizontal fragmentation on a pair of classes according to each other; 110

4 The 007 Benchmark data model [4] illustrated in Figure 1 will be used to exemplify the following discussion. The 007 Benchmark has been applied to many object DBMS in order to evaluate their performance. In Section 5, we show some performance results on an alternative fragmentation schema supporting our proposed horizontal I Figure 1: The 007 benchmark database schema Given a database schema to be horizontally fragmented and a set of operations (queries and transactions) on it, the (owner, member) relation can be defined. Operations are then sorted in a descending way according to their execution frequency (thus priority is given to the most frequent operations), and are then analyzed one at a time with regard to its accessed classes. When analyzing an operation Oi, one of the two following situations may occur: i. Oi accesses only one class extension, named X: in this case, an instance (X, null) is included in the (owner, member) relation. This is a very frequent situation in real applications, for example when performing a selection over a class extension or scanning a whole class extension without navigating to other classes: E.g.1: E.g.2: select x from x in Atomicpart where x.builddate 4 10/11/96 (owner, der) = (Atomicpart, null) select x.type, x.builddate from x in Compositepart (owner, der) = (Compositepart, null) ii. Oi navigates through a class path, named X,->X2->... ->X,,: in this case, for each pair of classes (Xi, Xi+l) in the class path, 1 I i I n-i, an instance (Xi, Xi+l) is included in the (owner, member) relation if at least one of the following conditions occurs: ii.a. Existence of "1 x 1" or "n x 1" relationship: if Oi navigates from Xi to Xi+l by the way of a pointer representing a "1 x 1" or a "n x 1" relationship. In these cases, the relationship will be translated into a complex attribute of class Xi (named the composite class) with its domain on I class Xi+l (named the containing class). The relationships documentation (from CompositePart to Document) and to (from AtomicPart to Connection) are examples of "1 x 1" and "n x 1" relationships, respectively. Those cases are likely to occur in real applications that navigate through the database schema, according to the defined relationships between classes. E.g.3:select c.builddate from c in CompositePart where c->documentation.title = "Algorithm" (owner, member) = (CompositePart, Document) E.g.4:select a->to from a in AtomicPart where a.builddate < 10/11/96 (owner, "bar) = (Atomicpart, Connection) An important thing to notice is that the member class is always defined at the "1" side of the relationship. If necessary in "n x 1" relationships, we replace the instance (Xi, Xi+l) with the instance (Xi+l, Xi) in the (owner, member) relation to make sure that the member class is at the "1" side. This prevents a member object from having more than one owner (a connection is related to one and only one atomic part), thus there will be no object sharing in the member class and this eliminates the overhead of replicating member objects in many owner fragments. For the same reason, we do not create an instance in the (owner, member) relation in the case of "n x m" relationships. We believe that derived horizontal fragmentation does not contribute to performance improvement of an application when there is object sharing in both classes involved in the relationship; ii.b. Existence of a "part-of" relationship: This may be considered as a special case of the above situation (since "part-of' relationships have typically a "n x 1" cardinality). However, the semantic of this type of relationship makes it a strong candidate for defining derived fragmentation of the "part" class according to the "whole" class. Therefore, it is important to stress that the instance (Xi, Xi+l) must be included in the (owner, member) relation if there is a "part-of' relationship from Xi (the "whole" class) to Xi+l (the "part" class). This may be illustrated in the relationship parts from CompositePart to AtomicPart in figure 1. E.g.5:select c->parts from c in CompositePart (owner, d er) = (CompositePart, Atomicpart) ii.c. Existence of a method invocation sequence: if Oi navigates from Xi to Xi+l by the way of a method call. In this case, class Xi has a complex method, that is, a method accessing objects from class Xi+l. Those cases are likely to occur in real applications that navigate through the database schema 111

5 according to the relationships between classes defined in the method body. For example, we may define a method length() in class AtomicPart as the sum of the attribute length of all its to Connections. E.g.6:select x from x in AtomicPart where x.lengtho> 10 (owner, "ber) = (Atomicpart, Connection) Differently from [7], we do not take inheritance relationships (such as the ISA relationship) into account when defining member classes on the database schema. Most object DBMS products do not implement a storage structure for the objects in the database class hierarchy in which an instance of a subclass physically contains a pointer to the instance of its superclass that is logically part of it. Therefore, considering inheritance links to drive the derived horizontal fragmentation process would generate a useless overhead to the fragmentation algorithm and lead to an unnecessary derived fragmentation of a superclass according to its subclass, since the inheritance links will not exist. This would surely impact on the distributed database performance. The algorithm defining the (owner, member) relation is illustrated in figure 2. function Build&nerMemberRelation (0: set of owrationsl retums (own, man) : set of pairs (owner, member) of classes begin sort 0 in descending order according to the operation frequency for each Oi that is in 0 do if Oi accesses only 1 class C then (own, man) += (C. null); else if Oi navigates through a class path Cl->C2->-->cn then for each pair of classes (X,Y) that is in the path do set card = cardinality of the relationship between X and Y if (card = "1:l.) or (card = "N:l') then if Y is not a member in (own, rmn) then (own, mem) += (X, Y); else if card =.l:n' then if X is not a member in lown, man) then (own. m) += W. XI; retum (0wn.m) end Figure 2: Defining the (owner, member) relation 4.2 Fragmenting the database Once all the database operations are analyzed, the (owner, member) relation is completely instantiated. Classes from the database schema may then proceed to the fragmentation phase. Owner classes that do not play the member role in any of the (owner, member) relation instances are selected for primary horizontal fragmentation, while member classes are selected for derived horizontal fragmentation according to its owner. In the case that a class X appears in the (owner, member) relation in both (X, null) and (Y, X) forms, we must choose the fragmentation technique to be applied on class X - primary according to instance (X, null) or derived according to instance (X, Y). This choice is made considering the operations that were responsible for creating each instance. The instance created by the operation with lower frequency is eliminated from the (owner, member) relation. Note that this may break the class path that is accessed by a navigation application (N), thus reducing its performance, however this will only occur in the case that there is a more frequent operation (E) accessing the extension of a class in the middle of the N path. Selecting this class for primary fragmentation (instead of derived) will improve E performance, thus benefiting the most frequent operation. This situation is clearly shown in the example in Section 5. The algorithm in figure 2 prevents this situation from happening, by inserting an instance (X, null) in the relation only if X is not already a member. Primary horizontal fragmentation. The algorithm used for the primary horizontal fragmentation is an extension of the one in [13]. The algorithm takes input information on the applications accessing the class to be primary fragmented, such as their predicates and execution frequencies, in order to identify groups of objects with similar characteristics that are likely to be accessed simultaneously by applications. These groups of objects will represent the class fragments. The fragmentation process is performed in a two-step process: first, it builds a predicate affinity matrix between the simple predicates used in the applications, and then it builds a predicate affinity graph with cycles representing class fragments. To build the predicate affinity matrix, predicates are extracted from the applications and represent the matrix dimensions (lines and columns). Each value (pi, pj) in the predicate affinity matrix represents the sum of the frequencies of applications that accesses predicates pi and pj simultaneously. Logical relationships between predicates (such as the logical implication) are also maintained in the predicate affinity matrix in order to reduce the number of class fragments defined. To build the predicate affinity graph, a graphical based algorithm is performed in order to group predicates into sets of predicates. Each predicate represents a graph node, and graph links between the nodes are inserted one at a time by selecting the highest value in the predicate affinity matrix that was not considered yet. Eventually, the inclusion of a graph link may form a cycle in the graph. After building a connected graph with all the predicates, each graph cycle will represent a class fragment. A class fragment is defined by a boolean combination of predicates in the cycle using the logical connectives A and v. An additional ELSE fragment is defined, which is the negation of the conjunction of all predicate definitions, to gather objects of the class that do not fall in any of the previously defined fragments. Also, the result of predicate partitioning is adjusted, if necessary, in order to generate non-overlapping fragments only.

6 Derived horizontal fragmentation. The definition of derived horizontal fragments is straightforward, since it considers the (owner, member) relation defined previously as a guideline. In order to group in one horizontal fragment objects from different classes referenced by the same navigation operation, the distribution designer must define derived horizontal fragments of each member class (i.e., the class playing the member role) in the (owner, member) relation according to its owner. 5 Experimental results To analyze the behavior of horizontal fragmentation, we present some experimental results involving the 007 benchmark. Experiments were made with the ParGoa system [lo], a parallel object server. ParGoa is ODMG compliant [5] having ODL and OQL as interface languages. The ParGoa server is responsible for parallel processing of object-oriented queries. The parallel processing relies on data fragmentation, thus distribution design plays an important role, particularly in distributed memory environments. The ParGoa tests were performed on a cluster of IBM RS/6000(PowerPC) stations connected by Ethernet. Each workstation had 32MB of main memory. The IBM Stations in the cluster were not isolated and the PVM software was used to interconnect ParGoa modules. We present performance results derived from [ 101. While in that work the experimental study aimed at presenting results of performance speed up, here we focus on analyzing the performance results obtained with two different horizontal fragmentation designs. The main objective of this analysis is to evaluate the performance impact of two distribution design decisions involving the (owner, member) relation. Particularly, we show distributed query results involving classes that can play the role of owner or member. This is a situation discussed in Section 4, where most algorithms leave this choice to the designer. Thus, the following results have helped the design of our algorithm. For the medium sized 007 database, the four chosen classes (from figure l), Document, CompositePart, AtomicPart and Connection (DC, CP, AP and CN) had their extensions fragmented according to two horizontal design strategies hereinafter called Strategy 1 and Strategy 2. Strategy 1 privileges derived fragmentation by applying derived fragmentation on class Atomicpart according to CompositePart. Strategy 2 puts more emphasis in primary fragmentation, since it applies primary fragmentation on class AtomicPart on the builddate attribute. The resulting fragmentation present the following (owner, member) relations: Strategy 1: { (CP, null), (CP, DC), (CP,AP), (AP, CN) 1 Strategy 2: { (CP,null),(CP,DC),(AP,null),(AP,CN)) Since all these relationships are either 1 x 1 or 1 x n, in Strategy 1 related objects are kept in the same site while in Strategy 2 the links between CP and AP objects may cause cross boundaries between the nodes. In both strategies, six fragments were generated for each of the four classes so that we would have the same number for fragments and nodes. For each strategy, the following 3 queries were executed. Query 1:select x from x in AtomicPart where x.builddate < 10/11/96 (owner, member) = (AP, null) Query 2:select a->to from a in AtomicPart where a.builddate < 10/11/96 (owner, der) = (AP, CN) Query 3:select c->parts from c in CompositePart where c->documentation.title = DBMS (owner, der) = { (CP, AP), (CP, DC) 1 Each of these 3 queries was evaluated for the centralized database and for the two fragmented (Strategy 1 and Strategy 2) databases. The results in figure 3 correspond to the elapsed processing time in seconds, considering the time interval between the query reception from master node and the delivery of results from all six nodes. The results show the performance for situations (cold) where the cache was empty and remote access was required. Hot results are not shown here since the memory cache masks the communication and transfer time. To reduce the interference effects due to not having isolated workstations, we re-ran each query 20 times Querv 1 Querv 2 Querv 3 Figure 3: Performance results All distributed query executions show performance improvements when compared to the sequential and centralized database. Queries 1 and 2 contain the same predicate that was used in Strategy 2 for the primary fragmentation of Atomicpart. Thus, in Strategy 1 all Atomicpart fragments have to be scanned in these queries, whereas in Strategy 2 the execution can be directed to one specific node. Therefore, the results of Strategy 2 are at least two times faster than in Strategy 1. Particularly in 113

7 Query 2, Strategy 2 performed five times faster than the centralized execution. This is a very significant result since this query is the most time consuming between the three queries. On the other hand, Query 3 has a class path involving the relationship link between Compositepart and Atomicpart, which caused Strategy 1 to outperform Strategy 2. This can be explained by the adequacy of Strategy 1 for this kind of query. This query execution in Strategy 2 hardly improved the centralized performance. This execution forced many remote accesses to follow the relationship links. This was not the case in any of the Strategy 2 executions. Therefore these results in figure 3 show that there is a tradeoff between the fragmentation strategy to be chosen. Most algorithms would direct the fragmentation of the Atomicpart class to be derived, as in Strategy 1. However the improvements obtained with the primary fragmentation of AtomicPart in Strategy 2 were quite significant. Therefore we believe that this choice will rely on the access frequencies of the queries. We can also see that derived fragmentation is a good idea, as it was in the relational model. However maintaining long chains of derivation may incur in data skew. This was not the case in the 007 database, where the parallel processing in Strategy 1 was quite uniform. 6 Conclusions This work shows performance improvements obtained by applying distribution design and parallel processing on top of an object-oriented DBMS. Performance of distributed object DBMS can be improved by minimizing the number of irrelevant objects accessed by the applications, as it happens with primary horizontal fragmentation. It can also be improved reducing the data transfer among sites, as it happens with derived horizontal fragmentation. Therefore, the combination of these two objectives depends on the decision upon which classes will be primary or derived fragmented. Our performance results present improvements for both primary and derived fragmentation, and evidences a conflicting situation for classes that may be owner (primary) or member (derived). Previous distribution design algorithms usually ignore this issue, where the choice between owner and member seems to be trivial. Therefore, we present the definition for owner and member classes and an algorithm that carefully examines the role of each class to be fragmented, considering its relationships, cardinalities and the access frequencies of the queries. An important characteristic of the presented definition for owner and member classes is that it reflects the structure of the navigation paths accessed by the most frequent applications. In applications navigating through a class path, most algorithms would suggest a fully derived fragmentation. However, our experimental results show that when there is a frequent query accessing a member class individually, primary fragmentation of this class should be considered, despite of breaking the relationship link in the class path. References Baiao, F., Mattoso, M., Zaverucha, G., Towards an Inductive Design of Distributed Object Oriented Databases, Proc Third IFCIS Conf on Cooperative Information Systems (CoopISP8), IEEE CS Press, New York, Aug 1998, pp Bellatreche, L., Karlapalem, K, Qing, L., Complex Methods and Class Allocation in Distributed Object- Oriented Databases. Proc 51h Int l Conf on Object Oriented Information Systems, Paris, Sept 1998, pp Bellatreche, L., Karlapalem, K., Basak, G., Query-Driven Horizontal Class Partitioning in Object-Oriented Databases. Proc 9th Int l Conf on Databases and Expert Systems (DEXAP8), Lecture Notes in Computer Science no 1460, Vienna, Austria, Aug 1998, pp Carey, M., DeWitt, D., Naughton, J., The 007 Benchmark. Proc 1993 ACM SIGMOD 22(2), Washington DC, Jun 1993 pp Cattel, R., The Object Database Standard: ODMG-93 Release 1.1, Morgan Kaufman Publishers, 1993 Ceri, S., Navathe, S., A comprehensive approach to fragmentation and allocation of data in distributed databases. Proc IEEE COMPCON Conference, 1983 Ezeife, C., Barker, K., A Comprehensive Approach to Horizontal Class Fragmentation in a Distributed Object Based System, Int l Joumal of Distributed and Parallel Databases 2(3), 1995, pp Gardarin, G., Gruser, J., Tang, Z., A Cost Model for Clustered Object-Oriented Databases, Proc 2lSt VLDB Conference, Switzerland, 1995, pp Karlapalem, K., Navathe, S., Morsi, M., Issues in Distribution Design of Object-Oriented Databases. In: Ozsu, M., Dayal, U., Valduriez, P. (eds), Distributed Object Management, Morgan Kaufman Publishers, Meyer, L., Mattoso, M., Parallel query processing in a shared-nothing object database server, Proc 3rd Int l Meeting on Vector and Parallel Processing (VECPARPQ, Porto, Jun 1998, Molina, H., Hsu, M., Distributed Databases. In: Kim, W. (ed), Modem Database Systems, ACM Press, 1995, pp [El Navathe, S., Ra, M., Vertical Partitioning for Database Design: A Graphical Algorithm, Proc 1989 ACM SIGMOD, 1989, pp ] Navathe, S., Karlapalem, K., Ra, M., A Mixed Fragmentation Methodology for Initial Distributed Database Design, Joumal of Computer and Software Eng. 3(4), 1995 [lq Ozsu, M., Valduriez, P., Principles of Distributed Database Systems, New Jersey, Prentice-Hall, 1991, 2nd ed [151 Savonnet, M., Terrasse, M., Yttongnon, K., Fragtique: A Methodology for Distributing Object Oriented Databases. Proc Int l Conf on Computing and Information (ICCI 98), Winnipeg, Jun 1998, pp

A Mixed Fragmentation Algorithm for Distributed Object Oriented Databases 1

A Mixed Fragmentation Algorithm for Distributed Object Oriented Databases 1 A Mixed Fragmentation Algorithm for Distributed Object Oriented Databases 1 Fernanda Baião Department of Computer Science - COPPE/UFRJ Abstract Federal University of Rio de Janeiro - Brazil baiao@cos.ufrj.br

More information

Transactions on Information and Communications Technologies vol WIT Press, ISSN

Transactions on Information and Communications Technologies vol WIT Press,   ISSN A Knowledge-Based Perspective of the Distributed Design of Object Oriented Databases Fernanda Baiao, Marta Mattoso & Gerson Zaverucha baiao, marta, gerson@cos.ufrj.br Department of Computer Science - COPPE/UFRJ

More information

International Journal of Modern Trends in Engineering and Research e-issn: p-issn:

International Journal of Modern Trends in Engineering and Research  e-issn: p-issn: International Journal of Modern Trends in Engineering and Research www.ijmter.com Fragmentation as a Part of Security in Distributed Database: A Survey Vaidik Ochurinda 1 1 External Student, MCA, IGNOU.

More information

Department DeptId : N Dname : S Proj. Vehicle VId : N Manufacturer Color : S Model : S

Department DeptId : N Dname : S Proj. Vehicle VId : N Manufacturer Color : S Model : S Horizontal Class Partitioning in Object-Oriented Databases? Ladjel Bellatreche 1 and Kamalakar Karlapalem 1 and Ana Simonet 2 1 University of Science and Technology Department of Computer Science Clear

More information

Module 9: Selectivity Estimation

Module 9: Selectivity Estimation Module 9: Selectivity Estimation Module Outline 9.1 Query Cost and Selectivity Estimation 9.2 Database profiles 9.3 Sampling 9.4 Statistics maintained by commercial DBMS Web Forms Transaction Manager Lock

More information

Comparing the performance of object and object relational database systems on objects of varying complexity

Comparing the performance of object and object relational database systems on objects of varying complexity Comparing the performance of object and object relational database systems on objects of varying complexity Kalantari, R and Bryant, CH http://dx.doi.org/10.1007/978 3 642 25704 9_8 Title Authors Type

More information

Computing Data Cubes Using Massively Parallel Processors

Computing Data Cubes Using Massively Parallel Processors Computing Data Cubes Using Massively Parallel Processors Hongjun Lu Xiaohui Huang Zhixian Li {luhj,huangxia,lizhixia}@iscs.nus.edu.sg Department of Information Systems and Computer Science National University

More information

HEURISTIC ALGORITHM FOR FRAGMENTATION AND ALLOCATION IN DISTRIBUTED OBJECT ORIENTED DATABASES

HEURISTIC ALGORITHM FOR FRAGMENTATION AND ALLOCATION IN DISTRIBUTED OBJECT ORIENTED DATABASES Journal of Computer Science and Cybernetics, V.32, N.1 (2016), 45 58 DOI: 10.15625/1813-9663/32/1/5772 HEURISTIC ALGORITHM FOR FRAGMENTATION AND ALLOCATION IN DISTRIBUTED OBJECT ORIENTED DATABASES MAI

More information

Optimization of Queries in Distributed Database Management System

Optimization of Queries in Distributed Database Management System Optimization of Queries in Distributed Database Management System Bhagvant Institute of Technology, Muzaffarnagar Abstract The query optimizer is widely considered to be the most important component of

More information

Implementation Techniques

Implementation Techniques V Implementation Techniques 34 Efficient Evaluation of the Valid-Time Natural Join 35 Efficient Differential Timeslice Computation 36 R-Tree Based Indexing of Now-Relative Bitemporal Data 37 Light-Weight

More information

An Optimal Locking Scheme in Object-Oriented Database Systems

An Optimal Locking Scheme in Object-Oriented Database Systems An Optimal Locking Scheme in Object-Oriented Database Systems Woochun Jun Le Gruenwald Dept. of Computer Education School of Computer Science Seoul National Univ. of Education Univ. of Oklahoma Seoul,

More information

Advanced Databases: Parallel Databases A.Poulovassilis

Advanced Databases: Parallel Databases A.Poulovassilis 1 Advanced Databases: Parallel Databases A.Poulovassilis 1 Parallel Database Architectures Parallel database systems use parallel processing techniques to achieve faster DBMS performance and handle larger

More information

An Effective Class Hierarchy Concurrency Control Technique in Object-Oriented Database Systems

An Effective Class Hierarchy Concurrency Control Technique in Object-Oriented Database Systems An Effective Class Hierarchy Concurrency Control Technique in Object-Oriented Database Systems Woochun Jun and Le Gruenwald School of Computer Science University of Oklahoma Norman, OK 73019 wocjun@cs.ou.edu;

More information

Query Optimization in Distributed Databases. Dilşat ABDULLAH

Query Optimization in Distributed Databases. Dilşat ABDULLAH Query Optimization in Distributed Databases Dilşat ABDULLAH 1302108 Department of Computer Engineering Middle East Technical University December 2003 ABSTRACT Query optimization refers to the process of

More information

Heuristic Horizontal XML Fragmentation

Heuristic Horizontal XML Fragmentation Heuristic Horizontal XML Fragmentation Hui Ma, Klaus-Dieter Schewe Massey University, Information Science Research Centre Private Bag 11 222, Palmerston North, New Zealand [h.ma k.d.schewe]@massey.ac.nz

More information

A Class Hierarchy Concurrency Control Technique in Object-Oriented Database Systems

A Class Hierarchy Concurrency Control Technique in Object-Oriented Database Systems A Class Hierarchy Concurrency Control Technique in Object-Oriented Database Systems Woochun Jun and Le Gruenwald Dept. of Computer Science Univ. of Oklahoma Norman, OK 73072 gruenwal@cs.ou.edu Abstract

More information

Distributed Databases Systems

Distributed Databases Systems Distributed Databases Systems Lecture No. 01 Distributed Database Systems Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro

More information

A Review on Fragmentation Techniques in Distributed Database

A Review on Fragmentation Techniques in Distributed Database International Journal of Modern Trends in Engineering and Research www.ijmter.com A Review on Fragmentation Techniques in Distributed Database Akashkumar Patel1, Rakshitkumar Hirapara2, Vivekkumar Dhamecha3

More information

A NEW APPROACH IN FRAGMENTATION OF DISTRIBUTED OBJECT ORIENTED DATABASES USING CLUSTERING TECHNIQUES

A NEW APPROACH IN FRAGMENTATION OF DISTRIBUTED OBJECT ORIENTED DATABASES USING CLUSTERING TECHNIQUES STUDIA UNIV. BABEŞ BOLYAI, INFORMATICA, Volume L, Number 2, 2005 A NEW APPROACH IN FRAGMENTATION OF DISTRIBUTED OBJECT ORIENTED DATABASES USING CLUSTERING TECHNIQUES ADRIAN SERGIU DARABANT Abstract. Horizontal

More information

Object Placement in Shared Nothing Architecture Zhen He, Jeffrey Xu Yu and Stephen Blackburn Λ

Object Placement in Shared Nothing Architecture Zhen He, Jeffrey Xu Yu and Stephen Blackburn Λ 45 Object Placement in Shared Nothing Architecture Zhen He, Jeffrey Xu Yu and Stephen Blackburn Λ Department of Computer Science The Australian National University Canberra, ACT 2611 Email: fzhen.he, Jeffrey.X.Yu,

More information

Dynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering

Dynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering Dynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering Abstract Mrs. C. Poongodi 1, Ms. R. Kalaivani 2 1 PG Student, 2 Assistant Professor, Department of

More information

Chapter 18: Parallel Databases

Chapter 18: Parallel Databases Chapter 18: Parallel Databases Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 18: Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery

More information

Chapter 18: Parallel Databases. Chapter 18: Parallel Databases. Parallelism in Databases. Introduction

Chapter 18: Parallel Databases. Chapter 18: Parallel Databases. Parallelism in Databases. Introduction Chapter 18: Parallel Databases Chapter 18: Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of

More information

Using Association Rules for Better Treatment of Missing Values

Using Association Rules for Better Treatment of Missing Values Using Association Rules for Better Treatment of Missing Values SHARIQ BASHIR, SAAD RAZZAQ, UMER MAQBOOL, SONYA TAHIR, A. RAUF BAIG Department of Computer Science (Machine Intelligence Group) National University

More information

Vertical Partitioning in Object Oriented Databases Using Intelligent Agents

Vertical Partitioning in Object Oriented Databases Using Intelligent Agents IJCSNS International Journal of Computer Science and Network Security, VOL8 No10, October 2008 205 Vertical Partitioning in Object Oriented Databases Using Intelligent Agents Rajan John and Dr V Saravanan,

More information

ITCS Jing Yang 2010 Fall. Class 16: Object and Object- Relational Databases (ch.11) References

ITCS Jing Yang 2010 Fall. Class 16: Object and Object- Relational Databases (ch.11) References ITCS 3160 Jing Yang 2010 Fall Class 16: Object and Object- Relational Databases (ch.11) Slides come from: References Michael Grossniklaus, Moira Norrie (ETH Zürich): Object Oriented Databases (Version

More information

! Parallel machines are becoming quite common and affordable. ! Databases are growing increasingly large

! Parallel machines are becoming quite common and affordable. ! Databases are growing increasingly large Chapter 20: Parallel Databases Introduction! Introduction! I/O Parallelism! Interquery Parallelism! Intraquery Parallelism! Intraoperation Parallelism! Interoperation Parallelism! Design of Parallel Systems!

More information

Chapter 20: Parallel Databases

Chapter 20: Parallel Databases Chapter 20: Parallel Databases! Introduction! I/O Parallelism! Interquery Parallelism! Intraquery Parallelism! Intraoperation Parallelism! Interoperation Parallelism! Design of Parallel Systems 20.1 Introduction!

More information

Chapter 20: Parallel Databases. Introduction

Chapter 20: Parallel Databases. Introduction Chapter 20: Parallel Databases! Introduction! I/O Parallelism! Interquery Parallelism! Intraquery Parallelism! Intraoperation Parallelism! Interoperation Parallelism! Design of Parallel Systems 20.1 Introduction!

More information

Configuration Management in the STAR Framework *

Configuration Management in the STAR Framework * 3 Configuration Management in the STAR Framework * Helena G. Ribeiro, Flavio R. Wagner, Lia G. Golendziner Universidade Federal do Rio Grande do SuI, Instituto de Informatica Caixa Postal 15064, 91501-970

More information

Announcements (March 1) Query Processing: A Systems View. Physical (execution) plan. Announcements (March 3) Physical plan execution

Announcements (March 1) Query Processing: A Systems View. Physical (execution) plan. Announcements (March 3) Physical plan execution Announcements (March 1) 2 Query Processing: A Systems View CPS 216 Advanced Database Systems Reading assignment due Wednesday Buffer management Homework #2 due this Thursday Course project proposal due

More information

RELATIONAL OPERATORS #1

RELATIONAL OPERATORS #1 RELATIONAL OPERATORS #1 CS 564- Spring 2018 ACKs: Jeff Naughton, Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? Algorithms for relational operators: select project 2 ARCHITECTURE OF A DBMS query

More information

Parallel DBMS. Parallel Database Systems. PDBS vs Distributed DBS. Types of Parallelism. Goals and Metrics Speedup. Types of Parallelism

Parallel DBMS. Parallel Database Systems. PDBS vs Distributed DBS. Types of Parallelism. Goals and Metrics Speedup. Types of Parallelism Parallel DBMS Parallel Database Systems CS5225 Parallel DB 1 Uniprocessor technology has reached its limit Difficult to build machines powerful enough to meet the CPU and I/O demands of DBMS serving large

More information

Using A Network of workstations to enhance Database Query Processing Performance

Using A Network of workstations to enhance Database Query Processing Performance Using A Network of workstations to enhance Database Query Processing Performance Mohammed Al Haddad, Jerome Robinson Department of Computer Science, University of Essex, Wivenhoe Park, Colchester, CO4

More information

HISTORICAL BACKGROUND

HISTORICAL BACKGROUND VALID-TIME INDEXING Mirella M. Moro Universidade Federal do Rio Grande do Sul Porto Alegre, RS, Brazil http://www.inf.ufrgs.br/~mirella/ Vassilis J. Tsotras University of California, Riverside Riverside,

More information

Query Processing: A Systems View. Announcements (March 1) Physical (execution) plan. CPS 216 Advanced Database Systems

Query Processing: A Systems View. Announcements (March 1) Physical (execution) plan. CPS 216 Advanced Database Systems Query Processing: A Systems View CPS 216 Advanced Database Systems Announcements (March 1) 2 Reading assignment due Wednesday Buffer management Homework #2 due this Thursday Course project proposal due

More information

Parallel DBMS. Prof. Yanlei Diao. University of Massachusetts Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Parallel DBMS. Prof. Yanlei Diao. University of Massachusetts Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke Parallel DBMS Prof. Yanlei Diao University of Massachusetts Amherst Slides Courtesy of R. Ramakrishnan and J. Gehrke I. Parallel Databases 101 Rise of parallel databases: late 80 s Architecture: shared-nothing

More information

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24 FILE SYSTEMS, PART 2 CS124 Operating Systems Fall 2017-2018, Lecture 24 2 Last Time: File Systems Introduced the concept of file systems Explored several ways of managing the contents of files Contiguous

More information

A Mixed Fragmentation Methodology For. Initial Distributed Database Design. Shamkant B. Navathe. Georgia Institute of Technology.

A Mixed Fragmentation Methodology For. Initial Distributed Database Design. Shamkant B. Navathe. Georgia Institute of Technology. A Mixed Fragmentation Methodology For Initial Distributed Database Design Shamkant B. Navathe Georgia Institute of Technology Kamalakar Karlapalem Hong Kong University of Science and Technology Minyoung

More information

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Slide 16-1

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Slide 16-1 Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Slide 16-1 Chapter 16 Practical Database Design and Tuning Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Chapter Outline 1. Physical Database

More information

HYRISE In-Memory Storage Engine

HYRISE In-Memory Storage Engine HYRISE In-Memory Storage Engine Martin Grund 1, Jens Krueger 1, Philippe Cudre-Mauroux 3, Samuel Madden 2 Alexander Zeier 1, Hasso Plattner 1 1 Hasso-Plattner-Institute, Germany 2 MIT CSAIL, USA 3 University

More information

The Relationship between Slices and Module Cohesion

The Relationship between Slices and Module Cohesion The Relationship between Slices and Module Cohesion Linda M. Ott Jeffrey J. Thuss Department of Computer Science Michigan Technological University Houghton, MI 49931 Abstract High module cohesion is often

More information

Chapter 17: Parallel Databases

Chapter 17: Parallel Databases Chapter 17: Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of Parallel Systems Database Systems

More information

NOTES ON OBJECT-ORIENTED MODELING AND DESIGN

NOTES ON OBJECT-ORIENTED MODELING AND DESIGN NOTES ON OBJECT-ORIENTED MODELING AND DESIGN Stephen W. Clyde Brigham Young University Provo, UT 86402 Abstract: A review of the Object Modeling Technique (OMT) is presented. OMT is an object-oriented

More information

4/9/2018 Week 13-A Sangmi Lee Pallickara. CS435 Introduction to Big Data Spring 2018 Colorado State University. FAQs. Architecture of GFS

4/9/2018 Week 13-A Sangmi Lee Pallickara. CS435 Introduction to Big Data Spring 2018 Colorado State University. FAQs. Architecture of GFS W13.A.0.0 CS435 Introduction to Big Data W13.A.1 FAQs Programming Assignment 3 has been posted PART 2. LARGE SCALE DATA STORAGE SYSTEMS DISTRIBUTED FILE SYSTEMS Recitations Apache Spark tutorial 1 and

More information

Heap-Filter Merge Join: A new algorithm for joining medium-size relations

Heap-Filter Merge Join: A new algorithm for joining medium-size relations Oregon Health & Science University OHSU Digital Commons CSETech January 1989 Heap-Filter Merge Join: A new algorithm for joining medium-size relations Goetz Graefe Follow this and additional works at:

More information

Simple and Efficient Transactions for a Distributed Object Store

Simple and Efficient Transactions for a Distributed Object Store Simple and Efficient Transactions for a Distributed ect Store Laszlo Boeszoermenyi and Carsten Weich Institut fuer Informatik, Universitaet Klagenfurt, Universitaetsstrasse 65-67, A-9020 Klagenfurt, Austria,

More information

A Framework for Clustering Massive Text and Categorical Data Streams

A Framework for Clustering Massive Text and Categorical Data Streams A Framework for Clustering Massive Text and Categorical Data Streams Charu C. Aggarwal IBM T. J. Watson Research Center charu@us.ibm.com Philip S. Yu IBM T. J.Watson Research Center psyu@us.ibm.com Abstract

More information

A Framework for Storage Management Evaluation in Persistent Object Systems

A Framework for Storage Management Evaluation in Persistent Object Systems A Framework for Storage Management Evaluation in Persistent Object Systems Thorna O. Humphries Alexander L. Wolf Benjamin G. Zorn University of Colorado University of Colorado University of Colorado humphrie@cs.colorado.edu

More information

Chapter 12: Query Processing. Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join

More information

On Computing Minimum Size Prime Implicants

On Computing Minimum Size Prime Implicants On Computing Minimum Size Prime Implicants João P. Marques Silva Cadence European Laboratories / IST-INESC Lisbon, Portugal jpms@inesc.pt Abstract In this paper we describe a new model and algorithm for

More information

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1 Basic Concepts :- 1. What is Data? Data is a collection of facts from which conclusion may be drawn. In computer science, data is anything in a form suitable for use with a computer. Data is often distinguished

More information

Module 4: Tree-Structured Indexing

Module 4: Tree-Structured Indexing Module 4: Tree-Structured Indexing Module Outline 4.1 B + trees 4.2 Structure of B + trees 4.3 Operations on B + trees 4.4 Extensions 4.5 Generalized Access Path 4.6 ORACLE Clusters Web Forms Transaction

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

Chapter 18: Parallel Databases

Chapter 18: Parallel Databases Chapter 18: Parallel Databases Introduction Parallel machines are becoming quite common and affordable Prices of microprocessors, memory and disks have dropped sharply Recent desktop computers feature

More information

OBJECTIVES. How to derive a set of relations from a conceptual data model. How to validate these relations using the technique of normalization.

OBJECTIVES. How to derive a set of relations from a conceptual data model. How to validate these relations using the technique of normalization. 7.5 逻辑数据库设计 OBJECTIVES How to derive a set of relations from a conceptual data model. How to validate these relations using the technique of normalization. 2 OBJECTIVES How to validate a logical data model

More information

Frequently asked questions from the previous class survey

Frequently asked questions from the previous class survey CS 455: INTRODUCTION TO DISTRIBUTED SYSTEMS [FILE SYSTEMS] Shrideep Pallickara Computer Science Colorado State University L27.1 Frequently asked questions from the previous class survey How many choices

More information

Final Exam Review 2. Kathleen Durant CS 3200 Northeastern University Lecture 23

Final Exam Review 2. Kathleen Durant CS 3200 Northeastern University Lecture 23 Final Exam Review 2 Kathleen Durant CS 3200 Northeastern University Lecture 23 QUERY EVALUATION PLAN Representation of a SQL Command SELECT {DISTINCT} FROM {WHERE

More information

A Comparison of Memory Usage and CPU Utilization in Column-Based Database Architecture vs. Row-Based Database Architecture

A Comparison of Memory Usage and CPU Utilization in Column-Based Database Architecture vs. Row-Based Database Architecture A Comparison of Memory Usage and CPU Utilization in Column-Based Database Architecture vs. Row-Based Database Architecture By Gaurav Sheoran 9-Dec-08 Abstract Most of the current enterprise data-warehouses

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

0. Database Systems 1.1 Introduction to DBMS Information is one of the most valuable resources in this information age! How do we effectively and efficiently manage this information? - How does Wal-Mart

More information

Job Re-Packing for Enhancing the Performance of Gang Scheduling

Job Re-Packing for Enhancing the Performance of Gang Scheduling Job Re-Packing for Enhancing the Performance of Gang Scheduling B. B. Zhou 1, R. P. Brent 2, C. W. Johnson 3, and D. Walsh 3 1 Computer Sciences Laboratory, Australian National University, Canberra, ACT

More information

PARALLEL & DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH

PARALLEL & DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH PARALLEL & DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH 1 INTRODUCTION In centralized database: Data is located in one place (one server) All DBMS functionalities are done by that server

More information

The Replication Technology in E-learning Systems

The Replication Technology in E-learning Systems Available online at www.sciencedirect.com Procedia - Social and Behavioral Sciences 28 (2011) 231 235 WCETR 2011 The Replication Technology in E-learning Systems Iacob (Ciobanu) Nicoleta Magdalena a *

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Chapter 13: Query Processing

Chapter 13: Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

SIMILARITY MEASURES FOR MULTI-VALUED ATTRIBUTES FOR DATABASE CLUSTERING

SIMILARITY MEASURES FOR MULTI-VALUED ATTRIBUTES FOR DATABASE CLUSTERING SIMILARITY MEASURES FOR MULTI-VALUED ATTRIBUTES FOR DATABASE CLUSTERING TAE-WAN RYU AND CHRISTOPH F. EICK Department of Computer Science, University of Houston, Houston, Texas 77204-3475 {twryu, ceick}@cs.uh.edu

More information

ALIN Results for OAEI 2016

ALIN Results for OAEI 2016 ALIN Results for OAEI 2016 Jomar da Silva, Fernanda Araujo Baião and Kate Revoredo Department of Applied Informatics Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Brazil {jomar.silva,fernanda.baiao,katerevoredo}@uniriotec.br

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join

More information

Cost Models for Query Processing Strategies in the Active Data Repository

Cost Models for Query Processing Strategies in the Active Data Repository Cost Models for Query rocessing Strategies in the Active Data Repository Chialin Chang Institute for Advanced Computer Studies and Department of Computer Science University of Maryland, College ark 272

More information

Database Architectures

Database Architectures Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 4/15/15 Agenda Check-in Parallelism and Distributed Databases Technology Research Project Introduction to NoSQL

More information

Record Placement Based on Data Skew Using Solid State Drives

Record Placement Based on Data Skew Using Solid State Drives Record Placement Based on Data Skew Using Solid State Drives Jun Suzuki 1, Shivaram Venkataraman 2, Sameer Agarwal 2, Michael Franklin 2, and Ion Stoica 2 1 Green Platform Research Laboratories, NEC j-suzuki@ax.jp.nec.com

More information

I. Khalil Ibrahim, V. Dignum, W. Winiwarter, E. Weippl, Logic Based Approach to Semantic Query Transformation for Knowledge Management Applications,

I. Khalil Ibrahim, V. Dignum, W. Winiwarter, E. Weippl, Logic Based Approach to Semantic Query Transformation for Knowledge Management Applications, I. Khalil Ibrahim, V. Dignum, W. Winiwarter, E. Weippl, Logic Based Approach to Semantic Query Transformation for Knowledge Management Applications, Proc. of the International Conference on Knowledge Management

More information

Preprocessing Short Lecture Notes cse352. Professor Anita Wasilewska

Preprocessing Short Lecture Notes cse352. Professor Anita Wasilewska Preprocessing Short Lecture Notes cse352 Professor Anita Wasilewska Data Preprocessing Why preprocess the data? Data cleaning Data integration and transformation Data reduction Discretization and concept

More information

On Object Orientation as a Paradigm for General Purpose. Distributed Operating Systems

On Object Orientation as a Paradigm for General Purpose. Distributed Operating Systems On Object Orientation as a Paradigm for General Purpose Distributed Operating Systems Vinny Cahill, Sean Baker, Brendan Tangney, Chris Horn and Neville Harris Distributed Systems Group, Dept. of Computer

More information

CS122 Lecture 15 Winter Term,

CS122 Lecture 15 Winter Term, CS122 Lecture 15 Winter Term, 2014-2015 2 Index Op)miza)ons So far, only discussed implementing relational algebra operations to directly access heap Biles Indexes present an alternate access path for

More information

Mining Quantitative Association Rules on Overlapped Intervals

Mining Quantitative Association Rules on Overlapped Intervals Mining Quantitative Association Rules on Overlapped Intervals Qiang Tong 1,3, Baoping Yan 2, and Yuanchun Zhou 1,3 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China {tongqiang,

More information

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Chapter 13: Query Processing Basic Steps in Query Processing

Chapter 13: Query Processing Basic Steps in Query Processing Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

SOME TYPES AND USES OF DATA MODELS

SOME TYPES AND USES OF DATA MODELS 3 SOME TYPES AND USES OF DATA MODELS CHAPTER OUTLINE 3.1 Different Types of Data Models 23 3.1.1 Physical Data Model 24 3.1.2 Logical Data Model 24 3.1.3 Conceptual Data Model 25 3.1.4 Canonical Data Model

More information

Algorithms for Dynamic Memory Management (236780) Lecture 4. Lecturer: Erez Petrank

Algorithms for Dynamic Memory Management (236780) Lecture 4. Lecturer: Erez Petrank Algorithms for Dynamic Memory Management (236780) Lecture 4 Lecturer: Erez Petrank!1 March 24, 2014 Topics last week The Copying Garbage Collector algorithm: Basics Cheney s collector Additional issues:

More information

CSE 544 Principles of Database Management Systems

CSE 544 Principles of Database Management Systems CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 5 - DBMS Architecture and Indexing 1 Announcements HW1 is due next Thursday How is it going? Projects: Proposals are due

More information

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Client-Server Semantic Binary Database: Design and Development

Client-Server Semantic Binary Database: Design and Development Client-Server Semantic Binary Database: Design and Development Konstantin Beznosov High Performance Database Research Center Florida International University http://www.cs.fiu.edu/ beznosov December 9,

More information

PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data

PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data Enhua Jiao, Tok Wang Ling, Chee-Yong Chan School of Computing, National University of Singapore {jiaoenhu,lingtw,chancy}@comp.nus.edu.sg

More information

Module 4. Implementation of XQuery. Part 0: Background on relational query processing

Module 4. Implementation of XQuery. Part 0: Background on relational query processing Module 4 Implementation of XQuery Part 0: Background on relational query processing The Data Management Universe Lecture Part I Lecture Part 2 2 What does a Database System do? Input: SQL statement Output:

More information

Efficient Scheduling of Scientific Workflows using Hot Metadata in a Multisite Cloud

Efficient Scheduling of Scientific Workflows using Hot Metadata in a Multisite Cloud Efficient Scheduling of Scientific Workflows using Hot Metadata in a Multisite Cloud Ji Liu 1,2,3, Luis Pineda 1,2,4, Esther Pacitti 1,2,3, Alexandru Costan 4, Patrick Valduriez 1,2,3, Gabriel Antoniu

More information

Using Statistics for Computing Joins with MapReduce

Using Statistics for Computing Joins with MapReduce Using Statistics for Computing Joins with MapReduce Theresa Csar 1, Reinhard Pichler 1, Emanuel Sallinger 1, and Vadim Savenkov 2 1 Vienna University of Technology {csar, pichler, sallinger}@dbaituwienacat

More information

Database Optimization

Database Optimization Database Optimization June 9 2009 A brief overview of database optimization techniques for the database developer. Database optimization techniques include RDBMS query execution strategies, cost estimation,

More information

Rekayasa Perangkat Lunak 2 (IN043): Pertemuan 8. Data Management Layer Design

Rekayasa Perangkat Lunak 2 (IN043): Pertemuan 8. Data Management Layer Design Rekayasa Perangkat Lunak 2 (IN043): Pertemuan 8 Data Management Layer Design Data Management Layer Focus on how to manage data are stored that can be handled by the programs that run the system, including:

More information

Outline. Distributed DBMS Page 5. 1

Outline. Distributed DBMS Page 5. 1 Outline Introduction Background Distributed DBMS Architecture Distributed Database Design Fragmentation Data Location Semantic Data Control Distributed Query Processing Distributed Transaction Management

More information

Outline. File Systems. Page 1

Outline. File Systems. Page 1 Outline Introduction What is a distributed DBMS Problems Current state-of-affairs Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing

More information

Design A Web Based Service to Promite Telemedicine Management System

Design A Web Based Service to Promite Telemedicine Management System Design A Web Based Service to Promite Telemedicine Management System Mr.S.RamanaReddy, Mrs.B.Swapna ABSTRACT: Many web computing systems are running real time database services where their information

More information

A Performance Study of Hashing Functions for. M. V. Ramakrishna, E. Fu and E. Bahcekapili. Michigan State University. East Lansing, MI

A Performance Study of Hashing Functions for. M. V. Ramakrishna, E. Fu and E. Bahcekapili. Michigan State University. East Lansing, MI A Performance Study of Hashing Functions for Hardware Applications M. V. Ramakrishna, E. Fu and E. Bahcekapili Department of Computer Science Michigan State University East Lansing, MI 48824 frama, fue,

More information

Efficient integration of data mining techniques in DBMSs

Efficient integration of data mining techniques in DBMSs Efficient integration of data mining techniques in DBMSs Fadila Bentayeb Jérôme Darmont Cédric Udréa ERIC, University of Lyon 2 5 avenue Pierre Mendès-France 69676 Bron Cedex, FRANCE {bentayeb jdarmont

More information

Teaching Scheme Business Information Technology/Software Engineering Management Advanced Databases

Teaching Scheme Business Information Technology/Software Engineering Management Advanced Databases Teaching Scheme Business Information Technology/Software Engineering Management Advanced Databases Level : 4 Year : 200 2002 Jim Craven (jcraven@bournemouth.ac.uk) Stephen Mc Kearney (smckearn@bournemouth.ac.uk)

More information

Extendible Chained Bucket Hashing for Main Memory Databases. Abstract

Extendible Chained Bucket Hashing for Main Memory Databases. Abstract Extendible Chained Bucket Hashing for Main Memory Databases Pyung-Chul Kim *, Kee-Wook Rim, Jin-Pyo Hong Electronics and Telecommunications Research Institute (ETRI) P.O. Box 106, Yusong, Taejon, 305-600,

More information

INTEGRATED MANAGEMENT OF LARGE SATELLITE-TERRESTRIAL NETWORKS' ABSTRACT

INTEGRATED MANAGEMENT OF LARGE SATELLITE-TERRESTRIAL NETWORKS' ABSTRACT INTEGRATED MANAGEMENT OF LARGE SATELLITE-TERRESTRIAL NETWORKS' J. S. Baras, M. Ball, N. Roussopoulos, K. Jang, K. Stathatos, J. Valluri Center for Satellite and Hybrid Communication Networks Institute

More information

Data parallel algorithms 1

Data parallel algorithms 1 Data parallel algorithms (Guy Steele): The data-parallel programming style is an approach to organizing programs suitable for execution on massively parallel computers. In this lecture, we will characterize

More information