Servicing Range Queries on Multidimensional Datasets with Partial Replicas

Size: px
Start display at page:

Download "Servicing Range Queries on Multidimensional Datasets with Partial Replicas"

Transcription

1 Servicing Range Queries on Multidimensional Datasets with Partial Replicas Li Weng, Umit Catalyurek, Tahsin Kurc, Gagan Agrawal, Joel Saltz Department of Computer Science and Engineering Department of Biomedical Informatics Ohio State University Columbus OH Abstract Partial replication is one type of optimization to speed up execution of queries submitted to large datasets. In partial replication, a portion of the dataset is extracted, re-organized, and re-distributed across the storage system. The objective is to reduce the volume of I/O and increase I/O parallelism for different types of queries and for the portions of the dataset that are likely to be accessed frequently. When multiple partial replicas of a dataset exist, query execution plan should be generated so as to use the best combination of subsets of partial replicas (and possibly the original dataset) to minimize query execution time. In this paper, we present a compiler and runtime approach for range queries submitted against distributed scientific datasets. A heuristic algorithm is proposed to choose the set of replicas to reduce query execution. We show the efficiency of the proposed method using datasets and queries in oil reservoir simulation studies on a cluster machine. 1 Introduction A number of challenges should be overcome to carry out on-demand, interactive analysis of scientific datasets. The very large size of data in many scientific applications is a major problem. In many situations, large dataset sizes necessitate use of distributed storage. Another challenge is to efficiently extract subsets of data for processing and data analysis. Data subsets can be specified by spatio-temporal range queries or queries on other attributes such as pressure, permeability, speed. Several optimizations can be employed to efficiently search for and extract the data of interest for analysis. It is a common optimization to partition a dataset into chunks, each of which holds a subset of data elements, and use This research was supported by the National Science Foundation under Grants #ACI (UC Subcontract # ), #EIA , #EIA , #ANI , #ACI , #ACI , #CCF , #CNS , #CNS , Lawrence Livermore National Laboratory under Grant #B and #B (UC Subcontract # ), NIH NIBIB BISTI #P20EB000591, Ohio Board of Regents BRTTC #BRTT chunks as the unit of I/O and data management. When data is stored and retrieved in chunks, the number of disk seeks, hence I/O latency, is reduced. In addition, managing a dataset as a set of chunks decreases the cost of metadata management, indexing, and data declustering. Indexing is another optimization technique to speed up query execution [11]. It enables rapid searching and extraction of meta-data for data elements that intersect a given query. Efficient access to data also depends on how well the data has been distributed across multiple storage nodes. The goal of declustering [9, 14] is to distribute the data across as many storage units as possible so that data elements that satisfy a query can be retrieved from many sources in parallel. Caching is yet another optimization that targets multiple query workloads [1, 10, 19, 21]. Caching attempts to reduce the execution time of queries through data and computation reuse by maintaining the results of queries previously executed in the system. The performance of these techniques depends on data access and processing patterns of an application. If there is only one type of query, the dataset can be declustered and indexed to optimize the execution of that type of query. However, when there are multiple types of queries, a single indexing and declustering scheme may not give good performance for all query types. Partial replication is an optimization technique that can be employed when multiple queries and/or multiple types of queries are expected. The objective is to decrease the volume of I/O and increase I/O parallelism by re-organizing and re-distributing a portion of the dataset across the storage system. For example, if the elements of a dataset are organized into chunks and indexed based on a single attribute or a subset of attributes (e.g., atomic species or 3D coordinates), queries that involve only those attributes can be efficiently processed. Queries that involve another attribute, on the other hand, may not be answered efficiently. If a secondary index is created for that attribute, a full scan of the dataset can be avoided. However, this may still yield inefficient execution due to random seeks to different locations in data files to access tuples that match the query criteria. When partial replication optimizations are employed, a 1

2 query can be answered from data subsets and replicas of datasets. Since partial replication can reorganize and redistribute a selected portion of a dataset, there may not be oneto-one mapping between the chunks in the original dataset and those in the replica. In addition, it is possible that no single replica can fully answer the query. Thus, it becomes challenging to figure out how to put together the requested data by piecing data from multiple replicas and possibly from the original dataset. In this paper, we present a compiler and runtime approach for efficient execution of multidimensional range queries when partial replicas of a dataset exist. We present a compile-time query planning strategy to select the best combination of replicas in order to minimize query execution time in a distributed environment. The efficiency of the proposed strategy is demonstrated on a parallel machine using queries for analysis of data from oil reservoir simulations. 2 Related Work Several research projects have looked at improving I/O performance using different declustering techniques [9, 14]. Parallel file systems and I/O libraries have also been a widely studied research topic, and many such systems and libraries have been developed [3, 7, 13]. These systems mainly focus on supporting regular strided access to uniformly distributed datasets, such as images, maps, and dense multi-dimensional arrays. In the context of replication, most of the previous work targets issues like data availability during disk and/or network failures, as well as to speed up I/O performance by creating exact replicas of the datasets [4, 5, 6, 16, 18, 20]. File level and dataset level replication and replica management have been studied [6, 16, 12]. In the area of partial replication [4, 5, 20] the focus had been on creating exact replicas of portions of a dataset to achieve better I/O performance. Our goal is different in the sense that we investigate strategies for evaluating queries when there are (partial) replicas stored. In data caching context, Dahlin et al. [8] have evaluated algorithms that make use of remote memory in a cluster of workstations. Franklin et al. [10] have studied ways for augmenting the server memory by utilizing client resources. Shrira and Yoder [19] have proposed a technique for managing cooperative caches in untrusted environments. Venkataraman et al. [21] have proposed a new buffer management policy, which approximates a global LRU page replacement policy. Andrade et.al. [1] implement an active semantic cache framework for data reuse among multiple data analysis queries submitted against scientific multi-dimensional datasets. While these approaches can be employed for management and replacement of replicas, our work focuses on improving query performance by re-organization of portions of input datasets. 3 Overview 3.1 Motivating Application In simulation-based oil reservoir management studies [17], the goal is to investigate changes in reservoir characteristics and assess how injection and production wells should be placed to maximize oil production and at the same time minimize effects to the environment. An oil reservoir is a 3-dimensional subsurface volume composed of different types of rocks, soil, holes, and underground water ways. A complex numerical model simulates 17 different variables (such as gas pressure, oil saturation, oil velocity) at each point of the reservoir mesh over many time steps. Due to computational and storage requirements, large scale simulation runs require use of distributed computers and storage systems. In order to improve I/O performance, a simulation dataset can be partitioned into a set of chunks, i.e., data elements are organized into application-specific groups and stored on disk. A chunk is the unit of I/O, i.e., data is retrieved from disk in chunks, even if only a subset of the data elements in the chunk are needed. One possible grouping of data elements divides each grid along x, y, and z dimensions as well as the time dimension into chunks (i.e., 4-dimensional rectangles). The chunks are stored on disk in data files. Each chunk contains a subset of grid points and time steps and the values of the variables that are computed at those grid points and time steps. Metadata associated with a chunk is the minimum bounding box of the chunk, the name of the file that contains the chunk, offset in the file, and the size of the chunk. An R-Tree index [11] can be built on space and time dimensions to store the bounding boxes of chunks. The index enables rapid search of chunks that intersect a given query. On a system with multiple storage devices, the chunks can be distributed across storage devices to achieve parallelism when a query accesses them. 3.2 Runtime Middleware We employed a middleware framework, called STORM, to implement the runtime support for execution of queries. STORM [15] is a suite of services that collectively provide basic database support for 1) selection of the data of interest from scientific datasets stored in files and 2) transfer of selected data from storage nodes to compute nodes for processing. The data of interest can be selected based on attribute values, ranges of values, and user-defined filters. STORM supports user-defined distribution of data elements across destination nodes and transfer of selected data subset from distributed source nodes to destination nodes according to the distribution. Scientific datasets are usually stored in a set of flat files with application-specific file formats. In order to provide common support for different applications, STORM employs virtual tables and select query abstractions, which are based on object-relational database model. 2

3 STORM has been developed using a component-based framework, called DataCutter [2], which enables execution of application data processing components in a distributed environment. Using the DataCutter runtime system, STORM implements several optimizations to reduce the execution time of queries. Distributed Execution of Filtering Operations. Both data and task parallelism can be employed to execute filtering operations in a distributed manner. If a select expression contains multiple (userdefined) filters, a network of filters can be formed and executed on a distributed collection of machines. Parallel Data Transfer. Data can be transferred from multiple data sources to multiple destination processors by STORM data mover components. Data movers can be instantiated on multiple storage units and destination processors to achieve parallelism during data transfer. 4 Query Execution with Partial Replicas In this section we present an algorithm for efficient execution of range queries when partial replicas of a dataset exist. We first describe the type of partial replicas we consider in our system and give an overview of our overall system. 4.1 Partial Replicas Considered Our main focus is on answering queries in the presence of partial replicas. Our system assumes that partial replicas have been created, and information about them is stored in a replica information file. A variety of approaches could be taken for selecting partial replicas. One possible approach is to use a group of representative queries to identify the portions of the dataset to be replicated. We assume that a hot region in a dataset is defined by a multi-dimensional bounding box. A partial replica of the dataset is then created from data elements, whose coordinates in the underlying multidimensional space fall into this bounding box. The contents of this bounding box are partitioned into chunks and these chunks are distributed across the system. We allow flexible chunking of partial replicas, i.e., different replicas can have different chunk shape and size, even if they cover the same multi-dimensional region. Moreover, the layout of chunks on disk can follow different dimension order. This is important, as different dimension order will influence the file seek time for retrieving a set of chunks. Once a partial replica is chunked and a dimension order is chosen, chunk layout is done by traversing the chunks in the dimension order. To maximize I/O parallelism when retrieving data, we assume that each chunk of a replica is partitioned across all available data source nodes. However, it should be noted that if the replicas have been created with different chunk shapes and sizes, there will not be one-to-one mapping between data chunks in the original dataset and those in the replicas. 4.2 System Overview Our support for servicing range queries in the presence of partial replicas has been built on top of our prior work on supporting SQL Select queries on multidimensional datasets in a cluster environment [22]. A high-level overview of our system is shown in Figure 1. The underlying runtime system we use, STORM, was briefly described in the previous section. The STORM system requires the users to provide an index function and an extractor function. The index function chooses the file chunks that need to be processed for a given query. The extractor is responsible for reading the file(s) and creating rows of the virtual table. In our previous work, we had described how a code generation tool can use metadata about the layout of the datasets and generate indexing and extractor functions [22]. The information about replicas available in the system is stored in the replica information file. This file captures information about the multi-dimensional range covered by each (partial) replica, the size and shapes of the chunks, and the dimension order for the layout of the chunks. Given a range query, the replica selection algorithm uses the replica information file and indices to determine which partial replicas and which chunks in partial replicas can be used to answer the query. Based on this information, it chooses a strategy or an execution plan for answering the query. This module then interacts with our modules for code generation. It is possible that no single replica can fully answer the query. In that case, it is necessary to generate subqueries for each selected replica that partially answers the query. A subquery corresponds to the use of one replica (or the original dataset) for answering a part of the query. The code generation modules generate indexing and extraction services for each subquery. 4.3 Query Execution Computing Goodness Value To execute a range query efficiently, we should minimize the total cost associated with the query execution on the required data. To that end, we need to establish a goodness value for replicas to choose the ones that achieve the best query execution performance. Because chunk is the basic unit of data access from disk, the goodness value is calculated on chunks. We should note that 1) the I/O cost we pay to retrieve a chunk might be too high when the chunk is only partially intersected by the query bounding box, and 2) the cost of a chunk is proportional to its size. Thus, we need to take into account both the cost per chunk and the amount of useful data the chunk contains to answer the query. Our goodness value is defined as the total amount of the useful data in a chunk normalized by the estimated retrieval cost of the chunk. goodness = useful data per chunk /cost per chunk The retrieval cost of a chunk is computed as: 3

4 Figure 1. Overview of Our System (a) (b) (c) Figure 2. Using Partial Replicas for Answering a Query (a) Query and intersecting datasets, (b) 4 Fragments from 2 Replicas (c) Chunks retrieved. cost = k 1 C read operation + k 2 C seek operation We measure the cost of read operation C read operation as the number of pages fetched from disks when answering a query. In the formula, k 1 is used as the average read time for a page. The cost of seek operation C seek operation is measured as the number of seeks used to locate the start point for a chunk during query execution. k 2 is the average seek time for one read operation. This cost metric is relatively simplistic. However, experiments presented in Section 5 demonstrate that it is capable of capturing the dominant cost factor in query execution. As illustrated in the Figure 2(a), Replica 1 has two kinds of chunks that are needed to answer the query. The chunks whose boundaries intersect the query boundary are called partial chunks, since they contain redundant tuples, which are not used to answer the query and which incur extra filtering operations. The chunks whose bounding boxes fall completely within the query bounding box are called full chunks. In order to compute the contribution of full and partial chunks in estimating the goodness value, we introduce an intermediate unit, called Fragment, between a replica and its chunks. A fragment is defined as a group of partial or full chunks having same retrieval goodness value in a replica. Obviously, all full chunks of one replica should be grouped as one fragment. In Figure 2(b), Replica 1 has three fragments and Replica 2 has only one fragment. In our implementation, each fragment has a goodness value associated with it. goodness = useful data per fragment /cost per fragment Using this cost model, we estimate the execution cost for different combinations of replicas to answer the query and choose the best, i.e., the least expensive combination of replicas. A simple exhaustive search algorithm needs to explore a large search space to find the best combination of replicas. Suppose we are given r replicas with n chunks in total. In the worst case, the exhaustive algorithm needs to check upto 2 n potential combinations for finding the most efficient way to answer an issued query. In order to reduce the exponential search cost and find the efficient combination quickly, we use a heuristic algorithm to recommend a set of candidate replicas Replica Selection Algorithm Our approach has two main steps; a greedy heuristic (step 1) and one optimization extension (step 2). In the first step we obtain a set of candidate replicas, i.e., a set of candidate fragments, which are chosen based on their goodness value in a greedy manner. If the union of the fragments do not cover the query range completely, a remainder query range is generated to be directed to the original dataset as well. The candidate replicas with or without 4

5 the original dataset is our initial solution. This initial solution may contain chunks (from different replicas) that intersect each other in the underlying multi-dimensional space of the dataset. This results in redundant I/O volume as one or more chunks may contain the same data element. In the second step, we attempt to eliminate redundant I/O by checking chunks in one candidate replica against others. When multiple partial replicas exist, our algorithm finds all the replicas that can be used to answer the query, selects a combination of replicas that will have the minimum cost value, and performs an efficient set union operation among data elements selected from these replicas to answer the query. The set union operation is needed since two replicas may contain an overlapping set of data elements selected by the query. INPUT: Query Q; a given set of partial replicas R; the original dataset D. OUTPUT: a list of candidate replicas (fragments) with or without the original dataset D for answering the issued query. // Let F be the set of all fragments intersecting with // the query boundary 1. F 2. Foreach R i R and R i Q 3. Classify chunks that intersect the query into different fragments 4. Calculate the goodness value of each fragment 5. Insert the fragments of R i into F 6. End Foreach // Let S be the ordered list of the candidate fragments in // decreasing order of their goodness values 7. S 8. While F 9. Find the fragment (F i) with the maximum goodness value 10. F F F i 11. Append F i to S 12. Q Q (F i Q) 13. Foreach F j F 14. If F j intersects with F i 15. Subtract the region of overlap from F j 16. Re-compute the goodness value of F j 17. End Foreach 18. End While 19. If Q 20. Use D to answer the subquery Q 21. Append D for the range Q to S Figure 3. Greedy Algorithm The first step of the algorithm is shown in Figure 3. Chunks in each replica that intersect the query are categorized as partial or full chunks and into different fragments, and the respective goodness values of the fragments are calculated (steps 2-6). For a given query Q, let us denote the set of all fragments as F and the list of all chosen fragments in decreasing order of goodness value as S. We can apply our greedy search over F (the while loop over steps 8-18). We choose the fragment with the largest goodness value, move it from F to S, and modify Q by subtracting the range contained by this fragment. Note that for other fragments in F, if they have an overlap with this chosen fragment, the area of overlap should be subtracted and their goodness values should be re-computed. To simplify our algorithm implementation, we re-compute the goodness value based on the formula of per-fragment instead of per-chunk. As shown in Figure 2(a), if we choose the only fragment of Replica 2 in the beginning, we need to subtract the portion of Fragment 2 in Replica 1 that overlaps with the fragment of Replica 2 and modify its goodness value. The while loop iterates until F is empty. At this point, the chosen fragment list S is the initial candidate solution. We may need to include the original dataset in the initial solution and direct a subquery to it, if the union range of S can not completely answer the query (steps 19-21). INPUT: a list of candidate replicas (fragments) S in decreasing order of the goodness values. OUTPUT: a set of replicas with or without the original dataset D for answering the issued query. // Let Intersect be the variable indicating whether we can // improve the candidate solution 1. Intersect 0 2. Foreach F i S 3. Foreach F j S and F j F i 4. If F i F j 5. Intersect 1 and break 6. End Foreach 7. End Foreach 8. If Intersect = 1 9. Foreach F i S from the head of S 10. Foreach chunk C in F i // Let r be the union range contained by the // filtered areas of all other fragments 11. If chunk C is completely within r 12. Drop it from F i 13. Foreach F k after F i in list S 14. If C F k 15. Modify F k to retrieve the intersection 16. Subtract the intersection from C 18. if C = break 19. End Foreach 20. End Foreach 21. End Foreach Figure 4. Extension to the Greedy Algorithm The second step as shown in Figure 4 attempts to improve the initial solution obtained in the first step. Consider the example in Figure 2(a). Suppose after the first step, the 4 candidate fragments in Figure 2(b) together with the original dataset are the initial solution. Fragment 2 in Replica 1 requires additional filtering operations because the data in the regions of overlap can be retrieved from Fragment 4 in Replica 2. Clearly, the better solution would be to delete the two chunks that fall in the region of overlap from Fragment 4 and use Fragment 2 for that region as illustrated in 5

6 Figure 2(c). This solution results in fewer I/O operations and less filtering computation. In the second step, a check is done chunk by chunk for each fragment in S against other fragments. If the data contained in a chunk is also contained in the data area marked to be filtered by other fragments, the chunk should be deleted from the fragment. The running time of categorizing chunks and calculating the goodness valus is O(n), where n is the number of chunks intersecting with the query bounding box. The first step would loop in time O(m), where m is the total number of fragments containing those n chunks. Within the while loop, the running time is equal to O(m) for finding the fragment with the largest goodness value and O(m) for modifying other fragments if there is overlap with the chosen fragment. Thus, the total running time of the first step is O(m 2 ). The second step executes in O(n 2 ). Overall, the algorithm takes O(m 2 +n 2 ) to compute the set of replicas to be used. Since generally m << n, the runtime complexity of the algorithm is O(n 2 ). 5 Experimental Results We carried out an experimental performance evaluation of the proposed algorithm using a large oil reservoir simulation dataset [17] generated using an application emulator to control its size and chunking. Table 1 displays the properties of the original dataset and its replicas that were used in the experiments. The columns TIME (the time dimension) and X, Y, and Z (the three spatial dimensions) in the table contain the time and space ranges in each dimension and the length of each division along that dimension. The row value s : e : l shows the start value (s), the end value (e), and the length of division (l) in the corresponding dimension. Each grid point of reservoir mesh is repsented with a tuple containing 17 floats and 4 integers. Hence each tuple is 84 bytes. Approximate chunk size of each dataset is also displayed in the table. Both the original and replica datasets are declustered across the disks of 8 nodes. In the experiments, we used three groups of replicas (along with the original dataset). These groups are denoted as Set #1, Set #2, and Set #3 in the table. The four of queries used in the experiments are shown in Table 2. All of the experiments were carried out on a Linux cluster where each node has a PIII 933MHz CPU, 512 MB main memory, and three 100GB IDE disks. The nodes are inter-connected via a Switched Fast Ethernet. The first set of experiments demonstrates the scalability of our method with increasing data size. Query #1 from Table 2 was executed with TIMEVAL values set to 1199, 1399, 1599 and 1799 against the datasets in Set #1 of Table 1. Query execution time and the amount of data processed are displayed in Figure 5. Out of 6 replicas in Set #1, algorithm has chosen to use 4 of those replicas; 0, 1, 2, and 4. The query filters out more than 83% of the data that is retrieved from disk, when it is executed using the original dataset only. While the replicas do not eliminate filtering completely, they reduce it drastically, thus improving per- Figure 5. Query execution time and amount of data processed with and without replicas. Figure 6. Query execution time as the number of nodes is varied. formance. About 25% of the data retrieved from disk is filtered, if the query uses replicas. The second set of experiments demonstrates the performance of the proposed method when the number of nodes is varied. Query #2 from Table 2 was executed using the datasets in Set #1 of Table 1. Again, out of 6 replicas, algorithm has chosen to use only 4 of them; 0, 1, 2, and 4. Query execution time, as shown in Figure 6, scales almost linearly upto 4 nodes. Although the use of 8 nodes provides an improvement over 4 nodes, execution time is not reduced by half. We attribute this to the partitioning of replica chunks. When replicas are created, in order to provide a better load balance each chunk is partitioned across all storage nodes. For the replicas displayed in Table 1, chunk sizes are not large enough to result a good-sized chunk parts when they are partitioned into 8. Hence, seek time starts dominating the I/O overhead, resulting in poor I/O performance. 6

7 replica Set #1 Set #2 Set #3 Ranges of Attributes Chunk ID TIME X Y Z Size orig 0:1999:1 0:16:17 0:64:65 0:64:65 6MB :1799:10 0:7:4 0:31:4 0:31:4 54KB :1799:10 0:15:4 0:15:4 0:63:16 215KB :1799:10 0:15:4 0:63:16 0:15:4 215KB :1799:10 0:15:16 0:7:4 0:7:4 215KB :1799:10 4:11:8 16:47:8 16:47:8 430KB :1799:10 4:15:6 34:63:6 34:63:6 181KB :1799:10 0:15:4 0:31:4 0:63:4 54KB :1799:10 1:16:8 33:64:8 1:64:8 430KB :1199:200 0:16:1 0:64:1 0:64:1 17KB :1199:10 0:16:17 0:64:65 0:64:1 928KB :1199:10 0:16:17 0:64:1 33:64:32 457KB Table 1. Properties of the original dataset and its replicas. In the TIME, X, Y, Z columns, s : e : l denotes the start value, end value, and the length of division along each dimension. Query No. Query Expression 1 select * from IPARS where TIME 1000 and TIME TIMEVAL and X 0 and X 11 and Y 0 and Y 28 and Z 0 and Z 28; 2 select * from IPARS where TIME 1000 and TIME 1599 and X 0 and X 11 and Y 0 and Y 31 and Z 0 and Z 31; 3 select * from IPARS where TIME 1000 and TIME TIMEVAL and X 0 and X 15 and Y 0 and Y 63 and Z 0 and Z 63; 4 select * from IPARS where TIME 1000 and TIME 1199; Table 2. Queries used in the experiments. The goal of the third set of experiments is to show the robustness of the proposed algorithm. If we were to execute Query #3 of Table 2 against the datasets in Set #1 of Table 1 for TIMEVAL values set to 1199, 1399, 1599 and 1799, we would not be able to answer the query using only the replicas. Moreover, data chunks to be retrieved from the original dataset would actually cover all of the data that could be retrieved from the replicas. As a result, the use of replicas is not beneficial in this case. The second step of the proposed algorithm (see Section 4.3.2) is designed to also detect such cases and avoid using replica chunks. As is seen in Figure 7, if the replicas are used, we actually get worse performance than just using the original dataset. In the last set of experiments, we execute Query #4 of Table 1 against the original dataset, Set #2 and Set #3. The results are shown in Table 3. If the cost function took into account only the amount of data that would be processed, the algorithm would favor original dataset or Set #2 over Set #3. This is because with the original dataset and Set #2, a little less amount of data would need to be processed. However, Set #3 requires fewer chunks to produce the output and its execution time is dominated by the volume of I/O. The others require processing of more chunks. In that case, disk seeks start to dominate the I/O overhead. This experiment shows that a more accurate modeling of execution time is necessary for better estimates. In our target applications datasets are stored in a set original Set #2 Set #3 execution time (seconds) data processed (MB) number of seeks Table 3. Results from the execution of Query #4. of files. A partial replication strategy could be devised based on replication of subsets of files and we could use methods developed for file-based replica selection. Such a strategy aims to improve performance by placing files on faster storage systems and distributing them across more nodes. We expect that our strategy will achieve better performance than file replication only strategies since it not only can take advantage of faster storage and parallel I/O, but also decreases amount of data retrieved by reorganizing and repartitioning data. We plan to carry out experiments in a future work to quantify the performance improvements of our strategy on file replication based strategies. 6 Conclusions In this paper we have investigated a compiler-runtime approach for execution of range queries on distributed machines, when partial replica optimization is employed. A challenging issue is that when multiple partial replicas of a 7

8 Figure 7. Query execution time and amount of data processed with and without replicas. Algorithm uses only original dataset to answer the query. dataset exist, a query execution plan needs to be devised to efficiently compute the results of the query using a subset of replicas. We have proposed a cost metric and algorithm to select the best set of replicas that will minimize the execution time of a given query. Our experimental results show that 1) the algorithm scales well when the size of the query is increased, 2) it not only decreases the volume of I/O, but also increases the ratio of useful data to the total amount of data retrieved from disk, and 3) it is robust in that it can revert back to the original dataset if using partial replicas is not beneficial. References [1] H. Andrade, T. Kurc, A. Sussman, and J. Saltz. Active Proxy-G: Optimizing the query execution process in the Grid. In Proceedings of the 2002 ACM/IEEE SC02 Conference. ACM Press, Nov [2] M. D. Beynon, T. Kurc, U. Catalyurek, C. Chang, A. Sussman, and J. Saltz. Distributed processing of very large datasets with DataCutter. Parallel Computing, 27(11): , Oct [3] P. H. Carns, W. B. Ligon, R. B. Ross, and R. Thakur. PVFS: A parallel file system for Linux clusters. In Proceedings of the 4th Annual Linux Showcase and Conference, pages , Oct [4] C.-M. Chen and C. T. Cheng. Replication and retrieval strategies of multidimensional data on parallel disks. In Proceedings of the twelfth international conference on Information and knowledge management, pages ACM Press, [5] P. M. Chen, E. K. Lee, G. A. Gibson, R. H. Katz, and D. A. Patterson. RAID: High-performance, reliable secondary storage. ACM Computing Surveys, 26(2): , June [6] A. Chervenak, E. Deelman, I. Foster, L. Guy, W. Hoschek, A. Iamnitchi, C. Kesselman, P. Kunszt, M. Ripeanu, B. Schwartzkopf, H. Stockinger, K. Stockinger, and B. Tierney. Giggle: a framework for constructing scalable replica location services. In Proceedings of the 2002 ACM/IEEE conference on Supercomputing, pages IEEE Computer Society Press, [7] P. F. Corbett and D. G. Feitelson. The Vesta parallel file system. ACM Transactions on Computer Systems, 14(3): , Aug [8] M. Dahlin, R. Wang, T. Anderson, and D. Patterson. Cooperative caching: Using remote client memory to improve file system performance. In Proc. Symp. on Operating Systems Design and Implementation, pages ACM Press, [9] C. Faloutsos and P. Bhagwat. Declustering using fractals. In Proceedings of the 2nd International Conference on Parallel and Distributed Information Systems, pages 18 25, Jan [10] M. J. Franklin, M. J. Carey, and M. Livny. Global memory management in client-server DBMS architectures. Technical Report CS- TR , University of Wisconsin, [11] A. Guttman. R-trees: A dynamic index structure for spatial searching. In Proceedings of SIGMOD 84, pages ACM Press, May [12] M. Manohar, A. Chervenak, B. Clifford, and C. Kesselman. Implementation and evaluation of a replicaset grid service. In Fifth IEEE/ACM International Workshop on Grid Computing (GRID 04). [13] J. M. May. Parallel I/O for High Performance Computing. Morgan Kaufmann Publishers, [14] B. Moon and J. H. Saltz. Scalability analysis of declustering methods for multidimensional range queries. IEEE Transactions on Knowledge and Data Engineering, 10(2): , March/April [15] S. Narayanan, U. Catalyurek, T. Kurc, X. Zhang, and J. Saltz. Applying database support for large scale data driven science in distributed environments. In Proceedings of the Fourth International Workshop on Grid Computing (Grid 2003), pages , Phoenix, Arizona, Nov [16] K. Ranganathan and I. Foster. Identifying dynamic replication strategies for high performance data grids. In Proceedings of International Workshop on Grid Computing, Denver, CO, November [17] J. Saltz, U. Catalyurek, T. Kurc, M. Gray, S. Hastings, S. Langella, S. Narayanan, R. Martino, S. Bryant, M. Peszynska, M. Wheeler, A. Sussman, M. Beynon, C. Hansen, D. Stredney, and D. Sessanna. Driving scientific applications by data in distributed environments. In Dynamic Data Driven Application Systems Workshop, held jointly with ICCS 2003, Melbourne, Australia, June [18] P. Sanders, S. Egner, and J. Korst. Fast concurrent access to parallel disks. In Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, pages Society for Industrial and Applied Mathematics, [19] L. Shrira and B. Yoder. Trust but check: Mutable objects in untrusted cooperative caches. In R. Morrison, M. J. Jordan, and M. P. Atkinson, editors, Advances in Persistent Object Systems, Proceedings of the 8th International Workshop on Persistent Object Systems (POS8) and Proceedings of the 3rd International Workshop on Persistence and Java (PJW3), pages 29 36, Tiburon, California, Morgan Kaufmann Publishers. [20] A. S. Tosun and H. Ferhatosmanoglu. Optimal parallel i/o using replication. In Proceedings of International Workshops on Parallel Processing ICPP, Vancouver, Canada, August [21] S. Venkataraman, J. F. Naughton, and M. Livny. Remote loadsensitive caching for multi-server database systems. In Proceedings of the 1998 International Conference on Data Engineering, pages IEEE Computer Society Press, [22] L. Weng, G. Agrawal, U. Catalyurek, T. Kurc, S. Narayanan, and J. Saltz. An Approach for Automatic Data Virtualization. In Proceedings of the Conference on High Performance Distributed Computing (HPDC), June

Optimizing Multiple Queries on Scientific Datasets with Partial Replicas

Optimizing Multiple Queries on Scientific Datasets with Partial Replicas Optimizing Multiple Queries on Scientific Datasets with Partial Replicas Li Weng Umit Catalyurek Tahsin Kurc Gagan Agrawal Joel Saltz Department of Computer Science and Engineering Department of Biomedical

More information

Supporting SQL-3 Aggregations on Grid-based Data Repositories

Supporting SQL-3 Aggregations on Grid-based Data Repositories Supporting SQL-3 Aggregations on Grid-based Data Repositories Li Weng Gagan Agrawal Umit Catalyurek Joel Saltz Department of Computer Science and Engineering Department of Biomedical Informatics Ohio State

More information

Multiple Query Optimization for Data Analysis Applications on Clusters of SMPs

Multiple Query Optimization for Data Analysis Applications on Clusters of SMPs Multiple Query Optimization for Data Analysis Applications on Clusters of SMPs Henrique Andrade y, Tahsin Kurc z, Alan Sussman y, Joel Saltz yz y Dept. of Computer Science University of Maryland College

More information

ADR and DataCutter. Sergey Koren CMSC818S. Thursday March 4 th, 2004

ADR and DataCutter. Sergey Koren CMSC818S. Thursday March 4 th, 2004 ADR and DataCutter Sergey Koren CMSC818S Thursday March 4 th, 2004 Active Data Repository Used for building parallel databases from multidimensional data sets Integrates storage, retrieval, and processing

More information

Cost Models for Query Processing Strategies in the Active Data Repository

Cost Models for Query Processing Strategies in the Active Data Repository Cost Models for Query rocessing Strategies in the Active Data Repository Chialin Chang Institute for Advanced Computer Studies and Department of Computer Science University of Maryland, College ark 272

More information

Very Large Dataset Access and Manipulation: Active Data Repository (ADR) and DataCutter

Very Large Dataset Access and Manipulation: Active Data Repository (ADR) and DataCutter Very Large Dataset Access and Manipulation: Active Data Repository (ADR) and DataCutter Joel Saltz Alan Sussman Tahsin Kurc University of Maryland, College Park and Johns Hopkins Medical Institutions http://www.cs.umd.edu/projects/adr

More information

Application of Grid-Enabled Technologies for Solving Optimization Problems in Data-Driven Reservoir Studies

Application of Grid-Enabled Technologies for Solving Optimization Problems in Data-Driven Reservoir Studies Application of Grid-Enabled Technologies for Solving Optimization Problems in Data-Driven Reservoir Studies Manish Parashar 1, Hector Klie 2, Umit Catalyurek 3, Tahsin Kurc 3, Vincent Matossian 1, Joel

More information

A Component-based Implementation of Iso-surface Rendering for Visualizing Large Datasets (Extended Abstract)

A Component-based Implementation of Iso-surface Rendering for Visualizing Large Datasets (Extended Abstract) A Component-based Implementation of Iso-surface Rendering for Visualizing Large Datasets (Extended Abstract) Michael D. Beynon y, Tahsin Kurc z, Umit Catalyurek z, Alan Sussman y, Joel Saltz yz y Institute

More information

Image-Space-Parallel Direct Volume Rendering on a Cluster of PCs

Image-Space-Parallel Direct Volume Rendering on a Cluster of PCs Image-Space-Parallel Direct Volume Rendering on a Cluster of PCs B. Barla Cambazoglu and Cevdet Aykanat Bilkent University, Department of Computer Engineering, 06800, Ankara, Turkey {berkant,aykanat}@cs.bilkent.edu.tr

More information

Scheduling File Transfers for Data-Intensive Jobs on Heterogeneous Clusters

Scheduling File Transfers for Data-Intensive Jobs on Heterogeneous Clusters Scheduling File Transfers for Data-Intensive Jobs on Heterogeneous Clusters Gaurav Khanna 1, Umit Catalyurek 2, Tahsin Kurc 2, P. Sadayappan 1, Joel Saltz 2 1 Dept. of Computer Science and Engineering

More information

Object Placement in Shared Nothing Architecture Zhen He, Jeffrey Xu Yu and Stephen Blackburn Λ

Object Placement in Shared Nothing Architecture Zhen He, Jeffrey Xu Yu and Stephen Blackburn Λ 45 Object Placement in Shared Nothing Architecture Zhen He, Jeffrey Xu Yu and Stephen Blackburn Λ Department of Computer Science The Australian National University Canberra, ACT 2611 Email: fzhen.he, Jeffrey.X.Yu,

More information

Indexing Cached Multidimensional Objects in Large Main Memory Systems

Indexing Cached Multidimensional Objects in Large Main Memory Systems Indexing Cached Multidimensional Objects in Large Main Memory Systems Beomseok Nam and Alan Sussman UMIACS and Dept. of Computer Science University of Maryland College Park, MD 2742 bsnam,als @cs.umd.edu

More information

Combining Distributed Memory and Shared Memory Parallelization for Data Mining Algorithms

Combining Distributed Memory and Shared Memory Parallelization for Data Mining Algorithms Combining Distributed Memory and Shared Memory Parallelization for Data Mining Algorithms Ruoming Jin Department of Computer and Information Sciences Ohio State University, Columbus OH 4321 jinr@cis.ohio-state.edu

More information

Improving Access to Multi-dimensional Self-describing Scientific Datasets Λ

Improving Access to Multi-dimensional Self-describing Scientific Datasets Λ Improving Access to Multi-dimensional Self-describing Scientific Datasets Λ Beomseok Nam and Alan Sussman UMIACS and Dept. of Computer Science University of Maryland College Park, MD fbsnam,alsg@cs.umd.edu

More information

A Replica Location Grid Service Implementation

A Replica Location Grid Service Implementation A Replica Location Grid Service Implementation Mary Manohar, Ann Chervenak, Ben Clifford, Carl Kesselman Information Sciences Institute, University of Southern California Marina Del Rey, CA 90292 {mmanohar,

More information

POCCS: A Parallel Out-of-Core Computing System for Linux Clusters

POCCS: A Parallel Out-of-Core Computing System for Linux Clusters POCCS: A Parallel Out-of-Core System for Linux Clusters JIANQI TANG BINXING FANG MINGZENG HU HONGLI ZHANG Department of Computer Science and Engineering Harbin Institute of Technology No.92, West Dazhi

More information

Query Planning for Range Queries with User-defined Aggregation on Multi-dimensional Scientific Datasets

Query Planning for Range Queries with User-defined Aggregation on Multi-dimensional Scientific Datasets Query Planning for Range Queries with User-defined Aggregation on Multi-dimensional Scientific Datasets Chialin Chang y, Tahsin Kurc y, Alan Sussman y, Joel Saltz y+ y UMIACS and Dept. of Computer Science

More information

Current Topics in OS Research. So, what s hot?

Current Topics in OS Research. So, what s hot? Current Topics in OS Research COMP7840 OSDI Current OS Research 0 So, what s hot? Operating systems have been around for a long time in many forms for different types of devices It is normally general

More information

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for Comparison of Two Image-Space Subdivision Algorithms for Direct Volume Rendering on Distributed-Memory Multicomputers Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc Dept. of Computer Eng. and

More information

Scalable Hybrid Search on Distributed Databases

Scalable Hybrid Search on Distributed Databases Scalable Hybrid Search on Distributed Databases Jungkee Kim 1,2 and Geoffrey Fox 2 1 Department of Computer Science, Florida State University, Tallahassee FL 32306, U.S.A., jungkkim@cs.fsu.edu, 2 Community

More information

Quadrant-Based MBR-Tree Indexing Technique for Range Query Over HBase

Quadrant-Based MBR-Tree Indexing Technique for Range Query Over HBase Quadrant-Based MBR-Tree Indexing Technique for Range Query Over HBase Bumjoon Jo and Sungwon Jung (&) Department of Computer Science and Engineering, Sogang University, 35 Baekbeom-ro, Mapo-gu, Seoul 04107,

More information

UAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA

UAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA UAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA METANAT HOOSHSADAT, SAMANEH BAYAT, PARISA NAEIMI, MAHDIEH S. MIRIAN, OSMAR R. ZAÏANE Computing Science Department, University

More information

Record Placement Based on Data Skew Using Solid State Drives

Record Placement Based on Data Skew Using Solid State Drives Record Placement Based on Data Skew Using Solid State Drives Jun Suzuki 1, Shivaram Venkataraman 2, Sameer Agarwal 2, Michael Franklin 2, and Ion Stoica 2 1 Green Platform Research Laboratories, NEC j-suzuki@ax.jp.nec.com

More information

Evaluating I/O Characteristics and Methods for Storing Structured Scientific Data

Evaluating I/O Characteristics and Methods for Storing Structured Scientific Data Evaluating I/O Characteristics and Methods for Storing Structured Scientific Data Avery Ching 1, Alok Choudhary 1, Wei-keng Liao 1,LeeWard, and Neil Pundit 1 Northwestern University Sandia National Laboratories

More information

Using the Holey Brick Tree for Spatial Data. in General Purpose DBMSs. Northeastern University

Using the Holey Brick Tree for Spatial Data. in General Purpose DBMSs. Northeastern University Using the Holey Brick Tree for Spatial Data in General Purpose DBMSs Georgios Evangelidis Betty Salzberg College of Computer Science Northeastern University Boston, MA 02115-5096 1 Introduction There is

More information

Computing Data Cubes Using Massively Parallel Processors

Computing Data Cubes Using Massively Parallel Processors Computing Data Cubes Using Massively Parallel Processors Hongjun Lu Xiaohui Huang Zhixian Li {luhj,huangxia,lizhixia}@iscs.nus.edu.sg Department of Information Systems and Computer Science National University

More information

Communication and Memory Efficient Parallel Decision Tree Construction

Communication and Memory Efficient Parallel Decision Tree Construction Communication and Memory Efficient Parallel Decision Tree Construction Ruoming Jin Department of Computer and Information Sciences Ohio State University, Columbus OH 4321 jinr@cis.ohiostate.edu Gagan Agrawal

More information

Impact of High Performance Sockets on Data Intensive Applications

Impact of High Performance Sockets on Data Intensive Applications Impact of High Performance Sockets on Data Intensive Applications Pavan Balaji, Jiesheng Wu, Tahsin Kurc, Umit Catalyurek, Dhabaleswar K. Panda, Joel Saltz Dept. of Computer and Information Science The

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system

More information

Striped Grid Files: An Alternative for Highdimensional

Striped Grid Files: An Alternative for Highdimensional Striped Grid Files: An Alternative for Highdimensional Indexing Thanet Praneenararat 1, Vorapong Suppakitpaisarn 2, Sunchai Pitakchonlasap 1, and Jaruloj Chongstitvatana 1 Department of Mathematics 1,

More information

Packet Classification Using Dynamically Generated Decision Trees

Packet Classification Using Dynamically Generated Decision Trees 1 Packet Classification Using Dynamically Generated Decision Trees Yu-Chieh Cheng, Pi-Chung Wang Abstract Binary Search on Levels (BSOL) is a decision-tree algorithm for packet classification with superior

More information

Using Tiling to Scale Parallel Data Cube Construction

Using Tiling to Scale Parallel Data Cube Construction Using Tiling to Scale Parallel Data Cube Construction Ruoming in Karthik Vaidyanathan Ge Yang Gagan Agrawal Department of Computer Science and Engineering Ohio State University, Columbus OH 43210 jinr,vaidyana,yangg,agrawal

More information

THE EFFECT OF JOIN SELECTIVITIES ON OPTIMAL NESTING ORDER

THE EFFECT OF JOIN SELECTIVITIES ON OPTIMAL NESTING ORDER THE EFFECT OF JOIN SELECTIVITIES ON OPTIMAL NESTING ORDER Akhil Kumar and Michael Stonebraker EECS Department University of California Berkeley, Ca., 94720 Abstract A heuristic query optimizer must choose

More information

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,

More information

Adaptive Polling of Grid Resource Monitors using a Slacker Coherence Model Λ

Adaptive Polling of Grid Resource Monitors using a Slacker Coherence Model Λ Adaptive Polling of Grid Resource Monitors using a Slacker Coherence Model Λ R. Sundaresan z,m.lauria z,t.kurc y, S. Parthasarathy z, and Joel Saltz y z Dept. of Computer and Information Science The Ohio

More information

COPYRIGHTED MATERIAL. Introduction: Enabling Large-Scale Computational Science Motivations, Requirements, and Challenges.

COPYRIGHTED MATERIAL. Introduction: Enabling Large-Scale Computational Science Motivations, Requirements, and Challenges. Chapter 1 Introduction: Enabling Large-Scale Computational Science Motivations, Requirements, and Challenges Manish Parashar and Xiaolin Li 1.1 MOTIVATION The exponential growth in computing, networking,

More information

Scheduling Multiple Data Visualization Query Workloads on a Shared Memory Machine

Scheduling Multiple Data Visualization Query Workloads on a Shared Memory Machine Scheduling Multiple Data Visualization Query Workloads on a Shared Memory Machine Henrique Andrade y,tahsinkurc z, Alan Sussman y, Joel Saltz z y Dept. of Computer Science University of Maryland College

More information

Analysis of Basic Data Reordering Techniques

Analysis of Basic Data Reordering Techniques Analysis of Basic Data Reordering Techniques Tan Apaydin 1, Ali Şaman Tosun 2, and Hakan Ferhatosmanoglu 1 1 The Ohio State University, Computer Science and Engineering apaydin,hakan@cse.ohio-state.edu

More information

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Seminar on A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Mohammad Iftakher Uddin & Mohammad Mahfuzur Rahman Matrikel Nr: 9003357 Matrikel Nr : 9003358 Masters of

More information

Efficient retrieval of replicated data

Efficient retrieval of replicated data Distrib Parallel Databases (26) 19:17 124 DOI 1.17/s1619-6-8484- Efficient retrieval of replicated data Ali Şaman Tosun Published online: 25 May 26 C Science + Business Media, LLC 26 Abstract Declustering

More information

CLOUD-SCALE FILE SYSTEMS

CLOUD-SCALE FILE SYSTEMS Data Management in the Cloud CLOUD-SCALE FILE SYSTEMS 92 Google File System (GFS) Designing a file system for the Cloud design assumptions design choices Architecture GFS Master GFS Chunkservers GFS Clients

More information

Scheduling Multiple Data Visualization Query Workloads on a Shared Memory Machine

Scheduling Multiple Data Visualization Query Workloads on a Shared Memory Machine Scheduling Multiple Data Visualization Query Workloads on a Shared Memory Machine Henrique Andrade y, Tahsin Kurc z, Alan Sussman y, Joel Saltz z y Dept. of Computer Science University of Maryland College

More information

How to Apply the Geospatial Data Abstraction Library (GDAL) Properly to Parallel Geospatial Raster I/O?

How to Apply the Geospatial Data Abstraction Library (GDAL) Properly to Parallel Geospatial Raster I/O? bs_bs_banner Short Technical Note Transactions in GIS, 2014, 18(6): 950 957 How to Apply the Geospatial Data Abstraction Library (GDAL) Properly to Parallel Geospatial Raster I/O? Cheng-Zhi Qin,* Li-Jun

More information

File Structures and Indexing

File Structures and Indexing File Structures and Indexing CPS352: Database Systems Simon Miner Gordon College Last Revised: 10/11/12 Agenda Check-in Database File Structures Indexing Database Design Tips Check-in Database File Structures

More information

Surveying Formal and Practical Approaches for Optimal Placement of Replicas on the Web

Surveying Formal and Practical Approaches for Optimal Placement of Replicas on the Web Surveying Formal and Practical Approaches for Optimal Placement of Replicas on the Web TR020701 April 2002 Erbil Yilmaz Department of Computer Science The Florida State University Tallahassee, FL 32306

More information

Foundation of Parallel Computing- Term project report

Foundation of Parallel Computing- Term project report Foundation of Parallel Computing- Term project report Shobhit Dutia Shreyas Jayanna Anirudh S N (snd7555@rit.edu) (sj7316@rit.edu) (asn5467@rit.edu) 1. Overview: Graphs are a set of connections between

More information

Communication and Memory Efficient Parallel Decision Tree Construction

Communication and Memory Efficient Parallel Decision Tree Construction Communication and Memory Efficient Parallel Decision Tree Construction Ruoming Jin Department of Computer and Information Sciences Ohio State University, Columbus OH 4321 jinr@cis.ohiostate.edu Gagan Agrawal

More information

CSE 153 Design of Operating Systems

CSE 153 Design of Operating Systems CSE 153 Design of Operating Systems Winter 2018 Lecture 22: File system optimizations and advanced topics There s more to filesystems J Standard Performance improvement techniques Alternative important

More information

FuxiSort. Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc

FuxiSort. Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc Fuxi Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc {jiamang.wang, yongjun.wyj, hua.caihua, zhipeng.tzp, zhiqiang.lv,

More information

Something to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact:

Something to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact: Query Evaluation Techniques for large DB Part 1 Fact: While data base management systems are standard tools in business data processing they are slowly being introduced to all the other emerging data base

More information

Parallel I/O Scheduling in Multiprogrammed Cluster Computing Systems

Parallel I/O Scheduling in Multiprogrammed Cluster Computing Systems Parallel I/O Scheduling in Multiprogrammed Cluster Computing Systems J.H. Abawajy School of Computer Science, Carleton University, Ottawa, Canada. abawjem@scs.carleton.ca Abstract. In this paper, we address

More information

HISTORICAL BACKGROUND

HISTORICAL BACKGROUND VALID-TIME INDEXING Mirella M. Moro Universidade Federal do Rio Grande do Sul Porto Alegre, RS, Brazil http://www.inf.ufrgs.br/~mirella/ Vassilis J. Tsotras University of California, Riverside Riverside,

More information

Compression of the Stream Array Data Structure

Compression of the Stream Array Data Structure Compression of the Stream Array Data Structure Radim Bača and Martin Pawlas Department of Computer Science, Technical University of Ostrava Czech Republic {radim.baca,martin.pawlas}@vsb.cz Abstract. In

More information

Multi-path based Algorithms for Data Transfer in the Grid Environment

Multi-path based Algorithms for Data Transfer in the Grid Environment New Generation Computing, 28(2010)129-136 Ohmsha, Ltd. and Springer Multi-path based Algorithms for Data Transfer in the Grid Environment Muzhou XIONG 1,2, Dan CHEN 2,3, Hai JIN 1 and Song WU 1 1 School

More information

Querying Very Large Multi-dimensional Datasets in ADR

Querying Very Large Multi-dimensional Datasets in ADR Querying Very Large Multi-dimensional Datasets in ADR Tahsin Kurc, Chialin Chang, Renato Ferreira, Alan Sussman, Joel Saltz + Dept. of Computer Science + Dept. of Pathology Johns Hopkins Medical University

More information

TSAR : A Two Tier Sensor Storage Architecture using Interval Skip Graphs

TSAR : A Two Tier Sensor Storage Architecture using Interval Skip Graphs TSAR : A Two Tier Sensor Storage Architecture using Interval Skip Graphs Authors: Peter Desnoyers, Deepak Ganesan, Prashant Shenoy ( University of Massachusetts, Amherst, MA) Presented by: Nikhil Raghavan

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in

More information

Implementation Techniques

Implementation Techniques V Implementation Techniques 34 Efficient Evaluation of the Valid-Time Natural Join 35 Efficient Differential Timeslice Computation 36 R-Tree Based Indexing of Now-Relative Bitemporal Data 37 Light-Weight

More information

Communication and Memory Optimal Parallel Data Cube Construction

Communication and Memory Optimal Parallel Data Cube Construction Communication and Memory Optimal Parallel Data Cube Construction Ruoming Jin Ge Yang Karthik Vaidyanathan Gagan Agrawal Department of Computer and Information Sciences Ohio State University, Columbus OH

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

Ambry: LinkedIn s Scalable Geo- Distributed Object Store

Ambry: LinkedIn s Scalable Geo- Distributed Object Store Ambry: LinkedIn s Scalable Geo- Distributed Object Store Shadi A. Noghabi *, Sriram Subramanian +, Priyesh Narayanan +, Sivabalan Narayanan +, Gopalakrishna Holla +, Mammad Zadeh +, Tianwei Li +, Indranil

More information

TEFS: A Flash File System for Use on Memory Constrained Devices

TEFS: A Flash File System for Use on Memory Constrained Devices 2016 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE) TEFS: A Flash File for Use on Memory Constrained Devices Wade Penson wpenson@alumni.ubc.ca Scott Fazackerley scott.fazackerley@alumni.ubc.ca

More information

Parallel Algorithms for the Third Extension of the Sieve of Eratosthenes. Todd A. Whittaker Ohio State University

Parallel Algorithms for the Third Extension of the Sieve of Eratosthenes. Todd A. Whittaker Ohio State University Parallel Algorithms for the Third Extension of the Sieve of Eratosthenes Todd A. Whittaker Ohio State University whittake@cis.ohio-state.edu Kathy J. Liszka The University of Akron liszka@computer.org

More information

Advanced Databases: Parallel Databases A.Poulovassilis

Advanced Databases: Parallel Databases A.Poulovassilis 1 Advanced Databases: Parallel Databases A.Poulovassilis 1 Parallel Database Architectures Parallel database systems use parallel processing techniques to achieve faster DBMS performance and handle larger

More information

C-NBC: Neighborhood-Based Clustering with Constraints

C-NBC: Neighborhood-Based Clustering with Constraints C-NBC: Neighborhood-Based Clustering with Constraints Piotr Lasek Chair of Computer Science, University of Rzeszów ul. Prof. St. Pigonia 1, 35-310 Rzeszów, Poland lasek@ur.edu.pl Abstract. Clustering is

More information

Parallel Programming Concepts. Parallel Algorithms. Peter Tröger

Parallel Programming Concepts. Parallel Algorithms. Peter Tröger Parallel Programming Concepts Parallel Algorithms Peter Tröger Sources: Ian Foster. Designing and Building Parallel Programs. Addison-Wesley. 1995. Mattson, Timothy G.; S, Beverly A.; ers,; Massingill,

More information

Workload Characterization using the TAU Performance System

Workload Characterization using the TAU Performance System Workload Characterization using the TAU Performance System Sameer Shende, Allen D. Malony, and Alan Morris Performance Research Laboratory, Department of Computer and Information Science University of

More information

Accelerated Load Balancing of Unstructured Meshes

Accelerated Load Balancing of Unstructured Meshes Accelerated Load Balancing of Unstructured Meshes Gerrett Diamond, Lucas Davis, and Cameron W. Smith Abstract Unstructured mesh applications running on large, parallel, distributed memory systems require

More information

Automatic Counterflow Pipeline Synthesis

Automatic Counterflow Pipeline Synthesis Automatic Counterflow Pipeline Synthesis Bruce R. Childers, Jack W. Davidson Computer Science Department University of Virginia Charlottesville, Virginia 22901 {brc2m, jwd}@cs.virginia.edu Abstract The

More information

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction Chapter 6 Objectives Chapter 6 Memory Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured.

More information

ECE7995 (7) Parallel I/O

ECE7995 (7) Parallel I/O ECE7995 (7) Parallel I/O 1 Parallel I/O From user s perspective: Multiple processes or threads of a parallel program accessing data concurrently from a common file From system perspective: - Files striped

More information

Implementation and Performance Evaluation of RAPID-Cache under Linux

Implementation and Performance Evaluation of RAPID-Cache under Linux Implementation and Performance Evaluation of RAPID-Cache under Linux Ming Zhang, Xubin He, and Qing Yang Department of Electrical and Computer Engineering, University of Rhode Island, Kingston, RI 2881

More information

A Parallel Implementation of 4-Dimensional Haralick Texture Analysis for Disk-resident Image Datasets

A Parallel Implementation of 4-Dimensional Haralick Texture Analysis for Disk-resident Image Datasets A Parallel Implementation of 4-Dimensional Haralick Texture Analysis for Disk-resident Image Datasets Brent Woods, Bradley Clymer, Joel Saltz, Tahsin Kurc Dept. of Electrical Engineering The Ohio State

More information

Benchmarking the UB-tree

Benchmarking the UB-tree Benchmarking the UB-tree Michal Krátký, Tomáš Skopal Department of Computer Science, VŠB Technical University of Ostrava, tř. 17. listopadu 15, Ostrava, Czech Republic michal.kratky@vsb.cz, tomas.skopal@vsb.cz

More information

A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis

A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis Bruno da Silva, Jan Lemeire, An Braeken, and Abdellah Touhafi Vrije Universiteit Brussel (VUB), INDI and ETRO department, Brussels,

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK ENHANCED DATA REPLICATION TECHNIQUES FOR DISTRIBUTED DATABASES SANDIP PANDURANG

More information

SONAS Best Practices and options for CIFS Scalability

SONAS Best Practices and options for CIFS Scalability COMMON INTERNET FILE SYSTEM (CIFS) FILE SERVING...2 MAXIMUM NUMBER OF ACTIVE CONCURRENT CIFS CONNECTIONS...2 SONAS SYSTEM CONFIGURATION...4 SONAS Best Practices and options for CIFS Scalability A guide

More information

Parallelizing Frequent Itemset Mining with FP-Trees

Parallelizing Frequent Itemset Mining with FP-Trees Parallelizing Frequent Itemset Mining with FP-Trees Peiyi Tang Markus P. Turkia Department of Computer Science Department of Computer Science University of Arkansas at Little Rock University of Arkansas

More information

The Google File System (GFS)

The Google File System (GFS) 1 The Google File System (GFS) CS60002: Distributed Systems Antonio Bruto da Costa Ph.D. Student, Formal Methods Lab, Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur 2 Design constraints

More information

Pattern-Aware File Reorganization in MPI-IO

Pattern-Aware File Reorganization in MPI-IO Pattern-Aware File Reorganization in MPI-IO Jun He, Huaiming Song, Xian-He Sun, Yanlong Yin Computer Science Department Illinois Institute of Technology Chicago, Illinois 60616 {jhe24, huaiming.song, sun,

More information

Google is Really Different.

Google is Really Different. COMP 790-088 -- Distributed File Systems Google File System 7 Google is Really Different. Huge Datacenters in 5+ Worldwide Locations Datacenters house multiple server clusters Coming soon to Lenior, NC

More information

Experimental Evaluation of Spatial Indices with FESTIval

Experimental Evaluation of Spatial Indices with FESTIval Experimental Evaluation of Spatial Indices with FESTIval Anderson Chaves Carniel 1, Ricardo Rodrigues Ciferri 2, Cristina Dutra de Aguiar Ciferri 1 1 Department of Computer Science University of São Paulo

More information

A Comparative Study on Exact Triangle Counting Algorithms on the GPU

A Comparative Study on Exact Triangle Counting Algorithms on the GPU A Comparative Study on Exact Triangle Counting Algorithms on the GPU Leyuan Wang, Yangzihao Wang, Carl Yang, John D. Owens University of California, Davis, CA, USA 31 st May 2016 L. Wang, Y. Wang, C. Yang,

More information

Flexible Cache Cache for afor Database Management Management Systems Systems Radim Bača and David Bednář

Flexible Cache Cache for afor Database Management Management Systems Systems Radim Bača and David Bednář Flexible Cache Cache for afor Database Management Management Systems Systems Radim Bača and David Bednář Department ofradim Computer Bača Science, and Technical David Bednář University of Ostrava Czech

More information

CS435 Introduction to Big Data FALL 2018 Colorado State University. 11/7/2018 Week 12-B Sangmi Lee Pallickara. FAQs

CS435 Introduction to Big Data FALL 2018 Colorado State University. 11/7/2018 Week 12-B Sangmi Lee Pallickara. FAQs 11/7/2018 CS435 Introduction to Big Data - FALL 2018 W12.B.0.0 CS435 Introduction to Big Data 11/7/2018 CS435 Introduction to Big Data - FALL 2018 W12.B.1 FAQs Deadline of the Programming Assignment 3

More information

Workloads Programmierung Paralleler und Verteilter Systeme (PPV)

Workloads Programmierung Paralleler und Verteilter Systeme (PPV) Workloads Programmierung Paralleler und Verteilter Systeme (PPV) Sommer 2015 Frank Feinbube, M.Sc., Felix Eberhardt, M.Sc., Prof. Dr. Andreas Polze Workloads 2 Hardware / software execution environment

More information

Design and Implementation of A P2P Cooperative Proxy Cache System

Design and Implementation of A P2P Cooperative Proxy Cache System Design and Implementation of A PP Cooperative Proxy Cache System James Z. Wang Vipul Bhulawala Department of Computer Science Clemson University, Box 40974 Clemson, SC 94-0974, USA +1-84--778 {jzwang,

More information

Exploiting On-Chip Data Transfers for Improving Performance of Chip-Scale Multiprocessors

Exploiting On-Chip Data Transfers for Improving Performance of Chip-Scale Multiprocessors Exploiting On-Chip Data Transfers for Improving Performance of Chip-Scale Multiprocessors G. Chen 1, M. Kandemir 1, I. Kolcu 2, and A. Choudhary 3 1 Pennsylvania State University, PA 16802, USA 2 UMIST,

More information

Bitmap Indices for Fast End-User Physics Analysis in ROOT

Bitmap Indices for Fast End-User Physics Analysis in ROOT Bitmap Indices for Fast End-User Physics Analysis in ROOT Kurt Stockinger a, Kesheng Wu a, Rene Brun b, Philippe Canal c a Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA b European Organization

More information

Efficient Execution of Multi-Query Data Analysis Batches Using Compiler Optimization Strategies

Efficient Execution of Multi-Query Data Analysis Batches Using Compiler Optimization Strategies Efficient Execution of Multi-Query Data Analysis Batches Using Compiler Optimization Strategies Henrique Andrade Ý Þ, Suresh Aryangat Ý, Tahsin Kurc Þ, Joel Saltz Þ, Alan Sussman Ý Ý Dept. of Computer

More information

Efficient Retrieval of Multidimensional Datasets Through Parallel I/O

Efficient Retrieval of Multidimensional Datasets Through Parallel I/O Efficient Retrieval of Multidimensional Datasets Through Parallel I/O Sunil Prabhakar Computer Sciences Purdue University W. Lafayette, IN 47907 sunil@cs.purdue.edu Khaled Abdel-Ghaffar Elect. & Computer

More information

Parallel Computing. Slides credit: M. Quinn book (chapter 3 slides), A Grama book (chapter 3 slides)

Parallel Computing. Slides credit: M. Quinn book (chapter 3 slides), A Grama book (chapter 3 slides) Parallel Computing 2012 Slides credit: M. Quinn book (chapter 3 slides), A Grama book (chapter 3 slides) Parallel Algorithm Design Outline Computational Model Design Methodology Partitioning Communication

More information

Tertiary Storage Organization for Large Multidimensional Datasets

Tertiary Storage Organization for Large Multidimensional Datasets Tertiary Storage Organization for Large Multidimensional Datasets Sachin More, Alok Choudhary Department of Electrical and Computer Engineering Northwestern University Evanston, IL 60028 fssmore,choudharg@ece.nwu.edu

More information

PSON: A Parallelized SON Algorithm with MapReduce for Mining Frequent Sets

PSON: A Parallelized SON Algorithm with MapReduce for Mining Frequent Sets 2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming PSON: A Parallelized SON Algorithm with MapReduce for Mining Frequent Sets Tao Xiao Chunfeng Yuan Yihua Huang Department

More information

Distributed File Systems II

Distributed File Systems II Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation

More information

On Multiple Query Optimization in Data Mining

On Multiple Query Optimization in Data Mining On Multiple Query Optimization in Data Mining Marek Wojciechowski, Maciej Zakrzewicz Poznan University of Technology Institute of Computing Science ul. Piotrowo 3a, 60-965 Poznan, Poland {marek,mzakrz}@cs.put.poznan.pl

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

The Fusion Distributed File System

The Fusion Distributed File System Slide 1 / 44 The Fusion Distributed File System Dongfang Zhao February 2015 Slide 2 / 44 Outline Introduction FusionFS System Architecture Metadata Management Data Movement Implementation Details Unique

More information

Framework for replica selection in fault-tolerant distributed systems

Framework for replica selection in fault-tolerant distributed systems Framework for replica selection in fault-tolerant distributed systems Daniel Popescu Computer Science Department University of Southern California Los Angeles, CA 90089-0781 {dpopescu}@usc.edu Abstract.

More information

Improving Range Query Performance on Historic Web Page Data

Improving Range Query Performance on Historic Web Page Data Improving Range Query Performance on Historic Web Page Data Geng LI Lab of Computer Networks and Distributed Systems, Peking University Beijing, China ligeng@net.pku.edu.cn Bo Peng Lab of Computer Networks

More information