Replication Routing Indexes for XML Documents

Size: px
Start display at page:

Download "Replication Routing Indexes for XML Documents"

Transcription

1 Replication Routing Indexes for XML Documents Panos Skyvalidas, Evaggelia Pitoura, and V. V. Dimakopoulos Computer Science Department, University of Ioannina, GR4511 Ioannina, Greece Abstract. Peer-to-peer systems have attracted considerable attention as a means of sharing content among large and dynamic communities of nodes. In this paper, we consider sharing of XML documents. We propose a simple yet efficient replication method that is based on replication routing indexes. A replication routing index for a node is a path-based XML index that summarizes access information for each of the links of the node. The replication routing indexes are used for determining both the granularity of the XML fragments to be replicated as well as the appropriate sites for placing the fragments. We also present experimental results of the deployment of our indexes in a dynamic unstructured peer-to-peer system. 1 Introduction Peer-to-peer (p2p) systems have attracted a lot of attention as a means of data sharing among a large and dynamic population of nodes. Nodes in a p2p system connect with (know about) a small number of other nodes, thus forming an overlay network. A central issue is locating the nodes that hold data of interest. There have been various proposals towards building overlays for efficient content location. Such proposals vary from constructing rigid topologies and placing data on specific nodes to unstructured networks with no correlation between the node content and its position in the overlay. In all overlay types, content replication results in reducing the latency of lookups. Motivated by the fact that XML is increasingly being used in data intensive applications, in this paper, we study replication in p2p systems with nodes that share content stored in XML. Various replication techniques have been proposed that can be roughly categorized as either passive or proactive. With passive replication, items are replicated after being successfully located. A commonly used passive replication scheme is path replication that caches data items along the lookup path. It has been shown that the number of copies produced by path replication is close to optimal with respect to the average search size [6,9], that is, the expected number of steps to reach a locatable item. With proactive replication, the creation of replicas is initiated by holders of data items not necessarily after a successful lookup of the item. Path replication is simple and requires minimum bookkeeping. However, it tends to cluster copies along lookup paths. Further, passive replication considers the average search cost over all items and does not allow for fine tuning the search efficiency for popular and unpopular items. We consider XML replication for both passive and proactive protocols. XML documents have a hierarchical structure and thus, different fragments of an XML Work supported in part by AEOLUS project, IST 15964

2 document may have different access frequencies. We show that replicating at the fragment level is preferable to replicating whole documents. For proactive replication, we introduce a new data structure which we call replication routing index. A Replication Routing Index (RepRI) for a peer has one entry for each of its adjacent edges with statistics about the requests it has received through each one of them. Our replication strategy uses these indexes for deciding whether to maintain a copy locally or forward it as well as for selecting a direction for doing so. A Replication Routing Index for XML (RepRIX) maintains statistics for fragments. RepRIX allows us to fine-tune the unit of replication, so that fragments of the same document can have different numbers of replicas. Further, it allows us to push specific fragments closer to their requesters. RepRIX entries are also used as hints during lookup to direct nodes towards paths that most probably hold replicas of the requested items. We compare experimentally both proactive and passive variations of fragment replication. Both types of fragment replication outperform whole document replication resulting in both increasing the percentage of items located and reducing the required steps for doing so. Proactive replication with hints is shown to work better than passive replication. Furthermore, proactive replication allows for fine tuning the search efficiency for popular and unpopular items at the additional overhead of maintaining the replication indexes. The remainder of this paper is structured as follows. Section 2 introduces proactive replication using replication indexes. Section 3 extends replication indexes for XML documents, while Section 4 reports our experimental results. Section 5 presents related research. Finally, Section 6 concludes the paper. 2 Proactive Replication in Unstructured p2p Systems We consider first the case where peers store simple data files. Search is based on keyword-based queries that refer to file names. Replication in Unstructured p2p Systems We consider unstructured overlay networks where there is no control over the network topology or data placement. Unstructured p2p systems offer decentralization and simplicity of system construction and maintenance. However, searching for an item may be expensive, since each peer requesting an item needs to propagate the search message through the overlay using some variation of flooding or random walks. With flooding, each peer contacts its neighbors in the overlay, which in turn contact their own neighbors, until the requested item is located. With random walks, the peer initiates one or more depth-first traversals of the overlay. In both cases, to avoid overwhelming the network, the search query is propagated for up to a maximum of TTL (Time-To-Live) hops. To increase data availability and improve search performance, several replication techniques have been proposed. The most well known replication techniques used in unstructured p2p systems are owner and path replication, both of which are passive schemes. With owner replication, the requester peer locally stores a file after it has been successfully located. Owner replication creates at most one replica for each successful search. With path replication, after a successful search, the file

3 is replicated at all peers on the path from the provider peer to the requester peer. Previous work on replication [6] has shown that, under certain conditions, path replication results in creating close to optimal numbers of replicas in terms of minimizing the average search size for locatable items. However, path replication tends to replicate files on peers that are topologically along the same path. Proactive replication allows for a more random distribution of replicas. We propose a simple replication strategy that creates and places replicas along the peers of an unstructured p2p network in a proactive manner. Replication Routing Indexes To support proactive replication, we introduce a data structure that we call Replication Routing Index (RepRI). The RepRI(p) of a peer p maintains information about file requests that the peer has received through its neighbors. There are entries corresponding to each of its links (neighbors) summarizing the requests received through each of these links. For simplicity, we maintain entries for individual files. Other methods, for instance maintaining information for group of requests, are possible for reducing the size of the structure. The entry RepRI(p)(f), for a peer p with k neighbors and file f is of the form (LReq, Req 1, Hop 1,... Req k, Hop k, Ownership), where LReq is the number of queries for file f initiated by peer p itself, Req i and Hop i with i = 1,...,k, are respectively the number of queries that have been forwarded to p by its neighbor i and the average number of hops required for the corresponding queries to reach p through i. Ownership is set to 1 and depending on whether file f is stored locally at p or not. In Figure 1, we show the RepRI maintained by peer 1, which has three neighbors, 2, 3 and 4. Peer 1 has processed queries for two local files, Talk.mp3 and Precious.mp3 and for one file Sun.jpg stored at some other peer Filename LReq Req1 Hop1 Req2 Hop2 Req3 Hop3 Ownership 6 Talk_mp Sun_jpg Precious_mp Fig.1. The replication routing index of peer 1 Our replication strategy uses these indexes to decide whether a peer should replicate some of its files and forward them to certain directions. Our goal is to create and move replicas towards the direction at which they are actually needed. The decision which file to replicate and towards which direction is based on both the popularity of the file and the cost for locating it. In particular, each peer p with k neighbors calculates the replication utility, ru(f) of a file f as follows: ru(f) = α pop factor(f) + (1 α) hop factor(f) where pop factor(f) = k i=1 Req i(f) / Max f ( k i=1 Req i(f )) and hop factor(f) = ( k i=1 Hop i(f)/k)/ttl. Req i (f) and Hop i (f) denote respectively the number of queries for file f and the corresponding number of hops for these queries that p has received from its neighbor i. The weight α is a tuning parameter that determines how much each of these two factors, namely popularity and search size, affects the replication decision. The larger the value of α, the more efficient the search for

4 popular items. Each peer p also maintains the average replication utility (aru) for all files in its index. Periodically, a peer p creates replicas of all locally stored files that have replication utility greater than or equal to its average replication utility. For each such file f, peer p sends a replication message to its neighbor i that has the maximum corresponding Req i (f). After completing this replication phase, the entries for all files in the RepRI of the initiating peer are reset to. Each peer that receives a replication message for a file f checks the value of the corresponding LReq field. If this is greater than, the peer stores the replica. Else, the peer compares ru(f) with its aru and if ru(f) is greater than or equal to aru, it stores the file. Finally, the peer checks whether it should further forward the replication message to any of its neighbors based on the fields Req i (f). If at least one of them is greater than, the replication message is forwarded further to the neighbor i having the maximum corresponding Req i (f). Otherwise, the replication procedure stops. The duration of the period of the replication procedure depends on the query workload. A large value leads to making more informed decisions based on sufficiently large samples of requests and ignores popularity fluctuations that may be caused by random variations in the query workload. Furthermore, it reduces the associated network overhead. On the other hand, the system adapts to workload changes less promptly. Also, note that resetting the RepRI entries of the peer that initiated the replication procedure provides a simple yet effective aging scheme. REpRI-Based Replacement and Routing When peers have reached their maximum storage capacity for storing replicas, RepRI is used for selecting which of the replicas to replace. To this end, the file f with the smallest replication utility ru(f) is replaced. RepRI can also be used to further improve the routing process by maintaining the direction of the provider of a file. When a peer forwards a replication message to one of its neighbors without storing the replica, it sets the corresponding counter to a hint value indicating that the file has been copied along this direction. Thus, the next time the peer receives a query for that file, the peer knows towards which of its neighbors to forward the request. 3 Replication Routing Indexes for XML In most p2p systems, users and applications employ various formats and schemas to describe their data. In this section, we consider replication in the case of p2p systems for sharing XML data. Data and Query Model An XML document is represented by a rooted labeled tree, where nodes correspond to XML elements labeled with the corresponding tag and edges represent direct element-subelement relationships. Figure 2 depicts an XML description of a song collection provided by a node and the corresponding XML tree. To support querying the structure of documents, in addition to their content, query languages for XML employ path patterns. We focus on simple path queries of the form p 1 e 1 p 2 e 2... p n e n where each e i is an element name or the wildcard operator * and each p i is either / or // denoting respectively parent-child and ancestor-descendant traversal.

5 <song_collection> <pop> <song>.. </song> <song>,,, </song> </pop> <rock> <song> </song> <song></song> </rock> </song_collection> song_collection pop rock song song song song Fig.2. (left) Example of an XML document (right) The corresponding tree A path query q is evaluated at a node u of an XML tree T and its result is the set of nodes of T reachable from u via q. For a query q and a document d, we say that d matches q, if the path expression forming the query exists in the document. A peer that stores documents that match the query is called a matching peer. Using path queries, peers receive as results fragments of the information included in an XML document. An XML fragment F i is a subtree of T rooted at some node n of T. Each fragment is represented by a simple path expression starting from the root node r of T and leading to n. In Figure 2 (shaded part), we see an example of an XML fragment represented by the path /song collection/rock/*. Different fragments of a document may have different access frequencies. We exploit this fact by replicating at the fragment level. Fragment Replication We allow peers to replicate appropriate fragments of documents depending on the queries they have processed. We assume that the documents stored initially at peers are not fragmented. Document fragmentation is the result of the replication strategy. We assume that a unique identifier is associated with each document. This is also useful for handling possible updates. For brevity, in our examples, we have omitted most of these ids. The replicas that are created correspond to fragments defined by path queries. We propose two techniques for replica creation, called skeleton replication and subtree replication. <song_collection> <pop> <song>.. </song> <song>,,, </song> </pop> <house> <song> </song> <song></song> </house> <rock> <song> </song> <song></song> </rock> </song_collection> <song_collection> <pop> <externallink> </pop> <house> <externallink> </house> <rock> <song> </song> <song></song> </rock> </song_collection> <rockpath="song_collection/rock"parentnode="id" hassibling="yes"> <song> </song> <song></song> </rock> Fig. 3. (left) Example of an XML document (middle) Skeleton replication and (right) Subtree replication for the fragment defined by the path /song collection/rock/*.

6 Skeleton Replication. With skeleton replication, replicas have similar structure with the original documents. The only difference is that only part of the data of the original document is replicated. For creating a replica for a path expression /a 1 /a 2... /a n, we copy the skeleton (element hierarchy) of the original document from the root node a 1 to the fragment-root node a n. We also copy the data contained in the subtree defined by the path query. Data that corresponds to elements that are not included in the related path is not replicated. Instead, it is replaced by an external link. External links are special elements identified by the tag name externallink used to indicate where the data of the element, to which they belong, is located. To ensure the correctness and the completeness of the query evaluation process, we enforce the following constraint on the structure of replicas Replica Creation Constraint. For a path expression, /a 1 /a 2... /a n, if an element node a i of the original document has siblings, then the corresponding replica contains, along with the sibling elements in the skeleton, an external link to the original document. External links can be viewed as an intentional description of this missing data and provide the means to obtain it, if needed. From the XML tree perspective, external links play the role of some form of external nodes. In Figure 3, we see an example of an XML document and a replica corresponding to the path query /song collection/rock/*. With skeleton replication (Fig 3(middle)), only the data contained at the rock element is replicated. Data contained at the sibling elements pop and house is not replicated but replaced by an external link to the original document. When a peer receives a path query, first, it checks whether it has matching documents. If this is not the case, the query is further forwarded to some neighboring peer. If there is a match and the fragment-root node defined by the path query is a leaf node, then the query is satisfied by this node and the corresponding data is returned to the requester peer. Else, if the fragment-root node defines a subtree, its child nodes are checked. If none of them contains an external link, due to the Replica Creation Constraint, all the requested data must be present at the current document. Data is collected, returned to the requester peer and processing completes. Else, if one or more child nodes contain an external link, the peer uses the link to forward the query to the peer that stores the document containing all the requested data. Subtree Replication. With subtree replication, the replicas that are created contain only the fragment defined by the path query and not the whole skeleton, that is, the replica just consists of the subtree (of the original document) rooted at the fragment-root node. In addition, for evaluating a path query over a replica, three attributes are added to its root element: the path, the parent node and the hassiblings attributes. The path attribute contains the sequence of element names from the root node of the original document to the root node of the replica fragment. Since all documents have a unique id, the parent node attribute contains the parent node s id for achieving cohesion between the original document and the replica. Finally, the hassiblings attribute takes the values yes or no indicating whether the

7 replica s root node has any sibling nodes which have not been replicated. Figure 3(right) shows the replica corresponding to the path query /song collection/rock/* when using subtree replication. The query processing procedure for subtree replication is similar to that of skeleton replication. Replication Using RepRIX Instead of maintaining statistics for whole documents, a Replication Routing Indexes for XML (RepRIX) maintains entries that correspond to paths, representing fragments that the peer has processed queries for. Peers make decisions regarding replicating fragments of documents based on the replication utilities of the corresponding paths. An important point is that the replication utility of a path is calculated by taking into account the replication utility of all paths that are subsumed by it. A path p 1 is subsumed by a path p 2, if whenever a document d matches p 2, it also matches p 1. Checking XML query subsumption is a well-studied problem (for instance, see [12] for a survey). In this paper, we use a simple test: we check whether p 2 is a prefix of p 1. Path subsumption is not taken into account when updating the access frequency of fragments, instead, for each path entry is updated separately. However, path subsumption is considered for taking replication decisions. In particular, the replication utility for each path p is calculated as follows: ru(p) = m i=1 ru(p i), for each p i that is subsumed by p. The replication message refers to the fragment that subsumes the others, that is, the fragment to be created and replicated is represented by the path that corresponds to the largest subtree. For example consider the case where a peer has processed queries for paths /a/b/c and /a/b. Since the path /a/b/c is subsumed by the path /a/b, the fragment that is finally replicated is the one defined by the path /a/b. Note that handling external links requires an extension of the content of RepRIX. In particular, during a lookup, an external link may be followed. A peer that uses an external link to forward a query may not be a neighbor of the peer that receives it. RepRI does not cover such cases, since RepRI maintains access statistics only for its neighbors. To handle such cases, whenever a peer receives a query from a direction outside its neighboring list, an extra counter is added to RepRIX. Now, peers cache the location of peers holding data that a part of the network is interested in but it is not able to reach, unless the external link is used. During the replication procedure, these extra counters are also taken into account for creating and propagating replicas via external links towards the direction they are considered popular. Replacement Policy Each time a peer decides to store a new replica, the following check takes place to avoid data redundancy. If the fragment to be stored contains a fragment that is already stored there, the latter is replaced by the former. We say that a fragment f 1 is contained in a fragment f 2, if the path expression representing f 1 is subsumed by the path expression representing f 2. For example, if a peer that already has a replica of a fragment /a/b/c decides to store a fragment /a/b, the replica /a/b/c is replaced. Replacement takes also place, when the space allocated to replicas gets full at some peer. In this case, the peer uses RepRIX to find the path with the smallest

8 replication utility and evicts the corresponding fragment. In particular, the data corresponding to that fragment is discarded and replaced by an external link pointing to the document from which the replica originated. This is done to maintain consistency among external links and ensure the correctness of the search process. When a peer follows an external link, during a lookup, it may reach a replica that does not contain the requested data, as a result of the replacement strategy. However, another external link will be present to direct the lookup towards a document that might contain it. In the worst case, the requested fragment may be get evicted everywhere. In this case, the lookup operation will end when we follow all encountered external links, finally reaching the original document. Note that since replicas have different sizes, a peer may have to replace more than one fragment to make sufficient space for the new one. The fragments that are discarded are the ones with the lowest replication utilities. 4 Experimental Evaluation In this section, we present an experimental evaluation of the various XML replication procedures. Simulation Environment All replication techniques are implemented as components of the Peersim simulator [1]. Each simulation starts with an initialization phase. During this phase, the topology of the network is formed and the documents are distributed randomly at the nodes of the network. The documents used for the experiments are generated using the ToXgene [14] XML generator. The queries are simple path expressions extracted by the data set. The search mechanism is random walks with TTL. The networks follow power-law topologies, with sizes ranging from 1 to 1 peers. In all experiments, we consider that the system poses a constraint on the number of replicas that can be stored at each peer. Each peer is the owner of approximately one document and maintains replicas of 2-16 others resulting in about 2-16 replicas per document on average. Table 1 summarizes our experimental parameters. Table 1. Simulation Parameters Parameters Default value Range Network size Random walkers 16 File distribution Random Query distribution Zipf (alpha = 1.) alpha = TTL 7 Avg file size 4KB 3.5KB-4.5KB Avg fragment size 1.3KB.5KB-3KB Storage limit 16KB 8KB-64KB Experimental Results We present two sets of experiments. In the first set, we consider proactive and path fragment replication, while in the second set, we com-

9 pare whole document versus fragment replication for both of them. All reported values were averaged over multiple runs. The search sizes presented are those of successful queries. Note that, in the experiments, we use skeleton replication for both proactive and path replication. Skeleton replication achieves better results than subtree replication, since external links speed up the search process. Skeleton replication answers about 6% more queries than subtree replication and decreases the search size by around 15%. However, subtree replication creates messages having smaller sizes by as much as 25%. Proactive Replication vs Path Replication. In all experiments, both techniques use all the space available for replica storage, creating approximately the same number of replicas. We first consider how the tuning parameter α of the replication utility affects search. In Figure 4, we report the percentage of successful searches for the 2% most popular and the 2% most unpopular fragments. As α increases, the number of replicas of popular fragments increases and thus the percentage of successful queries for popular fragments increases as well, while for unpopular fragments, it drops. Similarly, our results have shown that increasing the value of α reduces the search size for popular fragments (from 4 to around 2 hops) and increases the search size of unpopular ones (from 4 to around 6.5). Thus, α can be set appropriately to fine tune the search efficiency of popular and unpopular items. For example, a small value of α (for instance α =.3) increases the number of unpopular items that can be located and reduces their search size. In the following, we assume α =.6 and report average values. Succesfull queries (%) RepRIX popular Path popular RepRIX unpopular Path unopular Tuning parameter Fig.4. Query hits for popular and unpopular fragments for different values of α. Regarding scalability, we tune the period of replication for proactive replication so that the number of replication messages is about the same with that in the case of passive (path) replication. Figures 5(left) and 5(right) show that RepRIX replication outperforms path replication with respect to both the average search size and the percentage of successful queries. Fragment vs Whole Document Replication. Figures 6(left) and 6(right) present a comparison of fragment replication and whole document XML replication for both path and RepRIX replication. As shown, in both cases, fragment replication manages the available storage much more efficiently storing only the most popular fragments resulting in both smaller search sizes and more successful queries.

10 In Figure 7(left), we show how the replication methods adapt when the query distribution changes. At the middle of the simulation (5% of time), we shift the query distribution, thus changing the popularity of fragments. As shown, RepRIX fragment replication achieves its expected percentage of successful queries fast. Note that the rate of adaptation of RepRIX depends on the period of the replication protocol. In these experiments, we set the period such that the amount of activity (message cost) is approximately the same with that of path replication. In Figure 7(right), we show how the size of the popular fragments affects the performance of fragment replication. When most queries refer to relatively small fragments, both RepRIX and path fragment replication methods show better performance, since more fragments are replicated. The rate of successful queries increases as well. Finally, as expected, all type of replications improve performance in the case of node failures. Our experiments showed that this improvement is comparable. Average Search Size RepRIX Path replication Succesful queries (%) RepRIX Path replication Number of nodes Number of nodes Fig. 5. (left) Average search size for different network sizes (right) Query hits for different network sizes Average search size RepRIX frag Path frag RepRIX whole Path whole Successful queries (%) RepRIX frag Path frag RepRIX whole Path whole Number of nodes Number of nodes Fig. 6. (left) Average search size for fragment and whole document replication (right) Query hits for fragment and whole document replication 5 Related Work Replication in unstructured p2p systems has received considerable attention. The authors of [9] consider the general problem of determining the optimal number of

11 Sucessful queries (%) RepRIX frag Path frag RepRIX whole Path whole Average search size RepRIX frag Path frag RepRIX whole Path whole Simulation progress (%) Size of popular fragments Fig. 7. (left) Query hits after changing popularity of fragments (right) Average search size for different sizes of popular fragments replicas in unstructured p2p systems given that the total amount of storage in the network is fixed. They conclude that square-root replication that allocates replicas to items proportionally to the square root of the items query rate produces the optimal number of copies in terms of optimizing the average search time. It was shown that path replication achieves almost square-root replication under certain conditions [6, 9]. In this paper, we compare path replication with a proactive scheme for replica allocation that allows for tuning the search cost of popular and unpopular items. We also extend this work for items with structure, such as XML documents. Beehive [11] describes a proactive replication protocol for structured overlays that achieves O(1) lookup. The basic difference with our approach is that in structured overlays, items are placed at specific peers and thus the search cost can be reduced by carefully placing replicas of an item along its search path. However, in unstructured overlays, searches may follow diverse paths and we deploy routing indexes to estimate them. There is some work on replicating and fragmenting XML documents. In [1], the authors consider dynamic XML documents which are XML documents that contain both materialized and intentional XML data that can be produced by web service calls. Since dynamic documents may contain calls to services on other peers, some form of distribution is inherently part of the model. The focus there is on making decisions on whether to materialize the result of a call or not. The authors of [5] deal with the problem of supporting parallel query processing when parts of an XML document are located at different peers. They focus on boolean XPath queries. The key idea is to send the whole query to each peer which partially evaluates it and sends the results as compact boolean functions to a coordinator which combines them to obtain the result. The work reported in [3] addresses the XML allocation problem, that is given a collection of XML documents how to fragment them and allocate the resulting fragments among the peers of a distributed system to improve performance. The problem we address here is different in that we do not consider partition and allocation but replication and also in that peers have no global knowledge about the location of other fragments. A similar XML fragmentation problem is addressed in [4], where an approach for vectorizing and querying large XML repositories is presented. The idea is based on the decomposition of an XML document into a set of vectors that contain the data values and a compressed skeleton that describes the structure. This is similar to our approach of representing

12 replicas. Again, in this work, we address a different problem, that of dynamically creating replicated fragments. There has also been some work on processing XML documents in both structured (e.g., [2, 7, 13]) and unstructured p2p systems (e.g., [8]). This research is complementary to ours, since in this paper, we address replication. 6 Summary Replication is commonly used in peer-to-peer systems to improve response time of queries. In this paper, we focus on replicating XML documents. A distinctive feature of XML replication is determining an appropriate granularity that would allow different levels of replication for fragments of the same XML document. We have presented both a proactive and a passive replication protocol to achieve such replication at the fragment level. Our proactive replication protocol is based on replication routing indexes that are path-based indexes used to determine the appropriate peers towards which to create copies. Our experimental results show that our fragment-based replication model outperforms whole document replication, while proactive replication works better than passive replication at the cost of maintaining access statistics. References 1. Abiteboul S., Bonifati A.,. Cobena G, Manolescu I., and Milo T.: Dynamic XML Documents with Distribution and Replication. In SIGMOD, Bonifati A., Matrangolo U., Cuzzocrea A., Jain M.: XPath lookup queries in P2P networks. In WIDM Bremer, J. M., Gertz M.: On Distributing XML Repositories. In WebDB, Buneman P., Choi B., Fan W., Hutchison R., Mann R., Viglas S. D.: Vectorizing and Querying Large XML Repositories. In ICDE Buneman P., Cong G., Fan W., Kementsietsidis A.: Using Partial Evaluation in Distributed Query Evaluation. In VLDB Cohen E., Shenker S.: Replication strategies in unstructured peer-to-peer networks. In SIGCOMM Galanis L., Wang Y., Jeffery S., and DeWitt D.: Locating Data Sources in Large Distributed Systems. In VLDB, Koloniari G., Pitoura E.: Content-Based Routing of Path Queries in Peer-to-Peer Systems. In EDBT, Lv Q., Cao P., Cohen E., Li K., Shenker S.: Search and Replication in Unstructured Peer-to-Peer Networks. In ICS Peersim website: Ramasubramanian V., Sirer E. G.: Beehive: O(1) Lookup Performance for Power-Law Query Distributions in Peer-to-Peer Overlays. In NSDI Schwentick T.: XPath Query Containment. In Sigmod Record, 331(1), Skobeltsyn G., Hauswirth M., Aberer K.: Efficient Processing of XPath Queries with Structured Overlay networks. In ODBASE ToXgene website:

IP-FP AEOLUS. Algorithmic Principles for Building Efficient Overlay Computers. Deliverable D Resource discovery with QoS guarantees

IP-FP AEOLUS. Algorithmic Principles for Building Efficient Overlay Computers. Deliverable D Resource discovery with QoS guarantees IP-FP6-015964 AEOLUS Algorithmic Principles for Building Efficient Overlay Computers Deliverable D2.1.2 Resource discovery with QoS guarantees Responsible Partner: University of Ioannina (EL) Report Preparation

More information

Venugopal Ramasubramanian Emin Gün Sirer SIGCOMM 04

Venugopal Ramasubramanian Emin Gün Sirer SIGCOMM 04 The Design and Implementation of a Next Generation Name Service for the Internet Venugopal Ramasubramanian Emin Gün Sirer SIGCOMM 04 Presenter: Saurabh Kadekodi Agenda DNS overview Current DNS Problems

More information

The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Presented By: Kamalakar Kambhatla

The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Presented By: Kamalakar Kambhatla The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Venugopalan Ramasubramanian Emin Gün Sirer Presented By: Kamalakar Kambhatla * Slides adapted from the paper -

More information

Filters for XML-based Service Discovery in Pervasive Computing

Filters for XML-based Service Discovery in Pervasive Computing Filters for XML-based Service Discovery in Pervasive Computing Georgia Koloniari and Evaggelia Pitoura Department of Computer Science, University of Ioannina, GR45110 Ioannina, Greece {kgeorgia, pitoura}@cs.uoi.gr

More information

Load Sharing in Peer-to-Peer Networks using Dynamic Replication

Load Sharing in Peer-to-Peer Networks using Dynamic Replication Load Sharing in Peer-to-Peer Networks using Dynamic Replication S Rajasekhar, B Rong, K Y Lai, I Khalil and Z Tari School of Computer Science and Information Technology RMIT University, Melbourne 3, Australia

More information

Creating and maintaining replicas in unstructured peer-to-peer systems

Creating and maintaining replicas in unstructured peer-to-peer systems Creating and maintaining replicas in unstructured peer-to-peer systems Elias Leontiadis Vassilios V. Dimakopoulos Evaggelia Pitoura Technical Report 2006-1 Department of Computer Science, University of

More information

Scalable overlay Networks

Scalable overlay Networks overlay Networks Dr. Samu Varjonen 1 Lectures MO 15.01. C122 Introduction. Exercises. Motivation. TH 18.01. DK117 Unstructured networks I MO 22.01. C122 Unstructured networks II TH 25.01. DK117 Bittorrent

More information

Content-Based Routing of Path Queries in Peer-to-Peer Systems

Content-Based Routing of Path Queries in Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Department of Computer Science, University of Ioannina, Greece {kgeorgia,pitoura}@cs.uoi.gr Abstract.

More information

EARM: An Efficient and Adaptive File Replication with Consistency Maintenance in P2P Systems.

EARM: An Efficient and Adaptive File Replication with Consistency Maintenance in P2P Systems. : An Efficient and Adaptive File Replication with Consistency Maintenance in P2P Systems. 1 K.V.K.Chaitanya, 2 Smt. S.Vasundra, M,Tech., (Ph.D), 1 M.Tech (Computer Science), 2 Associate Professor, Department

More information

PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data

PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data Enhua Jiao, Tok Wang Ling, Chee-Yong Chan School of Computing, National University of Singapore {jiaoenhu,lingtw,chancy}@comp.nus.edu.sg

More information

Overlay and P2P Networks. Unstructured networks. Prof. Sasu Tarkoma

Overlay and P2P Networks. Unstructured networks. Prof. Sasu Tarkoma Overlay and P2P Networks Unstructured networks Prof. Sasu Tarkoma 20.1.2014 Contents P2P index revisited Unstructured networks Gnutella Bloom filters BitTorrent Freenet Summary of unstructured networks

More information

CHAPTER 3 LITERATURE REVIEW

CHAPTER 3 LITERATURE REVIEW 20 CHAPTER 3 LITERATURE REVIEW This chapter presents query processing with XML documents, indexing techniques and current algorithms for generating labels. Here, each labeling algorithm and its limitations

More information

Today: World Wide Web! Traditional Web-Based Systems!

Today: World Wide Web! Traditional Web-Based Systems! Today: World Wide Web! WWW principles Case Study: web caching as an illustrative example Invalidate versus updates Push versus Pull Cooperation between replicas Lecture 22, page 1 Traditional Web-Based

More information

Cooperative Caching versus Proactive Replication for Location Dependent Request Patterns

Cooperative Caching versus Proactive Replication for Location Dependent Request Patterns Cooperative Caching versus Proactive Replication for Location Dependent Request Patterns Niels Sluijs, Frédéric Iterbeke, Tim Wauters, Filip De Turck, Bart Dhoedt and Piet Demeester Department of Information

More information

Early Measurements of a Cluster-based Architecture for P2P Systems

Early Measurements of a Cluster-based Architecture for P2P Systems Early Measurements of a Cluster-based Architecture for P2P Systems Balachander Krishnamurthy, Jia Wang, Yinglian Xie I. INTRODUCTION Peer-to-peer applications such as Napster [4], Freenet [1], and Gnutella

More information

Last Class: Consistency Models. Today: Implementation Issues

Last Class: Consistency Models. Today: Implementation Issues Last Class: Consistency Models Need for replication Data-centric consistency Strict, linearizable, sequential, causal, FIFO Lecture 15, page 1 Today: Implementation Issues Replica placement Use web caching

More information

Overlay and P2P Networks. Unstructured networks. Prof. Sasu Tarkoma

Overlay and P2P Networks. Unstructured networks. Prof. Sasu Tarkoma Overlay and P2P Networks Unstructured networks Prof. Sasu Tarkoma 19.1.2015 Contents Unstructured networks Last week Napster Skype This week: Gnutella BitTorrent P2P Index It is crucial to be able to find

More information

Evaluating XPath Queries

Evaluating XPath Queries Chapter 8 Evaluating XPath Queries Peter Wood (BBK) XML Data Management 201 / 353 Introduction When XML documents are small and can fit in memory, evaluating XPath expressions can be done efficiently But

More information

Evaluating Unstructured Peer-to-Peer Lookup Overlays

Evaluating Unstructured Peer-to-Peer Lookup Overlays Evaluating Unstructured Peer-to-Peer Lookup Overlays Idit Keidar EE Department, Technion Roie Melamed CS Department, Technion ABSTRACT Unstructured peer-to-peer lookup systems incur small constant overhead

More information

Resilient GIA. Keywords-component; GIA; peer to peer; Resilient; Unstructured; Voting Algorithm

Resilient GIA. Keywords-component; GIA; peer to peer; Resilient; Unstructured; Voting Algorithm Rusheel Jain 1 Computer Science & Information Systems Department BITS Pilani, Hyderabad Campus Hyderabad, A.P. (INDIA) F2008901@bits-hyderabad.ac.in Chittaranjan Hota 2 Computer Science & Information Systems

More information

Bloom-Based Filters for Hierarchical Data

Bloom-Based Filters for Hierarchical Data Bloom-Based Filters for Hierarchical Data Georgia Koloniari and Evaggelia Pitoura Department of Computer Science, University of Ioannina, Greece. {kgeorgia, pitoura}@cs.uoi.gr Abstract In this paper, we

More information

Accelerating XML Structural Matching Using Suffix Bitmaps

Accelerating XML Structural Matching Using Suffix Bitmaps Accelerating XML Structural Matching Using Suffix Bitmaps Feng Shao, Gang Chen, and Jinxiang Dong Dept. of Computer Science, Zhejiang University, Hangzhou, P.R. China microf_shao@msn.com, cg@zju.edu.cn,

More information

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University CS 555: DISTRIBUTED SYSTEMS [P2P SYSTEMS] Shrideep Pallickara Computer Science Colorado State University Frequently asked questions from the previous class survey Byzantine failures vs malicious nodes

More information

Processing Rank-Aware Queries in P2P Systems

Processing Rank-Aware Queries in P2P Systems Processing Rank-Aware Queries in P2P Systems Katja Hose, Marcel Karnstedt, Anke Koch, Kai-Uwe Sattler, and Daniel Zinn Department of Computer Science and Automation, TU Ilmenau P.O. Box 100565, D-98684

More information

Enabling Dynamic Querying over Distributed Hash Tables

Enabling Dynamic Querying over Distributed Hash Tables Enabling Dynamic Querying over Distributed Hash Tables Domenico Talia a,b, Paolo Trunfio,a a DEIS, University of Calabria, Via P. Bucci C, 736 Rende (CS), Italy b ICAR-CNR, Via P. Bucci C, 736 Rende (CS),

More information

A Hybrid Peer-to-Peer Architecture for Global Geospatial Web Service Discovery

A Hybrid Peer-to-Peer Architecture for Global Geospatial Web Service Discovery A Hybrid Peer-to-Peer Architecture for Global Geospatial Web Service Discovery Shawn Chen 1, Steve Liang 2 1 Geomatics, University of Calgary, hschen@ucalgary.ca 2 Geomatics, University of Calgary, steve.liang@ucalgary.ca

More information

Jinho Hwang and Timothy Wood George Washington University

Jinho Hwang and Timothy Wood George Washington University Jinho Hwang and Timothy Wood George Washington University Background: Memory Caching Two orders of magnitude more reads than writes Solution: Deploy memcached hosts to handle the read capacity 6. HTTP

More information

Overlay and P2P Networks. Unstructured networks. PhD. Samu Varjonen

Overlay and P2P Networks. Unstructured networks. PhD. Samu Varjonen Overlay and P2P Networks Unstructured networks PhD. Samu Varjonen 25.1.2016 Contents Unstructured networks Last week Napster Skype This week: Gnutella BitTorrent P2P Index It is crucial to be able to find

More information

Effective File Replication and Consistency Maintenance Mechanism in P2P Systems

Effective File Replication and Consistency Maintenance Mechanism in P2P Systems Global Journal of Computer Science and Technology Volume 11 Issue 16 Version 1.0 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Inc. (USA) Online ISSN: 0975-4172

More information

A Survey of Peer-to-Peer Systems

A Survey of Peer-to-Peer Systems A Survey of Peer-to-Peer Systems Kostas Stefanidis Department of Computer Science, University of Ioannina, Greece kstef@cs.uoi.gr Abstract Peer-to-Peer systems have become, in a short period of time, one

More information

An Cross Layer Collaborating Cache Scheme to Improve Performance of HTTP Clients in MANETs

An Cross Layer Collaborating Cache Scheme to Improve Performance of HTTP Clients in MANETs An Cross Layer Collaborating Cache Scheme to Improve Performance of HTTP Clients in MANETs Jin Liu 1, Hongmin Ren 1, Jun Wang 2, Jin Wang 2 1 College of Information Engineering, Shanghai Maritime University,

More information

Physical Level of Databases: B+-Trees

Physical Level of Databases: B+-Trees Physical Level of Databases: B+-Trees Adnan YAZICI Computer Engineering Department METU (Fall 2005) 1 B + -Tree Index Files l Disadvantage of indexed-sequential files: performance degrades as file grows,

More information

An XML Routing Synopsis for Unstructured P2P Networks

An XML Routing Synopsis for Unstructured P2P Networks An XML Routing Synopsis for Unstructured P2P Networks Qiang Wang University of Waterloo q6wang@uwaterloo.ca Abhay Kumar Jha IIT, Bombay abhaykj@cse.iitb.ac.in M. Tamer Özsu University of Waterloo tozsu@uwaterloo.ca

More information

On Veracious Search In Unsystematic Networks

On Veracious Search In Unsystematic Networks On Veracious Search In Unsystematic Networks K.Thushara #1, P.Venkata Narayana#2 #1 Student Of M.Tech(S.E) And Department Of Computer Science And Engineering, # 2 Department Of Computer Science And Engineering,

More information

Material You Need to Know

Material You Need to Know Review Quiz 2 Material You Need to Know Normalization Storage and Disk File Layout Indexing B-trees and B+ Trees Extensible Hashing Linear Hashing Decomposition Goals: Lossless Joins, Dependency preservation

More information

Routing XQuery in A P2P Network Using Adaptable Trie-Indexes

Routing XQuery in A P2P Network Using Adaptable Trie-Indexes Routing XQuery in A P2P Network Using Adaptable Trie-Indexes Florin Dragan, Georges Gardarin, Laurent Yeh PRISM laboratory Versailles University & Oxymel, France {firstname.lastname@prism.uvsq.fr} Abstract

More information

Full-Text and Structural XML Indexing on B + -Tree

Full-Text and Structural XML Indexing on B + -Tree Full-Text and Structural XML Indexing on B + -Tree Toshiyuki Shimizu 1 and Masatoshi Yoshikawa 2 1 Graduate School of Information Science, Nagoya University shimizu@dl.itc.nagoya-u.ac.jp 2 Information

More information

Outline. Depth-first Binary Tree Traversal. Gerênciade Dados daweb -DCC922 - XML Query Processing. Motivation 24/03/2014

Outline. Depth-first Binary Tree Traversal. Gerênciade Dados daweb -DCC922 - XML Query Processing. Motivation 24/03/2014 Outline Gerênciade Dados daweb -DCC922 - XML Query Processing ( Apresentação basedaem material do livro-texto [Abiteboul et al., 2012]) 2014 Motivation Deep-first Tree Traversal Naïve Page-based Storage

More information

Adaptive Load Balancing for DHT Lookups

Adaptive Load Balancing for DHT Lookups Adaptive Load Balancing for DHT Lookups Silvia Bianchi, Sabina Serbu, Pascal Felber and Peter Kropf University of Neuchâtel, CH-, Neuchâtel, Switzerland {silvia.bianchi, sabina.serbu, pascal.felber, peter.kropf}@unine.ch

More information

Making Gnutella-like P2P Systems Scalable

Making Gnutella-like P2P Systems Scalable Making Gnutella-like P2P Systems Scalable Y. Chawathe, S. Ratnasamy, L. Breslau, N. Lanham, S. Shenker Presented by: Herman Li Mar 2, 2005 Outline What are peer-to-peer (P2P) systems? Early P2P systems

More information

An Effective and Efficient Approach for Keyword-Based XML Retrieval. Xiaoguang Li, Jian Gong, Daling Wang, and Ge Yu retold by Daryna Bronnykova

An Effective and Efficient Approach for Keyword-Based XML Retrieval. Xiaoguang Li, Jian Gong, Daling Wang, and Ge Yu retold by Daryna Bronnykova An Effective and Efficient Approach for Keyword-Based XML Retrieval Xiaoguang Li, Jian Gong, Daling Wang, and Ge Yu retold by Daryna Bronnykova Search on XML documents 2 Why not use google? Why are traditional

More information

Small-World Overlay P2P Networks: Construction and Handling Dynamic Flash Crowd

Small-World Overlay P2P Networks: Construction and Handling Dynamic Flash Crowd Small-World Overlay P2P Networks: Construction and Handling Dynamic Flash Crowd Ken Y.K. Hui John C. S. Lui David K.Y. Yau Dept. of Computer Science & Engineering Computer Science Department The Chinese

More information

Peer-to-Peer Systems. Network Science: Introduction. P2P History: P2P History: 1999 today

Peer-to-Peer Systems. Network Science: Introduction. P2P History: P2P History: 1999 today Network Science: Peer-to-Peer Systems Ozalp Babaoglu Dipartimento di Informatica Scienza e Ingegneria Università di Bologna www.cs.unibo.it/babaoglu/ Introduction Peer-to-peer (PP) systems have become

More information

Evolutionary Linkage Creation between Information Sources in P2P Networks

Evolutionary Linkage Creation between Information Sources in P2P Networks Noname manuscript No. (will be inserted by the editor) Evolutionary Linkage Creation between Information Sources in P2P Networks Kei Ohnishi Mario Köppen Kaori Yoshida Received: date / Accepted: date Abstract

More information

Exploiting Communities for Enhancing Lookup Performance in Structured P2P Systems

Exploiting Communities for Enhancing Lookup Performance in Structured P2P Systems Exploiting Communities for Enhancing Lookup Performance in Structured P2P Systems H. M. N. Dilum Bandara and Anura P. Jayasumana Colorado State University Anura.Jayasumana@ColoState.edu Contribution Community-aware

More information

Introduction to Peer-to-Peer Systems

Introduction to Peer-to-Peer Systems Introduction Introduction to Peer-to-Peer Systems Peer-to-peer (PP) systems have become extremely popular and contribute to vast amounts of Internet traffic PP basic definition: A PP system is a distributed

More information

CSE 530A. B+ Trees. Washington University Fall 2013

CSE 530A. B+ Trees. Washington University Fall 2013 CSE 530A B+ Trees Washington University Fall 2013 B Trees A B tree is an ordered (non-binary) tree where the internal nodes can have a varying number of child nodes (within some range) B Trees When a key

More information

An Extended Byte Carry Labeling Scheme for Dynamic XML Data

An Extended Byte Carry Labeling Scheme for Dynamic XML Data Available online at www.sciencedirect.com Procedia Engineering 15 (2011) 5488 5492 An Extended Byte Carry Labeling Scheme for Dynamic XML Data YU Sheng a,b WU Minghui a,b, * LIU Lin a,b a School of Computer

More information

Parallelizing Structural Joins to Process Queries over Big XML Data Using MapReduce

Parallelizing Structural Joins to Process Queries over Big XML Data Using MapReduce Parallelizing Structural Joins to Process Queries over Big XML Data Using MapReduce Huayu Wu Institute for Infocomm Research, A*STAR, Singapore huwu@i2r.a-star.edu.sg Abstract. Processing XML queries over

More information

Chapter 8: Data Abstractions

Chapter 8: Data Abstractions Chapter 8: Data Abstractions Computer Science: An Overview Tenth Edition by J. Glenn Brookshear Presentation files modified by Farn Wang Copyright 28 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

More information

B-Trees and External Memory

B-Trees and External Memory Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015 and External Memory 1 1 (2, 4) Trees: Generalization of BSTs Each internal node

More information

SCALING UP VS. SCALING OUT IN A QLIKVIEW ENVIRONMENT

SCALING UP VS. SCALING OUT IN A QLIKVIEW ENVIRONMENT SCALING UP VS. SCALING OUT IN A QLIKVIEW ENVIRONMENT QlikView Technical Brief February 2012 qlikview.com Introduction When it comes to the enterprise Business Discovery environments, the ability of the

More information

Evaluating find a path reachability queries

Evaluating find a path reachability queries Evaluating find a path reachability queries Panagiotis ouros and Theodore Dalamagas and Spiros Skiadopoulos and Timos Sellis Abstract. Graphs are used for modelling complex problems in many areas, such

More information

An Efficient Eigenvalue-based P2P XML Routing Framework

An Efficient Eigenvalue-based P2P XML Routing Framework An Efficient Eigenvalue-based P2P XML Routing Framework Qiang Wang Univ. of Waterloo, Canada q6wang@uwaterloo.ca M. Tamer Özsu Univ. of Waterloo, Canada tozsu@uwaterloo.ca Abstract Many emerging applications

More information

B-Trees and External Memory

B-Trees and External Memory Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015 B-Trees and External Memory 1 (2, 4) Trees: Generalization of BSTs Each internal

More information

On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage

On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage Arijit Khan Nanyang Technological University (NTU), Singapore Gustavo Segovia ETH Zurich, Switzerland Donald Kossmann Microsoft

More information

Fast Contextual Preference Scoring of Database Tuples

Fast Contextual Preference Scoring of Database Tuples Fast Contextual Preference Scoring of Database Tuples Kostas Stefanidis Department of Computer Science, University of Ioannina, Greece Joint work with Evaggelia Pitoura http://dmod.cs.uoi.gr 2 Motivation

More information

Chapter 17 Indexing Structures for Files and Physical Database Design

Chapter 17 Indexing Structures for Files and Physical Database Design Chapter 17 Indexing Structures for Files and Physical Database Design We assume that a file already exists with some primary organization unordered, ordered or hash. The index provides alternate ways to

More information

Update Propagation Through Replica Chain in Decentralized and Unstructured P2P Systems

Update Propagation Through Replica Chain in Decentralized and Unstructured P2P Systems Update Propagation Through Replica Chain in Decentralized and Unstructured PP Systems Zhijun Wang, Sajal K. Das, Mohan Kumar and Huaping Shen Center for Research in Wireless Mobility and Networking (CReWMaN)

More information

A Scalable Content- Addressable Network

A Scalable Content- Addressable Network A Scalable Content- Addressable Network In Proceedings of ACM SIGCOMM 2001 S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker Presented by L.G. Alex Sung 9th March 2005 for CS856 1 Outline CAN basics

More information

A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS

A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS SRIVANI SARIKONDA 1 PG Scholar Department of CSE P.SANDEEP REDDY 2 Associate professor Department of CSE DR.M.V.SIVA PRASAD 3 Principal Abstract:

More information

An Optimal Overlay Topology for Routing Peer-to-Peer Searches

An Optimal Overlay Topology for Routing Peer-to-Peer Searches An Optimal Overlay Topology for Routing Peer-to-Peer Searches Brian F. Cooper Center for Experimental Research in Computer Systems, College of Computing, Georgia Institute of Technology cooperb@cc.gatech.edu

More information

Addressed Issue. P2P What are we looking at? What is Peer-to-Peer? What can databases do for P2P? What can databases do for P2P?

Addressed Issue. P2P What are we looking at? What is Peer-to-Peer? What can databases do for P2P? What can databases do for P2P? Peer-to-Peer Data Management - Part 1- Alex Coman acoman@cs.ualberta.ca Addressed Issue [1] Placement and retrieval of data [2] Server architectures for hybrid P2P [3] Improve search in pure P2P systems

More information

A Data Locating Mechanism for Distributed XML Data over P2P Networks

A Data Locating Mechanism for Distributed XML Data over P2P Networks A Data Locating Mechanism for Distributed XML Data over PP Networks Qiang Wang AND M. Tamer Özsu University of Waterloo School of Computer Science Waterloo, Canada {qwang,tozsu}@uwaterloo.ca Technical

More information

Data-Centric Query in Sensor Networks

Data-Centric Query in Sensor Networks Data-Centric Query in Sensor Networks Jie Gao Computer Science Department Stony Brook University 10/27/05 Jie Gao, CSE590-fall05 1 Papers Chalermek Intanagonwiwat, Ramesh Govindan and Deborah Estrin, Directed

More information

A SOLUTION FOR REPLICA CONSISTENCY MAINTENANCE IN UNSTRUCTURED PEER-TO-PEER NETWORKS

A SOLUTION FOR REPLICA CONSISTENCY MAINTENANCE IN UNSTRUCTURED PEER-TO-PEER NETWORKS INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 A SOLUTION FOR REPLICA CONSISTENCY MAINTENANCE IN UNSTRUCTURED PEER-TO-PEER NETWORKS Narjes Nikzad Khasmakhi 1, Shahram

More information

XML Storage and Indexing

XML Storage and Indexing XML Storage and Indexing Web Data Management and Distribution Serge Abiteboul Ioana Manolescu Philippe Rigaux Marie-Christine Rousset Pierre Senellart Web Data Management and Distribution http://webdam.inria.fr/textbook

More information

Query Routing and Processing in Schema-Based P2P Systems

Query Routing and Processing in Schema-Based P2P Systems Query Routing and Processing in Schema-Based PP Systems Marcel Karnstedt Katja Hose Kai-Uwe Sattler Department of Computer Science and Automation, TU Ilmenau P.O. Box 1565, D-98684 Ilmenau, Germany Abstract

More information

Debunking some myths about structured and unstructured overlays

Debunking some myths about structured and unstructured overlays Debunking some myths about structured and unstructured overlays Miguel Castro Manuel Costa Antony Rowstron Microsoft Research, 7 J J Thomson Avenue, Cambridge, UK Abstract We present a comparison of structured

More information

Today: World Wide Web

Today: World Wide Web Today: World Wide Web WWW principles Case Study: web caching as an illustrative example Invalidate versus updates Push versus Pull Cooperation between replicas Lecture 22, page 1 Traditional Web-Based

More information

Advances in Data Management Principles of Database Systems - 2 A.Poulovassilis

Advances in Data Management Principles of Database Systems - 2 A.Poulovassilis 1 Advances in Data Management Principles of Database Systems - 2 A.Poulovassilis 1 Storing data on disk The traditional storage hierarchy for DBMSs is: 1. main memory (primary storage) for data currently

More information

Today: World Wide Web. Traditional Web-Based Systems

Today: World Wide Web. Traditional Web-Based Systems Today: World Wide Web WWW principles Case Study: web caching as an illustrative example Invalidate versus updates Push versus Pull Cooperation between replicas Lecture 22, page 1 Traditional Web-Based

More information

CSIT5300: Advanced Database Systems

CSIT5300: Advanced Database Systems CSIT5300: Advanced Database Systems L08: B + -trees and Dynamic Hashing Dr. Kenneth LEUNG Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong SAR,

More information

CSC2231: DNS with DHTs

CSC2231: DNS with DHTs CSC2231: DNS with DHTs http://www.cs.toronto.edu/~stefan/courses/csc2231/05au Stefan Saroiu Department of Computer Science University of Toronto Administrivia Next lecture: P2P churn Understanding Availability

More information

Improving Information Retrieval Effectiveness in Peer-to-Peer Networks through Query Piggybacking

Improving Information Retrieval Effectiveness in Peer-to-Peer Networks through Query Piggybacking Improving Information Retrieval Effectiveness in Peer-to-Peer Networks through Query Piggybacking Emanuele Di Buccio, Ivano Masiero, and Massimo Melucci Department of Information Engineering, University

More information

CS 640 Introduction to Computer Networks. Today s lecture. What is P2P? Lecture30. Peer to peer applications

CS 640 Introduction to Computer Networks. Today s lecture. What is P2P? Lecture30. Peer to peer applications Introduction to Computer Networks Lecture30 Today s lecture Peer to peer applications Napster Gnutella KaZaA Chord What is P2P? Significant autonomy from central servers Exploits resources at the edges

More information

Stochastic Models of Pull-Based Data Replication in P2P Systems

Stochastic Models of Pull-Based Data Replication in P2P Systems Stochastic Models of Pull-Based Data Replication in P2P Systems Xiaoyong Li and Dmitri Loguinov Presented by Zhongmei Yao Internet Research Lab Department of Computer Science and Engineering Texas A&M

More information

A Service Replication Scheme for Service Oriented Computing in Pervasive Environment

A Service Replication Scheme for Service Oriented Computing in Pervasive Environment A Service Replication Scheme for Service Oriented Computing in Pervasive Environment Sital Dash, Mridulina Nandi & Subhrabrata Choudhury National Institute of Technology, Durgapur E-mail : Sheetal.dash@gmail.com,

More information

Question Bank Subject: Advanced Data Structures Class: SE Computer

Question Bank Subject: Advanced Data Structures Class: SE Computer Question Bank Subject: Advanced Data Structures Class: SE Computer Question1: Write a non recursive pseudo code for post order traversal of binary tree Answer: Pseudo Code: 1. Push root into Stack_One.

More information

Performance Comparison of Caching Strategies for Information-Centric IoT

Performance Comparison of Caching Strategies for Information-Centric IoT Performance Comparison of Caching Strategies for Information-Centric IoT Jakob Pfender, Alvin Valera, Winston Seah School of Engineering and Computer Science Victoria University of Wellington, New Zealand

More information

CS419: Computer Networks. Lecture 6: March 7, 2005 Fast Address Lookup:

CS419: Computer Networks. Lecture 6: March 7, 2005 Fast Address Lookup: : Computer Networks Lecture 6: March 7, 2005 Fast Address Lookup: Forwarding/Routing Revisited Best-match Longest-prefix forwarding table lookup We looked at the semantics of bestmatch longest-prefix address

More information

Amol Deshpande, UC Berkeley. Suman Nath, CMU. Srinivasan Seshan, CMU

Amol Deshpande, UC Berkeley. Suman Nath, CMU. Srinivasan Seshan, CMU Amol Deshpande, UC Berkeley Suman Nath, CMU Phillip Gibbons, Intel Research Pittsburgh Srinivasan Seshan, CMU IrisNet Overview of IrisNet Example application: Parking Space Finder Query processing in IrisNet

More information

MEMORY MANAGEMENT/1 CS 409, FALL 2013

MEMORY MANAGEMENT/1 CS 409, FALL 2013 MEMORY MANAGEMENT Requirements: Relocation (to different memory areas) Protection (run time, usually implemented together with relocation) Sharing (and also protection) Logical organization Physical organization

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Ossification of the Internet

Ossification of the Internet Ossification of the Internet The Internet evolved as an experimental packet-switched network Today, many aspects appear to be set in stone - Witness difficulty in getting IP multicast deployed - Major

More information

LCA-based Selection for XML Document Collections

LCA-based Selection for XML Document Collections WWW 2 Full Paper April 26-3 Raleigh NC USA LCA-based Selection for XML Document Collections Georgia Koloniari Department of Computer Science University of Ioannina, Greece kgeorgia@cs.uoi.gr Evaggelia

More information

A Top Catching Scheme Consistency Controlling in Hybrid P2P Network

A Top Catching Scheme Consistency Controlling in Hybrid P2P Network A Top Catching Scheme Consistency Controlling in Hybrid P2P Network V. Asha*1, P Ramesh Babu*2 M.Tech (CSE) Student Department of CSE, Priyadarshini Institute of Technology & Science, Chintalapudi, Guntur(Dist),

More information

Schema-Agnostic Indexing with Azure Document DB

Schema-Agnostic Indexing with Azure Document DB Schema-Agnostic Indexing with Azure Document DB Introduction Azure DocumentDB is Microsoft s multi-tenant distributed database service for managing JSON documents at Internet scale Multi-tenancy is an

More information

Performance Analysis of Restricted Path Flooding Scheme in Distributed P2P Overlay Networks

Performance Analysis of Restricted Path Flooding Scheme in Distributed P2P Overlay Networks 216 IJCSNS International Journal of Computer Science and Network Security, VOL.7 No.12, December 2007 Performance Analysis of Restricted Path Flooding Scheme in Distributed P2P Overlay Networks Hyuncheol

More information

Server Selection Mechanism. Server Selection Policy. Content Distribution Network. Content Distribution Networks. Proactive content replication

Server Selection Mechanism. Server Selection Policy. Content Distribution Network. Content Distribution Networks. Proactive content replication Content Distribution Network Content Distribution Networks COS : Advanced Computer Systems Lecture Mike Freedman Proactive content replication Content provider (e.g., CNN) contracts with a CDN CDN replicates

More information

XML Data Stream Processing: Extensions to YFilter

XML Data Stream Processing: Extensions to YFilter XML Data Stream Processing: Extensions to YFilter Shaolei Feng and Giridhar Kumaran January 31, 2007 Abstract Running XPath queries on XML data steams is a challenge. Current approaches that store the

More information

An Implementation of Tree Pattern Matching Algorithms for Enhancement of Query Processing Operations in Large XML Trees

An Implementation of Tree Pattern Matching Algorithms for Enhancement of Query Processing Operations in Large XML Trees An Implementation of Tree Pattern Matching Algorithms for Enhancement of Query Processing Operations in Large XML Trees N. Murugesan 1 and R.Santhosh 2 1 PG Scholar, 2 Assistant Professor, Department of

More information

FM-WAP Mining: In Search of Frequent Mutating Web Access Patterns from Historical Web Usage Data

FM-WAP Mining: In Search of Frequent Mutating Web Access Patterns from Historical Web Usage Data FM-WAP Mining: In Search of Frequent Mutating Web Access Patterns from Historical Web Usage Data Qiankun Zhao Nanyang Technological University, Singapore and Sourav S. Bhowmick Nanyang Technological University,

More information

Evaluation of Models for Analyzing Unguided Search in Unstructured Networks

Evaluation of Models for Analyzing Unguided Search in Unstructured Networks Evaluation of Models for Analyzing Unguided Search in Unstructured Networks Bin Wu and Ajay D. Kshemkalyani Computer Science Department, Univ. of Illinois at Chicago, Chicago, IL 667, USA {bwu, ajayk}@cs.uic.edu

More information

Author's personal copy

Author's personal copy Journal of Network and Computer Applications 35 (212) 85 96 Contents lists available at ScienceDirect Journal of Network and Computer Applications journal homepage: www.elsevier.com/locate/jnca Proactive

More information

Compression of the Stream Array Data Structure

Compression of the Stream Array Data Structure Compression of the Stream Array Data Structure Radim Bača and Martin Pawlas Department of Computer Science, Technical University of Ostrava Czech Republic {radim.baca,martin.pawlas}@vsb.cz Abstract. In

More information

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Naming WHAT IS NAMING? Name: Entity: Slide 3. Slide 1. Address: Identifier:

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Naming WHAT IS NAMING? Name: Entity: Slide 3. Slide 1. Address: Identifier: BASIC CONCEPTS DISTRIBUTED SYSTEMS [COMP9243] Name: String of bits or characters Refers to an entity Slide 1 Lecture 9a: Naming ➀ Basic Concepts ➁ Naming Services ➂ Attribute-based Naming (aka Directory

More information

A Scalable Mobile Agent Location Mechanism

A Scalable Mobile Agent Location Mechanism A Scalable Mobile Agent Location Mechanism Georgia Kastidou Department of Computer Science University of Ioannina GR 45110, Ioannina, Greece +30 26510 98857 georgia@cs.uoi.gr Evaggelia Pitoura Department

More information

AOTO: Adaptive Overlay Topology Optimization in Unstructured P2P Systems

AOTO: Adaptive Overlay Topology Optimization in Unstructured P2P Systems AOTO: Adaptive Overlay Topology Optimization in Unstructured P2P Systems Yunhao Liu, Zhenyun Zhuang, Li Xiao Department of Computer Science and Engineering Michigan State University East Lansing, MI 48824

More information

Design and Implementation of A P2P Cooperative Proxy Cache System

Design and Implementation of A P2P Cooperative Proxy Cache System Design and Implementation of A PP Cooperative Proxy Cache System James Z. Wang Vipul Bhulawala Department of Computer Science Clemson University, Box 40974 Clemson, SC 94-0974, USA +1-84--778 {jzwang,

More information