Replication Routing Indexes for XML Documents

Size: px

Start display at page:

Download "Replication Routing Indexes for XML Documents"

Magdalen Lester
6 years ago
Views:

1 Replication Routing Indexes for XML Documents Panos Skyvalidas, Evaggelia Pitoura, and V. V. Dimakopoulos Computer Science Department, University of Ioannina, GR4511 Ioannina, Greece Abstract. Peer-to-peer systems have attracted considerable attention as a means of sharing content among large and dynamic communities of nodes. In this paper, we consider sharing of XML documents. We propose a simple yet efficient replication method that is based on replication routing indexes. A replication routing index for a node is a path-based XML index that summarizes access information for each of the links of the node. The replication routing indexes are used for determining both the granularity of the XML fragments to be replicated as well as the appropriate sites for placing the fragments. We also present experimental results of the deployment of our indexes in a dynamic unstructured peer-to-peer system. 1 Introduction Peer-to-peer (p2p) systems have attracted a lot of attention as a means of data sharing among a large and dynamic population of nodes. Nodes in a p2p system connect with (know about) a small number of other nodes, thus forming an overlay network. A central issue is locating the nodes that hold data of interest. There have been various proposals towards building overlays for efficient content location. Such proposals vary from constructing rigid topologies and placing data on specific nodes to unstructured networks with no correlation between the node content and its position in the overlay. In all overlay types, content replication results in reducing the latency of lookups. Motivated by the fact that XML is increasingly being used in data intensive applications, in this paper, we study replication in p2p systems with nodes that share content stored in XML. Various replication techniques have been proposed that can be roughly categorized as either passive or proactive. With passive replication, items are replicated after being successfully located. A commonly used passive replication scheme is path replication that caches data items along the lookup path. It has been shown that the number of copies produced by path replication is close to optimal with respect to the average search size [6,9], that is, the expected number of steps to reach a locatable item. With proactive replication, the creation of replicas is initiated by holders of data items not necessarily after a successful lookup of the item. Path replication is simple and requires minimum bookkeeping. However, it tends to cluster copies along lookup paths. Further, passive replication considers the average search cost over all items and does not allow for fine tuning the search efficiency for popular and unpopular items. We consider XML replication for both passive and proactive protocols. XML documents have a hierarchical structure and thus, different fragments of an XML Work supported in part by AEOLUS project, IST 15964

2 document may have different access frequencies. We show that replicating at the fragment level is preferable to replicating whole documents. For proactive replication, we introduce a new data structure which we call replication routing index. A Replication Routing Index (RepRI) for a peer has one entry for each of its adjacent edges with statistics about the requests it has received through each one of them. Our replication strategy uses these indexes for deciding whether to maintain a copy locally or forward it as well as for selecting a direction for doing so. A Replication Routing Index for XML (RepRIX) maintains statistics for fragments. RepRIX allows us to fine-tune the unit of replication, so that fragments of the same document can have different numbers of replicas. Further, it allows us to push specific fragments closer to their requesters. RepRIX entries are also used as hints during lookup to direct nodes towards paths that most probably hold replicas of the requested items. We compare experimentally both proactive and passive variations of fragment replication. Both types of fragment replication outperform whole document replication resulting in both increasing the percentage of items located and reducing the required steps for doing so. Proactive replication with hints is shown to work better than passive replication. Furthermore, proactive replication allows for fine tuning the search efficiency for popular and unpopular items at the additional overhead of maintaining the replication indexes. The remainder of this paper is structured as follows. Section 2 introduces proactive replication using replication indexes. Section 3 extends replication indexes for XML documents, while Section 4 reports our experimental results. Section 5 presents related research. Finally, Section 6 concludes the paper. 2 Proactive Replication in Unstructured p2p Systems We consider first the case where peers store simple data files. Search is based on keyword-based queries that refer to file names. Replication in Unstructured p2p Systems We consider unstructured overlay networks where there is no control over the network topology or data placement. Unstructured p2p systems offer decentralization and simplicity of system construction and maintenance. However, searching for an item may be expensive, since each peer requesting an item needs to propagate the search message through the overlay using some variation of flooding or random walks. With flooding, each peer contacts its neighbors in the overlay, which in turn contact their own neighbors, until the requested item is located. With random walks, the peer initiates one or more depth-first traversals of the overlay. In both cases, to avoid overwhelming the network, the search query is propagated for up to a maximum of TTL (Time-To-Live) hops. To increase data availability and improve search performance, several replication techniques have been proposed. The most well known replication techniques used in unstructured p2p systems are owner and path replication, both of which are passive schemes. With owner replication, the requester peer locally stores a file after it has been successfully located. Owner replication creates at most one replica for each successful search. With path replication, after a successful search, the file

3 is replicated at all peers on the path from the provider peer to the requester peer. Previous work on replication [6] has shown that, under certain conditions, path replication results in creating close to optimal numbers of replicas in terms of minimizing the average search size for locatable items. However, path replication tends to replicate files on peers that are topologically along the same path. Proactive replication allows for a more random distribution of replicas. We propose a simple replication strategy that creates and places replicas along the peers of an unstructured p2p network in a proactive manner. Replication Routing Indexes To support proactive replication, we introduce a data structure that we call Replication Routing Index (RepRI). The RepRI(p) of a peer p maintains information about file requests that the peer has received through its neighbors. There are entries corresponding to each of its links (neighbors) summarizing the requests received through each of these links. For simplicity, we maintain entries for individual files. Other methods, for instance maintaining information for group of requests, are possible for reducing the size of the structure. The entry RepRI(p)(f), for a peer p with k neighbors and file f is of the form (LReq, Req 1, Hop 1,... Req k, Hop k, Ownership), where LReq is the number of queries for file f initiated by peer p itself, Req i and Hop i with i = 1,...,k, are respectively the number of queries that have been forwarded to p by its neighbor i and the average number of hops required for the corresponding queries to reach p through i. Ownership is set to 1 and depending on whether file f is stored locally at p or not. In Figure 1, we show the RepRI maintained by peer 1, which has three neighbors, 2, 3 and 4. Peer 1 has processed queries for two local files, Talk.mp3 and Precious.mp3 and for one file Sun.jpg stored at some other peer Filename LReq Req1 Hop1 Req2 Hop2 Req3 Hop3 Ownership 6 Talk_mp Sun_jpg Precious_mp Fig.1. The replication routing index of peer 1 Our replication strategy uses these indexes to decide whether a peer should replicate some of its files and forward them to certain directions. Our goal is to create and move replicas towards the direction at which they are actually needed. The decision which file to replicate and towards which direction is based on both the popularity of the file and the cost for locating it. In particular, each peer p with k neighbors calculates the replication utility, ru(f) of a file f as follows: ru(f) = α pop factor(f) + (1 α) hop factor(f) where pop factor(f) = k i=1 Req i(f) / Max f ( k i=1 Req i(f )) and hop factor(f) = ( k i=1 Hop i(f)/k)/ttl. Req i (f) and Hop i (f) denote respectively the number of queries for file f and the corresponding number of hops for these queries that p has received from its neighbor i. The weight α is a tuning parameter that determines how much each of these two factors, namely popularity and search size, affects the replication decision. The larger the value of α, the more efficient the search for

4 popular items. Each peer p also maintains the average replication utility (aru) for all files in its index. Periodically, a peer p creates replicas of all locally stored files that have replication utility greater than or equal to its average replication utility. For each such file f, peer p sends a replication message to its neighbor i that has the maximum corresponding Req i (f). After completing this replication phase, the entries for all files in the RepRI of the initiating peer are reset to. Each peer that receives a replication message for a file f checks the value of the corresponding LReq field. If this is greater than, the peer stores the replica. Else, the peer compares ru(f) with its aru and if ru(f) is greater than or equal to aru, it stores the file. Finally, the peer checks whether it should further forward the replication message to any of its neighbors based on the fields Req i (f). If at least one of them is greater than, the replication message is forwarded further to the neighbor i having the maximum corresponding Req i (f). Otherwise, the replication procedure stops. The duration of the period of the replication procedure depends on the query workload. A large value leads to making more informed decisions based on sufficiently large samples of requests and ignores popularity fluctuations that may be caused by random variations in the query workload. Furthermore, it reduces the associated network overhead. On the other hand, the system adapts to workload changes less promptly. Also, note that resetting the RepRI entries of the peer that initiated the replication procedure provides a simple yet effective aging scheme. REpRI-Based Replacement and Routing When peers have reached their maximum storage capacity for storing replicas, RepRI is used for selecting which of the replicas to replace. To this end, the file f with the smallest replication utility ru(f) is replaced. RepRI can also be used to further improve the routing process by maintaining the direction of the provider of a file. When a peer forwards a replication message to one of its neighbors without storing the replica, it sets the corresponding counter to a hint value indicating that the file has been copied along this direction. Thus, the next time the peer receives a query for that file, the peer knows towards which of its neighbors to forward the request. 3 Replication Routing Indexes for XML In most p2p systems, users and applications employ various formats and schemas to describe their data. In this section, we consider replication in the case of p2p systems for sharing XML data. Data and Query Model An XML document is represented by a rooted labeled tree, where nodes correspond to XML elements labeled with the corresponding tag and edges represent direct element-subelement relationships. Figure 2 depicts an XML description of a song collection provided by a node and the corresponding XML tree. To support querying the structure of documents, in addition to their content, query languages for XML employ path patterns. We focus on simple path queries of the form p 1 e 1 p 2 e 2... p n e n where each e i is an element name or the wildcard operator * and each p i is either / or // denoting respectively parent-child and ancestor-descendant traversal.

5 <song_collection> <pop> <song>.. </song> <song>,,, </song> </pop> <rock> <song> </song> <song></song> </rock> </song_collection> song_collection pop rock song song song song Fig.2. (left) Example of an XML document (right) The corresponding tree A path query q is evaluated at a node u of an XML tree T and its result is the set of nodes of T reachable from u via q. For a query q and a document d, we say that d matches q, if the path expression forming the query exists in the document. A peer that stores documents that match the query is called a matching peer. Using path queries, peers receive as results fragments of the information included in an XML document. An XML fragment F i is a subtree of T rooted at some node n of T. Each fragment is represented by a simple path expression starting from the root node r of T and leading to n. In Figure 2 (shaded part), we see an example of an XML fragment represented by the path /song collection/rock/*. Different fragments of a document may have different access frequencies. We exploit this fact by replicating at the fragment level. Fragment Replication We allow peers to replicate appropriate fragments of documents depending on the queries they have processed. We assume that the documents stored initially at peers are not fragmented. Document fragmentation is the result of the replication strategy. We assume that a unique identifier is associated with each document. This is also useful for handling possible updates. For brevity, in our examples, we have omitted most of these ids. The replicas that are created correspond to fragments defined by path queries. We propose two techniques for replica creation, called skeleton replication and subtree replication. <song_collection> <pop> <song>.. </song> <song>,,, </song> </pop> <house> <song> </song> <song></song> </house> <rock> <song> </song> <song></song> </rock> </song_collection> <song_collection> <pop> <externallink> </pop> <house> <externallink> </house> <rock> <song> </song> <song></song> </rock> </song_collection> <rockpath="song_collection/rock"parentnode="id" hassibling="yes"> <song> </song> <song></song> </rock> Fig. 3. (left) Example of an XML document (middle) Skeleton replication and (right) Subtree replication for the fragment defined by the path /song collection/rock/*.

6 Skeleton Replication. With skeleton replication, replicas have similar structure with the original documents. The only difference is that only part of the data of the original document is replicated. For creating a replica for a path expression /a 1 /a 2... /a n, we copy the skeleton (element hierarchy) of the original document from the root node a 1 to the fragment-root node a n. We also copy the data contained in the subtree defined by the path query. Data that corresponds to elements that are not included in the related path is not replicated. Instead, it is replaced by an external link. External links are special elements identified by the tag name externallink used to indicate where the data of the element, to which they belong, is located. To ensure the correctness and the completeness of the query evaluation process, we enforce the following constraint on the structure of replicas Replica Creation Constraint. For a path expression, /a 1 /a 2... /a n, if an element node a i of the original document has siblings, then the corresponding replica contains, along with the sibling elements in the skeleton, an external link to the original document. External links can be viewed as an intentional description of this missing data and provide the means to obtain it, if needed. From the XML tree perspective, external links play the role of some form of external nodes. In Figure 3, we see an example of an XML document and a replica corresponding to the path query /song collection/rock/*. With skeleton replication (Fig 3(middle)), only the data contained at the rock element is replicated. Data contained at the sibling elements pop and house is not replicated but replaced by an external link to the original document. When a peer receives a path query, first, it checks whether it has matching documents. If this is not the case, the query is further forwarded to some neighboring peer. If there is a match and the fragment-root node defined by the path query is a leaf node, then the query is satisfied by this node and the corresponding data is returned to the requester peer. Else, if the fragment-root node defines a subtree, its child nodes are checked. If none of them contains an external link, due to the Replica Creation Constraint, all the requested data must be present at the current document. Data is collected, returned to the requester peer and processing completes. Else, if one or more child nodes contain an external link, the peer uses the link to forward the query to the peer that stores the document containing all the requested data. Subtree Replication. With subtree replication, the replicas that are created contain only the fragment defined by the path query and not the whole skeleton, that is, the replica just consists of the subtree (of the original document) rooted at the fragment-root node. In addition, for evaluating a path query over a replica, three attributes are added to its root element: the path, the parent node and the hassiblings attributes. The path attribute contains the sequence of element names from the root node of the original document to the root node of the replica fragment. Since all documents have a unique id, the parent node attribute contains the parent node s id for achieving cohesion between the original document and the replica. Finally, the hassiblings attribute takes the values yes or no indicating whether the

7 replica s root node has any sibling nodes which have not been replicated. Figure 3(right) shows the replica corresponding to the path query /song collection/rock/* when using subtree replication. The query processing procedure for subtree replication is similar to that of skeleton replication. Replication Using RepRIX Instead of maintaining statistics for whole documents, a Replication Routing Indexes for XML (RepRIX) maintains entries that correspond to paths, representing fragments that the peer has processed queries for. Peers make decisions regarding replicating fragments of documents based on the replication utilities of the corresponding paths. An important point is that the replication utility of a path is calculated by taking into account the replication utility of all paths that are subsumed by it. A path p 1 is subsumed by a path p 2, if whenever a document d matches p 2, it also matches p 1. Checking XML query subsumption is a well-studied problem (for instance, see [12] for a survey). In this paper, we use a simple test: we check whether p 2 is a prefix of p 1. Path subsumption is not taken into account when updating the access frequency of fragments, instead, for each path entry is updated separately. However, path subsumption is considered for taking replication decisions. In particular, the replication utility for each path p is calculated as follows: ru(p) = m i=1 ru(p i), for each p i that is subsumed by p. The replication message refers to the fragment that subsumes the others, that is, the fragment to be created and replicated is represented by the path that corresponds to the largest subtree. For example consider the case where a peer has processed queries for paths /a/b/c and /a/b. Since the path /a/b/c is subsumed by the path /a/b, the fragment that is finally replicated is the one defined by the path /a/b. Note that handling external links requires an extension of the content of RepRIX. In particular, during a lookup, an external link may be followed. A peer that uses an external link to forward a query may not be a neighbor of the peer that receives it. RepRI does not cover such cases, since RepRI maintains access statistics only for its neighbors. To handle such cases, whenever a peer receives a query from a direction outside its neighboring list, an extra counter is added to RepRIX. Now, peers cache the location of peers holding data that a part of the network is interested in but it is not able to reach, unless the external link is used. During the replication procedure, these extra counters are also taken into account for creating and propagating replicas via external links towards the direction they are considered popular. Replacement Policy Each time a peer decides to store a new replica, the following check takes place to avoid data redundancy. If the fragment to be stored contains a fragment that is already stored there, the latter is replaced by the former. We say that a fragment f 1 is contained in a fragment f 2, if the path expression representing f 1 is subsumed by the path expression representing f 2. For example, if a peer that already has a replica of a fragment /a/b/c decides to store a fragment /a/b, the replica /a/b/c is replaced. Replacement takes also place, when the space allocated to replicas gets full at some peer. In this case, the peer uses RepRIX to find the path with the smallest

8 replication utility and evicts the corresponding fragment. In particular, the data corresponding to that fragment is discarded and replaced by an external link pointing to the document from which the replica originated. This is done to maintain consistency among external links and ensure the correctness of the search process. When a peer follows an external link, during a lookup, it may reach a replica that does not contain the requested data, as a result of the replacement strategy. However, another external link will be present to direct the lookup towards a document that might contain it. In the worst case, the requested fragment may be get evicted everywhere. In this case, the lookup operation will end when we follow all encountered external links, finally reaching the original document. Note that since replicas have different sizes, a peer may have to replace more than one fragment to make sufficient space for the new one. The fragments that are discarded are the ones with the lowest replication utilities. 4 Experimental Evaluation In this section, we present an experimental evaluation of the various XML replication procedures. Simulation Environment All replication techniques are implemented as components of the Peersim simulator [1]. Each simulation starts with an initialization phase. During this phase, the topology of the network is formed and the documents are distributed randomly at the nodes of the network. The documents used for the experiments are generated using the ToXgene [14] XML generator. The queries are simple path expressions extracted by the data set. The search mechanism is random walks with TTL. The networks follow power-law topologies, with sizes ranging from 1 to 1 peers. In all experiments, we consider that the system poses a constraint on the number of replicas that can be stored at each peer. Each peer is the owner of approximately one document and maintains replicas of 2-16 others resulting in about 2-16 replicas per document on average. Table 1 summarizes our experimental parameters. Table 1. Simulation Parameters Parameters Default value Range Network size Random walkers 16 File distribution Random Query distribution Zipf (alpha = 1.) alpha = TTL 7 Avg file size 4KB 3.5KB-4.5KB Avg fragment size 1.3KB.5KB-3KB Storage limit 16KB 8KB-64KB Experimental Results We present two sets of experiments. In the first set, we consider proactive and path fragment replication, while in the second set, we com-

9 pare whole document versus fragment replication for both of them. All reported values were averaged over multiple runs. The search sizes presented are those of successful queries. Note that, in the experiments, we use skeleton replication for both proactive and path replication. Skeleton replication achieves better results than subtree replication, since external links speed up the search process. Skeleton replication answers about 6% more queries than subtree replication and decreases the search size by around 15%. However, subtree replication creates messages having smaller sizes by as much as 25%. Proactive Replication vs Path Replication. In all experiments, both techniques use all the space available for replica storage, creating approximately the same number of replicas. We first consider how the tuning parameter α of the replication utility affects search. In Figure 4, we report the percentage of successful searches for the 2% most popular and the 2% most unpopular fragments. As α increases, the number of replicas of popular fragments increases and thus the percentage of successful queries for popular fragments increases as well, while for unpopular fragments, it drops. Similarly, our results have shown that increasing the value of α reduces the search size for popular fragments (from 4 to around 2 hops) and increases the search size of unpopular ones (from 4 to around 6.5). Thus, α can be set appropriately to fine tune the search efficiency of popular and unpopular items. For example, a small value of α (for instance α =.3) increases the number of unpopular items that can be located and reduces their search size. In the following, we assume α =.6 and report average values. Succesfull queries (%) RepRIX popular Path popular RepRIX unpopular Path unopular Tuning parameter Fig.4. Query hits for popular and unpopular fragments for different values of α. Regarding scalability, we tune the period of replication for proactive replication so that the number of replication messages is about the same with that in the case of passive (path) replication. Figures 5(left) and 5(right) show that RepRIX replication outperforms path replication with respect to both the average search size and the percentage of successful queries. Fragment vs Whole Document Replication. Figures 6(left) and 6(right) present a comparison of fragment replication and whole document XML replication for both path and RepRIX replication. As shown, in both cases, fragment replication manages the available storage much more efficiently storing only the most popular fragments resulting in both smaller search sizes and more successful queries.

10 In Figure 7(left), we show how the replication methods adapt when the query distribution changes. At the middle of the simulation (5% of time), we shift the query distribution, thus changing the popularity of fragments. As shown, RepRIX fragment replication achieves its expected percentage of successful queries fast. Note that the rate of adaptation of RepRIX depends on the period of the replication protocol. In these experiments, we set the period such that the amount of activity (message cost) is approximately the same with that of path replication. In Figure 7(right), we show how the size of the popular fragments affects the performance of fragment replication. When most queries refer to relatively small fragments, both RepRIX and path fragment replication methods show better performance, since more fragments are replicated. The rate of successful queries increases as well. Finally, as expected, all type of replications improve performance in the case of node failures. Our experiments showed that this improvement is comparable. Average Search Size RepRIX Path replication Succesful queries (%) RepRIX Path replication Number of nodes Number of nodes Fig. 5. (left) Average search size for different network sizes (right) Query hits for different network sizes Average search size RepRIX frag Path frag RepRIX whole Path whole Successful queries (%) RepRIX frag Path frag RepRIX whole Path whole Number of nodes Number of nodes Fig. 6. (left) Average search size for fragment and whole document replication (right) Query hits for fragment and whole document replication 5 Related Work Replication in unstructured p2p systems has received considerable attention. The authors of [9] consider the general problem of determining the optimal number of

11 Sucessful queries (%) RepRIX frag Path frag RepRIX whole Path whole Average search size RepRIX frag Path frag RepRIX whole Path whole Simulation progress (%) Size of popular fragments Fig. 7. (left) Query hits after changing popularity of fragments (right) Average search size for different sizes of popular fragments replicas in unstructured p2p systems given that the total amount of storage in the network is fixed. They conclude that square-root replication that allocates replicas to items proportionally to the square root of the items query rate produces the optimal number of copies in terms of optimizing the average search time. It was shown that path replication achieves almost square-root replication under certain conditions [6, 9]. In this paper, we compare path replication with a proactive scheme for replica allocation that allows for tuning the search cost of popular and unpopular items. We also extend this work for items with structure, such as XML documents. Beehive [11] describes a proactive replication protocol for structured overlays that achieves O(1) lookup. The basic difference with our approach is that in structured overlays, items are placed at specific peers and thus the search cost can be reduced by carefully placing replicas of an item along its search path. However, in unstructured overlays, searches may follow diverse paths and we deploy routing indexes to estimate them. There is some work on replicating and fragmenting XML documents. In [1], the authors consider dynamic XML documents which are XML documents that contain both materialized and intentional XML data that can be produced by web service calls. Since dynamic documents may contain calls to services on other peers, some form of distribution is inherently part of the model. The focus there is on making decisions on whether to materialize the result of a call or not. The authors of [5] deal with the problem of supporting parallel query processing when parts of an XML document are located at different peers. They focus on boolean XPath queries. The key idea is to send the whole query to each peer which partially evaluates it and sends the results as compact boolean functions to a coordinator which combines them to obtain the result. The work reported in [3] addresses the XML allocation problem, that is given a collection of XML documents how to fragment them and allocate the resulting fragments among the peers of a distributed system to improve performance. The problem we address here is different in that we do not consider partition and allocation but replication and also in that peers have no global knowledge about the location of other fragments. A similar XML fragmentation problem is addressed in [4], where an approach for vectorizing and querying large XML repositories is presented. The idea is based on the decomposition of an XML document into a set of vectors that contain the data values and a compressed skeleton that describes the structure. This is similar to our approach of representing

12 replicas. Again, in this work, we address a different problem, that of dynamically creating replicated fragments. There has also been some work on processing XML documents in both structured (e.g., [2, 7, 13]) and unstructured p2p systems (e.g., [8]). This research is complementary to ours, since in this paper, we address replication. 6 Summary Replication is commonly used in peer-to-peer systems to improve response time of queries. In this paper, we focus on replicating XML documents. A distinctive feature of XML replication is determining an appropriate granularity that would allow different levels of replication for fragments of the same XML document. We have presented both a proactive and a passive replication protocol to achieve such replication at the fragment level. Our proactive replication protocol is based on replication routing indexes that are path-based indexes used to determine the appropriate peers towards which to create copies. Our experimental results show that our fragment-based replication model outperforms whole document replication, while proactive replication works better than passive replication at the cost of maintaining access statistics. References 1. Abiteboul S., Bonifati A.,. Cobena G, Manolescu I., and Milo T.: Dynamic XML Documents with Distribution and Replication. In SIGMOD, Bonifati A., Matrangolo U., Cuzzocrea A., Jain M.: XPath lookup queries in P2P networks. In WIDM Bremer, J. M., Gertz M.: On Distributing XML Repositories. In WebDB, Buneman P., Choi B., Fan W., Hutchison R., Mann R., Viglas S. D.: Vectorizing and Querying Large XML Repositories. In ICDE Buneman P., Cong G., Fan W., Kementsietsidis A.: Using Partial Evaluation in Distributed Query Evaluation. In VLDB Cohen E., Shenker S.: Replication strategies in unstructured peer-to-peer networks. In SIGCOMM Galanis L., Wang Y., Jeffery S., and DeWitt D.: Locating Data Sources in Large Distributed Systems. In VLDB, Koloniari G., Pitoura E.: Content-Based Routing of Path Queries in Peer-to-Peer Systems. In EDBT, Lv Q., Cao P., Cohen E., Li K., Shenker S.: Search and Replication in Unstructured Peer-to-Peer Networks. In ICS Peersim website: Ramasubramanian V., Sirer E. G.: Beehive: O(1) Lookup Performance for Power-Law Query Distributions in Peer-to-Peer Overlays. In NSDI Schwentick T.: XPath Query Containment. In Sigmod Record, 331(1), Skobeltsyn G., Hauswirth M., Aberer K.: Efficient Processing of XPath Queries with Structured Overlay networks. In ODBASE ToXgene website:

IP-FP AEOLUS. Algorithmic Principles for Building Efficient Overlay Computers. Deliverable D Resource discovery with QoS guarantees

IP-FP AEOLUS. Algorithmic Principles for Building Efficient Overlay Computers. Deliverable D Resource discovery with QoS guarantees IP-FP6-015964 AEOLUS Algorithmic Principles for Building Efficient Overlay Computers Deliverable D2.1.2 Resource discovery with QoS guarantees Responsible Partner: University of Ioannina (EL) Report Preparation