A copy can be downloaded for personal non-commercial research or study, without prior permission or charge

Size: px
Start display at page:

Download "A copy can be downloaded for personal non-commercial research or study, without prior permission or charge"

Transcription

1 Broccolo, Daniele, Macdonald, Craig, Orlando, Salvatore, Ounis, Iadh, Perego, Raffaele, Silvestri, Fabrizio, and Tonellotto, Nicola(213) Loadsensitive selective pruning for distributed search. In: CIKM '13: 22nd ACM International Conference on Conference on Information and Knowledge Management, 27 Oct - 1 Nov 213, San Francisco CA, USA. Copyright 213 ACM A copy can be downloaded for personal non-commercial research or study, without prior permission or charge Content must not be changed in any way or reproduced in any format or medium without the formal permission of the copyright holder(s) When referring to this work, full bibliographic details must be given Deposited on: 13 May 214 Enlighten Research publications by members of the University of Glasgow

2 Load-Sensitive Selective Pruning for Distributed Search Daniele Broccolo 1,3, Craig Macdonald 2, Salvatore Orlando 1,3, Iadh Ounis 2, Raffaele Perego 1, Fabrizio Silvestri 1,4, Nicola Tonellotto 1, 1 National Research Council of Italy 2 University of Glasgow 3 Ca Foscari University of Venice 4 Yahoo! Research, Barcelona, Spain {firstname.lastname}@isti.cnr.it 1, {craig.macdonald, iadh.ounis}@glasgow.ac.uk 2, silvestr@yahoo-inc.com 4 ABSTRACT A search engine infrastructure must be able to provide the same quality of service to all queries received during a day. During normal operating conditions, the demand for resources is considerably lower than under peak conditions, yet an oversized infrastructure would result in an unnecessary waste of computing power. A possible solution adopted in this situation might consist of defining a maximum threshold processing time for each query, and dropping queries for which this threshold elapses, leading to disappointed users. In this paper, we propose and evaluate a different approach, where, given a set of different query processing strategies with differing efficiency, each query is considered by a framework that sets a maximum query processing time and selects which processing strategy is the best for that query, such that the processing time for all queries is kept below the threshold. The processing time estimates used by the scheduler are learned from past queries. We experimentally validate our approach on 1, queries from a standard TREC dataset with over 5 million documents, and we compare it with several baselines. These experiments encompass testing the system under different query loads and different maximum tolerated query response times. Our results show that, at the cost of a marginal loss in terms of response quality, our search system is able to answer 9% of queries within half a second during times of high query volume. Categories and Subject Descriptors: H.3.3 [Information Storage & Retrieval]: Information Search & Retrieval Keywords: Query Efficiency Prediction, Scheduling 1. INTRODUCTION Commercial Web search engines are expected to process user queries under tight response time constraints while being able to operate under heavy query traffic loads. Queries that cannot be processed within their time constraint experience degraded result quality [5]. Operating under these conditions requires building a very large infrastructure involving thousands of computers and making continuous investments Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. CIKM 13, Oct. 27 Nov. 1, 213, San Francisco, CA, USA. Copyright 213 ACM /13/1...$ Query traffic rate (queries/sec) 14 Peak query load experienced by the search engine Hour of the day Figure 1: The query distribution of a 24 hours time span covering 1st May 26 from MSN query log. to maintain this infrastructure [3]. Hence, optimising the efficiency of Web search engines is important to reduce their infrastructure costs. The user query volume typically received by a Web search engine is illustrated in Figure 1, showing how the rate of queries received can vary through the course of a day. In order to guarantee that each query is processed with subsecond response times, the computing/communication infrastructure has to support worst-case query volume, which reaches its maximum during the day time (about 13 queries per second from 1: to 14: in the workload depicted in Figure 1), typically around midday. Hence, the typical approach taken by Web search engines is to deploy a distributed search architecture [8]. According to this architecture, the servicing of a search query uses many query servers, each in charge of a partition of the global index. When a query reaches one of these servers, it is processed immediately if the server is idle, otherwise it is placed in a queue waiting for processing. Hence, the completion time of a query at each server includes both a waiting time and a processing time, in turn relying on the processing strategy. We argue that in order to realise a Web search engine that can answer a query within a given deadline, there are three options for how the system can respond in presence of high query volume. Firstly, if the system is not able to reduce the processing time of each query and the scheduling of the queries cannot be modified, queries with long processing times might simply be dropped resulting in error pages being returned to users or interrupted where a partial result list is presented. While this technique does guarantee high quality results for a (possibly small) subset of queries, it is unfortunate that this is the only possible choice when the query volume cannot be sustained by a given search engine infrastructure. The second option, discussed in recent works [1, 14], is based on query efficiency predictions, along with suitable scheduling algorithms, to re-order the

3 query queue. The aim of this technique is to reduce the overall queueing time of queries, thus increasing the query throughput. A third, alternative option, proposed in this paper, is to dynamically select a suitable query processing (retrieval) strategy to process the queries to satisfy the perquery deadlines, thus reducing the query processing time. In particular, when deciding between processing strategies, our proposed approach considers the time necessary to satisfy the per-query deadlines for all queries queued after the current query. In this way it can allocate available resources fairly across the waiting queries. On the other hand, a strategy that reduces the processing time of a query has obvious drawbacks: we need to exploit approximate processing strategies, such as dynamic pruning [4, 16, 2], that can reduce the quality of the query results. However, pruning can be applied selectively, on a per-query basis [19], depending on the expected processing time of the query and the status of the search engine. In this work, we go further, as our novel scheduling methodology can selectively adopt dynamic pruning processing strategies only when the system is experiencing high workloads, thereby trading off some effectiveness to ensure efficiency. In summary, this paper argues that the effectiveness of search results can be maintained whilst meeting completion time constraints by choosing an appropriate pruning strategy to use for each query to be processed by a given server. In particular, our method can examine not just the current query, but also the other queries queued for processing. Hence, the contributions of this paper are two-fold: We propose a load-sensitive selective pruning framework for bounding the permitted processing time of a query, which consider goals such as meeting a time threshold, effectiveness and fairness to other queries waiting to be processed; Moreover, to support our proposed framework, we propose an accurate approach for query efficiency prediction of term-ata-time dynamic pruning strategies. Our experiments show that the proposed framework is able to produce results of quality comparable to that of a search system that does not bound query processing time, while at high query workloads the system can still respond to queries in less than a predetermined time threshold. For instance, when 4 queries per second are arriving at the search engine, our framework is able to answer 9% of queries within.5 seconds with a 5% drop in results quality compared to an effective processing strategy for which % of queries meet the time threshold. The remainder of this paper is structured as follows: In Section 2, we introduce the necessary preliminaries by discussing the context of our work; Section 3 discusses related work in efficient retrieval; In Section 4, we propose our framework for load-sensitive selective pruning; Section 5 discusses the processing strategies we deploy, and how their response times can be accurately predicted; Section 6 defines the experimental setup for the evaluation that follows in Section 7; We provide concluding remarks in Section PRELIMINARIES Web search engines have to manage huge quantities of documents while achieving the goal of effectively answering users queries, and doing so efficiently i.e., within a fraction of a second. To achieve this multi-objective goal despite the large size of the Web, the corpus of documents the search engine must manage are partitioned into sub-collections that are each manageable by a single machine. This results in several query servers engaged in answering a user s query, Figure 2: Our reference architecture of a distributed search engine node (based on [5]). each of them storing the index shard [3] for a subset of the index built on the corpus. Without loss of generality, in this work we assume a distributed search engine where data are distributed according to a document partitioning strategy [2]. The index is thus partitioned into shards each one relative to a particular partition of the documents. To increase query throughput, each index shard is typically replicated into several replicas and a query received by the search front-end is routed to one of the available replicas. In this work, we assume a multi-node search engine without replicas, because our experimental results are independent from the number of replicas, and hence can be applied directly to each replica independently [5]. Figure 2 depicts our reference architecture for a single replica. New queries arrive at a front-end machine called query broker, which broadcasts the query to the query servers of all shards, before collecting and merging the final results set for presentation to the user. When a query reaches a query server, it is processed immediately if the server is idle. Indeed, each query server comprises a query processor, which is responsible for tokenising the query and ranking the documents of its index shard according to a scoring function (in our case we use the BM25 scoring function [18]). Strategies such as dynamic pruning [4, 16, 2] can be used to process queries in an efficient manner on each query server. In this work, we consider document-sorted indices, as used by at least one major search engine [8]. Other efficient retrieval techniques such as frequency-sorted [2] or impact-sorted indices [1] are possible, which also support our objective of early termination of long running queries. However, there is no evidence of such index layouts in common use within commercial search engines [15], perhaps as suggested by Lester et al. [12] due their practical disadvantages such as difficulty of use for Boolean and phrasal queries. As such, in this work, we focus on the realistic scenario of standard document-sorted index layouts. Finally, we use disjunctive semantics for queries, as supported by Craswell et al. [7] who highlighted that disjunctive semantics does not produce significantly different high-precision effectiveness compared to conjunctive retrieval. If the query server is already busy processing another query, each newly arrived query is placed in a queue, waiting to be selected by a query scheduler for processing. Hence, the time that a query spends with a query server, i.e. its completion time, can be split into two components: a waiting time, spent in the queue, and a processing time, spent being processed. While the latter depends on the particular retrieval strategy (which we call the processing strategy) and the shard s characteristics, the former depends on the specific scheduling algorithm implemented to manage the queue and on the number of queries in the queue itself.

4 Indeed, it has been observed that a query scheduler can make some gains in overall efficiency by re-ordering queries, thereby delaying the execution of expensive queries [14]. However, this approach only considers the cost of executing single queries, and hence cannot respond to surges in query traffic. Instead, in this work, we take a different approach, by arguing that the time available to execute a query on a query server whilst meeting the completion time constraints is influenced by the other queries queued on that query server. Hence in this paper, we estimate the target completion times for a query on a server based on the prediction of queueing and completion times for the queries scheduled after the query in the queue. The utility of query scheduling is particularly evident when queries arrive at a higher rate than the maximum sustainable peak load of the system [11]. Indeed, in our proposed framework, we set the maximum query processing time to a carefully chosen value (see Section 4), such that the system load is kept under control, thereby enabling an optimal management of the peak load at the cost of a slightly reduced results quality (see Section 7.2). Our proposed framework exploits novel machine learning models for estimating processing time under different processing strategies. 3. RELATED WORK Having defined the architecture context of our work, in this section we discuss some related work on which various components of our architecture rely, namely dynamic pruning (Section 3.1), query efficiency prediction (Section 3.2) and selective pruning (Section 3.3). 3.1 Dynamic Pruning The strategies to match documents to a query fall in two categories [16]: in a term-at-a-time (TAAT) strategy, the posting lists of query terms are processed and scored in sequence, while, in a document-at-a-time (DAAT) strategy, the query term postings lists are processed in parallel. To attain the typical sub-second response times of Web search engines, various techniques to enhance retrieval efficiency have been proposed (e.g. [4, 16, 2]). In particular, dynamic pruning aims to eliminate the scoring of documents that will not be present in the final list of top results. Most DAAT dynamic pruning strategies [4, 2] exhibit efficiency improvements without negatively impacting effectiveness, but some TAAT dynamic pruning techniques [12, 16], while they enhance efficiency, negatively impact retrieval effectiveness because some relevant document can be pruned. In this work, we consider the Continue TAAT dynamic pruning strategy [16], which we denote TAAT-CS. Our choice of the TAAT-CS strategy is motivated by the fact that its overall efficiency is directly proportional to the number of accumulator to create in the first phase [16]. Indeed, the fine tuning of the number of accumulators gives us the flexibility to directly control the efficiency of the pruning strategy. 3.2 Query Efficiency Prediction The query scheduler component must select the next query to be processed from the queue of waiting queries. To achieve this, it is fundamental to know in advance an estimate of the processing time for the query to be scheduled. Indeed, efficiency predictions estimate the response time of a search engine for a query [14]. Moffat et al. [15] stated that the response time of a query is related to the posting list lengths of its constituent query terms. However, in dynamic pruning strategies (e.g. Wand [4]), the response time of a query is more variable, as not every posting is scored, and many postings can be skipped [16], resulting in reduced retrieval time. As a result, for Wand, the length of the posting lists is insufficient to accurately predict the response time of a query [14]. Query efficiency predictors [14] have been proposed to address the problem of predicting the response time of Wand for an unseen query. In particular, various term-level statistics are computed for each term offline. When a new query arrives, the term-level features are aggregated into query-level statistics, which are used as input to a learned regression model. In this work, arising from our focus on the TAAT-CS pruning strategy, we propose query efficiency predictions for TAAT-CS, by describing a set of features that can be easily used to estimate the efficiency of a query through a learned approach. These predictions represent our estimates for the query processing time, which we exploit to determine a maximum amount of processing time to allocate for each query. 3.3 Selective Pruning Dynamic pruning strategies, such as Wand and TAAT-CS can all be configured to be made more aggressive. In doing so, the strategy becomes more efficient, but at a possible loss of effectiveness [4]. For instance, reducing the maximum number of accumulators in the TAAT-CS strategy results in less documents being examined before the second stage of the algorithm commences, when no new accumulators can be added. Hence, reducing the number of accumulators increases efficiency, but can result in relevant documents not being identified within the set of accumulators, thereby hindering effectiveness [16]. Typically, the aggressiveness is selected a priori to any retrieval, independent of the query to be processed and its characteristics. However, in [19], Tonellotto et al. showed how the Wand pruning strategy could be configured to prune more or less aggressively, on a per-query basis, depending on the expected duration of the query. They call this approach selective pruning. Our work makes an important improvement to selective pruning compared to [19], by observing that the appropriate aggressiveness for a query should be determined not just by considering the current query. Instead, our proposed loadsensitive selective pruning framework also accounts for the other queries waiting to be processed, and their predicted response times, together with their positions in the waiting queue. These are used to select the dynamic pruning aggressiveness in order to process the queries with a fixed time threshold, when possible, or to process it more efficiently, when the time constraint cannot be respected. 4. LOAD-DRIVEN SELECTIVE PRUNING One of the problems that must be addressed to build a large-scale Web search engine is how to provide the service when the received query volume is excessively high. In particular, when the entire system is overloaded, the response time of the queries increases, making it necessary to answer queries more rapidly. A common strategy is to drop queries that have been waiting or executing for a long time, returning empty results list; alternatively, it is possible to set a time threshold and interrupt the retrieval whenever a query is going to take too much time. Both strategies are suboptimal and have the huge drawback of disappointing the users who submitted those queries that have been dropped.

5 q1 t1 qn qi q2 t2 q2 q1 qi ti 1 p T qn tn Predict Bound e1(q1) t ep(q1) f(q1) Select Figure 3: The components of the proposed loadsensitive selective pruning framework (bottom), along with a representation of the variables depicting the queries currently queued (top). Typically, in search systems critical situations arise when bursts of queries are submitted (almost) at the same time. See, for instance, the peak load around 12 PM in the query workload plotted in Figure 1. In this section, we discuss a novel load-sensitive framework, based on query efficiency predictors and taking into account other features like the length of the list of queries waiting to be processed and the duration each query has been queued for. We aim to dynamically adapt the retrieval strategy, by reducing the processing time of queries when the system is heavily loaded. Indeed, during high query load, we propose to adopt aggressive pruning strategies, thus speeding up query processing, while possibly impacting negatively on the effectiveness of the returned results. Let us consider the search engine state depicted in Figure 3, which shows the system at time t. There are n queries q 1,..., q n waiting to be processed in the scheduler s queue. Let t i be the arrival time of query q i, where t i t j whenever i < j, i.e., t 1..., t n t. Query q 1 is the head of the queue, as it has been queued for the longest time. Until time t, the query processor was busy by processing the previous queries (not shown in the figure), and at time t it becomes idle. Then, the query scheduler must select the next query to be processed. We assume that scheduling follows a first-in first-out discipline, that is, query q 1 which has been queued for the longest time is selected for processing next. Furthermore, each query can be processed by several processing strategies σ 1,..., σ p, such as TAAT or DAAT with different levels of dynamic pruning aggressiveness. We assume that strategy σ 1 is the search engine s full processing strategy, such as TAAT or DAAT, while subsequent strategies are increasingly more efficient, such that σ p is the most efficient processing strategy. Moreover, we assume that, while σ k+1 is more efficient than σ k, the effectiveness of σ k is, in general, better than the effectiveness of σ k+1. This assumption is well-founded, because efficient processing strategies typically have a negative impact on the corresponding retrieval effectiveness [13, 19, 21]. For query q 1, we associate with each strategy σ k the processing time e k (q 1), which the strategy is predicted to take to process query q 1. This means that, for example, e 1(q 1) represents the processing time of query q 1 when the less efficient (but most effective) processing strategy is adopted, while e p(q 1) represents the most efficient yet less effective predicted processing strategy. A constant time threshold T represents the maximum time budget for the processing of any query: the completion time of any query must be not greater than T, such that its results can be presented to the user in a timely manner. This time means that the time elapsed between the arrival of any query and its processing finish time must not exceed T. Note that, since the query has already spent some time in the queue, its available processing time, i.e., the maximum time it is allowed to spend in processing, is not, in general, equal to T, but it is decreased by the time it has spent in the queue. Moreover, if there are other subsequent queries queued, then it can be considered unfair for the query to take all available time, while other queries are starved. Hence, we argue that the available processing time for each query is bounded by some time budget depending on various factors such as the time the query has spent in the queue, and the number of queued queries. The definition of a suitable time budget is central to this paper. Let f(q i) be this time budget for query q i, which has to ensure fairness in query processing: whenever the query workload is close to the maximum allowed, enqueued queries should be assigned reduced time budgets for their processing. Once f(q i) has been computed, we have to select the processing strategies able to process the query within the time budget, i.e. any strategy σ k (q i) such that e k (q i) f(q i). Finally, among all these strategies, we select the best strategy in term of effectiveness, i.e., according to our assumptions, the strategy that takes the largest processing time among all admissible strategies. The definition of a suitable time budget function f(q i) depends on various aspects: the position of the query in the queue, its arrival time, the current time, and the status of the queue. The outline of the proposed selective pruning framework is shown in Algorithm 1. For a queue of queries awaiting processing, q 1,..., q n, their expected processing times for all possible processing strategies are estimated. This allows the time budget to be calculated f(q 1) for the next query to be processed. Thereafter, we choose an appropriate query processing strategy, which aims to ensure that the query meets its completion time threshold T, while providing results that are as effective as possible. Algorithm 1 Load-Sensitive Selective Pruning Framework Input: The queries q 1,..., q n The completion time threshold T Output: The selected processing strategy σ for query q 1 1: for all processing strategies σ k, k = 1,..., p 2: for all enqueued queues q i, i = 1,..., n 3: expected processing time e k (q i ) Predict(σ k, q i ) 4: Time budget f(q 1 ) Bound(T, σ 1 (q 1 ),..., σ p(q n)) 5: Processing strategy σ Select(f(q 1 ), e 1 (q 1 ),..., e p(q 1 )) In order to select the processing strategy σ, we must implement the following functions within our framework: Predict(): Defines a mechanism allowing to predict the processing time for each query in the queue when the processing strategy can be selected among the different dynamic pruning strategies. This mechanism is used to estimate the processing times e k (q i) of the available processing strategies, and the pruning strategy that will most likely process the query within the desired time threshold T. Bound(): Defines a method to compute the time budget f(q 1) for query q 1, depending on the global time threshold T and on the queries waiting to be processed. The time budget defines a bound on the processing time that query q 1 will be permitted.

6 Select(): Defines a mechanism to select the best processing strategy that is able to process query q 1 according to the maximum processing time, f(q 1), that q 1 is allowed to take and that maximises the resulting query effectiveness. Similar to previous work on selective pruning [19], it follows that the processing times of a query can be estimated through the use of query efficiency prediction [14], i.e. Predict(). However, as no such predictors have previously been defined for TAAT strategies such as TAAT-CS, in Section 5 we address query efficiency prediction for TAAT. In the remainder of this section, we propose mechanisms for Bound() (Section 4.1) and Select() (Section 4.2). 4.1 Bound() We assume a list of queries q 1, q 2,..., q n that are currently (at time t) in the queue of the system. Each query is associated with its arrival time t i. Roughly speaking, the query processing time bound f(q 1) has the following goals: 1. Efficiency: q 1 (the least recently queued query) will have a completion time not greater than T, the global time threshold. 2. Effectiveness: The time available to process q 1 will be as large as possible, such that the most effective processing strategy can be deployed. 3. Fairness: Queries q 2,..., q n received after q 1 are not starved of processing time, and hence are each able to meet T. Clearly, these three goals can be at odds with each other. In the following, we describe four methods of defining f(q 1) that address some or all of the goals to varying extents:. Query q 1 is processed as effectively as possible, i.e. using the most inefficient processing strategy: f(q 1) = argmax{e k (q 1)} = e 1(q 1). k This method ignores the waiting time spent in the queue, and makes no attempt to prune aggressively queries such that the threshold T can be met, by this query or other queries in the queue. In other words, it is a method that is neither fair nor efficient. For this reason, we use it as a baseline with maximal effectiveness.. Query q 1 is processed as fast as possible, by using the most efficient, aggressive pruning strategy for all queries: f(q 1) = argmin{e k (q 1)} = e p(q 1). k In this method, we ignore the waiting time that the query q 1 has spent in the queue. Similarly to the method, serves as a baseline method that does not explicitly consider the fairness or effectiveness goals. However, in contrast to, consumes the least computing resources, and hence is the fairest method, even if the other queries do not exploit the unused resources. Selfish. The query q 1, enqueued at time t 1, should be processed by time t 1 + T. Hence, at time t, the amount of remaining time 1 to process the query such that threshold T is met has decreased by t t 1 seconds, i.e.: 1 = (t 1 + T ) t If 1 >, the processing time bound is f(q 1) = 1, and depends only on the time q 1 has spent in the queue, without consideration for the processing time needed for other queued queries. Then, if the time threshold T for this query has elapsed ( 1 ), the query is processed as fast as possible, as in the case: { 1 if 1 > f(q 1) = e p(q 1) otherwise Altruistic. The previous method has the disadvantage that q 1 processing is bound with the maximum amount of time available (given the time spent in the queue), disregarding the queries that are still in the queue. This can penalise queued queries q 2,..., q n that have not yet been processed. In contrast, Altruistic enforces fairness, by firstly computing how much time is left to empty the current queue. This is simply the time at which the lastly queued query q n should be completed (t n+t ) minus the current time. Formally, n, the remaining time to finish processing up to query n, is: n = (t n + T ) t Then, to compute the maximum time available for q 1 we have to subtract the minimum time necessary to process all the queued queries. This time is simply given by the sum of the estimations e p(q i) of the processing time needed by the fastest processing strategy p. Hence, we define the available slack time, n, as: n n = n e p(q i). If n >, we evenly distribute this extra slack time to the queued queries. In doing so, if some time is left to process all enqueued queries faster than the minimum possible, each one might receive a fair amount of extra processing time 1. Hence the processing bound for query q 1 becomes e p(q 1) + n/n. However, this quantity can exceed 1, and will result in too much extra budget assigned to query q 1, beyond the time threshold T. In this case, the processing bound for the query q 1 is simply 1. Finally, if n, we process the query as fast as possible, as in the case, i.e., f(q 1) = { min e p(q 1) i=1 { 1, e p(q 1) + n/n } if n > otherwise The Altruistic method to compute Bound() is a central contribution of our paper. Once the time budget f(q 1) has been computed, it is used by the query processor to select the most suitable processing strategy among those available to process the query. In the following, we describe Select(), which is the function used to take these decisions. 4.2 Select() Given the time budget f(q 1) granted by Bound(), the role of the Select() function is to choose the most effective strategy σ = σ k {σ 1,... σ p} to resolve query q 1 within the assigned budget f(q 1). Primarily, the selection of an appropriate processing strategy is based on the estimated query processing times e 1(q 1),..., e p(q 1). Assuming the estimates are sorted in descending order of expected processing times, i.e., e 1(q 1) e p(q 1), we can identify the strategy σ k where 1 k p is the smallest such that e k (q 1) f(q 1). In order words, we select σ k as the best strategy in terms of effectiveness, whose expected completion time is not greater than the budget the query has been granted by Bound(). 1 This is true as far as no additional queries are received.

7 Note that, in the case that no strategy is able to process query q 1 within the computed time budget, we always select the most aggressive processing strategy, i.e., σ p. As a remark, when the and methods are used Select() will resort to always pick CS 1 (i.e. σ p) and DAAT (i.e. σ 1), respectively. Both Bound() and Select() descriptions have been given using the informal, and implicit, concept of an efficiency predictor. In the next section, we detail in a more precise way how inspired by the work in [19] we predicting the efficiency of a TAAT-CS strategy before processing commences. 5. PRUNING STRATEGIES & PREDICTORS The framework we described in the previous section relies on the concept of query efficiency predictors. In our definition, given a query and a set of query processing strategies, efficiency predictors return the estimated query processing time for each one of the strategies considered. The load-sensitive selective pruning framework proposed in Section 4 is general with respect to the deployed retrieval strategy. However, in this work we focus on two particular strategies, namely DAAT and TAAT-CS. In particular, we adopt document-at-a-time (DAAT) for full processing. Full-processing is chosen when, in normal load conditions, processing time is not constrained. On the other hand, when the system is experiencing a high workload, we resort to use faster and less precise processing strategies, specifically, based on the term-at-a-time-continue strategy (TAAT-CS) [16]. In the remainder of this section, we define the details of TAAT-CS (Section 5.1), before explaining how the processing time of both DAAT and TAAT-CS can be accurately measured (Section 5.2). 5.1 TAAT-CS Dynamic Pruning As defined in [16], TAAT-CS works as follows. Given a set of terms to process, sorted in decreasing order of posting list length, an OR phase processes the posting lists one by one until we have K accumulators. From this point, no new accumulators are created, and an AND phases processes the remaining posting lists by intersecting them with the existing accumulators. The efficiency of the AND phase can benefit from skip pointers [16] within the posting lists, such that the postings of documents that are not in the top K accumulators are not decompressed, leading to IO benefits. Therefore, smaller values of K correspond to more aggressive pruning, as the AND phase is started earlier, and more skipping can occur during this phase. However, smaller K values are likely to lead to result lists with degraded effectiveness. Our implementation of the TAAT-CS dynamic pruning strategy adopts a further heuristic, to optimise the initial phase in which new accumulators are created. Given that DAAT processing is faster than TAAT processing [9], we alter the accumulator creation phase as follows. We select the shortest l posting lists, such that the sum of their lengths is greater than or equal to the number of accumulators K. These posting lists for this initial set of terms are processed using a DAAT strategy, instead of TAAT. In doing so, the resulting number of accumulators will never be greater than the number of accumulators we will get after processing the first list with a classic TAAT-CS strategy. After this modified OR phase, the processing strategy proceeds with the AND phase as in TAAT-CS. Using our refined strategy, we may end up with less accumulators than using the tradi- Query Efficiency Prediction Features total number of postings in the query s term lists number of terms in the query variance of the length of the posting lists mean of the length of the posting lists length of the shortest posting list length of the longest posting list number of terms processed in the first phase of CS length of the posting lists processed in the first phase of CS number of terms processed in the second phase of CS length of the posting lists processed in the second phase of CS Table 1: Features used for prediction processing time: the top features are method independent, the bottom features are method dependent, for CS. tional TAAT-CS. However, in our initial experiments, we found that this happens only for.1% of the 1, queries used in this paper. Yet, on average, the response time of our DAAT/TAAT-CS strategies exhibit a 2x improvement over the classical TAAT-CS strategy. The adoption of the DAAT/TAAT-CS strategy motivates also the comparison of our selective pruning strategies with DAAT, instead than TAAT. Indeed, in terms of efficiency, out-performing DAAT as a baseline is, in general, more difficult than for TAAT [9]. In the following, we refer to our DAAT/ TAAT-CS with K accumulators as CS-K (e.g. CS- 1 uses K = 1 accumulators), without further mention of the use of DAAT for the initial phase. As a side note, we are not aware of any previous work studying this small variation on TAAT-CS. Therefore, to the best of our knowledge, this is another new contribution presented by this work. 5.2 Query Efficiency Prediction In the preceding section, we defined the processing strategies used within this paper. In this section, we describe how we obtain query efficiency predictions for the processing strategies. In particular, we are inspired by the query efficiency predictors for DAAT previously defined by Macdonald et al. [14]. However, in this work we also use TAAT- CS for aggressive pruning. Hence, in the following we devise a method for predicting the processing time of CS-K, before retrieval commences, using a Linear Regression-based technique. First of all, we define a set of features to represent each query. In the case of DAAT, Macdonald et al. [14] show that there is a strong correlation between the distribution of postings in the query terms and the response time of the query itself. Therefore, to predict the response time of DAAT we use the features listed in the top part of Table 1. On the other hand, as discussed above, TAAT-CS strategies do not score all postings in the posting lists of the query terms. Hence, we do not expect that relying only on posting features can lead to good predictions. Instead, given the characteristics of our TAAT-CS strategies (a first phase where we fully evaluate a subset of terms using DAAT, and a second phase where we use the remaining terms to update the accumulators found in the first phase) we build a regression model using the features listed in the bottom part of Table 1, in addition to the method-independent features listed in the top part. It is of note that all of these query efficiency prediction features can be calculated using commonly available statistics, particularly the length of the query term s corresponding posting lists, before retrieval commences, and

8 hence query efficiency predictions can be made with very low overheads, as soon as a query arrives at a query server. In total, our prediction method models the problem using a feature space made up of 1 distinct features. As our reference architecture is a distributed one, each query server might have different response times for the same query. For this reason, we need to build different models for each server. We adopt a linear regression model to estimate the running time e j(q i) of query q i when scored using method j. In other words, we model e j(q i) as a linear combination of the features fi weighted by a real value λ f. Features and weights are different for each scoring method thus we indicate f ji and λ jf to refer to values for scoring method j. Formally, e j(q i) = λ jf j λ j9f j9. Linear regression is then used to find the values for various λ jf with the goal of minimising the least square error of processing time on a training set of queries [14]. In the next section, we define the experimental setup for our experiments. In particular, our experiments demonstrate the accuracy of the proposed efficiency predictors for TAAT-CS, before showing how the proposed selective scheduling framework proposed in Section 4 can increase the ability of a search engine to effectively and efficiently handle different traffic query loads. 6. EXPERIMENTAL SETUP In the following experiments, we deploy a widely used document collection created as part of TREC, namely the ClueWeb9 (cat. B) collection, which comprises around 5 million English Web documents, and is designed to represent the first-tier index of a commercial Web search engine. We index the document collection using the Terrier search engine [17], removing standard stopwords and applying Porter s English stemmer. The resulting index is document partitioned into ten separate index shards, while maintaining the original ordering of the collection. Each inverted index shard, which is stored on disk, also has skipping information embedded, to permit skipping [16] during the Continue phase of TAAT-CS. For the retrieval experiments, we use a distributed C++ search system engine, accessing the index produced by Terrier. Our experiments are conducted on a cluster of twelve quad-core machines, where each machine has one Intel Xeon 2.4GHz X3223 CPU and 8GB of RAM, connected using Gigabit Ethernet. Only a single core on each query server is used to serve queries. 2 Two additional nodes are used as follows: one as the query broker, and one as the client application that sends the queries to the system. Finally, each query server has a queue used to keep queries coming from the broker, while the query processor on each query server processes queries one at a time. As query processing strategies, we use DAAT, as well as TAAT-CS with different accumulators, i.e. CS-1, CS-2, CS-5 and CS-1. Documents are scored using BM25, with parameters at the default settings [18]. We use queries from the TREC Million Query Track 29 [6], which contains 4, queries, some of which have relevance assessments. In our experiments, 3, of these queries 2 While increasing the number of cores on each query server obviously increases throughput, we prefer to use a singlethreaded environment to reduce any resource contention that may reduce the reliability of experimental results. are used as the training set for learning λ values in our regression models, while the other 1, are used for testing the accuracy of the predictors, and retrieval experiments. Indeed, for measuring the accuracy of our query efficiency predictors, we use root mean square error (RMSE), while for retrieval effectiveness, we compute NDCG@1 using the 687 queries out of the 1, that have relevance assessments from TREC 29. Efficiency is measured using mean response time computed over 5 runs for each test. In our experiments, we do not use query caching, in order to better analyse the impact of our models on the processing performance. Moreover, adding a cache in front of our architecture would only reduce the query arrival rate, but not the efficiency and effectiveness of our method. 7. EXPERIMENTS In the following, we address these research questions: RQ1. What is the accuracy of the linear regression-based approach for query efficiency prediction for TAAT-CS? (Section 7.1) RQ2. Do the proposed methods achieve effective and efficient retrieval under different query loads? (Section 7.2) RQ3. To what extent can efficient query per second servicing be attained for different time thresholds? (Section 7.3) 7.1 Predictors Error Evaluation Efficiency predictors, which aim to predict the processing time of a query before retrieval commences, are an important component of our work. In this first research question, we aim to ensure that our estimations, particularly for TAAT-CS pruning strategies, are accurate. We compare the accuracy of the features listed in Table 1 when combined using linear regression. In particular, we compare the set that only includes the six method independent features, with the set that includes, in addition to the previous six, the four method dependent features proposed for TAAT-CS. Table 2 reports the accuracy of the linear regression models combining the six and ten features, as well as a baseline predictor that uses only the total number of postings for the query terms as a feature. In the table, we report the mean, over the ten query servers, of the query processing time (QPT) for each strategy, as well as the Root Mean Square Error (RMSE), and the percentage of queries for which the prediction error is less than 1 milliseconds. The best value in each row for each measure is highlighted. On analysing Table 2, we note that for DAAT, using the six features improves over the baseline single feature predictor by 42% (from RMSE to ), with 95% of the queries having a prediction error of less than 1 ms. On the other hand, using only the six features is insufficient for accurate processing time prediction for the CS-K strategies for instance, for CS-1, only 65% of queries are accurately predicted within 1 ms. However, for the linear regression models that uses the additional 4 method dependent features (1 features in all) 3, the error is one order of magnitude lower, and for the vast majority of queries (95-99%) our linear model is able to predict the correct response time up to a 1ms error. Therefore, in answering research question RQ1, we find that the proposed linear regression model is accurate, with an error smaller than 1 ms in more than 95% of the cases. 3 The 4 method dependent features do not apply to DAAT.

9 1 Feature: sum of postings 6 Features: method independent 1 Features: incl. method depend. Strategy QPT RMSE err 1 ms RMSE err 1 ms RMSE err 1 ms DAAT.11 s % % - - CS-1.25 s % % % CS-2.3 s % % % CS-5.37 s % % % CS-1.44 s % % % Table 2: Mean query processing time (QPT, in seconds), as well as prediction accuracy using various feature sets, for each processing strategy. Average query response time (seconds) CS-1 Selfish Altruistic Figure 4: Average query response time in seconds for different methods, T =.5. In particular, the best performing models for predicting CS- K strategies are those obtained by the full set of ten features described in Section 5, while in the case of DAAT, the six features describing the lengths of the lists associated with query terms perform very well. Therefore, in the following experiments, we use six features for the prediction of the DAAT processing times and the full set of ten features for the prediction of TAAT CS-K processing times. 7.2 Efficiency and Effectiveness Analysis In this section, we experiment to address RQ2, in comparing the efficiency and effectiveness of our proposed loadsensitive selective pruning framework. In particular, we compare our methods, Selfish and Altruistic, with three different baselines: and, as well as applying CS 1 for all queries. We remark that, by their respective definitions, corresponds to a pure DAAT full processing strategy and corresponds to using CS- 1. Within this section, we use a maximum threshold time of T =.5 seconds, which mandates that the results for each query must be returned, including both queueing and processing, before this time elapses. Later, in Section 7.3, we analyse how T affects the performances of our methods. We analyse our methods in terms of query response time and effectiveness, stressing our search system with different rates of queries, measured in queries per second (q/s). The query response time corresponds to measuring how much time the query spends within the queues and being processed in other words the time a user waits for the results to be returned. We evaluate effectiveness using NDCG@1, exploiting the 687 queries that have relevance assessments. Firstly, we experiment to determine the average response time of the various methods by varying the number of queries per second submitted to the search system. As the Million query track query set does not have query arrival times, queries are submitted at uniform query rate in other words a submission rate of N q/s corresponds to submitting a query every 1/N seconds. This allows us to measure the behaviour of the various techniques under various load conditions, as shown in Figure 4. As expected, when using Perfec- Average query response time (seconds) CS-1 Selfish Altruistic Figure 5: Average query response time in seconds for different methods (enlargement of Figure 4). Query Response Time Query ID - 4 q/s CS-1-4 q/s Selfish - 4 q/s Altruistic - 4 q/s Figure 6: Query response time for 1 queries, arrival rate 4 q/s, T =.5. tionist method, the mean response time exceeds the threshold (T =.5) for all except very low workloads. CS 1 can sustain slightly higher loads than, however for loads greater than 2 q/s the response times are well above the threshold. Figure 5 enlarges the curves of Figure 4 for query response times up to the threshold T =.5. This allows us to better analyse the behaviour of the various methods for a workload of 4 q/s or less. Clearly, attains the smallest response times, as it aggressively prunes all queries. However, both Selfish and Altruistic methods are less efficient than, but still achieve the threshold up to 4 q/s. To show how the various methods cope with queries of varying efficiency, Figure 6 plots the actual query response times for a subset (one hundred) of all the test queries, for a query workload of 4 q/s. In particular, the response times for, CS-1, Selfish, and Altruistic are shown. Spikes in the lines correspond to the effect of expensive queries on other later queries. Indeed, expensive queries delay the queries submitted later, as expected though Selfish and Altruistic are more uniform than the others. In particular, in the case of Altruistic, the line is also close to the time threshold, indicating a better utilisation of the resources. To determine how the threshold is adhered to for different methods and workloads, Figure 7 shows the percentage of queries whose response time are within the threshold

Load-Sensitive Selective Pruning for Distributed Search

Load-Sensitive Selective Pruning for Distributed Search Load-Sensitive Selective Pruning for Distributed Search Daniele Broccolo 1,3, Craig Macdonald 2, Salvatore Orlando 1,3, Iadh Ounis 2, Raffaele Perego 1, Fabrizio Silvestri 1,4, Nicola Tonellotto 1, 1 National

More information

Query Processing in Highly-Loaded Search Engines

Query Processing in Highly-Loaded Search Engines Query Processing in Highly-Loaded Search Engines Daniele Broccolo 1,2, Craig Macdonald 3, Salvatore Orlando 1,2, Iadh Ounis 3, Raffaele Perego 2, Fabrizio Silvestri 2, and Nicola Tonellotto 2 1 Università

More information

A copy can be downloaded for personal non-commercial research or study, without prior permission or charge

A copy can be downloaded for personal non-commercial research or study, without prior permission or charge Freire, Ana, Macdonald, Craig, Tonellotto, Nicola, Ounis, Iadh, and Cacheda, Fidel (214) A self-adapting latency/power tradeoff model for replicated search engines. In: WSDM (Web Search and Data Mining)

More information

Efficient Dynamic Pruning with Proximity Support

Efficient Dynamic Pruning with Proximity Support Efficient Dynamic Pruning with Proximity Support Nicola Tonellotto Information Science and Technologies Institute National Research Council Via G. Moruzzi, 56 Pisa, Italy nicola.tonellotto@isti.cnr.it

More information

Analyzing the performance of top-k retrieval algorithms. Marcus Fontoura Google, Inc

Analyzing the performance of top-k retrieval algorithms. Marcus Fontoura Google, Inc Analyzing the performance of top-k retrieval algorithms Marcus Fontoura Google, Inc This talk Largely based on the paper Evaluation Strategies for Top-k Queries over Memory-Resident Inverted Indices, VLDB

More information

Efficient and Effective Retrieval using Selective Pruning

Efficient and Effective Retrieval using Selective Pruning Efficient and Effective Retrieval using Selective Pruning Nicola Tonellotto Information Science and Technologies Institute National Research Council 56124 Pisa, Italy nicola.tonellotto@isti.cnr.it Craig

More information

Optimized Top-K Processing with Global Page Scores on Block-Max Indexes

Optimized Top-K Processing with Global Page Scores on Block-Max Indexes Optimized Top-K Processing with Global Page Scores on Block-Max Indexes Dongdong Shan 1 Shuai Ding 2 Jing He 1 Hongfei Yan 1 Xiaoming Li 1 Peking University, Beijing, China 1 Polytechnic Institute of NYU,

More information

Using Graphics Processors for High Performance IR Query Processing

Using Graphics Processors for High Performance IR Query Processing Using Graphics Processors for High Performance IR Query Processing Shuai Ding Jinru He Hao Yan Torsten Suel Polytechnic Inst. of NYU Polytechnic Inst. of NYU Polytechnic Inst. of NYU Yahoo! Research Brooklyn,

More information

Predictive Parallelization: Taming Tail Latencies in Web Search

Predictive Parallelization: Taming Tail Latencies in Web Search Predictive Parallelization: Taming Tail Latencies in Web Search Myeongjae Jeon, Saehoon Kim 2, Seung-won Hwang 2, Yuxiong He 3, Sameh Elnikety 3, Alan L. Cox, Scott Rixner Rice University 2 POSTECH 3 Microsoft

More information

CHAPTER 6 STATISTICAL MODELING OF REAL WORLD CLOUD ENVIRONMENT FOR RELIABILITY AND ITS EFFECT ON ENERGY AND PERFORMANCE

CHAPTER 6 STATISTICAL MODELING OF REAL WORLD CLOUD ENVIRONMENT FOR RELIABILITY AND ITS EFFECT ON ENERGY AND PERFORMANCE 143 CHAPTER 6 STATISTICAL MODELING OF REAL WORLD CLOUD ENVIRONMENT FOR RELIABILITY AND ITS EFFECT ON ENERGY AND PERFORMANCE 6.1 INTRODUCTION This chapter mainly focuses on how to handle the inherent unreliability

More information

DISTRIBUTED HIGH-SPEED COMPUTING OF MULTIMEDIA DATA

DISTRIBUTED HIGH-SPEED COMPUTING OF MULTIMEDIA DATA DISTRIBUTED HIGH-SPEED COMPUTING OF MULTIMEDIA DATA M. GAUS, G. R. JOUBERT, O. KAO, S. RIEDEL AND S. STAPEL Technical University of Clausthal, Department of Computer Science Julius-Albert-Str. 4, 38678

More information

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Seminar on A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Mohammad Iftakher Uddin & Mohammad Mahfuzur Rahman Matrikel Nr: 9003357 Matrikel Nr : 9003358 Masters of

More information

Reduction of Periodic Broadcast Resource Requirements with Proxy Caching

Reduction of Periodic Broadcast Resource Requirements with Proxy Caching Reduction of Periodic Broadcast Resource Requirements with Proxy Caching Ewa Kusmierek and David H.C. Du Digital Technology Center and Department of Computer Science and Engineering University of Minnesota

More information

Ch 4 : CPU scheduling

Ch 4 : CPU scheduling Ch 4 : CPU scheduling It's the basis of multiprogramming operating systems. By switching the CPU among processes, the operating system can make the computer more productive In a single-processor system,

More information

Estimate performance and capacity requirements for Access Services

Estimate performance and capacity requirements for Access Services Estimate performance and capacity requirements for Access Services This document is provided as-is. Information and views expressed in this document, including URL and other Internet Web site references,

More information

Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS

Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS Structure Page Nos. 2.0 Introduction 4 2. Objectives 5 2.2 Metrics for Performance Evaluation 5 2.2. Running Time 2.2.2 Speed Up 2.2.3 Efficiency 2.3 Factors

More information

CHAPTER 6 ENERGY AWARE SCHEDULING ALGORITHMS IN CLOUD ENVIRONMENT

CHAPTER 6 ENERGY AWARE SCHEDULING ALGORITHMS IN CLOUD ENVIRONMENT CHAPTER 6 ENERGY AWARE SCHEDULING ALGORITHMS IN CLOUD ENVIRONMENT This chapter discusses software based scheduling and testing. DVFS (Dynamic Voltage and Frequency Scaling) [42] based experiments have

More information

Performance Estimation and Regularization. Kasthuri Kannan, PhD. Machine Learning, Spring 2018

Performance Estimation and Regularization. Kasthuri Kannan, PhD. Machine Learning, Spring 2018 Performance Estimation and Regularization Kasthuri Kannan, PhD. Machine Learning, Spring 2018 Bias- Variance Tradeoff Fundamental to machine learning approaches Bias- Variance Tradeoff Error due to Bias:

More information

Citation for published version (APA): He, J. (2011). Exploring topic structure: Coherence, diversity and relatedness

Citation for published version (APA): He, J. (2011). Exploring topic structure: Coherence, diversity and relatedness UvA-DARE (Digital Academic Repository) Exploring topic structure: Coherence, diversity and relatedness He, J. Link to publication Citation for published version (APA): He, J. (211). Exploring topic structure:

More information

Hashing for searching

Hashing for searching Hashing for searching Consider searching a database of records on a given key. There are three standard techniques: Searching sequentially start at the first record and look at each record in turn until

More information

Oracle Database 12c: JMS Sharded Queues

Oracle Database 12c: JMS Sharded Queues Oracle Database 12c: JMS Sharded Queues For high performance, scalable Advanced Queuing ORACLE WHITE PAPER MARCH 2015 Table of Contents Introduction 2 Architecture 3 PERFORMANCE OF AQ-JMS QUEUES 4 PERFORMANCE

More information

Window Extraction for Information Retrieval

Window Extraction for Information Retrieval Window Extraction for Information Retrieval Samuel Huston Center for Intelligent Information Retrieval University of Massachusetts Amherst Amherst, MA, 01002, USA sjh@cs.umass.edu W. Bruce Croft Center

More information

Performance Extrapolation for Load Testing Results of Mixture of Applications

Performance Extrapolation for Load Testing Results of Mixture of Applications Performance Extrapolation for Load Testing Results of Mixture of Applications Subhasri Duttagupta, Manoj Nambiar Tata Innovation Labs, Performance Engineering Research Center Tata Consulting Services Mumbai,

More information

Preview. Memory Management

Preview. Memory Management Preview Memory Management With Mono-Process With Multi-Processes Multi-process with Fixed Partitions Modeling Multiprogramming Swapping Memory Management with Bitmaps Memory Management with Free-List Virtual

More information

Characterizing Storage Resources Performance in Accessing the SDSS Dataset Ioan Raicu Date:

Characterizing Storage Resources Performance in Accessing the SDSS Dataset Ioan Raicu Date: Characterizing Storage Resources Performance in Accessing the SDSS Dataset Ioan Raicu Date: 8-17-5 Table of Contents Table of Contents...1 Table of Figures...1 1 Overview...4 2 Experiment Description...4

More information

Search Engines Chapter 8 Evaluating Search Engines Felix Naumann

Search Engines Chapter 8 Evaluating Search Engines Felix Naumann Search Engines Chapter 8 Evaluating Search Engines 9.7.2009 Felix Naumann Evaluation 2 Evaluation is key to building effective and efficient search engines. Drives advancement of search engines When intuition

More information

6.2 DATA DISTRIBUTION AND EXPERIMENT DETAILS

6.2 DATA DISTRIBUTION AND EXPERIMENT DETAILS Chapter 6 Indexing Results 6. INTRODUCTION The generation of inverted indexes for text databases is a computationally intensive process that requires the exclusive use of processing resources for long

More information

Estimating the Quality of Databases

Estimating the Quality of Databases Estimating the Quality of Databases Ami Motro Igor Rakov George Mason University May 1998 1 Outline: 1. Introduction 2. Simple quality estimation 3. Refined quality estimation 4. Computing the quality

More information

Cost Models for Query Processing Strategies in the Active Data Repository

Cost Models for Query Processing Strategies in the Active Data Repository Cost Models for Query rocessing Strategies in the Active Data Repository Chialin Chang Institute for Advanced Computer Studies and Department of Computer Science University of Maryland, College ark 272

More information

Technical Documentation Version 7.4. Performance

Technical Documentation Version 7.4. Performance Technical Documentation Version 7.4 These documents are copyrighted by the Regents of the University of Colorado. No part of this document may be reproduced, stored in a retrieval system, or transmitted

More information

Management and Analysis of Multi Class Traffic in Single and Multi-Band Systems

Management and Analysis of Multi Class Traffic in Single and Multi-Band Systems Wireless Personal Communication manuscript No. DOI 1.17/s11277-15-2391-5 Management and Analysis of Multi Class Traffic in Single and Multi-Band Systems Husnu S. Narman Md. Shohrab Hossain Mohammed Atiquzzaman

More information

Improving the Efficiency of Multi-site Web Search Engines

Improving the Efficiency of Multi-site Web Search Engines Improving the Efficiency of Multi-site Web Search Engines ABSTRACT Guillem Francès Universitat Pompeu Fabra Barcelona, Spain guillem.frances@upf.edu B. Barla Cambazoglu Yahoo Labs Barcelona, Spain barla@yahoo-inc.com

More information

COMPUTER SCIENCE 4500 OPERATING SYSTEMS

COMPUTER SCIENCE 4500 OPERATING SYSTEMS Last update: 3/28/2017 COMPUTER SCIENCE 4500 OPERATING SYSTEMS 2017 Stanley Wileman Module 9: Memory Management Part 1 In This Module 2! Memory management functions! Types of memory and typical uses! Simple

More information

CPU scheduling. Alternating sequence of CPU and I/O bursts. P a g e 31

CPU scheduling. Alternating sequence of CPU and I/O bursts. P a g e 31 CPU scheduling CPU scheduling is the basis of multiprogrammed operating systems. By switching the CPU among processes, the operating system can make the computer more productive. In a single-processor

More information

Ranking Clustered Data with Pairwise Comparisons

Ranking Clustered Data with Pairwise Comparisons Ranking Clustered Data with Pairwise Comparisons Alisa Maas ajmaas@cs.wisc.edu 1. INTRODUCTION 1.1 Background Machine learning often relies heavily on being able to rank the relative fitness of instances

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

Overview Computer Networking What is QoS? Queuing discipline and scheduling. Traffic Enforcement. Integrated services

Overview Computer Networking What is QoS? Queuing discipline and scheduling. Traffic Enforcement. Integrated services Overview 15-441 15-441 Computer Networking 15-641 Lecture 19 Queue Management and Quality of Service Peter Steenkiste Fall 2016 www.cs.cmu.edu/~prs/15-441-f16 What is QoS? Queuing discipline and scheduling

More information

IBM InfoSphere Streams v4.0 Performance Best Practices

IBM InfoSphere Streams v4.0 Performance Best Practices Henry May IBM InfoSphere Streams v4.0 Performance Best Practices Abstract Streams v4.0 introduces powerful high availability features. Leveraging these requires careful consideration of performance related

More information

CAPACITY PLANNING FOR THE DATA WAREHOUSE BY W. H. Inmon

CAPACITY PLANNING FOR THE DATA WAREHOUSE BY W. H. Inmon CAPACITY PLANNING FOR THE DATA WAREHOUSE BY W. H. Inmon The data warehouse environment - like all other computer environments - requires hardware resources. Given the volume of data and the type of processing

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Future-ready IT Systems with Performance Prediction using Analytical Models

Future-ready IT Systems with Performance Prediction using Analytical Models Future-ready IT Systems with Performance Prediction using Analytical Models Madhu Tanikella Infosys Abstract Large and complex distributed software systems can impact overall software cost and risk for

More information

Joint Entity Resolution

Joint Entity Resolution Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute

More information

ResPubliQA 2010

ResPubliQA 2010 SZTAKI @ ResPubliQA 2010 David Mark Nemeskey Computer and Automation Research Institute, Hungarian Academy of Sciences, Budapest, Hungary (SZTAKI) Abstract. This paper summarizes the results of our first

More information

Frequency Oriented Scheduling on Parallel Processors

Frequency Oriented Scheduling on Parallel Processors School of Mathematics and Systems Engineering Reports from MSI - Rapporter från MSI Frequency Oriented Scheduling on Parallel Processors Siqi Zhong June 2009 MSI Report 09036 Växjö University ISSN 1650-2647

More information

different problems from other networks ITU-T specified restricted initial set Limited number of overhead bits ATM forum Traffic Management

different problems from other networks ITU-T specified restricted initial set Limited number of overhead bits ATM forum Traffic Management Traffic and Congestion Management in ATM 3BA33 David Lewis 3BA33 D.Lewis 2007 1 Traffic Control Objectives Optimise usage of network resources Network is a shared resource Over-utilisation -> congestion

More information

CA Single Sign-On. Performance Test Report R12

CA Single Sign-On. Performance Test Report R12 CA Single Sign-On Performance Test Report R12 Contents CHAPTER 1: OVERVIEW INTRODUCTION SUMMARY METHODOLOGY GLOSSARY CHAPTER 2: TESTING METHOD TEST ENVIRONMENT DATA MODEL CONNECTION PROCESSING SYSTEM PARAMETERS

More information

DESIGN AND ANALYSIS OF ALGORITHMS. Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES

DESIGN AND ANALYSIS OF ALGORITHMS. Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES DESIGN AND ANALYSIS OF ALGORITHMS Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES http://milanvachhani.blogspot.in USE OF LOOPS As we break down algorithm into sub-algorithms, sooner or later we shall

More information

Chapter 8 Memory Management

Chapter 8 Memory Management 1 Chapter 8 Memory Management The technique we will describe are: 1. Single continuous memory management 2. Partitioned memory management 3. Relocatable partitioned memory management 4. Paged memory management

More information

Chapter 6: CPU Scheduling. Operating System Concepts 9 th Edition

Chapter 6: CPU Scheduling. Operating System Concepts 9 th Edition Chapter 6: CPU Scheduling Silberschatz, Galvin and Gagne 2013 Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Thread Scheduling Multiple-Processor Scheduling Real-Time

More information

Large-Scale Network Simulation Scalability and an FPGA-based Network Simulator

Large-Scale Network Simulation Scalability and an FPGA-based Network Simulator Large-Scale Network Simulation Scalability and an FPGA-based Network Simulator Stanley Bak Abstract Network algorithms are deployed on large networks, and proper algorithm evaluation is necessary to avoid

More information

Operating Systems Unit 6. Memory Management

Operating Systems Unit 6. Memory Management Unit 6 Memory Management Structure 6.1 Introduction Objectives 6.2 Logical versus Physical Address Space 6.3 Swapping 6.4 Contiguous Allocation Single partition Allocation Multiple Partition Allocation

More information

FuxiSort. Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc

FuxiSort. Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc Fuxi Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc {jiamang.wang, yongjun.wyj, hua.caihua, zhipeng.tzp, zhiqiang.lv,

More information

Virtual Memory. Chapter 8

Virtual Memory. Chapter 8 Virtual Memory 1 Chapter 8 Characteristics of Paging and Segmentation Memory references are dynamically translated into physical addresses at run time E.g., process may be swapped in and out of main memory

More information

CHAPTER 3 A FAST K-MODES CLUSTERING ALGORITHM TO WAREHOUSE VERY LARGE HETEROGENEOUS MEDICAL DATABASES

CHAPTER 3 A FAST K-MODES CLUSTERING ALGORITHM TO WAREHOUSE VERY LARGE HETEROGENEOUS MEDICAL DATABASES 70 CHAPTER 3 A FAST K-MODES CLUSTERING ALGORITHM TO WAREHOUSE VERY LARGE HETEROGENEOUS MEDICAL DATABASES 3.1 INTRODUCTION In medical science, effective tools are essential to categorize and systematically

More information

Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES

Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES DESIGN AND ANALYSIS OF ALGORITHMS Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES http://milanvachhani.blogspot.in USE OF LOOPS As we break down algorithm into sub-algorithms, sooner or later we shall

More information

An Oracle White Paper September Oracle Utilities Meter Data Management Demonstrates Extreme Performance on Oracle Exadata/Exalogic

An Oracle White Paper September Oracle Utilities Meter Data Management Demonstrates Extreme Performance on Oracle Exadata/Exalogic An Oracle White Paper September 2011 Oracle Utilities Meter Data Management 2.0.1 Demonstrates Extreme Performance on Oracle Exadata/Exalogic Introduction New utilities technologies are bringing with them

More information

Process- Concept &Process Scheduling OPERATING SYSTEMS

Process- Concept &Process Scheduling OPERATING SYSTEMS OPERATING SYSTEMS Prescribed Text Book Operating System Principles, Seventh Edition By Abraham Silberschatz, Peter Baer Galvin and Greg Gagne PROCESS MANAGEMENT Current day computer systems allow multiple

More information

A Combined Semi-Pipelined Query Processing Architecture For Distributed Full-Text Retrieval

A Combined Semi-Pipelined Query Processing Architecture For Distributed Full-Text Retrieval A Combined Semi-Pipelined Query Processing Architecture For Distributed Full-Text Retrieval Simon Jonassen and Svein Erik Bratsberg Department of Computer and Information Science Norwegian University of

More information

Asynchronous Method Calls White Paper VERSION Copyright 2014 Jade Software Corporation Limited. All rights reserved.

Asynchronous Method Calls White Paper VERSION Copyright 2014 Jade Software Corporation Limited. All rights reserved. VERSION 7.0.10 Copyright 2014 Jade Software Corporation Limited. All rights reserved. Jade Software Corporation Limited cannot accept any financial or other responsibilities that may be the result of your

More information

Modeling and Synthesizing Task Placement Constraints in Google Compute Clusters

Modeling and Synthesizing Task Placement Constraints in Google Compute Clusters Modeling and Synthesizing Task Placement s in Google s Bikash Sharma Pennsylvania State University University Park 1 bikash@cse.psu.edu Rasekh Rifaat Google Inc. Seattle 93 rasekh@google.com Victor Chudnovsky

More information

vsan 6.6 Performance Improvements First Published On: Last Updated On:

vsan 6.6 Performance Improvements First Published On: Last Updated On: vsan 6.6 Performance Improvements First Published On: 07-24-2017 Last Updated On: 07-28-2017 1 Table of Contents 1. Overview 1.1.Executive Summary 1.2.Introduction 2. vsan Testing Configuration and Conditions

More information

Leveraging Transitive Relations for Crowdsourced Joins*

Leveraging Transitive Relations for Crowdsourced Joins* Leveraging Transitive Relations for Crowdsourced Joins* Jiannan Wang #, Guoliang Li #, Tim Kraska, Michael J. Franklin, Jianhua Feng # # Department of Computer Science, Tsinghua University, Brown University,

More information

OPTIMAL MULTI-CHANNEL ASSIGNMENTS IN VEHICULAR AD-HOC NETWORKS

OPTIMAL MULTI-CHANNEL ASSIGNMENTS IN VEHICULAR AD-HOC NETWORKS Chapter 2 OPTIMAL MULTI-CHANNEL ASSIGNMENTS IN VEHICULAR AD-HOC NETWORKS Hanan Luss and Wai Chen Telcordia Technologies, Piscataway, New Jersey 08854 hluss@telcordia.com, wchen@research.telcordia.com Abstract:

More information

Information Retrieval II

Information Retrieval II Information Retrieval II David Hawking 30 Sep 2010 Machine Learning Summer School, ANU Session Outline Ranking documents in response to a query Measuring the quality of such rankings Case Study: Tuning

More information

Optimal Space-time Tradeoffs for Inverted Indexes

Optimal Space-time Tradeoffs for Inverted Indexes Optimal Space-time Tradeoffs for Inverted Indexes Giuseppe Ottaviano 1, Nicola Tonellotto 1, Rossano Venturini 1,2 1 National Research Council of Italy, Pisa, Italy 2 Department of Computer Science, University

More information

Adaptive Parallelism for Web Search

Adaptive Parallelism for Web Search Adaptive Parallelism for Web Search Myeongjae Jeon, Yuxiong He, Sameh Elnikety, Alan L. Cox, Scott Rixner Microsoft Research Rice University Redmond, WA, USA Houston, TX, USA Abstract A web search query

More information

McGill University - Faculty of Engineering Department of Electrical and Computer Engineering

McGill University - Faculty of Engineering Department of Electrical and Computer Engineering McGill University - Faculty of Engineering Department of Electrical and Computer Engineering ECSE 494 Telecommunication Networks Lab Prof. M. Coates Winter 2003 Experiment 5: LAN Operation, Multiple Access

More information

Modeling and Synthesizing Task Placement Constraints in Google Compute Clusters

Modeling and Synthesizing Task Placement Constraints in Google Compute Clusters Modeling and Synthesizing Task Placement s in Google s Bikash Sharma Pennsylvania State University University Park 1 bikash@cse.psu.edu Rasekh Rifaat Google Inc. Seattle 913 rasekh@google.com Victor Chudnovsky

More information

Achieving Distributed Buffering in Multi-path Routing using Fair Allocation

Achieving Distributed Buffering in Multi-path Routing using Fair Allocation Achieving Distributed Buffering in Multi-path Routing using Fair Allocation Ali Al-Dhaher, Tricha Anjali Department of Electrical and Computer Engineering Illinois Institute of Technology Chicago, Illinois

More information

ptop: A Process-level Power Profiling Tool

ptop: A Process-level Power Profiling Tool ptop: A Process-level Power Profiling Tool Thanh Do, Suhib Rawshdeh, and Weisong Shi Wayne State University {thanh, suhib, weisong}@wayne.edu ABSTRACT We solve the problem of estimating the amount of energy

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Hashing. Hashing Procedures

Hashing. Hashing Procedures Hashing Hashing Procedures Let us denote the set of all possible key values (i.e., the universe of keys) used in a dictionary application by U. Suppose an application requires a dictionary in which elements

More information

12 PEPA Case Study: Rap Genius on Heroku

12 PEPA Case Study: Rap Genius on Heroku 1 PEPA Case Study: Rap Genius on Heroku As an example of a realistic case study, we consider a Platform as a service (PaaS) system, Heroku, and study its behaviour under different policies for assigning

More information

ECE519 Advanced Operating Systems

ECE519 Advanced Operating Systems IT 540 Operating Systems ECE519 Advanced Operating Systems Prof. Dr. Hasan Hüseyin BALIK (10 th Week) (Advanced) Operating Systems 10. Multiprocessor, Multicore and Real-Time Scheduling 10. Outline Multiprocessor

More information

Quantitative Models for Performance Enhancement of Information Retrieval from Relational Databases

Quantitative Models for Performance Enhancement of Information Retrieval from Relational Databases Quantitative Models for Performance Enhancement of Information Retrieval from Relational Databases Jenna Estep Corvis Corporation, Columbia, MD 21046 Natarajan Gautam Harold and Inge Marcus Department

More information

Using Coherence-based Measures to Predict Query Difficulty

Using Coherence-based Measures to Predict Query Difficulty Using Coherence-based Measures to Predict Query Difficulty Jiyin He, Martha Larson, and Maarten de Rijke ISLA, University of Amsterdam {jiyinhe,larson,mdr}@science.uva.nl Abstract. We investigate the potential

More information

Worst-case Ethernet Network Latency for Shaped Sources

Worst-case Ethernet Network Latency for Shaped Sources Worst-case Ethernet Network Latency for Shaped Sources Max Azarov, SMSC 7th October 2005 Contents For 802.3 ResE study group 1 Worst-case latency theorem 1 1.1 Assumptions.............................

More information

Oracle Database 10g Resource Manager. An Oracle White Paper October 2005

Oracle Database 10g Resource Manager. An Oracle White Paper October 2005 Oracle Database 10g Resource Manager An Oracle White Paper October 2005 Oracle Database 10g Resource Manager INTRODUCTION... 3 SYSTEM AND RESOURCE MANAGEMENT... 3 ESTABLISHING RESOURCE PLANS AND POLICIES...

More information

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 3: Programming Models RCFile: A Fast and Space-efficient Data

More information

Course Syllabus. Operating Systems

Course Syllabus. Operating Systems Course Syllabus. Introduction - History; Views; Concepts; Structure 2. Process Management - Processes; State + Resources; Threads; Unix implementation of Processes 3. Scheduling Paradigms; Unix; Modeling

More information

A Disk Head Scheduling Simulator

A Disk Head Scheduling Simulator A Disk Head Scheduling Simulator Steven Robbins Department of Computer Science University of Texas at San Antonio srobbins@cs.utsa.edu Abstract Disk head scheduling is a standard topic in undergraduate

More information

CPU THREAD PRIORITIZATION USING A DYNAMIC QUANTUM TIME ROUND-ROBIN ALGORITHM

CPU THREAD PRIORITIZATION USING A DYNAMIC QUANTUM TIME ROUND-ROBIN ALGORITHM CPU THREAD PRIORITIZATION USING A DYNAMIC QUANTUM TIME ROUND-ROBIN ALGORITHM Maysoon A. Mohammed 1, 2, Mazlina Abdul Majid 1, Balsam A. Mustafa 1 and Rana Fareed Ghani 3 1 Faculty of Computer System &

More information

OASIS: Self-tuning Storage for Applications

OASIS: Self-tuning Storage for Applications OASIS: Self-tuning Storage for Applications Kostas Magoutis, Prasenjit Sarkar, Gauri Shah 14 th NASA Goddard- 23 rd IEEE Mass Storage Systems Technologies, College Park, MD, May 17, 2006 Outline Motivation

More information

CS Operating Systems

CS Operating Systems CS 4500 - Operating Systems Module 9: Memory Management - Part 1 Stanley Wileman Department of Computer Science University of Nebraska at Omaha Omaha, NE 68182-0500, USA June 9, 2017 In This Module...

More information

CS Operating Systems

CS Operating Systems CS 4500 - Operating Systems Module 9: Memory Management - Part 1 Stanley Wileman Department of Computer Science University of Nebraska at Omaha Omaha, NE 68182-0500, USA June 9, 2017 In This Module...

More information

Last Class: Processes

Last Class: Processes Last Class: Processes A process is the unit of execution. Processes are represented as Process Control Blocks in the OS PCBs contain process state, scheduling and memory management information, etc A process

More information

Efficient Lists Intersection by CPU- GPU Cooperative Computing

Efficient Lists Intersection by CPU- GPU Cooperative Computing Efficient Lists Intersection by CPU- GPU Cooperative Computing Di Wu, Fan Zhang, Naiyong Ao, Gang Wang, Xiaoguang Liu, Jing Liu Nankai-Baidu Joint Lab, Nankai University Outline Introduction Cooperative

More information

Performance of relational database management

Performance of relational database management Building a 3-D DRAM Architecture for Optimum Cost/Performance By Gene Bowles and Duke Lambert As systems increase in performance and power, magnetic disk storage speeds have lagged behind. But using solidstate

More information

Networking Acronym Smorgasbord: , DVMRP, CBT, WFQ

Networking Acronym Smorgasbord: , DVMRP, CBT, WFQ Networking Acronym Smorgasbord: 802.11, DVMRP, CBT, WFQ EE122 Fall 2011 Scott Shenker http://inst.eecs.berkeley.edu/~ee122/ Materials with thanks to Jennifer Rexford, Ion Stoica, Vern Paxson and other

More information

A Hybrid Recursive Multi-Way Number Partitioning Algorithm

A Hybrid Recursive Multi-Way Number Partitioning Algorithm Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence A Hybrid Recursive Multi-Way Number Partitioning Algorithm Richard E. Korf Computer Science Department University

More information

Query Evaluation Strategies

Query Evaluation Strategies Introduction to Search Engine Technology Term-at-a-Time and Document-at-a-Time Evaluation Ronny Lempel Yahoo! Labs (Many of the following slides are courtesy of Aya Soffer and David Carmel, IBM Haifa Research

More information

Leveraging Set Relations in Exact Set Similarity Join

Leveraging Set Relations in Exact Set Similarity Join Leveraging Set Relations in Exact Set Similarity Join Xubo Wang, Lu Qin, Xuemin Lin, Ying Zhang, and Lijun Chang University of New South Wales, Australia University of Technology Sydney, Australia {xwang,lxue,ljchang}@cse.unsw.edu.au,

More information

Design of Parallel Algorithms. Course Introduction

Design of Parallel Algorithms. Course Introduction + Design of Parallel Algorithms Course Introduction + CSE 4163/6163 Parallel Algorithm Analysis & Design! Course Web Site: http://www.cse.msstate.edu/~luke/courses/fl17/cse4163! Instructor: Ed Luke! Office:

More information

Mark Sandstrom ThroughPuter, Inc.

Mark Sandstrom ThroughPuter, Inc. Hardware Implemented Scheduler, Placer, Inter-Task Communications and IO System Functions for Many Processors Dynamically Shared among Multiple Applications Mark Sandstrom ThroughPuter, Inc mark@throughputercom

More information

Why You Should Consider a Hardware Based Protocol Analyzer?

Why You Should Consider a Hardware Based Protocol Analyzer? Why You Should Consider a Hardware Based Protocol Analyzer? Software-only protocol analyzers are limited to accessing network traffic through the utilization of mirroring. While this is the most convenient

More information

Active Adaptation in QoS Architecture Model

Active Adaptation in QoS Architecture Model Active Adaptation in QoS Architecture Model Drago agar and Snjeana Rimac -Drlje Faculty of Electrical Engineering University of Osijek Kneza Trpimira 2b, HR-31000 Osijek, CROATIA Abstract - A new complex

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung December 2003 ACM symposium on Operating systems principles Publisher: ACM Nov. 26, 2008 OUTLINE INTRODUCTION DESIGN OVERVIEW

More information

Inital Starting Point Analysis for K-Means Clustering: A Case Study

Inital Starting Point Analysis for K-Means Clustering: A Case Study lemson University TigerPrints Publications School of omputing 3-26 Inital Starting Point Analysis for K-Means lustering: A ase Study Amy Apon lemson University, aapon@clemson.edu Frank Robinson Vanderbilt

More information

Rapid Bottleneck Identification A Better Way to do Load Testing. An Oracle White Paper June 2008

Rapid Bottleneck Identification A Better Way to do Load Testing. An Oracle White Paper June 2008 Rapid Bottleneck Identification A Better Way to do Load Testing An Oracle White Paper June 2008 Rapid Bottleneck Identification A Better Way to do Load Testing. RBI combines a comprehensive understanding

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in

More information