Selective-NRA Algorithms for Top-k Queries

Size: px

Start display at page:

Download "Selective-NRA Algorithms for Top-k Queries"

Dustin Barber
6 years ago
Views:

1 Selective-NRA Algorithms for Top- Queries Jing Yuan, Guang-Zhong Sun, Ye Tian, Guoliang Chen, and Zhi Liu MOE-MS Key Laboratory of Multimedia Computing and Communication, Department of Computer Science and Technology, University of Science and Technology of China, Hefei, , P.R. China Abstract. Efficient processing of top- queries has become a classical research area recently since it has lots of application fields. Fagin et al. proposed the middleware cost for a top- query algorithm. In some databases there is no way to perform a random access, Fagin et al. proposed NRA (No Random Access) algorithm for this case. In this paper, we provided some ey observations of NRA. Based on them, we proposed a new algorithm called Selective-NRA (SNRA) which is designed to minimize the useless access of a top- query. However, we proved the SNRA is not instance optimal in Fagin s notion and we also proposed an instance optimal algorithm Hybrid-SNRA based on algorithm SNRA. We conducted extensive experiments on both synthetic and realworld data. The experiments showed SNRA (Hybrid-SNRA) has less access cost than NRA. For some instances, SNRA performed 50% fewer accesses than NRA. 1 Introduction Assume there are a huge amount of objects and every object has m attributes, for each attribute the object has a score, these scores can be aggregated to a total score by an aggregate function, and we want to now which objects have the largest total scores. This scenario is generalized as top- queries. Top- query has become a classical research area since it has many applications such as information retrieval[5],[7], multimedia databases[2],[9], data mining[6]. In top- queries, we can do random access and sorted access to get an object s score. Under a random access we can get some object s score of a given attribute at one step; a sorted access means we proceed through an attribute list sequentially from the top of a list, i.e. if some object is the lth largest object in the ith list, we should do l sorted access to the ith list to obtain the object s score. As stated in [8], if there are s sorted accesses and r random accesses, the total access cost will be sc S + rc R.(C S is the cost of a single sorted access and C R is the cost of a single random access.) In some cases, random access is forbidden or restricted.[1] For example, a typical web search engine has no way to return a score of a document of our choice Corresponding author. Q. Li et al. (Eds.): APWeb/WAIM 2009, LNCS 5446, pp , c Springer-Verlag Berlin Heidelberg 2009

2 16 J. Yuan et al. under a query. On the assumption that random access is not supported by the database, in [1] the authors proposed the No Random Access Algorithm (NRA)[1]. In this paper we first demonstrate some observations of NRA algorithm and propose a Selective-NRA (SNRA) algorithm, which performs significantly better than NRA algorithm in terms of access cost and booeeping; secondly, we turn SNRA into an instance optimal algorithm which we call Hybrid-SNRA; thirdly, we carry out extensive experiments to compare SNRA (Hybrid-SNRA) algorithms with NRA both on synthetic and real-world data sets. The results also demonstrate that our algorithms have lower access cost than NRA algorithm. The rest of this paper is organized as follows. In section 2, we define the problem and review related wor. In section 3, we introduce SNRA algorithm and Hybrid-SNRA algorithm. In section 4, we show the experiment results. Finally in section 5, we conclude the paper. 2 Problem Definition and Related Wor Our model can be described as follows: assume there are m lists and n objects, the aggregation function is t which has m variables. For a given object R and for list i, R has a score x i and 0 x i 1. R has a total score of t(x 1,x 2,...,x m ). We shall denote the lists as L 1,L 2...,L m which are sorted lists and can not be random accessed. We refer to L i as list i. Each entry of L i is (R, x i )wherex i is the ith field of R. We assume there is an exact value for each object, so the length of each list is n. Since we consider only sorted access, the access cost of an algorithm will be C R r if r sorted accesses are performed. There have been several algorithms which satisfy the assumption of no random access. The famous two algorithms are Güntzer et al. s Stream-Combine Algorithm [2] and Fagin et al. s No Random Access Algorithm [1]. As mentioned in [3], Stream-Combine algorithm considers only upper bounds on overall grades of objects, and cannot say that an object is in the top- unless that object has been seen in every sorted list. In this sense, NRA is better than Stream- Combine. In [1], the authors proved that algorithm NRA is instance optimal with optimality ratio m, and no deterministic algorithm has a lower optimality ratio. (The definition of instance optimal and optimality ratio appeared in [1].) In [10], the authors studied NRA algorithm and proposed an algorithm which they call LARA. In terms of run time cost, LARA is significantly faster than NRA algorithm. However, in terms of access cost, the advantage of LARA is sometimes marginally. Theobald et al. presented several probabilistic algorithms [4] which are also variants of the NRA algorithm. The basic idea of NRA is to evaluate an object s exact value using upper bounds (best value B (d) (R)) and lower bounds (worst value W (d) (R)). The detail of algorithm NRA is given as follows. If not specified elsewhere, we will use the same notations (such as W d (R),T (d) (d),m ) as NRA algorithm used.

3 Selective-NRA Algorithms for Top- Queries 17 Algorithm NRA (by Fagin et al.)[1] Do sorted access in parallel to L 1,L 2,...,L m.ateachdepthd (when d objects have been accessed under sorted access in each list) do: Maintain the bottom values (x (d) 1,x(d) 2,...,x(d) m ) encountered in each list. For every object R with discovered fields S = S (d) (R) = {i 1,i 2,...,i l } {1,...,m} with values x i1,x i2,...,x { il, compute W (d) (R) = W S(R) = xi if i S t(w 1,w 2,...,w m) where w i = and B (d) (R) = B 0 else S(R) = { xi if i S t(b 1,b 2,...,b m)whereb i = x (d). i else Let T (d) (d), the current top list, contain the objects with the largest W values seen so far (and their grades); if two objects have the same W (d) value, then ties are broen using the B (d) values, such that the object with the highest B (d) value wins (and arbitrarily among objects that tie for the highest B (d) value). Let M (d) be the th largest W (d) value in T (d). Call an object Rviableif B (d) (R) >M (d). Halt when (a) at least distinct objects have been seen (so that in particular T (d) contains objects) and (b) there are no viable objects left outside T (d),thatis,whenb(d) (R) M (d) for all R/ T (d). Return the objects in T (d). 3 Selective-NRA Algorithms In this section we will propose our Selective-NRA algorithms. In the rest of this section we first give an equivalent form of algorithm NRA s stopping rule, and introduce some lemmas and observations which motivate us to propose algorithm SNRA; secondly we will show our algorithm SNRA and prove its correctness; finally we will propose an instance optimal algorithm based on algorithm SNRA. 3.1 Observations of NRA Algorithm Definition 1. Call an object R best competitor if R has the largest best value (B d (R)) among all viable 1 objects which are not in the current top-. If there is more than one best competitor, choose one which we have seen earliest (sorted accessed earliest) as the best competitor. The stopping rule(b) of NRA is : there are no viable objects left outside T (d) that is, when B (d) (R) M (d) for all R/ T (d) which is equivalent to T he best competitor sbestvalue M (d). We now give two lemmas which motivate us to propose algorithm SNRA. In this paper, we suppose m 2. 1 The viable object and best value are well defined at the previous section.,

4 18 J. Yuan et al. Lemma 1. If at depth d object R is the best competitor and algorithm NRA does not halt at depth d, thenr has at least one missing (undiscovered) field. Proof. We assume that at depth d we have nown all fields of R.ThenW (d) (R) = B (d) (R). Since R has the largest best value, thebest value of all viable objects except objects in T (d) is not more than B (d) (R). Since algorithm NRA does not halt at depth d, there is some object R T (d) such that B (d) (R) >W (d) (R ), then it follows that, R should be in T (d) because B (d) (R) =W (d) (R) >W (d) (R ). It leads to a confliction. So the conclusion follows, as desired. Lemma 2. If at depth d object R is the best competitor and algorithm NRA does not halt at depth d, B (d) (R) will decrease at depth d +1 only if we sorted access R s missing field. Proof. If we sorted access ith list which is not R s missing field, R s best value will not decrease because we compute R s best value by substituting R s real value for the discovered field and bottom value for the missing field. So R s best value will decrease only if we sorted access R s missing field. (R s best value maybe won t decrease when the values at the next level are the same with this level. It is a necessary but not sufficient condition.) Another Observation: We note that the best competitor changes with the algorithm running. However, by doing experiments, we found that the last best competitor (which means the last one before algorithm NRA terminated) would occupy the position of best competitor for a long time. We speculate the results are most liely similar for common data sets. Table 1 gives an experiment result for a synthetic uniform data set. We set n=100000, m=2,4,...,12, = 20. The aggregation function is summation. Suppose that the last best competitor has only one missing field when algorithm NRA is running. In this case, we still have to sorted access the best competitor s all other nown fields. This will incur lots of useless sorted accesses if the last best competitor continue for a long time before the top- objects are obtained. These observations motivate us to propose Selective-NRA algorithm. 3.2 Selective-NRA Algorithm (SNRA) We will show our algorithm in pseudo-code form. As stated in section3.1, algorithm NRA performs some sorted accesses which will definitely not reduce the best competitor s best value. Our approach can avoid these sorted accesses. To prove the correctness of SNRA algorithm, we Table 1. Accessed depth for last best competitor vs. total accessed depth m depth of the last best competitor total depth of NRA

5 Selective-NRA Algorithms for Top- Queries 19 Algorithm SNRA-Initialize 1: bottom[j]:=1.0, j =1, 2,...,m 2: top:=[dummy 1,dummy 2,...,dummy ] dummy i s best:=worst:=0, missingfield:= 3: for each R i, i =1, 2,...,n do 4: R i.best:=m 5: R i.worst:=0 6: R i.missingfield:=[1,2,...,m] 7: end for 8: candidates:= 9: bestcompetitor:=dummy with missingfield=[1,2,...,m] 10: best:=m 11: min:=dummy 1 with min:=0 Algorithm Selective-NRA Call Algorithm SNRA-Initialize while (min < best) (less than objects have been accessed) do for each j bestcompetitor.missingfield do sorted access L j obtain (p, x j) if (p has not been seen before) then candidates:=candidates {p} end if bottom[j]:=x j //is used for updating an object s best value p.missingfield:=p.missingfield-{j} update p.worst min:=min{d.worst d top}// our min is M (d) min:=argmin d {d.worst d top} if (p.worst > min) (p candidates) then candidates:=candidates-{p} candidates:=candidates {min} top:=top-{min} top:=top {p} end if end for for each R (candidates top) do update R.best if (R.best min) then candidates:=candidates-{r} end if end for bestcompetitor:=argmax R{R.best R candidates} best:=max R{R.best R candidates} end while innra need to give a lemma first. This lemma indicates that SNRA algorithm will not lead to an infinite loop.

6 20 J. Yuan et al. Lemma 3. Assume algorithm SNRA has sorted accessed d j depth to L j (j = 1, 2,...,m), and the stopping rule has not been satisfied, algorithm SNRA will proceed to access the next object at least in one list. Proof. Let d be an array of [d 1,d 2,...,d m ], in the rest of the paper when we say depth d, it means depth d j in L j. We only need to prove that at depth d the best competitor has at least one missing field. Thus the remaining part of this lemma s proof is the same as Lemma 1 s. Theorem 1. If the aggregation function is monotone, then algorithm SNRA correctly finds the top objects. Proof. Accordingto Lemma 3, ifalgorithm SNRA doesn t halt, it will proceed to access the next level at least in one list until the stopping rule is satisfied. Assume that algorithm SNRA halts after d j sorted access to L j (j =1, 2,...,m)andthe objects output by algorithm SNRA are R 1,R 2,...,R.LetR be an object not among R 1,R 2,...,R. We must show that t(r) t(r i )foreachi =1, 2,...,. Let d =[d 1,d 2,...,d m ]. Since algorithm SNRA halts at depth d, thebest competitor s best value is less than M (d), then, B(d) (R) B (d) (best competitor) M (d) and t(r) B (d) (R). Also for each of the objects R i we have M (d) W (d) (R i ) t(r i ). Combining the inequalities we have shown, we have t(r) B (d) (R) M (d) W (d) (R i ) t(r i ) for each i, as desired. Not only accesses fewer objects, our algorithm requires less booeeping than algorithm NRA. At step 2 of algorithm NRA, it will update all seen objects s worst values and best values. Our algorithm just updates an object s worst value when we sorted access it. This is reasonable because the other objects s worst values will not change if they are not accessed. Now we give an example to show how our algorithm SNRA wors. Example 1. Assume m=3, n=5, = 1, the aggregation function is summation, and the lists shown in Tab. 2 can only be sorted accessed. Table 3-6 show how our algorithm SNRA performs sorted accesses at each step on this database. At step(a) SNRA sorted accesses all the 3 lists. After that, object R 1 becomes the top 1 object and R 2 becomes the best competitor. Then at step(b) we only sorted access L 1 and L 3 since L 2 is R 2 s missing field. Now R 1 is the best competitor and R 2 is the top 1 whose worst value is 1.8. At step(c) L 2 is sorted accessed. R 1 becomes the top 1 again and R 2 becomes the best competitor. Atthistime,R 2 s missing field is L 3 so at step(d) we sorted access L 3.Nowthebest competitor is R 4 whose best value is not more than top 1 s worst value. The algorithm terminates. Table 7 shows how the top 1 object and the best competitor update at each step. On this database, algorithm NRA sorted accesses to depth 3 of each list and performs 9 sorted accesses in total while our SNRA s sorted access cost is 7.

7 Selective-NRA Algorithms for Top- Queries 21 Table 2. Sorted lists (R 1,0.9) (R 2,0.9) (R 1,0.6) (R 2,0.9) (R 1,0.8) (R 4,0.6) (R 3,0.6) (R 3,0.7) (R 3,0.4) (R 4,0.5) (R 4,0.6) (R 2,0.3) (R 5,0.2) (R 5,0.2) (R 5,0.3) Table 3. SNRA(a) (R 1,0.9) (R 2,0.9) (R 1,0.6) (R 2,0.9) (R 1,0.8) (R 4,0.6) (R 3,0.6) (R 3,0.7) (R 3,0.4) (R 4,0.5) (R 4,0.6) (R 2,0.3) (R 5,0.2) (R 5,0.2) (R 5,0.3) Table 4. SNRA(b) (R 1,0.9) (R 2,0.9) (R 1,0.6) (R 2,0.9) (R 1,0.8) (R 4,0.6) (R 3,0.6) (R 3,0.7) (R 3,0.4) (R 4,0.5) (R 4,0.6) (R 2,0.3) (R 5,0.2) (R 5,0.2) (R 5,0.3) Table 5. SNRA(c) (R 1,0.9) (R 2,0.9) (R 1,0.6) (R 2,0.9) (R 1,0.8) (R 4,0.6) (R 3,0.6) (R 3,0.7) (R 3,0.4) (R 4,0.5) (R 4,0.6) (R 2,0.3) (R 5,0.2) (R 5,0.2) (R 5,0.3) Table 6. SNRA(d) (R 1,0.9) (R 2,0.9) (R 1,0.6) (R 2,0.9) (R 1,0.8) (R 4,0.6) (R 3,0.6) (R 3,0.7) (R 3,0.4) (R 4,0.5) (R 4,0.6) (R 2,0.3) (R 5,0.2) (R 5,0.2) (R 5,0.3) Table 7. Each step of SNRA top 1 best competitor object worst object best R R R R R R R R Turning SNRA into an Instance Optimal Algorithm Fagin et al. defined instance optimality for a top- query algorithm and proved instance optimality of algorithm NRA. Unfortunately our algorithm SNRA is not instance optimal in his notion. In this section we first show an example that demonstrates SNRA is not instance optimal and then we will modify algorithm SNRA and turn it to be an instance optimal algorithm. Example 2. Assume m=2, =1, the aggregation function is summation. Table 8 shows a database over which algorithm NRA performs only 6 sorted accesses to depth 3, outputs R 1 as the top 1 object while algorithm SNRA sorted accesses the whole L1 list and totally does n + 3 sorted accesses. (After depth [1, 1], R 2 is the best competitor, so we sorted access to depth [2, 1], then R 2 is still the best competitor, then sorted access to depth [3, 1], at this depth, R 3 is the best competitor, then sorted access to depth [3, 2], R 2 becomes the best competitor since R 3 s exact value is less than R 2 s worst value, then R 2 will be the best competitor until the end of L 1 at depth [n 1, 2]. After depth [n, 2], R 2 becomes the top 1 and R 1 becomes the best competitor. Then sorted access L 2,at depth [n, 3] R 1 becomes the top 1 and the algorithm terminates.) Since n could be arbitrary large, algorithm SNRA is not an instance optimal algorithm. The reason SNRA is not instance optimal is that SNRA selects some of the lists instead of all the lists to sorted access. By doing this selection, SNRA may

8 22 J. Yuan et al. Table 8. An example shows SNRA is not instance optimal (R 1,1.0) (R 2,0.9) (R 3,0.5) (R 3,0.39) (R 4,0.45) (R 1,0.38) (R 5,0.32) (R 5,0.32) (R i, n i (R n,0.12) (R n,0.12) (R 2,0.11) (R 4,0.11) n i+1 ) do fewer sorted access than NRA algorithm in most databases but may miss some important information in some particular databases lie our example 2. Nevertheless, we can force SNRA to be an instance optimal algorithm by a little modifying of algorithm SNRA. We call the modified algorithm Hybrid-SNRA, and we now show it as follows. Algorithm Hybrid-SNRA 1: Call Algorithm Initialize 2: step:=0; 3: while (min < best) (less than objects have been accessed) do 4: step++; 5: if (step mod p =0) then 6: field=[1,2,...,m]-{the fields which have been accessed to the bottom} 7: else 8: field=bestcompetitor.missingfield 9: end if 10: for each j field do 11:... //the same as SNRA from line 4 to line 18 12: end for 13:... //the same as SNRA from line 20 to line 27 14: end while We note the sorted access cost of Hybrid-SNRA is at most p times as algorithm NRA where p is an constant. So algorithm Hybrid-SNRA is instance optimal. Since the optimal ratio of algorithm NRA is m, the optimal ratio of algorithm Hybrid-SNRA is pm under some natural assumption [1]. (In fact, if p=1, Hybrid-SNRA is equivalent to NRA and if p=infinity, Hybrid-SNRA becomes SNRA.) 4 Experiment Results Our algorithms were implemented in C++. We performed our experiments on an AMD 1.9 GHz PC with 2GB of memory. In our experiments we used summation

9 Selective-NRA Algorithms for Top- Queries 23 1,200,000 =20 n= uniform data 1,200,000 =20 n= norml data # of sorted accesses 1,000, , , ,000 NRA SNRA HSNRA # of sorted accesses 1,000, , , ,000 NRA SNRA HSNRA 200, , m m Fig. 1. Access cost over uniform database Fig. 2. Access cost over normal database as the aggregation function which was a most common one. For Hybrid-SNRA we set p=11 as default. We used both synthetic and real-world data to evaluate SNRA, Hybrid-SNRA and NRA algorithms. 4.1 Evaluation for Synthetic Data We conducted experiments on three synthetic data sets with different distributions. They are uniform distributed, normal (Gaussian) distributed and exponential distributed. We set n = , m =2, 4,...,12, and = 20. Figure 1-3 show that on the uniform, normal and exponential distributed databases our algorithm SNRA as well as Hybrid-SNRA performs fewer sorted accesses than algorithm NRA, and algorithm SNRA performs the fewest sorted accesses among these three algorithms. When m is small, the difference is not so significant, but as m becomes larger, SNRA outperforms NRA more and more significantly in terms of sorted access cost. 4.2 Evaluation for Real-World Data In addition of synthetic data, we carried out experiments on three different real-world data sets which were all downloaded from UCI KDD Archive 2. The first real-world data (CE) is IPUMS Census Database. This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the year It contains objects and we extracted 4 to 10 attributes from this data set. We normalized this data set with x formula: i Min Max Min if an object s value is x i. The next real-world data is KDD cup 1998 data set (cup98). It contains different objects. This is the data set used for The Second International Knowledge Discovery and Data Mining Tools Competition, which was held in conjunction with KDD-98 The Fourth International Conference on Knowledge Discovery and Data Mining. We extracted 10 attributes to perform our experiment. We normalized this data set with the same formula as CE data. 2

10 24 J. Yuan et al. 500,000 =20 n= exponential data 500,000 =20 n=88443 CE data # of sorted accesses 400, , , ,000 NRA SNRA HSNRA # of sorted accesses 400, , , ,000 NRA SNRA HSNRA m m Fig. 3. Access cost over exponential database Fig. 4. Access cost over IPUMS Census database The last data set El Nino data set contains oceanographic and surface meteorological readings taen from a series of buoys positioned throughout the equatorial Pacific. The data is expected to aid in the understanding and prediction of El Nino/Southern Oscillation (ENSO) cycles. We removed those objects which had some missing fields from the original data set. We normalized the data set with the same formula as CE data. The remaining data contains objects. We chose 7 attributes to test our algorithms. Figure 4 shows the experiment result of SNRA, Hybrid-SNRA and NRA over IPUMS Census data set. We tested when m =4, 5,...,10 algorithm SNRA, Hybrid-SNRA and NRA would do how many sorted accesses to the lists. The result demonstrates that our algorithms are more better in terms of access cost even though the degree of advantage is relevant to the specific database. In addition, this figure also shows that algorithm Hybrid-SNRA does a little more sorted accesses than algorithm SNRA, under our default set (p=11). Figure 5 shows the experiment result of SNRA, Hybrid-SNRA and NRA over Cup98 data set. In this figure, it is also significant that our SNRA does fewer sorted accesses. Furthermore, as m goes larger, the advantage of SNRA is gradually obvious. Figure 6 shows the experiment result of SNRA, Hybrid-SNRA and NRA over El Nino data set. On this data set our SNRA and Hybrid-SNRA also perform very well. Our algorithm saves 50% of NRA s sorted accesses, because of the selecting strategy. 4.3 Summarize Our experiment illustrates that algorithm SNRA does fewer sorted accesses than algorithm NRA, both on synthesized data and real-world data. Algorithm Hybrid-SNRA does a little more sorted accesses than algorithm SNRA. As m becomes larger, the decrease of sorted accesses becomes more significant. (When changes, we obtain similar results which are omitted due to space limitations.)

11 Selective-NRA Algorithms for Top- Queries 25 # of sorted accesses 500, , , , ,000 NRA SNRA HSNRA =20 n=95412 cup98 data # of sorted accesses 250, , , ,000 50,000 =20 n=93935 elnino data NRA SNRA HSNRA m m Fig. 5. Access cost over KDD Cup98 database Fig. 6. Access cost over El Nino database The reason why our algorithm does fewer sorted accesses is that it selects some lists and does useful sorted accesses instead of sorted accessing all the lists. 5 Conclusion and Future Wor In this paper, we analyzed algorithm NRA and gave some observations. We proposed a new algorithm which we called Selective-NRA and we turned it into an instance optimal algorithm Hybrid-SNRA. Extensive experiment results both on synthetic and real-world data show that our algorithms SNRA and Hybrid- SNRA perform significantly better than NRA in terms of sorted access cost. Another interesting result according to our experiments is that algorithm NRA and algorithm Hybrid-SNRA are instance optimal but they perform fewer sorted accesses than a non- instance optimal algorithm on common data sets. This is an issue for our further research. In the future, we will also consider the run time cost of SNRA compared with algorithm NRA. We will design some techniques to lower down the run time cost of SNRA. Acnowledgements. This wor is supported by the National Science Foundation of China under the grant No and No This wor is also supported by the Science Research Fund of MOE-Microsoft Key Laboratory of Multimedia Computing and Communication (Grant No ). References 1. Fagin, R., Lotem, A., Naor, M.: Optimal Aggregation Algorithms for Middleware. In: Proceedings of the 20th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp (2001) 2. Güntzer, U., Bale, W.T., Kie, W.: Towards Efficient Multi-Feature Queries in Heterogeneous Environments. In: Proceedings of the IEEE International Conference on Information Technology: Coding and Computing, pp (2001)

12 26 J. Yuan et al. 3. Fagin, R.: Combining Fuzzy Information: an Overview. SIGMOD Record 31(2), (2002) 4. Theobald, M., Keium, G., Schenel, R.: Top- Query Evaluation with Probabilistic Guarantees. In: Proceedings of the 30th International Conference on Very Large Data Bases, pp (2004) 5. Salton, G.: Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading (1989) 6. Getoor, L., Diehl, C.P.: Lin Mining: a Survey. SIGKDD Explorations 7(2), 3 12 (2005) 7. Long, X., Suel, T.: Three-Level Caching for Efficient Query Processing in Large Web Search Engines. In: Proceedings of the 14th International Conference on World Wide Web, pp (2005) 8. Fagin, R.: Combining Fuzzy Information from Multiple Systems. J. Comput. Syst. Sci. 58(1), (1999) 9. Nepal, S., Ramarishna, M.V.: Query Processing Issues in Image (Multimedia) Databases. In: Proceedings of the 15th International Conference on Data Engineering, pp (1999) 10. Mamoulis, N., Yiu, M.H., Cheng, K.H., Cheung, D.W.: Efficient Top- Aggregation of Raned Inputs. ACM Transactions on Database Systems (TODS) 32(3), 19 (2007)

Combining Fuzzy Information: an Overview

Combining Fuzzy Information: an Overview Ronald Fagin IBM Almaden Research Center 650 Harry Road San Jose, California 95120-6099 email: fagin@almaden.ibm.com http://www.almaden.ibm.com/cs/people/fagin/