Selective-NRA Algorithms for Top-k Queries
|
|
- Dustin Barber
- 6 years ago
- Views:
Transcription
1 Selective-NRA Algorithms for Top- Queries Jing Yuan, Guang-Zhong Sun, Ye Tian, Guoliang Chen, and Zhi Liu MOE-MS Key Laboratory of Multimedia Computing and Communication, Department of Computer Science and Technology, University of Science and Technology of China, Hefei, , P.R. China Abstract. Efficient processing of top- queries has become a classical research area recently since it has lots of application fields. Fagin et al. proposed the middleware cost for a top- query algorithm. In some databases there is no way to perform a random access, Fagin et al. proposed NRA (No Random Access) algorithm for this case. In this paper, we provided some ey observations of NRA. Based on them, we proposed a new algorithm called Selective-NRA (SNRA) which is designed to minimize the useless access of a top- query. However, we proved the SNRA is not instance optimal in Fagin s notion and we also proposed an instance optimal algorithm Hybrid-SNRA based on algorithm SNRA. We conducted extensive experiments on both synthetic and realworld data. The experiments showed SNRA (Hybrid-SNRA) has less access cost than NRA. For some instances, SNRA performed 50% fewer accesses than NRA. 1 Introduction Assume there are a huge amount of objects and every object has m attributes, for each attribute the object has a score, these scores can be aggregated to a total score by an aggregate function, and we want to now which objects have the largest total scores. This scenario is generalized as top- queries. Top- query has become a classical research area since it has many applications such as information retrieval[5],[7], multimedia databases[2],[9], data mining[6]. In top- queries, we can do random access and sorted access to get an object s score. Under a random access we can get some object s score of a given attribute at one step; a sorted access means we proceed through an attribute list sequentially from the top of a list, i.e. if some object is the lth largest object in the ith list, we should do l sorted access to the ith list to obtain the object s score. As stated in [8], if there are s sorted accesses and r random accesses, the total access cost will be sc S + rc R.(C S is the cost of a single sorted access and C R is the cost of a single random access.) In some cases, random access is forbidden or restricted.[1] For example, a typical web search engine has no way to return a score of a document of our choice Corresponding author. Q. Li et al. (Eds.): APWeb/WAIM 2009, LNCS 5446, pp , c Springer-Verlag Berlin Heidelberg 2009
2 16 J. Yuan et al. under a query. On the assumption that random access is not supported by the database, in [1] the authors proposed the No Random Access Algorithm (NRA)[1]. In this paper we first demonstrate some observations of NRA algorithm and propose a Selective-NRA (SNRA) algorithm, which performs significantly better than NRA algorithm in terms of access cost and booeeping; secondly, we turn SNRA into an instance optimal algorithm which we call Hybrid-SNRA; thirdly, we carry out extensive experiments to compare SNRA (Hybrid-SNRA) algorithms with NRA both on synthetic and real-world data sets. The results also demonstrate that our algorithms have lower access cost than NRA algorithm. The rest of this paper is organized as follows. In section 2, we define the problem and review related wor. In section 3, we introduce SNRA algorithm and Hybrid-SNRA algorithm. In section 4, we show the experiment results. Finally in section 5, we conclude the paper. 2 Problem Definition and Related Wor Our model can be described as follows: assume there are m lists and n objects, the aggregation function is t which has m variables. For a given object R and for list i, R has a score x i and 0 x i 1. R has a total score of t(x 1,x 2,...,x m ). We shall denote the lists as L 1,L 2...,L m which are sorted lists and can not be random accessed. We refer to L i as list i. Each entry of L i is (R, x i )wherex i is the ith field of R. We assume there is an exact value for each object, so the length of each list is n. Since we consider only sorted access, the access cost of an algorithm will be C R r if r sorted accesses are performed. There have been several algorithms which satisfy the assumption of no random access. The famous two algorithms are Güntzer et al. s Stream-Combine Algorithm [2] and Fagin et al. s No Random Access Algorithm [1]. As mentioned in [3], Stream-Combine algorithm considers only upper bounds on overall grades of objects, and cannot say that an object is in the top- unless that object has been seen in every sorted list. In this sense, NRA is better than Stream- Combine. In [1], the authors proved that algorithm NRA is instance optimal with optimality ratio m, and no deterministic algorithm has a lower optimality ratio. (The definition of instance optimal and optimality ratio appeared in [1].) In [10], the authors studied NRA algorithm and proposed an algorithm which they call LARA. In terms of run time cost, LARA is significantly faster than NRA algorithm. However, in terms of access cost, the advantage of LARA is sometimes marginally. Theobald et al. presented several probabilistic algorithms [4] which are also variants of the NRA algorithm. The basic idea of NRA is to evaluate an object s exact value using upper bounds (best value B (d) (R)) and lower bounds (worst value W (d) (R)). The detail of algorithm NRA is given as follows. If not specified elsewhere, we will use the same notations (such as W d (R),T (d) (d),m ) as NRA algorithm used.
3 Selective-NRA Algorithms for Top- Queries 17 Algorithm NRA (by Fagin et al.)[1] Do sorted access in parallel to L 1,L 2,...,L m.ateachdepthd (when d objects have been accessed under sorted access in each list) do: Maintain the bottom values (x (d) 1,x(d) 2,...,x(d) m ) encountered in each list. For every object R with discovered fields S = S (d) (R) = {i 1,i 2,...,i l } {1,...,m} with values x i1,x i2,...,x { il, compute W (d) (R) = W S(R) = xi if i S t(w 1,w 2,...,w m) where w i = and B (d) (R) = B 0 else S(R) = { xi if i S t(b 1,b 2,...,b m)whereb i = x (d). i else Let T (d) (d), the current top list, contain the objects with the largest W values seen so far (and their grades); if two objects have the same W (d) value, then ties are broen using the B (d) values, such that the object with the highest B (d) value wins (and arbitrarily among objects that tie for the highest B (d) value). Let M (d) be the th largest W (d) value in T (d). Call an object Rviableif B (d) (R) >M (d). Halt when (a) at least distinct objects have been seen (so that in particular T (d) contains objects) and (b) there are no viable objects left outside T (d),thatis,whenb(d) (R) M (d) for all R/ T (d). Return the objects in T (d). 3 Selective-NRA Algorithms In this section we will propose our Selective-NRA algorithms. In the rest of this section we first give an equivalent form of algorithm NRA s stopping rule, and introduce some lemmas and observations which motivate us to propose algorithm SNRA; secondly we will show our algorithm SNRA and prove its correctness; finally we will propose an instance optimal algorithm based on algorithm SNRA. 3.1 Observations of NRA Algorithm Definition 1. Call an object R best competitor if R has the largest best value (B d (R)) among all viable 1 objects which are not in the current top-. If there is more than one best competitor, choose one which we have seen earliest (sorted accessed earliest) as the best competitor. The stopping rule(b) of NRA is : there are no viable objects left outside T (d) that is, when B (d) (R) M (d) for all R/ T (d) which is equivalent to T he best competitor sbestvalue M (d). We now give two lemmas which motivate us to propose algorithm SNRA. In this paper, we suppose m 2. 1 The viable object and best value are well defined at the previous section.,
4 18 J. Yuan et al. Lemma 1. If at depth d object R is the best competitor and algorithm NRA does not halt at depth d, thenr has at least one missing (undiscovered) field. Proof. We assume that at depth d we have nown all fields of R.ThenW (d) (R) = B (d) (R). Since R has the largest best value, thebest value of all viable objects except objects in T (d) is not more than B (d) (R). Since algorithm NRA does not halt at depth d, there is some object R T (d) such that B (d) (R) >W (d) (R ), then it follows that, R should be in T (d) because B (d) (R) =W (d) (R) >W (d) (R ). It leads to a confliction. So the conclusion follows, as desired. Lemma 2. If at depth d object R is the best competitor and algorithm NRA does not halt at depth d, B (d) (R) will decrease at depth d +1 only if we sorted access R s missing field. Proof. If we sorted access ith list which is not R s missing field, R s best value will not decrease because we compute R s best value by substituting R s real value for the discovered field and bottom value for the missing field. So R s best value will decrease only if we sorted access R s missing field. (R s best value maybe won t decrease when the values at the next level are the same with this level. It is a necessary but not sufficient condition.) Another Observation: We note that the best competitor changes with the algorithm running. However, by doing experiments, we found that the last best competitor (which means the last one before algorithm NRA terminated) would occupy the position of best competitor for a long time. We speculate the results are most liely similar for common data sets. Table 1 gives an experiment result for a synthetic uniform data set. We set n=100000, m=2,4,...,12, = 20. The aggregation function is summation. Suppose that the last best competitor has only one missing field when algorithm NRA is running. In this case, we still have to sorted access the best competitor s all other nown fields. This will incur lots of useless sorted accesses if the last best competitor continue for a long time before the top- objects are obtained. These observations motivate us to propose Selective-NRA algorithm. 3.2 Selective-NRA Algorithm (SNRA) We will show our algorithm in pseudo-code form. As stated in section3.1, algorithm NRA performs some sorted accesses which will definitely not reduce the best competitor s best value. Our approach can avoid these sorted accesses. To prove the correctness of SNRA algorithm, we Table 1. Accessed depth for last best competitor vs. total accessed depth m depth of the last best competitor total depth of NRA
5 Selective-NRA Algorithms for Top- Queries 19 Algorithm SNRA-Initialize 1: bottom[j]:=1.0, j =1, 2,...,m 2: top:=[dummy 1,dummy 2,...,dummy ] dummy i s best:=worst:=0, missingfield:= 3: for each R i, i =1, 2,...,n do 4: R i.best:=m 5: R i.worst:=0 6: R i.missingfield:=[1,2,...,m] 7: end for 8: candidates:= 9: bestcompetitor:=dummy with missingfield=[1,2,...,m] 10: best:=m 11: min:=dummy 1 with min:=0 Algorithm Selective-NRA Call Algorithm SNRA-Initialize while (min < best) (less than objects have been accessed) do for each j bestcompetitor.missingfield do sorted access L j obtain (p, x j) if (p has not been seen before) then candidates:=candidates {p} end if bottom[j]:=x j //is used for updating an object s best value p.missingfield:=p.missingfield-{j} update p.worst min:=min{d.worst d top}// our min is M (d) min:=argmin d {d.worst d top} if (p.worst > min) (p candidates) then candidates:=candidates-{p} candidates:=candidates {min} top:=top-{min} top:=top {p} end if end for for each R (candidates top) do update R.best if (R.best min) then candidates:=candidates-{r} end if end for bestcompetitor:=argmax R{R.best R candidates} best:=max R{R.best R candidates} end while innra need to give a lemma first. This lemma indicates that SNRA algorithm will not lead to an infinite loop.
6 20 J. Yuan et al. Lemma 3. Assume algorithm SNRA has sorted accessed d j depth to L j (j = 1, 2,...,m), and the stopping rule has not been satisfied, algorithm SNRA will proceed to access the next object at least in one list. Proof. Let d be an array of [d 1,d 2,...,d m ], in the rest of the paper when we say depth d, it means depth d j in L j. We only need to prove that at depth d the best competitor has at least one missing field. Thus the remaining part of this lemma s proof is the same as Lemma 1 s. Theorem 1. If the aggregation function is monotone, then algorithm SNRA correctly finds the top objects. Proof. Accordingto Lemma 3, ifalgorithm SNRA doesn t halt, it will proceed to access the next level at least in one list until the stopping rule is satisfied. Assume that algorithm SNRA halts after d j sorted access to L j (j =1, 2,...,m)andthe objects output by algorithm SNRA are R 1,R 2,...,R.LetR be an object not among R 1,R 2,...,R. We must show that t(r) t(r i )foreachi =1, 2,...,. Let d =[d 1,d 2,...,d m ]. Since algorithm SNRA halts at depth d, thebest competitor s best value is less than M (d), then, B(d) (R) B (d) (best competitor) M (d) and t(r) B (d) (R). Also for each of the objects R i we have M (d) W (d) (R i ) t(r i ). Combining the inequalities we have shown, we have t(r) B (d) (R) M (d) W (d) (R i ) t(r i ) for each i, as desired. Not only accesses fewer objects, our algorithm requires less booeeping than algorithm NRA. At step 2 of algorithm NRA, it will update all seen objects s worst values and best values. Our algorithm just updates an object s worst value when we sorted access it. This is reasonable because the other objects s worst values will not change if they are not accessed. Now we give an example to show how our algorithm SNRA wors. Example 1. Assume m=3, n=5, = 1, the aggregation function is summation, and the lists shown in Tab. 2 can only be sorted accessed. Table 3-6 show how our algorithm SNRA performs sorted accesses at each step on this database. At step(a) SNRA sorted accesses all the 3 lists. After that, object R 1 becomes the top 1 object and R 2 becomes the best competitor. Then at step(b) we only sorted access L 1 and L 3 since L 2 is R 2 s missing field. Now R 1 is the best competitor and R 2 is the top 1 whose worst value is 1.8. At step(c) L 2 is sorted accessed. R 1 becomes the top 1 again and R 2 becomes the best competitor. Atthistime,R 2 s missing field is L 3 so at step(d) we sorted access L 3.Nowthebest competitor is R 4 whose best value is not more than top 1 s worst value. The algorithm terminates. Table 7 shows how the top 1 object and the best competitor update at each step. On this database, algorithm NRA sorted accesses to depth 3 of each list and performs 9 sorted accesses in total while our SNRA s sorted access cost is 7.
7 Selective-NRA Algorithms for Top- Queries 21 Table 2. Sorted lists (R 1,0.9) (R 2,0.9) (R 1,0.6) (R 2,0.9) (R 1,0.8) (R 4,0.6) (R 3,0.6) (R 3,0.7) (R 3,0.4) (R 4,0.5) (R 4,0.6) (R 2,0.3) (R 5,0.2) (R 5,0.2) (R 5,0.3) Table 3. SNRA(a) (R 1,0.9) (R 2,0.9) (R 1,0.6) (R 2,0.9) (R 1,0.8) (R 4,0.6) (R 3,0.6) (R 3,0.7) (R 3,0.4) (R 4,0.5) (R 4,0.6) (R 2,0.3) (R 5,0.2) (R 5,0.2) (R 5,0.3) Table 4. SNRA(b) (R 1,0.9) (R 2,0.9) (R 1,0.6) (R 2,0.9) (R 1,0.8) (R 4,0.6) (R 3,0.6) (R 3,0.7) (R 3,0.4) (R 4,0.5) (R 4,0.6) (R 2,0.3) (R 5,0.2) (R 5,0.2) (R 5,0.3) Table 5. SNRA(c) (R 1,0.9) (R 2,0.9) (R 1,0.6) (R 2,0.9) (R 1,0.8) (R 4,0.6) (R 3,0.6) (R 3,0.7) (R 3,0.4) (R 4,0.5) (R 4,0.6) (R 2,0.3) (R 5,0.2) (R 5,0.2) (R 5,0.3) Table 6. SNRA(d) (R 1,0.9) (R 2,0.9) (R 1,0.6) (R 2,0.9) (R 1,0.8) (R 4,0.6) (R 3,0.6) (R 3,0.7) (R 3,0.4) (R 4,0.5) (R 4,0.6) (R 2,0.3) (R 5,0.2) (R 5,0.2) (R 5,0.3) Table 7. Each step of SNRA top 1 best competitor object worst object best R R R R R R R R Turning SNRA into an Instance Optimal Algorithm Fagin et al. defined instance optimality for a top- query algorithm and proved instance optimality of algorithm NRA. Unfortunately our algorithm SNRA is not instance optimal in his notion. In this section we first show an example that demonstrates SNRA is not instance optimal and then we will modify algorithm SNRA and turn it to be an instance optimal algorithm. Example 2. Assume m=2, =1, the aggregation function is summation. Table 8 shows a database over which algorithm NRA performs only 6 sorted accesses to depth 3, outputs R 1 as the top 1 object while algorithm SNRA sorted accesses the whole L1 list and totally does n + 3 sorted accesses. (After depth [1, 1], R 2 is the best competitor, so we sorted access to depth [2, 1], then R 2 is still the best competitor, then sorted access to depth [3, 1], at this depth, R 3 is the best competitor, then sorted access to depth [3, 2], R 2 becomes the best competitor since R 3 s exact value is less than R 2 s worst value, then R 2 will be the best competitor until the end of L 1 at depth [n 1, 2]. After depth [n, 2], R 2 becomes the top 1 and R 1 becomes the best competitor. Then sorted access L 2,at depth [n, 3] R 1 becomes the top 1 and the algorithm terminates.) Since n could be arbitrary large, algorithm SNRA is not an instance optimal algorithm. The reason SNRA is not instance optimal is that SNRA selects some of the lists instead of all the lists to sorted access. By doing this selection, SNRA may
8 22 J. Yuan et al. Table 8. An example shows SNRA is not instance optimal (R 1,1.0) (R 2,0.9) (R 3,0.5) (R 3,0.39) (R 4,0.45) (R 1,0.38) (R 5,0.32) (R 5,0.32) (R i, n i (R n,0.12) (R n,0.12) (R 2,0.11) (R 4,0.11) n i+1 ) do fewer sorted access than NRA algorithm in most databases but may miss some important information in some particular databases lie our example 2. Nevertheless, we can force SNRA to be an instance optimal algorithm by a little modifying of algorithm SNRA. We call the modified algorithm Hybrid-SNRA, and we now show it as follows. Algorithm Hybrid-SNRA 1: Call Algorithm Initialize 2: step:=0; 3: while (min < best) (less than objects have been accessed) do 4: step++; 5: if (step mod p =0) then 6: field=[1,2,...,m]-{the fields which have been accessed to the bottom} 7: else 8: field=bestcompetitor.missingfield 9: end if 10: for each j field do 11:... //the same as SNRA from line 4 to line 18 12: end for 13:... //the same as SNRA from line 20 to line 27 14: end while We note the sorted access cost of Hybrid-SNRA is at most p times as algorithm NRA where p is an constant. So algorithm Hybrid-SNRA is instance optimal. Since the optimal ratio of algorithm NRA is m, the optimal ratio of algorithm Hybrid-SNRA is pm under some natural assumption [1]. (In fact, if p=1, Hybrid-SNRA is equivalent to NRA and if p=infinity, Hybrid-SNRA becomes SNRA.) 4 Experiment Results Our algorithms were implemented in C++. We performed our experiments on an AMD 1.9 GHz PC with 2GB of memory. In our experiments we used summation
9 Selective-NRA Algorithms for Top- Queries 23 1,200,000 =20 n= uniform data 1,200,000 =20 n= norml data # of sorted accesses 1,000, , , ,000 NRA SNRA HSNRA # of sorted accesses 1,000, , , ,000 NRA SNRA HSNRA 200, , m m Fig. 1. Access cost over uniform database Fig. 2. Access cost over normal database as the aggregation function which was a most common one. For Hybrid-SNRA we set p=11 as default. We used both synthetic and real-world data to evaluate SNRA, Hybrid-SNRA and NRA algorithms. 4.1 Evaluation for Synthetic Data We conducted experiments on three synthetic data sets with different distributions. They are uniform distributed, normal (Gaussian) distributed and exponential distributed. We set n = , m =2, 4,...,12, and = 20. Figure 1-3 show that on the uniform, normal and exponential distributed databases our algorithm SNRA as well as Hybrid-SNRA performs fewer sorted accesses than algorithm NRA, and algorithm SNRA performs the fewest sorted accesses among these three algorithms. When m is small, the difference is not so significant, but as m becomes larger, SNRA outperforms NRA more and more significantly in terms of sorted access cost. 4.2 Evaluation for Real-World Data In addition of synthetic data, we carried out experiments on three different real-world data sets which were all downloaded from UCI KDD Archive 2. The first real-world data (CE) is IPUMS Census Database. This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the year It contains objects and we extracted 4 to 10 attributes from this data set. We normalized this data set with x formula: i Min Max Min if an object s value is x i. The next real-world data is KDD cup 1998 data set (cup98). It contains different objects. This is the data set used for The Second International Knowledge Discovery and Data Mining Tools Competition, which was held in conjunction with KDD-98 The Fourth International Conference on Knowledge Discovery and Data Mining. We extracted 10 attributes to perform our experiment. We normalized this data set with the same formula as CE data. 2
10 24 J. Yuan et al. 500,000 =20 n= exponential data 500,000 =20 n=88443 CE data # of sorted accesses 400, , , ,000 NRA SNRA HSNRA # of sorted accesses 400, , , ,000 NRA SNRA HSNRA m m Fig. 3. Access cost over exponential database Fig. 4. Access cost over IPUMS Census database The last data set El Nino data set contains oceanographic and surface meteorological readings taen from a series of buoys positioned throughout the equatorial Pacific. The data is expected to aid in the understanding and prediction of El Nino/Southern Oscillation (ENSO) cycles. We removed those objects which had some missing fields from the original data set. We normalized the data set with the same formula as CE data. The remaining data contains objects. We chose 7 attributes to test our algorithms. Figure 4 shows the experiment result of SNRA, Hybrid-SNRA and NRA over IPUMS Census data set. We tested when m =4, 5,...,10 algorithm SNRA, Hybrid-SNRA and NRA would do how many sorted accesses to the lists. The result demonstrates that our algorithms are more better in terms of access cost even though the degree of advantage is relevant to the specific database. In addition, this figure also shows that algorithm Hybrid-SNRA does a little more sorted accesses than algorithm SNRA, under our default set (p=11). Figure 5 shows the experiment result of SNRA, Hybrid-SNRA and NRA over Cup98 data set. In this figure, it is also significant that our SNRA does fewer sorted accesses. Furthermore, as m goes larger, the advantage of SNRA is gradually obvious. Figure 6 shows the experiment result of SNRA, Hybrid-SNRA and NRA over El Nino data set. On this data set our SNRA and Hybrid-SNRA also perform very well. Our algorithm saves 50% of NRA s sorted accesses, because of the selecting strategy. 4.3 Summarize Our experiment illustrates that algorithm SNRA does fewer sorted accesses than algorithm NRA, both on synthesized data and real-world data. Algorithm Hybrid-SNRA does a little more sorted accesses than algorithm SNRA. As m becomes larger, the decrease of sorted accesses becomes more significant. (When changes, we obtain similar results which are omitted due to space limitations.)
11 Selective-NRA Algorithms for Top- Queries 25 # of sorted accesses 500, , , , ,000 NRA SNRA HSNRA =20 n=95412 cup98 data # of sorted accesses 250, , , ,000 50,000 =20 n=93935 elnino data NRA SNRA HSNRA m m Fig. 5. Access cost over KDD Cup98 database Fig. 6. Access cost over El Nino database The reason why our algorithm does fewer sorted accesses is that it selects some lists and does useful sorted accesses instead of sorted accessing all the lists. 5 Conclusion and Future Wor In this paper, we analyzed algorithm NRA and gave some observations. We proposed a new algorithm which we called Selective-NRA and we turned it into an instance optimal algorithm Hybrid-SNRA. Extensive experiment results both on synthetic and real-world data show that our algorithms SNRA and Hybrid- SNRA perform significantly better than NRA in terms of sorted access cost. Another interesting result according to our experiments is that algorithm NRA and algorithm Hybrid-SNRA are instance optimal but they perform fewer sorted accesses than a non- instance optimal algorithm on common data sets. This is an issue for our further research. In the future, we will also consider the run time cost of SNRA compared with algorithm NRA. We will design some techniques to lower down the run time cost of SNRA. Acnowledgements. This wor is supported by the National Science Foundation of China under the grant No and No This wor is also supported by the Science Research Fund of MOE-Microsoft Key Laboratory of Multimedia Computing and Communication (Grant No ). References 1. Fagin, R., Lotem, A., Naor, M.: Optimal Aggregation Algorithms for Middleware. In: Proceedings of the 20th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp (2001) 2. Güntzer, U., Bale, W.T., Kie, W.: Towards Efficient Multi-Feature Queries in Heterogeneous Environments. In: Proceedings of the IEEE International Conference on Information Technology: Coding and Computing, pp (2001)
12 26 J. Yuan et al. 3. Fagin, R.: Combining Fuzzy Information: an Overview. SIGMOD Record 31(2), (2002) 4. Theobald, M., Keium, G., Schenel, R.: Top- Query Evaluation with Probabilistic Guarantees. In: Proceedings of the 30th International Conference on Very Large Data Bases, pp (2004) 5. Salton, G.: Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading (1989) 6. Getoor, L., Diehl, C.P.: Lin Mining: a Survey. SIGKDD Explorations 7(2), 3 12 (2005) 7. Long, X., Suel, T.: Three-Level Caching for Efficient Query Processing in Large Web Search Engines. In: Proceedings of the 14th International Conference on World Wide Web, pp (2005) 8. Fagin, R.: Combining Fuzzy Information from Multiple Systems. J. Comput. Syst. Sci. 58(1), (1999) 9. Nepal, S., Ramarishna, M.V.: Query Processing Issues in Image (Multimedia) Databases. In: Proceedings of the 15th International Conference on Data Engineering, pp (1999) 10. Mamoulis, N., Yiu, M.H., Cheng, K.H., Cheung, D.W.: Efficient Top- Aggregation of Raned Inputs. ACM Transactions on Database Systems (TODS) 32(3), 19 (2007)
Combining Fuzzy Information: an Overview
Combining Fuzzy Information: an Overview Ronald Fagin IBM Almaden Research Center 650 Harry Road San Jose, California 95120-6099 email: fagin@almaden.ibm.com http://www.almaden.ibm.com/cs/people/fagin/
More informationOptimal Aggregation Algorithms for Middleware
Optimal Aggregation Algorithms for Middleware [Extended Abstract] Ronald Fagin IBM Almaden Research Center 650 Harry Road San Jose, CA 95120 fagin@almaden.ibm.com Amnon Lotem University of Maryland College
More informationmodern database systems lecture 5 : top-k retrieval
modern database systems lecture 5 : top-k retrieval Aristides Gionis Michael Mathioudakis spring 2016 announcements problem session on Monday, March 7, 2-4pm, at T2 solutions of the problems in homework
More informationOn the Complexity of the Policy Improvement Algorithm. for Markov Decision Processes
On the Complexity of the Policy Improvement Algorithm for Markov Decision Processes Mary Melekopoglou Anne Condon Computer Sciences Department University of Wisconsin - Madison 0 West Dayton Street Madison,
More informationEfficient Aggregation of Ranked Inputs
Efficient Aggregation of Ranked Inputs Nikos Mamoulis, Kit Hung Cheng, Man Lung Yiu, and David W. Cheung Department of Computer Science University of Hong Kong Pokfulam Road Hong Kong {nikos,khcheng,mlyiu2,dcheung}@cs.hku.hk
More informationA Study on Reverse Top-K Queries Using Monochromatic and Bichromatic Methods
A Study on Reverse Top-K Queries Using Monochromatic and Bichromatic Methods S.Anusuya 1, M.Balaganesh 2 P.G. Student, Department of Computer Science and Engineering, Sembodai Rukmani Varatharajan Engineering
More informationSequences Modeling and Analysis Based on Complex Network
Sequences Modeling and Analysis Based on Complex Network Li Wan 1, Kai Shu 1, and Yu Guo 2 1 Chongqing University, China 2 Institute of Chemical Defence People Libration Army {wanli,shukai}@cqu.edu.cn
More informationOptimal algorithms for middleware
Optimal aggregation algorithms for middleware S856 Fall 2005 Presentation Weihan Wang w23wang@uwaterloo.ca November 23, 2005 About the paper Ronald Fagin, IBM Research Amnon Lotem, Maryland Moni Naor,
More informationCombining Fuzzy Information - Top-k Query Algorithms. Sanjay Kulhari
Combining Fuzzy Information - Top-k Query Algorithms Sanjay Kulhari Outline Definitions Objects, Attributes and Scores Querying Fuzzy Data Top-k query algorithms Naïve Algorithm Fagin s Algorithm (FA)
More informationEfficient Top-k Aggregation of Ranked Inputs
Efficient Top-k Aggregation of Ranked Inputs NIKOS MAMOULIS University of Hong Kong MAN LUNG YIU Aalborg University KIT HUNG CHENG University of Hong Kong and DAVID W. CHEUNG University of Hong Kong A
More informationDominant Graph: An Efficient Indexing Structure to Answer Top-K Queries
Dominant Graph: An Efficient Indexing Structure to Answer Top-K Queries Lei Zou 1, Lei Chen 2 1 Huazhong University of Science and Technology 137 Luoyu Road, Wuhan, P. R. China 1 zoulei@mail.hust.edu.cn
More informationNew Worst-Case Upper Bound for #2-SAT and #3-SAT with the Number of Clauses as the Parameter
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) New Worst-Case Upper Bound for #2-SAT and #3-SAT with the Number of Clauses as the Parameter Junping Zhou 1,2, Minghao
More informationMax-Count Aggregation Estimation for Moving Points
Max-Count Aggregation Estimation for Moving Points Yi Chen Peter Revesz Dept. of Computer Science and Engineering, University of Nebraska-Lincoln, Lincoln, NE 68588, USA Abstract Many interesting problems
More informationCombination of TA- and MD-algorithm for Efficient Solving of Top-K Problem according to User s Preferences
Combination of TA- and MD-algorithm for Efficient Solving of Top-K Problem according to User s Preferences Matúš Ondreička and Jaroslav Pokorný Department of Software Engineering, Faculty of Mathematics
More informationMining Frequent Itemsets for data streams over Weighted Sliding Windows
Mining Frequent Itemsets for data streams over Weighted Sliding Windows Pauray S.M. Tsai Yao-Ming Chen Department of Computer Science and Information Engineering Minghsin University of Science and Technology
More informationWeb page recommendation using a stochastic process model
Data Mining VII: Data, Text and Web Mining and their Business Applications 233 Web page recommendation using a stochastic process model B. J. Park 1, W. Choi 1 & S. H. Noh 2 1 Computer Science Department,
More informationHybrid Feature Selection for Modeling Intrusion Detection Systems
Hybrid Feature Selection for Modeling Intrusion Detection Systems Srilatha Chebrolu, Ajith Abraham and Johnson P Thomas Department of Computer Science, Oklahoma State University, USA ajith.abraham@ieee.org,
More informationPSON: A Parallelized SON Algorithm with MapReduce for Mining Frequent Sets
2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming PSON: A Parallelized SON Algorithm with MapReduce for Mining Frequent Sets Tao Xiao Chunfeng Yuan Yihua Huang Department
More informationA Practical Distributed String Matching Algorithm Architecture and Implementation
A Practical Distributed String Matching Algorithm Architecture and Implementation Bi Kun, Gu Nai-jie, Tu Kun, Liu Xiao-hu, and Liu Gang International Science Index, Computer and Information Engineering
More informationClosed Non-Derivable Itemsets
Closed Non-Derivable Itemsets Juho Muhonen and Hannu Toivonen Helsinki Institute for Information Technology Basic Research Unit Department of Computer Science University of Helsinki Finland Abstract. Itemset
More informationNondeterministic Query Algorithms
Journal of Universal Computer Science, vol. 17, no. 6 (2011), 859-873 submitted: 30/7/10, accepted: 17/2/11, appeared: 28/3/11 J.UCS Nondeterministic Query Algorithms Alina Vasilieva (Faculty of Computing,
More informationRanking Web Pages by Associating Keywords with Locations
Ranking Web Pages by Associating Keywords with Locations Peiquan Jin, Xiaoxiang Zhang, Qingqing Zhang, Sheng Lin, and Lihua Yue University of Science and Technology of China, 230027, Hefei, China jpq@ustc.edu.cn
More informationDiagonal Principal Component Analysis for Face Recognition
Diagonal Principal Component nalysis for Face Recognition Daoqiang Zhang,2, Zhi-Hua Zhou * and Songcan Chen 2 National Laboratory for Novel Software echnology Nanjing University, Nanjing 20093, China 2
More informationSA-IFIM: Incrementally Mining Frequent Itemsets in Update Distorted Databases
SA-IFIM: Incrementally Mining Frequent Itemsets in Update Distorted Databases Jinlong Wang, Congfu Xu, Hongwei Dan, and Yunhe Pan Institute of Artificial Intelligence, Zhejiang University Hangzhou, 310027,
More informationOptimizing Access Cost for Top-k Queries over Web Sources: A Unified Cost-based Approach
UIUC Technical Report UIUCDCS-R-03-2324, UILU-ENG-03-1711. March 03 (Revised March 04) Optimizing Access Cost for Top-k Queries over Web Sources A Unified Cost-based Approach Seung-won Hwang and Kevin
More informationLinear-time approximation algorithms for minimum subset sum and subset sum
Linear-time approximation algorithms for minimum subset sum and subset sum Liliana Grigoriu 1 1 : Faultät für Wirtschaftswissenschaften, Wirtschaftsinformati und Wirtschaftsrecht Universität Siegen, Kohlbettstr.
More informationarxiv: v1 [cs.ma] 8 May 2018
Ordinal Approximation for Social Choice, Matching, and Facility Location Problems given Candidate Positions Elliot Anshelevich and Wennan Zhu arxiv:1805.03103v1 [cs.ma] 8 May 2018 May 9, 2018 Abstract
More informationAppropriate Item Partition for Improving the Mining Performance
Appropriate Item Partition for Improving the Mining Performance Tzung-Pei Hong 1,2, Jheng-Nan Huang 1, Kawuu W. Lin 3 and Wen-Yang Lin 1 1 Department of Computer Science and Information Engineering National
More informationOn the Security of Stream Cipher CryptMT v3
On the Security of Stream Cipher CryptMT v3 Haina Zhang 1, and Xiaoyun Wang 1,2 1 Key Laboratory of Cryptologic Technology and Information Security, Ministry of Education, Shandong University, Jinan 250100,
More informationEFFICIENT ATTRIBUTE REDUCTION ALGORITHM
EFFICIENT ATTRIBUTE REDUCTION ALGORITHM Zhongzhi Shi, Shaohui Liu, Zheng Zheng Institute Of Computing Technology,Chinese Academy of Sciences, Beijing, China Abstract: Key words: Efficiency of algorithms
More informationA Distribution-Sensitive Dictionary with Low Space Overhead
A Distribution-Sensitive Dictionary with Low Space Overhead Prosenjit Bose, John Howat, and Pat Morin School of Computer Science, Carleton University 1125 Colonel By Dr., Ottawa, Ontario, CANADA, K1S 5B6
More informationOn Multiple Query Optimization in Data Mining
On Multiple Query Optimization in Data Mining Marek Wojciechowski, Maciej Zakrzewicz Poznan University of Technology Institute of Computing Science ul. Piotrowo 3a, 60-965 Poznan, Poland {marek,mzakrz}@cs.put.poznan.pl
More informationIntroduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/18/14
600.363 Introduction to Algorithms / 600.463 Algorithms I Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/18/14 23.1 Introduction We spent last week proving that for certain problems,
More informationAn Efficient Clustering Method for k-anonymization
An Efficient Clustering Method for -Anonymization Jun-Lin Lin Department of Information Management Yuan Ze University Chung-Li, Taiwan jun@saturn.yzu.edu.tw Meng-Cheng Wei Department of Information Management
More informationNDoT: Nearest Neighbor Distance Based Outlier Detection Technique
NDoT: Nearest Neighbor Distance Based Outlier Detection Technique Neminath Hubballi 1, Bidyut Kr. Patra 2, and Sukumar Nandi 1 1 Department of Computer Science & Engineering, Indian Institute of Technology
More informationFormal Model. Figure 1: The target concept T is a subset of the concept S = [0, 1]. The search agent needs to search S for a point in T.
Although this paper analyzes shaping with respect to its benefits on search problems, the reader should recognize that shaping is often intimately related to reinforcement learning. The objective in reinforcement
More informationBest Keyword Cover Search
Vennapusa Mahesh Kumar Reddy Dept of CSE, Benaiah Institute of Technology and Science. Best Keyword Cover Search Sudhakar Babu Pendhurthi Assistant Professor, Benaiah Institute of Technology and Science.
More informationSelecting Topics for Web Resource Discovery: Efficiency Issues in a Database Approach +
Selecting Topics for Web Resource Discovery: Efficiency Issues in a Database Approach + Abdullah Al-Hamdani, Gultekin Ozsoyoglu Electrical Engineering and Computer Science Dept, Case Western Reserve University,
More informationComparison of of parallel and random approach to
Comparison of of parallel and random approach to acandidate candidatelist listininthe themultifeature multifeaturequerying Peter Gurský Peter Gurský Institute of Computer Science, Faculty of Science Institute
More informationClustering-Based Distributed Precomputation for Quality-of-Service Routing*
Clustering-Based Distributed Precomputation for Quality-of-Service Routing* Yong Cui and Jianping Wu Department of Computer Science, Tsinghua University, Beijing, P.R.China, 100084 cy@csnet1.cs.tsinghua.edu.cn,
More informationIncrementally mining high utility patterns based on pre-large concept
Appl Intell (2014) 40:343 357 DOI 10.1007/s10489-013-0467-z Incrementally mining high utility patterns based on pre-large concept Chun-Wei Lin Tzung-Pei Hong Guo-Cheng Lan Jia-Wei Wong Wen-Yang Lin Published
More informationJOB SHOP SCHEDULING WITH UNIT LENGTH TASKS
JOB SHOP SCHEDULING WITH UNIT LENGTH TASKS MEIKE AKVELD AND RAPHAEL BERNHARD Abstract. In this paper, we consider a class of scheduling problems that are among the fundamental optimization problems in
More informationSubspace Discovery for Promotion: A Cell Clustering Approach
Subspace Discovery for Promotion: A Cell Clustering Approach Tianyi Wu and Jiawei Han University of Illinois at Urbana-Champaign, USA {twu5,hanj}@illinois.edu Abstract. The promotion analysis problem has
More informationDIRA : A FRAMEWORK OF DATA INTEGRATION USING DATA QUALITY
DIRA : A FRAMEWORK OF DATA INTEGRATION USING DATA QUALITY Reham I. Abdel Monem 1, Ali H. El-Bastawissy 2 and Mohamed M. Elwakil 3 1 Information Systems Department, Faculty of computers and information,
More information3 No-Wait Job Shops with Variable Processing Times
3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select
More informationJoint Entity Resolution
Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute
More informationHeuristic Algorithms for Multiconstrained Quality-of-Service Routing
244 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL 10, NO 2, APRIL 2002 Heuristic Algorithms for Multiconstrained Quality-of-Service Routing Xin Yuan, Member, IEEE Abstract Multiconstrained quality-of-service
More informationParameterized graph separation problems
Parameterized graph separation problems Dániel Marx Department of Computer Science and Information Theory, Budapest University of Technology and Economics Budapest, H-1521, Hungary, dmarx@cs.bme.hu Abstract.
More informationOn top-k search with no random access using small memory
On top-k search with no random access using small memory Peter Gurský and Peter Vojtáš 2 University of P.J.Šafárik, Košice, Slovakia 2 Charles University, Prague, Czech Republic peter.gursky@upjs.sk,peter.vojtas@mff.cuni.cz
More informationIntroducing Partial Matching Approach in Association Rules for Better Treatment of Missing Values
Introducing Partial Matching Approach in Association Rules for Better Treatment of Missing Values SHARIQ BASHIR, SAAD RAZZAQ, UMER MAQBOOL, SONYA TAHIR, A. RAUF BAIG Department of Computer Science (Machine
More information. A quick enumeration leads to five possible upper bounds and we are interested in the smallest of them: h(x 1, x 2, x 3) min{x 1
large-scale search engines [14]. These intersection lists, however, take up additional space dictating a cost-benefit trade-off, and careful strategies have been proposed to select the pairs of terms for
More informationIO-Top-k at TREC 2006: Terabyte Track
IO-Top-k at TREC 2006: Terabyte Track Holger Bast Debapriyo Majumdar Ralf Schenkel Martin Theobald Gerhard Weikum Max-Planck-Institut für Informatik, Saarbrücken, Germany {bast,deb,schenkel,mtb,weikum}@mpi-inf.mpg.de
More informationCompetitive Analysis of On-line Algorithms for On-demand Data Broadcast Scheduling
Competitive Analysis of On-line Algorithms for On-demand Data Broadcast Scheduling Weizhen Mao Department of Computer Science The College of William and Mary Williamsburg, VA 23187-8795 USA wm@cs.wm.edu
More informationAlgorithms on Minimizing the Maximum Sensor Movement for Barrier Coverage of a Linear Domain
Algorithms on Minimizing the Maximum Sensor Movement for Barrier Coverage of a Linear Domain Danny Z. Chen 1, Yan Gu 2, Jian Li 3, and Haitao Wang 1 1 Department of Computer Science and Engineering University
More informationTopic: Local Search: Max-Cut, Facility Location Date: 2/13/2007
CS880: Approximations Algorithms Scribe: Chi Man Liu Lecturer: Shuchi Chawla Topic: Local Search: Max-Cut, Facility Location Date: 2/3/2007 In previous lectures we saw how dynamic programming could be
More informationStudy on Personalized Recommendation Model of Internet Advertisement
Study on Personalized Recommendation Model of Internet Advertisement Ning Zhou, Yongyue Chen and Huiping Zhang Center for Studies of Information Resources, Wuhan University, Wuhan 430072 chenyongyue@hotmail.com
More informationPredictive Indexing for Fast Search
Predictive Indexing for Fast Search Sharad Goel Yahoo! Research New York, NY 10018 goel@yahoo-inc.com John Langford Yahoo! Research New York, NY 10018 jl@yahoo-inc.com Alex Strehl Yahoo! Research New York,
More informationCS264: Homework #1. Due by midnight on Thursday, January 19, 2017
CS264: Homework #1 Due by midnight on Thursday, January 19, 2017 Instructions: (1) Form a group of 1-3 students. You should turn in only one write-up for your entire group. See the course site for submission
More informationA Two-Phase Algorithm for Fast Discovery of High Utility Itemsets
A Two-Phase Algorithm for Fast Discovery of High Utility temsets Ying Liu, Wei-keng Liao, and Alok Choudhary Electrical and Computer Engineering Department, Northwestern University, Evanston, L, USA 60208
More informationFeature Selection for Multi-Class Imbalanced Data Sets Based on Genetic Algorithm
Ann. Data. Sci. (2015) 2(3):293 300 DOI 10.1007/s40745-015-0060-x Feature Selection for Multi-Class Imbalanced Data Sets Based on Genetic Algorithm Li-min Du 1,2 Yang Xu 1 Hua Zhu 1 Received: 30 November
More informationMerging Frequent Summaries
Merging Frequent Summaries M. Cafaro, M. Pulimeno University of Salento, Italy {massimo.cafaro, marco.pulimeno}@unisalento.it Abstract. Recently, an algorithm for merging counter-based data summaries which
More information1 Counting triangles and cliques
ITCSC-INC Winter School 2015 26 January 2014 notes by Andrej Bogdanov Today we will talk about randomness and some of the surprising roles it plays in the theory of computing and in coding theory. Let
More informationRandomized Algorithms 2017A - Lecture 10 Metric Embeddings into Random Trees
Randomized Algorithms 2017A - Lecture 10 Metric Embeddings into Random Trees Lior Kamma 1 Introduction Embeddings and Distortion An embedding of a metric space (X, d X ) into a metric space (Y, d Y ) is
More informationThe Encoding Complexity of Network Coding
The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network
More informationA 2k-Kernelization Algorithm for Vertex Cover Based on Crown Decomposition
A 2k-Kernelization Algorithm for Vertex Cover Based on Crown Decomposition Wenjun Li a, Binhai Zhu b, a Hunan Provincial Key Laboratory of Intelligent Processing of Big Data on Transportation, Changsha
More informationProbabilistic Graph Summarization
Probabilistic Graph Summarization Nasrin Hassanlou, Maryam Shoaran, and Alex Thomo University of Victoria, Victoria, Canada {hassanlou,maryam,thomo}@cs.uvic.ca 1 Abstract We study group-summarization of
More information8 SortinginLinearTime
8 SortinginLinearTime We have now introduced several algorithms that can sort n numbers in O(n lg n) time. Merge sort and heapsort achieve this upper bound in the worst case; quicksort achieves it on average.
More informationII (Sorting and) Order Statistics
II (Sorting and) Order Statistics Heapsort Quicksort Sorting in Linear Time Medians and Order Statistics 8 Sorting in Linear Time The sorting algorithms introduced thus far are comparison sorts Any comparison
More informationOn the Max Coloring Problem
On the Max Coloring Problem Leah Epstein Asaf Levin May 22, 2010 Abstract We consider max coloring on hereditary graph classes. The problem is defined as follows. Given a graph G = (V, E) and positive
More informationSharp lower bound for the total number of matchings of graphs with given number of cut edges
South Asian Journal of Mathematics 2014, Vol. 4 ( 2 ) : 107 118 www.sajm-online.com ISSN 2251-1512 RESEARCH ARTICLE Sharp lower bound for the total number of matchings of graphs with given number of cut
More informationClustering. (Part 2)
Clustering (Part 2) 1 k-means clustering 2 General Observations on k-means clustering In essence, k-means clustering aims at minimizing cluster variance. It is typically used in Euclidean spaces and works
More informationEffective Pattern Similarity Match for Multidimensional Sequence Data Sets
Effective Pattern Similarity Match for Multidimensional Sequence Data Sets Seo-Lyong Lee, * and Deo-Hwan Kim 2, ** School of Industrial and Information Engineering, Hanu University of Foreign Studies,
More informationUAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA
UAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA METANAT HOOSHSADAT, SAMANEH BAYAT, PARISA NAEIMI, MAHDIEH S. MIRIAN, OSMAR R. ZAÏANE Computing Science Department, University
More informationOpen Access Apriori Algorithm Research Based on Map-Reduce in Cloud Computing Environments
Send Orders for Reprints to reprints@benthamscience.ae 368 The Open Automation and Control Systems Journal, 2014, 6, 368-373 Open Access Apriori Algorithm Research Based on Map-Reduce in Cloud Computing
More informationOn Using Machine Learning for Logic BIST
On Using Machine Learning for Logic BIST Christophe FAGOT Patrick GIRARD Christian LANDRAULT Laboratoire d Informatique de Robotique et de Microélectronique de Montpellier, UMR 5506 UNIVERSITE MONTPELLIER
More information2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006
2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,
More informationDesigning Views to Answer Queries under Set, Bag,and BagSet Semantics
Designing Views to Answer Queries under Set, Bag,and BagSet Semantics Rada Chirkova Department of Computer Science, North Carolina State University Raleigh, NC 27695-7535 chirkova@csc.ncsu.edu Foto Afrati
More informationOnline algorithms for clustering problems
University of Szeged Department of Computer Algorithms and Artificial Intelligence Online algorithms for clustering problems Summary of the Ph.D. thesis by Gabriella Divéki Supervisor Dr. Csanád Imreh
More informationMining High Average-Utility Itemsets
Proceedings of the 2009 IEEE International Conference on Systems, Man, and Cybernetics San Antonio, TX, USA - October 2009 Mining High Itemsets Tzung-Pei Hong Dept of Computer Science and Information Engineering
More informationEfficient SQL-Querying Method for Data Mining in Large Data Bases
Efficient SQL-Querying Method for Data Mining in Large Data Bases Nguyen Hung Son Institute of Mathematics Warsaw University Banacha 2, 02095, Warsaw, Poland Abstract Data mining can be understood as a
More informationPrivacy Breaches in Privacy-Preserving Data Mining
1 Privacy Breaches in Privacy-Preserving Data Mining Johannes Gehrke Department of Computer Science Cornell University Joint work with Sasha Evfimievski (Cornell), Ramakrishnan Srikant (IBM), and Rakesh
More informationStability of Networks and Protocols in the Adversarial Queueing. Model for Packet Routing. Ashish Goel. December 1, Abstract
Stability of Networks and Protocols in the Adversarial Queueing Model for Packet Routing Ashish Goel University of Southern California December 1, 2000 Abstract The adversarial queueing theory model for
More informationSemantics of Ranking Queries for Probabilistic Data and Expected Ranks
Semantics of Ranking Queries for Probabilistic Data and Expected Ranks Graham Cormode AT&T Labs Research Florham Park, NJ, USA Feifei Li Computer Science Department FSU, Tallahassee, FL, USA Ke Yi Computer
More informationPrivacy-Preserving of Check-in Services in MSNS Based on a Bit Matrix
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 15, No 2 Sofia 2015 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.1515/cait-2015-0032 Privacy-Preserving of Check-in
More informationWorst-case running time for RANDOMIZED-SELECT
Worst-case running time for RANDOMIZED-SELECT is ), even to nd the minimum The algorithm has a linear expected running time, though, and because it is randomized, no particular input elicits the worst-case
More informationParallel Query Processing and Edge Ranking of Graphs
Parallel Query Processing and Edge Ranking of Graphs Dariusz Dereniowski, Marek Kubale Department of Algorithms and System Modeling, Gdańsk University of Technology, Poland, {deren,kubale}@eti.pg.gda.pl
More informationCollaborative Rough Clustering
Collaborative Rough Clustering Sushmita Mitra, Haider Banka, and Witold Pedrycz Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India {sushmita, hbanka r}@isical.ac.in Dept. of Electrical
More informationAN OPTIMIZATION GENETIC ALGORITHM FOR IMAGE DATABASES IN AGRICULTURE
AN OPTIMIZATION GENETIC ALGORITHM FOR IMAGE DATABASES IN AGRICULTURE Changwu Zhu 1, Guanxiang Yan 2, Zhi Liu 3, Li Gao 1,* 1 Department of Computer Science, Hua Zhong Normal University, Wuhan 430079, China
More informationA Polygon Rendering Method with Precomputed Information
A Polygon Rendering Method with Precomputed Information Seunghyun Park #1, Byoung-Woo Oh #2 # Department of Computer Engineering, Kumoh National Institute of Technology, Korea 1 seunghyunpark12@gmail.com
More informationDistributed minimum spanning tree problem
Distributed minimum spanning tree problem Juho-Kustaa Kangas 24th November 2012 Abstract Given a connected weighted undirected graph, the minimum spanning tree problem asks for a spanning subtree with
More informationAn Efficient Algorithm for Computing Non-overlapping Inversion and Transposition Distance
An Efficient Algorithm for Computing Non-overlapping Inversion and Transposition Distance Toan Thang Ta, Cheng-Yao Lin and Chin Lung Lu Department of Computer Science National Tsing Hua University, Hsinchu
More informationWelfare Navigation Using Genetic Algorithm
Welfare Navigation Using Genetic Algorithm David Erukhimovich and Yoel Zeldes Hebrew University of Jerusalem AI course final project Abstract Using standard navigation algorithms and applications (such
More informationHISTORICAL BACKGROUND
VALID-TIME INDEXING Mirella M. Moro Universidade Federal do Rio Grande do Sul Porto Alegre, RS, Brazil http://www.inf.ufrgs.br/~mirella/ Vassilis J. Tsotras University of California, Riverside Riverside,
More informationA TIGHT BOUND ON THE LENGTH OF ODD CYCLES IN THE INCOMPATIBILITY GRAPH OF A NON-C1P MATRIX
A TIGHT BOUND ON THE LENGTH OF ODD CYCLES IN THE INCOMPATIBILITY GRAPH OF A NON-C1P MATRIX MEHRNOUSH MALEKESMAEILI, CEDRIC CHAUVE, AND TAMON STEPHEN Abstract. A binary matrix has the consecutive ones property
More information/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18
601.433/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18 22.1 Introduction We spent the last two lectures proving that for certain problems, we can
More informationLecture 2 The k-means clustering problem
CSE 29: Unsupervised learning Spring 2008 Lecture 2 The -means clustering problem 2. The -means cost function Last time we saw the -center problem, in which the input is a set S of data points and the
More informationLocation Privacy Protection for Preventing Replay Attack under Road-Network Constraints
Location Privacy Protection for Preventing Replay Attack under Road-Network Constraints Lan Sun, Ying-jie Wu, Zhao Luo, Yi-lei Wang College of Mathematics and Computer Science Fuzhou University Fuzhou,
More informationOnline k-taxi Problem
Distributed Computing Online k-taxi Problem Theoretical Patrick Stäuble patricst@ethz.ch Distributed Computing Group Computer Engineering and Networks Laboratory ETH Zürich Supervisors: Georg Bachmeier,
More informationExact and Approximate Generic Multi-criteria Top-k Query Processing
Exact and Approximate Generic Multi-criteria Top-k Query Processing Mehdi Badr, Dan Vodislav To cite this version: Mehdi Badr, Dan Vodislav. Exact and Approximate Generic Multi-criteria Top-k Query Processing.
More informationOutline. Introduction. 2 Proof of Correctness. 3 Final Notes. Precondition P 1 : Inputs include
Outline Computer Science 331 Correctness of Algorithms Mike Jacobson Department of Computer Science University of Calgary Lectures #2-4 1 What is a? Applications 2 Recursive Algorithms 3 Final Notes Additional
More information