Implementations of Partial Document Ranking Using. Inverted Files. Wai Yee Peter Wong. Dik Lun Lee

Size: px
Start display at page:

Download "Implementations of Partial Document Ranking Using. Inverted Files. Wai Yee Peter Wong. Dik Lun Lee"

Transcription

1 Imlementations of Partial Document Ranking Using Inverted Files Wai Yee Peter Wong Dik Lun Lee Deartment of Comuter and Information Science, Ohio State University, 36 Neil Ave, Columbus, Ohio 4321, U.S.A. May 1992 Abstract - Most commercial text retrieval systems emloy inverted les to imrove retrieval seed. This aer concerns with the imlementations of document ranking based on inverted les. Three heuristic methods for imlementing the tf idf weighting strategy, where tf stands for term frequency and idf stands for inverse document frequency, are studied. The basic idea of the heuristic methods is to rocess the query terms in an order so that as many to documents as ossible can be identied without rocessing all of the query terms. The rst heuristic was roosed by Smeaton and van Rijsbergen (Smeaton & Rijsbergen, 1981), and it serves as the basis for comarison with the other two heuristic methods roosed in this aer. These three heuristics are evaluated and comared by exerimental runs based on the number of disk accesses required for artial document ranking, in which the returned documents contain some, but not necessarily all, of the requested number of to documents. The results show that the roosed heuristic methods erform better than that roosed by Smeaton and van Rijsbergen in terms of retrieval, which is used to indicate the ercentage of to documents obtained after a number of disk accesses. For total document ranking, in which all of the requested number of to documents are guaranteed to be returned, no otimization techniques studied so far can lead to substantial erformance gain. To realize the advantage of the roosed heuristics, two methods for estimating the retrieval are studied. Their accuracies and rocessing costs are comared. All the exerimental runs are based on four test collections made available with the SMART system. 1

2 1 Introduction Boolean search strategies are used in most commercial text retrieval systems, but their drawbacks are wellknown (Cooer, 1988; Radecki, 1988; Salton, Fox & Wu, 1983; Salton & McGill, 1983). For examle, formulation of useful boolean queries is not easy to learn, and users often have diculty in controlling the outut size. In order to imrove retrieval eectiveness, vector rocessing systems emloying similarity measures have been suggested and studied extensively (Buckley & Lewit, 1985; Croft & Savino, 1988; Salton, 1989; Salton & McGill, 1983). In a vector rocessing system, both query and document terms can be weighted to distinguish terms which are more imortant for retrieval uroses from those which are less imortant. Let V be the size of the vocabulary for the document collection, a document D i is reresented as a V -dimensional vector < w i;1; w i;2; :::; w i;v >, where w i;j reresents the weight of term j in document i. Likewise, a query Q can be reresented as a V -dimensional vector < q1; q2; : : :; q V >, where q i secies the weight of term i in the query. The similarity between a query and the documents can be comuted in order to rank the retrieved documents in decreasing order of the query-document similarity. Many similarity measures have been studied (Salton, 1989). For instance, the cosine coecient measures the similarity between the query and document vectors based on the cosine of the angle between them in the V -dimensional sace: similarity(d i ; Q) = cos() = P V j=1 w i;j q j q PV j=1 w2 i;j P V j=1 q2 j : One of the well-known weighting strategies uses so-called tf idf weights, in which w i;j = tf Di (t j )idf(t j ), for 1 j V. The term tf Di (t) reresents the term frequency of term t in document D i (i.e., the number of occurrences of term t in document D i ). The function idf(t) is called the inverse document frequency of term t and is set to log 2 (N=df(t)), where N is the number of documents in a collection and df(t) is the document frequency of term t (i.e., the number of documents in which term t is contained) (Croft & Savino, 1988; Salton, 1986). Thus, a term has a high weight in a document if it occurs frequently in the document but infrequently in the rest of the collection. The vector rocessing system allows a query to be exressed as a natural language text describing the user's information need. Thus, the descrition can be treated as a short document so that q i 's can be exressed in tf idf weights as well. Since most query terms are likely to aear only once in a short assage, tf's can be assumed to be 1 in the query and the weights of the query terms are sometimes reresented by the idf's of the terms. To further reduce the comutational cost, q i can often be simlied into a binary value with \1" signifying the resence of term i in the query and \" its absence (Stanll & Kahle, 1986). Concetually, the tf idf ranking strategy is very simle; however, it has been shown to give good retrieval eectiveness. Imlementations of document ranking are studied extensively (Croft & Savino, 1988; Lucarella, 1988; Mohan & Willett, 1985; Murtagh, 1982; Perry & Willett, 1983; Salton, 1968; Shasha & Wang, 199; Stanll & Kahle, 1986; Stanll, Thau & Waltz, 1989; Weiss, 1981; Wong & Lee, 199); much work is based on inverted les (Buckley & Lewit, 1985; Stanll, Thau & Waltz, 1989). An inverted le consists of two comonents, namely the index le and the ostings le. Each item in the index le corresonds to a document term in the collection, and it is associated with a ostings list in the ostings le, which is usually stored on the disk. Each osting records the document which contains the term and some other information such as the corresonding term frequency, deending on the retrieval environement. An inverted le is shown in Figure 1, in which the document frequency of each term is stored in the index le, which can be used to comute the idf value. A query will be resented as a list of terms with associated weights. The ostings lists corresonding to the query terms are retrieved and, from the ostings lists, the document scores can be comuted as shown in the seudo-code below. 2

3 Q = term 1, term 2, term 3,... index terms df D 1, 4 D 1, 3 D i, tfi a ostings 3 D 1, 6 index file ostings lists Figure 1: Inverted le suorting vector rocessing. initialize all document scores to zero for all q in Q do retrieve the ostings list for term q for each osting < D i; tf i; q > in the ostings list do comute score(d i) = score(d i) + tf i;q idf i end fforg end fforg sort document scores and return to documents The algorithm is rather straightforward since it essentially sums u the weights of the terms secied in the query for each document, taking advantage of the availiability of an inverted index. It should be noted that this algorithm is simlied by assuming binary query weights and no normalization, but it can be easily extended to remove the simlications. Many otimization techniques on inverted le systems have been develoed to reduce the I/O cost (Buckley & Lewit, 1985; Perry & Willett, 1983; Smeaton & Rijsbergen, 1981). All of these methods rocess query terms one by one and accumulate artial scores for the documents, rather than comute the nal score of a document comletely before roceeding to the next document. These methods are motivated by the fact that the number of query terms can be very large when thesaurus and relevance feedback are used to exand the original query (Stanll & Kahle, 1986). For instance, a query of 1 terms could be exanded to more than 1 terms, when synonyms and related terms (i.e., narrower terms and broader terms) are included. Thus, the corresonding rocessing cost becomes signicant and needs to be reduced. The basic idea behind the otimization techniques is to rocess the query terms in an order so that the requested number of to documents 1 can be identied without rocessing all of the query terms. Most of the revious methods aim at total document ranking (or total ranking), in which all of the requested number of to documents are guaranteed to be returned. In this aer, we focus on artial document ranking (or artial ranking) (Buckley & Lewit, 1985; Perry & Willett, 1983; Weiss, 1981), in which the retrieved documents contain some, but not necessarily all, of the requested number of to documents. The imortance of artial ranking is twofold. First, when total ranking is imlemented, none of the heuristic methods studied so far can yield any signicant erformance gain (Buckley & Lewit, 1985; Perry & Willett, 1983). Second, document retrieval itself is inherently imrecise. That is, even when total ranking is imlemented, it is known that not every to document returned is actually relevant from the user's ersective. Eectiveness should not suer greatly if the system can return a major ortion of the to documents (Buckley & Lewit, 1985), since other techniques 1 The to documents with resect to the query are those which have the highest scores according to some similarity measure. However, they are not guaranteed to be relevant from the user's ersective. 3

4 such as relevance feedback can be used to further imrove the degree of relevance. Partial ranking is a rotable aroach if it can yield substantial savings in rocessing costs. The organization of this aer is as follows. In Section 2, three heuristic methods, called the L method, W method and SW method, are described, and the notion of retrieval is introduced. Their erformance in imlementing the tf idf weighting strategy are evaluated and comared based uon exerimental runs on the four test collections made available with the SMART system. In Section 3, two heuristic methods, called the document movement method (or simly Dm method) and the linear regression method (or simly Lr method), are roosed to redict the number of to documents obtained at dierent oints of the retrieval rocess for the W method and SW method. The accuracies and rocessing costs of the Dm method and Lr method are comared. The last section summarizes the merits of the roosed methods. 2 Heuristic methods As shown by revious studies (Buckley & Lewit, 1985; Perry & Willett, 1983) and further suorted by our study, total ranking cannot be achieved until almost all query terms are rocessed. The goal of artial ranking is to maximize the number of to documents obtained after rocessing only a subset of the query terms. In other words, the goal of artial ranking technique is to be able to foresee the nal document ranking (usually an aroximate one) when only a subset of query terms is rocessed. Clearly, the chance of getting to documents fast deends on how fast document scores are incremented as query terms are rocessed. This in turn is aected by the rocessing order of the query terms. In Fig. 1, the idf's of term 1, term 2 and term 3 are 4.6, 4. and 6.6 resectively, assuming that N = 1. If the query terms are rocessed in the order as resented in the query, the score of D1 after rocessing each query term will be 18.4, 42.4, and If the query terms are ordered by increasing document frequencies, the document score will be 19.8, 43.8, and With query terms ordered by decreasing term frequencies, the document score will be 24, 42.4, and The rationale is that the faster the document scores of to documents increment, the earlier they can be identied. In this articular examle, either ordering by document frequencies or by term frequencies is better than the original order. In this section, we study three methods which rocess query terms in dierent orders based uon dierent criteria. The rst method, called L method, was roosed by Smeaton and van Rijsbergen (Smeaton & Rijsbergen, 1981), the second and third methods, called the W and SW methods resectively, are roosed by the authors (Lee & Wong, 1991; Wong & Lee, 1991). 2.1 The L method The L method, which has been used in the uerbound search algorithm (Buckley & Lewit, 1985; Croft & Savino, 1988; Fukunaga & Narendra, 1975; Mohan & Willett, 1985; Perry & Willett, 1983; Smeaton & Rijsbergen, 1981; Weiss, 1981), rocesses query terms in descending order of their idf values. Since document frequencies corresond to the lengths of the ostings lists, in other words, query terms are rocessed in increasing lengths of the ostings lists. This method requires the lengths of the ostings lists to be ket in the index le, which can be accessed searately from the ostings lists. For simlicity, Q is reresented by (q1; q2; : : :; q k ), in which df(q i ) df(q j ) for i < j. The uerbound search algorithm works as follows. After a ostings list is rocessed, artial document scores are sorted in descending order and the current to T documents are obtained. The uerbound of the (T +1)st document is then comuted with the assumtion that it contains all the remaining unsearched query terms. The retrieval rocess stos if the uerbound of the (T +1)st document is smaller than the current score of the T th document. If the search can be stoed 4

5 CACM CISI CRAN MED Number of documents Number of distinct terms Average number of terms / document Maximum document frequency (df) Average df Standard deviation of df Maximum tf Maximum idf l Table 1: Characteristics of the document collections. early, the amount of disk accesses and CPU rocessing cost can be reduced substantially, since short ostings lists are rocessed rst and the unrocessed lists are long. In order to evaluate the usefulness of the uerbound search algorithm by using the above sorting method, we take checkoints during the retrieval rocess. To facilitate discussions, three denitions are given: 1. Q f : The nal set of to documents obtained when all the query terms in query Q are rocessed. The number of to documents returned is determined by the system or by the user. 2. Q i : The set of to documents obtained after rocessing i% of the total disk accesses in resonse to query Q. 3. Ra i ( ): (j Q i \ Q f j = j Q f j) 1%. The values of Ra's indicate the ercentages of to documents obtained at dierent oints of the retrieval. Moreover, a sequence of Ra's can show how fast the to documents in Q f are revealed. For instances, if Ra1 =.5, Ra =.7 and Ra3 =.85, then it indicates that the to documents are revealed fast. If Ra1 =.1, Ra =.15 and Ra3 =., then it indicates that the to documents are revealed rather slowly. For artial ranking, the rate at which Ra increases should be the faster the better, since the retrieval rocess can sto earlier for the same. We will study the retrieval of the L method by an exeriment carried out on four test collections, which are made available with the SMART system. Some characteristics of the test collections are given in Table 1. The exeriment is erformed in the following manner. For each collection, 15 queries are generated for each of the three dierent query sizes, namely 3, 5 and 1 terms, for a total of 45 queries. For each query size, three grous of queries are generated with 5 queries each. In the rst grou, each query contains only short ostings lists (i.e., only contains query terms with high idf values). 2 In the second grou, each query contains one-half short ostings lists and one-half long ostings lists. In the third grou, each query contains only long ostings lists. Query terms in each query are randomly selected from the vocabulary of the resective test collection. With queries of dierent sizes and the corresonding ostings lists of dierent lengths, our exeriments do not try to favor any articular oerational environment. Ten equally saced checkoints are taken and the to documents at each checkoint are recorded. It means that if a total of P disk ages is required for rocessing all the query terms, checkoints are taken after dp=1e; d2p=1e; d3p=1e; : : :, and d1p=1e disk ages have been rocessed. Finally, for each collection, we obtain at each checkoint the retrieval accuracies for all the queries and comute the average of the retrieval 2 In this aer, a ostings list is short if it contains fewer than 5 document ostings, long otherwise. 5

6 accuracies for queries of the same size. In our study, each document osting requires 6 bytes, 4 bytes for a document identier and 2 bytes for a term frequency. In most cases, each disk access retrieves document ostings, corresonding to 1 bytes. The reason for this rather small age size is to account for the small document database used in the exeriment; if the age size is too large (e.g., 1K), most ostings list will t in one age, which is not realistic for large databases. The results are shown in Figs. 2-5 for CACM, CISI, CRAN and MED, resectively. In each collection, three curves () in three diagrams are shown, for 3-term, 5-term and 1-term queries. Most of the curves are roughly lying at about the 45 degree line. In general, almost all of the query terms must be rocessed in order to obtain all of the documents in Q f. This indicates that the stoing criterion secied by the uerbound search algorithm is hard to meet and no signicant erformance imrovement can be achieved by the L method if total ranking is required. Our result is consistent with the study by Buckley and Lewit (Buckley & Lewit, 1985). In the tf idf weighting strategy, document weights are determined by both the tf and idf values. For tyical document collections, the range of idf values is rather small since their values are comressed by the log function. Thus, idf is the secondary comonent for determining the weight of a term, when comared to tf, whose range is larger in general (see Table 1). However, the L method determines the rocessing order of query terms based on idf values alone, without taking tf values into consideration. This exlains why the retrieval of the L method increases rather slowly. In fact, the uerbound search algorithm is too conservative to yield any signicant erformance gain. First, it is highly unlikely for the (T +1)st document to contain all the remaining unsearched terms. Second, the uerbound method considers the chance of the (T + 1)st document being romoted to a to document, assuming that the weights of the other documents, including the current to documents, are not changed. This assumtion is unrealistic and will be too essimistic as a stoing criterion. Two methods have been roosed to imrove retrieval eciency by relaxing the stoing criterion (Buckley & Lewit, 1985; Perry & Willett, 1983; Weiss, 1981). The rst one only guarantees a subset of to documents to be returned (i.e., only artial ranking is erformed). The uerbound of the (T +1)st document is comared with the current score of the Sth document, where S < T. With this relaxation, the chance to meet the stoing criterion is higher because of a larger dierence in scores (Buckley & Lewit, 1985). The second one alies robability on the uerbound comutation (Perry & Willett, 1983; Weiss, 1981). The rocess may sto early if the robability for the uerbound of the (T +1)st document to exceed the current score of the T th document is small. But we nd that the robabilistic stoing criterion is still unlikely to be met because the score dierence between the T th and (T +1)st documents is usually very small. Figure 6 shows the document scores of the to 3 documents of one run on MED. There is no signicant ga between two consecutive scores. Even if a ga exists, it may not fall between the T th and (T + 1)st documents to trigger early stoing. Thus, the (T +1)st document has a high robability to become a to document until almost all the query terms have been rocessed. Moreover, the saving due to this method is still questionable since the cost of comuting robabilities with a large number of terms could be very signicant. Thus, the uerbound comutation not only fails to roduce any erformance gain but also induces comutational overhead. These roblems motivate our study on artial ranking. To overcome the shortcomings of the L method, we roose and investigate two search algorithms in the next two subsections. The idea of the methods is based uon greedy algorithms, and their objective is to imrove Ra's, esecially at the initial stage of the retrieval rocess, i.e., to obtain a large ortion of to documents fast without a large amount of I/O oerations. 6

7 CACM ??????????? Disk ages retrieved (a) ??????????? Disk ages retrieved (b) ??????????? Disk ages retrieved (c) Figure 2: Average retrieval accuracies of the L method (), W method (?) and SW method ( [1 bytes/age], 2 [2 bytes/age]) for CACM. (a) 3 query terms er query. (b) 5 query terms er query. (c) 1 query terms er query. Fifteen queries are tested for each query size. 7

8 CISI ??????????? Disk ages retrieved (a) ??????????? Disk ages retrieved (b) ??????????? Disk ages retrieved (c) Figure 3: Average retrieval accuracies of the L method (), W method (?) and SW method ( [1 bytes/age], 2 [2 bytes/age]) for CISI. (a) 3 query terms er query. (b) 5 query terms er query. (c) 1 query terms er query. Fifteen queries are tested for each query size. 8

9 CRAN ??????????? Disk ages retrieved (a) ??????????? Disk ages retrieved (b) ??????????? Disk ages retrieved (c) Figure 4: Average retrieval accuracies of the L method (), W method (?) and SW method ( [1 bytes/age], 2 [2 bytes/age]) for CRAN. (a) 3 query terms er query. (b) 5 query terms er query. (c) 1 query terms er query. Fifteen queries are tested for each query size. 9

10 MED ??????????? Disk ages retrieved (a) ??????????? Disk ages retrieved (b) ??????????? Disk ages retrieved (c) Figure 5: Average retrieval accuracies of the L method (), W method (?) and SW method ( [1 bytes/age], 2 [2 bytes/age]) for MED. (a) 3 query terms er query. (b) 5 query terms er query. (c) 1 query terms er query. Fifteen queries are tested for each query size. 1

11 id score id score id score Figure 6: To 3 document scores of one run on MED. 2.2 The W method The rocess of document ranking is to obtain and accumulate term weights of the documents from the ostings lists, and to sort the nal document scores in descending order. The W method takes two arameters into consideration, namely the maximum tf in a ostings list (denoted by tf max ) and the length of that ostings list. The query terms are then rocessed in descending order of the tf max idf values. In this case, the maximum tf value of each ostings list is stored in the index le. Since two bytes are sucient to store a term frequency, the additional storage overhead will not be signicant. A tyical entry in the inverted le is as follows: term t i df tf max?! d j ; tf x d k ; tf y Since this method rocesses ostings lists which have a high otential of generating large increments to the document scores, artial scores of to documents will be accumulated faster than the L method without using a large amount of I/O. Consequently, the to documents in Q f will be revealed earlier in the retrieval rocess. An exeriment similar to that used for the L method was carried out to nd out the retrieval of the W method. The results are lotted as (?) curves in Figs Like the L method, most of the query terms need to be rocessed in order to obtain all of the documents in Q f if total ranking is required, so the W method is still unable to obtain signicant erformance gain. However, the retrieval accuracies of the W method are consistently higher than those of the L method in all the test collections, excet for the 1-term queries in CACM. It means that if artial ranking is allowed, the W method can lead to a substantial imrovement in terms of the number of disk accesses, for a retrieval less than 1%. For instance, for the 1-term queries in CISI, after rocessing 3% of the disk accesses, the average Ra for the L method is less than 1%, while it is about 7% for the W method. As seen in Figs. 2-5, the increments of Ra's in the W method are fast for the rst % of disk accesses; however, the increments slow down as Ra aroaches 1%. That is, the disk accesses after the rst % increase the retrieval very slowly; and the cost-eectiveness of this ortion of I/O oerations is low. The major advantage of the W method is the ability to obtain a large number of to documents in a small amount of disk accesses, esecially at the initial stage of the retrieval rocess, with a small additional storage overhead in the index le. However, this sorting method is still not the fastest one to obtain to documents for a given amount of disk accesses. In the following subsection, a more comlex method is studied. 11

12 2.3 The SW method The SW method involves a more comlex ordering method than the rst two methods. In the W method, ostings lists are rocessed by decreasing order of tf max idf values. When a ostings list of high tf max idf is retrieved, many disk ages with low term weights are retrieved and rocessed at the same time, but they contribute little to nding the to documents. Thus, rocessing the query Q list by list is not the best way to achieve high retrieval. To further utilize the idea of greediness, the SW search method is investigated. In the SW method, the ostings of each ostings list are sorted by decreasing tf values rst. In other words, for a given term, documents with high tf values are ut at the beginning of the list, and those with low tf's are ut at the end. Moreover, a ostings list is no longer viewed as a single item, but rather as a sequence of individual disk ages. This organization allows disk ages of high term weights to be rocessed before those of low term weights. For instance, the rst age of term t i may be rocessed rst, and then the rst age of term t j, and then the rst age of t k, and then the second age of term t i, and so on. The maximum tf of each age is stored in the index le so that disk ages with high tf values can be determined from the index le. A tyical entry in the inverted le is shown below: term t i df tf1;max tf2;max?! d j ; tf x d k ; tf y The rocessing and storage overheads of the SW will be higher than the rst two methods. However, since most ostings lists are short according to the Zif's law (Zif, 1949) and therefore will occuy only a small number of disk ages, the overheads in keeing the tf values and maintaining the order of the disk ages are insignicant. This is esecially true for environments where udates are done in batch and are infrequent comared to retrieval. Disk ages are rocessed in an order dened by three arameters, namely, the maximum tf of the age, the length of the ostings list and the number of document identiers in the age: tf max idf f(i); where tf max is the maximum tf contained in the disk age and I is the number of document identiers in the age. The function f(i) is included in the formula to account for the number of document identiers in a disk age, which aects the contribution of a disk age to nding the to documents, esecially when the weights of the to documents are determined by many terms. For examle, a age having one identier with a weight of 31 may not have a higher otential to accumulate weights than a age having identiers with the maximum weight of 29. First of all, the dierence between the maximum weights of these two ages is small. Moreover, the former only increments the weight of one document, while the latter increments the weights of documents. In this case, it is reasonable to rocess the disk age with the maximum weight of 29 rst. The function f(i) is introduced to account for the degree of fullness of disk ages in determining the order of rocessing. A number of functions have been investigated; in this study, we use f(i) = I e, where < e < 1. The eect of I is restricted by e so that the maximum weights of disk ages are still the rimary determining factor. Since this method requires the document identiers of each ostings list to be sorted by decreasing tf value, it is referred to as the SW method. We study the SW method of dierent age sizes: 1 bytes and 2 bytes, corresonding to ostings and ostings er disk age. For the age size of 1 bytes, e is set to.5. In other words, we take a square root of I in the formula. As the age size is increased to 2 bytes (i.e., the maximum value of I is increased to ), with the maximum tf in each ostings list unchanged, we accordingly reduce e to.4 to restrict the signicance of I to the rocessing order. The otimal value of e for a certain age size requires further investigation. Once again, an exeriment similar to that used for the L method was carried out to test the 12

13 retrieval of the SW method. The results are lotted as () and (2) curves for age sizes of 1 and 2 bytes in Figs We nd that the SW method is better than the W method and L method in terms of retrieval in all the test collections for all query sizes, demonstrating the suerior erformance of this sorting method. In Fig. 4, we nd that if 8% of to documents in CRAN are required, about 9% of disk accesses are needed for the L method, 6% for the W method and % for the SW method. In Figs. 2-5, the average retrieval accuracies of the SW method are consistently better than those of the W method by about 5-%. By observing the retrieval accuracies at 1-3% of disk accesses for the SW method, there are still lenty of room for imrovement, which is left for future investigation. In the following section, two estimation methods for the W method and SW method are roosed to estimate Ra's at dierent oints of the retrieval rocess. Note that the retrieval accuracies in the revious exeriments are obtained assuming that Q f is known in advance, which is not the case in reality. 3 Estimations of retrieval accuracies for the W method and SW method In order to realize the advantage of the W method and SW method, the system must be able to estimate Ra's at dierent oints of the retrieval rocess so that the system can sto the retrieval rocess when a desired retrieval has been reached. The terminating condition could be a threshold determined by the system or secied by the user. If the system fails to estimate the retrieval, we may have searched more ostings lists than necessary, incurring unnecessary disk accesses, or stoed too soon, resulting in a lower than the user's secication. Estimations of retrieval accuracies can be done in a number of ways. For instance, the values of Ra i 's can be determined by carefully calibrating the retrieval system. This method is simle and roduce good estimations for a stable database, but may not work well when the arameters of the database change frequently. Alternatively, an analytical model can be develoed based uon the distribution functions of the tf values, df values, and the search strategy. An accurate analytical model should be able to give better rediction on Ra i 's under dierent system arameters. Unfortunately, it is dicult to derive. Instead, we study in this section two heuristic methods to estimate Ra's at the checkoints. It should be noted that they are not used to further imrove the caability of obtaining better retrieval, but rather they are used to estimate the retrieval during a retrival rocess. The rst method is called the document movement method (or the Dm method). It uses a heuristic to estimate whether a document is likely to be a to document based uon document movements during the retrieval rocess. The second one is called the linear regression method (or the Lr method). In the Lr method, an indicator is rst develoed, and then a regression line is constructed to relate the retrieval and the indicator by using linear regression techniques. We use these two estimation methods to estimate the retrieval accuracies of both the W method and SW method. Their accuracies and rocessing costs are comared. Since there are many similarities between the W method and SW method, the exlanations are mainly based on the W method, followed by the features corresonding to the SW method in arentheses. 3.1 Estimations by the document movement method As the document weights are sorted during the ranking rocess, the document identiers with their weights change their ositions in the document-weight array, called doc wt hereafter. To study the behavior of document movements, the to T ositions of the array are called the Candidate Region, and the documents in this region are called candidate documents. Fig. 7 illustrates the document movements of the 1 documents with highest scores. In this examle, there are four query terms and assume only the to 5 documents are 13

14 After term After term After term After term final to five documents Figure 7: The movement of documents in a ranking rocess. returned to the user (i.e., T = 5). The arrows indicate the movements of 5 documents which are eventually in Q f. Documents 1261, 298 and 56 stay in the Candidate Region during the whole retrieval rocess. Document 356 stays in the Candidate Region after term 1 is rocessed, but moves out of the Region after term 2 is retrieved. Eventually it gets back to the Candidate Region. Document 115 was outside the to 5 after term 1 is rocessed, but it gets into the Candidate Region after term 2 is rocessed. Even though the document movements seem to be comlex and random, with the use of the W method and SW method, some documents remain in the high ranks consistently, while some move u and/or down in the array. To study the temoral behavior inside the Candidate Region, Fig. 8 records the document movement based on a -term query on the MED collection, with the use of the W method. During the whole retrieval rocess, a total of 47 documents have entered the Candidate Region. To simlify exlanations, the documents are assigned seudo identiers from 1 to 47 and sorted so that to documents are shown on the right of the gure. In this exeriment, the to documents are collected and examined after every ve query terms have been rocessed. As can be observed from Fig. 8, among the to documents obtained at the oint when the rst 5 ostings lists have been rocessed, only 1 remain after the next ve query terms have been rocessed (documents 1 to 1 shown near the lower left corner of the diagram have been eliminated from and never aeared again in the Candidate Region). We nd that some documents stay in the Candidate Region for a longer eriod of time and some are exelled very soon. To show this behavior ictorially, a continuous vertical line is drawn uwards if a document remains in the Candidate Region at consecutive checkoints; the continuous vertical line is terminated if the document leaves the Region. The continuous lines at the right art of the grah indicate that many documents stay in the Candidate Region in a stable manner. For instance, if we sto after 1 query terms had been rocessed, we would have retrieved documents and That is, 18 out of the nal to documents would have been identied, and documents 26 and 27 would be retrieved instead of 28 and 29. Based on the somewhat regular behaviors of the document movements, heuristics can be develoed. The following two observations establish the guidelines for our heuristic. First, those documents which have relatively high scores at the beginning of the retrieval and get into the to t ranks of the Candidate Region, where t < T, tend to stay in the Candidate Region for the rest of the rocessing (the to t ranks of the Candidate Region is called the Stable Region and is reresented by SR). Therefore, there is a high robability for those documents in SR to be retained in Q f. Second, those documents which are eventually in Q f tend to stay in the Candidate Region for a longer eriod of time, while those which are not in Q f tend to move in and out of the Candidate Region more frequently. Thus, the duration for which the documents stay in the Candidate Region can be used to 14

15 MED Number of ostings lists searched Documents Figure 8: The temoral behavior in the Candidate Region. redict documents in Q f. Since ostings lists (disk ages) are sorted by decreasing tf max idf for the W method (tf max idf f(i) for the SW method), the inuence of the remaining ostings lists (disk ages) on the nal ranking will decrease as more lists (ages) are rocessed. Thus, the duration requirement for otential documents to be in Q f should decrease as the search roceeds. To suort this feature, we add to each document a counter which will be increased by one whenever the document moves into the Candidate Region. The document-weight array doc wt is sorted after one disk age of ostings is rocessed. A candidate document is considered as a to document if it stays in the Candidate Region for a certain eriod of time. There are 1 checkoints taken for the whole retrieval rocess. At each checkoint, the number of to documents is estimated based uon the following requirements. In our study, we set SR to be % of the to T documents, with T equal to. The choice of the SR value is exerimental. Documents in this sub-region are counted as to documents indeendently of how long they stay in it. For the rst 4 checkoints (i.e., after 1%, %, 3% and % of disk accesses), we require that candidate documents to be counted as to documents stay in the Candidate Region 75% of the time. This means that for 1 disk accesses, after 3% of disk accesses are rocessed, the counted documents must stay in the Candidate Region for at least 22 times. This requirement is decreased as more lists are rocessed to reect the decreasing inuence of smaller term weights on the nal ranking, towards the end of the retrieval rocess. The requirement is decreased by 5% for every 1% of disk accesses. This means that for 1 disk accesses, at 5% of the disk accesses, the counted documents must stay in the Candidate Region for at least 35 of the times. The discreancies for the four test collections based on this estimation method are shown in Figs. 9 and 1. In each diagram, the horizontal axis corresonds to the ercentages of the total disk accesses; the vertical axis corresonds to the absolute discreancy between the actual Ra and the estimated Ra. We use the height of the box to reresent the maximum discreancy at a articular checkoint, and the horizontal line inside the box to reresent the average discreancy. For each collection, 45 queries are rocessed. In 15

16 Fig. 11, the estimation discreancies of the W method and SW method based on the Dm method are shown. The results will be discussed after the next method is described. 3.2 Estimations by the linear regression method The major reason that the W method (SW method) erforms better than the L method is that it rocesses ostings lists (disk ages) with heavy weights earlier than those with light weights. Let the total weight of all the ostings lists (disk ages) corresonding to a query be w. At a articular checkoint, if the weights rocessed thus far is w, it is reasonable to exect that the larger the weight ratio r = w =w, the more it is true that the to documents have been obtained. Initially, w =, then r = and the corresonding Ra = %. When all ostings lists (disk ages) are rocessed, w = w, then r = 1 and the corresonding Ra = 1%. The weight ratio r can be exected to be more or less roortional to Ra. However, the weight ratio r alone cannot rovide accurate estimations. The roblem can be shown by the dierent distributions of the artial scores among the candidate documents, with a number of unrocessed ostings lists (disk ages). For a given value of w, if most of the weight is concentrated on a few documents, and they are much heavier than the remaining ostings lists (disk ages), then they have a high chance to be in Q f. On the other hand, if the weight is evenly distributed among the documents, and they are not much heavier than the remaining ostings lists (disk ages), then they have a low chance to be in Q f. We nd that the dierences between the artial scores of the candidate documents and the maximum weights of the unrocessed ostings lists (disk ages) are also an imortant indicator for Ra's. If h candidate documents have weights higher than the maximum weight of the next heaviest ostings list (disk age) (i.e., they are heavier than all the remaining ostings lists (disk ages)) and h is very close to T, then it is likely that a large ortion of the candidate documents will be in Q f. However, if h is small, then it is likely that very few candidate documents will be in Q f. Thus, the value of h is roortional to the value of Ra, and s = h=t is called the safe ratio of candidate documents in the Candidate Region. The weight ratio r and the safe ratio s are two indicators that can estimate the number of to documents obtained at a articular checkoint. We exect that if r + s is small at a checkoint, Ra is small; if it is large, Ra is large. However, like the Dm method, the query size can aect Ra at the same time. If there is a small number of ostings lists to be rocessed, the chance for a document to further gain weight is small and the sum of r and s reects the value of Ra to a greater extent. The situation becomes comlex when a large number of ostings lists remain to be rocessed, because there are high chances for a document to gain weight from dierent ostings lists (disk ages). Thus, the size of the query must be taken into consideration as well. To formulate the indicator, a few more denitions are given below: 1. w = P P j=1 max wt[j], where P is the total number of disk ages required to rocess all the query terms and max wt[j] is the maximum weight of the jth age. It should be noted that the maximum term weight of a age, rather than the sum of term weights in a age, is used in our comutation. 2. w i = P dip=1e j=1 max wt[j]. The value deends on which sorting method is used, namely the W method or the SW method. 3. r i = w i =w, the weight ratio. 4. s i = h i =T, the safe ratio, where h i is the number of candidate documents whose current weights are larger than the maximum weight of the remaining ostings lists (disk ages), after i% of the disk ages are rocessed. 16

17 Figure 9: The estimation discreancies of the W method by the Dm method (light gray boxes) and the Lr method (white boxes). Forty-ve queries are rocessed. 17

18 Figure 1: The estimation discreancies of the SW method by the Dm method (dark gray boxes) and the Lr method (gray boxes). Forty-ve queries are rocessed. 18

19 Figure 11: The estimation discreancies of the W method (light gray boxes) and SW method (dark gray boxes) by the Dm method. Forty-ve queries are rocessed. 19

20 Figure 12: The estimation discreancies of the W method (white boxes) and SW method (gray boxes) by the Lr method. Forty-ve queries are rocessed.

21 After i% of the disk accesses have been rocessed and the array doc wt has been sorted, the indicator Ind i for the W method (SW method) is dened as follows: for i 1. Ind i = (r i + s i )? comb factor(k; i); The value of comb factor(k; i) is the amount that is to be subtracted from r + s to account for the eect of the combination of term weights from dierent ostings lists to the nal ranking. If the query size k is large, a large amount should be subtracted. On the contrary, if the query size is small, a small amount should be subtracted. However, this amount of subtraction is reduced to reect the decreasing inuence of smaller term weights to the nal ranking towards the end of retrieval rocess. Therefore, we use comb factor(k; i) = ck (1? i=1), where c is a constant. Since the maximum value of r + s is 2 and the values of r + s are generally below.5 at the rst few checkoints, we set c = 1=5 to control the amount of subtraction; for instance, the maximum subtraction is.2 at the rst checkoint for 1-term queries. At the oint when all query terms have been rocessed, nothing is subtracted, r + s = 2 and Ra = 1%. However, it seems to be reasonable to have a bound for subtractions; we require that Ind i is non-negative. In order to comute the above indicator, in addition to the maximum tf stored in the index le, the sum of the maximum weights of the disk ages in a list, denoted by sum max wt, is also stored in the index le and is used to comute w above. To examine the relationshi between Ind's and Ra's, the following exeriment was conducted. For each collection, 45 queries are run. We take checkoints at every 1% of disk accesses as before and the Ind's are comuted at the same time. In Table 2, the (Ind; Ra) airs for two queries in CACM by the W method are shown. These (Ind; Ra) airs can be viewed as oints in a two-dimensional lane, with the x-axis for the Ind and y-axis for the Ra. In Figs. 13 and 14, the oints corresonding to the W method and SW method are shown in two diagrams for each collection. Since there are 45 queries for each collection, and 11 oints for each query (including the % and 1%), 495 oints in total are lotted in each diagram. The 's and 's in Fig. 13 corresond to samle queries 1 and 2 in Table 2. Based uon the distribution of oints for each collection, we nd that in general the larger the Ind, the larger the Ra. Statistically, we can establish a linear relationshi between Ind's and Ra's; regression lines can be drawn by using the linear regression technique for each collection. The regression lines and the corresonding correlation coecients for CACM, CISI, CRAN and MED are shown in Figs. 13 and 14, resectively. Since the correlation coecients are about.8,.9,.89, and.89 for the W method (.95,.95,.97 and.95 for the SW method) for CACM, CISI, CRAN and MED, we can conclude that there is a strong correlation between Ind and Ra, esecially for the SW method. Once the regression line is drawn, we can estimate the value of Ra when Ind is comuted. The discreancies of this method are shown in Fig. 9 and 1. The estimation discreancies of the W method and SW method based on the Lr method are shown in Fig The comarison between these two estimation methods A good estimator should have the following two roerties: its estimations are close to the actual values and it is easy to imlement. Based uon these criteria, the two estimation methods are comared. Let us rst consider the estimations for the W method. By comaring the heights of the light gray boxes with those of the white boxes in Fig. 9, excet for 3 cases in CACM after the 6% oint and one case in CRAN at the 5% oint, the Lr method erforms better than the Dm method in terms of maximum discreancies. As far as the average discreancies are concerned, the Lr method also erforms better than the Dm method in most cases, excet for 4 cases in CACM and one case in CRAN. The average discreancies in either estimation method are mostly in the 1-% range. In general, both the maximum and average discreancies decrease 21

22 CACM, y = 15: :8x; r = :81 (y) :2 :4 :6 :8 1 1:2 1:4 1:6 1:8 2 Indicator (x) CISI, y = 15: :37x; r = :9 (y) :2 :4 :6 :8 1 1:2 1:4 1:6 1:8 2 Indicator (x) CRAN, y = 12: :71x; r = :89 (y) :2 :4 :6 :8 1 1:2 1:4 1:6 1:8 2 Indicator (x) MED, y = 14: :95x; r = :89 (y) :2 :4 :6 :8 1 1:2 1:4 1:6 1:8 2 Indicator (x) Figure 13: The regression lines for the test collections based on the W method. The horizontal axis corresonds to the indicator; the vertical axis corresonds to the retrieval. The regression line and the correlation coecient (r) are given on to of each diagram. 22

23 CACM, y = 3:8 + 49:17x; r = :95 (y) :2 :4 :6 :8 1 1:2 1:4 1:6 1:8 2 Indicator (x) CISI, y = 3:8 + 48:37x; r = :95 (y) :2 :4 :6 :8 1 1:2 1:4 1:6 1:8 2 Indicator (x) CRAN, y = 4:4 + 48:73x; r = :97 (y) :2 :4 :6 :8 1 1:2 1:4 1:6 1:8 2 Indicator (x) MED, y = 6:3 + 48:13x; r = :95 (y) :2 :4 :6 :8 1 1:2 1:4 1:6 1:8 2 Indicator (x) Figure 14: The regression lines for the test collections based on the SW method. The horizontal axis corresonds to the indicator; the vertical axis corresonds to the retrieval. The regression line and the correlation coecient (r) are given on to of each diagram. 23

24 Samle Query 1 Samle Query 2 Checkoints Ind Actual Ra Ind Actual Ra Table 2: The (Ind; Ra) airs for two samle queries in CACM by the W method. as the search roceeds. Now, consider the estimations for the SW method. By comaring the heights of the dark gray boxes with those of the gray boxes in Fig. 1, the Lr method consistently erforms better than the Dm method at all oints in terms of maximum and average discreancies. The average discreancies by the Dm method are mostly in the 1-% range, while they are most in the 5-1% range by the Lr method. Based uon these comarisons, the estimation ability of the Lr method is better than that of the Dm method. To nd out which sorting algorithm is better estimated by the estimation methods, the maximum and average discreancies of the W method and SW method for each estimation heuristic are comared in Figs. 11 and 12, resectively. For the Dm method, excet for 1 case in CACM, 1 case in CISI and 1 case in MED, the maximum discreancies for the SW method are smaller. Excet for 4 cases in CACM, the average discreancies for the SW method are smaller as well. As far as the Lr method is concerned, the estimations for the SW method are better than those of the W method for almost all of the cases, in terms of maximum and average discreancies. Moreover, the discreancies of the SW method are much smaller than those of the W method. Based uon these results, the estimation methods can aroximate the SW method with higher accuracies. For ractical uroses, both estimation methods achieve satisfactory results on the average. Let us now turn to the eciency of the methods. The algorithm of the Dm method seems to be easy to imlement. It involves the counting of the durations of documents in the Candidate Region and the sorting of the document-weight array after each disk age is rocessed. Based uon the descrition above, the Lr method seems to be more comlex, because it involves the construction of an indicator and the techniques of linear regression to generate a regression line for each document base. However, in oerating environments, the Lr method erforms better than the Dm method. The rocess to generate a regression line is done once for each document collection (or after the document collection has been extensively udated) and thus it is not costly. To rocess a query, the document-weight array are sorted only 1 times at 1 checkoints, and the estimation of Ra's can be done rather inexensively by maing the value of Ind on the regression line. However, the Dm method has to sort the document-weight array after a disk age is rocessed. If there are 3 disk ages retrieved, it will be sorted 3 times, comared to 1 times required by the Lr method. One immediate suggestion for the Dm method is to sort doc wt less frequently. However, the discreancies of estimations could get worse when the behavior of document movements in the Candidate Region is not 24

Sensitivity Analysis for an Optimal Routing Policy in an Ad Hoc Wireless Network

Sensitivity Analysis for an Optimal Routing Policy in an Ad Hoc Wireless Network 1 Sensitivity Analysis for an Otimal Routing Policy in an Ad Hoc Wireless Network Tara Javidi and Demosthenis Teneketzis Deartment of Electrical Engineering and Comuter Science University of Michigan Ann

More information

An Efficient Coding Method for Coding Region-of-Interest Locations in AVS2

An Efficient Coding Method for Coding Region-of-Interest Locations in AVS2 An Efficient Coding Method for Coding Region-of-Interest Locations in AVS2 Mingliang Chen 1, Weiyao Lin 1*, Xiaozhen Zheng 2 1 Deartment of Electronic Engineering, Shanghai Jiao Tong University, China

More information

Parallel Construction of Multidimensional Binary Search Trees. Ibraheem Al-furaih, Srinivas Aluru, Sanjay Goil Sanjay Ranka

Parallel Construction of Multidimensional Binary Search Trees. Ibraheem Al-furaih, Srinivas Aluru, Sanjay Goil Sanjay Ranka Parallel Construction of Multidimensional Binary Search Trees Ibraheem Al-furaih, Srinivas Aluru, Sanjay Goil Sanjay Ranka School of CIS and School of CISE Northeast Parallel Architectures Center Syracuse

More information

EE678 Application Presentation Content Based Image Retrieval Using Wavelets

EE678 Application Presentation Content Based Image Retrieval Using Wavelets EE678 Alication Presentation Content Based Image Retrieval Using Wavelets Grou Members: Megha Pandey megha@ee. iitb.ac.in 02d07006 Gaurav Boob gb@ee.iitb.ac.in 02d07008 Abstract: We focus here on an effective

More information

OMNI: An Efficient Overlay Multicast. Infrastructure for Real-time Applications

OMNI: An Efficient Overlay Multicast. Infrastructure for Real-time Applications OMNI: An Efficient Overlay Multicast Infrastructure for Real-time Alications Suman Banerjee, Christoher Kommareddy, Koushik Kar, Bobby Bhattacharjee, Samir Khuller Abstract We consider an overlay architecture

More information

Shuigeng Zhou. May 18, 2016 School of Computer Science Fudan University

Shuigeng Zhou. May 18, 2016 School of Computer Science Fudan University Query Processing Shuigeng Zhou May 18, 2016 School of Comuter Science Fudan University Overview Outline Measures of Query Cost Selection Oeration Sorting Join Oeration Other Oerations Evaluation of Exressions

More information

Efficient Processing of Top-k Dominating Queries on Multi-Dimensional Data

Efficient Processing of Top-k Dominating Queries on Multi-Dimensional Data Efficient Processing of To-k Dominating Queries on Multi-Dimensional Data Man Lung Yiu Deartment of Comuter Science Aalborg University DK-922 Aalborg, Denmark mly@cs.aau.dk Nikos Mamoulis Deartment of

More information

Lecture 8: Orthogonal Range Searching

Lecture 8: Orthogonal Range Searching CPS234 Comutational Geometry Setember 22nd, 2005 Lecture 8: Orthogonal Range Searching Lecturer: Pankaj K. Agarwal Scribe: Mason F. Matthews 8.1 Range Searching The general roblem of range searching is

More information

Improved heuristics for the single machine scheduling problem with linear early and quadratic tardy penalties

Improved heuristics for the single machine scheduling problem with linear early and quadratic tardy penalties Imroved heuristics for the single machine scheduling roblem with linear early and quadratic tardy enalties Jorge M. S. Valente* LIAAD INESC Porto LA, Faculdade de Economia, Universidade do Porto Postal

More information

AUTOMATIC GENERATION OF HIGH THROUGHPUT ENERGY EFFICIENT STREAMING ARCHITECTURES FOR ARBITRARY FIXED PERMUTATIONS. Ren Chen and Viktor K.

AUTOMATIC GENERATION OF HIGH THROUGHPUT ENERGY EFFICIENT STREAMING ARCHITECTURES FOR ARBITRARY FIXED PERMUTATIONS. Ren Chen and Viktor K. inuts er clock cycle Streaming ermutation oututs er clock cycle AUTOMATIC GENERATION OF HIGH THROUGHPUT ENERGY EFFICIENT STREAMING ARCHITECTURES FOR ARBITRARY FIXED PERMUTATIONS Ren Chen and Viktor K.

More information

Privacy Preserving Moving KNN Queries

Privacy Preserving Moving KNN Queries Privacy Preserving Moving KNN Queries arxiv:4.76v [cs.db] 4 Ar Tanzima Hashem Lars Kulik Rui Zhang National ICT Australia, Deartment of Comuter Science and Software Engineering University of Melbourne,

More information

A Study of Protocols for Low-Latency Video Transport over the Internet

A Study of Protocols for Low-Latency Video Transport over the Internet A Study of Protocols for Low-Latency Video Transort over the Internet Ciro A. Noronha, Ph.D. Cobalt Digital Santa Clara, CA ciro.noronha@cobaltdigital.com Juliana W. Noronha University of California, Davis

More information

Extracting Optimal Paths from Roadmaps for Motion Planning

Extracting Optimal Paths from Roadmaps for Motion Planning Extracting Otimal Paths from Roadmas for Motion Planning Jinsuck Kim Roger A. Pearce Nancy M. Amato Deartment of Comuter Science Texas A&M University College Station, TX 843 jinsuckk,ra231,amato @cs.tamu.edu

More information

Parametric Optimization in WEDM of WC-Co Composite by Neuro-Genetic Technique

Parametric Optimization in WEDM of WC-Co Composite by Neuro-Genetic Technique Parametric Otimization in WEDM of WC-Co Comosite by Neuro-Genetic Technique P. Saha*, P. Saha, and S. K. Pal Abstract The resent work does a multi-objective otimization in wire electro-discharge machining

More information

Efficient Parallel Hierarchical Clustering

Efficient Parallel Hierarchical Clustering Efficient Parallel Hierarchical Clustering Manoranjan Dash 1,SimonaPetrutiu, and Peter Scheuermann 1 Deartment of Information Systems, School of Comuter Engineering, Nanyang Technological University, Singaore

More information

Collective communication: theory, practice, and experience

Collective communication: theory, practice, and experience CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Comutat.: Pract. Exer. 2007; 19:1749 1783 Published online 5 July 2007 in Wiley InterScience (www.interscience.wiley.com)..1206 Collective

More information

Space-efficient Region Filling in Raster Graphics

Space-efficient Region Filling in Raster Graphics "The Visual Comuter: An International Journal of Comuter Grahics" (submitted July 13, 1992; revised December 7, 1992; acceted in Aril 16, 1993) Sace-efficient Region Filling in Raster Grahics Dominik Henrich

More information

A Model-Adaptable MOSFET Parameter Extraction System

A Model-Adaptable MOSFET Parameter Extraction System A Model-Adatable MOSFET Parameter Extraction System Masaki Kondo Hidetoshi Onodera Keikichi Tamaru Deartment of Electronics Faculty of Engineering, Kyoto University Kyoto 66-1, JAPAN Tel: +81-7-73-313

More information

A New and Efficient Algorithm-Based Fault Tolerance Scheme for A Million Way Parallelism

A New and Efficient Algorithm-Based Fault Tolerance Scheme for A Million Way Parallelism A New and Efficient Algorithm-Based Fault Tolerance Scheme for A Million Way Parallelism Erlin Yao, Mingyu Chen, Rui Wang, Wenli Zhang, Guangming Tan Key Laboratory of Comuter System and Architecture Institute

More information

Brigham Young University Oregon State University. Abstract. In this paper we present a new parallel sorting algorithm which maximizes the overlap

Brigham Young University Oregon State University. Abstract. In this paper we present a new parallel sorting algorithm which maximizes the overlap Aeared in \Journal of Parallel and Distributed Comuting, July 1995 " Overlaing Comutations, Communications and I/O in Parallel Sorting y Mark J. Clement Michael J. Quinn Comuter Science Deartment Deartment

More information

Relations with Relation Names as Arguments: Algebra and Calculus. Kenneth A. Ross. Columbia University.

Relations with Relation Names as Arguments: Algebra and Calculus. Kenneth A. Ross. Columbia University. Relations with Relation Names as Arguments: Algebra and Calculus Kenneth A. Ross Columbia University kar@cs.columbia.edu Abstract We consider a version of the relational model in which relation names may

More information

Source Coding and express these numbers in a binary system using M log

Source Coding and express these numbers in a binary system using M log Source Coding 30.1 Source Coding Introduction We have studied how to transmit digital bits over a radio channel. We also saw ways that we could code those bits to achieve error correction. Bandwidth is

More information

Distributed Estimation from Relative Measurements in Sensor Networks

Distributed Estimation from Relative Measurements in Sensor Networks Distributed Estimation from Relative Measurements in Sensor Networks #Prabir Barooah and João P. Hesanha Abstract We consider the roblem of estimating vectorvalued variables from noisy relative measurements.

More information

PREDICTING LINKS IN LARGE COAUTHORSHIP NETWORKS

PREDICTING LINKS IN LARGE COAUTHORSHIP NETWORKS PREDICTING LINKS IN LARGE COAUTHORSHIP NETWORKS Kevin Miller, Vivian Lin, and Rui Zhang Grou ID: 5 1. INTRODUCTION The roblem we are trying to solve is redicting future links or recovering missing links

More information

Near-Optimal Routing Lookups with Bounded Worst Case Performance

Near-Optimal Routing Lookups with Bounded Worst Case Performance Near-Otimal Routing Lookus with Bounded Worst Case Performance Pankaj Guta Balaji Prabhakar Stehen Boyd Deartments of Electrical Engineering and Comuter Science Stanford University CA 9430 ankaj@stanfordedu

More information

22. Swaping: Policies

22. Swaping: Policies 22. Swaing: Policies Oerating System: Three Easy Pieces 1 Beyond Physical Memory: Policies Memory ressure forces the OS to start aging out ages to make room for actively-used ages. Deciding which age to

More information

split split (a) (b) split split (c) (d)

split split (a) (b) split split (c) (d) International Journal of Foundations of Comuter Science c World Scientic Publishing Comany ON COST-OPTIMAL MERGE OF TWO INTRANSITIVE SORTED SEQUENCES JIE WU Deartment of Comuter Science and Engineering

More information

Learning Robust Locality Preserving Projection via p-order Minimization

Learning Robust Locality Preserving Projection via p-order Minimization Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence Learning Robust Locality Preserving Projection via -Order Minimization Hua Wang, Feiing Nie, Heng Huang Deartment of Electrical

More information

An empirical analysis of loopy belief propagation in three topologies: grids, small-world networks and random graphs

An empirical analysis of loopy belief propagation in three topologies: grids, small-world networks and random graphs An emirical analysis of looy belief roagation in three toologies: grids, small-world networks and random grahs R. Santana, A. Mendiburu and J. A. Lozano Intelligent Systems Grou Deartment of Comuter Science

More information

Figure 8.1: Home age taken from the examle health education site (htt:// Setember 14, 2001). 201

Figure 8.1: Home age taken from the examle health education site (htt://  Setember 14, 2001). 201 200 Chater 8 Alying the Web Interface Profiles: Examle Web Site Assessment 8.1 Introduction This chater describes the use of the rofiles develoed in Chater 6 to assess and imrove the quality of an examle

More information

IEEE Coyright Notice Personal use of this material is ermitted. However, ermission to rerint/reublish this material for advertising or romotional uroses or for creating new collective works for resale

More information

Truth Trees. Truth Tree Fundamentals

Truth Trees. Truth Tree Fundamentals Truth Trees 1 True Tree Fundamentals 2 Testing Grous of Statements for Consistency 3 Testing Arguments in Proositional Logic 4 Proving Invalidity in Predicate Logic Answers to Selected Exercises Truth

More information

Improve Precategorized Collection Retrieval by Using Supervised Term Weighting Schemes Λ

Improve Precategorized Collection Retrieval by Using Supervised Term Weighting Schemes Λ Imrove Precategorized Collection Retrieval by Using Suervised Term Weighting Schemes Λ Ying Zhao and George Karyis University of Minnesota, Deartment of Comuter Science Minneaolis, MN 55455 Abstract The

More information

To appear in IEEE TKDE Title: Efficient Skyline and Top-k Retrieval in Subspaces Keywords: Skyline, Top-k, Subspace, B-tree

To appear in IEEE TKDE Title: Efficient Skyline and Top-k Retrieval in Subspaces Keywords: Skyline, Top-k, Subspace, B-tree To aear in IEEE TKDE Title: Efficient Skyline and To-k Retrieval in Subsaces Keywords: Skyline, To-k, Subsace, B-tree Contact Author: Yufei Tao (taoyf@cse.cuhk.edu.hk) Deartment of Comuter Science and

More information

Grouping of Patches in Progressive Radiosity

Grouping of Patches in Progressive Radiosity Grouing of Patches in Progressive Radiosity Arjan J.F. Kok * Abstract The radiosity method can be imroved by (adatively) grouing small neighboring atches into grous. Comutations normally done for searate

More information

SPITFIRE: Scalable Parallel Algorithms for Test Set Partitioned Fault Simulation

SPITFIRE: Scalable Parallel Algorithms for Test Set Partitioned Fault Simulation To aear in IEEE VLSI Test Symosium, 1997 SITFIRE: Scalable arallel Algorithms for Test Set artitioned Fault Simulation Dili Krishnaswamy y Elizabeth M. Rudnick y Janak H. atel y rithviraj Banerjee z y

More information

Collective Communication: Theory, Practice, and Experience. FLAME Working Note #22

Collective Communication: Theory, Practice, and Experience. FLAME Working Note #22 Collective Communication: Theory, Practice, and Exerience FLAME Working Note # Ernie Chan Marcel Heimlich Avi Purkayastha Robert van de Geijn Setember, 6 Abstract We discuss the design and high-erformance

More information

J. Parallel Distrib. Comput.

J. Parallel Distrib. Comput. J. Parallel Distrib. Comut. 71 (2011) 288 301 Contents lists available at ScienceDirect J. Parallel Distrib. Comut. journal homeage: www.elsevier.com/locate/jdc Quality of security adatation in arallel

More information

Learning Motion Patterns in Crowded Scenes Using Motion Flow Field

Learning Motion Patterns in Crowded Scenes Using Motion Flow Field Learning Motion Patterns in Crowded Scenes Using Motion Flow Field Min Hu, Saad Ali and Mubarak Shah Comuter Vision Lab, University of Central Florida {mhu,sali,shah}@eecs.ucf.edu Abstract Learning tyical

More information

A Novel Iris Segmentation Method for Hand-Held Capture Device

A Novel Iris Segmentation Method for Hand-Held Capture Device A Novel Iris Segmentation Method for Hand-Held Cature Device XiaoFu He and PengFei Shi Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai 200030, China {xfhe,

More information

Sensitivity of multi-product two-stage economic lotsizing models and their dependency on change-over and product cost ratio s

Sensitivity of multi-product two-stage economic lotsizing models and their dependency on change-over and product cost ratio s Sensitivity two stage EOQ model 1 Sensitivity of multi-roduct two-stage economic lotsizing models and their deendency on change-over and roduct cost ratio s Frank Van den broecke, El-Houssaine Aghezzaf,

More information

SEARCH ENGINE MANAGEMENT

SEARCH ENGINE MANAGEMENT e-issn 2455 1392 Volume 2 Issue 5, May 2016. 254 259 Scientific Journal Imact Factor : 3.468 htt://www.ijcter.com SEARCH ENGINE MANAGEMENT Abhinav Sinha Kalinga Institute of Industrial Technology, Bhubaneswar,

More information

Convex Hulls. Helen Cameron. Helen Cameron Convex Hulls 1/101

Convex Hulls. Helen Cameron. Helen Cameron Convex Hulls 1/101 Convex Hulls Helen Cameron Helen Cameron Convex Hulls 1/101 What Is a Convex Hull? Starting Point: Points in 2D y x Helen Cameron Convex Hulls 3/101 Convex Hull: Informally Imagine that the x, y-lane is

More information

Equality-Based Translation Validator for LLVM

Equality-Based Translation Validator for LLVM Equality-Based Translation Validator for LLVM Michael Ste, Ross Tate, and Sorin Lerner University of California, San Diego {mste,rtate,lerner@cs.ucsd.edu Abstract. We udated our Peggy tool, reviously resented

More information

IMS Network Deployment Cost Optimization Based on Flow-Based Traffic Model

IMS Network Deployment Cost Optimization Based on Flow-Based Traffic Model IMS Network Deloyment Cost Otimization Based on Flow-Based Traffic Model Jie Xiao, Changcheng Huang and James Yan Deartment of Systems and Comuter Engineering, Carleton University, Ottawa, Canada {jiexiao,

More information

Randomized algorithms: Two examples and Yao s Minimax Principle

Randomized algorithms: Two examples and Yao s Minimax Principle Randomized algorithms: Two examles and Yao s Minimax Princile Maximum Satisfiability Consider the roblem Maximum Satisfiability (MAX-SAT). Bring your knowledge u-to-date on the Satisfiability roblem. Maximum

More information

Pivot Selection for Dimension Reduction Using Annealing by Increasing Resampling *

Pivot Selection for Dimension Reduction Using Annealing by Increasing Resampling * ivot Selection for Dimension Reduction Using Annealing by Increasing Resamling * Yasunobu Imamura 1, Naoya Higuchi 1, Tetsuji Kuboyama 2, Kouichi Hirata 1 and Takeshi Shinohara 1 1 Kyushu Institute of

More information

10. Parallel Methods for Data Sorting

10. Parallel Methods for Data Sorting 10. Parallel Methods for Data Sorting 10. Parallel Methods for Data Sorting... 1 10.1. Parallelizing Princiles... 10.. Scaling Parallel Comutations... 10.3. Bubble Sort...3 10.3.1. Sequential Algorithm...3

More information

GEOMETRIC CONSTRAINT SOLVING IN < 2 AND < 3. Department of Computer Sciences, Purdue University. and PAMELA J. VERMEER

GEOMETRIC CONSTRAINT SOLVING IN < 2 AND < 3. Department of Computer Sciences, Purdue University. and PAMELA J. VERMEER GEOMETRIC CONSTRAINT SOLVING IN < AND < 3 CHRISTOPH M. HOFFMANN Deartment of Comuter Sciences, Purdue University West Lafayette, Indiana 47907-1398, USA and PAMELA J. VERMEER Deartment of Comuter Sciences,

More information

Identity-sensitive Points-to Analysis for the Dynamic Behavior of JavaScript Objects

Identity-sensitive Points-to Analysis for the Dynamic Behavior of JavaScript Objects Identity-sensitive Points-to Analysis for the Dynamic Behavior of JavaScrit Objects Shiyi Wei and Barbara G. Ryder Deartment of Comuter Science, Virginia Tech, Blacksburg, VA, USA. {wei,ryder}@cs.vt.edu

More information

An Efficient VLSI Architecture for Adaptive Rank Order Filter for Image Noise Removal

An Efficient VLSI Architecture for Adaptive Rank Order Filter for Image Noise Removal International Journal of Information and Electronics Engineering, Vol. 1, No. 1, July 011 An Efficient VLSI Architecture for Adative Rank Order Filter for Image Noise Removal M. C Hanumantharaju, M. Ravishankar,

More information

A DEA-bases Approach for Multi-objective Design of Attribute Acceptance Sampling Plans

A DEA-bases Approach for Multi-objective Design of Attribute Acceptance Sampling Plans Available online at htt://ijdea.srbiau.ac.ir Int. J. Data Enveloment Analysis (ISSN 2345-458X) Vol.5, No.2, Year 2017 Article ID IJDEA-00422, 12 ages Research Article International Journal of Data Enveloment

More information

A Measurement Study of Internet Bottlenecks

A Measurement Study of Internet Bottlenecks A Measurement Study of Internet Bottlenecks Ningning Hu, Li (Erran) Li y, Zhuoqing Morley Mao z, Peter Steenkiste and Jia Wang x Carnegie Mellon University, Email: fhnn, rsg@cs.cmu.edu y Bell Laboratories,

More information

Stereo Disparity Estimation in Moment Space

Stereo Disparity Estimation in Moment Space Stereo Disarity Estimation in oment Sace Angeline Pang Faculty of Information Technology, ultimedia University, 63 Cyberjaya, alaysia. angeline.ang@mmu.edu.my R. ukundan Deartment of Comuter Science, University

More information

Experiments on Patent Retrieval at NTCIR-4 Workshop

Experiments on Patent Retrieval at NTCIR-4 Workshop Working Notes of NTCIR-4, Tokyo, 2-4 June 2004 Exeriments on Patent Retrieval at NTCIR-4 Worksho Hironori Takeuchi Λ Naohiko Uramoto Λy Koichi Takeda Λ Λ Tokyo Research Laboratory, IBM Research y National

More information

Patterned Wafer Segmentation

Patterned Wafer Segmentation atterned Wafer Segmentation ierrick Bourgeat ab, Fabrice Meriaudeau b, Kenneth W. Tobin a, atrick Gorria b a Oak Ridge National Laboratory,.O.Box 2008, Oak Ridge, TN 37831-6011, USA b Le2i Laboratory Univ.of

More information

TOPP Probing of Network Links with Large Independent Latencies

TOPP Probing of Network Links with Large Independent Latencies TOPP Probing of Network Links with Large Indeendent Latencies M. Hosseinour, M. J. Tunnicliffe Faculty of Comuting, Information ystems and Mathematics, Kingston University, Kingston-on-Thames, urrey, KT1

More information

A 2D Random Walk Mobility Model for Location Management Studies in Wireless Networks Abstract: I. Introduction

A 2D Random Walk Mobility Model for Location Management Studies in Wireless Networks Abstract: I. Introduction A D Random Walk Mobility Model for Location Management Studies in Wireless Networks Kuo Hsing Chiang, RMIT University, Melbourne, Australia Nirmala Shenoy, Information Technology Deartment, RIT, Rochester,

More information

Face Recognition Using Legendre Moments

Face Recognition Using Legendre Moments Face Recognition Using Legendre Moments Dr.S.Annadurai 1 A.Saradha Professor & Head of CSE & IT Research scholar in CSE Government College of Technology, Government College of Technology, Coimbatore, Tamilnadu,

More information

Semi-Supervised Learning Based Object Detection in Aerial Imagery

Semi-Supervised Learning Based Object Detection in Aerial Imagery Semi-Suervised Learning Based Obect Detection in Aerial Imagery Jian Yao Zhongfei (Mark) Zhang Deartment of Comuter Science, State University of ew York at Binghamton, Y 13905, USA yao@binghamton.edu Zhongfei@cs.binghamton.edu

More information

An Indexing Framework for Structured P2P Systems

An Indexing Framework for Structured P2P Systems An Indexing Framework for Structured P2P Systems Adina Crainiceanu Prakash Linga Ashwin Machanavajjhala Johannes Gehrke Carl Lagoze Jayavel Shanmugasundaram Deartment of Comuter Science, Cornell University

More information

Improving Trust Estimates in Planning Domains with Rare Failure Events

Improving Trust Estimates in Planning Domains with Rare Failure Events Imroving Trust Estimates in Planning Domains with Rare Failure Events Colin M. Potts and Kurt D. Krebsbach Det. of Mathematics and Comuter Science Lawrence University Aleton, Wisconsin 54911 USA {colin.m.otts,

More information

Leak Detection Modeling and Simulation for Oil Pipeline with Artificial Intelligence Method

Leak Detection Modeling and Simulation for Oil Pipeline with Artificial Intelligence Method ITB J. Eng. Sci. Vol. 39 B, No. 1, 007, 1-19 1 Leak Detection Modeling and Simulation for Oil Pieline with Artificial Intelligence Method Pudjo Sukarno 1, Kuntjoro Adji Sidarto, Amoranto Trisnobudi 3,

More information

Building Better Nurse Scheduling Algorithms

Building Better Nurse Scheduling Algorithms Building Better Nurse Scheduling Algorithms Annals of Oerations Research, 128, 159-177, 2004. Dr Uwe Aickelin Dr Paul White School of Comuter Science University of the West of England University of Nottingham

More information

I ACCEPT NO RESPONSIBILITY FOR ERRORS ON THIS SHEET. I assume that E = (V ).

I ACCEPT NO RESPONSIBILITY FOR ERRORS ON THIS SHEET. I assume that E = (V ). 1 I ACCEPT NO RESPONSIBILITY FOR ERRORS ON THIS SHEET. I assume that E = (V ). Data structures Sorting Binary heas are imlemented using a hea-ordered balanced binary tree. Binomial heas use a collection

More information

Process and Measurement System Capability Analysis

Process and Measurement System Capability Analysis Process and Measurement System aability Analysis Process caability is the uniformity of the rocess. Variability is a measure of the uniformity of outut. Assume that a rocess involves a quality characteristic

More information

Interactive Image Segmentation

Interactive Image Segmentation Interactive Image Segmentation Fahim Mannan (260 266 294) Abstract This reort resents the roject work done based on Boykov and Jolly s interactive grah cuts based N-D image segmentation algorithm([1]).

More information

3D Surface Simplification Based on Extended Shape Operator

3D Surface Simplification Based on Extended Shape Operator 3D Surface Simlification Based on Extended Shae Oerator JUI-LIG TSEG, YU-HSUA LI Deartment of Comuter Science and Information Engineering, Deartment and Institute of Electrical Engineering Minghsin University

More information

Use of Multivariate Statistical Analysis in the Modelling of Chromatographic Processes

Use of Multivariate Statistical Analysis in the Modelling of Chromatographic Processes Use of Multivariate Statistical Analysis in the Modelling of Chromatograhic Processes Simon Edwards-Parton 1, Nigel itchener-hooker 1, Nina hornhill 2, Daniel Bracewell 1, John Lidell 3 Abstract his aer

More information

Multi-robot SLAM with Unknown Initial Correspondence: The Robot Rendezvous Case

Multi-robot SLAM with Unknown Initial Correspondence: The Robot Rendezvous Case Multi-robot SLAM with Unknown Initial Corresondence: The Robot Rendezvous Case Xun S. Zhou and Stergios I. Roumeliotis Deartment of Comuter Science & Engineering, University of Minnesota, Minneaolis, MN

More information

A CLASS OF STRUCTURED LDPC CODES WITH LARGE GIRTH

A CLASS OF STRUCTURED LDPC CODES WITH LARGE GIRTH A CLASS OF STRUCTURED LDPC CODES WITH LARGE GIRTH Jin Lu, José M. F. Moura, and Urs Niesen Deartment of Electrical and Comuter Engineering Carnegie Mellon University, Pittsburgh, PA 15213 jinlu, moura@ece.cmu.edu

More information

arxiv: v1 [cs.mm] 18 Jan 2016

arxiv: v1 [cs.mm] 18 Jan 2016 Lossless Intra Coding in with 3-ta Filters Saeed R. Alvar a, Fatih Kamisli a a Deartment of Electrical and Electronics Engineering, Middle East Technical University, Turkey arxiv:1601.04473v1 [cs.mm] 18

More information

AN early generation of unstructured P2P systems is

AN early generation of unstructured P2P systems is 1078 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 16, NO. 11, NOVEMBER 2005 Dynamic Layer Management in Suereer Architectures Li Xiao, Member, IEEE, Zhenyun Zhuang, and Yunhao Liu, Member,

More information

Hardware-Accelerated Formal Verification

Hardware-Accelerated Formal Verification Hardare-Accelerated Formal Verification Hiroaki Yoshida, Satoshi Morishita 3 Masahiro Fujita,. VLSI Design and Education Center (VDEC), University of Tokyo. CREST, Jaan Science and Technology Agency 3.

More information

Introduction to Parallel Algorithms

Introduction to Parallel Algorithms CS 1762 Fall, 2011 1 Introduction to Parallel Algorithms Introduction to Parallel Algorithms ECE 1762 Algorithms and Data Structures Fall Semester, 2011 1 Preliminaries Since the early 1990s, there has

More information

Texture Mapping with Vector Graphics: A Nested Mipmapping Solution

Texture Mapping with Vector Graphics: A Nested Mipmapping Solution Texture Maing with Vector Grahics: A Nested Mimaing Solution Wei Zhang Yonggao Yang Song Xing Det. of Comuter Science Det. of Comuter Science Det. of Information Systems Prairie View A&M University Prairie

More information

Autonomic Physical Database Design - From Indexing to Multidimensional Clustering

Autonomic Physical Database Design - From Indexing to Multidimensional Clustering Autonomic Physical Database Design - From Indexing to Multidimensional Clustering Stehan Baumann, Kai-Uwe Sattler Databases and Information Systems Grou Technische Universität Ilmenau, Ilmenau, Germany

More information

An improved algorithm for Hausdorff Voronoi diagram for non-crossing sets

An improved algorithm for Hausdorff Voronoi diagram for non-crossing sets An imroved algorithm for Hausdorff Voronoi diagram for non-crossing sets Frank Dehne, Anil Maheshwari and Ryan Taylor May 26, 2006 Abstract We resent an imroved algorithm for building a Hausdorff Voronoi

More information

APPLICATION OF PARTICLE FILTERS TO MAP-MATCHING ALGORITHM

APPLICATION OF PARTICLE FILTERS TO MAP-MATCHING ALGORITHM APPLICATION OF PARTICLE FILTERS TO MAP-MATCHING ALGORITHM Pavel Davidson 1, Jussi Collin 2, and Jarmo Taala 3 Deartment of Comuter Systems, Tamere University of Technology, Finland e-mail: avel.davidson@tut.fi

More information

[9] J. J. Dongarra, R. Hempel, A. J. G. Hey, and D. W. Walker, \A Proposal for a User-Level,

[9] J. J. Dongarra, R. Hempel, A. J. G. Hey, and D. W. Walker, \A Proposal for a User-Level, [9] J. J. Dongarra, R. Hemel, A. J. G. Hey, and D. W. Walker, \A Proosal for a User-Level, Message Passing Interface in a Distributed-Memory Environment," Tech. Re. TM-3, Oak Ridge National Laboratory,

More information

MATHEMATICAL MODELING OF COMPLEX MULTI-COMPONENT MOVEMENTS AND OPTICAL METHOD OF MEASUREMENT

MATHEMATICAL MODELING OF COMPLEX MULTI-COMPONENT MOVEMENTS AND OPTICAL METHOD OF MEASUREMENT MATHEMATICAL MODELING OF COMPLE MULTI-COMPONENT MOVEMENTS AND OPTICAL METHOD OF MEASUREMENT V.N. Nesterov JSC Samara Electromechanical Plant, Samara, Russia Abstract. The rovisions of the concet of a multi-comonent

More information

CMSC 425: Lecture 16 Motion Planning: Basic Concepts

CMSC 425: Lecture 16 Motion Planning: Basic Concepts : Lecture 16 Motion lanning: Basic Concets eading: Today s material comes from various sources, including AI Game rogramming Wisdom 2 by S. abin and lanning Algorithms by S. M. LaValle (Chats. 4 and 5).

More information

Theoretical Analysis of Graphcut Textures

Theoretical Analysis of Graphcut Textures Theoretical Analysis o Grahcut Textures Xuejie Qin Yee-Hong Yang {xu yang}@cs.ualberta.ca Deartment o omuting Science University o Alberta Abstract Since the aer was ublished in SIGGRAPH 2003 the grahcut

More information

Skip List Based Authenticated Data Structure in DAS Paradigm

Skip List Based Authenticated Data Structure in DAS Paradigm 009 Eighth International Conference on Grid and Cooerative Comuting Ski List Based Authenticated Data Structure in DAS Paradigm Jieing Wang,, Xiaoyong Du,. Key Laboratory of Data Engineering and Knowledge

More information

521493S Computer Graphics Exercise 3 (Chapters 6-8)

521493S Computer Graphics Exercise 3 (Chapters 6-8) 521493S Comuter Grahics Exercise 3 (Chaters 6-8) 1 Most grahics systems and APIs use the simle lighting and reflection models that we introduced for olygon rendering Describe the ways in which each of

More information

Multicast in Wormhole-Switched Torus Networks using Edge-Disjoint Spanning Trees 1

Multicast in Wormhole-Switched Torus Networks using Edge-Disjoint Spanning Trees 1 Multicast in Wormhole-Switched Torus Networks using Edge-Disjoint Sanning Trees 1 Honge Wang y and Douglas M. Blough z y Myricom Inc., 325 N. Santa Anita Ave., Arcadia, CA 916, z School of Electrical and

More information

AN ANALYTICAL MODEL DESCRIBING THE RELATIONSHIPS BETWEEN LOGIC ARCHITECTURE AND FPGA DENSITY

AN ANALYTICAL MODEL DESCRIBING THE RELATIONSHIPS BETWEEN LOGIC ARCHITECTURE AND FPGA DENSITY AN ANALYTICAL MODEL DESCRIBING THE RELATIONSHIPS BETWEEN LOGIC ARCHITECTURE AND FPGA DENSITY Andrew Lam 1, Steven J.E. Wilton 1, Phili Leong 2, Wayne Luk 3 1 Elec. and Com. Engineering 2 Comuter Science

More information

Experimental Comparison of Shortest Path Approaches for Timetable Information

Experimental Comparison of Shortest Path Approaches for Timetable Information Exerimental Comarison of Shortest Path roaches for Timetable Information Evangelia Pyrga Frank Schulz Dorothea Wagner Christos Zaroliagis bstract We consider two aroaches that model timetable information

More information

TD C. Space filling designs in R

TD C. Space filling designs in R TD C Sace filling designs in R 8/05/0 Tyical engineering ractice : One-At-a-Time (OAT design X P P3 P X Main remarks : OAT brings some information, but otentially wrong Eloration is oor : on monotonicity?

More information

Lecture 2: Fixed-Radius Near Neighbors and Geometric Basics

Lecture 2: Fixed-Radius Near Neighbors and Geometric Basics structure arises in many alications of geometry. The dual structure, called a Delaunay triangulation also has many interesting roerties. Figure 3: Voronoi diagram and Delaunay triangulation. Search: Geometric

More information

Face Recognition Based on Wavelet Transform and Adaptive Local Binary Pattern

Face Recognition Based on Wavelet Transform and Adaptive Local Binary Pattern Face Recognition Based on Wavelet Transform and Adative Local Binary Pattern Abdallah Mohamed 1,2, and Roman Yamolskiy 1 1 Comuter Engineering and Comuter Science, University of Louisville, Louisville,

More information

Continuous Visible k Nearest Neighbor Query on Moving Objects

Continuous Visible k Nearest Neighbor Query on Moving Objects Continuous Visible k Nearest Neighbor Query on Moving Objects Yaniu Wang a, Rui Zhang b, Chuanfei Xu a, Jianzhong Qi b, Yu Gu a, Ge Yu a, a Deartment of Comuter Software and Theory, Northeastern University,

More information

MULTIPLE SENSOR TRACKING IN A SENSE & AVOID CONTEXT

MULTIPLE SENSOR TRACKING IN A SENSE & AVOID CONTEXT 7 TH INTERNATIONAL CONGRESS OF THE AERONAUTICAL SCIENCES MULTILE SENSOR TRACKING IN A SENSE & AVOID CONTET M. Rousseau*, L. Ratton*, T. Fournet* *THALES, Defense Mission System Elancourt Keywords: Sense&Avoid,

More information

in Distributed Systems Department of Computer Science, Keio University into four forms according to asynchrony and real-time properties.

in Distributed Systems Department of Computer Science, Keio University into four forms according to asynchrony and real-time properties. Asynchrony and Real-Time in Distributed Systems Mario Tokoro? and Ichiro Satoh?? Deartment of Comuter Science, Keio University 3-14-1, Hiyoshi, Kohoku-ku, Yokohama, 223, Jaan Tel: +81-45-56-115 Fax: +81-45-56-1151

More information

AUTOMATIC 3D SURFACE RECONSTRUCTION BY COMBINING STEREOVISION WITH THE SLIT-SCANNER APPROACH

AUTOMATIC 3D SURFACE RECONSTRUCTION BY COMBINING STEREOVISION WITH THE SLIT-SCANNER APPROACH AUTOMATIC 3D SURFACE RECONSTRUCTION BY COMBINING STEREOVISION WITH THE SLIT-SCANNER APPROACH A. Prokos 1, G. Karras 1, E. Petsa 2 1 Deartment of Surveying, National Technical University of Athens (NTUA),

More information

Detection of Occluded Face Image using Mean Based Weight Matrix and Support Vector Machine

Detection of Occluded Face Image using Mean Based Weight Matrix and Support Vector Machine Journal of Comuter Science 8 (7): 1184-1190, 2012 ISSN 1549-3636 2012 Science Publications Detection of Occluded Face Image using Mean Based Weight Matrix and Suort Vector Machine 1 G. Nirmala Priya and

More information

Robust Motion Estimation for Video Sequences Based on Phase-Only Correlation

Robust Motion Estimation for Video Sequences Based on Phase-Only Correlation Robust Motion Estimation for Video Sequences Based on Phase-Only Correlation Loy Hui Chien and Takafumi Aoki Graduate School of Information Sciences Tohoku University Aoba-yama 5, Sendai, 98-8579, Jaan

More information

A Texture Based Matching Approach for Automated Assembly of Puzzles

A Texture Based Matching Approach for Automated Assembly of Puzzles A Texture Based Matching Aroach for Automated Assembly of Puzzles Mahmut amil Saırolu 1, Aytül Erçil Sabancı University 1 msagiroglu@su.sabanciuniv.edu, aytulercil@sabanciuniv.edu Abstract The uzzle assembly

More information

Submission. Verifying Properties Using Sequential ATPG

Submission. Verifying Properties Using Sequential ATPG Verifying Proerties Using Sequential ATPG Jacob A. Abraham and Vivekananda M. Vedula Comuter Engineering Research Center The University of Texas at Austin Austin, TX 78712 jaa, vivek @cerc.utexas.edu Daniel

More information

12) United States Patent 10) Patent No.: US 6,321,328 B1

12) United States Patent 10) Patent No.: US 6,321,328 B1 USOO6321328B1 12) United States Patent 10) Patent No.: 9 9 Kar et al. (45) Date of Patent: Nov. 20, 2001 (54) PROCESSOR HAVING DATA FOR 5,961,615 10/1999 Zaid... 710/54 SPECULATIVE LOADS 6,006,317 * 12/1999

More information