Query based Site Selection for Distributed Search Engines

Size: px
Start display at page:

Download "Query based Site Selection for Distributed Search Engines"

Transcription

1 Query based Site Selection for Distributed Search Engines Nobuyoshi SATO, Minoru DAAWA, Minoru EHARA, Yoshifumi SAKAI, Hideki MORI Department of Information and Computer Sciences, Toyo niversity, JAPAN Abstract We have developed a distributed search engine, called Cooperative Search Engine (CSE), in order to retrieve fresh information. In CSE, a local search engine located in each Web server makes an index of local pages. And, a Meta search server integrates these local search engines in order to realize a global search engine. In such a way, the communication delay occurs at retrieval time. So, we have developed several speedup techniques in order to realize fast retrieval. However, these techniques cannot be used for first page retrieval in Next 10 search if the page has not been searched yet. So, we have proposed Query based Site Selection (QbSS), which is widely available in all cases. In this paper, we describe QbSS in detail and discuss its features. 1. Introduction Search engines are very important for Web page retrieval. Typical search engines employ centralized architecture. In such a centralized search engine, a robot collects Web pages and an indexer makes an index of these pages to search fast. Now the update interval is defined as the period that a page is published but cannot be searched yet. In this case, centralized architecture has a problem that the update interval is very long. For an example, oogle wastes 4 weeks[1]. So, we have developed a distributed search engine, Cooperative Search Engine (CSE)[2][3] in order to reduce the update interval. In CSE, a local search engine located in each Web server makes an index of local pages. Furthermore, a meta search engine integrates these local search engines in order to realize a global search engine. By such a mechanism, though the update interval is reduced, communication overhead is increased. As this result, early CSE is suited for intranet information retrieval in small-scale networks that consist of less than 100 servers. However, international enterprises often have more than 100 servers in their domains. In order to solve the scalability of CSE, we have developed several techniques such as Score based Site Selection (SbSS)[7], Persistent Cache[9]. In SbSS, when second or later page is retrieved in Next 10 search, a client sends a query to at most top 10 sites by holding maximum score of each server. As this result, CSE realizes the scalability on retrieving second or later page. Persistent Cache keeps valid data after updating and it realizes the scalability on retrieving first page searched once. However, we still have a problem that these techniques mentioned above cannot be used if first page of Next 10 search has not been searched yet. In this paper, we propose Query based Site Selection (QbSS) in order to reduce the retrieval time even if first page of Next 10 search has not been searched yet. QbSS is one of site selection techniques based on Boolean formula of a query. CSE supports Boolean search based on Boolean formula. In Boolean search of CSE, the operations and, or, and and-not are available. Let S A and S B be the set of target sites for search queries A and B, respectively. Then, the set of target sites for queries A and B, A or B, and A not B are S A S B, S A S B, and S A, respectively. By this selection of the target sites, the number of messages in search process is saved. The remainder of this paper is organized as follows: We describe about the overview of CSE and its behaviors in section 2. We describe about QbSS in section 3, and evaluate it in section 4. In section 5, we survey the related works on distributed information retrieval. Finally, we summarize conclusions and future works. 2. Cooperative Search Engine First, we explain a basic idea of CSE. In order to minimize the update interval, every web site basically makes indices via a local indexer. However, these sites are not cooperative yet. Each site sends the information about what (i.e. which words) it knows to the manager. This information is called Forward Knowledge (FK), and is Meta knowledge indicating what each site knows. FK is the same as FI of Ingrid. When searching, the manager tells which site has documents including any word in the query to the client, and then the client sends the query to all of those sites. In this way, since CSE needs two-pass communication at searching, the retrieval time of CSE becomes longer than that of a centralized search engine. CSE consists of the following components (see Figure 1). Location Server (LS): It manages FK exclusively. sing FK, LS performs Query based Site Selection described later. LS also has Site

2 Figure 1 The overview of CSE selection Cache (SC) which caches results of site selection. Cache Server (CS): It caches FK and retrieval results. LS can be thought of as the top-level CS. It realizes Next 10 searches by caching retrieval results. Furthermore, it realizes a parallel search by calling LMSE mentioned later in parallel. Local Meta Search Engine (LMSE): It receives queries from a user, sends it to CS (ser I/F in Figure 1), and does local search process by calling LSE mentioned later (Engine I/F in Figure 1). It works as the Meta search engine that abstracts the difference between LSEs. Local Search Engine (LSE): It gathers documents locally (atherer in Figure 1), makes a local index (Indexer in Fig. 1), and retrieves documents by using the index (Engine in Figure 1). In CSE, Namazu[4] can be used as a LSE. Namazu has widely used as the search services on various Japanese sites. Next, we explain how the update process is done. In CSE, pdate I/F of LSE carries out the update process periodically. The algorithm for the update process in CSE is as follows. 1. atherer of LSE gathers all the documents (Web pages) in the target Web sites using direct access (i.e. via NFS) if available, using archived access (i.e. via CI) if it is available but direct access is not available, and using HTTP access otherwise. Here, we explain archived access in detail. In archived access, a special CI that provides mobile agent place functions is used. A mobile agent is sent to that place. The agent archives local files, compresses them and sends back to the gatherer. 2. Indexer of LSE makes an index for gathered documents by parallel processing based on Boss-Worker model. 3. pdate phase 1: Each LMSE i updates as follows Engine I/F of LMSE i obtains from the corresponding LSE the total number N i of all the documents, the set K i of all the words appearing in some documents, and the number n k,i of all the documents including word k, and sends to CS all of them together with its own RL CS sends all the contents received from each LMSE i to the upper-level CS. The transmission of the contents is terminated when they reach the top-level CS (namely, LS) LS calculates the value of idf(k) = log( N i / n k,i ) from N k,i and N i for each word k. 4. pdate phase 2: Each LMSE i updates as follows 4.1. LMSEi receives the set of Boolean queries Q which has been searched and the set of idf values from LS Engine I/F of LMSE i obtains from the corresponding LSE the highest score max d D S i (d,q) for each q {Q,K i }, S i (d,k) is a score of document d containing k, D is the set of all the documents in the site, and sends to CS all of them together with its own RL CS sends all the contents received from each LMSE i to the upper-level CS. The transmission of the contents is terminated when they reach the top-level CS (namely, LS). Note that the data transferred between each module are mainly used for distributed calculation to obtain the score based on the tf*idf method. We call this method the distributed tf*idf method. The score based on the distributed tf*idf method is calculated at the search process. So we will give the detail about the score when we explain the search process in CSE. For the good performance of the update process, the performance of the search process is sacrificed in CSE. Here we explain how the search process in CSE is done. 1. When LMSE 0 receives a query from a user, it sends the query to CS. 2. CS obtains from LS all the LMSEs expected to have documents satisfying the query. 3. CS sends the query to each of all LMSEs obtained. 4. Each LMSE searches documents satisfying the query by using LSE, and returns the result to CS. 5. CS combines with all the results received from LMSEs, and returns it to LMSE LMSE 0 displays the search result to the user..here, we describe the design of scalable architecture for the distributed search engine, CSE. In CSE, at searching time, there is the problem that communication delay occurs. Such a problem is solved by using following techniques. Look Ahead Cache in Next 10 Search[5][6] To shorten the delay on search process, CS

3 prepares the next result for the Next 10 search. That is, the search result is divided into page units, and each page unit is cached in advance by background process without increasing the response time. Score based Site Selection (SbSS)[7] In the Next 10 search, the score of the next ranked document in each site is gathered in advance, and the requests to the sites with low-ranked documents are suppressed. By this suppression, the network traffic does not increase unnecessarily. For example, there are more than 100,000 domain sites in Japan. However, by using this technique, about ten sites are sufficient to requests on each continuous search. lobal Shared Cache (SC)[8] A LMSE sends a query to the nearest CS. Many CS may send same requests to LMSEs. So, in order to globally share cached retrieval results among CSs, we proposed lobal Shared Cache (SC). In this method, LS memories the authority CS a of each query and tells CSs CS a instead of LMSEs. CS caches the cached contents of CS a. Persistent Cache(PC)[9] There is at least one CS in CSE in order to improve the response time of retrieval. However, the cache becomes invalid soon because the update interval is very short in CSE. aluable first page is also lost. Therefore, we need persistent cache, which holds valid cache data before and after updating. In this method, there are two update phases. At first update phase, each LMSE sends the number of documents including each word to LS, and LS detects idf of each word. At second update phase, preliminary search is performed using new idfs in order to update caches. Query based Site Selection(QbSS)[10] CSE supports Boolean search based on Boolean formula. In Boolean search of CSE, the operations and, or, and and-not are available. Let S A and S B be the set of target sites for search queries A and B, respectively. Then, the set of target sites for queries A and B, A or B, and A and-not B are S A S B, S A S B, and S A, respectively. By this selection of the target sites, the number of messages in search process is saved. These techniques are used as follows: if the previous page of Next 10 search has been already searched LAC else if query does not contain and or and-not SbSS else if it has been searched since index was updated SC else if it has been searched once PC else // query is new QbSS fi Only QbSS is not scalable in these techniques. Therefore, it is important to improve the precision of QbSS. 3. Query based Site Selection In CSE, when retrieval a query is given, LS commissions LMSEs to search local documents. So, it is important for reducing the retrieval time to select LMSEs having at least one document satisfying the condition represented by the query. LS has a set of all keywords appearing in at least one local document, for every LMSE. We will describe below how LMSEs are appropriately selected in feasible time based on this information. Let { k 1,..., k n } be a set of all keywords, and for each keyword k i, let x i be a Boolean variable indicating whether k i appears in a document. In CSE, a retrieval query is given as a Boolean formula f of variables x 1,..., x n, where AND, OR, and NOT operators are available in the formula. In the following, we think of a document as an n-dimensional vector whose i-th entry takes the value 1 if keyword k i appears in the document, and 0 otherwise. Then, for a retrieval query f, the target to be found is a set { d f(d) = 1 } of documents. For an LMSE L, let d(l) denote an n-dimensional vector obtained by taking bitwise-or operations over all documents in L. Then, we can regard LS as having d(l) for every LMSE L. However, it is impossible from d(l) to determine the set of documents maintained in L exactly. On the other hand, if a document d is in L, then it follows from the definition of d(l) that d d(l) holds, where d d(l) means that, for any i = 1,..., n, the i-th entry of d is less than or equal to the i-th entry of d(l). Thus, we handle { d d d(l) } as the set of documents maintained in L. For given retrieval query f and LMSE L, whether there exists a document in { d d d(l) } that satisfies f can be determined by testing whether there exists a document d such that both f(d) = 1 and d d(l) hold. Let f be a Boolean function such that f (d) = 1 if and only if there exists d d with f(d ) = 1, which is known as the minimal monotone function of f[11]. Then the above condition, there exists a document d such that both f(d) = 1 and d d(l) hold, can be replaced by a simple condition f (d(l)) = 1. nfortunately, the problem of constructing f from a given Boolean formula f is NP hard, and hence no polynomial time algorithm for this

4 problem is known. For example, expanding f into a disjunctive normal form formula, and then eliminating all negative literals (that is, negated variables) from the resulting formula yield f[11]. However this algorithm is not feasible since, in general, the length of the disjunctive normal form formula of f is exponentially larger than the length of f. So, in CSE, LMSEs possibly having at least one document satisfying f are selected by using a formula f obtained by simply eliminating all negated subformulas form retrieval query f, instead of using the minimal monotone function f of f. Clearly, f can be obtained from f in linear time with respect to the length of f. Furthermore, we can guarantee by the following two facts that all LMSEs selected by f are also selected by f. Fact 1. Eliminating negated subformulas from f does not change the value that f takes for any document d with f(d) = 1. Fact 2. No NOT operators appear in f, and hence, for any document d, if there exists d d such that f (d ) = 1 then f (d) = 1 holds. However, using f instead of f, LMSEs that have no document satisfying f may be selected because f (d) = 0 does not necessarily imply f (d) = 0. For an extreme example, if f = x 1 AND (NOT x 1 ) then f = 0 (a constant function that always takes value 0) and f = x 1, which implies that all LMSEs with a document in which keyword k 1 appears are selected even though we can guarantee that none of these LMSEs has documents satisfying f. In the next section, we will experimentally examine the accuracy and efficiency of selecting LMSEs by f. The methods we will examine are as follows: 1. The method of simply constructing the minimal monotone function (SIMP): constructing the formula OR Y X ± ( f Y = 1, X Y = 0 AND (AND x Y x)), where X and X ± are the set of all variables which are negatively used and both positively and negatively used in f, respectively, and f = 1, W = 0 denotes the Boolean function obtained from f by setting any variable in to 1 and any variable in W to The method of simply constructing the minimal monotone function after pruning down subformulas (PRN): constructing the minimal monotone function using SIMP after pruning f of subformulas without changing the minimal monotone function. 3. Query Based Site Selection (QbSS): simply eliminating all negated subformulas form f. Note that the methods SIMP and PRN are guaranteed to output f, the minimal monotone function of f in exponential time with respect to the size of X ± concerning f and the formula obtained by pruning f of subformulas, respectively, whereas the method QbSS outputs f in linear time with respect to the length of f as mentioned before. 4. Evaluations First, we compare query lengths in 3 kinds of monotonization: SIMP, PRN, and QbSS. Each query length is equivalent to the number of literals. There are 2 kinds of literals in a query: keyword and Boolean operator. A query is represented as full complete binary tree with depth N. Leaf nodes correspond to keywords and inner nodes correspond to one of 3 Boolean operators ( and, or, and-not ). The number of literals in an input query is 2 N+1-1. An input query is converted by monotonization. Now, we show randomly generated query length and query lengths converted by different monotonization methods as Figure 2. Here, each of orig, qbss, simp, prun are query length of an input query, query length of QbSS, query length of SIMP, query length of PRN respectively. In addition, the size of word set is fixed to 100. Although all query lengths grow exponentially, query length of QbSS is shortest with these methods. Query length of min becomes too large if N>5. Query length of PRN becomes also too large if N>6. From this result, we conclude that QbSS is suitable for various N. Query length is depended on the size of word set. We show query length of SIMP in various word sets as Figure 3. Here, min W is a series in which the size of word set is W. At first, we discuss the case that there are J I P N [ T W S QTKI SD KO R RTWP 0 Figure 2. Query lengths in monotonization methods J I P N [ T W S QTKI KO R KOR KOR 0 Figure 3. Query lengths in SIMP method

5 ? O = KO I KP E Q T R K SD KO R RTWP Figure 4. Processing times of monotonization methods few words in a query (W=10). If the number of literals grows then the words appear in a query any number of times. So, this monotonization causes the effect of reduction. Next, we discuss the case that there are many words in a query (W=1000). In this case, the query is rarely monotonized because there are few same words in the query. Finally, in the case of W=100, the effect of reduction is not expected because the query is frequently monotonized between dependent words. Pruned method is as same as the minimum monotonization method. However, QbSS can reduce query length stably in the wide range of W. Next, we compare the processing times of 3 monotonization methods (see Figure 4). The processing time of QbSS is O(n) because QbSS is computed by traversing the query tree. The processing time of SIMP is O(2 n ). PRN method is slower than SIMP method if N<6. However, the growth of PRN method is more gently sloped than SIMP method. PRN method is O(2 n ) in the worst case because it uses SIMP method internally. Next, we compare these site selection effects by three monotonization methods. We show the relationship of the size of word set (#words) to the number of sites (#sites) selected using queries converted by 3 methods as Figure 5. Here, the number of sites is 100, the number of documents in a site is 10, the number of words in a document is 10. The number of selected sites is depended on the size of word set. QbSS cannot select 0 SD KO R RTWP Y QTF Figure 5. #words vs #sites in the case of N=6 sites if the size of word set is small. In such a case, the same word is used any number of times in a query or a document. So, there are few words in a query monotonized by QbSS. Such a query matches to almost sites. However, the number of sites selected by QbSS is larger compared with other methods and its difference is 20-30%. Although QbSS may not select sites efficiently, we conclude that QbSS is efficient because QbSS is O(n). Furthermore, the effect of QbSS is as same as other methods if the size of word set is large. Now we consider scaling figure 5 to 10 4 times. The scale of network such as #words=10 5, #sites=10 6 is very large. Such a network is only Internet. nfortunately, QbSS is not suitable for such a large-scale network. However, QbSS is useful for middle-scale network with #words=10 5, #sites=10 5. Furthermore, QbSS is suitable for small-scale network such as enterprise intranets because QbSS can select average 10% of sites. In the above discussion, we assume that the distribution of words is uniformed. However, the distribution of words is not actually balanced. In such a case, QbSS can reduce more the number of selected sites. In addition, the retrieval time may be reduced by sorting keywords in a query in the following order: concrete and clauses (e.g. the length of a keyword string is long), abstract and clauses (e.g. the length of a keyword string is short), abstract and-not clauses, and concrete and-not clauses. 5. Related Works Many researchers have already studied on distributed information retrieval and they have developed the following systems, Archie, WAIS, Whois++, and so on. These are not search engines for Web pages. However, Forward Knowledge (FK), which is introduced by Whois++, is a basic idea for distributed information retrieval. Several FK-based distributed Web page retrieval systems such as Harvest, Ingrid, and so on, are developed. In Whois++[12], FKs are grouped as a centroid, each server transfers queries by using FK if it does not know their destinations. This is known as query routing. Most famous research on distributed information retrieval will be Harvest[13]. Harvest consists of atherer and Broker. A atherer collects documents, summarizes them as SOIF (Summary Object Interchange Format), and transfer is to a Broker. SOIF is the summary of a document, which consists of author s name, title, key words and so on. Actually, a atherer needs to send almost full texts of collected documents to a Broker, because the full text of a document must be included in SOIF in Harvest s full text search. A Broker makes an index internally. A Broker accepts a query and retrieves by cooperating with other Brokers. In Harvest,

6 both limpse and Nebula are employed as search engines, which really make indexes and search. The index size of limpse is very small and Nebula can search documents very fast. In Harvest, atherer itself can access documents directly. However, because atherer does not make an index, it needs to send the index to a Broker. Therefore, Harvest cannot reduce the update interval than CSE. Ingrid[14] is the information infrastructure developed by NTT, which aims to realize topic-level retrieval. Ingrid links collected resources each other and makes an original topology. Forward Information (FI) servers manage this topology. Ingrid navigator communicates with FI servers in order to search the way to a resource. Ingrid is flexible but its communication latency is long because the way is sequentially searched. In CSE, only LS searches the way, so it may become bottleneck but its communication latency is short. 6. Conclusions In this paper, we proposed Query based Site Selection (QbSS) as a site selection method which is used for new query. QbSS can select 10% of all sites in middle-scale network such as enterprise. Although query monotonized by QbSS is not minimum, QbSS is enough in practice. The computing time of QbSS, O(n) is shorter than the computing time of minimum monotonization, O(2 n ). However, the effect of site selection of QbSS is as same as that of minimum monotonization. Therefore, QbSS is efficient and it increases the scalability of CSE. Acknowledgement This research was cooperatively performed as a part of Mobile Agent based Web Robot project in Toyo niversity and a part of Scalable Distributed Search Engine for Fresh Information Retrieval ( ) in rant-in-aid for Scientific Research promoted by Japan Society for the Promotion of Science (JSPS). References [1] oogle, oogle Information for Webmasters, [2] Nobuyoshi Sato, Minoru ehara, Yoshifumi Sakai, Hideki Mori, Distributed Information Retrieval by using Cooperative Meta Search Engines, in Proceedings of The 21st IEEE International Conference on Distributed Computing Systems Workshops (Multimedia Network Systems, MNS2001), pp , [3] Nobuyoshi Sato, Minoru ehara, Yoshifumi Sakai, Hideki Mori, Fresh Information Retrieval using Cooperative Meta Search Engines, in Proceedings of the 16th International Conference on Information Networking (ICOIN-16), ol.ii, pp.7a-2-1 7, [4] The Namazu Project, Namazu, [5] Nobuyoshi Sato, Takashi Yamamoto, Yoshihiro Nishida, Minoru ehara, Hideki Mori, Look Ahead Cache for Next 10 in Cooperative Search Engine, in Proceedings of DPSWS 2000, IPSJ Symposium Series, ol.2000, No.15, pp , 2000 (in Japanese). [6] Nobuyoshi Sato, Minoru ehara, Yoshifumi Sakai, Hideki Mori, Fresh Information Retrieval in Cooperative Search Engine, in Proceedings of 2nd International Conference on Software Engineering, Artificial Intelligence, Networking & Parallel / Distributed Computing 2001 (SNPD 01), pp , Nagoya Japan, [7] Nobuyoshi Sato, Minoru ehara, Yoshifumi Sakai, Hideki Mori, Score Based Site Selection in Cooperative Search Engine, in Proceedings of DICOMO 2001, IPSJ Symposium Series, ol.2001, No.7, pp , 2001 (in Japanese) [8] Nobuyoshi Sato, Minoru ehara, Yoshifumi Sakai, Hideki Mori, lobal Shared Cache in Cooperative Search Engine, in Proceedings of DPSWS 2001, IPSJ Symposium Series, ol.2001, No.13, pp , 2001 (in Japanese). [9] Nobuyoshi Sato, Minoru ehara, Yoshifumi Sakai, Hideki Mori, Persistent Cache in Cooperative Search Engine, in Proceedings of the 22nd IEEE International Conference on Distributed Computing Systems Workshops (Multimedia Network Systems and Applications, MNSA 2002), pp , [10] Yoshifumi Sakai, Nobuyoshi Sato, Minoru ehara, Hideki Mori, The Optimal Monotonization for Search Queries in Cooperative Search Engine, in Proceedings of DICOMO2001, IPSJ Symposium Series, ol.2001, No.7, pp , 2001 (in Japanese). [11] N. H. Bshouty, Exact learning Boolean functions via the monotone theory, Information and Computation, ol.123, pp , [12] C. Weider, J. Fullton, S. Spero: Architecture of the Whois++ Index Service, RFC1913, [13] C. Mic Bowman, Peter B. Danzig, Darren R. Hardy, di Manber, Michael F. Schwartz: The Harvest Information Discovery and Access System, in Proceedings of the 2nd WWW Conference, earching/schwartz.harvest/schwartz.harvest.html, [14] Nippon Telegraph and Telephone Corp. Ingrid,

Temporal Ranking for Fresh Information Retrieval

Temporal Ranking for Fresh Information Retrieval Temporal Ranking for Fresh Information Retrieval Nobuyoshi Sato Dept. of Information and Computer Sciences Toyo University Kawagoe, Saitama, Japan jju@ds.cs.toyo.ac.jp Minoru Uehara Dept. of Information

More information

1 Definition of Reduction

1 Definition of Reduction 1 Definition of Reduction Problem A is reducible, or more technically Turing reducible, to problem B, denoted A B if there a main program M to solve problem A that lacks only a procedure to solve problem

More information

Source Routing Algorithms for Networks with Advance Reservations

Source Routing Algorithms for Networks with Advance Reservations Source Routing Algorithms for Networks with Advance Reservations Lars-Olof Burchard Communication and Operating Systems Technische Universitaet Berlin ISSN 1436-9915 No. 2003-3 February, 2003 Abstract

More information

1 More on the Bellman-Ford Algorithm

1 More on the Bellman-Ford Algorithm CS161 Lecture 12 Shortest Path and Dynamic Programming Algorithms Scribe by: Eric Huang (2015), Anthony Kim (2016), M. Wootters (2017) Date: May 15, 2017 1 More on the Bellman-Ford Algorithm We didn t

More information

Distributed Indexing of the Web Using Migrating Crawlers

Distributed Indexing of the Web Using Migrating Crawlers Distributed Indexing of the Web Using Migrating Crawlers Odysseas Papapetrou cs98po1@cs.ucy.ac.cy Stavros Papastavrou stavrosp@cs.ucy.ac.cy George Samaras cssamara@cs.ucy.ac.cy ABSTRACT Due to the tremendous

More information

Research Article A Two-Level Cache for Distributed Information Retrieval in Search Engines

Research Article A Two-Level Cache for Distributed Information Retrieval in Search Engines The Scientific World Journal Volume 2013, Article ID 596724, 6 pages http://dx.doi.org/10.1155/2013/596724 Research Article A Two-Level Cache for Distributed Information Retrieval in Search Engines Weizhe

More information

An Oracle White Paper April 2010

An Oracle White Paper April 2010 An Oracle White Paper April 2010 In October 2009, NEC Corporation ( NEC ) established development guidelines and a roadmap for IT platform products to realize a next-generation IT infrastructures suited

More information

Polynomial SAT-Solver Algorithm Explanation

Polynomial SAT-Solver Algorithm Explanation 1 Polynomial SAT-Solver Algorithm Explanation by Matthias Mueller (a.k.a. Louis Coder) louis@louis-coder.com Explanation Version 1.0 - December 1, 2013 Abstract This document describes an algorithm that

More information

Query Evaluation Strategies

Query Evaluation Strategies Introduction to Search Engine Technology Term-at-a-Time and Document-at-a-Time Evaluation Ronny Lempel Yahoo! Research (Many of the following slides are courtesy of Aya Soffer and David Carmel, IBM Haifa

More information

Design and Implementation of A P2P Cooperative Proxy Cache System

Design and Implementation of A P2P Cooperative Proxy Cache System Design and Implementation of A PP Cooperative Proxy Cache System James Z. Wang Vipul Bhulawala Department of Computer Science Clemson University, Box 40974 Clemson, SC 94-0974, USA +1-84--778 {jzwang,

More information

DATA MINING - 1DL105, 1DL111

DATA MINING - 1DL105, 1DL111 1 DATA MINING - 1DL105, 1DL111 Fall 2007 An introductory class in data mining http://user.it.uu.se/~udbl/dut-ht2007/ alt. http://www.it.uu.se/edu/course/homepage/infoutv/ht07 Kjell Orsborn Uppsala Database

More information

9.1 Cook-Levin Theorem

9.1 Cook-Levin Theorem CS787: Advanced Algorithms Scribe: Shijin Kong and David Malec Lecturer: Shuchi Chawla Topic: NP-Completeness, Approximation Algorithms Date: 10/1/2007 As we ve already seen in the preceding lecture, two

More information

Processing Rank-Aware Queries in P2P Systems

Processing Rank-Aware Queries in P2P Systems Processing Rank-Aware Queries in P2P Systems Katja Hose, Marcel Karnstedt, Anke Koch, Kai-Uwe Sattler, and Daniel Zinn Department of Computer Science and Automation, TU Ilmenau P.O. Box 100565, D-98684

More information

A Virtual Laboratory for Study of Algorithms

A Virtual Laboratory for Study of Algorithms A Virtual Laboratory for Study of Algorithms Thomas E. O'Neil and Scott Kerlin Computer Science Department University of North Dakota Grand Forks, ND 58202-9015 oneil@cs.und.edu Abstract Empirical studies

More information

Achieving Distributed Buffering in Multi-path Routing using Fair Allocation

Achieving Distributed Buffering in Multi-path Routing using Fair Allocation Achieving Distributed Buffering in Multi-path Routing using Fair Allocation Ali Al-Dhaher, Tricha Anjali Department of Electrical and Computer Engineering Illinois Institute of Technology Chicago, Illinois

More information

A New Pool Control Method for Boolean Compressed Sensing Based Adaptive Group Testing

A New Pool Control Method for Boolean Compressed Sensing Based Adaptive Group Testing Proceedings of APSIPA Annual Summit and Conference 27 2-5 December 27, Malaysia A New Pool Control Method for Boolean Compressed Sensing Based Adaptive roup Testing Yujia Lu and Kazunori Hayashi raduate

More information

High Dimensional Indexing by Clustering

High Dimensional Indexing by Clustering Yufei Tao ITEE University of Queensland Recall that, our discussion so far has assumed that the dimensionality d is moderately high, such that it can be regarded as a constant. This means that d should

More information

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,

More information

Column Generation Method for an Agent Scheduling Problem

Column Generation Method for an Agent Scheduling Problem Column Generation Method for an Agent Scheduling Problem Balázs Dezső Alpár Jüttner Péter Kovács Dept. of Algorithms and Their Applications, and Dept. of Operations Research Eötvös Loránd University, Budapest,

More information

Horn Formulae. CS124 Course Notes 8 Spring 2018

Horn Formulae. CS124 Course Notes 8 Spring 2018 CS124 Course Notes 8 Spring 2018 In today s lecture we will be looking a bit more closely at the Greedy approach to designing algorithms. As we will see, sometimes it works, and sometimes even when it

More information

Practice Problems for the Final

Practice Problems for the Final ECE-250 Algorithms and Data Structures (Winter 2012) Practice Problems for the Final Disclaimer: Please do keep in mind that this problem set does not reflect the exact topics or the fractions of each

More information

Performance Evaluation of Active Route Time-Out parameter in Ad-hoc On Demand Distance Vector (AODV)

Performance Evaluation of Active Route Time-Out parameter in Ad-hoc On Demand Distance Vector (AODV) Performance Evaluation of Active Route Time-Out parameter in Ad-hoc On Demand Distance Vector (AODV) WADHAH AL-MANDHARI, KOICHI GYODA 2, NOBUO NAKAJIMA Department of Human Communications The University

More information

Path-planning for Multiple Robots

Path-planning for Multiple Robots Path-planning for Multiple Robots RNDr. Pavel Surynek, Ph.D. Department of Theoretical Computer Science and Mathematical Logic Faculty of Mathematics and Physics Charles University in Prague http://ktiml.mff.cuni.cz/~surynek

More information

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18 601.433/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18 22.1 Introduction We spent the last two lectures proving that for certain problems, we can

More information

INTRODUCTION. Chapter GENERAL

INTRODUCTION. Chapter GENERAL Chapter 1 INTRODUCTION 1.1 GENERAL The World Wide Web (WWW) [1] is a system of interlinked hypertext documents accessed via the Internet. It is an interactive world of shared information through which

More information

Why Search Personalization?

Why Search Personalization? INFSCI 2480 Adaptive Information Systems Personalized Web Search Peter Brusilovsky http://www.sis.pitt.edu/~peterb/2480-012 Why Search Personalization? R. Larsen: With the growth of DL even a good query

More information

Discrete Optimization. Lecture Notes 2

Discrete Optimization. Lecture Notes 2 Discrete Optimization. Lecture Notes 2 Disjunctive Constraints Defining variables and formulating linear constraints can be straightforward or more sophisticated, depending on the problem structure. The

More information

INFSCI 2480 Adaptive Information Systems Adaptive [Web] Search. Peter Brusilovsky.

INFSCI 2480 Adaptive Information Systems Adaptive [Web] Search. Peter Brusilovsky. INFSCI 2480 Adaptive Information Systems Adaptive [Web] Search Peter Brusilovsky http://www.sis.pitt.edu/~peterb/ Where we are? Search Navigation Recommendation Content-based Semantics / Metadata Social

More information

Global IP Network System Large-Scale, Guaranteed, Carrier-Grade

Global IP Network System Large-Scale, Guaranteed, Carrier-Grade Global Network System Large-Scale, Guaranteed, Carrier-Grade 192 Global Network System Large-Scale, Guaranteed, Carrier-Grade Takanori Miyamoto Shiro Tanabe Osamu Takada Shinobu Gohara OVERVIEW: traffic

More information

Two-Dimensional Visualization for Internet Resource Discovery. Shih-Hao Li and Peter B. Danzig. University of Southern California

Two-Dimensional Visualization for Internet Resource Discovery. Shih-Hao Li and Peter B. Danzig. University of Southern California Two-Dimensional Visualization for Internet Resource Discovery Shih-Hao Li and Peter B. Danzig Computer Science Department University of Southern California Los Angeles, California 90089-0781 fshli, danzigg@cs.usc.edu

More information

Web site Image database. Web site Video database. Web server. Meta-server Meta-search Agent. Meta-DB. Video query. Text query. Web client.

Web site Image database. Web site Video database. Web server. Meta-server Meta-search Agent. Meta-DB. Video query. Text query. Web client. (Published in WebNet 97: World Conference of the WWW, Internet and Intranet, Toronto, Canada, Octobor, 1997) WebView: A Multimedia Database Resource Integration and Search System over Web Deepak Murthy

More information

Parallel and Distributed Computing

Parallel and Distributed Computing Parallel and Distributed Computing Project Assignment MAX-SAT SOLVER Version 1.0 (07/03/2016) 2015/2016 2nd Semester CONTENTS Contents 1 Introduction 2 2 Problem Description 2 2.1 Illustrative Example...................................

More information

PARALLEL ID3. Jeremy Dominijanni CSE633, Dr. Russ Miller

PARALLEL ID3. Jeremy Dominijanni CSE633, Dr. Russ Miller PARALLEL ID3 Jeremy Dominijanni CSE633, Dr. Russ Miller 1 ID3 and the Sequential Case 2 ID3 Decision tree classifier Works on k-ary categorical data Goal of ID3 is to maximize information gain at each

More information

Kanban Scheduling System

Kanban Scheduling System Kanban Scheduling System Christian Colombo and John Abela Department of Artificial Intelligence, University of Malta Abstract. Nowadays manufacturing plants have adopted a demanddriven production control

More information

Data deduplication for Similar Files

Data deduplication for Similar Files Int'l Conf. Scientific Computing CSC'17 37 Data deduplication for Similar Files Mohamad Zaini Nurshafiqah, Nozomi Miyamoto, Hikari Yoshii, Riichi Kodama, Itaru Koike, Toshiyuki Kinoshita School of Computer

More information

AODV-PA: AODV with Path Accumulation

AODV-PA: AODV with Path Accumulation -PA: with Path Accumulation Sumit Gwalani Elizabeth M. Belding-Royer Department of Computer Science University of California, Santa Barbara fsumitg, ebeldingg@cs.ucsb.edu Charles E. Perkins Communications

More information

* (4.1) A more exact setting will be specified later. The side lengthsj are determined such that

* (4.1) A more exact setting will be specified later. The side lengthsj are determined such that D D Chapter 4 xtensions of the CUB MTOD e present several generalizations of the CUB MTOD In section 41 we analyze the query algorithm GOINGCUB The difference to the CUB MTOD occurs in the case when the

More information

Dr. Amotz Bar-Noy s Compendium of Algorithms Problems. Problems, Hints, and Solutions

Dr. Amotz Bar-Noy s Compendium of Algorithms Problems. Problems, Hints, and Solutions Dr. Amotz Bar-Noy s Compendium of Algorithms Problems Problems, Hints, and Solutions Chapter 1 Searching and Sorting Problems 1 1.1 Array with One Missing 1.1.1 Problem Let A = A[1],..., A[n] be an array

More information

Real-Time Document Image Retrieval for a 10 Million Pages Database with a Memory Efficient and Stability Improved LLAH

Real-Time Document Image Retrieval for a 10 Million Pages Database with a Memory Efficient and Stability Improved LLAH 2011 International Conference on Document Analysis and Recognition Real-Time Document Image Retrieval for a 10 Million Pages Database with a Memory Efficient and Stability Improved LLAH Kazutaka Takeda,

More information

Query Evaluation Strategies

Query Evaluation Strategies Introduction to Search Engine Technology Term-at-a-Time and Document-at-a-Time Evaluation Ronny Lempel Yahoo! Labs (Many of the following slides are courtesy of Aya Soffer and David Carmel, IBM Haifa Research

More information

Chapter 8. NP-complete problems

Chapter 8. NP-complete problems Chapter 8. NP-complete problems Search problems E cient algorithms We have developed algorithms for I I I I I finding shortest paths in graphs, minimum spanning trees in graphs, matchings in bipartite

More information

CMU-Q Lecture 2: Search problems Uninformed search. Teacher: Gianni A. Di Caro

CMU-Q Lecture 2: Search problems Uninformed search. Teacher: Gianni A. Di Caro CMU-Q 15-381 Lecture 2: Search problems Uninformed search Teacher: Gianni A. Di Caro RECAP: ACT RATIONALLY Think like people Think rationally Agent Sensors? Actuators Percepts Actions Environment Act like

More information

A Time-To-Live Based Reservation Algorithm on Fully Decentralized Resource Discovery in Grid Computing

A Time-To-Live Based Reservation Algorithm on Fully Decentralized Resource Discovery in Grid Computing A Time-To-Live Based Reservation Algorithm on Fully Decentralized Resource Discovery in Grid Computing Sanya Tangpongprasit, Takahiro Katagiri, Hiroki Honda, Toshitsugu Yuba Graduate School of Information

More information

Department of Computer Science Admission Test for PhD Program. Part I Time : 30 min Max Marks: 15

Department of Computer Science Admission Test for PhD Program. Part I Time : 30 min Max Marks: 15 Department of Computer Science Admission Test for PhD Program Part I Time : 30 min Max Marks: 15 Each Q carries 1 marks. ¼ mark will be deducted for every wrong answer. Part II of only those candidates

More information

Striped Grid Files: An Alternative for Highdimensional

Striped Grid Files: An Alternative for Highdimensional Striped Grid Files: An Alternative for Highdimensional Indexing Thanet Praneenararat 1, Vorapong Suppakitpaisarn 2, Sunchai Pitakchonlasap 1, and Jaruloj Chongstitvatana 1 Department of Mathematics 1,

More information

GENERATING SUPPLEMENTARY INDEX RECORDS USING MORPHOLOGICAL ANALYSIS FOR HIGH-SPEED PARTIAL MATCHING ABSTRACT

GENERATING SUPPLEMENTARY INDEX RECORDS USING MORPHOLOGICAL ANALYSIS FOR HIGH-SPEED PARTIAL MATCHING ABSTRACT GENERATING SUPPLEMENTARY INDEX RECORDS USING MORPHOLOGICAL ANALYSIS FOR HIGH-SPEED PARTIAL MATCHING Masahiro Oku NTT Affiliated Business Headquarters 20-2 Nishi-shinjuku 3-Chome Shinjuku-ku, Tokyo 163-1419

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

International Journal of Current Trends in Engineering & Technology Volume: 02, Issue: 01 (JAN-FAB 2016)

International Journal of Current Trends in Engineering & Technology Volume: 02, Issue: 01 (JAN-FAB 2016) Survey on Ant Colony Optimization Shweta Teckchandani, Prof. Kailash Patidar, Prof. Gajendra Singh Sri Satya Sai Institute of Science & Technology, Sehore Madhya Pradesh, India Abstract Although ant is

More information

OPTIMAL MULTI-CHANNEL ASSIGNMENTS IN VEHICULAR AD-HOC NETWORKS

OPTIMAL MULTI-CHANNEL ASSIGNMENTS IN VEHICULAR AD-HOC NETWORKS Chapter 2 OPTIMAL MULTI-CHANNEL ASSIGNMENTS IN VEHICULAR AD-HOC NETWORKS Hanan Luss and Wai Chen Telcordia Technologies, Piscataway, New Jersey 08854 hluss@telcordia.com, wchen@research.telcordia.com Abstract:

More information

Data gathering using mobile agents for reducing traffic in dense mobile wireless sensor networks

Data gathering using mobile agents for reducing traffic in dense mobile wireless sensor networks Mobile Information Systems 9 (23) 295 34 295 DOI.3233/MIS-364 IOS Press Data gathering using mobile agents for reducing traffic in dense mobile wireless sensor networks Keisuke Goto, Yuya Sasaki, Takahiro

More information

Tree-Based Minimization of TCAM Entries for Packet Classification

Tree-Based Minimization of TCAM Entries for Packet Classification Tree-Based Minimization of TCAM Entries for Packet Classification YanSunandMinSikKim School of Electrical Engineering and Computer Science Washington State University Pullman, Washington 99164-2752, U.S.A.

More information

Early Measurements of a Cluster-based Architecture for P2P Systems

Early Measurements of a Cluster-based Architecture for P2P Systems Early Measurements of a Cluster-based Architecture for P2P Systems Balachander Krishnamurthy, Jia Wang, Yinglian Xie I. INTRODUCTION Peer-to-peer applications such as Napster [4], Freenet [1], and Gnutella

More information

Localization in Graphs. Richardson, TX Azriel Rosenfeld. Center for Automation Research. College Park, MD

Localization in Graphs. Richardson, TX Azriel Rosenfeld. Center for Automation Research. College Park, MD CAR-TR-728 CS-TR-3326 UMIACS-TR-94-92 Samir Khuller Department of Computer Science Institute for Advanced Computer Studies University of Maryland College Park, MD 20742-3255 Localization in Graphs Azriel

More information

Battery Power Management Routing Considering Participation Duration for Mobile Ad Hoc Networks

Battery Power Management Routing Considering Participation Duration for Mobile Ad Hoc Networks Battery Power Management Routing Considering Participation Duration for Mobile Ad Hoc Networks Masaru Yoshimachi and Yoshifumi Manabe movement of the devices. Thus the routing protocols for MANET need

More information

Lamarckian Repair and Darwinian Repair in EMO Algorithms for Multiobjective 0/1 Knapsack Problems

Lamarckian Repair and Darwinian Repair in EMO Algorithms for Multiobjective 0/1 Knapsack Problems Repair and Repair in EMO Algorithms for Multiobjective 0/ Knapsack Problems Shiori Kaige, Kaname Narukawa, and Hisao Ishibuchi Department of Industrial Engineering, Osaka Prefecture University, - Gakuen-cho,

More information

DATA MINING II - 1DL460. Spring 2014"

DATA MINING II - 1DL460. Spring 2014 DATA MINING II - 1DL460 Spring 2014" A second course in data mining http://www.it.uu.se/edu/course/homepage/infoutv2/vt14 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,

More information

2SAT Andreas Klappenecker

2SAT Andreas Klappenecker 2SAT Andreas Klappenecker The Problem Can we make the following boolean formula true? ( x y) ( y z) (z y)! Terminology A boolean variable is a variable that can be assigned the values true (T) or false

More information

FINAL EXAM SOLUTIONS

FINAL EXAM SOLUTIONS COMP/MATH 3804 Design and Analysis of Algorithms I Fall 2015 FINAL EXAM SOLUTIONS Question 1 (12%). Modify Euclid s algorithm as follows. function Newclid(a,b) if a

More information

EECS 219C: Computer-Aided Verification Boolean Satisfiability Solving. Sanjit A. Seshia EECS, UC Berkeley

EECS 219C: Computer-Aided Verification Boolean Satisfiability Solving. Sanjit A. Seshia EECS, UC Berkeley EECS 219C: Computer-Aided Verification Boolean Satisfiability Solving Sanjit A. Seshia EECS, UC Berkeley Project Proposals Due Friday, February 13 on bcourses Will discuss project topics on Monday Instructions

More information

Enhanced Performance of Database by Automated Self-Tuned Systems

Enhanced Performance of Database by Automated Self-Tuned Systems 22 Enhanced Performance of Database by Automated Self-Tuned Systems Ankit Verma Department of Computer Science & Engineering, I.T.M. University, Gurgaon (122017) ankit.verma.aquarius@gmail.com Abstract

More information

ΕΠΛ660. Ανάκτηση µε το µοντέλο διανυσµατικού χώρου

ΕΠΛ660. Ανάκτηση µε το µοντέλο διανυσµατικού χώρου Ανάκτηση µε το µοντέλο διανυσµατικού χώρου Σηµερινό ερώτηµα Typically we want to retrieve the top K docs (in the cosine ranking for the query) not totally order all docs in the corpus can we pick off docs

More information

BEx Front end Performance

BEx Front end Performance BUSINESS INFORMATION WAREHOUSE BEx Front end Performance Performance Analyses of BEx Analyzer and Web Application in the Local and Wide Area Networks Environment Document Version 1.1 March 2002 Page 2

More information

Web Data mining-a Research area in Web usage mining

Web Data mining-a Research area in Web usage mining IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 13, Issue 1 (Jul. - Aug. 2013), PP 22-26 Web Data mining-a Research area in Web usage mining 1 V.S.Thiyagarajan,

More information

modern database systems lecture 4 : information retrieval

modern database systems lecture 4 : information retrieval modern database systems lecture 4 : information retrieval Aristides Gionis Michael Mathioudakis spring 2016 in perspective structured data relational data RDBMS MySQL semi-structured data data-graph representation

More information

Efficient World-Wide-Web Information Gathering. Tian Fanjiang Wang Xidong Wang Dingxing

Efficient World-Wide-Web Information Gathering. Tian Fanjiang Wang Xidong Wang Dingxing Efficient World-Wide-Web Information Gathering Tian Fanjiang Wang Xidong Wang Dingxing (Department of Computer Science and Technology, Tsinghua University, Beijing 100084,tfj@www.cs.tsinghua.edu.cn) Abstract

More information

Different Optimal Solutions in Shared Path Graphs

Different Optimal Solutions in Shared Path Graphs Different Optimal Solutions in Shared Path Graphs Kira Goldner Oberlin College Oberlin, OH 44074 (610) 324-3931 ksgoldner@gmail.com ABSTRACT We examine an expansion upon the basic shortest path in graphs

More information

MID TERM MEGA FILE SOLVED BY VU HELPER Which one of the following statement is NOT correct.

MID TERM MEGA FILE SOLVED BY VU HELPER Which one of the following statement is NOT correct. MID TERM MEGA FILE SOLVED BY VU HELPER Which one of the following statement is NOT correct. In linked list the elements are necessarily to be contiguous In linked list the elements may locate at far positions

More information

Chapter 27 Introduction to Information Retrieval and Web Search

Chapter 27 Introduction to Information Retrieval and Web Search Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval

More information

APPENDIX 3 Tuning Tips for Applications That Use SAS/SHARE Software

APPENDIX 3 Tuning Tips for Applications That Use SAS/SHARE Software 177 APPENDIX 3 Tuning Tips for Applications That Use SAS/SHARE Software Authors 178 Abstract 178 Overview 178 The SAS Data Library Model 179 How Data Flows When You Use SAS Files 179 SAS Data Files 179

More information

A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup

A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup Yan Sun and Min Sik Kim School of Electrical Engineering and Computer Science Washington State University Pullman, Washington

More information

TESTING A COGNITIVE PACKET CONCEPT ON A LAN

TESTING A COGNITIVE PACKET CONCEPT ON A LAN TESTING A COGNITIVE PACKET CONCEPT ON A LAN X. Hu, A. N. Zincir-Heywood, M. I. Heywood Faculty of Computer Science, Dalhousie University {xhu@cs.dal.ca, zincir@cs.dal.ca, mheywood@cs.dal.ca} Abstract The

More information

An Efficient Algorithm to Test Forciblyconnectedness of Graphical Degree Sequences

An Efficient Algorithm to Test Forciblyconnectedness of Graphical Degree Sequences Theory and Applications of Graphs Volume 5 Issue 2 Article 2 July 2018 An Efficient Algorithm to Test Forciblyconnectedness of Graphical Degree Sequences Kai Wang Georgia Southern University, kwang@georgiasouthern.edu

More information

Flexible Cache Cache for afor Database Management Management Systems Systems Radim Bača and David Bednář

Flexible Cache Cache for afor Database Management Management Systems Systems Radim Bača and David Bednář Flexible Cache Cache for afor Database Management Management Systems Systems Radim Bača and David Bednář Department ofradim Computer Bača Science, and Technical David Bednář University of Ostrava Czech

More information

Pattern Mining in Frequent Dynamic Subgraphs

Pattern Mining in Frequent Dynamic Subgraphs Pattern Mining in Frequent Dynamic Subgraphs Karsten M. Borgwardt, Hans-Peter Kriegel, Peter Wackersreuther Institute of Computer Science Ludwig-Maximilians-Universität Munich, Germany kb kriegel wackersr@dbs.ifi.lmu.de

More information

Boolean Functions (Formulas) and Propositional Logic

Boolean Functions (Formulas) and Propositional Logic EECS 219C: Computer-Aided Verification Boolean Satisfiability Solving Part I: Basics Sanjit A. Seshia EECS, UC Berkeley Boolean Functions (Formulas) and Propositional Logic Variables: x 1, x 2, x 3,, x

More information

On Computing Minimum Size Prime Implicants

On Computing Minimum Size Prime Implicants On Computing Minimum Size Prime Implicants João P. Marques Silva Cadence European Laboratories / IST-INESC Lisbon, Portugal jpms@inesc.pt Abstract In this paper we describe a new model and algorithm for

More information

Core Membership Computation for Succinct Representations of Coalitional Games

Core Membership Computation for Succinct Representations of Coalitional Games Core Membership Computation for Succinct Representations of Coalitional Games Xi Alice Gao May 11, 2009 Abstract In this paper, I compare and contrast two formal results on the computational complexity

More information

Introduction to Parallel & Distributed Computing Parallel Graph Algorithms

Introduction to Parallel & Distributed Computing Parallel Graph Algorithms Introduction to Parallel & Distributed Computing Parallel Graph Algorithms Lecture 16, Spring 2014 Instructor: 罗国杰 gluo@pku.edu.cn In This Lecture Parallel formulations of some important and fundamental

More information

Variable Length and Dynamic Addressing for Mobile Ad Hoc Networks

Variable Length and Dynamic Addressing for Mobile Ad Hoc Networks Variable Length and Dynamic Addressing for Mobile Ad Hoc Networks Som Chandra Neema Venkata Nishanth Lolla {sneema,vlolla}@cs.ucr.edu Computer Science Department University of California, Riverside Abstract

More information

Efficient Cluster Based Data Collection Using Mobile Data Collector for Wireless Sensor Network

Efficient Cluster Based Data Collection Using Mobile Data Collector for Wireless Sensor Network ISSN (e): 2250 3005 Volume, 06 Issue, 06 June 2016 International Journal of Computational Engineering Research (IJCER) Efficient Cluster Based Data Collection Using Mobile Data Collector for Wireless Sensor

More information

Cooperative Watchdog in Wireless Ad-Hoc Networks Norihiro SOTA and Hiroaki HIGAKI *

Cooperative Watchdog in Wireless Ad-Hoc Networks Norihiro SOTA and Hiroaki HIGAKI * 2017 2nd International Conference on Computer, Network Security and Communication Engineering (CNSCE 2017) ISBN: 978-1-60595-439-4 Cooperative Watchdog in Wireless Ad-Hoc Networks Norihiro SOTA and Hiroaki

More information

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/18/14

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/18/14 600.363 Introduction to Algorithms / 600.463 Algorithms I Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/18/14 23.1 Introduction We spent last week proving that for certain problems,

More information

Excerpt from "Art of Problem Solving Volume 1: the Basics" 2014 AoPS Inc.

Excerpt from Art of Problem Solving Volume 1: the Basics 2014 AoPS Inc. Chapter 5 Using the Integers In spite of their being a rather restricted class of numbers, the integers have a lot of interesting properties and uses. Math which involves the properties of integers is

More information

Notes: Notes: Primo Ranking Customization

Notes: Notes: Primo Ranking Customization Primo Ranking Customization Hello, and welcome to today s lesson entitled Ranking Customization in Primo. Like most search engines, Primo aims to present results in descending order of relevance, with

More information

Web Usage Mining: A Research Area in Web Mining

Web Usage Mining: A Research Area in Web Mining Web Usage Mining: A Research Area in Web Mining Rajni Pamnani, Pramila Chawan Department of computer technology, VJTI University, Mumbai Abstract Web usage mining is a main research area in Web mining

More information

Experimental Observations of Construction Methods for Double Array Structures Using Linear Functions

Experimental Observations of Construction Methods for Double Array Structures Using Linear Functions Experimental Observations of Construction Methods for Double Array Structures Using Linear Functions Shunsuke Kanda*, Kazuhiro Morita, Masao Fuketa, Jun-Ichi Aoe Department of Information Science and Intelligent

More information

Predictive Indexing for Fast Search

Predictive Indexing for Fast Search Predictive Indexing for Fast Search Sharad Goel, John Langford and Alex Strehl Yahoo! Research, New York Modern Massive Data Sets (MMDS) June 25, 2008 Goel, Langford & Strehl (Yahoo! Research) Predictive

More information

Reductions. Linear Time Reductions. Desiderata. Reduction. Desiderata. Classify problems according to their computational requirements.

Reductions. Linear Time Reductions. Desiderata. Reduction. Desiderata. Classify problems according to their computational requirements. Desiderata Reductions Desiderata. Classify problems according to their computational requirements. Frustrating news. Huge number of fundamental problems have defied classification for decades. Desiderata'.

More information

Azure database performance Azure performance measurements February 2017

Azure database performance Azure performance measurements February 2017 dbwatch report 1-2017 Azure database performance Azure performance measurements February 2017 Marek Jablonski, CTO dbwatch AS Azure database performance Introduction The popular image of cloud services

More information

USING THE WEB EFFICIENTLY: MOBILE CRAWLERS

USING THE WEB EFFICIENTLY: MOBILE CRAWLERS USING THE WEB EFFICIENTLY: MOBILE CRAWLERS Jan Fiedler and Joachim Hammer University of Florida Gainesville, FL 32611-6125 j.fiedler@intershop.de, jhammer@cise.ufl.edu ABSTRACT Search engines have become

More information

SAT-CNF Is N P-complete

SAT-CNF Is N P-complete SAT-CNF Is N P-complete Rod Howell Kansas State University November 9, 2000 The purpose of this paper is to give a detailed presentation of an N P- completeness proof using the definition of N P given

More information

NP-Completeness of 3SAT, 1-IN-3SAT and MAX 2SAT

NP-Completeness of 3SAT, 1-IN-3SAT and MAX 2SAT NP-Completeness of 3SAT, 1-IN-3SAT and MAX 2SAT 3SAT The 3SAT problem is the following. INSTANCE : Given a boolean expression E in conjunctive normal form (CNF) that is the conjunction of clauses, each

More information

P Is Not Equal to NP. ScholarlyCommons. University of Pennsylvania. Jon Freeman University of Pennsylvania. October 1989

P Is Not Equal to NP. ScholarlyCommons. University of Pennsylvania. Jon Freeman University of Pennsylvania. October 1989 University of Pennsylvania ScholarlyCommons Technical Reports (CIS) Department of Computer & Information Science October 1989 P Is Not Equal to NP Jon Freeman University of Pennsylvania Follow this and

More information

Research Article QOS Based Web Service Ranking Using Fuzzy C-means Clusters

Research Article QOS Based Web Service Ranking Using Fuzzy C-means Clusters Research Journal of Applied Sciences, Engineering and Technology 10(9): 1045-1050, 2015 DOI: 10.19026/rjaset.10.1873 ISSN: 2040-7459; e-issn: 2040-7467 2015 Maxwell Scientific Publication Corp. Submitted:

More information

Weighted Suffix Tree Document Model for Web Documents Clustering

Weighted Suffix Tree Document Model for Web Documents Clustering ISBN 978-952-5726-09-1 (Print) Proceedings of the Second International Symposium on Networking and Network Security (ISNNS 10) Jinggangshan, P. R. China, 2-4, April. 2010, pp. 165-169 Weighted Suffix Tree

More information

Implementation of Near Optimal Algorithm for Integrated Cellular and Ad-Hoc Multicast (ICAM)

Implementation of Near Optimal Algorithm for Integrated Cellular and Ad-Hoc Multicast (ICAM) CS230: DISTRIBUTED SYSTEMS Project Report on Implementation of Near Optimal Algorithm for Integrated Cellular and Ad-Hoc Multicast (ICAM) Prof. Nalini Venkatasubramanian Project Champion: Ngoc Do Vimal

More information

Improving the Performance of the Peer to Peer Network by Introducing an Assortment of Methods

Improving the Performance of the Peer to Peer Network by Introducing an Assortment of Methods Journal of Computer Science 7 (1): 32-38, 2011 ISSN 1549-3636 2011 Science Publications Improving the Performance of the Peer to Peer Network by Introducing an Assortment of Methods 1 M. Sadish Sendil

More information

Incorporation of Scalarizing Fitness Functions into Evolutionary Multiobjective Optimization Algorithms

Incorporation of Scalarizing Fitness Functions into Evolutionary Multiobjective Optimization Algorithms H. Ishibuchi, T. Doi, and Y. Nojima, Incorporation of scalarizing fitness functions into evolutionary multiobjective optimization algorithms, Lecture Notes in Computer Science 4193: Parallel Problem Solving

More information

IN recent years, the amount of traffic has rapidly increased

IN recent years, the amount of traffic has rapidly increased , March 15-17, 2017, Hong Kong Content Download Method with Distributed Cache Management Masamitsu Iio, Kouji Hirata, and Miki Yamamoto Abstract This paper proposes a content download method with distributed

More information

A Vector Space Equalization Scheme for a Concept-based Collaborative Information Retrieval System

A Vector Space Equalization Scheme for a Concept-based Collaborative Information Retrieval System A Vector Space Equalization Scheme for a Concept-based Collaborative Information Retrieval System Takashi Yukawa Nagaoka University of Technology 1603-1 Kamitomioka-cho, Nagaoka-shi Niigata, 940-2188 JAPAN

More information