Recommending Better Queries Based on Click-Through Data

Size: px
Start display at page:

Download "Recommending Better Queries Based on Click-Through Data"

Transcription

1 Recommending Better Queries Based on Click-Through Data Georges Dupret and Marcelo Mendoza Departamento de Ciencias de la Computación Universidad de Chile Abstract. We present a method to help a user redefine a query based on past users experience, namely the click-through data as recorded by a search engine. Unlike most previous works, the method we propose attempts to recommend better queries rather than related queries. It is effective at identifying query specialization or sub-topics because it take into account the co-occurrence of documents in individual query sessions. It is also particularly simple to implement. 1 Introduction There is hardly a paper in the Literature over search engines that does not start reminding readers of the spectacular growth of the Internet and the need to develop new methods to help the user navigate through this immense source of heterogeneous information. Besides the practical difficulties inherent to the manipulation of a large number of pages, the difficulty of ranking documents based on generally very short queries and the fact that query needs are often imprecisely formulated, users are also constrained by the query interface, and often have difficulty articulating a query that could lead to satisfactory results. It happens also that they do not exactly know what they are searching for and select query terms in a trial and error process. Better formulation of vague queries are likely to appear in the query logs because experienced users tend to avoid vague formulation and issue queries specific enough to discard irrelevant documents. Different studies sustain this claim: [11] report that an independent survey of 40,000 web users found that after a failed search, 76% of users try rephrasing their query on the same search engine. Besides, the queries can be identical or articulated differently with different terms but they represent the same information needs as found and reported in [, 14]. We propose in this work a simple and novel query recommendation algorithm. While most methods recommend queries related to the original query, this method aims at improving it. 1.1 Related Work The scientific literature follows essentially three research lines in order to recommend related queries: query clustering algorithms, association rules and query expansion.

2 There is recent related work on query clustering using query log files. Baeza- Yates et al. [] build a term-weight vector for each query. Each term is weighted according to the number of document occurrences and the number of selection of the documents in which the term appears. Unfortunately, the method is limited to queries that appears in the log and is biased by the search engine. Wen et al. [15] propose to cluster similar queries to recommend URLs in answer to frequently asked queries. They use four notions of query distance: (1) based on query keywords or phrases; () based on string matching of keywords; (3) based on common clicked URLs; and (4) based on the distance of the clicked documents in some predefined hierarchy. From a study of the log of a popular search engine, Jansen et al. [] conclude that most queries are short (around terms per query) and imprecise, and that the average number of clicked pages per answer is very low (around 3 clicks per query). Thus, notions (1)-(3) are difficult to deal with in practice, because distance matrices between queries generated by them are very sparse. [16] and [7] nevertheless report satisfactory results. Notion (4) needs documents to be classified in a concept taxonomy. Zaïane and Strilets [18] present a query recommendation method based on seven different notions of query similarity. Three of them are mild variations of notion (1) and (3). The remaining use the content and title of the URLs in the result of a query. Their approach is designed for a meta-search engine and thus none of their similarity measures consider user click-through data. Fonseca et al. [8] present a method to discover related queries based on association rules. Here queries represent items in traditional association rules. The query log is viewed as a set of transactions, where each transaction represent a session in which a single user submits a sequence of related queries within a time frame. The method gives good results, however two problems arise. First, it is difficult to determine sessions of successive queries that belong to the same search process. On the other hand, the most interesting related queries, those submitted by different users, cannot be discovered. This is because the support of a rule increases only if its queries appear in the same query session and thus they must be submitted by the same user. Finally, there are several relevant works based on query augmentation or expansion [1, 5, 6] and query reformulation [1]. The basic idea here is to reformulate the query such that it gets closer to the term-weight vector of the documents the user is looking for. In [4], large query logs are used to construct a surrogate for each document consisting of the queries that were a close match to that document. It is found that the queries that match a document are a fair description of its content. They also investigate whether query associations can play a role in query expansion. In this sense, in [13], a somewhat similar approach of summarizing document with queries is proposed: Query association is based on the notion that a query that is highly similar to a document is a good descriptor of that document. The concept of related search has recently taken a new importance with the development of search engines that place it at the center of the ranking algorithm [16] to improve precision (e.g. AskJeeves,

3 Unlike traditional search engines, these new systems try to understand the users question in order to suggest similar questions that other people have asked and for which the system has the correct answers. These answers have been prepared or checked by human editors in most cases. The assumption behind such a system is that many people are interested in the same questions the Frequently Asked Questions (FAQs). Based on these observations, Wen et al. [16] suggest a method for clustering queries in order to identify FAQs and their set of associated relevant documents. According to [17], Teoma ( also uses a click-through log-based ranking. Given a particular query, it utilizes the previous session logs of the same query to return pages that most users visit. To get a statistically significant result, it is only applied on a small set of popular queries. Because the engine is a commercial one, more details on the method are uneasy to obtain. The method we propose is closely related to the work of Wen et al. [10]. These authors observe that common selections in two or more queries identify common sub-topics and they derive a measure of similarity between queries based on user feedback. They also propose a similarity measure proportional to the shared number of clicked (or selected) documents taken individually and observe a surprising ability to cluster semantically related queries that contain different words. It is worth to quote them further, as the argument is the basis of this work: In addition, this measure is also very useful in distinguishing between queries that happen to be worded similarly but stem from different information needs. For example, if one user asked for law and clicked on the articles about legal problems, and another user asked law and clicked the articles about the order of nature, the two cases can be easily distinguished through user clicks. 1. Contribution Most work on the related topics of query recommendations and augmentations are based on some form of clustering, where the features are the terms appearing in the queries and/or the documents. Recommendations or query augmentation terms are then extracted from the clusters. Because they are related to the original query, they might help the user to refine it. By contrast, the simple method we propose here aims at discovering alternate queries that improve the search engine ranking of documents: We order the documents selected during past sessions of a query according to the ranking of other past queries. If the resulting ranking is better than the original one, we recommend the associated query. This is described in Sect.. We apply this method in Sect. 3 on the query logs of TodoCL, an important Chilean search engine. Finally we conclude and describe future work in Sect. 4. Query Recommendation Algorithm Many identical queries can represent different user information needs. Depending on the topic the user has in mind, he will tend to select a particular sub-group

4 of documents. Consequently, the set of selections in a session reflects a sub-topic of the original query. Consider for example the users submitting the query car. They might be interested in renting, buying, repairing or formula one. Whatever their variety of interests, users initially interested in car rental will make similar selections. We might attempt to assess the existing correlations between the documents selected during sessions of a same query, create clusters and identify queries relevant to each cluster, but we prefer a simpler, more direct method where clustering is done at the level of query sessions. We need first to introduce some definitions. We understand by keyword any unbroken string that describes the query or document content. A query is a set of one or more keywords that represent an information need formulated to the search engine. The same query may be submitted several times. Each submission induces a different query session. Following Wen et al. [15], a query session consists of one query and the URLs the user clicked on, while a click is a Web page selection belonging to a query session. We also define a notion of consistency between a query and a document: Definition 1 (Consistency). A document is consistent with a query if it has been selected a significant number of times during the sessions of the query. Consistency ensures that a query and a document bear a natural relation in the opinion of users and discards documents that have been selected by mistake once or a few time. Similarly, we say that a query and a set of documents are consistent if each document in the set is consistent with the query. We now describe the algorithm. Consider the set D(s q ) of documents selected during a session s q of a query q. If we make the assumption that this set represents the information need of the user who generated the session, we might wonder if alternate queries exist in the logs that 1) are consistent with D(s q ) and ) better rank the documents of D(s q ). If these alternate queries exist, they return the same information D(s q ) as the original query, but with a better ranking and are potential query recommendations. We then repeat the procedure for each session of the original query, select the alternative queries that appear in a significant number of sessions and propose them as recommendations to the user interested in q. To apply the above algorithm, we need a criteria to compare the ranking of a set of documents for two different queries. We first define therank of a document in a query: Definition (Rank of a Documents). The rank of document u in query q, denoted r(u,q), is the position of document u in the listing returned by the search engine. We say that a document has a high or large rank if the search engine estimates that its relevance to the query is comparatively low. We extend this definition to sets of documents: Definition 3 (Rank of a Set of Documents). The rank of a set U of documents in a query q is defined as r(u,q) = max u U r(u,q).

5 q 1 U U U U U U q Fig. 1. Comparison of the ranking of two queries. A session of the original query q 1 contains selections of documents U 3 and U 6 appearing at position 3 and 6 respectively. The rank of this set of document is 6 by virtue of Def. 3. By contrast, query q achieves rank 4 for the same set of documents and therefore qualifies as a candidate recommendation. In other words, the documents with the worse ranking determines the rank of the set. If a set of documents achieves a lower rank in a query q a than in a query q b, then we say that q a ranks the documents better than q b. Definition 4 (Ranking Comparison). A query q a ranks better a set U of documents than a query q b if r(u,q a ) < r(u,q b ). This criteria is illustrated in Fig. 1 for a session containing two documents. One might be reluctant to reject an alternative query that orders well most documents of a session and poorly only one or a few of them. The argument against recommending such a query is the following: If the documents with a large rank in the alternate query appear in a significant number of sessions, they are important to the users. Presenting the alternate query as a recommendation would mislead users into following recommendations that conceal part of their information need. We can formalize the methods as follows: Definition 5 (Recommendation). A query q a is a recommendation for a query q b if a significant number of sessions of q a are consistent with q b and are ranked better by q a than by q b. The recommendation algorithm induces a directed graph between queries. The original query is the root of a tree with the recommendations as leaves. Each branch of the tree represents a different specialization or sub-topic of the original query. The depth between a root and its leaves is always one, because we require the recommendations to improve the associated document set ranking. We could follow another policy and add to the graph the recommendations of the recommendations iteratively. We would obtain a deeper graph, but as experiments show, that it is of little value because the sub-topics of a recommendation are generally not sub-topics of the original query. Finally, we observe that nothing prevents two queries from recommending each other:

6 Definition 6 (Quasi-synonyms). Two queries are quasi-synonyms when they recommend each other. We will see in the numerical experiments of the next section that this definition leads to queries that are indeed semantically very close. 3 Evaluation of the Algorithm The algorithm implied by Def. 5 is easy to implement once the log data are organized in a relational database. It is considerably simpler than most of the algorithms referred to in the Introduction. Candidate queries for the set of documents selected during a session are searched, and if a significant proportion of sessions points to a given query, it is recommended. We identify three methods for evaluating query recommendations in the Literature. The first consists in asking a group of volunteers to test the system. Results are then compiled and conclusion drawn in favor of one method or another [13]. The second method consists in modifying an existing search engine and in proposing the query recommendations to regular users. This solution is adopted in [3] where the author had access to the lycos search engine. Finally, in [18], the authors opted to analyze their results in detail to evaluate whether they make sense. This is the methodology we adopt here. We used the logs of the TodoCL search engine for a period of three months ( TodoCL is a search engine that mainly covers the.cl domain (Chilean web pages) and some pages included in the.net domain that are related with Chilean ISP providers. It indexed over 3,100,000 Web pages and has currently over 50,000 requests per day. Over three months the logs gathered 0,563,567 requests, most of them with no selections: Meta search engines issue queries and re-use the answer of TodoCL but do not return information on user selections. A total of 13,540 distinct queries lead to 8,45 registered selections corresponding to 387,47 different URLs. There are 30,3 query sessions. Thus, in average users clicked,8 pages per query session and 4,17 pages per query. Queries are originally in Spanish and have been translated to English. We intend to illustrate that the recommendation algorithm has the ability to identify sub-topics and suggest query refinement. Fig. shows the recommendation graph for the query valparaiso. The number inside nodes refer to the number of sessions registered in the logs for the query. They are reported only for nodes having children, as they help make sense of the numbers on the edges which count the number of users who would have benefited of the query reformulation. This query requires some contextual explanation. Valparaiso is an important touristic and harbor city, with various universities. It is also the home for the Mercurio, the most important Chilean newspaper. Fig. captures all these informations. It also recommends some queries that are typical of any city of some importance like city hall, municipality and so on. The more potentially beneficial recommendations have a higher link number. For example /33 7% of the users would have had access to a better ranking of the documents they

7 harbour of valparaiso valparaiso university electromecanics in valparaiso municipality valparaiso el mercurio 0 11 mercurio valparaiso Fig.. Query valparaiso and associated recommendations. The node number indicate the number of query session. The edge numbers count the sessions improved by the pointed query. A given session can recommend various queries. selected if they had searched for university instead of valparaiso. This also implicitly suggests to the user the query university valparaiso, although we are maybe presuming the user intentions. The query maps illustrated in Fig. 3 shows how a particularly vague query can be disambiguated. The TodoCL engine click-through data bias naturally the recommendations towards Chilean information. A third example concerns the naval battle in Iquique in May 1, 187 between Peru and Chile. The query graph is represented in Fig. 4. The point we want to illustrate here is the ability of the algorithm to extract from the logs and to suggest to users alternative search strategies. Out of the 47 sessions, sessions were better ranked by armada of Chile and sea of Chile. In turn, out of the 5 sessions for armada of Chile, 3 would have been better answered by sea of Chile. Probably the most interesting recommendations are for biography of Arturo Pratt, who was captain of the Esmeralda, a Chilean ship involved in the battle, for the Ancón treaty that was signed afterward between Peru and Chile and for the Government of José Joaquín Prieto, responsible for the war declaration against the Perú-Bolivia Federation. A more complex recommendation graph is presented in Fig. 5. The user who issued the query fiat is recommended essentially to specify the car model he is interested in, if he wants spare parts, or if he is interesting in selling or buying

8 map of chilean cities maps map antofagasta 1 6 map chile 8 3 map argentina Fig. 3. Query maps and associated recommendations a fiat. Note that such a graph also suggests to a user interested in say the history or the profitability of the company to issue a query more specific to his needs. Finally, we already observed that two queries can recommend each other. We show in Table 1 a list of such query pairs found in the logs. We reported also the number of original query sessions and number of sessions enhanced by the recommendation so as to have an appreciation of the statistical significance of the links. We excluded the mutual recommendation pairs with less than links. For example, in the first row, out of the 13 sessions for ads, 3 would have been better satisfied by advert, while 10 of the 0 sessions for advert would have been better satisfied by ads. Both pairs of queries lawyer tetareli tetareli and lawyer tetarelli tetarelli are quasi-synonyms, but although the four queries clearly refer to the same information need (only a spelling mistake distinguish the two pairs), they are no recommendations from one pair to the other. The reason is that TodoCL returns only documents with the spelling corresponding to the query. On the other hand, all the pairs reported in Table 1 are high quality quasi-synonyms, like van and light truck or genealogy and family name illustrates. The proposed method generates a clustering a posteriori where different sessions consisting of sometimes completely different sets of documents end up recommending a same query. This query can then label this group of session. In some sense, the very search engine has been used to cluster the queries. 4 Conclusion We have proposed a method for query recommendation based on user logs that is simple to implement and has low computational cost. Contrary to methods based on clustering where related queries are proposed to the user, the recommendation method we propose are made only if they are expected to improve. Moreover, the algorithm does not rely on the particular terms appearing in the

9 armada of chile 5 3 sea of chile naval battle in iquique biografy arturo prat 3 goverment of jose joaquin prieto ancon treaty Fig. 4. Query naval battle in Iquique and associated recommendations. query and documents, making it robust to alternative formulations of an identical information need. Our experiments show that query graphs induced by our method identify information needs and relate queries without common terms. Among the limitations of the method, one is inherent to query logs: Only queries present in the logs can be recommended and can give rise to recommendations. Problems might also arise if the search engine presents some weakness or if the logs are not large enough: Some variations in the expression of an information need fail to be clustered. Finally, queries change in time and query recommendation systems must reflect the dynamic nature of the data. If the documents appearing in answer to a query are re-ordered to match their popularity in the past sessions of the same query, an interesting synergy appears with our recommendation algorithm: The document ranking matching better past user selections, the association between selected documents and the query improves. At the same time, the recommendation algorithm matches past query sessions to an improved ranking, leading to recommendations of higher quality.

10 Table 1. Quasi-synonym queries recommend each other. Isapres is a generic name for health insurance companies and Tetareli or Tetarelli is supposed to be a family name. query query 3/13 ads advert 10/0 3/105 cars used cars /41 34/84 chat sports /13 4/ chileporno pornochile /14 /1 classified ads advertisement /0 4/1 code of penal proceedings code of penal procedure / 3/10 courses of english english courses /5 /7 dvd musical dvd /5 /5 family name genealogy /16 3/ hotels in santiago hotels santiago /11 3/4 isapres superintendence of isapres 3/8 /10 lawyer tetareli tetareli /10 4/17 lawyer tetarelli tetarelli /0 5/67 mail in chile mail company of chile /3 7/15 penal code code of penal procedure / 8/43 rent houses houses to rent /14 /58 van light truck 3/5 References 1. R. Baeza-Yates. Query usage mining in search engines. Web Mining: Applications and Techniques, Anthony Scime, editor. Idea Group, R. Baeza-Yates, C. Hurtado, and M. Mendoza. Query recommendation using query logs in search engines. In Current Trends in Database Technology - EDBT 004 Workshops, LNCS 368, pages , March 004. Heraklion, Greece. 3. D. Beeferman and A. Berger. Agglomerative clustering of a search engine query log. In Knowledge Discovery and Data Mining, pages , B. Billerbeck, F. Scholer, H. E. Williams, and J. Zobel. Query expansion using associated queries. In CIKM 03: Proceedings of the twelfth international conference on Information and knowledge management, pages, New York, NY, USA, 003. ACM Press. 5. K. Chakrabarti, M. Ortega-Binderberger, S. Mehrotra, and K. Porkaew. Evaluating refined queries in top-k retrieval systems. IEEE Transactions on Knowledge and Data Engineering, 16():56 70, H. Cui, J. Wen, J. Nie, and W. Ma. Query expansion by mining user logs. IEEE Transaction on Knowledge and Data Engineering, 15(4), July/August L. Fitzpatrick and M. Dent. Automatic feedback using past queries: social searching? In SIGIR 7: Proceedings of the 0th annual international ACM SIGIR conference on Research and development in information retrieval, pages , New York, NY, USA, 17. ACM Press. 8. B. M. Fonseca, P. B. Golgher, E. S. De Moura, and N. Ziviani. Using association rules to discovery search engines related queries. In First Latin American Web Congress (LA-WEB 03), November, 003. Santiago, Chile.. M. Jansen, A. Spink, J. Bateman, and T. Saracevic. Real life information retrieval: a study of user queries on the web. ACM SIGIR Forum, 3(1):5-17, 18.

11 10. J.-Y. N. Ji-Rong Wen and H.-J. Zhang. Clustering user queries of a search engine. inclu, NPD. Search and portal site survey. Published by NPD New Media Services., 000. Reported in 1. K. M. Risvik, T. Miko?ajewski, and P. Boros. Query segmentation for web search. WWW003 Budapest, Hungary. ACM, May F. Scholer and H. E. Williams. Query association for effective retrieval. In CIKM 0: Proceedings of the eleventh international conference on Information and knowledge management, pages , New York, NY, USA, 00. ACM Press. 14. C. Silverstein, M. Henzinger, M. Hannes, and M. Moricz. Analysis of a very large alta vista query log. In SIGIR Forum, volume 33(3), pages 6 1, J. Wen, J. Nie, and H. Zhang. Clustering user queries of a search engine. In Proc. at 10th International World Wide Web Conference. W3C, J.-R. WEN, J.-Y. NIE, and H.-J. ZHANG. Query clustering using user logs. ACM Trans. Inf. Syst., 0(1):5 81, G. Xue, H. Zeng, Z. Chen, W. Ma, and C. Lu. Log mining to improve the performance of site search. In WISE Workshops, volume 33, pages 38 45, O. R. Zaïane and A. Strilets. Finding similar queries to satisfy searches based on query traces. In Proceedings of the International Workshop on Efficient Web-Based Information Systems (EWIS), Montpellier, France, September 00.

12 7 auto nissan centra fiat uno 3 7 automobiles fiat 7 6 tire fiat fiat 3 fiat fiat second hand fiat 7 4 fiat bravo 8 spare pieces fiat fiat sale fiat palio Fig. 5. Query Fiat and associated recommendations.

Modeling User Search Behavior

Modeling User Search Behavior Modeling User Search Behavior Ricardo Baeza-Yates, Carlos Hurtado, Marcelo Mendoza and Georges Dupret Center for Web Research Computer Science Department Universidad de Chile {rbaeza,churtado,mmendoza,gdupret}@dcc.uchile.cl

More information

A New Technique to Optimize User s Browsing Session using Data Mining

A New Technique to Optimize User s Browsing Session using Data Mining Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

INCORPORATING SYNONYMS INTO SNIPPET BASED QUERY RECOMMENDATION SYSTEM

INCORPORATING SYNONYMS INTO SNIPPET BASED QUERY RECOMMENDATION SYSTEM INCORPORATING SYNONYMS INTO SNIPPET BASED QUERY RECOMMENDATION SYSTEM Megha R. Sisode and Ujwala M. Patil Department of Computer Engineering, R. C. Patel Institute of Technology, Shirpur, Maharashtra,

More information

IMAGE RETRIEVAL SYSTEM: BASED ON USER REQUIREMENT AND INFERRING ANALYSIS TROUGH FEEDBACK

IMAGE RETRIEVAL SYSTEM: BASED ON USER REQUIREMENT AND INFERRING ANALYSIS TROUGH FEEDBACK IMAGE RETRIEVAL SYSTEM: BASED ON USER REQUIREMENT AND INFERRING ANALYSIS TROUGH FEEDBACK 1 Mount Steffi Varish.C, 2 Guru Rama SenthilVel Abstract - Image Mining is a recent trended approach enveloped in

More information

Inferring User Search for Feedback Sessions

Inferring User Search for Feedback Sessions Inferring User Search for Feedback Sessions Sharayu Kakade 1, Prof. Ranjana Barde 2 PG Student, Department of Computer Science, MIT Academy of Engineering, Pune, MH, India 1 Assistant Professor, Department

More information

Query Log Analysis for Enhancing Web Search

Query Log Analysis for Enhancing Web Search Query Log Analysis for Enhancing Web Search Salvatore Orlando, University of Venice, Italy Fabrizio Silvestri, ISTI - CNR, Pisa, Italy From tutorials given at IEEE / WIC / ACM WI/IAT'09 and ECIR 09 Query

More information

Query Modifications Patterns During Web Searching

Query Modifications Patterns During Web Searching Bernard J. Jansen The Pennsylvania State University jjansen@ist.psu.edu Query Modifications Patterns During Web Searching Amanda Spink Queensland University of Technology ah.spink@qut.edu.au Bhuva Narayan

More information

A Novel Query Suggestion Method Based On Sequence Similarity and Transition Probability

A Novel Query Suggestion Method Based On Sequence Similarity and Transition Probability A Novel Query Suggestion Method Based On Sequence Similarity and Transition Probability Bo Shu 1, Zhendong Niu 1, Xiaotian Jiang 1, and Ghulam Mustafa 1 1 School of Computer Science, Beijing Institute

More information

How does the Library Searcher behave?

How does the Library Searcher behave? How does the Library Searcher behave? A contrastive study of library against ad-hoc Suzan Verberne, Max Hinne, Maarten van der Heijden, Eduard Hoenkamp, Wessel Kraaij, Theo van der Weide Information Foraging

More information

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, ISSN:

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, ISSN: IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 20131 Improve Search Engine Relevance with Filter session Addlin Shinney R 1, Saravana Kumar T

More information

Disambiguating Search by Leveraging a Social Context Based on the Stream of User s Activity

Disambiguating Search by Leveraging a Social Context Based on the Stream of User s Activity Disambiguating Search by Leveraging a Social Context Based on the Stream of User s Activity Tomáš Kramár, Michal Barla and Mária Bieliková Faculty of Informatics and Information Technology Slovak University

More information

ResPubliQA 2010

ResPubliQA 2010 SZTAKI @ ResPubliQA 2010 David Mark Nemeskey Computer and Automation Research Institute, Hungarian Academy of Sciences, Budapest, Hungary (SZTAKI) Abstract. This paper summarizes the results of our first

More information

A Novel Approach for Inferring and Analyzing User Search Goals

A Novel Approach for Inferring and Analyzing User Search Goals A Novel Approach for Inferring and Analyzing User Search Goals Y. Sai Krishna 1, N. Swapna Goud 2 1 MTech Student, Department of CSE, Anurag Group of Institutions, India 2 Associate Professor, Department

More information

A Metric for Inferring User Search Goals in Search Engines

A Metric for Inferring User Search Goals in Search Engines International Journal of Engineering and Technical Research (IJETR) A Metric for Inferring User Search Goals in Search Engines M. Monika, N. Rajesh, K.Rameshbabu Abstract For a broad topic, different users

More information

A Review on Identifying the Main Content From Web Pages

A Review on Identifying the Main Content From Web Pages A Review on Identifying the Main Content From Web Pages Madhura R. Kaddu 1, Dr. R. B. Kulkarni 2 1, 2 Department of Computer Scienece and Engineering, Walchand Institute of Technology, Solapur University,

More information

Query Refinement and Search Result Presentation

Query Refinement and Search Result Presentation Query Refinement and Search Result Presentation (Short) Queries & Information Needs A query can be a poor representation of the information need Short queries are often used in search engines due to the

More information

A Model for Interactive Web Information Retrieval

A Model for Interactive Web Information Retrieval A Model for Interactive Web Information Retrieval Orland Hoeber and Xue Dong Yang University of Regina, Regina, SK S4S 0A2, Canada {hoeber, yang}@uregina.ca Abstract. The interaction model supported by

More information

A Website Mining Model Centered on User Queries

A Website Mining Model Centered on User Queries A Website Mining Model Centered on User Queries Ricardo Baeza-Yates 1, 3, 2 and Barbara Poblete 2, 3 1 ICREA, Barcelona, Catalunya, Spain 2 Center for Web Research, CS Dept., University of Chile 3 Web

More information

Automatic New Topic Identification in Search Engine Transaction Log Using Goal Programming

Automatic New Topic Identification in Search Engine Transaction Log Using Goal Programming Proceedings of the 2012 International Conference on Industrial Engineering and Operations Management Istanbul, Turkey, July 3 6, 2012 Automatic New Topic Identification in Search Engine Transaction Log

More information

second_language research_teaching sla vivian_cook language_department idl

second_language research_teaching sla vivian_cook language_department idl Using Implicit Relevance Feedback in a Web Search Assistant Maria Fasli and Udo Kruschwitz Department of Computer Science, University of Essex, Wivenhoe Park, Colchester, CO4 3SQ, United Kingdom fmfasli

More information

Hierarchical Location and Topic Based Query Expansion

Hierarchical Location and Topic Based Query Expansion Hierarchical Location and Topic Based Query Expansion Shu Huang 1 Qiankun Zhao 2 Prasenjit Mitra 1 C. Lee Giles 1 Information Sciences and Technology 1 AOL Research Lab 2 Pennsylvania State University

More information

Enabling Users to Visually Evaluate the Effectiveness of Different Search Queries or Engines

Enabling Users to Visually Evaluate the Effectiveness of Different Search Queries or Engines Appears in WWW 04 Workshop: Measuring Web Effectiveness: The User Perspective, New York, NY, May 18, 2004 Enabling Users to Visually Evaluate the Effectiveness of Different Search Queries or Engines Anselm

More information

Applications of Web Query Mining

Applications of Web Query Mining Applications of Web Query Mining Ricardo Baeza-Yates Center for Web Research, Dept. of Computer Science, Universidad de Chile, Santiago ICREA Research Professor, Technology Department, Universitat Pompeu

More information

An Axiomatic Approach to IR UIUC TREC 2005 Robust Track Experiments

An Axiomatic Approach to IR UIUC TREC 2005 Robust Track Experiments An Axiomatic Approach to IR UIUC TREC 2005 Robust Track Experiments Hui Fang ChengXiang Zhai Department of Computer Science University of Illinois at Urbana-Champaign Abstract In this paper, we report

More information

Design of Query Suggestion System using Search Logs and Query Semantics

Design of Query Suggestion System using Search Logs and Query Semantics Design of Query Suggestion System using Search Logs and Query Semantics Abstract Query suggestion is an assistive technology mechanism commonly used in search engines to enable a user to formulate their

More information

Using Clusters on the Vivisimo Web Search Engine

Using Clusters on the Vivisimo Web Search Engine Using Clusters on the Vivisimo Web Search Engine Sherry Koshman and Amanda Spink School of Information Sciences University of Pittsburgh 135 N. Bellefield Ave., Pittsburgh, PA 15237 skoshman@sis.pitt.edu,

More information

ImgSeek: Capturing User s Intent For Internet Image Search

ImgSeek: Capturing User s Intent For Internet Image Search ImgSeek: Capturing User s Intent For Internet Image Search Abstract - Internet image search engines (e.g. Bing Image Search) frequently lean on adjacent text features. It is difficult for them to illustrate

More information

Deep Web Content Mining

Deep Web Content Mining Deep Web Content Mining Shohreh Ajoudanian, and Mohammad Davarpanah Jazi Abstract The rapid expansion of the web is causing the constant growth of information, leading to several problems such as increased

More information

Finding Neighbor Communities in the Web using Inter-Site Graph

Finding Neighbor Communities in the Web using Inter-Site Graph Finding Neighbor Communities in the Web using Inter-Site Graph Yasuhito Asano 1, Hiroshi Imai 2, Masashi Toyoda 3, and Masaru Kitsuregawa 3 1 Graduate School of Information Sciences, Tohoku University

More information

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS 1 WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS BRUCE CROFT NSF Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts,

More information

5 Choosing keywords Initially choosing keywords Frequent and rare keywords Evaluating the competition rates of search

5 Choosing keywords Initially choosing keywords Frequent and rare keywords Evaluating the competition rates of search Seo tutorial Seo tutorial Introduction to seo... 4 1. General seo information... 5 1.1 History of search engines... 5 1.2 Common search engine principles... 6 2. Internal ranking factors... 8 2.1 Web page

More information

Chapter 6. Queries and Interfaces

Chapter 6. Queries and Interfaces Chapter 6 Queries and Interfaces Keyword Queries Simple, natural language queries were designed to enable everyone to search Current search engines do not perform well (in general) with natural language

More information

The Use of Domain Modelling to Improve Performance Over a Query Session

The Use of Domain Modelling to Improve Performance Over a Query Session The Use of Domain Modelling to Improve Performance Over a Query Session Deirdre Lungley Dyaa Albakour Udo Kruschwitz Apr 18, 2011 Table of contents 1 The AutoAdapt Project 2 3 4 5 The AutoAdapt Project

More information

Automatic Query Type Identification Based on Click Through Information

Automatic Query Type Identification Based on Click Through Information Automatic Query Type Identification Based on Click Through Information Yiqun Liu 1,MinZhang 1,LiyunRu 2, and Shaoping Ma 1 1 State Key Lab of Intelligent Tech. & Sys., Tsinghua University, Beijing, China

More information

International Journal of Advanced Research in. Computer Science and Engineering

International Journal of Advanced Research in. Computer Science and Engineering Volume 4, Issue 1, January 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Design and Architecture

More information

E-R Model. Hi! Here in this lecture we are going to discuss about the E-R Model.

E-R Model. Hi! Here in this lecture we are going to discuss about the E-R Model. E-R Model Hi! Here in this lecture we are going to discuss about the E-R Model. What is Entity-Relationship Model? The entity-relationship model is useful because, as we will soon see, it facilitates communication

More information

Context based Re-ranking of Web Documents (CReWD)

Context based Re-ranking of Web Documents (CReWD) Context based Re-ranking of Web Documents (CReWD) Arijit Banerjee, Jagadish Venkatraman Graduate Students, Department of Computer Science, Stanford University arijitb@stanford.edu, jagadish@stanford.edu}

More information

CHAPTER-26 Mining Text Databases

CHAPTER-26 Mining Text Databases CHAPTER-26 Mining Text Databases 26.1 Introduction 26.2 Text Data Analysis and Information Retrieval 26.3 Basle Measures for Text Retrieval 26.4 Keyword-Based and Similarity-Based Retrieval 26.5 Other

More information

International Journal of Research in Computer and Communication Technology, Vol 3, Issue 11, November

International Journal of Research in Computer and Communication Technology, Vol 3, Issue 11, November Classified Average Precision (CAP) To Evaluate The Performance of Inferring User Search Goals 1H.M.Sameera, 2 N.Rajesh Babu 1,2Dept. of CSE, PYDAH College of Engineering, Patavala,Kakinada, E.g.dt,AP,

More information

MEASURING SEMANTIC SIMILARITY BETWEEN WORDS AND IMPROVING WORD SIMILARITY BY AUGUMENTING PMI

MEASURING SEMANTIC SIMILARITY BETWEEN WORDS AND IMPROVING WORD SIMILARITY BY AUGUMENTING PMI MEASURING SEMANTIC SIMILARITY BETWEEN WORDS AND IMPROVING WORD SIMILARITY BY AUGUMENTING PMI 1 KAMATCHI.M, 2 SUNDARAM.N 1 M.E, CSE, MahaBarathi Engineering College Chinnasalem-606201, 2 Assistant Professor,

More information

WEB PAGE RE-RANKING TECHNIQUE IN SEARCH ENGINE

WEB PAGE RE-RANKING TECHNIQUE IN SEARCH ENGINE WEB PAGE RE-RANKING TECHNIQUE IN SEARCH ENGINE Ms.S.Muthukakshmi 1, R. Surya 2, M. Umira Taj 3 Assistant Professor, Department of Information Technology, Sri Krishna College of Technology, Kovaipudur,

More information

Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task. Junjun Wang 2013/4/22

Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task. Junjun Wang 2013/4/22 Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task Junjun Wang 2013/4/22 Outline Introduction Related Word System Overview Subtopic Candidate Mining Subtopic Ranking Results and Discussion

More information

Applying the KISS Principle for the CLEF- IP 2010 Prior Art Candidate Patent Search Task

Applying the KISS Principle for the CLEF- IP 2010 Prior Art Candidate Patent Search Task Applying the KISS Principle for the CLEF- IP 2010 Prior Art Candidate Patent Search Task Walid Magdy, Gareth J.F. Jones Centre for Next Generation Localisation School of Computing Dublin City University,

More information

A Novel Approach for Restructuring Web Search Results by Feedback Sessions Using Fuzzy clustering

A Novel Approach for Restructuring Web Search Results by Feedback Sessions Using Fuzzy clustering A Novel Approach for Restructuring Web Search Results by Feedback Sessions Using Fuzzy clustering R.Dhivya 1, R.Rajavignesh 2 (M.E CSE), Department of CSE, Arasu Engineering College, kumbakonam 1 Asst.

More information

Implementation of Enhanced Web Crawler for Deep-Web Interfaces

Implementation of Enhanced Web Crawler for Deep-Web Interfaces Implementation of Enhanced Web Crawler for Deep-Web Interfaces Yugandhara Patil 1, Sonal Patil 2 1Student, Department of Computer Science & Engineering, G.H.Raisoni Institute of Engineering & Management,

More information

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES Mu. Annalakshmi Research Scholar, Department of Computer Science, Alagappa University, Karaikudi. annalakshmi_mu@yahoo.co.in Dr. A.

More information

Theme Identification in RDF Graphs

Theme Identification in RDF Graphs Theme Identification in RDF Graphs Hanane Ouksili PRiSM, Univ. Versailles St Quentin, UMR CNRS 8144, Versailles France hanane.ouksili@prism.uvsq.fr Abstract. An increasing number of RDF datasets is published

More information

A NEW PERFORMANCE EVALUATION TECHNIQUE FOR WEB INFORMATION RETRIEVAL SYSTEMS

A NEW PERFORMANCE EVALUATION TECHNIQUE FOR WEB INFORMATION RETRIEVAL SYSTEMS A NEW PERFORMANCE EVALUATION TECHNIQUE FOR WEB INFORMATION RETRIEVAL SYSTEMS Fidel Cacheda, Francisco Puentes, Victor Carneiro Department of Information and Communications Technologies, University of A

More information

Domain Specific Search Engine for Students

Domain Specific Search Engine for Students Domain Specific Search Engine for Students Domain Specific Search Engine for Students Wai Yuen Tang The Department of Computer Science City University of Hong Kong, Hong Kong wytang@cs.cityu.edu.hk Lam

More information

Concept-Based Interactive Query Expansion

Concept-Based Interactive Query Expansion Concept-Based Interactive Query Expansion Bruno M. Fonseca 12 maciel@dcc.ufmg.br Paulo Golgher 2 golgher@akwan.com.br Bruno Pôssas 12 bavep@akwan.com.br Berthier Ribeiro-Neto 1 2 berthier@dcc.ufmg.br Nivio

More information

SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE

SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE YING DING 1 Digital Enterprise Research Institute Leopold-Franzens Universität Innsbruck Austria DIETER FENSEL Digital Enterprise Research Institute National

More information

Finding Hubs and authorities using Information scent to improve the Information Retrieval precision

Finding Hubs and authorities using Information scent to improve the Information Retrieval precision Finding Hubs and authorities using Information scent to improve the Information Retrieval precision Suruchi Chawla 1, Dr Punam Bedi 2 1 Department of Computer Science, University of Delhi, Delhi, INDIA

More information

Improving Suffix Tree Clustering Algorithm for Web Documents

Improving Suffix Tree Clustering Algorithm for Web Documents International Conference on Logistics Engineering, Management and Computer Science (LEMCS 2015) Improving Suffix Tree Clustering Algorithm for Web Documents Yan Zhuang Computer Center East China Normal

More information

EFFICIENT INTEGRATION OF SEMANTIC TECHNOLOGIES FOR PROFESSIONAL IMAGE ANNOTATION AND SEARCH

EFFICIENT INTEGRATION OF SEMANTIC TECHNOLOGIES FOR PROFESSIONAL IMAGE ANNOTATION AND SEARCH EFFICIENT INTEGRATION OF SEMANTIC TECHNOLOGIES FOR PROFESSIONAL IMAGE ANNOTATION AND SEARCH Andreas Walter FZI Forschungszentrum Informatik, Haid-und-Neu-Straße 10-14, 76131 Karlsruhe, Germany, awalter@fzi.de

More information

A modified and fast Perceptron learning rule and its use for Tag Recommendations in Social Bookmarking Systems

A modified and fast Perceptron learning rule and its use for Tag Recommendations in Social Bookmarking Systems A modified and fast Perceptron learning rule and its use for Tag Recommendations in Social Bookmarking Systems Anestis Gkanogiannis and Theodore Kalamboukis Department of Informatics Athens University

More information

University of Amsterdam at INEX 2010: Ad hoc and Book Tracks

University of Amsterdam at INEX 2010: Ad hoc and Book Tracks University of Amsterdam at INEX 2010: Ad hoc and Book Tracks Jaap Kamps 1,2 and Marijn Koolen 1 1 Archives and Information Studies, Faculty of Humanities, University of Amsterdam 2 ISLA, Faculty of Science,

More information

An adaptable search system for collection of partially structured documents

An adaptable search system for collection of partially structured documents Samantha Riccadonna An adaptable search system for collection of partially structured documents by Udo Kruschwitz Web Information Retrieval Course A.Y. 2005-2006 Outline Search system overview Few concepts

More information

Designing and Building an Automatic Information Retrieval System for Handling the Arabic Data

Designing and Building an Automatic Information Retrieval System for Handling the Arabic Data American Journal of Applied Sciences (): -, ISSN -99 Science Publications Designing and Building an Automatic Information Retrieval System for Handling the Arabic Data Ibrahiem M.M. El Emary and Ja'far

More information

Finding Relevant Documents using Top Ranking Sentences: An Evaluation of Two Alternative Schemes

Finding Relevant Documents using Top Ranking Sentences: An Evaluation of Two Alternative Schemes Finding Relevant Documents using Top Ranking Sentences: An Evaluation of Two Alternative Schemes Ryen W. White Department of Computing Science University of Glasgow Glasgow. G12 8QQ whiter@dcs.gla.ac.uk

More information

VIDEO SEARCHING AND BROWSING USING VIEWFINDER

VIDEO SEARCHING AND BROWSING USING VIEWFINDER VIDEO SEARCHING AND BROWSING USING VIEWFINDER By Dan E. Albertson Dr. Javed Mostafa John Fieber Ph. D. Student Associate Professor Ph. D. Candidate Information Science Information Science Information Science

More information

Deep Web Crawling and Mining for Building Advanced Search Application

Deep Web Crawling and Mining for Building Advanced Search Application Deep Web Crawling and Mining for Building Advanced Search Application Zhigang Hua, Dan Hou, Yu Liu, Xin Sun, Yanbing Yu {hua, houdan, yuliu, xinsun, yyu}@cc.gatech.edu College of computing, Georgia Tech

More information

INFORMATION RETRIEVAL SYSTEM: CONCEPT AND SCOPE

INFORMATION RETRIEVAL SYSTEM: CONCEPT AND SCOPE 15 : CONCEPT AND SCOPE 15.1 INTRODUCTION Information is communicated or received knowledge concerning a particular fact or circumstance. Retrieval refers to searching through stored information to find

More information

An Approach To Web Content Mining

An Approach To Web Content Mining An Approach To Web Content Mining Nita Patil, Chhaya Das, Shreya Patanakar, Kshitija Pol Department of Computer Engg. Datta Meghe College of Engineering, Airoli, Navi Mumbai Abstract-With the research

More information

Keywords Data alignment, Data annotation, Web database, Search Result Record

Keywords Data alignment, Data annotation, Web database, Search Result Record Volume 5, Issue 8, August 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Annotating Web

More information

Query Sugges*ons. Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata

Query Sugges*ons. Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata Query Sugges*ons Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata Search engines User needs some information search engine tries to bridge this gap ssumption: the

More information

Outline. Possible solutions. The basic problem. How? How? Relevance Feedback, Query Expansion, and Inputs to Ranking Beyond Similarity

Outline. Possible solutions. The basic problem. How? How? Relevance Feedback, Query Expansion, and Inputs to Ranking Beyond Similarity Outline Relevance Feedback, Query Expansion, and Inputs to Ranking Beyond Similarity Lecture 10 CS 410/510 Information Retrieval on the Internet Query reformulation Sources of relevance for feedback Using

More information

Business Rules in the Semantic Web, are there any or are they different?

Business Rules in the Semantic Web, are there any or are they different? Business Rules in the Semantic Web, are there any or are they different? Silvie Spreeuwenberg, Rik Gerrits LibRT, Silodam 364, 1013 AW Amsterdam, Netherlands {silvie@librt.com, Rik@LibRT.com} http://www.librt.com

More information

CSC105, Introduction to Computer Science I. Introduction and Background. search service Web directories search engines Web Directories database

CSC105, Introduction to Computer Science I. Introduction and Background. search service Web directories search engines Web Directories database CSC105, Introduction to Computer Science Lab02: Web Searching and Search Services I. Introduction and Background. The World Wide Web is often likened to a global electronic library of information. Such

More information

Associating Terms with Text Categories

Associating Terms with Text Categories Associating Terms with Text Categories Osmar R. Zaïane Department of Computing Science University of Alberta Edmonton, AB, Canada zaiane@cs.ualberta.ca Maria-Luiza Antonie Department of Computing Science

More information

Obtaining Language Models of Web Collections Using Query-Based Sampling Techniques

Obtaining Language Models of Web Collections Using Query-Based Sampling Techniques -7695-1435-9/2 $17. (c) 22 IEEE 1 Obtaining Language Models of Web Collections Using Query-Based Sampling Techniques Gary A. Monroe James C. French Allison L. Powell Department of Computer Science University

More information

Query suggestion by query search: a new approach to user support in web search

Query suggestion by query search: a new approach to user support in web search Query suggestion by query search: a new approach to user support in web search Shen Jiang Department of Computing Science University of Alberta Edmonton, Alberta, Canada sjiang1@cs.ualberta.ca Sandra Zilles

More information

Mining Association Rules in Temporal Document Collections

Mining Association Rules in Temporal Document Collections Mining Association Rules in Temporal Document Collections Kjetil Nørvåg, Trond Øivind Eriksen, and Kjell-Inge Skogstad Dept. of Computer and Information Science, NTNU 7491 Trondheim, Norway Abstract. In

More information

From Passages into Elements in XML Retrieval

From Passages into Elements in XML Retrieval From Passages into Elements in XML Retrieval Kelly Y. Itakura David R. Cheriton School of Computer Science, University of Waterloo 200 Univ. Ave. W. Waterloo, ON, Canada yitakura@cs.uwaterloo.ca Charles

More information

Improving the Efficiency of Fast Using Semantic Similarity Algorithm

Improving the Efficiency of Fast Using Semantic Similarity Algorithm International Journal of Scientific and Research Publications, Volume 4, Issue 1, January 2014 1 Improving the Efficiency of Fast Using Semantic Similarity Algorithm D.KARTHIKA 1, S. DIVAKAR 2 Final year

More information

A New Measure of the Cluster Hypothesis

A New Measure of the Cluster Hypothesis A New Measure of the Cluster Hypothesis Mark D. Smucker 1 and James Allan 2 1 Department of Management Sciences University of Waterloo 2 Center for Intelligent Information Retrieval Department of Computer

More information

Relevance Feedback and Query Reformulation. Lecture 10 CS 510 Information Retrieval on the Internet Thanks to Susan Price. Outline

Relevance Feedback and Query Reformulation. Lecture 10 CS 510 Information Retrieval on the Internet Thanks to Susan Price. Outline Relevance Feedback and Query Reformulation Lecture 10 CS 510 Information Retrieval on the Internet Thanks to Susan Price IR on the Internet, Spring 2010 1 Outline Query reformulation Sources of relevance

More information

A Granular Approach to Web Search Result Presentation

A Granular Approach to Web Search Result Presentation A Granular Approach to Web Search Result Presentation Ryen W. White 1, Joemon M. Jose 1 & Ian Ruthven 2 1 Department of Computing Science, University of Glasgow, Glasgow. Scotland. 2 Department of Computer

More information

Ovid Technologies, Inc. Databases

Ovid Technologies, Inc. Databases Physical Therapy Workshop. August 10, 2001, 10:00 a.m. 12:30 p.m. Guide No. 1. Search terms: Diabetes Mellitus and Skin. Ovid Technologies, Inc. Databases ACCESS TO THE OVID DATABASES You must first go

More information

MIRACLE at ImageCLEFmed 2008: Evaluating Strategies for Automatic Topic Expansion

MIRACLE at ImageCLEFmed 2008: Evaluating Strategies for Automatic Topic Expansion MIRACLE at ImageCLEFmed 2008: Evaluating Strategies for Automatic Topic Expansion Sara Lana-Serrano 1,3, Julio Villena-Román 2,3, José C. González-Cristóbal 1,3 1 Universidad Politécnica de Madrid 2 Universidad

More information

Research Article Apriori Association Rule Algorithms using VMware Environment

Research Article Apriori Association Rule Algorithms using VMware Environment Research Journal of Applied Sciences, Engineering and Technology 8(2): 16-166, 214 DOI:1.1926/rjaset.8.955 ISSN: 24-7459; e-issn: 24-7467 214 Maxwell Scientific Publication Corp. Submitted: January 2,

More information

Discovery of Actionable Patterns in Databases: The Action Hierarchy Approach

Discovery of Actionable Patterns in Databases: The Action Hierarchy Approach Discovery of Actionable Patterns in Databases: The Action Hierarchy Approach Gediminas Adomavicius Computer Science Department Alexander Tuzhilin Leonard N. Stern School of Business Workinq Paper Series

More information

NOTES ON OBJECT-ORIENTED MODELING AND DESIGN

NOTES ON OBJECT-ORIENTED MODELING AND DESIGN NOTES ON OBJECT-ORIENTED MODELING AND DESIGN Stephen W. Clyde Brigham Young University Provo, UT 86402 Abstract: A review of the Object Modeling Technique (OMT) is presented. OMT is an object-oriented

More information

This session will provide an overview of the research resources and strategies that can be used when conducting business research.

This session will provide an overview of the research resources and strategies that can be used when conducting business research. Welcome! This session will provide an overview of the research resources and strategies that can be used when conducting business research. Many of these research tips will also be applicable to courses

More information

21. Search Models and UIs for IR

21. Search Models and UIs for IR 21. Search Models and UIs for IR INFO 202-10 November 2008 Bob Glushko Plan for Today's Lecture The "Classical" Model of Search and the "Classical" UI for IR Web-based Search Best practices for UIs in

More information

Information Retrieval CSCI

Information Retrieval CSCI Information Retrieval CSCI 4141-6403 My name is Anwar Alhenshiri My email is: anwar@cs.dal.ca I prefer: aalhenshiri@gmail.com The course website is: http://web.cs.dal.ca/~anwar/ir/main.html 5/6/2012 1

More information

Evaluating the Usefulness of Sentiment Information for Focused Crawlers

Evaluating the Usefulness of Sentiment Information for Focused Crawlers Evaluating the Usefulness of Sentiment Information for Focused Crawlers Tianjun Fu 1, Ahmed Abbasi 2, Daniel Zeng 1, Hsinchun Chen 1 University of Arizona 1, University of Wisconsin-Milwaukee 2 futj@email.arizona.edu,

More information

Data Mining with Oracle 10g using Clustering and Classification Algorithms Nhamo Mdzingwa September 25, 2005

Data Mining with Oracle 10g using Clustering and Classification Algorithms Nhamo Mdzingwa September 25, 2005 Data Mining with Oracle 10g using Clustering and Classification Algorithms Nhamo Mdzingwa September 25, 2005 Abstract Deciding on which algorithm to use, in terms of which is the most effective and accurate

More information

Search Scripts Mining from Wisdom of the Crowds

Search Scripts Mining from Wisdom of the Crowds Search Scripts Mining from Wisdom of the Crowds Chieh-Jen Wang and Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University Taipei 106, Taiwan {d96002, hhchen}@csie.ntu.edu.tw

More information

Enhancing Internet Search Engines to Achieve Concept-based Retrieval

Enhancing Internet Search Engines to Achieve Concept-based Retrieval Enhancing Internet Search Engines to Achieve Concept-based Retrieval Fenghua Lu 1, Thomas Johnsten 2, Vijay Raghavan 1 and Dennis Traylor 3 1 Center for Advanced Computer Studies University of Southwestern

More information

A Frequency Mining-based Algorithm for Re-Ranking Web Search Engine Retrievals

A Frequency Mining-based Algorithm for Re-Ranking Web Search Engine Retrievals A Frequency Mining-based Algorithm for Re-Ranking Web Search Engine Retrievals M. Barouni-Ebrahimi, Ebrahim Bagheri and Ali A. Ghorbani Faculty of Computer Science, University of New Brunswick, Fredericton,

More information

6.001 Notes: Section 8.1

6.001 Notes: Section 8.1 6.001 Notes: Section 8.1 Slide 8.1.1 In this lecture we are going to introduce a new data type, specifically to deal with symbols. This may sound a bit odd, but if you step back, you may realize that everything

More information

International Journal of Scientific & Engineering Research Volume 2, Issue 12, December ISSN Web Search Engine

International Journal of Scientific & Engineering Research Volume 2, Issue 12, December ISSN Web Search Engine International Journal of Scientific & Engineering Research Volume 2, Issue 12, December-2011 1 Web Search Engine G.Hanumantha Rao*, G.NarenderΨ, B.Srinivasa Rao+, M.Srilatha* Abstract This paper explains

More information

Assigning Vocation-Related Information to Person Clusters for Web People Search Results

Assigning Vocation-Related Information to Person Clusters for Web People Search Results Global Congress on Intelligent Systems Assigning Vocation-Related Information to Person Clusters for Web People Search Results Hiroshi Ueda 1) Harumi Murakami 2) Shoji Tatsumi 1) 1) Graduate School of

More information

Quoogle: A Query Expander for Google

Quoogle: A Query Expander for Google Quoogle: A Query Expander for Google Michael Smit Faculty of Computer Science Dalhousie University 6050 University Avenue Halifax, NS B3H 1W5 smit@cs.dal.ca ABSTRACT The query is the fundamental way through

More information

Information Retrieval (Part 1)

Information Retrieval (Part 1) Information Retrieval (Part 1) Fabio Aiolli http://www.math.unipd.it/~aiolli Dipartimento di Matematica Università di Padova Anno Accademico 2008/2009 1 Bibliographic References Copies of slides Selected

More information

THE URBAN COWGIRL PRESENTS KEYWORD RESEARCH

THE URBAN COWGIRL PRESENTS KEYWORD RESEARCH THE URBAN COWGIRL PRESENTS KEYWORD RESEARCH The most valuable keywords you have are the ones you mine from your pay-per-click performance reports. Scaling keywords that have proven to convert to orders

More information

SYSTEMS FOR NON STRUCTURED INFORMATION MANAGEMENT

SYSTEMS FOR NON STRUCTURED INFORMATION MANAGEMENT SYSTEMS FOR NON STRUCTURED INFORMATION MANAGEMENT Prof. Dipartimento di Elettronica e Informazione Politecnico di Milano INFORMATION SEARCH AND RETRIEVAL Inf. retrieval 1 PRESENTATION SCHEMA GOALS AND

More information

Term Frequency Normalisation Tuning for BM25 and DFR Models

Term Frequency Normalisation Tuning for BM25 and DFR Models Term Frequency Normalisation Tuning for BM25 and DFR Models Ben He and Iadh Ounis Department of Computing Science University of Glasgow United Kingdom Abstract. The term frequency normalisation parameter

More information

University of Santiago de Compostela at CLEF-IP09

University of Santiago de Compostela at CLEF-IP09 University of Santiago de Compostela at CLEF-IP9 José Carlos Toucedo, David E. Losada Grupo de Sistemas Inteligentes Dept. Electrónica y Computación Universidad de Santiago de Compostela, Spain {josecarlos.toucedo,david.losada}@usc.es

More information

Processing Structural Constraints

Processing Structural Constraints SYNONYMS None Processing Structural Constraints Andrew Trotman Department of Computer Science University of Otago Dunedin New Zealand DEFINITION When searching unstructured plain-text the user is limited

More information