INFOQUEST- A META SEARCH ENGINE FOR USER FRIENDLY INTELLIGENT INFORMATION RETRIEVAL FROM THE WEB

Size: px
Start display at page:

Download "INFOQUEST- A META SEARCH ENGINE FOR USER FRIENDLY INTELLIGENT INFORMATION RETRIEVAL FROM THE WEB"

Transcription

1 INFOQUEST- A META SEARCH ENGINE FOR USER FRIENDLY INTELLIGENT INFORMATION RETRIEVAL FROM THE WEB Sachin Agarwal Pallavi Agarwal 4 th Year students Indian Institute of Information Technology(IIIT) Allahabad Deoghat, Jhalwa, Allahabad (U.P.) India Telephone Number: {sachinagarwal, pallaviagarwal }@ug.iiita.ac.in ABSTRACT The paper aims to describe a method used in the metasearch engine InfoQuest to assist the user in assessing the information on the web conveniently. InfoQuest classifies the web-documents into pre-defined classes and presents it to the user according to the user s fields of interest. It keeps a track on the user s web navigation pattern to get the information of the user s preferences without intervening him during browsing the web. To serve the special purpose of keeping an eye on user InfoQuest uses its own browser Carnival at client-end. Keywords: Information Retrieval, user-tracking, Semisupervised learning, Search-result personalization 1. INTRODUCTION World Wide Web is the most promising source of information. The traditional tool to retrieve the documents from such a big reservoir of information is the search engine. Search engines look the words in the query (excluding the stop words) as keywords and presents results on the basis of the presence of these keywords on the web pages. Meta search engines make use of the capabilities of existing search engines and manipulate these results to present them to the user in a better way. InfoQuest uses the search results from top search engines, classifies them into different groups, and presents them according to the user s preferences of the categories. It uses TSVM [1] classifier, constructed using Clustering based Classification (CBC) algorithm [2] to classify the documents in to these predefined groups. InfoQuest uses an absolutely novice rule based method to get the knowledge of the user s interests during browsing the web. Every activity of the user is tracked, while browsing the web to get the information of user s interest, so an intelligent browser Carnival is developed to record all the user activities. This record is finally used to present the search results according to user s preferences. 1.1 Motivation Search engines give the results according to the occurrence of the keywords in the web pages. The keywords can be ambiguous which have the consequence of having a collection of documents related to different subjects (classes). For example: On giving Apple as keyword, result comprises of the documents related to: Apple-company, Apple- fruit, Big Apple Circus, After getting such results, the user has to scan through all the documents available to get the documents of the category for which he has searched for. It is the additional overhead for the user and sometimes leads to user s frustration during browsing the web. There is the need to perform some additional processing on these results returned by search engines to help the user in searching the web. Available Meta Search Engines performs clustering on the results returned by the search engines before presenting to the user. InfoQuest classifies the search results and order these classes according to user s choice. 1.2 Organization of the paper The rest of the paper has been organized as follows. Related Work in this field has been mentioned in section 2. Section 3 describes InfoQuest architecture which includes a brief description of Carnival-The InfoQuest Intelligent Web Browser and the InfoQuest server. Section 4 gives the complete description of the working of InfoQuest. The user tracking has been explained in section 5. The results are presented and analyzed in section 6.The conclusive remarks and future work followed by references is mentioned in section 7, 8 and 9 respectively. 2. RELATED WORK MetaCrawler[3], a meta search engine which searches top search engines in parallel and performs sophisticated pruning on the responses returned. It removes irrelevant, outdated, or unavailable documents. KartOO[4] is a meta-search engine with visual display interfaces. KartOO launches the query to a set of search engines, gathers the results, compiles them and represents them in a series of interactive maps through a proprietary algorithm.

2 Grouper[5], it uses the results of the HuskySearch (a meta search engine) and partitions the results into clusters(by using Suffix Tree Algorithm), or groupings of URLs which contain similar content. By generating high quality clusters with simple descriptions for novice users Grouper provided an effective way of organizing search results into collections for ease of browsing. Google Personalized web search [6], it gives the personalized search results to the user. To have the personalized results, user has to first make a profile and then the search engine sorts the documents by first giving the results that are more relevant for the user and then presenting the less relevant results. Google has developed new algorithms that dynamically reorder results by weighting the interests the user enters in his profile. Newt[7] uses a keyword based filtering algorithm and the system learns user preferences through relevance feedback and genetic algorithms. InfoQuest is different from these as no information retrieval system described above performs the classification of the documents and tracking of the user to get the information of the change in its interests with time. 3. INFOQUEST ARCHITECTURE InfoQuest has its specialized browser, Carnival dedicated for the purpose of tracking the user web navigation pattern. A brief overview of the browser and server is given below. 3.1 Carnival- the InfoQuest browser A normal web browser provides no assistance in personalization of the search results. An efficient personalization of the user s query results can be achieved by tracking the user s psychology while browsing the web pages. InfoQuest incorporates an intelligent web browser, Carnival, dedicated for this very purpose. This browser is to be installed on the user s machine to use the intelligent InfoQuest meta-search facility. Carnival keep a track on the user activities like which kind of the web pages a user is browsing and giving more preference. On the basis of the user s web browsing pattern it makes a profile for the user which contains the user preference information. Carnival sends the user s preference information to the server (which provides the search results in the prescribed format) along with the query to get the results ordered as per the user s choice. The user s preference information includes classes of the web pages and page characteristics like font color, page background etc, user is interested.the server then gives the results in the preference order of the user. Carnival also takes the information of the class and page characteristics from the server of a web page whenever user browses it. 3.2 InfoQuest Server The heart of InfoQuest intelligent meta-search system is the InfoQuest server. The InfoQuest browser sends queries and the preference order of the user to the server which fetches the search results by meta searching some of the top rated search engines. The InfoQuest server then performs the merging of results obtained from the search engines and removes repeating URLs in the search results of different search engines. The server then performs the classification of the webpages pointed to by the URLs in the search results into the predefined categories and rank the results within these pre-defined categories according to the page characteristics that the user prefers and sends the ranked (ordered according to user preference order) results to Carnival running at client-end. In order to obviate the need of classifying the same documents repeatedly, server maintains a database that contains the information about the class of the web page and the page characteristics like font color, font size and page background color, number of images/text size (the measure for graphics on the page). 4. HOW INFOQUEST WORKS 4.1 Interaction with Search Engines and fetching web pages InfoQuest server, on receiving a query along with user s preference information from the client, through Carnival, directs the query to the search engines. The result sets obtained from the search engines contains many common URLs; the intersection of the search results set is done to remove the repeating results. The information about the class of the documents can be obtained by using the content of the web pages hosted at the URLs in the search results. Though search engines provides a title, a snippet (optional), and a URL in the results, the classification on the basis of this information is not done because many times snippet part may contain incomplete and ambiguous information that may result in inaccurate classification. For example, as in the web result shown below: Technorati:Tag:apple... This page shows goodies from the web about apple. To contribute, just make a post to your blog about apple and include the link below. More Info»... Just by using above information about the web page hosted at URL it is not easy to detect the class of the document posted at URL.Moreover, matter becomes worse when the information about the page content is not provided as snippet. To overcome such situations web pages hosted at URL are used to get the class information of the result. As sometimes information in snippet and title may also prove out to be useful. The text in the snippet and title is also used along with the text at the web page to make the class distinction. To obviate the need of fetching the same web

3 page repeatedly, server maintains a database to keep the record of the information of the category and look and feel of the web page. 4.2 Parsing and Document representation The parsing of the web pages is done to get the frequency of occurrence of every word the document, after ignoring the stop words and performing stemming on the non-stop words using Porter s algorithm [8]. The set of these keywords forms a feature set of the document and the features of all the documents forms the feature space where the documents are represented using vector space model. Consequently a document matrix is prepared that contains Term Frequency Inverse Document Frequency TFIDF score. The TFIDF score has the significance of giving more weightage to the features that occur too often in a document and giving less weightage to the features occurring too often in many documents as they have less discriminating power. TFIDF score is calculated by using the relation: TFIDF = TF * IDF Where TF =term frequency (frequency of occurrence of a feature in a document) IDF=Inverse Document Frequency which is define as IDF = log(d/d) Where D = the number of documents and d = the number of documents in which word is occurring at least once. The document matrix is used further to classify the documents. 4.3 Classification of web documents Classification of the document is done to assign the predefined categories to the different web pages. It is a supervised learning problem. So a classifier is to be trained using a set of the documents for which the classes are already assigned (labeled training set). As it is a very time consuming and mechanical job to give classes to the documents to have training and testing dataset. InfoQuest uses CBC (Clustering Based Classification) [2], semi-supervised learning method of classification. The open directory project [9] is used to get the classes of the web documents. 4.4 Result Representation The classified results in user s preference order are then sent to Carnival at client end. The web page of the ordered results and a pop window showing ranked user s preference classes are sent to the user as the output of the query to InfoQuest. 5. USER TRACKING The user of a search engine would prefer to get the results of his interest. Normally the search engines displays the query results based on the usual criteria such as the frequency with which the web pages are accessed. More is the accessing frequency higher the web page will appear in the query results. In order to have the user s interests fields InfoQuest request the user to make a profile in the beginning. Every class is specified with a certain preference value in the profile. The user has the choice of specifying the preference value along with the class during the inception of profile. Henceforth, Carnival at user-end, keeps a watch on the user s web browsing patterns to track his interest changes. InfoQuest follows a set of rules that are fabricated based on the user psychology during browsing the web to get the information about the user s choice. The rules are based on: 1. The time user spends on a web page. 2. Frequency with which a user visits the web pages of a particular class and other page characteristics. 3. The page characteristics such as background color, font size, font color, images that appeal the user. In the following sections, we give a comprehensive discussion of the rules stated above. 5.1 The time user spends on a web page To track the user, a plug-in in Carnival keeps track on the user s activities. The user may select any URL on currently opened web page or may opt to open a URL of his choice. As soon as the user s selected web page pops up on the browser window, the browser starts keeping track of the time the user gives to a particular page. Care is taken to ensure that the browser does not count the time for which the user sits idle without doing effective work on the page. This could be traced out by keeping track of the upper and lower limit of the reading speed of a user while reading textual contents on the page and also the upper and lower limit of time taken by the user in viewing the graphical contents. Carnival requests for all the information regarding the web page like text size, number of images and web page class from server. This information is used to find an estimate of the time the user would spend on the page to perceive all its contents indepth. The upper and lower limits of user speed are initialized with appropriate values that normally users take. These limits are used to calculate the estimated maximum and minimum time the user can take to read the page. Now while reading the page if a user gives much more time than the estimated maximum time, then it can be safely assumed that he is not reading the page and the web page has just been ignored by the user after opening it and the user is not working on the currently opened page. If the user gives much less time than the estimated lower limit of time assigned for the page then it can be assumed that user has closed the web page without thoroughly reading it. In cases when the time taken by the user lies within the range of lower and upper limit of the time estimated

4 then that time will be used to update the limits of the reading speed of the user. In this way the reading speed limits of the user keeps on getting updated and gradually converges towards the actual reading speed limits of the user. If the user takes appropriate time (i.e. within the minimum and maximum limits of the expected time), then the characteristics (class and look and feel) of that web page can be assume to attract the user and as per his interest. The preference for such characteristics is increased for the user and stored in the Carnival for further ranking the results of the user queries. 5.2 Frequency with which a user visits a web page of a particular class and other characteristics When a user visits a web page and reads it effectively giving the average amount of time to that page then the page is considered to be relevant to the user. The browser checks whether the class is in profile.in case it is not in the profile, the profile is updated by giving a small preference value to the class. A small value for the preference update is chosen because it might be possible that the user is not much interested in a particular class of web documents but visits them seldom. Preference value for a class increases by substantial amount if the user reads more pages of that class than a certain threshold value. In this way the frequency with which the user visits the web pages of a particular class and other characteristics helps in keeping track of the user interest in certain class of web pages. 5.3 The page characteristics such as background color, font size, font color, images that appeal the user Page characteristic is an important criterion to figure out the look and feel of the web pages that may appeal to the user. These characteristics are specific for a particular page. Page characteristics preference are not initially taken from the user while having the profile for his preferences but as the time proceeds, the system learns the user s taste of page characteristics and adds them to the profile. These characteristics can further be used to filter the results in different classes and rank them in order of the user s choice of the page layout and design. 6. RESULTS To get a rough estimate of the performance of InfoQuest, we conducted a very simple experiment with the help of a number of users of different age groups The results obtained by the experiments on a school going child has been included in this paper. The school going child is chosen to test the system as they normally have varied interests and they generally get fascinated by the presentation of web pages, which is a good measure to test the system. The user was allowed to use the system for 4 days and on completion of the duration, the query Apple was given as a test query to the system. The results could be depicted from the following snapshots. Figure 1 shows the popup window that appears with the search result. It shows the categories along with the rank. Figure 2 shows the searched results under different categories in user preference order. Fig. 1 Classes of the web results for query Apple ranked according to user s preference order. The above snapshot depicts that the results for the query Apple are categorized in classes Arts/Performing Arts, Computers/ Software and Home/Cooking. On studying the results it was discovered that the results related to Big Apple Circus were in the Arts/Performing Arts category. The results related to Apple company were in Computers/Software and the results having Apple as Fruit were in Home/Cooking category. The categories are arranged in the order of the user preferences, which are clearly depicted from the number of stars in front of each category. The user can also directly click on the category to get the results in that category. On studying the weblog of the user maintained in browser Carnival, it was studied that user gave the following queries to the system. Mah Jongg, Pinball in Computers/Software category and Learning magic, Lance Burton, magic tricks, magic shops, latest movies + review, Filmfare awards,dance + disco in Arts/Performing Arts category. Apart from the above queries to InfoQuest the user visited a large number of web pages which were presented as search results to user s queries. From the log we figured out that though the user did not give Arts/Performing Arts as the preferred category to the system, but he gave the queries relating to this category and gave a considerable amount of time to those web pages. By this web navigation pattern the system learned that the user has interest in this category, which it has reflected in the results shown in figure 2.

5 9. REFERENCES Fig. 2 A part of the results for query Apple showing the results of class Arts/Performing arts on the top as this class is on the top of user s preferences. After looking at the pages from the web log we discovered that a majority of the time was spent on the pages with fascinating colorful images and text. First and second result in the above snapshot contains only an image. The third result has some text along with the images thus reducing the image/text ratio and fourth result has much more text compare to number of images thus having a very small image/text ratio. 7. CONCLUSION In this paper we have described InfoQuest- A Meta Search Engine for user friendly intelligent information retrieval from web. It personalizes the search results by tracking the user while navigating the web through an intelligent browser Carnival at user at end. Carnival studies the user s psychology for web pages preference and uses a novice rule set to capture his interest. The system architecture is modular and the changes and improvements could be easily incorporated in any module. The system has been tested on a number of users. The results for a school going boy presented in the paper reveals the efficiency of the system in personalizing the results by using the method of user tracking given in the paper. [1] Joachims, T. Transductive inference for text classification using support vector machines. In Proceedings of 16th International Conference on Machine Learning, 1999, San Francisco:Morgan Kaufmann., (pp ) [2] Hua-Jun Zeng, Xuan-Hui Wang, Zheng Chen, Hongjun Lu and Wei-Ying Ma, CBC: clustering based text classification requiring minimal labeled data, ICDM 2003, Nov. 2003, page(s): [3] Selberg, E. and Etzioni, O., "The MetaCrawler Architecture for Resource Aggregation on the Web.IEEEExpert, January/February 1997, Volume 12, number 1, pages [4] : Kartoo meta search engine [5] Zamir, O. and Etzioni, O. "Grouper: a Dynamic Clustering Interface to Web Search Results".WWW-8, 1999 [6] Google personalized search engine. [7] Beerud Sheth, "A Learning Approach to Personalized Information Filtering," Learning and Common Sense Section T. R , MIT Media Laboratory, ml [8] M. Porter. An Algorithm for Suffix Stripping. Program Automated Library and Information Systems, 14(3) [9] Open directory project developed by Netscape. 8. FUTURE WORK Every system has a scope for further improvements that could enhance the system s efficiency and capability. The same is true for InfoQuest as well. The rule set for user tracking can further be enhanced to provide better personalization of search results.

Letter Pair Similarity Classification and URL Ranking Based on Feedback Approach

Letter Pair Similarity Classification and URL Ranking Based on Feedback Approach Letter Pair Similarity Classification and URL Ranking Based on Feedback Approach P.T.Shijili 1 P.G Student, Department of CSE, Dr.Nallini Institute of Engineering & Technology, Dharapuram, Tamilnadu, India

More information

THE WEB SEARCH ENGINE

THE WEB SEARCH ENGINE International Journal of Computer Science Engineering and Information Technology Research (IJCSEITR) Vol.1, Issue 2 Dec 2011 54-60 TJPRC Pvt. Ltd., THE WEB SEARCH ENGINE Mr.G. HANUMANTHA RAO hanu.abc@gmail.com

More information

A New Technique to Optimize User s Browsing Session using Data Mining

A New Technique to Optimize User s Browsing Session using Data Mining Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

A Novel Categorized Search Strategy using Distributional Clustering Neenu Joseph. M 1, Sudheep Elayidom 2

A Novel Categorized Search Strategy using Distributional Clustering Neenu Joseph. M 1, Sudheep Elayidom 2 A Novel Categorized Search Strategy using Distributional Clustering Neenu Joseph. M 1, Sudheep Elayidom 2 1 Student, M.E., (Computer science and Engineering) in M.G University, India, 2 Associate Professor

More information

A Novel Approach for Restructuring Web Search Results by Feedback Sessions Using Fuzzy clustering

A Novel Approach for Restructuring Web Search Results by Feedback Sessions Using Fuzzy clustering A Novel Approach for Restructuring Web Search Results by Feedback Sessions Using Fuzzy clustering R.Dhivya 1, R.Rajavignesh 2 (M.E CSE), Department of CSE, Arasu Engineering College, kumbakonam 1 Asst.

More information

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, ISSN:

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, ISSN: IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 20131 Improve Search Engine Relevance with Filter session Addlin Shinney R 1, Saravana Kumar T

More information

Inferring User Search for Feedback Sessions

Inferring User Search for Feedback Sessions Inferring User Search for Feedback Sessions Sharayu Kakade 1, Prof. Ranjana Barde 2 PG Student, Department of Computer Science, MIT Academy of Engineering, Pune, MH, India 1 Assistant Professor, Department

More information

Feedback Session Based User Search Goal Prediction

Feedback Session Based User Search Goal Prediction Feedback Session Based User Search Goal Prediction Sreerenjini.P.R 1, Prasanth.R.S 2 PG Scholar, Dept of Computer Science, Mohandas College of Engg & technology, Anad, Trivandrum, Kerala, India 1 Assistant

More information

International Journal of Scientific & Engineering Research Volume 2, Issue 12, December ISSN Web Search Engine

International Journal of Scientific & Engineering Research Volume 2, Issue 12, December ISSN Web Search Engine International Journal of Scientific & Engineering Research Volume 2, Issue 12, December-2011 1 Web Search Engine G.Hanumantha Rao*, G.NarenderΨ, B.Srinivasa Rao+, M.Srilatha* Abstract This paper explains

More information

Information Retrieval

Information Retrieval Information Retrieval CSC 375, Fall 2016 An information retrieval system will tend not to be used whenever it is more painful and troublesome for a customer to have information than for him not to have

More information

Chapter 27 Introduction to Information Retrieval and Web Search

Chapter 27 Introduction to Information Retrieval and Web Search Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval

More information

Domain Specific Search Engine for Students

Domain Specific Search Engine for Students Domain Specific Search Engine for Students Domain Specific Search Engine for Students Wai Yuen Tang The Department of Computer Science City University of Hong Kong, Hong Kong wytang@cs.cityu.edu.hk Lam

More information

Automated Online News Classification with Personalization

Automated Online News Classification with Personalization Automated Online News Classification with Personalization Chee-Hong Chan Aixin Sun Ee-Peng Lim Center for Advanced Information Systems, Nanyang Technological University Nanyang Avenue, Singapore, 639798

More information

In the recent past, the World Wide Web has been witnessing an. explosive growth. All the leading web search engines, namely, Google,

In the recent past, the World Wide Web has been witnessing an. explosive growth. All the leading web search engines, namely, Google, 1 1.1 Introduction In the recent past, the World Wide Web has been witnessing an explosive growth. All the leading web search engines, namely, Google, Yahoo, Askjeeves, etc. are vying with each other to

More information

Learning Ontology-Based User Profiles: A Semantic Approach to Personalized Web Search

Learning Ontology-Based User Profiles: A Semantic Approach to Personalized Web Search 1 / 33 Learning Ontology-Based User Profiles: A Semantic Approach to Personalized Web Search Bernd Wittefeld Supervisor Markus Löckelt 20. July 2012 2 / 33 Teaser - Google Web History http://www.google.com/history

More information

Chapter 6: Information Retrieval and Web Search. An introduction

Chapter 6: Information Retrieval and Web Search. An introduction Chapter 6: Information Retrieval and Web Search An introduction Introduction n Text mining refers to data mining using text documents as data. n Most text mining tasks use Information Retrieval (IR) methods

More information

CS 8803 AIAD Prof Ling Liu. Project Proposal for Automated Classification of Spam Based on Textual Features Gopal Pai

CS 8803 AIAD Prof Ling Liu. Project Proposal for Automated Classification of Spam Based on Textual Features Gopal Pai CS 8803 AIAD Prof Ling Liu Project Proposal for Automated Classification of Spam Based on Textual Features Gopal Pai Under the supervision of Steve Webb Motivations and Objectives Spam, which was until

More information

Introduction to Web Clustering

Introduction to Web Clustering Introduction to Web Clustering D. De Cao R. Basili Corso di Web Mining e Retrieval a.a. 2008-9 June 26, 2009 Outline Introduction to Web Clustering Some Web Clustering engines The KeySRC approach Some

More information

CHAPTER 6 PROPOSED HYBRID MEDICAL IMAGE RETRIEVAL SYSTEM USING SEMANTIC AND VISUAL FEATURES

CHAPTER 6 PROPOSED HYBRID MEDICAL IMAGE RETRIEVAL SYSTEM USING SEMANTIC AND VISUAL FEATURES 188 CHAPTER 6 PROPOSED HYBRID MEDICAL IMAGE RETRIEVAL SYSTEM USING SEMANTIC AND VISUAL FEATURES 6.1 INTRODUCTION Image representation schemes designed for image retrieval systems are categorized into two

More information

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao Motivation Image search Building large sets of classified images Robotics Background Object recognition is unsolved Deformable shaped

More information

A NEW CLUSTER MERGING ALGORITHM OF SUFFIX TREE CLUSTERING

A NEW CLUSTER MERGING ALGORITHM OF SUFFIX TREE CLUSTERING A NEW CLUSTER MERGING ALGORITHM OF SUFFIX TREE CLUSTERING Jianhua Wang, Ruixu Li Computer Science Department, Yantai University, Yantai, Shandong, China Abstract: Key words: Document clustering methods

More information

Information Gathering Support Interface by the Overview Presentation of Web Search Results

Information Gathering Support Interface by the Overview Presentation of Web Search Results Information Gathering Support Interface by the Overview Presentation of Web Search Results Takumi Kobayashi Kazuo Misue Buntarou Shizuki Jiro Tanaka Graduate School of Systems and Information Engineering

More information

Blackhat Search Engine Optimization Techniques (SEO) and Counter Measures

Blackhat Search Engine Optimization Techniques (SEO) and Counter Measures 2018 IJSRST Volume 4 Issue 11 Print ISSN: 2395-6011 Online ISSN: 2395-602X Themed Section: Science and Technology DOI : https://doi.org/10.32628/ijsrst1840117 Blackhat Search Engine Optimization Techniques

More information

Evaluation of Meta-Search Engine Merge Algorithms

Evaluation of Meta-Search Engine Merge Algorithms 2008 International Conference on Internet Computing in Science and Engineering Evaluation of Meta-Search Engine Merge Algorithms Chunshuang Liu, Zhiqiang Zhang,2, Xiaoqin Xie 2, TingTing Liang School of

More information

Mining Web Data. Lijun Zhang

Mining Web Data. Lijun Zhang Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems

More information

CHAPTER THREE INFORMATION RETRIEVAL SYSTEM

CHAPTER THREE INFORMATION RETRIEVAL SYSTEM CHAPTER THREE INFORMATION RETRIEVAL SYSTEM 3.1 INTRODUCTION Search engine is one of the most effective and prominent method to find information online. It has become an essential part of life for almost

More information

Proximity Prestige using Incremental Iteration in Page Rank Algorithm

Proximity Prestige using Incremental Iteration in Page Rank Algorithm Indian Journal of Science and Technology, Vol 9(48), DOI: 10.17485/ijst/2016/v9i48/107962, December 2016 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Proximity Prestige using Incremental Iteration

More information

COLLABORATIVE LOCATION AND ACTIVITY RECOMMENDATIONS WITH GPS HISTORY DATA

COLLABORATIVE LOCATION AND ACTIVITY RECOMMENDATIONS WITH GPS HISTORY DATA COLLABORATIVE LOCATION AND ACTIVITY RECOMMENDATIONS WITH GPS HISTORY DATA Vincent W. Zheng, Yu Zheng, Xing Xie, Qiang Yang Hong Kong University of Science and Technology Microsoft Research Asia WWW 2010

More information

Batch Inherence of Map Reduce Framework

Batch Inherence of Map Reduce Framework Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 6, June 2015, pg.287

More information

Mining Web Data. Lijun Zhang

Mining Web Data. Lijun Zhang Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems

More information

Chapter 2. Architecture of a Search Engine

Chapter 2. Architecture of a Search Engine Chapter 2 Architecture of a Search Engine Search Engine Architecture A software architecture consists of software components, the interfaces provided by those components and the relationships between them

More information

Web Usage Mining: A Research Area in Web Mining

Web Usage Mining: A Research Area in Web Mining Web Usage Mining: A Research Area in Web Mining Rajni Pamnani, Pramila Chawan Department of computer technology, VJTI University, Mumbai Abstract Web usage mining is a main research area in Web mining

More information

Flight Recommendation System based on user feedback, weighting technique and context aware recommendation system

Flight Recommendation System based on user feedback, weighting technique and context aware recommendation system www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 5 Issue 09 September 2016 Page No.17973-17978 Flight Recommendation System based on user feedback, weighting

More information

Skill Area 209: Use Internet Technology. Software Application (SWA)

Skill Area 209: Use Internet Technology. Software Application (SWA) Skill Area 209: Use Internet Technology Software Application (SWA) Skill Area 209.1 Use Browser for Research (10hrs) 209.1.1 Familiarise with the Environment of Selected Browser Internet Technology The

More information

ways to present and organize the content to provide your students with an intuitive and easy-to-navigate experience.

ways to present and organize the content to provide your students with an intuitive and easy-to-navigate experience. In Blackboard Learn, as you create your course, you can add a variety of content types, including text, file attachments, and tools. You can experiment with ways to present and organize the content to

More information

A modified and fast Perceptron learning rule and its use for Tag Recommendations in Social Bookmarking Systems

A modified and fast Perceptron learning rule and its use for Tag Recommendations in Social Bookmarking Systems A modified and fast Perceptron learning rule and its use for Tag Recommendations in Social Bookmarking Systems Anestis Gkanogiannis and Theodore Kalamboukis Department of Informatics Athens University

More information

CHAPTER 3 A FAST K-MODES CLUSTERING ALGORITHM TO WAREHOUSE VERY LARGE HETEROGENEOUS MEDICAL DATABASES

CHAPTER 3 A FAST K-MODES CLUSTERING ALGORITHM TO WAREHOUSE VERY LARGE HETEROGENEOUS MEDICAL DATABASES 70 CHAPTER 3 A FAST K-MODES CLUSTERING ALGORITHM TO WAREHOUSE VERY LARGE HETEROGENEOUS MEDICAL DATABASES 3.1 INTRODUCTION In medical science, effective tools are essential to categorize and systematically

More information

In this project, I examined methods to classify a corpus of s by their content in order to suggest text blocks for semi-automatic replies.

In this project, I examined methods to classify a corpus of  s by their content in order to suggest text blocks for semi-automatic replies. December 13, 2006 IS256: Applied Natural Language Processing Final Project Email classification for semi-automated reply generation HANNES HESSE mail 2056 Emerson Street Berkeley, CA 94703 phone 1 (510)

More information

IMAGE RETRIEVAL SYSTEM: BASED ON USER REQUIREMENT AND INFERRING ANALYSIS TROUGH FEEDBACK

IMAGE RETRIEVAL SYSTEM: BASED ON USER REQUIREMENT AND INFERRING ANALYSIS TROUGH FEEDBACK IMAGE RETRIEVAL SYSTEM: BASED ON USER REQUIREMENT AND INFERRING ANALYSIS TROUGH FEEDBACK 1 Mount Steffi Varish.C, 2 Guru Rama SenthilVel Abstract - Image Mining is a recent trended approach enveloped in

More information

A Comparison of Three Document Clustering Algorithms: TreeCluster, Word Intersection GQF, and Word Intersection Hierarchical Agglomerative Clustering

A Comparison of Three Document Clustering Algorithms: TreeCluster, Word Intersection GQF, and Word Intersection Hierarchical Agglomerative Clustering A Comparison of Three Document Clustering Algorithms:, Word Intersection GQF, and Word Intersection Hierarchical Agglomerative Clustering Abstract Kenrick Mock 9/23/1998 Business Applications Intel Architecture

More information

A Novel Approach for Inferring and Analyzing User Search Goals

A Novel Approach for Inferring and Analyzing User Search Goals A Novel Approach for Inferring and Analyzing User Search Goals Y. Sai Krishna 1, N. Swapna Goud 2 1 MTech Student, Department of CSE, Anurag Group of Institutions, India 2 Associate Professor, Department

More information

SEARCH TECHNIQUES: BASIC AND ADVANCED

SEARCH TECHNIQUES: BASIC AND ADVANCED 17 SEARCH TECHNIQUES: BASIC AND ADVANCED 17.1 INTRODUCTION Searching is the activity of looking thoroughly in order to find something. In library and information science, searching refers to looking through

More information

Hybrid Approach for Query Expansion using Query Log

Hybrid Approach for Query Expansion using Query Log Volume 7 No.6, July 214 www.ijais.org Hybrid Approach for Query Expansion using Query Log Lynette Lopes M.E Student, TSEC, Mumbai, India Jayant Gadge Associate Professor, TSEC, Mumbai, India ABSTRACT Web

More information

Web Data mining-a Research area in Web usage mining

Web Data mining-a Research area in Web usage mining IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 13, Issue 1 (Jul. - Aug. 2013), PP 22-26 Web Data mining-a Research area in Web usage mining 1 V.S.Thiyagarajan,

More information

Lexis for Microsoft Office User Guide

Lexis for Microsoft Office User Guide Lexis for Microsoft Office User Guide Created 01-2018 Copyright 2018 LexisNexis. All rights reserved. Contents About Lexis for Microsoft Office...1 What is Lexis for Microsoft Office?... 1 What's New in

More information

Using Clusters on the Vivisimo Web Search Engine

Using Clusters on the Vivisimo Web Search Engine Using Clusters on the Vivisimo Web Search Engine Sherry Koshman and Amanda Spink School of Information Sciences University of Pittsburgh 135 N. Bellefield Ave., Pittsburgh, PA 15237 skoshman@sis.pitt.edu,

More information

Tag-based Social Interest Discovery

Tag-based Social Interest Discovery Tag-based Social Interest Discovery Xin Li / Lei Guo / Yihong (Eric) Zhao Yahoo!Inc 2008 Presented by: Tuan Anh Le (aletuan@vub.ac.be) 1 Outline Introduction Data set collection & Pre-processing Architecture

More information

Semantic Search in s

Semantic Search in  s Semantic Search in Emails Navneet Kapur, Mustafa Safdari, Rahul Sharma December 10, 2010 Abstract Web search technology is abound with techniques to tap into the semantics of information. For email search,

More information

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2

More information

June 15, Abstract. 2. Methodology and Considerations. 1. Introduction

June 15, Abstract. 2. Methodology and Considerations. 1. Introduction Organizing Internet Bookmarks using Latent Semantic Analysis and Intelligent Icons Note: This file is a homework produced by two students for UCR CS235, Spring 06. In order to fully appreacate it, it may

More information

Using Text Learning to help Web browsing

Using Text Learning to help Web browsing Using Text Learning to help Web browsing Dunja Mladenić J.Stefan Institute, Ljubljana, Slovenia Carnegie Mellon University, Pittsburgh, PA, USA Dunja.Mladenic@{ijs.si, cs.cmu.edu} Abstract Web browsing

More information

Enabling Users to Visually Evaluate the Effectiveness of Different Search Queries or Engines

Enabling Users to Visually Evaluate the Effectiveness of Different Search Queries or Engines Appears in WWW 04 Workshop: Measuring Web Effectiveness: The User Perspective, New York, NY, May 18, 2004 Enabling Users to Visually Evaluate the Effectiveness of Different Search Queries or Engines Anselm

More information

Smart Browser: A framework for bringing intelligence into the browser

Smart Browser: A framework for bringing intelligence into the browser Smart Browser: A framework for bringing intelligence into the browser Demiao Lin, Jianming Jin, Yuhong Xiong HP Laboratories HPL-2010-1 Keyword(s): smart browser, Firefox extension, XML message, information

More information

A Parallel Computing Architecture for Information Processing Over the Internet

A Parallel Computing Architecture for Information Processing Over the Internet A Parallel Computing Architecture for Information Processing Over the Internet Wendy A. Lawrence-Fowler, Xiannong Meng, Richard H. Fowler, Zhixiang Chen Department of Computer Science, University of Texas

More information

Concept-Based Document Similarity Based on Suffix Tree Document

Concept-Based Document Similarity Based on Suffix Tree Document Concept-Based Document Similarity Based on Suffix Tree Document *P.Perumal Sri Ramakrishna Engineering College Associate Professor Department of CSE, Coimbatore perumalsrec@gmail.com R. Nedunchezhian Sri

More information

Search Engine Architecture II

Search Engine Architecture II Search Engine Architecture II Primary Goals of Search Engines Effectiveness (quality): to retrieve the most relevant set of documents for a query Process text and store text statistics to improve relevance

More information

A Document-centered Approach to a Natural Language Music Search Engine

A Document-centered Approach to a Natural Language Music Search Engine A Document-centered Approach to a Natural Language Music Search Engine Peter Knees, Tim Pohle, Markus Schedl, Dominik Schnitzer, and Klaus Seyerlehner Dept. of Computational Perception, Johannes Kepler

More information

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 1 Department of Electronics & Comp. Sc, RTMNU, Nagpur, India 2 Department of Computer Science, Hislop College, Nagpur,

More information

Supervised Web Forum Crawling

Supervised Web Forum Crawling Supervised Web Forum Crawling 1 Priyanka S. Bandagale, 2 Dr. Lata Ragha 1 Student, 2 Professor and HOD 1 Computer Department, 1 Terna college of Engineering, Navi Mumbai, India Abstract - In this paper,

More information

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A.

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. Knowledge Retrieval Franz J. Kurfess Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. 1 Acknowledgements This lecture series has been sponsored by the European

More information

Context Based Indexing in Search Engines: A Review

Context Based Indexing in Search Engines: A Review International Journal of Computer (IJC) ISSN 2307-4523 (Print & Online) Global Society of Scientific Research and Researchers http://ijcjournal.org/ Context Based Indexing in Search Engines: A Review Suraksha

More information

SCUBA DIVER: SUBSPACE CLUSTERING OF WEB SEARCH RESULTS

SCUBA DIVER: SUBSPACE CLUSTERING OF WEB SEARCH RESULTS SCUBA DIVER: SUBSPACE CLUSTERING OF WEB SEARCH RESULTS Fatih Gelgi, Srinivas Vadrevu, Hasan Davulcu Department of Computer Science and Engineering, Arizona State University, Tempe, AZ fagelgi@asu.edu,

More information

Web Page Classification using FP Growth Algorithm Akansha Garg,Computer Science Department Swami Vivekanad Subharti University,Meerut, India

Web Page Classification using FP Growth Algorithm Akansha Garg,Computer Science Department Swami Vivekanad Subharti University,Meerut, India Web Page Classification using FP Growth Algorithm Akansha Garg,Computer Science Department Swami Vivekanad Subharti University,Meerut, India Abstract - The primary goal of the web site is to provide the

More information

Indexing: Part IV. Announcements (February 17) Keyword search. CPS 216 Advanced Database Systems

Indexing: Part IV. Announcements (February 17) Keyword search. CPS 216 Advanced Database Systems Indexing: Part IV CPS 216 Advanced Database Systems Announcements (February 17) 2 Homework #2 due in two weeks Reading assignments for this and next week The query processing survey by Graefe Due next

More information

An Oracle White Paper October Oracle Social Cloud Platform Text Analytics

An Oracle White Paper October Oracle Social Cloud Platform Text Analytics An Oracle White Paper October 2012 Oracle Social Cloud Platform Text Analytics Executive Overview Oracle s social cloud text analytics platform is able to process unstructured text-based conversations

More information

Finding Hubs and authorities using Information scent to improve the Information Retrieval precision

Finding Hubs and authorities using Information scent to improve the Information Retrieval precision Finding Hubs and authorities using Information scent to improve the Information Retrieval precision Suruchi Chawla 1, Dr Punam Bedi 2 1 Department of Computer Science, University of Delhi, Delhi, INDIA

More information

Domain-specific Concept-based Information Retrieval System

Domain-specific Concept-based Information Retrieval System Domain-specific Concept-based Information Retrieval System L. Shen 1, Y. K. Lim 1, H. T. Loh 2 1 Design Technology Institute Ltd, National University of Singapore, Singapore 2 Department of Mechanical

More information

Enterprise Data Catalog for Microsoft Azure Tutorial

Enterprise Data Catalog for Microsoft Azure Tutorial Enterprise Data Catalog for Microsoft Azure Tutorial VERSION 10.2 JANUARY 2018 Page 1 of 45 Contents Tutorial Objectives... 4 Enterprise Data Catalog Overview... 5 Overview... 5 Objectives... 5 Enterprise

More information

Integration Service. Admin Console User Guide. On-Premises

Integration Service. Admin Console User Guide. On-Premises Kony MobileFabric TM Integration Service Admin Console User Guide On-Premises Release 7.3 Document Relevance and Accuracy This document is considered relevant to the Release stated on this title page and

More information

AN OVERVIEW OF SEARCHING AND DISCOVERING WEB BASED INFORMATION RESOURCES

AN OVERVIEW OF SEARCHING AND DISCOVERING WEB BASED INFORMATION RESOURCES Journal of Defense Resources Management No. 1 (1) / 2010 AN OVERVIEW OF SEARCHING AND DISCOVERING Cezar VASILESCU Regional Department of Defense Resources Management Studies Abstract: The Internet becomes

More information

International Journal of Advance Engineering and Research Development. Survey of Web Usage Mining Techniques for Web-based Recommendations

International Journal of Advance Engineering and Research Development. Survey of Web Usage Mining Techniques for Web-based Recommendations Scientific Journal of Impact Factor (SJIF): 5.71 International Journal of Advance Engineering and Research Development Volume 5, Issue 02, February -2018 e-issn (O): 2348-4470 p-issn (P): 2348-6406 Survey

More information

SIEVE Search Images Effectively through Visual Elimination

SIEVE Search Images Effectively through Visual Elimination SIEVE Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech, Monash University, Churchill, Victoria, 3842 {dengsheng.zhang, guojun.lu}@infotech.monash.edu.au

More information

Task Minder: An Intelligent Task Suggestion Agent

Task Minder: An Intelligent Task Suggestion Agent Task Minder: An Intelligent Task Suggestion Agent Zach Pousman, Brian Landry, Rahul Nair, Manas Tungare CS 8802B Georgia Institute of Technology {zpousman,blandry,rnair,manas}@cc.gatech.edu Introduction

More information

User Profiling for Interest-focused Browsing History

User Profiling for Interest-focused Browsing History User Profiling for Interest-focused Browsing History Miha Grčar, Dunja Mladenič, Marko Grobelnik Jozef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia {Miha.Grcar, Dunja.Mladenic, Marko.Grobelnik}@ijs.si

More information

Enhanced Performance of Search Engine with Multitype Feature Co-Selection of Db-scan Clustering Algorithm

Enhanced Performance of Search Engine with Multitype Feature Co-Selection of Db-scan Clustering Algorithm Enhanced Performance of Search Engine with Multitype Feature Co-Selection of Db-scan Clustering Algorithm K.Parimala, Assistant Professor, MCA Department, NMS.S.Vellaichamy Nadar College, Madurai, Dr.V.Palanisamy,

More information

VISUAL RERANKING USING MULTIPLE SEARCH ENGINES

VISUAL RERANKING USING MULTIPLE SEARCH ENGINES VISUAL RERANKING USING MULTIPLE SEARCH ENGINES By Dennis Lim Thye Loon A REPORT SUBMITTED TO Universiti Tunku Abdul Rahman in partial fulfillment of the requirements for the degree of Faculty of Information

More information

Information Retrieval. (M&S Ch 15)

Information Retrieval. (M&S Ch 15) Information Retrieval (M&S Ch 15) 1 Retrieval Models A retrieval model specifies the details of: Document representation Query representation Retrieval function Determines a notion of relevance. Notion

More information

SEO. Drivers You Are Missing in Content Marketing

SEO. Drivers You Are Missing in Content Marketing SEO Drivers You Are Missing in Content Marketing SEO IS ALWAYS CHANGING. HICH MEANS your content strategy what you create and how it is found is ALWAYS CHANGING AS WELL. BUT IS SEO ALWAYS CHANGING? BECAUSE

More information

International Journal of Innovative Research in Computer and Communication Engineering

International Journal of Innovative Research in Computer and Communication Engineering Optimized Re-Ranking In Mobile Search Engine Using User Profiling A.VINCY 1, M.KALAIYARASI 2, C.KALAIYARASI 3 PG Student, Department of Computer Science, Arunai Engineering College, Tiruvannamalai, India

More information

EXTERNAL INQUIRIES. Objective of Section: Definition: Rating:

EXTERNAL INQUIRIES. Objective of Section: Definition: Rating: EXTERNAL INQUIRIES 7 Objective of Section: Describe and define the concepts necessary to identify and rate External Inquiries. The exercises at the end of the section help the student demonstrate that

More information

Improving Suffix Tree Clustering Algorithm for Web Documents

Improving Suffix Tree Clustering Algorithm for Web Documents International Conference on Logistics Engineering, Management and Computer Science (LEMCS 2015) Improving Suffix Tree Clustering Algorithm for Web Documents Yan Zhuang Computer Center East China Normal

More information

Reference Requirements for Records and Documents Management

Reference Requirements for Records and Documents Management Reference Requirements for Records and Documents Management Ricardo Jorge Seno Martins ricardosenomartins@gmail.com Instituto Superior Técnico, Lisboa, Portugal May 2015 Abstract When information systems

More information

Semantic Website Clustering

Semantic Website Clustering Semantic Website Clustering I-Hsuan Yang, Yu-tsun Huang, Yen-Ling Huang 1. Abstract We propose a new approach to cluster the web pages. Utilizing an iterative reinforced algorithm, the model extracts semantic

More information

CWS: : A Comparative Web Search System

CWS: : A Comparative Web Search System CWS: : A Comparative Web Search System Jian-Tao Sun, Xuanhui Wang, Dou Shen Hua-Jun Zeng, Zheng Chen Microsoft Research Asia University of Illinois at Urbana-Champaign Hong Kong University of Science and

More information

CAMERA User s Guide. They are most easily launched from the main menu application. To do this, all the class files must reside in the same directory.

CAMERA User s Guide. They are most easily launched from the main menu application. To do this, all the class files must reside in the same directory. CAMERA User s Guide 1 Quick Start CAMERA is a collection of concise, intuitive and visually inspiring workbenches for cache mapping schemes and virtual memory. The CAMERA workbenches are standalone applications

More information

REDUNDANCY REMOVAL IN WEB SEARCH RESULTS USING RECURSIVE DUPLICATION CHECK ALGORITHM. Pudukkottai, Tamil Nadu, India

REDUNDANCY REMOVAL IN WEB SEARCH RESULTS USING RECURSIVE DUPLICATION CHECK ALGORITHM. Pudukkottai, Tamil Nadu, India REDUNDANCY REMOVAL IN WEB SEARCH RESULTS USING RECURSIVE DUPLICATION CHECK ALGORITHM Dr. S. RAVICHANDRAN 1 E.ELAKKIYA 2 1 Head, Dept. of Computer Science, H. H. The Rajah s College, Pudukkottai, Tamil

More information

A Web Page Segmentation Method by using Headlines to Web Contents as Separators and its Evaluations

A Web Page Segmentation Method by using Headlines to Web Contents as Separators and its Evaluations IJCSNS International Journal of Computer Science and Network Security, VOL.13 No.1, January 2013 1 A Web Page Segmentation Method by using Headlines to Web Contents as Separators and its Evaluations Hiroyuki

More information

Building a website. Should you build your own website?

Building a website. Should you build your own website? Building a website As discussed in the previous module, your website is the online shop window for your business and you will only get one chance to make a good first impression. It is worthwhile investing

More information

CS 6320 Natural Language Processing

CS 6320 Natural Language Processing CS 6320 Natural Language Processing Information Retrieval Yang Liu Slides modified from Ray Mooney s (http://www.cs.utexas.edu/users/mooney/ir-course/slides/) 1 Introduction of IR System components, basic

More information

A. Krishna Mohan *1, Harika Yelisala #1, MHM Krishna Prasad #2

A. Krishna Mohan *1, Harika Yelisala #1, MHM Krishna Prasad #2 Vol. 2, Issue 3, May-Jun 212, pp.1433-1438 IR Tree An Adept Index for Handling Geographic Document ing A. Krishna Mohan *1, Harika Yelisala #1, MHM Krishna Prasad #2 *1,#2 Associate professor, Dept. of

More information

A Clustering Framework to Build Focused Web Crawlers for Automatic Extraction of Cultural Information

A Clustering Framework to Build Focused Web Crawlers for Automatic Extraction of Cultural Information A Clustering Framework to Build Focused Web Crawlers for Automatic Extraction of Cultural Information George E. Tsekouras *, Damianos Gavalas, Stefanos Filios, Antonios D. Niros, and George Bafaloukas

More information

Automated Tagging for Online Q&A Forums

Automated Tagging for Online Q&A Forums 1 Automated Tagging for Online Q&A Forums Rajat Sharma, Nitin Kalra, Gautam Nagpal University of California, San Diego, La Jolla, CA 92093, USA {ras043, nikalra, gnagpal}@ucsd.edu Abstract Hashtags created

More information

Image Similarity Measurements Using Hmok- Simrank

Image Similarity Measurements Using Hmok- Simrank Image Similarity Measurements Using Hmok- Simrank A.Vijay Department of computer science and Engineering Selvam College of Technology, Namakkal, Tamilnadu,india. k.jayarajan M.E (Ph.D) Assistant Professor,

More information

A Metric for Inferring User Search Goals in Search Engines

A Metric for Inferring User Search Goals in Search Engines International Journal of Engineering and Technical Research (IJETR) A Metric for Inferring User Search Goals in Search Engines M. Monika, N. Rajesh, K.Rameshbabu Abstract For a broad topic, different users

More information

Using Text Elements by Context to Display Search Results in Information Retrieval Systems Model and Research results

Using Text Elements by Context to Display Search Results in Information Retrieval Systems Model and Research results Using Text Elements by Context to Display Search Results in Information Retrieval Systems Model and Research results Offer Drori SHAAM Information Systems The Hebrew University of Jerusalem offerd@ {shaam.gov.il,

More information

Optimizing Search Engines using Click-through Data

Optimizing Search Engines using Click-through Data Optimizing Search Engines using Click-through Data By Sameep - 100050003 Rahee - 100050028 Anil - 100050082 1 Overview Web Search Engines : Creating a good information retrieval system Previous Approaches

More information

Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm

Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm Rekha Jain 1, Sulochana Nathawat 2, Dr. G.N. Purohit 3 1 Department of Computer Science, Banasthali University, Jaipur, Rajasthan ABSTRACT

More information

A Hybrid Recommender System for Dynamic Web Users

A Hybrid Recommender System for Dynamic Web Users A Hybrid Recommender System for Dynamic Web Users Shiva Nadi Department of Computer Engineering, Islamic Azad University of Najafabad Isfahan, Iran Mohammad Hossein Saraee Department of Electrical and

More information

Enhancing Cluster Quality by Using User Browsing Time

Enhancing Cluster Quality by Using User Browsing Time Enhancing Cluster Quality by Using User Browsing Time Rehab Duwairi Dept. of Computer Information Systems Jordan Univ. of Sc. and Technology Irbid, Jordan rehab@just.edu.jo Khaleifah Al.jada' Dept. of

More information

News Article Categorization Team Members: Himay Jesal Desai, Bharat Thatavarti, Aditi Satish Mhapsekar

News Article Categorization Team Members: Himay Jesal Desai, Bharat Thatavarti, Aditi Satish Mhapsekar CS 410 PROJECT REPORT News Article Categorization Team Members: Himay Jesal Desai, Bharat Thatavarti, Aditi Satish Mhapsekar Overview: Our project, News Explorer, is a system that categorizes news articles

More information