Feature-Guided Automated Collaborative Filtering. Yezdi Lashkari. Abstract. of content analysis of documents to represent a prole of user interests.
|
|
- Scot Francis
- 6 years ago
- Views:
Transcription
1 Feature-Guided Automated Collaborative Filtering Yezdi Lashkari Abstract Information ltering systems have traditionally relied on some form of content analysis of documents to represent a prole of user interests. Such content ltering is generally ineective in domains with diverse media types such as audio, video, and images, because machineanalysis of such media is hard. Recently, information ltering systems relying primarily on human evaluations of documents have been built. Such automated collaborative ltering systems work by discovering correlations in evaluations of documents amongst users, and by using these correlations to recommend new documents. However, such systems rely on the implicit assumption that all the features of a document are equally important to a user's evaluation of that document. This assumption breaks down in broad domains, (such as all documents in the World Wide Web), where users correlate well only for some features of a document they evaluated similarly. This thesis claims that using a combination of easily extractible features of documents with subjective human evaluations for automated collaborative ltering is a powerful information ltering technique for complex information spaces. To verify this claim I propose building an information ltering system for the World Wide Web that relies primarily on a combination of simple feature extraction and human evaluations of documents to make eective personalized recommendations for documents to users.
2 1 Introduction 1.1 Motivation Automated Collaborative Filtering (ACF) [1] 1 is a technique for locating items of potential interest to users in almost any domain using human evaluations of items. It relies on a deceptively simple idea: if person A correlates strongly with person B in rating a set of items, then it is possible to predict the rating of a new item for A, given B's rating for that item. Since ACF does not rely on computer analysis of items, it is especially useful for domains where it is either too hard (or too computationally expensive) to analyze items by computer, such as information spaces containing images, movies, audio, text etc. The World Wide Web (WWW) is such a domain. The exponential growth of the web has exacerbated the problem of personal information overload faced by most networked computer users. While almost any information a user may wish to nd probably exists somewhere on the web, most users nd it almost impossible to locate such information on their own, or to keep track of new, related documents. Most current solutions to this problem attempt to create some form of index of web documents, which users may then query. Such solutions possess numerous drawbacks from the viewpoint of an ordinary user. The basic assumption behind all indexing schemes is that users will somehow learn of the existence of the index and can then query it eectively. With the growing number of indices, it is no longer possible, even for expert users, to keep track of all the useful indices. Further, web indices vary widely in quality, indexing mechanisms, and coverage. Hence, simply locating an index isn't enough: a user must know the correct set of keywords to locate relevant documents from that index. 2 web documents. Furthermore, most web indices do not attempt to index non-text 1 Also referred to as Social Information Filtering. 2 This set may dier from index to index depending on what information was used to construct the index 1
3 Several characteristics of the web suggest ACF as a potential solution to the information overload problem. By leveraging o of the opinions of millions of users who browse dierent parts of the web daily, the problem of computer analysis of a continuously growing set of documents with multiple rich media formats is avoided. By using correlations amongst human evaluations, ACF based recommendations of a document contain an implicit (subjective) evaluation of its quality as perceived by a user; an extremely valuable notion in a domain containing thousands of related documents (in terms of content) of widely varying quality. Furthermore, another potential benet of applying ACF to the web is the automatic identication of communities of users with similar interests, who may currently be unaware of each others's existence. Another fascinating possibility is ACF application across dierent domains: for example, if an ACF system knows that users A and B correlate in their music interests, it may try recommending movies to A based on B's movie evaluations. To be applied eectively, ACF assumes that the domain of items is suf- ciently narrow (for example, only documents about information ltering). When the domain is broad, an item may be given similar ratings by two users because the users correlate in their evaluations of some specic features of the item, rather than for every feature of the item. 3 In other words, two users may both give a high evaluation to a particular document for completely dierent reasons: one due to the fact that it was authored by Marvin Minsky and Seymour Papert, the other because it is about neural networks, Marvin Minsky is one of the authors, and is published by MIT Press. Such simple feature information is available for most domains, and could be used to make much better recommendations. In the example above, the ACF system could determine that the two users correlate only on documents for which Marvin Minsky is an author. To apply ACF eectively to the web, I propose using easily extractable features of documents along with human evaluations of documents to allow ACF to be applied relative to a set of features. This will reduce the number 3 This is the implicit assumption between correlating at the item level. 2
4 of items the ACF algorithm needs to consider in making a good prediction, as well as yield better recommendations, since only items having features on which a particular user correlates need be considered while calculating a prediction. This thesis proposes the implementation of a personalized WWW information agent per user. The agent observes its user's browsing patterns and attempts to learn its user's interests in WWW document space. The agent continually attempts to locate new documents similar to documents that the user has shown interest in in the past. The burden of locating interesting or new documents is thus shouldered by the agent, freeing the user to do more important work. The mechanism that I will explore is Feature-Guided Automated Collaborative Filtering (FGACF). 1.2 Related Work Automated Collaborative Filtering Information Filtering refers to the ltering of a dynamic information stream based on a long term prole of the user's interests created and maintained by the system. Most personalized ltering systems automatically create and maintain a user interest prole using machine learning techniques [2, 3]. Content ltering refers to the the ltering of information based on its content. Keyword-based ltering is an example of content ltering [4, 2]. Collaborative ltering techniques select documents based on correlations between people and on their subjective judgements of documents. In the Tapestry system [5] users annotate documents by hand, actively decide whose opinions they are interested in, and program arbitrarily complex lters that are then run continuously over a document store. For example, a typical Tapestry lter may be a query of the form: nd all articles in the newsgroup comp.unix-wizards with the keywords UNIX and BSD in the subject that John Doe replied to. Tapestry places the burden of identifying users with similar interests and programming the appropriate lters entirely on the user. Such a solution only works well in a small group of computer literate users. 3
5 An Automated Collaborative Filtering (ACF) system, by contrast, automatically determines correlations amongst users in their evaluations of items, and uses these correlations to recommend interesting items. The GroupLens system [6] uses this approach in the domain of USENET netnews. GroupLens partitions the document space by applying ACF separately within each newsgroup. Users evaluate articles using modied newsreader client programs. Grouplens is still undergoing testing; initial results with a very limited set of users are encouraging. The RINGO system [7] built at the MIT Media Lab uses ACF for making personalized music recommendations. RINGO currently consists of a single central server and does not partition the item space in any way. RINGO has been evaluated on a large user population and currently has a growing population of over 3000 users who can add new items to the database as well as submit reviews of groups and albums. While systems applying either content or collaborative ltering techniques exist, to date, no system has attempted to eectively combine the two. One such system currently being implemented, is the NewsWeeder system [3] for USENET netnews. However, the emphasis in NewsWeeder is in attempting to combine human evaluations with content representations, so as to discover new machine representations for text, and how evaluations can guide the learning of these representations WWW Indexing Mechanisms Most previous attempts to tackle the information overload problem for the WWW have attempted to construct a web index of some sort using a variety of approaches such as individual web-robots [8], collaborative swarms of ants [9], collaboratively-maintained, hierarchical indices [11], generalized resource discovery architectures [12], or meta-indices [13]. A few systems take a more user-centric approach. The Infobot hotlist database collects the contents of various hotlists mailed to it, and periodically forms a list of most popular documents [14]. The SIMON system [15] distributed a package that allowed users to associate keywords with various 4
6 resources in their WWW browser hotlist and retrieve their documents by simply specifying keywords. In return for using this software, users were requested to send their hotlists (and associated keywords) periodically to a central site, where the data is summarized. The hope is that each user's hotlist can be used to construct local maps of WWW space, which can then all be linked somehow (it is not clear exactly how), into a unied global map of the WWW. The Fish-search robot [16] integrated with Mosaic allows users to specify keywords to search for, in documents reachable from a particular document. A robot is then invoked which searches out from the start document looking for documents containing certain keywords. 2 System Design The system consists of two main components: a personalized information agent per user that continually monitors the user's interests, recommends documents of potential interest, and collects and propagates the user's evaluations of documents; and, a feature-guided automated collaborative ltering server that collects and collates evaluations from multiple users' agents so as to be able to recommend new documents to these users. User agents and FGACF servers communicate using a simple protocol. Figure 1 shows the various parts of the system as well as the various communication paths. The following sections describe each part in detail. 2.1 User Agents User agents help structure one's set of favorite documents. They learn user's interests, collect document evaluations, make document recommendations, and can help locate particular kinds of documents. Figure 2 shows a single user's agent. Note that every agent must interoperate with a WWW browser: we have chosen XMosaic as the browser for our implementation because it is popular and readily available. The agent communicates with the user's browser to retrieve the document pointer of the current document so as to pair an evaluation with that document. In 5
7 1 2 Agent interface Automated Feature Guided Collaborative Filtering Server WWW Browser (XMosaic) + collect ratings + manage user created category hierarchy + automatically classify documents + recommend new documents + collect feedback KEY 1 Instruct browser to retrieve document. Highlight recommended documents. 2 Retrieve URL of current document (for ratings and analysis). 3 Propagate URLs, ratings, extra features to server. 4 On demand recommendations: + Recommend docs like this one + How would my user rate this doc? + Recommend N docs 5 Server recommendations (documents, computed ratings, confidence levels) Figure 1: A single user's agent interacting with a FGSF server addition, the agent can instruct the browser to go to a specied document, and retrieve the current contents of the user's hotlist. An agent ideally consists of an extensible collection of document recommendation modules, each of which periodically makes a series of recommendations to a recommendation selector module that decides which documents to propose to the user based on the condence and past performance of the proposing document recommendation module. 4 The heterogeneous nature of the WWW implies that no single method is going to prove satisfactory in 4 Typically dierent document recommendation modules will be good at recommending certain types of documents, and bad at recommending other types. The hotlist structuring facility provided automatically provides the recommendation selector module with personalized partitions of WWW document space. 6
8 Agent Interface WWW Browser (XMosaic) Current document hotlist contents Document Categorization and Hotlist Structuring Module (simple feature extraction) Document Evaluation Interface + Feedback Module Goto specified document Directed Search Module Recommend Documents Propagate feedback Recommendation Selector Module Document Recommendation Module 1 Document Recommendation Module 2 Document Recommendation Module N Figure 2: Schematic of a single user's agent all cases - a modular design allows extendibility, as new document discovery methods become available. This thesis will only implement a document recommendation module that combines simple feature analysis (titles, keywords, servers, whether the document is an index, etc.), with user evaluations. Hence the recommendation selector module implemented will be extremely simple. 7
9 2.1.1 Hotlist structuring Ferret is a Tk interface for hotlist structuring that communicates with XMosaic. Ferret allows a user to structure her browser hotlist hierarchically, as also to associate keywords with sets of documents (or specic documents). The user can then retrieve documents either by specifying a set of keywords or by traversing the hierarchy she has created Evaluations for Documents The interface allows the user to enter ratings for the document she is currently reading. The agent interface collects these ratings and propagates them using the server-agent protocol to a FGACF server. Recommended documents may be presented by appearing in a distinguished font in the appropriate categories in the user's hotlist structure (either periodically or on demand). Feedback is provided by evaluating a recommended document. User feedback can be used in a variety of ways: with multiple document recommendation modules the recommendation selector module can adjust its weighting of the suggestions given to it by the various document recommendation modules; analogously, an ACF document recommendation module communicating with multiple ACF servers can adjust the amount of weight it places in each ACF server's recommendations in the future. Feedback may also be propagated back to ACF servers as evaluations so the server can make corrections to its database and parameters Agent Document Recommendation Module The agent's document recommendation module collects its user's ratings and propagates them along with user-provided information such as special keywords or comments to a FGACF server. It uses a standard protocol to query FGACF servers for evaluations for certain classes of documents (or documents similar to a given document). 8
10 2.2 Feature-Guided Automated Collaborative Filtering Server A feature-guided automated collaborative ltering (FGACF) server collects evaluations (and any additional information such as keywords, etc) from user agents. In addition each FGACF server contains a document processing module that can extract simple features from documents. These features will be used to determine useful partitions of the document space so that the ACF algorithm can be applied eectively. The features I initially plan on using are: document title, keywords (for text documents) as well as usersupplied keywords, document type, the server it originated from, whether it is an index, and author information (if available). The ACF algorithm can be guided by features of the documents in two ways: either automatic clusters formed by bands of correlations between similar users are analyzed to nd commonality between the documents in these clusters, or, features of the document are used to partition the space and then the ACF algorithm is applied within the partitions. I suspect both forms of partitioning will be useful: the former to locate important features for certain classes of documents which may help to reduce computation in the future; the latter to recommend documents \similar" to a given document. The FGACF server supports two forms of interactions between agents and the server: a subscription based interaction wherein it periodically sends document recommendations to registered agents; and a demand based interaction wherein agents can make specic queries to the server. The forms of demand-driven queries supported by a FGACF server (for a particular agent's user) are: Recommend new documents similar to a particular document. Compute a probable user evaluation for a specic document. Recommend the best (in terms of computed probable evaluations) N new documents. This provides individual agents the ability to control the ow of new documents coming in from FGACF servers, as also the ability to make directed 9
11 queries. Note that the FGACF server learns of a new document only if some user sees it and evaluates it - the growth of the server's database is thus continuous Evaluation Criteria There is no real way to evaluate the qualitative advantage to a user with a personalized agent, against one without. As a rough measure of quantitative advantage, user feedback to recommendations will take the form of actually rating recommended articles. These ratings will be compared against the calculated ratings to generate a measure of eectiveness. A \pure" ACF algorithm (no feature guidance) will be treated as the base case to compare the FGACF algorithm against. 3 Timetable The research will be carried out in two phases. In Phase I, I will implement the agent interface that allows users to evaluate documents and receive recommendations for documents. A simple ACF and feature-extraction module will be implemented. I will concentrate on setting up one ACF server that does not use any content information. This will serve as a testbed for the protocol as well as the agent module. Further, the performance of this method will serve as a benchmark for evaluation of FGACF later. In Phase II, I will implement the FGACF server using the experience gained with the results of Phase I, and evaluate it against the results from Phase I. This stage will also consist of rigorous testing with users and the deployment of multiple FGACF servers. The table below presents the various milestones along with the expected dates of completion. Phase I Implement Hotlist Facility Oct 30 Implement Agent Interface Nov 20 Implement simple ACF document recommendation module Nov 30 10
12 Implement and deploy pure ACF server Dec 15 Begin initial user testing Phase II Implement and deploy FGACF server March 10 Improve agent modules and FGACF server April 2 Begin user testing with FGACF server Deploy multiple FGACF servers April 15 Correlate results from users May 2 Preparation of Report May 7 4 Contributions I expect this thesis work to result in the following research contributions: Since ACF is domain independent, and simple item features can be extracted in almost any domain, the FGACF techniques developed here should provide a general framework for applying FGACF to any domain. Develop an extensible personalized agent architecture for WWW users. As new personalized document recommendation modules are built, it should be easy to simply \plug" them into a user's agent. 5 Deliverables At the beginning of April I hope to have a personalized agent system that can hook into a WWW browser. In addition I hope to have developed and deployed a feature-guided automated collaborative ltering server, and designed an architecture for distributed coordination amongst multiple such servers. 11
13 6 Equipment and Resources Required The browser used for the implementation will be XMosaic for the UNIX platforms. The agent interface will be implemented in C++ and Tk on a Silicon Graphics UNIX workstation at the Media Lab. The various modules will be implemented either in C or Perl. shared les. Modules will communicate via The FGACF server will be built in C (for speed and portability). I hope to use the generic ACF server being built by Max Metral [10] at the Media Lab as a starting point. References [1] Feynman, C., Nearest neighbor and maximum likelihood methods for social information ltering, Internal Document, MIT Media Lab, Fall [2] Sheth, B. D., A Learning Approach to Personalized Information Filtering, SM Thesis, Department of EECS, MIT, Feb [3] Lang, K., NewsWeeder: An Adaptive Multi-User Text Filter, Research Summary, Aug [4] Salton, G., and McGill, M. J., Introduction to Modern Information Retrieval, McGraw-Hill, [5] Goldberg, D., Nichols, D., Oki, B., and Terry, D., Using Collaborative Filtering to Weave an Information Tapestry, CACM, 35 (12), Dec 1992, pp [6] Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., and Riedl, J., GroupLens: An Open Architecture for Collaborative Filtering of Netnews, in Proc CSCW-94. [7] Shardanand, U., Social Information Filtering for Music Recommendation, SM Thesis, Dept of EECS, MIT, Sept
14 [8] McBryan, O., GENVL and WWWW: Tools for Taming the Web, in Proc of the First Int'l World Wide Web Conference, CERN, Geneva, May [9] Maudlin, M. L., and Leavitt, J. R., Web Agent Related Research at the Center for Machine Translation, in Proc SIGNIDR-94, Aug 1994, McLean Virginia. [10] Metral, M. E., SM Thesis Proposal, MIT Media Laboratory, Oct [11] YAHOO - A guide to WWW, available online at [12] Bowman, C. M., Danzig, P. B., Hardy, D. R., Manber, U., and Schwartz, M. F., The Harvest Information Discovery and Access System, in Proc of the Second Int'l World Wide Web Conference, Chicago, IL, Oct [13] CUI W3 Catalog, available online at [14] Mueller, P., Infobot Hotlist Database, available online at ftp://ftp.netcom.com/pub/ksedgwic/hotlist/hotlist.html [15] Johnson, M., SIMON - System of Internet Mapping for Organized Navigation, available online at [16] DeBra, P., Houben, G-J., and Kornatzky, Y., Navigational Search in the World-Wide Web, available online at 13
second_language research_teaching sla vivian_cook language_department idl
Using Implicit Relevance Feedback in a Web Search Assistant Maria Fasli and Udo Kruschwitz Department of Computer Science, University of Essex, Wivenhoe Park, Colchester, CO4 3SQ, United Kingdom fmfasli
More informationTwo-Dimensional Visualization for Internet Resource Discovery. Shih-Hao Li and Peter B. Danzig. University of Southern California
Two-Dimensional Visualization for Internet Resource Discovery Shih-Hao Li and Peter B. Danzig Computer Science Department University of Southern California Los Angeles, California 90089-0781 fshli, danzigg@cs.usc.edu
More informationA Time-based Recommender System using Implicit Feedback
A Time-based Recommender System using Implicit Feedback T. Q. Lee Department of Mobile Internet Dongyang Technical College Seoul, Korea Abstract - Recommender systems provide personalized recommendations
More informationContent-Based Recommendation for Web Personalization
Content-Based Recommendation for Web Personalization R.Kousalya 1, K.Saranya 2, Dr.V.Saravanan 3 1 PhD Scholar, Manonmaniam Sundaranar University,Tirunelveli HOD,Department of Computer Applications, Dr.NGP
More informationWeb site Image database. Web site Video database. Web server. Meta-server Meta-search Agent. Meta-DB. Video query. Text query. Web client.
(Published in WebNet 97: World Conference of the WWW, Internet and Intranet, Toronto, Canada, Octobor, 1997) WebView: A Multimedia Database Resource Integration and Search System over Web Deepak Murthy
More information2 Data Reduction Techniques The granularity of reducible information is one of the main criteria for classifying the reduction techniques. While the t
Data Reduction - an Adaptation Technique for Mobile Environments A. Heuer, A. Lubinski Computer Science Dept., University of Rostock, Germany Keywords. Reduction. Mobile Database Systems, Data Abstract.
More informationUsing Statistical Properties of Text to Create. Metadata. Computer Science and Electrical Engineering Department
Using Statistical Properties of Text to Create Metadata Grace Crowder crowder@cs.umbc.edu Charles Nicholas nicholas@cs.umbc.edu Computer Science and Electrical Engineering Department University of Maryland
More informationINTRODUCTION. Chapter GENERAL
Chapter 1 INTRODUCTION 1.1 GENERAL The World Wide Web (WWW) [1] is a system of interlinked hypertext documents accessed via the Internet. It is an interactive world of shared information through which
More informationIn the recent past, the World Wide Web has been witnessing an. explosive growth. All the leading web search engines, namely, Google,
1 1.1 Introduction In the recent past, the World Wide Web has been witnessing an explosive growth. All the leading web search engines, namely, Google, Yahoo, Askjeeves, etc. are vying with each other to
More information2 Approaches to worldwide web information retrieval
The WEBFIND tool for finding scientific papers over the worldwide web. Alvaro E. Monge and Charles P. Elkan Department of Computer Science and Engineering University of California, San Diego La Jolla,
More informationTHE RECOMMENDATION ALGORITHM FOR AN ONLINE ART GALLERY
INFORMATION SYSTEMS IN MANAGEMENT Information Systems in Management (2018) Vol. 7 (2) 108 119 DOI: 10.22630/ISIM.2018.7.2.3 THE RECOMMENDATION ALGORITHM FOR AN ONLINE ART GALLERY WALDEMAR KARWOWSKI a),
More informationihits: Extending HITS for Personal Interests Profiling
ihits: Extending HITS for Personal Interests Profiling Ziming Zhuang School of Information Sciences and Technology The Pennsylvania State University zzhuang@ist.psu.edu Abstract Ever since the boom of
More informationCollaborative Filtering based on User Trends
Collaborative Filtering based on User Trends Panagiotis Symeonidis, Alexandros Nanopoulos, Apostolos Papadopoulos, and Yannis Manolopoulos Aristotle University, Department of Informatics, Thessalonii 54124,
More informationA Survey on Various Techniques of Recommendation System in Web Mining
A Survey on Various Techniques of Recommendation System in Web Mining 1 Yagnesh G. patel, 2 Vishal P.Patel 1 Department of computer engineering 1 S.P.C.E, Visnagar, India Abstract - Today internet has
More informationSTAR Lab Technical Report
VRIJE UNIVERSITEIT BRUSSEL FACULTEIT WETENSCHAPPEN VAKGROEP INFORMATICA EN TOEGEPASTE INFORMATICA SYSTEMS TECHNOLOGY AND APPLICATIONS RESEARCH LAB STAR Lab Technical Report Benefits of explicit profiling
More informationDomain Specific Search Engine for Students
Domain Specific Search Engine for Students Domain Specific Search Engine for Students Wai Yuen Tang The Department of Computer Science City University of Hong Kong, Hong Kong wytang@cs.cityu.edu.hk Lam
More informationChapter The LRU* WWW proxy cache document replacement algorithm
Chapter The LRU* WWW proxy cache document replacement algorithm Chung-yi Chang, The Waikato Polytechnic, Hamilton, New Zealand, itjlc@twp.ac.nz Tony McGregor, University of Waikato, Hamilton, New Zealand,
More informationAn Information Theoretic Approach to Ontology-based Interest Matching
An Information Theoretic Approach to Ontology-based Interest Matching aikit Koh and Lik Mui Laboratory for Computer Science Clinical Decision Making Group Massachusetts Institute of Technology waikit@mit.edu
More informationCompetitive Intelligence and Web Mining:
Competitive Intelligence and Web Mining: Domain Specific Web Spiders American University in Cairo (AUC) CSCE 590: Seminar1 Report Dr. Ahmed Rafea 2 P age Khalid Magdy Salama 3 P age Table of Contents Introduction
More informationVideo Representation. Video Analysis
BROWSING AND RETRIEVING VIDEO CONTENT IN A UNIFIED FRAMEWORK Yong Rui, Thomas S. Huang and Sharad Mehrotra Beckman Institute for Advanced Science and Technology University of Illinois at Urbana-Champaign
More informationSyskill & Webert: Identifying interesting web sites
Syskill & Webert Page 1 of 10 Syskill & Webert: Identifying interesting web sites Abstract Michael Pazzani, Jack Muramatsu & Daniel Billsus Department of Information and Computer Science University of
More informationInformation Filtering and user profiles
Information Filtering and user profiles Roope Raisamo (rr@cs.uta.fi) Department of Computer Sciences University of Tampere http://www.cs.uta.fi/sat/ Information Filtering Topics information filtering vs.
More informationImproving Results and Performance of Collaborative Filtering-based Recommender Systems using Cuckoo Optimization Algorithm
Improving Results and Performance of Collaborative Filtering-based Recommender Systems using Cuckoo Optimization Algorithm Majid Hatami Faculty of Electrical and Computer Engineering University of Tabriz,
More informationSalford Systems Predictive Modeler Unsupervised Learning. Salford Systems
Salford Systems Predictive Modeler Unsupervised Learning Salford Systems http://www.salford-systems.com Unsupervised Learning In mainstream statistics this is typically known as cluster analysis The term
More informationLearning from hotlists and coldlists: Towards a WWW information filtering and seeking agent
TAI Coldlist final Page 1 of 7 Learning from hotlists and coldlists: Towards a WWW information filtering and seeking agent Abstract Michael Pazzani, Larry Nguyen & Stefanus Mantik Department of Information
More informationhighest cosine coecient [5] are returned. Notice that a query can hit documents without having common terms because the k indexing dimensions indicate
Searching Information Servers Based on Customized Proles Technical Report USC-CS-96-636 Shih-Hao Li and Peter B. Danzig Computer Science Department University of Southern California Los Angeles, California
More informationAn Analysis of Image Retrieval Behavior for Metadata Type and Google Image Database
An Analysis of Image Retrieval Behavior for Metadata Type and Google Image Database Toru Fukumoto Canon Inc., JAPAN fukumoto.toru@canon.co.jp Abstract: A large number of digital images are stored on the
More informationChapter 5: Summary and Conclusion CHAPTER 5 SUMMARY AND CONCLUSION. Chapter 1: Introduction
CHAPTER 5 SUMMARY AND CONCLUSION Chapter 1: Introduction Data mining is used to extract the hidden, potential, useful and valuable information from very large amount of data. Data mining tools can handle
More informationSteering. Stream. User Interface. Stream. Manager. Interaction Managers. Snapshot. Stream
Agent Roles in Snapshot Assembly Delbert Hart Dept. of Computer Science Washington University in St. Louis St. Louis, MO 63130 hart@cs.wustl.edu Eileen Kraemer Dept. of Computer Science University of Georgia
More informationUnobtrusive Data Collection for Web-Based Social Navigation
Unobtrusive Data Collection for Web-Based Social Navigation Katja Hofmann 1, Catherine Reed 2, Hilary Holz 2 California State University, East Bay 25800 Carlos Bee Boulevard Hayward, CA 94542 1 katja.hofmann@gmail.com
More informationINTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A SURVEY ON WEB CONTENT MINING DEVEN KENE 1, DR. PRADEEP K. BUTEY 2 1 Research
More informationDynamicLens: A Dynamic User-Interface for a Meta-Recommendation System
DynamicLens: A Dynamic User-Interface for a Meta-Recommendation System J. Ben Schafer Department of Computer Science University of Northern Iowa Cedar Falls, IA 50614-0507 USA +1 319 273 2187 schafer@cs.uni.edu
More information2 Application Support via Proxies Onion Routing can be used with applications that are proxy-aware, as well as several non-proxy-aware applications, w
Onion Routing for Anonymous and Private Internet Connections David Goldschlag Michael Reed y Paul Syverson y January 28, 1999 1 Introduction Preserving privacy means not only hiding the content of messages,
More informationRecommender Systems. Collaborative Filtering & Content-Based Recommending
Recommender Systems Collaborative Filtering & Content-Based Recommending 1 Recommender Systems Systems for recommending items (e.g. books, movies, CD s, web pages, newsgroup messages) to users based on
More informationDifferential Compression and Optimal Caching Methods for Content-Based Image Search Systems
Differential Compression and Optimal Caching Methods for Content-Based Image Search Systems Di Zhong a, Shih-Fu Chang a, John R. Smith b a Department of Electrical Engineering, Columbia University, NY,
More informationNext-Generation Standards Management with IHS Engineering Workbench
ENGINEERING & PRODUCT DESIGN Next-Generation Standards Management with IHS Engineering Workbench The addition of standards management capabilities in IHS Engineering Workbench provides IHS Standards Expert
More informationExplaining Recommendations: Satisfaction vs. Promotion
Explaining Recommendations: Satisfaction vs. Promotion Mustafa Bilgic Computer Science Dept. University of Maryland at College Park College Park, MD 20742 mbilgic@cs.umd.edu Raymond J. Mooney Computer
More informationEnhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering Recommendation Algorithms
International Journal of Mathematics and Statistics Invention (IJMSI) E-ISSN: 2321 4767 P-ISSN: 2321-4759 Volume 4 Issue 10 December. 2016 PP-09-13 Enhanced Web Usage Mining Using Fuzzy Clustering and
More informationAutomated Online News Classification with Personalization
Automated Online News Classification with Personalization Chee-Hong Chan Aixin Sun Ee-Peng Lim Center for Advanced Information Systems, Nanyang Technological University Nanyang Avenue, Singapore, 639798
More informationTowards a hybrid approach to Netflix Challenge
Towards a hybrid approach to Netflix Challenge Abhishek Gupta, Abhijeet Mohapatra, Tejaswi Tenneti March 12, 2009 1 Introduction Today Recommendation systems [3] have become indispensible because of the
More informationSemantically Rich Recommendations in Social Networks for Sharing, Exchanging and Ranking Semantic Context
Semantically Rich Recommendations in Social Networks for Sharing, Exchanging and Ranking Semantic Context Stefania Ghita, Wolfgang Nejdl, and Raluca Paiu L3S Research Center, University of Hanover, Deutscher
More informationA Tagging Approach to Ontology Mapping
A Tagging Approach to Ontology Mapping Colm Conroy 1, Declan O'Sullivan 1, Dave Lewis 1 1 Knowledge and Data Engineering Group, Trinity College Dublin {coconroy,declan.osullivan,dave.lewis}@cs.tcd.ie Abstract.
More informationAn Evaluation of Information Retrieval Accuracy. with Simulated OCR Output. K. Taghva z, and J. Borsack z. University of Massachusetts, Amherst
An Evaluation of Information Retrieval Accuracy with Simulated OCR Output W.B. Croft y, S.M. Harding y, K. Taghva z, and J. Borsack z y Computer Science Department University of Massachusetts, Amherst
More informationTDNet Discover User Manual
TDNet Discover User Manual 2014 Introduction Contents 1 Introduction... 3 1.1 TDNet Discover... 3 1.2 TDNet Index... 3 1.3 Personalization... 3 1.4 TDNet Widgets... 4 2 Logging In... 5 2.1 Browsing without
More informationLECTURE 12. Web-Technology
LECTURE 12 Web-Technology Household issues Course evaluation on Caracal o https://caracal.uu.nl o Between 3-4-2018 and 29-4-2018 Assignment 3 deadline extension No lecture/practice on Friday 30/03/18 2
More informationSearching the Deep Web
Searching the Deep Web 1 What is Deep Web? Information accessed only through HTML form pages database queries results embedded in HTML pages Also can included other information on Web can t directly index
More informationDistributed Indexing of the Web Using Migrating Crawlers
Distributed Indexing of the Web Using Migrating Crawlers Odysseas Papapetrou cs98po1@cs.ucy.ac.cy Stavros Papastavrou stavrosp@cs.ucy.ac.cy George Samaras cssamara@cs.ucy.ac.cy ABSTRACT Due to the tremendous
More informationA MPEG-4/7 based Internet Video and Still Image Browsing System
A MPEG-4/7 based Internet Video and Still Image Browsing System Miroslaw Bober 1, Kohtaro Asai 2 and Ajay Divakaran 3 1 Mitsubishi Electric Information Technology Center Europe VIL, Guildford, Surrey,
More information2. PRELIMINARIES MANICURE is specically designed to prepare text collections from printed materials for information retrieval applications. In this ca
The MANICURE Document Processing System Kazem Taghva, Allen Condit, Julie Borsack, John Kilburg, Changshi Wu, and Je Gilbreth Information Science Research Institute University of Nevada, Las Vegas ABSTRACT
More informationJoining Collaborative and Content-based Filtering
Joining Collaborative and Content-based Filtering 1 Patrick Baudisch Integrated Publication and Information Systems Institute IPSI German National Research Center for Information Technology GMD 64293 Darmstadt,
More informationADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT
ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT PhD Summary DOCTORATE OF PHILOSOPHY IN COMPUTER SCIENCE & ENGINEERING By Sandip Kumar Goyal (09-PhD-052) Under the Supervision
More informationDiscovering Paths Traversed by Visitors in Web Server Access Logs
Discovering Paths Traversed by Visitors in Web Server Access Logs Alper Tugay Mızrak Department of Computer Engineering Bilkent University 06533 Ankara, TURKEY E-mail: mizrak@cs.bilkent.edu.tr Abstract
More informationBrowsing in the tsimmis System. Stanford University. into requests the source can execute. The data returned by the source is converted back into the
Information Translation, Mediation, and Mosaic-Based Browsing in the tsimmis System SIGMOD Demo Proposal (nal version) Joachim Hammer, Hector Garcia-Molina, Kelly Ireland, Yannis Papakonstantinou, Jerey
More informationVision Document for Multi-Agent Research Tool (MART)
Vision Document for Multi-Agent Research Tool (MART) Version 2.0 Submitted in partial fulfillment of the requirements for the degree MSE Madhukar Kumar CIS 895 MSE Project Kansas State University 1 1.
More informationContent Bookmarking and Recommendation
Content Bookmarking and Recommendation Ananth V yasara yamut 1 Sat yabrata Behera 1 M anideep At tanti 1 Ganesh Ramakrishnan 1 (1) IIT Bombay, India ananthv@iitb.ac.in, satty@cse.iitb.ac.in, manideep@cse.iitb.ac.in,
More informationNetwork Working Group Request for Comments: 1679 Category: Informational K. O Donoghue NSWC-DD August 1994
Network Working Group Request for Comments: 1679 Category: Informational D. Green P. Irey D. Marlow K. O Donoghue NSWC-DD August 1994 HPN Working Group Input to the IPng Requirements Solicitation Status
More informationCSE 454 Final Report TasteCliq
CSE 454 Final Report TasteCliq Samrach Nouv, Andrew Hau, Soheil Danesh, and John-Paul Simonis Goals Your goals for the project Create an online service which allows people to discover new media based on
More informationMeaning & Concepts of Databases
27 th August 2015 Unit 1 Objective Meaning & Concepts of Databases Learning outcome Students will appreciate conceptual development of Databases Section 1: What is a Database & Applications Section 2:
More informationRecommendation on the Web Search by Using Co-Occurrence
Recommendation on the Web Search by Using Co-Occurrence S.Jayabalaji 1, G.Thilagavathy 2, P.Kubendiran 3, V.D.Srihari 4. UG Scholar, Department of Computer science & Engineering, Sree Shakthi Engineering
More informationSkill Area 209: Use Internet Technology. Software Application (SWA)
Skill Area 209: Use Internet Technology Software Application (SWA) Skill Area 209.1 Use Browser for Research (10hrs) 209.1.1 Familiarise with the Environment of Selected Browser Internet Technology The
More informationComparison of FP tree and Apriori Algorithm
International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 10, Issue 6 (June 2014), PP.78-82 Comparison of FP tree and Apriori Algorithm Prashasti
More informationConstructing Websites toward High Ranking Using Search Engine Optimization SEO
Constructing Websites toward High Ranking Using Search Engine Optimization SEO Pre-Publishing Paper Jasour Obeidat 1 Dr. Raed Hanandeh 2 Master Student CIS PhD in E-Business Middle East University of Jordan
More informationVISO: A Shared, Formal Knowledge Base as a Foundation for Semi-automatic InfoVis Systems
VISO: A Shared, Formal Knowledge Base as a Foundation for Semi-automatic InfoVis Systems Jan Polowinski Martin Voigt Technische Universität DresdenTechnische Universität Dresden 01062 Dresden, Germany
More informationIntroducing MESSIA: A Methodology of Developing Software Architectures Supporting Implementation Independence
Introducing MESSIA: A Methodology of Developing Software Architectures Supporting Implementation Independence Ratko Orlandic Department of Computer Science and Applied Math Illinois Institute of Technology
More informationDesign and Implementation of Search Engine Using Vector Space Model for Personalized Search
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 1, January 2014,
More informationSemantic feedback for hybrid recommendations in Recommendz
Semantic feedback for hybrid recommendations in Recommendz Matthew Garden and Gregory Dudek McGill University Centre For Intelligent Machines 3480 University St, Montréal, Québec, Canada H3A 2A7 {mgarden,
More informationCommunity Central Quick Start Guide
Community Central Quick Start Guide Copyright 2011 Open Solutions Inc. All rights reserved No part of this material may be reproduced in any form without written permission Table of Contents Community
More informationUnderstanding the workplace of the future. Artificial Intelligence series
Understanding the workplace of the future Artificial Intelligence series Konica Minolta Inc. 02 Cognitive Hub and the Semantic Platform Within today s digital workplace, there is a growing need for different
More informationSearching the Deep Web
Searching the Deep Web 1 What is Deep Web? Information accessed only through HTML form pages database queries results embedded in HTML pages Also can included other information on Web can t directly index
More informationSEMANTIC WEB POWERED PORTAL INFRASTRUCTURE
SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE YING DING 1 Digital Enterprise Research Institute Leopold-Franzens Universität Innsbruck Austria DIETER FENSEL Digital Enterprise Research Institute National
More informationMulti-Aspect Tagging for Collaborative Structuring
Multi-Aspect Tagging for Collaborative Structuring Katharina Morik and Michael Wurst University of Dortmund, Department of Computer Science Baroperstr. 301, 44221 Dortmund, Germany morik@ls8.cs.uni-dortmund
More informationApplication of Dimensionality Reduction in Recommender System -- A Case Study
Application of Dimensionality Reduction in Recommender System -- A Case Study Badrul M. Sarwar, George Karypis, Joseph A. Konstan, John T. Riedl Department of Computer Science and Engineering / Army HPC
More informationWSN Routing Protocols
WSN Routing Protocols 1 Routing Challenges and Design Issues in WSNs 2 Overview The design of routing protocols in WSNs is influenced by many challenging factors. These factors must be overcome before
More informationThe Architecture of a System for the Indexing of Images by. Content
The Architecture of a System for the Indexing of s by Content S. Kostomanolakis, M. Lourakis, C. Chronaki, Y. Kavaklis, and S. C. Orphanoudakis Computer Vision and Robotics Laboratory Institute of Computer
More informationOverview of Web Mining Techniques and its Application towards Web
Overview of Web Mining Techniques and its Application towards Web *Prof.Pooja Mehta Abstract The World Wide Web (WWW) acts as an interactive and popular way to transfer information. Due to the enormous
More informationMining Web Data. Lijun Zhang
Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems
More informationInsightConnector Version 1.0
InsightConnector Version 1.0 2002 Bynari Inc. All Rights Reserved Table of Contents Table of Contents... 2 Executive Summary... 3 Examination of the Insight Messaging Solution... 3 Exchange or Outlook?...
More informationFramework for suggesting POPULAR ITEMS to users by Analyzing Randomized Algorithms
Framework for suggesting POPULAR ITEMS to users by Analyzing Randomized Algorithms #1 Y.Maanasa(Mtech) Department of CSE, Avanthi Institute of Engg & Technology, narsipatnam, India. maanasay@gmail.com
More informationUser accesses business site. Recommendations Engine. Recommendations to user 3 Data Mining for Personalization
Personalization and Location-based Technologies for E-Commerce Applications K. V. Ravi Kanth, and Siva Ravada Spatial Technologies, NEDC, Oracle Corporation, Nashua NH 03062. fravi.kothuri, Siva.Ravadag@oracle.com
More informationThe ToCAI Description Scheme for Indexing and Retrieval of Multimedia Documents 1
The ToCAI Description Scheme for Indexing and Retrieval of Multimedia Documents 1 N. Adami, A. Bugatti, A. Corghi, R. Leonardi, P. Migliorati, Lorenzo A. Rossi, C. Saraceno 2 Department of Electronics
More informationijade Reporter An Intelligent Multi-agent Based Context Aware News Reporting System
ijade Reporter An Intelligent Multi-agent Based Context Aware Reporting System Eddie C.L. Chan and Raymond S.T. Lee The Department of Computing, The Hong Kong Polytechnic University, Hung Hong, Kowloon,
More informationData publication and discovery with Globus
Data publication and discovery with Globus Questions and comments to outreach@globus.org The Globus data publication and discovery services make it easy for institutions and projects to establish collections,
More informationWeb Data mining-a Research area in Web usage mining
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 13, Issue 1 (Jul. - Aug. 2013), PP 22-26 Web Data mining-a Research area in Web usage mining 1 V.S.Thiyagarajan,
More informationSupporting World-Wide Web Navigation Through History Mechanisms
Supporting World-Wide Web Navigation Through History Mechanisms Linda Tauscher Computer Science Department, University of Calgary tauscher@cpsc.ucalgary.ca Cite as: Tauscher, L. (1996) Supporting World
More informationEnhancing Cluster Quality by Using User Browsing Time
Enhancing Cluster Quality by Using User Browsing Time Rehab Duwairi Dept. of Computer Information Systems Jordan Univ. of Sc. and Technology Irbid, Jordan rehab@just.edu.jo Khaleifah Al.jada' Dept. of
More informationHybrid Recommender Systems for Electronic Commerce
From: AAAI Technical Report WS-00-04. Compilation copyright 2000, AAAI (www.aaai.org). All rights reserved. Hybrid Recommender Systems for Electronic Commerce Thomas Tran and Robin Cohen Dept. of Computer
More informationUsability Inspection Report of NCSTRL
Usability Inspection Report of NCSTRL (Networked Computer Science Technical Report Library) www.ncstrl.org NSDL Evaluation Project - Related to efforts at Virginia Tech Dr. H. Rex Hartson Priya Shivakumar
More informationWebBeholder: A Revolution in Tracking and Viewing Changes on The Web by Agent Community
WebBeholder: A Revolution in Tracking and Viewing Changes on The Web by Agent Community Santi Saeyor Mitsuru Ishizuka Dept. of Information and Communication Engineering, Faculty of Engineering, University
More informationUsage of LDAP in Globus
Usage of LDAP in Globus Gregor von Laszewski and Ian Foster Mathematics and Computer Science Division Argonne National Laboratory, Argonne, IL 60439 gregor@mcs.anl.gov Abstract: This short note describes
More informationBuilding an Infrastructure for Law Enforcement Information Sharing and Collaboration: Design Issues and Challenges
Submission to the National Conference on Digital Government, 2001 Submission Type: PAPER (SHORT PAPER preferred) Building an Infrastructure for Law Enforcement Information Sharing and Collaboration: Design
More informationPORTAL RESOURCES INFORMATION SYSTEM: THE DESIGN AND DEVELOPMENT OF AN ONLINE DATABASE FOR TRACKING WEB RESOURCES.
PORTAL RESOURCES INFORMATION SYSTEM: THE DESIGN AND DEVELOPMENT OF AN ONLINE DATABASE FOR TRACKING WEB RESOURCES by Richard Spinks A Master s paper submitted to the faculty of the School of Information
More informationTaccumulation of the social network data has raised
International Journal of Advanced Research in Social Sciences, Environmental Studies & Technology Hard Print: 2536-6505 Online: 2536-6513 September, 2016 Vol. 2, No. 1 Review Social Network Analysis and
More informationMichael F. Schwartz. March 12, (Original Date: August 1994 Revised March 1995) Abstract
Harvest: A Scalable, Customizable Discovery and Access System C. Mic Bowman Transarc Corp. Udi Manber University of Arizona Peter B. Danzig University of Southern California Michael F. Schwartz University
More informationWEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS
1 WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS BRUCE CROFT NSF Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts,
More informationMapping the library future: Subject navigation for today's and tomorrow's library catalogs
University of Pennsylvania ScholarlyCommons Scholarship at Penn Libraries Penn Libraries January 2008 Mapping the library future: Subject navigation for today's and tomorrow's library catalogs John Mark
More informationIntroduction to Grid Computing
Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able
More informationContext-based Navigational Support in Hypermedia
Context-based Navigational Support in Hypermedia Sebastian Stober and Andreas Nürnberger Institut für Wissens- und Sprachverarbeitung, Fakultät für Informatik, Otto-von-Guericke-Universität Magdeburg,
More informationLibrary. Summary Report
Library Summary Report 217-218 Prepared by: Library Staff December 218 Table of Contents Introduction..1 New Books.2 Print Circulation.3 Interlibrary Loan 4 Information Literacy 5 Reference Statistics.6
More informationThe TDAQ Analytics Dashboard: a real-time web application for the ATLAS TDAQ control infrastructure
The TDAQ Analytics Dashboard: a real-time web application for the ATLAS TDAQ control infrastructure Giovanna Lehmann Miotto, Luca Magnoni, John Erik Sloper European Laboratory for Particle Physics (CERN),
More informationFIGURE 3. Two-Level Internet Address Structure. FIGURE 4. Principle Classful IP Address Formats
Classful IP Addressing When IP was first standardized in September 1981, the specification required that each system attached to an IP-based Internet be assigned a unique, 32-bit Internet address value.
More information