INFSCI 2480 Adaptive Information Systems Adaptive [Web] Search. Peter Brusilovsky.

Similar documents
Why Search Personalization?

INFSCI 2140 Information Storage and Retrieval Lecture 6: Taking User into Account. Ad-hoc IR in text-oriented DS

Open-Corpus Adaptive Hypermedia. Peter Brusilovsky School of Information Sciences University of Pittsburgh, USA

Open-Corpus Adaptive Hypermedia. Adaptive Hypermedia

LECTURE 12. Web-Technology

An Implementation and Analysis on the Effectiveness of Social Search

Incorporating Conceptual Matching in Search

Search Engine Architecture. Hongning Wang

INFSCI 2140 Information Storage and Retrieval Lecture 2: Models of Information Retrieval: Boolean model. Final Group Projects

A Two-Level Adaptive Visualization for Information Access to Open-Corpus Educational Resources

Syskill & Webert: Identifying interesting web sites

Ranked Retrieval. Evaluation in IR. One option is to average the precision scores at discrete. points on the ROC curve But which points?

Department of Computer Science and Engineering B.E/B.Tech/M.E/M.Tech : B.E. Regulation: 2013 PG Specialisation : _

INFSCI 2140 Information Storage and Retrieval Lecture 1: Introduction. INFSCI 2140 and your program

In the recent past, the World Wide Web has been witnessing an. explosive growth. All the leading web search engines, namely, Google,

Information Retrieval (Part 1)

Mathematical Logic Prof. Arindama Singh Department of Mathematics Indian Institute of Technology, Madras. Lecture - 37 Resolution Rules

Information Retrieval CSCI

CS 572: Information Retrieval. Lecture 1: Course Overview and Introduction 11 January 2016

CHAPTER THREE INFORMATION RETRIEVAL SYSTEM

THE HISTORY & EVOLUTION OF SEARCH

Department of Electronic Engineering FINAL YEAR PROJECT REPORT

Information Retrieval Spring Web retrieval

Information Retrieval. CS630 Representing and Accessing Digital Information. What is a Retrieval Model? Basic IR Processes

Search Engine Architecture II

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS

CS6200 Information Retrieval. Jesse Anderton College of Computer and Information Science Northeastern University

Relevance Feedback and Query Reformulation. Lecture 10 CS 510 Information Retrieval on the Internet Thanks to Susan Price. Outline

CHAPTER 5 Querying of the Information Retrieval System

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A.

Query Languages. Berlin Chen Reference: 1. Modern Information Retrieval, chapter 4

Personalized Search on the World Wide Web

Querying Introduction to Information Retrieval INF 141 Donald J. Patterson. Content adapted from Hinrich Schütze

BRIDGING NAVIGATION, SEARCH AND ADAPTATION. Adaptive Hypermedia Models Evolution

Durchblick - A Conference Assistance System for Augmented Reality Devices

Sec. 8.7 RESULTS PRESENTATION

A Natural Language Processing Based Internet Agent

Query Phrase Expansion using Wikipedia for Patent Class Search

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University

EVALUATING SEARCH EFFECTIVENESS OF SOME SELECTED SEARCH ENGINES

Comprehensive Personalized Information Access in an Educational Digital Library

5 Choosing keywords Initially choosing keywords Frequent and rare keywords Evaluating the competition rates of search

second_language research_teaching sla vivian_cook language_department idl

Search Costs vs. User Satisfaction on Mobile

Web Information Retrieval using WordNet

Introduction to Information Retrieval

AutoCAD Utility Design: Bending the Rules. Dan Leighton Principal Consultant DL Consulting

PERSONALIZED MOBILE SEARCH ENGINE BASED ON MULTIPLE PREFERENCE, USER PROFILE AND ANDROID PLATFORM

Word Disambiguation in Web Search

Mobile Offloading. Matti Kemppainen

Information Retrieval

Chapter 6: Information Retrieval and Web Search. An introduction

AN ENCODING TECHNIQUE BASED ON WORD IMPORTANCE FOR THE CLUSTERING OF WEB DOCUMENTS

Semantic Clickstream Mining

Jan Pedersen 22 July 2010

Multimedia Information Extraction and Retrieval Term Frequency Inverse Document Frequency

Telling Experts from Spammers Expertise Ranking in Folksonomies

Part I: Data Mining Foundations

A Unified User Profile Framework for Query Disambiguation and Personalization

VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK VII SEMESTER

Introduction to Information Retrieval. Hongning Wang

Part 11: Collaborative Filtering. Francesco Ricci

Query Refinement and Search Result Presentation

Search Evaluation. Tao Yang CS293S Slides partially based on text book [CMS] [MRS]

Improvement of Web Search Results using Genetic Algorithm on Word Sense Disambiguation

An Entity Name Systems (ENS) for the [Semantic] Web

Top-To-Bottom (And Beyond) On-Page Optimization Guidebook

Weighted Page Rank Algorithm Based on Number of Visits of Links of Web Page

An Axiomatic Approach to IR UIUC TREC 2005 Robust Track Experiments

Definitions. Lecture Objectives. Text Technologies for Data Science INFR Learn about main concepts in IR 9/19/2017. Instructor: Walid Magdy

From Adaptive Hypermedia to the Adaptive Web Systems. WWW: One Size Fits All?

Document Clustering for Mediated Information Access The WebCluster Project

Venice: Content-Based Information Management for Electronic Mail

Compilers. Lecture 2 Overview. (original slides by Sam

Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm

Content-Based Recommendation for Web Personalization

21. Search Models and UIs for IR

Chapter 2. Architecture of a Search Engine

Copyright 2011 please consult the authors

INFORMATION RETRIEVAL SYSTEM USING FUZZY SET THEORY - THE BASIC CONCEPT

Information Retrieval and Knowledge Organisation

COMP6237 Data Mining Searching and Ranking

The Research of A multi-language supporting description-oriented Clustering Algorithm on Meta-Search Engine Result Wuling Ren 1, a and Lijuan Liu 2,b

CS 6320 Natural Language Processing

A Meta Search Engine for User Adaptive Information Retrieval Interfaces for Desktop and Mobile Devices

Open Corpus Adaptive Educational Hypermedia

Keywords APSE: Advanced Preferred Search Engine, Google Android Platform, Search Engine, Click-through data, Location and Content Concepts.

Identifying Web Spam With User Behavior Analysis

Open Locast: Locative Media Platforms for Situated Cultural Experiences

Retrieval Evaluation. Hongning Wang

MSc Advanced Computer Science School of Computer Science The University of Manchester

Secure Conjunctive Keyword Ranked Search over Encrypted Cloud Data

slide courtesy of D. Yarowsky Splitting Words a.k.a. Word Sense Disambiguation Intro to NLP - J. Eisner 1

Everyday Activity. Course Content. Objectives of Lecture 13 Search Engine

Information Retrieval May 15. Web retrieval

An Oracle White Paper October Oracle Social Cloud Platform Text Analytics

Information Retrieval (IR) Introduction to Information Retrieval. Lecture Overview. Why do we need IR? Basics of an IR system.

Towards Summarizing the Web of Entities

A Security Model for Multi-User File System Search. in Multi-User Environments

WebBeholder: A Revolution in Tracking and Viewing Changes on The Web by Agent Community

Transcription:

INFSCI 2480 Adaptive Information Systems Adaptive [Web] Search Peter Brusilovsky http://www.sis.pitt.edu/~peterb/

Where we are? Search Navigation Recommendation Content-based Semantics / Metadata Social

Why Search Personalization? n R. Larsen: With the growth of DL even a good query can return not just tens, but thousands of "relevant" documents 1 n Personalization is an attempt to find most relevant documents using information about user's goals, knowledge, preferences, navigation history, etc. 1 Larsen, R.L. Relaxing Assumptions... Stretching the Vision: A Modest View of Some Technical Issues. D-Lib Magazine, 3, April (1997), available online at http://www.dlib.org/dlib/april97/04larsen.html!

User ProfileWhere we are? n Common term for user models in IR/IF n A user s profile is a collection of information about the user of the system. n This information is used to get the user to more relevant information n Views on user profiles in IR community Classic (Korfhage) - a reference point Modern - simple form of a user model

Core vs. Extended User Profile n Core profile contains information related to the user search goals and interests n Extended profile contains information related to the user as a person in order to understand or model the use that a person will make with the information retrieved

Simple (vector) Core Profiles n Primitive profile (any model) A set of search terms (0-1 vector) n For Boolean model of IR A Boolean query n For vector model of IR (dominated) A set of terms with their weights (vector) An overlay (set of weights) over a simple domain model that is just a list of terms that could be of interest to the users

Advanced Core Profiles n Domain model is a network of terms A subset of the model (simple overlay) A weighted overlay over DM n Domain model is a hierarchy of topics typically, each topic is a vector of terms A subset of the model (simple overlay) A weighted overlay over DM

Group Profiles n A system can maintain a group profile in parallel or instead of user profile n Could resolve the privacy issue (navigation with group profile) n Could be use for new group members at the beginning n Could be used in addition to the user profile to add group wisdom

Extended Profile n Knowledge: about the system and the subject n Goals: local and global n Interests n Background: profession, language, prospect, capabilities n Preferences (types of docs, authors, sources )

Who Maintains the Profile? n Profile is provided and maintained by the user/administrator Sometimes the only choice n The system constructs and updates the profile (automatic personalization) n Collaborative - user and system User creates, system maintains User can influence and edit Does it help or not?

Adaptive Search n Goals: Present documents (pages) that are most suitable for the individual user n Methods: Employ user profiles representing shortterm and/or long-term interests (Korfhage) Rank and present search results taking both user query and user profile into account

Personalized Search: Benefits n Resolving ambiguity The profile provides a context to the query in order to reduce ambiguity. Example: The profile of interests will allow to distinguish what the user asked about Berkeley ( Pirates, Jaguar ) really wants n Revealing hidden treasures The profile allows to bring to surface most relevant documents, which could be hidden beyond top results page Example: Owner of iphone searches for Google Android. Pages referring to both would be most interesting

The Components of Web Search Formulates Processes Analyzes Query Search / Matching Ordered results

Where to Apply Profiles? n The user profile can be applied in several ways: To modify the query itself (pre-processing) To change the usual way of retrieval To process results of a query (postprocessing) To present document snippets Special case: adaptation for meta-search

Where to Apply Profiles?

Examples of Systems n Pre-process with QE - Koutrika, Mobasher, Chirita n Post-process with annotations: Syskill & Webert n Post-process with re-ranking: WIFS, YourNews, TaskSieve n Adaptive Snippets: TaskSieve

Pre-Process: Query Expansion n User profile is applied to add terms to the query Popular terms could be added to introduce context Similar terms could be added to resolve indexer-user mismatch Related terms could be added to resolve ambiguity Works with any IR model or search engine

Simple Context-based Query Expansion: Chirita et al. 2006 User related documents (desktop documents) containing the query Score and extract keywords Top query-dependent, user-biased keywords Extract query expansion or re-ranking terms [ Chirita, Firan, Nejdl. Summarizing local context to personalize global web search. CIKM 2006 ]

Advanced Example: Koutrika & Ioannidis 05 n Advanced relevance network for query expansion n java -> java and programming -> java and (programming or development) A Unified User-Profile Framework for Query Disambiguation and Personalization Georgia Koutrika and Yannis Ioannidis, http://adiret.cs.uni-magdeburg.de/pia2005/proceedings.htm

Pre-Process: Relevance Feedback n In this case the profile is used to move the query vector (vector model only) n Imagine that: the documents, the query the user profile are represented by the same set of weighted index terms

Pre-process: Linear Transformation n The query q=q 1, q 2, q n n The profile p=p 1, p 2, p n n The query modified by the user profile will be something like that: modified q i =Kp i +(1-K)q i i=1,2, n

Pre-process: Linear Transformation modified q i =Kp i +(1-K)q i n In this case we add the terms of the profile to the query ones, weighted by K for K=0 modified q i =q i the query is unmodified for K=1 modified q i =p i the query is substituted by the profile

Piecewise Linear Transformation n if the term appears in the query and in the profile then the linear transformation is applied n if the term appears in the query but not in the profile is left unmodified or diminished slightly n if the term appears in the profile but not in the query it is not introduced, or introduced with a weight lower than in the profile.

Example: SmartGuide n Access to the CIS-like information n User has a long-term interests profile and current queries n Information is searched using a combination of both n Profile is initiated from a stereotype and kept updated n Increased user satisfaction, decreased navigation overhead Gates, K. F., Lawhead, P. B., and Wilkins, D. E. (1998) Toward an adaptive WWW: a case study in customized hypermedia. New Review of Multimedia and Hypermedia 4, 89-113.!

Post-Processing n The user profile is used to organize the results of the retrieval process present to the user the most interesting documents Filter out irrelevant documents n Extended profile can be used effectively n In this case the use of the profile adds an extra step to processing n Similar to classic information filtering problem n Typical way for adaptive Web IR

Post-Process: Annotations n The result could be relevant to the user in several aspects. Fusing this relevance with query relevance is error prone and leads to a loss of data n Results are ranked by the query relevance, but annotated with visual cues reflecting other kinds of relevance User interests - Syskill and Webert, group interests - KnowledgeSea

Example: Syskill and Webert n First example of annotation n Post-filter to Lycos n Hot, cold, lukewarm n Pazzani, M., Muramatsu, J., and Billsus, D. (1996) Syskill & Webert: Identifying interesting Web sites. In: Proceedings of the Thirteen National Conference on Artificial Intelligence, AAAI'96, Portland, OR, August, 1996, AAAI Press /MIT Press, pp. 54-61, also available at http:// www.ics.uci.edu/~pazzani/ Syskill.html.!

Post-Process: Re-Ranking n Re-ranking is a typical approach for post-processing n Each document is rated according to its relevance (similarity) to the user or group profile n This rating is fused with the relevance rating returned by the search engine n The results are ranked by fused rating User model: WIFS, group model: I-Spy

Example: WIFS (Micarelli) n Adaptive post-filter to AltaVista search engine n Maintains an advanced stereotypebased user model (Humos subsystem) n User model is updated by watching the user n The model is used to filter and re-order the links returned by AltaVista

YourNews: Adaptive Search and Filtering with Open User Profile http://amber.exp.sis.pitt.edu/yournews/

TaskSieve: User-Controled Adaptive Search http://amber.exp.sis.pitt.edu/yournews-experiment/

TaskSieve: Adaptive Snippets n The goal of a usual snippet is to show query relevance. TaskSieve applies adaptive snippets to show profile relevance as well n Selects top 3 relevant sentences combining query relevance and task relevance to sentences n Applies color coding by query/profile

Knowledge Sea Adaptive Search: Three Reference Points n Adaptive search in TaskSieve uses linear combinations of two ranks - query-based and profile-based to calculate the final rank n Query and profile are two reference points for ranking n What if there are three reference points? n Knowledge Sea Search allows to do useradaptive ranking with three reference points: Query, task profile, and lecture

Knowledge Sea Adaptive Search: Three Reference Points

Adaptive VIBE: Moving to 2D Query Terms Documents User Profile Terms n n n Presents (1) User profile, (2) Query, (3) Documents Allows to merge query relevance with user profile relevance Reveals relevance to individual terms

Some Ideas for Profile-Adapted Retrieval (Korfhage) n Query and Profile are considered as Separate Reference Points n In this case documents are retrieved if they are near the query or the profile. n For the following slides, let s assume that the similarity is measured by distance D,Q where D is the document and Q is the query

Separate Reference Points n We have different way to integrate query and profile as separate reference points: Disjunctive model of query-profile integration Conjunctive model of query-profile integration Ellipsoidal model Cassini oval model

Disjunctive Model n We will take the document if the following condition is satisfied: min ( D, Q, D, P ) < d n The D document should be near the query Q or the profile P

Conjunctive Model n We will take the document if the following condition is satisfied: max ( D, Q, D, P ) < d n The D document should be near the query Q and the profile P n In this case if the profile and the query have little in common very few documents are retrieved

Ellipsoidal Model n Condition to satisfy D, Q + D, P < d this is the equation of a ellipse. If the profile and query are far apart a lot of documents not relevant are retrieved P Q P Q

Cassini Model n Condition to satisfy D, Q D, P < d P Q P Q P Q P Q