The Role of Information Scent in On-line Browsing:

Similar documents
SNIF-ACT: A Model of Information Foraging on the World Wide Web

Information Scent. Slides adapted from Caitlin Kelleher

Model-based Navigation Support

Information Scent as a Driver of Web Behavior Graphs: Results of a Protocol Analysis Method for Web Usability

Automated Cognitive Walkthrough for the Web (AutoCWW)

Facilitating Exploratory Search by Model-Based Navigational Cues

Facilitating Exploratory Search by Model-Based Navigational Cues

Text Modeling with the Trace Norm

Clustering Lecture 5: Mixture Model

The Effects of Semantic Grouping on Visual Search

Supporting Exploratory Search Through User Modeling

Framework of a Real-Time Adaptive Hypermedia System

A TEXT MINER ANALYSIS TO COMPARE INTERNET AND MEDLINE INFORMATION ABOUT ALLERGY MEDICATIONS Chakib Battioui, University of Louisville, Louisville, KY

Principles of Visual Design

Using User Interaction to Model User Comprehension on the Web Navigation

Interaction Model to Predict Subjective Specificity of Search Results

Information Retrieval and Web Search Engines

Ambiguity. Potential for Personalization. Personalization. Example: NDCG values for a query. Computing potential for personalization

Discount Eye Tracking: The Enhanced Restricted Focus Viewer

Object perception by primates

CURIOUS BROWSERS: Automated Gathering of Implicit Interest Indicators by an Instrumented Browser

Fall 2018 CSE 482 Big Data Analysis: Exam 1 Total: 36 (+3 bonus points)

INF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering

CS Information Visualization March 4, 2004 John Stasko

8. Visual Analytics. Prof. Tulasi Prasad Sariki SCSE, VIT, Chennai

Information Retrieval and Web Search Engines

MAE 298, Lecture 9 April 30, Web search and decentralized search on small-worlds

Looking Back: Fitts Law

Path Analysis References: Ch.10, Data Mining Techniques By M.Berry, andg.linoff Dr Ahmed Rafea

Using Text Elements by Context to Display Search Results in Information Retrieval Systems Model and Research results

Usability Testing. November 9, 2016

Variability of User Interaction with Multi-Platform News Feeds

Probabilistic Graphical Models Part III: Example Applications

Using Thumbnails to Search the Web

Toward Modeling Contextual Information in Web Navigation

10-701/15-781, Fall 2006, Final

Information Gathering Support Interface by the Overview Presentation of Web Search Results

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A.

Characterizing Web Usage Regularities with Information Foraging Agents

Web Structure & Content (not a theory talk)

Supporting World-Wide Web Navigation Through History Mechanisms

STATISTICS (STAT) Statistics (STAT) 1

Bayesian Approaches to Content-based Image Retrieval

Text Mining. Munawar, PhD. Text Mining - Munawar, PhD

Proximity Prestige using Incremental Iteration in Page Rank Algorithm

A World Wide Web-based HCI-library Designed for Interaction Studies

Preliminary Evidence for Top-down and Bottom-up Processes in Web Search Navigation

Samuel Coolidge, Dan Simon, Dennis Shasha, Technical Report NYU/CIMS/TR

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

Modeling Result-List Searching in the World Wide Web: The Role of Relevance Topologies and Trust Bias

CZWeb: Fish-eye Views for Visualizing the World-Wide Web

Appendix A: A Review of Literature related to Information Foraging Theory (IFT)

BUPT at TREC 2009: Entity Track

10/10/13. Traditional database system. Information Retrieval. Information Retrieval. Information retrieval system? Information Retrieval Issues

Theory of mind and bounded rationality without interpretive overhead

Discovering User Information Goals with Semantic Website Media Modeling

Lecture #3: PageRank Algorithm The Mathematics of Google Search

Exploratory Analysis: Clustering

INF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering

Slide Copyright 2005 Pearson Education, Inc. SEVENTH EDITION and EXPANDED SEVENTH EDITION. Chapter 13. Statistics Sampling Techniques

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies

Clustering CS 550: Machine Learning

ScentTrails: Integrating Browsing and Searching on the World Wide Web

Link Analysis and Web Search

Exploring the Impact of Information Visualization on Medical Information Seeking Using the Web

Formalization, User Strategy and Interaction Design: Users Behaviour with Discourse Tagging Semantics

Learning Statistical Models From Relational Data

User-Centered Analysis & Design

How to Choose the Right UX Methods For Your Project

Clustering using Topic Models

JMP Clinical. Release Notes. Version 5.0

Markov Decision Processes. (Slides from Mausam)

Lecture: Simulation. of Manufacturing Systems. Sivakumar AI. Simulation. SMA6304 M2 ---Factory Planning and scheduling. Simulation - A Predictive Tool

Information Retrieval

Shedding Light on the Graph Schema

Effectiveness of Shallow Hierarchies for Document Stores

Extension Web Publishing 3 Lecture # 1. Chapter 6 Site Types and Architectures

IBL and clustering. Relationship of IBL with CBR

Graphs / Networks. CSE 6242/ CX 4242 Feb 18, Centrality measures, algorithms, interactive applications. Duen Horng (Polo) Chau Georgia Tech

3D Unsharp Masking for Scene Coherent Enhancement Supplemental Material 1: Experimental Validation of the Algorithm

Tennessee. Trade & Industrial Course Web Page Design II - Site Designer Standards. A Guide to Web Development Using Adobe Dreamweaver CS3 2009

Scene Grammars, Factor Graphs, and Belief Propagation

ON HEURISTIC METHODS IN NEXT-GENERATION SEQUENCING DATA ANALYSIS

Mobility Models. Larissa Marinho Eglem de Oliveira. May 26th CMPE 257 Wireless Networks. (UCSC) May / 50

CS 188: Artificial Intelligence Spring Announcements

Detection of skin cancer

Using Information Scent to Model the Dynamic Foraging Behavior of Programmers in Maintenance Tasks

Bootstrapping Methods

OntoGen: Semi-automatic Ontology Editor

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, ISSN:

CSCI 599: Applications of Natural Language Processing Information Retrieval Retrieval Models (Part 1)"

A Patent Retrieval Method Using a Hierarchy of Clusters at TUT

[Slides Extracted From] Visualization Analysis & Design Full-Day Tutorial Session 4

Scene Grammars, Factor Graphs, and Belief Propagation

Better Bioinformatics Through Usability Analysis

Variational Autoencoders. Sargur N. Srihari

Global modelling of air pollution using multiple data sources

Topics in Parsing: Context and Markovization; Dependency Parsing. COMP-599 Oct 17, 2016

Improving Difficult Queries by Leveraging Clusters in Term Graph

Markov Decision Processes and Reinforcement Learning

Transcription:

The Role of Information Scent in On-line Browsing: Extensions of the ACT-R Utility and Concept Formation Mechanisms Peter Pirolli User Interface Research Area Supported in part by Office of Naval Research (ONR) Advanced Research and Development Agency (ARDA)

Information Scent Browsing often requires that we use visual or textual cues to direct our actions. (a) (b) These local (proximal) cues are called information scent. The theory of information scent is a psychological theory Used to develop new user interface designs and new Web site evaluation tools (c) (d)

Overview Why information scent is important SNIF-ACT Model of Navigation Where to go next (navigational choices)? When to stop (leaving a Web site)? InfoCLASS Model of topic category formation What categories of topics are where (learning about the information environment)

Example: Looking for Pirolli s Personal Page on the PARC Site Company Research Document Content Pirolli Sensemaking User Interface People

Hierarchical Navigation Spaces Strong Information Scent = Navigation Cost Linear in Depth Weak Information Scent = Navigation Cost Exponential in Depth

Navigation Costs as Function of Information Scent 150 Number of pages visited 100 50 Probability of choosing wrong link.150.125 Probabilities within range of empirical values in Woodruff et al. (2002).100 0 0 2 4 6 8 10 Depth Notes: Average branching factor = 10 Depth = 10

Phase Transition in Navigation Costs as Function of Information Scent 150 Number of pages visited 100 50 Probability of choosing wrong link.150.125.100 Number of Pages Visited per Level 100 80 60 40 20 0 Linear Exponential 0 0.05 0.1 0.15 0.2 0 0 2 4 6 8 10 Depth Probability of Choosing Wrong Link Notes: Average branching factor = 10 Depth = 10

WWW Studies Representative Task Sample User System Interface Content Survey of user tasks (N = 2188). Develop ecologically valid tasks for study in experiments. Experiments Database Database Instrumentation User Trace Data User Tracing User Simulation Model Develop instrumentation to record user trace data. Analyze data. Develop cognitive model. Validate against user trace data.

WWW Experiment & Model 12 Stanford University students 6 tasks 2 tasks analyzed and modeled for 4 participants Example tasks Find specific movie posters for your new living room Find dates for a performance by a comedy troupe Pirolli, P. & Fu, W. (2003). SNIF-ACT: A model of information foraging on the World Wide Web. Proceedings of the Conference on User Modeling.

Web Behavior Graphs (WBGs) Find a poster for the movie Antz Find the tour schedule of the Second City Comedy Troupe S1 S2 S3 S4 S5 S6 S1 S2 S3 S4 S5 S6 S7 S8 S5 S7 S8 S9 S7 S9 S8 S10 S5 S11 S12 S13 S12 S11 S5 S14 S14 S15 S15 S20 S20 S20 S16 S17 S18 S19 S21 S22 Links providing better information scent yield more direct navigation S5 S14 S11 S11 S11 S23 S24 S25 S26 S27 S28 S29 S30 S31 S32 S27 S26 S33 S34 S35 S36 S35 S37 X People abandon Web sites when information scent diminishes Card, S., Pirolli, P., Van Der Wege, M., Morrison, J., Reeder, R., Schraedley, P., & Boshart, J. (2001). CHI Proceedings

SNIF-ACT Scent-based Navigation and Information Foraging in the ACT theory Declarative knowledge User goal (e.g. Find the poster for Antz ) Perceived aspects of Web page & browser Large spreading activation network representing word associations Procedural knowledge Productions representing basic Web browsing actions Utility: Information Scent Mutual relevance between link text and user goal

Cognitive Model of Information Scent new cell Information Goal medical patient Link Text treatments dose procedures beam Spreading activation Activation reflects likelihood of relevance given past history and current context

Spreading activation Activation of node i A i = B i + ΣW j S ji Base-level activation Activation spread from linked nodes j i bread j butter Base-level reflects log odds of occurrence Pr(i) B i = ln( ψ) Pr(not i) Strength of link spread reflects log likelihood odds of cooccurrance Pr(j i) S ji = ln( ψ) Pr(j not i)

spreading activation networks Document corpus ~ 200 X 200 million sparse word matrix Word statistics Spreading activation network ~ 55 million associations

Information Scent: Random Utility Model based on Activation Set of Alternatives (C) Goal G Specific alternative J Deterministic Utility of alternative J, out of set C, in context of goal G: U J G = VJ G + ε J G Stochastic Deterministic utility is based on activation: V J G = i G A i Summed activation from scent cues to goal Probability of choosing alternative J: J e Pr( J G, C) = µ e K C µ V Q V K Q

Evaluation:Link-following actions User Actions Page A choose link X Page B choose link Y Page C choose link Z Page A Page B SNIF-ACT Link X Link W Link V Link K Link Y Link M Info Scent Rank=1 Info Scent Rank=2 Page C Link Z Link O Link B Info Scent Rank=1

Observed vs Predicted Link Choice Observed frequency that link is chosen by user Rank of link by SNIF-ACT predicted utility

Theory: Decision to Leave Web Site Expected Utility (via Information Scent) Expected utility of continuing at site Expected utility of starting over at a new relevant site Foraging Time Decision to stop & go elsewhere

Data: Sequences of Moves Just Prior to Leaving a Site

Match of SNIF-ACT to User Data SNIF-ACT Predicted utility of next link SNIF-ACT Average utility of new site User Move Before Leaving Site

Forming Concepts about Information Sources Standard Web Site Search Engine Cancer Skin Treatments Statistics

Forming Concepts about Information Sources Scatter/Gather Document Clustering Browser

Browser Study Browsers Scatter-Gather (N = 8) Standard Similarity Search Engine (N = 8) 12 Tasks (TREC) E.g., Find documents on new medical procedures for cancer Pirolli, P., Hearst, M., Schank, P., & Diehl, C. (1996). Scatter-Gather browsing communicates the topic structure of a very large text collection. Proceedings of the CHI 96 Conference.

Inferred Topic Structure Browse documents Typical Topic-Structure Tree Diagram

Coherence of Mental Topics Among Users Hierarchical clustering of users based on similarity of topic trees C Person Person A Person B SG SS

InfoCLASS Information Category Learning by Adaptation to Scent Stimuli Rational Analysis Model of Human Category Learning (Anderson, 1991, 1991) Assumes Topics are mental categories Mental categories are probabilistic collections of concepts Links, thumbnails, citations provide information scent cues that evoke mental concepts

Evoking Categories and Inferences Perceive cues Activate Category Infer cell medical patient Health treatments dose cancer beam Strength between observed UI features (words) and category reflects log odds Strength between category and inferred features reflects log odds

Cues correspond to existing mental topics: Choose most activated (highest odds) cell medical patient Health treatments dose cancer beam Patents

Otherwise create new category New cell patient dose beam

InfoCLASS Simulation Scatter/Gather (SG): expose to cluster summaries from user logs Standard Search (SS) Expose to search result lists from user logs Evaluate ADA (category coherence) for the two groups of users SG Logs Parse Stoplist Items Categorization SS Logs Parse Stoplist Items Categorization Compare

Simulation Performance Items Categories SG SS SG SS 1 2 3 4 5 6 7 8 330 464 470 767 526 1926 579 1481 770 1331 420 565 370 963 507 1486 496 1123 59 28 149 39 142 33 101 28 120 28 152 45 151 16 124 14 125 29 Scatter-Gather (SG) simulations develop more categories than Standard Search (SS). Same trend as Pirolli et al (1996) comparison of SG vs SS users.

Category structure and coherence Scatter/Gather User 1 Similarity Search User 1 120 450 400 100 350 Number of Category Members 80 60 40 Number of category Members 300 250 200 150 100 20 50 0 0 10 20 30 40 50 60 Rank Size of Category 0 0 5 10 15 20 25 30 Rank Size of Category SG SS ADA.949 1.292 x 10-6 bits Scatter-Gather simulations exhibit greater category coherence (less divergence).

Summary Importance Quality of information scent can effect qualitative changes in navigation costs (linear to exponential) Browsers can differ in the how they communicate the topic structure of a collection of information SNIF-ACT model of navigation Navigation & when to stop InfoClass Model of formation of mental categories of topic structure Applications New UIs (ScentTrails, Relevance Enhanced Thumbnails, ) Web Site Evaluation tools (Bloodhound, Lumberjack)

BACKUP SLIDES

Eye Movements: High Information Scent (Pirolli, Card, & Van Der Wege, 2003, TOCHI)

Eye Movements: Low Information Scent (Pirolli, Card, & Van Der Wege, 2003, TOCHI)

Information Scent and the Cost of Navigation (based on Hogg & Huberman, 1987) D = depth of search hierarchy z = average branching factor (1- q) = prob. of false alarm ( Pr[FA] ) µ(q, z) = qz =average no. branches explored A(U, q, z) = average no. nodes explored within distance U = (1- µ(q, z) U +1 )/(1-µ(q, z)) N(D, z, q) = average no. nodes examined before desired goal found [ ] = (z -1)q 2 D-1 Σ A(s -1, q, z) s=1

Prototype Formation If there are no existing categories, then create a new category (k New ) and assign instance P to the category, otherwise Determine the prob that the instance comes from a new category, Pr(k New P), and compare that to the existing category with the highest probability of including the instance, Pr(k Max P) Assign P to k New if Pr(k New P) > Pr(k Max P), else Assign P to k Max

Coupling Parameter (c) Number of items in category Pr( k) = cn ( 1 c) k + cn Number of items experienced k 1 c Pr( ) = New ( 1 c) + cn

Categorization Model Item = [ ] 0 0 1 0 2 w 1 w 2 w N. Series of multinomial trials = i i n n i p Z p p p 1 2 1 2 1 2 1 1 α α α α α α α ),, ( ),,,, Pr( Category = [ ] 1 1 4 1 5.....w 1 w 2 w N Vector of probabilities Dirichlet [ ] [ ] + + = = = = = j j i i i i i i i i i i n n p p p 0 0 0 1 1 α α α α α α α E E Expected value Noninformative prior Posterior distribution

Average Divergence from Average Entropy D( p q) = K x p( x) p( x)log q( w) D( pk p) k = ADA = 1 K p= [ p p p p ] 1 2 3 4 p5 [ p p p p ] [ p p p p ] [ p p p p ] 1 2 3 4 p5 1 2 3 4 p5 1 2 3 4 p5 p k

Category structure and coherence 120 100 Scatter/Gather User 1 Similarity Search User 1 450 400 350 Number of Category Members 80 60 40 Number of category Members 300 250 200 150 100 20 50 0 0 10 20 30 40 50 60 Rank Size of Category 0 0 5 10 15 20 25 30 Rank Size of Category SG SS ADA.949 1.292 x 10-6 bits