CS276A Text Information Retrieval, Mining, and Exploitation. Lecture 9 5 Nov 2002
|
|
- Jasper Haynes
- 5 years ago
- Views:
Transcription
1 CS276A Text Information Retrieval, Mining, and Exploitation Lecture 9 5 Nov 2002
2 Recap: Relevance Feedback Rocchio Algorithm: Typical weights: alpha = 8, beta = 64, gamma = 64 Tradeoff alpha vs beta/gamma: If we have a lot of judged documents, we want a higher beta/gamma. But we usually don t 2
3 Pseudo Feedback initial query apply relevance feedback retrieve documents label top k docs relevant documents top k documents 3
4 Pseudo-Feedback: Performance 4
5 Today s topics User Interfaces Browsing Visualization 5
6 The User in Information Access Find starting point Information need Formulate/ Reformulate Query Send to system Receive results Explore results User no Done? Stop yes 6
7 The User in Information Access Find starting point Information need Formulate/ Reformulate Query Send to system Receive results Explore results User Focus of most IR! no Done? Stop yes 7
8 Information Access in Context Information Access Analyze Synthesize High-Level Goal Done? User no Stop yes 8
9 The User in Information Access Find starting point Information need Formulate/ Reformulate Query Send to system Receive results Explore results User no Done? Stop yes 9
10 Starting points Source selection Highwire press Lexis-nexis Google! Overviews Directories/hierarchies Visual maps Clustering 10
11 Highwire Press Source Selection 11
12 Hierarchical browsing Level 0 Level 1 Level 2 12
13 13
14 Visual Browsing: Themescape 14
15 Browsing Starting point x x x x x x x x x x x x x x Credit: William Arms, Cornell Answer 15
16 Scatter/Gather Scatter/gather allows the user to find a set of documents of interest through browsing. Take the collection and scatter it into n clusters. Pick the clusters of interest and merge them. Iterate 16
17 Scatter/Gather 17
18 Scatter/gather 18
19 How to Label Clusters Show titles of typical documents Titles are easy to scan Authors create them for quick scanning! But you can only show a few titles which may not fully represent cluster Show words/phrases prominent in cluster More likely to fully represent cluster Use distinguishing words/phrases But harder to scan 19
20 Visual Browsing: Hyperbolic Tree 20
21 Visual Browsing: Hyperbolic Tree 21
22 Study of Kohonen Feature Maps H. Chen, A. Houston, R. Sewell, and B. Schatz, JASIS 49(7) Comparison: Kohonen Map and Yahoo Task: Window shop for interesting home page Repeat with other interface Results: Starting with map could repeat in Yahoo (8/11) Starting with Yahoo unable to repeat in map (2/14) Credit: Marti 22Hearst
23 Study (cont.) Participants liked: Correspondence of region size to # documents Overview (but also wanted zoom) Ease of jumping from one topic to another Multiple routes to topics Use of category and subcategory labels Credit: Marti 23Hearst
24 Study (cont.) Participants wanted: hierarchical organization other ordering of concepts (alphabetical) integration of browsing and search corresponce of color to meaning more meaningful labels labels at same level of abstraction fit more labels in the given space combined keyword and category search multiple category assignment (sports+entertain) Credit: Marti 24Hearst
25 Browsing Effectiveness depends on Starting point Ease of orientation (are similar docs close etc, intuitive organization) How adaptive system is Compare to physical browsing (library, grocery store) 25
26 Searching vs. Browsing Information need dependent Open-ended (find an interesting quote on the virtues of friendship) -> browsing Specific (directions to Pacific Bell Park) -> searching User dependent Some users prefer searching, others browsing (confirmed in many studies: some hate to type) You don t need to know vocabulary for browsing. System dependent (some web sites don t support search) Searching and browsing are often interleaved. 26
27 Searchers vs. Browsers 1/3 of users do not search at all 1/3 rarely search (or urls only) Only 1/3 understand the concept of search (ISP data from 2000) 27
28 Exercise Observe your own information seeking behavior WWW University library Grocery store Are you a searcher or a browser? How do you reformulate your query? Read bad hits, then minus terms Read good hits, then plus terms Try a completely different query 28
29 The User in Information Access Find starting point Information need Formulate/ Reformulate Query Send to system Receive results Explore results User no Done? Stop yes 29
30 Query Specification Recall: Relevance feedback Query expansion Spelling correction Query-log mining based Interaction styles for query specification Queries on the Web Parametric search Term browsing 30
31 Query Specification: Interaction Styles Shneiderman 97 Command Language Form Fillin Menu Selection Direct Manipulation Natural Language Example: How do each apply to Boolean Queries Credit: Marti 31Hearst
32 Command-Based Query Specification command attribute value connector find pa shneiderman and tw user# What are the attribute names? What are the command names? What are allowable values? Credit: Marti 32Hearst
33 Form-Based Query Specification (Altavista) Credit: Marti 33Hearst
34 Form-Based Query Specification (Melvyl) Credit: Marti 34Hearst
35 Form-based Query Specification (Infoseek) Credit: Marti 35Hearst
36 Credit: Marti 36 Hearst D irec Ma nipulatio VQUER (Jo ne 98) Spec.
37 Menu-based Query Specification (Young & Shneiderman 93) Credit: Marti 37Hearst
38 Query Specification/Reformulation A good user interface makes it easy for the user to reformulate the query Challenge: one user interface is not ideal for all types of information needs 38
39 Types of Information Needs Need answer to question (who won the game?) Re-find a particular document Find a good recipe for tonight s dinner Authoritative summary of information (HIV review) Exploration of new area (browse sites about Baja) 39
40 Queries on the Web Most Frequent on 2002/10/26 40
41 Queries on the Web (2000) 41
42 Intranet Queries (Aug 2000) 3351 bearfacts 3349 telebears 1909 extension 1874 schedule+of+classes 1780 bearlink 1737 bear+facts 1468 decal 1443 infobears 1227 calendar 989 career+center 974 campus+map 920 academic+calendar 840 map 773 bookstore 741 class+pass 738 housing 721 tele-bears 716 directory 667 schedule 627 recipes 602 transcripts 582 tuition 577 seti 563 registrar 550 info+bears 543 class+schedule 470 financial+aid Source: Ray 42Larson
43 Intranet Queries Summary of sample data from 3 weeks of UCB queries 13.2% Telebears/BearFacts/InfoBears/BearLink (12297) 6.7% Schedule of classes or final exams (6222) 5.4% Summer Session (5041) 3.2% Extension (2932) 3.1% Academic Calendar (2846) 2.4% Directories (2202) 1.7% Career Center (1588) 1.7% Housing (1583) 1.5% Map (1393) Average query length over last 4 months: 1.8 words This suggests what is difficult to find from the home page Source: Ray 43Larson
44 Query Specification: Feast or Famine Feast Specifying a well targeted query is hard. Bigger problem for Boolean. Famine 44
45 Parametric search Each document has, in addition to text, some meta-data e.g., Language = French Format = pdf Subject = Physics etc. Date = Feb 2000 A parametric search interface allows the user to combine a full-text query with selections on these parameters e.g., language, date range, etc. 45
46 Parametric search example 46
47 47 Parametric search example
48 Interfaces for term browsing 48
49 49
50 The User in Information Access Find starting point Information need Formulate/ Reformulate Query Send to system Receive results Explore results User no Done? Stop yes 50
51 Explore Results Determine: Do these results answer my question? Summarization More generally: provide context Hypertext navigation: Can I find the answer by following a link? Browsing and clustering (again) Browse to explore results 51
52 Explore Results: Context We can t present complete documents in the result set too much information. Present information about each doc Must be concise (so we can show many docs) Must be informative Typical information about each document Summary Context of query words Meta data: date, author, language, file name/url Context of document in collection Information about structure of document 52
53 Context in Collection: Cha-Cha 53
54 Category Labels Advantages: Interpretable Capture summary information Describe multiple facets of content Domain dependent, and so descriptive Disadvantages Do not scale well (for organizing documents) Domain dependent, so costly to acquire May mis-match users interests Credit: Marti 54Hearst
55 Evaluate Results Context in Hierarchy: Cat-a-Cone 55
56 Explore Results: Summarization Query-dependent summarization KWIC (keyword in context) lines (a la google) Query-independent summarization Summary written by author (if available) Exploit genre (news stories) Sentence extraction Natural language generation 56
57 Evaluate Results Structure of document: SeeSoft 57
58 Personalization User Query Outride Personalized Search System Interests Query Augmentation Intranet Search Demographics Result Processing Click Stream Search History Result Set Web Search Application Usage! " 58
59 59
60 & " How Long to Get an Answer? O u t r i d e G o o g l e 8 1 Y a h o o! E x c i t e A O L ! $1 & 0 -, (-/. )+* ' ( % $ # $! "
61 Search Engine User Actions Difference (%) Outride 11.2 Google Yahoo! AOL Excite Average Table 1. User actions study results. Experienced Users Novice Users Overall Engine Expert Time Rank Novice Time Rank Average Rank % Difference Outride 32.8 (1) 45.1 (1) 38.9 (1) 0% AOL 92.3 (5) 87.0 (4) 89.6 (5) 130.2% Excite 75.7 (3) 91.3 (5) 83.5 (4) 114.5% Google 72.5 (2) 78.4 (3) 75.4 (2) 93.7% Yahoo! 85.1 (4) 76.9 (2) 81.0 (3) 107.9% Table 2. Overall timing results (in seconds, with placement in parenthesis). 61
62 J M ED ' 62 C = >< = >< =? 8 =< 9;: N H F L K I J K HI G F Others Outride Novice Experts 4443 " %2 '1. -).0/ *,+ ( &% # ) $% " #!
63 Performance of Interactive Retrieval 63
64 Boolean Queries: Interface Is sues Boolean logic is difficult for the average user. Much research was done on interfaces facilitating the creation of boolean queries by non-experts. Much of this research was made obsolete by the web. Current view is that non-expert users are best served with non-boolean or simple +/- boolean (pioneered by altavista). But boolean queries are the standard for certain groups of expert users (eg, lawyers). 64
65 User Interfaces: Other Issues Technical HCI issues How to use screen real estate One monolithic window or many? Undo operator Give access to history Alternative interfaces for novel/expert users Disabilities 65
66 Take-Away Don t ignore the user in information retrieval. Finding matching documents for a query is only part of information access and knowledge work. In addition to core information retrieval, information access interfaces need to support Finding starting points Formulation/reformulation of queries Exploring/evaluating results 66
67 Exercise Current information retrieval user interfaces are designed for typical computer screens. How would you design a user interface for a wallsize screen? 67
68 Resources MIR Ch Donna Harman, Overview of the fourth text retrieval conference (TR EC 4), National Institute of Standards and Technology. Cutting, Karger, Pedersen, Tukey. Scatter/Gather. ACM SIGIR. Hearst, Cat-a-cone, an interactive interface for specifying searches and viewing retrieving results in a large category hierarchy, ACM SIGIR. 68
Recap: Relevance Feedback. CS276A Text Information Retrieval, Mining, and Exploitation. Pseudo Feedback. Pseudo-Feedback: Performance.
CS276A Tet Information Retrieval, Mining, and Eploitation Recap: Relevance Feedback Rocchio Algorithm: Lecture 9 5 Nov 2002 Typical weights: alpha = 8, beta = 64, gamma = 64 Tradeoff alpha vs beta/gamma:
More information21. Search Models and UIs for IR
21. Search Models and UIs for IR INFO 202-10 November 2008 Bob Glushko Plan for Today's Lecture The "Classical" Model of Search and the "Classical" UI for IR Web-based Search Best practices for UIs in
More informationA World Wide Web-based HCI-library Designed for Interaction Studies
A World Wide Web-based HCI-library Designed for Interaction Studies Ketil Perstrup, Erik Frøkjær, Maria Konstantinovitz, Thorbjørn Konstantinovitz, Flemming S. Sørensen, Jytte Varming Department of Computing,
More informationEnabling Users to Visually Evaluate the Effectiveness of Different Search Queries or Engines
Appears in WWW 04 Workshop: Measuring Web Effectiveness: The User Perspective, New York, NY, May 18, 2004 Enabling Users to Visually Evaluate the Effectiveness of Different Search Queries or Engines Anselm
More informationKnowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A.
Knowledge Retrieval Franz J. Kurfess Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. 1 Acknowledgements This lecture series has been sponsored by the European
More informationUSER SEARCH INTERFACES. Design and Application
USER SEARCH INTERFACES Design and Application KEEP IT SIMPLE Search is a means towards some other end, rather than a goal in itself. Search is a mentally intensive task. Task Example: You have a friend
More informationAdaptive Search Engines Learning Ranking Functions with SVMs
Adaptive Search Engines Learning Ranking Functions with SVMs CS478/578 Machine Learning Fall 24 Thorsten Joachims Cornell University T. Joachims, Optimizing Search Engines Using Clickthrough Data, Proceedings
More informationUser-Centered and System-Centered IR
User-Centered and System-Centered IR Information Retrieval Lecture 2 User tasks Role of the system Document view and model Lecture 2 Information Retrieval 1 What is Information Retrieval? IR is the study
More informationToday s topic CS347. Results list clustering example. Why cluster documents. Clustering documents. Lecture 8 May 7, 2001 Prabhakar Raghavan
Today s topic CS347 Clustering documents Lecture 8 May 7, 2001 Prabhakar Raghavan Why cluster documents Given a corpus, partition it into groups of related docs Recursively, can induce a tree of topics
More informationLearning Ranking Functions with SVMs
Learning Ranking Functions with SVMs CS4780/5780 Machine Learning Fall 2014 Thorsten Joachims Cornell University T. Joachims, Optimizing Search Engines Using Clickthrough Data, Proceedings of the ACM Conference
More informationCS506/606 - Topics in Information Retrieval
CS506/606 - Topics in Information Retrieval Instructors: Class time: Steven Bedrick, Brian Roark, Emily Prud hommeaux Tu/Th 11:00 a.m. - 12:30 p.m. September 25 - December 6, 2012 Class location: WCC 403
More informationOverview On Methods Of Searching The Web
Overview On Methods Of Searching The Web Introduction World Wide Web (WWW) is the ultimate source of information. It has taken over the books, newspaper, and any other paper based material. It has become
More informationChapter 27 Introduction to Information Retrieval and Web Search
Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval
More informationInformation Retrieval and Web Search Engines
Information Retrieval and Web Search Engines Lecture 7: Document Clustering May 25, 2011 Wolf-Tilo Balke and Joachim Selke Institut für Informationssysteme Technische Universität Braunschweig Homework
More informationIntroduction. What do you know about web in general and web-searching in specific?
WEB SEARCHING Introduction What do you know about web in general and web-searching in specific? Web World Wide Web (or WWW, It is called a web because the interconnections between documents resemble a
More informationSession 10: Information Retrieval
INFM 63: Information Technology and Organizational Context Session : Information Retrieval Jimmy Lin The ischool University of Maryland Thursday, November 7, 23 Information Retrieval What you search for!
More informationChapter 2. Architecture of a Search Engine
Chapter 2 Architecture of a Search Engine Search Engine Architecture A software architecture consists of software components, the interfaces provided by those components and the relationships between them
More informationInformation Retrieval
Information Retrieval An Introduction The view of an open-minded computer scientist What is Information Retrieval? The process of actively seeking out information relevant to a topic of interest (van Rijsbergen)
More informationDirectory Search Engines Searching the Yahoo Directory
Searching on the WWW Directory Oriented Search Engines Often looking for some specific information WWW has a growing collection of Search Engines to aid in locating information The Search Engines return
More informationPromoting Website CS 4640 Programming Languages for Web Applications
Promoting Website CS 4640 Programming Languages for Web Applications [Jakob Nielsen and Hoa Loranger, Prioritizing Web Usability, Chapter 5] [Sean McManus, Web Design, Chapter 15] 1 Search Engine Optimization
More informationLearning Ranking Functions with SVMs
Learning Ranking Functions with SVMs CS4780/5780 Machine Learning Fall 2012 Thorsten Joachims Cornell University T. Joachims, Optimizing Search Engines Using Clickthrough Data, Proceedings of the ACM Conference
More informationTHE HISTORY & EVOLUTION OF SEARCH
THE HISTORY & EVOLUTION OF SEARCH Duration : 1 Hour 30 Minutes Let s talk about The History Of Search Crawling & Indexing Crawlers / Spiders Datacenters Answer Machine Relevancy (200+ Factors)
More informationUsing Clusters on the Vivisimo Web Search Engine
Using Clusters on the Vivisimo Web Search Engine Sherry Koshman and Amanda Spink School of Information Sciences University of Pittsburgh 135 N. Bellefield Ave., Pittsburgh, PA 15237 skoshman@sis.pitt.edu,
More informationInformation Retrieval
Introduction Information Retrieval Information retrieval is a field concerned with the structure, analysis, organization, storage, searching and retrieval of information Gerard Salton, 1968 J. Pei: Information
More informationCHAPTER THREE INFORMATION RETRIEVAL SYSTEM
CHAPTER THREE INFORMATION RETRIEVAL SYSTEM 3.1 INTRODUCTION Search engine is one of the most effective and prominent method to find information online. It has become an essential part of life for almost
More informationThis lecture: IIR Sections Ranked retrieval Scoring documents Term frequency Collection statistics Weighting schemes Vector space scoring
This lecture: IIR Sections 6.2 6.4.3 Ranked retrieval Scoring documents Term frequency Collection statistics Weighting schemes Vector space scoring 1 Ch. 6 Ranked retrieval Thus far, our queries have all
More informationChapter 6: Information Retrieval and Web Search. An introduction
Chapter 6: Information Retrieval and Web Search An introduction Introduction n Text mining refers to data mining using text documents as data. n Most text mining tasks use Information Retrieval (IR) methods
More informationOutline. Structures for subject browsing. Subject browsing. Research issues. Renardus
Outline Evaluation of browsing behaviour and automated subject classification: examples from KnowLib Subject browsing Automated subject classification Koraljka Golub, Knowledge Discovery and Digital Library
More informationA NEW CLUSTER MERGING ALGORITHM OF SUFFIX TREE CLUSTERING
A NEW CLUSTER MERGING ALGORITHM OF SUFFIX TREE CLUSTERING Jianhua Wang, Ruixu Li Computer Science Department, Yantai University, Yantai, Shandong, China Abstract: Key words: Document clustering methods
More informationMultimedia Information Extraction and Retrieval Term Frequency Inverse Document Frequency
Multimedia Information Extraction and Retrieval Term Frequency Inverse Document Frequency Ralf Moeller Hamburg Univ. of Technology Acknowledgement Slides taken from presentation material for the following
More informationInformation Architecture
Information Architecture Why, What, & How Internet Technology 1 Why IA? Information Overload Internet Technology 2 What is IA? Process of organizing & presenting information in an intuitive & clear manner.
More informationLink Analysis and Web Search
Link Analysis and Web Search Moreno Marzolla Dip. di Informatica Scienza e Ingegneria (DISI) Università di Bologna http://www.moreno.marzolla.name/ based on material by prof. Bing Liu http://www.cs.uic.edu/~liub/webminingbook.html
More informationQuery Refinement and Search Result Presentation
Query Refinement and Search Result Presentation (Short) Queries & Information Needs A query can be a poor representation of the information need Short queries are often used in search engines due to the
More informationAggregation for searching complex information spaces. Mounia Lalmas
Aggregation for searching complex information spaces Mounia Lalmas mounia@acm.org Outline Document Retrieval Focused Retrieval Aggregated Retrieval Complexity of the information space (s) INEX - INitiative
More information5 Choosing keywords Initially choosing keywords Frequent and rare keywords Evaluating the competition rates of search
Seo tutorial Seo tutorial Introduction to seo... 4 1. General seo information... 5 1.1 History of search engines... 5 1.2 Common search engine principles... 6 2. Internal ranking factors... 8 2.1 Web page
More informationInformation Retrieval
Information Retrieval CSC 375, Fall 2016 An information retrieval system will tend not to be used whenever it is more painful and troublesome for a customer to have information than for him not to have
More informationAppendix A: Scenarios
Appendix A: Scenarios Snap-Together Visualization has been used with a variety of data and visualizations that demonstrate its breadth and usefulness. Example applications include: WestGroup case law,
More informationSec. 8.7 RESULTS PRESENTATION
Sec. 8.7 RESULTS PRESENTATION 1 Sec. 8.7 Result Summaries Having ranked the documents matching a query, we wish to present a results list Most commonly, a list of the document titles plus a short summary,
More informationCSC369 Lecture 9. Larry Zhang, November 16, 2015
CSC369 Lecture 9 Larry Zhang, November 16, 2015 1 Announcements A3 out, due ecember 4th Promise: there will be no extension since it is too close to the final exam (ec 7) Be prepared to take the challenge
More informationInformation Retrieval
Introduction to Information Retrieval SCCS414: Information Storage and Retrieval Christopher Manning and Prabhakar Raghavan Lecture 10: Text Classification; Vector Space Classification (Rocchio) Relevance
More informationInformation Retrieval
Information Retrieval Suan Lee - Information Retrieval - 06 Scoring, Term Weighting and the Vector Space Model 1 Recap of lecture 5 Collection and vocabulary statistics: Heaps and Zipf s laws Dictionary
More informationInteraction Style Categories. COSC 3461 User Interfaces. What is a Command-line Interface? Command-line Interfaces
COSC User Interfaces Module 2 Interaction Styles What is a Command-line Interface? An interface where the user types commands in direct response to a prompt Examples Operating systems MS-DOS Unix Applications
More informationSE Workshop PLAN. What is a Search Engine? Components of a SE. Crawler-Based Search Engines. How Search Engines (SEs) Work?
PLAN SE Workshop Ellen Wilson Olena Zubaryeva Search Engines: How do they work? Search Engine Optimization (SEO) optimize your website How to search? Tricks Practice What is a Search Engine? A page on
More informationEVALUATION OF PROTOTYPES USABILITY TESTING
EVALUATION OF PROTOTYPES USABILITY TESTING CPSC 544 FUNDAMENTALS IN DESIGNING INTERACTIVE COMPUTATION TECHNOLOGY FOR PEOPLE (HUMAN COMPUTER INTERACTION) WEEK 9 CLASS 17 Joanna McGrenere and Leila Aflatoony
More informationNavigating Large Hierarchical Space Using Invisible Links
Navigating Large Hierarchical Space Using Invisible Links Ming C. Hao, Meichun Hsu, Umesh Dayal, Adrian Krug* Software Technology Laboratory HP Laboratories Palo Alto HPL-2000-8 January, 2000 E-mail:(mhao,
More informationCS 6320 Natural Language Processing
CS 6320 Natural Language Processing Information Retrieval Yang Liu Slides modified from Ray Mooney s (http://www.cs.utexas.edu/users/mooney/ir-course/slides/) 1 Introduction of IR System components, basic
More informationAN OVERVIEW OF SEARCHING AND DISCOVERING WEB BASED INFORMATION RESOURCES
Journal of Defense Resources Management No. 1 (1) / 2010 AN OVERVIEW OF SEARCHING AND DISCOVERING Cezar VASILESCU Regional Department of Defense Resources Management Studies Abstract: The Internet becomes
More informationIndexing. Week 14, Spring Edited by M. Naci Akkøk, , Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel
Indexing Week 14, Spring 2005 Edited by M. Naci Akkøk, 5.3.2004, 3.3.2005 Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel Overview Conventional indexes B-trees Hashing schemes
More informationSearching. Outline. Copyright 2006 Haim Levkowitz. Copyright 2006 Haim Levkowitz
Searching 1 Outline Goals and Objectives Topic Headlines Introduction Directories Open Directory Project Search Engines Metasearch Engines Search techniques Intelligent Agents Invisible Web Summary 2 1
More informationevaluation techniques goals of evaluation evaluation by experts cisc3650 human-computer interaction spring 2012 lecture # II.1
topics: evaluation techniques usability testing references: cisc3650 human-computer interaction spring 2012 lecture # II.1 evaluation techniques Human-Computer Interaction, by Alan Dix, Janet Finlay, Gregory
More informationInformation Retrieval
Multimedia Computing: Algorithms, Systems, and Applications: Information Retrieval and Search Engine By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854,
More informationInformation Retrieval and Web Search Engines
Information Retrieval and Web Search Engines Lecture 7: Document Clustering December 4th, 2014 Wolf-Tilo Balke and José Pinto Institut für Informationssysteme Technische Universität Braunschweig The Cluster
More informationAn adaptable search system for collection of partially structured documents
Samantha Riccadonna An adaptable search system for collection of partially structured documents by Udo Kruschwitz Web Information Retrieval Course A.Y. 2005-2006 Outline Search system overview Few concepts
More informationThe Person in Personal
WWW Panel: Searching Personal Content The Person in Personal (Supporting the Person in Searching Personal Content) Susan Dumais Microsoft Research http://research.microsoft.com/~sdumais Stuff I ve I Seen
More informationIBE101: Introduction to Information Architecture. Hans Fredrik Nordhaug 2008
IBE101: Introduction to Information Architecture Hans Fredrik Nordhaug 2008 Objectives Defining IA Practicing IA User Needs and Behaviors The anatomy of IA Organizations Systems Labelling Systems Navigation
More informationUNIT-V WEB MINING. 3/18/2012 Prof. Asha Ambhaikar, RCET Bhilai.
UNIT-V WEB MINING 1 Mining the World-Wide Web 2 What is Web Mining? Discovering useful information from the World-Wide Web and its usage patterns. 3 Web search engines Index-based: search the Web, index
More informationINFSCI 2140 Information Storage and Retrieval Lecture 6: Taking User into Account. Ad-hoc IR in text-oriented DS
INFSCI 2140 Information Storage and Retrieval Lecture 6: Taking User into Account Peter Brusilovsky http://www2.sis.pitt.edu/~peterb/2140-051/ Ad-hoc IR in text-oriented DS The context (L1) Querying and
More informationQuery reformulation CE-324: Modern Information Retrieval Sharif University of Technology
Query reformulation CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2016 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford) Sec.
More informationMultimedia Information Systems
Multimedia Information Systems Samson Cheung EE 639, Fall 2004 Lecture 6: Text Information Retrieval 1 Digital Video Library Meta-Data Meta-Data Similarity Similarity Search Search Analog Video Archive
More informationCSE 3. How Is Information Organized? Searching in All the Right Places. Design of Hierarchies
CSE 3 Comics Updates Shortcut(s)/Tip(s) of the Day Web Proxy Server PrimoPDF How Computers Work Ch 30 Chapter 5: Searching for Truth: Locating Information on the WWW Fluency with Information Technology
More informationToday we show how a search engine works
How Search Engines Work Today we show how a search engine works What happens when a searcher enters keywords What was performed well in advance Also explain (briefly) how paid results are chosen If we
More informationModule 1: Internet Basics for Web Development (II)
INTERNET & WEB APPLICATION DEVELOPMENT SWE 444 Fall Semester 2008-2009 (081) Module 1: Internet Basics for Web Development (II) Dr. El-Sayed El-Alfy Computer Science Department King Fahd University of
More informationWeb Search Basics Introduction to Information Retrieval INF 141/ CS 121 Donald J. Patterson
Web Search Basics Introduction to Information Retrieval INF 141/ CS 121 Donald J. Patterson Content adapted from Hinrich Schütze http://www.informationretrieval.org Overview Overview Introduction Classic
More informationIntroduction to Information Retrieval
Introduction to Information Retrieval http://informationretrieval.org IIR 16: Flat Clustering Hinrich Schütze Institute for Natural Language Processing, Universität Stuttgart 2009.06.16 1/ 64 Overview
More informationQuery reformulation CE-324: Modern Information Retrieval Sharif University of Technology
Query reformulation CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2015 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford) Sec.
More informationJan Pedersen 22 July 2010
Jan Pedersen 22 July 2010 Outline Problem Statement Best effort retrieval vs automated reformulation Query Evaluation Architecture Query Understanding Models Data Sources Standard IR Assumptions Queries
More informationLecture 5: Information Retrieval using the Vector Space Model
Lecture 5: Information Retrieval using the Vector Space Model Trevor Cohn (tcohn@unimelb.edu.au) Slide credits: William Webber COMP90042, 2015, Semester 1 What we ll learn today How to take a user query
More informationSkill Area 209: Use Internet Technology. Software Application (SWA)
Skill Area 209: Use Internet Technology Software Application (SWA) Skill Area 209.1 Use Browser for Research (10hrs) 209.1.1 Familiarise with the Environment of Selected Browser Internet Technology The
More informationInformation Retrieval Lecture 4: Web Search. Challenges of Web Search 2. Natural Language and Information Processing (NLIP) Group
Information Retrieval Lecture 4: Web Search Computer Science Tripos Part II Simone Teufel Natural Language and Information Processing (NLIP) Group sht25@cl.cam.ac.uk (Lecture Notes after Stephen Clark)
More informationSearching in All the Right Places. How Is Information Organized? Chapter 5: Searching for Truth: Locating Information on the WWW
Chapter 5: Searching for Truth: Locating Information on the WWW Fluency with Information Technology Third Edition by Lawrence Snyder Searching in All the Right Places The Obvious and Familiar To find tax
More informationEvaluating the Accuracy of. Implicit feedback. from Clicks and Query Reformulations in Web Search. Learning with Humans in the Loop
Evaluating the Accuracy of Implicit Feedback from Clicks and Query Reformulations in Web Search Thorsten Joachims, Filip Radlinski, Geri Gay, Laura Granka, Helene Hembrooke, Bing Pang Department of Computer
More informationChapter 6. Queries and Interfaces
Chapter 6 Queries and Interfaces Keyword Queries Simple, natural language queries were designed to enable everyone to search Current search engines do not perform well (in general) with natural language
More informationLecture 8 May 7, Prabhakar Raghavan
Lecture 8 May 7, 2001 Prabhakar Raghavan Clustering documents Given a corpus, partition it into groups of related docs Recursively, can induce a tree of topics Given the set of docs from the results of
More informationHow many people are online? As of Sept. 2002: an educated guess suggests: World Total: million. Internet. Types of Computers on Internet
Internet Hazelwood East High School How many people are online? As of Sept. 2002: an educated guess suggests: World Total: 605.6 million Africa: 6.31 million Asia/ Pacific: 187.24 million Europe: 190.91
More informationSupporting Exploratory Search Through User Modeling
Supporting Exploratory Search Through User Modeling Kumaripaba Athukorala, Antti Oulasvirta, Dorota Glowacka, Jilles Vreeken, Giulio Jacucci Helsinki Institute for Information Technology HIIT Department
More informationClustering Results. Result List Example. Clustering Results. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Presenting Results Clustering Clustering Results! Result lists often contain documents related to different aspects of the query topic! Clustering is used to
More informationBroadening Access to Large Online Databases by Generalizing Query Previews
Broadening Access to Large Online Databases by Generalizing Query Previews Egemen Tanin egemen@cs.umd.edu Catherine Plaisant plaisant@cs.umd.edu Ben Shneiderman ben@cs.umd.edu Human-Computer Interaction
More informationText Mining. Munawar, PhD. Text Mining - Munawar, PhD
10 Text Mining Munawar, PhD Definition Text mining also is known as Text Data Mining (TDM) and Knowledge Discovery in Textual Database (KDT).[1] A process of identifying novel information from a collection
More informationQuestioning Yahoo! Answers
Questioning Yahoo! Answers Zoltán Gyöngyi zoltan@cs.stanford.edu Outline Yahoo! Answers model Statistics Basics Diversity Authority Problems Interaction model Others Question Answering on the Web April
More informationElementary IR: Scalable Boolean Text Search. (Compare with R & G )
Elementary IR: Scalable Boolean Text Search (Compare with R & G 27.1-3) Information Retrieval: History A research field traditionally separate from Databases Hans P. Luhn, IBM, 1959: Keyword in Context
More informationCLARIT Compound Queries and Constraint-Controlled Feedback in TREC-5 Ad-Hoc Experiments
CLARIT Compound Queries and Constraint-Controlled Feedback in TREC-5 Ad-Hoc Experiments Natasa Milic-Frayling 1, Xiang Tong 2, Chengxiang Zhai 2, David A. Evans 1 1 CLARITECH Corporation 2 Laboratory for
More informationText Analytics (Text Mining)
CSE 6242 / CX 4242 Apr 1, 2014 Text Analytics (Text Mining) Concepts and Algorithms Duen Horng (Polo) Chau Georgia Tech Some lectures are partly based on materials by Professors Guy Lebanon, Jeffrey Heer,
More informationInformation Retrieval CSCI
Information Retrieval CSCI 4141-6403 My name is Anwar Alhenshiri My email is: anwar@cs.dal.ca I prefer: aalhenshiri@gmail.com The course website is: http://web.cs.dal.ca/~anwar/ir/main.html 5/6/2012 1
More informationInformation Behavior in Digital Age (III): Related Research
Information Behavior in Digital Age (III): Related Research Invited Lectures on Information Behaviors 國立政治大學圖書資訊與檔案學研究所 Peiling Wang, Ph.D. November 28, 2013 Use of Digital Information Resources & Internet
More informationDocument Clustering for Mediated Information Access The WebCluster Project
Document Clustering for Mediated Information Access The WebCluster Project School of Communication, Information and Library Sciences Rutgers University The original WebCluster project was conducted at
More informationInformation Retrieval. Lecture 9 - Web search basics
Information Retrieval Lecture 9 - Web search basics Seminar für Sprachwissenschaft International Studies in Computational Linguistics Wintersemester 2007 1/ 30 Introduction Up to now: techniques for general
More informationInformation Networks. Hacettepe University Department of Information Management DOK 422: Information Networks
Information Networks Hacettepe University Department of Information Management DOK 422: Information Networks Search engines Some Slides taken from: Ray Larson Search engines Web Crawling Web Search Engines
More informationEVALUATION OF PROTOTYPES USABILITY TESTING
EVALUATION OF PROTOTYPES USABILITY TESTING CPSC 544 FUNDAMENTALS IN DESIGNING INTERACTIVE COMPUTATIONAL TECHNOLOGY FOR PEOPLE (HUMAN COMPUTER INTERACTION) WEEK 9 CLASS 17 Joanna McGrenere and Leila Aflatoony
More informationINFSCI 2140 Information Storage and Retrieval Lecture 1: Introduction. INFSCI 2140 and your program
INFSCI 2140 Information Storage and Retrieval Lecture 1: Introduction Peter Brusilovsky http://www2.sis.pitt.edu/~peterb/2140-051/ INFSCI 2140 and your program Foundation course One of the key courses
More informationSearch Engine Architecture II
Search Engine Architecture II Primary Goals of Search Engines Effectiveness (quality): to retrieve the most relevant set of documents for a query Process text and store text statistics to improve relevance
More informationAn Introduction to Search Engines and Web Navigation
An Introduction to Search Engines and Web Navigation MARK LEVENE ADDISON-WESLEY Ал imprint of Pearson Education Harlow, England London New York Boston San Francisco Toronto Sydney Tokyo Singapore Hong
More informationStudent Usability Project Recommendations Define Information Architecture for Library Technology
Student Usability Project Recommendations Define Information Architecture for Library Technology Erika Rogers, Director, Honors Program, California Polytechnic State University, San Luis Obispo, CA. erogers@calpoly.edu
More informationQuery Modifications Patterns During Web Searching
Bernard J. Jansen The Pennsylvania State University jjansen@ist.psu.edu Query Modifications Patterns During Web Searching Amanda Spink Queensland University of Technology ah.spink@qut.edu.au Bhuva Narayan
More informationIHS Standards Expert FAQs
IHS Standards Expert FAQs New IHS Standards Expert FAQs based on Customer First surveys How can I find out what is part of my subscription? How do I know what I have access to? There are two easy ways
More information<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany
Information Systems & University of Koblenz Landau, Germany Semantic Search examples: Swoogle and Watson Steffen Staad credit: Tim Finin (swoogle), Mathieu d Aquin (watson) and their groups 2009-07-17
More informationWeb Search. Lecture Objectives. Text Technologies for Data Science INFR Learn about: 11/14/2017. Instructor: Walid Magdy
Text Technologies for Data Science INFR11145 Web Search Instructor: Walid Magdy 14-Nov-2017 Lecture Objectives Learn about: Working with Massive data Link analysis (PageRank) Anchor text 2 1 The Web Document
More informationSearch Engine Architecture. Hongning Wang
Search Engine Architecture Hongning Wang CS@UVa CS@UVa CS4501: Information Retrieval 2 Document Analyzer Classical search engine architecture The Anatomy of a Large-Scale Hypertextual Web Search Engine
More informationSearching for Information
Searching for Information INFO/CSE100, Spring 2006 Fluency in Information Technology http://www.cs.washington.edu/100 Apr-10-06 searching @ university of washington 1 Readings and References Reading Fluency
More informationIntroduction to Information Retrieval
Introduction Inverted index Processing Boolean queries Course overview Introduction to Information Retrieval http://informationretrieval.org IIR 1: Boolean Retrieval Hinrich Schütze Institute for Natural
More informationMining Web Data. Lijun Zhang
Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems
More information