ITERATIVE SEARCHING IN AN ONLINE DATABASE. Susan T. Dumais and Deborah G. Schmitt Cognitive Science Research Group Bellcore Morristown, NJ
|
|
- Jeremy Stafford
- 5 years ago
- Views:
Transcription
1 - 1 - ITERATIVE SEARCHING IN AN ONLINE DATABASE Susan T. Dumais and Deborah G. Schmitt Cognitive Science Research Group Bellcore Morristown, NJ ABSTRACT An experiment examined how people use an online retrieval system. Subjects solved general topical search problems using a database containing the full text of news articles (e.g., find articles about the "Background of the new prime minister of Great Britain"). Time, accuracy and content of the searches were recorded. Of particular interest was the use of two iterative search methods available in the interface - a Lookup function that allowed users to explicitly specify an alternative query; and a LikeThese function that could be used to automatically generate a new query using articles the user marked as relevant. Results showed that subjects could easily use both query reformulation methods. Subjects generated much more effective LikeThese searches than Lookup searches. An analysis of individual subject differences suggests that the LikeThese method is more accessible to a wide range of users. Figure 1. Example of InfoSearch interface. Response of the system to the search problem "Leaders who figure in discussions of the future of the West German chancellorship" is shown. (a) Lookup search (b) LikeThese search Figure 2. Examples of InfoSearch iterative search functions. Only the List of Documents and Lookup Windows are shown. Panel (a) shows use of the Lookup function to enter the query "new west german chancellor"; Panel (b) shows a LikeThese search using documents 81 and 103.
2 - 2 - INTRODUCTION This paper describes an experiment examining how people use an online retrieval system. The InfoSearch interface (Dumais & Littman, 1990) was used to present a textual database of news articles to users. This interface incorporates features that have been shown to improve retrieval performance in simulations (e.g., Latent Semantic Indexing and iterative query specification). The experiment examines the extent to which these methods are effective in practice. Of particular interest are the strategies people use for modifying their initial requests. Before describing the experiment and results, we briefly review the Latent Semantic Indexing retrieval method and describe the InfoSearch interface. Latent Semantic Indexing (LSI) Latent Semantic Indexing (LSI) is a method that can improve people s access to textual information (Deerwester, et al. 1990; Dumais, et al., 1988). Most textual retrieval systems operate by matching words in users queries with words in database objects. Because of the tremendous variability in the words people use to describe objects or topics of interest, word-matching methods are far from perfect. The fact that the same word can be used to refer to many different things means that irrelevant objects will be retrieved (e.g., the word "mouse" means different things in different contexts). Conversely, the fact that different authors use different words to describe essentially the same idea means that many relevant objects will be missed (e.g., articles about mice, track balls, and pointing devices might also be relevant to someone asking about a mouse). LSI tries to overcome these problems by using statistical methods to model the associations of terms and objects, and to automatically construct a "semantic" space more appropriate for
3 - 3 - information retrieval. LSI provides several advantages over standard word-matching methods. First, LSI allows objects which share no words with a user s query to be quite similar to it, resulting in up to 30% improvement in retrieval performance. Second, in response to a query, LSI returns a list of all objects ranked from most similar to least similar, allowing the user to view as many as necessary for a particular task. Finally, since both terms and text objects are represented in the LSI space, any combination of words and objects can be used as a query. InfoSearch retrieval interface The InfoSearch Retrieval Interface is a program that allows users to see the results of an LSI search and to interactively specify new queries (Dumais & Littman, 1990; also see METHOD section below). Multiple tiled windows allow users to brief titles, to view the full text of selected objects, and to construct queries. Users specify initial queries by typing. A rank-ordered list of objects (based on LSI-matching) is returned. InfoSearch also provides two mechanisms for iterative query formulation. A Lookup function can be used to explicitly specify an alternative query. Essentially, this lets users try again. There is little data on the effectiveness of this method, although it is generally assumed that users can use the results of previous searches to modify subsequent attempts. In addition, a Like These function can be used to automatically generate a new query using the full text of objects the user has marked as relevant. If some of the initial responses are on the right track, users mark them and ask the system to find more "like these". Information retrieval simulations and psychological theory suggest that this so called relevance feedback can improve users ability to find relevant objects by 60% or more (Oddy, 1977; Salton & Buckley, 1990; Stanfill & Kahle, 1986; Williams, 1984).
4 - 4 - Design METHOD Fifty-seven college students took part in the experiment. The database was a collection of the full text of several hundred international news articles from 1963 often used in information science research. There were three experimental conditions designed to manipulate the search strategies subjects used. In the Lookup condition, subjects were encouraged to use the Lookup function to find additional articles. In the LikeThese condition, subjects were encouraged to use the LikeThese function. And, in the Both condition, both search strategies were given equal emphasis during training. Subjects were free to use either method at any time after training. Procedure Subjects were taught to use the InfoSearch interface and practiced on a small collection of information science articles. They were then given ten topical search problems that could be answered using the news database, and asked to find as many articles as they could that were relevant to each question. The questions were general topical searches - e.g., find articles about the "Background of the new prime minister of Great Britain" or find articles about the "Leaders who figure in discussions of the future of the West German chancellorship".
5 - 5 - At the beginning of each search problem, the display was initialized to what it would have looked like if subjects had literally typed the search problem as a query. Subjects searched until they thought they had found all relevant articles. The experiment was self-paced, with the average subject completing the experiment in three hours. All keystrokes were collected by the InfoSearch program. Measures of primary interest included problem solving time, accuracy and the content of subjects searches. On a separate day, demographic and technical aptitude information about the subjects was collected. Interface A screen dump of the InfoSearch retrieval interface is presented in Figure 1. This example shows the systems response to the query: "Leaders who figure in discussions of the future of the West German chancellorship". There are four main windows in the experimental system. (1) The List of Documents Window (upper left) displays a list of the titles of articles that best match the query. Articles are ranked from most to least similar to the query. The numbers at the far left (e.g., 0.84) are the LSI-based similarity between the query and each article. These numbers can range from 1.00 (indicating a perfect match between query and article) to The numbers in parentheses (e.g., 266) are article identification numbers. The scroll bar can be used to display the titles of additional articles. (2) The full text of the first article is shown in the
6 - 6 - large Page of Text Window (upper right). The full text of other articles can be displayed by pointing to the corresponding article in the List of Documents Window, or by scrolling through the text in the Text Window until the next article appears. (3) Queries are entered in the Lookup Window (bottom left). InfoSearch provides two mechanisms for query formulation - the Lookup and LikeThese buttons at the bottom of the window. When users select the Lookup button, a query window appears and they can enter any query by typing (Figure 2a). Alternatively, users can use the LikeThese function to search for additional articles. If some articles contain relevant information users can mark them and ask the system to find more articles "like these" (Figure 2b). In this case, articles 81 and 103 are marked as relevant. The system automatically constructs a query using the full text of these articles when the LikeThese function is selected. All previous queries are saved in the Lookup Window and users can easily re-execute them. Note that the query #31 is a shorthand for the search problem "Leaders who figure in discussions of the future of the West German chancellorship". (4) The Experimental Control Window (lower right) is used to present search problems to subjects and to collect their responses.
7 - 7 - Search strategies RESULTS Subjects in all conditions could easily use both query reformulation methods. On average, subjects tried more than four searches (Lookup or LikeThese) in addition to the original problem statement to answer each question. The experimental manipulation was effective in influencing the search strategies subjects used. The ratio of the number of LikeThese searches to the total number of searches was.62 in the LikeThese condition,.52 in the Both condition, and.26 in the Lookup condition (F (2,54)=18.2; p <.001). Effectiveness of searches Analyses were performed using answers provided by outside judges as target responses for each search problem. Subjects answers were compared with the judges "correct" answers. The proportion correct, the number of intrusions, and total time all favored the LikeThese condition, although none of the differences was statistically reliable. It is important to note, however, that since subjects were free to use either search method at any time this is a very weak comparison.
8 - 8 - A more sensitive measure of performance can be obtained by separately examining the quality of Lookup and LikeThese searches independent of condition. Because subjects generated several searches in solving each problem, it is difficult to know which particular searches resulted in which final answers. To examine the effectiveness of each search, we simply calculate the number of relevant articles in the top 10. That is, for each of the 10 search problems, we look at the articles returned in response to each Lookup and each LikeThese search and count the number of relevant articles among the first 10 articles returned. Table 1 summarizes the results of this analysis. On average subjects try more Lookup searches (2.5) than LikeThese searches (1.9). Lookup searches are, however, generally much less effective than LikeThese searches. The average Lookup search results in fewer relevant articles (3.3) that the average LikeThese search (4.4) - F (1,9)=27.8, p <.001. Similar advantages for LikeThese searches are obtained for the best and worst queries generated by each subject for each search problem - F (1,9)=9.2, p <.014; F (1,9)=56.5, p <.001 for the best and worst queries, respectively. The best Lookup search returns the same number of relevant articles as the worst LikeThese search. It is also interesting that only LikeThese searches reliably improve on performance obtained using the original problem statement as a query. The single best LikeThese search, for example, results in a 37% improvement over the original problem statement. Finally, we note that the average number of relevant articles is 6.8, so there is still room for improvement relative to even the best LikeThese searches which return 4.7 relevant articles among the top 10.
9 - 9 - original users users problem iterative iterative statement searches searches "Lookup" "LikeThese" number of searches number relevant in top 10: avg best worst Table 1. Effectiveness of Lookup vs. LikeThese searches - number of relevant articles in the top 10 articles returned. These results confirm informal observations that subjects find it difficult to generate effective Lookup search queries. This is in spite of the fact that InfoSearch is an interactive retrieval system in which results of previous
10 search attempts could be used to modify subsequent searches. We believe that part of the problem results from the fact that users typically generate short queries (an average of 3 words per Lookup search). Given the variability in the way different authors describe the same topic, many relevant articles will be missed with such short queries. The LikeThese method, on the other hand, provides users with an easy way to construct what is in effect a very rich query (the system automatically constructs a query using the full text of the selected articles), and this appears to be necessary for success. Individual differences There were large and interesting individual differences in performance in the experiment. For most dependent variables, a range of about 4:1 was observed between the best and worst subject. In general, technical aptitudes and background variables did not reliably predict performance, suggesting that success with the InfoSearch interface is not limited to people with high aptitudes or particular kinds of previous experience. (See Egan, 1989, for a review of other retrieval interfaces that require specific technical aptitudes or background characteristics for success.) LikeThese searches were particularly effective for most people, regardless of aptitude. For subjects who used LikeThese searches more frequently than Lookup searches (n =27), performance was predicted only by reading ability, and this is not surprising since they had to read the articles to answer the search problems. For subjects who preferred Lookup searches (n =27), performance depended on verbal fluency, and spatial ability, as well as reading ability. This pattern suggests that additional verbal and spatial abilities may be required when subjects must explicitly generate alternative queries. Figure 3 shows the average time per correct response plotted as a function of "associational fluency" for subjects who prefer to use Lookup searches (top curve), and for subjects who prefer to use LikeThese searches (bottom curve). Associational fluency is a composite factor reflecting the ability to quickly generate words that are semantically or phonemically related to target words (as measured by the Associational Fluency and Word Fluency tests from Ekstrom et al., 1976). This factor does not reflect general reading comprehension or vocabulary. The lines depict the regression of time per correct response on
11 associational fluency. The difference between the simple correlations for these two groups is reliable (z =1.95; p =.05). For subjects who prefer Lookup searches, performance depends on associational fluency - subjects with low fluency scores take 50% longer to find articles than subjects with high fluency scores. For subjects who prefer LikeThese searches, performance is independent of fluency and is generally better. This pattern suggests that LikeThese searches can be used more effectively by more people than Lookup searches.
12 Figure 3. Mean time per correct response is plotted as a function of "associational fluency" for subjects who prefer Lookup searches (top curve), and for those subjects who prefer LikeThese searches (bottom curve). DISCUSSION The InfoSearch interface to textual databases appears to be easy to use for novice searchers. Subjects use both available iterative retrieval mechanisms (Lookup and LikeThese). They are much more effective using LikeThese to find additional relevant articles than they are at explicitly constructing their own alternative searches using Lookup. These results support previous theoretical and simulation results suggesting that relevance feedback methods should improve performance (e.g., Oddy, 1977; Salton & Buckley, 1990; Stanfill & Kahle, 1986; Williams, 1984). In addition, an analysis of individual subject differences suggests that the LikeThese method may be more accessible to a wider range of users.
13 REFERENCES [1] Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., and Harshman R. A. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 1990, 41(6), [2] Dumais, S. T. and Littman, M. L. InfoSearch: A program for iterative retrieval using Latent Semantic Indexing. Poster presented at CHI 90. [3] Dumais, S. T., Furnas, G. W., Landauer, T. K., and Deerwester, S.. Using latent semantic analysis to improve information retrieval. In CHI 88 Proceedings, 1988, [4] Egan, D. E. Individual differences in humancomputer interaction. In: M. Helander (Ed.), Handbook of Human-Computer Interaction, Elsevier Science Publishers (North-Holland), 1988, [5] Ekstrom, R. B., French, J. W., Harman, H. H., and Dermen, D. Manual for Kit of Factor-Referenced Cognitive Tests Princeton, NJ: Educational Testing Service, [6] Oddy, R. N. Information retrieval through manmachine dialogue. Journal of Documentation, 1977, 33, [7] Salton, G. and Buckley, C. Improving retrieval performance by relevance feedback. Journal of the American Society for Information Science, 1990, 41(4), [8] Stanfill, C. and Kahle, B. Parallel free-text search on the connection machine system. Communications of the ACM, 1986, 29(12), [9] Williams, M. D. What makes RABBIT run? International Journal of Man-Machine Studies, 1984, 21,
Evaluating a Visual Information Retrieval Interface: AspInquery at TREC-6
Evaluating a Visual Information Retrieval Interface: AspInquery at TREC-6 Russell Swan James Allan Don Byrd Center for Intelligent Information Retrieval Computer Science Department University of Massachusetts
More informationTwo-Dimensional Visualization for Internet Resource Discovery. Shih-Hao Li and Peter B. Danzig. University of Southern California
Two-Dimensional Visualization for Internet Resource Discovery Shih-Hao Li and Peter B. Danzig Computer Science Department University of Southern California Los Angeles, California 90089-0781 fshli, danzigg@cs.usc.edu
More informationA Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition
A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition Ana Zelaia, Olatz Arregi and Basilio Sierra Computer Science Faculty University of the Basque Country ana.zelaia@ehu.es
More informationA Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition
A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition Ana Zelaia, Olatz Arregi and Basilio Sierra Computer Science Faculty University of the Basque Country ana.zelaia@ehu.es
More informationhighest cosine coecient [5] are returned. Notice that a query can hit documents without having common terms because the k indexing dimensions indicate
Searching Information Servers Based on Customized Proles Technical Report USC-CS-96-636 Shih-Hao Li and Peter B. Danzig Computer Science Department University of Southern California Los Angeles, California
More informationLRLW-LSI: An Improved Latent Semantic Indexing (LSI) Text Classifier
LRLW-LSI: An Improved Latent Semantic Indexing (LSI) Text Classifier Wang Ding, Songnian Yu, Shanqing Yu, Wei Wei, and Qianfeng Wang School of Computer Engineering and Science, Shanghai University, 200072
More informationJoho, H. and Jose, J.M. (2006) A comparative study of the effectiveness of search result presentation on the web. Lecture Notes in Computer Science 3936:pp. 302-313. http://eprints.gla.ac.uk/3523/ A Comparative
More informationDATA-DRIVEN APPROACHES TO IMPROVING INFORMATION ACCESS
Festschrift for Richard M. Shiffrin DATA-DRIVEN APPROACHES TO IMPROVING INFORMATION ACCESS Susan Dumais, Microsoft Research Overview From IU to Industry (Bell Labs 1979, MSR 1997) Themes Practical focus
More informationSubjective Relevance: Implications on Interface Design for Information Retrieval Systems
Subjective : Implications on interface design for information retrieval systems Lee, S.S., Theng, Y.L, Goh, H.L.D., & Foo, S (2005). Proc. 8th International Conference of Asian Digital Libraries (ICADL2005),
More informationMinoru SASAKI and Kenji KITA. Department of Information Science & Intelligent Systems. Faculty of Engineering, Tokushima University
Information Retrieval System Using Concept Projection Based on PDDP algorithm Minoru SASAKI and Kenji KITA Department of Information Science & Intelligent Systems Faculty of Engineering, Tokushima University
More informationSpeed and Accuracy using Four Boolean Query Systems
From:MAICS-99 Proceedings. Copyright 1999, AAAI (www.aaai.org). All rights reserved. Speed and Accuracy using Four Boolean Query Systems Michael Chui Computer Science Department and Cognitive Science Program
More informationJune 15, Abstract. 2. Methodology and Considerations. 1. Introduction
Organizing Internet Bookmarks using Latent Semantic Analysis and Intelligent Icons Note: This file is a homework produced by two students for UCR CS235, Spring 06. In order to fully appreacate it, it may
More informationNoida institute of engineering and technology,greater noida
Impact Of Word Sense Ambiguity For English Language In Web IR Prachi Gupta 1, Dr.AnuragAwasthi 2, RiteshRastogi 3 1,2,3 Department of computer Science and engineering, Noida institute of engineering and
More informationA Model for Interactive Web Information Retrieval
A Model for Interactive Web Information Retrieval Orland Hoeber and Xue Dong Yang University of Regina, Regina, SK S4S 0A2, Canada {hoeber, yang}@uregina.ca Abstract. The interaction model supported by
More informationThis literature review provides an overview of the various topics related to using implicit
Vijay Deepak Dollu. Implicit Feedback in Information Retrieval: A Literature Analysis. A Master s Paper for the M.S. in I.S. degree. April 2005. 56 pages. Advisor: Stephanie W. Haas This literature review
More informationAn Exploratory Analysis of Semantic Network Complexity for Data Modeling Performance
An Exploratory Analysis of Semantic Network Complexity for Data Modeling Performance Abstract Aik Huang Lee and Hock Chuan Chan National University of Singapore Database modeling performance varies across
More informationInteraction Model to Predict Subjective Specificity of Search Results
Interaction Model to Predict Subjective Specificity of Search Results Kumaripaba Athukorala, Antti Oulasvirta, Dorota Glowacka, Jilles Vreeken, Giulio Jacucci Helsinki Institute for Information Technology
More informationVisual Appeal vs. Usability: Which One Influences User Perceptions of a Website More?
1 of 9 10/3/2009 9:42 PM October 2009, Vol. 11 Issue 2 Volume 11 Issue 2 Past Issues A-Z List Usability News is a free web newsletter that is produced by the Software Usability Research Laboratory (SURL)
More informationA Content Vector Model for Text Classification
A Content Vector Model for Text Classification Eric Jiang Abstract As a popular rank-reduced vector space approach, Latent Semantic Indexing (LSI) has been used in information retrieval and other applications.
More informationThe Curated Web: A Recommendation Challenge. Saaya, Zurina; Rafter, Rachael; Schaal, Markus; Smyth, Barry. RecSys 13, Hong Kong, China
Provided by the author(s) and University College Dublin Library in accordance with publisher policies. Please cite the published version when available. Title The Curated Web: A Recommendation Challenge
More informationWeb Information Retrieval using WordNet
Web Information Retrieval using WordNet Jyotsna Gharat Asst. Professor, Xavier Institute of Engineering, Mumbai, India Jayant Gadge Asst. Professor, Thadomal Shahani Engineering College Mumbai, India ABSTRACT
More informationTitle Core TIs Optional TIs Core Labs Optional Labs. All None 1.1.6, 1.1.7, and Network Math All None None 1.2.5, 1.2.6, and 1.2.
CCNA 1 Plan for Academy Student Success (PASS) CCNA 1 v3.1 Instructional Update # 2006-1 This Instructional Update has been issued to provide guidance on the flexibility that Academy instructors now have
More informationAutomated Cognitive Walkthrough for the Web (AutoCWW)
CHI 2002 Workshop: Automatically Evaluating the Usability of Web Sites Workshop Date: April 21-22, 2002 Automated Cognitive Walkthrough for the Web (AutoCWW) Position Paper by Marilyn Hughes Blackmon Marilyn
More informationOptimizing Search by Showing Results In Context
Optimizing Search by Showing Results In Context Susan Dumais and Edward Cutrell Microsoft Research One Microsoft Way Redmond, WA 98052 [sdumais cutrell]@microsoft.com ABSTRACT We developed and evaluated
More informationApplication Use Strategies
Application Use Strategies Suresh K. Bhavnani Strategies for using complex computer applications such as word processors, and computer-aided drafting (CAD) systems, are general and goal-directed methods
More informationOnly the original curriculum in Danish language has legal validity in matters of discrepancy
CURRICULUM Only the original curriculum in Danish language has legal validity in matters of discrepancy CURRICULUM OF 1 SEPTEMBER 2007 FOR THE BACHELOR OF ARTS IN INTERNATIONAL BUSINESS COMMUNICATION (BA
More informationEvaluating usability of screen designs with layout complexity
Southern Cross University epublications@scu Southern Cross Business School 1995 Evaluating usability of screen designs with layout complexity Tim Comber Southern Cross University John R. Maltby Southern
More informationEnabling Users to Visually Evaluate the Effectiveness of Different Search Queries or Engines
Appears in WWW 04 Workshop: Measuring Web Effectiveness: The User Perspective, New York, NY, May 18, 2004 Enabling Users to Visually Evaluate the Effectiveness of Different Search Queries or Engines Anselm
More informationIMPROVING THE RELEVANCY OF DOCUMENT SEARCH USING THE MULTI-TERM ADJACENCY KEYWORD-ORDER MODEL
IMPROVING THE RELEVANCY OF DOCUMENT SEARCH USING THE MULTI-TERM ADJACENCY KEYWORD-ORDER MODEL Lim Bee Huang 1, Vimala Balakrishnan 2, Ram Gopal Raj 3 1,2 Department of Information System, 3 Department
More informationDecomposition. November 20, Abstract. With the electronic storage of documents comes the possibility of
Latent Semantic Indexing via a Semi-Discrete Matrix Decomposition Tamara G. Kolda and Dianne P. O'Leary y November, 1996 Abstract With the electronic storage of documents comes the possibility of building
More informationInformation Retrieval. (M&S Ch 15)
Information Retrieval (M&S Ch 15) 1 Retrieval Models A retrieval model specifies the details of: Document representation Query representation Retrieval function Determines a notion of relevance. Notion
More informationWhat is this Song About?: Identification of Keywords in Bollywood Lyrics
What is this Song About?: Identification of Keywords in Bollywood Lyrics by Drushti Apoorva G, Kritik Mathur, Priyansh Agrawal, Radhika Mamidi in 19th International Conference on Computational Linguistics
More informationDomain Specific Search Engine for Students
Domain Specific Search Engine for Students Domain Specific Search Engine for Students Wai Yuen Tang The Department of Computer Science City University of Hong Kong, Hong Kong wytang@cs.cityu.edu.hk Lam
More informationAdaptive Search Engines Learning Ranking Functions with SVMs
Adaptive Search Engines Learning Ranking Functions with SVMs CS478/578 Machine Learning Fall 24 Thorsten Joachims Cornell University T. Joachims, Optimizing Search Engines Using Clickthrough Data, Proceedings
More informationEssential Dimensions of Latent Semantic Indexing (LSI)
Essential Dimensions of Latent Semantic Indexing (LSI) April Kontostathis Department of Mathematics and Computer Science Ursinus College Collegeville, PA 19426 Email: akontostathis@ursinus.edu Abstract
More informationWeb personalization using Extended Boolean Operations with Latent Semantic Indexing
Web personalization using Extended Boolean Operations with Latent Semantic Indexing Preslav Nakov Bulgaria, Sofia, Studentski grad. bl.8/room 723 (preslav@rila.bg) Key words: Information Retrieval and
More informationThe Effectiveness of a Dictionary-Based Technique for Indonesian-English Cross-Language Text Retrieval
University of Massachusetts Amherst ScholarWorks@UMass Amherst Computer Science Department Faculty Publication Series Computer Science 1997 The Effectiveness of a Dictionary-Based Technique for Indonesian-English
More informationInformation Retrieval CSCI
Information Retrieval CSCI 4141-6403 My name is Anwar Alhenshiri My email is: anwar@cs.dal.ca I prefer: aalhenshiri@gmail.com The course website is: http://web.cs.dal.ca/~anwar/ir/main.html 5/6/2012 1
More informationDocument Clustering in Reduced Dimension Vector Space
Document Clustering in Reduced Dimension Vector Space Kristina Lerman USC Information Sciences Institute 4676 Admiralty Way Marina del Rey, CA 90292 Email: lerman@isi.edu Abstract Document clustering is
More informationA User Study on Features Supporting Subjective Relevance for Information Retrieval Interfaces
A user study on features supporting subjective relevance for information retrieval interfaces Lee, S.S., Theng, Y.L, Goh, H.L.D., & Foo, S. (2006). Proc. 9th International Conference of Asian Digital Libraries
More informationThe Person in Personal
WWW Panel: Searching Personal Content The Person in Personal (Supporting the Person in Searching Personal Content) Susan Dumais Microsoft Research http://research.microsoft.com/~sdumais Stuff I ve I Seen
More informationLearning Ranking Functions with SVMs
Learning Ranking Functions with SVMs CS4780/5780 Machine Learning Fall 2014 Thorsten Joachims Cornell University T. Joachims, Optimizing Search Engines Using Clickthrough Data, Proceedings of the ACM Conference
More informationDesigning and Building an Automatic Information Retrieval System for Handling the Arabic Data
American Journal of Applied Sciences (): -, ISSN -99 Science Publications Designing and Building an Automatic Information Retrieval System for Handling the Arabic Data Ibrahiem M.M. El Emary and Ja'far
More informationUsing Excel for Graphical Analysis of Data
EXERCISE Using Excel for Graphical Analysis of Data Introduction In several upcoming experiments, a primary goal will be to determine the mathematical relationship between two variable physical parameters.
More informationIteration vs Recursion in Introduction to Programming Classes: An Empirical Study
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 16, No 4 Sofia 2016 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.1515/cait-2016-0068 Iteration vs Recursion in Introduction
More informationOrganizing Information. Organizing information is at the heart of information science and is important in many other
Dagobert Soergel College of Library and Information Services University of Maryland College Park, MD 20742 Organizing Information Organizing information is at the heart of information science and is important
More informationA Model for Information Retrieval Agent System Based on Keywords Distribution
A Model for Information Retrieval Agent System Based on Keywords Distribution Jae-Woo LEE Dept of Computer Science, Kyungbok College, 3, Sinpyeong-ri, Pocheon-si, 487-77, Gyeonggi-do, Korea It2c@koreaackr
More informationPlease note: Only the original curriculum in Danish language has legal validity in matters of discrepancy. CURRICULUM
Please note: Only the original curriculum in Danish language has legal validity in matters of discrepancy. CURRICULUM CURRICULUM OF 1 SEPTEMBER 2008 FOR THE BACHELOR OF ARTS IN INTERNATIONAL COMMUNICATION:
More informationTitle Core TIs Optional TIs Core Labs Optional Labs. All None 1.1.4a, 1.1.4b, 1.1.4c, 1.1.5, WAN Technologies All None None None
CCNA 4 Plan for Academy Student Success (PASS) CCNA 4 v3.1 Instructional Update # 2006-1 This Instructional Update has been issued to provide guidance to the Academy instructors on the flexibility that
More informationInternet Usage Transaction Log Studies: The Next Generation
Internet Usage Transaction Log Studies: The Next Generation Sponsored by SIG USE Dietmar Wolfram, Moderator. School of Information Studies, University of Wisconsin-Milwaukee Milwaukee, WI 53201. dwolfram@uwm.edu
More informationThe Semantic Conference Organizer
34 The Semantic Conference Organizer Kevin Heinrich, Michael W. Berry, Jack J. Dongarra, Sathish Vadhiyar University of Tennessee, Knoxville, USA CONTENTS 34.1 Background... 571 34.2 Latent Semantic Indexing...
More informationJames Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence!
James Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence! (301) 219-4649 james.mayfield@jhuapl.edu What is Information Retrieval? Evaluation
More informationA Breakdown of the Psychomotor Components of Input Device Usage
Page 1 of 6 February 2005, Vol. 7 Issue 1 Volume 7 Issue 1 Past Issues A-Z List Usability News is a free web newsletter that is produced by the Software Usability Research Laboratory (SURL) at Wichita
More informationEight units must be completed and passed to be awarded the Diploma.
Diploma of Computing Course Outline Campus Intake CRICOS Course Duration Teaching Methods Assessment Course Structure Units Melbourne Burwood Campus / Jakarta Campus, Indonesia March, June, October 022638B
More informationExamining the Authority and Ranking Effects as the result list depth used in data fusion is varied
Information Processing and Management 43 (2007) 1044 1058 www.elsevier.com/locate/infoproman Examining the Authority and Ranking Effects as the result list depth used in data fusion is varied Anselm Spoerri
More informationShedding Light on the Graph Schema
Shedding Light on the Graph Schema Raj M. Ratwani (rratwani@gmu.edu) George Mason University J. Gregory Trafton (trafton@itd.nrl.navy.mil) Naval Research Laboratory Abstract The current theories of graph
More informationDynamic Visualization of Hubs and Authorities during Web Search
Dynamic Visualization of Hubs and Authorities during Web Search Richard H. Fowler 1, David Navarro, Wendy A. Lawrence-Fowler, Xusheng Wang Department of Computer Science University of Texas Pan American
More informationDeveloping a Test Collection for the Evaluation of Integrated Search Lykke, Marianne; Larsen, Birger; Lund, Haakon; Ingwersen, Peter
university of copenhagen Københavns Universitet Developing a Test Collection for the Evaluation of Integrated Search Lykke, Marianne; Larsen, Birger; Lund, Haakon; Ingwersen, Peter Published in: Advances
More informationDetecting and Analyzing Communities in Social Network Graphs for Targeted Marketing
Detecting and Analyzing Communities in Social Network Graphs for Targeted Marketing Gautam Bhat, Rajeev Kumar Singh Department of Computer Science and Engineering Shiv Nadar University Gautam Buddh Nagar,
More informationClustered SVD strategies in latent semantic indexing q
Information Processing and Management 41 (5) 151 163 www.elsevier.com/locate/infoproman Clustered SVD strategies in latent semantic indexing q Jing Gao, Jun Zhang * Laboratory for High Performance Scientific
More informationDetection of Web-Site Usability Problems: Empirical Comparison of Two Testing Methods
Detection of Web-Site Usability Problems: Empirical Comparison of Two Testing Methods Mikael B. Skov and Jan Stage Department of Computer Science Aalborg University Fredrik Bajers Vej 7 9220 Aalborg East,
More informationMultivariate Data & Tables and Graphs. Agenda. Data and its characteristics Tables and graphs Design principles
Topic Notes Multivariate Data & Tables and Graphs CS 7450 - Information Visualization Aug. 27, 2012 John Stasko Agenda Data and its characteristics Tables and graphs Design principles Fall 2012 CS 7450
More informationChapter 6: Information Retrieval and Web Search. An introduction
Chapter 6: Information Retrieval and Web Search An introduction Introduction n Text mining refers to data mining using text documents as data. n Most text mining tasks use Information Retrieval (IR) methods
More informationHeuristic Evaluation of Groupware. How to do Heuristic Evaluation of Groupware. Benefits
Kimberly Tee ketee@ucalgary.ca CPSC 681 Topic Heuristic Evaluation of Groupware Heuristic evaluation [9] is a discount evaluation method for finding usability problems in a singleuser interface design.
More informationA Study for Documents Summarization based on Personal Annotation
A Study for Documents Summarization based on Personal Annotation Haiqin Zhang University of Science and Technology of China face@mail.ustc.edu. cn ZhengChen Wei-yingMa Microsoft Research Asia zhengc@microsoft.com
More informationDeep Character-Level Click-Through Rate Prediction for Sponsored Search
Deep Character-Level Click-Through Rate Prediction for Sponsored Search Bora Edizel - Phd Student UPF Amin Mantrach - Criteo Research Xiao Bai - Oath This work was done at Yahoo and will be presented as
More informationMultivariate Data & Tables and Graphs. Agenda. Data and its characteristics Tables and graphs Design principles
Multivariate Data & Tables and Graphs CS 7450 - Information Visualization Aug. 24, 2015 John Stasko Agenda Data and its characteristics Tables and graphs Design principles Fall 2015 CS 7450 2 1 Data Data
More informationNPTEL Computer Science and Engineering Human-Computer Interaction
M4 L5 Heuristic Evaluation Objective: To understand the process of Heuristic Evaluation.. To employ the ten principles for evaluating an interface. Introduction: Heuristics evaluation is s systematic process
More informationA Knowledge-Based Approach to Organizing Retrieved Documents
A Knowledge-Based Approach to Organizing Retrieved Documents Wanda Pratt Information & Computer Science University of California, Irvine Irvine, CA 92697-3425 pratt@ics.uci.edu From: AAAI-99 Proceedings.
More informationConceptions of Features and Semantic Clusters as Search Mechanisms: A Pilot Study 1
Conceptions of Features and Semantic Clusters as Search Mechanisms: A Pilot Study 1 Barbara M. Wildemuth *, Meng Yang *, Gary Geisler, Tom Tolleson *, Jon Elsas *, Jei Luo *, and Gary Marchionini * * Open
More informationTitle Core TIs Optional TIs Core Labs Optional Labs. 1.1 WANs All None None None. All None None None. All None 2.2.1, 2.2.4, 2.2.
CCNA 2 Plan for Academy Student Success (PASS) CCNA 2 v3.1 Instructional Update # 2006-1 This Instructional Update has been issued to provide guidance on the flexibility that Academy instructors now have
More informationVisualization of Text Document Corpus
Informatica 29 (2005) 497 502 497 Visualization of Text Document Corpus Blaž Fortuna, Marko Grobelnik and Dunja Mladenić Jozef Stefan Institute Jamova 39, 1000 Ljubljana, Slovenia E-mail: {blaz.fortuna,
More informationText Analytics (Text Mining)
CSE 6242 / CX 4242 Apr 1, 2014 Text Analytics (Text Mining) Concepts and Algorithms Duen Horng (Polo) Chau Georgia Tech Some lectures are partly based on materials by Professors Guy Lebanon, Jeffrey Heer,
More informationSheffield University and the TREC 2004 Genomics Track: Query Expansion Using Synonymous Terms
Sheffield University and the TREC 2004 Genomics Track: Query Expansion Using Synonymous Terms Yikun Guo, Henk Harkema, Rob Gaizauskas University of Sheffield, UK {guo, harkema, gaizauskas}@dcs.shef.ac.uk
More informationAn Attempt to Identify Weakest and Strongest Queries
An Attempt to Identify Weakest and Strongest Queries K. L. Kwok Queens College, City University of NY 65-30 Kissena Boulevard Flushing, NY 11367, USA kwok@ir.cs.qc.edu ABSTRACT We explore some term statistics
More informationInternational Journal of Advance Foundation and Research in Science & Engineering (IJAFRSE) Volume 1, Issue 2, July 2014.
A B S T R A C T International Journal of Advance Foundation and Research in Science & Engineering (IJAFRSE) Information Retrieval Models and Searching Methodologies: Survey Balwinder Saini*,Vikram Singh,Satish
More informationWeb document summarisation: a task-oriented evaluation
Web document summarisation: a task-oriented evaluation Ryen White whiter@dcs.gla.ac.uk Ian Ruthven igr@dcs.gla.ac.uk Joemon M. Jose jj@dcs.gla.ac.uk Abstract In this paper we present a query-biased summarisation
More informationMultimodal Information Spaces for Content-based Image Retrieval
Research Proposal Multimodal Information Spaces for Content-based Image Retrieval Abstract Currently, image retrieval by content is a research problem of great interest in academia and the industry, due
More informationA Semi-Discrete Matrix Decomposition for Latent. Semantic Indexing in Information Retrieval. December 5, Abstract
A Semi-Discrete Matrix Decomposition for Latent Semantic Indexing in Information Retrieval Tamara G. Kolda and Dianne P. O'Leary y December 5, 1996 Abstract The vast amount of textual information available
More informationLearning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li
Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,
More informationUsing Clusters on the Vivisimo Web Search Engine
Using Clusters on the Vivisimo Web Search Engine Sherry Koshman and Amanda Spink School of Information Sciences University of Pittsburgh 135 N. Bellefield Ave., Pittsburgh, PA 15237 skoshman@sis.pitt.edu,
More informationData Analyst Nanodegree Syllabus
Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working
More informationMonroe Township Middle School Monroe Township, New Jersey
Monroe Township Middle School Monroe Township, New Jersey Middle School 8 th Grade *PREPARATION PACKET* Welcome to 8 th Grade Mathematics! Our 8 th Grade Mathematics Course is a comprehensive survey course
More informationContent-based Dimensionality Reduction for Recommender Systems
Content-based Dimensionality Reduction for Recommender Systems Panagiotis Symeonidis Aristotle University, Department of Informatics, Thessaloniki 54124, Greece symeon@csd.auth.gr Abstract. Recommender
More informationMULTIMEDIA RETRIEVAL
MULTIMEDIA RETRIEVAL Peter L. Stanchev *&**, Krassimira Ivanova ** * Kettering University, Flint, MI, USA 48504, pstanche@kettering.edu ** Institute of Mathematics and Informatics, BAS, Sofia, Bulgaria,
More informationAssessing the Impact of Sparsification on LSI Performance
Accepted for the Grace Hopper Celebration of Women in Computing 2004 Assessing the Impact of Sparsification on LSI Performance April Kontostathis Department of Mathematics and Computer Science Ursinus
More informationUsing Excel for Graphical Analysis of Data
Using Excel for Graphical Analysis of Data Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable physical parameters. Graphs are
More informationTREC 2017 Dynamic Domain Track Overview
TREC 2017 Dynamic Domain Track Overview Grace Hui Yang Zhiwen Tang Ian Soboroff Georgetown University Georgetown University NIST huiyang@cs.georgetown.edu zt79@georgetown.edu ian.soboroff@nist.gov 1. Introduction
More informationMethods for closed loop system identification in industry
Available online www.jocpr.com Journal of Chemical and Pharmaceutical Research, 2015, 7(1):892-896 Review Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 Methods for closed loop system identification in industry
More informationUsing Query History to Prune Query Results
Using Query History to Prune Query Results Daniel Waegel Ursinus College Department of Computer Science dawaegel@gmail.com April Kontostathis Ursinus College Department of Computer Science akontostathis@ursinus.edu
More informationHow to use indexing languages in searching
Indexing, searching, and retrieval 6.3.1. How to use indexing languages in searching Overview This module explains how you can become a better searcher by exploiting the power of indexing and indexing
More information99 /
99 / 2 3 : : / 90 ««: : Nbahreyni68@gmailcom ( Mmirzabeigi@gmailcom 2 Sotudeh@shirazuacir 3 8 / 00 : (203 «(2000 2 (998 985 3 5 8 202 7 2008 6 2007 Kinley 2 Wilson 3 Elm, & Woods Mcdonald, & Stevenson
More informationUnderstanding the Relationship between Searchers Queries and Information Goals
Understanding the Relationship between Searchers Queries and Information Goals Doug Downey University of Washington Seattle, WA 9895 ddowney@cs.washington.edu Susan Dumais, Dan Liebling, Eric Horvitz Microsoft
More informationsecond_language research_teaching sla vivian_cook language_department idl
Using Implicit Relevance Feedback in a Web Search Assistant Maria Fasli and Udo Kruschwitz Department of Computer Science, University of Essex, Wivenhoe Park, Colchester, CO4 3SQ, United Kingdom fmfasli
More informationCollaborative Filtering based on User Trends
Collaborative Filtering based on User Trends Panagiotis Symeonidis, Alexandros Nanopoulos, Apostolos Papadopoulos, and Yannis Manolopoulos Aristotle University, Department of Informatics, Thessalonii 54124,
More informationInformation Retrieval. hussein suleman uct cs
Information Management Information Retrieval hussein suleman uct cs 303 2004 Introduction Information retrieval is the process of locating the most relevant information to satisfy a specific information
More informationA modified and fast Perceptron learning rule and its use for Tag Recommendations in Social Bookmarking Systems
A modified and fast Perceptron learning rule and its use for Tag Recommendations in Social Bookmarking Systems Anestis Gkanogiannis and Theodore Kalamboukis Department of Informatics Athens University
More informationESANN'2001 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), April 2001, D-Facto public., ISBN ,
An Integrated Neural IR System. Victoria J. Hodge Dept. of Computer Science, University ofyork, UK vicky@cs.york.ac.uk Jim Austin Dept. of Computer Science, University ofyork, UK austin@cs.york.ac.uk Abstract.
More informationChapter 8. Evaluating Search Engine
Chapter 8 Evaluating Search Engine Evaluation Evaluation is key to building effective and efficient search engines Measurement usually carried out in controlled laboratory experiments Online testing can
More informationText Modeling with the Trace Norm
Text Modeling with the Trace Norm Jason D. M. Rennie jrennie@gmail.com April 14, 2006 1 Introduction We have two goals: (1) to find a low-dimensional representation of text that allows generalization to
More information