Modern Information Retrieval
|
|
- Melina Holt
- 6 years ago
- Views:
Transcription
1
2 Modern Information Retrieval The Concepts and Technology behind Search Ricardo Baeza-Yates Berthier Ribeiro-Neto Second edition Addison-Wesley Harlow, England Reading, Massachusetts Menlo Park, California New York Don Mills, Ontario Amsterdam Bonn Sydney Singapore Tokyo Madrid San Juan Milan Mexico City Seoul Taipei
3 To be filled by Pearson
4 To Helena, Rosa, and our children Amo los libros exploradores, libros con bosque o nieve, profundidad o cielo Un libro, un libro lleno de contactos humanos, de camisas, un libro sin soledad, con hombres y herramientas, un libro es la victoria. de Oda al Libro (I) y (II), en Odas Elementales, Pablo Neruda Ilovebooksthatexplore, books with a forest or snow, depth or sky Abook,abookfull of human contacts, of shirts, abookwithoutsolitude, with people and tools, abookisthevictory. from Ode to the Book (I) and (II), in Elemental Odes, Pablo Neruda território de homens livres que será nossopaís eserápátria de todos. Irmãos, cantai esse mundo que não verei, mas virá um dia, dentro de mil anos, talvez mais... não tenho pressa. de Cidade Prevista no livro ARosadoPovo,1945. Carlos Drummond de Andrade territory of free men that will be our country and will be the nation of all Brothers, sing that world which I ll not see, but which will come one day, in a thousand years, maybe more... no hurry. from Prevised City in the book The Rose of the People, Carlos Drummond de Andrade
5
6 Contents Preface to the Second Edition Preface to the First Edition Author s Acknowledgements to the Second Edition Author s Acknowledgements to the First Edition Publisher s Acknowledgements xix xxi xxiii xxv xxvii 1 Introduction Information Retrieval Early Developments Information Retrieval in Libraries and Digital Libraries IR at the Center of the Stage The IR Problem The User s Task Information versus Data Retrieval The IR System Software Architecture of the IR System The Retrieval and Ranking Processes The Web A Brief History The e-publishing Era How the Web Changed Search Practical Issues on the Web Organization of the Book Focus of the Book Book Contents The Book Web Site: A Teaching Resource Bibliographic Discussion User Interfaces for Search 21 by Marti Hearst 2.1 Introduction How People Search vii
7 viii CONTENTS Information Lookup versus Exploratory Search Classic versus Dynamic Model of Information Seeking Navigation versus Search Observations of the Search Process Search Interfaces Today Getting Started Query Specification Query Specification Interfaces Retrieval Results Display Query Reformulation Organizing Search Results Visualization in Search Interfaces Visualizing Boolean Syntax Visualizing Query Terms within Retrieval Results Visualizing Relationships Among Words and Documents Visualization for Text Mining Design and Evaluation of Search Interfaces Trends and Research Issues Bibliographic Discussion Modeling IR Models Modeling and Ranking Characterization of an IR Model A Taxonomy of IR Models Classic Information Retrieval Basic Concepts The Boolean Model Term Weighting TF-IDF Weights Document Length Normalization The Vector Model The Probabilistic Model Brief Comparison of Classic Models Alternative Set Theoretic Models Set-Based Model Extended Boolean Model Fuzzy Set Model Alternative Algebraic Models Generalized Vector Space Model Latent Semantic Indexing Model Neural Network Model Alternative Probabilistic Models BM Language Models Divergence from Randomness Bayesian Network Models
8 CONTENTS ix 3.6 Other Models The Hypertext Model Web based Models Structured Text Retrieval Multimedia Retrieval Enterprise and Vertical Search Trends and Research Issues Bibliographic Discussion Retrieval Evaluation Introduction The Cranfield Paradigm A Brief History Reference Collections Retrieval Metrics Precision and Recall Single Value Summaries: P@n, MAP, MRR, F User-Oriented Measures DCG: Discounted Cumulated Gain BPREF: Binary Preferences Rank Correlation Metrics Reference Collections The TREC Collections Other Reference Collections Other Small Test Collections User-Based Evaluation Human Experimentation in the Lab Side-by-Side Panels A/B Testing Crowdsourcing Evaluation using Clickthrough Data Practical Caveats Trends and Research Issues Bibliographic Discussion Relevance Feedback and Query Expansion Introduction A Framework for Feedback Methods Explicit Relevance Feedback Relevance Feedback for the Vector Model: Rocchio Method Relevance Feedback for the Probabilistic Model Evaluation of Relevance Feedback Explicit Feedback Through Clicks Eye Tracking and Relevance Judgements User Behavior Clicks as a Metric of User Preferences Implicit Feedback Through Local Analysis
9 x CONTENTS Implicit Feedback Through Local Clustering Implicit Feedback through Local Context Analysis Implicit Feedback Through Global Analysis Query Expansion based on a Similarity Thesaurus Query Expansion based on a Statistical Thesaurus Trends and Research Issues Bibliographic Discussion Documents: Languages & Properties 203 with Gonzalo Navarro and Nivio Ziviani 6.1 Introduction Metadata Document Formats Text Multimedia Graphics and Virtual Reality Markup Languages SGML HTML XML RDF: Resource Description Framework HyTime Text Properties Information Theory Modeling Natural Language Text Similarity Document Preprocessing Lexical Analysis of the Text Elimination of Stopwords Stemming Keyword Selection Thesauri Organizing Documents Taxonomies Folksonomies Text Compression Basic Concepts Statistical Methods Statistical Methods: Modeling Statistical Methods: Coding Dictionary Methods Preprocessing for Compression Comparing Text Compression Techniques Structured Text Compression Trends and Research Issues Bibliographical Discussion
10 CONTENTS xi 7 Queries: Languages & Properties 257 with Gonzalo Navarro 7.1 Query Languages Keyword-Based Querying Beyond Keywords Structural Queries Query Protocols Query Properties Characterizing Web Queries User Search Behavior Query Intent Query Topic Query Sessions and Missions Query Difficulty Trends and Research Issues Bibliographical Discussion Text Classification 283 with Marcos Gonçalves 8.1 Introduction A Characterization of Text Classification Machine Learning The Text Classification Problem Text Classification Algorithms Unsupervised Algorithms Clustering Naive Text Classification Supervised Algorithms Decision Trees The k-nn Classifier The Rocchio Classifier Probabilistic Naive Bayes Document Classification The SVM Classifier Ensemble Classifiers Final Remarks on Supervised Algorithms Feature Selection or Dimensionality Reduction Term Class Incidence Table Term Document Frequency TF-IDF Weights Mutual Information Information Gain Chi Square Impact of Feature Selection Evaluation Metrics Contingency Table Accuracy and Error Precision and Recall
11 xii CONTENTS F-measure and F Cross-Validation Standard Collections Organizing the Classes Building Taxonomies Trends and Research Issues Bibliographic Discussion Indexing and Searching 339 with Gonzalo Navarro 9.1 Introduction Inverted Indexes Basic Concepts Full Inverted Indexes Searching Ranking Construction Compressed Inverted Indexes Structural Queries Signature Files Suffix Trees and Suffix Arrays Structure: Tries and Suffix Trees Searching for Simple Strings Searching for Complex Patterns Construction Compressed Suffix Arrays Sequential Searching Simple Strings: Horspool Complex Patterns: Automata and Bit-Parallelism Faster Bit-Parallel Algorithms Regular Expressions Multiple Patterns Approximate Searching Searching Compressed Text Multi-dimensional Indexing Trends and Research Issues Bibliographic Discussion Parallel and Distributed IR 401 with Eric Brown 10.1 Introduction A Taxonomy of Distributed IR Systems Data Partitioning Collection Partitioning Collection Selection Inverted Index Partitioning Partitioning other Indexes Parallel IR
12 CONTENTS xiii Introduction Parallel IR on MIMD Architectures Parallel IR on SIMD Architectures Cluster-based IR Distributed IR Introduction Indexing Query Processing Web Issues Federated Search Retrieval in Peer-to-Peer Networks Trends and Research Issues Bibliographic Discussion Web Retrieval 449 with Yoelle Maarek 11.1 Introduction A Challenging Problem The Web Characteristics Structure of the Web Graph Modeling the Web Link Analysis Search Engine Architectures Basic Architecture Cluster-based Architecture Caching Multiple Indexes Distributed Architectures Search Engine Ranking Ranking Signals Link-based Ranking Simple Ranking Functions Learning to Rank Learning the Ranking Function Quality Evaluation Web Spam Managing Web Data Assigning Identifiers to Documents Metadata Compressing the Web Graph Handling Duplicated Data Search Engine User Interaction The Search Rectangle Paradigm The Search Engine Result Page Educating the User Browsing
13 xiv CONTENTS Flat Browsing Structure Guided Browsing and Web Directories Beyond Browsing Hypertext and the Web Combining Searching with Browsing Web Query Languages Dynamic Search Related Problems Computational Advertising Web Mining Metasearch Trends and Research Issues Beyond Static Text Data Current Challenges Bibliographical Discussion Web Crawling 519 with Carlos Castillo 12.1 Introduction Applications of a Web Crawler General Web Search Topical Crawling Web Characterization Mirroring Web Site Analysis A Taxonomy of Crawlers Types of Web Pages Architecture and Implementation Crawler Architecture Practical Issues Parallel Crawling Scheduling Algorithms Selection Policy Revisit Policy Politeness Policy Combining Policies Evaluation Evaluating Network Usage Evaluating Long-term Scheduling Trends and Research Issues Crawling the Hidden Web Crawling with the Help of Web Sites Distributed Crawling Bibliographic Discussion
14 CONTENTS xv 13 Structured Text Retrieval 549 with Mounia Lalmas 13.1 Introduction Structuring Power Explicit vs. Implicit Structure Static vs. Dynamic Structure Single Hierarchy vs. Multiple Hierarchies Early Text Retrieval Models Model Based on Non-Overlapping Lists Model Based on Proximal Nodes Ranking Structured Text Results XML Retrieval Challenges in XML Retrieval Indexing Strategies Ranking Strategies Removing Overlaps XML Retrieval Evaluation Document Collections Topics Retrieval Tasks Relevance Measures Query Languages Characteristics Classification of XML Query Languages Examples of XML Query Languages Trends and Research Issues Bibliographic Discussion Multimedia Information Retrieval 591 by Dulce Ponceleón and Malcolm Slaney 14.1 Introduction What is Multimedia? Multimedia IR Text IR versus Multimedia IR The Challenges The Semantic Gap Feature Ambiguity Machine-generated Data Content-based Image Retrieval Color-Based Retrieval Texture Salient Points Audio and Music Retrieval Fingerprinting Speech Recognition Speaker Identification
15 xvi CONTENTS Spoken Document Retrieval Audio Basics Retrieving and Browsing Video Video Abstracts Static Summaries Mosaics and Salient Stills Dynamic Summaries Interactive Summaries Visual vs. Audio Browsing Evaluating Summaries Fusion Models: Combining it All Naming Faces Naming Images Naming Audio Combining Audio and Video for AVSR Combining Audio and Video for Multimedia Segmentation A Video Segmentation Example Segmentation Schemes for Video Video Segmentation with Edges Speech Segmentation Segmentation Evaluation Compression and MPEG Standards Intensity and Sampling Color Lossy Compression Lossless Compression Temporal Redundancy Motion Prediction MPEG Standards Trends and Research Issues Bibliographic Discussion Enterprise Search 645 by David Hawking 15.1 Introduction Characteristics and Applications of Enterprise Search Enterprise Search Software Workplace Search Enterprise Search Tasks Examples of Search-Supported Tasks Search Types Studying Enterprise Search Architecture of Enterprise Search Systems Gathering Extracting Indexing
16 CONTENTS xvii Indexing Textual Annotations Query Processing Presentation of Search Results Security Models Federation/Metasearch Enterprise Search Evaluation Published Test Collections for Enterprise Search Internal Enterprise Search Evaluations Enterprise Search Tuning What is it Reasonable to Expect? Potential Reasons for Dissatisfaction Context and Personalization Controls and Levers for Contextualization Contextualization: Local, Enterprise or Global? Privacy of Profiles Defining, Creating and Maintaining a Profile User Modeling Implicit Measures Information Filtering Social Recommender Systems Trends and Research Issues Bibliographic Discussion Library Systems 687 by Edie Rasmussen 16.1 The Information Environment in the Library Online Public Access Catalogues OPACs and Bibliographic Records Information Retrieval from the ILS Integrating the Hybrid Library OPACs and End Users ILS: Vendors and Products IR Systems and Document Databases Bibliographic and Full-text Databases Content of Database Records The Online Industry: Database Vendors Information Retrieval from Document Databases Information Retrieval in Organizations Trends and Research Issues Bibliographic Discussion Digital Libraries 713 by Marcos Gonçalves 17.1 Introduction Defining Digital Libraries A General Architecture Fundamentals
17 xviii CONTENTS Digital Objects and Collections Metadata and Catalogs Repositories/Archives Services Social-Economical Issues Social Issues Economical Issues Software Systems Greenstone Eprints DSpace Fedora Open Digital Libraries The 5S Suite DL Case Studies The Networked DL of Theses and Dissertations The National Science Digital Library The ETANA-DL Archaeological Digital Library Trends and Research Issues Evaluation Integration Other Research Challenges Bibliographic Discussion A Open Source Search Engines 739 with Christian Middleton A.1 Introduction A.2 Search Engines A.2.1 Preliminary Selection of Search Engines A.2.2 Features A.2.3 Evaluation A.3 Methodology A.3.1 Document Collections A.3.2 Evaluation Tests A.3.3 Experimental Setup A.4 Experimental Results A.4.1 Test A Indexing A.4.2 Test B Incremental Indexing A.4.3 Test C Search Performance A.4.4 Global Evaluation A.5 Conclusions B Biographies 757 References 765
Modern Information Retrieval
Modern Information Retrieval Ricardo Baeza-Yates Berthier Ribeiro-Neto ACM Press NewYork Harlow, England London New York Boston. San Francisco. Toronto. Sydney Singapore Hong Kong Tokyo Seoul Taipei. New
More informationPart I: Data Mining Foundations
Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web and the Internet 2 1.3. Web Data Mining 4 1.3.1. What is Data Mining? 6 1.3.2. What is Web Mining?
More informationAn Introduction to Search Engines and Web Navigation
An Introduction to Search Engines and Web Navigation MARK LEVENE ADDISON-WESLEY Ал imprint of Pearson Education Harlow, England London New York Boston San Francisco Toronto Sydney Tokyo Singapore Hong
More informationBing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer
Bing Liu Web Data Mining Exploring Hyperlinks, Contents, and Usage Data With 177 Figures Springer Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web
More informationDepartment of Computer Science and Engineering B.E/B.Tech/M.E/M.Tech : B.E. Regulation: 2013 PG Specialisation : _
COURSE DELIVERY PLAN - THEORY Page 1 of 6 Department of Computer Science and Engineering B.E/B.Tech/M.E/M.Tech : B.E. Regulation: 2013 PG Specialisation : _ LP: CS6007 Rev. No: 01 Date: 27/06/2017 Sub.
More informationIntroduction to Information Retrieval
Introduction to Information Retrieval Mohsen Kamyar چهارمین کارگاه ساالنه آزمایشگاه فناوری و وب بهمن ماه 1391 Outline Outline in classic categorization Information vs. Data Retrieval IR Models Evaluation
More informationIntroduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.
Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How
More informationSearch Engines Information Retrieval in Practice
Search Engines Information Retrieval in Practice W. BRUCE CROFT University of Massachusetts, Amherst DONALD METZLER Yahoo! Research TREVOR STROHMAN Google Inc. ----- PEARSON Boston Columbus Indianapolis
More informationCOMPUTER AND ROBOT VISION
VOLUME COMPUTER AND ROBOT VISION Robert M. Haralick University of Washington Linda G. Shapiro University of Washington A^ ADDISON-WESLEY PUBLISHING COMPANY Reading, Massachusetts Menlo Park, California
More informationContents. Foreword to Second Edition. Acknowledgments About the Authors
Contents Foreword xix Foreword to Second Edition xxi Preface xxiii Acknowledgments About the Authors xxxi xxxv Chapter 1 Introduction 1 1.1 Why Data Mining? 1 1.1.1 Moving toward the Information Age 1
More informationChapter 27 Introduction to Information Retrieval and Web Search
Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval
More informationIntroductory logic and sets for Computer scientists
Introductory logic and sets for Computer scientists Nimal Nissanke University of Reading ADDISON WESLEY LONGMAN Harlow, England II Reading, Massachusetts Menlo Park, California New York Don Mills, Ontario
More informationName of the lecturer Doç. Dr. Selma Ayşe ÖZEL
Y.L. CENG-541 Information Retrieval Systems MASTER Doç. Dr. Selma Ayşe ÖZEL Information retrieval strategies: vector space model, probabilistic retrieval, language models, inference networks, extended
More informationAutomatic Text Processing
Automatic Text Processing The Transformation, Analysis, and Retrieval of Information by Computer Gerard Salton Cornell University Technlsche Univerariat Darmstadt FACHBEREICH1NFORMATJK BIBLIOTHE.K Invented.:
More informationVALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK VII SEMESTER
VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur 603 203 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK VII SEMESTER CS6007-INFORMATION RETRIEVAL Regulation 2013 Academic Year 2018
More informationFUNDAMENTALS OF. Database S wctpmc. Shamkant B. Navathe College of Computing Georgia Institute of Technology. Addison-Wesley
FUNDAMENTALS OF Database S wctpmc SIXTH EDITION Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington Shamkant B. Navathe College of Computing Georgia Institute
More informationACM Press New York. Addison-Wesley. Modern Information Retrieval. Ricardo Baeza-Yates. Berthier Ribeiro-Neto. Harlow, England Reading, Massachusetts
Modern Information Retrieval Ricardo Baeza-Yates Berthier Ribeiro-Neto ACM Press New York Addison-Wesley Harlow, England Reading, Massachusetts Menlo Park, California New York Don Mills, Ontario Amsterdam
More informationWin32 Network Programming
Win32 Network Programming Windows 95 and Windows NT Network Programming Using MFC Ralph Davis TT Addison-Wesley Developers Press Reading, Massachusetts Menlo Park, California New York Don Mills, Ontario
More informationRepresentation/Indexing (fig 1.2) IR models - overview (fig 2.1) IR models - vector space. Weighting TF*IDF. U s e r. T a s k s
Summary agenda Summary: EITN01 Web Intelligence and Information Retrieval Anders Ardö EIT Electrical and Information Technology, Lund University March 13, 2013 A Ardö, EIT Summary: EITN01 Web Intelligence
More informationTable Of Contents: xix Foreword to Second Edition
Data Mining : Concepts and Techniques Table Of Contents: Foreword xix Foreword to Second Edition xxi Preface xxiii Acknowledgments xxxi About the Authors xxxv Chapter 1 Introduction 1 (38) 1.1 Why Data
More informationGlossary. ASCII: Standard binary codes to represent occidental characters in one byte.
Glossary ASCII: Standard binary codes to represent occidental characters in one byte. Ad hoc retrieval: standard retrieval task in which the user specifies his information need through a query which initiates
More informationInformation Retrieval
Multimedia Computing: Algorithms, Systems, and Applications: Information Retrieval and Search Engine By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854,
More informationInformation Retrieval
Information Retrieval CSC 375, Fall 2016 An information retrieval system will tend not to be used whenever it is more painful and troublesome for a customer to have information than for him not to have
More informationINFORMATION RETRIEVAL SYSTEMS: Theory and Implementation
INFORMATION RETRIEVAL SYSTEMS: Theory and Implementation THE KLUWER INTERNATIONAL SERIES ON INFORMATION RETRIEVAL Series Editor W. Bruce Croft University of Massachusetts Amherst, MA 01003 Also in the
More informationPreface...xi Coverage of this edition...xi Acknowledgements...xiii
Contents Preface...xi Coverage of this edition...xi Acknowledgements...xiii 1 Basic concepts of information retrieval systems...1 Introduction...1 Features of an information retrieval system...2 Elements
More informationDesigning the User Interface
Designing the User Interface Strategies for Effective Human-Computer Interaction Second Edition Ben Shneiderman The University of Maryland Addison-Wesley Publishing Company Reading, Massachusetts Menlo
More informationContents. Preface to the Second Edition
Preface to the Second Edition v 1 Introduction 1 1.1 What Is Data Mining?....................... 4 1.2 Motivating Challenges....................... 5 1.3 The Origins of Data Mining....................
More information60-538: Information Retrieval
60-538: Information Retrieval September 7, 2017 1 / 48 Outline 1 what is IR 2 3 2 / 48 Outline 1 what is IR 2 3 3 / 48 IR not long time ago 4 / 48 5 / 48 now IR is mostly about search engines there are
More informationmodern database systems lecture 4 : information retrieval
modern database systems lecture 4 : information retrieval Aristides Gionis Michael Mathioudakis spring 2016 in perspective structured data relational data RDBMS MySQL semi-structured data data-graph representation
More informationThe Essential Guide to Video Processing
The Essential Guide to Video Processing Second Edition EDITOR Al Bovik Department of Electrical and Computer Engineering The University of Texas at Austin Austin, Texas AMSTERDAM BOSTON HEIDELBERG LONDON
More informationChapter 6: Information Retrieval and Web Search. An introduction
Chapter 6: Information Retrieval and Web Search An introduction Introduction n Text mining refers to data mining using text documents as data. n Most text mining tasks use Information Retrieval (IR) methods
More informationSystems:;-'./'--'.; r. Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington
Data base 7\,T"] Systems:;-'./'--'.; r Modelsj Languages, Design, and Application Programming Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington Shamkant
More informationTEXT MINING APPLICATION PROGRAMMING
TEXT MINING APPLICATION PROGRAMMING MANU KONCHADY CHARLES RIVER MEDIA Boston, Massachusetts Contents Preface Acknowledgments xv xix Introduction 1 Originsof Text Mining 4 Information Retrieval 4 Natural
More informationPreface to the Second Edition. Preface to the First Edition. 1 Introduction 1
Preface to the Second Edition Preface to the First Edition vii xi 1 Introduction 1 2 Overview of Supervised Learning 9 2.1 Introduction... 9 2.2 Variable Types and Terminology... 9 2.3 Two Simple Approaches
More informationCollective Intelligence in Action
Collective Intelligence in Action SATNAM ALAG II MANNING Greenwich (74 w. long.) contents foreword xv preface xvii acknowledgments xix about this book xxi PART 1 GATHERING DATA FOR INTELLIGENCE 1 "1 Understanding
More informationFundamentals of. Database Systems. Shamkant B. Navathe. College of Computing Georgia Institute of Technology PEARSON.
Fundamentals of Database Systems 5th Edition Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington Shamkant B. Navathe College of Computing Georgia Institute
More informationInformation Retrieval. Information Retrieval and Web Search
Information Retrieval and Web Search Introduction to IR models and methods Information Retrieval The indexing and retrieval of textual documents. Searching for pages on the World Wide Web is the most recent
More informationSQL Queries. for. Mere Mortals. Third Edition. A Hands-On Guide to Data Manipulation in SQL. John L. Viescas Michael J. Hernandez
SQL Queries for Mere Mortals Third Edition A Hands-On Guide to Data Manipulation in SQL John L. Viescas Michael J. Hernandez r A TT TAddison-Wesley Upper Saddle River, NJ Boston Indianapolis San Francisco
More informationHuman-Computer Information Retrieval
Human-Computer Information Retrieval Gary Marchionini University of North Carolina at Chapel Hill march@ils.unc.edu CSAIL MIT November 12, 2004 Message IR and HCI are related fields that have strong (staid?)
More informationAn Introduction to Object-Oriented Programming
An Introduction to Object-Oriented Programming Timothy Budd Oregon State University TT Addison-Wesley Publishing Company Reading, Massachusetts Menlo Park, California New York Don Mills, Ontario Wokingham,
More informationChapter 2. Architecture of a Search Engine
Chapter 2 Architecture of a Search Engine Search Engine Architecture A software architecture consists of software components, the interfaces provided by those components and the relationships between them
More informationInformation Retrieval and Web Search
Information Retrieval and Web Search Introduction to IR models and methods Rada Mihalcea (Some of the slides in this slide set come from IR courses taught at UT Austin and Stanford) Information Retrieval
More informationBusiness Intelligence Roadmap HDT923 Three Days
Three Days Prerequisites Students should have experience with any relational database management system as well as experience with data warehouses and star schemas. It would be helpful if students are
More informationTABLE OF CONTENTS CHAPTER NO. TITLE PAGE NO. ABSTRACT 5 LIST OF TABLES LIST OF FIGURES LIST OF SYMBOLS AND ABBREVIATIONS xxi
ix TABLE OF CONTENTS CHAPTER NO. TITLE PAGE NO. ABSTRACT 5 LIST OF TABLES xv LIST OF FIGURES xviii LIST OF SYMBOLS AND ABBREVIATIONS xxi 1 INTRODUCTION 1 1.1 INTRODUCTION 1 1.2 WEB CACHING 2 1.2.1 Classification
More informationMathematica for Scientists and Engineers
Mathematica for Scientists and Engineers Thomas B. Bahder Addison-Wesley Publishing Company Reading, Massachusetts Menlo Park, California New York Don Mills, Ontario Wokingham, England Amsterdam Bonn Paris
More informationKnowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A.
Knowledge Retrieval Franz J. Kurfess Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. 1 Acknowledgements This lecture series has been sponsored by the European
More informationCS290N Summary Tao Yang
CS290N Summary 2015 Tao Yang Text books [CMS] Bruce Croft, Donald Metzler, Trevor Strohman, Search Engines: Information Retrieval in Practice, Publisher: Addison-Wesley, 2010. Book website. [MRS] Christopher
More informationProgramming. In Ada JOHN BARNES TT ADDISON-WESLEY
Programming In Ada 2005 JOHN BARNES... TT ADDISON-WESLEY An imprint of Pearson Education Harlow, England London New York Boston San Francisco Toronto Sydney Tokyo Singapore Hong Kong Seoul Taipei New Delhi
More informationMining the Web 2.0 to improve Search
Mining the Web 2.0 to improve Search Ricardo Baeza-Yates VP, Yahoo! Research Agenda The Power of Data Examples Improving Image Search (Faceted Clusters) Searching the Wikipedia (Correlator) Understanding
More informationToward Human-Computer Information Retrieval
Toward Human-Computer Information Retrieval Gary Marchionini University of North Carolina at Chapel Hill march@ils.unc.edu Samuel Lazerow Memorial Lecture The Information School University of Washington
More informationCHAPTER 8 Multimedia Information Retrieval
CHAPTER 8 Multimedia Information Retrieval Introduction Text has been the predominant medium for the communication of information. With the availability of better computing capabilities such as availability
More informationInformation Management (IM)
1 2 3 4 5 6 7 8 9 Information Management (IM) Information Management (IM) is primarily concerned with the capture, digitization, representation, organization, transformation, and presentation of information;
More informationIntroduction to Information Retrieval. Lecture Outline
Introduction to Information Retrieval Lecture 1 CS 410/510 Information Retrieval on the Internet Lecture Outline IR systems Overview IR systems vs. DBMS Types, facets of interest User tasks Document representations
More informationTABLE OF CONTENTS CHAPTER NO. TITLE PAGENO. LIST OF TABLES LIST OF FIGURES LIST OF ABRIVATION
vi TABLE OF CONTENTS ABSTRACT LIST OF TABLES LIST OF FIGURES LIST OF ABRIVATION iii xii xiii xiv 1 INTRODUCTION 1 1.1 WEB MINING 2 1.1.1 Association Rules 2 1.1.2 Association Rule Mining 3 1.1.3 Clustering
More informationCS377: Database Systems Text data and information. Li Xiong Department of Mathematics and Computer Science Emory University
CS377: Database Systems Text data and information retrieval Li Xiong Department of Mathematics and Computer Science Emory University Outline Information Retrieval (IR) Concepts Text Preprocessing Inverted
More informationKeyword Extraction by KNN considering Similarity among Features
64 Int'l Conf. on Advances in Big Data Analytics ABDA'15 Keyword Extraction by KNN considering Similarity among Features Taeho Jo Department of Computer and Information Engineering, Inha University, Incheon,
More informationSearch Engines. Information Retrieval in Practice
Search Engines Information Retrieval in Practice All slides Addison Wesley, 2008 Classification and Clustering Classification and clustering are classical pattern recognition / machine learning problems
More informationMahout in Action MANNING ROBIN ANIL SEAN OWEN TED DUNNING ELLEN FRIEDMAN. Shelter Island
Mahout in Action SEAN OWEN ROBIN ANIL TED DUNNING ELLEN FRIEDMAN II MANNING Shelter Island contents preface xvii acknowledgments about this book xx xix about multimedia extras xxiii about the cover illustration
More informationINFORMATION HIDING IN COMMUNICATION NETWORKS
0.8125 in Describes information hiding in communication networks, and highlights its important issues, challenges, trends, and applications. Highlights development trends and potential future directions
More informationInternational Journal of Advance Foundation and Research in Science & Engineering (IJAFRSE) Volume 1, Issue 2, July 2014.
A B S T R A C T International Journal of Advance Foundation and Research in Science & Engineering (IJAFRSE) Information Retrieval Models and Searching Methodologies: Survey Balwinder Saini*,Vikram Singh,Satish
More informationInformation Retrieval. CS630 Representing and Accessing Digital Information. What is a Retrieval Model? Basic IR Processes
CS630 Representing and Accessing Digital Information Information Retrieval: Retrieval Models Information Retrieval Basics Data Structures and Access Indexing and Preprocessing Retrieval Models Thorsten
More informationInformation Retrieval Spring Web retrieval
Information Retrieval Spring 2016 Web retrieval The Web Large Changing fast Public - No control over editing or contents Spam and Advertisement How big is the Web? Practically infinite due to the dynamic
More informationBeyond Ten Blue Links Seven Challenges
Beyond Ten Blue Links Seven Challenges Ricardo Baeza-Yates VP of Yahoo! Research for EMEA & LatAm Barcelona, Spain Thanks to Andrei Broder, Yoelle Maarek & Prabhakar Raghavan Agenda Past and Present Wisdom
More informationThe Power of Events. An Introduction to Complex Event Processing in Distributed Enterprise Systems. David Luckham
The Power of Events An Introduction to Complex Event Processing in Distributed Enterprise Systems David Luckham AAddison-Wesley Boston San Francisco New York Toronto Montreal London Munich Paris Madrid
More informationHow to Build a Digital Library
How to Build a Digital Library Ian H. Witten & David Bainbridge Contents Preface Acknowledgements i iv 1. Orientation: The world of digital libraries 1 One: Supporting human development 1 Two: Pushing
More informationCS6200 Information Retrieval. Jesse Anderton College of Computer and Information Science Northeastern University
CS6200 Information Retrieval Jesse Anderton College of Computer and Information Science Northeastern University Major Contributors Gerard Salton! Vector Space Model Indexing Relevance Feedback SMART Karen
More information[Contents. Sharing. sqlplus. Storage 6. System Support Processes 15 Operating System Files 16. Synonyms. SQL*Developer
ORACLG Oracle Press Oracle Database 12c Install, Configure & Maintain Like a Professional Ian Abramson Michael Abbey Michelle Malcher Michael Corey Mc Graw Hill Education New York Chicago San Francisco
More informationInformation Retrieval. (M&S Ch 15)
Information Retrieval (M&S Ch 15) 1 Retrieval Models A retrieval model specifies the details of: Document representation Query representation Retrieval function Determines a notion of relevance. Notion
More informationDesigning and Building an Automatic Information Retrieval System for Handling the Arabic Data
American Journal of Applied Sciences (): -, ISSN -99 Science Publications Designing and Building an Automatic Information Retrieval System for Handling the Arabic Data Ibrahiem M.M. El Emary and Ja'far
More informationAjloun National University
Study Plan Guide for the Bachelor Degree in Computer Information System First Year hr. 101101 Arabic Language Skills (1) 101099-01110 Introduction to Information Technology - - 01111 Programming Language
More informationEssentials of Database Management
Essentials of Database Management Jeffrey A. Hoffer University of Dayton Heikki Topi Bentley University V. Ramesh Indiana University PEARSON Boston Columbus Indianapolis New York San Francisco Upper Saddle
More informationInformation Retrieval and Web Search
Information Retrieval and Web Search Course overview Instructor: Rada Mihalcea What is this course about? Processing Indexing Retrieving textual data (or audio, video, geo-spatial,, data) Fits in four
More informationComputers as Components Principles of Embedded Computing System Design
Computers as Components Principles of Embedded Computing System Design Third Edition Marilyn Wolf ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY
More informationSummary of Contents LIST OF FIGURES LIST OF TABLES
Summary of Contents LIST OF FIGURES LIST OF TABLES PREFACE xvii xix xxi PART 1 BACKGROUND Chapter 1. Introduction 3 Chapter 2. Standards-Makers 21 Chapter 3. Principles of the S2ESC Collection 45 Chapter
More informationReal-Time Systems and Programming Languages
Real-Time Systems and Programming Languages Ada, Real-Time Java and C/Real-Time POSIX Fourth Edition Alan Burns and Andy Wellings University of York * ADDISON-WESLEY An imprint of Pearson Education Harlow,
More informationTable of Contents 1 Introduction A Declarative Approach to Entity Resolution... 17
Table of Contents 1 Introduction...1 1.1 Common Problem...1 1.2 Data Integration and Data Management...3 1.2.1 Information Quality Overview...3 1.2.2 Customer Data Integration...4 1.2.3 Data Management...8
More informationComplete. The. Reference. Christopher Adamson. Mc Grauu. LlLIJBB. New York Chicago. San Francisco Lisbon London Madrid Mexico City
The Complete Reference Christopher Adamson Mc Grauu LlLIJBB New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul Singapore Sydney Toronto Contents Acknowledgments
More informationChapter 3 - Text. Management and Retrieval
Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 3 - Text Management and Retrieval Literature: Baeza-Yates, R.;
More informationMachine Learning in Action
Machine Learning in Action PETER HARRINGTON Ill MANNING Shelter Island brief contents PART l (~tj\ssification...,... 1 1 Machine learning basics 3 2 Classifying with k-nearest Neighbors 18 3 Splitting
More informationAn Introduction to Parallel Programming
F 'C 3 R'"'C,_,. HO!.-IJJ () An Introduction to Parallel Programming Peter S. Pacheco University of San Francisco ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO
More informationEmpowering People with Knowledge the Next Frontier for Web Search. Wei-Ying Ma Assistant Managing Director Microsoft Research Asia
Empowering People with Knowledge the Next Frontier for Web Search Wei-Ying Ma Assistant Managing Director Microsoft Research Asia Important Trends for Web Search Organizing all information Addressing user
More informationJames Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence!
James Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence! (301) 219-4649 james.mayfield@jhuapl.edu What is Information Retrieval? Evaluation
More informationAutomatic Identification of User Goals in Web Search [WWW 05]
Automatic Identification of User Goals in Web Search [WWW 05] UichinLee @ UCLA ZhenyuLiu @ UCLA JunghooCho @ UCLA Presenter: Emiran Curtmola@ UC San Diego CSE 291 4/29/2008 Need to improve the quality
More informationOutline. Lecture 3: EITN01 Web Intelligence and Information Retrieval. Query languages - aspects. Previous lecture. Anders Ardö.
Outline Lecture 3: EITN01 Web Intelligence and Information Retrieval Anders Ardö EIT Electrical and Information Technology, Lund University February 5, 2013 A. Ardö, EIT Lecture 3: EITN01 Web Intelligence
More informationStructured Parallel Programming Patterns for Efficient Computation
Structured Parallel Programming Patterns for Efficient Computation Michael McCool Arch D. Robison James Reinders ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO
More informationInformation Search and Retrieval System in Libraries
Information Search and Retrieval System in Libraries N Rupsing Naik A Madhava Rao Abstract A digital library comprises diverse collections of digital objects representing text, sound, maps, videos, photos,
More informationChapter 5: Summary and Conclusion CHAPTER 5 SUMMARY AND CONCLUSION. Chapter 1: Introduction
CHAPTER 5 SUMMARY AND CONCLUSION Chapter 1: Introduction Data mining is used to extract the hidden, potential, useful and valuable information from very large amount of data. Data mining tools can handle
More informationContents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11
Preface xvii Acknowledgments xix CHAPTER 1 Introduction to Parallel Computing 1 1.1 Motivating Parallelism 2 1.1.1 The Computational Power Argument from Transistors to FLOPS 2 1.1.2 The Memory/Disk Speed
More informationInformation Retrieval CS Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science
Information Retrieval CS 6900 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Information Retrieval Information Retrieval (IR) is finding material of an unstructured
More informationCS 6320 Natural Language Processing
CS 6320 Natural Language Processing Information Retrieval Yang Liu Slides modified from Ray Mooney s (http://www.cs.utexas.edu/users/mooney/ir-course/slides/) 1 Introduction of IR System components, basic
More informationMining Web Data. Lijun Zhang
Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems
More informationCOMPUTER AND ROBOT VISION
VOLUME COMPUTER AND ROBOT VISION Robert M. Haralick University of Washington Linda G. Shapiro University of Washington T V ADDISON-WESLEY PUBLISHING COMPANY Reading, Massachusetts Menlo Park, California
More informationDigital Image Processing
Digital Image Processing Third Edition Rafael C. Gonzalez University of Tennessee Richard E. Woods MedData Interactive PEARSON Prentice Hall Pearson Education International Contents Preface xv Acknowledgments
More informationIntroduction to Information Retrieval
Introduction to Information Retrieval (Supplementary Material) Zhou Shuigeng March 23, 2007 Advanced Distributed Computing 1 Text Databases and IR Text databases (document databases) Large collections
More informationMathematics Shape and Space: Polygon Angles
a place of mind F A C U L T Y O F E D U C A T I O N Department of Curriculum and Pedagogy Mathematics Shape and Space: Polygon Angles Science and Mathematics Education Research Group Supported by UBC Teaching
More informationSearch Engines. Information Retrieval in Practice
Search Engines Information Retrieval in Practice All slides Addison Wesley, 2008 Web Crawler Finds and downloads web pages automatically provides the collection for searching Web is huge and constantly
More informationPATTERN CLASSIFICATION AND SCENE ANALYSIS
PATTERN CLASSIFICATION AND SCENE ANALYSIS RICHARD O. DUDA PETER E. HART Stanford Research Institute, Menlo Park, California A WILEY-INTERSCIENCE PUBLICATION JOHN WILEY & SONS New York Chichester Brisbane
More informationWebSci and Learning to Rank for IR
WebSci and Learning to Rank for IR Ernesto Diaz-Aviles L3S Research Center. Hannover, Germany diaz@l3s.de Ernesto Diaz-Aviles www.l3s.de 1/16 Motivation: Information Explosion Ernesto Diaz-Aviles
More informationDATA MINING - 1DL105, 1DL111
1 DATA MINING - 1DL105, 1DL111 Fall 2007 An introductory class in data mining http://user.it.uu.se/~udbl/dut-ht2007/ alt. http://www.it.uu.se/edu/course/homepage/infoutv/ht07 Kjell Orsborn Uppsala Database
More information