Modern Information Retrieval

Size: px
Start display at page:

Download "Modern Information Retrieval"

Transcription

1 Modern Information Retrieval Ricardo Baeza-Yates Berthier Ribeiro-Neto ACM Press NewYork Harlow, England London New York Boston. San Francisco. Toronto. Sydney Singapore Hong Kong Tokyo Seoul Taipei. New Delhi Cape Town Madrid Mexico City. Amsterdam Munkh» Paris. Milan

2 Contents Preface Acknowledgements Biographies v vii xvii 1 Introduction Motivation Information versus Data Retrieval Information Retrieval at the Center of the Stage Foeus of the Book Basie Coneepts The User Task Logieal View of the Doeuments Past, Present, and Future Early Developments Information Retrieval in the Library The Web and Digital Libraries Praetieal Issues The Retrieval Proeess Organization of the Book Book Topies Book Chapters How to Use this Book Teaehing Suggestions The Book's Web Page Bibliographie Diseussion 17 2 Modeling Introduetion A Taxonomy of Information Retrieval Models Retrieval: Ad hoe and Filtering ix

3 x 2.4 A Formal Characterization of IR Models 2.5 Classic Information Retrieval Basic Concepts Boolean Model " Vector Model Probabilistic Model Brief Comparison of Classic Models 2.6 Alternative Set Theoretic Models Fuzzy Set Model Extended Boolean Model. 2.7 Alternative Algebraic Models Generalized Vector Space Model Latent Semantic Indexing Model Neural Network Model Alternative Probabilistic Models Bayesian Networks Inference Network Model Belief Network Model Comparison of Bayesian Network Models Computational Costs of Bayesian Networks The Impact of Bayesian Network Models 2.9 Structured Text Retrieval Models Model Based on Non-Overlapping Lists Model Based on Proximal Nodes 2.10 Models for Browsing Flat Browsing Structure Guided Browsing The Hypertext Model Trends and Research Issues Bibliographie Discussion Retrieval Evaluation 3.1 Introduction Retrieval Performance Evaluation Recall and Precision Alternative Measures. 3.3 Reference Collections The TREC Collection The CACM and ISI Collections The Cystic Fibrosis Collection. 3.4 Trends and Research Issues. 3.5 Bibliographie Discussion 4 Query Languages 4.1 Introduction Keyword-Based Querying

4 4.2.1 Single-Ward Queries Context Queries Boolean Queries Natural Language. 4.3 Pattern Matehing Struetural Queries Fixed Strueture Hypertext Hierarehieal Strueture 4.5 Query Protoeols Trends and Research Issues. 4.7 Bibliographie Diseussion xi Query Operations Introduetion User Relevanee Feedback Query Expansion and Term Reweighting for the Vector Model Term Reweighting for the Probabilistie Model A Variant of Probabilistie Term Reweighting Evaluation of Relevanee Feedback Strategies Automatie Loeal Analysis Query Expansion Through Loeal Clustering Query Expansion Through Loeal Context Analysis Automatie Global Analysis Query Expansion based on a Similarity Thesaurus Query Expansion based on a Statistieal Thesaurus Trends and Research Issues Bibliographie Diseussion Text and Multimedia Languages and Properties Introduetion Metadata Text Formats Information Theory Modeling Natural Language Similarity Models Markup Languages SGML HTML XML Multimedia Formats Textual Images Graphies and Virtual Reality 159

5 XII HyTime. 6.6 Trends and Research Issues. 6.7 Bibliographie Discussion 7 Text Operations 7.1 Introduction. 7.2 Document Preprocessing Lexical Analysis of the Text Elimination of Stopwords Stemming Index Terms Selection Thesauri Document Clustering 7.4 Text Compression Motivation Basic Concepts Statistical Methods Dictionary Methods Inverted File Compression 7.5 Comparing Text Compression Techniques. 7.6 Trends and Research Issues. 7.7 Bibliographie Discussion 8 Indexing and Searching 8.1 Introduction. 8.2 Inverted Files Searching Construction. 8.3 Other Indices for Text Suffix Trees and Suffix Arrays Signature Files 8.4 Boolean Queries Sequential Searching Brute Force Knuth-Morris-Pratt : Boyer-Moore Family Shift-Or Suffix Automaton Practical Comparison Phrases and Proximity 8.6 Pattern Matehing String Matehing Allowing Errors Regular Expressions and Extended Patterns Pattern Matehing Using Indices 8.7 Structural Queries. 8.8 Compression

6 XIIl Sequential Searching Compressed Indices. Trends and Research Issues. Bibliographie Discussion Parallel and Distributed IR Introduction Parallel Computing Performance Measures Parallel IR Introduction MIMD Architectures SIMD Architectures Distributed IR Introduction Collection Partitioning Source Selection Query Processing Web Issues Trends and Research Issues Bibliographie Discussion User Interfaces and Visualization Introduction Human-Computer Interaction Design Principles The Role of Visualization Evaluating Interactive Systems The Information Access Process Models of Interaction Non-Search Parts of the Information Access Process Earlier Interface Studies Starting Points Lists of Collections Overviews Examples, Dialogs, and Wizards Automated Source Selection Query Specification Boolean Queries From Command Lines to Forms and Menus Faceted Queries Graphieal Approaches to Query Specification Phrases and Proximity Natural Language and Free Text Queries Context Document Surrogates

7 xiv Query Term Hits Within Document Content Query Term Hits Between Documents SuperBook: Context via Table of Contents Categories for Results Set Context Using Hyperlinks to Organize Retrieval Results Tables Using Rclevance Judgements Interfaces for Standard Relevance Feedback Studies of User Interaction with Relevance Feedback Systems Fetching Relevant Information in the Background Group Relevance Judgements Pseudo-Relevance Feedback Interface Support für the Search Process Interfaces for String Matehing Window Management Example Systems Examples of Poor Use of Overlapping Windows Retaining Search History Integrating Scanning, Selection, and Querying Trends and Research Issues Bibliographie Discussion Multimedia IR: Models and Languages Introduction Data Modeling Multimedia Data Support in Commercial DBMSs The MULTOS Data Model Query Languages Request Specification Conditions on Multimedia Data Uncertainty, Proximity, and Weights in Query Expressions Some Proposals Trends and Research Issues Bibiographic Discussion Multimedia IR: Indexing and Searching Introduction Background - Spatial Access Methods A Generic Multimedia Indexing Approach One-dimensional Time Series Distance Function Feature Extraction and Lower-bounding Experiments Two-dimensional Color Images

8 Image Features and Distanee Functions Lower-bounding Experiments Automatie Feature Extraction 12.7 Trends and Research Issues Bibliographie Discussion 13 Searching the Web 13.1 Introduction Challenges Characterizing the Web Measuring the Web Modeling the Web 13.4 Search Engines Centralized Architecture Distributed Architecture User Interfaces., Ranking Crawling the Web Indices Browsing Web Directories Combining Searching with Browsing Helpful Tools Metasearchers Finding the Needle in the Haystaek User Problems Some Examples Teaching the User Searching using Hyperlinks Web Query Languages Dynamic Search and Software Agents Trends and Research Issues Bibliographie Discussion. 14 Libraries and Bibliographical Systems 14.1 Introduction Online IR Systems and Document Databases Databases Online Retrieval Systems IR in Online Retrieval Systems 'Natural Language' Searching Online Public Aecess Catalogs (OPACs) OPACs and Their Content OPACs and End Users OPACs: Vendors and Produets xv

9 xvi Alternatives to Vendor OPACs 14.4 Libraries and Digital Library Projects Trends and Research Issues Bibliographie Discussion 15 Digital Libraries 15.1 Introduction 15.2 Definitions Architectural Issues 15.4 Document Models, Representations, and Access Multilingual Documents Multimedia Documents Structured Documents Distributed Collections Federated Search Access Prototypes, Projects, and Interfaces International Range of Efforts Usability Standards Protocols and Federation Metadata Trends and Research Issues Bibliographical Discussion Appendix: Porter's Algorithm Glossary References Index

Glossary. ASCII: Standard binary codes to represent occidental characters in one byte.

Glossary. ASCII: Standard binary codes to represent occidental characters in one byte. Glossary ASCII: Standard binary codes to represent occidental characters in one byte. Ad hoc retrieval: standard retrieval task in which the user specifies his information need through a query which initiates

More information

An Introduction to Search Engines and Web Navigation

An Introduction to Search Engines and Web Navigation An Introduction to Search Engines and Web Navigation MARK LEVENE ADDISON-WESLEY Ал imprint of Pearson Education Harlow, England London New York Boston San Francisco Toronto Sydney Tokyo Singapore Hong

More information

Introduction to Information Retrieval

Introduction to Information Retrieval Introduction to Information Retrieval Mohsen Kamyar چهارمین کارگاه ساالنه آزمایشگاه فناوری و وب بهمن ماه 1391 Outline Outline in classic categorization Information vs. Data Retrieval IR Models Evaluation

More information

Programming. In Ada JOHN BARNES TT ADDISON-WESLEY

Programming. In Ada JOHN BARNES TT ADDISON-WESLEY Programming In Ada 2005 JOHN BARNES... TT ADDISON-WESLEY An imprint of Pearson Education Harlow, England London New York Boston San Francisco Toronto Sydney Tokyo Singapore Hong Kong Seoul Taipei New Delhi

More information

Search Engines Information Retrieval in Practice

Search Engines Information Retrieval in Practice Search Engines Information Retrieval in Practice W. BRUCE CROFT University of Massachusetts, Amherst DONALD METZLER Yahoo! Research TREVOR STROHMAN Google Inc. ----- PEARSON Boston Columbus Indianapolis

More information

Systems:;-'./'--'.; r. Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington

Systems:;-'./'--'.; r. Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington Data base 7\,T"] Systems:;-'./'--'.; r Modelsj Languages, Design, and Application Programming Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington Shamkant

More information

FUNDAMENTALS OF. Database S wctpmc. Shamkant B. Navathe College of Computing Georgia Institute of Technology. Addison-Wesley

FUNDAMENTALS OF. Database S wctpmc. Shamkant B. Navathe College of Computing Georgia Institute of Technology. Addison-Wesley FUNDAMENTALS OF Database S wctpmc SIXTH EDITION Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington Shamkant B. Navathe College of Computing Georgia Institute

More information

Department of Computer Science and Engineering B.E/B.Tech/M.E/M.Tech : B.E. Regulation: 2013 PG Specialisation : _

Department of Computer Science and Engineering B.E/B.Tech/M.E/M.Tech : B.E. Regulation: 2013 PG Specialisation : _ COURSE DELIVERY PLAN - THEORY Page 1 of 6 Department of Computer Science and Engineering B.E/B.Tech/M.E/M.Tech : B.E. Regulation: 2013 PG Specialisation : _ LP: CS6007 Rev. No: 01 Date: 27/06/2017 Sub.

More information

Modern Information Retrieval

Modern Information Retrieval Modern Information Retrieval The Concepts and Technology behind Search Ricardo Baeza-Yates Berthier Ribeiro-Neto Second edition Addison-Wesley Harlow, England Reading, Massachusetts Menlo Park, California

More information

World Wide Web PROGRAMMING THE PEARSON EIGHTH EDITION. University of Colorado at Colorado Springs

World Wide Web PROGRAMMING THE PEARSON EIGHTH EDITION. University of Colorado at Colorado Springs PROGRAMMING THE World Wide Web EIGHTH EDITION ROBERT W. SEBESTA University of Colorado at Colorado Springs PEARSON Boston Columbus Indianapolis New York San Francisco Upper Saddle River Amsterdam Cape

More information

Database Concepts. David M. Kroenke UNIVERSITATSBIBLIOTHEK HANNOVER

Database Concepts. David M. Kroenke UNIVERSITATSBIBLIOTHEK HANNOVER Database Concepts Fifth Edition David M. Kroenke David J. Auer ^111 I ii i.111 111 n.n jiiim^ TECHNISCHE INFORMATIOMSBiBLIOTHEK UNIVERSITATSBIBLIOTHEK HANNOVER j TIB/UB Hannover Prentice Hall Boston Columbus

More information

Representation/Indexing (fig 1.2) IR models - overview (fig 2.1) IR models - vector space. Weighting TF*IDF. U s e r. T a s k s

Representation/Indexing (fig 1.2) IR models - overview (fig 2.1) IR models - vector space. Weighting TF*IDF. U s e r. T a s k s Summary agenda Summary: EITN01 Web Intelligence and Information Retrieval Anders Ardö EIT Electrical and Information Technology, Lund University March 13, 2013 A Ardö, EIT Summary: EITN01 Web Intelligence

More information

Essentials of Database Management

Essentials of Database Management Essentials of Database Management Jeffrey A. Hoffer University of Dayton Heikki Topi Bentley University V. Ramesh Indiana University PEARSON Boston Columbus Indianapolis New York San Francisco Upper Saddle

More information

Part I: Data Mining Foundations

Part I: Data Mining Foundations Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web and the Internet 2 1.3. Web Data Mining 4 1.3.1. What is Data Mining? 6 1.3.2. What is Web Mining?

More information

Introduction to Information Retrieval. Lecture Outline

Introduction to Information Retrieval. Lecture Outline Introduction to Information Retrieval Lecture 1 CS 410/510 Information Retrieval on the Internet Lecture Outline IR systems Overview IR systems vs. DBMS Types, facets of interest User tasks Document representations

More information

MECHATRONICS. William Bolton. Sixth Edition ELECTRONIC CONTROL SYSTEMS ENGINEERING IN MECHANICAL AND ELECTRICAL PEARSON

MECHATRONICS. William Bolton. Sixth Edition ELECTRONIC CONTROL SYSTEMS ENGINEERING IN MECHANICAL AND ELECTRICAL PEARSON MECHATRONICS ELECTRONIC CONTROL SYSTEMS IN MECHANICAL AND ELECTRICAL ENGINEERING Sixth Edition William Bolton PEARSON Harlow, England London New York Boston San Francisco Toronto Sydney Auckland Singapore

More information

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer Bing Liu Web Data Mining Exploring Hyperlinks, Contents, and Usage Data With 177 Figures Springer Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web

More information

INFORMATION RETRIEVAL SYSTEMS: Theory and Implementation

INFORMATION RETRIEVAL SYSTEMS: Theory and Implementation INFORMATION RETRIEVAL SYSTEMS: Theory and Implementation THE KLUWER INTERNATIONAL SERIES ON INFORMATION RETRIEVAL Series Editor W. Bruce Croft University of Massachusetts Amherst, MA 01003 Also in the

More information

Chapter 27 Introduction to Information Retrieval and Web Search

Chapter 27 Introduction to Information Retrieval and Web Search Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval

More information

Information Retrieval

Information Retrieval Information Retrieval CSC 375, Fall 2016 An information retrieval system will tend not to be used whenever it is more painful and troublesome for a customer to have information than for him not to have

More information

Fundamentals of. Database Systems. Shamkant B. Navathe. College of Computing Georgia Institute of Technology PEARSON.

Fundamentals of. Database Systems. Shamkant B. Navathe. College of Computing Georgia Institute of Technology PEARSON. Fundamentals of Database Systems 5th Edition Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington Shamkant B. Navathe College of Computing Georgia Institute

More information

Real-Time Systems and Programming Languages

Real-Time Systems and Programming Languages Real-Time Systems and Programming Languages Ada, Real-Time Java and C/Real-Time POSIX Fourth Edition Alan Burns and Andy Wellings University of York * ADDISON-WESLEY An imprint of Pearson Education Harlow,

More information

The Unified Modeling Language User Guide

The Unified Modeling Language User Guide The Unified Modeling Language User Guide Grady Booch James Rumbaugh Ivar Jacobson Rational Software Corporation TT ADDISON-WESLEY Boston San Francisco New York Toronto Montreal London Munich Paris Madrid

More information

Introductory logic and sets for Computer scientists

Introductory logic and sets for Computer scientists Introductory logic and sets for Computer scientists Nimal Nissanke University of Reading ADDISON WESLEY LONGMAN Harlow, England II Reading, Massachusetts Menlo Park, California New York Don Mills, Ontario

More information

Access ComprehGnsiwG. Shelley Gaskin, Carolyn McLellan, and. Nancy Graviett. with Microsoft

Access ComprehGnsiwG. Shelley Gaskin, Carolyn McLellan, and. Nancy Graviett. with Microsoft with Microsoft Access 2010 ComprehGnsiwG Shelley Gaskin, Carolyn McLellan, and Nancy Graviett Prentice Hall Boston Columbus Indianapolis New York San Francisco Upper Saddle River Imsterdam Cape Town Dubai

More information

Objects First with Java

Objects First with Java ^ Objects First with Java A Practical Introduction using BlueJ David J. Barnes and Michael Kolling Second edition PEARSON Prentice Hall Harlow, England London New York Boston San Francisco Toronto Sydney

More information

Information Retrieval and Web Search

Information Retrieval and Web Search Information Retrieval and Web Search Introduction to IR models and methods Rada Mihalcea (Some of the slides in this slide set come from IR courses taught at UT Austin and Stanford) Information Retrieval

More information

ony Gaddis Haywood Community College STARTING OUT WITH PEARSON Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montreal Toronto

ony Gaddis Haywood Community College STARTING OUT WITH PEARSON Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montreal Toronto STARTING OUT WITH J^"* 1 Ti * ony Gaddis Haywood Community College PEARSON Boston Columbus Indianapolis New York San Francisco Upper Saddle River Amsterdam Cape Town Dubai London Madrid Milan Munich Paris

More information

Information Retrieval. Information Retrieval and Web Search

Information Retrieval. Information Retrieval and Web Search Information Retrieval and Web Search Introduction to IR models and methods Information Retrieval The indexing and retrieval of textual documents. Searching for pages on the World Wide Web is the most recent

More information

Visual C# Tony Gaddis. Haywood Community College STARTING OUT WITH. Piyali Sengupta. Third Edition. Global Edition contributions by.

Visual C# Tony Gaddis. Haywood Community College STARTING OUT WITH. Piyali Sengupta. Third Edition. Global Edition contributions by. STARTING OUT WITH Visual C# 2012 Third Edition Global Edition Tony Gaddis Haywood Community College Global Edition contributions by Piyali Sengupta PEARSON Boston Columbus Indianapolis New York San Francisco

More information

PROBLEM SOLVING USING JAVA WITH DATA STRUCTURES. A Multimedia Approach. Mark Guzdial and Barbara Ericson PEARSON. College of Computing

PROBLEM SOLVING USING JAVA WITH DATA STRUCTURES. A Multimedia Approach. Mark Guzdial and Barbara Ericson PEARSON. College of Computing PROBLEM SOLVING WITH DATA STRUCTURES USING JAVA A Multimedia Approach Mark Guzdial and Barbara Ericson College of Computing Georgia Institute of Technology PEARSON Boston Columbus Indianapolis New York

More information

modern database systems lecture 4 : information retrieval

modern database systems lecture 4 : information retrieval modern database systems lecture 4 : information retrieval Aristides Gionis Michael Mathioudakis spring 2016 in perspective structured data relational data RDBMS MySQL semi-structured data data-graph representation

More information

SQL Queries. for. Mere Mortals. Third Edition. A Hands-On Guide to Data Manipulation in SQL. John L. Viescas Michael J. Hernandez

SQL Queries. for. Mere Mortals. Third Edition. A Hands-On Guide to Data Manipulation in SQL. John L. Viescas Michael J. Hernandez SQL Queries for Mere Mortals Third Edition A Hands-On Guide to Data Manipulation in SQL John L. Viescas Michael J. Hernandez r A TT TAddison-Wesley Upper Saddle River, NJ Boston Indianapolis San Francisco

More information

THE AVR MICROCONTROLLER AND EMBEDDED SYSTEMS. Using Assembly and С

THE AVR MICROCONTROLLER AND EMBEDDED SYSTEMS. Using Assembly and С THE AVR MICROCONTROLLER AND EMBEDDED SYSTEMS Using Assembly and С Muhammad AH Mazidi Sarmad Naimi Sepehr Naimi Prentice Hall Boston Columbus Indianapolis New York San Francisco Upper Saddle River Amsterdam

More information

Business Driven Data Communications

Business Driven Data Communications Business Driven Data Communications Michael S. Gendron PEARSON Boston Columbus Indianapolis New York San Francisco Upper Saddle River Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montreal

More information

CRYPTOGRAPHY AND NETWORK SECURITY

CRYPTOGRAPHY AND NETWORK SECURITY CRYPTOGRAPHY AND NETWORK SECURITY PRINCIPLES AND PRACTICE FIFTH EDITION William Stallings Prentice Hall Boston Columbus Indianapolis New York San Francisco Upper Saddle River Amsterdam Cape Town Dubai

More information

DATABASE SYSTEM CONCEPTS

DATABASE SYSTEM CONCEPTS DATABASE SYSTEM CONCEPTS HENRY F. KORTH ABRAHAM SILBERSCHATZ University of Texas at Austin McGraw-Hill, Inc. New York St. Louis San Francisco Auckland Bogota Caracas Lisbon London Madrid Mexico Milan Montreal

More information

COMPUTER AND ROBOT VISION

COMPUTER AND ROBOT VISION VOLUME COMPUTER AND ROBOT VISION Robert M. Haralick University of Washington Linda G. Shapiro University of Washington A^ ADDISON-WESLEY PUBLISHING COMPANY Reading, Massachusetts Menlo Park, California

More information

Digital System Design with SystemVerilog

Digital System Design with SystemVerilog Digital System Design with SystemVerilog Mark Zwolinski AAddison-Wesley Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich Paris Madrid Capetown Sydney Tokyo

More information

Complete. The. Reference. Christopher Adamson. Mc Grauu. LlLIJBB. New York Chicago. San Francisco Lisbon London Madrid Mexico City

Complete. The. Reference. Christopher Adamson. Mc Grauu. LlLIJBB. New York Chicago. San Francisco Lisbon London Madrid Mexico City The Complete Reference Christopher Adamson Mc Grauu LlLIJBB New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul Singapore Sydney Toronto Contents Acknowledgments

More information

Outline. Lecture 3: EITN01 Web Intelligence and Information Retrieval. Query languages - aspects. Previous lecture. Anders Ardö.

Outline. Lecture 3: EITN01 Web Intelligence and Information Retrieval. Query languages - aspects. Previous lecture. Anders Ardö. Outline Lecture 3: EITN01 Web Intelligence and Information Retrieval Anders Ardö EIT Electrical and Information Technology, Lund University February 5, 2013 A. Ardö, EIT Lecture 3: EITN01 Web Intelligence

More information

Computers as Components Principles of Embedded Computing System Design

Computers as Components Principles of Embedded Computing System Design Computers as Components Principles of Embedded Computing System Design Third Edition Marilyn Wolf ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY

More information

The Information Retrieval Series. Series Editor W. Bruce Croft

The Information Retrieval Series. Series Editor W. Bruce Croft The Information Retrieval Series Series Editor W. Bruce Croft Sándor Dominich The Modern Algebra of Information Retrieval 123 Sándor Dominich Computer Science Department University of Pannonia Egyetem

More information

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How

More information

Programming in Python 3

Programming in Python 3 Programming in Python 3 A Complete Introduction to the Python Language Mark Summerfield.4.Addison-Wesley Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich

More information

Data Structures and Abstractions with Java

Data Structures and Abstractions with Java Global edition Data Structures and Abstractions with Java Fourth edition Frank M. Carrano Timothy M. Henry Data Structures and Abstractions with Java TM Fourth Edition Global Edition Frank M. Carrano University

More information

CJT^jL rafting Cm ompiler

CJT^jL rafting Cm ompiler CJT^jL rafting Cm ompiler ij CHARLES N. FISCHER Computer Sciences University of Wisconsin Madison RON K. CYTRON Computer Science and Engineering Washington University RICHARD J. LeBLANC, Jr. Computer Science

More information

Integrated Approach. Operating Systems COMPUTER SYSTEMS. LEAHY, Jr. Georgia Institute of Technology. Umakishore RAMACHANDRAN. William D.

Integrated Approach. Operating Systems COMPUTER SYSTEMS. LEAHY, Jr. Georgia Institute of Technology. Umakishore RAMACHANDRAN. William D. COMPUTER SYSTEMS An and Integrated Approach f Architecture Operating Systems Umakishore RAMACHANDRAN Georgia Institute of Technology William D. LEAHY, Jr. Georgia Institute of Technology PEARSON Boston

More information

60-538: Information Retrieval

60-538: Information Retrieval 60-538: Information Retrieval September 7, 2017 1 / 48 Outline 1 what is IR 2 3 2 / 48 Outline 1 what is IR 2 3 3 / 48 IR not long time ago 4 / 48 5 / 48 now IR is mostly about search engines there are

More information

Structured Parallel Programming Patterns for Efficient Computation

Structured Parallel Programming Patterns for Efficient Computation Structured Parallel Programming Patterns for Efficient Computation Michael McCool Arch D. Robison James Reinders ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO

More information

MariaDB Crash Course. A Addison-Wesley. Ben Forta. Upper Saddle River, NJ Boston. Indianapolis. Singapore Mexico City. Cape Town Sydney.

MariaDB Crash Course. A Addison-Wesley. Ben Forta. Upper Saddle River, NJ Boston. Indianapolis. Singapore Mexico City. Cape Town Sydney. MariaDB Crash Course Ben Forta A Addison-Wesley Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich Paris Madrid Cape Town Sydney Tokyo Singapore Mexico City

More information

Chapter 6: Information Retrieval and Web Search. An introduction

Chapter 6: Information Retrieval and Web Search. An introduction Chapter 6: Information Retrieval and Web Search An introduction Introduction n Text mining refers to data mining using text documents as data. n Most text mining tasks use Information Retrieval (IR) methods

More information

CompTIA" Cloud Essentials Certification Study Guide. (Exam CLO-001) ITpreneurs

CompTIA Cloud Essentials Certification Study Guide. (Exam CLO-001) ITpreneurs CompTIA" Cloud Essentials Certification Study Guide (Exam CLO-001) ITpreneurs JGraw-Hill Education and ITpreneurs are independent entities from CompTIA". Is publication and CD-ROM may be used in assisting

More information

Table of contents for The organization of information / Arlene G. Taylor and Daniel N. Joudrey.

Table of contents for The organization of information / Arlene G. Taylor and Daniel N. Joudrey. Table of contents for The organization of information / Arlene G. Taylor and Daniel N. Joudrey. Chapter 1: Organization of Recorded Information The Need to Organize The Nature of Information Organization

More information

Information Search and Retrieval System in Libraries

Information Search and Retrieval System in Libraries Information Search and Retrieval System in Libraries N Rupsing Naik A Madhava Rao Abstract A digital library comprises diverse collections of digital objects representing text, sound, maps, videos, photos,

More information

EMEDIA EXPERT SYSTEMS

EMEDIA EXPERT SYSTEMS EMEDIA EXPERT SYSTEMS Editors R. Rada University of Liverpool К. Tochtermann University of Dortmund Vfo World Scientific Sinaapore Singapore New Jersey London Hong Kong IX CONTENTS PARTI INTRODUCTION TO

More information

Prelude to Programming

Prelude to Programming GLOBAL EDITION Prelude to Programming Concepts and Design SIXTH EDITION Stewart Venit Elizabeth Drake Prelude toprogramming Sixth Edition Global Edition Concepts and Design Stewart Venit Elizabeth Drake

More information

International Journal of Advance Foundation and Research in Science & Engineering (IJAFRSE) Volume 1, Issue 2, July 2014.

International Journal of Advance Foundation and Research in Science & Engineering (IJAFRSE) Volume 1, Issue 2, July 2014. A B S T R A C T International Journal of Advance Foundation and Research in Science & Engineering (IJAFRSE) Information Retrieval Models and Searching Methodologies: Survey Balwinder Saini*,Vikram Singh,Satish

More information

Preface...xi Coverage of this edition...xi Acknowledgements...xiii

Preface...xi Coverage of this edition...xi Acknowledgements...xiii Contents Preface...xi Coverage of this edition...xi Acknowledgements...xiii 1 Basic concepts of information retrieval systems...1 Introduction...1 Features of an information retrieval system...2 Elements

More information

Data Structures and Abstractions with Java

Data Structures and Abstractions with Java Global edition Data Structures and Abstractions with Java Fourth edition Frank M. Carrano Timothy M. Henry Data Structures and Abstractions with Java TM Fourth Edition Global Edition Frank M. Carrano University

More information

DB2 SQL Tuning Tips for z/os Developers

DB2 SQL Tuning Tips for z/os Developers DB2 SQL Tuning Tips for z/os Developers Tony Andrews IBM Press, Pearson pic Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich Paris Madrid Cape Town Sydney

More information

TEXT MINING APPLICATION PROGRAMMING

TEXT MINING APPLICATION PROGRAMMING TEXT MINING APPLICATION PROGRAMMING MANU KONCHADY CHARLES RIVER MEDIA Boston, Massachusetts Contents Preface Acknowledgments xv xix Introduction 1 Originsof Text Mining 4 Information Retrieval 4 Natural

More information

Information Retrieval. (M&S Ch 15)

Information Retrieval. (M&S Ch 15) Information Retrieval (M&S Ch 15) 1 Retrieval Models A retrieval model specifies the details of: Document representation Query representation Retrieval function Determines a notion of relevance. Notion

More information

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A.

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. Knowledge Retrieval Franz J. Kurfess Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. 1 Acknowledgements This lecture series has been sponsored by the European

More information

Designing the User Interface

Designing the User Interface Designing the User Interface Strategies for Effective Human-Computer Interaction Second Edition Ben Shneiderman The University of Maryland Addison-Wesley Publishing Company Reading, Massachusetts Menlo

More information

Anany Levitin 3RD EDITION. Arup Kumar Bhattacharjee. mmmmm Analysis of Algorithms. Soumen Mukherjee. Introduction to TllG DCSISFI &

Anany Levitin 3RD EDITION. Arup Kumar Bhattacharjee. mmmmm Analysis of Algorithms. Soumen Mukherjee. Introduction to TllG DCSISFI & Introduction to TllG DCSISFI & mmmmm Analysis of Algorithms 3RD EDITION Anany Levitin Villa nova University International Edition contributions by Soumen Mukherjee RCC Institute of Information Technology

More information

MODERN DATABASE MANAGEMENT

MODERN DATABASE MANAGEMENT Global Twelfth Edition Edition MODERN DATABASE MANAGEMENT Jeffrey A. Hoffer University of Dayton V. Ramesh Indiana University Heikki Topi Bentley University PEARSON Boston Columbus Indianapolis New York

More information

Structured Parallel Programming

Structured Parallel Programming Structured Parallel Programming Patterns for Efficient Computation Michael McCool Arch D. Robison James Reinders ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO

More information

Fit for Developing Software

Fit for Developing Software Fit for Developing Software Framework for Integrated Tests Rick Mugridge Ward Cunningham 04) PRENTICE HALL Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich

More information

PYTHON. p ykos vtawynivis. Second eciitiovl. CO Ve, WESLEY J. CHUN

PYTHON. p ykos vtawynivis. Second eciitiovl. CO Ve, WESLEY J. CHUN CO Ve, PYTHON p ykos vtawynivis Second eciitiovl WESLEY J. CHUN. PRENTICE HALL Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich Paris Madrid Capetown Sydney

More information

Information Retrieval CS Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Information Retrieval CS Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science Information Retrieval CS 6900 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Information Retrieval Information Retrieval (IR) is finding material of an unstructured

More information

CHAPTER 8 Multimedia Information Retrieval

CHAPTER 8 Multimedia Information Retrieval CHAPTER 8 Multimedia Information Retrieval Introduction Text has been the predominant medium for the communication of information. With the availability of better computing capabilities such as availability

More information

Designing and Building an Automatic Information Retrieval System for Handling the Arabic Data

Designing and Building an Automatic Information Retrieval System for Handling the Arabic Data American Journal of Applied Sciences (): -, ISSN -99 Science Publications Designing and Building an Automatic Information Retrieval System for Handling the Arabic Data Ibrahiem M.M. El Emary and Ja'far

More information

A Model for Information Retrieval Agent System Based on Keywords Distribution

A Model for Information Retrieval Agent System Based on Keywords Distribution A Model for Information Retrieval Agent System Based on Keywords Distribution Jae-Woo LEE Dept of Computer Science, Kyungbok College, 3, Sinpyeong-ri, Pocheon-si, 487-77, Gyeonggi-do, Korea It2c@koreaackr

More information

Automatic Text Processing

Automatic Text Processing Automatic Text Processing The Transformation, Analysis, and Retrieval of Information by Computer Gerard Salton Cornell University Technlsche Univerariat Darmstadt FACHBEREICH1NFORMATJK BIBLIOTHE.K Invented.:

More information

CLASSIC DATA STRUCTURES IN JAVA

CLASSIC DATA STRUCTURES IN JAVA CLASSIC DATA STRUCTURES IN JAVA Timothy Budd Oregon State University Boston San Francisco New York London Toronto Sydney Tokyo Singapore Madrid Mexico City Munich Paris Cape Town Hong Kong Montreal CONTENTS

More information

Information Retrieval

Information Retrieval s Information Retrieval Information system management system Model Processing of queries/updates Queries Answer Access to stored data Patrick Lambrix Department of Computer and Information Science Linköpings

More information

Ajloun National University

Ajloun National University Study Plan Guide for the Bachelor Degree in Computer Information System First Year hr. 101101 Arabic Language Skills (1) 101099-01110 Introduction to Information Technology - - 01111 Programming Language

More information

Algorithmic Graph Theory and Perfect Graphs

Algorithmic Graph Theory and Perfect Graphs Algorithmic Graph Theory and Perfect Graphs Second Edition Martin Charles Golumbic Caesarea Rothschild Institute University of Haifa Haifa, Israel 2004 ELSEVIER.. Amsterdam - Boston - Heidelberg - London

More information

This page intentionally left blank

This page intentionally left blank Database Concepts This page intentionally left blank Database Concepts Seventh Edition David M. Kroenke David J. Auer Western Washington University Boston Columbus Indianapolis New York San Francisco Hoboken

More information

OpenGL SUPERBIBLE. Fifth Edition. Comprehensive Tutorial and Reference. Richard S. Wright, Jr. Nicholas Haemel Graham Sellers Benjamin Lipchak

OpenGL SUPERBIBLE. Fifth Edition. Comprehensive Tutorial and Reference. Richard S. Wright, Jr. Nicholas Haemel Graham Sellers Benjamin Lipchak OpenGL SUPERBIBLE Fifth Edition Comprehensive Tutorial and Reference Richard S. Wright, Jr. Nicholas Haemel Graham Sellers Benjamin Lipchak AAddison-Wesley Upper Saddle River, NJ Boston Indianapolis San

More information

Information Management (IM)

Information Management (IM) 1 2 3 4 5 6 7 8 9 Information Management (IM) Information Management (IM) is primarily concerned with the capture, digitization, representation, organization, transformation, and presentation of information;

More information

Web Development and Design Foundations with HTML5

Web Development and Design Foundations with HTML5 GLOBAL EDITION Web Development and Design Foundations with HTML5 SEVENTH EDITION Terry Felke-Morris 7th Edition Web Development and Design Foundations with HTML5 GLOBAL EDITION Terry Ann Felke-Morris,

More information

DATA AND COMPUTER COMMUNICATIONS

DATA AND COMPUTER COMMUNICATIONS DATA AND COMPUTER COMMUNICATIONS Ninth Edition William Stallings Boston Columbus Indianapolis New York San Francisco Upper Saddle River Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montreal

More information

Chapter 3 - Text. Management and Retrieval

Chapter 3 - Text. Management and Retrieval Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 3 - Text Management and Retrieval Literature: Baeza-Yates, R.;

More information

CS54701: Information Retrieval

CS54701: Information Retrieval CS54701: Information Retrieval Basic Concepts 19 January 2016 Prof. Chris Clifton 1 Text Representation: Process of Indexing Remove Stopword, Stemming, Phrase Extraction etc Document Parser Extract useful

More information

The Essential Guide to Video Processing

The Essential Guide to Video Processing The Essential Guide to Video Processing Second Edition EDITOR Al Bovik Department of Electrical and Computer Engineering The University of Texas at Austin Austin, Texas AMSTERDAM BOSTON HEIDELBERG LONDON

More information

Cloud Computing and SOA Convergence in Your Enterprise

Cloud Computing and SOA Convergence in Your Enterprise Cloud Computing and SOA Convergence in Your Enterprise A Step-by-Step Guide David S. Lint hicum A Addison-Wesley Upper Saddle River, NT Boston Indianapolis San Francisco New York Toronto Montreal London

More information

Win32 Network Programming

Win32 Network Programming Win32 Network Programming Windows 95 and Windows NT Network Programming Using MFC Ralph Davis TT Addison-Wesley Developers Press Reading, Massachusetts Menlo Park, California New York Don Mills, Ontario

More information

Contents. Foreword to Second Edition. Acknowledgments About the Authors

Contents. Foreword to Second Edition. Acknowledgments About the Authors Contents Foreword xix Foreword to Second Edition xxi Preface xxiii Acknowledgments About the Authors xxxi xxxv Chapter 1 Introduction 1 1.1 Why Data Mining? 1 1.1.1 Moving toward the Information Age 1

More information

FrontPage 98: The Complete Reference

FrontPage 98: The Complete Reference FrontPage 98: The Complete Reference Martin S. Matthews Erik B. Poulsen Osborne McGraw-Hill Berkeley New York St. Louis San Francisco Auckland Bogota Hamburg London Madrid Mexico City Milan Montreal New

More information

CS 6320 Natural Language Processing

CS 6320 Natural Language Processing CS 6320 Natural Language Processing Information Retrieval Yang Liu Slides modified from Ray Mooney s (http://www.cs.utexas.edu/users/mooney/ir-course/slides/) 1 Introduction of IR System components, basic

More information

The Power of Events. An Introduction to Complex Event Processing in Distributed Enterprise Systems. David Luckham

The Power of Events. An Introduction to Complex Event Processing in Distributed Enterprise Systems. David Luckham The Power of Events An Introduction to Complex Event Processing in Distributed Enterprise Systems David Luckham AAddison-Wesley Boston San Francisco New York Toronto Montreal London Munich Paris Madrid

More information

Foundations of Multidimensional and Metric Data Structures

Foundations of Multidimensional and Metric Data Structures Foundations of Multidimensional and Metric Data Structures Hanan Samet University of Maryland, College Park ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE

More information

Simulation Modeling and Analysis

Simulation Modeling and Analysis Simulation Modeling and Analysis FOURTH EDITION Averill M. Law President Averill M. Law & Associates, Inc. Tucson, Arizona, USA www. averill-law. com Boston Burr Ridge, IL Dubuque, IA New York San Francisco

More information

Microsoft Visual Studio 2010

Microsoft Visual Studio 2010 Microsoft Visual Studio 2010 A Beginner's Guide Joe Mayo Mc Grauu Hill New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul Singapore Sydney Toronto Contents ACKNOWLEDGMENTS

More information

Indexing and Searching

Indexing and Searching Indexing and Searching Introduction How to retrieval information? A simple alternative is to search the whole text sequentially Another option is to build data structures over the text (called indices)

More information

James Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence!

James Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence! James Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence! (301) 219-4649 james.mayfield@jhuapl.edu What is Information Retrieval? Evaluation

More information

FACULTY OF ENGINEERING B.E. 4/4 (CSE) II Semester (Old) Examination, June Subject : Information Retrieval Systems (Elective III) Estelar

FACULTY OF ENGINEERING B.E. 4/4 (CSE) II Semester (Old) Examination, June Subject : Information Retrieval Systems (Elective III) Estelar B.E. 4/4 (CSE) II Semester (Old) Examination, June 2014 Subject : Information Retrieval Systems Code No. 6306 / O 1 Define Information retrieval systems. 3 2 What is precision and recall? 3 3 List the

More information