信息检索与搜索引擎 Introduction to Information Retrieval GESC1007

Size: px
Start display at page:

Download "信息检索与搜索引擎 Introduction to Information Retrieval GESC1007"

Transcription

1 信息检索与搜索引擎 Introduction to Information Retrieval GESC1007 Philippe Fournier-Viger Full professor School of Natural Sciences and Humanities Spring

2 Last week We have discussed about: Hashing ( 散列 ) and search trees ( 搜索树 ) Wildcard queries Spell correction QQ Group: Website: PPTs 2

3 Course schedule ( 日程安排 ) Lecture 1 Lecture 2 Lecture 3 Lecture 4 Lecture 5 Lecture 6 Lecture 7 Introduction Boolean retrieval ( 布尔检索模型 ) Term vocabulary and posting lists Dictionaries and tolerant retrieval Index construction and compression Scoring, weighting, and the vector space model Computer scores, and a complete search system Evaluation in information retrieval Web search engines, advanced topics, and conclusion 3

4 PHONETIC ( 语音的 ) CORRECTION Write Right Rite Wright 4

5 Phonetic correction Misspellings are often caused by a user typing a query that sounds like the target term. Phonetic hashing: try to group together all terms that sound similar. 5

6 Soundex algorithms 1. Turn every term to be indexed into a 4- character reduced form Hermann H Use these character to create an inverted index (dictionary 词典 ). The dictionary is called soundex index 3. Do the same with query terms 4. When a new query arrives, search using the soundex index. 6

7 How to calculate the 4 character codes? 1. Retain the first letter of the term. 2. Change all occurrences of the following letters to 0 (zero): A,E, I, O, U, H, W, Y 3. Change letters to digits as follows: B, F, P, V to 1. C, G, J, K, Q, S, X, Z to 2. D,T to 3. L to 4. M, N to 5. R to Repeatedly remove one out of each pair of consecutive identical digits 5. Remove all zeros from the resulting text. Pad the resulting text with trailing zeros and return the first four positions, which will consist of a letter followed by three digits. 7

8 Observation about Soundex Vowels ( 元音 ) are viewed as interchangeable in transcribing names A,E, I, O, U, H, W, Y. Consonants ( 辅音 ) with similar sounds are considered to be the same. e.g. D and T These rules work for most European languages. 8

9 CHAPTER 4 INDEX CONSTRUCTION PDF p.104 9

10 Introduction We will talk about how to construct an inverted index. This process is called index construction or indexing ( 索引 ). It is performed by some software called an indexer ( 索引器 ). A collection of documents An inverted index to search for documents Doc 1 Book1 Indexer Book1 Book1 Doc2 10

11 Introduction For a Web search engine like Baidu or Bing, the indexer is called a spider or web crawler ( 网络爬虫 ). A web crawler is a software that will browse the internet periodically to update its index of webpages. 11

12 Types of IR systems There are: small scale Information Retrieval systems (e.g. to search documents in a company) large scale Information Retrieval systems (e.g. to search the Web). In general, we want an IR system to be fast. Thus, characteristics of the computer hardware ( 计算机硬件 ) must be considered. 12

13 Computer memory There are two main types of memory in a computer: Hard drive ( 硬盘驱动器 ) RAM memory (RAM 芯片 ) Permanent storage Cheaper Slower Temporary storage Expensive Fast 13

14 About hardware 1) Data access time ( 访问时间 ) Accessing the data in RAM is faster than accessing the data in a hard drive. To increase the speed of an IR system we should keep as much data as possible in RAM. We may use a computer having several gigabytes (GB) of RAM for an IR system. A technique called caching ( 缓存 ) consists of keeping the most frequently accessed data in RAM memory. 14

15 About hardware 2) How the data is organized is important How the data is organized in memory also influences how fast the data can be read or written. In general, if the data that we read is stored contiguously ( 连续的 ) on the hard drive, then reading the data will be faster than if the data is not stored contiguously. Data stored contiguously Data not stored contiguously

16 About hardware 3) Data compression ( 数据压缩 ) can reduce the time for reading data on the hard drive Data compression refers to techniques for reducing the size of the data. If the data is smaller, reading it is faster. Uncompressed data Compressed data 16

17 Simple approach for index construction Step 1. Each document from the collection is read. For each word, a <term, document ID > pair is created. e.g. this indicates that the term Brutus appears in document #1 17

18 Simple approach for index construction Step 1. Each document from the collection is read. For each word, a <term, document ID > pair is created. e.g. this indicates that the term Brutus also appears in document #2 18

19 Step 2. All the pairs are sorted alphabetically Thus, all pairs representing the same term now appears consecutively. e.g. was 19

20 Step 3. The pairs with same terms are then combined to create the inverted index (dictionary) 20

21 Step 3. The pairs are then used to create the inverted index (dictionary) The term Brutus The frequency of this term (optional). Brutus appears in 2 documents The posting list. Brutus appears in documents 1 and 2 21

22 Example Reuters-RCV1: a collection of about 800,000 news documents published between August 20, 1996 and August 19, GB of text, average: 200 tokens per document 400,000 terms 22

23 Example (cont d) 100 million tokens Each token requires 32 bits of memory Storing the texts takes 0.8 GB This collection of documents can fit in the memory of a desktop computer. However, for larger document collections, it is not possible 23

24 Index construction If a computer has not enough RAM memory, the index must be created on the hard drive. At any given moment, only some part of the data can be stored in RAM memory. Thus, the list of <term, document ID> pairs must be stored on the hard drive. It must also be sorted on the hard drive. It is not easy to write a software ( 软件 ) program that does this. This is some advanced discussion. For more details, see p.71 of the book 24

25 Several variations of indexing Several other approaches for indexing. Another one: 1. A dictionary is created (empty) in RAM memory. 2. Documents are read one by one to fill the dictionary. 3. If the memory is full the current dictionary is saved to disk and a new dictionary is created in memory. 4. The process continue to fill the new dictionary. 5. Finally, all the dictionaries needs to be merged to obtain a single dictionary. 25

26 Distributed indexing 分布式索引 Up to now, we have discussed about indexing on a single computer. For large document collections (e.g. the World Wide Web), indexing cannot be done efficiently using a single computer..solution: Create a distributed index ( 分布式索引 ). It is an index that is stored on many computers. 26

27 Distributed indexing 分布式索引 Distributed index The index is distributed on various computers either according to terms or documents. Here we will discuss indexes where the data is organized according to terms rather than documents. 27

28 Distributed indexing 分布式索引 In practice, distributed indexing is often done in the cloud ( 云计算 ) using technologies such as MapReduce What is the cloud? Many computers with standard parts (processor, memory, disk) that work together, up to a thousand computers, 28

29 Distributed indexing 分布式索引 In practice, distributed indexing is often done in the cloud ( 云计算 ) using technologies such as MapReduce What is the cloud? Many computers with standard parts (processor, memory, disk) that work together, up to a thousand computers, can survive the failure of some computers (multiple copies of the data is kept on multiple computers). 29

30 We will not talk about the details 30

31 Dynamic indexing ( 动态索引 ) We have until now assumed that a document collection is static (never changes, or is rarely changed). But most collections are not static New terms are added to the dictionary. New documents are added or removed (posting lists needs to be updated) 31

32 How to update a dictionary? Simple approach: Rebuild the dictionary periodically from scratch (e.g. every day). This is acceptable if the number of changes over time is small. the delay in making new documents searchable is acceptable. enough computer resources are available to construct a new index while the old one is still being used. 32

33 Dynamic indexing with two indexes If new documents needs to be indexed quickly: A main index is created to store documents and their posting lists An auxiliary index is kept in memory to store new documents and their posting lists. 33

34 Dynamic indexing with two indexes When searching for documents, the search is done on both indexes and the results are merged. Then, the result is shown to the user. Deletions: a list is used to keep track of documents that have been deleted. Updates: updated documents are removed from the indexes and inserted again. 34

35 Dynamic indexing with two indexes When the auxiliary index becomes too large, it is merged with the main index. This can be done periodically. 35

36 How indexes are stored? To store a dictionary, a file can be created for each term, containing its posting list. Shenzhen Beijing Brutus Automobile However, many computers cannot handle well a large amount of files. A better approach: the dictionary is stored in a single file or a database ( 数据库 ). Other solutions may also be used. 36

37 Performance Constructing a distributed index is more complicated than constructing an index that is stored on a single computer. But index construction and update can be very fast using a cloud (many computers). In practice, many search engine prefer to reconstruct the index from scratch, rather than trying to update it. More details 37

38 A main index is used for searching User ( 用户 ) searches for documents while a new index is being constructed. Indexer builds an updated index 38

39 Construction of positional indexes We previously discussed positional indexes. Positional index ( 位置索引 ): a dictionary where the positions of terms in documents are stored. Dictionary City Shenzhen Located China Book1 (3, 25, 38) Book 20 (4, 100, 1000) Book1 (2, 24, 35). Book20(3,500) This indicates that «Shenzhen» appears as the 2 nd, 24 th and 35 th word in Book

40 Construction of positional indexes Positional indexes are constructed in the same way as regular indexes. The main difference is that the position of terms in documents is kept and stored in the index. Dictionary City Shenzhen Located China Book1 (3, 25, 38) Book 20 (4, 100, 1000) Book1 (2, 24, 35). Book20(3,500) 40 40

41 Indexes for ranking Some IR systems rank documents from the most relevant to the least relevant. Most relevant Least relevant 41

42 Indexes for ranking The most relevant results should be shown first to the user. An approach is to sort the index by weight or impact (highest-weighted documents occur first in the index). This can allow to quickly stop a search for documents (since less important or unpopular documents are listed last). 42

43 Security for IR system Another important consideration of IR system is security. For example: Employees can search documents in the enterprise database. But some employees should not be able to access top-secret documents. Moreover, even the existence of a document can be sensitive ( 敏感的文件 ). Hence, the IR system should not show documents that a user cannot open. 43

44 How to ensure security? A solution: use an access control list ( 存取控制表 ). An access control list is a file that indicates the documents that each user can access. It can be viewed as a table (matrix) where rows are users and columns are documents. Documents Doc1 Doc2 Doc3 Doc4 User Users User User : can t read the document, 1 can read the document 44

45 How to ensure security? When a user searches for documents (e.g. user1): A set of documents is found that match the user s query using an inverted index (dictionary). {Doc1, Doc2, Doc3} Then, the intersection of these documents and the documents that the user can access is calculated. Doc1 Doc2 Doc3 Doc4 User {Doc1, Doc2, Doc3} The result is shown to the user: {Doc1} 45

46 How to ensure security? When a user searches for documents (e.g. user1): A set of documents is found that match the user s query using an inverted index (dictionary). {Doc1, Doc2, Doc3} Then, the intersection of these documents and the documents that the user can access is calculated. Doc1 Doc2 Doc3 Doc4 User {Doc1, Doc2, Doc3} The result is shown to the user: {Doc1} 46

47 CHAPTER 5: INDEX COMPRESSION pdf p122 47

48 Introduction An index or dictionary can be very large if there are many documents. Compression ( 压缩 ): the process of reducing the size of an index. Several compression techniques. May reduce storage space required by up to 75 %. Benefits 48

49 Benefits of compression 1) We can save some disk space. 2) More data can fit in memory. Thus, we can increase the use of caching ( 缓存 ) (keeping the most frequently accessed information in RAM memory, for faster access, and reducing the number of disk accesses). 3) Transferring data from disk to memory becomes faster because less data is transmitted (the data is compressed). 49

50 Time needed for compression Using compression requires to compress ( 压缩数据 ) and uncompress data ( 压缩数据 ). This is not a difficult task. It can be done very quickly by a computer. Thus, the cost of compression and decompression is small compared to the benefits obtained by compression. 50

51 Statistical properties of terms in IR Besides, if we apply preprocessing on a set of documents, the size of the dictionary will be reduced. An example: Reuters-RCV1 collection There are 485,494 terms. 51

52 Eliminating the 150 most common words from indexing cuts 25% to 30% of the non positional postings. 52

53 53

54 English vs other languages English: The Ofxford English Dictionary : 600,00 words. But this excludes names, numbers, scientific terms, etc. The reduction achieved by compression is greater for some languages e.g. French The reason is that French is a morphologically richer language ( 形态丰富的语言 ) than English. 54

55 Two types of compression Lossless compression ( 无损压缩 ): we reduce the space occupied by the data. but we do not lose any information. we will talk about this! Lossy compression ( 有损压缩 ): we reduce the space however some data is lost. can save more space. 55

56 Heaps law There is a law for estimating the number of terms in a collection of documents which is: NumberOfTerms = k x NumberOfTokens b In general: k 30, 100 b ~ 0.5 NumberOfTokens : the sum of the number of tokens in all documents. 56

57 Example: for 1 million words, we can expect approximately 38,000 different terms. In Reuters-RCV1, we have 38,365 words. The parameter k depends a lot on the nature of the documents and how it is processed. Case folding and stemming reduce the growth-rate of vocabulary. Spelling errors and numbers increase the vocabulary growth 57

58 vocabulary size relationship between collection size and vocabulary size is often linear in log log space collection size 58

59 Frequency of terms In real-life, few terms are accessed very often, many terms are rarely accessed. We can take advantage of this for dictionary compression 59

60 How to store the dictionary? Fixed length encoding: Each term is stored using a same amount of memory (e.g. 20 bytes for each term) Example: Problem: If we use a fixed amount of memory for each term, some memory is wasted because not all terms have the same number of characters! 60

61 How to store the dictionary? Fixed length encoding: Each term is stored using a same amount of memory (e.g. 20 bytes for each term) Example: Problem 2: If the chosen size for storing a term is too small, some long terms cannot be stored in the dictionary. In this example, terms with more than 20 characters cannot be stored. 61

62 Variable length encoding: Each term is stored using a variable amount of memory This can save a lot of memory! 62

63 Block encoding: each term is preceded by a number indicating the number of letters in the term. This allow to reduce the number of pointers. This can save a lot of memory! 63

64 Front-coding If a dictionary is sorted, several consecutive words share the same prefix ( 前缀 ). This information can be used to further compress the dictionary. In this example, we don t need to store automat several times. This saves memory! 64

65 An illustration of the compression Explanation on next slide 65

66 Explanation of the previous slide We have several words : automata, automate, automatic, automation. We want to compress this data to make it smaller. Since all these words start with automat we write: 8automat <-- Here 8 is the number of letters in "automat" Then, we write automata has follows: *a <-- This means that it is the same as "automat" but we must add character "a" to get "automata" Then, we write automate has follows: 1 e <-- This means that it is the same as "automat" but we must add 1 character which is "e" to get "automate" Then, we write automate has follows: 2 ic <-- This means that it is the same as "automat" but we must add 2 characters which is "ic" to get "automatic" Then, we write automate has follows: 3 ion <-- This means that it is the same as "automat" but we must add 3 characters which is "ion" to get "automation" 66

67 How much reduction? 67

68 Compression of posting lists It is also possible to compress posting lists. Normally, in a dictionary, for each term, we store the full list of documents where it appears. Each document is represented by a number (identifier), which uses a fixed amount of memory. To save memory, we can use a variable amount of memory to store the identifier of documents. Many approaches. See book p

69 Compression vs Dictionary size 3600 MB for the collection of documents 107 MB for storing the index ( ) 69

70 Conclusion Today, we have quickly discussed chapter 4 and 5. We will continue next week The PPT slides are on the website. 70

71 References Manning, C. D., Raghavan, P., Schütze, H. Introduction to information retrieval. Cambridge: Cambridge University Press,

信息检索与搜索引擎 Introduction to Information Retrieval GESC1007

信息检索与搜索引擎 Introduction to Information Retrieval GESC1007 信息检索与搜索引擎 Introduction to Information Retrieval GESC1007 Philippe Fournier-Viger Full professor School of Natural Sciences and Humanities philfv8@yahoo.com Spring 2019 1 Last week We have discussed in

More information

信息检索与搜索引擎 Introduction to Information Retrieval GESC1007

信息检索与搜索引擎 Introduction to Information Retrieval GESC1007 信息检索与搜索引擎 Introduction to Information Retrieval GESC1007 Philippe Fournier-Viger Full professor School of Natural Sciences and Humanities philfv8@yahoo.com Spring 2019 1 Last week We have discussed: A

More information

信息检索与搜索引擎 Introduction to Information Retrieval GESC1007

信息检索与搜索引擎 Introduction to Information Retrieval GESC1007 信息检索与搜索引擎 Introduction to Information Retrieval GESC1007 Philippe Fournier-Viger Full professor School of Natural Sciences and Humanities philfv8@yahoo.com Spring 2018 1 Last week What is Information Retrieval

More information

信息检索与搜索引擎 Introduction to Information Retrieval GESC1007

信息检索与搜索引擎 Introduction to Information Retrieval GESC1007 信息检索与搜索引擎 Introduction to Information Retrieval GESC1007 Philippe Fournier-Viger Full professor School of Natural Sciences and Humanities philfv8@yahoo.com Spring 2019 1 Introduction Philippe Fournier-Viger

More information

信息检索与搜索引擎 Introduction to Information Retrieval GESC1007

信息检索与搜索引擎 Introduction to Information Retrieval GESC1007 信息检索与搜索引擎 Introduction to Information Retrieval GESC1007 Philippe Fournier-Viger Full professor School of Natural Sciences and Humanities philfv8@yahoo.com Spring 2019 1 Last week We have discussed: Evaluation

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval CS3245 Information Retrieval Lecture 6: Index Compression 6 Last Time: index construction Sort- based indexing Blocked Sort- Based Indexing Merge sort is effective

More information

Introduction to Information Retrieval (Manning, Raghavan, Schutze)

Introduction to Information Retrieval (Manning, Raghavan, Schutze) Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 3 Dictionaries and Tolerant retrieval Chapter 4 Index construction Chapter 5 Index compression Content Dictionary data structures

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval Lecture 3: Dictionaries and tolerant retrieval 1 Outline Dictionaries Wildcard queries skip Edit distance skip Spelling correction skip Soundex 2 Inverted index Our

More information

Administrative. Distributed indexing. Index Compression! What I did last summer lunch talks today. Master. Tasks

Administrative. Distributed indexing. Index Compression! What I did last summer lunch talks today. Master. Tasks Administrative Index Compression! n Assignment 1? n Homework 2 out n What I did last summer lunch talks today David Kauchak cs458 Fall 2012 adapted from: http://www.stanford.edu/class/cs276/handouts/lecture5-indexcompression.ppt

More information

Information Retrieval

Information Retrieval Information Retrieval Suan Lee - Information Retrieval - 05 Index Compression 1 05 Index Compression - Information Retrieval - 05 Index Compression 2 Last lecture index construction Sort-based indexing

More information

云计算入门 Introduction to Cloud Computing GESC1001

云计算入门 Introduction to Cloud Computing GESC1001 Lecture #3 云计算入门 Introduction to Cloud Computing GESC1001 Philippe Fournier-Viger Professor School of Humanities and Social Sciences philfv8@yahoo.com Fall 2018 1 Course schedule Part 1 Part 2 Part 3 Introduction

More information

云计算入门 Introduction to Cloud Computing GESC1001

云计算入门 Introduction to Cloud Computing GESC1001 Lecture #6 云计算入门 Introduction to Cloud Computing GESC1001 Philippe Fournier-Viger Professor School of Humanities and Social Sciences philfv8@yahoo.com Fall 2017 1 Introduction Last week: how cloud applications

More information

Index Compression. David Kauchak cs160 Fall 2009 adapted from:

Index Compression. David Kauchak cs160 Fall 2009 adapted from: Index Compression David Kauchak cs160 Fall 2009 adapted from: http://www.stanford.edu/class/cs276/handouts/lecture5-indexcompression.ppt Administrative Homework 2 Assignment 1 Assignment 2 Pair programming?

More information

Web Information Retrieval. Lecture 4 Dictionaries, Index Compression

Web Information Retrieval. Lecture 4 Dictionaries, Index Compression Web Information Retrieval Lecture 4 Dictionaries, Index Compression Recap: lecture 2,3 Stemming, tokenization etc. Faster postings merges Phrase queries Index construction This lecture Dictionary data

More information

Efficiency. Efficiency: Indexing. Indexing. Efficiency Techniques. Inverted Index. Inverted Index (COSC 488)

Efficiency. Efficiency: Indexing. Indexing. Efficiency Techniques. Inverted Index. Inverted Index (COSC 488) Efficiency Efficiency: Indexing (COSC 488) Nazli Goharian nazli@cs.georgetown.edu Difficult to analyze sequential IR algorithms: data and query dependency (query selectivity). O(q(cf max )) -- high estimate-

More information

Information Retrieval and Organisation

Information Retrieval and Organisation Information Retrieval and Organisation Dell Zhang Birkbeck, University of London 2015/16 IR Chapter 04 Index Construction Hardware In this chapter we will look at how to construct an inverted index Many

More information

Recap: lecture 2 CS276A Information Retrieval

Recap: lecture 2 CS276A Information Retrieval Recap: lecture 2 CS276A Information Retrieval Stemming, tokenization etc. Faster postings merges Phrase queries Lecture 3 This lecture Index compression Space estimation Corpus size for estimates Consider

More information

Introduction to Information Retrieval

Introduction to Information Retrieval Introduction to Information Retrieval http://informationretrieval.org IIR 5: Index Compression Hinrich Schütze Center for Information and Language Processing, University of Munich 2014-04-17 1/59 Overview

More information

Information Retrieval

Information Retrieval Information Retrieval Suan Lee - Information Retrieval - 04 Index Construction 1 04 Index Construction - Information Retrieval - 04 Index Construction 2 Plan Last lecture: Dictionary data structures Tolerant

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval Lecture 4: Index Construction 1 Plan Last lecture: Dictionary data structures Tolerant retrieval Wildcards Spell correction Soundex a-hu hy-m n-z $m mace madden mo

More information

Inverted Indexes. Indexing and Searching, Modern Information Retrieval, Addison Wesley, 2010 p. 5

Inverted Indexes. Indexing and Searching, Modern Information Retrieval, Addison Wesley, 2010 p. 5 Inverted Indexes Indexing and Searching, Modern Information Retrieval, Addison Wesley, 2010 p. 5 Basic Concepts Inverted index: a word-oriented mechanism for indexing a text collection to speed up the

More information

INDEX CONSTRUCTION 1

INDEX CONSTRUCTION 1 1 INDEX CONSTRUCTION PLAN Last lecture: Dictionary data structures Tolerant retrieval Wildcards Spell correction Soundex a-hu hy-m n-z $m mace madden This time: mo among amortize Index construction on

More information

Information Retrieval. Lecture 3 - Index compression. Introduction. Overview. Characterization of an index. Wintersemester 2007

Information Retrieval. Lecture 3 - Index compression. Introduction. Overview. Characterization of an index. Wintersemester 2007 Information Retrieval Lecture 3 - Index compression Seminar für Sprachwissenschaft International Studies in Computational Linguistics Wintersemester 2007 1/ 30 Introduction Dictionary and inverted index:

More information

Understanding IO patterns of SSDs

Understanding IO patterns of SSDs 固态硬盘 I/O 特性测试 周大 众所周知, 固态硬盘是一种由闪存作为存储介质的数据库存储设备 由于闪存和磁盘之间物理特性的巨大差异, 现有的各种软件系统无法直接使用闪存芯片 为了提供对现有软件系统的支持, 往往在闪存之上添加一个闪存转换层来实现此目的 固态硬盘就是在闪存上附加了闪存转换层从而提供和磁盘相同的访问接口的存储设备 一方面, 闪存本身具有独特的访问特性 另外一方面, 闪存转换层内置大量的算法来实现闪存和磁盘访问接口之间的转换

More information

Course work. Today. Last lecture index construc)on. Why compression (in general)? Why compression for inverted indexes?

Course work. Today. Last lecture index construc)on. Why compression (in general)? Why compression for inverted indexes? Course work Introduc)on to Informa(on Retrieval Problem set 1 due Thursday Programming exercise 1 will be handed out today CS276: Informa)on Retrieval and Web Search Pandu Nayak and Prabhakar Raghavan

More information

3-2. Index construction. Most slides were adapted from Stanford CS 276 course and University of Munich IR course.

3-2. Index construction. Most slides were adapted from Stanford CS 276 course and University of Munich IR course. 3-2. Index construction Most slides were adapted from Stanford CS 276 course and University of Munich IR course. 1 Ch. 4 Index construction How do we construct an index? What strategies can we use with

More information

Indexing. UCSB 290N. Mainly based on slides from the text books of Croft/Metzler/Strohman and Manning/Raghavan/Schutze

Indexing. UCSB 290N. Mainly based on slides from the text books of Croft/Metzler/Strohman and Manning/Raghavan/Schutze Indexing UCSB 290N. Mainly based on slides from the text books of Croft/Metzler/Strohman and Manning/Raghavan/Schutze All slides Addison Wesley, 2008 Table of Content Inverted index with positional information

More information

CS60092: Informa0on Retrieval

CS60092: Informa0on Retrieval Introduc)on to CS60092: Informa0on Retrieval Sourangshu Bha1acharya Last lecture index construc)on Sort- based indexing Naïve in- memory inversion Blocked Sort- Based Indexing Merge sort is effec)ve for

More information

Index construction CE-324: Modern Information Retrieval Sharif University of Technology

Index construction CE-324: Modern Information Retrieval Sharif University of Technology Index construction CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2014 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford) Ch.

More information

INFO 4300 / CS4300 Information Retrieval. slides adapted from Hinrich Schütze s, linked from

INFO 4300 / CS4300 Information Retrieval. slides adapted from Hinrich Schütze s, linked from INFO 4300 / CS4300 Information Retrieval slides adapted from Hinrich Schütze s, linked from http://informationretrieval.org/ IR 6: Index Compression Paul Ginsparg Cornell University, Ithaca, NY 15 Sep

More information

index construct Overview Overview Recap How to construct index? Introduction Index construction Introduction to Recap

index construct Overview Overview Recap How to construct index? Introduction Index construction Introduction to Recap to to Information Retrieval Index Construct Ruixuan Li Huazhong University of Science and Technology http://idc.hust.edu.cn/~rxli/ October, 2012 1 2 How to construct index? Computerese term document docid

More information

Index construction CE-324: Modern Information Retrieval Sharif University of Technology

Index construction CE-324: Modern Information Retrieval Sharif University of Technology Index construction CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2016 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford) Ch.

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval Lecture 4: Index Construction Plan Last lecture: Dictionary data structures Tolerant retrieval Wildcards This time: Spell correction Soundex Index construction Index

More information

Index Construction. Dictionary, postings, scalable indexing, dynamic indexing. Web Search

Index Construction. Dictionary, postings, scalable indexing, dynamic indexing. Web Search Index Construction Dictionary, postings, scalable indexing, dynamic indexing Web Search 1 Overview Indexes Query Indexing Ranking Results Application Documents User Information analysis Query processing

More information

Index construction CE-324: Modern Information Retrieval Sharif University of Technology

Index construction CE-324: Modern Information Retrieval Sharif University of Technology Index construction CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2017 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford) Ch.

More information

CSCI 5417 Information Retrieval Systems Jim Martin!

CSCI 5417 Information Retrieval Systems Jim Martin! CSCI 5417 Information Retrieval Systems Jim Martin! Lecture 4 9/1/2011 Today Finish up spelling correction Realistic indexing Block merge Single-pass in memory Distributed indexing Next HW details 1 Query

More information

Information Retrieval 6. Index compression

Information Retrieval 6. Index compression Ghislain Fourny Information Retrieval 6. Index compression Picture copyright: donest /123RF Stock Photo What we have seen so far 2 Boolean retrieval lawyer AND Penang AND NOT silver query Input Set of

More information

Corso di Biblioteche Digitali

Corso di Biblioteche Digitali Corso di Biblioteche Digitali Vittore Casarosa casarosa@isti.cnr.it tel. 050-315 3115 cell. 348-397 2168 Ricevimento dopo la lezione o per appuntamento Valutazione finale 70-75% esame orale 25-30% progetto

More information

Introduction to. CS276: Information Retrieval and Web Search Christopher Manning and Prabhakar Raghavan. Lecture 4: Index Construction

Introduction to. CS276: Information Retrieval and Web Search Christopher Manning and Prabhakar Raghavan. Lecture 4: Index Construction Introduction to Information Retrieval CS276: Information Retrieval and Web Search Christopher Manning and Prabhakar Raghavan Lecture 4: Index Construction 1 Plan Last lecture: Dictionary data structures

More information

Introduction to Information Retrieval

Introduction to Information Retrieval Introduction to Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar Raghavan Hamid Rastegari Lecture 4: Index Construction Plan Last lecture: Dictionary data structures

More information

CS347. Lecture 2 April 9, Prabhakar Raghavan

CS347. Lecture 2 April 9, Prabhakar Raghavan CS347 Lecture 2 April 9, 2001 Prabhakar Raghavan Today s topics Inverted index storage Compressing dictionaries into memory Processing Boolean queries Optimizing term processing Skip list encoding Wild-card

More information

Today s topics CS347. Inverted index storage. Inverted index storage. Processing Boolean queries. Lecture 2 April 9, 2001 Prabhakar Raghavan

Today s topics CS347. Inverted index storage. Inverted index storage. Processing Boolean queries. Lecture 2 April 9, 2001 Prabhakar Raghavan Today s topics CS347 Lecture 2 April 9, 2001 Prabhakar Raghavan Inverted index storage Compressing dictionaries into memory Processing Boolean queries Optimizing term processing Skip list encoding Wild-card

More information

Data-analysis and Retrieval Boolean retrieval, posting lists and dictionaries

Data-analysis and Retrieval Boolean retrieval, posting lists and dictionaries Data-analysis and Retrieval Boolean retrieval, posting lists and dictionaries Hans Philippi (based on the slides from the Stanford course on IR) April 25, 2018 Boolean retrieval, posting lists & dictionaries

More information

如何查看 Cache Engine 缓存中有哪些网站 /URL

如何查看 Cache Engine 缓存中有哪些网站 /URL 如何查看 Cache Engine 缓存中有哪些网站 /URL 目录 简介 硬件与软件版本 处理日志 验证配置 相关信息 简介 本文解释如何设置处理日志记录什么网站 /URL 在 Cache Engine 被缓存 硬件与软件版本 使用这些硬件和软件版本, 此配置开发并且测试了 : Hardware:Cisco 缓存引擎 500 系列和 73xx 软件 :Cisco Cache 软件版本 2.3.0

More information

Text Analytics. Index-Structures for Information Retrieval. Ulf Leser

Text Analytics. Index-Structures for Information Retrieval. Ulf Leser Text Analytics Index-Structures for Information Retrieval Ulf Leser Content of this Lecture Inverted files Storage structures Phrase and proximity search Building and updating the index Using a RDBMS Ulf

More information

Elementary IR: Scalable Boolean Text Search. (Compare with R & G )

Elementary IR: Scalable Boolean Text Search. (Compare with R & G ) Elementary IR: Scalable Boolean Text Search (Compare with R & G 27.1-3) Information Retrieval: History A research field traditionally separate from Databases Hans P. Luhn, IBM, 1959: Keyword in Context

More information

Index Construction 1

Index Construction 1 Index Construction 1 October, 2009 1 Vorlage: Folien von M. Schütze 1 von 43 Index Construction Hardware basics Many design decisions in information retrieval are based on hardware constraints. We begin

More information

Chapter 11 SHANDONG UNIVERSITY 1

Chapter 11 SHANDONG UNIVERSITY 1 Chapter 11 File System Implementation ti SHANDONG UNIVERSITY 1 Contents File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency and

More information

Lecture 3 Index Construction and Compression. Many thanks to Prabhakar Raghavan for sharing most content from the following slides

Lecture 3 Index Construction and Compression. Many thanks to Prabhakar Raghavan for sharing most content from the following slides Lecture 3 Index Construction and Compression Many thanks to Prabhakar Raghavan for sharing most content from the following slides Recap of the previous lecture Tokenization Term equivalence Skip pointers

More information

Text Analytics. Index-Structures for Information Retrieval. Ulf Leser

Text Analytics. Index-Structures for Information Retrieval. Ulf Leser Text Analytics Index-Structures for Information Retrieval Ulf Leser Content of this Lecture Inverted files Storage structures Phrase and proximity search Building and updating the index Using a RDBMS Ulf

More information

Index Construction. Slides by Manning, Raghavan, Schutze

Index Construction. Slides by Manning, Raghavan, Schutze Introduction to Information Retrieval ΕΠΛ660 Ανάκτηση Πληροφοριών και Μηχανές Αναζήτησης ης Index Construction ti Introduction to Information Retrieval Plan Last lecture: Dictionary data structures Tolerant

More information

Multimedia Information Extraction and Retrieval Term Frequency Inverse Document Frequency

Multimedia Information Extraction and Retrieval Term Frequency Inverse Document Frequency Multimedia Information Extraction and Retrieval Term Frequency Inverse Document Frequency Ralf Moeller Hamburg Univ. of Technology Acknowledgement Slides taken from presentation material for the following

More information

Information Retrieval

Information Retrieval Introduction to CS3245 Lecture 5: Index Construction 5 Last Time Dictionary data structures Tolerant retrieval Wildcards Spelling correction Soundex a-hu hy-m n-z $m mace madden mo among amortize on abandon

More information

数据挖掘 Introduction to Data Mining

数据挖掘 Introduction to Data Mining 数据挖掘 Introduction to Data Mining Philippe Fournier-Viger Full professor School of Natural Sciences and Humanities philfv8@yahoo.com Spring 2019 S8700113C 1 Introduction Last week: Association Analysis

More information

Information Retrieval

Information Retrieval Introduction to CS3245 Lecture 5: Index Construction 5 CS3245 Last Time Dictionary data structures Tolerant retrieval Wildcards Spelling correction Soundex a-hu hy-m n-z $m mace madden mo among amortize

More information

Introduction to Computer Science

Introduction to Computer Science Introduction to Computer Science 郝建业副教授 软件学院 http://www.escience.cn/people/jianye/index.html Lecturer Jianye HAO ( 郝建业 ) Email: jianye.hao@tju.edu.cn Tutor: Li Shuxin ( 李姝昕 ) Email: 957005030@qq.com Outline

More information

Chapter 2. Architecture of a Search Engine

Chapter 2. Architecture of a Search Engine Chapter 2 Architecture of a Search Engine Search Engine Architecture A software architecture consists of software components, the interfaces provided by those components and the relationships between them

More information

Informa(on Retrieval

Informa(on Retrieval Introduc*on to Informa(on Retrieval CS276: Informa*on Retrieval and Web Search Pandu Nayak and Prabhakar Raghavan Lecture 4: Index Construc*on Plan Last lecture: Dic*onary data structures Tolerant retrieval

More information

计算机信息表达. Information Representation 刘志磊天津大学智能与计算学部

计算机信息表达. Information Representation 刘志磊天津大学智能与计算学部 计算机信息表达 刘志磊天津大学智能与计算学部 Bits & Bytes Bytes & Letters More Bytes Bit ( 位 ) the smallest unit of storage Everything in a computer is 0 s and 1 s Bits why? Computer Hardware Chip uses electricity 0/1 states

More information

modern database systems lecture 4 : information retrieval

modern database systems lecture 4 : information retrieval modern database systems lecture 4 : information retrieval Aristides Gionis Michael Mathioudakis spring 2016 in perspective structured data relational data RDBMS MySQL semi-structured data data-graph representation

More information

Introduction to Information Retrieval

Introduction to Information Retrieval Introduction Inverted index Processing Boolean queries Course overview Introduction to Information Retrieval http://informationretrieval.org IIR 1: Boolean Retrieval Hinrich Schütze Institute for Natural

More information

Bi-monthly report. Tianyi Luo

Bi-monthly report. Tianyi Luo Bi-monthly report Tianyi Luo 1 Work done in this week Write a crawler plus based on keywords (Support Chinese and English) Modify a Sina weibo crawler (340M/day) Offline learning to rank module is completed

More information

Outline of the course

Outline of the course Outline of the course Introduction to Digital Libraries (15%) Description of Information (30%) Access to Information (30%) User Services (10%) Additional topics (15%) Buliding of a (small) digital library

More information

CS6200 Information Retrieval. David Smith College of Computer and Information Science Northeastern University

CS6200 Information Retrieval. David Smith College of Computer and Information Science Northeastern University CS6200 Information Retrieval David Smith College of Computer and Information Science Northeastern University Indexing Process!2 Indexes Storing document information for faster queries Indexes Index Compression

More information

CS105 Introduction to Information Retrieval

CS105 Introduction to Information Retrieval CS105 Introduction to Information Retrieval Lecture: Yang Mu UMass Boston Slides are modified from: http://www.stanford.edu/class/cs276/ Information Retrieval Information Retrieval (IR) is finding material

More information

Chapter 6: Information Retrieval and Web Search. An introduction

Chapter 6: Information Retrieval and Web Search. An introduction Chapter 6: Information Retrieval and Web Search An introduction Introduction n Text mining refers to data mining using text documents as data. n Most text mining tasks use Information Retrieval (IR) methods

More information

: Operating System 计算机原理与设计

: Operating System 计算机原理与设计 .. 0117401: Operating System 计算机原理与设计 Chapter 11: File system interface( 文件系统接口 ) 陈香兰 xlanchen@ustc.edu.cn http://staff.ustc.edu.cn/~xlanchen Computer Application Laboratory, CS, USTC @ Hefei Embedded

More information

EECS 395/495 Lecture 3 Scalable Indexing, Searching, and Crawling

EECS 395/495 Lecture 3 Scalable Indexing, Searching, and Crawling EECS 395/495 Lecture 3 Scalable Indexing, Searching, and Crawling Doug Downey Based partially on slides by Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze Announcements Project progress report

More information

Indexing and Searching

Indexing and Searching Indexing and Searching Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University References: 1. Modern Information Retrieval, chapter 9 2. Information Retrieval:

More information

The State and Opportunities of HPC Applications in China. Ruibo Wang National University of Defense Technology

The State and Opportunities of HPC Applications in China. Ruibo Wang National University of Defense Technology The State and Opportunities of HPC Applications in China Ruibo Wang National University of Defense Technology Outline Brief introduction to the Sites Applications Fusion Development of HPC, Cloud & Big

More information

第二小题 : 逻辑隔离 (10 分 ) OpenFlow Switch1 (PC-A/Netfpga) OpenFlow Switch2 (PC-B/Netfpga) ServerB PC-2. Switching Hub

第二小题 : 逻辑隔离 (10 分 ) OpenFlow Switch1 (PC-A/Netfpga) OpenFlow Switch2 (PC-B/Netfpga) ServerB PC-2. Switching Hub 第二小题 : 逻辑隔离 (10 分 ) 一 实验背景云平台服务器上的不同虚拟服务器, 分属于不同的用户 用户远程登录自己的虚拟服务器之后, 安全上不允许直接访问同一局域网的其他虚拟服务器 二 实验目的搭建简单网络, 通过逻辑隔离的方法, 实现用户能远程登录局域网内自己的虚拟内服务器, 同时不允许直接访问同一局域网的其他虚拟服务器 三 实验环境搭建如图 1-1 所示, 我们会创建一个基于 OpenFlow

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval CS276 Information Retrieval and Web Search Christopher Manning and Prabhakar Raghavan Lecture 1: Boolean retrieval Information Retrieval Information Retrieval (IR)

More information

Information Retrieval

Information Retrieval Information Retrieval Dictionaries & Tolerant Retrieval Gintarė Grigonytė gintare@ling.su.se Department of Linguistics and Philology Uppsala University Slides based on previous IR course given by Jörg

More information

Technology: Anti-social Networking 科技 : 反社交网络

Technology: Anti-social Networking 科技 : 反社交网络 Technology: Anti-social Networking 科技 : 反社交网络 1 Technology: Anti-social Networking 科技 : 反社交网络 The Growth of Online Communities 社交网络使用的增长 Read the text below and do the activity that follows. 阅读下面的短文, 然后完成练习

More information

Introduc)on to. CS60092: Informa0on Retrieval

Introduc)on to. CS60092: Informa0on Retrieval Introduc)on to CS60092: Informa0on Retrieval Ch. 4 Index construc)on How do we construct an index? What strategies can we use with limited main memory? Sec. 4.1 Hardware basics Many design decisions in

More information

Indexing and Searching

Indexing and Searching Indexing and Searching Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University References: 1. Modern Information Retrieval, chapter 8 2. Information Retrieval:

More information

Index Construction Introduction to Information Retrieval INF 141/ CS 121 Donald J. Patterson

Index Construction Introduction to Information Retrieval INF 141/ CS 121 Donald J. Patterson Index Construction Introduction to Information Retrieval INF 141/ CS 121 Donald J. Patterson Content adapted from Hinrich Schütze http://www.informationretrieval.org Index Construction Overview Introduction

More information

Information Retrieval

Information Retrieval Information Retrieval Natural Language Processing: Lecture 12 30.11.2017 Kairit Sirts Homework 4 things that seemed to work Bidirectional LSTM instead of unidirectional Change LSTM activation to sigmoid

More information

GUJARAT TECHNOLOGICAL UNIVERSITY

GUJARAT TECHNOLOGICAL UNIVERSITY GUJARAT TECHNOLOGICAL UNIVERSITY INFORMATION TECHNOLOGY DATA COMPRESSION AND DATA RETRIVAL SUBJECT CODE: 2161603 B.E. 6 th SEMESTER Type of course: Core Prerequisite: None Rationale: Data compression refers

More information

Digital Libraries: Language Technologies

Digital Libraries: Language Technologies Digital Libraries: Language Technologies RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Recall: Inverted Index..........................................

More information

Recap of the previous lecture. This lecture. A naïve dictionary. Introduction to Information Retrieval. Dictionary data structures Tolerant retrieval

Recap of the previous lecture. This lecture. A naïve dictionary. Introduction to Information Retrieval. Dictionary data structures Tolerant retrieval Ch. 2 Recap of the previous lecture Introduction to Information Retrieval Lecture 3: Dictionaries and tolerant retrieval The type/token distinction Terms are normalized types put in the dictionary Tokenization

More information

1 o Semestre 2007/2008

1 o Semestre 2007/2008 Efficient Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Outline 1 2 3 4 5 6 7 Outline 1 2 3 4 5 6 7 Text es An index is a mechanism to locate a given term in

More information

3-1. Dictionaries and Tolerant Retrieval. Most slides were adapted from Stanford CS 276 course and University of Munich IR course.

3-1. Dictionaries and Tolerant Retrieval. Most slides were adapted from Stanford CS 276 course and University of Munich IR course. 3-1. Dictionaries and Tolerant Retrieval Most slides were adapted from Stanford CS 276 course and University of Munich IR course. 1 Dictionary data structures for inverted indexes Sec. 3.1 The dictionary

More information

Part 2: Boolean Retrieval Francesco Ricci

Part 2: Boolean Retrieval Francesco Ricci Part 2: Boolean Retrieval Francesco Ricci Most of these slides comes from the course: Information Retrieval and Web Search, Christopher Manning and Prabhakar Raghavan Content p Term document matrix p Information

More information

Information Retrieval. Lecture 10 - Web crawling

Information Retrieval. Lecture 10 - Web crawling Information Retrieval Lecture 10 - Web crawling Seminar für Sprachwissenschaft International Studies in Computational Linguistics Wintersemester 2007 1/ 30 Introduction Crawling: gathering pages from the

More information

Color LaserJet Pro MFP M477 入门指南

Color LaserJet Pro MFP M477 入门指南 Color LaserJet Pro MFP M477 入门指南 Getting Started Guide 2 www.hp.com/support/colorljm477mfp www.register.hp.com ZHCN 4. 在控制面板上进行初始设置...2 5. 选择一种连接方式并准备安装软件...2 6. 找到或下载软件安装文件...3 7. 安装软件...3 8. 移动和无线打印

More information

Machine Vision Market Analysis of 2015 Isabel Yang

Machine Vision Market Analysis of 2015 Isabel Yang Machine Vision Market Analysis of 2015 Isabel Yang CHINA Machine Vision Union Content 1 1.Machine Vision Market Analysis of 2015 Revenue of Machine Vision Industry in China 4,000 3,500 2012-2015 (Unit:

More information

Encoding. A thesis submitted to the Graduate School of University of Cincinnati in

Encoding. A thesis submitted to the Graduate School of University of Cincinnati in Lossless Data Compression for Security Purposes Using Huffman Encoding A thesis submitted to the Graduate School of University of Cincinnati in a partial fulfillment of requirements for the degree of Master

More information

Indexing and Searching

Indexing and Searching Indexing and Searching Introduction How to retrieval information? A simple alternative is to search the whole text sequentially Another option is to build data structures over the text (called indices)

More information

HBase 在 hulu 的使用和实践. hulu

HBase 在 hulu 的使用和实践. hulu HBase 在 hulu 的使用和实践 张虔熙 @ hulu qianxi.zhang@hulu.com About hulu About me 张虔熙 ü 软件工程师 @Hulu 大数据平台组 ü 专注于分布式计算和存储技术 ü 热衷于参与开源社区贡献代码 üqianxi.zhang@hulu.com Agenda Overview Audience Platform( 用户画像系统 ) Auto

More information

Overview. Lecture 3: Index Representation and Tolerant Retrieval. Type/token distinction. IR System components

Overview. Lecture 3: Index Representation and Tolerant Retrieval. Type/token distinction. IR System components Overview Lecture 3: Index Representation and Tolerant Retrieval Information Retrieval Computer Science Tripos Part II Ronan Cummins 1 Natural Language and Information Processing (NLIP) Group 1 Recap 2

More information

Information Retrieval and Organisation

Information Retrieval and Organisation Information Retrieval and Organisation Dell Zhang Birkbeck, University of London 2016/17 IR Chapter 01 Boolean Retrieval Example IR Problem Let s look at a simple IR problem Suppose you own a copy of Shakespeare

More information

CSE 562 Database Systems

CSE 562 Database Systems Goal of Indexing CSE 562 Database Systems Indexing Some slides are based or modified from originals by Database Systems: The Complete Book, Pearson Prentice Hall 2 nd Edition 08 Garcia-Molina, Ullman,

More information

OTAD Application Note

OTAD Application Note OTAD Application Note Document Title: OTAD Application Note Version: 1.0 Date: 2011-08-30 Status: Document Control ID: Release _OTAD_Application_Note_CN_V1.0 Copyright Shanghai SIMCom Wireless Solutions

More information

Command Dictionary CUSTOM

Command Dictionary CUSTOM 命令模式 CUSTOM [(filename)] [parameters] Executes a "custom-designed" command which has been provided by special programming using the GHS Programming Interface. 通过 GHS 程序接口, 执行一个 用户设计 的命令, 该命令由其他特殊程序提供 参数说明

More information

Instructor: Stefan Savev

Instructor: Stefan Savev LECTURE 2 What is indexing? Indexing is the process of extracting features (such as word counts) from the documents (in other words: preprocessing the documents). The process ends with putting the information

More information

public static InetAddress getbyname(string host) public static InetAddress getlocalhost() public static InetAddress[] getallbyname(string host)

public static InetAddress getbyname(string host) public static InetAddress getlocalhost() public static InetAddress[] getallbyname(string host) 网络编程 杨亮 网络模型 访问 网络 Socket InetAddress 类 public static InetAddress getbyname(string host) public static InetAddress getlocalhost() public static InetAddress[] getallbyname(string host) public class OreillyByName

More information

Boolean Retrieval. Manning, Raghavan and Schütze, Chapter 1. Daniël de Kok

Boolean Retrieval. Manning, Raghavan and Schütze, Chapter 1. Daniël de Kok Boolean Retrieval Manning, Raghavan and Schütze, Chapter 1 Daniël de Kok Boolean query model Pose a query as a boolean query: Terms Operations: AND, OR, NOT Example: Brutus AND Caesar AND NOT Calpuria

More information

INFO 4300 / CS4300 Information Retrieval. slides adapted from Hinrich Schütze s, linked from

INFO 4300 / CS4300 Information Retrieval. slides adapted from Hinrich Schütze s, linked from INFO 4300 / CS4300 Information Retrieval slides adapted from Hinrich Schütze s, linked from http://informationretrieval.org/ IR 7: Scores in a Complete Search System Paul Ginsparg Cornell University, Ithaca,

More information

2.8 Megapixel industrial camera for extreme environments

2.8 Megapixel industrial camera for extreme environments Prosilica GT 1920 Versatile temperature range for extreme environments PTP PoE P-Iris and DC-Iris lens control 2.8 Megapixel industrial camera for extreme environments Prosilica GT1920 is a 2.8 Megapixel

More information