The Lucene Search Engine

Size: px
Start display at page:

Download "The Lucene Search Engine"

Transcription

1 The Lucene Search Engine Kira Radinsky Based on the material from: Thomas Paul and Steven J. Owens

2 What is Lucene? Doug Cutting s grandmother s middle name A open source set of Java Classses Search Engine/Document Classifier/Indexer Developed by Doug Cutting (1996) Xerox/Apple/Excite/Nutch/Yahoo/Cloudera Hadoop founder, Board of directors of the Apache Software Jakarta Apache Product. Strong open source community support. High-performance, full-featured text search engine library Easy to use yet powerful API

3 Use the Source, Luke Document Field Represents a section of a Document: name for the section + the actual data. Analyzer Abstract class (to provide interface) Document -> tokens (for later indexing) StandardAnalyzer class. IndexWriter Creates and maintains indexes. IndexSearcher Searches through an index. QueryParser Builds a parser that can search through an index. Query Abstract class that contains the search criteria created by the QueryParser. Hits Contains the Document objects that are returned by running the Query object against the index.

4 Indexing a Document

5 Document from an article private Document createdocument(string article, String author, String title, String topic, String url, Date datewritten) { } Document document = new Document(); document.add(field.text("author", author)); document.add(field.text("title", title)); document.add(field.text("topic", topic)); document.add(field.unindexed("url", url)); document.add(field.keyword("date", datewritten)); document.add(field.unstored("article", article)); return document;

6 The Field Object Factory Method Tokenized Indexed Stored Use for Field.Text(String name, String value) Field.Text(String name, Reader value) Field.Keyword(String name, String value) Field.UnIndexed(String name, String value) Field.UnStored(String name, String value) Yes Yes Yes Yes Yes No No Yes Yes No No Yes Yes Yes No contents you want stored contents you don't want stored values you don't want broken down values you don't want indexed values you don't want stored

7 Store a Document in the index String indexdirectory = "lucene-index"; private void indexdocument(document document) throws Exception { Analyzer analyzer = new StandardAnalyzer(); IndexWriter writer = new IndexWriter( indexdirectory, analyzer, false ); writer.adddocument(document); writer.optimize(); writer.close(); }

8 Analyzers and Tokenizers SimpleAnalyzer StopAnalyzer SimpleAnalyzer seems to just use a Tokenizer that converts all of the input to lower case. StopAnalyzer includes the lower-case filter, and also has a filter that drops out any "stop words", words like articles (a, an, the, etc) that occur so commonly in english that they might as well be noise for searching purposes. StopAnalyzer comes with a set of stop words, but you can instantiate it with your own array of stop words. StandardAnalyzer StandardAnalyzer does both lower-case and stop-word filtering, and in addition tries to do some basic clean-up of words, for example taking out apostrophes ( ' ) and removing periods from acronyms (i.e. "T.L.A." becomes "TLA"). Lucene Sandbox Here you can find analyzers in your own language

9 Adding to an Index public void indexarticle( String article, String author, String title, String topic, String url, Date datewritten) throws Exception { Document document = createdocument ( article, author, title, topic, url, datewritten ); indexdocument(document); }

10 Searching the Index

11 Searching IndexSearcher is = new IndexSearcher(indexDirectory); Analyzer analyzer = new StandardAnalyzer(); QueryParser parser = new QueryParser("article", analyzer); Query query = parser.parse(searchcriteria); Hits hits = is.search(query);

12 Extracting Document objects for (int i=0; i<hits.length(); i++) { } Document doc = hits.doc(i); // display the articles that were found to the user

13 Search Criteria Supports several searches: AND OR and NOT, fuzzy, proximity searches, wildcard searches, and range searches author:henry relativity AND "quantum physics "string theory" NOT Einstein "Galileo Kepler"~5 author:johnson date:[01/01/2004 TO 01/31/2004]

14 Thread Safety Indexing and searching are not only thread safe, but process safe. What this means is that: Multiple index searchers can read the lucene index files at the same time. An index writer or reader can edit the lucene index files while searches are ongoing Multiple index writers or readers can try to edit the lucene index files at the same time (it's important for the index writer/reader to be closed so it will release the file lock). The query parser is not thread safe, The index writer however, is thread safe,

EPL660: Information Retrieval and Search Engines Lab 2

EPL660: Information Retrieval and Search Engines Lab 2 EPL660: Information Retrieval and Search Engines Lab 2 Παύλος Αντωνίου Γραφείο: B109, ΘΕΕ01 University of Cyprus Department of Computer Science Apache Lucene Extremely rich and powerful full-text search

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval Lucene Tutorial Chris Manning and Pandu Nayak Open source IR systems Widely used academic systems Terrier (Java, U. Glasgow) http://terrier.org Indri/Galago/Lemur

More information

LUCENE - FIRST APPLICATION

LUCENE - FIRST APPLICATION LUCENE - FIRST APPLICATION http://www.tutorialspoint.com/lucene/lucene_first_application.htm Copyright tutorialspoint.com Let us start actual programming with Lucene Framework. Before you start writing

More information

COMP Implemen0ng Search using Lucene

COMP Implemen0ng Search using Lucene COMP 4601 Implemen0ng Search using Lucene 1 Luke: Lucene index analyzer WARNING: I HAVE NOT USED THIS 2 Scenario Crawler Crawl Directory containing tokenized content Lucene Lucene index directory 3 Classes

More information

Development of Search Engines using Lucene: An Experience

Development of Search Engines using Lucene: An Experience Available online at www.sciencedirect.com Procedia Social and Behavioral Sciences 18 (2011) 282 286 Kongres Pengajaran dan Pembelajaran UKM, 2010 Development of Search Engines using Lucene: An Experience

More information

SEARCHING AND INDEXING BIG DATA. -By Jagadish Rouniyar

SEARCHING AND INDEXING BIG DATA. -By Jagadish Rouniyar SEARCHING AND INDEXING BIG DATA -By Jagadish Rouniyar WHAT IS IT? Doug Cutting s grandmother s middle name A open source set of Java Classses Search Engine/Document Classifier/Indexer http://lucene.sourceforge.net/talks/pisa/

More information

Introduc)on to Lucene. Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata

Introduc)on to Lucene. Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata Introduc)on to Lucene Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata Open source search engines Academic Terrier (Java, University of Glasgow) Indri, Lemur (C++,

More information

Information Retrieval

Information Retrieval Information Retrieval Assignment 3: Boolean Information Retrieval with Lucene Patrick Schäfer (patrick.schaefer@hu-berlin.de) Marc Bux (buxmarcn@informatik.hu-berlin.de) Lucene Open source, Java-based

More information

VK Multimedia Information Systems

VK Multimedia Information Systems VK Multimedia Information Systems Mathias Lux, mlux@itec.uni-klu.ac.at This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Results Exercise 01 Exercise 02 Retrieval

More information

Applied Databases. Sebastian Maneth. Lecture 11 TFIDF Scoring, Lucene. University of Edinburgh - February 26th, 2017

Applied Databases. Sebastian Maneth. Lecture 11 TFIDF Scoring, Lucene. University of Edinburgh - February 26th, 2017 Applied Databases Lecture 11 TFIDF Scoring, Lucene Sebastian Maneth University of Edinburgh - February 26th, 2017 2 Outline 1. Vector Space Ranking & TFIDF 2. Lucene Next Lecture Assignment 1 marking will

More information

Web Data Management. Text indexing with LUCENE (Nicolas Travers) Philippe Rigaux CNAM Paris & INRIA Saclay

Web Data Management. Text indexing with LUCENE (Nicolas Travers) Philippe Rigaux CNAM Paris & INRIA Saclay http://webdam.inria.fr Web Data Management Text indexing with LUCENE (Nicolas Travers) Serge Abiteboul INRIA Saclay & ENS Cachan Ioana Manolescu INRIA Saclay & Paris-Sud University Philippe Rigaux CNAM

More information

Search Engines Exercise 5: Querying. Dustin Lange & Saeedeh Momtazi 9 June 2011

Search Engines Exercise 5: Querying. Dustin Lange & Saeedeh Momtazi 9 June 2011 Search Engines Exercise 5: Querying Dustin Lange & Saeedeh Momtazi 9 June 2011 Task 1: Indexing with Lucene We want to build a small search engine for movies Index and query the titles of the 100 best

More information

!"#$%&'()*+,-./'*.0'12*)$%-./'34'5# '/"-028'

!#$%&'()*+,-./'*.0'12*)$%-./'34'5# '/-028' !"#$%&()*+,-./*.012*)$%-./345#267+-52/"-028 9:;2$#-#(*+:9:(++;9,(#,*/,-(3%#&(1;=9""2?@A*-/)-*/++B"$",)-"2$/#9,(12,-"

More information

LUCENE - BOOLEANQUERY

LUCENE - BOOLEANQUERY LUCENE - BOOLEANQUERY http://www.tutorialspoint.com/lucene/lucene_booleanquery.htm Copyright tutorialspoint.com Introduction BooleanQuery is used to search documents which are result of multiple queries

More information

Informa(on Retrieval

Informa(on Retrieval Introduc*on to Informa(on Retrieval Lucene Tutorial Chris Manning and Pandu Nayak Open source IR systems Widely used academic systems Terrier (Java, U. Glasgow) hhp://terrier.org Indri/Galago/Lemur (C++

More information

LUCENE - TERMRANGEQUERY

LUCENE - TERMRANGEQUERY LUCENE - TERMRANGEQUERY http://www.tutorialspoint.com/lucene/lucene_termrangequery.htm Copyright tutorialspoint.com Introduction TermRangeQuery is the used when a range of textual terms are to be searched.

More information

Apache Lucene - Overview

Apache Lucene - Overview Table of contents 1 Apache Lucene...2 2 The Apache Software Foundation... 2 3 Lucene News...2 3.1 27 November 2011 - Lucene Core 3.5.0... 2 3.2 26 October 2011 - Java 7u1 fixes index corruption and crash

More information

A short introduction to the development and evaluation of Indexing systems

A short introduction to the development and evaluation of Indexing systems A short introduction to the development and evaluation of Indexing systems Danilo Croce croce@info.uniroma2.it Master of Big Data in Business SMARS LAB 3 June 2016 Outline An introduction to Lucene Main

More information

Lucene Java 2.9: Numeric Search, Per-Segment Search, Near-Real-Time Search, and the new TokenStream API

Lucene Java 2.9: Numeric Search, Per-Segment Search, Near-Real-Time Search, and the new TokenStream API Lucene Java 2.9: Numeric Search, Per-Segment Search, Near-Real-Time Search, and the new TokenStream API Uwe Schindler Lucene Java Committer uschindler@apache.org PANGAEA - Publishing Network for Geoscientific

More information

Indexing and Searching Document Collections using Lucene

Indexing and Searching Document Collections using Lucene University of New Orleans ScholarWorks@UNO University of New Orleans Theses and Dissertations Dissertations and Theses 5-18-2007 Indexing and Searching Document Collections using Lucene Sridevi Addagada

More information

Project Report. Project Title: Evaluation of Standard Information retrieval system related to specific queries

Project Report. Project Title: Evaluation of Standard Information retrieval system related to specific queries Project Report Project Title: Evaluation of Standard Information retrieval system related to specific queries Submitted by: Sindhu Hosamane Thippeswamy Information and Media Technologies Matriculation

More information

Apache Lucene - Query Parser Syntax

Apache Lucene - Query Parser Syntax Peter Carlson Table of contents 1 Overview...2 2 Terms... 2 3 Fields...3 4 Term Modifiers... 3 4.1 Wildcard Searches... 3 4.2 Fuzzy Searches... 4 4.3 Proximity Searches...4 4.4 Range Searches...4 4.5 Boosting

More information

LUCENE - ADD DOCUMENT OPERATION

LUCENE - ADD DOCUMENT OPERATION LUCENE - ADD DOCUMENT OPERATION http://www.tutorialspoint.com/lucene/lucene_adddocument.htm Copyright tutorialspoint.com Add document is one of the core operation as part of indexing process. We add Documents

More information

Lucene. Jianguo Lu. School of Computer Science. University of Windsor

Lucene. Jianguo Lu. School of Computer Science. University of Windsor Lucene Jianguo Lu School of Computer Science University of Windsor 1 A Comparison of Open Source Search Engines for 1.69M Pages 2 lucene Developed by Doug CuHng iniially Java-based. Created in 1999, Donated

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval ΠΛΕ70: Ανάκτηση Πληροφορίας Διδάσκουσα: Ευαγγελία Πιτουρά Διάλεξη 11: Εισαγωγή στο Lucene. 1 Τι είναι; Open source Java library for IR (indexing and searching) Lets

More information

230 Million Tweets per day

230 Million Tweets per day Tweets per day Queries per day Indexing latency Avg. query response time Earlybird - Realtime Search @twitter Michael Busch @michibusch michael@twitter.com buschmi@apache.org Earlybird - Realtime Search

More information

LAB 7: Search engine: Apache Nutch + Solr + Lucene

LAB 7: Search engine: Apache Nutch + Solr + Lucene LAB 7: Search engine: Apache Nutch + Solr + Lucene Apache Nutch Apache Lucene Apache Solr Crawler + indexer (mainly crawler) indexer + searcher indexer + searcher Lucene vs. Solr? Lucene = library, more

More information

Searching and Analyzing Qualitative Data on Personal Computer

Searching and Analyzing Qualitative Data on Personal Computer IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 10, Issue 2 (Mar. - Apr. 2013), PP 41-45 Searching and Analyzing Qualitative Data on Personal Computer Mohit

More information

XML to Lucene to SRW

XML to Lucene to SRW IMLS Grant Partner Uplift Project XML to Lucene to SRW (Work Area B.2 - B.4) Serhiy Polyakov February 15, 2007 Table of Contents 1. Introduction... 1 2. Parsing XML records into to Lucene index... 1 2.1.

More information

Please post comments or corrections to the Author Online forum at

Please post comments or corrections to the Author Online forum at MEAP Edition Manning Early Access Program Copyright 2008 Manning Publications For more information on this and other Manning titles go to www.manning.com Contents Preface Chapter 1 Meet Lucene Chapter

More information

LUCENE - DELETE DOCUMENT OPERATION

LUCENE - DELETE DOCUMENT OPERATION LUCENE - DELETE DOCUMENT OPERATION http://www.tutorialspoint.com/lucene/lucene_deletedocument.htm Copyright tutorialspoint.com Delete document is another important operation as part of indexing process.this

More information

Termin 6: Web Suche. Übung Netzbasierte Informationssysteme. Arbeitsgruppe. Prof. Dr. Adrian Paschke

Termin 6: Web Suche. Übung Netzbasierte Informationssysteme. Arbeitsgruppe. Prof. Dr. Adrian Paschke Arbeitsgruppe Übung Netzbasierte Informationssysteme Termin 6: Web Suche Prof. Dr. Adrian Paschke Arbeitsgruppe Corporate Semantic Web (AG-CSW) Institut für Informatik, Freie Universität Berlin paschke@inf.fu-berlin.de

More information

Project Report on winter

Project Report on winter Project Report on 01-60-538-winter Yaxin Li, Xiaofeng Liu October 17, 2017 Li, Liu October 17, 2017 1 / 31 Outline Introduction a Basic Search Engine with Improvements Features PageRank Classification

More information

Lucene 4 - Next generation open source search

Lucene 4 - Next generation open source search Lucene 4 - Next generation open source search Simon Willnauer Apache Lucene Core Committer & PMC Chair simonw@apache.org / simon.willnauer@searchworkings.org Who am I? Lucene Core Committer Project Management

More information

Search Evolution von Lucene zu Solr und ElasticSearch. Florian

Search Evolution von Lucene zu Solr und ElasticSearch. Florian Search Evolution von Lucene zu Solr und ElasticSearch Florian Hopf @fhopf http://www.florian-hopf.de Index Indizieren Index Suchen Index Term Document Id Analyzing http://www.flickr.com/photos/quinnanya/5196951914/

More information

Realtime Search with Lucene. Michael

Realtime Search with Lucene. Michael Realtime Search with Lucene Michael Busch @michibusch michael@twitter.com buschmi@apache.org 1 Realtime Search with Lucene Agenda Introduction - Near-realtime Search (NRT) - Searching DocumentsWriter s

More information

Computer Science 572 Exam Prof. Horowitz Tuesday, April 24, 2017, 8:00am 9:00am

Computer Science 572 Exam Prof. Horowitz Tuesday, April 24, 2017, 8:00am 9:00am Computer Science 572 Exam Prof. Horowitz Tuesday, April 24, 2017, 8:00am 9:00am Name: Student Id Number: 1. This is a closed book exam. 2. Please answer all questions. 3. There are a total of 40 questions.

More information

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article Available online www.jocpr.com Journal of Chemical and Pharmaceutical Research, 2014, 6(6):1958-1966 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 Application of textual relevance retrieval in

More information

Please post comments or corrections to the Author Online forum at

Please post comments or corrections to the Author Online forum at MEAP Edition Manning Early Access Program Copyright 2009 Manning Publications For more information on this and other Manning titles go to www.manning.com Contents Preface Chapter 1 Meet Lucene Chapter

More information

Apache Lucene - Scoring

Apache Lucene - Scoring Grant Ingersoll Table of contents 1 Introduction...2 2 Scoring... 2 2.1 Fields and Documents... 2 2.2 Score Boosting...3 2.3 Understanding the Scoring Formula...3 2.4 The Big Picture...3 2.5 Query Classes...

More information

CS11 Java. Fall Lecture 7

CS11 Java. Fall Lecture 7 CS11 Java Fall 2006-2007 Lecture 7 Today s Topics All about Java Threads Some Lab 7 tips Java Threading Recap A program can use multiple threads to do several things at once A thread can have local (non-shared)

More information

Goal of this document: A simple yet effective

Goal of this document: A simple yet effective INTRODUCTION TO ELK STACK Goal of this document: A simple yet effective document for folks who want to learn basics of ELK (Elasticsearch, Logstash and Kibana) without any prior knowledge. Introduction:

More information

Fully Automatic and Precise Detection of Thread Safety Violations

Fully Automatic and Precise Detection of Thread Safety Violations Fully Automatic and Precise Detection of Thread Safety Violations Michael Pradel and Thomas R. Gross Department of Computer Science ETH Zurich 1 Motivation Thread-safe classes: Building blocks for concurrent

More information

Computer Science 572 Exam Prof. Horowitz Monday, November 27, 2017, 8:00am 9:00am

Computer Science 572 Exam Prof. Horowitz Monday, November 27, 2017, 8:00am 9:00am Computer Science 572 Exam Prof. Horowitz Monday, November 27, 2017, 8:00am 9:00am Name: Student Id Number: 1. This is a closed book exam. 2. Please answer all questions. 3. There are a total of 40 questions.

More information

Apache Lucene 4. Robert Muir

Apache Lucene 4. Robert Muir Apache Lucene 4 Robert Muir Agenda Overview of Lucene Conclusion Resources Q & A Download of Lucene: core/ analysis/ queryparser/ highlighter/ suggest/ expressions/ join/ memory/ codecs/... core/ Lucene

More information

Research and implementation of search engine based on Lucene Wan Pu, Wang Lisha

Research and implementation of search engine based on Lucene Wan Pu, Wang Lisha 2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) Research and implementation of search engine based on Lucene Wan Pu, Wang Lisha Physics Institute,

More information

Relevancy Workbench Module. 1.0 Documentation

Relevancy Workbench Module. 1.0 Documentation Relevancy Workbench Module 1.0 Documentation Created: Table of Contents Installing the Relevancy Workbench Module 4 System Requirements 4 Standalone Relevancy Workbench 4 Deploy to a Web Container 4 Relevancy

More information

Last Class: Synchronization Problems. Need to hold multiple resources to perform task. CS377: Operating Systems. Real-world Examples

Last Class: Synchronization Problems. Need to hold multiple resources to perform task. CS377: Operating Systems. Real-world Examples Last Class: Synchronization Problems Reader Writer Multiple readers, single writer In practice, use read-write locks Dining Philosophers Need to hold multiple resources to perform task Lecture 10, page

More information

Building Search Applications

Building Search Applications Building Search Applications Lucene, LingPipe, and Gate Manu Konchady Mustru Publishing, Oakton, Virginia. Contents Preface ix 1 Information Overload 1 1.1 Information Sources 3 1.2 Information Management

More information

Web-based File Upload and Download System

Web-based File Upload and Download System COMP4905 Honor Project Web-based File Upload and Download System Author: Yongmei Liu Student number: 100292721 Supervisor: Dr. Tony White 1 Abstract This project gives solutions of how to upload documents

More information

ER/Studio Enterprise Portal 1.1 New Features Guide

ER/Studio Enterprise Portal 1.1 New Features Guide ER/Studio Enterprise Portal 1.1 New Features Guide 2nd Edition, April 16/2009 Copyright 1994-2009 Embarcadero Technologies, Inc. Embarcadero Technologies, Inc. 100 California Street, 12th Floor San Francisco,

More information

Advanced Indexing Techniques with Lucene

Advanced Indexing Techniques with Lucene Advanced Indexing Techniques with Lucene Michael Busch buschmi@{apache.org, us.ibm.com} 1 1 Advanced Indexing Techniques with Lucene Agenda Introduction - Lucene s data structures 101 - Payloads - Numeric

More information

CS 112 Introduction to Computing II. Wayne Snyder Computer Science Department Boston University

CS 112 Introduction to Computing II. Wayne Snyder Computer Science Department Boston University 9/5/6 CS Introduction to Computing II Wayne Snyder Department Boston University Today: Arrays (D and D) Methods Program structure Fields vs local variables Next time: Program structure continued: Classes

More information

Supporting Constructivist Learning in a Multimedia Presentation System

Supporting Constructivist Learning in a Multimedia Presentation System Supporting Constructivist Learning in a Multimedia Presentation System Dula Kumela 1, Kenneth Watts 2, and W. Richards Adrion 3 Abstract - The Research in Presentation Production for Learning Electronically

More information

AP COMPUTER SCIENCE JAVA CONCEPTS IV: RESERVED WORDS

AP COMPUTER SCIENCE JAVA CONCEPTS IV: RESERVED WORDS AP COMPUTER SCIENCE JAVA CONCEPTS IV: RESERVED WORDS PAUL L. BAILEY Abstract. This documents amalgamates various descriptions found on the internet, mostly from Oracle or Wikipedia. Very little of this

More information

Presentation References

Presentation References Presentation References Knowledge Articles How can I estimate how long a full FTS Reindex will take? - KA000030925 FTS Configuration Options for Recovery Environments - KA000102997 FTS - Reindexing a single

More information

Paradoxes of API Design. Jaroslav Tulach NetBeans Platform Architect

Paradoxes of API Design. Jaroslav Tulach NetBeans Platform Architect Paradoxes of API Design Jaroslav Tulach NetBeans Platform Architect Motto Just like there is a difference between describing a house and describing a Universe, there is a difference between writing a code

More information

IBM Rational Software

IBM Rational Software IBM Rational Software Development Conference 2008 Introduction to the Jazz Technology Platform: Architecture Overview and Extensibility Scott Rich Distinguished Engineer, Jazz Architect IBM Rational SDP21

More information

Covers Apache Lucene 3.0 IN ACTION SECOND EDITION. Michael McCandless Erik Hatcher, Otis Gospodnetic F OREWORD BY D OUG C UTTING MANNING

Covers Apache Lucene 3.0 IN ACTION SECOND EDITION. Michael McCandless Erik Hatcher, Otis Gospodnetic F OREWORD BY D OUG C UTTING MANNING Covers Apache Lucene 3.0 IN ACTION SECOND EDITION Michael McCandless Erik Hatcher, Otis Gospodnetic F OREWORD BY D OUG C UTTING SAMPLE CHAPTER MANNING Lucene in Action, Second Edition by Michael McCandless,

More information

LucidWorks: Searching with curl October 1, 2012

LucidWorks: Searching with curl October 1, 2012 LucidWorks: Searching with curl October 1, 2012 1. Module name: LucidWorks: Searching with curl 2. Scope: Utilizing curl and the Query admin to search documents 3. Learning objectives Students will be

More information

The Dining Philosophers Problem CMSC 330: Organization of Programming Languages

The Dining Philosophers Problem CMSC 330: Organization of Programming Languages The Dining Philosophers Problem CMSC 0: Organization of Programming Languages Threads Classic Concurrency Problems Philosophers either eat or think They must have two forks to eat Can only use forks on

More information

Last Class: Synchronization Problems!

Last Class: Synchronization Problems! Last Class: Synchronization Problems! Reader Writer Multiple readers, single writer In practice, use read-write locks Dining Philosophers Need to hold multiple resources to perform task Lecture 11, page

More information

Yonik Seeley 29 June 2006 Dublin, Ireland

Yonik Seeley 29 June 2006 Dublin, Ireland Apache Solr Yonik Seeley yonik@apache.org 29 June 2006 Dublin, Ireland History Search for a replacement search platform commercial: high license fees open-source: no full solutions CNET grants code to

More information

The Object Cache. Table of contents

The Object Cache. Table of contents by Armin Waibel, Thomas Mahler Table of contents 1 Introduction.2 2 Why a cache and how it works?2 3 How to declare and change the used ObjectCache implementation.. 3 3.1 Priority of Cache Level.. 3 3.2

More information

CMSC 330: Organization of Programming Languages. The Dining Philosophers Problem

CMSC 330: Organization of Programming Languages. The Dining Philosophers Problem CMSC 330: Organization of Programming Languages Threads Classic Concurrency Problems The Dining Philosophers Problem Philosophers either eat or think They must have two forks to eat Can only use forks

More information

CS 241 Honors Concurrent Data Structures

CS 241 Honors Concurrent Data Structures CS 241 Honors Concurrent Data Structures Bhuvan Venkatesh University of Illinois Urbana Champaign March 27, 2018 CS 241 Course Staff (UIUC) Lock Free Data Structures March 27, 2018 1 / 43 What to go over

More information

Servlets. How to use Apache FOP in a Servlet $Revision: $ Table of contents

Servlets. How to use Apache FOP in a Servlet $Revision: $ Table of contents How to use Apache FOP in a Servlet $Revision: 505235 $ Table of contents 1 Overview...2 2 Example Servlets in the FOP distribution...2 3 Create your own Servlet...2 3.1 A minimal Servlet...2 3.2 Adding

More information

Synchronization SPL/2010 SPL/20 1

Synchronization SPL/2010 SPL/20 1 Synchronization 1 Overview synchronization mechanisms in modern RTEs concurrency issues places where synchronization is needed structural ways (design patterns) for exclusive access 2 Overview synchronization

More information

Research on Full-text Retrieval based on Lucene in Enterprise Content Management System Lixin Xu 1, a, XiaoLin Fu 2, b, Chunhua Zhang 1, c

Research on Full-text Retrieval based on Lucene in Enterprise Content Management System Lixin Xu 1, a, XiaoLin Fu 2, b, Chunhua Zhang 1, c Applied Mechanics and Materials Submitted: 2014-07-18 ISSN: 1662-7482, Vols. 644-650, pp 1950-1953 Accepted: 2014-07-21 doi:10.4028/www.scientific.net/amm.644-650.1950 Online: 2014-09-22 2014 Trans Tech

More information

run your own search engine. today: Cablecar

run your own search engine. today: Cablecar run your own search engine. today: Cablecar Robert Kowalski @robinson_k http://github.com/robertkowalski Search nobody uses that, right? Services on the Market Google Bing Yahoo ask Wolfram Alpha Baidu

More information

Lucene Performance Workshop Lucid Imagination, Inc.

Lucene Performance Workshop Lucid Imagination, Inc. Lucene Performance Workshop 1 Intro About the speaker and Lucid Imagination Agenda Lucene and performance Lucid Gaze for Lucene: UI and API Key statistics Examples Q & A session 2 Lucene and performance

More information

Scalable Web Programming. CS193S - Jan Jannink - 2/25/10

Scalable Web Programming. CS193S - Jan Jannink - 2/25/10 Scalable Web Programming CS193S - Jan Jannink - 2/25/10 Weekly Syllabus 1.Scalability: (Jan.) 2.Agile Practices 3.Ecology/Mashups 4.Browser/Client 7.Analytics 8.Cloud/Map-Reduce 9.Published APIs: (Mar.)*

More information

Java Thread Programming By Paul Hyde

Java Thread Programming By Paul Hyde Java Thread Programming By Paul Hyde Buy, download and read Java Thread Programming ebook online in PDF format for iphone, ipad, Android, Computer and Mobile readers. Author: Paul Hyde. ISBN: 9780768662085.

More information

CSP.NET. A Software library for concurrent and distributed programming

CSP.NET. A Software library for concurrent and distributed programming A Software library for concurrent and distributed programming CSP.NET Constructs Process Parallel Channel BufferedChannel BroadcastChannel Choice Barrier Bucket Timer Skip Processes public class P1 : Process

More information

SUPPORTING KEYWORD SEARCH ON SEMANTIC WEB DOCUMENTS RAVI PAVAGADA. (Under the Direction of Dr. Amit P. Sheth) ABSTRACT

SUPPORTING KEYWORD SEARCH ON SEMANTIC WEB DOCUMENTS RAVI PAVAGADA. (Under the Direction of Dr. Amit P. Sheth) ABSTRACT SUPPORTING KEYWORD SEARCH ON SEMANTIC WEB DOCUMENTS by RAVI PAVAGADA (Under the Direction of Dr. Amit P. Sheth) ABSTRACT Most contemporary search engines [8, 17, 41] allow searches on keywords and support

More information

Utilities (Part 3) Implementing static features

Utilities (Part 3) Implementing static features Utilities (Part 3) Implementing static features 1 Goals for Today learn about preconditions versus validation introduction to documentation introduction to testing 2 Yahtzee class so far recall our implementation

More information

Frequently Asked Questions

Frequently Asked Questions Frequently Asked Questions This PowerTools FAQ answers many frequently asked questions regarding the functionality of the various parts of the PowerTools suite. The questions are organized in the following

More information

LUCENE - QUICK GUIDE LUCENE - OVERVIEW

LUCENE - QUICK GUIDE LUCENE - OVERVIEW LUCENE - QUICK GUIDE http://www.tutorialspoint.com/lucene/lucene_quick_guide.htm Copyright tutorialspoint.com LUCENE - OVERVIEW Lucene is simple yet powerful java based search library. It can be used in

More information

Tabula DX The Search Engine for PDF Files

Tabula DX The Search Engine for PDF Files Tabula DX The Search Engine for PDF Files Reference Guide Version 1.20 April 2009 Copyright 2009 Aquaforest Limited http://www.aquaforest.com/ CONTENTS 1 INTRODUCTION...2 2 INSTALLATION AND INITIAL CONFIGURATION...3

More information

Brainspace: Quick Reference

Brainspace: Quick Reference Brainspace is a dynamic and flexible data analysis tool. The purpose of this document is to provide a quick reference guide to navigation, use, and workflow within Brainspace. This guide is divided into

More information

rpaf ktl Pen Apache Solr 3 Enterprise Search Server J community exp<= highlighting, relevancy ranked sorting, and more source publishing""

rpaf ktl Pen Apache Solr 3 Enterprise Search Server J community exp<= highlighting, relevancy ranked sorting, and more source publishing Apache Solr 3 Enterprise Search Server Enhance your search with faceted navigation, result highlighting, relevancy ranked sorting, and more David Smiley Eric Pugh rpaf ktl Pen I I riv IV I J community

More information

Designing API: 20 API Paradoxes. Jaroslav Tulach NetBeans Platform Architect

Designing API: 20 API Paradoxes. Jaroslav Tulach NetBeans Platform Architect Designing API: 20 API Paradoxes Jaroslav Tulach NetBeans Platform Architect Motto Just like there is a difference between describing a house and describing a Universe, there is a difference between writing

More information

Basic Principles of analysis and testing software

Basic Principles of analysis and testing software Basic Principles of analysis and testing software Software Reliability and Testing - Barbara Russo SwSE - Software and Systems Engineering Research Group 1 Basic principles of analysis and testing As in

More information

UAIC: Participation in task

UAIC: Participation in task UAIC: Participation in TEL@CLEF task Adrian Iftene, Alina-Elena Mihăilă, Ingride-Paula Epure UAIC: Faculty of Computer Science, Alexandru Ioan Cuza University, Romania {adiftene, elena.mihaila, paula.epure}@info.uaic.ro

More information

Text Search With Lucene

Text Search With Lucene Text Search With Lucene Please refer to Geode 1.2.0 documentation with final implementation is here. Requirements Related Documents Terminology API User Input Key points Java API Examples Gfsh API XML

More information

Module 3 Web Component

Module 3 Web Component Module 3 Component Model Objectives Describe the role of web components in a Java EE application Define the HTTP request-response model Compare Java servlets and JSP components Describe the basic session

More information

Parametric Search using In-memory Auxiliary Index

Parametric Search using In-memory Auxiliary Index Parametric Search using In-memory Auxiliary Index Nishant Verman and Jaideep Ravela Stanford University, Stanford, CA {nishant, ravela}@stanford.edu Abstract In this paper we analyze the performance of

More information

Introduction to Software Testing Chapter 2.4 Graph Coverage for Design Elements Paul Ammann & Jeff Offutt

Introduction to Software Testing Chapter 2.4 Graph Coverage for Design Elements Paul Ammann & Jeff Offutt Introduction to Software Testing Chapter 2.4 Graph Coverage for Design Elements Paul Ammann & Jeff Offutt www.introsoftwaretesting.com OO Software and Designs Emphasis on modularity and reuse puts complexity

More information

Recap. Contents. Reenterancy of synchronized. Explicit Locks: ReentrantLock. Reenterancy of synchronise (ctd) Advanced Thread programming.

Recap. Contents. Reenterancy of synchronized. Explicit Locks: ReentrantLock. Reenterancy of synchronise (ctd) Advanced Thread programming. Lecture 07: Advanced Thread programming Software System Components 2 Behzad Bordbar School of Computer Science, University of Birmingham, UK Recap How to deal with race condition in Java Using synchronised

More information

Create High Performance, Massively Scalable Messaging Solutions with Apache ActiveBlaze

Create High Performance, Massively Scalable Messaging Solutions with Apache ActiveBlaze Create High Performance, Massively Scalable Messaging Solutions with Apache ActiveBlaze Rob Davies Director of Open Source Product Development, Progress: FuseSource - http://fusesource.com/ Rob Davies

More information

Snapshots and Repeatable reads for HBase Tables

Snapshots and Repeatable reads for HBase Tables Snapshots and Repeatable reads for HBase Tables Note: This document is work in progress. Contributors (alphabetical): Vandana Ayyalasomayajula, Francis Liu, Andreas Neumann, Thomas Weise Objective The

More information

Java Threads. COMP 585 Noteset #2 1

Java Threads. COMP 585 Noteset #2 1 Java Threads The topic of threads overlaps the boundary between software development and operation systems. Words like process, task, and thread may mean different things depending on the author and the

More information

CONCURRENT PROGRAMMING - EXAM

CONCURRENT PROGRAMMING - EXAM CONCURRENT PROGRAMMING - EXAM First name: Last name: Matrikel: Date: Wednesday, 19.12. Allowed material: this paper and a pen Number of exercises: 6 Total points: 29 Important: You have 60 minutes to solve

More information

Concurrency User Guide

Concurrency User Guide Concurrency User Guide Release 1.0 Dylan Hackers January 26, 2019 CONTENTS 1 Basic Abstractions 3 1.1 Executors................................................. 3 1.2 Queues..................................................

More information

Distributed Objects SPL/ SPL 201 / 0 1

Distributed Objects SPL/ SPL 201 / 0 1 Distributed Objects 1 distributed objects objects which reside on different machines/ network architectures, benefits, drawbacks implementation of a remote object system 2 Why go distributed? large systems

More information

CSCI 136 Written Exam #1 Fundamentals of Computer Science II Spring 2014

CSCI 136 Written Exam #1 Fundamentals of Computer Science II Spring 2014 CSCI 136 Written Exam #1 Fundamentals of Computer Science II Spring 2014 Name: This exam consists of 5 problems on the following 6 pages. You may use your double- sided hand- written 8 ½ x 11 note sheet

More information

First Name: AITI 2004: Exam 2 July 19, 2004

First Name: AITI 2004: Exam 2 July 19, 2004 First Name: AITI 2004: Exam 2 July 19, 2004 Last Name: Standard Track Read Instructions Carefully! This is a 3 hour closed book exam. No calculators are allowed. Please write clearly if we cannot understand

More information

Text search. CSE 392, Computers Playing Jeopardy!, Fall

Text search. CSE 392, Computers Playing Jeopardy!, Fall Text search CSE 392, Computers Playing Jeopardy!, Fall 2011 Stony Brook University http://www.cs.stonybrook.edu/~cse392 1 Today 2 parts: theoretical: costs of searching substrings, data structures for

More information

DENODO ARACNE 4.5 ADMINISTRATOR GUIDE

DENODO ARACNE 4.5 ADMINISTRATOR GUIDE DENODO ARACNE 4.5 ADMINISTRATOR GUIDE Update Nov 13 th, 2009 NOTE This document is confidential and is the property of denodo technologies (hereinafter denodo). No part of the document may be copied, photographed,

More information

CMSC 132: Object-Oriented Programming II. Effective Java. Department of Computer Science University of Maryland, College Park

CMSC 132: Object-Oriented Programming II. Effective Java. Department of Computer Science University of Maryland, College Park CMSC 132: Object-Oriented Programming II Effective Java Department of Computer Science University of Maryland, College Park Effective Java Textbook Title Effective Java, Second Edition Author Joshua Bloch

More information