Project GRACE: A grid based search tool for the global digital library
|
|
- Johnathan Jacob Waters
- 6 years ago
- Views:
Transcription
1 Project GRACE: A grid based search tool for the global digital library Frank Scholze 1, Glenn Haya 2, Jens Vigen 3, Petra Prazak 4 1 Stuttgart University Library, Postfach , Stuttgart, Germany; frank.scholze@ub.uni.stuttgart.de 2 Stockholm University Library, Stockholm, Sweden; glenn.haya@sub.su.se 3 CERN Library, CH-1211, Geneva 23, Switzerland; jens.vigen@cern.ch 4 Stuttgart University Library, Postfach , Stuttgart, Germany; petra.prazak@ub.uni-stuttgart.de Abstract: GRACE - Grid Search and Categorization Engine ( is an ongoing EU project. GRACE is an attempt to apply an innovative Grid-based solution that will meet the challenges of searching a global heterogeneous collection of documents. The goal of the project is to build a distributed search and categorization engine that will run on the Enabling Grids for E-science in Europe (EGEE), the successor to the European Data Grid (EDG). This paper is a description of the project and its potential as a framework for a global ETD search tool. 1 Introduction The goal of Project GRACE i is to build a distributed search and categorization tool adapted to the Grid network infrastructure. We are currently in the process of developing the first prototype of the GRACE search engine. Testing and evaluation will proceed throughout the summer and fall of CERN-OPEN /06/2004 One of the unique aspects of the GRACE toolkit is that it is built on a large distributed-computing network referred to as a grid. The project is one of the first to deal with search and retrieval in the grid environment and in the process of creating the search tool we have identified potential advantages of a grid-based search and categorization engine as well as limitations for search and retrieval in the current grid environment. GRACE is a project in the Fifth Framework Program (FP5) of the Information Society Technologies (IST) initiative by the European Union. The partners in this project are Telecom Italia Lab (as project leader and manager), CERN (European Organization for Nuclear Research), Virtual Self, Sheffield Hallam University - School of Computing and Management Sciences, Stockholm University Library and Stuttgart University Library. The project started in September 2002 and will end in February Benefits of GRACE for searching ETDs GRACE could be used as a framework for a global ETD search tool by searching existing data providers and service providers as well as other content sources. Below are some of the aspects of the GRACE search and categorization tool with the most potential benefit for the ETD community. 2.1 Federated search of heterogeneous sources GRACE can integrate content sources using different protocols such as OAI, http or Z This means that GRACE can function as a layer on top of service providers such as NDLTD as well as individual repositories. Content sources can be integrated even if they do not have a search interface of their own. GRACE provides its own indexing service for that which is based on Jakarta s Lucene adapted to the GRACE toolkit. ii If for example, an institution has a collection of PDF files but no way to search through them, they could integrate their collection into GRACE. Users could then search this content source along with any other content source (service providers or data providers) that is integrated with GRACE. 2.2 Federated search of subscription versus free material Some material, notably Proquest s Digital Dissertation, can be accessed through subscription accounts. GRACE is built on top of a Grid network in which users are registered as part of virtual organizations which can be a university or a faculty for example. Virtual organizations can ease the administration of access rights to various sources. Potentially, this means that a registered user can log in as part of a virtual organization and GRACE can automatically allow or restrict access to content based on their organizations access levels. This automatic authentification feature will not be included in the first prototype of GRACE but may be added at a later date.
2 2.3 Sources organized in knowledge domains Content sources will be organized into knowledge domains (e.g. subjects such as physics or computer science). Users will therefore not have to know each content source relevant for their specific query. However in addition to selecting knowledge domains they will be able to select individual content sources that best match their topic. However, ETDs are often stored in sources that are interdisciplinary and will be included under a general knowledge domain. For this reason, content sources will also be defined by predominant document type (such as thesis, dissertations), so users can easily identify sources that contain ETDs. 2.4 Automatic categorization GRACE includes a categorization engine that will dynamically integrate and categorize results from various data sources. This partially solves the problem of integrating results from heterogeneous content sources that rank results using different methods. The categorization engine will be based on linguistic algorithms iii as opposed to statistical methods used in other search and categorization engines such as Vivissimo. To begin with, GRACE will be capable of automatically identifying and then categorizing results in the following languages: English, German, Swedish and Italian. Additional languages may be added at a later stage. More details on categorization can be found in section Use of subject thesauri or classifications schemes To launch a search: GRACE will allow users to search using subject appropriate thesauri or classification schemes that change depending on the knowledge domain(s) or sources selected. For example, if a researcher were to search content sources that focus on particle physics for example, he can start by selecting terms from the High Energy Physics Index. When searching using a classification scheme that is not supported by a content source, GRACE will take the word or phrase from the classification scheme and perform a keyword search on the documents. To present results: GRACE will also allow users the option to view search results categorized by a specific classification scheme. For example, a user can choose to view search results categorized using terms from the High Energy Physics Index. When this option is chosen, GRACE s categorization engine will automatically categorize the documents using the terminology from the classification scheme based on a linguistic analysis of the entire text of the documents. This feature is not currently developed but is planned for the GRACE prototype that will be available for public testing in September Multilingual functionality The search tool will have the ability to search and automatically categorize documents in various languages with the help of lexical databases. The prototype will have capabilities in English, German, Italian and Swedish and this feature is extensible to other languages. 3 Comparison of GRACE to existing tools Below is a table comparing GRACE to other ETD search tools. NDLTD NCSTRL GRACE Content Searched Repositories via OAI Repositories via OAI Various sources via http, OAI, Z39.50, Free or Restricted? Free Free Free and restricted content Index Centralized Decentralized Decentralized Query Processing Centralized Decentralized over the web Grid computational resources Response Time Immediate, Quick Immediate, Slower Delayed, Batch Processing Table 1: Features of GRACE vs NDLTD and NCSTRL
3 3.1 Content searched The chart above shows how GRACE can integrate content sources via OAI, http and Z This allows it to function as a unified search tool for various content sources including both ETD service providers and data providers as well as other search interfaces. 3.2 Query processing GRACE processes search results on the computers that make up the Grid network. This means that the system is very powerful and scalable without the investment in a large number of servers by any single institution. 3.3 Response Time Finally, the chart shows that the processing time for a query with GRACE is delayed due to the batch orientation of current Grid technology. Unfortunately the current grid architecture does not provide real time interaction since there will always be an overhead of a few minutes to every job submitted to the Grid, no matter how simple. As a result of this, we decided to create a search and categorization tool that delivers results over time via links sent by as opposed to providing immediate results. The GRACE approach is comparable to SDI or profile searches in online databases like Inspec or Medline. These work as a kind of alerting service in the background gathering new information into a structure. We believe this to be a potentially powerful paradigm for the future giving the user control over the structure into which documents or information is fed into. A PhD Student for example could make up a table of contents for his thesis which would be enriched by his own work as well as by documents retrieved in the background. 4 Workflow Below is a diagram of the overall workflow of the GRACE tool. Figure 1: GRACE workflow
4 4.1 From Query Submission to Downloading After query formulation and submission (which is explained in section 7), content sources selected by the user are queried, results are parsed and documents are downloaded. This takes place on the internet. The downloaded documents are then sent to the grid network for normalization and categorization. 4.2 Normalization During text normalization, the text is put into a uniform format in preparation for categorization. For example, stop words such as articles (e.g. the and a ) are removed. Words are grouped together when appropriate (for example, proper names, acronyms). Words are also stripped of prefixes and suffixes at this stage. Many of these items are language specific and GRACE can perform these functions on any supported language (at this stage suitable lexical tables are available for English, German and Italian. Swedish will be added soon). 4.3 Categorization The entire text of documents retrieved from a search are downloaded, normalized (see 6.2) and sent to the categorization engine where lexical algorithms are used to categorize the results. An example of the type of work done in the categorization stage is a process called disambiguation. Words can be used to mean different things, but clues to the meaning of a word or phrase can often be found within the context in which the word was used. The categorization engine analyzes the context of words in order to group together those words that have similar contexts. For example, the word October below can refer to the month, the submarine or the Russian Revolution. GRACE s categorization engine uses the context to help sort the words into categories and present them to the user Response Time Figure 2: Disambiguation 4.4 External vs. internal content sources In Figure 1 it is shown that GRACE will query both external and internal content sources. An internal content source is a source in which documents have already been parsed and normalized and are stored on the grid, ready to be included in a query and categorized. For example, if a university department stored their theses on a server as PDF files, GRACE could download the documents, process them and store a normalized version of them on the GRID. The theses would then be presented to the user as a content source that can be included in the federated search. From the user perspective, internal sources (normalized and stored on the grid) and external (queried through http for example) are indistinguishable since the user interface presents them both in the same way.
5 5 User interface 5.1 Query Input The query is input in three stages. First the user selects the content sources. Either individual sources or entire knowledge domains can be selected. The screen shot of the first GRACE prototype (figure 3) includes web sources and internal documents. However, in the prototype available for the public the sources will be divided into knowledge domains such as physics and computer science. The user then enters the search terms. Figure 3: Search Wizard screen #1 On the next screen (Figure 4) the user enters in the search term(s). Any term can be typed in or the user can choose to use a term from an appropriate classification scheme to launch a search. Searches can also be limited to a specific field. The fields available vary depending on the resources selected. Figure 4: Search Wizard screen #2 Finally, the user launches the search (Figure 5). If the user is logged in then the information is automatically filled in. After the search is launched the user receives a confirmation. After the search is processed, an is sent out with a link to the categorized results. The user can get updates on a search as often as once a day.
6 Figure 5: Search Wizard screen #3 5.2 Results The results (Figure 6) are sent to the user as a link via . Results are presented in categories that are displayed on the left hand side of the screen. Each category is linked to relevant related concepts that are listed at the bottom of the page. From the result page, the user can sort or filter the search results. Automatically created Table of Contents Automatically created Related Concepts list, per selected topic in the upper list Figure 6: Results screen
7 6 Federated Search Federated search of heterogeneous content sources has traditionally been problematic. Below is a series of problems associated with Federated search iv along with the solution proposed by GRACE. Problem: US Digital Library Experience suggests cross searching does not scale. Solution: User limits source selection by choosing a specific knowledge domain or by choosing sources that focus on a specific document type such as theses. Problem: Collection description is difficult and users have trouble knowing which sources to search Solution: GRACE allows users to choose sources by subject (knowledge domain) or by primary document type that the source contains. Problem: Query language and search attributes can vary across different sources. Solution: Query syntax is mapped to individual information resources. However, as content sources scale up there will be considerable maintenance effort. Problem: Different sources rank results in different ways. Ranking the combined results of various sources is problematic. Solution: By presenting results divided into categories, GRACE provides an alternative to traditional ranking of results and provides a partial solution to this problem. Problem: Performance is limited to slowest target Solution: The project does not provide a direct solution to this problem but instead adopted a batch processing approach that aims to provide high quality results updated over time. The presentation of the search tool clearly explains that the complete results will not be available immediately but the user will be notified by when a search is completed. 7 What is Grid Technology? A Grid is a distributed computer network in which computers share computing power and storage capacity. There are currently several large grid networks, however there is no global grid. The basics of grid computing are explained well at CERN s gridcafe. v As explained on this website, the dream of grid computing is to create a global network of computers, accessible from anywhere, which will function as a practically unlimited computing resource. However, the grids available now and in the immediate future are regional and have e- science as a primary focus. The concept of a computing grid is often compared to the power grid where the user does not need to worry about what computer processes his request and where the data is stored. Like the power grid, the computing grid would be accessible from anywhere and you will pay for the power that you need. However, there are some key differences between the power grid and the computing grid however. For example power is either on or off. There are no performance issues the way there can be with computer networks such as the Grid. Also, power flows basically only one way, from producer to consumer, which is not true on a grid network where there is interaction. Distributed computing, which grid networks are based on, is not a new concept. Distributed computing developed in an effort to generate processing power for meeting workload challenges. In order to boost processing power, institutions aggregated computing resources across locations or across the entire institution. The idea was to match the supply of processing cycles with the demand created by applications. This concept is now a ubiquitous solution practiced by leading organizations around the world. It ensures continuous computing availability despite scheduled maintenance, power outages, and unexpected failures. 8 GRACE and the Grid As mentioned previously, there are several grid networks in use today. GRACE will be integrated with the GILDA grid testbed vi (Gilda=Grid INFN Laboratory for Dissemination Activities) which has computing nodes at six sites spread across Italy. GILDA is in turn a part of a larger infrastructure called EGEE vii (Enabling Grids for E-Science in Europe).
8 The GRACE project has its own grid nodes which it has integrated with GILDA. These include 5 CPUs in Turin and 4 CPUS in Milan. 9 Future implications of Grid search and retrieval for the ETD community In its current stage of development, the grid is well suited for batch processing and storage of enormous amounts of data. This means that it is appropriate for using on massive collections of documents for researchers who are willing to wait for high quality in terms of categorization of results. In the future, the process for submitting a job to the grid may be streamlined, making it an environment suitable for interactive applications. This demand is formulated for example by the e-learning community utilizing Grid technology viii, by the GGF working group on Grid information retrieval (GIR) ix as well as by vendors such as IBM which introduced Masala x as an extension to DB2. It is our hope that the lessons learned from the GRACE project will contribute to the grids development in this direction. i GRACE project Hhttp:// ii Hhttp://jakarta.apache.org/lucene/H Add reference to GRACE Deliverable D2.1 GRACE Local Search and Categorization Engine iii Nahum Korda et al.: Unsupervised Taxonomy of Large Document Corpora Utilizing Idiomatic Character of Natural Languages In: The 2001 International Conference on Artificial Intelligence (IC-AI'2001) June 25-28, 2001 Las Vegas iv The problems were taken from Tutorial 1, CERN Workshop on Innovations in Scholarly Communications, February, Add url v Hhttp://gridcafe.web.cern.ch/gridcafe/H vi Hhttp://gilda.ct.infn.it/main.htmlH vii viii ix x
Developing a Grid-Based Search and Categorization Tool
Abstract: High Energy Physics Libraries Webzine Issue 8 / October 2003 Developing a Grid-Based Search and Categorization Tool Glenn Haya (*), Frank Scholze (*), Jens Vigen (*) Grid technology has the potential
More informationProQuest Dissertations and Theses Overview. Austin McLean and Marlene Coles CGS Summer Workshop, July 2017
ProQuest Dissertations and Theses Overview Austin McLean and Marlene Coles CGS Summer Workshop, July 2017 Agenda Dissertations and ProQuest Short form video Pilot Project 2 A mission that aligns with universities
More informationDeliverable D71:(PC1) Draft Collaboration Plan
Deliverable D71:(PC1) Draft Collaboration Plan DATA MINING TOOLS AND SERVICES FOR GRID COMPUTING ENVIRONMENTS D71: (PC1) Draft Collaboration Plan Responsible author: Co-authors: Nahum Korda Assaf Schuster,
More informationCACAO PROJECT AT THE 2009 TASK
CACAO PROJECT AT THE TEL@CLEF 2009 TASK Alessio Bosca, Luca Dini Celi s.r.l. - 10131 Torino - C. Moncalieri, 21 alessio.bosca, dini@celi.it Abstract This paper presents the participation of the CACAO prototype
More informationAnnotating Spatio-Temporal Information in Documents
Annotating Spatio-Temporal Information in Documents Jannik Strötgen University of Heidelberg Institute of Computer Science Database Systems Research Group http://dbs.ifi.uni-heidelberg.de stroetgen@uni-hd.de
More informationEclipse Technology Project: g-eclipse
(Incubation) Document classification: Made available under the Eclipse Public License v1.0. Date: September 11, 2007 Abstract: This document contains the Release Review Documentation for the Eclipse Technology
More informationThe SweGrid Accounting System
The SweGrid Accounting System Enforcing Grid Resource Allocations Thomas Sandholm sandholm@pdc.kth.se 1 Outline Resource Sharing Dilemma Grid Research Trends Connecting National Computing Resources in
More informationExtending the Facets concept by applying NLP tools to catalog records of scientific literature
Extending the Facets concept by applying NLP tools to catalog records of scientific literature *E. Picchi, *M. Sassi, **S. Biagioni, **S. Giannini *Institute of Computational Linguistics **Institute of
More informationGRIDS INTRODUCTION TO GRID INFRASTRUCTURES. Fabrizio Gagliardi
GRIDS INTRODUCTION TO GRID INFRASTRUCTURES Fabrizio Gagliardi Dr. Fabrizio Gagliardi is the leader of the EU DataGrid project and designated director of the proposed EGEE (Enabling Grids for E-science
More informationA service-oriented national e-theses information system and repository
A service-oriented national e-theses information system and repository Nikos Houssos, Panagiotis Stathopoulos, Ioanna Sarantopoulou, Dimitris Zavaliadis, Evi Sachini National Documentation Centre / National
More informationA Cloud Framework for Big Data Analytics Workflows on Azure
A Cloud Framework for Big Data Analytics Workflows on Azure Fabrizio MAROZZO a, Domenico TALIA a,b and Paolo TRUNFIO a a DIMES, University of Calabria, Rende (CS), Italy b ICAR-CNR, Rende (CS), Italy Abstract.
More informationeresearch Australia The Elephant in the Room! Open Access Archiving and other Gateways to e-research Richard Levy
eresearch Australia Open Access Archiving and other Gateways to e-research Richard Levy The Elephant in the Room! The Impact of Google on eresearch Google is the black box to information on the Internet,
More informationShare.TEC System Architecture
Share.TEC System Architecture Krassen Stefanov 1, Pavel Boytchev 2, Alexander Grigorov 3, Atanas Georgiev 4, Milen Petrov 5, George Gachev 6, and Mihail Peltekov 7 1,2,3,4,5,6,7 Faculty of Mathematics
More informationDALA Project: Digital Archive System for Long Term Access
2010 International Conference on Distributed Framework for Multimedia Applications (DFmA) DALA Project: Digital Archive System for Long Term Access Mardhani Riasetiawan 1,2, Ahmad Kamil Mahmood 2 1 Master
More informationSLHC-PP DELIVERABLE REPORT EU DELIVERABLE: Document identifier: SLHC-PP-D v1.1. End of Month 03 (June 2008) 30/06/2008
SLHC-PP DELIVERABLE REPORT EU DELIVERABLE: 1.2.1 Document identifier: Contractual Date of Delivery to the EC Actual Date of Delivery to the EC End of Month 03 (June 2008) 30/06/2008 Document date: 27/06/2008
More informationIntroduction
Introduction EuropeanaConnect All-Staff Meeting Berlin, May 10 12, 2010 Welcome to the All-Staff Meeting! Introduction This is a quite big meeting. This is the end of successful project year Project established
More informationPreservation Planning in the OAIS Model
Preservation Planning in the OAIS Model Stephan Strodl and Andreas Rauber Institute of Software Technology and Interactive Systems Vienna University of Technology {strodl, rauber}@ifs.tuwien.ac.at Abstract
More information"Efficient" Thesis & Dissertation Workflows With Limited Resources
Portland State University PDXScholar Northwest IR User Group 2018 Northwest IR User Group Jul 20th, 9:10 AM - 10:00 AM "Efficient" Thesis & Dissertation Workflows With Limited Resources Michele Gibney
More informationGoogle indexed 3,3 billion of pages. Google s index contains 8,1 billion of websites
Access IT Training 2003 Google indexed 3,3 billion of pages http://searchenginewatch.com/3071371 2005 Google s index contains 8,1 billion of websites http://blog.searchenginewatch.com/050517-075657 Estimated
More informationMelvyl Webinar UC and OCLC Roadmap Discussion
Melvyl Webinar UC and OCLC Roadmap Discussion Leslie Wolf & Jeff Penka October 11, 2011 CDL Team Introduction Leslie Wolf, Melvyl Project Manager Ellen Meltzer, Information Services Manager Margery Tibbetts,
More informationMarriott. Manual Submission Quick Reference Guide (QRG)
Marriott Manual Submission Quick Reference Guide (QRG) February 2015 Topics Covered Login and Submit Menu Submission Form Submission Instructions Reviewing a Quote Downloading Translated Files 2 Login
More informationInstitutional Repository using DSpace. Yatrik Patel Scientist D (CS)
Institutional Repository using DSpace Yatrik Patel Scientist D (CS) yatrik@inflibnet.ac.in What is Institutional Repository? Institutional repositories [are]... digital collections capturing and preserving
More informationRISE SICS North Newsletter 2017:3
RISE SICS North Newsletter 2017:3 Visa webbversion RISE SICS North Newsletter 2017:3 News in short We have released our web http://ice.sics.se and our web shop for Openstack We have started our first H2020
More informationUsing UIMA to Structure an Open Platform for Textual Entailment. Tae-Gil Noh, Sebastian Padó Dept. of Computational Linguistics Heidelberg University
Using UIMA to Structure an Open Platform for Textual Entailment Tae-Gil Noh, Sebastian Padó Dept. of Computational Linguistics Heidelberg University The paper is about About EXCITEMENT Open Platform a
More informationAndrea Sciabà CERN, Switzerland
Frascati Physics Series Vol. VVVVVV (xxxx), pp. 000-000 XX Conference Location, Date-start - Date-end, Year THE LHC COMPUTING GRID Andrea Sciabà CERN, Switzerland Abstract The LHC experiments will start
More informationMAtchUP D8.2: Project website WP 8, T th March 2018 (M6)
MAtchUP D8.2: Project website WP 8, T 8.2 30th March 2018 (M6) Authors: Costanza Caffo (ICE), Veronica Meneghello (ICE) MAtchUP - SCC-1-2016-2017 Innovation Action GRANT AGREEMENT No. 774477 Technical
More informationGrid Scheduling Architectures with Globus
Grid Scheduling Architectures with Workshop on Scheduling WS 07 Cetraro, Italy July 28, 2007 Ignacio Martin Llorente Distributed Systems Architecture Group Universidad Complutense de Madrid 1/38 Contents
More informationDELOS WP7: Evaluation
DELOS WP7: Evaluation Claus-Peter Klas Univ. of Duisburg-Essen, Germany (WP leader: Norbert Fuhr) WP Objectives Enable communication between evaluation experts and DL researchers/developers Continue existing
More informationIRMOS Newsletter. Issue N 5 / January Editorial. In this issue... Dear Reader, Editorial p.1
IRMOS Newsletter Issue N 5 / January 2011 In this issue... Editorial Editorial p.1 Highlights p.2 Special topic: The IRMOS Repository p.5 Recent project outcomes p.6 Keep in touch with IRMOS p.8 Dear Reader,
More informationETD Submission via ProQuest Step-by-Step
ETD Submission via ProQuest Step-by-Step 1. Access the ProQuest ETD Administrator portal ProQuest s portal page is linked in the Students area of the Graduate School website here: http://www.clemson.edu/graduate/students/theses-and-dissertations/submit.html.
More informationDiscovery services: next generation of searching scholarly information
Discovery services: next generation of searching scholarly information Article (Unspecified) Keene, Chris (2011) Discovery services: next generation of searching scholarly information. Serials, 24 (2).
More informationScienceDirect. Multi-interoperable CRIS repository. Ivanović Dragan a *, Ivanović Lidija b, Dimić Surla Bojana c CRIS
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 33 ( 2014 ) 86 91 CRIS 2014 Multi-interoperable CRIS repository Ivanović Dragan a *, Ivanović Lidija b, Dimić Surla Bojana
More informationReview of. Amanda Spink. and her work in. Web Searching and Retrieval,
Review of Amanda Spink and her work in Web Searching and Retrieval, 1997-2004 Larry Reeve for Dr. McCain INFO861, Winter 2004 Term Project Table of Contents Background of Spink 2 Web Search and Retrieval
More informationProcedures and Resources Plan
Project acronym D4Science Project full title DIstributed collaboratories Infrastructure on Grid Enabled Technology 4 Science Project No 212488 Procedures and Resources Plan Deliverable No DSA1.1b January
More informationUNICORE Globus: Interoperability of Grid Infrastructures
UNICORE : Interoperability of Grid Infrastructures Michael Rambadt Philipp Wieder Central Institute for Applied Mathematics (ZAM) Research Centre Juelich D 52425 Juelich, Germany Phone: +49 2461 612057
More informationOpenAIRE Open Knowledge Infrastructure for Europe
Birgit Schmidt University of Göttingen State and University Library OpenAIRE Open Knowledge Infrastructure for Europe ERC Workshop, 6-7 February 2013, Brussels OpenAIRE Characteristics A policy driven
More informationElectronic Thesis and Dissertation Tutorial: Submitting an ETD to SFA ScholarWorks
Stephen F. Austin State University SFA ScholarWorks Library Faculty and Staff Publications Ralph W. Steen Library 5-13-2016 Electronic Thesis and Dissertation Tutorial: Submitting an ETD to SFA ScholarWorks
More informationA Comparative Study of the Search and Retrieval Features of OAI Harvesting Services
A Comparative Study of the Search and Retrieval Features of OAI Harvesting Services V. Indrani 1 and K. Thulasi 2 1 Information Centre for Aerospace Science and Technology, National Aerospace Laboratories,
More informationThe TDAQ Analytics Dashboard: a real-time web application for the ATLAS TDAQ control infrastructure
The TDAQ Analytics Dashboard: a real-time web application for the ATLAS TDAQ control infrastructure Giovanna Lehmann Miotto, Luca Magnoni, John Erik Sloper European Laboratory for Particle Physics (CERN),
More informationELFms industrialisation plans
ELFms industrialisation plans CERN openlab workshop 13 June 2005 German Cancio CERN IT/FIO http://cern.ch/elfms ELFms industrialisation plans, 13/6/05 Outline Background What is ELFms Collaboration with
More informationAdvanced Monitoring Asset for IBM Integration Bus
IBM Cloud Services Advanced Monitoring Asset for IBM Integration Bus Monitoring the business flows of IBM Integration Bus v10 Patrick MARIE IBM Cloud Services consultant pmarie@fr.ibm.com September 2017
More informationIEPSAS-Kosice: experiences in running LCG site
IEPSAS-Kosice: experiences in running LCG site Marian Babik 1, Dusan Bruncko 2, Tomas Daranyi 1, Ladislav Hluchy 1 and Pavol Strizenec 2 1 Department of Parallel and Distributed Computing, Institute of
More informationLUND UNIVERSITY Open Access Journals dissemination and integration in modern library services
Open Access Journals dissemination and integration in modern library services 15th Panhellenic Academic Libraries Conference, Patras, November 2006 Lars Björnshauge, Director of Libraries Lund University
More informationwhere the Web was born Experience of Adding New Architectures to the LCG Production Environment
where the Web was born Experience of Adding New Architectures to the LCG Production Environment Andreas Unterkircher, openlab fellow Sverre Jarp, CTO CERN openlab Industrializing the Grid openlab Workshop
More informationMetal Recovery from Low Grade Ores and Wastes Plus
Metal Recovery from Low Grade Ores and Wastes Plus D7.1 Project and public website Public Authors: Marta Macias, Carlos Leyva (IDENER) D7.1 I Page 2 Deliverable Number 7.1 Deliverable Name Project and
More informationHarnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets
Page 1 of 5 1 Year 1 Proposal Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets Year 1 Progress Report & Year 2 Proposal In order to setup the context for this progress
More informationHigh Performance Computing on MapReduce Programming Framework
International Journal of Private Cloud Computing Environment and Management Vol. 2, No. 1, (2015), pp. 27-32 http://dx.doi.org/10.21742/ijpccem.2015.2.1.04 High Performance Computing on MapReduce Programming
More informationJMU ETD SUBMISSION INSTRUCTIONS
JMU ETD SUBMISSION INSTRUCTIONS Before you submit your thesis or dissertation electronically, you must: Convert your manuscript to a PDF file. For conversion instructions, go to www.atomiclearning.com
More informationElectronic Submission to UMI using FTP
Electronic Submission to UMI using FTP As the digital era progresses, UMI Dissertation Publishing has been encouraging our university publishing partners to move from paper submissions to electronic submissions.
More informationMedical-domain Machine Translation in KConnect
Medical-domain Machine Translation in KConnect Pavel Pecina Charles University, Prague Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics Czech Republic Apr 4th, 2017 QT21 workshop,
More informationThe DART-Europe E-theses Portal
The DART-Europe E-theses Portal Martin Moyle Digital Curation Manager UCL Library Services, UK m.moyle@ucl.ac.uk ETD 2009, University of Pittsburgh, June 10-13 2009 Contents DART-Europe: background The
More informationSemantic Scholar. ICSTI Towards a More Efficient Review of Research Literature 11 September
Semantic Scholar ICSTI Towards a More Efficient Review of Research Literature 11 September 2018 Allen Institute for Artificial Intelligence (https://allenai.org/) Non-profit Research Institute in Seattle,
More informationResearch on the Interoperability Architecture of the Digital Library Grid
Research on the Interoperability Architecture of the Digital Library Grid HaoPan Department of information management, Beijing Institute of Petrochemical Technology, China, 102600 bjpanhao@163.com Abstract.
More informationA cocktail approach to the VideoCLEF 09 linking task
A cocktail approach to the VideoCLEF 09 linking task Stephan Raaijmakers Corné Versloot Joost de Wit TNO Information and Communication Technology Delft, The Netherlands {stephan.raaijmakers,corne.versloot,
More informationA fully-automatic approach to answer geographic queries: GIRSA-WP at GikiP
A fully-automatic approach to answer geographic queries: at GikiP Johannes Leveling Sven Hartrumpf Intelligent Information and Communication Systems (IICS) University of Hagen (FernUniversität in Hagen)
More informationThe EPIKH, GILDA and GISELA Projects
The EPIKH Project (Exchange Programme to advance e-infrastructure Know-How) The EPIKH, GILDA and GISELA Projects Antonio Calanducci INFN Catania (Consorzio COMETA) - UniCT Joint GISELA/EPIKH School for
More informationD6.1. Project website and internal IT communication infrastructure HINT. 36 months FP7/
D6.1 Project website and internal IT communication infrastructure Project number: 317930 Project acronym: Project title: HINT Start date of the project: 1 st October, 2012 Duration: Programme: Holistic
More informationCMS users data management service integration and first experiences with its NoSQL data storage
Journal of Physics: Conference Series OPEN ACCESS CMS users data management service integration and first experiences with its NoSQL data storage To cite this article: H Riahi et al 2014 J. Phys.: Conf.
More informationNavigating the Universe of ETDs: Streamlining for an Efficient and Sustainable Workflow at the University of North Florida Library
University of North Florida From the SelectedWorks of Marielle Veve 2014 Navigating the Universe of ETDs: Streamlining for an Efficient and Sustainable Workflow at the University of North Florida Library
More informationBatch Services at CERN: Status and Future Evolution
Batch Services at CERN: Status and Future Evolution Helge Meinhard, CERN-IT Platform and Engineering Services Group Leader HTCondor Week 20 May 2015 20-May-2015 CERN batch status and evolution - Helge
More informationCollective Awareness Platform for Tropospheric Ozone Pollution
Collective Awareness Platform for Tropospheric Ozone Pollution Work package WP2 Deliverable number D2.7. Deliverable title Release of Website Deliverable type DEC Dissemination level PU (Public) Estimated
More informationNRF Open Access Statement
NRF Open Access Statement Implications for grantees research output submissions and dissemination SOUTH AFRICAN RESEARCH CHAIRS INITIATIVE North West & Limpopo Regional Workshop VENUE: NRF, Albert Luthuli
More informationThe Content Editor UBC Department of Botany Website
The Content Editor UBC Department of Botany Website Prepared by: IT Support Jan 2015 0 Table of Contents SECTION I: BACKGROUND AND PURPOSE... 3 SECTION II: LOGGING IN TO THE DEPARTMENT OF BOTANY WEBSITE...
More informationEuropeana Core Service Platform
Europeana Core Service Platform DELIVERABLE D7.1: Strategic Development Plan, Architectural Planning Revision Final Date of submission 30 October 2015 Author(s) Marcin Werla, PSNC Pavel Kats, Europeana
More informationComplete Solutions for Today s Electronic Collections
Complete Solutions for Today s Electronic Collections 29, January, 2007 Sydney, Australia JR Jenkins, MLIS Steve McCracken Peter McCracken, MLS Serials Solutions, Inc. Group Product Manager Today s Agenda
More informationStandardization Activities in ITU-T
Standardization Activities in ITU-T Nozomu NISHINAGA and Suyong Eum Standardization activities for Future Networks in ITU-T have produced 19 Recommendations since it was initiated in 2009. The brief history
More informationTests of PROOF-on-Demand with ATLAS Prodsys2 and first experience with HTTP federation
Journal of Physics: Conference Series PAPER OPEN ACCESS Tests of PROOF-on-Demand with ATLAS Prodsys2 and first experience with HTTP federation To cite this article: R. Di Nardo et al 2015 J. Phys.: Conf.
More information1. Publishable Summary
1. Publishable Summary 1.1Project objectives and context Identity management (IdM) has emerged as a promising technology to distribute identity information across security domains. In e-business scenarios,
More informationCrossing the Archival Borders
IST-Africa 2008 Conference Proceedings Paul Cunningham and Miriam Cunningham (Eds) IIMC International Information Management Corporation, 2008 ISBN: 978-1-905824-07-6 Crossing the Archival Borders Fredrik
More informationWeb Services for Integrated Management: a Case Study
Web Services for Integrated Management: a Case Study Jean-Philippe Martin-Flatin, CERN, Switzerland Pierre-Alain Doffoel, ESCP-EAP, France Mario Jeckle, University of Applied Sciences Furtwangen, Germany
More informationAnnual Public Report - Project Year 2 November 2012
November 2012 Grant Agreement number: 247762 Project acronym: FAUST Project title: Feedback Analysis for User Adaptive Statistical Translation Funding Scheme: FP7-ICT-2009-4 STREP Period covered: from
More informationEnvirocat: a Swiss Catalogue for Sharing Environmental Information
Envirocat: a Swiss Catalogue for Sharing Environmental Information Karin Fink 1, Véronique Ortner 1, Hy Dao 2, Jean-Philippe Richard 3, Frédéric Vogel 3 Summary This paper presents the envirocat 4 project,
More information5GrEEn Towards Green 5G Mobile Networks
5GrEEn Towards Green 5G Mobile Networks ETSI workshop 7-8 October 2013, Athens, Greece Magnus Olsson Ericsson Research, Stockholm, Sweden Background & Introduction RAN Energy Efficiency is an important
More informationWEB-BASED COLLECTION MANAGEMENT FOR LIBRARIES
WEB-BASED COLLECTION MANAGEMENT FOR LIBRARIES Comprehensive Collections Management Systems You Can Access Anytime, Anywhere AXIELL COLLECTIONS FOR LIBRARIES Axiell Collections is a web-based CMS designed
More informationINSPIRE and SPIRES Log File Analysis
INSPIRE and SPIRES Log File Analysis Cole Adams Science Undergraduate Laboratory Internship Program Wheaton College SLAC National Accelerator Laboratory August 5, 2011 Prepared in partial fulfillment of
More informationOpen Access Publishing with arxiv. Tommy Ohlsson KTH Royal Institute of Technology
Open Access Publishing with arxiv Tommy Ohlsson KTH Royal Institute of Technology Outline Open Access (OA) arxiv SCOAP 3 Useful references Some questions Open Access (OA) What is Open Access (OA)? Definition
More informationRN Workshop Series on Innovations in Scholarly Communication: plementing the Benefits of OAI (OAI3)
RN Workshop Series on Innovations in Scholarly Communication: plementing the Benefits of OAI (OAI3) Overview of the OAI and its Relation to Scientific Publishing in 2004 Dr. Diann Rusch-Feja, Director
More informationA European Vision and Plan for a Common Grid Infrastructure
A European Vision and Plan for a Common Grid Infrastructure European Grid Initiative www.eu-egi.org Why Sustainability? Scientific applications start to depend on Grid infrastructures (EGEE, DEISA, ) Jobs/month
More informationInformation Retrieval
Introduction Information Retrieval Information retrieval is a field concerned with the structure, analysis, organization, storage, searching and retrieval of information Gerard Salton, 1968 J. Pei: Information
More informationEO Ground Segment Evolution Reflections by
EO Ground Segment Evolution Reflections by Interoute Jonathan Brown Marketing Director Workshop 2015, 24 th September 2015 ESA/ESRIN Frascati Interoute, from the ground to the cloud 1. Interoute is the
More informationSWAD-Europe Deliverable 3.18: RDF Query Standardisation
SWAD-Europe Deliverable 3.18: RDF Query Standardisation Project name: Semantic Web Advanced Development for Europe (SWAD-Europe) Project Number: IST-2001-34732 Workpackage name: 3 Dissemination and Exploitation
More informationSubmitting your Dissertation/ Thesis Electronically: A Guide for Graduate Students
Submitting your Dissertation/ Thesis Electronically: A Guide for Graduate Students Your comprehensive, screen by screen guide to submitting your thesis or dissertation document electronically for review
More informationInterim Report Technical Support for Integrated Library Systems Comparison of Open Source and Proprietary Software
Interim Report Technical Support for Integrated Library Systems Comparison of Open Source and Proprietary Software Vandana Singh Assistant Professor, School of Information Science, University of Tennessee,
More information"Charting the Course to Your Success!" MOC Microsoft SharePoint 2010 Site Collection and Site Administration Course Summary
MOC 50547 Microsoft SharePoint Site Collection and Site Course Summary Description This five-day instructor-led Site Collection and Site Administrator course gives students who have SharePoint Owner permissions
More informationMONTE CARLO SIMULATION FOR RADIOTHERAPY IN A DISTRIBUTED COMPUTING ENVIRONMENT
The Monte Carlo Method: Versatility Unbounded in a Dynamic Computing World Chattanooga, Tennessee, April 17-21, 2005, on CD-ROM, American Nuclear Society, LaGrange Park, IL (2005) MONTE CARLO SIMULATION
More informationM3 Framework: User s guide & tutorial
M3 Framework: User s guide & tutorial Creator Send Feedback Amelie Gyrard (Eurecom - Insight - NUIG/DERI) Designed and implemented by Amélie Gyrard, she was a PhD student at Eurecom under the supervision
More informationReport on Deployment of Customized OJS
D3.2 Deployment of Customized OJS Agora (270904) 1 AGORA Scholarly Open Access Research in European Philosophy Project Reference: 270904 Report on Deployment of Customized OJS Deliverable number D3.2 Destination
More informationCapturing and Analyzing User Behavior in Large Digital Libraries
Capturing and Analyzing User Behavior in Large Digital Libraries Giorgi Gvianishvili, Jean-Yves Le Meur, Tibor Šimko, Jérôme Caffaro, Ludmila Marian, Samuele Kaplun, Belinda Chan, and Martin Rajman European
More informationEXTENDING OAI-PMH PROTOCOL WITH DYNAMIC SETS DEFINITIONS USING CQL LANGUAGE
EXTENDING OAI-PMH PROTOCOL WITH DYNAMIC SETS DEFINITIONS USING CQL LANGUAGE Cezary Mazurek Poznań Supercomputing and Networking Center Noskowskiego 12/14, 61-704 Poznań, Poland Marcin Werla Poznań Supercomputing
More informationECHA -term User Guide
ECHA -term User Guide 1 / 27 Table of contents Introduction... 3 Content... 3 Reliability of data... 4 Languages... 5 Target users... 5 Terminology principles... 5 Domain classification... 6 Localised
More informationISO INTERNATIONAL STANDARD. Financial services Universal financial industry message scheme Part 3: Modelling
INTERNATIONAL STANDARD ISO 20022-3 First edition 2013-05-01 Financial services Universal financial industry message scheme Part 3: Modelling Services financiers Schéma universel de messages pour l'industrie
More informationALOE - A Socially Aware Learning Resource and Metadata Hub
ALOE - A Socially Aware Learning Resource and Metadata Hub Martin Memmel & Rafael Schirru Knowledge Management Department German Research Center for Artificial Intelligence DFKI GmbH, Trippstadter Straße
More informationE-Marefa User Guide. "Arab Theses and Dissertations"
E-Marefa User Guide "Arab Theses and Dissertations" Table of Contents What is E-Marefa Database.3 System Requirements 3 Inside this User Guide 3 Access to E-Marefa Database.....4 Choosing Database to Search.5
More informationThesis/Dissertation Submission Guidelines The Graduate School Valdosta State University
Thesis/Dissertation Submission Guidelines The Graduate School Valdosta State University Has your thesis or dissertation been reviewed and approved by the Graduate School? If not start with Section 1. If
More informationPortfolios Creating and Editing Portfolios... 38
Portfolio Management User Guide 16 R1 March 2017 Contents Preface: Using Online Help... 25 Primavera Portfolio Management Overview... 27 Portfolio Management Software for Technology Leaders... 27 Solution
More informationDesign and Realization of the EXCITEMENT Open Platform for Textual Entailment. Günter Neumann, DFKI Sebastian Pado, Universität Stuttgart
Design and Realization of the EXCITEMENT Open Platform for Textual Entailment Günter Neumann, DFKI Sebastian Pado, Universität Stuttgart Textual Entailment Textual Entailment (TE) A Text (T) entails a
More informationCERN openlab Communications
CERN openlab Communications CERN openlab III Board of Sponsors 02 April 2009 Mélissa Le Jeune Major New Actions Implemented Mechanisms to: constantly measure the impact of all communications report sponsor
More informationInternational Roaming Charges: Frequently Asked Questions
MEMO/06/144 Brussels, 28 March 2006 International Roaming Charges: Frequently Asked Questions What is international mobile roaming? International roaming refers to the ability to use your mobile phone
More informationUSC Viterbi School of Engineering
Introduction to Computational Thinking and Data Science USC Viterbi School of Engineering http://www.datascience4all.org Term: Fall 2016 Time: Tues- Thur 10am- 11:50am Location: Allan Hancock Foundation
More informationDelivering On Canada s Broadband Commitment Presentation to OECD/WPIE Public Sector Broadband Procurement Workshop December 4, 2002
Delivering On Canada s Broadband Commitment Presentation to OECD/WPIE Public Sector Broadband Procurement Workshop December 4, 2002 Canada Demand Aggregation in a Federal Structure 3 major levels of government
More information