Project GRACE: A grid based search tool for the global digital library

Size: px
Start display at page:

Download "Project GRACE: A grid based search tool for the global digital library"

Transcription

1 Project GRACE: A grid based search tool for the global digital library Frank Scholze 1, Glenn Haya 2, Jens Vigen 3, Petra Prazak 4 1 Stuttgart University Library, Postfach , Stuttgart, Germany; frank.scholze@ub.uni.stuttgart.de 2 Stockholm University Library, Stockholm, Sweden; glenn.haya@sub.su.se 3 CERN Library, CH-1211, Geneva 23, Switzerland; jens.vigen@cern.ch 4 Stuttgart University Library, Postfach , Stuttgart, Germany; petra.prazak@ub.uni-stuttgart.de Abstract: GRACE - Grid Search and Categorization Engine ( is an ongoing EU project. GRACE is an attempt to apply an innovative Grid-based solution that will meet the challenges of searching a global heterogeneous collection of documents. The goal of the project is to build a distributed search and categorization engine that will run on the Enabling Grids for E-science in Europe (EGEE), the successor to the European Data Grid (EDG). This paper is a description of the project and its potential as a framework for a global ETD search tool. 1 Introduction The goal of Project GRACE i is to build a distributed search and categorization tool adapted to the Grid network infrastructure. We are currently in the process of developing the first prototype of the GRACE search engine. Testing and evaluation will proceed throughout the summer and fall of CERN-OPEN /06/2004 One of the unique aspects of the GRACE toolkit is that it is built on a large distributed-computing network referred to as a grid. The project is one of the first to deal with search and retrieval in the grid environment and in the process of creating the search tool we have identified potential advantages of a grid-based search and categorization engine as well as limitations for search and retrieval in the current grid environment. GRACE is a project in the Fifth Framework Program (FP5) of the Information Society Technologies (IST) initiative by the European Union. The partners in this project are Telecom Italia Lab (as project leader and manager), CERN (European Organization for Nuclear Research), Virtual Self, Sheffield Hallam University - School of Computing and Management Sciences, Stockholm University Library and Stuttgart University Library. The project started in September 2002 and will end in February Benefits of GRACE for searching ETDs GRACE could be used as a framework for a global ETD search tool by searching existing data providers and service providers as well as other content sources. Below are some of the aspects of the GRACE search and categorization tool with the most potential benefit for the ETD community. 2.1 Federated search of heterogeneous sources GRACE can integrate content sources using different protocols such as OAI, http or Z This means that GRACE can function as a layer on top of service providers such as NDLTD as well as individual repositories. Content sources can be integrated even if they do not have a search interface of their own. GRACE provides its own indexing service for that which is based on Jakarta s Lucene adapted to the GRACE toolkit. ii If for example, an institution has a collection of PDF files but no way to search through them, they could integrate their collection into GRACE. Users could then search this content source along with any other content source (service providers or data providers) that is integrated with GRACE. 2.2 Federated search of subscription versus free material Some material, notably Proquest s Digital Dissertation, can be accessed through subscription accounts. GRACE is built on top of a Grid network in which users are registered as part of virtual organizations which can be a university or a faculty for example. Virtual organizations can ease the administration of access rights to various sources. Potentially, this means that a registered user can log in as part of a virtual organization and GRACE can automatically allow or restrict access to content based on their organizations access levels. This automatic authentification feature will not be included in the first prototype of GRACE but may be added at a later date.

2 2.3 Sources organized in knowledge domains Content sources will be organized into knowledge domains (e.g. subjects such as physics or computer science). Users will therefore not have to know each content source relevant for their specific query. However in addition to selecting knowledge domains they will be able to select individual content sources that best match their topic. However, ETDs are often stored in sources that are interdisciplinary and will be included under a general knowledge domain. For this reason, content sources will also be defined by predominant document type (such as thesis, dissertations), so users can easily identify sources that contain ETDs. 2.4 Automatic categorization GRACE includes a categorization engine that will dynamically integrate and categorize results from various data sources. This partially solves the problem of integrating results from heterogeneous content sources that rank results using different methods. The categorization engine will be based on linguistic algorithms iii as opposed to statistical methods used in other search and categorization engines such as Vivissimo. To begin with, GRACE will be capable of automatically identifying and then categorizing results in the following languages: English, German, Swedish and Italian. Additional languages may be added at a later stage. More details on categorization can be found in section Use of subject thesauri or classifications schemes To launch a search: GRACE will allow users to search using subject appropriate thesauri or classification schemes that change depending on the knowledge domain(s) or sources selected. For example, if a researcher were to search content sources that focus on particle physics for example, he can start by selecting terms from the High Energy Physics Index. When searching using a classification scheme that is not supported by a content source, GRACE will take the word or phrase from the classification scheme and perform a keyword search on the documents. To present results: GRACE will also allow users the option to view search results categorized by a specific classification scheme. For example, a user can choose to view search results categorized using terms from the High Energy Physics Index. When this option is chosen, GRACE s categorization engine will automatically categorize the documents using the terminology from the classification scheme based on a linguistic analysis of the entire text of the documents. This feature is not currently developed but is planned for the GRACE prototype that will be available for public testing in September Multilingual functionality The search tool will have the ability to search and automatically categorize documents in various languages with the help of lexical databases. The prototype will have capabilities in English, German, Italian and Swedish and this feature is extensible to other languages. 3 Comparison of GRACE to existing tools Below is a table comparing GRACE to other ETD search tools. NDLTD NCSTRL GRACE Content Searched Repositories via OAI Repositories via OAI Various sources via http, OAI, Z39.50, Free or Restricted? Free Free Free and restricted content Index Centralized Decentralized Decentralized Query Processing Centralized Decentralized over the web Grid computational resources Response Time Immediate, Quick Immediate, Slower Delayed, Batch Processing Table 1: Features of GRACE vs NDLTD and NCSTRL

3 3.1 Content searched The chart above shows how GRACE can integrate content sources via OAI, http and Z This allows it to function as a unified search tool for various content sources including both ETD service providers and data providers as well as other search interfaces. 3.2 Query processing GRACE processes search results on the computers that make up the Grid network. This means that the system is very powerful and scalable without the investment in a large number of servers by any single institution. 3.3 Response Time Finally, the chart shows that the processing time for a query with GRACE is delayed due to the batch orientation of current Grid technology. Unfortunately the current grid architecture does not provide real time interaction since there will always be an overhead of a few minutes to every job submitted to the Grid, no matter how simple. As a result of this, we decided to create a search and categorization tool that delivers results over time via links sent by as opposed to providing immediate results. The GRACE approach is comparable to SDI or profile searches in online databases like Inspec or Medline. These work as a kind of alerting service in the background gathering new information into a structure. We believe this to be a potentially powerful paradigm for the future giving the user control over the structure into which documents or information is fed into. A PhD Student for example could make up a table of contents for his thesis which would be enriched by his own work as well as by documents retrieved in the background. 4 Workflow Below is a diagram of the overall workflow of the GRACE tool. Figure 1: GRACE workflow

4 4.1 From Query Submission to Downloading After query formulation and submission (which is explained in section 7), content sources selected by the user are queried, results are parsed and documents are downloaded. This takes place on the internet. The downloaded documents are then sent to the grid network for normalization and categorization. 4.2 Normalization During text normalization, the text is put into a uniform format in preparation for categorization. For example, stop words such as articles (e.g. the and a ) are removed. Words are grouped together when appropriate (for example, proper names, acronyms). Words are also stripped of prefixes and suffixes at this stage. Many of these items are language specific and GRACE can perform these functions on any supported language (at this stage suitable lexical tables are available for English, German and Italian. Swedish will be added soon). 4.3 Categorization The entire text of documents retrieved from a search are downloaded, normalized (see 6.2) and sent to the categorization engine where lexical algorithms are used to categorize the results. An example of the type of work done in the categorization stage is a process called disambiguation. Words can be used to mean different things, but clues to the meaning of a word or phrase can often be found within the context in which the word was used. The categorization engine analyzes the context of words in order to group together those words that have similar contexts. For example, the word October below can refer to the month, the submarine or the Russian Revolution. GRACE s categorization engine uses the context to help sort the words into categories and present them to the user Response Time Figure 2: Disambiguation 4.4 External vs. internal content sources In Figure 1 it is shown that GRACE will query both external and internal content sources. An internal content source is a source in which documents have already been parsed and normalized and are stored on the grid, ready to be included in a query and categorized. For example, if a university department stored their theses on a server as PDF files, GRACE could download the documents, process them and store a normalized version of them on the GRID. The theses would then be presented to the user as a content source that can be included in the federated search. From the user perspective, internal sources (normalized and stored on the grid) and external (queried through http for example) are indistinguishable since the user interface presents them both in the same way.

5 5 User interface 5.1 Query Input The query is input in three stages. First the user selects the content sources. Either individual sources or entire knowledge domains can be selected. The screen shot of the first GRACE prototype (figure 3) includes web sources and internal documents. However, in the prototype available for the public the sources will be divided into knowledge domains such as physics and computer science. The user then enters the search terms. Figure 3: Search Wizard screen #1 On the next screen (Figure 4) the user enters in the search term(s). Any term can be typed in or the user can choose to use a term from an appropriate classification scheme to launch a search. Searches can also be limited to a specific field. The fields available vary depending on the resources selected. Figure 4: Search Wizard screen #2 Finally, the user launches the search (Figure 5). If the user is logged in then the information is automatically filled in. After the search is launched the user receives a confirmation. After the search is processed, an is sent out with a link to the categorized results. The user can get updates on a search as often as once a day.

6 Figure 5: Search Wizard screen #3 5.2 Results The results (Figure 6) are sent to the user as a link via . Results are presented in categories that are displayed on the left hand side of the screen. Each category is linked to relevant related concepts that are listed at the bottom of the page. From the result page, the user can sort or filter the search results. Automatically created Table of Contents Automatically created Related Concepts list, per selected topic in the upper list Figure 6: Results screen

7 6 Federated Search Federated search of heterogeneous content sources has traditionally been problematic. Below is a series of problems associated with Federated search iv along with the solution proposed by GRACE. Problem: US Digital Library Experience suggests cross searching does not scale. Solution: User limits source selection by choosing a specific knowledge domain or by choosing sources that focus on a specific document type such as theses. Problem: Collection description is difficult and users have trouble knowing which sources to search Solution: GRACE allows users to choose sources by subject (knowledge domain) or by primary document type that the source contains. Problem: Query language and search attributes can vary across different sources. Solution: Query syntax is mapped to individual information resources. However, as content sources scale up there will be considerable maintenance effort. Problem: Different sources rank results in different ways. Ranking the combined results of various sources is problematic. Solution: By presenting results divided into categories, GRACE provides an alternative to traditional ranking of results and provides a partial solution to this problem. Problem: Performance is limited to slowest target Solution: The project does not provide a direct solution to this problem but instead adopted a batch processing approach that aims to provide high quality results updated over time. The presentation of the search tool clearly explains that the complete results will not be available immediately but the user will be notified by when a search is completed. 7 What is Grid Technology? A Grid is a distributed computer network in which computers share computing power and storage capacity. There are currently several large grid networks, however there is no global grid. The basics of grid computing are explained well at CERN s gridcafe. v As explained on this website, the dream of grid computing is to create a global network of computers, accessible from anywhere, which will function as a practically unlimited computing resource. However, the grids available now and in the immediate future are regional and have e- science as a primary focus. The concept of a computing grid is often compared to the power grid where the user does not need to worry about what computer processes his request and where the data is stored. Like the power grid, the computing grid would be accessible from anywhere and you will pay for the power that you need. However, there are some key differences between the power grid and the computing grid however. For example power is either on or off. There are no performance issues the way there can be with computer networks such as the Grid. Also, power flows basically only one way, from producer to consumer, which is not true on a grid network where there is interaction. Distributed computing, which grid networks are based on, is not a new concept. Distributed computing developed in an effort to generate processing power for meeting workload challenges. In order to boost processing power, institutions aggregated computing resources across locations or across the entire institution. The idea was to match the supply of processing cycles with the demand created by applications. This concept is now a ubiquitous solution practiced by leading organizations around the world. It ensures continuous computing availability despite scheduled maintenance, power outages, and unexpected failures. 8 GRACE and the Grid As mentioned previously, there are several grid networks in use today. GRACE will be integrated with the GILDA grid testbed vi (Gilda=Grid INFN Laboratory for Dissemination Activities) which has computing nodes at six sites spread across Italy. GILDA is in turn a part of a larger infrastructure called EGEE vii (Enabling Grids for E-Science in Europe).

8 The GRACE project has its own grid nodes which it has integrated with GILDA. These include 5 CPUs in Turin and 4 CPUS in Milan. 9 Future implications of Grid search and retrieval for the ETD community In its current stage of development, the grid is well suited for batch processing and storage of enormous amounts of data. This means that it is appropriate for using on massive collections of documents for researchers who are willing to wait for high quality in terms of categorization of results. In the future, the process for submitting a job to the grid may be streamlined, making it an environment suitable for interactive applications. This demand is formulated for example by the e-learning community utilizing Grid technology viii, by the GGF working group on Grid information retrieval (GIR) ix as well as by vendors such as IBM which introduced Masala x as an extension to DB2. It is our hope that the lessons learned from the GRACE project will contribute to the grids development in this direction. i GRACE project Hhttp:// ii Hhttp://jakarta.apache.org/lucene/H Add reference to GRACE Deliverable D2.1 GRACE Local Search and Categorization Engine iii Nahum Korda et al.: Unsupervised Taxonomy of Large Document Corpora Utilizing Idiomatic Character of Natural Languages In: The 2001 International Conference on Artificial Intelligence (IC-AI'2001) June 25-28, 2001 Las Vegas iv The problems were taken from Tutorial 1, CERN Workshop on Innovations in Scholarly Communications, February, Add url v Hhttp://gridcafe.web.cern.ch/gridcafe/H vi Hhttp://gilda.ct.infn.it/main.htmlH vii viii ix x

Developing a Grid-Based Search and Categorization Tool

Developing a Grid-Based Search and Categorization Tool Abstract: High Energy Physics Libraries Webzine Issue 8 / October 2003 Developing a Grid-Based Search and Categorization Tool Glenn Haya (*), Frank Scholze (*), Jens Vigen (*) Grid technology has the potential

More information

ProQuest Dissertations and Theses Overview. Austin McLean and Marlene Coles CGS Summer Workshop, July 2017

ProQuest Dissertations and Theses Overview. Austin McLean and Marlene Coles CGS Summer Workshop, July 2017 ProQuest Dissertations and Theses Overview Austin McLean and Marlene Coles CGS Summer Workshop, July 2017 Agenda Dissertations and ProQuest Short form video Pilot Project 2 A mission that aligns with universities

More information

Deliverable D71:(PC1) Draft Collaboration Plan

Deliverable D71:(PC1) Draft Collaboration Plan Deliverable D71:(PC1) Draft Collaboration Plan DATA MINING TOOLS AND SERVICES FOR GRID COMPUTING ENVIRONMENTS D71: (PC1) Draft Collaboration Plan Responsible author: Co-authors: Nahum Korda Assaf Schuster,

More information

CACAO PROJECT AT THE 2009 TASK

CACAO PROJECT AT THE 2009 TASK CACAO PROJECT AT THE TEL@CLEF 2009 TASK Alessio Bosca, Luca Dini Celi s.r.l. - 10131 Torino - C. Moncalieri, 21 alessio.bosca, dini@celi.it Abstract This paper presents the participation of the CACAO prototype

More information

Annotating Spatio-Temporal Information in Documents

Annotating Spatio-Temporal Information in Documents Annotating Spatio-Temporal Information in Documents Jannik Strötgen University of Heidelberg Institute of Computer Science Database Systems Research Group http://dbs.ifi.uni-heidelberg.de stroetgen@uni-hd.de

More information

Eclipse Technology Project: g-eclipse

Eclipse Technology Project: g-eclipse (Incubation) Document classification: Made available under the Eclipse Public License v1.0. Date: September 11, 2007 Abstract: This document contains the Release Review Documentation for the Eclipse Technology

More information

The SweGrid Accounting System

The SweGrid Accounting System The SweGrid Accounting System Enforcing Grid Resource Allocations Thomas Sandholm sandholm@pdc.kth.se 1 Outline Resource Sharing Dilemma Grid Research Trends Connecting National Computing Resources in

More information

Extending the Facets concept by applying NLP tools to catalog records of scientific literature

Extending the Facets concept by applying NLP tools to catalog records of scientific literature Extending the Facets concept by applying NLP tools to catalog records of scientific literature *E. Picchi, *M. Sassi, **S. Biagioni, **S. Giannini *Institute of Computational Linguistics **Institute of

More information

GRIDS INTRODUCTION TO GRID INFRASTRUCTURES. Fabrizio Gagliardi

GRIDS INTRODUCTION TO GRID INFRASTRUCTURES. Fabrizio Gagliardi GRIDS INTRODUCTION TO GRID INFRASTRUCTURES Fabrizio Gagliardi Dr. Fabrizio Gagliardi is the leader of the EU DataGrid project and designated director of the proposed EGEE (Enabling Grids for E-science

More information

A service-oriented national e-theses information system and repository

A service-oriented national e-theses information system and repository A service-oriented national e-theses information system and repository Nikos Houssos, Panagiotis Stathopoulos, Ioanna Sarantopoulou, Dimitris Zavaliadis, Evi Sachini National Documentation Centre / National

More information

A Cloud Framework for Big Data Analytics Workflows on Azure

A Cloud Framework for Big Data Analytics Workflows on Azure A Cloud Framework for Big Data Analytics Workflows on Azure Fabrizio MAROZZO a, Domenico TALIA a,b and Paolo TRUNFIO a a DIMES, University of Calabria, Rende (CS), Italy b ICAR-CNR, Rende (CS), Italy Abstract.

More information

eresearch Australia The Elephant in the Room! Open Access Archiving and other Gateways to e-research Richard Levy

eresearch Australia The Elephant in the Room! Open Access Archiving and other Gateways to e-research Richard Levy eresearch Australia Open Access Archiving and other Gateways to e-research Richard Levy The Elephant in the Room! The Impact of Google on eresearch Google is the black box to information on the Internet,

More information

Share.TEC System Architecture

Share.TEC System Architecture Share.TEC System Architecture Krassen Stefanov 1, Pavel Boytchev 2, Alexander Grigorov 3, Atanas Georgiev 4, Milen Petrov 5, George Gachev 6, and Mihail Peltekov 7 1,2,3,4,5,6,7 Faculty of Mathematics

More information

DALA Project: Digital Archive System for Long Term Access

DALA Project: Digital Archive System for Long Term Access 2010 International Conference on Distributed Framework for Multimedia Applications (DFmA) DALA Project: Digital Archive System for Long Term Access Mardhani Riasetiawan 1,2, Ahmad Kamil Mahmood 2 1 Master

More information

SLHC-PP DELIVERABLE REPORT EU DELIVERABLE: Document identifier: SLHC-PP-D v1.1. End of Month 03 (June 2008) 30/06/2008

SLHC-PP DELIVERABLE REPORT EU DELIVERABLE: Document identifier: SLHC-PP-D v1.1. End of Month 03 (June 2008) 30/06/2008 SLHC-PP DELIVERABLE REPORT EU DELIVERABLE: 1.2.1 Document identifier: Contractual Date of Delivery to the EC Actual Date of Delivery to the EC End of Month 03 (June 2008) 30/06/2008 Document date: 27/06/2008

More information

Introduction

Introduction Introduction EuropeanaConnect All-Staff Meeting Berlin, May 10 12, 2010 Welcome to the All-Staff Meeting! Introduction This is a quite big meeting. This is the end of successful project year Project established

More information

Preservation Planning in the OAIS Model

Preservation Planning in the OAIS Model Preservation Planning in the OAIS Model Stephan Strodl and Andreas Rauber Institute of Software Technology and Interactive Systems Vienna University of Technology {strodl, rauber}@ifs.tuwien.ac.at Abstract

More information

"Efficient" Thesis & Dissertation Workflows With Limited Resources

Efficient Thesis & Dissertation Workflows With Limited Resources Portland State University PDXScholar Northwest IR User Group 2018 Northwest IR User Group Jul 20th, 9:10 AM - 10:00 AM "Efficient" Thesis & Dissertation Workflows With Limited Resources Michele Gibney

More information

Google indexed 3,3 billion of pages. Google s index contains 8,1 billion of websites

Google indexed 3,3 billion of pages. Google s index contains 8,1 billion of websites Access IT Training 2003 Google indexed 3,3 billion of pages http://searchenginewatch.com/3071371 2005 Google s index contains 8,1 billion of websites http://blog.searchenginewatch.com/050517-075657 Estimated

More information

Melvyl Webinar UC and OCLC Roadmap Discussion

Melvyl Webinar UC and OCLC Roadmap Discussion Melvyl Webinar UC and OCLC Roadmap Discussion Leslie Wolf & Jeff Penka October 11, 2011 CDL Team Introduction Leslie Wolf, Melvyl Project Manager Ellen Meltzer, Information Services Manager Margery Tibbetts,

More information

Marriott. Manual Submission Quick Reference Guide (QRG)

Marriott. Manual Submission Quick Reference Guide (QRG) Marriott Manual Submission Quick Reference Guide (QRG) February 2015 Topics Covered Login and Submit Menu Submission Form Submission Instructions Reviewing a Quote Downloading Translated Files 2 Login

More information

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS)

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS) Institutional Repository using DSpace Yatrik Patel Scientist D (CS) yatrik@inflibnet.ac.in What is Institutional Repository? Institutional repositories [are]... digital collections capturing and preserving

More information

RISE SICS North Newsletter 2017:3

RISE SICS North Newsletter 2017:3 RISE SICS North Newsletter 2017:3 Visa webbversion RISE SICS North Newsletter 2017:3 News in short We have released our web http://ice.sics.se and our web shop for Openstack We have started our first H2020

More information

Using UIMA to Structure an Open Platform for Textual Entailment. Tae-Gil Noh, Sebastian Padó Dept. of Computational Linguistics Heidelberg University

Using UIMA to Structure an Open Platform for Textual Entailment. Tae-Gil Noh, Sebastian Padó Dept. of Computational Linguistics Heidelberg University Using UIMA to Structure an Open Platform for Textual Entailment Tae-Gil Noh, Sebastian Padó Dept. of Computational Linguistics Heidelberg University The paper is about About EXCITEMENT Open Platform a

More information

Andrea Sciabà CERN, Switzerland

Andrea Sciabà CERN, Switzerland Frascati Physics Series Vol. VVVVVV (xxxx), pp. 000-000 XX Conference Location, Date-start - Date-end, Year THE LHC COMPUTING GRID Andrea Sciabà CERN, Switzerland Abstract The LHC experiments will start

More information

MAtchUP D8.2: Project website WP 8, T th March 2018 (M6)

MAtchUP D8.2: Project website WP 8, T th March 2018 (M6) MAtchUP D8.2: Project website WP 8, T 8.2 30th March 2018 (M6) Authors: Costanza Caffo (ICE), Veronica Meneghello (ICE) MAtchUP - SCC-1-2016-2017 Innovation Action GRANT AGREEMENT No. 774477 Technical

More information

Grid Scheduling Architectures with Globus

Grid Scheduling Architectures with Globus Grid Scheduling Architectures with Workshop on Scheduling WS 07 Cetraro, Italy July 28, 2007 Ignacio Martin Llorente Distributed Systems Architecture Group Universidad Complutense de Madrid 1/38 Contents

More information

DELOS WP7: Evaluation

DELOS WP7: Evaluation DELOS WP7: Evaluation Claus-Peter Klas Univ. of Duisburg-Essen, Germany (WP leader: Norbert Fuhr) WP Objectives Enable communication between evaluation experts and DL researchers/developers Continue existing

More information

IRMOS Newsletter. Issue N 5 / January Editorial. In this issue... Dear Reader, Editorial p.1

IRMOS Newsletter. Issue N 5 / January Editorial. In this issue... Dear Reader, Editorial p.1 IRMOS Newsletter Issue N 5 / January 2011 In this issue... Editorial Editorial p.1 Highlights p.2 Special topic: The IRMOS Repository p.5 Recent project outcomes p.6 Keep in touch with IRMOS p.8 Dear Reader,

More information

ETD Submission via ProQuest Step-by-Step

ETD Submission via ProQuest Step-by-Step ETD Submission via ProQuest Step-by-Step 1. Access the ProQuest ETD Administrator portal ProQuest s portal page is linked in the Students area of the Graduate School website here: http://www.clemson.edu/graduate/students/theses-and-dissertations/submit.html.

More information

Discovery services: next generation of searching scholarly information

Discovery services: next generation of searching scholarly information Discovery services: next generation of searching scholarly information Article (Unspecified) Keene, Chris (2011) Discovery services: next generation of searching scholarly information. Serials, 24 (2).

More information

ScienceDirect. Multi-interoperable CRIS repository. Ivanović Dragan a *, Ivanović Lidija b, Dimić Surla Bojana c CRIS

ScienceDirect. Multi-interoperable CRIS repository. Ivanović Dragan a *, Ivanović Lidija b, Dimić Surla Bojana c CRIS Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 33 ( 2014 ) 86 91 CRIS 2014 Multi-interoperable CRIS repository Ivanović Dragan a *, Ivanović Lidija b, Dimić Surla Bojana

More information

Review of. Amanda Spink. and her work in. Web Searching and Retrieval,

Review of. Amanda Spink. and her work in. Web Searching and Retrieval, Review of Amanda Spink and her work in Web Searching and Retrieval, 1997-2004 Larry Reeve for Dr. McCain INFO861, Winter 2004 Term Project Table of Contents Background of Spink 2 Web Search and Retrieval

More information

Procedures and Resources Plan

Procedures and Resources Plan Project acronym D4Science Project full title DIstributed collaboratories Infrastructure on Grid Enabled Technology 4 Science Project No 212488 Procedures and Resources Plan Deliverable No DSA1.1b January

More information

UNICORE Globus: Interoperability of Grid Infrastructures

UNICORE Globus: Interoperability of Grid Infrastructures UNICORE : Interoperability of Grid Infrastructures Michael Rambadt Philipp Wieder Central Institute for Applied Mathematics (ZAM) Research Centre Juelich D 52425 Juelich, Germany Phone: +49 2461 612057

More information

OpenAIRE Open Knowledge Infrastructure for Europe

OpenAIRE Open Knowledge Infrastructure for Europe Birgit Schmidt University of Göttingen State and University Library OpenAIRE Open Knowledge Infrastructure for Europe ERC Workshop, 6-7 February 2013, Brussels OpenAIRE Characteristics A policy driven

More information

Electronic Thesis and Dissertation Tutorial: Submitting an ETD to SFA ScholarWorks

Electronic Thesis and Dissertation Tutorial: Submitting an ETD to SFA ScholarWorks Stephen F. Austin State University SFA ScholarWorks Library Faculty and Staff Publications Ralph W. Steen Library 5-13-2016 Electronic Thesis and Dissertation Tutorial: Submitting an ETD to SFA ScholarWorks

More information

A Comparative Study of the Search and Retrieval Features of OAI Harvesting Services

A Comparative Study of the Search and Retrieval Features of OAI Harvesting Services A Comparative Study of the Search and Retrieval Features of OAI Harvesting Services V. Indrani 1 and K. Thulasi 2 1 Information Centre for Aerospace Science and Technology, National Aerospace Laboratories,

More information

The TDAQ Analytics Dashboard: a real-time web application for the ATLAS TDAQ control infrastructure

The TDAQ Analytics Dashboard: a real-time web application for the ATLAS TDAQ control infrastructure The TDAQ Analytics Dashboard: a real-time web application for the ATLAS TDAQ control infrastructure Giovanna Lehmann Miotto, Luca Magnoni, John Erik Sloper European Laboratory for Particle Physics (CERN),

More information

ELFms industrialisation plans

ELFms industrialisation plans ELFms industrialisation plans CERN openlab workshop 13 June 2005 German Cancio CERN IT/FIO http://cern.ch/elfms ELFms industrialisation plans, 13/6/05 Outline Background What is ELFms Collaboration with

More information

Advanced Monitoring Asset for IBM Integration Bus

Advanced Monitoring Asset for IBM Integration Bus IBM Cloud Services Advanced Monitoring Asset for IBM Integration Bus Monitoring the business flows of IBM Integration Bus v10 Patrick MARIE IBM Cloud Services consultant pmarie@fr.ibm.com September 2017

More information

IEPSAS-Kosice: experiences in running LCG site

IEPSAS-Kosice: experiences in running LCG site IEPSAS-Kosice: experiences in running LCG site Marian Babik 1, Dusan Bruncko 2, Tomas Daranyi 1, Ladislav Hluchy 1 and Pavol Strizenec 2 1 Department of Parallel and Distributed Computing, Institute of

More information

LUND UNIVERSITY Open Access Journals dissemination and integration in modern library services

LUND UNIVERSITY Open Access Journals dissemination and integration in modern library services Open Access Journals dissemination and integration in modern library services 15th Panhellenic Academic Libraries Conference, Patras, November 2006 Lars Björnshauge, Director of Libraries Lund University

More information

where the Web was born Experience of Adding New Architectures to the LCG Production Environment

where the Web was born Experience of Adding New Architectures to the LCG Production Environment where the Web was born Experience of Adding New Architectures to the LCG Production Environment Andreas Unterkircher, openlab fellow Sverre Jarp, CTO CERN openlab Industrializing the Grid openlab Workshop

More information

Metal Recovery from Low Grade Ores and Wastes Plus

Metal Recovery from Low Grade Ores and Wastes Plus Metal Recovery from Low Grade Ores and Wastes Plus D7.1 Project and public website Public Authors: Marta Macias, Carlos Leyva (IDENER) D7.1 I Page 2 Deliverable Number 7.1 Deliverable Name Project and

More information

Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets

Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets Page 1 of 5 1 Year 1 Proposal Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets Year 1 Progress Report & Year 2 Proposal In order to setup the context for this progress

More information

High Performance Computing on MapReduce Programming Framework

High Performance Computing on MapReduce Programming Framework International Journal of Private Cloud Computing Environment and Management Vol. 2, No. 1, (2015), pp. 27-32 http://dx.doi.org/10.21742/ijpccem.2015.2.1.04 High Performance Computing on MapReduce Programming

More information

JMU ETD SUBMISSION INSTRUCTIONS

JMU ETD SUBMISSION INSTRUCTIONS JMU ETD SUBMISSION INSTRUCTIONS Before you submit your thesis or dissertation electronically, you must: Convert your manuscript to a PDF file. For conversion instructions, go to www.atomiclearning.com

More information

Electronic Submission to UMI using FTP

Electronic Submission to UMI using FTP Electronic Submission to UMI using FTP As the digital era progresses, UMI Dissertation Publishing has been encouraging our university publishing partners to move from paper submissions to electronic submissions.

More information

Medical-domain Machine Translation in KConnect

Medical-domain Machine Translation in KConnect Medical-domain Machine Translation in KConnect Pavel Pecina Charles University, Prague Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics Czech Republic Apr 4th, 2017 QT21 workshop,

More information

The DART-Europe E-theses Portal

The DART-Europe E-theses Portal The DART-Europe E-theses Portal Martin Moyle Digital Curation Manager UCL Library Services, UK m.moyle@ucl.ac.uk ETD 2009, University of Pittsburgh, June 10-13 2009 Contents DART-Europe: background The

More information

Semantic Scholar. ICSTI Towards a More Efficient Review of Research Literature 11 September

Semantic Scholar. ICSTI Towards a More Efficient Review of Research Literature 11 September Semantic Scholar ICSTI Towards a More Efficient Review of Research Literature 11 September 2018 Allen Institute for Artificial Intelligence (https://allenai.org/) Non-profit Research Institute in Seattle,

More information

Research on the Interoperability Architecture of the Digital Library Grid

Research on the Interoperability Architecture of the Digital Library Grid Research on the Interoperability Architecture of the Digital Library Grid HaoPan Department of information management, Beijing Institute of Petrochemical Technology, China, 102600 bjpanhao@163.com Abstract.

More information

A cocktail approach to the VideoCLEF 09 linking task

A cocktail approach to the VideoCLEF 09 linking task A cocktail approach to the VideoCLEF 09 linking task Stephan Raaijmakers Corné Versloot Joost de Wit TNO Information and Communication Technology Delft, The Netherlands {stephan.raaijmakers,corne.versloot,

More information

A fully-automatic approach to answer geographic queries: GIRSA-WP at GikiP

A fully-automatic approach to answer geographic queries: GIRSA-WP at GikiP A fully-automatic approach to answer geographic queries: at GikiP Johannes Leveling Sven Hartrumpf Intelligent Information and Communication Systems (IICS) University of Hagen (FernUniversität in Hagen)

More information

The EPIKH, GILDA and GISELA Projects

The EPIKH, GILDA and GISELA Projects The EPIKH Project (Exchange Programme to advance e-infrastructure Know-How) The EPIKH, GILDA and GISELA Projects Antonio Calanducci INFN Catania (Consorzio COMETA) - UniCT Joint GISELA/EPIKH School for

More information

D6.1. Project website and internal IT communication infrastructure HINT. 36 months FP7/

D6.1. Project website and internal IT communication infrastructure HINT. 36 months FP7/ D6.1 Project website and internal IT communication infrastructure Project number: 317930 Project acronym: Project title: HINT Start date of the project: 1 st October, 2012 Duration: Programme: Holistic

More information

CMS users data management service integration and first experiences with its NoSQL data storage

CMS users data management service integration and first experiences with its NoSQL data storage Journal of Physics: Conference Series OPEN ACCESS CMS users data management service integration and first experiences with its NoSQL data storage To cite this article: H Riahi et al 2014 J. Phys.: Conf.

More information

Navigating the Universe of ETDs: Streamlining for an Efficient and Sustainable Workflow at the University of North Florida Library

Navigating the Universe of ETDs: Streamlining for an Efficient and Sustainable Workflow at the University of North Florida Library University of North Florida From the SelectedWorks of Marielle Veve 2014 Navigating the Universe of ETDs: Streamlining for an Efficient and Sustainable Workflow at the University of North Florida Library

More information

Batch Services at CERN: Status and Future Evolution

Batch Services at CERN: Status and Future Evolution Batch Services at CERN: Status and Future Evolution Helge Meinhard, CERN-IT Platform and Engineering Services Group Leader HTCondor Week 20 May 2015 20-May-2015 CERN batch status and evolution - Helge

More information

Collective Awareness Platform for Tropospheric Ozone Pollution

Collective Awareness Platform for Tropospheric Ozone Pollution Collective Awareness Platform for Tropospheric Ozone Pollution Work package WP2 Deliverable number D2.7. Deliverable title Release of Website Deliverable type DEC Dissemination level PU (Public) Estimated

More information

NRF Open Access Statement

NRF Open Access Statement NRF Open Access Statement Implications for grantees research output submissions and dissemination SOUTH AFRICAN RESEARCH CHAIRS INITIATIVE North West & Limpopo Regional Workshop VENUE: NRF, Albert Luthuli

More information

The Content Editor UBC Department of Botany Website

The Content Editor UBC Department of Botany Website The Content Editor UBC Department of Botany Website Prepared by: IT Support Jan 2015 0 Table of Contents SECTION I: BACKGROUND AND PURPOSE... 3 SECTION II: LOGGING IN TO THE DEPARTMENT OF BOTANY WEBSITE...

More information

Europeana Core Service Platform

Europeana Core Service Platform Europeana Core Service Platform DELIVERABLE D7.1: Strategic Development Plan, Architectural Planning Revision Final Date of submission 30 October 2015 Author(s) Marcin Werla, PSNC Pavel Kats, Europeana

More information

Complete Solutions for Today s Electronic Collections

Complete Solutions for Today s Electronic Collections Complete Solutions for Today s Electronic Collections 29, January, 2007 Sydney, Australia JR Jenkins, MLIS Steve McCracken Peter McCracken, MLS Serials Solutions, Inc. Group Product Manager Today s Agenda

More information

Standardization Activities in ITU-T

Standardization Activities in ITU-T Standardization Activities in ITU-T Nozomu NISHINAGA and Suyong Eum Standardization activities for Future Networks in ITU-T have produced 19 Recommendations since it was initiated in 2009. The brief history

More information

Tests of PROOF-on-Demand with ATLAS Prodsys2 and first experience with HTTP federation

Tests of PROOF-on-Demand with ATLAS Prodsys2 and first experience with HTTP federation Journal of Physics: Conference Series PAPER OPEN ACCESS Tests of PROOF-on-Demand with ATLAS Prodsys2 and first experience with HTTP federation To cite this article: R. Di Nardo et al 2015 J. Phys.: Conf.

More information

1. Publishable Summary

1. Publishable Summary 1. Publishable Summary 1.1Project objectives and context Identity management (IdM) has emerged as a promising technology to distribute identity information across security domains. In e-business scenarios,

More information

Crossing the Archival Borders

Crossing the Archival Borders IST-Africa 2008 Conference Proceedings Paul Cunningham and Miriam Cunningham (Eds) IIMC International Information Management Corporation, 2008 ISBN: 978-1-905824-07-6 Crossing the Archival Borders Fredrik

More information

Web Services for Integrated Management: a Case Study

Web Services for Integrated Management: a Case Study Web Services for Integrated Management: a Case Study Jean-Philippe Martin-Flatin, CERN, Switzerland Pierre-Alain Doffoel, ESCP-EAP, France Mario Jeckle, University of Applied Sciences Furtwangen, Germany

More information

Annual Public Report - Project Year 2 November 2012

Annual Public Report - Project Year 2 November 2012 November 2012 Grant Agreement number: 247762 Project acronym: FAUST Project title: Feedback Analysis for User Adaptive Statistical Translation Funding Scheme: FP7-ICT-2009-4 STREP Period covered: from

More information

Envirocat: a Swiss Catalogue for Sharing Environmental Information

Envirocat: a Swiss Catalogue for Sharing Environmental Information Envirocat: a Swiss Catalogue for Sharing Environmental Information Karin Fink 1, Véronique Ortner 1, Hy Dao 2, Jean-Philippe Richard 3, Frédéric Vogel 3 Summary This paper presents the envirocat 4 project,

More information

5GrEEn Towards Green 5G Mobile Networks

5GrEEn Towards Green 5G Mobile Networks 5GrEEn Towards Green 5G Mobile Networks ETSI workshop 7-8 October 2013, Athens, Greece Magnus Olsson Ericsson Research, Stockholm, Sweden Background & Introduction RAN Energy Efficiency is an important

More information

WEB-BASED COLLECTION MANAGEMENT FOR LIBRARIES

WEB-BASED COLLECTION MANAGEMENT FOR LIBRARIES WEB-BASED COLLECTION MANAGEMENT FOR LIBRARIES Comprehensive Collections Management Systems You Can Access Anytime, Anywhere AXIELL COLLECTIONS FOR LIBRARIES Axiell Collections is a web-based CMS designed

More information

INSPIRE and SPIRES Log File Analysis

INSPIRE and SPIRES Log File Analysis INSPIRE and SPIRES Log File Analysis Cole Adams Science Undergraduate Laboratory Internship Program Wheaton College SLAC National Accelerator Laboratory August 5, 2011 Prepared in partial fulfillment of

More information

Open Access Publishing with arxiv. Tommy Ohlsson KTH Royal Institute of Technology

Open Access Publishing with arxiv. Tommy Ohlsson KTH Royal Institute of Technology Open Access Publishing with arxiv Tommy Ohlsson KTH Royal Institute of Technology Outline Open Access (OA) arxiv SCOAP 3 Useful references Some questions Open Access (OA) What is Open Access (OA)? Definition

More information

RN Workshop Series on Innovations in Scholarly Communication: plementing the Benefits of OAI (OAI3)

RN Workshop Series on Innovations in Scholarly Communication: plementing the Benefits of OAI (OAI3) RN Workshop Series on Innovations in Scholarly Communication: plementing the Benefits of OAI (OAI3) Overview of the OAI and its Relation to Scientific Publishing in 2004 Dr. Diann Rusch-Feja, Director

More information

A European Vision and Plan for a Common Grid Infrastructure

A European Vision and Plan for a Common Grid Infrastructure A European Vision and Plan for a Common Grid Infrastructure European Grid Initiative www.eu-egi.org Why Sustainability? Scientific applications start to depend on Grid infrastructures (EGEE, DEISA, ) Jobs/month

More information

Information Retrieval

Information Retrieval Introduction Information Retrieval Information retrieval is a field concerned with the structure, analysis, organization, storage, searching and retrieval of information Gerard Salton, 1968 J. Pei: Information

More information

EO Ground Segment Evolution Reflections by

EO Ground Segment Evolution Reflections by EO Ground Segment Evolution Reflections by Interoute Jonathan Brown Marketing Director Workshop 2015, 24 th September 2015 ESA/ESRIN Frascati Interoute, from the ground to the cloud 1. Interoute is the

More information

SWAD-Europe Deliverable 3.18: RDF Query Standardisation

SWAD-Europe Deliverable 3.18: RDF Query Standardisation SWAD-Europe Deliverable 3.18: RDF Query Standardisation Project name: Semantic Web Advanced Development for Europe (SWAD-Europe) Project Number: IST-2001-34732 Workpackage name: 3 Dissemination and Exploitation

More information

Submitting your Dissertation/ Thesis Electronically: A Guide for Graduate Students

Submitting your Dissertation/ Thesis Electronically: A Guide for Graduate Students Submitting your Dissertation/ Thesis Electronically: A Guide for Graduate Students Your comprehensive, screen by screen guide to submitting your thesis or dissertation document electronically for review

More information

Interim Report Technical Support for Integrated Library Systems Comparison of Open Source and Proprietary Software

Interim Report Technical Support for Integrated Library Systems Comparison of Open Source and Proprietary Software Interim Report Technical Support for Integrated Library Systems Comparison of Open Source and Proprietary Software Vandana Singh Assistant Professor, School of Information Science, University of Tennessee,

More information

"Charting the Course to Your Success!" MOC Microsoft SharePoint 2010 Site Collection and Site Administration Course Summary

Charting the Course to Your Success! MOC Microsoft SharePoint 2010 Site Collection and Site Administration Course Summary MOC 50547 Microsoft SharePoint Site Collection and Site Course Summary Description This five-day instructor-led Site Collection and Site Administrator course gives students who have SharePoint Owner permissions

More information

MONTE CARLO SIMULATION FOR RADIOTHERAPY IN A DISTRIBUTED COMPUTING ENVIRONMENT

MONTE CARLO SIMULATION FOR RADIOTHERAPY IN A DISTRIBUTED COMPUTING ENVIRONMENT The Monte Carlo Method: Versatility Unbounded in a Dynamic Computing World Chattanooga, Tennessee, April 17-21, 2005, on CD-ROM, American Nuclear Society, LaGrange Park, IL (2005) MONTE CARLO SIMULATION

More information

M3 Framework: User s guide & tutorial

M3 Framework: User s guide & tutorial M3 Framework: User s guide & tutorial Creator Send Feedback Amelie Gyrard (Eurecom - Insight - NUIG/DERI) Designed and implemented by Amélie Gyrard, she was a PhD student at Eurecom under the supervision

More information

Report on Deployment of Customized OJS

Report on Deployment of Customized OJS D3.2 Deployment of Customized OJS Agora (270904) 1 AGORA Scholarly Open Access Research in European Philosophy Project Reference: 270904 Report on Deployment of Customized OJS Deliverable number D3.2 Destination

More information

Capturing and Analyzing User Behavior in Large Digital Libraries

Capturing and Analyzing User Behavior in Large Digital Libraries Capturing and Analyzing User Behavior in Large Digital Libraries Giorgi Gvianishvili, Jean-Yves Le Meur, Tibor Šimko, Jérôme Caffaro, Ludmila Marian, Samuele Kaplun, Belinda Chan, and Martin Rajman European

More information

EXTENDING OAI-PMH PROTOCOL WITH DYNAMIC SETS DEFINITIONS USING CQL LANGUAGE

EXTENDING OAI-PMH PROTOCOL WITH DYNAMIC SETS DEFINITIONS USING CQL LANGUAGE EXTENDING OAI-PMH PROTOCOL WITH DYNAMIC SETS DEFINITIONS USING CQL LANGUAGE Cezary Mazurek Poznań Supercomputing and Networking Center Noskowskiego 12/14, 61-704 Poznań, Poland Marcin Werla Poznań Supercomputing

More information

ECHA -term User Guide

ECHA -term User Guide ECHA -term User Guide 1 / 27 Table of contents Introduction... 3 Content... 3 Reliability of data... 4 Languages... 5 Target users... 5 Terminology principles... 5 Domain classification... 6 Localised

More information

ISO INTERNATIONAL STANDARD. Financial services Universal financial industry message scheme Part 3: Modelling

ISO INTERNATIONAL STANDARD. Financial services Universal financial industry message scheme Part 3: Modelling INTERNATIONAL STANDARD ISO 20022-3 First edition 2013-05-01 Financial services Universal financial industry message scheme Part 3: Modelling Services financiers Schéma universel de messages pour l'industrie

More information

ALOE - A Socially Aware Learning Resource and Metadata Hub

ALOE - A Socially Aware Learning Resource and Metadata Hub ALOE - A Socially Aware Learning Resource and Metadata Hub Martin Memmel & Rafael Schirru Knowledge Management Department German Research Center for Artificial Intelligence DFKI GmbH, Trippstadter Straße

More information

E-Marefa User Guide. "Arab Theses and Dissertations"

E-Marefa User Guide. Arab Theses and Dissertations E-Marefa User Guide "Arab Theses and Dissertations" Table of Contents What is E-Marefa Database.3 System Requirements 3 Inside this User Guide 3 Access to E-Marefa Database.....4 Choosing Database to Search.5

More information

Thesis/Dissertation Submission Guidelines The Graduate School Valdosta State University

Thesis/Dissertation Submission Guidelines The Graduate School Valdosta State University Thesis/Dissertation Submission Guidelines The Graduate School Valdosta State University Has your thesis or dissertation been reviewed and approved by the Graduate School? If not start with Section 1. If

More information

Portfolios Creating and Editing Portfolios... 38

Portfolios Creating and Editing Portfolios... 38 Portfolio Management User Guide 16 R1 March 2017 Contents Preface: Using Online Help... 25 Primavera Portfolio Management Overview... 27 Portfolio Management Software for Technology Leaders... 27 Solution

More information

Design and Realization of the EXCITEMENT Open Platform for Textual Entailment. Günter Neumann, DFKI Sebastian Pado, Universität Stuttgart

Design and Realization of the EXCITEMENT Open Platform for Textual Entailment. Günter Neumann, DFKI Sebastian Pado, Universität Stuttgart Design and Realization of the EXCITEMENT Open Platform for Textual Entailment Günter Neumann, DFKI Sebastian Pado, Universität Stuttgart Textual Entailment Textual Entailment (TE) A Text (T) entails a

More information

CERN openlab Communications

CERN openlab Communications CERN openlab Communications CERN openlab III Board of Sponsors 02 April 2009 Mélissa Le Jeune Major New Actions Implemented Mechanisms to: constantly measure the impact of all communications report sponsor

More information

International Roaming Charges: Frequently Asked Questions

International Roaming Charges: Frequently Asked Questions MEMO/06/144 Brussels, 28 March 2006 International Roaming Charges: Frequently Asked Questions What is international mobile roaming? International roaming refers to the ability to use your mobile phone

More information

USC Viterbi School of Engineering

USC Viterbi School of Engineering Introduction to Computational Thinking and Data Science USC Viterbi School of Engineering http://www.datascience4all.org Term: Fall 2016 Time: Tues- Thur 10am- 11:50am Location: Allan Hancock Foundation

More information

Delivering On Canada s Broadband Commitment Presentation to OECD/WPIE Public Sector Broadband Procurement Workshop December 4, 2002

Delivering On Canada s Broadband Commitment Presentation to OECD/WPIE Public Sector Broadband Procurement Workshop December 4, 2002 Delivering On Canada s Broadband Commitment Presentation to OECD/WPIE Public Sector Broadband Procurement Workshop December 4, 2002 Canada Demand Aggregation in a Federal Structure 3 major levels of government

More information