Building a Digital Library Software

Size: px
Start display at page:

Download "Building a Digital Library Software"

Transcription

1 Building a Software INVENIO, Part 1 J-Y. Le Meur Department of Information Technology CERN JINR-CERN School on GRID and Information Management Systems 14 May 2012

2 Outline

3 Outline

4 A physicist office at CERN: the "Non-Digital" Library

5 Specialized Software Specialist software for running a digital library: Content is organized and ready for exchange, support of interoperability protocols Metadata and Data is preserved for long term, support of preservation standards Submission, Edition, Curation processes are supported Dissemination is organized and controlled Combined traditional Library Systems, Document Management SW and Engine SW examples: Eprints, DuraSpace, Greenstone... Institutional repository software focuses primarily on ingest, preservation and access of locally produced documents, particularly locally produced academic outputs.

6 Invenio DL SW History 1954: CERN laboratory is created 1989: Tim Berners-Lee invents the Web 1991: HEP SPIRES/ArXiv is the first DL on the Web 1993: CERN Preprint Server starts as an institutional and disciplinary repository 1996: CERN Library Server includes Books and Periodicals, as an hybrid library 2000: CERN Document Server serves Multimedia material and restricted notes 2002: CDSWare SW released Open Source 2006: CDSWare becomes Invenio; start of I18N collaborations 2010: Invenio 1.0 released and adopted world-wide 2012: First Invenio User Group Workshop is organized

7 Project Overview Invenio DL SW Open Source GPL project Linux Server side Medium to big data repositories Flexible at every layer Technology: Python (and C and Lisp), MySQL and Apache + mod_wsgi WSGI: Web Server Gateway Interface Supports Library standards: XML-MARC, MARC21, OAI-PMH, OpenURL...

8 Library Standards exchange, identifiers and preservation Exchange protocols: Z39.50 and OAI-PMH between Data and Service providers Interoperability: SWORD = Simple Web-service Offering Repository Deposit Identifiers: ISBN, DOI, PURL, etc Preservation: METS, PDF/A, OAIS Content description: Metadata Encoding and Transmission Standard Data formats Supporting system: Open Archival Information System ref. model Content representation: MARC, DC XML-MARC

9 De facto Standards Plugins examples Compatibility with LibX: Invenio toolbar LibX: Can be integrated with IExplorer and Firefox browsers Integration with the main digital content websites including Amazon, Google Schoolar, Wikipedia Highlighted text from a web page can be used to directly query an Invenio installation Zotero: Export DL content to Zotero Firefox plugin for compiling CVs, etc Cooliris: Browsing multimedia content as a 3 dimensional wall (integration with the Cooliris plugin)

10 Technology Component concepts 3 Tier architecture:

11 Technology Overview languages

12 Why Python? languages Easy to read and understand: good for many temporary developers Suitable for rapid prototyping: good for organic-growth software development model Write code to throw it away

13 Why Python? art of ikebana programming (T. Simko)

14 Why Python? Speeding up Pyhton bitecode interpreted language: what about speed? Cython permits to write C extensions easily combining efficiency of C with the high-levelness of Python declace C types on variables and class attributes:

15 Domain logic

16

17

18 Outline

19 Ingestion Modules Overview

20 Ingestion Modules Overview

21 Ingestion Modules Overview

22 Ingestion Modules Submission by Humans The ingestion is performed by humans (authors, secretaries, cataloguers, etc) WebSubmit is a framework for helping collecting user data and creating MARCXML records + other workflow-related goodies. "" Strategy.

23 Ingestion Modules Submission: usual Web Form

24 Ingestion Modules Submission: unusual Web Form

25 Ingestion Modules Submission: Behind the Form

26 Ingestion Modules Submission: interfaces, workflows and functions

27 Ingestion Modules Submission: interfaces, workflows and functions

28 Ingestion Modules Submission: interfaces, workflows and functions

29 Ingestion Modules Submission: interfaces, workflows and functions

30 Ingestion Modules Submission by Robots Ingestion by robots, three use cases: Pulling from OAI-PMH Compatible Source Pulling from a non-compatible Source ing from External systems into the DL

31 Ingestion Pull: Harvesting from OAI source Pulling from OAI-PMH source OAI-PMH: Open Archives Initiative Protocol for Metadata Harvesting Data Provider vs Service Provider XML (Dulin Core) over HTTP Commercial search engines have started using OAI-PMH to acquire/deliver more resources Help in reducing the network traffic and other resource usage by doing an incremental harvesting

32

33

34 Ingestion : Robot Upload ing records in the : To POST (the HTTP way) a record Insert/replace/modify/delete Authorization: checked by IP or API key (a la Twitter) Upload files via special protocol FFT (Invenio) Feedack: Immediate HTTP error code in case of non valid MARCXML or other request error Examples at CERN Experiment (CMS) pushes records automatically to CERN Document Server Event recording SW pushes talks held at CERN in CDS automatically CERN Drupal web infrastructure to push photos and other documents to CDS

35 Which one? Example 1: Nikos wants to create an archive of all the blogs about High Energy Physics E.g. Nikos works at CERN and he must harvest blog posts from Fermilab Quantium Diaries Example 2: Sam wants to create an archive of all the scheduled TV programmes in his nation E.g. Sam lives in France and he would like to harvest content from the TV Broadcast site Telerama.fr

36 Which one? Example 1: Nikos wants to create an archive of all the blogs about High Energy Physics E.g. Nikos works at CERN and he must harvest blog posts from Fermilab Quantium Diaries Example 2: Sam wants to create an archive of all the scheduled TV programmes in his nation E.g. Sam lives in France and he would like to harvest content from the TV Broadcast site Telerama.fr

37 The TV DL Example 2: Sam wants to create an archive of all the scheduled TV programmes in his nation E.g. Sam lives in France and he would like to harvest content from the TV Broadcast site Telerama.fr Good luck Sam!

38 The TV DL Find a common input standard: For TV programmes this is XMLTV Map the input to MARCXML Wrap the harvesting, conversion and uploading into a Tasklet That s all!

39 The TV DL Mapping XMLTV to XMLMARC: XML Conversion smart_add_field(rec, 245 : a : programme.get( title ), b : programme.get( sub-title ), ## Title 260 : b : channel_map[programme[ channel ]], c : start_time, ## "Place" 269 : c : programme.get( date ), ## Date 520 : a : programme.get( desc ), ## Summary 037 : a : u -.join([programme.get( channel, ), programme.get( start, ).replace(, )]), 088 : a : u -.join([programme.get( title, [(, )])[0][0].replace(, ), programme.get( episode-num, [(, )])[0][0]]), FMT : g : original_xml, f : xmltv,)

40 The TV DL Tasklet programming: Main function: bibtasklet def bst_xmltv2marcxml(): write_message("grabbing XMLTV data") xmltvfile = grab_xmltv() write_message("xmltv data saved into %s" % xmltvfile) fd, marcxmlfile = tempfile.mkstemp(dir=tmpdir, suffix=.xml ) os.write(fd, xmltv2marcxml(open(xmltvfile))) os.close(fd) write_message("derived MARCXML saved into %s" % marcxmlfile) task_id = task_low_level_submission("bibupload", "xmltv", "-ir", marcxmlfile) write_message("bibupload scheduled with task id %s" % task_id)

41 The TV DL Scheduler with pending tasklets:

42

43

44

45 The TV DL or Pull?

46 Outline

47 Processing Modules Overview

48 Processing Modules Overview

49 Processing Modules Overview

50

51 Processing Modules Example: indexing

52 Processing Modules Example: indexing

53 Processing Modules Example: indexing

54 Building Indexes designing a search engine performance-driven design assumptions: high number of selects, low number of updates fast searching, slow indexation cache everything cacheable search functionality: search for words, phrases, regular expressions search in any field, authors, titles, etc index design: forward indexes: rec1 > [word1, word8,... ] rec2 > [word1, word2,... ] reverse indexes: word1 > [rec1, rec2,... ] word2 > [rec2, rec7,... ] Zipf s law on word frequency: few words occur very often (e.g. the) most words are infrequent (even e.g. boson)

55 Building Indexes Optimizing Three important speed factors to consider: speed of finding sets (DB Server) speed of demarshaling sets (DB < > Web App Server) speed of intersecting sets (Web App Server) Optimizing data structures data structures tested: sorted (lists, Patricia trees) unsorted (hashed sets, binary vectors) fast prototyping: (Python, Common Lisp) Binary vectors found the best compromise: typical search time gain: 4.0 sec > 0.2 sec typical indexing time loss: 7 hours > 4 days mostly spare data modelled via mostly dense data structure

56 Processing Modules Sorting Quikly Very Large Sets A processing phase is needed to generate Sorting Buckets At Search time, 1st 10 records sorted ascendant by title:

57

58 Processing Modules Citation-graph based (L. Marian) Citation Counts Time-dependent Citation Counts Link-based Time-dependent Link-based Link-based with External Citations

59

60 Processing Modules Citation-graph based methods

61 Processing Modules Citation-graph based methods

62 Processing Modules D-Rank D-Rank: Distributed technology One method to rule them all One method that can aggregate all the existing ranking methods + user feedback Readjustment of parameters based on user feedback as a function of Relevance and Quality

63 Processing Modules Citation-graph based methods Spying the "life" of a record in all queries:

64

65 Outline

Invenio: a modern digital library system. <

Invenio: a modern digital library system. < Invenio: a modern digital library system Samuele Kaplun (on behalf of the Invenio Development Team) History I CERN adopted Open Access since almost

More information

Invenio: A Modern Digital Library for Grey Literature

Invenio: A Modern Digital Library for Grey Literature Invenio: A Modern Digital Library for Grey Literature Jérôme Caffaro, CERN Samuele Kaplun, CERN November 25, 2010 Abstract Grey literature has historically played a key role for researchers in the field

More information

CERN Document Server Software

CERN Document Server Software CERN Document Server Software Jean-Yves Le Meur 15 June 2004 CDSware > Introduction > History Since the creation of CERN 50 years ago, library mission is the same: the dissemination and long term keeping

More information

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS)

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS) Institutional Repository using DSpace Yatrik Patel Scientist D (CS) yatrik@inflibnet.ac.in What is Institutional Repository? Institutional repositories [are]... digital collections capturing and preserving

More information

Building a Digital Repository on a Shoestring Budget

Building a Digital Repository on a Shoestring Budget Building a Digital Repository on a Shoestring Budget Christinger Tomer University of Pittsburgh! PALA September 30, 2014 A version this presentation is available at http://www.pitt.edu/~ctomer/shoestring/

More information

Ing. José A. Mejía Villar M.Sc. Computing Center of the Alfred Wegener Institute for Polar and Marine Research

Ing. José A. Mejía Villar M.Sc. Computing Center of the Alfred Wegener Institute for Polar and Marine Research Ing. José A. Mejía Villar M.Sc. jmejia@awi.de Computing Center of the Alfred Wegener Institute for Polar and Marine Research 29. November 2011 Contents 1. Fedora Commons Repository 2. Federico 3. Federico's

More information

Promoting Open Standards for Digital Repository. case study examples and challenges

Promoting Open Standards for Digital Repository. case study examples and challenges Promoting Open Standards for Digital Repository Infrastructures: case study examples and challenges Flavia Donno CERN P. Fuhrmann, DESY, E. Ronchieri, INFN-CNAF OGF-Europe Community Outreach Seminar Digital

More information

Information retrieval concepts Search and browsing on unstructured data sources Digital libraries applications

Information retrieval concepts Search and browsing on unstructured data sources Digital libraries applications Digital Libraries Agenda Digital Libraries Information retrieval concepts Search and browsing on unstructured data sources Digital libraries applications What is Library Collection of books, documents,

More information

Building for the Future

Building for the Future Building for the Future The National Digital Newspaper Program Deborah Thomas US Library of Congress DigCCurr 2007 Chapel Hill, NC April 19, 2007 1 What is NDNP? Provide access to historic newspapers Select

More information

EUDAT. A European Collaborative Data Infrastructure. Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT

EUDAT. A European Collaborative Data Infrastructure. Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT EUDAT A European Collaborative Data Infrastructure Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT OpenAire Interoperability Workshop Braga, Feb. 8, 2013 EUDAT Key facts

More information

Adding OAI ORE Support to Repository Platforms

Adding OAI ORE Support to Repository Platforms Adding OAI ORE Support to Repository Platforms Alexey Maslov, Adam Mikeal, Scott Phillips, John Leggett, Mark McFarland Texas Digital Library OR 09 Texas Digital Library Use Case for OAI OREO Overview

More information

SobekCM. Compiled for presentation to the Digital Library Working Group School of Oriental and African Studies

SobekCM. Compiled for presentation to the Digital Library Working Group School of Oriental and African Studies SobekCM Compiled for presentation to the Digital Library Working Group School of Oriental and African Studies SobekCM Is a digital library system built at and maintained by the University of Florida s

More information

Comparing Open Source Digital Library Software

Comparing Open Source Digital Library Software Comparing Open Source Digital Library Software George Pyrounakis University of Athens, Greece Mara Nikolaidou Harokopio University of Athens, Greece Topic: Digital Libraries: Design and Development, Open

More information

BPMN Processes for machine-actionable DMPs

BPMN Processes for machine-actionable DMPs BPMN Processes for machine-actionable DMPs Simon Oblasser & Tomasz Miksa Contents Start DMP... 2 Specify Size and Type... 3 Get Cost and Storage... 4 Storage Configuration and Cost Estimation... 4 Storage

More information

http://resolver.caltech.edu/caltechlib:spoiti05 Caltech CODA http://coda.caltech.edu CODA: Collection of Digital Archives Caltech Scholarly Communication 15 Production Archives 3102 Records Theses, technical

More information

Persistent identifiers, long-term access and the DiVA preservation strategy

Persistent identifiers, long-term access and the DiVA preservation strategy Persistent identifiers, long-term access and the DiVA preservation strategy Eva Müller Electronic Publishing Centre Uppsala University Library, http://publications.uu.se/epcentre/ 1 Outline DiVA project

More information

The Semantic Institution: An Agenda for Publishing Authoritative Scholarly Facts. Leslie Carr

The Semantic Institution: An Agenda for Publishing Authoritative Scholarly Facts. Leslie Carr The Semantic Institution: An Agenda for Publishing Authoritative Scholarly Facts Leslie Carr http://id.ecs.soton.ac.uk/people/60 What s the Web For? To share information 1. Ad hoc home pages 2. Structured

More information

OAI-PMH. DRTC Indian Statistical Institute Bangalore

OAI-PMH. DRTC Indian Statistical Institute Bangalore OAI-PMH DRTC Indian Statistical Institute Bangalore Problem: No Library contains all the documents in the world Solution: Networking the Libraries 2 Problem No digital Library is expected to have all documents

More information

2nd Technical Validation Questionnaire - interim results -

2nd Technical Validation Questionnaire - interim results - 2nd Technical Validation Questionnaire - interim results - Birgit Matthaei Humboldt-University, Berlin, Germany Electronic Publishing Group Computer- and Mediaservice birgit.matthaei@cms.hu-berlin.de Why

More information

GNU EPrints 2 Overview

GNU EPrints 2 Overview GNU EPrints 2 Overview Christopher Gutteridge 14th October 2002 Abstract An overview of GNU EPrints 2. EPrints is free software which creates a web based archive and database of scholarly output and is

More information

The OAIS Reference Model: current implementations

The OAIS Reference Model: current implementations The OAIS Reference Model: current implementations Michael Day, UKOLN, University of Bath m.day@ukoln.ac.uk Chinese-European Workshop on Digital Preservation, Beijing, China, 14-16 July 2004 Presentation

More information

September Development of favorite collections & visualizing user search queries in CERN Document Server (CDS)

September Development of favorite collections & visualizing user search queries in CERN Document Server (CDS) Development of favorite collections & visualizing user search queries in CERN Document Server (CDS) September 2013 Author: Archit Sharma archit.py@gmail.com Supervisor: Nikolaos Kasioumis CERN Openlab

More information

Flexible Design for Simple Digital Library Tools and Services

Flexible Design for Simple Digital Library Tools and Services Flexible Design for Simple Digital Library Tools and Services Lighton Phiri Hussein Suleman Digital Libraries Laboratory Department of Computer Science University of Cape Town October 8, 2013 SARU archaeological

More information

Lessons Learned. Implementing Rosetta in the Harold B. Lee Library

Lessons Learned. Implementing Rosetta in the Harold B. Lee Library Lessons Learned Implementing Rosetta in the Harold B. Lee Library Provide Long Term Digital Access 1. To preserve BYU digital items: Digitized images, audio, video, Electronic articles, university records,

More information

Introduction to TIND. Guillaume Lastecoueres

Introduction to TIND. Guillaume Lastecoueres Introduction to TIND Guillaume Lastecoueres Good afternoon Introduction to TIND Basics Record types Bibliographic record. Holding record. Item record. Record types Bibliographic Holding Item Holding Item

More information

Digital The Harold B. Lee Library

Digital The Harold B. Lee Library Digital Preservation @ The Harold B. Lee Library CIMA 23 May 2013 How we got here? 1. Understanding Digital Preservation 2. Search for Content 3. Maintain Optical Disc Storage 4. In House Preservation

More information

Capturing and Analyzing User Behavior in Large Digital Libraries

Capturing and Analyzing User Behavior in Large Digital Libraries Capturing and Analyzing User Behavior in Large Digital Libraries Giorgi Gvianishvili, Jean-Yves Le Meur, Tibor Šimko, Jérôme Caffaro, Ludmila Marian, Samuele Kaplun, Belinda Chan, and Martin Rajman European

More information

Horizon Societies of Symbiotic Robot-Plant Bio-Hybrids as Social Architectural Artifacts. Deliverable D4.1

Horizon Societies of Symbiotic Robot-Plant Bio-Hybrids as Social Architectural Artifacts. Deliverable D4.1 Horizon 2020 Societies of Symbiotic Robot-Plant Bio-Hybrids as Social Architectural Artifacts Deliverable D4.1 Data management plan (open research data pilot) Date of preparation: 2015/09/30 Start date

More information

How to contribute information to AGRIS

How to contribute information to AGRIS How to contribute information to AGRIS Guidelines on how to complete your registration form The dashboard includes information about you, your institution and your collection. You are welcome to provide

More information

RVOT: A Tool For Making Collections OAI-PMH Compliant

RVOT: A Tool For Making Collections OAI-PMH Compliant RVOT: A Tool For Making Collections OAI-PMH Compliant K. Sathish, K. Maly, M. Zubair Computer Science Department Old Dominion University Norfolk, Virginia USA {kumar_s,maly,zubair}@cs.odu.edu X. Liu Research

More information

Metadata and Encoding Standards for Digital Initiatives: An Introduction

Metadata and Encoding Standards for Digital Initiatives: An Introduction Metadata and Encoding Standards for Digital Initiatives: An Introduction Maureen P. Walsh, The Ohio State University Libraries KSU-SLIS Organization of Information 60002-004 October 29, 2007 Part One Non-MARC

More information

Data Exchange and Conversion Utilities and Tools (DExT)

Data Exchange and Conversion Utilities and Tools (DExT) Data Exchange and Conversion Utilities and Tools (DExT) Louise Corti, Angad Bhat, Herve L Hours UK Data Archive CAQDAS Conference, April 2007 An exchange format for qualitative data Data exchange models

More information

The Virtual Language Observatory!

The Virtual Language Observatory! The Virtual Language Observatory! Dieter Van Uytvanck! CMDI workshop, Nijmegen! 2012-09-13! 1! Overview! VLO?! What is behind it? Relation to CMDI?! How do I get my data in there?! Demo + excercises!!

More information

Increasing access to OA material through metadata aggregation

Increasing access to OA material through metadata aggregation Increasing access to OA material through metadata aggregation Mark Jordan Simon Fraser University SLAIS Issues in Scholarly Communications and Publishing 2008-04-02 1 We will discuss! Overview of metadata

More information

Problem: Solution: No Library contains all the documents in the world. Networking the Libraries

Problem: Solution: No Library contains all the documents in the world. Networking the Libraries OAI-PMH Problem: No Library contains all the documents in the world Solution: Networking the Libraries 2 Problem No digital Library is expected to have all documents in the world Solution Networking the

More information

The Design of a DLS for the Management of Very Large Collections of Archival Objects

The Design of a DLS for the Management of Very Large Collections of Archival Objects Session: VLDL Architectures The Design of a DLS for the Management of Very Large Collections of Archival Objects Maristella Agosti, Nicola Ferro and Gianmaria Silvello Information Management Research Group

More information

B2SAFE metadata management

B2SAFE metadata management B2SAFE metadata management version 1.2 by Claudio Cacciari, Robert Verkerk, Adil Hasan, Elena Erastova Introduction The B2SAFE service provides a set of functions for long term bit stream data preservation:

More information

COAR Interoperability Roadmap. Uppsala, May 21, 2012 COAR General Assembly

COAR Interoperability Roadmap. Uppsala, May 21, 2012 COAR General Assembly COAR Interoperability Roadmap Uppsala, May 21, 2012 COAR General Assembly 1 Background COAR WG2 s main objective for 2011-2012 was to facilitate a discussion on interoperability among Open Access repositories.

More information

A service-oriented national e-thesis information system and repository

A service-oriented national e-thesis information system and repository Title of the presentation Date 1 A service-oriented national e-thesis information system and repository Nikos Houssos Panagiotis Stathopoulos Ioanna Sarantopoulou Dimitris Zavaliadis Evi Sachini National

More information

ACDH AUSTRIAN CENTRE FOR DIGITAL HUMANITIES

ACDH AUSTRIAN CENTRE FOR DIGITAL HUMANITIES ARCHE = A Resource Centre for the HumanitiEs A digital archive for the humanities Implements the OAIS Reference Model for an Open Archival Information System arche.acdh.oeaw.ac.at WHAT IS AN ARCHIVE? Preserves

More information

Research Data Edinburgh: MANTRA & Edinburgh DataShare. Stuart Macdonald EDINA & Data Library University of Edinburgh

Research Data Edinburgh: MANTRA & Edinburgh DataShare. Stuart Macdonald EDINA & Data Library University of Edinburgh Research Data Services @ Edinburgh: MANTRA & Edinburgh DataShare Stuart Macdonald EDINA & Data Library University of Edinburgh NFAIS Open Data Seminar, 16 June 2016 Context EDINA and Data Library are a

More information

CERN Open Data and Data Analysis Knowledge Preservation

CERN Open Data and Data Analysis Knowledge Preservation CERN Open Data and Data Analysis Knowledge Preservation Tibor Šimko Digital Library 2015 21 23 April 2015 Jasná, Slovakia @tiborsimko @inveniosoftware 1 / 26 1 Invenio @tiborsimko @inveniosoftware 2 /

More information

The Materials Data Facility

The Materials Data Facility The Materials Data Facility Ben Blaiszik (blaiszik@uchicago.edu), Kyle Chard (chard@uchicago.edu) Ian Foster (foster@uchicago.edu) materialsdatafacility.org What is MDF? We aim to make it simple for materials

More information

The Metadata Challenge:

The Metadata Challenge: The Metadata Challenge: Determining local and global needs and expectations for your metadata Gareth Knight, Kultivate Metadata workshop 24 th May 2011 Centre for e-research (CeRch), King s College London

More information

Introduction to Federico 2.0 and Fedora Commons

Introduction to Federico 2.0 and Fedora Commons Introduction to Federico 2.0 and Fedora Commons Dr. Bernadette Fritszch Bernadette.Fritzsch@awi.de http://aforge.awi.de/gf/project/federico/ Ing. José A. Mejía Villar M.Sc. Jose.Mejia@awi.de Computing

More information

Software Requirements Specification for the Names project prototype

Software Requirements Specification for the Names project prototype Software Requirements Specification for the Names project prototype Prepared for the JISC Names Project by Daniel Needham, Amanda Hill, Alan Danskin & Stephen Andrews April 2008 1 Table of Contents 1.

More information

Its All About The Metadata

Its All About The Metadata Best Practices Exchange 2013 Its All About The Metadata Mark Evans - Digital Archiving Practice Manager 11/13/2013 Agenda Why Metadata is important Metadata landscape A flexible approach Case study - KDLA

More information

Role of Social Media and Semantic WEB in Libraries

Role of Social Media and Semantic WEB in Libraries Role of Social Media and Semantic WEB in Libraries By Dr. Anwar us Saeed Email: anwarussaeed@yahoo.com Layout Plan Where Library streams merge the WEB Recent Evolution of the WEB Social WEB Semantic WEB

More information

Building Illinois Electronic Documents Access

Building Illinois Electronic Documents Access Building Illinois Electronic Documents Access August 2009 Illinois State Library in Partnership with the Graduate School of Library and Information Science at the University of Illinois, Urbana-Champaign

More information

2011 Emerging Leaders: Group C Improving ALA Poster Sessions. Final Report and Recommendations Date Submitted: Monday, May 16 th, 2011

2011 Emerging Leaders: Group C Improving ALA Poster Sessions. Final Report and Recommendations Date Submitted: Monday, May 16 th, 2011 2011 Emerging Leaders: Group C Improving ALA Poster Sessions Final Report and Recommendations Date Submitted: Monday, May 16 th, 2011 Amanda Harlan, Ben Hunter, Ariel eff, Thomas Reinsfelder, Kristen Yarmey

More information

SMART CONNECTOR TECHNOLOGY FOR FEDERATED SEARCH

SMART CONNECTOR TECHNOLOGY FOR FEDERATED SEARCH SMART CONNECTOR TECHNOLOGY FOR FEDERATED SEARCH VERSION 1.4 27 March 2018 EDULIB, S.R.L. MUSE KNOWLEDGE HEADQUARTERS Calea Bucuresti, Bl. 27B, Sc. 1, Ap. 10, Craiova 200675, România phone +40 251 413 496

More information

Semantic Web Systems Introduction Jacques Fleuriot School of Informatics

Semantic Web Systems Introduction Jacques Fleuriot School of Informatics Semantic Web Systems Introduction Jacques Fleuriot School of Informatics 11 th January 2015 Semantic Web Systems: Introduction The World Wide Web 2 Requirements of the WWW l The internet already there

More information

Dataverse: Modular Storage and Migration to the Cloud

Dataverse: Modular Storage and Migration to the Cloud Dataverse: Modular Storage and Migration to the Cloud Gustavo Durand, Dataverse Technical Lead / Architect Leonid Andreev, Dataverse Senior Developer Dataverse Overview An open-source platform to publish,

More information

Part 2: Current State of OAR Interoperability. Towards Repository Interoperability Berlin 10 Workshop 6 November 2012

Part 2: Current State of OAR Interoperability. Towards Repository Interoperability Berlin 10 Workshop 6 November 2012 Part 2: Current State of OAR Interoperability The COAR Current State of Open Access Repository Interoperability (2012) Intended Audience: Institutions, and repository managers, operating at different points

More information

Union catalogue models

Union catalogue models Union catalogue models Presentation for Hellenic academic libraries Martin van Muyen Union catalogue: a catalogue that lists the holdings of more than one library 2 Union catalogue base functions Discovery

More information

Digital Curation and Preservation: Defining the Research Agenda for the Next Decade

Digital Curation and Preservation: Defining the Research Agenda for the Next Decade Storage Resource Broker Digital Curation and Preservation: Defining the Research Agenda for the Next Decade Reagan W. Moore moore@sdsc.edu http://www.sdsc.edu/srb Background NARA research prototype persistent

More information

MuseKnowledge Hybrid Search

MuseKnowledge Hybrid Search MuseKnowledge Hybrid Search MuseGlobal, Inc. One Embarcadero Suite 500 San Francisco, CA 94111 415 896-6873 www.museglobal.com MuseGlobal S.A Calea Bucuresti Bl. 27B, Sc. 1, Ap. 10 Craiova, România 40

More information

Nuno Freire National Library of Portugal Lisbon, Portugal

Nuno Freire National Library of Portugal Lisbon, Portugal Date submitted: 05/07/2010 UNIMARC in The European Library and related projects Nuno Freire National Library of Portugal Lisbon, Portugal E-mail: nuno.freire@bnportugal.pt Meeting: 148. UNIMARC WORLD LIBRARY

More information

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan Metadata for Data Discovery: The NERC Data Catalogue Service Steve Donegan Introduction NERC, Science and Data Centres NERC Discovery Metadata The Data Catalogue Service NERC Data Services Case study:

More information

Long-term digital preservation of UNSWorks

Long-term digital preservation of UNSWorks Long-term digital preservation of UNSWorks UNSW Library Arif Shaon, Maude Frances CAUL Community Days 2014 UNSW Australia The University of New South Wales at a Glance: https://www.unsw.edu.au/sites/default/files/documents/unsw4009_miniguide_2012_aw2_v2.pdf

More information

Brown University Libraries Technology Plan

Brown University Libraries Technology Plan Brown University Libraries Technology Plan 2009-2011 Technology Vision Brown University Library creates, develops, promotes, and uses technology to further the Library s mission and strategic directions

More information

Copyright 2008, Paul Conway.

Copyright 2008, Paul Conway. Unless otherwise noted, the content of this course material is licensed under a Creative Commons Attribution - Non-Commercial - Share Alike 3.0 License.. http://creativecommons.org/licenses/by-nc-sa/3.0/

More information

CORE: Improving access and enabling re-use of open access content using aggregations

CORE: Improving access and enabling re-use of open access content using aggregations CORE: Improving access and enabling re-use of open access content using aggregations Petr Knoth CORE (Connecting REpositories) Knowledge Media institute The Open University @petrknoth 1/39 Outline 1. The

More information

The Open Archives Initiative and the Sheet Music Consortium

The Open Archives Initiative and the Sheet Music Consortium The Open Archives Initiative and the Sheet Music Consortium Jon Dunn, Jenn Riley IU Digital Library Program October 10, 2003 Presentation outline Jon: OAI introduction Sheet Music Consortium background

More information

DRIVER Step One towards a Pan-European Digital Repository Infrastructure

DRIVER Step One towards a Pan-European Digital Repository Infrastructure DRIVER Step One towards a Pan-European Digital Repository Infrastructure Norbert Lossau Bielefeld University, Germany Scientific coordinator of the Project DRIVER, Funded by the European Commission Consultation

More information

Survey of Existing Services in the Mathematical Digital Libraries and Repositories in the EuDML Project

Survey of Existing Services in the Mathematical Digital Libraries and Repositories in the EuDML Project Survey of Existing Services in the Mathematical Digital Libraries and Repositories in the EuDML Project Radoslav Pavlov, Desislava Paneva-Marinova, and Georgi Simeonov Institute of Mathematics and Informatics,

More information

Invenio at UAB 11 years after

Invenio at UAB 11 years after Invenio at UAB 11 years after Invenio User Group Workshop 2017 Heinz Maier-Leibnitz Zentrum (MLZ) 21-24 March 2017 Ferran Jorba Universitat Autònoma de Barcelona Ferran.Jorba@uab.cat Quick summary 2006:

More information

OPENAIRE FP7 POST-GRANT OPEN ACCESS PILOT

OPENAIRE FP7 POST-GRANT OPEN ACCESS PILOT OPENAIRE FP7 POST-GRANT OPEN ACCESS PILOT Alternative Funding Bid No 10. Hungarian Educational Research Journal (HERJ) Presenter: Laura Morvai University of Debrecen University and National Library Managing

More information

EUDAT-B2FIND A FAIR and Interdisciplinary Discovery Portal for Research Data

EUDAT-B2FIND A FAIR and Interdisciplinary Discovery Portal for Research Data EUDAT-B2FIND A FAIR and Interdisciplinary Discovery Portal for Research Data Heinrich Widmann, DKRZ Claudia Martens, DKRZ Open Science Days, Berlin, 17 October 2017 www.eudat.eu EUDAT receives funding

More information

Prototyping Data Intensive Apps: TrendingTopics.org

Prototyping Data Intensive Apps: TrendingTopics.org Prototyping Data Intensive Apps: TrendingTopics.org Pete Skomoroch Research Scientist at LinkedIn Consultant at Data Wrangling @peteskomoroch 09/29/09 1 Talk Outline TrendingTopics Overview Wikipedia Page

More information

Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments *

Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments * Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments * Joesph JaJa joseph@ Mike Smorul toaster@ Fritz McCall fmccall@ Yang Wang wpwy@ Institute

More information

Non-text theses as an integrated part of the University Repository

Non-text theses as an integrated part of the University Repository Non-text theses as an integrated part of the University Repository a case study of the Academy of Performing Arts in Prague Iva Horová, Radim Chvála 1/36! "#! $ I. ETDs and the Czech Republic II. AMU and

More information

Citation Services for Institutional Repositories: Citebase Search. Tim Brody Intelligence, Agents, Multimedia Group University of Southampton

Citation Services for Institutional Repositories: Citebase Search. Tim Brody Intelligence, Agents, Multimedia Group University of Southampton Citation Services for Institutional Repositories: Citebase Search Tim Brody Intelligence, Agents, Multimedia Group University of Southampton 28/04/2009 2 28/04/2009 3 Content The Open Access Literature

More information

Institutional repositories: description of VITAL as an example of a Fedora-based digital assets management system.

Institutional repositories: description of VITAL as an example of a Fedora-based digital assets management system. Institutional repositories: description of VITAL as an example of a Fedora-based digital assets management system. ICADLA-2, Johannesburg, South Africa Nabil Saadallah Manager, Middle East and Africa VTLS

More information

Open Archives Initiative protocol development and implementation at arxiv

Open Archives Initiative protocol development and implementation at arxiv Open Archives Initiative protocol development and implementation at arxiv Simeon Warner (Los Alamos National Laboratory, USA) (simeon@lanl.gov) OAI Open Day, Washington DC 23 January 2001 1 What is arxiv?

More information

Citation Services for Institutional Repositories: Citebase Search. Tim Brody Intelligence, Agents, Multimedia Group University of Southampton

Citation Services for Institutional Repositories: Citebase Search. Tim Brody Intelligence, Agents, Multimedia Group University of Southampton Citation Services for Institutional Repositories: Citebase Search Tim Brody Intelligence, Agents, Multimedia Group University of Southampton Content The Research Literature The Open Access Literature Why

More information

1. Download and install the Firefox Web browser if needed. 2. Open Firefox, go to zotero.org and click the big red Download button.

1. Download and install the Firefox Web browser if needed. 2. Open Firefox, go to zotero.org and click the big red Download button. Get Started with Zotero A free, open-source alternative to products such as RefWorks and EndNote, Zotero captures reference data from many sources, and lets you organize your citations and export bibliographies

More information

OpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond

OpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond Alessia Bardi and Paolo Manghi, Institute of Information Science and Technologies CNR Katerina Iatropoulou, ATHENA, Iryna Kuchma and Gwen Franck, EIFL Pedro Príncipe, University of Minho OpenAIRE Fostering

More information

The OpenAIREplus Project

The OpenAIREplus Project Special thanks to Natalia Manola and Yannis Ioannidis (University of Athens), who contributed to these slides The OpenAIREplus Project Paolo Manghi Istituto di Scienza e Tecnologie dell Informazione Consiglio

More information

Digital Preservation Standards Using ISO for assessment

Digital Preservation Standards Using ISO for assessment Digital Preservation Standards Using ISO 16363 for assessment Preservation Administrators Interest Group, American Library Association, June 25, 2016 Amy Rudersdorf Senior Consultant, AVPreserve amy@avpreserve.com

More information

Artificially enhanced research

Artificially enhanced research Artificially enhanced research Free software and fantastic research Dan Scott November 24, 2008 First, a word about our browser It all starts with Mozilla Firefox: A great, feature filled, secure browser

More information

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment Shigeo Sugimoto Research Center for Knowledge Communities Graduate School of Library, Information

More information

Digital Libraries: Interoperability

Digital Libraries: Interoperability Digital Libraries: Interoperability RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Interoperability...............................................

More information

Helping Journals to Upgrade Data Publications for Reusable Research

Helping Journals to Upgrade Data Publications for Reusable Research Helping Journals to Upgrade Data Publications for Reusable Research Sonia Barbosa (Project Manager) Eleni Castro (Project Coordinator) Ins9tute for Quan9ta9ve Social Science (IQSS) Harvard University @thedataorg

More information

Open source software for building open access repositories. Imma Subirats Coll knowledge and information management officer FAO of the United Nations

Open source software for building open access repositories. Imma Subirats Coll knowledge and information management officer FAO of the United Nations Open source software for building open access repositories Imma Subirats Coll knowledge and information management officer FAO of the United Nations Introduction Description of three open source softwares

More information

Data publication and discovery with Globus

Data publication and discovery with Globus Data publication and discovery with Globus Questions and comments to outreach@globus.org The Globus data publication and discovery services make it easy for institutions and projects to establish collections,

More information

Sessions 3/4: Member Node Breakouts. John Cobb Matt Jones Laura Moyers 7 July 2013 DataONE Users Group

Sessions 3/4: Member Node Breakouts. John Cobb Matt Jones Laura Moyers 7 July 2013 DataONE Users Group Sessions 3/4: Member Node Breakouts John Cobb Matt Jones Laura Moyers 7 July 2013 DataONE Users Group Schedule 1:00-2:20 and 2:40-4:00 Member Node Breakouts Member Node Overview and Process Overview Documentation

More information

Registry Interchange Format: Collections and Services (RIF-CS) explained

Registry Interchange Format: Collections and Services (RIF-CS) explained ANDS Guide Registry Interchange Format: Collections and Services (RIF-CS) explained Level: Awareness Last updated: 10 January 2017 Web link: www.ands.org.au/guides/rif-cs-explained The RIF-CS schema is

More information

Working with Islandora

Working with Islandora Working with Islandora Erin Tripp, discoverygarden erin@discoverygarden.ca @eeohalloran April 21, 2015 Jasna, Slovakia Presentation Agenda Introductions Islandora Software Islandora Community Islandora

More information

Introduction

Introduction Introduction EuropeanaConnect All-Staff Meeting Berlin, May 10 12, 2010 Welcome to the All-Staff Meeting! Introduction This is a quite big meeting. This is the end of successful project year Project established

More information

For those of you who may not have heard of the BHL let me give you some background. The Biodiversity Heritage Library (BHL) is a consortium of

For those of you who may not have heard of the BHL let me give you some background. The Biodiversity Heritage Library (BHL) is a consortium of 1 2 For those of you who may not have heard of the BHL let me give you some background. The Biodiversity Heritage Library (BHL) is a consortium of natural history and botanical libraries that cooperate

More information

OAI-PMH for dummies: how to build an institutional repository with limited resources?

OAI-PMH for dummies: how to build an institutional repository with limited resources? Federaal Kenniscentrum voor de Gezondheidszorg Centre fédéral d expertise des soins de santé Belgian Health Care Knowledge Centre OAI-PMH for dummies: how to build an institutional repository with limited

More information

MEDIA PROCESSING ON CLOUD

MEDIA PROCESSING ON CLOUD MEDIA PROCESSING ON CLOUD SCALABLE, MANAGEABLE AND COST EFFECTIVE SRINI AKKALA TABLE OF CONTENTS INTRODUCTION... 3 SOLUTION... 3 Elastic computing... 4 Storage and archival... 5 Database... 6 Disaster

More information

Article begins on next page

Article begins on next page NJVid: New Jersey Statewide Digital Video Portal Rutgers University has made this article freely available. Please share how this access benefits you. Your story matters. [https://rucore.libraries.rutgers.edu/rutgers-lib/21708/story/]

More information

Using metadata for interoperability. CS 431 February 28, 2007 Carl Lagoze Cornell University

Using metadata for interoperability. CS 431 February 28, 2007 Carl Lagoze Cornell University Using metadata for interoperability CS 431 February 28, 2007 Carl Lagoze Cornell University What is the problem? Getting heterogeneous systems to work together Providing the user with a seamless information

More information

Exploring Open Source Solutions in the Management of ETD Processes CHETAN S SONAWANE KMC COLLEGE, INDIA

Exploring Open Source Solutions in the Management of ETD Processes CHETAN S SONAWANE KMC COLLEGE, INDIA Exploring Open Source Solutions in the Management of ETD Processes CHETAN S SONAWANE KMC COLLEGE, INDIA Introduction ETD being the most important research materials holds an importance. With the development

More information

DAITSS Demo Virtual Machine Quick Start Guide

DAITSS Demo Virtual Machine Quick Start Guide DAITSS Demo Virtual Machine Quick Start Guide The following topics are covered in this document: A brief Glossary Downloading the DAITSS Demo Virtual Machine Starting up the DAITSS Demo Virtual Machine

More information

Representation/Indexing (fig 1.2) IR models - overview (fig 2.1) IR models - vector space. Weighting TF*IDF. U s e r. T a s k s

Representation/Indexing (fig 1.2) IR models - overview (fig 2.1) IR models - vector space. Weighting TF*IDF. U s e r. T a s k s Summary agenda Summary: EITN01 Web Intelligence and Information Retrieval Anders Ardö EIT Electrical and Information Technology, Lund University March 13, 2013 A Ardö, EIT Summary: EITN01 Web Intelligence

More information

EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal

EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal Heinrich Widmann, DKRZ DI4R 2016, Krakow, 28 September 2016 www.eudat.eu EUDAT receives funding from the European Union's Horizon 2020

More information

The Ohio State University's Knowledge Bank: An Institutional Repository in Practice

The Ohio State University's Knowledge Bank: An Institutional Repository in Practice The Ohio State University's Knowledge Bank: Maureen P. Walsh, The Ohio State University Libraries The Ohio State University s Institutional Repository Mission The mission of the institutional repository

More information