Guidelines for preparing a Z39.50/SRU target to enable metadata harvesting

Size: px
Start display at page:

Download "Guidelines for preparing a Z39.50/SRU target to enable metadata harvesting"

Transcription

1 ECP-2006-DILI TELplus Guidelines for preparing a Z39.50/SRU target to enable metadata harvesting Deliverable number Dissemination level D-2.3 Public Delivery date 30 th of June 2009 Status Author(s) V1.0 Final Nuno Freire(BNP), Diogo Reis(IST) econtentplus This project is funded under the econtentplus programme 1, a multiannual Community programme to make digital content in Europe more accessible, usable and exploitable. 1 OJ L 79, , p. 1.

2

3 Contents CONTENTS INTRODUCTION THE CASE FOR HARVESTING METADATA VIA Z39.50/SRU INTRODUCTION TO SEARCH & RETRIEVAL PROTOCOLS Z History Z39.50 Description SRU/SRW History Description RELATED WORK METADATA HARVESTING VIA Z39.50/SRU DEPLOYMENT SCENARIOS THE CURRENT STATUS OF SRU/Z39.50 USAGE METADATA HARVESTING EFFICIENCY CONSIDERATIONS HARVESTING METHODS INCREMENTAL HARVESTS FULL HARVESTS Full harvest by record creation and modification dates Full harvest by identifier export Full harvest by sequential identifier Full harvest by index scan CHOOSING A HARVESTING METHOD CONCLUSION REFERENCES... 22

4 1 Introduction Collections from the national libraries can be made available in The European Library portal via three communication protocols: the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), Z39.50 or Search/Retrieve via URL (SRU). The choice of the communication protocol greatly influences the functionalities that the portal can provide to the end-user. Although all three protocols provide a standard for communication between the portal and the libraries systems, the underlying communication paradigm is substantially different. While OAI-PMH s design allows the portal to harvest all metadata records from the libraries into a central repository, Z39.50 and SRU where designed for remote search and retrieval, therefore metadata records remain only at the data provider. The choice of the communication paradigm (metadata harvesting or remote search and retrieval) has an enormous impact on the usability of the portal, mainly on search and retrieval, which is an essential functionality of the portal. Having the metadata harvested into a central repository allows the portal to preprocess the metadata in order to provide services for the end-user, and generally improve the user experience. On the other hand, with remote search and retrieval, the functionality that the portal can provide for the metadata collections is limited by the functionality that the underlying communication protocol. This means that the portal can only provide search and retrieval functionality for the collections that are available by SRU or Z It also has the implication that the portal has to handle each collection independently of the others, thus not allowing the user to see a consolidated view of the results of his query, forcing him to navigate in the results of each collection individually. The European Library is also preparing to become a library aggregator for Europeana. Europeana was designed from start to follow the metadata harvesting paradigm, with OAI- PMH being the only supported protocol. Only those collections that are harvested into the central metadata repository of the European Library can be made available to Europeana. The above factors make a strong case for The European Library to pursue the objective of having the totality of the collections from libraries harvested into its central metadata repository. 1.1 The case for harvesting metadata via Z39.50/SRU This document is the result of a study to evaluate the possibility of performing metadata harvesting on collections that are available via Z39.50 or SRU. Although many national libraries already have OAI-PMH access implemented, or are working in that direction, some cases exist where an implementation of OAI-PMH would require a major effort, due to several kinds of difficulties in adapting the underlying information systems of the national libraries. An important factor for OAI-PMH implementation is vendor support. When vendors of library management systems provide an OAI-PMH module for their products, that is usually the choice taken by the libraries. However, even though OAI-PMH was designed to be a simple communication protocol, with a low cost of implementation, many vendors do not provide OAI-PMH module for their products. In these cases, libraries have to look for other solutions, that require more technical knowledge from the library staff.

5 Libraries can implement OAI-PMH by deploying one of the software solutions made available by The European Library or by using other software available as open source and free of any charges. Independently of the software in use for the OAI-PMH implementation, an essential component is the one that accesses the metadata records from the catalogue and makes them available to the OAI-PMH software. This typically may take two forms: An export tool provided by the library management system to export the metadata records to a file. An implemented middleware which accesses the metadata records from the catalogue, typically via a direct connection to the relational database of the library management system. Some difficulties may arise with any of the above solutions. For example, in some systems the export tools have to be used by a person, therefore the process cannot be automatic. Or connecting directly to the database of the library catalogue may not be at all possible because the database in use is a proprietary solution without an open API. In these scenarios, a possible solution may be to harvest the metadata records through the Z39.50 server of their catalogue. Unlike OAI-PMH, Z39.50 servers are available for the large majority of library management system vendors, and its usage within libraries is widely spread. Metadata harvesting done through Z39.50/SRU will certainly present difficulties of its own. Z39.50/SRU were not designed for metadata harvesting, so some functionality required to make the metadata harvesting process efficient and reliable was not included in the protocols design. So, this application of metadata harvesting will likely be a very inefficient process that will place a significant load on the libraries information systems.

6 2 Introduction to search & retrieval protocols This section aims to introduce the search and retrieval protocols. 2.1 Z39.50 The protocol Z is an ANSI/NISO standard for computer-to-computer communication designed to support searching and retrieval of information from remote databases, provided by a server, in a distributed network environment [1]. The architecture of Z39.50 is databaseoriented, at a higher level of abstraction than Database Management Systems because the semantics is oriented to classes of databases History The initial efforts in Z39.50 started in the 1970s [2], an experimental protocol as part of the Linked System Project (LSP), with the goal of creating a national bibliographic database in the United States. For that was defined a cross-database searching standard that could perform searches across the major bibliographic databases, which were very homogenous, in organizations such as the Library of Congress, the Online Computer Library Centre (OCLC), and the Research Libraries Information Network. The following work by the early 1980s was mainly implementation, delaying the creation of a protocol. In 1988, when the standard was established, the result was focused on information retrieval from bibliographic databases and there wasn t a correct implementation. By the end of the 1980s the community interested in the standard had grown and there were now many libraries with online catalogues, having more classes of databases. NISO released a revised version of Z39.50 in 1992, Version 2. This version had a much greater feedback from people implementing it. Still, a major barrier in its deployment was the definition of presentation layer services. In 1995 the version 3 of Z39.50 was published. It contained important improvements in performance on fast networks, sorting and access point browsing. It also introduced complex features like extended services and the generalized record syntax. Z revises the standard to incorporate clarifications, amendments and corrections and has been endorsed by the Z39.50 Implementers Group (ZIG) Z39.50 Description Z39.50 is a connection-oriented protocol that defines interactions between two machines. The current protocol is over TCP/IP. The first version was over X.25, but TCP/IP has a better performance and was used afterwards. The communication has an initialization phase where client and server negotiate the communication properties (version of the protocol, maximum record size, etc.). The server may require the client to authenticate. Either the server or the client may end the session. 1 Z39.50 Maintenance Agency Page,

7 A server has databases containing records. Each database has a set of access points (indexes) that can be used for searching. This is a much more abstract view of a database than one finds with SQL, for example. One deals only with logical entities based on the kind of information that is stored in the database, not the details of specific database implementations. One of the basic Z39.50 functions allows the client to transmit a search to the server (a SEARCH request). There are over a hundred search parameters for bibliographic records, in five categories: relation attributes, position attributes, structure attribute, truncation attributes, completeness attributes. The queries have a Boolean structure and may have multiple nesting of search phrases e.g. (phrase1 AND (phrase2 OR phrase3)). Also allowed in the query structure are operands for Restriction and Proximity. The semantics of the SEARCH request in Z39.50 was the basis for SRU/SRW Common Query Language. A search produces a set of records, called a "result set", that are maintained on the server. Records from the result set can be subsequently retrieved by the client using PRESENT requests. A server can provide progress reports for an active search, or can ask the client for authorization to continue a resource intensive search; a client can abort an active search. Z39.50 contains facilities for managing and sorting result sets, browsing the values of access points associated with a database, for opening and closing connections, and a general mechanism called "extended services", which is an asynchronous remote procedure call mechanism that the client can use to invoke services on the server, optionally making reference to the contents of a result set as a parameter. Various groups have developed Z39.50 profiles. Of notice is the Bath Profile 1 that specifies a rigid syntax for bibliographic searches. The maintenance agency has a list of these, but the maintenance and support of the profiles is unclear. This has led to a fragmentation in the implementer community and is a serious problem for the purpose of interoperability. 2.2 SRU/SRW SRU 2 is an acronym for Search/Retrieve via URL, a standard search protocol for Internet search queries, utilizing Common Query Language (CQL) a query syntax for representing queries based on the semantics of Z SRW 3 (Search Retrieve Web Service) is a variation of SRU, where messages are transmitted with XML over HTTP with SOAP 4 instead of URL. SRW and SRU are intended to define a standard form for Internet search queries as well as the structure of the responses. SRW/U was defined by the ZING (Z39.50 International: the Next Generation) Group History With the advent of the Internet, the flaws of Z39.50 started to become apparent. SRW/U was developed with the intention of using: HTTP instead of a telnet connection; a single URL as a request; and an XML response instead of a dialog of commands with some complexity. The 1 Interoperability Focus: The Bath Profile, 2 SRU: Search/Retrieve via URL, 3 SRW: Search/Retrieve Web Service, 4 SOAP Specifications,

8 main advantage of SRW/U compared to Z39.50 is its simplicity and the use of standards (like HTTP, XML, etc.). SRU concepts retained from Z : Result Sets; Abstract Access points; Abstract Record schemas; Explain; Diagnostics. The version 1.0 (November 2002) of SRU was an experimental version. The current version, released in February 2004 is 1.1. SRU Version 1.2 is the current version Description Both SRW and SRU have only three operations 2 : explain requests to learn about the server s database. The server must respond with the location of the database, its content description and protocol features supported by the server. scan list and enumerate the terms in the remote database s index. searchretrieve the most important operations. It s the mechanism to query the remote database, with the CQL. The queries may be from free text to Boolean operations with nested queries. Some aspects of CQL are optional, but the servers must reply to unsupported requests with diagnostic messages. The result of the query may be returned in several metadata formats (like Dublin Core, RDF, MARCXML, etc.), as specified in the explain operation. CQL is a formal language designed to be human readable, intuitive and with enough potential to express complex queries. It s possible to query using: 1. Boolean logic (ex: book or magazine): 2. numeric comparisons (ex: year > 2006) 3. proximity of words in a document (ex: book prox/distance<=5 magazine) 4. across multiple dimensions (ex: date within " ") 5. relevance (ex: subject any/relevant "fish frog"). The difference between SRU and SRW is the way messages are encapsulated and transmitted (see Figure 1). SRW is a SOAP-ful Web Service. Messages sent and received are encapsulated with SOAP. Because of this, HTTP is not a necessary transport protocol, although it is generally used for practical reasons. SRW over HTTP requests use the POST method. POST has the advantage of imposing no limitations on the length of arguments. SRU is a REST-ful Web Service, which means operations are encoded as name/value pairs. All operations are transmitted as HTTP GET requests. The results are XML streams, as with SRW, except there is no SOAP envelope. The benefits of SRW over SRU are: better extension support, authentication, and web service compliance (W3C standard). 1 SRU and Z39.50, 2 SRU: Version 1.2 and Beyond - DLF Spring Forum 2006,

9 Figure 1 - a) SRW and b) SRU communication with a Server

10 3 Related work A similar scenario to that faced by The European Library regarding metadata harvesting by Z39.50, was addressed in the project Yellow Brick Roads: Building a Digital Shortcut to Statewide Information in the United States of America, by the Library of the University of Illinois at Urbana-Champaign. This project investigated the feasibility of unified searching across library holdings, digitization projects, and online state government information through use of the OAI-PMH together with the Z39.50 protocol. Z39.50 was not used for search and retrieval, but for metadata harvesting. Their experience is documented in [3]. The project faces a similar scenario that is present nowadays to The European Library, with several libraries that have their catalogues available only be Z Of particular relevance for The European Library, is their study on the viability of using the Z39.50/OAI Gateway Application Profile in existing Z39.50 servers. The OAI-PMH/Z39.50 Gateway Profile 1 was created for the purpose of creating an appropriate response to OAI-PMH requests layered over the Z39.50 server. They analyzed if Z39.50 servers deployed in libraries had the necessary features to allow the actual implementation of the OAI-PMH/Z39.50 Gateway Profile. The requirements that this profile demands from the underlying data structures and search mechanisms of the Z39.50 servers where not supported, and the vendors did not had any plans to implement them. We have no reason to believe that the above conclusion is not applicable to the European libraries. We tested a sample of European Z39.50 servers and they also did not support the full set of required functionalities. Another work, which is relevant to our work, analyzed the potential usages of OAI-PMH together with SRU [4]. Of particular interest is their analysis of what functionalities a SRU target must provide to allow an OAI-PMH gateway to be built over the SRU interface. It is therefore a similar analysis to that done in the Z39.50/OAI Gateway Profile, but for SRU. It does not present any real-world deployments of the technique, so it does not provide any clues to its applicability in existing SRU servers During the course of our work, we where not able to access the URL of the site that hosts the application profile.

11 4 Metadata harvesting via Z39.50/SRU This section presents the analysis of the available possibilities for performing metadata harvesting via Z39.50/SRU in The European Library. 4.1 Deployment scenarios In the context of The European Library, the two scenarios have been identified where metadata harvesting via Z39.50/SRU could be used. These two scenarios are aligned with the ongoing work in TELplus for building an OAI-PMH infrastructure for The European Library. In the first scenario, metadata harvesting via Z39.50/SRU is deployed at the library. That is, the library harvests its own Z39.50 server, and makes the harvested metadata available to The European Library by OAI-PMH. In this scenario, the Z39.50/SRU harvester is a form of middleware between the library management system and the OAI-PMH server of the library, as shown in Figure 2. deployment Z39.50/SRU local harv esting deployment National Library The European Library Library management system OAI-PMH serv er Metadata harv esting manager «interface» Z39.50/SRU Interface Z39.50/SRU Z39.50/SRU harv ester «interface» OAI-PMH interface OAI-PMH OAI-PMH harv ester Figure 2 - Deployment of a local Z39.50/SRU harvester In the second scenario, metadata harvesting via Z39.50/SRU is deployed at The European Library. In this scenario, the Z39.50/SRU harvester will be integrated in the central metadata harvesting infrastructure of The European Library, as shown in Figure 3. deployment Z39.50/SRU central harvesting deployment National Library The European Library Library management system Metadata harv esting manager «interface» Z39.50/SRU Interface Z39.50/SRU Z39.50/SRU harv ester OAI-PMH harv ester Figure 3 - Deployment of a Z39.50/SRU central harvester

12 A possible third scenario is to deploy a gateway that translates OAI-PMH requests to Z39.50/SRU requests in real time, as in the Z39.50/OAI Gateway Application Profile. However, as presented in Section 3, it is unlikely that the Z39.50 servers deployed in libraries have the necessary features to allow a successful implementation of a gateway. 4.2 The current status of SRU/Z39.50 usage Since the start of the TELplus project several libraries of The European Library have moved from Z39.50/SRU to OAI-PMH. Eleven libraries are implementing OAI-PMH in TELplus, and others are implementing it in the context of local projects/activities. A significant number of libraries still have collections available by Z39.50, representing 27% of the searchable collections of The European Library. In these cases, an implementation of OAI-PMH would require a major effort, due to several kinds of difficulties in adapting the underlying information systems. However, SRU is currently used only for 3% of the searchable collections of The European Library. This percentage is expected to drop in the short term since these collections originate from four national libraries, and two of them have OAI-PMH implementations in progress. The remaining two libraries have SRU implementations based on Z39.50/SRU gateways, therefore these collections are also available via both SRU and Z In this context, supporting metadata harvesting by SRU would provide little added value to The European Library, therefore it should focus mainly on metadata harvesting via Z Metadata harvesting efficiency considerations Distributed search protocols like Z39.50 and SRU where not designed to fulfill requirements for metadata harvesting. The methods described in Section 5 make it possible to harvest Z39.50/SRU servers, however in some cases the process can be inefficient or computationally expensive. The harvesting methods can place a heavy load on the Z39.50/SRU targets and/or on the network. Its usage should be tested prior to usage, as it is not possible to predict in advance how the target will perform. The target performance can vary from several factors such as: The size of the collection; The processing capacity of the hardware; The network bandwidth available; The network latency; The server load from other applications and users of the target; The database structure of the target. The description of the methods in Section 5 will discuss their implications for the overall efficiency of the harvesting process. The efficiency of the harvesting process can be further affected if performed directly from The European Library to the Z39.50 or SRU targets at the libraries (the scenario 2 of Section 4.1). In this scenario the network latency and bandwidth can make the harvesting process very slow and unreliable. Therefore the application of this scenario should be tested prior to its usage for every target individually.

13 5 Harvesting methods Metadata harvesting was not what distributed search protocols where designed for, therefore there is no single standard way on how to perform metadata harvesting. The only work developed to define such standard is the Z39.50/OAI Gateway Application Profile, but its application in real-world scenarios showed that it cannot be applied to the existing Z39.50 servers. For this reason, this document will describe other techniques that can be used to harvest search targets. These techniques are based on functionalities that are commonly available in Z39.50 servers. They will not provide an ideal harvesting mechanism like OAI-PMH or the Z39.50/OAI Gateway Application Profile. These methods, although allowing metadata harvesting, may in some cases be inefficient or computationally expensive. For this reason we will also discuss their efficiency. We will start by discussing the requirements for incremental harvest to be possible. Incremental harvests will enable a more efficient way of harvesting, because when synchronization is taking place, only the updates are transferred, as opposed to full harvests that must transfer the complete collection whenever the harvester needs to update its copy of the collection. When a search target cannot support the requirements for incremental harvest, it will only be possible to use the full harvest techniques. Regardless of the method used for harvesting, the search target has to provide the metadata records in a format that can be used in The European Library. That is, it must provide full records in MARC format encoded either in ISO or XML (according to the MARCXML 2 or MarcXchange 3 schemas), which can be converted to the TEL Application Profile. Alternatively the target may provide TEL Application Profile records that can be directly used by The European Library. The description of the methods are focused on Z39.50 and use according terminology, but the same methods can be applied to SRU targets, as both protocols share the same underlying concepts. 5.1 Incremental harvests Incremental harvests are the preferred method of harvesting, since they only transfer the data that has changed between harvests. The changed records are retrieved by querying the target on the date of creation or modification of the records. In order to support incremental harvests, the target must meet the following requirements: It must support queries by date of creation (Z39.50 Bib-1 use attribute 1011 Date/time added )

14 It must support queries by date of last update (Z39.50 Bib-1 use attribute 1012 Date/time last modified ). It must support relation attributes less than, less than or equal, equal, greater than and greater than or equal (Z39.50 Bib-1 relation attributes 1, 2, 3, 4 and 5 ). It must not limit the number of search results. The records must contain an identifier, that is unique in the database. Deleted records should be retrieved in the search results. This is not mandatory but is recommended. Figure 4 shows an example harvest, carried out with this method. The harvester starts by sending a search request to the search target, querying for records that where created of modified since the last harvest. It then sends present requests for obtaining the full records, until the whole result set is harvested. Z39.50 servers that support this method of incremental harvesting, when harvested for the first time, can follow any of the full harvesting methods. The requirement for full harvest by record creation and modification dates are automatically supported by targets that support incremental harvests. In those cases where the target does not keep track of deleted records, the harvester will not know that the records should be deleted. In order to remove these records from the harvester, it has to perform periodical full harvests in addition to the incremental harvests. sd Incremental harv est Z39.50 Server Z39.50 harvester SearchRequest(dateCreated, datemodified) :result set Present request(result set item) :response record Present request(result set item) Result set is finished() :false :response record Result set is finished() :true Figure 4 - An incremental harvest by record creation and modification dates 5.2 Full harvests Full harvest methods have to be used when harvesting a target for the first time or whenever the requirements for incremental harvest cannot be met by the target. These methods can place

15 a heavy load on the search targets and/or on the network. Its usage should be tested prior to usage, as it is not possible to predict in advance how the target will perform. This section will describe several methods for conducting full harvests, and will discuss their implications for the efficiency of the harvesting process Full harvest by record creation and modification dates This full harvest method is based on the same functionality of the incremental harvest described in section 5.1, therefore has similar, but lighter, requirements. The records are harvested by querying the target on the date of creation or modification of the records. The target must support the following requirements: It must support queries by date of creation (Z39.50 Bib-1 use attribute 1011 Date/time added ). It must support queries by date of last update (Z39.50 Bib-1 use attribute 1012 Date/time last modified ). It must not limit the number of search results. The records must contain an identifier, that is unique in the database. Figure 5 shows an example harvest, carried out with this method. The harvester starts by sending a search request to the search target, querying for records that where created or modified in the earliest date stamp of the database. The earliest date stamp should be configured on the harvester. It then sends present requests for obtaining the full records, until the whole result set of that day is harvested. It then proceeds in the same way for the following days, until the present date is reached, and terminates the harvest. Z39.50 servers that support the method of incremental harvesting, will also support the requirements for this full harvest method. Therefore, this method should be used when harvesting those targets for the first time, or when full harvests are performed with the purpose of removing deleted records from the harvester.

16 sd Full harv est by modification date Z39.50 Server Z39.50 harvester Get first timestamp on target() :date Search request(date) :result set Present request(result set item) :result record Result set is finished() :true Search request (date) Last harvest date is before today() :true :result set Present request(result set item) :result record Result set is finished() :true Last harvest date is before today() :false Figure 5 - A full harvest by record creation and modification dates Full harvest by identifier export This method requires that the data provider is able to export the complete list of records identifiers of its data set, into a file. That file is then made available to the harvester that will query the target for each of the identifiers individually. The data provider target must support the following requirement: Must be able to export all record identifiers into a file. It has to be accessible by the harvester, either on the file system of the harvester, or by HTTP. The search target must support the following requirement: It must support queries by record identifier (Z39.50 Bib-1 use attribute 12 Local number ) Figure 6 shows an example harvest, carried out with this method. The first step consists in the creation of a file containing the complete list of records identifiers. The file should be a plain text file with one identifier per line.

17 Libraries can use several methods for exporting the identifiers. The choice will depend on the library management system in use, but typically this can be done by a manual export mechanism or by software program that collects the identifiers via a direct connection to the database of the library management system. The first step executed by the harvest will consist in accessing the file and reading the identifiers. Access to the file can be provided by the local file system of the harvester or by an URL. The harvester starts by sending a search request to the search target, with the first identifier, and followed by a present request for the full record. These operations are repeated for all identifiers. sd Full harv est by identifier export Z39.50 Server Z39.50 harvester Identifier exporter Get list of identifiers() Export list of identifiers() Search request(identifier) :result set Present request(result set item) :response record More identifiers exist() :true Search request(identifier) :result set Present request(result set item) :response record More identifiers exist() :false Figure 6 - A full harvest by identifier export Full harvest by sequential identifier This method is similar to the previous one, but does not require the initial step of exporting the list of record identifiers. This method is applicable only in systems where records identifiers consist of sequential numbers. In these cases, the harvest will query the target for record identifiers in sequential order until it reaches a maximum. The maximum can be manually configured or be detected by the harvester. The search target must support the following requirements: It must support queries by record identifier (Z39.50 Bib-1 use attribute 12 Local number ) Figure 7 shows an example harvest, carried out with this method. The harvester starts by sending a search request to the search target, with the first identifier. If the result set is not

18 empty, a present request for the full record is sent and the record is harvested. The harvester then increments the identifier, checks if the maximum identifier was reached, and proceeds by searching for the next identifier or ends the harvest. If the maximum identifier is not known by the harvester, it may stop the harvest when a certain amount of identifiers consecutively failed to be retrieved. sd Full harv est by identifier sequence Z39.50 Server Z39.50 harvester Search request(identifier) :result set Present request(result set item) :result record Increment identifier() Search request(identifier) Limit identifier reached() :false :result set Present request(result set item) :response record Increment identifier() Limit identifier reached() :true Figure 7 - A full harvest by identifier sequence Full harvest by index scan In this method, harvesting is carried out by first executing a harvest of all terms in a predefined index. This initial step is performed by sending successive scan operations to the target, until all terms are obtained. Once all terms are harvested, search requests are sent to the target for each of the terms. The chosen index should be one where all records of the data set have values on. The harvester will miss any records that don t have any values in the harvested index. The search target must support the following requirements: The target must support scan operations.

19 The scan operation must support step-size parameter with a value of 0 (zero). An index must be available for the scan, and all records must have a value for that index. The target must not limit the number of search results. The records must contain an identifier, that is unique in the data set, otherwise they may be harvested more than once, or an additional step for detecting and removing duplicates will need to run after harvesting is complete. Although, theoretically this method allows a target to be harvested, it has several disadvantages: It is the less efficient way of harvesting a search target: requires the greatest number of requests being sent to the target; and the same record can be harvested several times. It does not guarantee that 100% of the records are harvested: any record that does not contain a value in the index will not be harvested. It poses greater requirements to the target, than other methods. It is not well supported in Z39.50 implementations: some targets don t support scan, or only support it with limitations; and it is not always supported by Z39.50 client implementations. For these reasons the usage of this method is not viable in The European Library. Other methods are more easily supported by search targets. Also, the implementation of this method for The European Library would be more expensive. 5.3 Choosing a harvesting method When addressing the choice of the harvesting method, a data provider should first try evaluating the possibility of having support in the search target for the requirements of the incremental harvest method. If those cannot be supported a full harvest method should be chosen first on the basis of the requirements that can be fulfilled by the search target. If several methods are possible, then considerations on efficiency should be taken in consideration to make the final decision. Table 1 provides a quick summary of the methods that can be used when the search target can not fulfill certain requirements. Regarding the efficiency of the full harvesting methods, the method by record creation/modification date requires more requests to be sent to the target, and therefore will generate a greater load for the target. The methods of harvesting by record identifiers are the most efficient, since all search requests will be lightweight and easily handled by the search target.

20 Limits the number of search responses Does not support search by date stamps Records do not contain a local identifier Does not support search by local identifier Does not support search with relation attributes A list of identifiers can not be exported Incremental harvests Full harvest by record date stamp Full harvest by identifier export Full harvest by sequential identifier Table 1 - Limiting requirements for the harvesting methods

21 6 Conclusion This document presented an analysis of metadata harvesting methods, alternative to OAI- PMH, base on the search and retrieval protocols Z39.50 and SRU. Several methods were described and analyzed for their efficiency, but their application in real deployments has to be tested case by case. Of all the methods described, the combination of incremental harvests, by record creation and modification dates, and full harvests, by one of the record identifier methods, is the most efficient solution if the search target can support the requirements for these methods. The methods here described will be implemented in the OAI-PMH software under development in TELplus. Therefore these methods will be available for deployment locally at the libraries or centrally in The European Library s metadata harvesting management system. The implementation will start in July 2009 and will be finished by December During the development process, the methods will be tested on real Z39.50 targets from European national libraries.

22 7 References [1] W. Moen. The ANSI/NISO Z39.50 Protocol: Information Retrieval in the Information Infrastructure. Available on website: [2] C. Lynch. The Z39.50 Information Retrieval Standard - Part I: A Strategic View of Its Past, Present and Future. Available on website: [3] J. Kaczmarek and C.C. Naun, A statewide metasearch service using OAI, Library Hi Tech, vol. 23, 2005, pp [4] R. Sanderson, J. Young, and R. LeVan, SRW/U with OAI: Expected and Unexpected Synergies, D-Lib Magazine, vol. 11, Feb

Nuno Freire National Library of Portugal Lisbon, Portugal

Nuno Freire National Library of Portugal Lisbon, Portugal Date submitted: 05/07/2010 UNIMARC in The European Library and related projects Nuno Freire National Library of Portugal Lisbon, Portugal E-mail: nuno.freire@bnportugal.pt Meeting: 148. UNIMARC WORLD LIBRARY

More information

EXTENDING OAI-PMH PROTOCOL WITH DYNAMIC SETS DEFINITIONS USING CQL LANGUAGE

EXTENDING OAI-PMH PROTOCOL WITH DYNAMIC SETS DEFINITIONS USING CQL LANGUAGE EXTENDING OAI-PMH PROTOCOL WITH DYNAMIC SETS DEFINITIONS USING CQL LANGUAGE Cezary Mazurek Poznań Supercomputing and Networking Center Noskowskiego 12/14, 61-704 Poznań, Poland Marcin Werla Poznań Supercomputing

More information

OAI-PMH implementation and tools guidelines

OAI-PMH implementation and tools guidelines ECP-2006-DILI-510003 TELplus OAI-PMH implementation and tools guidelines Deliverable number Dissemination level D-2.1 Public Delivery date 31 May 2008 Status Final v1.1 Author(s) Diogo Reis(IST), Nuno

More information

SRW and CQL; Open Source at LC

SRW and CQL; Open Source at LC SRW and CQL; Open Source at LC Overview: Motivations Explain Operation SearchRetrieve Operation Scan Operation CQL Implementations NISO Metasearch Initiative Rob Sanderson (azaroth@liv.ac.uk) SRW/U: Introduction

More information

A Repository of Metadata Crosswalks. Jean Godby, Devon Smith, Eric Childress, Jeffrey A. Young OCLC Online Computer Library Center Office of Research

A Repository of Metadata Crosswalks. Jean Godby, Devon Smith, Eric Childress, Jeffrey A. Young OCLC Online Computer Library Center Office of Research A Repository of Metadata Crosswalks Jean Godby, Devon Smith, Eric Childress, Jeffrey A. Young OCLC Online Computer Library Center Office of Research DLF-2004 Spring Forum April 21, 2004 Outline of this

More information

Metadata and Encoding Standards for Digital Initiatives: An Introduction

Metadata and Encoding Standards for Digital Initiatives: An Introduction Metadata and Encoding Standards for Digital Initiatives: An Introduction Maureen P. Walsh, The Ohio State University Libraries KSU-SLIS Organization of Information 60002-004 October 29, 2007 Part One Non-MARC

More information

Texas Library Directory Web Services Application: The Potential for Web Services to Enhance Information Access to Legacy Data

Texas Library Directory Web Services Application: The Potential for Web Services to Enhance Information Access to Legacy Data Texas Library Directory Web Services Application: The Potential for Web Services to Enhance Information Access to Legacy Data By: Fatih Oguz and William E. Moen Oguz, F., & Moen, W. E. (2006). Texas Library

More information

Building for the Future

Building for the Future Building for the Future The National Digital Newspaper Program Deborah Thomas US Library of Congress DigCCurr 2007 Chapel Hill, NC April 19, 2007 1 What is NDNP? Provide access to historic newspapers Select

More information

Software Requirements Specification for the Names project prototype

Software Requirements Specification for the Names project prototype Software Requirements Specification for the Names project prototype Prepared for the JISC Names Project by Daniel Needham, Amanda Hill, Alan Danskin & Stephen Andrews April 2008 1 Table of Contents 1.

More information

Outline of the course

Outline of the course Outline of the course Introduction to Digital Libraries (15%) Description of Information (30%) Access to Information (30%) User Services (10%) Additional topics (15%) Buliding of a (small) digital library

More information

Digital Libraries: Interoperability

Digital Libraries: Interoperability Digital Libraries: Interoperability RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Interoperability...............................................

More information

Appendix REPOX User Manual

Appendix REPOX User Manual D5.3.1 Europeana OAI-PMH Infrastructure Documentation and final prototype co-funded by the European Union The project is co-funded by the European Union, through the econtentplus programme http://ec.europa.eu/econtentplus

More information

Interoperability for Digital Libraries

Interoperability for Digital Libraries DRTC Workshop on Semantic Web 8 th 10 th December, 2003 DRTC, Bangalore Paper: C Interoperability for Digital Libraries Michael Shepherd Faculty of Computer Science Dalhousie University Halifax, NS, Canada

More information

The OAIS Reference Model: current implementations

The OAIS Reference Model: current implementations The OAIS Reference Model: current implementations Michael Day, UKOLN, University of Bath m.day@ukoln.ac.uk Chinese-European Workshop on Digital Preservation, Beijing, China, 14-16 July 2004 Presentation

More information

Masters Proposal. Meta-standardisation of Interoperability Protocols

Masters Proposal. Meta-standardisation of Interoperability Protocols Masters Proposal Meta-standardisation of Interoperability Protocols Name: Jorgina Kaumbe do Rosario Paihama jpaihama@cs.uct.ac.za Supervised by: Dr Hussein Suleman hussein@cs.uct.ac.za Department of Computer

More information

How to contribute information to AGRIS

How to contribute information to AGRIS How to contribute information to AGRIS Guidelines on how to complete your registration form The dashboard includes information about you, your institution and your collection. You are welcome to provide

More information

Metadata aggregation for digital libraries

Metadata aggregation for digital libraries ICDAT 2005 Metadata aggregation for digital libraries Muriel Foulonneau () Grainger Engineering Library University of Illinois at Urbana-Champaign USA June 2005 Outlines Role and practices of actors in

More information

Open Archives Initiatives Protocol for Metadata Harvesting Practices for the cultural heritage sector

Open Archives Initiatives Protocol for Metadata Harvesting Practices for the cultural heritage sector Open Archives Initiatives Protocol for Metadata Harvesting Practices for the cultural heritage sector Relais Culture Europe mfoulonneau@relais-culture-europe.org Community report A community report on

More information

Sally H. McCallum (1) Library of Congress, USA

Sally H. McCallum (1) Library of Congress, USA Date 2 nd version : 18/07/2006 A Look at New Information Retrieval Protocols: SRU, OpenSearch/A9, CQL, and XQuery Sally H. McCallum (1) Library of Congress, USA Meeting: 102 IFLA-CDNL Alliance for Bibliographic

More information

Comparing Open Source Digital Library Software

Comparing Open Source Digital Library Software Comparing Open Source Digital Library Software George Pyrounakis University of Athens, Greece Mara Nikolaidou Harokopio University of Athens, Greece Topic: Digital Libraries: Design and Development, Open

More information

University of Bath. Publication date: Document Version Publisher's PDF, also known as Version of record. Link to publication

University of Bath. Publication date: Document Version Publisher's PDF, also known as Version of record. Link to publication Citation for published version: Patel, M & Duke, M 2004, 'Knowledge Discovery in an Agents Environment' Paper presented at European Semantic Web Symposium 2004, Heraklion, Crete, UK United Kingdom, 9/05/04-11/05/04,.

More information

ScienceDirect. Multi-interoperable CRIS repository. Ivanović Dragan a *, Ivanović Lidija b, Dimić Surla Bojana c CRIS

ScienceDirect. Multi-interoperable CRIS repository. Ivanović Dragan a *, Ivanović Lidija b, Dimić Surla Bojana c CRIS Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 33 ( 2014 ) 86 91 CRIS 2014 Multi-interoperable CRIS repository Ivanović Dragan a *, Ivanović Lidija b, Dimić Surla Bojana

More information

Presented by Dr Joanne Evans, Centre for Organisational and Social informatics Faculty of IT, Monash University Designing for interoperability

Presented by Dr Joanne Evans, Centre for Organisational and Social informatics Faculty of IT, Monash University Designing for interoperability Presented by Dr Joanne Evans, Centre for Organisational and Social informatics Faculty of IT, Monash University Designing for interoperability Experiences arising from the Clever Recordkeeping Metadata

More information

ECP-2008-DILI EuropeanaConnect. D5.7.1 EOD Connector Documentation and Final Prototype. PP (Restricted to other programme participants)

ECP-2008-DILI EuropeanaConnect. D5.7.1 EOD Connector Documentation and Final Prototype. PP (Restricted to other programme participants) ECP-2008-DILI-528001 EuropeanaConnect D5.7.1 EOD Connector Documentation and Final Prototype Deliverable number/name D 5.7.1 Dissemination level PP (Restricted to other programme participants) Delivery

More information

Ontology Servers and Metadata Vocabulary Repositories

Ontology Servers and Metadata Vocabulary Repositories Ontology Servers and Metadata Vocabulary Repositories Dr. Manjula Patel Technical Research and Development m.patel@ukoln.ac.uk http://www.ukoln.ac.uk/ Overview agentcities.net deployment grant Background

More information

The Design of a DLS for the Management of Very Large Collections of Archival Objects

The Design of a DLS for the Management of Very Large Collections of Archival Objects Session: VLDL Architectures The Design of a DLS for the Management of Very Large Collections of Archival Objects Maristella Agosti, Nicola Ferro and Gianmaria Silvello Information Management Research Group

More information

Corso di Biblioteche Digitali

Corso di Biblioteche Digitali Corso di Biblioteche Digitali Vittore Casarosa casarosa@isti.cnr.it tel. 050-315 3115 cell. 348-397 2168 Ricevimento dopo la lezione o per appuntamento Valutazione finale 70-75% esame orale 25-30% progetto

More information

The European Commission s science and knowledge service. Joint Research Centre

The European Commission s science and knowledge service. Joint Research Centre The European Commission s science and knowledge service Joint Research Centre GeoDCAT-AP The story so far Andrea Perego, Antonio Rotundo, Lieven Raes GeoDCAT-AP Webinar 6 June 2018 What is GeoDCAT-AP Geospatial

More information

Questionnaire for effective exchange of metadata current status of publishing houses

Questionnaire for effective exchange of metadata current status of publishing houses Questionnaire for effective exchange of metadata current status of publishing houses In 2011, important priorities were set in order to realise green publications in the open access movement in Germany.

More information

Metadata Workshop 3 March 2006 Part 1

Metadata Workshop 3 March 2006 Part 1 Metadata Workshop 3 March 2006 Part 1 Metadata overview and guidelines Amelia Breytenbach Ria Groenewald What metadata is Overview Types of metadata and their importance How metadata is stored, what metadata

More information

Digital Library Interoperability. Europeana

Digital Library Interoperability. Europeana Digital Library Interoperability technical and object modelling aspects Dr. Stefan Gradmann / EDLnet WP 2 stefan.gradmann@rrz.uni-hamburg.de www.rrz.uni-hamburg.de/rrz/s.gradmann of Europeana Interoperability,

More information

This document is a preview generated by EVS

This document is a preview generated by EVS INTERNATIONAL STANDARD ISO 25577 Second edition 2013-12-15 Information and documentation MarcXchange Information et documentation MarcXchange Reference number ISO 25577:2013(E) ISO 2013 ISO 25577:2013(E)

More information

Joining the BRICKS Network - A Piece of Cake

Joining the BRICKS Network - A Piece of Cake Joining the BRICKS Network - A Piece of Cake Robert Hecht and Bernhard Haslhofer 1 ARC Seibersdorf research - Research Studios Studio Digital Memory Engineering Thurngasse 8, A-1090 Wien, Austria {robert.hecht

More information

Search Interoperability, OAI, and Metadata

Search Interoperability, OAI, and Metadata Search Interoperability, OAI, and Metadata An Introduction to the OAI Protocol for Metadata Harvesting Sarah Shreeves University of Illinois at Urbana-Champaign November 30, 2006 This work is licensed

More information

Using metadata for interoperability. CS 431 February 28, 2007 Carl Lagoze Cornell University

Using metadata for interoperability. CS 431 February 28, 2007 Carl Lagoze Cornell University Using metadata for interoperability CS 431 February 28, 2007 Carl Lagoze Cornell University What is the problem? Getting heterogeneous systems to work together Providing the user with a seamless information

More information

The Sunshine State Digital Network

The Sunshine State Digital Network The Sunshine State Digital Network Keila Zayas-Ruiz, Sunshine State Digital Network Coordinator May 10, 2018 What is DPLA? The Digital Public Library of America is a free online library that provides access

More information

Document-Centric Computing

Document-Centric Computing Document-Centric Computing White Paper Abstract A document is a basic instrument for business and personal interaction and for capturing and communicating information and knowledge. Until the invention

More information

Networked Access to Library Resources

Networked Access to Library Resources Institute of Museum and Library Services National Leadership Grant Realizing the Vision of Networked Access to Library Resources An Applied Research and Demonstration Project to Establish and Operate a

More information

Introduction

Introduction Introduction EuropeanaConnect All-Staff Meeting Berlin, May 10 12, 2010 Welcome to the All-Staff Meeting! Introduction This is a quite big meeting. This is the end of successful project year Project established

More information

D2.5 Data mediation. Project: ROADIDEA

D2.5 Data mediation. Project: ROADIDEA D2.5 Data mediation Project: ROADIDEA 215455 Document Number and Title: D2.5 Data mediation How to convert data with different formats Work-Package: WP2 Deliverable Type: Report Contractual Date of Delivery:

More information

From The European Library to The European Digital Library. Jill Cousins Inforum, Prague, May 2007

From The European Library to The European Digital Library. Jill Cousins Inforum, Prague, May 2007 From The European Library to The European Digital Library Jill Cousins Inforum, Prague, May 2007 Timeline Past to Present Started as TEL a project funded by the EU and led by The British Library now fully

More information

GeoDCAT-AP Representing geographic metadata by using the "DCAT application profile for data portals in Europe"

GeoDCAT-AP Representing geographic metadata by using the DCAT application profile for data portals in Europe GeoDCAT-AP Representing geographic metadata by using the "DCAT application profile for data portals in Europe" Andrea Perego, Vlado Cetl, Anders Friis-Christensen, Michael Lutz, Lorena Hernandez Joint

More information

Building a Digital Repository on a Shoestring Budget

Building a Digital Repository on a Shoestring Budget Building a Digital Repository on a Shoestring Budget Christinger Tomer University of Pittsburgh! PALA September 30, 2014 A version this presentation is available at http://www.pitt.edu/~ctomer/shoestring/

More information

Bib-1 configuration guideline for Japanese Z39.50 library application

Bib-1 configuration guideline for Japanese Z39.50 library application Bib-1 configuration guideline for Japanese Z9.50 library application This is the Bib-1 configuration guideline for the Z9.50 target in Japanese library systems, and is used as a complement to the Z9.50

More information

Purpose: A dynamic approach to make legacy databases like CDS/ISIS, interoperable with OAI-compliant digital libraries (DL).

Purpose: A dynamic approach to make legacy databases like CDS/ISIS, interoperable with OAI-compliant digital libraries (DL). A Dynamic Approach to make CDS/ISIS Databases Interoperable over Internet Using OAI Protocol F. Jayakanth, K. Maly, M. Zubair, and L Aswath Authors: F. Jayakanth is a visiting Fulbright fellow at the Computer

More information

RVOT: A Tool For Making Collections OAI-PMH Compliant

RVOT: A Tool For Making Collections OAI-PMH Compliant RVOT: A Tool For Making Collections OAI-PMH Compliant K. Sathish, K. Maly, M. Zubair Computer Science Department Old Dominion University Norfolk, Virginia USA {kumar_s,maly,zubair}@cs.odu.edu X. Liu Research

More information

UNIMARC and XML at the BN. Nuno Freire José Borbinha Hugo Manguinhas INESC-ID

UNIMARC and XML at the BN. Nuno Freire José Borbinha Hugo Manguinhas INESC-ID UNIMARC and XML at the BN Nuno Freire José Borbinha Hugo Manguinhas INESC-ID UNIMARC and XML at the BN Reference publication of the formats MDR >>> HTTP://UNIMARC.INFO + PDF,... Validation of records and

More information

The Ohio State University's Knowledge Bank: An Institutional Repository in Practice

The Ohio State University's Knowledge Bank: An Institutional Repository in Practice The Ohio State University's Knowledge Bank: Maureen P. Walsh, The Ohio State University Libraries The Ohio State University s Institutional Repository Mission The mission of the institutional repository

More information

Institutional repositories: description of VITAL as an example of a Fedora-based digital assets management system.

Institutional repositories: description of VITAL as an example of a Fedora-based digital assets management system. Institutional repositories: description of VITAL as an example of a Fedora-based digital assets management system. ICADLA-2, Johannesburg, South Africa Nabil Saadallah Manager, Middle East and Africa VTLS

More information

Network Working Group. December Using the Z39.50 Information Retrieval Protocol in the Internet Environment

Network Working Group. December Using the Z39.50 Information Retrieval Protocol in the Internet Environment Network Working Group Request for Comments: 1729 Category: Informational C. Lynch University of California Office of the President December 1994 Status of this Memo Using the Z39.50 Information Retrieval

More information

SMART CONNECTOR TECHNOLOGY FOR FEDERATED SEARCH

SMART CONNECTOR TECHNOLOGY FOR FEDERATED SEARCH SMART CONNECTOR TECHNOLOGY FOR FEDERATED SEARCH VERSION 1.4 27 March 2018 EDULIB, S.R.L. MUSE KNOWLEDGE HEADQUARTERS Calea Bucuresti, Bl. 27B, Sc. 1, Ap. 10, Craiova 200675, România phone +40 251 413 496

More information

MuseKnowledge Hybrid Search

MuseKnowledge Hybrid Search MuseKnowledge Hybrid Search MuseGlobal, Inc. One Embarcadero Suite 500 San Francisco, CA 94111 415 896-6873 www.museglobal.com MuseGlobal S.A Calea Bucuresti Bl. 27B, Sc. 1, Ap. 10 Craiova, România 40

More information

An aggregation system for cultural heritage content

An aggregation system for cultural heritage content An aggregation system for cultural heritage content Nasos Drosopoulos, Vassilis Tzouvaras, Nikolaos Simou, Anna Christaki, Arne Stabenau, Kostas Pardalis, Fotis Xenikoudakis, Eleni Tsalapati and Stefanos

More information

'Marketing' with Metadata

'Marketing' with Metadata 'Marketing' with Metadata - Increase Exposure and Visibility of Content. OAI-PMH,... Page 1 of 19 'Marketing' with Metadata Increasing Exposure and Visibility of Online Content with "Best Practice" Metadata

More information

67th IFLA Council and General Conference August 16-25, 2001

67th IFLA Council and General Conference August 16-25, 2001 67th IFLA Council and General Conference August 16-25, 2001 Code Number: 050-203(WS)-E Division Number: 0 Professional Group: Universal Dataflow and Telecommunications Workshop Joint Meeting with: - Meeting

More information

Research Data Repository Interoperability Primer

Research Data Repository Interoperability Primer Research Data Repository Interoperability Primer The Research Data Repository Interoperability Working Group will establish standards for interoperability between different research data repository platforms

More information

IBM Endpoint Manager Version 9.0. Software Distribution User's Guide

IBM Endpoint Manager Version 9.0. Software Distribution User's Guide IBM Endpoint Manager Version 9.0 Software Distribution User's Guide IBM Endpoint Manager Version 9.0 Software Distribution User's Guide Note Before using this information and the product it supports,

More information

Metadata Harvesting Framework

Metadata Harvesting Framework Metadata Harvesting Framework Library User 3. Provide searching, browsing, and other services over the data. Service Provider (TEL, NSDL) Harvested Records 1. Service Provider polls periodically for new

More information

A distributed network of digital heritage information

A distributed network of digital heritage information A distributed network of digital heritage information SWIB17 Enno Meijers / 6 December 2017 / Hamburg Contents 1. Introduction to Dutch Digital Heritage Network 2. The current digital heritage infrastructure

More information

A service-oriented national e-theses information system and repository

A service-oriented national e-theses information system and repository A service-oriented national e-theses information system and repository Nikos Houssos, Panagiotis Stathopoulos, Ioanna Sarantopoulou, Dimitris Zavaliadis, Evi Sachini National Documentation Centre / National

More information

Slide 1 & 2 Technical issues Slide 3 Technical expertise (continued...)

Slide 1 & 2 Technical issues Slide 3 Technical expertise (continued...) Technical issues 1 Slide 1 & 2 Technical issues There are a wide variety of technical issues related to starting up an IR. I m not a technical expert, so I m going to cover most of these in a fairly superficial

More information

SAS Web Infrastructure Kit 1.0. Overview, Second Edition

SAS Web Infrastructure Kit 1.0. Overview, Second Edition SAS Web Infrastructure Kit 1.0 Overview, Second Edition The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2006. SAS Web Infrastructure Kit 1.0: Overview, Second Edition.

More information

Guidelines for Developing Digital Cultural Collections

Guidelines for Developing Digital Cultural Collections Guidelines for Developing Digital Cultural Collections Eirini Lourdi Mara Nikolaidou Libraries Computer Centre, University of Athens Harokopio University of Athens Panepistimiopolis, Ilisia, 15784 70 El.

More information

D Audio Aggregation Platform implementation of Highway to Europeana - Node on-a-stick, version 2.

D Audio Aggregation Platform implementation of Highway to Europeana - Node on-a-stick, version 2. D6.2.1 - Audio Aggregation Platform implementation of Highway to Europeana - Node on-a-stick,. The Highway to Europeana USB card provides a content owner with a simple means of uploading his/her material

More information

Lesson 3 SOAP message structure

Lesson 3 SOAP message structure Lesson 3 SOAP message structure Service Oriented Architectures Security Module 1 - Basic technologies Unit 2 SOAP Ernesto Damiani Università di Milano SOAP structure (1) SOAP message = SOAP envelope Envelope

More information

2nd Technical Validation Questionnaire - interim results -

2nd Technical Validation Questionnaire - interim results - 2nd Technical Validation Questionnaire - interim results - Birgit Matthaei Humboldt-University, Berlin, Germany Electronic Publishing Group Computer- and Mediaservice birgit.matthaei@cms.hu-berlin.de Why

More information

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment Shigeo Sugimoto Research Center for Knowledge Communities Graduate School of Library, Information

More information

An Introduction to PREMIS. Jenn Riley Metadata Librarian IU Digital Library Program

An Introduction to PREMIS. Jenn Riley Metadata Librarian IU Digital Library Program An Introduction to PREMIS Jenn Riley Metadata Librarian IU Digital Library Program Outline Background and context PREMIS data model PREMIS data dictionary Implementing PREMIS Adoption and ongoing developments

More information

Expected and Unexpected Synergies

Expected and Unexpected Synergies Page 1 of 8 Search Back Issues Author Index Title Index Contents D-Lib Magazine February 2005 Volume 11 Number 2 ISSN 1082-9873 SRW/U with OAI Expected and Unexpected Synergies Robert Sanderson University

More information

Fusion Registry 9 SDMX Data and Metadata Management System

Fusion Registry 9 SDMX Data and Metadata Management System Registry 9 Data and Management System Registry 9 is a complete and fully integrated statistical data and metadata management system using. Whether you require a metadata repository supporting a highperformance

More information

CARARE Training Workshops

CARARE Training Workshops CARARE Training Workshops Stein Runar Bergheim Asplan Viak Internet as CARARE is funded by the European Commission's ICT Policy Support Programme Introduction to Repox An OAI-PMH tool developed within

More information

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM OMB No. 3137 0071, Exp. Date: 09/30/2015 DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM Introduction: IMLS is committed to expanding public access to IMLS-funded research, data and other digital products:

More information

Beginning To Define ebxml Initial Draft

Beginning To Define ebxml Initial Draft Beginning To Define ebxml Initial Draft File Name Version BeginningToDefineebXML 1 Abstract This document provides a visual representation of how the ebxml Architecture could work. As ebxml evolves, this

More information

You may print, preview, or create a file of the report. File options are: PDF, XML, HTML, RTF, Excel, or CSV.

You may print, preview, or create a file of the report. File options are: PDF, XML, HTML, RTF, Excel, or CSV. Chapter 14 Generating outputs The Toolkit produces two distinct types of outputs: reports and exports. Reports include both administrative and descriptive products, such as lists of acquisitions for a

More information

Representation/Indexing (fig 1.2) IR models - overview (fig 2.1) IR models - vector space. Weighting TF*IDF. U s e r. T a s k s

Representation/Indexing (fig 1.2) IR models - overview (fig 2.1) IR models - vector space. Weighting TF*IDF. U s e r. T a s k s Summary agenda Summary: EITN01 Web Intelligence and Information Retrieval Anders Ardö EIT Electrical and Information Technology, Lund University March 13, 2013 A Ardö, EIT Summary: EITN01 Web Intelligence

More information

COAR Interoperability Roadmap. Uppsala, May 21, 2012 COAR General Assembly

COAR Interoperability Roadmap. Uppsala, May 21, 2012 COAR General Assembly COAR Interoperability Roadmap Uppsala, May 21, 2012 COAR General Assembly 1 Background COAR WG2 s main objective for 2011-2012 was to facilitate a discussion on interoperability among Open Access repositories.

More information

Using the Cisco ACE Application Control Engine Application Switches with the Cisco ACE XML Gateway

Using the Cisco ACE Application Control Engine Application Switches with the Cisco ACE XML Gateway Using the Cisco ACE Application Control Engine Application Switches with the Cisco ACE XML Gateway Applying Application Delivery Technology to Web Services Overview The Cisco ACE XML Gateway is the newest

More information

Linking library data: contributions and role of subject data. Nuno Freire The European Library

Linking library data: contributions and role of subject data. Nuno Freire The European Library Linking library data: contributions and role of subject data Nuno Freire The European Library Outline Introduction to The European Library Motivation for Linked Library Data The European Library Open Dataset

More information

USER GUIDE TO THE DIGITAL LIBRARY OF IBERO-AMERICAN HERITAGE (BDPI)

USER GUIDE TO THE DIGITAL LIBRARY OF IBERO-AMERICAN HERITAGE (BDPI) USER GUIDE TO THE DIGITAL LIBRARY OF IBERO-AMERICAN HERITAGE (BDPI) 1 Contents 1. What is the Digital Library of Ibero-American Heritage (BDPI)? 2. Searching the BDPI 2.1 Simple and advanced search 2.2

More information

Integrated Aeronautical Information database

Integrated Aeronautical Information database Integrated Aeronautical Information database Workshop for the development of Operational skills for the transition from AIS to AIM for Civil Aviation Authorities (CAA) and Air Navigation Service Providers

More information

Union catalogue models

Union catalogue models Union catalogue models Presentation for Hellenic academic libraries Martin van Muyen Union catalogue: a catalogue that lists the holdings of more than one library 2 Union catalogue base functions Discovery

More information

Introduction... 2 New Features... 3

Introduction... 2 New Features... 3 WorldShare Collection Manager Release Notes Release Date: February 23, 2015 Contents Introduction... 2 New Features... 3 New Collections button... 3 Retention of User Submitted OCLC Numbers... 4 Edit knowledge

More information

European Holocaust Research Infrastructure Theme [INFRA ] GA no Deliverable D19.5

European Holocaust Research Infrastructure Theme [INFRA ] GA no Deliverable D19.5 European Holocaust Research Infrastructure Theme [INFRA-2010-1.1.4] GA no. 261873 Deliverable D19.5 Filled Metadata Registry Linda Reijnhoudt, Ben Companjen, Mike Priddy Data Archiving and Networked Services

More information

EMC Documentum xdb. High-performance native XML database optimized for storing and querying large volumes of XML content

EMC Documentum xdb. High-performance native XML database optimized for storing and querying large volumes of XML content DATA SHEET EMC Documentum xdb High-performance native XML database optimized for storing and querying large volumes of XML content The Big Picture Ideal for content-oriented applications like dynamic publishing

More information

BHL-EUROPE: Biodiversity Heritage Library for Europe. Jana Hoffmann, Henning Scholz

BHL-EUROPE: Biodiversity Heritage Library for Europe. Jana Hoffmann, Henning Scholz Nimis P. L., Vignes Lebbe R. (eds.) Tools for Identifying Biodiversity: Progress and Problems pp. 43-48. ISBN 978-88-8303-295-0. EUT, 2010. BHL-EUROPE: Biodiversity Heritage Library for Europe Jana Hoffmann,

More information

SAS Web Infrastructure Kit 1.0. Overview

SAS Web Infrastructure Kit 1.0. Overview SAS Web Infrastructure Kit 1.0 Overview The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2004. SAS Web Infrastructure Kit 1.0: Overview. Cary, NC: SAS Institute Inc.

More information

Managing Learning Objects in Large Scale Courseware Authoring Studio 1

Managing Learning Objects in Large Scale Courseware Authoring Studio 1 Managing Learning Objects in Large Scale Courseware Authoring Studio 1 Ivo Marinchev, Ivo Hristov Institute of Information Technologies Bulgarian Academy of Sciences, Acad. G. Bonchev Str. Block 29A, Sofia

More information

Unified Approach to Searching across Information Services

Unified Approach to Searching across Information Services Unified Approach to Searching across Information Services Devika P. Madalli Documentation Research and Training Centre Indian Statistical Institute, Bangalore, India devika@drtc.isical.ac.in A.R.D. Prasad

More information

SciX Open, self organising repository for scientific information exchange. D15: Value Added Publications IST

SciX Open, self organising repository for scientific information exchange. D15: Value Added Publications IST IST-2001-33127 SciX Open, self organising repository for scientific information exchange D15: Value Added Publications Responsible author: Gudni Gudnason Co-authors: Arnar Gudnason Type: software/pilot

More information

Metadata Overview: digital repositories

Metadata Overview: digital repositories Metadata Overview: digital repositories Presented during Pre-African Summit Workshop no 2: Building digital repositories in public, special and research libraries by Makaba Macanda macanmb@unisa.ac.za

More information

VDX. VDX Web Admin Manual Pt.1

VDX. VDX Web Admin Manual Pt.1 VDX VDX Web Admin Manual Pt.1 OCLC, 2012. OCLC owns the copyright in this document including the content, page layout, graphical images, logos, and photographs and also owns all trademarks so identified.

More information

Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS. Jenn Riley IU Metadata Librarian DLP Brown Bag Series February 25, 2005

Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS. Jenn Riley IU Metadata Librarian DLP Brown Bag Series February 25, 2005 Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS Jenn Riley IU Metadata Librarian DLP Brown Bag Series February 25, 2005 Descriptive metadata Enables users to find relevant materials Used

More information

OAI-PMH. DRTC Indian Statistical Institute Bangalore

OAI-PMH. DRTC Indian Statistical Institute Bangalore OAI-PMH DRTC Indian Statistical Institute Bangalore Problem: No Library contains all the documents in the world Solution: Networking the Libraries 2 Problem No digital Library is expected to have all documents

More information

Ponds, Lakes, Ocean: Pooling Digitized Resources and DPLA. Emily Jaycox, Missouri Historical Society SLRLN Tech Expo 2018

Ponds, Lakes, Ocean: Pooling Digitized Resources and DPLA. Emily Jaycox, Missouri Historical Society SLRLN Tech Expo 2018 Ponds, Lakes, Ocean: Pooling Digitized Resources and DPLA Emily Jaycox, Missouri Historical Society SLRLN Tech Expo 2018 Reflections on the digital world Librarian Good news So many libraries have digitized

More information

Promoting semantic interoperability between public administrations in Europe

Promoting semantic interoperability between public administrations in Europe ISA solutions, Brussels, 23 September 2014 Vassilios.Peristeras@ec.europa.eu Promoting semantic interoperability between public administrations in Europe What semantics is about? ISA work in semantics

More information

A Comparative Study of the Search and Retrieval Features of OAI Harvesting Services

A Comparative Study of the Search and Retrieval Features of OAI Harvesting Services A Comparative Study of the Search and Retrieval Features of OAI Harvesting Services V. Indrani 1 and K. Thulasi 2 1 Information Centre for Aerospace Science and Technology, National Aerospace Laboratories,

More information

Developing Seamless Discovery of Scholarly and Trade Journal Resources Via OAI and RSS Chumbe, Santiago Segundo; MacLeod, Roddy

Developing Seamless Discovery of Scholarly and Trade Journal Resources Via OAI and RSS Chumbe, Santiago Segundo; MacLeod, Roddy Heriot-Watt University Heriot-Watt University Research Gateway Developing Seamless Discovery of Scholarly and Trade Journal Resources Via OAI and RSS Chumbe, Santiago Segundo; MacLeod, Roddy Publication

More information

The DIGMAP Virtual Digital Library

The DIGMAP Virtual Digital Library José Borbinha *, Gilberto Pedrosa *, João Luzio *, Hugo Manguinhas *, Bruno Martins * The DIGMAP Virtual Digital Library Keywords: Geographic information; cartographic heritage; information systems architectures;

More information

GUIDELINES FOR DATABASES AS PUBLIC RECORDS PURPOSE... 1 OVERVIEW... 1 POLICY GUIDELINES... 2 OFFICIAL REQUEST... 2 EXEMPT RECORDS... 2 REQUESTS FOR SPECIFIC RECORDS... 3 REQUEST FOR ENTIRE DATABASES OR

More information

SAS 9.2 Foundation Services. Administrator s Guide

SAS 9.2 Foundation Services. Administrator s Guide SAS 9.2 Foundation Services Administrator s Guide The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2009. SAS 9.2 Foundation Services: Administrator s Guide. Cary, NC:

More information