Bettina Martin-Weber BASYS - The Federal Archives database-driven archival management system for description, administration and presentation of metadata and digital archives Back in 2004 I already took part in a Jerzy-Skownorek conference here in Warsaw. And also in that talk almost 10 years ago the central question was: How can the access to archival information through the Internet be optimised? Already in 2004 you could see that the Internet would be the platform for the presentation of archival information (and already was). While I was preparing today s talk I found it quite interesting indeed to see what has happened since 2004, how targets remained the same or have changed, what kind of approaches turned out to be successful or where new methods have to be found. So what are our expectations today? Today the Internet is the central platform for the presentation of archival information. Other means, like for example printing finding aids have been abandoned at least by the Federal Archives. The public expects today more than just an insight or an extract on the Internet, instead people expect full, current and complete information. And they expect as unrealistic as this may be to find the archival material itself completely digitized. This expectation by the public is expressed in different political initiatives for accessing cultural assets and public information. Besides metadata and digital archival material should also be accessible from national and international portals in addition to the website of the archives. In this context, the standardization and automation of data export is gaining more importance. For reasons of traceability the links to metadata and digitized archival material should be permanently stable. This is a new technical requirement which goes beyond the reliability of quotable references. Compared to 2004 there is meanwhile no alternative to the use of the Internet for accessing archival information. Thus requirements have grown with respect to quality as well as quantity. In future large amounts of metadata and digitized archival material will be made available. So it now depends on procedures and tools that keep the effort at a low level, adopt a clearly targetoriented method and provide up-to-date and legally compliant data. Those procedures that have been developed in the last decade for displaying metadata and archival material are therefore 1
critically examined in the Federal Archives. So far the Federal Archives has provided its access information in separate XML-based online finding aids that have been incorporated in the search engine ARGUS developed specifically for this purpose 1. The method is rather labour-intensive and ties up a considerable amount of human resources both when manually exporting data in the process of creating the finding aids and when managing the file systems. We no longer consider this procedure suitable for large data quantities which in addition have to be constantly kept up-to-date 2. The method used so far could only display part of the in-house information available on the Internet. At the moment 2,187 finding aids with some 2 million units of description can be accessed via Internet (in comparison: In 2004 the figure was 100!). The archival database meanwhile manages 7,420 holdings with some 7 million units of description. We believe that this discrepancy between what ist online and what is not online is not up to today s standard. In 2004 our goal was still to make a remarkable part of the information available, while today we need to make all data available that are in line with legal requirements. And for that we need to find an efficient, sustainable and legally compliant procedure. The Federal Archives is in the favourable position of having implemented a database-supported archive management software named BASYS 2 for a number of years. It has been developed in the Federal Archives for its special purposes and has since been constantly optimised. A new research platform called Invenio has also been part of BASYS 2 since 2011. The data for Invenio are generated directly from BASYS 2. Invenio is a web application which has been deployed for users in reading rooms and for the staff of the Federal Archives. It has proven successful in the internal operation during that phase. As from next year Invenio will also be used as an external research platform on the Internet. The benefits of this new database-supported method are: Through automation not only the effort for creating the online finding aids is reduced substantially, storage and management efforts are also no longer required. Metadata are kept and maintained in one place and hence remain consistent. Archival description can be kept up-to-date: the results of the work from the day 1 Online finding aids are prepared using different tools that primarily support pooling data from different sources. Data originate partly from the central database of the Federal Archives but also from other sources. 2 Another functional problem is the inconsistency of data which is difficult to avoid if and when data are stored in different places. 2
before can provided that there are no legal restrictions be made available on the Internet on the following day. Registrations of any rights in the database can be used to design the automated display of data in line with the law in force. So all disclosed data are available for online research. I will especially refer to this last point when I am now giving you an outline of BASYS 2 and our research platform Invenio. Archival management software BASYS 2 BASYS 2 and its predecessors have been in operation in the Federal Archives since the 90s. The application is supported by an Oracle database. It is the central tool for the archival work in the Federal Archives for handling documents. Currently we are developing BASYS moduls for the film archive. I would like to focus on the features of BASYS that are important for research support and the process of use. BASYS is deployed when the documents are acquired. It supports all stages from appraisal, arrangement, description and storage to usage. The system holds all content and contextrelated information as well as all preservation measures and gives evidence of any form of conversion film, fiches or digitized reproductions. Any contact with archival material is supported and documented by BASYS 2. During the process of arrangement and description also legally relevant data are entered in the database: This includes the date range of records as well as dates of birth and death in case of personal documents. Based on these data the term of protection is automatically calculated and the time of disclosure and accessibility is determined in the system. Other contractual restrictions are also recorded in BASYS and are automatically considered in the process of use. BASYS does not only store data allocated to archival material but also details about users and their research intentions. This includes names and addresses as well as information on the topic of use and possibly details about their individual permission to access metadata and archival material. In principle, all users have the same access right. That right may be extended by a special permission. This individual profile of rights is stored in BASYS for every user and is automatically checked when doing research work or when requesting material. The Federal Archives grants an information access as open as possible, in line with the law in force. Therefore it differentiates between research of metadata and the access to the records themselves. 3
According to the Federal Archives Act, documents can only be used after a period of 30 years. However, the law allows to shorten this general term of protection. Users may only benefit from this right of shortening a term of protection if they know that archival material is available. Consequently, metadata on archival material that is younger than 30 years can be researched, unless any other legal restrictions object to the publication of data. The automatic check of rights in BASYS may lead to the result that metadata can be researched, but the documents themselves cannot be ordered. In particular cases a permit may be granted and entered in BASYS in order to allow a user to request the desired documents. This management of rights in BASYS reduces staff effort significantly in the process of making metadata and archival material available. A further benefit: When preparing an export of data to make them available for external platforms, the profile of rights and access terms are also considered and define which data could be exported and which could not. Research platform Invenio Invenio referres to proven methods of displaying archival information: Metadata are displayed according to their provenance. Content-related information is presented on different levels of description for holdings and records. Structural information relating to hierarchy and classification are visible at the same time ensuring that all information can be interpreted according to its context. So redundancy can be avoided in compliance with the international standard for description ISAD-G. I would like to give you now an impression of what Invenio looks like. Therefore I will show you screen-shots of the version deployed in the reading rooms for two years. The presentation of the inventory is familiar to the user - it traces the display of online finding aids that themselves correlate to the familiar layout of the printed version. Divided in three frames you can find information about the hierarchy, the classification of selected holdings and the units of description in the inventory. Users can choose between two ways of research: They can navigate from superior information to more detailed information or they can use the full-text search. In doing so, Invenio covers all holdings, but can - if necessary - be limited to certain fonds. Full-text search does not lead as done in Google and in the current online search engine of the Federal Archives to a flat hit list. It leads the user initially into the context and structural relationship. The first results are displayed as counted hits in the structure of the holdings. 4
There, in the hierarchy, the user makes their selection to then see the section of hits in the classification. In the classification they make a choice and get to the inventory, where search terms in hits are high-lighted. On the records level they can initially only see the hits that include their search terms, but at all times they can have the complete content of a classification point displayed. That means that the user after a full-text search with a qualified search result could switch to a complete view of the holdings, the classification of the selected holding and all units of description concerning the marked classification point. It is the same view they would have when using a navigating search. Of course it is possible to go back to the first selection. Hierarchy/ Holdings Inventory Classification of the selected Holding Also by using full-text search user have to proceed in the structure from general to detail top down. This persistent structure-oriented approach refers to proven archival search strategies. The plain hit list as it is common on the Internet on the other hand requires each hit to be actively allocated with respect to its origin. Without any knowledge about its origin and the system of classification, however, an item can in most cases not be interpreted correctly. Also a systematic overview of the hits on a high level of abstraction is missing so that an early exclusion of irrelevant hits is not possible. The top down-procedure may seem somewhat unusual and time-consuming at first glance but on closer examination it is at least for archival information more efficient, more transparent and less time-consuming than the interpretation 5
and allocation of individual hits out of a list. The user guidance in Invenio is strictly focused on structure and context: An exception hereof is - on first view - implemented by person and name research. The first result leads to a flat hit list for a better overview. There you can identify the searched person with the aid of further data. If necessary, the full context of every single name hit can be retrieved as shown a few minutes ago by the example of other records. The data for Invenio are processed at night-time from the database BASYS 2. This includes the structural and content-related information as well as any legal restrictions plus information about the users. On this data pool the research result of a user is first checked against their profile of rights in BASYS thus ensuring that only those details are displayed that the user is entitled to see. This procedure is already used in reading rooms and relieves employees considerably from time-consuming individual checks. For Internet research we initially intend to make only that information available that is legally not restricted. For the future also extended ways of access for different user profiles should be possible. Users that are personally known to the Federal Archives will have the permission to access that information on the Internet that would also be visible for them in the reading room. That kind of access must be well protected. Therefore this option shall only be introduced after the introduction of electronic identity cards (eid) for authentification. Out of Invenio orders and reservations can be prompted in BASYS without having to switch media. As long as there are digital or other reproductions on hand, orders and reservations of the original documents won t be possible. Digital reproductions can be retrieved directly from the unit of description in the inventory. BASYS aims at the support of internal processes and their optimisation through automation if possible, but it also makes sure the user is well attended in the whole research process. So users will be able to save their research results on a personal, permanent notepad and can place orders from that. If a user works on several topics, they can maintain several specific notes without time limit. Users can add or delete items on the notepad and newly memorized information is automatically arranged in the structure of the archival description which keeps the personal section of the user neat and clearly arranged. If a user wants to look at the saved information again in the provenance environment, they can call up this structured view straight from the notepad. Notepad and orders without changing media make things easier for the user. The notepad can be sent via e-mail to their private e-mail address where it can be processed further. 6
Invenio has been developed for user research. After it had been implemented, it became apparent within shortly that the application specifically supports the processing of written enquiries and the preparation of archive visits in a very efficient way. Archivists register themselves in Invenio with the user profile and can do advance research work rapidly and tailored to users rights and subsequently email the results in a legally compliant way straight from the application to the user. Access to digitized reproductions and digital archival material Beside BASYS 2 the Federal Archives hold a digital archive for the accession and long-term preservation of born digital material. At the moment an additional so-called digital magazine is being set up for digitized archival material. Due to the different functional requirements for born digital archival material and for digitized documents separate storage solutions are implemented. Digital archive and digital magazine are interfaced with BASYS 2. The metadata on digital records from the federal administration are integrated in BASYS 2. All digital items shall be accessible for users via Invenio. Likewise, there is a prior check of rights before any digitized reproductions or born digital material are disclosed so that nothing can be accessed that is not legally permissible. Summary Data for Invenio research on the Intranet and Internet are prepared automatically sourced from the central database BASYS. Automatic access checks take place in case of researching metadata and reserving or ordering archival material. Are metadata and archival material generally accessible? May the specific user see metadata and archival material because they are specifically entitled? Are original documents submitted or is a conversion form available? Is a digital reproduction available? Are the documents available or are they used elsewhere? Thus the Federal Archives will have efficient tools on hand looking at the database supported archive management software BASYS 2 and the linked web application Invenio which make access up-to-date, comprehensive and in line with the law in force without tieing up a lot of human resources. 7
Digitized and digital archival material is available via Invenio and can provided that there are no legal restrictions be retrieved directly. Automated and standardized XML-Exports in EAD format make data from BASYS available for other portals and platforms. Persistent identifier to metadata and digital materials will be established that refer to Invenio from other research and archive portals. 8